subsequence removal
sequence analysis
combinatorics
algorithmic problem
discrete mathematics

How can I determine all possible ways a subsequence can be removed from a sequence?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

A subsequence removal can be interpreted as choosing indices in the original sequence that match a target subsequence and then deleting those indices. Different index choices can produce different removal "ways" even if the removed values look the same. Solving this well requires separating two tasks: counting all valid removals and enumerating the exact index combinations.

Problem Definition

Given sequence S and target subsequence T:

  • A valid removal way is an increasing index list in S whose values form T.
  • Removing those indices produces a residual sequence.

Example:

  • S = "banana"
  • T = "ban"

The first three letters give one valid removal. If repeated characters exist, multiple index sets may match same target.

Enumerating All Removal Ways

Backtracking is the clearest approach when you need explicit index combinations.

python
1from typing import List, Tuple
2
3
4def all_removal_ways(s: str, t: str) -> List[Tuple[int, ...]]:
5    ways = []
6
7    def dfs(i_s: int, i_t: int, chosen: List[int]) -> None:
8        if i_t == len(t):
9            ways.append(tuple(chosen))
10            return
11        if i_s == len(s):
12            return
13
14        for j in range(i_s, len(s)):
15            if s[j] == t[i_t]:
16                chosen.append(j)
17                dfs(j + 1, i_t + 1, chosen)
18                chosen.pop()
19
20    dfs(0, 0, [])
21    return ways
22
23
24s, t = "ababa", "aba"
25ways = all_removal_ways(s, t)
26print(ways)

Output index tuples represent all distinct ways to remove T as subsequence.

Counting Without Listing All Ways

If you only need count, dynamic programming is much faster and memory-friendly than storing all index tuples.

Define dp[i][j] as number of ways target prefix length j appears as subsequence in source prefix length i.

Transition:

  • Ignore current source character.
  • If characters match, also include ways using that match.
python
1def count_removal_ways(s: str, t: str) -> int:
2    n, m = len(s), len(t)
3    dp = [[0] * (m + 1) for _ in range(n + 1)]
4
5    for i in range(n + 1):
6        dp[i][0] = 1  # empty target always matches
7
8    for i in range(1, n + 1):
9        for j in range(1, m + 1):
10            dp[i][j] = dp[i - 1][j]
11            if s[i - 1] == t[j - 1]:
12                dp[i][j] += dp[i - 1][j - 1]
13
14    return dp[n][m]
15
16
17print(count_removal_ways("ababa", "aba"))

This is the classic distinct-subsequence count formulation.

Residual Sequences After Removal

Sometimes question is not about index sets but resulting sequences after deletion. Different index sets may produce same residue, so deduplication may be needed.

python
1def residuals_after_removal(s: str, ways):
2    out = set()
3    for idxs in ways:
4        remove = set(idxs)
5        residue = ''.join(ch for i, ch in enumerate(s) if i not in remove)
6        out.add(residue)
7    return sorted(out)
8
9s, t = "ababa", "aba"
10ways = all_removal_ways(s, t)
11print(residuals_after_removal(s, ways))

This distinction matters in combinatorics tasks where unique outcomes are required.

Complexity Discussion

For enumeration:

  • Runtime can grow with number of valid matches, which can be exponential in repeated-character cases.
  • Memory grows with number of stored ways.

For DP counting:

  • Time complexity is O(len(S) * len(T)).
  • Space can be reduced to one dimension if only final count is needed.

Space-optimized counting:

python
1def count_removal_ways_1d(s: str, t: str) -> int:
2    m = len(t)
3    dp = [0] * (m + 1)
4    dp[0] = 1
5
6    for ch in s:
7        for j in range(m, 0, -1):
8            if ch == t[j - 1]:
9                dp[j] += dp[j - 1]
10
11    return dp[m]

Reverse update order is essential to avoid reusing updated values from same source character.

When to Use Which Method

Use enumeration when:

  • You need explicit index paths.
  • You need to reconstruct specific removals.

Use DP count when:

  • You only need number of ways.
  • Input size is larger and enumeration is infeasible.

Hybrid strategy can first count via DP, then enumerate only when count is manageable.

Common Pitfalls

  • Confusing subsequence with substring. Fix by remembering subsequence allows gaps while preserving order.
  • Counting value matches instead of index-based matches. Fix by defining removal way as index tuple.
  • Forgetting duplicate residues from different index choices. Fix by deduplicating residual strings when required.
  • Using forward one-dimensional DP update. Fix by updating target index in reverse order.
  • Enumerating blindly on large repeated strings. Fix by using DP counting first to estimate search size.

Summary

  • A removal way is best modeled as a strictly increasing index set matching target subsequence.
  • Backtracking enumerates exact removal paths.
  • Dynamic programming counts paths efficiently without full enumeration.
  • Residual-sequence counting may require deduplication separate from path counting.
  • Choose method by output requirement: explicit paths or just total count.

Course illustration
Course illustration

All Rights Reserved.