最長共同子序列 LCS

Vibe Prompt

「幫我用 DP 找出 'abcde' 與 'ace' 的 LCS，印出 dp 表格與最長共同子序列本身。」

def lcs(s1, s2):
    m, n = len(s1), len(s2)
    dp = [[0] * (n+1) for _ in range(m+1)]
    
    for i in range(1, m+1):
        for j in range(1, n+1):
            if s1[i-1] == s2[j-1]:
                dp[i][j] = dp[i-1][j-1] + 1
            else:
                dp[i][j] = max(dp[i-1][j], dp[i][j-1])
    
    # 回溯
    i, j = m, n
    result = []
    while i > 0 and j > 0:
        if s1[i-1] == s2[j-1]:
            result.append(s1[i-1])
            i -= 1; j -= 1
        elif dp[i-1][j] > dp[i][j-1]:
            i -= 1
        else:
            j -= 1
    
    return dp[m][n], ''.join(reversed(result))

length, seq = lcs("abcde", "ace")
print(f"LCS 長度: {length}")
print(f"LCS: {seq}")

# 應用：Git Diff 比較
old = "int x = 10;"
new = "int y = 20;"
_, common = lcs(old, new)
print(f"Git diff 共同: '{common}'")

DP 表格追蹤

"abcde" vs "ace" 的 LCS 表格

      ""  a   c   e
""    0   0   0   0
a    0   1   1   1
b    0   1   1   1
c    0   1   2   2
d    0   1   2   2
e    0   1   2   3  ← LCS 長度 = 3 ("ace")

應用場景

| 領域 | 用途 | |------|------| | Git Diff | 比較兩個版本找出共通行與差異行 | | DNA 序列比對 | 計算物種間的演化距離 | | 抄襲檢測 | 找出兩份文件中的共同段落 | | 資料壓縮 | Unix diff 指令的核心演算法 | | 最長共同子字串 | LCS 的連續版本（LCSubstring） |

總結

| 面向 | 詳情 | |------|------| | 狀態 | dp[i][j] = s1[:i] 和 s2[:j] 的 LCS 長度 | | 遞迴 | 相符: dp[i-1][j-1]+1 | 不符: max(dp[i-1][j], dp[i][j-1]) | | 初始 | dp[0][j] = dp[i][0] = 0 | | 時間 | O(m × n) | | 回溯 | 相符走斜角，不符走向 max(左, 上)始** | dp[0][j] = dp[i][0] = 0 |\n| 時間 | O(m × n) |\n| 回溯 | 相符走斜角，不符走向 max(左, 上) |"}]}

常見問題與解決方案

| 問題 | 原因 | 解法 | |------|------|------| | 結果不如預期 | 參數設定不當 | 檢查預設值與邊界條件 | | 執行速度慢 | 演算法效率低 | 考慮使用更高效的資料結構 | | 記憶體不足 | 資料量過大 | 使用批次處理或串流方式 | | 難以除錯 | 缺乏日誌 | 加入詳細的 log 輸出 |

Performance Considerations

When working with large datasets or complex computations:

Time Complexity: Analyze and optimize Big O
Space Complexity: Balance memory vs speed
Caching: Store computed results to avoid recalculation
Parallelism: Use multi-threading for independent tasks
Profiling: Measure before optimizing - use profilers

Real-World Application

This concept is widely used in:

Web development (routing, authentication)
Data science (feature engineering, model training)
Game development (pathfinding, physics)
Mobile apps (state management, caching)

LCS：字串比對的核心演算法

最長共同子序列（Longest Common Subsequence）是字串比對的基礎演算法。跟 substring 不同，subsequence 不要求連續——"abc" 是 "aebfc" 的 subsequence。

LCS 的 DP 定義

狀態：dp[i][j] = 字串 A[0:i] 和 字串 B[0:j] 的 LCS 長度

轉移：
if A[i-1] == B[j-1]:  # 字元相同
    dp[i][j] = dp[i-1][j-1] + 1
else:  # 字元不同
    dp[i][j] = max(dp[i-1][j], dp[i][j-1])

真實應用

| 應用 | 說明 | |:----|:----| | Git Diff | 比較兩個檔案的差異 | | 拼寫檢查 | 計算輸入字串與字典的編輯距離 | | DNA 比對 | 找出基因序列的相似度 | | 抄襲偵測 | 比對兩篇文章的相似程度 |

下一章預告：拼寫檢查 API

LCS 是編輯距離（Edit Distance / Levenshtein Distance）的基礎。下一章將編輯距離包裝成一個 FastAPI 服務——輸入錯誤的單字，回傳最接近的正確拼寫建議。