實戰:從零訓練線性回歸
Vibe Prompt
「使用 Adam Optimizer 從零訓練線性回歸,比較自實作的 Adam 與 sklearn 的結果。」
import numpy as np
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
from sklearn.linear_model import LinearRegression
# 產生資料
X, y = make_regression(n_samples=1000, n_features=10, noise=10, random_state=42)
y = y.reshape(-1, 1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# 標準化
mean, std = X_train.mean(axis=0), X_train.std(axis=0)
X_train = (X_train - mean) / std
X_test = (X_test - mean) / std
# 自實作 Adam 訓練
def train_adam(X, y, lr=0.01, epochs=200):
m, n = X.shape
w = np.zeros((n, 1))
b = 0.0
m_w, v_w = np.zeros((n, 1)), np.zeros((n, 1))
m_b, v_b = 0.0, 0.0
beta1, beta2 = 0.9, 0.999
eps = 1e-8
losses = []
for t in range(1, epochs+1):
pred = X @ w + b
loss = np.mean((pred - y)**2)
losses.append(loss)
dw = (2/m) * X.T @ (pred - y)
db = (2/m) * np.sum(pred - y)
m_w = beta1*m_w + (1-beta1)*dw
v_w = beta2*v_w + (1-beta2)*dw*dw
m_b = beta1*m_b + (1-beta1)*db
v_b = beta2*v_b + (1-beta2)*db*db
m_w_hat = m_w / (1-beta1**t)
v_w_hat = v_w / (1-beta2**t)
m_b_hat = m_b / (1-beta1**t)
v_b_hat = v_b / (1-beta2**t)
w -= lr * m_w_hat / (np.sqrt(v_w_hat) + eps)
b -= lr * m_b_hat / (np.sqrt(v_b_hat) + eps)
return w, b, losses
w, b, losses = train_adam(X_train, y_train)
pred = X_test @ w + b
print(f"自實作 Adam R²: {r2_score(y_test, pred):.4f}")
# sklearn 比較
lr = LinearRegression()
lr.fit(X_train, y_train)
print(f"sklearn R²: {r2_score(y_test, lr.predict(X_test)):.4f}")
print(f"權重差異: {np.mean(np.abs(w.flatten() - lr.coef_)):.6f}")
本日總結
- ✅ 梯度下降原理
- ✅ Momentum / RMSProp / Adam
- ✅ SGD / Mini-Batch
- ✅ 自動微分引擎
- ✅ 從零訓練線性回歸