T1 — Classical ML + matched projection (Type 1)¶
Paper evidence: main.pdf · Block findings
Nuisance: D1 subspace · Mode B: estimate $\hat{W}$, project features (or matched ridge penalty), then classical models — ridge, SVM, k-NN, logistic.
Notebook: t01-classical.ipynb — one Run All path:
| § | Content |
|---|---|
| 1–4 | Core loop + falsification (logistic, SVM, k-NN) |
| 5 | Office-31 ResNet extract + t1_office31_sklearn (toggle RUN_OFFICE31) |
| 6 | Soft k-NN (pmh.classical.compare_knn_hard_vs_soft) |
| 7 | Ridge + California housing tabular |
| 8 | MNIST pixels + data-driven $W$ |
Quick smoke: python scripts/demos/office31_sklearn.py or Run All on t01-classical.ipynb.
What T1 achieved (headline)¶
Matched anisotropic regularization along a nuisance subspace $W$ beats isotropic and wrong-$W$ controls — across ridge (theorem), SVM / k-NN / logistic, synthetic and real features.
| Evidence | Result |
|---|---|
| Theorem 1 (ridge, closed form) | Matched-ridge test MSE flat in OOD nuisance scale; B0 / iso / wrong grow with $\sigma_{test}$ |
| Oracle $W$ (MNIST, Fashion-MNIST) | Strict four-arm ordering for SVM, k-NN, logistic — matched wins |
| Data-driven $W$ (DCT drift) | Strong on Fashion-MNIST SVM/logistic; k-NN needs soft Mahalanobis metric |
| SVHN (real street digits) | SVM matched +20 pp over B0 |
| California Housing (tabular ridge) | Same flat-MSE behaviour as theorem on real UCI features |
| EB-GP | Marginal likelihood picks matched limit when $W$ is right, isotropic when wrong |
| Office-31 (Amazon → DSLR, ResNet-18 512-d) | Honest mixed result: CORAL beats PMH on all three classifiers; PMH still > B0 on SVM |
Paper arms: B0 · E1_iso · E1_matched · E1_wrong
Library keys: b0 · isotropic · matched · wrong_w (+ optional coral)
How data is plugged in (paper → you)¶
Every T1 script follows the same pattern:
- Fixed features $x = \phi(\text{input})$ — pixels, ResNet embeddings, or tabular rows.
- Site A / site B — train vs deploy (or oracle-injected nuisance along known $W$).
- Estimate $\hat W$ — cross-domain SVD (DCT drift, Office-31) or oracle orthogonal subspace.
- Apply PMH — hard projection $P_{W^\perp}$ for SVM/logistic; soft metric for k-NN under estimated $W$.
- Report on OOD / target test only (pool vs test split on Office-31).
| T1 experiment | Paper script | Data plug-in | Classifiers |
|---|---|---|---|
| Ridge theorem | ridge_theory.py, ridge_theorem_check.py |
Synthetic Gaussian, $d{=}50$, $r{=}5$ | Ridge only |
| Oracle-W images | run_benchmarks.py |
MNIST / Fashion-MNIST subsets | SVM, k-NN, logistic |
| DCT drift | mnist_drift.py |
Pixels + estimated $W$ from cross-domain SVD | SVM, k-NN (hard/soft), logistic |
| Soft k-NN fix | soft_knn.py |
Same drift features | k-NN with CV on $\alpha,\beta$ |
| Baselines | baselines.py |
Oracle + drift | PMH vs CORAL, LMNN, IRM |
| SVHN | svhn_subspace.py |
Real SVHN digits | SVM |
| Tabular ridge | ridge_tabular.py |
fetch_california_housing + injected $W$ |
Ridge |
| EB-GP | eb_gp.py |
Synthetic GP inputs | GP regression |
| Office-31 | office31_pmh.py |
download_office31.py → ResNet-18 per domain |
SVM, k-NN, logistic |
Full T1 battery: see main.pdf §T1 (seven sub-experiments).
Run with matching-pmh (your pipeline)¶
You only need frozen [N, d] features + labels on both sites (same meaning).
pip install "matching-pmh[sklearn,vision]"
Office-31 (paper §9 — main real DA benchmark)¶
python scripts/download_office31.py --root YOUR_OFFICE31_ROOT
python scripts/demos/office31_sklearn.py --office31-root YOUR_OFFICE31_ROOT --source amazon --target dslr
Preset t1_office31_sklearn: rank 32, target pool 200, test 250 (matches paper protocol).
Batch table (accuracy + TDI):
python scripts/demos/benchmark_sklearn_table.py --office31-root YOUR_OFFICE31_ROOT --report results/sklearn_benchmark
Any classical head on your features¶
from pmh import compare_arms_sklearn
result = compare_arms_sklearn(
x_source, y_source, x_target, y_target,
preset="t1_office31_sklearn",
include_coral=True,
)
Default classifier is logistic (one call). Paper Table uses SVM, k-NN, logistic separately — pass a custom classifier_factory to run_sklearn_benchmark for parity.
Protocol check (not a headline result)¶
Built-in arrays (load_g2_demo_arrays) check D1 + falsification arms without downloading data — same matched / wrong_w / isotropic logic, not a substitute for MNIST drift or Office-31.
Office-31 numbers (paper, 3 seeds)¶
| Method | B0 | PMH | CORAL | LMNN |
|---|---|---|---|---|
| SVM | 21.6% | 23.3% | 25.2% | 20.0% |
| k-NN | 20.4% | 15.5% | 24.4% | 18.0% |
| Logistic | 22.4% | 22.4% | 26.8% | 18.8% |
Read: CORAL wins on ResNet embeddings; PMH is still part of the evidence map (toy → drift → real DA). Deep PMH (Types 3–7) is where feature-level training continues.
Limitations (from FINAL.md)¶
- Linear projection fails on severe non-linear shift (e.g. raw USPS↔MNIST at chance).
- Hard projection hurts k-NN when $W$ is noisy — use soft Mahalanobis (paper
soft_knn.py). - Office-31: CORAL > PMH on 512-d ResNet features.
- IRM can win when two training environments are available.
- Rotation nuisance in pixel space can overlap signal (documented failure mode).
Library ↔ paper map¶
| Paper | Library |
|---|---|
estimate_cross_domain_subspace |
PMHMatcher / estimate_sigma_task_numpy, nuisance="subspace" |
project_perp + classifier |
compare_arms_sklearn arm matched |
random_orthonormal wrong $W$ |
arm wrong_w |
| D4-iso / domain Gram control | arm isotropic |
coral_align |
arm coral |
| Office-31 feature extract | pmh.datasets.office31.extract_office31_features |