13 paper tasks (T1 → T7)¶

Tasks are listed in paper order. Your pipeline does not need to match a paper ID — pick the row whose deploy change sounds like yours, open the notebook, Run All on the demo, then edit §8 with your data.

Full examples and estimation detail: README — Find your deployment story.

Matching principle (main.pdf): estimate $\Sigma_{\text{task}}$ → matched PMH on h → Step 5 (matched vs wrong vs isotropic on deploy holdout).

#	Task	Page	Notebook
1	T1 Classical ML + matched projection (ridge, SVM, k-NN, logistic)	t01-classical.md	t01-classical.ipynb
2	T2A ViT / image classifier	t02a-vit-isotropic.md	t02a-vit-isotropic.ipynb
3	T2B Medical imaging	t02b-chexpert-isotropic.md	t02b-chexpert-isotropic.ipynb
4	T3A Pose / keypoints	t03a-pose-gradient.md	t03a-pose-gradient.ipynb
5	T3B Depth estimation	t03b-depth-augmentation.md	t03b-depth-augmentation.ipynb
6	T4A Vision domain shift (single-layer / ResNet)	t04a-vision-domain.md	t04a-vision-domain.ipynb
7	T4B Vision domain shift (multilayer FPN / U-Net)	t04b-multilayer-vision.md	t04b-multilayer-vision.ipynb
8	T5A Molecules / graphs (QM9-style)	t05a-qm9-molecule.md	t05a-qm9-molecule.ipynb
9	T5B Code models	t05b-code-tokens.md	t05b-code-tokens.ipynb
10	T6A Speech / ASR	t06a-speech-whisper.md	t06a-speech-whisper.ipynb
11	T6B Time-series / HAR	t06b-temporal-har.md	t06b-temporal-har.ipynb
12	T7A LLM	t07a-llm-style.md	t07a-llm-style.ipynb
13	T7B Adversarial / PGD perturbations	t07b-adversarial-pgd.md	t07b-adversarial-pgd.ipynb

Which task fits your deploy change?¶

Task	What changes at deploy	Examples	What we estimate	`nuisance=`
T1	Frozen embeddings shift between sites	Office-31; two hospitals’ features; lab A→B tabular	Source−target subspace on features	`subspace`
T2A	Generic input noise / corruption	ImageNet-C; camera noise; blur/JPEG	Isotropic noise level σ	`isotropic`
T2B	Scanner / hospital appearance on X-ray	CheXpert site shift; DICOM pipeline change	Isotropic σ (medical deploy stress)	`isotropic`
T3A	Camera/lighting; same keypoints	Studio→in-the-wild pose; broadcast→fan photos	Augmentation feature deltas	`augmentation`
T3B	Photometric shift; depth meaning fixed	Lighting on depth maps; synthetic→real RGB-D	Augmentation deltas	`augmentation`
T4A	New camera, site, or visual domain	Photo→sketch; warehouse A→B; day→night cls	Train vs deploy feature Gram	`domain_shift`
T4B	Sim→real texture + layout (segmentation)	GTA5→Cityscapes; synthetic IR→real seg	Domain Gram (multilayer in paper)	`domain_shift`
T5A	Atom positions move; property label fixed	QM9 conformers; docked poses	Nuisance coordinates (positions)	`compositional`
T5B	Token groups change; task label fixed	Renames; comment strip; obfuscation	Nuisance token/block indices	`compositional`
T6A	Mic, room, codec — same words	Libri conditions; new microphone	Temporal / content-residual (see doc)	`temporal`
T6B	Sensor drift over time	HAR placement; IMU aging	Temporal residual on sequences	`temporal`
T7A	Tone/format; facts unchanged	Bulleted vs prose; formal vs casual bot	Style pairs (same content)	`style`
T7B	Adversarial perturbations at deploy	PGD robustness; spoof patches	Subspace from attack deltas	`style` (PGD path)

T1 bundles seven classical subtasks in one notebook. T2–T7 map to blocks in main.pdf. Clone any row for a similar deploy change — not only the benchmark named in the paper.

Regenerate: python scripts/render_handcrafted_tasks.py

Quickstart · Will PMH help? · API