Skip to content

API reference

Start with Quickstart and 13 tasks. Import from pmh (flat public API in 2.0).


Tier 0 — from pmh import …

Symbol Role
robust_fit Mode A bundle (estimate + train)
PMHTrainer Estimate + train
PMHMatcher Mode B adapt
PMHConfig Cap / warmup
check_applicability Scope gate
evaluate_baseline_vs_pmh Mode B eval + falsification
evaluate_robust_fit Mode A eval + falsification
explain_task, get_task, list_tasks Application routing

Recipe spine: pmh.recipe (plan_recipe, control_modes, default_protocol_config).

Benchmark / Step 5 arms: pmh.benchmark, compare_arms_sklearn.


Generated detail (mkdocstrings)

pmh.matcher.PMHMatcher

Bases: _SklearnMixin, _SklearnBase

Estimate deployment shift geometry and optionally project features.

The nuisance= argument is the shift type (e.g. domain_shift = site A vs B, same labels). See :func:format_shift_types or docs/WHAT_IS_DEPLOYMENT_SHIFT.md — not “nuisance” in the everyday sense.

sklearn contract
  • fit(X, y=None) — standard entry point; for D4 pass X_target via __init__, fit(..., X_target=...), or metadata routing pipe.fit(X, y, pmh__X_target=xt) after set_fit_request(X_target=True).
  • transform(X) — project onto complement of estimated nuisance subspace.
  • get_params / set_params / clone — compatible with :class:~sklearn.pipeline.Pipeline and :class:~sklearn.model_selection.GridSearchCV.

Parameters:

Name Type Description Default
nuisance str

Human-readable name (e.g. "domain_shift" → D4) or "D1""D7".

'domain_shift'
rank int

Subspace rank for D1/D4/D7. Default: min(32, d//4) from data.

None
shrinkage float

PSD regularization on Sigma_task.

1e-06
dim float

For D2 isotropic noise.

None
noise_level float

For D2 isotropic noise.

None
nuisance_indices list[int]

For D5 compositional blocks.

None
seed int

D1 class-pair sampling seed.

0
X_target array

Target-domain data stored at construction (enables Pipeline.fit(X, y) without metadata routing).

None
y_target array

Target-domain data stored at construction (enables Pipeline.fit(X, y) without metadata routing).

None
has_source_labels bool

Flags for nuisance="auto".

True
has_target_labels bool

Flags for nuisance="auto".

True
has_target_domain bool

Flags for nuisance="auto".

True

Attributes:

Name Type Description
artifact_ SigmaTaskEstimate

Fitted nuisance estimate (use with :class:PMHLoss in PyTorch).

w_ ndarray

D1 subspace basis [d, r] when applicable.

n_features_in_ int

Number of features seen during fit.

Examples:

>>> import numpy as np
>>> from pmh import PMHMatcher
>>> rng = np.random.default_rng(0)
>>> xs = rng.standard_normal((100, 20), dtype=np.float32)
>>> xt = xs + 0.3
>>> m = PMHMatcher(nuisance="domain_shift", rank=8).fit(xs, X_target=xt)
>>> m.transform(xs).shape
(100, 20)

__init__(nuisance='domain_shift', *, rank=None, shrinkage=1e-06, dim=None, noise_level=0.1, nuisance_indices=None, seed=0, n_pairs_per_class=100, X_target=None, y_target=None, has_source_labels=True, has_target_labels=False, has_target_domain=True, has_augmentation_modes=False, has_style_pairs=False)

fit(X, y=None, *args, X_target=None, y_target=None, aug_deltas=None)

Estimate Sigma_task from source (and optional target) feature matrices.

Standard sklearn: fit(X, y=None) with X_target / y_target in __init__, as keyword arguments, or via metadata routing.

Legacy positional: fit(X, y, X_target) or fit(X, y, X_target, y_target); fit(X, X_target) when the second argument is a 2D feature matrix (D4).

transform(X)

Project onto complement of estimated nuisance subspace (optional preprocessing).

pmh.trainer.PMHTrainer

Estimate Sigma_task once, then train with matched PMH on hook h.

Supports D1–D7 on PyTorch batches (see estimate() kwargs per method). For hybrid nuisances pass artifacts= or call :meth:add_artifact and use multi_loss=True.

Set train_mode="feature_diff" with forward_features + layer_names for paper T4B-style per-layer Gram + feature-diff PMH (see :meth:estimate_multilayer).

__init__(model, *, hook='backbone', head=None, nuisance='domain_shift', rank=None, shrinkage=1e-06, pmh_config=None, artifact_path=None, artifacts=None, device=None, pool_spatial=True, nuisance_indices=None, noise_level=0.1, data_context=None, has_source_labels=True, has_target_labels=False, has_target_domain=True, has_augmentation_modes=False, has_style_pairs=False, train_mode='jacobian', forward_features=None, layer_names=None, head_layer=None, noise_std=0.05, noise_rank=64)

fit(train_loader, *, source_batches=None, target_batches=None, sequences_batches=None, val_loader=None, epochs=10, optimizer=None, max_batches_estimate=50, max_steps_per_epoch=None, reestimate=False, estimate_kwargs=None)

Phase A (if needed) + Phase B training loop.

estimate(source_batches=None, target_batches=None, *, max_batches=50, save=True, aug_deltas=None, augmentations=None, sequences_batches=None, style_jsonl=None, hf_model=None, hf_tokenizer=None, d6_source='content')

Phase A: estimate Sigma_task (D1–D7).

Extra kwargs by method

D3 : aug_deltas [K,d] or [K,N,d], or augmentations + source_batches D6 : sequences_batches (encoder returns [B,T,d]) D7 : style_jsonl + hf_model / hf_tokenizer (Transformers) D5 : set nuisance_indices= on trainer; source_batches only D1 : labeled (x,y) in source and target loaders D4 : source_batches + target_batches (class-aligned Gram when (x,y) batches) D6 : d6_source='content' (default, paper 6A) or 'temporal' for consecutive diffs D2 : source_batches only (dim from h)

Single entry: check applicability, resolve hook, estimate + train.

Pass nuisance=None (default) to auto-pick from your data flags — you do not need to know D1–D7. Set has_target_labels=True when deploy batches are labeled, has_augmentation_modes=True when you enumerate deploy transforms, etc.

Human-readable §7 recipe (docs / format_five_step_guide).