Skip to content

Calibrate Package (swimrs.calibrate)

PEST++ IES integration for parameter estimation and inverse modeling.

PestBuilder

Builder for PEST++ IES calibration control files.

Constructs PEST++ control files, observation files, and parameter templates for calibrating SWIM-RS model parameters against ET fraction and SWE observations.

The builder handles: - Parameter setup with prior information from soil and vegetation data - Observation file generation from remote sensing ET and SNODAS SWE - Localization matrix construction for ensemble methods - Forward run script generation

Attributes:

Name Type Description
config

ProjectConfig instance with calibration settings.

pest_run_dir

Root directory for PEST++ files.

pest_dir

Directory containing the .pst control file.

master_dir

Directory for PEST++ master process.

pst_file

Path to the generated .pst control file.

Example

from swimrs.swim import ProjectConfig from swimrs.calibrate import PestBuilder

config = ProjectConfig() config.read_config("project.toml", calibrate=True)

with PestBuilder(config) as builder: ... builder.spinup() ... builder.build_pest(target_etf='ssebop') ... builder.build_localizer() ... builder.write_control_settings(noptmax=4, reals=250)

config = config instance-attribute

project_ws = config.project_ws instance-attribute

pest_run_dir = config.pest_run_dir instance-attribute

_container = None instance-attribute

_container_path = None instance-attribute

_owns_container = False instance-attribute

observation_index = {} instance-attribute

masks = ['inv_irr', 'irr', 'no_mask'] instance-attribute

pest = None instance-attribute

etf_std = None instance-attribute

etf_capture_indexes = [] instance-attribute

params_file = os.path.join(self.pest_run_dir, 'params.csv') instance-attribute

prior_contstraint = prior_constraint instance-attribute

conflicted_obs = conflicted_obs instance-attribute

pest_dir = os.path.join(config.pest_run_dir, 'pest') instance-attribute

master_dir = os.path.join(config.pest_run_dir, 'master') instance-attribute

workers_dir = os.path.join(config.pest_run_dir, 'workers') instance-attribute

obs_dir = os.path.join(config.pest_run_dir, 'obs') instance-attribute

pst_file = os.path.join(self.pest_dir, f'{self.config.project_name}.pst') instance-attribute

obs_idx_file = os.path.join(self.pest_dir, f'{self.config.project_name}.idx.csv') instance-attribute

pest_args = self.get_pest_builder_args() instance-attribute

verbose = verbose instance-attribute

python_script = python_script instance-attribute

overwrite_build = False instance-attribute

__init__(config, container, use_existing: bool = False, python_script: str | None = None, prior_constraint: dict | None = None, conflicted_obs: str | None = None, verbose: bool = True) -> None

Initialize PestBuilder for PEST++ calibration.

Parameters:

Name Type Description Default
config

ProjectConfig instance

required
container

SwimContainer instance or path to .swim directory. Required - all data is sourced from the container.

required
use_existing bool

If True, use existing PEST++ setup

False
python_script str | None

Path to custom forward run script

None
prior_constraint dict | None

Prior constraint settings

None
conflicted_obs str | None

Path to conflicted observations file

None
verbose bool

If False, suppress pyemu/PstFrom output. Default True.

True

_init_container(container) -> None

Initialize container from instance or path.

_load_data_from_container() -> None

Load all data from container (replaces SamplePlots).

Populates: - self.plot_order: field UIDs - self.plot_properties: field properties dict - self.irr: irrigation data dict - self.ke_max: bare soil evaporation coefficient dict - self.kc_max: max crop coefficient dict - self.date_range: (start_date, end_date) tuple

_get_etf_data(fid: str, model: str = 'ssebop') -> pd.DataFrame

Get ETf data for a field from container.

Returns DataFrame with columns like '{model}etf{mask}' for each mask.

If model='ensemble', computes the mean across all available ETf models.

_discover_etf_models() -> list[str]

Discover available ETf models in the container.

_get_swe_data(fid: str) -> pd.DataFrame

Get SWE data for a field from container.

close() -> None

Close container if we own it.

__enter__() -> PestBuilder

__exit__(exc_type, exc_val, exc_tb) -> bool

get_pest_builder_args() -> dict

build_pest(target_etf: str = 'openet', members: list[str] | None = None) -> None

Build the PEST++ control file and supporting files.

Creates the .pst control file, observation files, parameter templates, and forward run script in the pest directory.

Uses the process package with portable swim_input.h5 file. Workers are fully self-contained and can run without shared storage.

Parameters:

Name Type Description Default
target_etf str

ET model to use as calibration target ('ssebop', 'ptjpl', etc.).

'openet'
members list[str] | None

Optional list of ensemble member models for uncertainty weighting. If provided, observation weights are computed from inter-model spread.

None

Raises:

Type Description
NotImplementedError

If use_existing=True was set in constructor.

print_build_diagnostics(max_groups: int = 25) -> pd.DataFrame

Print a compact diagnostics table after building the PEST++ project.

This is meant to make it obvious whether calibration is actually using the intended observations/weights (e.g., ETf weights not all zero).

Returns

pd.DataFrame Per-observation-group summary table (also printed).

_build_obs_diagnostics_table(obs: pd.DataFrame) -> pd.DataFrame staticmethod

Build per-observation-group diagnostics for a PEST++ observation table.

_write_forward_run_script() -> None

Generate custom_forward_run.py with portable relative paths.

Uses the process package with swim_input.h5 for fully portable workers. All paths are relative to the worker directory - no shared storage needed.

build_localizer() -> None

Build the localization matrix for ensemble Kalman methods.

Creates a sparse matrix that restricts parameter-observation correlations to physically meaningful relationships. ET observations only update ET-related parameters, SWE observations only update snow parameters.

Writes loc.mat and localizer_summary.json to the pest directory.

write_control_settings(noptmax: int = -2, reals: int = 250) -> None

Write PEST++ IES control settings to the .pst file.

Parameters:

Name Type Description Default
noptmax int

Maximum optimization iterations. Use -2 for parameter estimation mode, positive values for optimization.

-2
reals int

Number of realizations in the ensemble.

250

initial_parameter_dict() -> OrderedDict

dry_run(exe: str = 'pestpp-ies') -> None

spinup(overwrite: bool = False) -> None

Run model spinup to initialize state variables.

Runs the model with initial parameters and saves the final state to the spinup JSON file for warm-starting calibration runs.

This method also creates the initial swim_input.h5 file (without spinup state). After spinup completes, _build_swim_input() rebuilds the h5 with the spinup state baked in.

Parameters:

Name Type Description Default
overwrite bool

If True, regenerate spinup even if file exists.

False

_build_swim_input() -> str

Build portable swim_input.h5 file for workers with spinup state.

Creates a self-contained HDF5 file with all input data needed for model execution, including spinup state if available. This file is copied to each PEST++ worker for isolated execution.

If spinup() was called first, this rebuilds the h5 with spinup state baked in. The rebuild is necessary because spinup creates the h5 without spinup state (since it's generating it).

Returns:

Name Type Description
str str

Path to the created swim_input.h5 file.

_write_params() -> None

_write_swe_obs(count: int) -> None

_write_etf_obs(target: str, members: list[str] | None) -> int

_finalize_obs() -> None

Write std to observations dataframes.

We should be able to write std to the observations dataframes in the etf and swe writers, but they are lost in the pest build call, so are written here.

add_regularization() -> None

_drop_conflicts(i: int, fid: str) -> None

PestResults

Parse, summarize, and clean up PEST++ IES calibration results.

Provides utilities for checking calibration success, extracting summary metrics, and cleaning up intermediate files after PEST++ runs.

Attributes:

Name Type Description
pest_dir

Path to the pest/ directory containing .pst and master/.

master_dir

Path to the master/ directory with output files.

project_name

Name of the project (matches .pst filename stem).

Example

from swimrs.calibrate import PestResults

results = PestResults("./pest_run/pest", "my_project") success, issues = results.is_successful()

if success: ... summary = results.get_summary() ... print(f"Phi reduction: {summary['phi_reduction_pct']:.1f}%") ... results.cleanup(archive_dir="./archive") ... else: ... for issue in issues: ... print(f"Issue: {issue}")

ARCHIVE_FILES = ['{project}.pst', '{project}.rec', '{project}.phi.meas.csv', '{project}.phi.composite.csv', 'params.csv', 'localizer_summary.json', 'loc.mat'] class-attribute instance-attribute

DEBUG_FILES = ['panther_master.rec', '{project}.*.obs.csv', '{project}.*.par.csv', '{project}.*.pdc.csv', '{project}.*.pcs.csv'] class-attribute instance-attribute

CLEANUP_PATTERNS = ['*.jcb', '*.jco', '*.rei', '*.rst'] class-attribute instance-attribute

pest_dir = Path(pest_dir) instance-attribute

master_dir = self.pest_dir / 'master' instance-attribute

project_name = project_name instance-attribute

pest_run_dir = self.pest_dir.parent instance-attribute

workers_dir = self.pest_run_dir / 'workers' instance-attribute

_rec_content = None instance-attribute

_phi_data = None instance-attribute

_noptmax = None instance-attribute

rec_file: Path property

Path to main record file.

pst_file: Path property

Path to control file.

__init__(pest_dir: str, project_name: str)

Initialize results handler.

Parameters:

Name Type Description Default
pest_dir str

Path to pest/ directory (contains master/, pst file, etc.)

required
project_name str

Project name (e.g., '2_Fort_Peck')

required

_read_rec_file() -> str

Read and cache record file content.

_read_phi_data() -> pd.DataFrame | None

Read and cache phi measurement data.

_get_noptmax() -> int | None

Extract noptmax from control file or record.

_get_par_files() -> list[Path]

Get all parameter CSV files sorted by iteration.

is_successful() -> tuple[bool, list[str]]

Check if calibration succeeded.

Returns:

Type Description
bool

Tuple of (success, issues) where:

list[str]
  • success: True if all primary criteria pass
tuple[bool, list[str]]
  • issues: List of problems found (empty if successful)

get_summary() -> dict

Extract key metrics from calibration results.

Returns:

Type Description
dict

Dictionary with summary metrics.

get_final_parameters() -> pd.DataFrame | None

Get final calibrated parameter ensemble.

Returns:

Type Description
DataFrame | None

DataFrame with parameter values, or None if not found.

_calculate_dir_size(path: Path) -> float

Calculate directory size in MB.

cleanup(archive_dir: str | None = None, keep_debug: bool = False, dry_run: bool = False) -> dict

Clean up calibration files based on success status.

Parameters:

Name Type Description Default
archive_dir str | None

Directory to archive important files (None = pest_dir/archive).

None
keep_debug bool

Force keeping debug files even on success.

False
dry_run bool

If True, report what would be done without doing it.

False

Returns:

Type Description
dict

Dictionary with cleanup report.

_get_recommendations(issues: list[str]) -> list[str]

Generate debugging recommendations based on issues.

print_summary() -> None

Print a formatted summary to stdout.

run_pst(_dir: str, _cmd: str, pst_file: str, num_workers: int, worker_root: str, master_dir: str | None = None, verbose: bool = True, cleanup: bool = True) -> None

Run PEST++ calibration with parallel workers.

Launches the PEST++ master and worker processes using pyemu's os_utils. Workers execute the forward model in parallel across multiple cores.

Parameters:

Name Type Description Default
_dir str

Directory containing the .pst control file.

required
_cmd str

PEST++ executable command (e.g., 'pestpp-ies').

required
pst_file str

Name of the .pst control file.

required
num_workers int

Number of parallel worker processes.

required
worker_root str

Directory for worker process files.

required
master_dir str | None

Directory for master process output. Defaults to None.

None
verbose bool

Print progress messages. Defaults to True.

True
cleanup bool

Clean up worker directories on completion. Defaults to True.

True

Raises:

Type Description
ValueError

If the pest directory does not exist.

Example

run_pst( ... _dir='/path/to/pest', ... _cmd='pestpp-ies', ... pst_file='project.pst', ... num_workers=4, ... worker_root='/path/to/workers', ... master_dir='/path/to/master' ... )