Skip to main content
June 29, 2026
A rebuilt execution engine for parallel data fits, name-based measurement lookup, Arbin subschedule/loop support, and clearer model and duplicate errors

A rebuilt execution engine for data fitting

The pipeline’s parallelisation layer has been rebuilt as a dedicated execution engine. DataFit.fit is now DataFit.run on this engine, which drives optimizers (ask/tell), point evaluation, multistart coordination, and progress reporting through a single set of executor protocols (serial, multiprocessing, and — on the backend — Ray over a warm actor pool). The old distributed.py Joblib configuration zoo and its num_workers / parallel schema fields are retired in favour of engine-owned parallelism, and backend datafit jobs now run through a thin datafit.run() host with classified-failure error mapping and per-generation progress streaming.

Name-based measurement lookup

client.resolve_measurement(cell_specification, cell_instance, measurement) resolves a measurement_id from human-readable names, walking the spec → instance → measurement hierarchy with server-side filtering at each level. Scripts no longer need to hand-walk three list endpoints and repeat name-matching boilerplate just to get an id. parameterized_model.create_or_get also lands, giving parameterized models the re-runnable create-or-resolve behaviour the other cell resources already had.

Clearer errors for model failures and duplicates

Two opaque “An unexpected error occurred” cases — a fit whose model setup needs geometry the user didn’t supply, and a custom model with no discretisation recipe — now surface as a clear ModelError naming what failed. Separately, attempting to create a resource that already exists now returns a 409 CONFLICT (with the existing id) instead of a 500, both via a catch-all for Postgres unique-constraint violations and per-path translation at every insert site, so duplicate-creation attempts no longer page on-call.
Improvements
  • “Save as template” for an optimization now builds the template server-side via a new POST /optimization_templates/from-optimization endpoint that copies an optimization’s saved config directly into the template.
Fixes
  • Completed validation/datafit results no longer sporadically show “Failed to load plot data.” for one of a pair of plots. A job’s metadata is rewritten in place as it runs (checkpoints, then the final write that adds the validation plot config), but the in-process metadata cache assumed metadata was immutable after completion; it now self-heals stale entries, and per-job storage blobs are written no-cache so a stale early write can’t win.
  • CycleAgeing experiments passed as a lazy DataLoader (experiment="from data") now resolve their db: measurement references server-side, instead of surviving to the credential-less fit worker and failing there.
Improvements
  • A bare element-wise cost (SSE/MSE/RMSE/MAE/Max) applied to a multi-variable objective now warns when variable lengths mismatch — e.g. a model-axis dQ/dV variable against a data-axis voltage variable — instead of silently broadcasting to a meaningless residual.
  • All 1-D interpolant calculations now accept a "pchip" interpolator (monotone cubic), the right choice for sparse, order-of-magnitude D(sto) tables such as half-cell GITT per-pulse diffusivity.
Fixes
  • The MSMR half-cell logistic value and derivative are now evaluated through a single per-species-stabilized helper whose exponent is always ≤ 0, so it can never overflow.
  • The object path (iwp.Pipeline.from_schema(...).run()) now rebuilds a serialized model dict in objective options the same way the config/server path does, so both paths agree.
  • A custom model with serialised geometry/mesh now survives stdlib pickle / raw multiprocessing multistart — the previous anonymous-subclass approach could only be serialised by value (cloudpickle/Ray).
Improvements
  • client.resolve_measurement and parameterized_model.create_or_get (both introduced above), plus Model.config for inspecting a model’s stored configuration.
  • Passing a bare pandas/polars DataFrame to a schema data field (objective data_input, OCPDataInterpolant.data, the Arrhenius calculations’ data) is now auto-wrapped correctly on serialization, instead of failing server-side validation with the opaque “Required field ‘data’ missing”.
Improvements
  • Arbin protocols now support subschedules (mapped to a UCP subroutine with the referenced .subsdx registered and parsed recursively), counter loops, and temperature add-ins, so schedules that previously crashed on upload with “Unsupported step type: SubSchedule” now parse and simulate end-to-end.
Fixes
  • BioLogic .mps simulation fixes: the loop counter is off by one no longer (EC-Lab’s ctrl_repeat does not count the first pass, so a loop now runs ctrl_repeat + 1 times), plus BCD/EIS support and drive-cycle fixes, each reproduced against the reported setting files before fixing.
  • User-correctable protocol and configuration errors across the Protocol Simulator and pipeline libraries now raise specific domain error types (ProtocolConfigurationError, UserConfigurationError) instead of generic ValueError / RuntimeError, so they surface to the user rather than paging on-call.
Improvements
  • The process-data reader gotchas are now documented: selecting the record sheet for multi-sheet Neware BTSDA .xlsx files, and current-unit handling — alongside the one-call read.time_series_and_steps entry point.