Objective Functions

A DataFit has two coupled pieces:

Objectives (iws.objectives.*) — what experiments to compare model output against.
Cost (iws.costs.*) — how the per-point disagreements are aggregated into a single number.

For the math behind each cost, see the Objective Functions Guide.

Available cost functions

Schema	Formula	When to use
`iws.costs.SSE()`	$\sum_i r_i^2$	Default; works with every optimiser
`iws.costs.MSE()`	$\frac{1}{N}\sum_i r_i^2$	Scale-aware mean of squared residuals
`iws.costs.RMSE()`	$\sqrt{\frac{1}{N}\sum_i r_i^2}$	Interpretable units; scalar-only (won’t work with residual-array optimisers)
`iws.costs.MAE()`	$\frac{1}{N}\sum_i \lvert r_i \rvert$	Robust to outliers
`iws.costs.Max()`	$\max_i \lvert r_i \rvert$	Minimise the worst-case (largest absolute) residual
`iws.costs.Wasserstein()`	$\frac{1}{N}\sum_i \lvert \tilde y_{\text{model},i} - \tilde y_{\text{data},i} \rvert$	Match distributions (sorted samples) rather than point-wise time series. Set `position_variable` and `weight_variable` for weighted point-cloud mode

For MLE, see iws.costs.GaussianLogLikelihood — it accepts per-variable noise standard deviations or can estimate them alongside the fitting parameters. It produces a Gaussian negative log-likelihood suitable for Bayesian and MAP estimation.

Wiring a cost into a fit

import pybamm
import ionworks_schema as iws

fit = iws.DataFit(
    objectives={
        "1C": iws.objectives.CurrentDriven(
            data_input="file:.../1C.csv",
            options={"model": pybamm.lithium_ion.SPMe()},
        ),
    },
    parameters={
        "Negative particle diffusivity [m2.s-1]": iws.Parameter(
            "Negative particle diffusivity [m2.s-1]",
            initial_value=2e-14,
            bounds=(1e-14, 1e-13),
        ),
    },
    cost=iws.costs.RMSE(),
)

If cost is omitted, the optimizer’s default cost function is used (typically a least-squares form).

cost accepts a cost schema instance (e.g. iws.costs.RMSE()) or a config dict with an explicit type key (e.g. {"type": "RMSE"}). A bare name string like cost="RMSE" is rejected with a validation error — wrap it as {"type": "RMSE"} instead.

Wasserstein weighted point-cloud mode

By default iws.costs.Wasserstein() compares the model and data samples for each objective variable with uniform weights (sorted point-wise comparison). Set both position_variable and weight_variable to switch to weighted point-cloud mode: one variable supplies the positions, the other supplies the (sign-stripped, renormalised) weights, and a single Wasserstein-1 distance is computed per objective. Use this when you want to match a density by position rather than sample-by-sample values — for example, lining up dQ/dV peaks in voltage rather than penalising every dQ/dV residual. Both iws.objectives.MSMRFullCell and iws.objectives.ElectrodeBalancing expose the matching Differential capacity [Ah/V] values alongside their Voltage [V] (dQdU) masked-axis sibling, so either can drive a weighted point-cloud fit:

import ionworks_schema as iws

fit = iws.DataFit(
    objectives={
        "ocp": iws.objectives.ElectrodeBalancing(
            data_input="file:.../ocv.csv",
            options={
                "objective variables": [
                    "Differential capacity [Ah/V]",
                    "Voltage [V] (dQdU)",
                ],
            },
        ),
    },
    parameters={...},
    cost=iws.costs.Wasserstein(
        position_variable="Voltage [V] (dQdU)",
        weight_variable="Differential capacity [Ah/V]",
    ),
)

position_variable and weight_variable must be set together — providing only one raises a validation error. Weights are taken as absolute values and renormalised internally, so sign conventions on dQ/dV don’t matter. Residual-array output is not available in this mode.

`ElectrodeBalancing` options for OCV fitting

ElectrodeBalancing accepts the following keys in its options dict to control how the full-cell OCV is processed before the objective is evaluated. These apply regardless of which cost function the fit uses (not only weighted Wasserstein):

Option	Type	Default	Purpose
`objective variables`	list of str	`["Voltage [V]", "Differential voltage [V/Ah]"]`	Variables compared between model and data. Add `"Differential capacity [Ah/V]"` to also emit model dQ/dU on the data voltage grid (plus the masked-axis siblings `"Voltage [V] (dQdU)"` and `"Capacity [A.h] (dQdU)"`) for weighted Wasserstein costs.
`dUdQ cutoff`	float \| None	`None`	Drop data points whose `dU/dQ` exceeds this value — useful for masking the near-vertical regions at the voltage limits.
`dQdU cutoff`	float \| None	`None`	Drop data points whose `dQ/dU` exceeds this value — useful for masking flat OCV regions where `dQ/dU` diverges. Negative or zero values are always dropped so the resulting weights stay non-negative for Wasserstein.
`direction`	`"charge"` \| `"discharge"` \| None	`None`	Direction of the OCV scan. `None` makes no directional assumption.
`GITT`	bool	`False`	Treat the data as sparse GITT samples and upsample by interpolation before computing the derivatives.
`dQdU model axis`	bool	`False`	When `True`, additionally emit dQ/dV on the model’s own full-window voltage axis as `"Differential capacity [Ah/V] (model axis)"` / `"Voltage [V] (model axis)"` — see Aligning dQ/dV peaks on the model voltage axis.

import ionworks_schema as iws

fit = iws.DataFit(
    objectives={
        "ocp": iws.objectives.ElectrodeBalancing(
            data_input="file:.../ocv.csv",
            options={
                "direction": "discharge",
                "GITT": True,
                "dUdQ cutoff": 1.0,
                "dQdU cutoff": 50.0,
                "objective variables": [
                    "Differential capacity [Ah/V]",
                    "Voltage [V] (dQdU)",
                ],
            },
        ),
    },
    parameters={...},
    cost=iws.costs.Wasserstein(
        position_variable="Voltage [V] (dQdU)",
        weight_variable="Differential capacity [Ah/V]",
    ),
)

Scoping a cost with `calculation_structure`

By default every cost on a DataFit consumes every objective and every objective variable in the outputs. Set calculation_structure on a cost to scope it explicitly: a mapping from objective name to the list of variable names that cost should compute, or None to compute all of that objective’s variables (an empty list computes none). Objectives you leave out of the mapping are not dropped. Inside a DataFit each unscoped objective is bound to all of its variables — the same as mapping it to None — so scoping one objective (e.g. {"ocp": ["Voltage [V]"]} while a "cc" objective also exists) still computes "cc" in full. Use this when one cost should only see a subset of variables — most commonly when you pair a per-variable cost (e.g. SSE) with a weighted Wasserstein. The Wasserstein owns the dQ/dV variables (whose model and data sides may have different lengths by construction), and the SSE is scoped to skip them so the lengths never collide.

import ionworks_schema as iws

fit = iws.DataFit(
    objectives={
        "ocp": iws.objectives.ElectrodeBalancing(
            data_input="file:.../ocv.csv",
            options={
                "objective variables": [
                    "Voltage [V]",
                    "Differential capacity [Ah/V] (model axis)",
                    "Voltage [V] (model axis)",
                ],
                "dQdU model axis": True,
            },
        ),
    },
    parameters={...},
    cost=[
        iws.costs.SSE(
            calculation_structure={"ocp": ["Voltage [V]"]},
        ),
        iws.costs.Wasserstein(
            position_variable="Voltage [V] (model axis)",
            weight_variable="Differential capacity [Ah/V] (model axis)",
            calculation_structure={
                "ocp": [
                    "Voltage [V] (model axis)",
                    "Differential capacity [Ah/V] (model axis)",
                ],
            },
        ),
    ],
)

calculation_structure replaces the deprecated objective_names field (a flat list of objective names with no per-variable control). Specifying both on the same cost raises a validation error.

Length-mismatch warning

Element-wise costs (SSE, MSE, RMSE, MAE, Max) combine the model and data arrays point-by-point, so a variable whose model and data sides have different lengths almost never gives a meaningful score. At fit setup, DataFit checks the shapes of every variable each cost is configured to score and emits a UserWarning for each mismatch — for example:

UserWarning: variable 'Voltage [V] (model axis)' of objective 'ocp' has mismatched
model/data shapes ((512,) vs (128,)). An element-wise cost will combine them
point-by-point, which is almost never intended. Scope the cost with an explicit
`calculation_structure` so each variable is compared against a matching-length
counterpart.

The check runs once at fit setup (not on every objective evaluation), so it has no impact on fit performance. When you see this warning, scope the cost with calculation_structure so it only sees variables whose model and data lengths match — and route any model-axis variables to a Wasserstein cost (or another distribution metric) instead. Distribution costs like Wasserstein are skipped by the check, since unequal-length sample sets are expected there.

Aligning dQ/dV peaks on the model voltage axis

iws.objectives.ElectrodeBalancing can emit dQ/dV on the model’s own full-window voltage axis in addition to (or instead of) the data voltage grid. Set dQdU model axis: True in options and add the two model-axis variables — "Differential capacity [Ah/V] (model axis)" and "Voltage [V] (model axis)" — to objective variables. Use this when you want a weighted cost (typically Wasserstein in point-cloud mode) to position-shift — i.e. align dQ/dV peaks in voltage rather than residual-by-residual on the data grid. The model and data sides have different lengths by construction, so only a weighted cost should consume them; pair them with a sibling per-variable cost scoped via calculation_structure (see above) to keep the rest of the fit honest. The existing data-axis variables ("Differential capacity [Ah/V]" plus the masked siblings "Voltage [V] (dQdU)" / "Capacity [A.h] (dQdU)") remain available — both axes can be requested side by side.

Available objectives

Schema	Use for
`iws.objectives.CurrentDriven(data_input=..., options={...})`	Time-series voltage vs. current loads (drive cycles, custom loads)
`iws.objectives.Pulse(data_input=..., options={...})`	Pulse experiments — GITT, HPPC, ICI — with optional feature-extraction variants
`iws.objectives.OCPHalfCell(electrode=..., data_input=...)`	Half-cell OCP curves
`iws.objectives.MSMRHalfCell(...)`	Fit MSMR parameters to half-cell data
`iws.objectives.MSMRFullCell(...)`	Fit MSMR parameters to full-cell data. Supports `Differential voltage [V/Ah]` and `Differential capacity [Ah/V]` as objective variables
`iws.objectives.ElectrodeBalancing(...)`	Stoichiometry windows from full-cell discharge. Supports `Differential voltage [V/Ah]` and `Differential capacity [Ah/V]` as objective variables — see `ElectrodeBalancing` options
`iws.objectives.EIS(...)`	Electrochemical impedance spectra
`iws.objectives.Resistance(...)`	DC resistance extracted from pulse data
`iws.objectives.CalendarAgeing(...)` / `iws.objectives.CycleAgeing(...)`	Ageing curves

Combine several by passing a dict[str, objective] to DataFit.objectives.

`GITTModel`: diffusion-only model for GITT and pulse fits

GITTModel is a fitting-only model intended for extracting solid-phase diffusivities (and a single lumped ohmic resistance) from GITT or pulse-relaxation measurements. It solves x-averaged spherical particle diffusion in each modelled electrode, with the surface flux set by the applied current, and computes the cell voltage from the electrode open-circuit potentials evaluated at the particle-surface stoichiometries, minus an ohmic drop through a lumped "Ohmic resistance [Ohm]" parameter. There are no reaction kinetics (Butler-Volmer), no electrolyte dynamics, and no thermal effects — all parameters are constant except the OCPs. Use it when you want fast, well-conditioned fits to diffusion-dominated portions of GITT or pulse data, and reach for SPM / SPMe / DFN when you need a full physics simulation. Select the cell configuration via the "working electrode" option:

`"working electrode"`	Configuration
`"both"` (default)	Full cell. Both electrodes are modelled. A positive (discharge) current delithiates the negative electrode and lithiates the positive electrode.
`"positive"`	Half-cell against a lithium-metal counter electrode (pybamm half-cell convention). Only the working electrode is modelled. A positive current lithiates the working electrode (discharge for a cathode material, charge for an anode material). Anode-material half cells also use `"positive"` — rename the anode’s parameters to the positive convention first.

Each modelled electrode is parameterised with the standard full-cell parameter names (thickness, active material volume fraction, particle radius, diffusivity, OCP, maximum and initial concentrations), plus the current function, electrode cross-sectional area, initial temperature, and "Ohmic resistance [Ohm]".

Fitting a full-cell GITT measurement

import ionworks_schema as iws

fit = iws.DataFit(
    objectives={
        "gitt": iws.objectives.Pulse(
            data_input="file:.../gitt.csv",
            options={
                "model": iws.models.GITTModel(),
            },
        ),
    },
    parameters={
        "Negative particle diffusivity [m2.s-1]": iws.Parameter(
            "Negative particle diffusivity [m2.s-1]",
            initial_value=2e-14,
            bounds=(1e-15, 1e-12),
        ),
        "Positive particle diffusivity [m2.s-1]": iws.Parameter(
            "Positive particle diffusivity [m2.s-1]",
            initial_value=2e-15,
            bounds=(1e-16, 1e-13),
        ),
        "Ohmic resistance [Ohm]": iws.Parameter(
            "Ohmic resistance [Ohm]",
            initial_value=0.02,
            bounds=(1e-3, 1e-1),
        ),
    },
)

Fitting a half-cell pulse measurement

Pass "working electrode": "positive" to model a single electrode against a lithium-metal counter. Only the working-electrode parameters are needed.

import ionworks_schema as iws

fit = iws.DataFit(
    objectives={
        "pulse": iws.objectives.Pulse(
            data_input="file:.../half_cell_pulse.csv",
            options={
                "model": iws.models.GITTModel(
                    options={"working electrode": "positive"},
                ),
            },
        ),
    },
    parameters={
        "Positive particle diffusivity [m2.s-1]": iws.Parameter(
            "Positive particle diffusivity [m2.s-1]",
            initial_value=2e-15,
            bounds=(1e-16, 1e-13),
        ),
        "Ohmic resistance [Ohm]": iws.Parameter(
            "Ohmic resistance [Ohm]",
            initial_value=0.02,
            bounds=(1e-3, 1e-1),
        ),
    },
)

"working electrode" only accepts "both" or "positive" — anything else fails schema validation. Any other keys in options are forwarded to the underlying battery-model options for parameter bookkeeping; they do not change the diffusion-only physics.

Specifying `data_input`

Every objective’s data_input (and any other data field on a calculation or interpolant) accepts the same set of forms:

A reference string: "db:<id>" to reference an uploaded measurement. "file:..." and "folder:..." are read from your local machine and inlined into the config by the API client on submit, so they work both locally and when you submit a fit to Ionworks — subject to the same 1,000-row inline limit as a bare DataFrame. For larger datasets, upload a measurement and reference it with "db:<id>".
An ionworksdata.DataLoader (local or fetched with DataLoader.from_db(...)).
A bare pandas or polars DataFrame of pre-loaded columns.

import pybamm
import ionworks_schema as iws
import pandas as pd

df = pd.DataFrame(
    {
        "Time [s]": [...],
        "Voltage [V]": [...],
        "Current [A]": [...],
    }
)

obj = iws.objectives.CurrentDriven(
    data_input=df,
    options={"model": pybamm.lithium_ion.SPMe()},
)

When a bare DataFrame is passed, it is auto-wrapped on serialization to match the parser’s expected {"data": <columns>} shape — so data_input=df and data_input={"data": df} behave the same. String paths and already-wrapped dicts are left untouched.

Inline DataFrames are capped at 1,000 rows per call. For larger datasets, upload as a measurement and reference it by ID instead. See inline time series size limit.

Generating a CycleAgeing experiment from data

iws.objectives.CycleAgeing normally requires an explicit pybamm.Experiment describing the cycling protocol. When the protocol is already encoded in the cycler step information attached to your data, set experiment="from data" to skip rebuilding it by hand. The experiment is generated lazily, when the fit starts, by calling DataLoader.generate_experiment() on the loaded step table. Use this when:

The fitted data carries its own step information (a local ionworksdata.DataLoader, or one fetched with DataLoader.from_db(...)).
You want the simulated protocol to track the measurement protocol exactly — including any per-step current, voltage limits, or durations recorded by the cycler.

import pybamm
import ionworks_schema as iws

fit = iws.DataFit(
    objectives={
        "ageing": iws.objectives.CycleAgeing(
            data_input="db:<measurement-id>",
            options={
                "model": pybamm.lithium_ion.SPM(options={"SEI": "ec reaction limited"}),
                "experiment": "from data",
                "objective variables": ["LLI [%]"],
            },
        ),
    },
    parameters={...},
)

If the data you are fitting against (for example, a per-cycle summary table) is a different object from the measurement that defines the protocol, pass a separate DataLoader as experiment instead — the steps come from that loader, while the residuals are still computed against data_input:

import pybamm
import ionworksdata as iwdata
import ionworks_schema as iws

protocol = iwdata.DataLoader.from_db("<protocol-measurement-id>")

fit = iws.DataFit(
    objectives={
        "ageing": iws.objectives.CycleAgeing(
            data_input="db:<summary-measurement-id>",
            options={
                "model": pybamm.lithium_ion.SPM(),
                "experiment": protocol,
                "objective variables": ["LLI [%]"],
            },
        ),
    },
    parameters={...},
)

experiment="from data" requires data_input to resolve to a DataLoader (or a dict whose "data" entry is a DataLoader) that carries step information. When you pass a separate DataLoader as experiment, that loader must carry the step information instead. Either way, configurations missing steps fail fast at objective construction with a clear error, before any simulation runs.

Tuning the auto-built solver

Simulation-backed objectives (CurrentDriven, Pulse, CalendarAgeing, CycleAgeing, MSMRFullCell, …) build an IonworksSolver for you when no explicit solver is provided. Pass solver_kwargs inside simulation_kwargs to override individual pieces of that default without restating the rest:

Nested options are merged over the default IDAKLU options. For example, {"options": {"compile": True}} flips on model compilation but keeps every other tuned option.
Other top-level keys (atol, rtol, on_extrapolation, …) override the corresponding default solver kwargs.

solver_kwargs is ignored (with a warning) when an explicit solver is supplied — configure those on the solver instance directly. It is also ignored when the model’s default solver isn’t IDAKLU-based.

import pybamm
import ionworks_schema as iws

fit = iws.DataFit(
    objectives={
        "1C": iws.objectives.CurrentDriven(
            data_input="file:.../1C.csv",
            options={
                "model": pybamm.lithium_ion.SPMe(),
                "simulation_kwargs": {
                    "solver_kwargs": {
                        "options": {"compile": True},
                        "atol": 1e-8,
                    },
                },
            },
        ),
    },
    parameters={...},
)

Enabling compile ahead of time ({"options": {"compile": True}}) trades a one-off compilation cost for faster repeated evaluations — useful when the same objective is solved many times during a fit or sweep.

Forwarding kwargs to the runtime solve

simulation_kwargs also accepts solve_kwargs, a dict forwarded to the runtime sim.solve(...) call on every objective evaluation. Use it for arguments that belong on the solve itself rather than the solver — for example starting_solution to warm-start from a previous solution, or any other pybamm.Simulation.solve argument.

solve_kwargs is applied regardless of whether the objective auto-built the solver or you supplied an explicit solver. It is the recommended way to pass solve-time arguments that work with any solver.
solver_kwargs (above) tunes the auto-built solver at construction time; solve_kwargs configures each solve call. The two are independent and can be combined.
Keys the objective controls directly — inputs, initial_soc, t_eval, t_interp, frequencies — are reserved and raise a ValueError if passed via solve_kwargs.
For CycleAgeing, save_at_cycles is derived automatically from the metrics; any value passed via solve_kwargs is ignored with a warning so that the cycles required by the metrics are preserved.

import pybamm
import ionworks_schema as iws

fit = iws.DataFit(
    objectives={
        "pulse": iws.objectives.Pulse(
            data_input="file:.../pulse.csv",
            options={
                "model": pybamm.lithium_ion.SPMe(),
                "simulation_kwargs": {
                    # Tunes the auto-built solver (construction time):
                    "solver_kwargs": {"options": {"compile": True}},
                    # Forwarded to every sim.solve(...) call (runtime).
                    # prior_solution is a pybamm.Solution you obtained from an
                    # earlier sim.solve(...) — substitute your own:
                    "solve_kwargs": {"starting_solution": prior_solution},
                },
            },
        ),
    },
    parameters={...},
)

`CycleAgeing`: automatic `store_first_last` for first/last-only metrics

CycleAgeing lets you supply metrics — a mapping from each objective variable to a .by_cycle() metric that pulls the value of interest out of the simulation. Defaults are provided for "LLI [%]", "LAM_ne [%]", and "LAM_pe [%]", all of which read a single per-step sample. When every metric in that mapping reads only the first or last sample of a step — i.e. the defaults, or any First/Last .by_cycle() metric — CycleAgeing now defaults solver_kwargs["store_first_last"] to True. The solver then stores only the endpoints of each step, which is far more memory-light for long cycling solves and produces identical results for these metrics. The flag is only auto-set when it is safe to do so:

Metrics that read interior points (e.g. Mean(...).by_cycle()) leave the default off so no samples are dropped.
Composed metrics (arithmetic of First/Last) are conservatively left alone.
An explicit store_first_last in solver_kwargs is always respected.
Supplying your own solver skips solver-kwargs injection entirely (as elsewhere).

import pybamm
import ionworks_schema as iws

fit = iws.DataFit(
    objectives={
        "ageing": iws.objectives.CycleAgeing(
            data_input="file:.../ageing.csv",
            options={
                "model": pybamm.lithium_ion.SPM(),
                "experiment": "from data",
                "objective variables": ["LLI [%]", "LAM_ne [%]"],
                # Defaults already read first/last only, so store_first_last
                # is enabled automatically. Override explicitly when needed:
                # "simulation_kwargs": {
                #     "solver_kwargs": {"store_first_last": False},
                # },
            },
        ),
    },
    parameters={...},
)

For most optimisers, SSE is the safest choice — it has both a residual-array form and a scalar form, so it’s compatible with every algorithm. Use MSE or RMSE when you need scale-independent reporting.