---
openTEPES documentation master file, created by Andres Ramos
---

# Multiple runs

Many studies run the same model over many related cases — scenario ensembles, demand or fuel-cost scans, Monte-Carlo samples. **openTEPES** offers three modes for this, trading set-up cost for reuse, all driven by one orchestrator (`openTEPES_Runner.run`). A single-case run reproduces a direct `openTEPES_run`.

The code calls a set of related runs a *sweep*: `openTEPES_Runner` is the sweep runner, the three modes below are the sweep modes A/B/C, and the merged output tables are named `oT_Sweep_*`.

| Mode | What each worker repeats | Cost per case | When to use |
| ---- | ------------------------ | ------------- | ----------- |
| A — pre-build | read input + build + solve | full | cases that differ structurally, each in its own folder |
| B — in-memory overlay | build + solve (input read once) | build + solve | one baseline perturbed by a scale factor or a table patch |
| C — post-build hot-swap | solve only (built once) | solve | one built model re-solved over a range of parameter values |

```{image} ../img/sweep_modes.png
:alt: The three sweep modes as a reuse matrix over the openTEPES pipeline; each mode reuses more stages, so each case costs less.
:width: 100%
:align: center
```

## Mode A — pre-build sweep

Each case reads its own input source, builds its own model, solves, and writes its own results. It is the most general mode: the cases may differ in any way, since nothing is shared between them.

A `Case` names one input source (a CSV case directory **or** a `.duckdb` file) plus an optional output directory and label. Cases that share the same `(dir_name, case_name)` need distinct `out_path` values so their result writers do not collide.

```python
from openTEPES import openTEPES_Cases, openTEPES_Runner

cases = [
    openTEPES_Cases.Case("cases", "SE2035_low"),
    openTEPES_Cases.Case("cases", "SE2035_ref"),
    openTEPES_Cases.Case("cases", "SE2035_high"),
]

summary = openTEPES_Runner.run(cases, solver_name="gurobi",
                               mode="pre-build", backend="multiprocessing", n_workers=8)
```

**Backends.** `backend="serial"` (default, no extra dependency) runs the cases one after another; `"multiprocessing"` spreads them across a standard-library `Pool` of `n_workers`; `"joblib"` uses `joblib.Parallel` (an optional dependency — `pip install joblib`). All three preserve input order in the returned list.

**Results come back as summaries, not models.** A Pyomo model cannot cross a worker-process boundary, so each worker writes its results to disk and `run` reads back that case's `openTEPES_run_status_*.json`, returning one summary dict per case. A case that raises is captured as `status="error"`, so one failure does not abort the run.

## Mode B — in-memory overlay

When every case starts from one baseline, re-reading the input for each is wasteful. Mode B reads the baseline once into memory and re-uses it, so the run pays the input I/O once and only rebuilds and solves per case.

**How the baseline is shared.** The runner calls `InMemorySource.materialize(open_source(path))` once per distinct baseline, holding every input table in RAM as a DataFrame. Each case gets a cheap clone via `with_overlay(...)` that shares those frames. Under the `multiprocessing` backend on Unix the workers are forked, so they see the baseline through copy-on-write — no extra memory, no re-read. On Windows the baseline is pickled once per task instead (still no CSV re-read, just without the copy-on-write saving). Mode B needs a distinct `out_path` per case, since the cases share the baseline directory.

**The overlay** maps a data-table stem (e.g. `"Demand"`) to one of:

- a **number** — multiply every numeric value of that table (a uniform scale, `1.10` for +10 %);
- a **`df -> df` callable** — any transform of the baseline frame;
- a **replacement DataFrame** — swap the table outright (in the wide, indexed shape the model reads).

Overlays target the `oT_Data_*` tables; the dimension tables pass through unchanged. An empty overlay reproduces a direct run exactly.

```python
openTEPES_Runner.run(
    [openTEPES_Cases.Case("cases", "SE2035_base", out_path="out/d105", overlay={"Demand": 1.05}),
     openTEPES_Cases.Case("cases", "SE2035_base", out_path="out/d110", overlay={"Demand": 1.10})],
    solver_name="gurobi", mode="in-memory", backend="multiprocessing", n_workers=8)
```

The diagram below traces one such run.

```{image} ../img/sweep_mode_b.png
:alt: Mode B flow — the baseline is read once into an InMemorySource, each worker patches a table and rebuilds, and the results are merged by aggregate().
:width: 100%
:align: center
```

## Mode C — post-build hot-swap

Mode C is the cheapest and, despite handling the largest runs, the simplest in code: it builds the model **once** and only re-solves it, one overlay at a time. It is a plain serial loop in a single process — no worker pool, no forking — so it runs unchanged on every platform. Its speed comes not from parallelism but from skipping the rebuild, the slowest step on large cases. It is driven directly by `openTEPES_ProblemSolvingResolve.resolve`, not through `openTEPES_Runner`.

```python
from openTEPES.openTEPES_ProblemSolvingResolve import resolve, overlay_scaled

# OptModel is an already-built model (from openTEPES_run's build step)
results = resolve(OptModel, "gurobi",
                  overlays=[overlay_scaled(OptModel, "pDemandElec", 1.05),
                            overlay_scaled(OptModel, "pDemandElec", 1.10)])
```

**How the hot-swap works.** Each overlay is a dict mapping a Param name to new values. For each overlay, `resolve` writes the new numbers into the built model with `Param.store_values(...)` and re-solves. This works only for Params declared `mutable=True`, so Mode C sweeps the operational set only — `pDemandElec`, `pENSCost`, `pLinearVarCost`, `pEFOR`, `pReserveMargin`, `pRESEnergy`. Structural and topology Params stay immutable (a mutable Param read in a build-time `if` would raise), so anything that changes the model's structure needs Mode A or B.

**The solver must be non-persistent.** `resolve` makes a fresh `SolverFactory(SolverName)`; a non-persistent solver re-exports the model from the current Param values on every `solve`, so `store_values` takes effect with no extra call. Pass a plain solver name (`"gurobi"`, `"highs"`), not a persistent one.

**Overlays are independent, not cumulative.** `resolve` snapshots the baseline of every touched Param and resets to it before each overlay, so overlay *k* does not build on overlay *k−1*. An empty overlay `{}` re-solves the baseline; `restore=True` (default) restores the baseline at the end. `overlay_scaled(model, name, factor)` builds a scale-factor overlay for one Param.

`resolve` returns one dict per overlay with the solver termination condition and the total system cost (`None` if the solve was not optimal). It solves one overlay at a time; to run a large Monte-Carlo in parallel, wrap `resolve` in your own worker pool.

## Collecting sweep results

`openTEPES_ResultAggregate.aggregate` stacks a sweep's cases into one long table per result, with a leading `case` column, reading either the per-case DuckDB files or the per-case CSV folders:

```python
from openTEPES.openTEPES_ResultAggregate import aggregate

aggregate(sources=["out/d105", "out/d110"], out_path="out/sweep", to="duckdb")
```

`openTEPES_Runner.run` can do this in one step with `aggregate_to=...`. Per-case DuckDB output keeps a sweep single-writer-safe (one database per case). See {doc}`OutputResults` for the `output_format` option that writes those per-case DuckDB files.