# Fixtures

> The test harness. Pure transform comparisons against canonical JSON.

A fixture is a `(raw_<label>.json, expected_<label>.json)` pair. The harness diffs `pipeline.transform(raw).model_dump(mode="json")` against `expected`. No network, no clocks, no RNG.

`esker test` runs fixtures locally. `esker push` runs the same harness as a pre-seal gate — push refuses to run if you have zero fixtures or any failing fixture.

## Layout

The harness probes two layouts, in order:

**Package form**: `<pipeline_package>/fixtures/`. Used when the pipeline is a package (multiple files, one of them `__init__.py`). Convention for the [three-class form](https://esker.so/docs/sdk/three-class-form.md).

```
my_pipelines/us_treasury_yields/
├── __init__.py
├── source.py
├── schema.py
├── pipeline.py
└── fixtures/
    ├── raw_basic.json
    └── expected_basic.json
```

**Single-file form**: `<pipeline_file>_fixtures/` sibling dir. Used when the pipeline is a single `.py` file. Convention for the [decorator form](https://esker.so/docs/sdk/pipelines.md).

```
my_pipelines/
├── sec_companies.py
└── sec_companies_fixtures/
    ├── raw_basic.json
    ├── expected_basic.json
    ├── raw_short_cik.json
    └── expected_short_cik.json
```

The first existing layout wins. Missing fixtures dir → `esker test` prints `no fixtures` (dim) and treats the pipeline as untested-but-not-failed. `esker push` treats no-fixtures as a failure unless you pass `--force-untested`.

## File naming

```
raw_<label>.json        the raw input (whatever your source yields as Fetched.raw)
expected_<label>.json   the expected transform(raw).model_dump(mode="json")
```

`<label>` is arbitrary. Use names that describe the case: `basic`, `falcon1`, `short_cik`, `null_rate`. The harness pairs files by the `<label>` portion.

## Canonical JSON

Both reading and writing use canonical JSON: `indent=2`, `sort_keys=True`, `ensure_ascii=False`. So `expected_*.json` files are stably ordered and diff-friendly.

Example `expected_basic.json` for the SEC pipeline:

```json
{
  "cik": "0000320193",
  "ticker": "AAPL",
  "title": "Apple Inc."
}
```

Note: no `esker_id`, no `esker_source_url`, no `esker_lineage_id`, no `schema_version`. Those are injected at `pipeline.run()` time, not by `transform`. Fixtures only contain author-domain fields.

## What gets compared

```python
actual = pipeline.transform(raw).model_dump(mode="json")
```

That's the **draft** model (no `esker_*` fields). `mode="json"` means `date` → ISO string, `datetime` → ISO string, `UUID` → string. The same form that lands in parquet.

`transform` must be a **pure function**. The harness compares bytes — any clock, RNG, or env-dependent value will fail.

## Failure reasons

| reason             | when                                                                                |
| ------------------ | ----------------------------------------------------------------------------------- |
| `mismatch`         | actual ≠ expected. Detail: unified diff (`fromfile=expected`, `tofile=actual`).     |
| `raised`           | `transform(raw)` threw. Detail: `<Type>: <msg>`.                                    |
| `missing_expected` | `raw_<label>.json` exists, no matching `expected_<label>.json`, and not `--update`. |
| `orphan_expected`  | `expected_<label>.json` exists, no matching `raw_<label>.json`.                     |

`orphan_expected` catches the common mistake of renaming a `raw_*.json` and forgetting the matching expected (or vice versa).

## `--update`

```
no expected file       → write it
mismatch               → overwrite
existing matching      → leave alone
orphan_expected        → still flagged as failed (even with --update)
```

Destructive — overwrites without prompting. The standard workflow is "make a code change, run with `--update`, inspect the git diff."

## Running

```sh
$ esker test us.sec.companies
  us.sec.companies@2.0.0
  2 passed · 0.0s
```

No domain → iterate every registered pipeline:

```sh
$ esker test
  global.spacex.rockets@1.0.0
  2 passed · 0.0s

  us.sec.companies@2.0.0
  2 passed · 0.0s

  us.treasury.yields@2.0.0
  2 passed · 0.0s
```

On failure:

```
  us.sec.companies@2.0.0
  1 passed · 1 failed · 0.0s

  mismatch: short_cik
  --- expected
  +++ actual
  @@ -1,4 +1,4 @@
   {
  -  "cik": "0000320193",
  +  "cik": "320193",
     "ticker": "AAPL",
     "title": "Apple Inc."
   }
```

Exit 0 if every fixture passed. Exit 1 if anything failed.

## Programmatic API

```python
from esker import run_fixtures, FixtureReport, FixtureFailure


report: FixtureReport = run_fixtures(MyPipeline, update=False)
report.ok            # bool
report.passed        # list[str] of labels
report.failed        # list[FixtureFailure]
report.wrote         # list[str] of labels (for --update mode)
```

The CLI uses this directly, as does `esker push`'s pre-seal gate.

## See also

- [Publishing](https://esker.so/docs/sdk/publishing.md) — the push-time fixture gate
- [Pipelines](https://esker.so/docs/sdk/pipelines.md) — single-file form (single-file fixtures)
- [Three-class form](https://esker.so/docs/sdk/three-class-form.md) — package form (package fixtures)
