Errors and footguns
Every error path, verbatim, plus the non-obvious behaviors worth knowing in advance.
This page collects the exact error messages the CLI emits and the behaviors that surprise people. Read top-to-bottom once; come back when something goes red.
Error format
Errors are two lines. The first is red, the type and message. The second is dim, a → file:line pointer at your code.
ValueError: esker_id jurisdiction 'us' does not match DOMAIN_ID jurisdiction 'ca'
→ my_pipelines/sec_companies.py:34
-v / --verbose adds the full traceback after a blank line.
The pointer is the user frame — picked from the deepest stack frame containing your pipeline file rather than the SDK internals.
Pipeline lookup
$ esker run nonexistent.foo
No pipeline registered for domain 'nonexistent.foo'
Same shape from test, check, push, schema. Caused by KeyError from the registry.
Bare-name resolution
$ esker view us.sec.companies # no binding
no binding for 'us.sec.companies' · run 'esker add <owner>/us.sec.companies'
or use a full ref
Raised by bindings.resolve as UnboundDatasetError. Add a binding or use a full ref.
Schema with no binding (special case)
$ esker schema us.sec.companies # registered locally, no binding
us.sec.companies@1.0.0
via local
<field-table>
esker schema skips the bindings lookup entirely if the bare name matches a registered pipeline — header drops the owner prefix because there isn't a published owner yet.
--remote forces the bindings lookup; an unbound name then fails as elsewhere.
Invalid refs
$ esker view Bad/Ref
ValueError: invalid owner 'Bad'
DatasetRef.__post_init__ validates owner → name → version in that order. Uppercase fails the regex.
Auth gate
$ esker push us.sec.companies # no creds
not signed in — run 'esker login'
Raised by auth.auth_header() as CredentialsError (subclass of HubError, but renders without the hub 0: prefix).
esker check requires either credentials or --owner:
$ esker check us.sec.companies # neither
not signed in — run 'esker login'
Whoami without credentials
$ esker whoami # no creds
not signed in
→ run 'esker login'
Two-line: red + dim hint.
Hub down (network)
$ esker manifest archie/us.sec.companies
RemoteDisconnected: Remote end closed connection without response
$ esker check us.sec.companies
ConnectionRefusedError: [Errno 61] Connection refused
Transport failures wrap at the hub.py boundary as HubUnreachableError(HubError). Every CLI's except HubError catches both 4xx/5xx and unreachable-hub failures uniformly.
Owner handle validation
$ esker config set-owner BadHandle
invalid handle 'BadHandle'
$ esker config set-owner api
invalid handle 'api'
Same red message for length, regex, and reserved-word failures. The message doesn't say why it's invalid — see Handles.
Visibility validation
$ esker visibility archie/foo unknown
setting must be 'public' or 'private'
$ esker visibility archie/foo private
private not yet supported · landing in phase 2
Phase 1 only accepts public. private exits 1 without contacting the server.
EskerModel construction
>>> SecCompany(cik="0000320193", esker_id="esker:us:corp:0000320193")
ValidationError: 1 validation error for SecCompany
esker_id
Extra inputs are not permitted [type=extra_forbidden, ...]
extra="forbid" blocks setting esker_* on the draft. The injection happens inside EskerPipeline.run(); user code can't bypass it.
Subclass enforcement
>>> class Bad(EskerModel):
... x: int = 0
TypeError: Bad must declare a DOMAIN_ID class variable
>>> class Bad2(EskerModel):
... DOMAIN_ID: ClassVar[str] = 'BAD-ID'
... schema_version: ClassVar[str] = '1.0.0'
TypeError: Bad2.DOMAIN_ID 'BAD-ID' must match
^[a-z0-9]+(\.[a-z0-9]+)+$ (lowercase a-z0-9, dot-separated)
>>> NoVersion.declared_version() # no schema_version ClassVar
ValueError: NoVersion must declare a schema_version class variable
Decorator validation
All TypeError. All fire at module-import (decoration) time:
TypeError: @pipeline ref must be '<domain>@<semver>', got 'badref'
TypeError: @pipeline entity_type must match /^[a-z]+$/, got 'Corp1'
TypeError: @pipeline requires exactly one of `url=` or `source=`
TypeError: @pipeline key='nonexistent' is not a field on E
TypeError: @pipeline class NoTransform must define `transform(cls, raw) -> cls` as a classmethod
TypeError: @pipeline decorates plain classes; use the explicit three-class form when subclassing EskerModel directly.
These propagate up from the entry-point load. The CLI command exits 1.
Source URL template misuse
KeyError: source_url template 'https://example.com/{nonexistent_field}'
references field 'nonexistent_field' which is not on X
(available: ['name', 'wid'])
The pipeline wraps str.format's KeyError with the template, the missing key, and the available draft fields.
Compat (push-time)
field 'cik': pattern '^\d{10}$' → '^\d{8}$'
required bump: major
CompatError rendering: each breaking change on its own line, then the message in red.
major bump 1.0.0 → 2.0.0 requires --force-major
Pass --force-major if you mean it.
archie/us.sec.companies@1.0.0 already published with a different schema; bump schema_version
Same-version re-publish with any schema change (breaking or additive).
Fixture failure
$ esker test
global.spacex.rockets@1.0.0
0 passed · 2 failed · 0.0s
mismatch: falcon1
--- expected
+++ actual
@@ ...
See Fixtures for the four failure reasons.
Footguns
Non-obvious behaviors. Worth knowing in advance.
schema_version: SemVer = "1.0.0" silently breaks the model
The most-flagged pitfall. Writing it as a Pydantic field instead of a ClassVar turns schema_version into a per-record column, breaks declared_version(), and pollutes the JSON Schema. Always use:
schema_version: ClassVar[str] = "2.0.0"
The decorator path bypasses this trap entirely.
Two consecutive runs produce different content_hash
Because Fetched.fetched_at and per-batch lineage_id change every run, the parquet bytes change, and content_hash changes. The compat engine doesn't care — it diffs JSON Schemas, not parquet bytes — but a user expecting identical bytes for identical inputs will be surprised.
The supersedes chain is the right way to think about re-publishes: each push is a new run with a new content hash, linked back to the previous run at the same version.
esker sync reports drift but doesn't fix it
esker sync prints <name> · hash drift · run 'esker upgrade <name>' when the lockfile's content_hash differs from the hub's latest. It doesn't auto-upgrade. Run upgrade per drifted name.
That's intentional — drift is a security signal, not a routine event — but worth knowing.
esker config set-owner doesn't write anything
It validates the handle and prints a paste-able snippet. If you're expecting state mutation, you'll be surprised.
esker config set-handle makes the local creds stale
It only PATCHes the server. The local ~/.esker/credentials owner_handle field is unchanged. Subsequent pushes use the cached old handle until you re-login. The success message says so but it's easy to miss.
BulkJsonSource re-fetches on every run
No bulk-cache primitive. Per-id sources can use fetch_cached; bulk sources hit the network every time. Big payloads get re-downloaded for every test run.
BulkJsonSource.SOURCE_ID = "bulk-json" (default)
If you subclass BulkJsonSource directly without setting SOURCE_ID, the manifest will record source_id="bulk-json" — meaningless. The decorator path overrides this to domain_id, so you only hit it if you go three-class with BulkJsonSource as base. Always set SOURCE_ID explicitly.
esker list reads ./output/<domain>.parquet for "last run"
The "last run" timestamp is read from ./output/<domain>.parquet's mtime — fixed path, ignores --output. A user who runs with -o data/ will always see never run.
JWT signature is not verified by the SDK
The SDK reads the JWT's exp claim only. Server-side validates on every request. So a tampered token will pass local checks until it hits an authenticated endpoint.
Pattern-constrained strings lose their pattern in Arrow
Annotated[str, Field(pattern=r"^\d{10}$")] (CIK) renders as Arrow string with no constraint metadata. Pydantic validates on construction; parquet is inert.
Literal of one value confuses the compat checker
Pydantic emits Literal["x"] as {"const": "x", "type": "string"} (no enum keyword). But Literal["x", "y"] is {"enum": ["x", "y"], "type": "string"}.
A transition Literal["x", "y"] → Literal["x"] shows as enum keyword added or removed (breaking) rather than enum values removed. The classification is right, the message could be clearer.
See also
- Compatibility — what gets blocked at push time
- Records — the ClassVar discipline
- CLI overview — the visual contract behind the error format