Compatibility
How schema changes are classified and what bumps are required.
A published schema is consumer-facing. Silently changing it would break consumers, so Esker has a compat engine that diffs the proposed schema against the last published one and decides what's allowed.
The engine lives in esker.schemas.compat (pure functions, no I/O); the I/O wrapping for the push gate lives in esker.client.compat. Both esker check and esker push route through the same logic.
What the engine does
For every push at a non-major version bump, the engine:
- Fetches the prior
schema.jsonfrom the hub. - Normalizes both schemas: inlines
$reffrom$defs, strips doc-only keys (title,description,default,$defs,examples), canonicalizes shape variants. - Walks
propertiesandrequiredrecursively, classifying each change asbreakingoradditive. - Decides the minimum required SemVer bump.
The output is a CompatReport:
@dataclass
class CompatReport:
breaking: list[str]
additive: list[str]
required_bump: Literal["patch", "minor", "major"]
@property
def compatible(self) -> bool:
return not self.breaking
Mapping:
- Any
breaking→required_bump = "major". - Only
additive→required_bump = "minor". - Nothing →
required_bump = "patch".
Classification rules
For each field, in order:
- In old, not in new → breaking (
field 'X' removed). - In new, not in old → additive if optional, breaking if required (
field 'X' added [as required]). - Required toggle: optional → required is breaking. Required → optional is additive.
- Type signature mismatch → breaking (
field 'X': string → integer). - Same
objecttype → recurse into the nested schema. - Same
array<object>type → recurse intoitems. anyOfon both sides → pair the unique object/array members and recurse.- Otherwise → constraint diff.
Constraint diff
| change | classification |
|---|---|
| enum value removed | breaking |
| enum value added | additive |
| enum keyword added/removed | breaking |
| pattern changed (any way) | breaking |
| format changed (any way) | breaking |
minLength / minimum tightened |
breaking |
minLength / minimum relaxed |
additive |
maxLength / maximum tightened |
breaking |
maxLength / maximum relaxed |
additive |
Pattern and format changes are always breaking, even if the new pattern is strictly looser. The engine doesn't statically analyze regex sizing.
What's not diffed
The engine short-circuits in three cases. Each case sets a skip_reason instead of running the diff.
| skip reason | meaning |
|---|---|
first_publish |
No prior manifest on the hub. Allowed. |
major_skip |
Major version bump. Different schema is expected; no diff to do. |
grandfather |
Prior manifest exists but no schema.json artifact. Common with old datasets. Allowed. |
Doc-only fields are also ignored: title, description, default, $defs, examples. So changing a description or a default value is invisible to compat.
Literal[X] collapses to {"const": X} in Pydantic's emit; Literal[X, Y] becomes {"enum": [X, Y]}. The diff special-cases enum, so transitions between single-element and multi-element literals can render with a slightly less clear message ("enum keyword added or removed") even though the classification is correct.
The push gate
esker push calls enforce(...), which raises CompatError when the push should be blocked.
Decision tree:
first_publish→ allowed.- Same version (
declared_bump == "none"):- If no prior
schema.json→ allowed (first artifact for this version). - If schema unchanged → allowed.
- If schema changed at all → blocked with
<owner>/<name>@<v> already published with a different schema; bump schema_version.
- If no prior
- Major bump:
- With
--force-major→ allowed. - Without → blocked with
major bump <a> → <b> requires --force-major.
- With
grandfather→ allowed.- Patch / minor:
- Declared bump covers required → allowed.
- Required bump exceeds declared → blocked with
required bump: <required>and the breaking changes listed.
A green esker check ≈ "push won't be blocked by compat." Both call the same diagnose function.
Same-version re-publish
A push at the same schema_version is only allowed if the schema hasn't changed at all (no breaking, no additive). Re-publishing v1.0.0 with a tweaked field type fails with the "already published with a different schema" error. Bump the version.
The exception is "no schema.json on hub for that version yet" (grandfather-like): then re-publish lands as the first schema artifact for that version.
See also
- Publishing —
esker checkandesker pushwalkthroughs - Manifests — what
schema_versionis and where it lives - Records —
Literalvspatternfor enum-like fields