Naming and IDs

The four ID patterns Esker uses, their separators, and what each is for.

ID format is visual discipline, not arbitrary. You should be able to tell what kind of identifier you are looking at by glancing at the punctuation.

The four patterns

concept example regex separator
Domain (DOMAIN_ID) ca.corporations.registry ^[a-z0-9]+(\.[a-z0-9]+)+$ .
Schema ca.corporations.registry@1.0.0 <domain>@<semver> @
Entity (esker_id) esker:ca:corp:0123456789 ^esker:[a-z]{2,}:[a-z]+:[\w.-]+$ :
Owner (OwnerHandle) archie ^[a-z0-9](?:-?[a-z0-9])*$, max 39 n/a
Ref archie/ca.corporations.registry@1.0.0 <owner>/<domain>[@<version>] /

Why the separators differ:

  • Dots: domain hierarchy (jurisdiction → namespace → name).
  • Colons: entity coordinates (esker: prefix, then jurisdiction, type, native id).
  • @: "version of" (mirrors git checkout pkg@v1).
  • /: "owner of" (mirrors github.com/owner/repo).

If you see colons, it's a record's join key. If you see slashes, it's a publisher's namespaced path.

Domain rules

  • Lowercase a–z, 0–9 only. No hyphens, no uppercase, no underscores in the domain itself.
  • At least two dot-segments. single won't parse.
  • The first segment is the jurisdiction — used to build esker_id. For global data: global. Conventional 2+ char codes: us, ca, gb, eu, un.

Examples:

us.sec.companies              valid
ca.corporations.registry      valid
global.spacex.rockets         valid
companies                     invalid (single segment)
US.Sec.Companies              invalid (uppercase)
us-sec.companies              invalid (hyphen)

Owner rules

Single global namespace shared by users and orgs. GitHub-style:

  • Lowercase a–z and digits.
  • Hyphens allowed but not at start, end, or doubled.
  • Length 1–39.
  • Cannot match a reserved word.

A small list of names is reserved because they collide with platform routes or artifact filenames:

api app auth cli dashboard datasets docs hub
login logout me new orgs owners search settings
sign-in sign-up static users www
data manifest schema lineage

See Handles for the full owner story.

Versioning

SemVer 3-part only. No pre-releases (1.0.0-alpha), no build metadata (1.0.0+abc). Just <major>.<minor>.<patch>.

The compat engine compares numerically and rejects downgrades. @latest is sugar for "no version" — it parses as version=None, equivalent to leaving the version off entirely.

archie/us.sec.companies@1.0.0     versioned
archie/us.sec.companies@latest    same as below
archie/us.sec.companies           latest published

esker_id decomposition

esker:us:corp:0123456789
^^^^^ ^^ ^^^^ ^^^^^^^^^^
prefix jur  type  native_id
  • prefix: literal esker.
  • jurisdiction: first segment of DOMAIN_ID (e.g. us for us.sec.companies). Built at run time as DOMAIN_ID.split(".", 1)[0].
  • entity_type: from the pipeline's entity_type= (decorator) or _ENTITY_TYPE ClassVar. Lowercase letters only. Examples: corp, rocket, curve.
  • native_id: the value of the key= field on the record. Word characters, dots, and hyphens.

The synthesis is a string concatenation built into EskerPipeline.run(). No escaping — a native id containing : would produce a malformed esker_id. In practice native ids are integers or short alphanumeric strings (CIK is ^\d{10}$, Mongo ObjectId is ^[a-f0-9]{24}$).

esker_id identifies the entity the record is about, not the record itself. Multiple records about the same corporation share one esker_id. For relationship records (e.g. an officer appointment), the esker_id is the appointment entity; references to other entities appear as domain-specific fields like corporation_id.

See also