Arrow & TypeScript artifacts

How JSON Schema becomes schema.arrow and schema.d.ts on every push.

Every esker push uploads two derived schema artifacts alongside the canonical schema.json: an Arrow IPC schema (schema.arrow) and a TypeScript interface (schema.d.ts). Both are pure functions of the published model's JSON Schema — the same input always produces the same bytes.

If you want to consume these artifacts yourself, the rules below tell you what to expect.

Arrow

schema.arrow is the Arrow IPC-serialized pa.Schema for the published parquet. The same schema the parquet writer was opened with — so it matches the column types and nullability bit-for-bit.

Type mapping

JSON Schema	Arrow
`string`	`string()`
`string` + `format: date-time`	`timestamp('us', tz='UTC')`
`string` + `format: date`	`date32()`
`integer`	`int64()`
`number`	`float64()`
`boolean`	`bool_()`
`array<inner>`	`list_(<inner>)`
`anyOf: [T, null]`	`<T>`, field marked nullable

Other string formats (uuid, uri) fall through to string(). Pydantic validates the format on construction; parquet stores plain text.

Anything not in the table — nested objects, unions other than T | null, etc. — raises ValueError("unmappable JSON Schema: {schema}"). The push fails before bytes leave your machine.

Nullability

A field is nullable iff it's not in the JSON Schema's required set:

pa.field(name, _arrow_type(prop), nullable=name not in required)

Same rule the parquet writer uses.

Use

import pyarrow as pa
import requests

r = requests.get("https://esker.so/archie/us.treasury.yields@2.0.0/schema.arrow")
schema = pa.ipc.read_schema(pa.BufferReader(r.content))

Hand schema to a ParquetWriter, an Arrow Flight reader, anything that wants a typed schema.

TypeScript

schema.d.ts is a single export interface <Name> { ... } declaration. Drop it into a TypeScript project and your records are typed.

Type mapping

JSON Schema	TS
`null`	`null`
`string`	`string`
`integer` / `number`	`number`
`boolean`	`boolean`
`array<inner>`	`<inner>[]`
`anyOf: [...]`	`<a> \| <b> \| ...`
`enum: [v1, v2]`	`"v1" \| "v2"` (JSON-encoded literals)

Required vs optional

required drives the TS ? marker:

export interface UsTreasuryYields {
  quote_date: string;
  rate_1m?: number;
  rate_3m?: number;
  // ...
  esker_id: string;
  esker_source_url: string;
  esker_lineage_id: string;
}

Same rule as Arrow nullability.

Name stripping

The published model class is named Published<X> internally. The TS emitter strips the prefix so consumers see <X> — the domain-native name.

Use

curl -O https://esker.so/archie/us.treasury.yields@2.0.0/schema.d.ts

Drop into your project, import:

import type { UsTreasuryYields } from "./us.treasury.yields";

What lands per push

Six artifacts per esker push:

artifact	source	mime
`data.parquet`	the records	`application/vnd.apache.parquet`
`schema.json`	Pydantic JSON Schema (sorted-key)	`application/json`
`schema.arrow`	`to_arrow_bytes(model)`	`application/vnd.apache.arrow.file`
`schema.d.ts`	`to_typescript(model)`	`text/plain; charset=utf-8`
`lineage.json`	`LineageBundle`	`application/json`
`manifest.json`	`DatasetManifest` (POSTed)	`application/json`

URL pattern:

GET https://esker.so/<owner>/<name>@<version>/<artifact>

Versionless URLs (<owner>/<name>/data.parquet) resolve to the latest published version.

Programmatic emit

Both functions are part of the public API:

from esker import to_arrow_bytes, to_arrow_schema, to_typescript

arrow_schema = to_arrow_schema(MyPublishedModel)
arrow_bytes = to_arrow_bytes(MyPublishedModel)
ts_text = to_typescript(MyPublishedModel)

MyPublishedModel is MyModel.published() — the variant with the three injected fields. Calling these on the draft variant gives you a schema without esker_id / esker_source_url / esker_lineage_id.

Arrow

Type mapping

Nullability

Use

TypeScript

Type mapping

Required vs optional

Name stripping

Use

What lands per push

Programmatic emit

See also