How it works
markdown-contract parses a document once and runs everything — structure checks, content checks, cross-cutting rules, and the typed model — off that single parse.
One parse
Section titled “One parse”Each document is parsed with remark/mdast (GitHub-flavored markdown plus YAML frontmatter), extended with an in-house dialect for the Obsidian conventions many docs corpora already use:
- line-terminal
^block-idanchors, which make sections, tables, and blocks addressable, [[wikilinks]]and![[transclusions]], with|aliasand#fragmentparts recognized.
The parse is projected into a position-carrying section tree: every
section, table, list, and code block knows its source line. That is what lets
every finding land as path:line instead of an opaque JSON pointer.
Three cooperating planes
Section titled “Three cooperating planes”A contract declares three kinds of expectations, each handled by the mechanism that is actually good at it:
- Structure — a small tree grammar over sections and block kinds:
sections,section,optional,oneOf, andgap, nested to any depth, withorderandallowUnknownset per level. This covers the one axis a schema language cannot express: these sections, in this order, with room for extras. - Content — Zod at every leaf. The frontmatter is a
plain Zod schema; inside sections, a finite leaf vocabulary (
table,list,code,maxWords) compiles to Zod checks over the projected nodes, with raw Zod as the escape hatch for anything richer. - Rules — named functions for what neither plane covers:
ruleattaches to a section,docRulesees the whole document (so a frontmatter field can gate a body section), andtextRule/requires/forbidsdeclare text constraints without writing a function at all.
The split is deliberate: schema languages and tree grammars are formally incomparable, so markdown-contract never forces one to fake the other. It is the same “grammar + datatype library + rules” shape the XML structured-authoring world (RELAX NG, Schematron) ran on for decades — applied to markdown.
One finding shape
Section titled “One finding shape”Every mechanism reports the same record:
{ id: "structure/section-missing", level: "error", path: "D-9001.md", pos: { line: 8, col: 1 }, message: "required section 'Decision' is missing" }Rule ids are namespaced, severity (error / warn / report) is contract
data, and the CLI renders findings as human text, raw JSON, or SARIF 2.1.0 for
code scanning.
Validation and consumption are the same contract
Section titled “Validation and consumption are the same contract”A compiled contract has two doors:
contract.validate(src, { path })returns{ findings, doc?, tree }and never throws — the shape CI wants.contract.read(src, { path })returns the typedDocor throws aContractErrorcarrying the findings — the shape a consumer wants.
The Doc is a navigable typed view of the document: doc.frontmatter is
typed by the frontmatter schema; doc.body reaches sections by camelCase key
(doc.body.summary) or exact heading (doc.body.section("Summary")); tables
are iterable typed-row collections; anchors resolve with doc.byAnchor(id).
Because the model is derived from the contract, checking and consuming can
never drift apart.
Architecture
Section titled “Architecture”The package is three layers, and imports flow one way:
| Layer | Role |
|---|---|
core |
one document × one contract → findings + tree + doc. Pure — no file system, no process. |
runner |
a corpus config (globs → contracts) → aggregated findings across a tree of files, with first-match routing. |
cli |
the markdown-contract bin: argv → runner → format → exit code. A thin shell over the library. |
The engine carries no knowledge of any particular repository — corpora are described declaratively (see Getting started) — and the validator is strictly read-only: it never rewrites a document.