Hedda internal shell

Source books in. Reviewed historical claims out.

Phase 1 is intentionally read-only. This surface exists to prove the project boundary: documents, evidence, extraction runs, and publication stay explicit before later workflow automation lands.

Source documents
Foundational

Books and PDFs are the first-class input. Upload and page persistence arrive in Phase 2.

Extraction runs
Versioned

Runs are explicit records with provenance and rerun boundaries. Execution wiring starts in the worker runtime.

Review boundary
Protected

Reviewed claims remain the source of truth. Publication and downstream payloads stay separate.

What this shell is showing

Hedda starts from a conservative source-first model. Documents, evidence spans, extraction runs, and reviewed claims are distinct concepts because the system must preserve provenance and rerun history instead of flattening everything into one event row.

  • Documents remain visible even if the source file later goes missing.
  • Extraction runs are versioned records, not silent in-place reruns.
  • Published outputs arrive only after review and grouping decisions.

Road ahead

  • Phase 2 adds document ingest and page persistence.
  • Phase 3 adds structured draft extraction with evidence spans.
  • Phase 4 adds the review queue and claim editing surfaces.
  • Later phases add normalization, event grouping, and publication.

Sample inspection paths

Once the Phase 1 sample has been seeded into the local Hedda database, these routes expose the canonical document and extraction-run records directly from the database.

Open review queueOpen canonical documentOpen canonical run