— Door 04 · For the decision-maker

One reader.
A high-signal training pipeline.

The honest pitch behind the other three doors. The role asks one person to challenge frontier models, catch where they fail at literary reasoning, and feed that back as structured signal — at volume, without lowering the bar. This is how I keep both: my own AI Air Team handles the routine half — surfacing candidate passages, drafting prompt variants, logging traces — so the human half (the close reading, the judgement on what's genuinely a failure, the metacognitive feedback) gets all of me.

CONCEPT ONLY — The dashboard figures below are an illustrative capacity model, not graded client data or a performance guarantee. Real throughput would be set with the client against their evaluation guidelines, calibration rounds and quality bar. The point here is the method: where a human must own the judgement and where tooling can responsibly carry the load.

01The division of labour

The discipline of the whole system is one rule: tooling never decides what counts as a failure or what good reading looks like. It clears the runway. The human lands the plane.

— Carried by my AI Air Team

The routine half

Mechanical, repeatable, verifiable-by-me. Speeds the work without touching the judgement.

▹Surfacing candidate passagesPulls stanzas, scenes and paragraphs across the six literary domains for me to choose from.
▹Drafting prompt variantsGenerates first-pass adversarial prompts; I rewrite, sharpen and approve every one before it's issued.
▹Logging & formatting tracesCaptures prompt, model output and metadata into the trace template so nothing is lost or retyped.
▹Tagging & trend-spottingClusters my annotations by failure mode so recurring patterns surface for the model team faster.

Guardrail: the AI never scores, never decides a reading is "wrong," and never writes the feedback. It hands me clean inputs; I do the reading.

— Owned by the human (me)

The judgement half

Irreducible. This is the part the role is actually paying for — and the part I refuse to automate.

◆The close reading itselfSitting with the text, hearing the line breaks, holding the ambiguity the model flattens.
◆Calling the failureDeciding whether a response is a genuine misreading, a defensible alternative, or a fabrication — the judgement call.
◆Metacognitive feedbackWriting the "why it's wrong / what a strong reading does instead" that actually trains the model.
◆Knowing my edgesFlagging when a text sits at the frontier of formal theory or specialist period scholarship — where I'd defer.

Why it stays human: a fabricated citation caught by a careless eye is worse than none caught at all. The judgement is the product.

02The prompt-to-feedback pipeline

Five stages from a blank passage to trainable signal. The coloured tag on each stage says who owns it — so you can see exactly where the human judgement lives.

STAGE 01

Surface

Candidate passages pulled across the six domains; I pick what's worth probing.

AI drafts · I choose

STAGE 02

Provoke

Adversarial prompt written to expose a specific weakness — instruction-following, fabrication, theme-flattening.

AI drafts · I author

STAGE 03

Probe & log

Prompt issued to the model; response and metadata captured into the trace template automatically.

Tooling

STAGE 04

Judge

I read the response closely, decide if it's a genuine failure, classify the mode and score it.

Human only

STAGE 05

Feed back

I write the metacognitive feedback; tooling clusters it with related traces for the model team.

I write · AI clusters

03Capacity model · the command centre

An honest sketch of what this division of labour buys — illustrative figures for a focused day, not a promise. The number that matters is the last one.

MERIDIAL · LITERARY EVAL · CAPACITY SKETCH Illustrative

Traces / focused day

30–40

Fully judged & annotated by me, with tooling carrying capture and formatting.

Domains in rotation

Poetry · prose fiction · drama · non-fiction · classical/period · comparative.

Human time on judgement

~70%

The share of my hours spent reading and feeding back — not on logistics.

Quality gate

100%

Every shipped trace read and scored by a human. No exceptions, at any volume.

Throughput scales with tooling; the quality gate does not move. If pace and the bar ever conflict, the bar wins — and I'll say so.

04How pace never lowers the bar

SG-1 · GATE

Human-read, always

Tooling can draft and capture, but a trace is never "done" until I've read the model's response in full and made the call myself. Volume comes from removing my typing, not my reading.

SG-2 · FABRICATION

Fabrication caps the score

Factual accuracy is a gate dimension (see Door 03). A confident invented citation caps the overall score regardless of prose quality — so a fast pass can never let a fabrication through as "strong."

SG-3 · CALIBRATION

Calibrated, not solo

The rubric and failure modes would be calibrated against the client's guidelines and inter-rater rounds — so my judgement tracks the team's standard, not just my taste.

SG-4 · DEFERRAL

I flag my edges

At the frontier of formal literary theory or specialist period scholarship I read as an informed generalist. Those traces get marked for specialist review rather than waved through.

SG-5 · TRACEABILITY

Every claim is sourced

Each trace records the exact prompt, the verbatim response and my reasoning steps — so any score can be audited and any disagreement resolved against the actual evidence.

SG-6 · METACOGNITION

Feedback shows the work

Every annotation spells out the interpretive steps — why a reading fails and what a strong one does instead — the explicit, metacognitive reasoning the brief asks for, written for a human reader.

The pitch in one line: automate the typing, never the judgement.

That's the whole system. Tooling clears the runway so the human close reading — the part Meridial is actually buying — gets all of me, at a pace that's useful rather than precious. You've now seen all four doors: the materials, the engine, the rubric, and the pipeline behind them.

← Back to the hub Open application materials →

One reader.A high-signal training pipeline.

The routine half

The judgement half

The pitch in one line: automate the typing, never the judgement.

One reader.
A high-signal training pipeline.