Review & quality
The 4-layer review pipeline
Every PR runs the same gauntlet before a founder ever sees it.
The framing matters: AI is an intelligent context generator, not a judge. It summarises, flags concerns with evidence, and scores against structured criteria — but in V1 it never autonomously approves or rejects. The owner makes the final call.
The four layers
build · tests · lint · scope · security
task · criteria · diff · history
structured verdict via Claude
auto-approve · owner · reviewer
Layer 1 — Automated validation
Build, tests, lint, type-check, security scan (Semgrep), file-scope validation, and the FORKE_SUBMISSION.md completeness check — all run in a sandboxed container in ~30 seconds. Any failure is an auto-reject with the specific reason shown to you. No AI is called, so a failing build costs nothing. This is the biggest cost saving and the fastest quality gate.
Layer 2 — Context assembly
Forke builds the AI prompt from exactly what matters: the task description and acceptance criteria, the git diff (changed files only — never the whole repo), repo patterns detected when the mirror was created, the automated results, and your trust score and history. Good context is what makes the verdict meaningful.
Layer 3 — AI review
Claude reads the assembled context and returns a strict JSON verdict: task-completion score (0–10), code-quality score (0–10), architecture fit (0–10), security status, up to five concrete issues, up to three positives, a plain-English summary, a recommendation, and a confidence score. Low confidence escalates to a human regardless of the other scores.
Layer 4 — Risk scoring
A composite 0–100 score from the AI verdict, task criticality, and your history decides routing:
| Risk score | Routing |
|---|---|
| 0–30 | Auto-approve eligible (low-risk, simple tasks) |
| 31–70 | Owner review — verdict card shown, owner decides |
| 71–100 | Trusted reviewer queue first, then owner |
