System Architecture Overview

Purpose

PMFKit v1 is a structured AI diagnostic engine: URL → evidence → PMF stage → recommendations. This doc describes the high-level flow, Evidence Layer, orchestration, single entry point, and dependency map.

High-Level Flow

Crawl: Discovery → Product crawl + UI capture (parallel) → optional enrichment. Output persisted to company_crawls.
Evidence Layer: Crawl output is normalized into tagged evidence categories (see evidence-taxonomy.md). Single artifact; all agents consume from here.
CMO: Consumes Evidence Layer (and legacy page evidence for the current pipeline). Produces Report 1 (discovery verdict, component analyses, DRL).
BD: Consumes Report 1 only. Produces Report 2 (context, quotes, cases).
COO: Consumes Report 1 + Report 2. Produces Report 3 (strategies, next actions, one thing this week).
CDO: Consumes Evidence Layer + UI captures. Produces design audit (UX blockers, design system, rewrite plan).

Conflict resolution: Stage and recommendation are owned by CMO (Report 1 discoveryVerdict). No other agent overwrites them. Orchestration respects Report 1 stage and does not override it.

Single Entry Point

runDiagnosis(input) in lib/pipeline/run-diagnosis.ts is the canonical orchestration entry.
Input: url (or crawlId/projectId), userId, supabase, addLog, flags includeBd, includeCoo, includeCdo, runCrawlFirst, enrichmentTier.
Output: DiagnosisResult — stage, recommendation, finalDiagnosis (canonical FinalDiagnosisV1), evidenceLayer, report1, report2?, report3?, cdo?, crawlId, executionLog. The FinalDiagnosisV1 shape is the single contract for API, export, and UI single-diagnosis view (see lib/schemas/final-diagnosis-v1.ts, lib/pipeline/to-final-diagnosis-v1.ts). Public extract: generatePublicSummary(finalDiagnosis) in lib/public-summary.ts produces a compressed, tweet-ready summary (PMF stage, confidence, primary risk, one fix this week, short explanation max 120 words); no separate engine, transformation only.
Existing API routes (POST /api/roles/crawl, POST /api/roles/cmo, etc.) can call into this or continue to call shared crawl and report-from-crawl directly; the contract is that the data they pass is the same Evidence Layer / crawl output.

Dependency Map

Artifact	Depends on
Crawl	—
Evidence Layer	Crawl
Report 1 (CMO)	Evidence Layer
Report 2 (BD)	Report 1
Report 3 (COO)	Report 1, Report 2
CDO report	Evidence Layer, UI captures

Defined in code as DIAGNOSIS_DEPENDENCY_MAP in lib/pipeline/run-diagnosis.ts.

Key Modules

Area	Path
Evidence Layer	`core/evidence/` (types, normalize, index)
Crawl	`lib/pipeline/run-shared-crawl.ts`, `core/crawlers/`
Report from crawl (CMO path)	`lib/pipeline/run-report-from-crawl.ts`
Orchestration	`lib/pipeline/run-diagnosis.ts`
Final diagnosis (canonical output)	`lib/schemas/final-diagnosis-v1.ts`, `lib/pipeline/to-final-diagnosis-v1.ts`
Public summary (tweet-ready extract)	`lib/public-summary.ts`
Agents	`core/agents/` (cmo-agent, cdo-agent, bd-agent, coo-agent)
Persistence	`lib/persistence/company-crawls.ts`, `lib/persistence/snapshots.ts`