ConductGene Swarm · v0.5
UCWS Singapore 2026 · AGENT track · dual submission with AttestRWA (APPLICATION)
Human-approved AI conduct QA
Multi-agent review of call transcripts with evidence-grounded citations. Supervisor corrections become rollbackable Policy Genes — institutional memory regulators can audit.
Problem → solution
QA teams repeat the same corrections
Policy Genes preserve supervisor-approved patterns
LLMs hallucinate policy
Evidence-grounded citations + abstain when unsure
Regulators need accountability
Supervisor approval, audit trail, rollback
Supervisor-approved conduct QA with Policy Genes — auditable institutional memory, not black-box scoring.
How the swarm works
Three agents debate each transcript; the supervisor approves before a Policy Gene is stored.
Prosecutor
Surfaces policy violations with evidence-grounded citations from the knowledge base.
Defender
Challenges weak citations and flags where the transcript does not support a verdict.
Arbiter
Synthesizes pass / needs review / fail — operator-in-the-loop before any gene is stored.
Policy Genes
When a supervisor overrides a verdict, the system stores a supervisor-approved Policy Gene — auditable, versioned, rollbackable. The swarm reuses approved patterns on similar cases instead of re-litigating the same correction every week.
Proof
Held-out eval — numbers from the synthetic suite.
Scenario pass rate
16 / 16
Synthetic held-out eval suite — 100% pass in mock mode.
Citation coverage
100%
Every verdict traceable to policy sources in the KB.
Abstain rate
~6.3%
CASE-007 — explicit abstain instead of guessing.
Gene learning
CASE-002 → 005
escalation_offered: needs_review → pass after supervisor gene.
Try the stack
Streamlit 5-panel demo, FastAPI at :8090, mock mode for CI — live stack with Qdrant + BGE rerank documented in the repo.