Skip to content

RAG agent

Retrieval-augmented question answering with citation enforcement and recall/precision eval.

When to use

Your project will answer questions grounded in a document corpus (docs, knowledge base, manuals, internal wiki). You want every claim the agent makes to cite a retrieved chunk.

What gets generated

File Role
AGENTS.md RAG-specific entry document (loop, conventions, definition of done)
SOUL.md "Careful research assistant" voice
TOOLS.md Recommended vector stores + retrieval MCP servers
MEMORY.md Three-layer memory (session / corpus / skill-derived)
SKILLS/chunk-and-embed/ Ingestion + stable chunk_id assignment
SKILLS/retrieve-and-rerank/ Top-k retrieval + cross-encoder rerank
SKILLS/answer-with-citations/ Sentence-level citation enforcement
SKILLS/eval-recall-precision/ Eval set runner
scripts/test_task.sh Sample question runner
scripts/verify_output.py Citation + schema check

Validators

Validator What it checks
structure Every blueprint-generated file is present and parses
citations Every inline [chunk_id] matches an entry in citations[] (both directions)
schema Sample answer JSON validates against the Pydantic shape
  • filesystem — read corpus + write chunk index
  • fetch — re-verify URL citations
  • postgres — conversation + retrieval log persistence
  • qdrant — vector store (>100k chunks)
  • chroma — vector store (prototypes, <100k chunks)

Memory schemas

  • conversation — session messages + retrieved_chunks log
  • rag-chunks — persistent chunk records (chunk_id, source, text, embedding, metadata)

Eval

eval/questions.yaml ships with 4 sample questions and min_recall: 0.7, min_precision: 0.5. Replace these with corpus-specific questions before shipping.

Hero demo

FastAPI + RAG cookbook shows this blueprint on fastapi/full-stack-fastapi-template.