Back to Curriculum

Knowledge Base Optimization: The RAG Foundation

For custom GPTs and Gems, the quality of the "Knowledge Base" (the uploaded files) is more important than the instructions. In this lesson, we learn how to architect High-Status Knowledge Bases that minimize hallucinations and maximize technical depth.

🏗️ The Knowledge Optimization Hierarchy

  1. File Format: Prefer .md or .txt over .pdf. PDF files have complex layouts that confuse LLM parsers.
  2. Chunking: Break large documents into smaller, thematic files (e.g., pricing_v2.md, onboarding_flow.md).
  3. Metadata Tagging: Use headers and tags within the files to help the model identify relevant sections instantly.

🛠️ Technical Snippet: Structural Markdown for Knowledge

# MODULE: CRM_INTEGRATION
## SUB-TASK: API_SYNC
Description: Logic for syncing leads from Typeform to HubSpot.
Logic Steps:
1. Verify email via Hunter.io.
2. If verified, create contact in HubSpot.
3. If score > 8, create 'High Priority' task.

🔍 Nuance: Reference Anchoring

When uploading a knowledge base, always add this instruction to your Gem: "When providing an answer based on the knowledge base, always cite the specific file and header you used. If the answer is not in the files, state 'Data Not Found' rather than guessing."


⚡ Practice Lab: The Hallucination Test

  1. Upload: Create a text file with 5 "Fake" business rules (e.g., "We offer 90% discounts on Fridays").
  2. Query: Ask the model about your discount policy.
  3. Refactor: Rewrite the file using the Structural Markdown pattern above and rerun the query.
  4. Result: Note the increase in citation accuracy.

📝 Homework: The Agency Wiki

Build a 5-page Knowledge Base for your growth agency. Include: (1) Pricing tiers (2) Tech stack (3) Standard Operating Procedures (SOPs). Verify your custom Gem can answer complex "What if" questions using this data.