Skip to main content

Lesson 6: The Augmentation Step

Where Retrieval Meets Prompting

You’ve retrieved relevant chunks from your knowledge base. Now comes a step that’s easy to overlook but crucial to get right: augmentation. Augmentation is where retrieval meets prompting. You’re taking the user’s question, adding the retrieved context, and crafting a prompt that helps the AI generate a useful, grounded response. If you’ve worked through the prompt engineering curriculum, you already know that how you phrase things matters enormously. The augmentation step is where those skills become essential for RAG.

Core Concepts

The Anatomy of an Augmented Prompt

A basic augmented prompt has three components: 1. Instructions: Tell the AI how to behave and use the provided context. 2. Retrieved Context: The chunks retrieved from your knowledge base. 3. User Question: What the user actually asked. Here’s what a simple template might look like:
You are a helpful assistant. Use the following context to answer the user's question. If the context doesn't contain enough information to fully answer, say so.

Context:
{retrieved_chunks}

Question: {user_question}

Answer:
The structure matters. The AI reads top to bottom, so instructions at the top prime its behavior. The context comes next to be fresh in “memory.” The question at the end clearly signals what needs to be answered.

Instruction Design: Setting the Rules

The instruction section is where you establish how the AI should handle the task. Common elements include: Role setting: “You are a customer support agent for Acme Corp.” Context usage guidance: “Base your answer on the provided context. Do not use information from outside the context.” Handling uncertainty: “If the context doesn’t contain enough information, say ‘I don’t have enough information to answer that fully.’” Tone and format: “Answer in a friendly, professional tone. Use bullet points for lists.” Constraints: “Do not make up information. Do not speculate beyond what the context supports.” These instructions shape everything that follows. Without them, the AI might hallucinate, ignore the context, or respond in ways that don’t fit your use case.

Presenting Retrieved Context

How you present the retrieved chunks affects how well the AI uses them. Several approaches work: Simple concatenation: Just join all chunks with line breaks. Simple but can be messy.
Context:
[Chunk 1 content]

[Chunk 2 content]

[Chunk 3 content]
Labeled chunks: Add markers to help the AI track sources.
Context:

[Source: Employee Handbook, Section 3.2]
[Chunk 1 content]

[Source: HR Policy Update 2024]
[Chunk 2 content]
Structured format: Use clear delimiters and organization.
Context:

---
Document: Refund Policy
Last Updated: January 2024
Content: [Chunk content]
---

---
Document: Customer FAQ
Last Updated: March 2024
Content: [Chunk content]
---
Labeled approaches help the AI cite sources and distinguish between different types of information.

Managing Context Length

Here’s a practical constraint: AI models have limited context windows. Every token in your prompt (instructions, context, question, AND the response) counts against that limit. This creates trade-offs: More context = More information for the AI to work with, but:
  • Higher cost (most APIs charge by token)
  • Risk of hitting context limits
  • Possible dilution of focus (too much noise)
Less context = Cheaper and more focused, but:
  • Might miss relevant information
  • Less comprehensive answers
Strategies for managing context length: Compress or summarize: Instead of including full chunks, include summaries. Prioritize ruthlessly: If you retrieved 10 chunks, maybe only include the top 5. Truncate strategically: If a chunk is too long, include the most relevant portion. Use models with larger context windows: When available and cost-effective. The right balance depends on your use case, budget, and quality requirements.

Instructing the AI on Context Usage

A critical aspect of augmentation is telling the AI how to use (and not misuse) the context. Without clear instructions, you might encounter: Ignoring context: The AI answers from general knowledge instead of the provided information. Overclaiming: The AI states things definitively that the context only implies or doesn’t support. Mixing context with hallucination: The AI starts with real information but adds fabricated details. Effective instructions address these risks:
Instructions:
- Answer ONLY based on the information provided in the context below
- If the context contradicts your general knowledge, defer to the context
- If you cannot answer from the context, say "Based on the available information, I cannot answer that question"
- Do not invent facts, statistics, or details not present in the context
- If the context contains conflicting information, acknowledge the conflict
These guardrails help keep responses grounded and honest.

Handling Multiple Chunks

When you retrieve multiple chunks, they might: Overlap: Say the same thing in different words Complement: Each provides different pieces of the answer Conflict: Provide contradictory information Your prompt template should help the AI navigate these situations:
The context below may contain multiple sources. If sources provide complementary information, synthesize them. If sources conflict, note the discrepancy and, if possible, indicate which source is more authoritative or recent.
This instruction prevents the AI from being confused by multiple inputs and helps it produce coherent responses.

Question Reformulation

Sometimes the user’s question could be improved before being sent to the AI. Common enhancements: Adding specificity: If the user asks “What’s the policy?”, but you know from context they’re looking at the refund page, you might expand to “What’s the refund policy?” Removing ambiguity: Clarifying pronouns or vague references based on conversation history. Decomposing complex questions: Breaking “What’s the price and how do I sign up?” into sub-questions that can be addressed separately. This is more advanced, but it can significantly improve response quality.

Try It Yourself

Exercise 1: Write a Prompt Template

Design a prompt template for a RAG system that answers questions about a company’s product documentation. Include:
  1. A role/persona for the AI
  2. Clear instructions on how to use the context
  3. Guidance on what to do when information is missing
  4. Placeholders for retrieved context and user question
Test your template mentally with a few example questions. Would it produce good results?

Exercise 2: Compare Template Approaches

Here are two different instruction sets. For each, consider what kind of responses they would produce: Template A:
Answer the user's question using the context provided.

Context:
{context}

Question: {question}
Template B:
You are a helpful assistant for Acme Corporation. Your role is to help customers understand our products and policies.

Instructions:
- Only use information from the provided context to answer
- If the context doesn't contain enough information, say "I don't have that information in my available resources"
- Be concise but thorough
- If relevant, mention which document the information comes from

Context:
{context}

Customer Question: {question}

Your Response:
Which would you trust more for customer support? Why?

Exercise 3: Handle Conflicting Context

Imagine you retrieve these two chunks: Chunk 1 (from 2022 policy): “Refunds are available within 14 days of purchase.” Chunk 2 (from 2024 policy update): “Refunds are available within 30 days of purchase.” How would you structure your prompt to help the AI handle this conflict? Write out the specific instructions you would include.

Common Pitfalls

Pitfall 1: No Instructions at All

Just throwing context and a question at the AI without any guidance leads to inconsistent, unpredictable responses. The fix: Always include clear instructions. Even a few sentences make a difference.

Pitfall 2: Instructions Too Vague

“Answer based on the context” is a start, but doesn’t address edge cases. What if the context is insufficient? What if it conflicts? The fix: Think through edge cases and address them explicitly in your instructions.

Pitfall 3: Overloading Context

Stuffing every retrieved chunk into the prompt, regardless of relevance or length, leads to bloated prompts that dilute focus and inflate costs. The fix: Be selective. Prioritize quality over quantity in what you include.

Pitfall 4: Ignoring Source Attribution

If the AI responds without indicating where information came from, users can’t verify the answer and trust erodes. The fix: Include instructions for the AI to reference sources. Use labeled chunks so the AI can cite them.

Level Up

Here’s a design challenge: Scenario: You’re building a RAG system for a medical information service. Users ask health questions, and the system retrieves from peer-reviewed medical literature. Constraints:
  • Responses must not be interpreted as personal medical advice
  • Sources must be clearly cited
  • Uncertainty must be acknowledged
  • Information should be accessible to non-medical users
Your task: Design a complete prompt template that addresses all these requirements. Include:
  1. An appropriate role/persona
  2. Detailed instructions on context usage and limitations
  3. Specific guidance on disclaimers and uncertainty
  4. A format for presenting retrieved context
  5. Instructions for generating the response
This exercise forces you to think carefully about how augmentation shapes responsible AI behavior.

Key Takeaway

Augmentation is where retrieval meets prompting. The quality of your augmented prompt directly impacts response quality. A good augmented prompt includes clear instructions on how to use the context, presents retrieved information in a well-organized way, and guides the AI on handling edge cases like missing information or conflicting sources. Think of it as setting up the AI for success by giving it exactly what it needs to generate a grounded, useful response.

What’s Next

You’ve prepared the perfect augmented prompt. Now the AI generates a response. In Lesson 7: The Generation Step, we’ll explore how the AI uses retrieved context to produce grounded responses. You’ll learn about staying faithful to source material, encouraging appropriate uncertainty, and making AI responses verifiable through citations.