Skip to main content

How Do I Tell AI What I Need It To Do?

So now we know what AI models are and how they’re trained. But how do we actually use them? This section covers how we interact with models through prompts, how generation works behind the scenes, and what affects the quality of a model’s response. At Gloo AI, we believe one of the most empowering things you can learn is how to write clear prompts and understand what influences model behavior.

Prompt

What it means: A prompt is the input or question you give to an AI model. It tells the model what to generate or how to respond. Examples:
  • “Summarize this paragraph.”
  • “Write a prayer for a new mother.”
  • “What are 5 ways a church could use data in outreach?”
Why it matters: The quality of the prompt heavily influences the quality of the output. Prompting well is often called “prompt engineering.” How it shows up in Gloo: Prompts are used throughout Gloo products, especially in Chat for Teams. When staff ask questions, write messages, or request summaries, the prompt determines how the model retrieves and uses their organization’s content through the Data Engine.

System Prompt

What it means: A system prompt is a special instruction given to the model to set its overall behavior, tone, or rules before it interacts with a user. Use case: A system prompt might say: “You are a friendly assistant who always answers concisely and avoids strong opinions.” This tells the model how to behave for the entire session. Note: Users don’t always see the system prompt, but it often shapes how the model responds. How it shows up in Gloo: Gloo uses carefully designed system prompts to enforce organizational voice, theological alignment, safety expectations, and content boundaries. These system instructions guide model behavior consistently, even when users ask open-ended questions.

Prompt Engineering

What it means: Prompt engineering is the skill of designing prompts to get better, more accurate, or more useful responses from an AI model. Why it’s useful: Sometimes a vague prompt gives a vague answer. Prompt engineering helps you guide the model clearly by setting expectations in the prompt. Example: Instead of saying “Explain photosynthesis,” a better prompt might be “Explain photosynthesis in simple terms for a middle school science student.” How it shows up in Gloo: Prompt engineering helps organizations get the most out of Chat for Teams and our APIs. Clear prompts improve retrieval accuracy, content grounding, and response alignment, especially when working with complex or sensitive ministry topics.

Inference

What it means: Inference is the process the model goes through to generate a response based on your prompt. It happens after training, during actual usage. Analogy: Training is like studying for an exam. Inference is answering the question during the test. Why it matters: Inference speed and accuracy affect how useful the model is in real-time settings like chat apps or search tools. How it shows up in Gloo: Inference happens every time users interact with Gloo AI. The system retrieves relevant content from the Data Engine and generates aligned responses in real time. Fast inference ensures smooth chat experiences and accurate document enrichment.

Tokens

What it means: Tokens are chunks of text the model reads or generates. They can be as small as a character or as large as a word. Example: The sentence “Hello there!” is about 3 tokens: “Hello,” “there,” and “!” Why it matters: Models have token limits. Longer prompts or responses use more tokens. Most models process somewhere between 2,000 and 100,000 tokens depending on their size. How it shows up in Gloo: Token usage impacts request size and model behavior within the Gloo API. When documents are uploaded, they are chunked into token sized sections for embedding.

Context Window

What it means: The context window is the total number of tokens a model can “remember” at once. It includes both your prompt and the model’s reply. Analogy: Think of it like a whiteboard. If you write too much, older stuff gets erased to make room. Bigger context windows mean the model can reference more of the conversation. How it shows up in Gloo: Gloo uses models with large context windows to allow better grounding in an organization’s uploaded content. This improves RAG accuracy, ensures longer documents can be analyzed, and helps maintain conversation continuity in Chat for Teams.

Temperature

What it means: Temperature is a setting that controls how creative or focused a model’s response is. How it works:
  • Lower temperature (like 0.2): More focused and predictable answers
  • Higher temperature (like 0.9): More random and creative responses
Use case: Use a low temperature for fact summaries. Use a higher one for brainstorming. How it shows up in Gloo: Temperature is preset by Gloo for most experiences to balance creativity and safety.

Top-k / Top-p Sampling

What it means: These are technical settings that influence which words the model chooses from when generating text. They help balance randomness and coherence. Quick comparison:
  • Top-k looks at the top K likely words and picks from them
  • Top-p looks at the smallest number of words whose probabilities add up to P (like 90 percent), then chooses from those
Why it matters: You usually don’t have to change these yourself, but understanding them can explain why a model sometimes repeats itself or surprises you. How it shows up in Gloo: Gloo manages these sampling settings internally to ensure stable, predictable behavior. By tuning these parameters, Gloo reduces repetition and maintains the accuracy and reliability that champion partners and developers expect.

Stop Sequence

What it means: A stop sequence is a specific signal that tells the model to stop generating text. Use case: You might set “###” as a stop sequence so the model stops there and doesn’t keep talking after a section. How it shows up in Gloo: Gloo uses stop sequences behind the scenes to structure outputs, enforce formatting rules, and keep responses within safe boundaries. This helps ensure that generated content does not run on too long or produce unintended sections.
Next Up: How AI Uses and Stores Knowledge In the next section, we’ll answer: “What are vectors, embeddings, and retrieval systems, and how do they help AI remember or reason?”