Grounded Completions

The Gloo Grounded Completions API extends the Completions V2 architecture with Retrieval-Augmented Generation (RAG), combining intelligent routing with powerful source-grounded response capabilities. Ground your AI responses in your own content—upload datasets, point the API at your publisher, and every response will be backed by your specific sources.

Grounded completions (RAG), intelligent routing, and tradition are Completions V2 features. For general, OpenAI-compatible model access, the Responses API (v1) is the recommended surface for new integrations; these grounding capabilities are planned for it but live on V2 today.

When you need responses backed by specific content rather than pure model knowledge, grounded completions retrieve relevant context from your uploaded datasets and provide it to the model during generation. Set the rag_publisher parameter to your publisher name and the API handles retrieval, grounding, and attribution automatically. Like Completions V2, you get the same three routing options—auto-routing, model family selection, or direct model choice—plus tradition-based personalization and input guardrails, all while ensuring responses remain grounded in retrievable sources.

Why Grounded Completions?

Grounded completions solve the core challenge of AI trustworthiness: verifying what the model tells you. Here’s what you get: Reduced Hallucinations By grounding responses in actual content from your specified dataset rather than relying solely on model training, you significantly reduce fabricated or incorrect information. The model generates answers based on retrieved content it can reference. Content-Grounded Responses Every response is informed by relevant sources retrieved from your uploaded content. The sources_returned flag in the response confirms that RAG was used to ground the generation. Publisher-Scoped Knowledge Query your own uploaded content, ensuring responses draw from approved, relevant sources rather than generic web knowledge. Control exactly what knowledge base powers your AI. Routing Flexibility Keep all the intelligent routing capabilities from Completions V2—let Gloo choose the best model automatically, select by provider family, or pick a specific model for your use case.

Key Features

Feature	Capability	Configuration
Intelligent Routing	Auto-routing, model family selection, or direct model choice	`auto_routing`, `model_family`, or `model`
RAG with Attribution	Retrieve relevant sources before generation, include references in response	`rag_publisher` set to your publisher name
Tradition Personalization	Customize responses for specific theological perspectives	`tradition`: `catholic`, `evangelical`, `mainline`
Input Guardrails	Content validation before processing	Automatic

RAG Configuration

Using Your Own Content

Set the rag_publisher parameter to your publisher name to ground responses in your uploaded content:

{
  "messages": [
    {
      "role": "user",
      "content": "What resources do you have on family counseling?"
    }
  ],
  "auto_routing": true,
  "rag_publisher": "YourPublisherName"
}

The model will only retrieve and reference content from your specified publisher. To find your publisher name, navigate to the Publishers page in Studio.

You must have already uploaded content before using grounded completions with your publisher.

If you omit the rag_publisher parameter, the API falls back to GlooGrounded, a shared dataset assembled by Gloo. For best results, we recommend always specifying your own publisher.

Source Limits

Control how many sources are retrieved and considered with the sources_limit parameter (1-10, default is 3):

{
  "messages": [
    { "role": "user", "content": "What does Scripture say about forgiveness?" }
  ],
  "auto_routing": true,
  "sources_limit": 5
}

More sources provide broader context but increase processing time. Choose based on your use case—3 sources work well for most queries, while complex theological questions may benefit from more.

Include Citations

Include citation metadata for sources utilized by RAG (defaults to false)

{
  "messages": [
    { "role": "user", "content": "What does Scripture say about forgiveness?" }
  ],
  "auto_routing": true,
  "include_citations": true
}

For streaming responses, citations will be pre-pended to the response in the following format:

  "citations": [
    {
      "item_title": "",
      "item_url": "",
      "author": [
        ""
      ],
      "publisher": "",
      "publication_date": "",
      "snippets": [
        "",
        ""
      ]
    }
  ]

data: {"citations": [{"item_title": "", "item_url": "", "author": [""], "publisher": "", "publication_date": "", "snippets": ["", ""]}]}

Tradition-Based Personalization

Customize responses to align with specific theological perspectives using the tradition parameter:

{
  "messages": [
    { "role": "user", "content": "Explain the concept of salvation" }
  ],
  "auto_routing": true,
  "tradition": "evangelical"
}

Supported traditions:

"evangelical" - Evangelical Protestant perspective
"catholic" - Roman Catholic perspective
"mainline" - Mainline Protestant perspective
"not_faith_specific" - General Christian perspective

When specified, both the RAG retrieval and response generation adapt to the theological tradition, ensuring appropriate language, concepts, and emphases.

Code Examples

curl -X POST 'https://platform.ai.gloo.com/ai/v2/chat/completions/grounded' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer ${ACCESS_TOKEN}' \
  -H 'Content-Type: application/json' \
  -d '{
  "messages": [
    {
      "role": "user",
      "content": "What are practical ways to build stronger community in a local church?"
    }
  ],
  "auto_routing": true,
  "rag_publisher": "YourPublisherName",
  "sources_limit": 3,
  "tradition": "evangelical"
}'

from openai import OpenAI

client = OpenAI(
    api_key="your-gloo-access-token",
    base_url="https://platform.ai.gloo.com/ai"
)

response = client.chat.completions.create(
    model="grounded",  # Special model identifier for grounded endpoint
    messages=[
        {
            "role": "user",
            "content": "What are practical ways to build stronger community in a local church?"
        }
    ],
    extra_body={
        "auto_routing": True,
        "rag_publisher": "YourPublisherName",
        "sources_limit": 3,
        "tradition": "evangelical"
    }
)

print(response.choices[0].message.content)

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'your-gloo-access-token',
  baseURL: 'https://platform.ai.gloo.com/ai',
});

const response = await client.chat.completions.create({
  model: 'grounded', // Special model identifier for grounded endpoint
  messages: [
    {
      role: 'user',
      content:
        'What are practical ways to build stronger community in a local church?',
    },
  ],
  // @ts-ignore - extra_body not in types
  extra_body: {
    auto_routing: true,
    rag_publisher: 'YourPublisherName',
    sources_limit: 3,
    tradition: 'evangelical',
  },
});

console.log(response.choices[0].message.content);

Prerequisites

Before starting, ensure you have:

A Gloo AI Studio account
Your Client ID and Client Secret from the API Credentials page
Authentication setup - Complete the Authentication Tutorial first
OpenAI SDK installed (for Python/TypeScript examples): pip install openai or npm install openai

Endpoint Details

URL: https://platform.ai.gloo.com/ai/v2/chat/completions/grounded Operation: POST

Example cURL Request

curl -X POST 'https://platform.ai.gloo.com/ai/v2/chat/completions/grounded' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer ${ACCESS_TOKEN}' \
  -H 'Content-Type: application/json' \
  -d '{
  "messages": [
    {
      "role": "user",
      "content": "How can I incorporate contemplative prayer into daily life?"
    }
  ],
  "auto_routing": true,
  "rag_publisher": "YourPublisherName",
  "sources_limit": 3,
  "tradition": "catholic",
  "stream": false,
  "include_citations": true
}'

Request Parameters

Parameter	Type	Required?	Description
`messages`	array	Yes	Chat message history in standard format
Routing (Choose exactly ONE):
`auto_routing`	boolean	Conditional	Enable smart routing (recommended)
`model`	string	Conditional	Specific Gloo model ID (e.g., `gloo-anthropic-claude-sonnet-4.5`)
`model_family`	string	Conditional	Provider family: `openai`, `anthropic`, `google`, `open source`
RAG Parameters:
`rag_publisher`	string	No	Your publisher name. Set this to ground responses in your own content
`sources_limit`	integer	No	Number of sources to retrieve (1-10, default: 3)
Optional:
`tradition`	string	No	Theological perspective: `evangelical`, `catholic`, `mainline`, `not_faith_specific`
`max_tokens`	integer	No	Maximum response length
`temperature`	float	No	Sampling temperature (0.0-2.0)
`stream`	boolean	No	Enable streaming responses (default: `false`)
`tools`	array	No	Function calling definitions (see Tool Use Guide)
`tool_choice`	string	No	Tool invocation strategy: `auto`, `none`, or specific tool
`parallel_tool_calls`	boolean	No	Allow parallel tool execution (default: `true`)
`prompt_cache_key`	string	No	Optional cache key for OpenAI explicit caching to improve hit rates (see Prompt Caching)
`include_citations`	boolean	No	Include citation metadata in the response (default: `false`)

Exactly one routing mechanism must be specified: auto_routing: true, model, or model_family.

For Anthropic models, add the X-Cache-TTL header to enable explicit caching. OpenAI and DeepSeek models use implicit caching automatically. For OpenAI, you can optionally add prompt_cache_key to improve hit rates. See the Prompt Caching Guide for details.

Response Format

Non-Streaming Response

{
  "id": "gen-1769792268-xTKWyq7A9HmEhxzDsAJj",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "Based on the retrieved sources, here are practical ways to build stronger community...",
        "refusal": null,
        "role": "assistant",
        "annotations": null,
        "audio": null,
        "function_call": null,
        "tool_calls": null,
        "reasoning": null
      },
      "native_finish_reason": "stop"
    }
  ],
  "created": 1769792268,
  "model": "gloo-anthropic-claude-sonnet-4.5",
  "object": "chat.completion",
  "service_tier": null,
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 592,
    "prompt_tokens": 2137,
    "total_tokens": 2729,
    "completion_tokens_details": {
      "accepted_prediction_tokens": null,
      "audio_tokens": 0,
      "reasoning_tokens": 0,
      "rejected_prediction_tokens": null
    }
  },
  "provider": "Gloo AI",
  "model_family": "Anthropic",
  "auto_routing": true,
  "routing_mechanism": "auto_routing",
  "routing_tier": "tier_3",
  "routing_confidence": 0.782,
  "tradition": "evangelical",
  "sources_returned": true,
  "citations": [
    {
      "item_title": "Daily Prayer",
      "item_url": "https://www.studio.ai.gloo.com",
      "author": ["John Doe"],
      "publisher": "The Publisher",
      "publication_date": "Apr 08 2017",
      "snippets": [
        "Daily prayer is an essential part of life as...",
        "Making time to spend with the Lord is..."
      ]
    }
  ]
}

Streaming Response

When stream: true, responses are sent as Server-Sent Events. Routing and RAG metadata is provided in HTTP headers:

Header	Description
`X-Routing-Mechanism`	The routing mode used: `auto_routing`, `model_family`, `direct_model_selection`
`X-Selected-Model`	The Gloo model ID that handled the request
`X-Model-Family`	The provider family (OpenAI, Anthropic, Google, Open Source)
`X-Tradition`	Theological perspective used (if specified)
`X-Sources-Returned`	Whether RAG was used to retrieve and ground the response: `true` or `false`
`X-Routing-Tier`	Model tier selected (auto-routing and model family modes only)
`X-Routing-Confidence`	Routing confidence score 0-1 (auto-routing and model family modes only)

When include_citations: true is set in the request and sources are found, a citations event is emitted as the first SSE chunk before any LLM content, with the shape:

data: {"citations": [{"item_title": "", "item_url": "", "author": [""], "publisher": "", "publication_date": "", "snippets": ["", ""]}]}

The event stream follows standard Server-Sent Events format. For detailed streaming implementation guidance, see the Completions V2 streaming documentation.

Response Metadata Fields

Field	Description
`sources_returned`	Boolean indicating whether RAG was used to retrieve content and ground the response
`routing_mechanism`	How the model was selected: `auto_routing`, `model_family`, or `direct_model_selection`
`tradition`	Theological perspective applied (only included if specified in request)
`model_family`	Provider family of the selected model
`provider`	Always `"Gloo AI"`
`routing_tier`	Model tier selected (included for auto-routing and model family modes)
`routing_confidence`	Confidence score for routing decision (included for auto-routing and model family modes)
`citations`	Array of source citation objects used to ground the response

Completions V2 API - Core routing mechanisms and streaming details
Search API - Standalone RAG queries without generation
Tool Use Guide - Using function calling with grounded completions
Supported Models - Model capabilities and context windows

​Why Grounded Completions?

​Key Features

​RAG Configuration

​Using Your Own Content

​Source Limits

​Include Citations

​Tradition-Based Personalization

​Code Examples

​Prerequisites

​Endpoint Details

​Example cURL Request

​Request Parameters

​Response Format

​Non-Streaming Response

​Streaming Response

​Response Metadata Fields

​Related Documentation

Why Grounded Completions?

Key Features

RAG Configuration

Using Your Own Content

Source Limits

Include Citations

Tradition-Based Personalization

Code Examples

Prerequisites

Endpoint Details

Example cURL Request

Request Parameters

Response Format

Non-Streaming Response

Streaming Response

Response Metadata Fields

Related Documentation