Grounded Completions

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Headers

X-Cache-TTL

enum<string>

Enables Anthropic prompt caching. When set, system messages are automatically wrapped with cache_control blocks. '5m' bills cache writes at 1.25x input rate, '1h' bills at 2x input rate. Only effective for Anthropic models.

Available options:

5m,

1h

Body

application/json

Request body for the grounded completions endpoint. Exactly one routing mechanism (auto_routing, model, or model_family) must be specified. Messages are optional when omitted and default to a single user message of "Hello world!". An explicit empty messages array is invalid and returns 400.

messages

LlmMessage · object[]

Chat message history with role and content fields. If omitted, the current service defaults to one user message with content "Hello world!". Explicitly passing an empty array is invalid.

Show child attributes

auto_routing

boolean

Enables intelligent model selection. Mutually exclusive with model and model_family.

model

string

Specific Gloo model identifier. Mutually exclusive with auto_routing and model_family.

model_family

enum<string>

Provider family for model selection. Mutually exclusive with auto_routing and model.

Available options:

openai,

anthropic,

google,

open source

rag_publisher

string

default:GlooGrounded

Publisher name to retrieve sources from. Defaults to GlooGrounded when omitted.

sources_limit

integer

default:3

Number of sources to retrieve for grounding (1-10).

Required range: 1 <= x <= 10

tradition

enum<string>

Theological perspective to apply. Options: evangelical, catholic, mainline, not_faith_specific.

Available options:

evangelical,

catholic,

mainline,

not_faith_specific

stream

boolean

default:false

Enables streaming responses via server-sent events.

temperature

number

default:0.7

Sampling temperature controlling randomness.

Required range: 0 <= x <= 2

max_tokens

integer

Maximum number of tokens to generate in the response.

Required range: x >= 1

tools

Tool · object[]

Function calling definitions.

Show child attributes

tool_choice

default:none

Controls which tool (if any) the model should use.

Available options:

none,

auto,

required

prompt_cache_key

string

OpenAI Responses API cache key for persistent prompt caching. When set, the prompt prefix is cached and reused across requests with the same key, reducing cost and latency. Cache TTL is 5 minutes (server-side). Only effective for OpenAI models. Ignored for other providers.

parallel_tool_calls

boolean

default:true

Allow parallel tool execution.

include_citations

boolean

default:false

Include citation metadata for sources utilized by RAG. For streaming responses, citations are prepended as the first SSE event before any content chunks.

stream_options

Stream Options · object

Streaming options. Use include_usage=true with stream=true to receive a final usage event.

Show child attributes

Response

Successful grounded completion response. If stream is false or omitted, the response is JSON. If stream=true, the response is a Server-Sent Events stream. Once streaming has started, failures are delivered as stream events rather than non-2xx HTTP responses. Content moderation ends the stream with a content_filter event. When include_citations=true and citations are available, the first stream event contains citations.

Response from the grounded completions endpoint.

string

Unique completion identifier.

object

string

default:chat.completion

Object type, always 'chat.completion'.

created

integer

Unix timestamp of when the completion was created.

model

string

The model that was selected and used for the completion.

provider

string

The model provider name.

model_family

string

Provider family of the selected model.

auto_routing

boolean

Indicates whether auto routing was used.

routing_mechanism

string

The routing method used: auto_routing, model_family, or direct_model_selection.

routing_tier

string

The performance tier assigned by the routing system.

routing_confidence

number

Confidence score for the routing decision (0.0 to 1.0).

tradition

string

The theological perspective that was applied to the response.

sources_returned

boolean

Whether RAG was used to retrieve sources and ground the response.

citations

CitationObject · object[] | null

Array of source citations used to ground the response. Only present when include_citations is true in the request.

Show child attributes

choices

CompletionsV2Choice · object[]

List of completion choices.

Show child attributes

usage

CompletionsV2Usage · object

Token consumption metrics.

Show child attributes

service_tier

null

system_fingerprint

null

Error Reference: Grounded Completions

Authorizations

Headers

Body

Response