Using the Completions API

This guide provides a practical, step-by-step tutorial for using the Gloo AI Completions V2 API with its powerful routing features.

Why V2? Completions V2 offers auto-routing for optimal model selection, model family preferences, and tradition-aware responses—all while maintaining compatibility with the standard chat completions format.

Prerequisites

Before starting, ensure you have:

A Gloo AI Studio account
Your Client ID and Client Secret from the API Credentials page
Authentication setup - Complete the Authentication Tutorial first

Choose Your Routing Strategy

Completions V2 offers three routing modes:

Mode	Use Case	Parameter
AI Core (Recommended)	Let Gloo AI automatically select the best model	`"auto_routing": true`
AI Core Select	Choose a provider family, let Gloo pick the model	`"model_family": "anthropic"`
AI Select	Specify an exact model	`"model": "gloo-openai-gpt-5-mini"`

Example 1: Auto-Routing (Recommended)

Let Gloo AI analyze your query and automatically select the optimal model:

import requests

def make_v2_completion_auto(token_info):
    """Makes a V2 completion request with auto-routing."""

    api_url = "https://platform.ai.gloo.com/ai/v2/chat/completions"
    headers = {
        "Authorization": f"Bearer {token_info['access_token']}",
        "Content-Type": "application/json"
    }

    payload = {
        "messages": [
            {"role": "user", "content": "How does the Old Testament connect to the New Testament?"}
        ],
        "auto_routing": True,
        "tradition": "evangelical"  # Optional: evangelical, catholic, or mainline
    }

    response = requests.post(api_url, headers=headers, json=payload)
    response.raise_for_status()
    return response.json()

Example 2: Model Family Selection

Specify a provider family and let Gloo AI pick the best model within that family:

def make_v2_completion_family(token_info):
    """Makes a V2 completion request with model family selection."""

    api_url = "https://platform.ai.gloo.com/ai/v2/chat/completions"
    headers = {
        "Authorization": f"Bearer {token_info['access_token']}",
        "Content-Type": "application/json"
    }

    payload = {
        "messages": [
            {"role": "user", "content": "Draft a short sermon outline on forgiveness."}
        ],
        "model_family": "anthropic",  # Options: openai, anthropic, google, open source
        "stream": False
    }

    response = requests.post(api_url, headers=headers, json=payload)
    response.raise_for_status()
    return response.json()

Available model families: openai, anthropic, google, open source

Example 3: Direct Model Selection

Choose a specific model for full control:

def make_v2_completion_direct(token_info):
    """Makes a V2 completion request with direct model selection."""

    api_url = "https://platform.ai.gloo.com/ai/v2/chat/completions"
    headers = {
        "Authorization": f"Bearer {token_info['access_token']}",
        "Content-Type": "application/json"
    }

    payload = {
        "messages": [
            {"role": "user", "content": "Summarize the book of Romans in 3 sentences."}
        ],
        "model": "gloo-anthropic-claude-sonnet-4.5",
        "temperature": 0.7,
        "max_tokens": 500
    }

    response = requests.post(api_url, headers=headers, json=payload)
    response.raise_for_status()
    return response.json()

See the Supported Model IDs page for all available models.

Understanding the Response

V2 responses include additional routing metadata:

{
  "id": "chatcmpl-xyz",
  "object": "chat.completion",
  "created": 1733184562,
  "model": "gloo-anthropic-claude-sonnet-4.5",
  "routing_mechanism": "auto_routing",
  "routing_tier": "tier_2",
  "routing_confidence": 0.87,
  "tradition": "evangelical",
  "provider": "Anthropic",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The response content..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 125,
    "completion_tokens": 78,
    "total_tokens": 203
  }
}

Key metadata fields:

routing_mechanism: How the model was selected (auto_routing, model_family, or direct_model_selection)
routing_tier: The complexity tier determined by auto-routing (tier_1, tier_2, tier_3)
routing_confidence: Confidence score for the routing decision (0.0-1.0)
tradition: The theological perspective applied (if specified)

Streaming Responses

Enable streaming for real-time responses:

import requests

def make_v2_completion_streaming(token_info):
    """Makes a streaming V2 completion request."""

    api_url = "https://platform.ai.gloo.com/ai/v2/chat/completions"
    headers = {
        "Authorization": f"Bearer {token_info['access_token']}",
        "Content-Type": "application/json"
    }

    payload = {
        "messages": [
            {"role": "user", "content": "Explain the significance of the resurrection."}
        ],
        "auto_routing": True,
        "stream": True
    }

    with requests.post(api_url, headers=headers, json=payload, stream=True) as response:
        response.raise_for_status()
        for line in response.iter_lines():
            if line:
                print(line.decode('utf-8'))

Complete Examples

The following examples combine token retrieval, expiration checking, and all three routing strategies into a single, runnable script for each language. Each example demonstrates auto-routing, model family selection, and direct model selection. You’ll want to first set up your environment variables in either an .env file:

GLOO_CLIENT_ID=YOUR_CLIENT_ID
GLOO_CLIENT_SECRET=YOUR_CLIENT_SECRET

Or export them in your shell for Go and Java:

export GLOO_CLIENT_ID="your_actual_client_id_here"
export GLOO_CLIENT_SECRET="your_actual_client_secret_here"

import requests
import time
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# --- Configuration ---
CLIENT_ID = os.getenv("GLOO_CLIENT_ID", "YOUR_CLIENT_ID")
CLIENT_SECRET = os.getenv("GLOO_CLIENT_SECRET", "YOUR_CLIENT_SECRET")
TOKEN_URL = "https://platform.ai.gloo.com/oauth2/token"
API_URL = "https://platform.ai.gloo.com/ai/v2/chat/completions"

# --- State Management ---
access_token_info = {}

# --- Token Management ---
def get_access_token():
    """Retrieves a new access token."""
    headers = {"Content-Type": "application/x-www-form-urlencoded"}
    data = {"grant_type": "client_credentials", "scope": "api/access"}
    response = requests.post(TOKEN_URL, headers=headers, data=data, auth=(CLIENT_ID, CLIENT_SECRET))
    response.raise_for_status()
    token_data = response.json()
    token_data['expires_at'] = int(time.time()) + token_data['expires_in']
    return token_data

def is_token_expired(token_info):
    """Checks if the token is expired or close to expiring."""
    if not token_info or 'expires_at' not in token_info:
        return True
    return time.time() > (token_info['expires_at'] - 60)

def ensure_valid_token():
    """Ensures we have a valid token, refreshing if needed."""
    global access_token_info
    if is_token_expired(access_token_info):
        print("Token is expired or missing. Fetching a new one...")
        access_token_info = get_access_token()
    return access_token_info

# --- V2 Completion Functions ---
def make_v2_auto_routing(message, tradition="evangelical"):
    """Example 1: Auto-routing - Let Gloo AI select the optimal model."""
    token = ensure_valid_token()
    headers = {
        "Authorization": f"Bearer {token['access_token']}",
        "Content-Type": "application/json"
    }
    payload = {
        "messages": [{"role": "user", "content": message}],
        "auto_routing": True,
        "tradition": tradition
    }
    response = requests.post(API_URL, headers=headers, json=payload)
    response.raise_for_status()
    return response.json()

def make_v2_model_family(message, model_family="anthropic"):
    """Example 2: Model family selection - Choose a provider family."""
    token = ensure_valid_token()
    headers = {
        "Authorization": f"Bearer {token['access_token']}",
        "Content-Type": "application/json"
    }
    payload = {
        "messages": [{"role": "user", "content": message}],
        "model_family": model_family
    }
    response = requests.post(API_URL, headers=headers, json=payload)
    response.raise_for_status()
    return response.json()

def make_v2_direct_model(message, model="gloo-anthropic-claude-sonnet-4.5"):
    """Example 3: Direct model selection - Specify an exact model."""
    token = ensure_valid_token()
    headers = {
        "Authorization": f"Bearer {token['access_token']}",
        "Content-Type": "application/json"
    }
    payload = {
        "messages": [{"role": "user", "content": message}],
        "model": model,
        "temperature": 0.7,
        "max_tokens": 500
    }
    response = requests.post(API_URL, headers=headers, json=payload)
    response.raise_for_status()
    return response.json()

# --- Main Execution ---
if __name__ == "__main__":
    try:
        # Example 1: Auto-routing
        print("=== Example 1: Auto-Routing ===")
        result1 = make_v2_auto_routing("How does the Old Testament connect to the New Testament?")
        print(f"Model used: {result1.get('model')}")
        print(f"Routing: {result1.get('routing_mechanism')}")
        print(f"Response: {result1['choices'][0]['message']['content'][:200]}...")

        # Example 2: Model family selection
        print("\n=== Example 2: Model Family Selection ===")
        result2 = make_v2_model_family("Draft a short sermon outline on forgiveness.", "anthropic")
        print(f"Model used: {result2.get('model')}")
        print(f"Response: {result2['choices'][0]['message']['content'][:200]}...")

        # Example 3: Direct model selection
        print("\n=== Example 3: Direct Model Selection ===")
        result3 = make_v2_direct_model("Summarize the book of Romans in 3 sentences.")
        print(f"Model used: {result3.get('model')}")
        print(f"Response: {result3['choices'][0]['message']['content'][:200]}...")

    except requests.exceptions.HTTPError as err:
        print(f"An HTTP error occurred: {err}")
    except Exception as err:
        print(f"An error occurred: {err}")

Working Code Sample

View Complete Code

Clone or browse the complete working examples for all 6 languages (JavaScript, TypeScript, Python, PHP, Go, Java) with setup instructions.

Next Steps

Now that you understand the Completions V2 API, explore:

Completions V2 Guide - Full API documentation
Supported Model IDs - All available models
Tool Use - Function calling with completions
Chat Tutorial - Stateful chat interactions

Legacy

​Prerequisites

​Choose Your Routing Strategy

​Example 1: Auto-Routing (Recommended)

​Example 2: Model Family Selection

​Example 3: Direct Model Selection

​Understanding the Response

​Streaming Responses

​Complete Examples

​Working Code Sample

View Complete Code

​Next Steps

Prerequisites

Choose Your Routing Strategy

Example 1: Auto-Routing (Recommended)

Example 2: Model Family Selection

Example 3: Direct Model Selection

Understanding the Response

Streaming Responses

Complete Examples

Working Code Sample

Next Steps