Building Custom Search with the Search API

This tutorial shows you how to build custom search functionality using the Gloo AI Search API. You’ll learn to authenticate, perform semantic search, work with rich results, and optionally combine search with Completions V2 for Retrieval Augmented Generation (RAG). The Search API gives you full control over how search works in your application — from the query to the UI. Whether you’re building a knowledge base, a chatbot, or a content discovery experience, this tutorial covers the backend patterns you need.

The Discovery Widget in Gloo AI Studio provides a quick, embeddable search experience. This tutorial shows you how to build equivalent functionality using the Search API directly, giving you full control over branding, UI, and integration patterns.

Prerequisites

Before starting, ensure you have:

A Gloo AI Studio account
Your Client ID and Client Secret from the API Credentials page
Your Tenant (publisher) name from Organizations in Studio
Content uploaded to the Data Engine (see Upload Files Tutorial)
Authentication setup — Complete the Authentication Tutorial first

Working Code Sample

Follow along with complete working examples in all 6 languages (JavaScript, TypeScript, Python, PHP, Go, Java). Includes a proxy server and browser-based frontend for each language.Setup and testing instructions are provided later.

The code snippets in this tutorial are simplified and self-contained — designed for readability and easy copy-paste. The cookbook examples use a modular architecture plus production niceties. Both implement the same APIs and patterns.

Understanding the Search API

The Search API provides AI-powered semantic search across your ingested content. Unlike keyword search, semantic search understands the meaning behind queries — so a search for “secrets to a happy marriage” will find content about “rules for keeping a marriage healthy” even without exact word matches. Endpoint: POST /ai/data/v1/search

Key Features

Semantic Search: Near-text search that understands meaning, not just keywords
Rich Metadata: AI-generated summaries, biblical analysis, content classifications
Snippet Extraction: Pre-chunked content ready for display or RAG
Relevance Scoring: Distance, certainty, and score metrics for ranking

Required Parameters

Parameter	Description
`query`	The search query string
`collection`	Always `"GlooProd"`
`tenant`	Your publisher (tenant) name
`limit`	Number of results to return (10-100 recommended)

Optional Parameters

Parameter	Type	Description
`certainty`	float (0-1)	Minimum relevance threshold. The Search Playground defaults to `0.5`. We recommend starting with `0.5` and adjusting as needed.

Important: The API’s default certainty is 0.75 when omitted, which is stricter than the Playground’s 0.5. If you’re getting no results, add "certainty": 0.5 to your request to match Playground behavior.

Response Structure

Each result in the data array contains:

{
  "uuid": "unique-result-id",
  "metadata": {
    "distance": 0.396,
    "certainty": 0.802,
    "score": 0.0
  },
  "properties": {
    "item_title": "Finding True Happiness",
    "type": "Article",
    "author": ["Author Name"],
    "snippet": "Content text...",
    "summaries": { ... },
    "biblical_analysis": { ... }
  },
  "collection": "GlooProd"
}

Key fields:

metadata.certainty — Relevance score (0-1, higher = more relevant)
properties.snippet — Content chunk text, ideal for display or RAG context
properties.summaries — AI-generated summaries in multiple styles
properties.biblical_analysis — Bible references, concepts, and lessons (if applicable)

Test in the Playground First

Before writing code, test your queries in the Search Playground to verify your content is indexed and understand the response structure.

Navigate to Playground in Gloo AI Studio
Select the Search tab
Choose your publisher from the dropdown
Enter a query and review the results

Try it now: Search for a topic covered in your uploaded content. The Playground displays each result with its title, snippet text, and AI-generated insights.

Step 1: Basic Search

Let’s make a search request with proper authentication. This is the foundation for everything that follows. Each implementation below handles token management, makes the search request, and displays results with titles, types, authors, and relevance scores.

#!/usr/bin/env python3
"""Basic search using the Gloo AI Search API."""

import requests
import os
import sys
import time
from dotenv import load_dotenv

load_dotenv()

# Configuration
CLIENT_ID = os.getenv("GLOO_CLIENT_ID", "YOUR_CLIENT_ID")
CLIENT_SECRET = os.getenv("GLOO_CLIENT_SECRET", "YOUR_CLIENT_SECRET")
TENANT = os.getenv("GLOO_TENANT", "your-tenant-name")
TOKEN_URL = "https://platform.ai.gloo.com/oauth2/token"
SEARCH_URL = "https://platform.ai.gloo.com/ai/data/v1/search"

# --- Token Management ---

access_token_info = {}

def get_access_token():
    """Retrieve a new access token using OAuth2 client credentials."""
    global access_token_info
    response = requests.post(
        TOKEN_URL,
        headers={"Content-Type": "application/x-www-form-urlencoded"},
        data={"grant_type": "client_credentials", "scope": "api/access"},
        auth=(CLIENT_ID, CLIENT_SECRET),
        timeout=30
    )
    response.raise_for_status()
    token_data = response.json()
    token_data['expires_at'] = int(time.time()) + token_data['expires_in']
    access_token_info = token_data
    return token_data

def ensure_valid_token():
    """Ensure we have a valid (non-expired) access token."""
    if not access_token_info or time.time() > (access_token_info.get('expires_at', 0) - 60):
        get_access_token()
    return access_token_info['access_token']

# --- Search ---

def search(query, limit=10):
    """Perform a semantic search query."""
    token = ensure_valid_token()

    payload = {
        "query": query,
        "collection": "GlooProd",
        "tenant": TENANT,
        "limit": limit,
        "certainty": 0.5
    }

    response = requests.post(
        SEARCH_URL,
        headers={
            "Authorization": f"Bearer {token}",
            "Content-Type": "application/json"
        },
        json=payload,
        timeout=60
    )
    response.raise_for_status()
    return response.json()

# --- Run ---

query = sys.argv[1] if len(sys.argv) > 1 else "How can I know my purpose?"
limit = int(sys.argv[2]) if len(sys.argv) > 2 else 10

print(f"Searching for: '{query}'")
print(f"Limit: {limit} results\n")

results = search(query, limit)

if not results.get('data'):
    print("No results found.")
else:
    print(f"Found {len(results['data'])} results:\n")
    for i, result in enumerate(results['data'], 1):
        props = result.get('properties', {})
        meta = result.get('metadata', {})
        print(f"--- Result {i} ---")
        print(f"Title: {props.get('item_title', 'N/A')}")
        print(f"Type: {props.get('type', 'N/A')}")
        print(f"Author: {', '.join(props.get('author', ['N/A']))}")
        print(f"Relevance Score: {meta.get('certainty', 0):.4f}")
        snippet = props.get('snippet', '')
        if snippet:
            print(f"Snippet: {snippet[:200]}...")
        print()

What You’ll See

A successful search returns results with titles, types, and relevance scores:

Searching for: 'How can I know my purpose?'
Limit: 10 results

Found 9 results:

--- Result 1 ---
Title: Finding True Happiness
Type: Article
Author: Automated Ingestion
Relevance Score: 0.7920
Snippet: # Finding True Happiness: A Christian Perspective  In a world obsessed...

--- Result 2 ---
Title: Finding True Happiness
Type: Article
Author: Automated Ingestion
Relevance Score: 0.7700
Snippet: ## The Beatitudes: God's Blueprint for Blessedness Jesus outlined the path...

Key Points

Collection must always be "GlooProd"
Tenant scopes results to your publisher’s content only
certainty: 0.5 matches the Playground default — adjust as needed
Request time increases non-linearly with larger limit values

Run the Cookbook Example

The cookbook includes a ready-to-run basic search script for each language:

cd search-tutorial/python
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
python search_basic.py "How can I know my purpose?" 5

Step 2: Search + RAG with Completions V2

Search results become even more powerful when used as context for AI-generated responses. This is Retrieval Augmented Generation (RAG) — search for relevant content, then generate an answer grounded in that content.

Two approaches to RAG with Gloo AI:

Search API + Completions V2 (this section): Full control over context, prompts, and formatting
Grounded Completions: Single API call, simpler but less control

Both retrieve identical content. The difference is who controls how that content is presented to the LLM.

The RAG Workflow

Search — Query the Search API for relevant content
Extract — Pull snippets from results
Format — Build context for the LLM
Generate — Call Completions V2 with the context
Return — Deliver the response with source citations

#!/usr/bin/env python3
"""Search + RAG using the Gloo AI Search API and Completions V2."""

import requests
import os
import sys
import time
from dotenv import load_dotenv

load_dotenv()

# Configuration
CLIENT_ID = os.getenv("GLOO_CLIENT_ID", "YOUR_CLIENT_ID")
CLIENT_SECRET = os.getenv("GLOO_CLIENT_SECRET", "YOUR_CLIENT_SECRET")
TENANT = os.getenv("GLOO_TENANT", "your-tenant-name")
TOKEN_URL = "https://platform.ai.gloo.com/oauth2/token"
SEARCH_URL = "https://platform.ai.gloo.com/ai/data/v1/search"
COMPLETIONS_URL = "https://platform.ai.gloo.com/ai/v2/chat/completions"

# --- Token Management (same as Step 1) ---

access_token_info = {}

def ensure_valid_token():
    global access_token_info
    if not access_token_info or time.time() > (access_token_info.get('expires_at', 0) - 60):
        response = requests.post(
            TOKEN_URL,
            headers={"Content-Type": "application/x-www-form-urlencoded"},
            data={"grant_type": "client_credentials", "scope": "api/access"},
            auth=(CLIENT_ID, CLIENT_SECRET), timeout=30
        )
        response.raise_for_status()
        access_token_info = response.json()
        access_token_info['expires_at'] = int(time.time()) + access_token_info['expires_in']
    return access_token_info['access_token']

# --- Step 1: Search ---

def search(query, limit=5):
    token = ensure_valid_token()
    response = requests.post(SEARCH_URL, headers={
        "Authorization": f"Bearer {token}", "Content-Type": "application/json"
    }, json={
        "query": query, "collection": "GlooProd",
        "tenant": TENANT, "limit": limit, "certainty": 0.5
    }, timeout=60)
    response.raise_for_status()
    return response.json()

# --- Step 2: Extract Snippets ---

def extract_snippets(results, max_snippets=5, max_chars=500):
    snippets = []
    for result in results.get("data", [])[:max_snippets]:
        props = result.get("properties", {})
        snippets.append({
            "text": props.get("snippet", "")[:max_chars],
            "title": props.get("item_title", "N/A"),
            "type": props.get("type", "N/A"),
        })
    return snippets

# --- Step 3: Format Context ---

def format_context(snippets):
    parts = []
    for i, s in enumerate(snippets, 1):
        parts.append(f"[Source {i}: {s['title']} ({s['type']})]\n{s['text']}\n")
    return "\n---\n".join(parts)

# --- Step 4: Generate Response ---

def generate_with_context(query, context):
    token = ensure_valid_token()
    payload = {
        "messages": [
            {"role": "system", "content":
                "You are a helpful assistant. Answer the user's question based on the "
                "provided context. If the context doesn't contain relevant information, "
                "say so honestly."},
            {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {query}"}
        ],
        "auto_routing": True,
        "max_tokens": 3000
    }
    response = requests.post(COMPLETIONS_URL, headers={
        "Authorization": f"Bearer {token}", "Content-Type": "application/json"
    }, json=payload, timeout=60)
    response.raise_for_status()
    return response.json()["choices"][0]["message"]["content"]

# --- Run Complete RAG Flow ---

query = sys.argv[2] if len(sys.argv) > 2 else "How can I know my purpose?"
limit = int(sys.argv[3]) if len(sys.argv) > 3 else 5

print(f"RAG Search for: '{query}'\n")

print("Step 1: Searching for relevant content...")
results = search(query, limit)
print(f"Found {len(results.get('data', []))} results\n")

print("Step 2: Extracting snippets...")
snippets = extract_snippets(results)
context = format_context(snippets)
print(f"Extracted {len(snippets)} snippets\n")

print("Step 3: Generating response with context...\n")
response = generate_with_context(query, context)

print("=== Generated Response ===")
print(response)
print("\n=== Sources Used ===")
for s in snippets:
    print(f"- {s['title']} ({s['type']})")

What You’ll See

The RAG flow searches, extracts context, then generates an AI response grounded in your content:

RAG Search for: 'How can I know my purpose?'

Step 1: Searching for relevant content...
Found 5 results

Step 2: Extracting snippets...
Extracted 5 snippets

Step 3: Generating response with context...

=== Generated Response ===
Based on the provided articles, finding purpose involves orienting your life
toward God and cultivating specific spiritual qualities. Here are a few key ideas:

- **Relationship with God:** True happiness and purpose are found in your
  relationship with God and aligning your life with His will. (Sources 3, 5)

- **The Beatitudes:** Jesus provided a "blueprint for blessedness" in the
  Beatitudes (Matthew 5:3-12). (Sources 1, 5)

=== Sources Used ===
- Finding True Happiness (Article)
- Finding True Happiness (Article)
- Beatitudes True Happiness (Article)

Run the Cookbook Example

python search_advanced.py rag "How can I know my purpose?" 5

Key Concepts

auto_routing: true — Lets Gloo AI automatically select the best model
System prompt — Customize to match your use case (tone, format, domain rules)
Context formatting — Source labels help the LLM cite correctly
Token budget — Keep context concise. 3-5 snippets of ~500 chars each works well

Search + Completions V2 vs Grounded Completions

	Search + Completions V2	Grounded Completions
Control	Full control over context, prompts, ordering	Gloo handles context automatically
Complexity	More code, more flexibility	Single API call
Custom prompts	Yes — any system prompt	Limited customization
Context formatting	You control structure and ordering	Gloo optimizes automatically
Best for	Custom UX, domain-specific needs	Quick prototyping, standard Q&A

Both approaches use the same underlying search. Start with Grounded Completions if you want simplicity, then switch to Search + Completions V2 when you need more control.

Try It: Frontend Example

The cookbook includes a browser-based frontend that connects to a proxy server, giving you a visual way to test both search and RAG. The proxy server keeps your credentials secure on the server side.

Architecture

Browser (HTML/JS) → Proxy Server (localhost:3000) → Gloo AI APIs

The proxy server exposes two endpoints:

GET /api/search?q=<query>&limit=<limit> — Basic search
POST /api/search/rag — Search + RAG with Completions V2

Start the Proxy Server

Each language includes a proxy server. Start one:

cd search-tutorial/python
source venv/bin/activate
python server.py

Then open http://localhost:3000 in your browser.

Search Results

Enter a query and click Search to see results with titles, content types, and relevance scores:

AI-Powered Answers (RAG)

Click Ask AI to send the same query through the RAG pipeline. The AI generates a response grounded in your search results, with sources listed:

AI response generated from search results with source citations

The frontend is language-agnostic — the same HTML/JS works with any language’s proxy server. This is one approach; customize for your branding and framework.

Complete Working Examples

View Complete Code

Clone or browse the complete working examples for all 6 languages (JavaScript, TypeScript, Python, PHP, Go, Java) with setup instructions, proxy servers, and a browser-based frontend.

What’s Included

Each language implementation provides:

auth — Shared OAuth2 token management with automatic refresh
config — Centralized configuration (URLs, env vars, RAG settings)
search_basic — Basic search (CLI script)
search_advanced — Advanced search + RAG helpers (CLI script)
server — Proxy server exposing REST endpoints for the frontend

Troubleshooting

No Results Returned

Missing certainty: Add "certainty": 0.5 to your payload. The API defaults to 0.75 when omitted, which may be too strict.
Content not indexed: Test in the Search Playground first. If results appear there but not via API, check your tenant name.
Wrong tenant: Results are scoped to your publisher. Verify the tenant name matches your publisher in Organizations.

403 Forbidden

You can only access your own publisher’s content. Other tenant names return 403.
Verify your Client ID and Client Secret are correct and not expired.

Slow Responses

Reduce the limit parameter. Request time rises non-linearly with larger result sets.
Start with limit=10 and increase only as needed.

Authentication Errors

Tokens expire. Implement token refresh logic (see Authentication Tutorial).
Ensure you’re using Bearer {token} in the Authorization header.

Empty RAG Responses

Verify search returns results before calling Completions V2.
Use auto_routing: true instead of specifying a model name.

Next Steps

Grounded Completions

Simpler RAG approach — single API call with automatic context management.

Upload Files

Add more content to your Data Engine for richer search results.

Completions V2 API

Full control over LLM interactions with custom prompts and parameters.

Search API Reference

Complete endpoint documentation with request/response schemas.

Legacy

​Prerequisites

Working Code Sample

​Understanding the Search API

​Key Features

​Required Parameters

​Optional Parameters

​Response Structure

​Test in the Playground First

​Step 1: Basic Search

​What You’ll See

​Key Points

​Run the Cookbook Example

​Step 2: Search + RAG with Completions V2

​The RAG Workflow

​What You’ll See

​Run the Cookbook Example

​Key Concepts

​Search + Completions V2 vs Grounded Completions

​Try It: Frontend Example

​Architecture

​Start the Proxy Server

​Search Results

​AI-Powered Answers (RAG)

​Complete Working Examples

View Complete Code

​What’s Included

​Troubleshooting

​No Results Returned

​403 Forbidden

​Slow Responses

​Authentication Errors

​Empty RAG Responses

​Next Steps

Grounded Completions

Upload Files

Completions V2 API

Search API Reference

Prerequisites

Understanding the Search API

Key Features

Required Parameters

Optional Parameters

Response Structure

Test in the Playground First

Step 1: Basic Search

What You’ll See

Key Points

Run the Cookbook Example

Step 2: Search + RAG with Completions V2

The RAG Workflow

What You’ll See

Run the Cookbook Example

Key Concepts

Search + Completions V2 vs Grounded Completions

Try It: Frontend Example

Architecture

Start the Proxy Server

Search Results

AI-Powered Answers (RAG)

Complete Working Examples

What’s Included

Troubleshooting

No Results Returned

403 Forbidden

Slow Responses

Authentication Errors

Empty RAG Responses

Next Steps