Skip to main content
This is Part 3 of the Build an End-to-End RAG Pipeline series. Parts 1 and 2 walked the happy path. Production code can’t assume it: requests fail, services blip, and you need to confirm that an operation actually took effect. This part builds a small resilient client and uses it to interpret API errors, retry transient failures, and verify ingestion health.
Gloo AI does not include a monitoring or health-check endpoint. Resilience is built from the same item APIs you’ve already used, plus disciplined error handling on the client side.

Pipeline at a Glance

1

Publisher setup (Studio)

Create the publisher that owns your content — Part 1.
2

Ingest content with metadata

Upload files and enrich them — Part 1.
3

Verify indexing

Poll item status until your content is searchable — Part 1.
4

Semantic search

Query your content — deep dive: Building Custom Search.
5

Grounded completions with sources

Answer questions from your content with citations — deep dive: Grounded Completions with RAG.
6

Content lifecycle

Update, bulk-edit, and delete content — Part 2.
7

Verification, errors & resilience

Error handling and retry patterns — covered below.

Prerequisites

Before starting, ensure you have:

Step 1: Parse Errors and Retry

A resilient client does two things on every request: it turns failures into a normalized error (status, code, message) so callers can react to them, and it retries only transient failures — server errors (500, 502, 503, 504) and network blips — with exponential backoff. Client errors (400, 401, 403, 404, 422) are bugs in the request, not blips, so they fail fast. The Data Engine returns a few error shapes — {"detail": {"code", "message"}}, {"detail": "..."}, and {"error", "message"} — so the parser normalizes all of them.
import time
import requests

RETRYABLE = {500, 502, 503, 504}
MAX_RETRIES = 4


class ApiError(Exception):
    def __init__(self, status, code, message):
        super().__init__(f"[{status} {code}] {message}")
        self.status, self.code, self.message = status, code, message


def parse_error(response):
    try:
        body = response.json()
    except ValueError:
        return None, response.reason
    detail = body.get("detail") if isinstance(body, dict) else None
    if isinstance(detail, dict):
        return detail.get("code"), detail.get("message") or response.reason
    if isinstance(detail, str):
        return None, detail
    return body.get("error"), body.get("message") or response.reason


def request(method, url, token, **kwargs):
    for attempt in range(MAX_RETRIES + 1):
        response = requests.request(
            method, url, headers={"Authorization": f"Bearer {token}"}, timeout=30, **kwargs
        )
        if response.status_code in RETRYABLE and attempt < MAX_RETRIES:
            delay = 2 ** attempt
            print(f"    Attempt {attempt + 1} failed ({response.status_code}); retrying in {delay}s")
            time.sleep(delay)
            continue
        if not response.ok:
            code, message = parse_error(response)
            raise ApiError(response.status_code, code, message)
        return response.json()
These snippets are simplified for readability. The cookbook client also refreshes the access token once on a 401, retries network-level failures, and applies the same retry policy to multipart uploads. Both implement the same patterns.

Step 2: Interpret API Error Responses

With request and parse_error in place, error handling becomes uniform: catch the normalized error and read its status, code, and message. The calls below deliberately trigger three common failures — a missing item (404), a malformed ID (400), and a rejected token (403).
import uuid

cases = [
    ("Missing item (random UUID)", f"{ITEMS_URL}/{uuid.uuid4()}", token),
    ("Malformed item ID", f"{ITEMS_URL}/not-a-valid-uuid", token),
    ("Rejected bearer token", f"{ITEMS_URL}/{uuid.uuid4()}", "invalid-token"),
]
for label, url, tok in cases:
    try:
        request("GET", url, tok)
        print(f"  {label}: unexpectedly succeeded")
    except ApiError as e:
        print(f"  {label}: status={e.status} code={e.code!r} message={e.message!r}")

What You’ll See

  Missing item (random UUID): status=404 code='Item not found' message='The requested item does not exist or has been permanently deleted'
  Malformed item ID: status=400 code='Invalid item ID format' message='Item ID must be a valid UUID format'
  Rejected bearer token: status=403 code='Forbidden - insufficient permissions' message='Forbidden'
A rejected or malformed token returns 403, not 401. A genuinely expired token typically returns 401 — which is why the cookbook client refreshes the token once on a 401 and retries. It does not refresh on 403, since 403 can be a legitimate permission denial (for example, an item that belongs to another publisher).

Step 3: Retry Transient Failures

Transient failures — a 503, a dropped connection — should be retried, not surfaced. The retry loop from Step 1 already does this; here it is in isolation, recovering from a service that fails twice before succeeding.
A healthy API won’t return a 5xx on demand, so this example simulates a transient failure to exercise the backoff path. In production the same path handles real 5xx responses and network errors.
RETRYABLE_DELAYS = [2 ** i for i in range(MAX_RETRIES)]

calls = 0
def flaky():
    global calls
    calls += 1
    if calls < 3:
        raise ApiError(503, "service_unavailable", "Service temporarily unavailable")
    return {"ok": True}

for attempt in range(MAX_RETRIES + 1):
    try:
        flaky()
        break
    except ApiError as e:
        if e.status in RETRYABLE and attempt < MAX_RETRIES:
            delay = RETRYABLE_DELAYS[attempt]
            print(f"    Attempt {attempt + 1} failed ({e.status}: {e.code}); retrying in {delay}s")
            time.sleep(delay)
        else:
            raise
print(f"  Succeeded after {calls} attempts")

What You’ll See

    Attempt 1 failed (503: service_unavailable); retrying in 1s
    Attempt 2 failed (503: service_unavailable); retrying in 2s
  Succeeded after 3 attempts

Step 4: Verify Ingestion Health

There’s no health endpoint, so you verify by checking the items you care about. Upload a batch, wait for indexing, then fetch each item’s status and roll it up into a summary — treating a 404 as “not found” rather than an error so one missing item doesn’t abort the check. The example deliberately includes a random ID to show that path.
# item_ids: uploaded and indexed via the resilient client (see the cookbook)
to_check = item_ids + [str(uuid.uuid4())]
summary = {"completed": 0, "pending": 0, "failed": 0, "not_found": 0}
for item_id in to_check:
    try:
        status = request("GET", f"{ITEMS_URL}/{item_id}", token).get("status", "").upper()
        if status == "COMPLETED":
            summary["completed"] += 1
        elif status in ("FAILED", "ERROR"):
            summary["failed"] += 1
        else:
            summary["pending"] += 1
    except ApiError as e:
        if e.status == 404:
            summary["not_found"] += 1
        else:
            raise
print(f"  Health: {summary['completed']} completed, {summary['pending']} pending, "
      f"{summary['failed']} failed, {summary['not_found']} not found")

What You’ll See

  Health: 2 completed, 0 pending, 0 failed, 1 not found
The two uploaded items report COMPLETED; the random ID is counted as not found instead of throwing.

Run the Complete Example

The cookbook ties it together — a resilient client that handles errors, retries, and a full upload → verify → clean-up health check — in all six languages. From the cookbook repository, install dependencies, copy .env.example to .env and add your credentials, then run it:
cd rag-pipeline-part-3/python
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env       # then add your Client ID, Secret, and Publisher ID
python main.py
You’ll see:
Step 1: Resilient client ready (token refresh, error parsing, retry/backoff).

Step 2: Interpreting API error responses...
  Missing item (random UUID): status=404 code='Item not found' message='The requested item does not exist or has been permanently deleted'
  Malformed item ID: status=400 code='Invalid item ID format' message='Item ID must be a valid UUID format'
  Rejected bearer token: status=403 code='Forbidden - insufficient permissions' message='Forbidden'

Step 3: Retrying transient failures with backoff...
    Attempt 1 failed (503: service_unavailable); retrying in 1s
    Attempt 2 failed (503: service_unavailable); retrying in 2s
  Succeeded after 3 attempts

Step 4: Verifying ingestion health...
  Uploading and indexing a batch...
  Waiting for 2 item(s) to finish indexing...
  Health: 2 completed, 0 pending, 0 failed, 1 not found
  Cleaned up 2 item(s)

Done. The resilient client handled errors, retries, and verification end to end.

Working Code Sample

View Complete Code

Clone or browse the complete resilient client for all 6 languages (JavaScript, TypeScript, Python, PHP, Go, Java) with setup instructions and the sample content files.

Troubleshooting

A request fails immediately instead of retrying

That’s intended for client errors (400, 401, 403, 404, 422) — they won’t succeed on retry. Only 500/502/503/504 and network failures are retried.

Retries never stop / take too long

Check your backoff cap. With base-2 exponential backoff and MAX_RETRIES = 4, the waits are 1s, 2s, 4s, 8s. Lower MAX_RETRIES or cap the delay for latency-sensitive paths.

Empty error message

Some error responses carry a code but no message. The parser falls back to the HTTP reason phrase (for example, Forbidden) so you always have something to log.

403 on a token you believe is valid

A malformed or rejected token returns 403. Re-check the token; if it’s genuinely expired you’ll typically get a 401, which the cookbook client refreshes automatically. See the Authentication Tutorial.

Next Steps

That completes the Build an End-to-End RAG Pipeline series — you’ve set up a publisher and ingested content (Part 1), managed its lifecycle (Part 2), and made the integration resilient (Part 3). From here:
  1. Building Custom Search — surface your content to users
  2. Grounded Completions with RAG — answer questions from your content with citations
  3. Fold the resilient client into your own service so every Gloo AI call gets consistent error handling and retries.