Prerequisites
Before starting, ensure you have:- A Gloo AI Studio account
- Your Client ID and Client Secret from the API Credentials page
- Authentication setup - Complete the Authentication Tutorial first
The Realtime Ingestion API requires Bearer token authentication. If you haven’t set up authentication yet, follow the Authentication Tutorial to learn how to exchange your credentials for access tokens and manage token expiration.
Step 1: Understanding the Realtime API
The Realtime Ingestion API allows you to upload content that gets processed and made available for search and AI interaction. The primary endpoint is: POST/ingestion/v1/real_time_upload
Key Features
- Real-time Processing: Content is processed upon upload
- Rich Metadata: Support for comprehensive content categorization
- Flexible Content Types: Articles, documents, media, and structured content
- Automatic Indexing: Content becomes searchable instantly
Required Fields
Optional Metadata Fields
- Content Details:
author
,publication_date
,item_title
,item_subtitle
,item_summary
- Categorization:
type
,pub_type
,denomination
,item_tags
- Media:
item_image
,item_url
,hosted_url
- Access Control:
drm
,evergreen
- Hierarchical Structure:
h2_title
,h3_title
for document sections
Step 2: Basic Content Upload
Let’s start with a simple content upload example. This demonstrates the core API call with proper authentication and error handling.API Request Structure
Expected Response
Step 3: Verifying Content Upload
After uploading content, you can check on progress through Gloo AI Studio.- Log in to Gloo AI Studio.
- Navigate to the Data Engine section from the main Studio sidebar.
- Click on Your Data.

Step 4: Setting Up File Monitoring
For automated content ingestion, you’ll want to monitor directories for new files and automatically process them. This creates a real-time content pipeline.File Watching Strategy
- Monitor Target Directory: Watch for new files or changes
- Extract Metadata: Parse filename and content for metadata
- Validate Content: Ensure required fields are present
- Upload with Retry: Handle failures gracefully
- Log Results: Track successful and failed uploads
Step 5: Batch Processing Pipeline
For processing multiple files or handling large volumes of content, batch processing provides better performance and resource management.Batch Processing Benefits
- Rate Limiting: Control API call frequency
- Progress Tracking: Monitor processing status
- Error Recovery: Retry failed uploads
- Resource Management: Efficient memory and network usage
Step 6: Production Considerations
Error Handling
- Authentication Failures: Token refresh and retry
- Rate Limiting: Exponential backoff
- Network Issues: Connection retry logic
- Validation Errors: Content preprocessing
Monitoring and Logging
- Success Metrics: Upload counts and timing
- Error Tracking: Failed uploads with reasons
- Performance Monitoring: API response times
- Health Checks: System status monitoring
Complete Examples
The following examples combine token management, file processing, and error handling into complete, production-ready solutions for each language. First, set up your environment variables in a.env
file:
Testing Your Implementation
To test any of the complete examples:1. Environment Setup
Create a.env
file with your Gloo AI credentials:
2. Install Dependencies
Each language has specific setup requirements - check the README files in the sandbox examples.3. Test Single File Upload
4. Test Directory Monitoring
5. Test Batch Processing
- Automatically handle token retrieval and refresh
- Extract metadata from filenames and content
- Upload content with proper error handling
- Display success/failure status for each operation
- Provide structured JSON responses from the API
Production Deployment Considerations
Error Handling
- Authentication: Automatic token refresh with exponential backoff
- Rate Limiting: Built-in delays between API calls
- Network Issues: Retry logic with configurable timeouts
- File Processing: Validation and error recovery
Monitoring
- Logging: Structured logs for all operations
- Metrics: Success rates, processing times, error counts
- Alerting: Failed upload notifications
- Health Checks: System status monitoring
Scaling
- Concurrent Processing: Parallel file processing capabilities
- Queue Management: Async processing for high-volume scenarios
- Resource Management: Memory and CPU optimization
- Load Balancing: Multiple instance coordination
Next Steps
Now that you have a working content ingestion pipeline, consider exploring:- Search API - Query your ingested content
- Chat Integration - Use ingested content in conversations
- Content Management - Organize and categorize content
- Advanced Metadata - Rich content classification and tagging
- Custom Pipelines - Integrate with CMS, RSS feeds, or other content sources