Channel Types
GPT-Load supports multiple mainstream AI service providers, offering completely transparent proxy access while maintaining native API format and experience.
Supported Services
OpenAI
- Chat Completions API
- Embeddings API
- Images API
- Audio API
- Files API
- Models API
Google Gemini
- Generate Content API
- Streaming Support
- Multi-modal Inputs
- Safety Settings
- Generation Config
- Models Management
Anthropic Claude
- Messages API
- Streaming Responses
- System Prompts
- Tool Use
- Token Counting
- Models Access
Extensibility
Architecture designed to quickly add new AI service providers through standardized interface adaptation layer for unified access.
Proxy Format
Unified Proxy Endpoint
http://localhost:3001/proxy/{group-name}Parameter Description
- group-name: Group name created in the management interface
- Supports arbitrary path suffixes with complete transparent forwarding
- Maintains all functionality of the original API
Authentication
- Use original service's API Key
- Pass through Authorization: Bearer {token} header
- Supports group-level key rotation and load balancing
OpenAI Format Integration
Authentication Configuration
GPT-Load is fully compatible with OpenAI SDK, only need to change base_url for seamless switching.
Original OpenAI Request
curl https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-4",
"messages": [
{
"role": "user",
"content": "Hello!"
}
]
}'Via GPT-Load Proxy
curl http://localhost:3001/proxy/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-4",
"messages": [
{
"role": "user",
"content": "Hello!"
}
]
}'Supported Endpoints
Core APIs
- /v1/chat/completions - Chat completions
- /v1/embeddings - Vector embeddings
- /v1/images/generations - Image generation
- /v1/audio/speech - Text-to-speech
- /v1/audio/transcriptions - Speech-to-text
Other APIs
- /v1/models - Model listing
- /v1/files - File management
- /v1/fine_tuning/jobs - Fine-tuning jobs
- /v1/assistants - Assistants API
- /v1/threads - Conversation threads
SDK Configuration
Python SDK
from openai import OpenAI
client = OpenAI(
api_key="your-openai-api-key",
base_url="http://localhost:3001/proxy/openai"
)
response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "user", "content": "Hello!"}
]
)Node.js SDK
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: 'your-openai-api-key',
baseURL: 'http://localhost:3001/proxy/openai'
});
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: [
{ role: 'user', content: 'Hello!' }
]
});Gemini Format Integration
Authentication Configuration
Fully compatible with Google Gemini API, supporting all native features including multi-modal inputs and streaming responses.
Original Gemini Request
curl https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=$API_KEY \
-H 'Content-Type: application/json' \
-d '{
"contents": [{
"parts": [{
"text": "Write a story about a magic backpack."
}]
}]
}'Via GPT-Load Proxy
curl http://localhost:3001/proxy/gemini/v1beta/models/gemini-pro:generateContent?key=$API_KEY \
-H 'Content-Type: application/json' \
-d '{
"contents": [{
"parts": [{
"text": "Write a story about a magic backpack."
}]
}]
}'Supported Endpoints
Content Generation
- /v1beta/models/*/generateContent - Content generation
- /v1beta/models/*/streamGenerateContent - Streaming generation
- /v1beta/models/*/countTokens - Token counting
- /v1beta/models/*/embedContent - Vector embeddings
Model Management
- /v1beta/models - Model listing
- /v1beta/models/* - Model details
- /v1beta/tuning/createTunedModel - Fine-tuning creation
- /v1beta/tuning/tunedModels - Fine-tuning list
SDK Configuration
Python SDK
import google.generativeai as genai
# Configure API Key
genai.configure(
api_key="your-gemini-api-key",
client_options={"api_endpoint": "http://localhost:3001/proxy/gemini"}
)
model = genai.GenerativeModel('gemini-pro')
response = model.generate_content("Hello!")HTTP Request
POST http://localhost:3001/proxy/gemini/v1beta/models/gemini-pro:generateContent?key=YOUR_API_KEY
Content-Type: application/json
{
"contents": [{
"parts": [{
"text": "Explain how AI works"
}]
}]
}Claude Format Integration
Authentication Configuration
Fully compatible with Anthropic Claude API, supporting Messages API, tool usage, streaming responses and all advanced features.
Original Claude Request
curl https://api.anthropic.com/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-3-sonnet-20240229",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello, Claude"}
]
}'Via GPT-Load Proxy
curl http://localhost:3001/proxy/claude/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-3-sonnet-20240229",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello, Claude"}
]
}'Supported Endpoints
Core APIs
- /v1/messages - Message conversations
- /v1/messages/streaming - Streaming conversations
- /v1/complete - Text completion (Legacy)
- /v1/tools - Tool usage
Model Management
- /v1/models - Available model list
- Supports full Claude-3 model series
- Supports custom max_tokens limits
- Supports system prompt configuration
SDK Configuration
Python SDK
from anthropic import Anthropic
client = Anthropic(
api_key="your-claude-api-key",
base_url="http://localhost:3001/proxy/claude"
)
message = client.messages.create(
model="claude-3-sonnet-20240229",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello!"}
]
)HTTP Request
POST http://localhost:3001/proxy/claude/v1/messages
Content-Type: application/json
x-api-key: YOUR_API_KEY
anthropic-version: 2023-06-01
{
"model": "claude-3-sonnet-20240229",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello!"}
]
}Group Management
Creating Groups
- 1. Access GPT-Load management interface
- 2. Navigate to "Environment Management" -> "Group Settings"
- 3. Click "Add Group" and fill in group information
- 4. Select corresponding channel type (OpenAI/Gemini/Claude)
- 5. Configure upstream address and test path
- 6. Add API keys and test connection
- 7. Save configuration and enable group
Configuration Points
- Group name becomes part of the proxy path
- Supports multiple API keys per group
- Automatic key rotation and load balancing
- Supports key health checks and failover
- Can set request rate limits and quota management
Migration Guide
Migration Steps
Assess Current State
Analyze current AI services and API calling methods
Deploy GPT-Load
Deploy GPT-Load service following the quick start guide
Update Configuration
Modify API base addresses in applications to point to GPT-Load
Seamless Migration
GPT-Load's design philosophy is complete transparency. During migration, no business logic modification is needed, just change API endpoint addresses to enjoy unified management and load balancing benefits.
Summary
Transparent Proxy
- Maintains native API format
- No need to modify business code
- Supports all functionality
Unified Management
- Multi-service unified access
- Centralized key management
- Unified monitoring and alerting
Highly Scalable
- Load balancing and failover
- Horizontal scaling support
- Enterprise-grade performance