Channel Types

GPT-Load supports multiple mainstream AI service providers, offering completely transparent proxy access while maintaining native API format and experience.

Supported Services

OpenAI

Chat Completions API
Embeddings API
Images API
Audio API
Files API
Models API

Google Gemini

Generate Content API
Streaming Support
Multi-modal Inputs
Safety Settings
Generation Config
Models Management

Anthropic Claude

Messages API
Streaming Responses
System Prompts
Tool Use
Token Counting
Models Access

Extensibility

Architecture designed to quickly add new AI service providers through standardized interface adaptation layer for unified access.

Proxy Format

Unified Proxy Endpoint

http://localhost:3001/proxy/{group-name}

Parameter Description

group-name: Group name created in the management interface
Supports arbitrary path suffixes with complete transparent forwarding
Maintains all functionality of the original API

Authentication

Use original service's API Key
Pass through Authorization: Bearer {token} header
Supports group-level key rotation and load balancing

OpenAI Format Integration

Authentication Configuration

GPT-Load is fully compatible with OpenAI SDK, only need to change base_url for seamless switching.

Original OpenAI Request

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'

Via GPT-Load Proxy

curl http://localhost:3001/proxy/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'

Only need to change API base address, all other code remains unchanged

Supports all OpenAI SDK functionality

Supported Endpoints

Core APIs

/v1/chat/completions - Chat completions
/v1/embeddings - Vector embeddings
/v1/images/generations - Image generation
/v1/audio/speech - Text-to-speech
/v1/audio/transcriptions - Speech-to-text

Other APIs

/v1/models - Model listing
/v1/files - File management
/v1/fine_tuning/jobs - Fine-tuning jobs
/v1/assistants - Assistants API
/v1/threads - Conversation threads

SDK Configuration

Python SDK

from openai import OpenAI

client = OpenAI(
    api_key="your-openai-api-key",
    base_url="http://localhost:3001/proxy/openai"
)

response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

Node.js SDK

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: 'your-openai-api-key',
  baseURL: 'http://localhost:3001/proxy/openai'
});

const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [
    { role: 'user', content: 'Hello!' }
  ]
});

Gemini Format Integration

Authentication Configuration

Fully compatible with Google Gemini API, supporting all native features including multi-modal inputs and streaming responses.

Original Gemini Request

curl https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=$API_KEY \
  -H 'Content-Type: application/json' \
  -d '{
    "contents": [{
      "parts": [{
        "text": "Write a story about a magic backpack."
      }]
    }]
  }'

Via GPT-Load Proxy

curl http://localhost:3001/proxy/gemini/v1beta/models/gemini-pro:generateContent?key=$API_KEY \
  -H 'Content-Type: application/json' \
  -d '{
    "contents": [{
      "parts": [{
        "text": "Write a story about a magic backpack."
      }]
    }]
  }'

Replace request base address with GPT-Load proxy address

Keep all parameters and authentication methods unchanged

Supported Endpoints

Content Generation

/v1beta/models/*/generateContent - Content generation
/v1beta/models/*/streamGenerateContent - Streaming generation
/v1beta/models/*/countTokens - Token counting
/v1beta/models/*/embedContent - Vector embeddings

Model Management

/v1beta/models - Model listing
/v1beta/models/* - Model details
/v1beta/tuning/createTunedModel - Fine-tuning creation
/v1beta/tuning/tunedModels - Fine-tuning list

SDK Configuration

Python SDK

import google.generativeai as genai

# Configure API Key
genai.configure(
    api_key="your-gemini-api-key",
    client_options={"api_endpoint": "http://localhost:3001/proxy/gemini"}
)

model = genai.GenerativeModel('gemini-pro')
response = model.generate_content("Hello!")

HTTP Request

POST http://localhost:3001/proxy/gemini/v1beta/models/gemini-pro:generateContent?key=YOUR_API_KEY
Content-Type: application/json

{
  "contents": [{
    "parts": [{
      "text": "Explain how AI works"
    }]
  }]
}

Claude Format Integration

Authentication Configuration

Fully compatible with Anthropic Claude API, supporting Messages API, tool usage, streaming responses and all advanced features.

Original Claude Request

curl https://api.anthropic.com/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-3-sonnet-20240229",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello, Claude"}
    ]
  }'

Via GPT-Load Proxy

curl http://localhost:3001/proxy/claude/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-3-sonnet-20240229",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello, Claude"}
    ]
  }'

Update API base address to GPT-Load proxy endpoint

Keep all headers and request format unchanged

Supported Endpoints

Core APIs

/v1/messages - Message conversations
/v1/messages/streaming - Streaming conversations
/v1/complete - Text completion (Legacy)
/v1/tools - Tool usage

Model Management

/v1/models - Available model list
Supports full Claude-3 model series
Supports custom max_tokens limits
Supports system prompt configuration

SDK Configuration

Python SDK

from anthropic import Anthropic

client = Anthropic(
    api_key="your-claude-api-key",
    base_url="http://localhost:3001/proxy/claude"
)

message = client.messages.create(
    model="claude-3-sonnet-20240229",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

HTTP Request

POST http://localhost:3001/proxy/claude/v1/messages
Content-Type: application/json
x-api-key: YOUR_API_KEY
anthropic-version: 2023-06-01

{
  "model": "claude-3-sonnet-20240229",
  "max_tokens": 1024,
  "messages": [
    {"role": "user", "content": "Hello!"}
  ]
}

Group Management

Creating Groups

1. Access GPT-Load management interface
2. Navigate to "Environment Management" -> "Group Settings"
3. Click "Add Group" and fill in group information
4. Select corresponding channel type (OpenAI/Gemini/Claude)
5. Configure upstream address and test path
6. Add API keys and test connection
7. Save configuration and enable group

Configuration Points

Group name becomes part of the proxy path
Supports multiple API keys per group
Automatic key rotation and load balancing
Supports key health checks and failover
Can set request rate limits and quota management

Migration Guide

Migration Steps

Assess Current State

Analyze current AI services and API calling methods

Deploy GPT-Load

Deploy GPT-Load service following the quick start guide

Update Configuration

Modify API base addresses in applications to point to GPT-Load

Seamless Migration

GPT-Load's design philosophy is complete transparency. During migration, no business logic modification is needed, just change API endpoint addresses to enjoy unified management and load balancing benefits.

Summary

Transparent Proxy

Maintains native API format
No need to modify business code
Supports all functionality

Unified Management

Multi-service unified access
Centralized key management
Unified monitoring and alerting

Highly Scalable

Load balancing and failover
Horizontal scaling support
Enterprise-grade performance