Channel Types

GPT-Load supports multiple mainstream AI service providers, offering completely transparent proxy access while maintaining native API format and experience.

Supported Services

OpenAI

  • Chat Completions API
  • Embeddings API
  • Images API
  • Audio API
  • Files API
  • Models API

Google Gemini

  • Generate Content API
  • Streaming Support
  • Multi-modal Inputs
  • Safety Settings
  • Generation Config
  • Models Management

Anthropic Claude

  • Messages API
  • Streaming Responses
  • System Prompts
  • Tool Use
  • Token Counting
  • Models Access

Extensibility

Architecture designed to quickly add new AI service providers through standardized interface adaptation layer for unified access.

Proxy Format

Unified Proxy Endpoint

http://localhost:3001/proxy/{group-name}

Parameter Description

  • group-name: Group name created in the management interface
  • Supports arbitrary path suffixes with complete transparent forwarding
  • Maintains all functionality of the original API

Authentication

  • Use original service's API Key
  • Pass through Authorization: Bearer {token} header
  • Supports group-level key rotation and load balancing

OpenAI Format Integration

Authentication Configuration

GPT-Load is fully compatible with OpenAI SDK, only need to change base_url for seamless switching.

Original OpenAI Request

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'

Via GPT-Load Proxy

curl http://localhost:3001/proxy/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'
Only need to change API base address, all other code remains unchanged
Supports all OpenAI SDK functionality

Supported Endpoints

Core APIs

  • /v1/chat/completions - Chat completions
  • /v1/embeddings - Vector embeddings
  • /v1/images/generations - Image generation
  • /v1/audio/speech - Text-to-speech
  • /v1/audio/transcriptions - Speech-to-text

Other APIs

  • /v1/models - Model listing
  • /v1/files - File management
  • /v1/fine_tuning/jobs - Fine-tuning jobs
  • /v1/assistants - Assistants API
  • /v1/threads - Conversation threads

SDK Configuration

Python SDK

from openai import OpenAI

client = OpenAI(
    api_key="your-openai-api-key",
    base_url="http://localhost:3001/proxy/openai"
)

response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

Node.js SDK

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: 'your-openai-api-key',
  baseURL: 'http://localhost:3001/proxy/openai'
});

const response = await openai.chat.completions.create({
  model: 'gpt-4',
  messages: [
    { role: 'user', content: 'Hello!' }
  ]
});

Gemini Format Integration

Authentication Configuration

Fully compatible with Google Gemini API, supporting all native features including multi-modal inputs and streaming responses.

Original Gemini Request

curl https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=$API_KEY \
  -H 'Content-Type: application/json' \
  -d '{
    "contents": [{
      "parts": [{
        "text": "Write a story about a magic backpack."
      }]
    }]
  }'

Via GPT-Load Proxy

curl http://localhost:3001/proxy/gemini/v1beta/models/gemini-pro:generateContent?key=$API_KEY \
  -H 'Content-Type: application/json' \
  -d '{
    "contents": [{
      "parts": [{
        "text": "Write a story about a magic backpack."
      }]
    }]
  }'
Replace request base address with GPT-Load proxy address
Keep all parameters and authentication methods unchanged

Supported Endpoints

Content Generation

  • /v1beta/models/*/generateContent - Content generation
  • /v1beta/models/*/streamGenerateContent - Streaming generation
  • /v1beta/models/*/countTokens - Token counting
  • /v1beta/models/*/embedContent - Vector embeddings

Model Management

  • /v1beta/models - Model listing
  • /v1beta/models/* - Model details
  • /v1beta/tuning/createTunedModel - Fine-tuning creation
  • /v1beta/tuning/tunedModels - Fine-tuning list

SDK Configuration

Python SDK

import google.generativeai as genai

# Configure API Key
genai.configure(
    api_key="your-gemini-api-key",
    client_options={"api_endpoint": "http://localhost:3001/proxy/gemini"}
)

model = genai.GenerativeModel('gemini-pro')
response = model.generate_content("Hello!")

HTTP Request

POST http://localhost:3001/proxy/gemini/v1beta/models/gemini-pro:generateContent?key=YOUR_API_KEY
Content-Type: application/json

{
  "contents": [{
    "parts": [{
      "text": "Explain how AI works"
    }]
  }]
}

Claude Format Integration

Authentication Configuration

Fully compatible with Anthropic Claude API, supporting Messages API, tool usage, streaming responses and all advanced features.

Original Claude Request

curl https://api.anthropic.com/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-3-sonnet-20240229",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello, Claude"}
    ]
  }'

Via GPT-Load Proxy

curl http://localhost:3001/proxy/claude/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-3-sonnet-20240229",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello, Claude"}
    ]
  }'
Update API base address to GPT-Load proxy endpoint
Keep all headers and request format unchanged

Supported Endpoints

Core APIs

  • /v1/messages - Message conversations
  • /v1/messages/streaming - Streaming conversations
  • /v1/complete - Text completion (Legacy)
  • /v1/tools - Tool usage

Model Management

  • /v1/models - Available model list
  • Supports full Claude-3 model series
  • Supports custom max_tokens limits
  • Supports system prompt configuration

SDK Configuration

Python SDK

from anthropic import Anthropic

client = Anthropic(
    api_key="your-claude-api-key",
    base_url="http://localhost:3001/proxy/claude"
)

message = client.messages.create(
    model="claude-3-sonnet-20240229",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

HTTP Request

POST http://localhost:3001/proxy/claude/v1/messages
Content-Type: application/json
x-api-key: YOUR_API_KEY
anthropic-version: 2023-06-01

{
  "model": "claude-3-sonnet-20240229",
  "max_tokens": 1024,
  "messages": [
    {"role": "user", "content": "Hello!"}
  ]
}

Group Management

Creating Groups

  1. 1. Access GPT-Load management interface
  2. 2. Navigate to "Environment Management" -> "Group Settings"
  3. 3. Click "Add Group" and fill in group information
  4. 4. Select corresponding channel type (OpenAI/Gemini/Claude)
  5. 5. Configure upstream address and test path
  6. 6. Add API keys and test connection
  7. 7. Save configuration and enable group

Configuration Points

  • Group name becomes part of the proxy path
  • Supports multiple API keys per group
  • Automatic key rotation and load balancing
  • Supports key health checks and failover
  • Can set request rate limits and quota management

Migration Guide

Migration Steps

1

Assess Current State

Analyze current AI services and API calling methods

2

Deploy GPT-Load

Deploy GPT-Load service following the quick start guide

3

Update Configuration

Modify API base addresses in applications to point to GPT-Load

Seamless Migration

GPT-Load's design philosophy is complete transparency. During migration, no business logic modification is needed, just change API endpoint addresses to enjoy unified management and load balancing benefits.

Summary

Transparent Proxy

  • Maintains native API format
  • No need to modify business code
  • Supports all functionality

Unified Management

  • Multi-service unified access
  • Centralized key management
  • Unified monitoring and alerting

Highly Scalable

  • Load balancing and failover
  • Horizontal scaling support
  • Enterprise-grade performance