Aggregate Groups

Multi-key pool integration and intelligent load balancing solution

Overview

What are Aggregate Groups?

Aggregate Groups are an advanced grouping type provided by GPT-Load, allowing you to combine multiple standard groups into a logical group. Through aggregate groups, you can unify management of multiple key pools and achieve intelligent load balancing and high availability.

Unified multi-key pool management

Integrate different key pools into a single access point

Intelligent load balancing

Automatically distribute request traffic to sub-groups based on weight

Improved availability

Automatically switch to other sub-groups when one has no available keys

Flexible traffic control

Dynamically adjust traffic distribution among sub-groups

Use Cases

1

Multi-key pool load balancing

Aggregate key pools from multiple standard groups for balanced traffic distribution

2

Resource isolation and aggregation

Independent management of departmental key pools, unified service through aggregate groups

3

Elastic scaling

Quickly add new key pools to existing aggregate groups without modifying client configuration

4

Canary deployment

Control traffic percentage to new key pools through weight adjustment for gradual migration

5

High availability architecture

Automatically switch to backup key pools when primary pools are exhausted

Core Concepts

Aggregate Group

A special group type that doesn't contain keys itself, but references other standard groups, implementing unified management and load balancing across multiple key pools through intelligent algorithms.

Sub-group

Standard groups referenced by aggregate groups that actually store and manage API keys. Each sub-group can be independently configured and maintained.

Weight

Determines the traffic percentage of a sub-group in load balancing, ranging from 0-1000. Higher weight means more traffic; weight of 0 means temporarily disabled.

Status

Sub-group operational status includes valid (weight > 0 and has available keys), disabled (weight = 0), and invalid (weight > 0 but no available keys).

State Management

WeightHas KeysStatusDescription
> 0Yes
Valid
Participates normally in load balancing
= 0-
Disabled
Does not participate in load balancing (temporarily disabled)
> 0No
Invalid
Automatically skipped when selected

Design Principles

Load Balancing Algorithm

GPT-Load uses Smooth Weighted Round-Robin algorithm for sub-group selection, which features:

Smooth distribution: More evenly distributed traffic, avoiding burst traffic

Accurate weighting: Strictly follows configured weight ratios for traffic distribution

High performance: O(n) time complexity, suitable for high concurrency scenarios

Sub-group Selection Logic

1
Calculate candidate group

Calculate next candidate sub-group based on weighted round-robin algorithm

2
Check availability

Check if this sub-group has available keys

3
Select or skip

Use sub-group if keys available, otherwise mark as tried and continue to next candidate

4
Failure handling

Return error if all sub-groups have no available keys

Key Features

  • Smart skip: Automatically skip sub-groups without available keys
  • Fast fail: Immediately return error after exhausting all sub-groups
  • Stateless: Each request makes independent decisions without external state dependency

Usage Rules

Sub-group Restrictions

No nesting

Aggregate group → Standard group
Aggregate group → Aggregate group

Consistent channel type

All sub-groups are OpenAI
Mixed OpenAI and Gemini sub-groups

Consistent validation endpoint

All sub-groups use /v1/chat/completions
Sub-group A uses /v1/chat/completions, sub-group B uses /v1/completions

Weight Configuration Rules

Value range

0-1000

Special value 0

Disable sub-group (does not participate in load balancing)

Percentage calculation

Sub-group percentage = Sub-group weight / Sum of all sub-group weights × 100%

Recommended practice

Use round hundred values (e.g., 100, 200, 500) for easy understanding and maintenance

Best Practices

Equal Distribution

Multiple key pools with similar capacity, aiming for even traffic distribution

Sub-group A: 100
Sub-group B: 100
Sub-group C: 100

Benefits

  • Simple and intuitive, balanced traffic
  • Suitable for multiple equally important key pools

Capacity-based Distribution

Key pools with significantly different capacities, distribute by capacity ratio

Sub-group A (50 keys): 500  # 50%
Sub-group B (30 keys): 300  # 30%
Sub-group C (20 keys): 200  # 20%

Benefits

  • Balanced key utilization
  • Avoid overloading smaller pools

Primary-Backup Mode

Prioritize primary key pool, backup pool as insurance

Sub-group A (primary): 900  # 90%
Sub-group B (backup): 100  # 10%

Benefits

  • Prioritize consuming primary pool
  • Backup pool as buffer

Canary Deployment

Test new key pool stability, gradually switch traffic

Stage 1: Old(980) / New(20)
Stage 2: Old(800) / New(200)
Stage 3: Old(200) / New(800)

Benefits

  • Controllable risk
  • Limited impact scope for issues

Integration with Model Redirect

When aggregate groups contain sub-groups of the same channel type but connecting to different service providers, configure model redirect in each sub-group to enable unified model name access from clients.

Use Case

Aggregate group contains three OpenAI-format sub-groups connecting to different providers, with clients using unified gpt-4 model name:

Sub-group A (OpenAI Official): {"gpt-4": "gpt-4-turbo"}
Sub-group B (Azure OpenAI): {"gpt-4": "gpt-35-turbo"}
Sub-group C (OpenRouter): {"gpt-4": "openai/gpt-4-turbo"}

Clients only need to use gpt-4, aggregate group distributes traffic based on weights, and each sub-group automatically maps it to the actually supported model.

Troubleshooting

Request returns "No available sub-groups"

Cause

All sub-groups have no available keys, all sub-groups are disabled (weight = 0), or aggregate group has no sub-groups added

Solution

  • 1Check sub-group list and view status
  • 2Enter "invalid" sub-groups, validate and replenish keys
  • 3Restore weight for disabled sub-groups
  • 4Ensure at least one sub-group is in "valid" status

Traffic distribution doesn't match weight configuration

Cause

Some sub-groups without available keys are automatically skipped, weight configuration recently modified needs longer observation time, or too few requests for statistical significance

Solution

  • 1Check if all sub-groups have available keys
  • 2Analyze traffic distribution over longer period (e.g., 1 hour) in log page
  • 3Confirm weight configuration is saved and effective

Cannot modify standard group's channel type

Cause

Standard group is referenced as sub-group by one or more aggregate groups

Solution

  • 1Check "Referenced by aggregate groups" list in standard group details page
  • 2Remove sub-group from each aggregate group in the list
  • 3Modify channel type
  • 4Re-add to aggregate groups if needed