Aggregate Groups
Multi-key pool integration and intelligent load balancing solution
Overview
What are Aggregate Groups?
Aggregate Groups are an advanced grouping type provided by GPT-Load, allowing you to combine multiple standard groups into a logical group. Through aggregate groups, you can unify management of multiple key pools and achieve intelligent load balancing and high availability.
Unified multi-key pool management
Integrate different key pools into a single access point
Intelligent load balancing
Automatically distribute request traffic to sub-groups based on weight
Improved availability
Automatically switch to other sub-groups when one has no available keys
Flexible traffic control
Dynamically adjust traffic distribution among sub-groups
Use Cases
Multi-key pool load balancing
Aggregate key pools from multiple standard groups for balanced traffic distribution
Resource isolation and aggregation
Independent management of departmental key pools, unified service through aggregate groups
Elastic scaling
Quickly add new key pools to existing aggregate groups without modifying client configuration
Canary deployment
Control traffic percentage to new key pools through weight adjustment for gradual migration
High availability architecture
Automatically switch to backup key pools when primary pools are exhausted
Core Concepts
Aggregate Group
A special group type that doesn't contain keys itself, but references other standard groups, implementing unified management and load balancing across multiple key pools through intelligent algorithms.
Sub-group
Standard groups referenced by aggregate groups that actually store and manage API keys. Each sub-group can be independently configured and maintained.
Weight
Determines the traffic percentage of a sub-group in load balancing, ranging from 0-1000. Higher weight means more traffic; weight of 0 means temporarily disabled.
Status
Sub-group operational status includes valid (weight > 0 and has available keys), disabled (weight = 0), and invalid (weight > 0 but no available keys).
State Management
| Weight | Has Keys | Status | Description |
|---|---|---|---|
| > 0 | Yes | Valid | Participates normally in load balancing |
| = 0 | - | Disabled | Does not participate in load balancing (temporarily disabled) |
| > 0 | No | Invalid | Automatically skipped when selected |
Design Principles
Load Balancing Algorithm
GPT-Load uses Smooth Weighted Round-Robin algorithm for sub-group selection, which features:
Smooth distribution: More evenly distributed traffic, avoiding burst traffic
Accurate weighting: Strictly follows configured weight ratios for traffic distribution
High performance: O(n) time complexity, suitable for high concurrency scenarios
Sub-group Selection Logic
Calculate candidate group
Calculate next candidate sub-group based on weighted round-robin algorithm
Check availability
Check if this sub-group has available keys
Select or skip
Use sub-group if keys available, otherwise mark as tried and continue to next candidate
Failure handling
Return error if all sub-groups have no available keys
Key Features
- •Smart skip: Automatically skip sub-groups without available keys
- •Fast fail: Immediately return error after exhausting all sub-groups
- •Stateless: Each request makes independent decisions without external state dependency
Usage Rules
Sub-group Restrictions
No nesting
Consistent channel type
Consistent validation endpoint
Weight Configuration Rules
Value range
0-1000
Special value 0
Disable sub-group (does not participate in load balancing)
Percentage calculation
Sub-group percentage = Sub-group weight / Sum of all sub-group weights × 100%
Recommended practice
Use round hundred values (e.g., 100, 200, 500) for easy understanding and maintenance
Best Practices
Equal Distribution
Multiple key pools with similar capacity, aiming for even traffic distribution
Sub-group A: 100 Sub-group B: 100 Sub-group C: 100
Benefits
- ✓Simple and intuitive, balanced traffic
- ✓Suitable for multiple equally important key pools
Capacity-based Distribution
Key pools with significantly different capacities, distribute by capacity ratio
Sub-group A (50 keys): 500 # 50% Sub-group B (30 keys): 300 # 30% Sub-group C (20 keys): 200 # 20%
Benefits
- ✓Balanced key utilization
- ✓Avoid overloading smaller pools
Primary-Backup Mode
Prioritize primary key pool, backup pool as insurance
Sub-group A (primary): 900 # 90% Sub-group B (backup): 100 # 10%
Benefits
- ✓Prioritize consuming primary pool
- ✓Backup pool as buffer
Canary Deployment
Test new key pool stability, gradually switch traffic
Stage 1: Old(980) / New(20) Stage 2: Old(800) / New(200) Stage 3: Old(200) / New(800)
Benefits
- ✓Controllable risk
- ✓Limited impact scope for issues
Integration with Model Redirect
When aggregate groups contain sub-groups of the same channel type but connecting to different service providers, configure model redirect in each sub-group to enable unified model name access from clients.
Use Case
Aggregate group contains three OpenAI-format sub-groups connecting to different providers, with clients using unified gpt-4 model name:
Sub-group A (OpenAI Official): {"gpt-4": "gpt-4-turbo"}Sub-group B (Azure OpenAI): {"gpt-4": "gpt-35-turbo"}Sub-group C (OpenRouter): {"gpt-4": "openai/gpt-4-turbo"}Clients only need to use gpt-4, aggregate group distributes traffic based on weights, and each sub-group automatically maps it to the actually supported model.
Troubleshooting
Request returns "No available sub-groups"
Cause
All sub-groups have no available keys, all sub-groups are disabled (weight = 0), or aggregate group has no sub-groups added
Solution
- 1Check sub-group list and view status
- 2Enter "invalid" sub-groups, validate and replenish keys
- 3Restore weight for disabled sub-groups
- 4Ensure at least one sub-group is in "valid" status
Traffic distribution doesn't match weight configuration
Cause
Some sub-groups without available keys are automatically skipped, weight configuration recently modified needs longer observation time, or too few requests for statistical significance
Solution
- 1Check if all sub-groups have available keys
- 2Analyze traffic distribution over longer period (e.g., 1 hour) in log page
- 3Confirm weight configuration is saved and effective
Cannot modify standard group's channel type
Cause
Standard group is referenced as sub-group by one or more aggregate groups
Solution
- 1Check "Referenced by aggregate groups" list in standard group details page
- 2Remove sub-group from each aggregate group in the list
- 3Modify channel type
- 4Re-add to aggregate groups if needed