Deployment¶
Deployment strategies: auto-scaling, rolling deploys, canary releases, and metric evaluators.
Deployment components for auto-scaling and deployment strategies.
AutoScaler ¶
AutoScaler(
name: str,
load_balancer: Entity,
server_factory: Callable[[str], Entity],
policy: ScalingPolicy | None = None,
min_instances: int = 1,
max_instances: int = 10,
evaluation_interval: float = 10.0,
scale_out_cooldown: float = 30.0,
scale_in_cooldown: float = 60.0,
)
Bases: Entity
Auto-scaling controller for load balancer backends.
Periodically evaluates backend utilization and adds/removes instances to maintain desired performance. Supports cooldown periods to prevent oscillation.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
Scaler identifier. |
|
stats |
AutoScalerStats
|
Frozen statistics snapshot. |
scaling_history |
list[ScalingEvent]
|
List of scaling events. |
Initialize the auto scaler.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Scaler identifier. |
required |
load_balancer
|
Entity
|
LoadBalancer to manage. |
required |
server_factory
|
Callable[[str], Entity]
|
Callable that creates new server instances given a name. |
required |
policy
|
ScalingPolicy | None
|
Scaling decision policy (default TargetUtilization(0.7)). |
None
|
min_instances
|
int
|
Minimum backend count. |
1
|
max_instances
|
int
|
Maximum backend count. |
10
|
evaluation_interval
|
float
|
Seconds between evaluations. |
10.0
|
scale_out_cooldown
|
float
|
Seconds after scale-out before next action. |
30.0
|
scale_in_cooldown
|
float
|
Seconds after scale-in before next action. |
60.0
|
AutoScalerStats
dataclass
¶
AutoScalerStats(
evaluations: int = 0,
scale_out_count: int = 0,
scale_in_count: int = 0,
instances_added: int = 0,
instances_removed: int = 0,
cooldown_blocks: int = 0,
)
Statistics tracked by AutoScaler.
QueueDepthScaling ¶
Scale based on aggregate queue depth across backends.
ScalingEvent
dataclass
¶
Record of a scaling action.
ScalingPolicy ¶
Bases: Protocol
Protocol for scaling decision algorithms.
evaluate ¶
evaluate(
backends: list[Entity],
current_count: int,
min_instances: int,
max_instances: int,
) -> int
Evaluate and return desired instance count.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
backends
|
list[Entity]
|
Current backend entities. |
required |
current_count
|
int
|
Current number of instances. |
required |
min_instances
|
int
|
Minimum allowed instances. |
required |
max_instances
|
int
|
Maximum allowed instances. |
required |
Returns:
| Type | Description |
|---|---|
int
|
Desired instance count. |
StepScaling ¶
Step-based scaling with utilization thresholds.
Each step defines a utilization threshold and the adjustment to make. Steps are evaluated from highest threshold to lowest.
Initialize step scaling.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
steps
|
list[tuple[float, int]]
|
List of (utilization_threshold, adjustment) tuples. Sorted by threshold descending internally. |
required |
TargetUtilization ¶
Scale to keep average utilization near a target.
Calculates average utilization across backends and scales to bring utilization close to the target value.
CanaryDeployer ¶
CanaryDeployer(
name: str,
load_balancer: Entity,
server_factory: Callable[[str], Entity],
stages: list[CanaryStage] | None = None,
metric_evaluator: MetricEvaluator | None = None,
evaluation_interval: float = 5.0,
)
Bases: Entity
Canary deployment with progressive traffic shifting.
Creates a canary instance and progressively shifts traffic through stages while monitoring metrics. Rolls back if metrics degrade.
Default stages: [1%, 5%, 25%, 100%].
Attributes:
| Name | Type | Description |
|---|---|---|
name |
Deployer identifier. |
|
stats |
CanaryDeployerStats
|
Frozen statistics snapshot. |
state |
Current deployment state. |
Initialize the canary deployer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Deployer identifier. |
required |
load_balancer
|
Entity
|
LoadBalancer to manage. |
required |
server_factory
|
Callable[[str], Entity]
|
Creates new server instances. |
required |
stages
|
list[CanaryStage] | None
|
Traffic stages (default [1%, 5%, 25%, 100%]). |
None
|
metric_evaluator
|
MetricEvaluator | None
|
Health evaluator (default ErrorRateEvaluator). |
None
|
evaluation_interval
|
float
|
Seconds between metric evaluations. |
5.0
|
CanaryDeployerStats
dataclass
¶
CanaryDeployerStats(
deployments_started: int = 0,
deployments_completed: int = 0,
deployments_rolled_back: int = 0,
stages_completed: int = 0,
evaluations_performed: int = 0,
evaluations_passed: int = 0,
evaluations_failed: int = 0,
)
Statistics tracked by CanaryDeployer.
CanaryStage
dataclass
¶
Definition of a canary traffic stage.
Attributes:
| Name | Type | Description |
|---|---|---|
traffic_percentage |
float
|
Fraction of traffic to send to canary (0.0-1.0). |
evaluation_period |
float
|
Seconds to observe at this stage before advancing. |
CanaryState
dataclass
¶
CanaryState(
status: str = "idle",
current_stage: int = 0,
total_stages: int = 0,
canary_traffic_pct: float = 0.0,
)
Current state of a canary deployment.
ErrorRateEvaluator ¶
Evaluate canary health based on error/failure rates.
Compares canary failure rate to baseline average. If canary exceeds the threshold multiplier over baseline, it's considered unhealthy.
LatencyEvaluator ¶
Evaluate canary health based on response latency.
Compares canary latency to baseline. If canary exceeds threshold multiplier over baseline average, it's unhealthy.
DeploymentState
dataclass
¶
DeploymentState(
status: str = "idle",
total_instances: int = 0,
replaced: int = 0,
failed: int = 0,
current_batch: int = 0,
)
Current state of a rolling deployment.
RollingDeployer ¶
RollingDeployer(
name: str,
load_balancer: Entity,
server_factory: Callable[[str], Entity],
batch_size: int = 1,
health_check_interval: float = 2.0,
healthy_threshold: int = 2,
max_failures: int = 1,
)
Bases: Entity
Rolls out new backend versions one batch at a time.
For each batch: 1. Create new instances and add to load balancer 2. Health check new instances until healthy_threshold consecutive passes 3. Remove old instances from the batch 4. If health checks fail too many times, rollback
Attributes:
| Name | Type | Description |
|---|---|---|
name |
Deployer identifier. |
|
stats |
RollingDeployerStats
|
Frozen statistics snapshot. |
state |
Current deployment state. |
Initialize the rolling deployer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Deployer identifier. |
required |
load_balancer
|
Entity
|
LoadBalancer whose backends to replace. |
required |
server_factory
|
Callable[[str], Entity]
|
Creates new server instances. |
required |
batch_size
|
int
|
Number of instances to replace per batch. |
1
|
health_check_interval
|
float
|
Seconds between health checks. |
2.0
|
healthy_threshold
|
int
|
Consecutive passes required. |
2
|
max_failures
|
int
|
Failures before rollback. |
1
|
RollingDeployerStats
dataclass
¶
RollingDeployerStats(
deployments_started: int = 0,
deployments_completed: int = 0,
deployments_rolled_back: int = 0,
instances_replaced: int = 0,
health_checks_performed: int = 0,
health_checks_passed: int = 0,
health_checks_failed: int = 0,
)
Statistics tracked by RollingDeployer.