Skip to content

Distributions

Latency and value distributions for stochastic simulation parameters.

Latency and probability distributions for simulations.

ConstantLatency

ConstantLatency(latency: Duration | float)

Bases: LatencyDistribution

Latency distribution that always returns the same value.

Every call to get_latency() returns the configured latency exactly. No randomness is involved.

Use for deterministic tests or when modeling fixed processing delays.

Initialize with a fixed latency value.

Parameters:

Name Type Description Default
latency Duration | float

The constant latency as Duration or seconds (float).

required

get_latency

get_latency(current_time: Instant) -> Duration

Return the constant latency value.

DistributionType

Bases: Enum

Distribution type identifiers.

POISSON: Exponential inter-arrival times (memoryless, random). CONSTANT: Deterministic inter-arrival times (perfectly regular).

ExponentialLatency

ExponentialLatency(mean_latency: Duration | float)

Bases: LatencyDistribution

Latency distribution sampling from an exponential distribution.

Uses random.expovariate() to generate exponentially distributed latencies with the specified mean. The exponential distribution has the memoryless property: the remaining wait time has the same distribution regardless of time already waited.

Samples have high variance (coefficient of variation = 1), so values can range from near-zero to several multiples of the mean.

Initialize with mean latency (expected value of distribution).

Parameters:

Name Type Description Default
mean_latency Duration | float

Expected mean latency as Duration or seconds (float).

required

get_latency

get_latency(current_time: Instant) -> Duration

Sample a random latency from the exponential distribution.

LatencyDistribution

LatencyDistribution(mean_latency: Duration | float)

Bases: ABC

Abstract base class for latency sampling.

Subclasses implement get_latency() to return sampled delay values. The mean_latency parameter configures the expected average latency; actual samples may vary based on the distribution type.

Supports +/- operators to adjust the mean latency, returning new instances (original is unchanged).

Attributes:

Name Type Description
_mean_latency

Mean latency in seconds (stored as float for calculations).

Initialize with a mean latency value.

Parameters:

Name Type Description Default
mean_latency Duration | float

Expected mean latency as Duration or seconds (float).

required

get_latency abstractmethod

get_latency(current_time: Instant) -> Duration

Sample a latency value.

Parameters:

Name Type Description Default
current_time Instant

Current simulation time (for time-varying distributions).

required

Returns:

Type Description
Duration

Sampled latency as a Duration.

__add__

__add__(additional: float) -> LatencyDistribution

Return a copy with increased mean latency.

__sub__

__sub__(subtraction: float) -> LatencyDistribution

Return a copy with decreased mean latency.

PercentileFittedLatency

PercentileFittedLatency(
    p50: float | None = None,
    p90: float | None = None,
    p99: float | None = None,
    p999: float | None = None,
    p9999: float | None = None,
)

Bases: LatencyDistribution

Exponential latency distribution fitted to percentile targets.

Fits an exponential distribution to match provided percentile values using least-squares optimization. At least one percentile must be provided.

For an exponential distribution with rate λ: - CDF: F(x) = 1 - e^(-λx) - Quantile: Q(p) = -ln(1-p) / λ - Mean: 1/λ

The fitting minimizes the squared error between the target percentile values and the exponential quantile function.

Parameters:

Name Type Description Default
p50 float | None

Optional target value for 50th percentile (median) in seconds.

None
p90 float | None

Optional target value for 90th percentile in seconds.

None
p99 float | None

Optional target value for 99th percentile in seconds.

None
p999 float | None

Optional target value for 99.9th percentile in seconds.

None
p9999 float | None

Optional target value for 99.99th percentile in seconds.

None

Raises:

Type Description
ValueError

If no percentiles are provided.

Example

dist = PercentileFittedLatency(p50=0.1, p99=0.5) latency = dist.get_latency(current_time)

Initialize by fitting an exponential to the provided percentiles.

Parameters:

Name Type Description Default
p50 float | None

Target latency at 50th percentile in seconds.

None
p90 float | None

Target latency at 90th percentile in seconds.

None
p99 float | None

Target latency at 99th percentile in seconds.

None
p999 float | None

Target latency at 99.9th percentile in seconds.

None
p9999 float | None

Target latency at 99.99th percentile in seconds.

None

get_latency

get_latency(current_time: Instant) -> Duration

Sample a random latency from the fitted exponential distribution.

get_percentile

get_percentile(p: float) -> Duration

Get the latency value at a given percentile.

Parameters:

Name Type Description Default
p float

Percentile as a fraction (0 < p < 1).

required

Returns:

Type Description
Duration

Latency value at the given percentile as a Duration.

UniformDistribution

UniformDistribution(
    values: Sequence[T], seed: int | None = None
)

Bases: ValueDistribution[T]

Uniform random sampling from a population.

Each value in the population has an equal probability of being selected. Useful as a baseline comparison against skewed distributions.

Parameters:

Name Type Description Default
values Sequence[T]

The population of values to sample from.

required
seed int | None

Optional random seed for reproducibility.

None
Example

Uniform distribution over regions

dist = UniformDistribution(["us-east", "us-west", "eu"], seed=42) region = dist.sample() # Each region equally likely

Uniform distribution over integer IDs

dist = UniformDistribution(range(1000)) id = dist.sample() # Each ID has 1/1000 probability

Initialize with population.

Parameters:

Name Type Description Default
values Sequence[T]

Sequence of values to sample from.

required
seed int | None

Random seed for reproducibility.

None

Raises:

Type Description
ValueError

If values is empty.

population property

population: Sequence[T]

Return the complete population of possible values.

size property

size: int

Return the number of distinct values in the population.

sample

sample() -> T

Sample a value uniformly at random.

Returns:

Type Description
T

A randomly selected value from the population.

probability

probability() -> float

Return the probability of selecting any single value.

Returns:

Type Description
float

1/n where n is the population size.

ValueDistribution

Bases: ABC

Abstract base for sampling discrete values from a distribution.

Subclasses implement sample() to return values according to their specific probability distribution. The population of possible values is finite and defined at construction time.

Class Type Parameters:

Name Bound or Constraints Description Default
T

The type of values in the distribution (int, str, custom objects, etc.)

required
Example

class MyDistribution(ValueDistribution[str]): def sample(self) -> str: return "value"

population abstractmethod property

population: Sequence[T]

Return the complete population of possible values.

Returns:

Type Description
Sequence[T]

Sequence of all values that can be sampled.

size abstractmethod property

size: int

Return the number of distinct values in the population.

Returns:

Type Description
int

Population size.

sample abstractmethod

sample() -> T

Sample a single value from the distribution.

Returns:

Type Description
T

A value from the population according to the distribution.

sample_n

sample_n(n: int) -> list[T]

Sample n values from the distribution.

Parameters:

Name Type Description Default
n int

Number of samples to generate.

required

Returns:

Type Description
list[T]

List of n sampled values.

ZipfDistribution

ZipfDistribution(
    values: Sequence[T],
    s: float = 1.0,
    seed: int | None = None,
)

Bases: ValueDistribution[T]

Samples values following Zipf's law (power-law distribution).

Zipf's law states that the frequency of an item is inversely proportional to its rank raised to a power: P(rank=k) proportional to 1/k^s

The distribution is implemented using inverse transform sampling with precomputed cumulative probabilities for O(1) sampling after initialization.

Parameters:

Name Type Description Default
values Sequence[T]

The population of values to sample from. The first value is rank 1 (most frequent), second is rank 2, etc.

required
s float

Zipf exponent (default 1.0). - s=0: uniform distribution - s=1: classic Zipf (item at rank k appears 1/k as often as rank 1) - s>1: more extreme skew toward popular items

1.0
seed int | None

Optional random seed for reproducibility.

None
Example

Customer IDs 0-999 with classic Zipf distribution

dist = ZipfDistribution(range(1000), s=1.0) customer_id = dist.sample() # Most likely returns low IDs

String values with extreme skew

dist = ZipfDistribution(["hot", "warm", "cool", "cold"], s=1.5, seed=42) category = dist.sample() # "hot" appears most frequently

Initialize with population and Zipf parameters.

Parameters:

Name Type Description Default
values Sequence[T]

Sequence of values to sample from (rank ordered).

required
s float

Zipf exponent controlling skew (default 1.0).

1.0
seed int | None

Random seed for reproducibility.

None

Raises:

Type Description
ValueError

If values is empty or s is negative.

population property

population: Sequence[T]

Return the complete population of possible values.

size property

size: int

Return the number of distinct values in the population.

s property

s: float

Return the Zipf exponent.

sample

sample() -> T

Sample a value using inverse transform sampling.

Returns:

Type Description
T

A value from the population according to Zipf distribution.

probability

probability(rank: int) -> float

Return the probability for a given rank (1-indexed).

Parameters:

Name Type Description Default
rank int

The rank of the item (1 = most popular).

required

Returns:

Type Description
float

Probability of sampling an item with this rank.

Raises:

Type Description
ValueError

If rank is out of range.

probability_for_value

probability_for_value(value: T) -> float

Return the probability for a specific value.

Parameters:

Name Type Description Default
value T

The value to get probability for.

required

Returns:

Type Description
float

Probability of sampling this value.

Raises:

Type Description
ValueError

If value is not in the population.

expected_frequency

expected_frequency(rank: int, n_samples: int) -> float

Return the expected count for a rank given n samples.

Parameters:

Name Type Description Default
rank int

The rank of the item (1 = most popular).

required
n_samples int

Total number of samples.

required

Returns:

Type Description
float

Expected count for items with this rank.

top_n_probability

top_n_probability(n: int) -> float

Return the combined probability of the top n ranked items.

Parameters:

Name Type Description Default
n int

Number of top items to include.

required

Returns:

Type Description
float

Combined probability (0.0 to 1.0).

Raises:

Type Description
ValueError

If n is out of range.