chaos
sqllocks_spindle.chaos
¶
Chaos Engine for Spindle — deterministic data-quality issue injection.
Public API::
from sqllocks_spindle.chaos import ChaosEngine, ChaosConfig, ChaosCategory
cfg = ChaosConfig(enabled=True, intensity="stormy")
engine = ChaosEngine(cfg)
df = engine.corrupt_dataframe(df, day=15)
Classes¶
ChaosMutator
¶
Bases: ABC
Base class for all chaos-category mutators.
Subclasses implement :meth:mutate which receives the data to corrupt,
the current simulation day, a seeded numpy Generator, and the
intensity multiplier from the active preset.
Attributes¶
category
abstractmethod
property
¶
Short category label matching :class:ChaosCategory values.
Methods:¶
mutate(data, day, rng, intensity_multiplier)
abstractmethod
¶
Apply chaos to data and return the mutated result.
The concrete type of data depends on the category (DataFrame, bytes, dict of DataFrames, etc.).
FileChaosMutator
¶
Bases: ChaosMutator
Corrupt raw file bytes: truncation, encoding corruption, partial writes, zero-byte files, wrong extension content, wrong delimiters, invalid JSON/Parquet poison payloads.
ReferentialChaosMutator
¶
Bases: ChaosMutator
Corrupt referential integrity: orphan foreign keys, duplicate primary keys.
Expects data to be a dict[str, pd.DataFrame] (table name to DF).
Returns the same structure with mutations applied.
SchemaChaosMutator
¶
Bases: ChaosMutator
Add, remove, rename, reorder, or retype columns.
Before breaking_change_day only additive changes (add column,
reorder) are applied. After that day, destructive mutations (drop,
rename, retype) are also possible.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
breaking_change_day
|
int
|
Simulation day after which destructive schema mutations are enabled. |
20
|
TemporalChaosMutator
¶
Bases: ChaosMutator
Corrupt temporal columns: late arrivals, out-of-order timestamps, timezone mismatches, DST boundary issues.
ValueChaosMutator
¶
Bases: ChaosMutator
Corrupt individual cell values: nulls, out-of-range, wrong types, encoding issues (BOM, Latin-1), future dates, negative amounts.
VolumeChaosMutator
¶
ChaosCategory
¶
Bases: Enum
Categories of chaos that can be injected into generated data.
ChaosConfig
dataclass
¶
Top-level configuration for the Chaos Engine.
Attributes:
| Name | Type | Description |
|---|---|---|
enabled |
bool
|
Master switch. When |
intensity |
str
|
One of |
seed |
int
|
Seed for the chaos RNG (independent of the main generation seed). |
warmup_days |
int
|
Number of days at the start with no chaos. |
chaos_start_day |
int
|
First day chaos may fire (must be > warmup_days). |
escalation |
str
|
How injection probability grows over time.
|
categories |
dict[str, dict[str, Any]]
|
Per-category configuration. Keys are
:class: |
overrides |
list[ChaosOverride]
|
Explicit per-day overrides that bypass probability checks. |
breaking_change_day |
int
|
Day on which schema-breaking mutations are allowed (column drops / renames). Before this day only additive schema changes are injected. |
Attributes¶
intensity_multiplier
property
¶
Return the numeric multiplier for the current intensity preset.
Methods:¶
is_category_enabled(category)
¶
Return True if the given category string is enabled.
category_weight(category)
¶
Return the base weight for a category (0.0 if missing/disabled).
overrides_for_day(day)
¶
Return any explicit overrides scheduled for day.
validate()
¶
Return a list of validation error messages (empty = valid).
ChaosOverride
dataclass
¶
Per-issue override that forces a specific chaos event on a given day.
Attributes:
| Name | Type | Description |
|---|---|---|
day |
int
|
The simulation day on which to inject. |
category |
str
|
Which :class: |
params |
dict[str, Any]
|
Extra parameters forwarded to the mutator. |
ChaosEngine
¶
Orchestrates chaos injection across all categories.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
ChaosConfig | None
|
A :class: |
None
|
seed
|
int | None
|
If provided, overrides |
None
|
Methods:¶
should_inject(day, category)
¶
Decide whether chaos should fire on day for category.
Returns False immediately if the engine is disabled, the day is
within the warmup window, or the category is disabled. Otherwise
draws against the effective probability (base weight * intensity
multiplier * escalation factor).
corrupt_dataframe(df, day)
¶
Apply value-level chaos to a DataFrame.
Injects nulls, out-of-range values, wrong types, encoding issues, future dates, and negative amounts.
drift_schema(df, day)
¶
Apply schema-level chaos: add/remove/rename/reorder/retype columns.
Destructive mutations (drop, rename) only fire after
config.breaking_change_day.
corrupt_file(file_bytes, day)
¶
Corrupt raw file bytes: truncation, encoding damage, partial writes, zero-byte, garbage headers.
inject_referential_chaos(tables_dict, day)
¶
Corrupt referential integrity: orphan FKs, duplicate PKs.
inject_temporal_chaos(df, date_columns, day)
¶
Corrupt temporal columns: late arrivals, out-of-order, timezone mismatches, DST boundary issues.
inject_volume_chaos(df, day)
¶
Alter data volume: 10x spike, empty batch, or single-row.
apply_all(df, day, *, tables_dict=None, date_columns=None)
¶
Run through every category and inject chaos where
:meth:should_inject returns True.
This is a convenience wrapper — callers who need fine-grained control should call individual methods directly.