Skip to content

scale_router

sqllocks_spindle.engine.scale_router

Entry point for multi-process chunked generation with multi-sink fan-out.

Classes

ScaleRouter

Entry point for multi-process chunked generation with multi-sink fan-out.

Parameters:

Name Type Description Default
schema_path str

Path to a .json file containing a serialized SpindleSchema.

required
sinks list[Sink]

List of Sink instances to receive generated data.

required
chunk_size int

Rows per chunk. Default 500_000.

500000
max_workers int | None

Subprocess count. Default os.cpu_count() - 1. Capped automatically if the estimated working set would exceed 80 % of available RAM.

None
Methods:
run(total_rows, seed=42)

Generate total_rows rows and fan out to all sinks.

Tables whose schema-derived row count is < chunk_size are treated as static (reference/dimension) tables: they are generated once with their natural cardinality and written to the sinks a single time. Their PK data is broadcast into every chunk worker so FK references resolve correctly without replication.

Tables whose schema-derived row count >= chunk_size are dynamic (fact) tables: they are generated chunk_size rows per chunk across ceil(total_rows / chunk_size) chunks.

Returns:

Type Description
dict

Stats dict: rows_generated, elapsed_seconds, throughput_rows_per_sec,

dict

memory_peak_gb (estimated).

Functions: