scale_router
sqllocks_spindle.engine.scale_router
¶
Entry point for multi-process chunked generation with multi-sink fan-out.
Classes¶
ScaleRouter
¶
Entry point for multi-process chunked generation with multi-sink fan-out.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
schema_path
|
str
|
Path to a .json file containing a serialized SpindleSchema. |
required |
sinks
|
list[Sink]
|
List of Sink instances to receive generated data. |
required |
chunk_size
|
int
|
Rows per chunk. Default 500_000. |
500000
|
max_workers
|
int | None
|
Subprocess count. Default os.cpu_count() - 1. Capped automatically if the estimated working set would exceed 80 % of available RAM. |
None
|
Methods:¶
run(total_rows, seed=42)
¶
Generate total_rows rows and fan out to all sinks.
Tables whose schema-derived row count is < chunk_size are treated as static (reference/dimension) tables: they are generated once with their natural cardinality and written to the sinks a single time. Their PK data is broadcast into every chunk worker so FK references resolve correctly without replication.
Tables whose schema-derived row count >= chunk_size are dynamic (fact) tables: they are generated chunk_size rows per chunk across ceil(total_rows / chunk_size) chunks.
Returns:
| Type | Description |
|---|---|
dict
|
Stats dict: rows_generated, elapsed_seconds, throughput_rows_per_sec, |
dict
|
memory_peak_gb (estimated). |