delta_writer
sqllocks_spindle.output.delta_writer
¶
Delta Lake writer for Spindle output.
Classes¶
DeltaWriter
¶
Write generated tables as Delta Lake tables.
Uses the deltalake (delta-rs) package — works both locally and inside
Microsoft Fabric Notebooks without requiring Spark or JVM.
Install the required extra::
pip install sqllocks-spindle[fabric]
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_dir
|
str | Path | None
|
Root directory for Delta tables. Each table is written to
a subdirectory |
None
|
partition_by
|
dict[str, list[str]] | None
|
Per-table partition specs from the schema output config.
Keys are table names, values are lists of specs like
|
None
|
mode
|
str
|
Write mode — |
'overwrite'
|
Example::
from sqllocks_spindle.output import DeltaWriter
writer = DeltaWriter(output_dir="./delta_output")
paths = writer.write_all(result.tables)
Methods:¶
write_all(tables, max_workers=4)
¶
Write all tables as Delta Lake tables in parallel.
delta-rs releases the GIL during Parquet/Delta writes, so ThreadPoolExecutor gives genuine parallelism here.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tables
|
dict[str, DataFrame]
|
Mapping of table name to DataFrame (the standard Spindle
output format from |
required |
max_workers
|
int
|
Maximum parallel write threads (default 4). |
4
|
Returns:
| Type | Description |
|---|---|
list[Path]
|
List of paths to the written Delta table directories. |
write(table_name, df)
¶
Write a single table as a Delta Lake table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
table_name
|
str
|
Name of the table (becomes the subdirectory name). |
required |
df
|
DataFrame
|
DataFrame to write. |
required |
Returns:
| Type | Description |
|---|---|
Path
|
Path to the written Delta table directory. |