transform
sqllocks_spindle.transform
¶
Transform layer — post-processing for star schema and CDM output.
Classes¶
DimSpec
dataclass
¶
Specification for building one dimension table.
FactSpec
dataclass
¶
Specification for building one fact table.
StarSchemaMap
¶
Describes how to transform a domain result into a star schema.
StarSchemaResult
dataclass
¶
StarSchemaTransform
¶
Transform a 3NF GenerationResult into a star schema using a StarSchemaMap.
Methods:¶
transform(tables, schema_map)
¶
Apply the star schema transform.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tables
|
dict[str, DataFrame]
|
Dict of table_name → DataFrame (from GenerationResult.tables). |
required |
schema_map
|
StarSchemaMap
|
Mapping spec defining dims and facts. |
required |
Returns:
| Type | Description |
|---|---|
StarSchemaResult
|
StarSchemaResult with dimensions, facts, and date_dim. |
CdmMapper
¶
Export tables as a Microsoft CDM folder (model.json + data files).
The output folder structure is compatible with Fabric CDM connectors, Dataverse, Power Platform, and Azure Data Lake Storage CDM folders.
Usage::
from sqllocks_spindle.transform import CdmMapper, CdmEntityMap
mapper = CdmMapper()
mapper.write_cdm_folder(
tables=result.tables,
output_dir="./cdm",
domain_name="SpindleRetail",
entity_map=CdmEntityMap({"customer": "Contact", "order": "SalesOrder"}),
)
Methods:¶
write_cdm_folder(tables, output_dir, domain_name='SpindleOutput', entity_map=None, fmt='csv')
¶
Write a CDM folder to disk.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tables
|
dict[str, DataFrame]
|
Dict of table_name → DataFrame. |
required |
output_dir
|
str | Path
|
Root directory for CDM folder output. |
required |
domain_name
|
str
|
The CDM model name (appears in model.json). |
'SpindleOutput'
|
entity_map
|
CdmEntityMap | None
|
Optional mapping of table names to CDM entity names. |
None
|
fmt
|
str
|
Data file format — "csv" (default) or "parquet". |
'csv'
|
Returns:
| Type | Description |
|---|---|
list[Path]
|
List of written file paths. |
to_model_json(tables, domain_name='SpindleOutput', entity_map=None, fmt='csv')
¶
Generate a model.json manifest dict without writing to disk.
Useful for in-memory CDM metadata generation or Fabric notebook use.
CdmEntityMap
¶
Optional mapping from source table names to CDM entity names.
If a table is not in the map, it defaults to PascalCase of the table name.