comparator
sqllocks_spindle.inference.comparator
¶
Fidelity comparator — compare real vs synthetic data quality.
Produces a FidelityReport with per-column and per-table scores (0-100) based on statistical tests, distribution matching, null rates, and cardinality analysis.
Classes¶
ColumnFidelity
dataclass
¶
Fidelity metrics for a single column.
TableFidelity
dataclass
¶
Fidelity metrics for a table.
FidelityReport
dataclass
¶
Complete fidelity report comparing real vs synthetic data.
Methods:¶
summary()
¶
Generate a plain-text summary.
to_markdown()
¶
Generate markdown report.
failing_columns(threshold=85.0)
¶
Return (table, column, score) tuples for columns below threshold.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
threshold
|
float
|
Score threshold (0-100). Columns with score < threshold are included. |
85.0
|
Returns:
| Type | Description |
|---|---|
list[tuple[str, str, float]]
|
List of (table_name, column_name, score) tuples, sorted by score (lowest first). |
to_dict()
¶
Return a JSON-serializable dict representation.
to_dataframe()
¶
Return a flat pandas DataFrame with one row per column.
to_html(title='Spindle Fidelity Report')
¶
Render fidelity report as a self-contained HTML page.
Uses inline CSS — no external dependencies. Score bands: green ≥ 85, amber 70-84, red < 70.
score(real, synthetic, table_name='table', threshold=85.0)
classmethod
¶
Compare two DataFrames and return a FidelityReport.
Convenience classmethod for single-table comparison.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
real
|
'pd.DataFrame'
|
Real data DataFrame. |
required |
synthetic
|
'pd.DataFrame'
|
Synthetic data DataFrame to compare. |
required |
table_name
|
str
|
Name for the table in the report (default: "table"). |
'table'
|
threshold
|
float
|
Score threshold for failing_columns() (default: 85.0). |
85.0
|
Returns:
| Type | Description |
|---|---|
'FidelityReport'
|
FidelityReport comparing the two DataFrames. |