registry
sqllocks_spindle.profiles.registry
¶
Profile registry — CRUD, search, tagging, bulk import, diff, and reindex.
Classes¶
ProfileRegistry
¶
Manages named, tagged profiles under a configurable root directory.
Directory layout::
<root>/
<system>/
<table>/
<profile_name>.json
_index.json ← auto-maintained index
Methods:¶
save(profile)
¶
Save a profile to disk and update the index.
load(identity)
¶
Load a profile by identity (system/table/name).
delete(identity)
¶
Delete a profile from disk and index.
list_all()
¶
Return all index entries sorted by identity.
search(query=None, system=None, table=None, tags=None)
¶
Filter index entries by query string, system, table, and/or tags.
add_tags(identity, tags)
¶
Add tags to a profile (in-place, no duplicates).
remove_tags(identity, tags)
¶
Remove tags from a profile.
import_from_dir(source_dir, overwrite=False)
¶
Import all *.json profile files from a directory tree.
Returns a list of imported identity strings.
diff(identity_a, identity_b)
¶
Compare two profiles column by column.
Returns a dict with keys: added, removed, changed.
reindex()
¶
Rebuild _index.json from all .json files on disk. Returns count.
save_from_dataset_profile(dataset_profile, system, name, tags=None, description='', config=None)
¶
Convert a DatasetProfile into registry profiles via the SafeProfile mapper.
STORY-014 (ADR-001): the per-column stats are now built through
SafeProfile.from_dataset_profile (the canonical safe-and-correct
mapper), NOT the old hand-read of non-existent .min/.max/
.top_values attributes (the B2 attribute-mismatch bug). So registry
profiles carry the SAFE statistic set (dtype/null_rate/cardinality/mean/
std/quantiles/bounds/categorical_weights/categorical_histogram), with no
raw values. The RegistryProfile wrapper (system/table/name/tags/
description/source_rows) is unchanged, so the registry read side
(load/diff/tag/reindex) is unaffected; no on-disk format break, no sidecar.
Legacy registry files (old min/max/top_values columns) still load as-is.
config is forwarded to the SafeProfile mapper (e.g. k, sensitive).
One RegistryProfile is created per table. Returns the saved profiles.
validate(identity, result, sample_rows=500)
¶
Compare a GenerationResult against a stored profile.
Reconstructs an approximate reference DataFrame from stored column statistics and runs FidelityComparator against the new generation. Returns a FidelityReport. Requires scipy.