Skip to content

safe_profile_adapter

sqllocks_spindle.inference.safe_profile_adapter

SafeProfileAdapter — bridge a loaded SafeProfile into the generator (STORY-005).

A persisted :class:~sqllocks_spindle.inference.safe_profile.SafeProfile (written and re-read via :class:~sqllocks_spindle.inference.profile_store.ProfileStore) carries ONLY the safe-and-sufficient statistic set (ADR-001 / ADR-007): no raw min_value / max_value / enum_values / value_counts_ext. Numeric extremes live in winsorized bounds (ADR-002); categorical mass lives in categorical_weights with sub-k values folded into __OTHER__ (ADR-003).

The generation engine (engine/generator.py) consumes a :class:~sqllocks_spindle.schema.parser.SpindleSchema whose per-column generator dicts name a registered strategy. This module is the adapter that maps a loaded SafeProfile to such a schema, selecting generator strategies that consume the safe statistics:

  • numeric with a fitted distribution + distribution_paramsdistribution strategy, with bounds threaded in as min/max so the engine clips regenerated values to the winsorized bounds (ADR-002).
  • numeric with quantiles but no usable distribution → empirical strategy (quantile interpolation), still clipped to bounds.
  • categorical with categorical_weightsweighted_enum strategy (samples the post-suppression weights, __OTHER__ included).
  • date/datetime → temporal strategy.
  • string → faker strategy (name-heuristic provider; pattern-only PII columns are refined in STORY-008).

The loaded SafeProfile ALSO satisfies the structural contract the generator's fidelity_profile= path reads (.tables[t].row_count, .tables[t].columns[c].null_rate / .cardinality), so the SAME loaded object can be passed straight back in as fidelity_profile to obtain a FidelityReport — no live in-memory DatasetProfile required.

Scope (STORY-005): make profile → save → load → generate run end-to-end and shape-correct. Real winsorization (STORY-006), real k-anon suppression (STORY-007), the pattern-only PII gate (STORY-008) and the >=90% fidelity assertion (STORY-011) are owned by their stories; the STORY-002 stubs supply bounds / categorical_weights here.

Classes

SafeProfileAdapter

Adapt a loaded :class:SafeProfile to a generatable :class:SpindleSchema.

Stateless; instantiate and call :meth:to_schema, or use the module-level :func:safe_profile_to_schema convenience wrapper.

Methods:
to_schema(profile, domain_name='safe_inferred')

Build a :class:SpindleSchema from a loaded :class:SafeProfile.

The returned schema is ready to pass to Spindle().generate(schema=..., fidelity_profile=profile).

Raw fields are never consulted (there are none on the safe model); numeric clipping is driven by the winsorized bounds (ADR-002).

Functions:

safe_profile_to_schema(profile, domain_name='safe_inferred')

Convenience wrapper around :meth:SafeProfileAdapter.to_schema.