The library is really easy to use, first install the sdnist python package:
# Optionally create a virtualenv
# and install the package from pypi.org
pip install sdnist importlib_resources
You can then use the library to evaluate a synthetic dataset:
import sdnist
# Fetch data
dataset, schema = sdnist.census()
# Synthesize data
# (replace this line with your synthetic data generator function)
synthetic = dataset.sample(n=20000)
# Evaluate your synthetic data
result = sdnist.score(dataset, synthetic, schema, challenge="census")
# Print the score
print(result.score)
# Display the results on a map
result.html()
You can also submit the generative model itself:
# You can also subclass sdnist.challenge.submission.Model
from sdnist.challenge.submission import run
from sdnist.challenge.subsample import SubsampleModel
model = SubsampleModel()
run(model, challenge="census")
And get the score for various levels of privacy loss (ε).
The results can be displayed on a map to figure out where the synthetic data model performed better.
Some examples using sdnist to evaluate some of the top performing generative models from the Differential Privacy Temporal Map Challenge have been implemented and shared on Github.