schneiderkamplab / syntheval Goto Github PK
View Code? Open in Web Editor NEWSoftware for evaluating the quality of synthetic data compared with real data.
License: MIT License
Software for evaluating the quality of synthetic data compared with real data.
License: MIT License
in get_cat_variables
in variable_detection.py in the utils folder, it checks if the column dtype is object, then it appends it to the cat_variabes
list, which is the list of categorical features of my dataset. I want to ask why does it also check for dtypes int64 and float64? Aren't those supposed to belong to the list of numerical features which is the num_cols
list?
Hello! Firstly, thank you for creating this wonderful library!
I had a question regarding the DCR privacy metric implementation.
In your paper and code, you compare the median DCR of (synthetic, train) and (train, train) distances.
I was wondering if you considered comparing the median DCR of (synthetic, train) and (holdout, train)? I have usually seen this version (page 4 in ref), and I was wondering if you had any insights into why (train, train) was used.
I am using the DCR metric in a project and wanted to know if a holdout dataset should be used or not.
May I ask if there is any guide or documentation on how to interpret the metrics? For example I am trying to figure out weather a higher score in Nearest Neighbour Adversarial Accuracy represents higher similarity between my synthetic and real data, and vice versa.
Or is it possible to provide a list of references to the original proposals of the metrics?
Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.