diffix / syndiffix-fs Goto Github PK
View Code? Open in Web Editor NEWF# implementation of the SynDiffix synthetic data generation mechanism.
Home Page: https://www.open-diffix.org
License: Other
F# implementation of the SynDiffix synthetic data generation mechanism.
Home Page: https://www.open-diffix.org
License: Other
Currently missing dependence measure, clustering, and stitching.
I wonder if there is a problem with boolean columns in syndiffix.
I ran the following experiment using alarms.csv
, which has a number of boolean columns.
I made three datasets (the are in abData/csvTest
:
alarm.csv
: The original alarm.csv (with booleans)
alarm_change_csv.csv
: Modified all of the True/False values to 'TVAL' and 'FVAL'
alarm_change_results.csv
: Also the original alarm.csv with booleans.
After I built the results files (abData/resultsTest
), I manually modified the results file for alarm_change_results
, changing all of the true
and false
values to "TVAL"
and "FVAL"
. I also modified the csvOrder.json
file to label all of the boolean
to text
. for alarm_change_results.csv
.
Then I ran quality measures on all three (see abData/measuresTest
)
The 1dim quality measures for alarm.csv
and alarm_change_results.csv
are very bad. The 1dim quality measures for alarm_change_csv.csv
are good.
The only reason I can think for these results is that syndiffix
itself is doing something wrong with boolean values (the fact that alarm_change_results.csv
has no booleans, and that the TVAL
and FVAL
values are treated as text, means that no booleans were encountered in the quality measure for alarm_change_results.csv
, and therefore the bad score isn't because of the quality measure per se).
The current design bumps the dependency measurements of the main column with other columns and relies on the clustering solver to select it for stitching and the main cluster.
In some edge cases, that might not happen, so it is better to include it explicitly instead.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.