Comments (6)
How about side-by-side comparison of anonymous and raw data?
from desktop.
How about side-by-side comparison of anonymous and raw data?
Yes, that's a good point!
That can be addressed in any number of ways really. Let me wire up something.
from desktop.
Ok, I think this is likely where the majority of effort needs to be spent...
The hardest thing to do well, might be a version where we show both the anonymized and the raw data inline in the same table, and highlight data/rows that were removed due to anonymization. I tried something like this when I made the playground version of reference earlier:
Conceptually much simpler would be to just show the raw and anonymous data in different tabs that can be toggled between:
A more advanced version of the tabs would be two tables (either below each other or next to each other) that scroll in sync. One being anonymized the other being raw. Then you see a clear "side-by-side" of what the anonymization does.
The absolute simplest to do (other than doing nothing), would be to just show some simple stats. Like 10% of the rows were suppressed. Etc...
from desktop.
I think that in any event we'll want a stats summary:
- number of missing buckets
- average error
But my preference, if it isn't too unwieldy, would be for a table that shows the diffix aggregate value, the true aggregate value, and an indication of error (possibly color coded).
age | gender | diffix count | true count | error |
---|---|---|---|---|
10 | F | 42 | 39 | 16% |
11 | M | --- | 2 | --- |
etc.
from desktop.
Sure, that's possible. The only slightly tricky part is in controlling for the rows that were suppressed. It's doable though!
from desktop.
This issue is obsolete, closing.
from desktop.
Related Issues (20)
- Support passing custom AnonymizationParams from DfD to the service
- Replace full aggregation hook with post aggregation callback
- Add a configuration file.
- LED causes long Preview request when there are ~40k buckets HOT 9
- Modification to Desktop-settable anonymization parameters HOT 4
- Cleanup the handling of default anonymization parameters
- Suppression threshold is a controlled input but has a defaultValue HOT 7
- Provide smart defaults for generalization.
- Always cast columns to their inferred type when loading from the table
- Change "Adjust suppression threshold" to just "Suppression threshold" HOT 2
- Specification of suppression ("star") bucket
- Consider using 'summary' feature of tables for suppress bin
- Generalized star bucket tooltip looks weird
- Fill anon params description location in "Other anonymization parameters" docs section. HOT 1
- Example emails in docs should not be hyperlinked.
- Changing tabs reruns notebook steps after a point
- Update `reference` dep to latest `master` version.
- Numeric generalization of `integer` columns casts to `real`.
- Investigate Unicode support.
- Test/support auto detection of language to use for the GUI
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from desktop.