Code Monkey home page Code Monkey logo

myanon's People

Contributors

asgrim avatar pierrepomes avatar ppomes avatar sjourdan avatar trilliot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

myanon's Issues

Using mysqldump with `--hex-blob` flag breaks myanon

Given the following SQL:

DROP TABLE IF EXISTS `the_blobs`;
CREATE TABLE `the_blobs` (
  `blob1` blob,
  `blob2` tinyblob,
  `blob3` mediumblob,
  `blog4` longblob
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

INSERT INTO `the_blobs` VALUES (
  0x68656c6c6f,
  0x68656c6c6f,
  0x68656c6c6f,
  0x68656c6c6f
);

And the following myanon configuration:

# Config file for test1.sql
secret = 'lapin'
stats  = 'no'

tables = {
   `the_blobs` = {
     `blob1`     = fixed '0x0000000000'
   }
}

When I run build/main/myanon -f tests/test1.conf < tests/test1.sql, I should expect to see:

DROP TABLE IF EXISTS `the_blobs`;
CREATE TABLE `the_blobs` (
  `blob1` blob,
  `blob2` tinyblob,
  `blob3` mediumblob,
  `blog4` longblob
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

INSERT INTO `the_blobs` VALUES (
  0x0000000000,
  0x68656c6c6f,
  0x68656c6c6f,
  0x68656c6c6f
);

But I actually see:

DROP TABLE IF EXISTS `the_blobs`;
CREATE TABLE `the_blobs` (
  `blob1` blob,
  `blob2` tinyblob,
  `blob3` mediumblob,
  `blog4` longblob
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

INSERT INTO `the_blobs` VALUES (
  0x0000000000x68656c6c6f,
  0x68656c6c6f,
  0x68656c6c6f,
  0x68656c6c6f
);

Dump parsing error at line 7: syntax error - Unexpected []

Config errors

Would be nice to get some feedback if a column specified in the config file didn't exist rather than silently ignore... had misspelled one column so had the potential to leak PII data (which we are trying to avoid).

Support for data-only dumps (--no-create-info)

It would be a really nice feature for myself if it could parse a data-only dump (mysqldump --no-create-info). I fiddled with it for a while and couldn't get it to work.

An easy workaround for now is to process the entire dump, and then just remove the create statements from the resulting script. This is a great little tool, thank you!

Generate fields with data from other fields

Given a field like id and a field like username, I'd like to keep id the same but set username to user<id>.

I imagine something like:

tables = {
   `people` = {
     `id`   = texthash 10
     `username` = sql CONCAT('user', id);
   }
}

Consider this a feature request. Thanks :)

NULL values are not maintained for rows with both NULL and non-NULL values

When anonymizing a column that contains both NULL and non-NULL values the NULL values are hashed. I would expect only the non-null values to be hashed and the NULL rows to maintain their current NULL value.

I've attached a tar.gz with an example:

  • null-example.conf = the myanon config
  • null-example-mysqldump.sql = the mysqldump of the database I'm using as an example
  • null-example-anonymized.sql = the myanon result after anonymizing null-example-mysqldump.sql

On line 48 of null-example-anonymized.sql you can see that rows 3 & 5 hash the NULL value to 'ahavykafkojauwmdriqpohobuuttmiif'. I would expect those values to remain NULL.
null-example.tar.gz

Documentation

Please update the documentation for Ubuntu to include the following:

sudo apt-get install build-essential

Once the user runs that step on a fresh Ubuntu 20.04 server the ./configure process runs without error.

Can't parse "set"

Can't parse set, like this:
CREATE TABLE some_table (
some_field int NOT NULL AUTO_INCREMENT,
some_field timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
some_field timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
some_field int NOT NULL,
some_field int NOT NULL,
some_field decimal(28,8) NOT NULL,
flags set('vat_included') CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
some_field int NOT NULL,
fails with the error:
Dump parsing error at line 8: Unable to read table definition - Unexpected [s]
Could you fix it please?

Fixed text truncated

we have a rule
``key = fixed '0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF'
64 chars, going into a varchar(64) column, but the last character is being truncated, which then effects all our other data. If we try and extend this, we get an error as > 64 characters. Really love a fix :) or push of new code if you have any

Adding column names results in syntax error

Using SQL:

CREATE TABLE `test_with_column_names` (
    `a` int(10) unsigned NOT NULL
) ENGINE=InnoDB;
INSERT INTO `test_with_column_names` (`a`) VALUES (1);

And configuration:

# Config file for test1.sql
secret = 'lapin'
stats  = 'no'

tables = {
   `test_with_column_names` = {
     `a` = inthash 2
   }
}

Gives a syntax error, running with myanon -d, the output is:

main/myanon -d -f tests/test1.conf
CREATE TABLE `test_with_column_names` (
    `a` int(10) unsigned NOT NULL
) ENGINE=InnoDB;
INSERT INTO `test_with_column_names` (FOUND TABLE `test_with_column_names`

ENTERING STATE ST_TABLELOOKING FOR  `test_with_column_names`:`a`

ENTERING STATE INITIALFOUND TABLE `test_with_column_names`

ENTERING STATE ST_VALUES
Dump parsing error at line 3: syntax error - Unexpected [(]

Process finished with exit code 1

Randomise the seed

After running this a few times, on different environments, I've noted that my first "texhhash" username is always the same random value - seems like need to add some randomisation onto the base?

Support for complex fields

I have field data in JSON Arrays and simple coma separated list in string, right now whole field is anonymized which "destroys" the array format of it.

Does fixed even replace if null?

Got some areas where I have to use fixed to set the value, but looks like if this was null it still replaces it with fixed (I believe?) text hash only seems to replace when it's not null, so maybe need a fixed leave null flag / option? fixed "1234567" true (not default, don't replace null)

Blob values are not quoted

Given the following config:

# Config file for test1.sql
secret = 'lapin'
stats  = 'no'

tables = {
   `lottypes` = {
     `int1`      = inthash 2
     `int2`      = fixed '9'
     `datetime1` = fixed '1970-01-01 12:00:00'
     `text1`     = texthash 5
     `text2`     = fixed null
     `blob1`     = fixed 'hello'
     `blob2`     = texthash 5
#      `blob3`     = fixed '\'hi\''
   }
}

When I run

build/main/myanon -f tests/test1.conf < tests/test1.sql

I expect to see see

INSERT INTO `lottypes` VALUES (... ,'hello','migez', ...);

But I actually see

INSERT INTO `lottypes` VALUES (... ,hello,migez, ...);

I tried quoting/escaping (by uncommenting the config line for blob3) , but I received the error:

Config parsing error at line 14: Syntax error - Unexpected [h]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.