Code Monkey home page Code Monkey logo

Comments (3)

greenlion avatar greenlion commented on August 27, 2024

You can probably write a simple script to do the following:
note: You can only do this if you are using the directory mapper.

Identify how you want to split the shard. For example, if you want to split a shard into even/odd components, you would select which half you want to move. Arbitrarily, I'll pick moving the odd numbered keys to another. You could even just move a single key.

On the directory, get a list of the shard keys to move:
[pseudo-code]
FOR EACH $key_value in (SELECT key_value FROM shard_map WHERE MOD(key_value,2) = 1)
START XA TRANSACTION on source, dest and directory
FOR EACH sharded_table as $table_name
SOURCE: select * from $table_name where shard_key =$key_value INTO OUTFILE ...
DEST: load data infile ....
SOURCE: delete from $table_name where shard_key = $CU
DIRECTORY: UPDATE mapper SET shard_id = $dest_shard_id WHERE key_value = $key_value
XA COMMIT

Let me know if you have questions, or if you want to sponsor development of a tool to split shards.

The XA transaction makes sure that any COUNT(*) for other queries done during the movement return correct results.

from swanhart-tools.

greenlion avatar greenlion commented on August 27, 2024

To split an existing non shard-query server, use 'mysqldump' to dump into a flat file, then use the Shard-Query loader to reload it. That will do the splitting for you.

Basically load the data into Shard-Query as if it was just a dump from an online data source, etc.

The loader is in the bin/ folder.

You need a loader.spec file. It looks like:
[default]
delimiter=","

[table_name]
file=/path/to/file.txt

You can use globs for the file name if you have multiple files for the same table. You can specify the same table multiple times if you have different paths to load.

cd bin
php loader --spec=loader.spec

That will fire of a bunch of loader jobs. You should only run loader workers on a single node unless you have a shared filesystem. Run them on the same node you invoke the bin/loader script from. If you have a shared filesystem, place the files there, and make sure the path to the filesystem is the same on all nodes running loader workers.

Run bin/update_jobs_table to check on the status of the jobs. It will stop producing output when the jobs are completed.

Please let me know if you have problems.

from swanhart-tools.

greenlion avatar greenlion commented on August 27, 2024

I noticed that you are in Santa Clara. If you want to meet up sometime for coffee and talk about your data and Shard-Query, I'd be happy to do so.

from swanhart-tools.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.