Code Monkey home page Code Monkey logo

k2vtune's Introduction

K2vTune: A Workload-aware Configuration Tuning for RocksDB

This repository contains the source code for the paper "K2vTune: A Workload-aware Configuration Tuning for RocksDB" published on Information Processing & Management in January 2024. K2vTune is a workload-aware configuration tuning framework, which can recognize the configuration knobs of RocksDB according to the workload type and effectively consider multiple performance metrics using our knob2vec method.

Requirements

  • lifelines
  • pytorch == 1.7.0
  • python >= 3.8

SMAC library

How to install

conda install gxx_linux-64 gcc_linux-64 swig
pip install smac

QuickStart

Run main.py to train the entire model. Paser explanation as below,

target       : target workload number  
tf           : using teacher forcing, if not specify this, the model will be trained by non-teacher forcing  
train        : mode of train  
eval         : mode of train using pre-trained model(.pt)  
model_path   : if using eval mode, add pre-trained model path  
batch_size   : batch size for dataset
hidden_size  : hidden size of the model  
lr           : learning rate of the model
mode         : regression model type ['raw', 'dnn', 'gru', 'attngru']
attn_mode    : attention tyep ['dot', 'general', 'concat', 'bahdanau']
generation   : the counts of generation in Genetic Algorithm  
pool         : size of pool in genetic algorithm
optimization : choose optimization algorithm ['ga', 'smac']
  • Training the model

Please modify arguments in the bash file. See train.sh

./main.sh train

or

python main.py --target ${target_idx} --tf --train --hidden_size ${hidden_size} --lr ${learning_rate} \
--generation ${generation_num} --pool ${pool_num}
  • Training with pre-trained model path

Please modify arguments in the bash file. See eval.sh

./main.sh eval

or

python main.py --target ${target_idx} --tf --eval --model_path ${model_path} \
--generation ${generation_num} --pool ${pool_num}

We set the parameters as follows

  • hidden_size = 128
  • lr = 0.001
  • generation = 100
  • pool = 128
  • mode = 'attngru'
  • attn_mode = 'general'
  • optimization = 'ga'

k2vtune's People

Contributors

addb-swstarlab avatar dkdl012 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

k2vtune's Issues

About optimization algorithm.

Hello,

Do you have any special reason to use the Genetic Algorithm optimization method in Knob-Representation/models/steps.py/ ?

Was the optimization method with the best performance GA?

Thanks for reading.

About knobs selection

Hello,
Did you use all the knobs in the rocksdb? If not, by what criteria did you choose a knobs?

Thank you

Some questions of the model operation

Hi.

I have some questions of the model operation.

Since this model is aim to optimize the knob's parameter, I wondering how the model operate in training and inference process.

Also the relation between the model output and the knob's parameters is confused without the README file.

So it would be great if you can upload more information of this model.

Thank you!

About scaler in models/knobs.py

hello,

Your code only uses MinMaxScaler.
Have you tried any other Scalers(e.g. StandardScaler, LogScaler, RobustScaelr)?
If so, what is the result of the experiment?

Thanks for reading

Can you upload the data?

Hi,
Thank you for providing the code.

I wonder there are some data to run the model.
In data folder, there are just name of the data files in readme.

Could you upload the data file that I can run the model?

Thank you.

About model network.

Hello,

I read in Knob-Representation/models/network.py/ that input_dim has a greater value than hidden_dim.

I know it's good to specify the number of units in the hidden layer as a multiple of the number of input units.

Is there any special reason that input_dim is greater than hidden_dim?

Thanks for reading.

About Workload Mapping

Hi,

Thanks for the interesting code.

When mapping workloads, I think it might be difficult to map all 16 workloads.
Was the actual performance always good?

Thank you for reading!

ImportError with 'from benchmark import exec_benchmark'

I have been working on your project and encountered an ImportError when trying to import the exec_benchmark function from the benchmark module.

I've searched for the corresponding installation package, but unfortunately, I was not able to find it in the project's documentation or in the usual package repositories.
Could you kindly assist in clarifying how I might be able to resolve this? Specifically, if there are any additional packages that need to be installed, could you please guide me on how to do so, or update the documentation to include this information?

Thank you for your time and assistance in this matter.

Questions about the code

When I checked the Attention Class on Knob-Representation/models/network.py , I checked the code for Bahdanau Attention. Was this attention more effective than general attention?

Inquiry about RocksDB Version Compatibility

Hello,

I am currently considering using your project and wanted to clarify if there are any specific version dependencies that I should be aware of. Could you please provide details on which version(s) of RocksDB are recommended or required for compatibility purposes?

Thank you for your assistance.

AdamW

Hi Thank you for your code.

I wonder why you use AdamW for optimizer.
There are any performance difference between Adam?

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.