addb-swstarlab / k2vtune Goto Github PK

License: GNU General Public License v2.0

Jupyter Notebook 89.79% Python 10.12% Shell 0.10%

k2vtune's Introduction

K2vTune: A Workload-aware Configuration Tuning for RocksDB

This repository contains the source code for the paper "K2vTune: A Workload-aware Configuration Tuning for RocksDB" published on Information Processing & Management in January 2024. K2vTune is a workload-aware configuration tuning framework, which can recognize the configuration knobs of RocksDB according to the workload type and effectively consider multiple performance metrics using our knob2vec method.

Requirements

lifelines
pytorch == 1.7.0
python >= 3.8

SMAC library

https://automl.github.io/SMAC3/main/index.html

How to install

conda install gxx_linux-64 gcc_linux-64 swig
pip install smac

QuickStart

Run main.py to train the entire model. Paser explanation as below,

target       : target workload number  
tf           : using teacher forcing, if not specify this, the model will be trained by non-teacher forcing  
train        : mode of train  
eval         : mode of train using pre-trained model(.pt)  
model_path   : if using eval mode, add pre-trained model path  
batch_size   : batch size for dataset
hidden_size  : hidden size of the model  
lr           : learning rate of the model
mode         : regression model type ['raw', 'dnn', 'gru', 'attngru']
attn_mode    : attention tyep ['dot', 'general', 'concat', 'bahdanau']
generation   : the counts of generation in Genetic Algorithm  
pool         : size of pool in genetic algorithm
optimization : choose optimization algorithm ['ga', 'smac']

Training the model

Please modify arguments in the bash file. See train.sh

./main.sh train

python main.py --target ${target_idx} --tf --train --hidden_size ${hidden_size} --lr ${learning_rate} \
--generation ${generation_num} --pool ${pool_num}

Training with pre-trained model path

Please modify arguments in the bash file. See eval.sh

./main.sh eval

python main.py --target ${target_idx} --tf --eval --model_path ${model_path} \
--generation ${generation_num} --pool ${pool_num}

We set the parameters as follows

hidden_size = 128
lr = 0.001
generation = 100
pool = 128
mode = 'attngru'
attn_mode = 'general'
optimization = 'ga'

k2vtune's People

Contributors

Stargazers

Watchers

Forkers

dkdl012 ychanho mathcom hyojoys rlagnlrns

k2vtune's Issues

Questions about RocksDB Space Amplification Factor

Hello,

I recently studied RockDB and often use the command "compact" in my experiments. I have a hard time understanding how does it related with Space Amplification Factor(SAF).

Thanks.

About optimization algorithm.

Hello,

Do you have any special reason to use the Genetic Algorithm optimization method in Knob-Representation/models/steps.py/ ?

Was the optimization method with the best performance GA?

Thanks for reading.

About knobs selection

Hello,
Did you use all the knobs in the rocksdb? If not, by what criteria did you choose a knobs?

Thank you

Some questions of the model operation

Hi.

I have some questions of the model operation.

Since this model is aim to optimize the knob's parameter, I wondering how the model operate in training and inference process.

Also the relation between the model output and the knob's parameters is confused without the README file.

So it would be great if you can upload more information of this model.

Thank you!

About scaler in models/knobs.py

hello,

Your code only uses MinMaxScaler.
Have you tried any other Scalers(e.g. StandardScaler, LogScaler, RobustScaelr)?
If so, what is the result of the experiment?

Thanks for reading

Can you upload the data?

Hi,
Thank you for providing the code.

I wonder there are some data to run the model.
In data folder, there are just name of the data files in readme.

Could you upload the data file that I can run the model?

Thank you.

About model network.

Hello,

I read in Knob-Representation/models/network.py/ that input_dim has a greater value than hidden_dim.

I know it's good to specify the number of units in the hidden layer as a multiple of the number of input units.

Is there any special reason that input_dim is greater than hidden_dim?

Thanks for reading.

How is overfitting handled?

There is no regularization term or dropout in the code, but overfitting doesn't happen?

About Workload Mapping

Hi,

Thanks for the interesting code.

When mapping workloads, I think it might be difficult to map all 16 workloads.
Was the actual performance always good?

Thank you for reading!

ImportError with 'from benchmark import exec_benchmark'

I have been working on your project and encountered an ImportError when trying to import the exec_benchmark function from the benchmark module.

I've searched for the corresponding installation package, but unfortunately, I was not able to find it in the project's documentation or in the usual package repositories.
Could you kindly assist in clarifying how I might be able to resolve this? Specifically, if there are any additional packages that need to be installed, could you please guide me on how to do so, or update the documentation to include this information?

Thank you for your time and assistance in this matter.

Questions about the code

When I checked the Attention Class on Knob-Representation/models/network.py , I checked the code for Bahdanau Attention. Was this attention more effective than general attention?

Inquiry about RocksDB Version Compatibility

Hello,

I am currently considering using your project and wanted to clarify if there are any specific version dependencies that I should be aware of. Could you please provide details on which version(s) of RocksDB are recommended or required for compatibility purposes?

Thank you for your assistance.

AdamW

Hi Thank you for your code.

I wonder why you use AdamW for optimizer.
There are any performance difference between Adam?

Thank you.

What exactly do you mean by external and interanal?

I understand that external and internal are performance indicators, but could you kindly elaborate on the difference between the two?

addb-swstarlab / k2vtune Goto Github PK

k2vtune's Introduction

K2vTune: A Workload-aware Configuration Tuning for RocksDB

Requirements

SMAC library

How to install

QuickStart

Run main.py to train the entire model. Paser explanation as below,

Training the model

Training with pre-trained model path

k2vtune's People

Contributors

Stargazers

Watchers

Forkers

k2vtune's Issues

Recommend Projects

Recommend Topics

Recommend Org