hyrise / index_selection_evaluation Goto Github PK

View Code? Open in Web Editor NEW

80.0 9.0 25.0 2.62 MB

Platform to evaluate index selection algorithms

License: MIT License

CSS 0.01% HTML 0.26% JavaScript 1.70% Python 55.17% Shell 0.54% Jupyter Notebook 42.09% AMPL 0.24%

indexes dbms

index_selection_evaluation's People

Contributors

Stargazers

Watchers

index_selection_evaluation's Issues

Change indexable columns and possible index methods

We currently consider ALL columns in a query. However, it should be sufficient to consider columns that are part of the where clause, right? Thereby, we can reduce the number of evaluated possible indexes.

Refactor architecture of CostEvaluation

I am not fully convinced by the architecture/indirection that the CostEvaluation uses WhatIfIndexCreation uses DBConnector.
But this has maybe not a high priority.
Besides, I see the danger that CostEvaluation and WhatIfIndexCreation become inconsistent when calling reset()?
Calling all_simulated_indexes() should not be that expensive.

Where is CoPhy code

I didn't find the implementation of CoPhy in the code. Could you please add the CoPhy code?

Cleanup csv_to_tikz.py

Remove all magic numbers, remove unused functions

Add information to readme about how to generate diagrams

python3 csv_to_tikz.py tpcds.csv tpcds_cost.tex cost
pdflatex tpcds_cost.tex

Algorithm EPIC: Fix budget check for multi column indexes

Does currently not check whether multi-attribute extensions are within budget, if the corresponding single column index is not.

Add paper reference to each implemented algorithm

Add DOI, DBLP verbal reference or link to paper.

Clean up example configs

Add index unit tests

Do you have a code implementation for INUM?

Hello, thank you for sharing the code!
INUM refers to this paper: Efficient Use of the Query Optimizer for Automated Physical Design.
CoPhy divides costs into internal sub plan costs and access costs based on INUM, and models them using integer programming. Although the cophy_input_generation.py code indicates that this approach is not necessary, I am still interested in the implementation details of INUM. If there is INUM code, I would greatly appreciate it.

Remove 2nd create statistics (vacuum analyze)

▶ python3 -m selection              
INFO:root:Starting Index Selection Evaluation
INFO:root:Using config file example_configs/config.json
DEBUG:root:Database connector created: None
DEBUG:root:Postgres connector created: None
DEBUG:root:Database with given scale factor already existing
DEBUG:root:Database connector created: indexselection_tpch___0_1
DEBUG:root:Postgres connector created: indexselection_tpch___0_1
INFO:root:Generating TPC-H Queries
DEBUG:root:No need to run make
INFO:root:Queries generated
INFO:root:Dropping indexes
INFO:root:Postgres: Run `vacuum analyze`
INFO:root:Dropping indexes
INFO:root:Postgres: Run `vacuum analyze`
DEBUG:root:Init selection algorithm
INFO:root:Dropping indexes
DEBUG:root:Init cost evaluation
INFO:root:Cost estimation with whatif
DEBUG:root:Init WhatIfIndexCreation

Do we have to create statistics before each algorithm?

hyrise / index_selection_evaluation Goto Github PK

index_selection_evaluation's People

Contributors

Stargazers

Watchers

Forkers

index_selection_evaluation's Issues

Recommend Projects

Recommend Topics

Recommend Org