This repository demonstrates in-memory joins and aggregate query implementations for a general database, in C.
This project was done as part of COMS W4112 Database Systems Implementation for Fall 2017 at Columbia University.
- Problem statement: 4112_project_2.pdf
- q4112_main.c: Project API
- q4112_nlj_1.c: single-threaded nested loop join
- q4112_nlj.c: multi-threaded nested loop join
- q4112_hj_1.c: single-threaded hash join
- q4112_hj.c: multi-threaded hash join
- q4112.c: aggregation query (described in Problem Statement)
1. `make clean`
2. `make all`
NOTE: DO NOT DELETE q4112_gen.o. It implements in-memory data generation for database queries and the source file is unavailable.
Each executable compiled above, runs a given configuration
for 5 repititions
as and logs the observations into a csv file.
The executables take in a number of optional arguments (in order):
- inner_tuples - number of tuples in inner relation table (default: 1000)
- inner_selectivity - fraction of tuples that satisfy query (default: 1.0)
- this must be in range
[0-1]
- inner_val_max - max value of inner relation field (default: 10000000) - used for data generation in q4112_gen.o
- outer_tuples - number of tuples in outer table (default: 1000000)
- outer_selectivity - fraction of tuples that satisfy query (default: 1.0)
- this must be in range
[0-1]
- outer_val_max - max value of outer relation field (default: 1000)
- groups - number of aggregation groups (group by query)
- hh_groups - number of hard hitter groups - tests for memory access contention in multi-threaded code
- hh_probability - fraction of groups that are hard hitter (default: 0)
- threads - number of threads to be used by query (default: 1)
- res_file - result csv format file (default: q4112.csv) - results of successive runs/configurations are appended
Example-1
./q4112_hj 100 1.0 99999 1000000000 0.5 99999 0 0 0.0 16 q4112_hj.csv
- inner_tuples: 100
- inner_selectivity: 1.0
- inner_val_max: 99999
- outer_tuples: 1000000000
- outer_selectivity: 0.5
- outer_val_max: 99999
- groups: 0
- hh_groups: 0
- hh_probability: 0
- threads: 16
- res_file: q4112_hj.csv
Example-2
/q4112 100 1.0 99999 1000000000 1.0 99999 100000000 100 0.5 16 q4112.csv
- inner_tuples: 100
- inner_selectivity: 1.0
- inner_val_max: 99999
- outer_tuples: 1000000000
- outer_selectivity: 1.0
- outer_val_max: 99999
- groups: 100000000
- hh_groups: 100
- hh_probability: 0.5
- threads: 16
- res_file: q4112.csv