In this exercise we will try to learn how can we implement linear regression just using SQL.
First create the database using the raw file before attempting the following:
- Data for regression
- Regression coefficients
- Linear time trend
- Statistical significance test
- Orthogonal contrast codes
- Linear time trend and seasonality
The R code file is also shared to do the final analysis and calculating necessary parameters to understand the ouput of the regression from this exercise.
The data for the experiment is available inside the 'data' folder in this repository. Also, the output from each process outlined above will be shared in the same folder.
I came across this idea through an online article via mode.com by Julia Glick. I have uploaded the reading material in the repository for your reading. Given the size of the file, one must download the folder to get access to the file. I don't claim exclusive rights on the technique but this is an impressive take on Statistics using SQL.
Following answers on stack overflow were quite helpful in understanding the execution in SQL: