Code Monkey home page Code Monkey logo

codenummy_correlationcoefficient's Introduction

Overview

This is a Code Nummy about the correlation coefficient.

Theory

Correlation

Knowledge about correlation is a powerful tool in maths, statistics and data analysis. You have two (seemingly unrelated?) measures x and y. How likely is it that an increase in x leads to an increase in y? This is the question that is answered by the correlation between xand y.

This leads to all sort of interesting as well as entertaining observerations [1, 2, 3]. But remember,

"Correlation does not mean causation!" wikipedia

But to be able to boast with this sentence among your friends, you need to understand how this correlation thing works internally and how to calculate it.

Assume a set of x and y value pairs. E.g. the number of nuclear power plants per year and the amount of swimming pool drownings per year. Are they correlated or not?

The correlation coefficient r will answer this question. It is a value in the range [-1, 1] where a value of 0 means " completely unrelated", and a value of -1 or 1 means "completely related". It is calculated as follows:

Exercise

  • implement the function calculate_sum(values) in src/correlation, which calculates the sum of the values
  • implement the function calculate_sum_of_squares(values) in src/correlation, which calculates the sum of the squared values
  • implement the function calculate_sum_of_multiplies in src/correlation, which calculates the sum of the multiplied values
  • Now implement the function correlation in src/correlation, which will calculate the value r by using all the previously defined functions.

Hints for C++

std::accumulate and std::inner_product can prove helpful.

Hints for Python

np.multiply can prove helpful.

Application

Think of any measurement of two (possibly?) related values, that you can easily perform on your own. Some ideas:

  • grab some books from your bookshelf and measure width and height of a book
  • the width and length of individual spaghetti from a pack
  • pick two measures from csgostats or league of graphs
  • stock market example: S&P500 and bitcoin value in USD
  • any of the examples from [1, 2, 3]

Further Reading and references

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.