Code Monkey home page Code Monkey logo

probability-density-functions-lab's Introduction

The Probability Density Function - Lab

Introduction

In this lab, we will look at building visualizations known as density plots to estimate the probability density for a given set of data.

Objectives

You will be able to:

  • Plot and interpret density plots and comment on the shape of the plot
  • Estimate probabilities for continuous variables by using interpolation

Let's get started

Let's import the necessary libraries for this lab.

# Import required libraries
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('ggplot')
import pandas as pd 

Import the data, and calculate the mean and the standard deviation

  • Import the dataset 'weight-height.csv' as a pandas dataframe.

  • Next, calculate the mean and standard deviation for weights and heights for men and women individually. You can simply use the pandas .mean() and .std() to do so.

Hint: Use your pandas dataframe subsetting skills like loc(), iloc(), and groupby()

data = None
male_df =  None
female_df =  None

# Male Height mean: 69.02634590621737
# Male Height sd: 2.8633622286606517
# Male Weight mean: 187.0206206581929
# Male Weight sd: 19.781154516763813
# Female Height mean: 63.708773603424916
# Female Height sd: 2.696284015765056
# Female Weight mean: 135.8600930074687
# Female Weight sd: 19.022467805319007
Male Height mean: 69.02634590621737
Male Height sd: 2.8633622286606517
Male Weight mean: 187.0206206581929
Male Weight sd: 19.781154516763813
Female Height mean: 63.708773603424916
Female Height sd: 2.696284015765056
Female Weight mean: 135.8600930074687
Female Weight sd: 19.022467805319007

Plot histograms (with densities on the y-axis) for male and female heights

  • Make sure to create overlapping plots
  • Use binsize = 10, set alpha level so that overlap can be visualized
# Your code here

png

# Record your observations - are these inline with your personal observations?

Create a density function using interpolation

  • Write a density function density() that uses interpolation and takes in a random variable
  • Use np.histogram()
  • The function should return two lists carrying x and y coordinates for plotting the density function
def density(x):
    
    pass


# Generate test data and test the function - uncomment to run the test
# np.random.seed(5)
# mu, sigma = 0, 0.1 # mean and standard deviation
# s = np.random.normal(mu, sigma, 100)
# x,y = density(s)
# plt.plot(x,y, label = 'test')
# plt.legend()

png

Add overlapping density plots to the histograms plotted earlier

# Your code here 

png

Repeat the above exercise for male and female weights

# Your code here 

png

Write your observations in the cell below

# Record your observations - are these inline with your personal observations?


# What is the takeaway when comparing male and female heights and weights?

Repeat the above experiments in seaborn and compare with your results

# Code for heights here

png

# Code for weights here

png

# Your comments on the two approaches here. 
# are they similar? what makes them different if they are?

Summary

In this lesson, you learned how to build the probability density curves visually for a given dataset and compare the distributions visually by looking at the spread, center, and overlap. This is a useful EDA technique and can be used to answer some initial questions before embarking on a complex analytics journey.

probability-density-functions-lab's People

Contributors

mike-kane avatar mas16 avatar loredirick avatar juicob avatar lmcm18 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.