Data_Explortion_Pandas_College_Major

Use .head(), .tail(), .shape and .columns to explore your DataFrame and find out the number of rows and columns as well as the column names.
Look for NaN (not a number) values with .findna() and consider using .dropna() to clean up your DataFrame.
You can access entire columns of a DataFrame using the square bracket notation: df['column name'] or df[['column name 1', 'column name 2', 'column name 3']]
You can access individual cells in a DataFrame by chaining square brackets df['column name'][index] or using df['column name'].loc[index]
The largest and smallest values, as well as their positions, can be found with methods like .max(), .min(), .idxmax() and .idxmin()
You can sort the DataFrame with .sort_values() and add new columns with .insert()
To create an Excel Style Pivot Table by grouping entries that belong to a particular category use the .groupby() method

Programming_Languages_using_matplotlib

used .groupby() to explore the number of posts and entries per programming language
converted strings to Datetime objects with to_datetime() for easier plotting
reshaped our DataFrame by converting categories to columns using .pivot()
used .count() and isna().values.any() to look for NaN values in our DataFrame, which we then replaced using .fillna()
created (multiple) line charts using .plot() with a for-loop
styled our charts by changing the size, the labels, and the upper and lower bounds of our axis.
added a legend to tell apart which line is which by colour
smoothed out our time-series observations with .rolling().mean() and plotted them to better identify trends over time.

Lego_Anaylsis

use HTML Markdown in Notebooks, such as section headings # and how to embed images with the tag.
combine the groupby() and count() functions to aggregate data
use the .value_counts() function
slice DataFrames using the square bracket notation e.g., df[:-2] or df[:10]
use the .agg() function to run an operation on a particular column
rename() columns of DataFrames
create a line chart with two separate axes to visualise data that have different scales.
create a scatter plot in Matplotlib
work with tables in a relational database by using primary and foreign keys
.merge() DataFrames along a particular column
create a bar chart with Matplotlib

Google Trends and Data Visualisation

How to use .describe() to quickly see some descriptive statistics at a glance.
How to use .resample() to make a time-series data comparable to another by changing the periodicity.
How to work with matplotlib.dates Locators to better style a timeline (e.g., an axis on a chart).
How to find the number of NaN values with .isna().values.sum()
How to change the resolution of a chart using the figure's dpi
How to create dashed '--' and dotted '-.' lines using linestyles
How to use different kinds of markers (e.g., 'o' or '^') on charts.
Fine-tuning the styling of Matplotlib charts by using limits, labels, linewidth and colours (both in the form of named colours and HEX codes).
Using .grid() to help visually identify seasonality in a time series.

Google Play Store App Analytics

Pull a random sample from a DataFrame using .sample()
How to find duplicate entries with .duplicated() and .drop_duplicates()
How to convert string and object data types into numbers with .to_numeric()
How to use plotly to generate beautiful pie, donut, and bar charts as well as box and scatter plots

Computation_with_NumPy_and_N_Dimensional_Arrays

Create arrays manually with np.array()
Generate arrays using .arange(), .random(), and .linspace()
Analyse the shape and dimensions of a ndarray
Slice and subset a ndarray based on its indices
Do linear algebra like operations with scalars and matrix multiplication
Use NumPys broadcasting to make ndarray shapes compatible
Manipulate images in the form of ndarrays

Seaborn_and_Linear_Regression

Use nested loops to remove unwanted characters from multiple columns
Filter Pandas DataFrames based on multiple conditions using both .loc[] and .query()
Create bubble charts using the Seaborn Library
Style Seaborn charts using the pre-built styles and by modifying Matplotlib parameters
Use floor division (i.e., integer division) to convert years to decades
Use Seaborn to superimpose a linear regressions over our data
Make a judgement if our regression is good or bad based on how well the model fits our data and the r-squared metric
Run regressions with scikit-learn and calculate the coefficients.

Nobel_Prize_Analysis

How to uncover and investigate NaN values.
How to convert objects and string data types to numbers.
Creating donut and bar charts with plotly.
Create a rolling average to smooth out time-series data and show a trend.
How to use .value_counts(), .groupby(), .merge(), .sort_values() and .agg().
Create a Choropleth to display data on a map.
Create bar charts showing different segments of the data with plotly.
Create Sunburst charts with plotly.
Use Seaborn's .lmplot() and show best-fit lines across multiple categories using the row, hue, and lowess parameters.
Understand how a different picture emerges when looking at the same data in different ways (e.g., box plots vs a time series analysis).
See the distribution of our data and visualise descriptive statistics with the help of a histogram in Seaborn.

Dr_Semmelweis_Handwashing_Discovery

How to use histograms to visualise distributions
How to superimpose histograms on top of each other even when the data series have different lengths
How to use a to smooth out kinks in a histogram and visualise a distribution with a Kernel Density Estimate (KDE)
How to improve a KDE by specifying boundaries on the estimates
How to use scipy and test for statistical significance by looking at p-values.
How to highlight different parts of a time series chart in Matplotib.
How to add and configure a Legend in Matplotlib.
Use NumPy's .where() function to process elements depending on a condition.

arinjain373 / data-analysis-and-visualization Goto Github PK

data-analysis-and-visualization's Introduction

Data_Explortion_Pandas_College_Major

Programming_Languages_using_matplotlib

Lego_Anaylsis

Google Trends and Data Visualisation

Google Play Store App Analytics

Computation_with_NumPy_and_N_Dimensional_Arrays

Seaborn_and_Linear_Regression

Nobel_Prize_Analysis

Dr_Semmelweis_Handwashing_Discovery

data-analysis-and-visualization's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent