GRIP-Tasks

Graduate Rotational Internship Program @The Sparks Foundation

These tasks are done under the internship program as a part of July 2021 batch by Marisha Bhatti. The projects mainly come under the domain of Data Science & Business Analytics.

Task 1: Prediction using Supervised ML.
- Simple Linear Regression Task. Predict the percentage of an student based on the number of study hours.
- Dataset can be seen and used from http://bit.ly/w-data.
- Mean Absolute Error: 4.183859899002982
- R2 Score: 0.9454906892105354

Task 2: Prediction using Unsupervised ML.
- K-Means Clustering Task. Predict the optimum number of clusters and represent it visually.
- Dataset used is Iris.csv (can also be imported from sklearn.datasets).
- This notebook has two parts. The first part uses a K-Nearest Neighbors model to perform a simple multi-classification task (Step 1 - 6). The second part tackles the unsupervised machine learning problem using K-Means Clustering model (Step 7 - 9).
- The KNN model has an accuracy of 0.9736842105263158
- From the K-Means model we find that the optimum number of clusters is 3.

Task 3: Exploratory Data Analysis - Retail
- Finding out the weak areas where more profit can be made.
- Dataset used is SampleSuperstore.csv (to view the dataset in github select view raw).
- Category-wise:
  - Highest profit: Furniture
  - Lowest profit: Technology
  - Maximum Sales in Category: Technology
- Sub-Category-wise:
  - Highest Profit: Copiers
  - Lowest Profite: Tables
  - Top 3 High Discount Products: Binders, Machines, Tables
- State-wise:
  - Average Number of Deals per state is 203.9591836734694
  - Highest Profit: Vermont
  - Lowest Profit: Ohio
  - Highest amount of Sales: Wyoming
- City-wise:
  - Highest Profit: Jamestown
  - Lowest Profit: Bethlehem

Task 4: Exploratory Data Analysis - Terrorism
- Finding out the hot zone of terrorism.
- The dataset can be downloaded from https://www.kaggle.com/START-UMD/gtd.
- Middle East & North Africa has the most terrorist attacks. South Asia has second most terrorist attacks.
- Iraq has the most terrorist attacks in middle east. Pakistan, Afghanistan and India are in the Top 3 in South Asia.
- Iraq, Pakistan and Afghanistan are the Top 3 countries with most terrorist attacks.
- In Eastern Europe, Middle East, South asia, Southeast Asia and subsaharan Africa there has been a huge increase in terrorist attacks whereas other regions have seen a decrease since 2001.

Task 5: Exploratory Data Analysis - Sports
- Finding out the most successful teams, players and factors contributing win or loss of a team.
- Datasets used are matches.csv and deliveries.csv.
- Mumbai Indians, Chennai Super Kings, Kolkata Knight Riders are top three teams with most wins.
- Top 3 Players based on Player of the Match Awards: Chris Gayle, AB de Villiers, MS Dhoni.
- Top Batsmen: Virat Kohli, SK Raina, Rohit Sharma.
- Top Bowlers: TG Southee, AD Mathews, SK Raina.

Task 6: Prediction using Decision Tree Algorithm.
- Decision Tree Classifier Task. The classifier would be able to predict the right class given any new data.
- Dataset used is Iris.csv (can also be imported from sklearn.datasets).
- Mean of Cross Validation Score: 0.9466666666666667
- Standard Deviation of Cross Validation Score: 0.04521553322083511

Task 7: Stock Market Prediction using Numerical and Textual analysis.
- Create a hybrid model for stock price/performance prediction using numerical analysis of historical stock prices, and sentimental analysis of news headlines.
- Historical stock prices dataset can be downloaded from finance.yahoo.com or use SENSEX.csv file in the repository for numerical analysis. Textual (news) data can be downloaded from https://bit.ly/36fFPI6 or https://www.kaggle.com/therohk/india-headlines-news-dataset?select=india-news-headlines.csv.
- Mean Absolute Error: 0.5019762845849802
- Mean Squared Error: 0.5019762845849802
- Root Mean Squared Error: 0.7085028472666713

marisha18 / grip-tasks Goto Github PK

grip-tasks's Introduction

GRIP-Tasks

Graduate Rotational Internship Program @The Sparks Foundation

grip-tasks's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent