Code Monkey home page Code Monkey logo

Amir Helmy Shawky's Projects

analyzing-ab-test-results icon analyzing-ab-test-results

# Analyzing-AB-Test-Results Understanding the results of an A/B test run by an e-commerce website, in order to help the company understand if they should implement the new web page, keep the old web page, or perhaps run the experiment longer to make their decision.

bigmart-sales icon bigmart-sales

### PROBLEM STATEMENT The data scientists at BigMart have collected 2013 sales data for 1559 products across 10 stores in different cities. Also, certain attributes of each product and store have been defined. The aim is to build a predictive model and find out the sales of each product at a particular store. Using this model, BigMart will try to understand the properties of products and stores which play a key role in increasing sales. ### DATA * Item_Identifier : Unique product ID * Item_Weight : Weight of product * Item_Fat_Content : Whether the product is low fat or not * Item_Visibility : The % of total display area of all products in a store allocated to the particular product * Item_Type : The category to which the product belongs * Item_MRP : Maximum Retail Price (list price) of the product * Outlet_Identifier : Unique store ID * Outlet_Establishment_Year : The year in which store was established * Outlet_Size : The size of the store in terms of ground area covered * Outlet_Location_Type : The type of city in which the store is located * Outlet_Type : Whether the outlet is just a grocery store or some sort of supermarket * Item_Outlet_Sales : Sales of the product in the particular store.

breast-cancer-prediction icon breast-cancer-prediction

Breast Cancer (Diagnostic) Data Set¶ Task : To predict whether the cancer is benign or malignant What Are the Symptoms of Breast Cancer? New lump in the breast or underarm (armpit). Thickening or swelling of part of the breast. Irritation or dimpling of breast skin. Redness or flaky skin in the nipple area or the breast. Pulling in of the nipple or pain in the nipple area. Nipple discharge other than breast milk, including blood.

ford-gobike-data-analysis icon ford-gobike-data-analysis

This is an analysis of GoBike Data to highlight COVID-19's impact, to gain insights about users and to help the company target offers in certain stations.

identifying-safe-loans-with-decision-trees icon identifying-safe-loans-with-decision-trees

dentifying safe loans with decision trees The LendingClub is a peer-to-peer leading company that directly connects borrowers and potential lenders/investors. In this notebook, you will build a classification model to predict whether or not a loan provided by LendingClub is likely to default. In this notebook you will use data from the LendingClub to predict whether a loan will be paid off in full or the loan will be charged off and possibly go into default. In this assignment you will: Use SFrames to do some feature engineering. Train a decision-tree on the LendingClub dataset. Visualize the tree. Predict whether a loan will default along with prediction probabilities (on a validation set). Train a complex tree model and compare it to simple tree model.

leetcode icon leetcode

Collection of LeetCode questions to ace the coding interview! - Created using [LeetHub](https://github.com/QasimWani/LeetHub)

loan-prediction icon loan-prediction

# Loan Approval Prediction: ### EDA + Decision Tree, Random Forest & Logistic Regression Modeling ## Introduction we are going to work on **binary classification problem**, where we got some information about sample of people , and we need to predict whether we should give some one a loan or not depending on his information . we actually have a few sample size (614 rows), so we will go with machine learning techniques to solve our problem . ### Problem Statement: __About Company__ <br> Dream Housing Finance company deals in all home loans. They have presence across all urban, semi urban and rural areas. Customer first apply for home loan after that company validates the customer eligibility for loan. __Problem__ <br> Company wants to automate the loan eligibility process (real time) based on customer detail provided while filling online application form. These details are Gender, Marital Status, Education, Number of Dependents, Income, Loan Amount, Credit History and others. To automate this process, they have given a problem to identify the customers segments, those are eligible for loan amount so that they can specifically target these customers. Here they have provided a partial data set. #### Dataset Description: | Variable | Description | |------|------| | Loan_ID | Unique Loan ID | | Gender | Male/ Female | | Married | Applicant married (Y/N) | | Dependents | Number of dependents | | Education | Applicant Education (Graduate/ Under Graduate) | | Self_Employed | Self employed (Y/N) | | ApplicantIncome | Applicant income | | CoapplicantIncome | Coapplicant income | | LoanAmount | Loan amount in thousands | | Loan_Amount_Term | Term of loan in months | | Credit_History | credit history meets guidelines | | Property_Area | Urban/ Semi Urban/ Rural | | Loan_Status | Loan approved (Y/N) |

men-s-shoe-pricess icon men-s-shoe-pricess

Context A list of 10,000 men's shoes and the various prices at which they are sold.. Content This is a list of 10,000 men's shoes provided by Datafiniti's Product Database. The dataset includes shoe name, brand, price, and more. Each shoe will have an entry for each price found for it and some shoes may have multiple entries. Note that this is a sample of a large dataset. The full dataset is available through Datafiniti. Acknowledgements What You Can Do with This Data You can use this data to determine brand markups, pricing strategies, and trends for luxury shoes E.g.: What is the average price of each distinct brand listed? Which brands have the highest prices? Which ones have the widest distribution of prices? Is there a typical price distribution (e.g., normal) across brands or within specific brands? Further processing data would also let you: Correlate specific product features with changes in price. You can cross-reference this data with a sample of our Women's Shoe Prices to see if there are any differences between women's brands and men's brands. Data Schema A full schema for the data is available in our support documentation. About Datafiniti Datafiniti provides instant access to web data. We compile data from thousands of websites to create standardized databases of business, product, and property information. Learn more. Inspiration Datafiniti provides instant access to web data. We compile data from thousands of websites to create standardized databases of business, product, and property information Feature Description refer below link: Feature Description

starbucks-challenge icon starbucks-challenge

Analyzing Starbucks rewards mobile app's data to determine which demographic groups respond best to which offer type.

statistics-with-python icon statistics-with-python

Practicing the coding part in Statistics With Python Specialization. Repo's Contents: Inferential Statistics - it has exersises about: Confidence Levels Construction. Hypothesis Testing. Fitting Statistical Models to Data - it has exersises about: Fitting models to independent data; Linear Regression, Logistic Regression.

us-bikeshare-data icon us-bikeshare-data

In this project, i will make use of Python to explore data related to bike share systems for three major cities in the United States—Chicago, New York City, and Washington. i will write code to import the data and answer interesting questions about it by computing descriptive statistics. i will also write a script that takes in raw input to create an interactive experience in the terminal to present these statistics. The Datasets Randomly selected data for the first six months of 2017 are provided for all three cities. All three of the data files contain the same core six (6) columns: Start Time (e.g., 2017-01-01 00:07:57) End Time (e.g., 2017-01-01 00:20:53) Trip Duration (in seconds - e.g., 776) Start Station (e.g., Broadway & Barry Ave) End Station (e.g., Sedgwick St & North Ave) User Type (Subscriber or Customer) The Chicago and New York City files also have the following two columns: Gender Birth Year Data for the first 10 rides in the new_york_city.csv file The original files are much larger and messier, and you don't need to download them, but they can be accessed here if you'd like to see them (Chicago, New York City, Washington). These files had more columns and they differed in format in many cases. Some data wrangling has been performed to condense these files to the above core six columns to make your analysis and the evaluation of your Python skills more straightforward. In the Data Wrangling course that comes later in the Data Analyst Nanodegree program, students learn how to wrangle the dirtiest, messiest datasets, so don't worry, you won't miss out on learning this important skill! Statistics Computed You will learn about bike share use in Chicago, New York City, and Washington by computing a variety of descriptive statistics. In this project, you'll write code to provide the following information: #1 Popular times of travel (i.e., occurs most often in the start time) most common month most common day of week most common hour of day #2 Popular stations and trip most common start station most common end station most common trip from start to end (i.e., most frequent combination of start station and end station) #3 Trip duration total travel time average travel time #4 User info counts of each user type counts of each gender (only available for NYC and Chicago) earliest, most recent, most common year of birth (only available for NYC and Chicago) The Files To answer these questions using Python, you will need to write a Python script. To help guide your work in this project, a template with helper code and comments is provided in a bikeshare.py file, and you will do your scripting in there also. You will need the three city dataset files too: chicago.csv new_york_city.csv washington.csv All four of these files are zipped up in the Bikeshare file in the resource tab in the sidebar on the left side of this page. You may download and open up that zip file to do your project work on your local machine. Some versions of this project also include a Project Workspace page in the classroom where the bikeshare.py file and the city dataset files are all included, and you can do all your work with them there.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.