Richard Taracha's Projects
A descriptive analysis report generated using SQL to determine the causes of car crashes in the city of Chicago
This repository contains code and resources for building credit risk models for Safaricom customers. This repository may be useful for data scientists and analysts who are interested in developing and deploying credit risk models for Safaricom.
Classification Analysis with Python to analyze relevant customer data and develop a solution that will help determine whether a customer will churn.
Data science interview questions and answers
Performing a descriptive analysis of the prices of ride-sharing apps (Uber and Lyft) using SQL to deduce useful relationships in the data that can help us understand the factors that influence cab pricing variations
practice the workflow you will be using when you want to work locally instead of in IllumiDesk on Canvas, and also get some hands-on practice with important tools that you will need to be familiar with as a professional Data Scientist: the command line, git, GitHub, and running Jupyter Notebooks locally.
Using Bagging and Boosting Ensemble Techniques from the Scikit-Learn library to predict whether small business loans offered by Kopo Kopo company will be paid off.
Statistical Analysis to identify the main factors that can help determine the number of total children ever born by a woman of reproductive age in Kenya.
This data analysis project uses data from IMDb, Rotten Tomatoes, and The Numbers to help Microsoft's new movie studio make informed decisions about what types of films to create. The project provides recommendations on which genres, directors, actors, and budgets are associated with higher ratings and revenues.
A python program that will read the no. of minutes and text messages used in a month then displays the base charge, additional minutes (if any), additional text messages (if any), tax charged and the total bill.
Wrangle data using Python to provide recommendations on how to reduce the cases of breakdown of buses in the city of New York.
This repository is dedicated to exploring the use of Multiple Linear Regression for the purpose of predicting mortality rates as represented by the "TARGET_deathRate" variable.
Enhancing Political Brand Reputation Through Twitter Sentiment Analysis
My GitHub profile-README
Data Visualization using Seaborn and Matplotlib
Using K-Means and Agglomerative Clustering to segment data to better understand clusters formed in terms of the results of a chemical analysis of wines grown in the same region in Kenya but derived from three different cultivars.