Code Monkey home page Code Monkey logo

Leondra R. Gonzalez's Projects

adclick_fraud icon adclick_fraud

Capstone project #2 for the Harvard University Professional Certificate in Data Science

awssagemaker_pythonxgboosttutorial icon awssagemaker_pythonxgboosttutorial

Python XGBoost model, using Amazon SageMaker, EC2 instances and S3 buckets. Used to prepare, partition, train, tune, predict and evaluate model. Project involves predicting customers who sign up for a financial product at a bank.

bikeshare-exploratory-analysis icon bikeshare-exploratory-analysis

An exploratory analysis of the Kaggle bikeshare data set with the application of linear regression models, which are not optimal for this particular problem of predicting bikes rented.

boston-housing---random-forest-xgboost icon boston-housing---random-forest-xgboost

Leveraging regression random forest and XGBoost algorithms with cross validation and grid search to tune the best performing model on the Boston Housing dataset. Analyzed and visualized the most statistically significant features for both models. Achieved an RMSE of $2K

chipotlelocations icon chipotlelocations

This is a descriptive and exploratory data analysis project from DataCamp which aims to explore real data on every Chipotle location to identify franchising opportunities. The goal is to scout out the next Chipotle location using interactive maps (ie: leaflet) and external data to compare proposed locations on several important factors, such as proximity to current Chipotle locations, the distribution of the state's population, and the distance from interstates and tourist attractions.

customer-churn-w-logistic-regression icon customer-churn-w-logistic-regression

Utilizing tools such as Spark, Python (PySpark), SQL, and Databricks, performed logistic regression on customers to predict those at a higher risk of churning, then applied the model to an unseen "new customers" data set.

degrees-that-pay-you-back icon degrees-that-pay-you-back

A cluster analysis leveraging the kmeans algorithm to determine which degrees are likely to yield which levels of income based on historical data.

disney-movies-box-office-hits icon disney-movies-box-office-hits

Analysis of Disney's top grossing films (adjusted for inflation) in Python, using regression to attribute film genre to success. The project includes using regression on the data, as well as bootstrap regression to determine confidence intervals of the intercept and coefficients.

film-similarity-nlp-with-kmeans-hierarchical-clustering icon film-similarity-nlp-with-kmeans-hierarchical-clustering

Used NLP techniques (tokenization, stemming, vectorization for TF-IDF) and clustering algorithms (Kmeans and Hierarchical clustering) to mine the "similarities" between films based on their plots provided by IMBD and Wikipedia. The dataset contains the titles of the top 100 movies on IMDb.

goldenageofgaming icon goldenageofgaming

Video games are big business: the global gaming market is projected to be worth more than $300 billion by 2027 according to Mordor Intelligence. With so much money at stake, the major game publishers are increasingly more incentivized to create the next big hit. But are games getting better, or has the golden age of video games already passed? In this project, I explore the top 400 best-selling video games created between 1977 and 2020. This is achieved by comparing gaming sales data with critic and user reviews data. In doing so, we can discover whether video games have improved as the gaming market has grown. Each table is limited to 400 rows for this experiment, but the complete dataset with over 13,000 games can be found on Kaggle.

gotnetworkanalysis icon gotnetworkanalysis

Analysis of the co-occurrence network of Game of Thrones characters in the Game of Thrones books. Here, two characters are considered to co-occur if their names appear in the vicinity of 15 words from one another in the books. This project utilized graph analysis and modeling frameworks such as Google's PageRank Algorithm.

hyundai-cruise-ship-crew-prediction icon hyundai-cruise-ship-crew-prediction

Predicting the number of required crew needed for manning a Hyundai Cruise ship based on information like number of cabins and passengers using linear regression. Leveraged SQL and PySpark,

loanpaymentprediction_svm icon loanpaymentprediction_svm

My first attempt with building a SVM model, and optimizing the cost and gamma parameters using the Gaussian Kernel grid search method.

mobilegameabtest icon mobilegameabtest

2 A/B tests, testing the difference in 1) average player 1 day and 2) 7 day retention against control (old player level) and new version (new player level)

netflix-content-duration-analysis icon netflix-content-duration-analysis

Given the large number of movies and series available on Netflix, it is a perfect opportunity to dive into the entertainment industry with an analysis of Netflix content durations. This analysis aims to understand trends in content duration on the Netflix platform since 2011 through 2020.

predicttaxifares icon predicttaxifares

An analysis and prediction of taxi fares based on 2013 NYC data using decision trees and random forests.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.