About

Science degree earnings in psychology and sociology laid a solid foundation for research methods, statistics, and data handling that helped inspire my transition to data science. Prior to starting a Master's degree program with Bellevue University I dedicated my career to serving those with severe and persistent mental health diagnoses. My work quickly led me earn leadership positions where I developed a growing interest data-practices that could be used to improve service quality, impact, and outcomes.

Resume

Technical Skills

Languages: Python, R, SQL
Software: Tableau, Power BI, Excel, Hadoop, Spark, HBase
Methods: Statistics, data mining, web-scraping, cleaning, transformation, machine learning, visualization, dashboarding.

Education

M.S. Data Science
B.S. Sociology
B.S. Psychology

Work Experience

Data Analyst Intern @ DataNicely

Omaha, NE ( Jan 2024 - Present)

Provision of ad-hoc data cleaning, transformation, and reporting to meet client needs.
Automated file cleaning and transformation with Power Query and Power BI to reduce labor and data redundancy.
Repaired/reformatted existing Tableau dashboard solutions to improve performance and consistency.

Research Analyst @ United Way of the Midlands

Omaha, NE ( June 2023 - Oct 2023)

Identified Omaha area service gaps through team-based, qualitative, meta-analysis and coding of external reports.
Analyzed internal 211 caller data to identify intersection of top needs with top unmet needs.
Influenced grantmaking changes through reporting on post-covid philanthropic trends and non-profit feedback.
Co-Authorship of their 2023 Community Needs Assessment and full authorship of assigned blogs.
Used Excel, Python, and Power BI to perform ad hoc analysis on volunteer engagement and donation.

Projects

Data Mining the Workplace for Mental Health

Image by Freepik

This project aims to inform company practices/offerings that reduce work-interference due to employee mental health using machine learning classification modelings and model interpretation.

A decision tree classifier was used to predict work interference. Once a model was optimized, feature impact was analyzed using the SHAP library to highlight workplace culture and practices associated with increased or decreased work-interference. Using only employee perceptions of culture and company practices, this model was able to predict whether employees experienced work-interference from mental health "never", "rarely", "sometimes", or "often" with 57% accuracy. (Over double what could be guessed by chance). See Project Repo

Key Findings

Work interference from mental health appears minimized by cultures that favor an open-ness of mental health as a recognized issue.
Feeling safe to discuss mental health with an employer was associated with less work interference.
Unclear or unavailable mental health resources and difficulty taking time off for mental health was reported with higher work interference.

Attrition Prediction of Healthcare Employees

This project's goal is to create data solutions for attrition analysis and prediction. Dashboards were created to explore influencers of attrition. Recursive feature elimination was used to identify best features for modeling. Three supervised, classifier models were tested and evaluated with a multi-layer perceptron (MLP) deomonstrating best performance at 92% accuracy. See Project Repo

Fetal Health Predictive Analytics

This project aims to develop a predictive model to classify the health outcomes of fetuses based on cardiotocogram(CTG) features. EDA was conducted then a Naive Bayes and multi-layer perceptron (MLP) models were trained and evaluated. A gridsearch was employed to identify best hyperparameters for the neural network model which was shown to out-perform the naive bayes at 91% accuracy compared to 84%. See Project Repo

Covid-19 Vaccine Hesitancy Correlates

This project was designed to draw insights into hesistant attitudes towards Covid-19 vaccination. Corrrelational analysis was conducted between hesitancy data, reports of Covid-19 impacts, demographics, and political influences at state and county levels. Census Bureau datasets were gathered on the topic of Covid-19 impact, income loss during the pandemic, eviction or expecting eviction.See Project Repo

Nursing Home Quality Analysis

This project combined flatfile, API, and webscraped data sources into a SQL database to examine variable relationships to nursing home quality. Nursing home data for all nursing homes in the United States was combined at state and county levels. Data for state funding and local demongraphics were explored along with features related to nursing care facilities. SQLite was used to Query and visualize bivariate relationships. See Project Repo

Twitter Analysis of Airline Sentiment

This project examined airline-related tweets for positive and negative sentiment. Logistic regression, gradient boosting classifier, and random forest classifier models were trained and compared for performance. Here, the random forest classifier was found to have the most effective performance with 91% accuracy on test data.See Project Repo

Strategic Marketing, Visualizations, Analysis for Airlines

This project is scenario based Sunset Air is suffering a media scare campaign. Claims of a rise in flight accidents in the media are threatening the industry. This project takes currently available data from a range of sources to perform an investigation of media claims. The findings are then presented in various audience level reports and visualizations to combat the issue. Primary tools for this project include Power BI and Power Point. This project includes internal dashboard for the technical teams and executive team, inforgraphic poster, Mock interview blog post and written rationale for visual designs and strategy. See Project Repo

Malware Detection with Gradient Boosting Classifier

A gradient booseting classifier was trained on extracted android application features to detect the presence of malware. An accuracy of 96% was yeilded from the model, with a false negative rate of 2%. The SHAP library was used to interpret the final model to pool insights into how the presence or lack of certain features influenced the classification outcome.See Project Repo

Visualization Gallery

This project was conducted to build fluency in data-visualization accross tools including : Python, R, Power BI, and Tableau. PDFs are provided for easy viewing of various visual types accross tools including mapping, line, bubble, bullet, area charts, stacked area charts, step charts, density plots and more. See Project Repo

Extremism EDA

This project explores correlates of extremist attitudes. Likert questions aimed at capturing these attitudes are based on J.M. Berger’s extensive writings on the topic. His work explains extreme attitudes (political, racial, religious etc) all relate to perceptions of in-groups and out groups. In the extreme, people come to believe the “success of us is inseparable from hostile acts against them”.See Project Repo

halepino / halepino.github.io Goto Github PK

halepino.github.io's Introduction