Code Monkey home page Code Monkey logo

ucb-dataviz-project1-group8's Introduction

ucb-dataviz-project1-group7

Mental Health Data Analysis

Mental health is defined based on factors and symptoms negatively affecting an individual’s mental well being.

Objectives: Compare mental health correlation between different groups/ attributes (e.g., self-employed vs. employed individuals, different countries, gender, etc) to identify disparities or similarities and their potential causes and provide insights that could inform public health policies, interventions, or initiatives aimed at improving mental health outcomes at local, national, or global levels.

Data: Survey Results (https://www.kaggle.com/datasets/bhavikjikadara/mental-health-dataset/data) (refer to Mental-Health-Dataset.csv)

Data Cleaning and Preprocessing: (refer to clean_up_file.ipynb)

  • Remove blanks and remove time stamp

  • Converted categorical values into integers for better statistical analysis

  • Added 2 features: Mental Health Candidacy = Family history + treatment + mental health history + growing stress + changes_habits

    Mental Health Severity = coping_struggles + mood_swings + work_interest + social_weakness

image

Exploratory Data Analysis (EDA): Q1 Does Gender play a significant role in mental health?

A p-value of 1.0 from Chi-Squared test strongly supports the null hypothesis, indicating that there is no evidence of a relationship between gender and mental health history based on sample study. These results suggest that any observed differences in mental health history between genders are due to random chance or other factors. (refer to gender_mental_health.ipynb)

Q2 Does one’s chosen occupation play a significant role in mental health?

Chi-Squared test indicates no significant association between Occupation and Mental Health. These results suggest that any observed differences in mental health history between occupations are due to random chance or other factors. (refer to occupation_mental_health.ipynb)

Q3 Does mental health factors have an impact on one’s interest in their work?

A large volume of respondents showcased a mid range raying for a propensity to mental health issues while also being negatively impacted by mental health issues and the volume of people's interest in work was evenly spread across the low/mid and high impact ratings (refer to Question 3 Final folder)

Q4 Are self-employed less susceptible to mental health?

Based on the significant t-statistic and very low p-value, there is strong evidence to support the alternative hypothesis that self-employed individuals are less susceptible to mental health compared to non self-employed individuals based on identified factors (Family history + treatment + mental health history + growing stress + changes_habits) (refer to final_self_employed.ipynb)

Q5 Is mental health severity lower in the US?

A derived metric "mental_health_severity" as a dependent variable was analyzed based on two samples: US and non-US measurements. The datapoints from the United States were one group, while measurements from other countries were aggregated to form a non-US population. A one-sided two-samples Mann-Whitney U-test was performed due to the derived nature of the factor (combination of other variables whose underlying distributions were unknown) a non-parametric test was performed. The initial categorical data was cleaned by removing null values, and applying a mapping from categorical values to numerical values. df_factors.csv was the starting point, and US / non-US populations were split before the statistical tests. A one-sided two sample t-test was also performed, since the large sample size approximates a normal underlying distribution.

Neither test revealed a lower average mental health score in the US than in countries outside the US. Mann-Whitney U-test - No significant indication that median severity of mental health is lower in the United States. Two-sample t-test - No significant indication that mean mental health severity is lower in the United States. (refer to question-5 folder for analysis and slides)

Q6 - weighted population analysis of cluster samples that are classified by highest similarilty. TOP 50/Bottom 50 Modeling Approach: columns were dropped which had the least normal distributions first to increase similarity between subject traits Results: The TOP 50 had a much higher correlation .95. And if Bottom 50 correlation between treatment and family history is compared. It can be determined that the 50 percent most common answers in this data set, set a high correlation between treatment and family history. In opposition to the bottome 50 percent or least common answers where the correlation was negative. Conclusions: Shows that by creating a ranking system that is weighted by mental traits of the following columns (family_history treatment growing_stress mental_health_history mood_swings coping_struggles work_interest social_weakness) the top most 50 percent of results show a strong positive correlation between treatment and family history while the bottom 50 percent show a weak negative correlation. This shows that the more random or unique set of answers of a subjects results are more likely to have no correlation between treatment and family history. But the top 50 percent of subjects that had answered similarly to their pairs showed they were also more likely to have answered in high correlation the same way for family history and treatment (refer to question_6 folder) This shows that family history and treatment are powerful traits of mental health diagnosis.

ucb-dataviz-project1-group8's People

Contributors

grant-i avatar nchakicherla avatar jpvicencio avatar riosrose avatar rponticelli0 avatar

Watchers

 avatar  avatar

ucb-dataviz-project1-group8's Issues

pull request

do we need to change the settings to require a pull request and merge request before adding to the main branch?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.