Code Monkey home page Code Monkey logo

diptarup794 / stackoverflow-analysis Goto Github PK

View Code? Open in Web Editor NEW

This project forked from recodehive/stackoverflow-analysis

0.0 0.0 0.0 3.02 MB

Stack overflow is a professional community for developers. This repo analysis 3 years of developer Survey done by Stackoverflow and do visualization and predict the salary of Data Scientist in future.

Home Page: https://www.canva.com/design/DAEdO-AUjRY/beEs8h-x6erz4d8s_XXsbw/view

License: MIT License

Jupyter Notebook 100.00%

stackoverflow-analysis's Introduction







MIT License

Stackoverflow Analysis Guidelines

👨‍💻 Demo video

Final.video.mp4

You can start working with repo with simple changes and i have updated couple of issue . Check out the Main Code: Stackoverflow-Analysis

👇 Prerequisites

Before installation, please make sure you have already installed the following tools:

🛠️ Installation Steps

  1. Fork the project Fork the sanjay-kv/Stackoverflow-Analysis/ repository Follow these instructions on how to fork a repository
  2. Clone the project
git clone [email protected]:your-username/Stackoverflow-Analysis.git
  1. Download the orginal data from the drive link
  2. Open Jupyter Notebook and place the file in the project folder Make sure you selecting the correct path

Development

We love your desire to give back, and want to make the process as welcoming to newcomers and experts as possible. We're working on developing more intuitive tutorials for individuals of all skill levels and expertise, so if you think the community would value from being walked through the steps you're going through please share! ❤️

Finding Insights from Stackoverflow Developer Survey

Stack overflow is a professional community for developers, Stackoverflow conducts a survey every year the collected data from 2011 has been available for open source on the web with the latest dataset 2020 released on March 5th, 2021. If the dataset analysed professionally using modern tools, would enable us to answer real-world questions effectively. The dataset has covered 275 questions in total.

Project Goal:

  1. To perform Analysis on 3 years Stackoverflow Dataset and get insights.

  2. To perform Data Analysis and answer the below questions.

    • Impact of higher education on salary of the surveyed developers.
    • Impact of education/experience/responsibilities on gender inequalities.
    • Impact on participation rate due to different ethnicity.
    • To find whether there is any difference between men and women's income.
    • Impact on the increase in popularity of a language in the current year due to developer’s interest in the previous year.
  3. To perform data visualization on

    • The most commonly used language.

    • Distribution of surveyors based on their developer role.

    • Factors affecting Job satisfaction.

    • Predicting the growth of languages for upcoming years based on the survey answers.

      The Insights can be used to provide information regarding IT environment, hiring employees and job seekers and build a solid résumé.

Data Source and Background

Code.Explanation.Video_1.mp4

The dataset is very diverse and came from a Stackoverflow developer survey with 275 questions answered from 180 countries. Stackoverflow has data collected through surveys from 2011 to 2020, but for the project, the purpose is to analyze the data of the last 3 years. The people who completed the survey mostly from the US, India, and EMEA regions. The majority of the survey respondents had the background of developer/ coding experience. The data are available in the CSV format ranging from 40 to 150 MB with data of 1.5 Lakh survey participants.The dataset includes survey data gathered from 180 countries, the response ranges from Not at all important to very important/ Not at all satisfied to very satisfied.

Data Format

The data is in a schema CSV file that consists of 252,199 observations and 62 variables.

Projected work needs to be done for Insights.

Data Wrangling

Dealing Null Values: As this is a developer survey and few questions left unanswered by the respondents as ‘NA’ or ‘Not Applicable’ so dealing with null values is important to get precise information. Data conversion/ manipulation is also required, as the developer responded to the survey through radio buttons rather than yes or no pattern(Univariate analysis).

Techniques expect to use in the project

Planning to use ML Algorithms like Random, may include, KNN, AUC for classification problems, training model, logistic regression,data visualization, parameter analysis, Linear Regreesion, Root Mean square.

👨‍💻 Contributing

  • Contributions make the open source community such an amazing place to learn, inspire, and create.
  • Any contributions you make are greatly appreciated.
  • Check out our contribution guidelines(yet to update) for more information.

🛡️ License

LinkFree is licensed under the MIT License - see the LICENSE file for details.

💪 Thanks to all Contributors

Thanks a lot for spending your time helping this project grow. Thanks a lot! Keep rocking 🍻

🙏 Support

This project needs a ⭐️ from you. Don't forget to leave a star ⭐️

This repo is crafted with ♥ and owned/maintained by @sanjay-kv

stackoverflow-analysis's People

Contributors

sanjay-kv avatar code-with-aneesh avatar gss0c24 avatar devnandini02 avatar saksh8 avatar sibam-paul avatar sdprogramer avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.