Code Monkey home page Code Monkey logo

lagoujob's Introduction

#Data analysis of Lagou LagouIcon ###Main Function

  1. scrape data from Lagou, and know the latest info of Internet career

  2. data analysis and visualize

  3. crawl job details info and generate word cloud as Job Impression

###Note Because lagou's back-end API has been changed, this repository may not work well.

I will try to fix these problems and publish V2.0 in the near future.

THX for your star and watching!

I will try my best to make it better and more robust with more new features as well!

Sorry for the inconvenience it may bring!

V2.0_ALPHA is developing ~

###Install Prerequisition

  1. Python Version >= 3.4
  2. Third Party Library:

pip install requests pip install beautifulsoup4 pip install jieba pip install openpyxl

###Basic Usage

  1. clone this project from github

  2. change the path of job.xml in lagouspider.py readconfig() method configmap = toolkit.readconfig(YourLocalPath)

  3. run lagouspider.py to get job data in JSON

  4. run excelhelper.py to generate every Excel file towards each job

  5. run jobdetailspider.py to get job recruitment details ----V1.3 updated

  6. run analyser.py to cut sentences, and return TOP20 hot words ----V1.3 updated

###Analysis Results

Image1 Image2 Image3 Image4 Image5

For more information, please visit my answer at Zhihu

lagoujob's People

Contributors

lucasxlu avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.