This project was created with the intent of understanding what are the main requirements for Data Analysts and Data Scientists jobs. To do that we created a crawler to scrap job pages from LinkedIn and processed the text, in order to achieve a report with the most frequent skills that are asked for each position.
Carlos Madriz (LinkedIn | Github)
Inês Garcia (LinkedIn | Github)
To use the code from this project you should fork this repository and clone it locally.
To run the sraping code you will need to make sure you have the following python libraries installed:
As per the text processing you will need:
- Job search with your own credentials.
- Your account is protected from bot detection - it's only used to get the jobs links. To scrape each job, you don't need to be logged in.
- Searches by job title and by location (specified by you).
- Gets all the job position links for the specified parameters.
- Gets, cleans and analyses all the job information for each job position.
- David Craven who wrote and article on how to use Selenium to Scrap LinkedIn, that was very helpful.
This project is licensed under the MIT License - see the LICENSE.md file for details.