Code Monkey home page Code Monkey logo

webscrapping_angellist_crunchbase's Introduction

Scraping Startups' Data from AngelList and Crunchbase

Websites are an ocean of limitless information which anyone and everyone can access. However, the amount of existing data available, and the increase of data in petabytes makes it a mammoth task for enterprises to get hold of accurate data & right insights beneficial to the company. Convenient access to the digital world and penetration of social media has increased data influx, but progressive businesses can stay ahead only with help of web scraping.

webscraping

Some of the benefits of web scraping to organizations across industries are as follow

  • Industry Research
  • Emerging Markets Analysis
  • Conduct Competitor Analysis
  • Determining Target Audience
  • Price optimization / Pricing Strategy Analysis
  • Brand Image Analysis

I show, in this Repository, how to scrape AngelList and Crunchbase to get valuable data about startups.

angelcrunch

Crunchbase is a platform for finding business information about private and public companies. Crunchbase information includes investments and funding information, founding members and individuals in leadership positions, mergers and acquisitions, news, and industry trends. I used the Crunchbase API, the free access one, to get the general information about companies registered on Crunchbase. I also showed how to scrape the Crunchbase profile page for a given company to get more information. To this end, you first need to let the class have your username and password to let it login through the app. This is because you cannot get access to the profile pages of companies without siging in. Again, this could be illegal, and you want to make sure that you already got permission to scrape companies profile pages via your app.

AngelList is a place that connects startups to investors and job candidates looking to work at startups. Their goal is to democratize the investment process, helping startups with both fundraising and talent. Scrapping AngleList is bit tricky. You first need to scrape the general information about companies from the ‘companies’ page by providing some search keywords, automatically click on the ‘More’ button, and so on. To avoid being blocked, you need to make the app sleep between scrapping two pages. Moreover, it doesn’t show more than 400 results per search. So, you need to filter out your search so that you can get as mush data as you can from your scraping. Then, to get more information about a company (information like fundraising rounds, social media pages, size, etc), you need to scrape its profile page on AngelList as well.

webscrapping_angellist_crunchbase's People

Contributors

asafilian avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.