Code Monkey home page Code Monkey logo

instascrape's Introduction

instascrape: powerful Instagram data scraping toolkit

Version Downloads Release License

Activity Dependencies Issues Code style: black

What is it?

instascrape is a lightweight Python package that provides an expressive and flexible API for scraping Instagram data. It is geared towards being a high-level building block on the data scientist's toolchain and can be seamlessly integrated and extended with industry standard tools for web scraping, data science, and analysis.

Key features

Here are a few of the things that instascrape does well:

  • Powerful, object-oriented scraping tools for profiles, posts, hashtags, reels, and IGTV
  • Scrapes HTML, BeautifulSoup, and JSON
  • Download content to your computer as png, jpg, mp4, and mp3
  • Dynamically retrieve HTML embed code for posts
  • Expressive and consistent API for concise and elegant code
  • Designed for seamless integration with Selenium, Pandas, and other industry standard tools for data collection and analysis
  • Lightweight; no boilerplate or configurations necessary
  • The only hard dependencies are Requests and Beautiful Soup
  • Proven to work as of January, 2021

Table of Contents


πŸ’» Installation

Minimum Python version

This library currently requires Python 3.7 or higher.

pip

Install from PyPI using

$ pip3 install insta-scrape

WARNING: make sure you install insta-scrape and not a package with a similar name!


πŸ”Ž Sample Usage

All top-level, ready-to-use features can be imported using:

from instascrape import *

instascrape uses clean, consistent, and expressive syntax to make the developer experience as painless as possible.

# Instantiate the scraper objects 
google = Profile('https://www.instagram.com/google/')
google_post = Post('https://www.instagram.com/p/CG0UU3ylXnv/')
google_hashtag = Hashtag('https://www.instagram.com/explore/tags/google/')

# Scrape their respective data 
google.scrape()
google_post.scrape()
google_hashtag.scrape()

print(google.followers)
print(google_post['hashtags'])
print(google_hashtag.amount_of_posts)
>>> 12262794
>>> ['growwithgoogle']
>>> 9053408

See the Scraped data points section of the Wiki for a complete list of the scraped attributes provided by each scraper.

πŸ“š Documentation

The official documentation can be found on Read The Docs


πŸ“° Blog Posts

Check out blog posts on the official site or DEV for ideas and tutorials!


πŸ™ Contributing

All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome!

Feel free to open an Issue, check out existing Issues, or start a discussion.

Beginners to open source are highly encouraged to participate and ask questions if you're unsure what to do/where to start ❀️


πŸ•ΈοΈ Dependencies


πŸ’³ License

This library operates under the MIT license.


❔ Support

Check out the FAQ

Reach out to me if you want to connect or have any questions!


DISCLAIMER: With great power comes great responsibility. This is a research project and I am not responsible for how you use it. Independently, the library is designed to be responsible and respectful and it is up to you to decide what you do with it.

instascrape's People

Contributors

benji011 avatar chris-greening avatar dependabot[bot] avatar dragid10 avatar fernando24164 avatar o3661606 avatar paola351 avatar tarob0ba avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.