Code Monkey home page Code Monkey logo

instagramy's Introduction

Instagramy

Python Package for Instagram Without Any external dependencies

PyPi Downloads GitHub stars GitHub forks GitHub license Code style GitHub Repo size GitHub Actions GitHub Actions

Scrape Instagram Users Information, Posts data, Hashtags and Locations data. This Package scrapes the user's recent posts with some information like likes, comments, captions and etc. No external dependencies.

Features

Download

Installation

pip install instagramy

Upgrade

pip install instagramy --upgrade

Sample Usage

Getting Session Id of Instrgram

For Login into Instagram via instagramy session id is required. No username or password is Needed. You must be login into Instagram via Browser to get session id

  1. Login into Instagram in default webbrowser
  2. Move to Developer option
  3. Copy the sessionid
    • Move to storage and then to cookies and copy the sessionid (Firefox)
    • Move to Application and then to storage and then to cookies and copy the sessionid (Chrome)

Note: Check for session id frequently, It may be changed by Instagram

Instagram User details

Class InstagramUser scrape some of the information related to the user of the Instagram

>>> from instagramy import InstagramUser

>>> session_id = "38566737751%3Ah7JpgePGAoLxJe%334"

>>> user = InstagramUser('google', sessionid=session_id)

>>> user.is_verified
True

>>> user.biography
'Google unfiltered—sometimes with filters.'

>>> user.user_data # More data about user as dict

If you get the data of the user onetime, instagramy store the data as cache file for avoid the error. you can get the data from cache also. Don't provide the sessionid.

>>> from instagramy import InstagramUser

>>> user = InstagramUser('google', from_cache=True)

>>> user.is_verified
True

It is opt of all classes InstagramUser, InstagramHashTag and InstagramPost.

Show all Properties

  • biography
  • connected_fb_page
  • followed_by_viewer
  • follows_viewer
  • fullname
  • has_blocked_viewer
  • has_country_block
  • has_requested_viewer
  • is_blocked_by_viewer
  • is_joined_recently
  • is_private
  • is_verified
  • no_of_mutual_follower
  • number_of_followers
  • number_of_followings
  • number_of_posts
  • other_info
  • posts
  • posts_display_urls
  • profile_picture_url
  • requested_by_viewer
  • restricted_by_viewer
  • username
  • website

InstagramUser.user_data has more data other than defined as Properties

Instagram Hashtag details

Class InstagramHashTag scrape some of the information related to the hash-tag of the Instagram

you can also set your sessionid as env variable

$ export SESSION_ID="38566737751%3Ah7JpgePGAoLxJe%er40q"
>>> import os

>>> from instagramy import InstagramHashTag

>>> session_id = os.environ.get("SESSION_ID")

>>> tag = InstagramHashtag('google', sessionid=session_id)

>>> tag.number_of_posts
9556876

>>> tag.tag_data # More data about hashtag as dict
Show all Properties

  • number_of_posts
  • posts_display_urls
  • profile_pic_url
  • tagname
  • top_posts

InstagramHashTag.tag_data has more data other than defined as Properties

Instagram Post details

Class InstagramPost scrape some of the information related to the particular post of Instagram. It takes the post id as the parameter. You can get the post id from the URL of the Instagram posts from the property of InstagramUser.posts. or InstagramHagTag.top_posts

>>> from instagramy import InstagramPost

>>> session_id = "38566737751%3Ah7JpgePGAoLxJe%334"

>>> post = InstagramPost('CLGkNCoJkcM', sessionid=session_id)

>>> post.author
'ipadpograffiti'

>>> post.number_of_likes
1439

>>> post.post_data # More data about post as dict
Show all Properties

  • author
  • caption
  • display_url
  • get_json
  • number_of_comments
  • number_of_likes
  • post_source
  • text
  • type_of_post
  • upload_time

InstagramPost.post_data has more data other than defined as Properties

Instagram Location details

Class InstagramLocation scrape some of the information and posts related to the given Location . It takes the location id and slug as the parameter. You can get the location id and slug from the URL of the Instagram Location or from the property of InstagramPost.location.id and InstagramPost.location.slug.

>>> from instagramy import InstagramPost

>>> session_id = "38566737751%3Ah7JpgePGAoLxJe%334"

>>> post = InstagramPost('CLGkNCoJkcM', sessionid=session_id)

>>> location_id, slug = post.location.id, post.location.slug

>>> from Instagramy import InstagramLocation

>>> location = InstagramLocation(location_id, slug, session_id)

>>> location.latitude
28.6139

>>> location.longitude
77.2089

>>> location.address
{'street_address': 'T2, Indira Gandhi International Airport', 'zip_code': '', 'city_name': 'New Delhi', 'region_name': '', 'country_code': 'IN', 'exact_city_match': False, 'exact_region_match': False, 'exact_country_match': False}

you can also get the location id and slug from the instagram url

https://www.instagram.com/explore/locations/977862530/mrc-nagar
https://www.instagram.com/explore/locations/<location_id>/<slug>
Show all Properties

  • address
  • id
  • latitude
  • location_data
  • longitude
  • name
  • number_of_posts
  • phone
  • profile_pic_url
  • sessionid
  • slug
  • top_posts
  • url
  • viewer
  • website

InstagramLocation.location_data has more data other than defined as Properties

Plugins

Instagramy has some plugins for ease

Plugins for Data Analyzing

  • analyze_users_popularity
  • analyze_hashtags
  • analyze_user_recent_posts
>>> import pandas as pd
>>> from instagramy.plugins.analysis import analyze_users_popularity

>>> session_id = "38566737751%3Ah7JpgePGAoLxJe%334"

>>> teams = ["chennaiipl", "mumbaiindians",
        "royalchallengersbangalore", "kkriders",
        "delhicapitals", "sunrisershyd",
        "kxipofficial"]
>>> data = analyze_users_popularity(teams, session_id)
>>> pd.DataFrame(data)

                   Usernames  Followers  Following  Posts
0                 chennaiipl    6189292        194   5646
1              mumbaiindians    6244961        124  12117
2  royalchallengersbangalore    5430018         59   8252
3                   kkriders    2204739         68   7991
4              delhicapitals    2097515         75   9522
5               sunrisershyd    2053824         70   6227
6               kxipofficial    1884241         67   7496

Plugins for Downloading Posts

  • download_hashtags_posts
  • download_post
  • download_profile_pic
>>> import os

>>> from instagramy.plugins.download import *

>>> session_id = os.environ.get('SESSION_ID')

>>> download_profile_pic(username='google', sessionid=session_id, filepath='google.png')

>>> download_post(id="ipadpograffiti", sessionid=session_id, filepath='post.mp4')

>>> download_hashtags_posts(tag="tamil", session_id=session_id, count=2)

Use Without Login

You can use this package without login. Sessionid is not required but it may rise RedirectionError error after four to five requests.

>>> from instagramy import *

>>> user = InstagramUser('google')
>>> user.fullname
'Google'
>>> tag = InstagramHashTag('python')
>>> tag.tag_data

Caching Feature

from version 4.3, Added the new feature that is caching the required data. If you get the data of the user onetime, instagramy store the data as cache json file for avoid the error. you can get the data from cache also. Don't need to provide the sessionid. Instead of sessionid add the optional parameter from_cache=True.

>>> from instagramy import InstagramUser

>>> user = InstagramUser('google', from_cache=True)

>>> user.is_verified
True

It is opt of all classes InstagramUser, InstagramHashTag, InstagramPost and InstagramLocation.

Clear all Caches created by instagramy in current dir by

>>> from instagramy.core.cache import clear_caches

>>> clear_caches() # clear all caches of instagramy

List of all Cache files created by instagramy in current dir

>>> from instagramy import list_caches

>>> list_caches() # list all caches of instagramy

Sample Scripts

Getting Email address and phone number

user = InstagramUser('username')
email, phone_number = user.user_data['business_email'], user.user_data['business_phone_number']

✏️ Important Notes

  • Don't send huge request to Instagram with sessionid, Instagram may ban you.
  • You can use this package without sessionid (Login). But it may RedirectionError after four to five requests.
  • class Viewer provide the data about currently logged in user.
  • Check for session id frequently, It may be changed by Instagram
  • If code execution is never gets completed, check and change your session id and try again.
  • Don't provide the wrong session_id.
  • InstagramUser.user_data, InstagramPost.post_data, InstagramHashtag.tag_data and InstagramLocation.location_data which is python dict has more and more data other than defined as Properties.
  • This Package does not scrap all the posts from an account, the limit of the post only 12 (For non-private account)
  • This Package not scrap all the posts of given hash-tag and location it only scrapes the top 60 - 72 posts.

Disclaimer

If you send the huge request to the Instagram with session id Instagram may ban you. I am not responsible for any misuse or damage caused by this program.

License

MIT License

Contributions

Contributions are Welcome. Feel free to report bugs in issue and fix some bugs by creating pull requests. Comments, Suggestions, Improvements and Enhancements are always welcome. Let disscuss about it Here.

Made with Python ❤️

instagramy's People

Contributors

yogeshwaran01 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

instagramy's Issues

Amazing repo, but the README could use an update for InstagramUser

README seems pretty outdated. I had to read the core code to figure out how to use the package.

Here is just an example:

# Connecting the profile
 user = InstagramUser(uname, os.environ.get("INSTAGRAM_SESSIONID"))
# get user data
hls = user.user_data
# get high level stats such as followers, following, total_posts
followers, following, total_posts = hls['edge_followed_by']['count'], hls['edge_follow']['count'], hls["edge_owner_to_timeline_media"]["count"]
print("followers, following, total_posts = ", followers, following, total_posts)

pip install instagramy not working

pip install instagramy
Collecting instagramy
Using cached https://files.pythonhosted.org/packages/01/4c/5376dd3567e7b31af3d8fd996ee181c9e0d77373d9970114a0c18054042c/instagramy-3.7.tar.gz
ERROR: Command errored out with exit status 1:
command: 'C:\Users\user\Anaconda3\envs\env\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\user\AppData\Local\Temp\pip-install-hjnkteis\instagramy\setup.py'"'"'; file='"'"'C:\Users\user\AppData\Local\Temp\pip-install-hjnkteis\instagramy\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base pip-egg-info
cwd: C:\Users\user\AppData\Local\Temp\pip-install-hjnkteis\instagramy
Complete output (7 lines):
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\user\AppData\Local\Temp\pip-install-hjnkteis\instagramy\setup.py", line 4, in
long_description = fh.read()
File "C:\Users\user\Anaconda3\envs\env\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 3018: character maps to
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

KeyError: 'graphql'

It was working before one month.. I dont know what happened now..
from instagramy import InstagramPost

post = InstagramPost('CLGkNCoJkcM')

File ~\Anaconda3\lib\site-packages\instagramy\InstagramPost.py:67 in init
post_id, data["entry_data"]["PostPage"][0]["graphql"]["shortcode_media"]

post.caption is not returning right value

I was using this post as a reference- "https://www.instagram.com/p/CNrtoeOp_Et/"

Given caption= Photo shared by LLuvia Bakery on April 15, 2021 tagging @yogisthaan, @healthybuddhaorganic, @studioalaya_proj, fabcafe.in, @masmaracrafts, @Pronatureorganic, @ping_this, and @madansolanki10. May be an image of 5 people and text.

Original caption= It's been a long journey and we have nothing else filling our hearts today except gratitude.
Gratitude for the support and love and patience that everyone in the entire small business ecosystem has showered on us in the last few years. Nothing more to be said , just deep felt Thanks.

Post Text

Would it be possible to include the post text when scraping user posts? I'm interested in grabbing the number of hashtags used over the last X (say 12) posts. For example, say over the last 12 posts, on average there are 5 hashtags used per post. If the Post object included the post text, I think I could get what I'm looking for. This isn't really an issue, more of a request or suggestion for your consideration.

KeyError: 'graphql'

I am getting the error when reproducing the example.

from instagramy import InstagramPost

post = InstagramPost('CLGkNCoJkcM', sessionid=session_id)

Fixing issue with analyze_user_post.

analyze_user_post returns an issue in which it says that a tuple cannot be accessed as a dictionary is. I fixed the problem by editing the for loop as such (i did not have a use for post_url in my code, but it can be edited the same way.

Screen Shot 2022-04-07 at 11 59 22 PM

h

PostPage empty, parsing update necessary?

Hi,

I think Instagram changed something again.

I am using this code:

from instagramy import InstagramPost
sessionid='MYSESSIONID3AXFvgRizsd7NJNR%3A26'
code= 'CLt6RG7pfqk'
post= InstagramPost(code, sessionid = sessionid)

error:

Traceback (most recent call last):
  File "/home/john/python-code/webAPI/social_media/proj/qa/instaloot.py", line 4, in <module>
    post= InstagramPost(code, sessionid = sessionid)
  File "/home/john/.virtualenvs/some_proj/lib/python3.8/site-packages/instagramy/InstagramPost.py", line 56, in __init__
    raise RedirectionError
instagramy.core.exceptions.RedirectionError: Instagram Redirects you to login page, Try After Sometime or Reboot your PC Provide the sessionid to Login

I tried debugging a bit and I think it has to do with the parsing. PostPage is empty inside the dict returned by urllopen, but when I look at the post I see data for the right post, so I am guessing it's the parsing.

Unfortunately I don't know enough about HTML-parsing to fix it myself.

Error with InstagramPost

Hello Yogeshwarn!

Thank you for your work!
I would like to use your scraper in my research to analyze museums visibility and how it translates to increase of museums visitors in Italy. As I understood, you updated some packages 8 days ago, however, still I have some problems.
Unfortunately, I have an error using InstagramPost and user.posts_display_urls
Screenshot 2022-03-30 at 10 11 38
Screenshot 2022-03-30 at 10 11 19
Screenshot 2022-03-30 at 10 10 03
Screenshot 2022-03-30 at 10 09 36
(see attached).

Thank you again and have a nice day!

key errors

Generates key errors, relevant json is moved.
my tests:

import instagramy

hashtag = “…..”
post_id =. “…..”
user = “….”

ig_session_id = “….”

t = instagramy.InstagramHashTag(hashtag)
Error Info: <class 'KeyError'>: 'TagPage'

worked correctly for some days without login, now generates this error

t = instagramy.InstagramHashTag(hashtag, sessionid=ig_session_id)
Error Info: <class 'KeyError'>: 'graphql'

p = instagramy.InstagramPost(post_id, sessionid=ig_session_id)
Error Info: <class 'KeyError'>: 'graphql'

I think relevant post data moved to script block: window.__additionalDataLoaded

u = instagramy.InstagramUser(user, sessionid=ig_session_id)

this works correctly

KeyError: 'ProfilePage' error while using "InstagramUser"

facing error on running the following query:

"from instagramy import InstagramUser
user = InstagramUser("github")
user.number_of_followers"

Error:


JSONDecodeError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/instagramy/InstagramUser.py in get_json(self)
44 try:
---> 45 return extract_user_profile(scripts[4])
46 except (json.decoder.JSONDecodeError, KeyError):

7 frames
JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/instagramy/InstagramUser.py in extract_user_profile(script)
16 data = script.contents[0]
17 info = json.loads(data[data.find('{"config"') : -1])
---> 18 return info["entry_data"]["ProfilePage"][0]["graphql"]["user"]
19
20

KeyError: 'ProfilePage'

Posts Limits is 12

Hi, I was trying to scrap all the posts from a certain account, but any time the limit of the posts shown is set to 12.
It would be convenient to get all the posts.

Impossible to use it

Traceback (most recent call last):
  File "rs_i.py", line 6, in <module>
    user = InstagramUser('github')
  File "/home/phenx/.local/lib/python3.8/site-packages/instagramy/InstagramUser.py", line 32, in __init__
    if check_username(username):
  File "/home/phenx/.local/lib/python3.8/site-packages/instagramy/InstagramChecks.py", line 26, in check_username
    token = response.cookies["csrftoken"]
  File "/usr/lib/python3/dist-packages/requests/cookies.py", line 328, in __getitem__
    return self._find_no_duplicates(name)
  File "/usr/lib/python3/dist-packages/requests/cookies.py", line 399, in _find_no_duplicates
    raise KeyError('name=%r, domain=%r, path=%r' % (name, domain, path))
KeyError: "name='csrftoken', domain=None, path=None"

Error scrapping hashtags - Graphql KeyError

When I execute this code:

from instagramy import InstagramHashTag
tag_latest = InstagramHashTag("my_hashtag", sessionid="my_session_id")

I receive the following error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-1-bb45efb6917a> in <module>
      1 from instagramy import InstagramHashTag
----> 2 tag_latest = InstagramHashTag("******", sessionid="4609********************")

C:\ProgramData\Miniconda3\envs\scrap38\lib\site-packages\instagramy\InstagramHashTag.py in __init__(self, tag, sessionid, from_cache)
     58             data = self.get_json()
     59             cache.make_cache(
---> 60                 tag, data["entry_data"]["TagPage"][0]["graphql"]["hashtag"]
     61             )
     62             try:

KeyError: 'graphql'

Apparently this graphql key doesn't exist inside the object data["entry_data"]["TagPage"][0].

What doey this error mean?

raise AttributeError(f"module 'pandas' has no attribute '{name}'")
AttributeError: module 'pandas' has no attribute 'Dataframe'

UsernameNotFound?

Well, since some days, it doesnt work anymore. Everytime i try to run the programm with simple aaccounts like google, it gives me the following error as output:

raise UsernameNotFound(self.url.split("/")[-2])
instagramy.core.exceptions.UsernameNotFound: InstagramUser('google') not Found

INstagramy doesnt work on a hosted API

I am doing an API on flask that gets instagram hashtag posts. My problem is that when I run the flask application locally, it works fine. However when I host it on Azures web service or python anywhere, I keep on getting an internal server error. Has anyone faced this issue?

Error while using instagrammy:

2022-05-11 07:33:07,644: Exception on /api/instaposts [GET]
Traceback (most recent call last):
  File "/home/datamindsetstest/mysite/app.py", line 136, in api_posts
    hashtag = InstagramHashTag(word)
  File "/home/datamindsetstest/.local/lib/python3.9/site-packages/instagramy/InstagramHashTag.py", line 60, in __init__
    tag, data["entry_data"]["TagPage"][0]["graphql"]["hashtag"]
KeyError: 'TagPage'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.