Code Monkey home page Code Monkey logo

Comments (14)

bwyyoung avatar bwyyoung commented on June 1, 2024 2

Ok Sir. Thank you for your help.

from instamancer.

ScriptSmith avatar ScriptSmith commented on June 1, 2024

It's not really something this project is intended for, but I'll consider adding it.

In the meantime, you can do this with plugins in ES2018 typescript:

import { IPlugin, IPluginContext, createApi } from "instamancer";

type PageData = { entry_data: { ProfilePage: [{ graphql: { user: {} } }] } }

class UserData<PostType> implements IPlugin<PostType> {
    constructionEvent(this: IPluginContext<UserData<PostType>, PostType>) {
        const oldStart = this.state.start

        this.state.start = async () => {
            await oldStart.bind(this.state)()
            const data: PageData = await this.state.page.evaluate(() => {
                //@ts-ignore
                return window["_sharedData"]
            })
            console.log(data.entry_data.ProfilePage[0].graphql.user);
            await this.state.forceStop(true)
        }
    }
}

const user = createApi("user", "spyvonne_chloe", {
    plugins: [
        new UserData(),
    ],
})

user.start()

from instamancer.

bwyyoung avatar bwyyoung commented on June 1, 2024

I got this error:
(node:47481) UnhandledPromiseRejectionWarning: TypeError: Cannot read property '0' of undefined

from instamancer.

bwyyoung avatar bwyyoung commented on June 1, 2024

I found out the issue. Basically Instagram requires after several calls of the plugin you wrote. It works initially, and I am able to retrieve the graphql data. However after several calls it will return a html page requesting the user to login.

Is there any way around this that we can resolve this issue without the use of plugin?
I tried this below method as well, but eventually it stops working after a while if I stay on the same IP address:
https://learnscraping.com/scraping-instagram-profile-data-with-nodejs/

If I change IP, like through mobile phone hotspot, it works again.

from instamancer.

bwyyoung avatar bwyyoung commented on June 1, 2024

Is there a way user profile scraping can be integrated via Instamancer? if not, would you have a rough idea of how to do it with puppeteer or adapting your code to have this additional function?

from instamancer.

ScriptSmith avatar ScriptSmith commented on June 1, 2024

Well you're probably being rate limited because you're asking too much of Instagram. Make sure you're not doing anything else with Instagram in the background, and try sleeping 5 seconds between scraping each profile.

from instamancer.

bwyyoung avatar bwyyoung commented on June 1, 2024

Already tried sleeping. The same problem happens when using the plugin.

However, instamancer itself works just fine. I am still able to see the whole JSON that was output from instamancer, but the plugin method does not work.

If instagram was rate limiting me, shouldn't instamancer fail as well? Is there a way around this problem by doing things the way that instamancer works?

from instamancer.

ScriptSmith avatar ScriptSmith commented on June 1, 2024

I'm unable to reproduce rate-limiting when sleeping between users. Does the following example work for you?

https://gist.github.com/ScriptSmith/b437b33c4f2005eb197f63c3a28f9dab

from instamancer.

bwyyoung avatar bwyyoung commented on June 1, 2024

Thank you for the example. It worked initially, and I tested it through a different new IP address.
However, after about 1 hour of sleep based requests for user info, Instagram blocks further requests and asks the user for login.

It seems that this method only works temporarily, and isnt so reliable.

from instamancer.

ScriptSmith avatar ScriptSmith commented on June 1, 2024

Well I don't think what you're after is really possible without either logging in, or sleeping longer and 'hibernating' for a while when instagram rate limits you. You might find luck with the other tools listed at the bottom of the README, but I doubt they'd be any better.

from instamancer.

bwyyoung avatar bwyyoung commented on June 1, 2024

I see. Is it not possible to obtain profile information with the way Instamancer works? This is because despite being rate limited, I can still use Instamancer's technique to obtain hashtag info.

from instamancer.

ScriptSmith avatar ScriptSmith commented on June 1, 2024

The plugin implements how I'll add profile scraping to instamancer. There's no 'api' as such to retrieve profile information like there is for posts in a hashtag, so it has to be read from the page or memory. You can read about how instamancer works here.

The only difference is that the plugin uses a new session for each profile. Using the same session (like instamancer post postid1,postid2...) to gather multiple profiles may work better to mitigate rate limiting, or it may have the opposite effect.

from instamancer.

bwyyoung avatar bwyyoung commented on June 1, 2024

I understand. Thank you for clarifying that.
But do you think there might be a way to do it if we are able to input a page token/user api token from facebook into instamancer as an option:
https://developers.facebook.com/docs/instagram-api/reference/user

Here, business user information is able to be obtained using a GET from the Facebook API. The only thing we need is a token to be input as part of the get in order to retrieve the JSON information. This way, we won't have rate limiting/blocking from Instagram.

from instamancer.

ScriptSmith avatar ScriptSmith commented on June 1, 2024

Instamancer is a web scraper, so I wouldn't consider implementing something that directly interacts with a regular API, there are better tools for that.

There are many tools available that work with facebook's graph api, I'd recommend using one of those instead.

from instamancer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.