Comments (8)
It might be worth using the debug logging level to see if there are any more clues.
My guess would be that your VPS has a limited amount of memory, causing the chrome process to be killed
from instamancer.
Here's the logging.
{"message":"Starting API at 1580682529697","level":"info"}
{"message":{"id":"therock","index":0,"sleepRemaining":0,"state":"Launching","total":3},"level":"debug"}
{"message":{"id":"therock","index":0,"sleepRemaining":0,"state":"Navigating","total":3},"level":"debug"}
{"name":"TimeoutError","level":"error","message":"ErrorNavigation timeout of 30000 ms exceeded","stack":"TimeoutError: Navigation timeout of 30000 ms exceeded\n at /usr/lib/node_modules/instamancer/node_modules/puppeteer/lib/LifecycleWatcher.js:142:21\n -- ASYNC --\n at Frame.<anonymous> (/usr/lib/node_modules/instamancer/node_modules/puppeteer/lib/helper.js:111:15)\n at Page.goto (/usr/lib/node_modules/instamancer/node_modules/puppeteer/lib/Page.js:670:49)\n at Page.<anonymous> (/usr/lib/node_modules/instamancer/node_modules/puppeteer/lib/helper.js:112:23)\n at User.constructPage (/usr/lib/node_modules/instamancer/src/api/instagram.js:587:29)\n at processTicksAndRejections (internal/process/task_queues.js:97:5)\n at async User.start (/usr/lib/node_modules/instamancer/src/api/instagram.js:222:9)\n at async spawn (/usr/lib/node_modules/instamancer/src/cli.js:343:5)\n at async Object.handler (/usr/lib/node_modules/instamancer/src/cli.js:72:9)"}
{"message":"https://instagram.com/therock","level":"error"}
{"message":{"id":"therock","index":0,"sleepRemaining":0,"state":"Request aborted","total":3},"level":"debug"}
{"message":{"id":"therock","index":0,"sleepRemaining":60,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"therock","index":0,"sleepRemaining":59,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"therock","index":0,"sleepRemaining":58,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"therock","index":0,"sleepRemaining":57,"state":"Scraping","total":3},"level":"debug"}
...
{"message":{"id":"therock","index":0,"sleepRemaining":1,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"therock","index":0,"sleepRemaining":0,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"therock","index":0,"sleepRemaining":0,"state":"Launching","total":3},"level":"debug"}
{"message":{"id":"therock","index":0,"sleepRemaining":0,"state":"Navigating","total":3},"level":"debug"}
{"level":"error","message":"ErrorNavigation failed because browser has disconnected!","stack":"Error: Navigation failed because browser has disconnected!\n at CDPSession.<anonymous> (/usr/lib/node_modules/instamancer/node_modules/puppeteer/lib/LifecycleWatcher.js:46:107)\n at CDPSession.emit (events.js:321:20)\n at CDPSession._onClosed (/usr/lib/node_modules/instamancer/node_modules/puppeteer/lib/Connection.js:215:10)\n at Connection._onClose (/usr/lib/node_modules/instamancer/node_modules/puppeteer/lib/Connection.js:138:15)\n at WebSocket.<anonymous> (/usr/lib/node_modules/instamancer/node_modules/puppeteer/lib/WebSocketTransport.js:48:22)\n at WebSocket.onClose (/usr/lib/node_modules/instamancer/node_modules/ws/lib/event-target.js:124:16)\n at WebSocket.emit (events.js:321:20)\n at WebSocket.emitClose (/usr/lib/node_modules/instamancer/node_modules/ws/lib/websocket.js:191:10)\n at Socket.socketOnClose (/usr/lib/node_modules/instamancer/node_modules/ws/lib/websocket.js:850:15)\n at Socket.emit (events.js:321:20)\n -- ASYNC --\n at Frame.<anonymous> (/usr/lib/node_modules/instamancer/node_modules/puppeteer/lib/helper.js:111:15)\n at Page.goto (/usr/lib/node_modules/instamancer/node_modules/puppeteer/lib/Page.js:670:49)\n at Page.<anonymous> (/usr/lib/node_modules/instamancer/node_modules/puppeteer/lib/helper.js:112:23)\n at User.constructPage (/usr/lib/node_modules/instamancer/src/api/instagram.js:587:29)\n at processTicksAndRejections (internal/process/task_queues.js:97:5)\n at async User.constructPage (/usr/lib/node_modules/instamancer/src/api/instagram.js:620:13)\n at async User.start (/usr/lib/node_modules/instamancer/src/api/instagram.js:222:9)\n at async spawn (/usr/lib/node_modules/instamancer/src/cli.js:343:5)\n at async Object.handler (/usr/lib/node_modules/instamancer/src/cli.js:72:9)"}
{"message":"https://instagram.com/therock","level":"error"}
{"message":{"id":"therock","index":0,"sleepRemaining":0,"state":"Request aborted","total":3},"level":"debug"}
{"message":{"id":"therock","index":0,"sleepRemaining":60,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"therock","index":0,"sleepRemaining":59,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"therock","index":0,"sleepRemaining":58,"state":"Scraping","total":3},"level":"debug"}
...
{"message":{"id":"therock","index":0,"sleepRemaining":3,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"therock","index":0,"sleepRemaining":2,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"therock","index":0,"sleepRemaining":1,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"therock","index":0,"sleepRemaining":0,"state":"Scraping","total":3},"level":"debug"}
I've thought about the lack of memory too and built a VM with very similar specs and even worse, sometimes it would take longer but after that it would scrap with no problem.
I think i'm gonna try to upgrade the VPS to test it out
from instamancer.
I've made the upgrade and now the "page crashed" error is gone, but now it's showing the HTML from instagram's login page on the logs.
{"message":"Starting API at 1580867364091","level":"info"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":0,"state":"Launching","total":3},"level":"debug"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":0,"state":"Navigating","total":3},"level":"debug"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":0,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":2,"state":"Scraping","total":3},"level":"debug"}
{"message":"Failed: https://www.facebook.com/x/oauth/status?client_id=124024574287414&input_token&origin=1&redirect_uri=https%3A%2F%2Fwww.instagram.com%2Faccounts%2Flogin%2F&sdk=joey&wants_cookie_data=true","level":"info"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":2,"state":"Request aborted","total":3},"level":"debug"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":1,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":0,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":2,"state":"Scraping","total":3},"level":"debug"}
{"_type":"warning","_text":"The resource https://www.instagram.com/static/bundles/metro/FBSignupPage.css/8caefd531e0f.css was preloaded using link preload but not used within a few seconds from the window's load event. Please make sure it has an appropriate `as` value and it is preloaded intentionally.","_args":[],"_location":{},"level":"info","message":"Console log"}
{"_type":"warning","_text":"The resource https://www.instagram.com/static/bundles/metro/LoginAndSignupPage.css/8acb7a798e78.css was preloaded using link preload but not used within a few seconds from the window's load event. Please make sure it has an appropriate `as` value and it is preloaded intentionally.","_args":[],"_location":{},"level":"info","message":"Console log"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":1,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":0,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":2,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":1,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":2,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":1,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":0,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":2,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":1,"state":"Scraping","total":3},"level":"debug"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":0,"state":"Scraping","total":3},"level":"debug"}
{"content":"<!DOCTYPE html><html lang=\"en\" class=\"js not-logged-in client-root js-focus-visible sDN5V\"><head>
....
<div class=\"Z2m7o\"><div class=\"CgFia \"></div></div><div id=\"fb-root\" class=\" fb_reset\"><div style=\"position: absolute; top: -10000px; width: 0px; height: 0px;\"><div></div></div></div></body></html>","level":"error","message":"Page failed to make requests"}
{"message":{"id":"arianagrande","index":0,"sleepRemaining":0,"state":"Closing","total":3},"level":"debug"}
from instamancer.
That's ok. That message should only be appearing because you're looking for a small number of posts, so the page never gets the chance to initiate a request.
from instamancer.
I tried with 50 and unlimited but it didn't work, unfortunately the same messages appeared.
from instamancer.
Try adjusting the time spent sleeping between page jumps with -s20
from instamancer.
Adjusting the time didn't work either, but I managed to test the module with a proxy and made the createApi("hashtag") work.
Then I tried to make the same module with createAPI("user") and it didn't work, it looked like it wasn't running the loop (console.log inside wasn't rendering).
Here's what I tried:
const options: IOptions = {
total: 10,
headless: true,
proxyURL: '--proxy-server=socks5://127.0.0.1:9050',
};
const user = createApi("user", "therock", options);
const result = [];
(async () => {
for await (const post of user.generator()) {
result.push(post)
}
fs.writeFileSync('test.json', JSON.stringify(result));
})();
Changing "user" with "hashtag" works.
from instamancer.
Does it work locally? If so, are you using your local connection as the proxy on the VPS?
from instamancer.
Related Issues (20)
- [FEATURE] Need a step-by-step example HOT 2
- [BUG] Cannot use tagged
- [BUG] HOT 1
- [BUG] After scraping around 800 hashtags Instamancer reloads the browser HOT 6
- Instgram login pops up and scraping freezes [BUG - possibly...?] HOT 1
- [FEATURE] Serverless Framework Support HOT 2
- Omitting fullAPI skips first 12 posts HOT 2
- Alert from # used in post. HOT 4
- Scraped: 0 in production server HOT 3
- [BUG] Scraping is not working anymore because Instagram requres authorization HOT 9
- [FEATURE] Parallel Batch Processing? HOT 1
- [BUG] Basic API does not work HOT 1
- [BUG] HOT 1
- I'm not getting the latest posts HOT 2
- Get amount of certain hashtag[FEATURE] HOT 1
- [FEATURE] Want to add new attribute under Owner HOT 1
- Is it possible to download only the first slide from post that have multiple? HOT 1
- Write to data file on the fly? HOT 2
- [BUG] Instagram requires login HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from instamancer.