Code Monkey home page Code Monkey logo

nickjs's Introduction

NickJS


⛔ This project is deprecated, please consider using Puppeteer instead. ⛔

NickJS predates Puppeteer and is no longer the best tool around. This project isn't maintained anymore.


Web scraping library made by the Phantombuster team. Modern, simple & works on all websites.

NPM version Gitter room Twitter follow

NickJS.orgInline doc ↓

  • Supports both Headless Chrome and PhantomJS as drivers
  • Simple high-level API
  • Async/await, Promises and callback coding styles

NickJS allows you to automate navigation and collect data from any website. By controlling an instance of either Headless Chrome or PhantomJS with CasperJS, your bots will simulate a human.

It's simple and allows for an easy implementation of our 3 scraping steps theory.

Example code

const Nick = require("nickjs")
const nick = new Nick()

;(async () => {

	const tab = await nick.newTab()
	await tab.open("news.ycombinator.com")

	await tab.untilVisible("#hnmain") // Make sure we have loaded the page

	await tab.inject("http://code.jquery.com/jquery-3.2.1.min.js") // We're going to use jQuery to scrape
	const hackerNewsLinks = await tab.evaluate((arg, callback) => {
		// Here we're in the page context. It's like being in your browser's inspector tool
		const data = []
		$(".athing").each((index, element) => {
			data.push({
				title: $(element).find(".storylink").text(),
				url: $(element).find(".storylink").attr("href")
			})
		})
		callback(null, data)
	})

	console.log(JSON.stringify(hackerNewsLinks, null, 2))

})()
.then(() => {
	console.log("Job done!")
	nick.exit()
})
.catch((err) => {
	console.log(`Something went wrong: ${err}`)
	nick.exit(1)
})

Usage

First of all, install NickJS: npm install nickjs.

NickJS will choose which headless browser to use depending on how you launch it. When launching your script with node, Headless Chrome will be used. When launched with casperjs, CasperJS+PhantomJS will be used.

To get started with the PhantomJS driver, read this. However we recommend using Headless Chrome (read on).

You'll need to have Node 7+ and Chrome 63+ installed on your system (read the next section for more info about which Chrome version you should use). The path to the Chrome executable can be specified with export CHROME_PATH=/path/to/chrome otherwise the binary google-chrome-beta will be used.

Launching a bot is then as simple as node my_nickjs_script.js.

Headless Chrome version

NickJS makes use of the latest DevTools protocol methods, so you'll need a very recent version of Chrome.

At the time of writing, NickJS is using some methods from Chrome 63, which is the Beta Channel. Having the correct version of Chrome is critical for a smooth experience with NickJS. Go to the Chrome Release Channels page and download a version compatible with your system. If you want this to be taken care of for you, check out Phantombuster, which is basically our "NickJS as a service" platform.

Environment variables

The following environment variables have an effect on NickJS:

  • CHROME_PATH: specifies where to find the Google Chrome binary — this is important! Example: "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
  • NICKJS_LOAD_IMAGES (0 or 1): disables image loading (equivalent to NickJS' constructor option loadImages)
  • NICKJS_NO_SANDBOX (0 or 1): disables Chrome's sandboxing (no effect when the CasperJS+PhantomJS driver is used)
  • NICKJS_PROXY or http_proxy: see below

HTTP proxy

NickJS supports HTTP (and HTTPS) proxies. Other protocols are not yet supported. To specify which proxy to use, set the httpProxy option in NickJS' constructor. You can also set the environment variable NICKJS_PROXY or the standard http_proxy (but the constructor option takes precedence).

Your proxy must be specified in the following format: http://username:[email protected]:3128 (the protocol portion is optional).

Contrary to some other libraries, yes, NickJS supports proxy authentication with Headless Chrome.

Documentation

Nick

Nick([options])

This is Nick's constructor. options is an optional argument that lets you configure your Nick instance.

Nick must be instantiated only once. Behind the scenes, the headless browser driver is initialized. The next step is to open a tab with newTab().

— [options] (PlainObject)

Optional settings for the Nick instance.

  • printNavigation (Boolean): when true (the default), Nick will log important navigation information like page changes, redirections and form submissions
  • printResourceErrors (Boolean): when true (the default), Nick will log all the errors encountered when loading pages, images and all other resources needed by the pages you visit
  • printPageErrors (Boolean): when true (the default), Nick will log all JavaScript errors and exceptions coming from the scripts executed in the page context
  • resourceTimeout (Number): milliseconds after which Nick will abort loading a resource (page, images and all other resources needed by the pages you visit)
  • userAgent (String): sets the User-Agent header
  • loadImages (Boolean): whether or not to load the images embedded in the pages (defaults to true) (note: specifying this parameter overrides the agent's Phantombuster setting "Load Images")
  • blacklist (Array): soon!
  • whitelist (Array): soon!
  • childStdout (String): when stderr can redirect stdout to stderr
  • childStdout (String): when stdout can redirect stderr to stdout
  • additionalChildOptions (Array): When chrome is used this is an Array of string (e.g ["--ignore-certificate-errors", "--ignore-urlfetcher-cert-requests"]), for CasperJs though this is an array of objects (e.g [{verbose: true, logLevel: "debug" }])
Basic (ES6+)
const Nick = require("nickjs")
const nick = new Nick()
All options (ES6+)
const Nick = require("nickjs")

// these are the default options
const nick = new Nick({
  printNavigation: true,
  printResourceErrors: true,
  printPageErrors: true,
  timeout: 10000,
  userAgent: "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36"
})

deleteAllCookies([callback])

Deletes all cookies set to the headless browser.

— callback (Function)

Function called when finished(optional).

  • err (String): null or a description of what went wrong if something went wrong
Example
try {
  await nick.deleteAllCookies()
  // All cookies are cleanded up
} catch (err) {
  console.log("Could not delete all cookies:", err)
}
🚫 Warning

This method will delete all cookies that might be necessary to your bot.

deleteCookie(cookieName, cookieDomain[, callback])

Deletes a specific cookie set in the headless browser.

— callback (Function)

Function called when finished(optional).

  • err (String): null or a description of what went wrong if something went wrong.
Example
const cookieName = "cookieName"
const cookieDomain = ".domain.com"

try {
  await nick.deleteCookie(cookieName, cookieDomain)
} catch (err) {
  console.log("Could not delete cookie:", err)
}

driver

nick.driver lets you access the underlying headless browser driver instance that is being used by Nick.

This is useful when doing trickier things in your navigation and for accessing driver-specific methods that are not available in Nick.

PhantomJS+CasperJS driver
// In this case we're using the PhantomJS+CasperJS driver
// This gets the CasperJS instance and clears the cache
nick.driver.casper.page.clearMemoryCache()

exit([code])

Immediately stops the whole bot and exits the process with code.

— [code] (Number)

Optional exit code that the process should return. 0 by default.

Example 1
nick.exit() // All is well
Example 2
nick.exit(1) // Something went horribly wrong

getAllCookies([callback])

Gets an object containing all cookies set in the headless browser.

— callback (Function)

Function called when finished(optional).

  • err (String): null or a description of what went wrong if something went wrong.
  • cookies (PlainObject): an object containing all cookies of the headless browser and their properties
Example
try {
  const cookies = await nick.getAllCookies()
  // Cookies contain all your cookies
  console.log(cookies, null, 2)
} catch (err) {
  console.log("Could not get all cookies:", err)
}

newTab()

Opens a new tab.

This is the first step in manipulating a website.

To open multiple tabs, call this method multiple times. If your bot opens many tabs to do different tasks, it's a good idea to close() them when their work is finished (to keep memory usage down).

Example
try {
  const tab = await nick.newTab()
  // You can now browse any website using `tab`
} catch (err) {
  console.log("An error occured:", err)
}

setCookie(cookie[, callback])

Sets a cookie.

Set the name, the value and the domain of a cookie. This cookie can be seen with getAllCookies() and deleted with deleteAllCookies() or deleteCookie().

— cookie (PlainObject)

An object containing all attributes of a cookie.

  • name (String): Name of the cookie you want to set.
  • value (String): Value of the cookie you want to set.
  • domain (String): Domain linked to the cookie set.

— callback (Function)

Function called when finished(optional).

  • err (String): null or a description of what went wrong if something went wrong.
Example
const cookie = {
  name: "cookieName",
  value: "cookieValue",
  domain: ".domain.com"
}

try {
  await nick.setCookie(cookie)
  // You can navigate with your cookie set
} catch (err) {
  console.log("Could not create cookie:", err)
}

Nick-tab

click(selector[, callback])

Performs a click on the element matching the CSS selector selector.

Clicking on elements is one of the main ways to manipulate web pages with Nick. Clicking is an easy way to navigate where you want, but keep in mind that it can be more efficient to scrape URLs (for example with evaluate()) and then call open().

— selector (String)

CSS selector targeting what element to click. Probably a button or an a but can be anything you want. Make sure the target element is visible or present by calling waitUntilVisible() or waitUntilPresent() beforehand.

— callback (Function(err))

Function called when finished (optional).

  • err (String): null or a string describing what went wrong with the click (typically the CSS selector did no match any element)
Example
const selector = "button.cool-button"
const pageTimeout = 5000

try {
  await tab.waitUntilPresent(selector, pageTimeout)
  await tab.click(selector)
} catch (err) {
  console.log("An error occured:", err)
}
// Continue your navigation in this branch
// You should probably do a waitUntilVisible() or waitUntilPresent() here
⚠️ Make sure your target is here

Before calling click() you should make sure the element you are trying to click on is actually visible or present in the page by using waitUntilVisible() or waitUntilPresent().

close([callback])

Closes the tab in current use.

After close() is called, the tab becomes unusable. All subsequent method calls will throw an exception saying that this specific tab instance has stopped. Lose all references to the instance for it to be garbage-collected and clean cookies and cache used for the whole nick.

— callback (Function)

Function called when finished (optional).

  • err (String): null or a description of what went wrong if something went wrong
Example
try {
  await tab.close()
  // tab can not be used here anymore
  // but you may continue other actions
} catch (err) {
  console.log("Could not close tab:", err)
}
ℹ️ It's like closing a tab in your browser

This method is useful when using multiple Nick instances to simulate browsing on multiple tabs. Calling close() is the equivalent of closing a tab.

ℹ️ Tips

It can be also useful if you want to iterate on many URLs, the fact that close() clear cache and cookies free a lot of memory.

🚫 Warning

Calling close() will clear the cookies and cache of the whole nick instantiated before.

evaluate(inPageFunction [, argumentObject, callback])

Executes inPageFunction in the current page context.

Nick provides you with two separate JavaScript contexts:

  1. Where the Nick code runs: this is your script environment, with all your locally declared variables and all your calls to Nick methods
  2. Where the page code runs: this is where the page executes jQuery or AngularJS code for example

The evaluate() method allows you to declare a function in your Nick context (1) and executes it in the page context (2). It's like executing code in your browser's inspector tool: you can do anything you want with the page.

In the page context, you have access to all the global variables declared by the page, as well as the DOM (window, document, ...). Any JavaScript libraries included by the page can also be used.

If the page does not include what you want (jQuery or underscore for example), you can inject any JavaScript file with inject() before calling evaluate().

— inPageFunction (Function(argumentObject, callback))

Function to execute in the current page context. argumentObject will be passed as its first argument and a callback as it second argument. argumentObject is an empty plainObject by default. callback is the function to call when finished.

  • err (String): null if the function succeeds otherwise put a description of what went wrong
  • res (Any): return value of inPageFunction in case of success (this value is serialized to be transferred back to the Nick context — complex object like DOM elements, functions or jQuery objects cannot be returned to the Nick context reliably)

— [argumentObject] (PlainObject)

Optional object that will be passed as an argument of inPageFunction (optional). This object is serialized to be transferred to the page context — complex objects like functions or JavaScript modules cannot be passed as argument reliably.

— callback (Function(err, res)

Function called when finished.

  • err (String): null or a string describing what went wrong during the evaluation of inPageFunction
  • res (Any): return value of inPageFunction in case of success (this value is serialized to be transferred back to the Nick context — complex object like DOM elements, functions or jQuery objects cannot be returned to the Nick context reliably)
Example
const scraper = (arg, done) => {
  // In this case, the current page uses a typical jQuery declared as $
  done(null, $(arg.link).attr("href"))
}
const arg = { link: "#header > a.title" }

try {
  const res = await tab.evaluate(scraper, arg)
  console.log("Scraped this link:", res)
  // Continue your navigation here
} catch (err) {
  console.log("Something went wrong:", err)
}
🚫 Local variables not accessible

Because inPageFunction is executed in the current page context, your local variables that have been declared before your evaluate() call will not be accessible. You can, however, transfer variables using the argumentObject parameter.

For this reason, Nick methods won't be available inside evaluate.

🚫 Error in callback

When returning data with the callback in the inPageFunction take care to always set the first argument as null if there is no error.

⚠️ Serialization subtleties

Keep in mind that to transfer inPageFunction and its return value to and from the page context, serialization has to occur. Everything becomes a string at some point. So you cannot return DOM elements or jQuery objects from the page. Moreover, the underlying PhantomJS browser has a bug where serialization of null gives an empty string "" (even in nested objects and arrays). Beware!

fill(selector, inputs [, submit, callback])

Fills a form with the given values and optionally submits it.

Inputs are referenced by their name attribute.

— selector (String)

CSS selector targeting what form to fill. It should point to a form tag. Make sure the target form is visible or present by calling waitUntilVisible() or waitUntilPresent() beforehand.

— inputs (PlainObject)

An object containing the data you want to enter in the form. Keys must correspond to the inputs' name attribute. This method supports single select fields in the same way as normal input fields. For select fields allowing multiple selections, supply an array of values to match against.

— options (Boolean)

  • submit (Boolean): Whether or not to submit the form after filling it (false by default).

— callback (Function(err))

Function called when finished.

  • err (String): null or a string describing what went wrong when filling the form
Example
const selector = "#contact-form"
const inputs = {
  "subject": "I am watching you",
  "content": "So be careful.",
  "civility": "Mr",
  "name": "Chuck Norris",
  "email": "[email protected]",
  "cc": true,
  "attachment": "roundhousekick.doc" // file taken from your agent's disk
}

try {
  await tab.waitUntilVisible(selector, 5000)
  await tab.fill(selector, inputs, { submit: true })
  console.log("Form sent!")
  // Continue your navigation in this branch
  // You should probably do a waitUntilVisible() or waitUntilPresent() here
} catch (err) {
  console.log("Form not found:", err)
}
Form used in the example (HTML)
<form action="/contact" id="contact-form" enctype="multipart/form-data">
  <input type="text" name="subject"/>
  <textearea name="content"></textearea>
  <input type="radio" name="civility" value="Mr"/> Mr
  <input type="radio" name="civility" value="Mrs"/> Mrs
  <input type="text" name="name"/>
  <input type="email" name="email"/>
  <input type="file" name="attachment"/>
  <input type="checkbox" name="cc"/> Receive a copy
  <input type="submit"/>
</form>

getContent([callback])

Returns the current page content as a string.

— callback (Function(err))

Function called when finished (optional).

  • err (String): null or a description of what went wrong if something went wrong
  • content (String): the full HTML content of the current webpage.
Example
try {
  const content = await tab.getContent()
  // content contains the content of the current webpage
} catch (err) {
  console.log("Could not get the content of the page:", err)
}
⚠️ Note

When the current page is a dynamic JavaScript powered HTML page, getContent() will return a snapshot of the current state of the DOM and not the initial source code.

getUrl([callback])

Returns the current page URL as a string.

— callback (Function(err))

Function called when finished (optional).

  • err (String): null or a description of what went wrong if something went wrong
  • url (String): the full URL of the current page.
Example
try {
  const url = await tab.getUrl()
  console.log("The url of the page is", url)
  // You can use the variable url and continue your actions
} catch (err) {
  console.log("Could not get the current url:", err)
}
ℹ️ Note

The URL you get will be URL-decoded.

inject(urlOrPath [, callback])

Injects a script in the current DOM page context.

The script can be stored locally on disk or on a remote server.

— urlOrPath (String)

Path to a local or remote script.

— callback (Function(err))

Function called when finished (optional).

  • err (String): null or a description of what went wrong if something went wrong
Example
const urlOrPath = "https://code.jquery.com/jquery-3.2.1.min.js"

try {
  await tab.inject(urlOrPath)
  console.log("Jquery script inserted!")
  //You may now use tab.evaluate() and use jQuery functions
} catch (err) {
  console.log("Could not inject jQuery:", err)
}

isPresent(selectors[, conditions, callback])

Checks for a list of selectors CSS selectors if they are present in the DOM and return a boolean: true if the selectors are present and false in the contrary.

— selectors (Array or String)

What to look for in the DOM. Can be an array of CSS selectors (array of strings) or a single CSS selector (string).

— [condition] (String)

When selectors is an array, this optional argument lets you choose how to wait for the CSS selectors(optional). If condition is "and" (the default), the method will check for the presence of all CSS selectors. On the other hand, if condition is "or", the method will check for the presence of any CSS selector.

— callback (Function(err, selector))

Function called when finished (optional).

  • err (String): null or a description of what went wrong if the function fails to check
  • visible (Boolean): true if the condition succeeds or false in the contrary
Example
const selectors = ["div.first", "div.second"]

const present = await tab.isPresent(selectors, "or")
if (present) {
  // Either .first or .second is present at this time
} else {
  console.log("Elements aren't present")
}

isVisible(selectors[, conditions, callback])

Checks for a list of selectors CSS selectors if they are visible in the page and return a boolean: true if the selectors are visible and false in the contrary.

— selectors (Array or String)

What to check for. Can be an array of CSS selectors (array of strings) or a single CSS selector (string).

— [condition] (String)

When selectors is an array, this optional argument lets you choose how to wait for the CSS selectors (optional). If condition is "and" (the default), the method will check for the visibility of all CSS selectors. On the other hand, if condition is "or", the method will check for the visibility of any CSS selector.

— callback (Function(err, selector))

Function called when finished (optional).

  • err (String): null or a description of what went wrong if the function fails to check
  • visible (Boolean): true if the condition succeeds or false in the contrary
Example
const selectors = ["div.first", "div.second"]

const visible = await tab.isVisible(selectors, "or")
if (visible) {
  // Either .first or .second is visible at this time
} else {
  console.log("Elements aren't visible")
}

onConfirm

Sets an event to a JS confirm alert. Executes the function assigned to this variable whenever a confirm dialog is called by window.confirm. The only parameter is the message sent by the dialog, and the function needs to return the user's response as a boolean.

— message (String)

A string containing the message from the confirm dialog.

ES6+
tab.onConfirm = (message) => {
  console.log("The confirm messsage is", message)
  return true
}

onPrompt

Sets an event to a JS prompt alert. Executes the function assigned to this variable whenever a prompt dialog is called by window.prompt(). The only parameter is the message sent by the dialog, and the function needs to return the user's response as a string.

— message (String)

A string containing the message from the prompt dialog.

ES6+
tab.onPrompt = (message) => {
  console.log("The prompt message is", message)
  return "Response"
}

open(url [, options, callback])

Opens the webpage at url.

By default, it's a GET but you can forge any type of HTTP request using the options parameter.

Opening a page will time out after 10 seconds. This can be changed with the resourceTimeout Nick option (see Nick's options). Note: this time out concerns the initial page but not the resources the page requires thereafter.

— url (String)

URL of the page to open. Should begin with http:// or https:// (or file:// to open a page that was previously downloaded to your agent's disk).

— [options] (PlainObject)

Optional request configuration (optional).

— callback (Function(err, httpCode, httpStatus, url))

Function called when finished (optional).

  • err (String): null or a description of what went wrong if something went wrong (typically if there was a network error or timeout)
  • httpCode (Number): the received HTTP code or null if there was a network error
  • httpStatus (String): text equivalent of the received HTTP code or null if there was a network error
  • url (String): the actually opened URL (can be different from the input URL because of 3xx redirects for example) or null if there was a network error
Example
const url = "https://phantombuster.com/"

try {
  const [httpCode, httpStatus] = await tab.open(url)

  if ((httpCode >= 300) || (httpCode < 200)) {
    console.log("The site responded with", httpCode, httpStatus)
  } else {
    console.log("Successfully opened", url, ":", httpCode, httpStatus)
    // Manipulate the page in this branch
    // You should probably do a waitUntilVisible() or waitUntilPresent() here
  }
} catch(err) {
  console.log("Could not open page:", err)
}
JavaScript: sample options
{
  method: "post",
  data:   {
    "some param": "some data",
    "another field":  "this is sent in x-www-form-urlencoded format"
  },
  headers: {
    "Accept-Language": "fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3"
  }
}
🚫 Know your errors

This method will NOT return an error when the received HTTP isn't 200. An error is returned only when a network error happens. It's your job to check for 404s or 500s with httpCode if needed.

⚠️ Always wait for DOM elements

Many pages on the web load slowly and unreliably. Many more make numerous aynchronous queries. For these reasons, you should always wait for the DOM elements that interest you after opening a page with waitUntilVisible()or waitUntilPresent().

screenshot(filename [, callback])

Takes a screenshot of the current page.

— path (String)

The local path of the screenshot. The format is defined by the file extension. 'image.jpg' will create a JPEG image in the current folder.

— callback (Function(err))

Function called when finished (optional).

  • err (String): null or a description of what went wrong if something went wrong
Example
const path = "./image.jpg"

try {
  await tab.screenshot(path)
  console.log("Screenshot saved at", path)
  // Your screenshot is available at this path
} catch (err) {
  console.log("Could not take a screenshot:", err)
}

scroll(x, y,[, callback])

Scrolls to coordinates [x,y] on the page.

— x (Number)

The X-axis coordinate in pixels to scroll to (horizontally).

— y (Number)

The Y-axis coordinate in pixels to scroll to (vertically).

— callback (Function(err))

Function called when finished (optional).

  • err (String): null or a description of what went wrong if something went wrong
Example
const x = 1000
const y = 2000

try {
  await tab.scroll(x, y)
  // Your position will be [1000, 2000] in the page now
} catch (err) {
  console.log("Could not scroll to coordinates:", err)
}
ℹ️ Tips

scroll() can also be called using scrollTo()

scrollToBottom([callback])

Scrolls to the bottom of the page.

— callback (Function(err))

Function called when finished (optional).

  • err (String): null or a description of what went wrong if something went wrong
Example
try {
  await tab.scrollToBottom()
  // You are now at the bottom of the page
} catch (err) {
  console.log("An error occured during the scroll to bottom:", err)
}

sendKeys(selector, keys[, options, callback])

Writes keys in an <input>, <textarea> or any DOM element with contenteditable="true" in the current page.

— selector (String)

A CSS3 or XPath expression that describes the path to DOM elements.

— keys (String)

Keys to send to the editable DOM element.

— options (String)

The three options available are:

  • reset (Boolean): remove the content of the targeted element before sending key presses.
  • keepFocus (Boolean): keep the focus in the editable DOM element after keys have been sent (useful for input with dropdowns).
  • modifiers (PlainObject): modifier string concatenated with a + (available modifiers are ctrl, alt, shift, meta and keypad).

— callback (Function(err))

Function called when finished(optional).

  • err (String): null or a description of what went wrong if something went wrong
Example
const selector = '#message'
const keys = "Boo!"
const options = {
  reset: true,
  keepFocus: false,
  modifiers: {}
}

try {
  await tab.sendKeys(selector, keys, options)
  console.log("Keys sent!")
  // You may continue your actions here
} catch (err) {
  console.log("Could not send keys:", err)
}

wait(duration[, callback])

Wait for duration milliseconds.

— duration (Number)

The number of milliseconds to wait for.

— callback (Function(err))

Function called when finished (optional).

  • err (String): null or a description of what went wrong if something went wrong
🚫 Warning

This function has nothing to do with the tab you are using, it is pure syntactic sugar to replace Promise.delay() (from Bluebird). It is like waiting in front of your computer after opening a web page.

Example
try {
  await tab.doSomething()
  await tab.wait(10000)
  // After waiting 10 seconds the script continues
  await tab.doSomething()
} catch (err) {
  console.log("An error occured during the execution:", err)
}

waitUntilPresent(selectors [, timeout, condition, callback])

Waits for a list of selectors CSS selectors to be present in the DOM. Aborts with an error if the elements have not become present in the DOM after timeout milliseconds.

selectors can be an array of CSS selectors (array of strings) or a single CSS selector (string).

By default, condition is "and" (wait for all CSS selectors) but it can be changed to "or" (wait for any CSS selector).

— selectors (Array or String)

What to wait for. Can be an array of CSS selectors (array of strings) or a single CSS selector (string).

— timeout (Number)

Maximum number of milliseconds to wait for, by default it is set to 5000(optional). callback will be called with an error if the elements have not become present after timeout milliseconds.

— [condition] (String)

When selectors is an array, this optional argument lets you choose how to wait for the CSS selectors(optional). If condition is "and" (the default), the method will wait for all CSS selectors. On the other hand, if condition is "or", the method will wait for any CSS selector.

— callback (Function(err, selector))

Function called when finished(optional).

  • err (String): null or a description of what went wrong if the CSS selectors were not present after timeout milliseconds
  • selector (String):
    • In case of success (err is null):
      • If condition was "and" then selector is null because all CSS selectors are present
      • If condition was "or" then selector is one of the present CSS selectors of the given array
    • In case of failure (err is not null):
      • If condition was "and" then selector is one of the non-present CSS selectors of the given array
      • If condition was "or" then selector is null because none of the CSS selectors are present
Example
const selectors = "#header > h1.big-title"
const pageTimeout = 5000

try {
  await tab.waitUntilPresent(selectors, pageTimeout)
  // The element is present in the DOM
} catch(err) {
  console.log("Oh no! Even after 5s, the element was still not present. ", err)
}

waitUntilVisible(selectors [, timeout, condition, callback])

Waits for a list of selectors CSS selectors to be visible. Aborts with an error if the elements have not become visible after timeout milliseconds.

selectors can be an array of CSS selectors (array of strings) or a single CSS selector (string).

By default, condition is "and" (wait for all CSS selectors) but it can be changed to "or" (wait for any CSS selector).

— selectors (Array or String)

What to wait for. Can be an array of CSS selectors (array of strings) or a single CSS selector (string).

— timeout (Number)

Maximum number of milliseconds to wait for, by default it is set to 5000(optional). callback will be called with an error if the elements have not become visible after timeout milliseconds.

— [condition] (String)

When selectors is an array, this optional argument lets you choose how to wait for the CSS selectors(optional). If condition is "and" (the default), the method will wait for all CSS selectors. On the other hand, if condition is "or", the method will wait for any CSS selector.

— callback (Function(err, selector))

Function called when finished(optional).

  • err (String): null or a description of what went wrong if the CSS selectors were not visible after timeout milliseconds
  • selector (String):
    • In case of success (err is null):
      • If condition was "and" then selector is null because all CSS selectors are visible
      • If condition was "or" then selector is one of the visible CSS selectors of the given array
    • In case of failure (err is not null):
      • If condition was "and" then selector is one of the non-visible CSS selectors of the given array
      • If condition was "or" then selector is null because none of the CSS selectors are visible
Example
const selectors = "#header > h1.big-title"
const pageTimeout = 5000

try {
  await tab.waitUntilVisible(selectors, pageTimeout)
  // Manipulate the element here
  // for example with a click() or evaluate()
} catch(err) {
  console.log("Oh no! Even after 5s, the element was still not visible:", err)
}
Example
const selectors = ["#header > h1", "img.product-image"]
const pageTimeout = 6000

try {
  await tab.waitUntilVisible(selectors, pageTimeout)
  // Manipulate the element here
  // for example with a click() or evaluate()
} catch(err) {
  console.log("Oh no! Even after 6s, at least one of the element was still not visible:", err)
}
Example
var selectors = ["section.footer", "section.header"]
var pageTimeout = 7000

try {
  const selector = await tab.waitUntilVisible(selectors, pageTimeout)
  console.log("This element is visible: " + selector)
  // Manipulate the element here
  // For example with a click() or evaluate()
} catch(err) {
  console.log("Oh no! Even after 7s, all the elements were still not visible. " + err)
  // in this case, the callback does not return which element is not visible
  // because ALL the elements are not visible
}

waitWhilePresent(selectors [, timeout, condition, callback])

Waits for a list of selectors CSS selectors to become non-present in the DOM. Aborts with an error if the elements are still present in the DOM after timeout milliseconds.

selectors can be an array of CSS selectors (array of strings) or a single CSS selector (string).

By default, condition is "and" (wait for all CSS selectors) but it can be changed to "or" (wait for any CSS selector).

— selectors (Array or String)

What to wait for. Can be an array of CSS selectors (array of strings) or a single CSS selector (string).

— timeout (Number)

The maximum number of milliseconds to wait for, by default it is set to 5000 (optional). callback will be called with an error if the elements are still present after timeout milliseconds.

— [condition] (String)

When selectors is an array, this optional argument lets you choose how to wait for the CSS selectors (optional). If condition is "and" (the default), the method will wait for all CSS selectors. On the other hand, if condition is "or", the method will wait for any CSS selector.

— callback (Function(err, selector))

Function called when finished (optional).

  • err (String): null or a description of what went wrong if the CSS selectors were still present after timeout milliseconds
  • selector (String):
    • In case of success (err is null):
      • If condition was "and" then selector is null because none of the CSS selectors are present
      • If condition was "or" then selector is one of the non-present CSS selectors of the given array
    • In case of failure (err is not null):
      • If condition was "and" then selector is one of the still present CSS selectors of the given array
      • If condition was "or" then selector is null because all of the CSS selectors are still present
Example
const selectors = "#header > h1.big-title"
const pageTimeout = 5000

try {
  await tab.waitWhilePresent(selectors, pageTimeout)
  // The selector has succesfully become non-present
} catch(err) {
  console.log("Oh no! Even after 5s, the element was still present:", err)
}

waitWhileVisible(selectors [, timeout, condition, callback])

Waits for a list of selectors CSS selectors to become non-visible. Aborts with an error if the elements are still visible after timeout milliseconds.

selectors can be an array of CSS selectors (array of strings) or a single CSS selector (string).

By default, condition is "and" (wait for all CSS selectors) but it can be changed to "or" (wait for any CSS selector).

— selectors (Array or String)

What to wait for. Can be an array of CSS selectors (array of strings) or a single CSS selector (string).

— timeout (Number)

The maximum number of milliseconds to wait for, by default it is set to 5000 (optional). callback will be called with an error if the elements are still visible after timeout milliseconds.

— [condition] (String)

When selectors is an array, this optional argument lets you choose how to wait for the CSS selectors(optional). If condition is "and" (the default), the method will wait for all CSS selectors. On the other hand, if condition is "or", the method will wait for any CSS selector.

— callback (Function(err, selector))

Function called when finished(optional).

  • err (String): null or a description of what went wrong if the CSS selectors were still visible after timeout milliseconds
  • selector (String):
    • In case of success (err is null):
      • If condition was "and" then selector is null because none of the CSS selectors are visible
      • If condition was "or" then selector is one of the non-visible CSS selectors of the given array
    • In case of failure (err is not null):
      • If condition was "and" then selector is one of the still visible CSS selectors of the given array
      • If condition was "or" then selector is null because all of the CSS selectors are still visible
Example
const selectors = "#header > h1.big-title"
const pageTimeout = 5000

try {
  await tab.waitWhileVisible(selectors, pageTimeout)
  // The selector has succesfully become non-visible
} catch(err) {
  console.log("Oh no! Even after 5s, the element was still visible:", err)
}

nickjs's People

Contributors

bogdanrada avatar crra avatar dependabot[bot] avatar gperrin01 avatar guillaumeboiret avatar kpennell avatar mrzor avatar paps avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nickjs's Issues

I am getting error when it is run

I want to use casper + phantom, not headless chrome.

Then, I have following source code.

require('babel-polyfill')
const Nick = require('nickjs')
const Promise = require('bluebird')

const nick = new Nick()

nick.newTab().then(async function(tab) {
  await tab.open("news.ycombinator.com")
  await tab.untilVisible("#hnmain") // Make sure we have loaded the page
  await tab.inject("/home/vagrant/test/nickjs_casper/src/js/jquery-3.3.1.slim.min.js") // We're going to use jQuery to scrape                                                                                
  const hackerNewsLinks = await tab.evaluate((arg, callback) => {
    // Here we're in the page context. It's like being in your browser's inspector tool
    const data = []
    $(".athing").each((index, element) => {      data.push({
        title: $(element).find(".storylink").text(),
        url: $(element).find(".storylink").attr("href")
      })
    })
    callback(null, data)
  })
  console.log(JSON.stringify(hackerNewsLinks, null, 2))
})
.then(() => nick.exit())
.catch((err) => {
    console.log('Oops, an error occurred: ' + err)
    nick.exit(1)
})

But, I am getting the following error.
Oops, an error occurred: Error: in evaluated code (initial call): EvalError: Refused to evaluate a string as JavaScript because 'unsafe-eval' is not an allowed source of script in the following Content Security Policy directive: "script-src 'self' 'unsafe-inline' https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/ https://cdnjs.cloudflare.com/".

Do you know why I am getting this error?

How to use Node API (Or How to export results as file)

Hi, it's me Again.
It sound dumb (and not really relative to nick) but how can i use Node API like fs (for writing my scrapping results in a json file) ?
Here my code

require('babel-polyfill')
const Nick = require('nickjs')
const nick = new Nick()
var fs = require('fs');
nick.newTab().then(async function (tab) {
  await tab.open('http://example.com')
  await tab.waitUntilVisible('.products-grid')
  await tab.inject('./jquery.js')
  const urls = await tab.evaluate((arg, callback) => {
    const urls = []
    $("li.item").each((index, element) => {
      urls.push($(element).find("div.product-content-wrapper").find('h3.product-name').find('a').attr('href'))
    })
    callback(null, urls)
  })
  var data = []
  for (var i = 0; i < urls.length; i++) {
    await tab.open(urls[i])
    await tab.waitUntilVisible('.catalog-product-view')
    await tab.inject('./jquery.js')
    const type = await tab.evaluate((arg, callback) => {
      var type = $('div.product-name').find('h1').text()
      callback(null, type)
    })
    const img = await tab.evaluate((arg, callback) => {
      var img = $('div#map_container').find('img').attr('src')
      callback(null, img)
    })
    data.push({ type: type, img: img })    
    await tab.screenshot('./uploads/' + i + '.png')
  }
  fs.writeFile('./myjsonfile.json', data, 'utf8', ()=>{
    nick.exit()
  });  
}).catch((err) => {
  console.log('Oops, an error occurred: ' + err)
  nick.exit(1)
})

Here at the end I want to write a json file with my results (var data is an Object).
But in the execution i have an error

undefined is not a function (evaluating 'fs.writeFile('./myjson.json', data, function () {

})')

For my defense, i'm really new to PhantomJs/CasperJs and Babel.
I know my script is executed in a CasperJS context, but i can't figured out why i can't use fs (even with the require('babel-polyfill')

Anyone to help me learn and make me a better dev ?

I cannot login with tab.fill

Hi,
thank you for your time in advance.

I am having login issue with nickjs.

I took the cookies from my browser and I set it up before opening the URL.

This is my code. As you can see from the screenshot, it fills the email and the password (the "name" attributes are the same with the input's keys) but it stops there (you can see my shell console output bellow).

I tried to navigate to a url which requires log-in, but it gives me the login form again, which means the login process wasn't successfull.

Would you help me to solve this issue? I would appreciate your help.

Regards,
Manol.

`
const selector = "form[name='signIn']";
const inputs = {
'email': "real email",
'password': "real password"
};

try {
  await tab.waitUntilVisible(selector, 5000)
  await tab.fill(selector, inputs, { submit: true })
  //await tab.click("input#signInSubmit")
  console.log("Form sent!")
  await tab.screenshot('./login.png')

} catch (err) {
  console.log("Form not found:", err)
}`

login

`root@06131ed62a3b:/app# node index.js

Fatal: Chrome subprocess exited with code 1

Tab 1: Navigation (open): https://www.amazon.com/ap/signin?_encoding=UTF8&accountStatusPolicy=P1&openid.assoc_handle=usflex&openid.claimed_id=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.identity=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.mode=checkid_setup&openid.ns=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0&openid.ns.pape=http%3A%2F%2Fspecs.openid.net%2Fextensions%2Fpape%2F1.0&openid.pape.max_auth_age=0&openid.return_to=https%3A%2F%2Fwww.amazon.com%2Fgp%2Fcss%2Forder-history%3Fie%3DUTF8%26ref_%3Dnav_youraccount_orders&pageId=webcs-yourorder&showRmrMe=1
Tab 1: Navigation (formSubmissionPost): https://www.amazon.com/ap/signin
Form sent!
Job done!
`

help, how to ignore javascript error

> Tab 1: Page JavaScript error: ReferenceError: c is not defined
    at sendLog (https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/static/protocol/https/plugins/every_cookie_mac_82990d4.js:4:177)
    at https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/static/protocol/https/plugins/every_cookie_mac_82990d4.js:5:127
> Tab 1: Navigation (open): https://www.baidu.com/
> Tab 1: Page JavaScript error: TypeError: jQuery1102022635599116432248_1531837205312 is not a function
    at https://sp0.baidu.com/5a1Fazu8AA54nxGko9WTAnF6hhy/su?wd=he&json=1&p=3&sid=1423_21122_26350_26810&req=2&csor=2&pwd=h&cb=jQuery1102022635599116432248_1531837205312&_=1531837205315:1:1
> Tab 1: Page JavaScript error: TypeError: jQuery1102022635599116432248_1531837205312 is not a function
    at https://sp0.baidu.com/5a1Fazu8AA54nxGko9WTAnF6hhy/su?wd=hea&json=1&p=3&sid=1423_21122_26350_26810&req=2&csor=3&pwd=he&cb=jQuery1102022635599116432248_1531837205312&_=1531837205316:1:1
> Tab 1: Page JavaScript error: TypeError: jQuery1102022635599116432248_1531837205312 is not a function
    at https://sp0.baidu.com/5a1Fazu8AA54nxGko9WTAnF6hhy/su?wd=h&json=1&p=3&sid=1423_21122_26350_26810&req=2&csor=1&cb=jQuery1102022635599116432248_1531837205312&_=1531837205314:1:1
> Tab 1: Page JavaScript error: TypeError: jQuery1102022635599116432248_1531837205312 is not a function
    at https://sp0.baidu.com/5a1Fazu8AA54nxGko9WTAnF6hhy/su?wd=headle&json=1&p=3&sid=1423_21122_26350_26810&req=2&csor=6&pwd=headl&cb=jQuery1102022635599116432248_1531837205312&_=1531837205319:1:1
> Tab 1: Page JavaScript error: TypeError: jQuery1102022635599116432248_1531837205312 is not a function
    at https://sp0.baidu.com/5a1Fazu8AA54nxGko9WTAnF6hhy/su?wd=headless%20&json=1&p=3&sid=1423_21122_26350_26810&req=2&csor=9&pwd=headless&cb=jQuery1102022635599116432248_1531837205312&_=1531837205322:1:1
> Tab 1: Page JavaScript error: TypeError: jQuery1102022635599116432248_1531837205312 is not a function
    at https://sp0.baidu.com/5a1Fazu8AA54nxGko9WTAnF6hhy/su?wd=headl&json=1&p=3&sid=1423_21122_26350_26810&req=2&csor=5&pwd=head&cb=jQuery1102022635599116432248_1531837205312&_=1531837205318:1:1
> Tab 1: Page JavaScript error: TypeError: jQuery1102022635599116432248_1531837205312 is not a function
    at https://sp0.baidu.com/5a1Fazu8AA54nxGko9WTAnF6hhy/su?wd=headless&json=1&p=3&sid=1423_21122_26350_26810&req=2&csor=8&pwd=headles&cb=jQuery1102022635599116432248_1531837205312&_=1531837205321:1:1
> Tab 1: Page JavaScript error: TypeError: jQuery1102022635599116432248_1531837205312 is not a function
    at https://sp0.baidu.com/5a1Fazu8AA54nxGko9WTAnF6hhy/su?wd=head&json=1&p=3&sid=1423_21122_26350_26810&req=2&csor=4&pwd=hea&cb=jQuery1102022635599116432248_1531837205312&_=1531837205317:1:1
> Tab 1: Page JavaScript error: TypeError: jQuery1102022635599116432248_1531837205312 is not a function
    at https://sp0.baidu.com/5a1Fazu8AA54nxGko9WTAnF6hhy/su?wd=headles&json=1&p=3&sid=1423_21122_26350_26810&req=2&csor=7&pwd=headle&cb=jQuery1102022635599116432248_1531837205312&_=1531837205320:1:1
> Tab 1: Page JavaScript error: TypeError: jQuery1102022635599116432248_1531837205312 is not a function
    at https://sp0.baidu.com/5a1Fazu8AA54nxGko9WTAnF6hhy/su?wd=headless%20c&json=1&p=3&sid=1423_21122_26350_26810&req=2&csor=10&pwd=headless%20&cb=jQuery1102022635599116432248_1531837205312&_=1531837205323:1:1
> Tab 1: Page JavaScript error: TypeError: jQuery1102022635599116432248_1531837205312 is not a function
    at https://sp0.baidu.com/5a1Fazu8AA54nxGko9WTAnF6hhy/su?wd=headless%20ch&json=1&p=3&sid=1423_21122_26350_26810&req=2&csor=11&pwd=headless%20c&cb=jQuery1102022635599116432248_1531837205312&_=1531837205324:1:1
> Tab 1: Page JavaScript error: TypeError: jQuery1102022635599116432248_1531837205312 is not a function
    at https://sp0.baidu.com/5a1Fazu8AA54nxGko9WTAnF6hhy/su?wd=headless%20chro&json=1&p=3&sid=1423_21122_26350_26810&req=2&csor=13&pwd=headless%20chr&cb=jQuery1102022635599116432248_1531837205312&_=1531837205326:1:1
> Tab 1: Page JavaScript error: TypeError: jQuery1102022635599116432248_1531837205312 is not a function
    at https://sp0.baidu.com/5a1Fazu8AA54nxGko9WTAnF6hhy/su?wd=headless%20chr&json=1&p=3&sid=1423_21122_26350_26810&req=2&csor=12&pwd=headless%20ch&cb=jQuery1102022635599116432248_1531837205312&_=1531837205325:1:1
> Tab 1: Page JavaScript error: TypeError: jQuery1102022635599116432248_1531837205312 is not a function
    at https://sp0.baidu.com/5a1Fazu8AA54nxGko9WTAnF6hhy/su?wd=headless%20chrom&json=1&p=3&sid=1423_21122_26350_26810&req=2&csor=14&pwd=headless%20chro&cb=jQuery1102022635599116432248_1531837205312&_=1531837205327:1:1
> Tab 1: Page JavaScript error: TypeError: jQuery1102022635599116432248_1531837205312 is not a function
    at https://sp0.baidu.com/5a1Fazu8AA54nxGko9WTAnF6hhy/su?wd=headless%20chrome&json=1&p=3&sid=1423_21122_26350_26810&req=2&csor=15&pwd=headless%20chrom&cb=jQuery1102022635599116432248_1531837205312&_=1531837205328:1:1

tab.open() hangs indefinitely

My NickJS script was working fine a few weeks ago. I don't think I changed anything, but it stopped working. I tried updating from NickJS 2.0 to 2.3 and Chrome dev to the latest version with no improvement.

Apparently it is hanging at the call to tab.open(url). I also tried running the sample code at https://nickjs.org/, and it hangs at the same method. How do I figure out why NickJS is hanging there? (The PhantomBuster demo still works, so I think it must be something with my environment)

Environment:

  • Windows 10 64-bit
  • Node: v8.4.0
  • NickJS 2.0, 2.3
  • Chrome: 63.0.3236.0 (developer stream)
  • Executing from Git bash 2.12.0.windows.1

Simple code to reproduce (CoffeeScript):

Note the callback is never called.

Nick = require 'nickjs'
nick = new Nick()

callback = (err, httpCode, httpStatus, url) =>
    console.log 'Callback:'
    console.log err

( () =>
    tab = await nick.newTab()
    console.log 'tab.open()'
    tab.open 'https://mixergy.com/interviews/', callback
)()

Simple code to reproduce (JS):

Note the callback is never called.

var Nick, callback, nick;

Nick = require('nickjs');

nick = new Nick();

callback = (err, httpCode, httpStatus, url) => {
  console.log('Callback:');
  return console.log(err);
};

(async() => {
  var tab;
  tab = (await nick.newTab());
  console.log('tab.open()');
  return tab.open('https://mixergy.com/interviews/', callback);
})();

Unable to Add Header on await tab.open().

I'm using NickJs browser emulation and am attempting to add a header when opening a page. I'm monitoring the request in Fiddler and see that the header is not being added.

How do I add the header?

Here's my current code:

await tab.open("https://www.google.com/", {headers: {"Accept-Language": "en-US,en;q=0.9"}})

Dockerise NickJs (Latest version)

Hi.
Do you have an image to dockerize nickjs (latest version with chrome) ?
I tried to dockerize my app with a very simple dockerfile with the latest node image

FROM node:latest
ADD ./app /app
WORKDIR /app
ENV CHROME_PATH=/usr/bin/google-chrome
RUN apt-get update && install google-chrome-stable
RUN npm install
CMD npm start

My app it's your example with ycombinator and nothing more.
The google-chrome-stable version is 62 (in accordance with your documentation).

The image build is ok, chrome's installation is ok path is ok (Check with which google-crome and echo "$CHROME_PATH").
But when i run a new container or when i run node index.js to launch my node script inside a new container i got this error:

Fatal: Chrome subprocess exited with code 1

And i can't figured out what is my problem.

Is it possible to change userAgent?

Hello there,

I would like to change userAgent after initialization of Ncickjs.

What I found is following.

tab = await nick.newTab();
tab.driver.client.Network.setUserAgentOverride({"userAgent":userAgent});

Is this correct way to do?

regards,

Any advice for deploying on Heroku?

I've run heroku buildpacks:add https://github.com/heroku/heroku-buildpack-google-chrome and rooting around on the forums but not having much luck.

Seems like I'm heading in the right direction:

screen shot 2017-11-24 at 5 31 32 pm

Use Proxy

Hi !

I begin using your lib and I'm wondering if and how I could use proxy to connect to the web ?

thanks for your work

Hugo

nick.exit() is killing the whole node.js

Hi guys,

You made a great tool, but I'm running in an issue.

when I try the sample code in node.js when the code is reaching nick.exit(0); the whole node process is killed I was making some tests with postman and as a result the response from the server is always : "Could not get any response" but in the node logs I have :
"server started on: 3000"
"Tab 1: Navigation (open): http://:www.google.com"
"Process finished with exit code 0"
and after the node server is not responding and is killed.

I tried also on a remote server the node server is restarting but every time the process is killed.

For now I found a workaround "tab.close()" but it looks like the chrome is staying there in background and every time a new Nick is created a new chrome is launched.

Can you give me some advice about this or is it a bug ?

Error: Could not start chrome: Error: spawn google-chrome-unstable ENOENT

With Node 6.11.3 LTS and Node 8.5.0
Chrome Version 61.0.3163.91 (Official Build) (64-bit)

Using sample on https://nickjs.org/
It always say: "Error: Could not start chrome: Error: spawn google-chrome-unstable ENOENT"

Same environment, I run puppeteer normally

const puppeteer = require('puppeteer');
(async() => {
    const browser = await puppeteer.launch();
    console.log(await browser.version());
    browser.close();
})();

Result is HeadlessChrome/62.0.3198.0

Error: timeout: load event did not fire

I get this :
Error: timeout: load event did not fire after xxx ms
while I'm trying to open website such as azlyrics.com or genius.com, even when I extend the timeout option to 60000ms.
I can quickly connect to https://news.ycombinator.com/ or other websites though. Are some websites preventing headless browser connections ? If so, is there any workaround ?

Can't download from sites

#What I've tried
*Reinstalling Chrome
*Modifying my installation of Nickjs to try enabling downloads
*Reinstalling a clean Nickjs installation

#Expected behavior
Downloadable files will be downloaded when navigated to

#Actual behavior
Downloadable files are not downloaded

I am getting error from Google Chrome

Google Chrome: 66.0.3359.117
Nickjs: 0.3.6

Error:

2018-05-15 11:17 : CHROME STDERR: [0515/111735.657188:ERROR:gles2_cmd_decoder.cc(3350)] ContextResult::kFatalFailure: fail_if_major_perf_caveat + swiftshader
2018-05-15 11:17 : CHROME STDERR: [0515/111735.660133:ERROR:gles2_cmd_decoder.cc(3350)] ContextResult::kFatalFailure: fail_if_major_perf_caveat + swiftshader
2018-05-15 11:17 : CHROME STDERR: [0515/111745.945705:ERROR:gles2_cmd_decoder.cc(3350)] ContextResult::kFatalFailure: fail_if_major_perf_caveat + swiftshader
2018-05-15 11:17 : CHROME STDERR: [0515/111745.948669:ERROR:gles2_cmd_decoder.cc(3350)] ContextResult::kFatalFailure: fail_if_major_perf_caveat + swiftshader
2018-05-15 11:17 : CHROME STDERR: [0515/111748.365746:WARNING:spdy_session.cc(2991)] Received HEADERS for invalid stream 767

Do you know how to solve this issue?

NickJS does not work with recent versions of Chrome (v64+): Network.setRequestInterceptionEnabled wasn't found

This error occurs after upgrading to latest version of Chrome:

error when initializing new chrome tab: Error: 'Network.setRequestInterceptionEnabled' wasn't found

Confirmed with these release channels of Chrome:

  • Release (Version 64.0.3282.119 (Official Build) (64-bit))
  • Beta (Version 64.0.3282.119 (Official Build) (64-bit))
  • Dev (Version 65.0.3325.18 (Official Build) dev (32-bit))

Environment:

  • Windows 10
  • NickJS 0.2.6
  • Node v8.9.1

The code:

Nick = require 'nickjs'
nick = new Nick()

nick.newTab().then (tab) =>
    await console.log "Testing..."
.then () =>
    console.log 'Completed with no errors!'
    nick.exit 0
.catch (err) =>
    console.log 'Something went wrong:'
    console.log err
    nick.exit 1

About sendKeys()

Hello there,

Is reset option working?

I found that it is not working and so the previous value is left in the input box.

Here is my code.

await tab.sendKeys('#username', id, {reset:true,keepForcus:false});

Can you confirm that?

OS: Ubuntu 16.04
NickJS: The latest one
Google Chrome: 67.0.3396.62

Best regards,

Found a bug when I crawl on google map

Hello there,

Possibly, this is bug on headless chrome, but, I want to confirm if this is a bug.

When I crawl like following on google map, the result is less than I use just chrome browser.

  1. Open this URL.
    https://www.google.co.jp/maps/@35.6603142,139.7055783,21z

  2. sendkey like shibuya web consultant

The result of google chrome has more than 20 search results.

On the other hand, the result of chrome headless browser has aorund 8 search results.

Do you have the same result?

My environment is as following.

ubuntu 18.04
Google Chrome 67.0.3396.99
NickJs the latest one
Regards,

`printResourceErrors` or `printPageErrors` not honored

I'm not sure if one or both of these options is being ignored:

$ node build/test.js
tab.open()
> Tab 1: Aborted (blacklisted by /.*\.js.*/): https://mixergy.com/wp-content/plugins/swiftype-search/assets/install_swiftype.min.js?ver=4.8.2
> Tab 1: Aborted (blacklisted by /.*\.js.*/): https://mixergy.com/wp-includes/js/jquery/jquery.js?ver=1.12.4
> Tab 1: Aborted (blacklisted by /.*\.js.*/): https://mixergy.com/wp-includes/js/jquery/jquery-migrate.min.js?ver=1.4.1
...

test.js:

Nick = require('nickjs');

nick = new Nick(options = {
  printNavigation: false,
  printResourceErrors: false,
  printPageErrors: false,
  blacklist: [/.*\.js.*/]
});

(async() => {
  var tab;
  tab = (await nick.newTab());
  console.log('tab.open()');
  return tab.open('https://mixergy.com/interviews/');
})();

I am getting following errors from Google Chrome

Nickjs: 0.3.6
Chrome: 67.0.3396.62
Ubuntu: 16.04.4

I am trying to crawl google map and got following error.

2018-06-01 10:34 : CHROME STDERR: [0601/103455.634767:ERROR:gpu_process_transport_factory.cc(1017)] Lost UI shared context.
2018-06-01 10:34 : CHROME STDERR:

Best regards,

Electron Possibilities

Can nick run inside an existing running electron instance and use the api to control it?
If so, how would the configuration look like ?

nick.exit() terminate caller process

As the following code, last console.log should be printed but the app would be forced exit as nick.exit() called. By remove nick.exit() line, console.log print the result but the app would not end as it should be. (I will have to Ctrl-c to terminate)

const nick = new Nick()
let tab = await nick.newTab()
await tab.open(url)
await tab.untilVisible('.tbl')
let content = await tab.getContent()

nick.exit()

console.log('This line should be printed with content', content)

Error: "Chrome subprocess killed by signal SIGTERM" in NodeJS while NickJs is executing

Hello guys!

You've been doing a great job with NickJS library!

But working with NickJs on NodeJs I had the following server error: "Chrome subprocess killed by signal SIGTERM". In fact, Chrome is killed by SIGTERM kernel process while NickJs tabs are evaluating.

It seems the problem occurs when code open/close tabs for long time, in particular when pages have a lot of content and scripts.
Test script is written to reflect the problem on the real project, where pages also run simultaneously in different tabs for long time. During the work CPU and Memory consumption in Task Manager looks fine, so killing the Chrome process is very unexpected.

My environment: NodeJS v8.8.1, NickJs + Desktop Chrome 63.0.3239.108 (Official Build) (64-bit).
Additional packages: Bluebird - for advanced Promise usage.
OS: Windows 8 (But the same error occurs on Ubuntu 16.04)

Here is my test file 'pureNickTest.js':

const Promise = require('bluebird');
const nick = new NickJs();
const testUrl = "http://edition.cnn.com/2017/12/15/politics/next-alabama-states/index.html";

const runTab = async (url) => {
    const tab = await nick.newTab();
    await tab.open(url);
    await tab.untilVisible("body");
    await tab.inject(`${__dirname}/libs/jquery-3.2.1.min.js`);
    await tab.wait(3000);
    await tab.scrollToBottom();

    const results = await tab.evaluate((arg, callback)=>{
        const blocks = $("body").find('div');
        callback(null, blocks.length);
    });

    await tab.close();
};

const runTestTabs = () => {
    const fakeArray = new Array(100);

    return Promise.each(fakeArray, () => {
        return runTab(testUrl);
    });
}

(async ()=> {
    await runTestTabs();
    console.log("Test complete!");
    nick.exit();
})();

Equivalent code for puppeteer works fine and doesn't crush with this error.

I would appreciate any help on this issue!

Support for SOCKS5 proxy

I am trying to use a SOCKS5 proxy to connect and "http://" is getting appended to the front. The SOCKS5 proxy url takes this form "socks5://username:password@domain:1080".

Heroku: Error: connect ECONNREFUSED 127.0.0.1:9222

could not connect to chrome debugger after 24 tries (10s): Error: connect ECONNREFUSED 127.0.0.1:9222

this just popped up for some reason on heroku, i have legacy dynos running the same code and working perfectly. tried to add a fresh dyno and getting this error for the first time

settings are:
Stack: heroku-16
node: 8.9.0
npm: 5.5.1

Using google-chrome-unstable ??

Need some help getting Chrome running in different environments.

In a different thread, you said nickjs looks for google-chrome-unstable before checking CHROME_PATH in environment variables.

Is using google-chrome-unstable the best way to have the same code execute in both a local (Mac) and remote (EC2/Elastic Beanstalk) node.js setup?

If so, can you help with download/install instructions of google-chrome-unstable?

If not, can you provide help with getting nickjs to recognize Chrome in an EC2 instance?

Thx - I know these are probably noob questions.

Updating to .0.3.6 causes Error: timeout: load event did not fire after 10002ms

First off, Nick is amazing. Thank you.

I tried upgrading to .0.3.6 (from .0.3.0) get around the Chrome subprocess killed by signal SIGTERM issues.

Now I get a Error: timeout: load event did not fire after 10002ms on my code. Any ideas? The HN example still works so I'm not sure what's wrong with my code.


const Nick = require("nickjs");
const nick = new Nick();

(async () => {
  const tab = await nick.newTab();

  let urls = ["https://www.lawfwfewef.com", "https://www.awfwaffe.com"];

  let outsideresults = [];

  for (let url of urls) {
    const insideresults = [];

    await tab.open(url);

    await tab.wait(9000);

    await tab.untilVisible("html"); // Make sure we have loaded the page
  } // for of loop
})()
  .then(() => {
    console.log("Job done!");
    nick.exit();
  })
  .catch(err => {
    console.log(`Something went wrong: ${err}`);
    nick.exit(1);
  });

Change Proxy and User-Agent after Nick instantiation

What would be the best way to set a different proxy and user-agent on each tab open, testing this out currently but not sure if this is the right approach:

const nick = new Nick({
  printNavigation: true,
  printResourceErrors: true,
  printPageErrors: false,
  resourceTimeout: 10000,
})

// start loop logic

;(async () => {
        nick.options.userAgent = randomUserAgent;
        nick.options.httpProxy = randomProxy;

        const tab = await nick.newTab()
        await tab.open(url)

        // Clean up
        await nick.deleteAllCookies()
        await tab.close()
})()

// end loop logic

Initially I had const nick = new Nick inside my loop logic and that crashed node after 100 tab opens, pretty sure why :)

Simulate a Enter event

How do i simulate a Enter event, like pressing enter after sending keys in a textarea?

Scaling

How is scaling supposed to work with nickjs?
Does instantiating a new nickjs starts a new session (suppose you want to keep tabs independent from each other)

Allow changing NickJS options after the initial constructor

Please enable changing NickJS options after the initial constructor. Specifically, I would like to modify:

  • loadImages flag
  • blacklist/whitelist
  • (and the print* flags would be nice, too)

Motivation: I would like to use a special configuration just when logging in

  • Login page requires loadImages=true and no blacklists
  • but other pages are OK with loadImages=false and blacklists (and load faster without images & JS files)

Open new tabs loop through array

Hi and first thanks for NickJs and PhantomBuster
This is an Help Me Please issue.
Im'not very familiar with async/await function, so i'm practicing with this.

My question is:
How can i grab a bunch of urls (Like my example code)
And loop through an array to open new tab for each url, take a screenshot, close the tab, close nick.
I'm a user of Phantombuster too (in experimentation in my company) and i'm really really stuck with this.
Anyone to help me ?

import 'babel-polyfill'
import Nick from 'nickjs'
const nick = new Nick()
nick.newTab().then(async function (tab) {
  await tab.open('https://news.ycombinator.com/')
  await tab.waitUntilVisible('#hnmain')
  await tab.inject('https://code.jquery.com/jquery-3.1.1.slim.min.js')
  const urls = tab.evaluate((arg, callback) => {
    const data = []
    $('.athing').each((index, element) => {
      data.push($(element).find('.storylink').attr('href'))
    })
    callback(null, data)
  })
  return urls
}).then((urls) => {
  for (var i = 0; i < urls.length; i++) {
    // here i want open a new tab for each url i have in my urls array
    // And i want to perform a screenshot or a evaluate function
  }
//and after that i want to quit nick
  nick.exit()
})
  .catch((err) => {
    console.log('Oops, an error occurred: ' + err)
    nick.exit(1)
  })

Best Regards

Phantom.js + casper : syntaxError: Unexpected token '>'

When I try to launch the example I get this error :

SyntaxError: Unexpected token '>'

  phantomjs://code/test.js:4 in injectJs
  phantomjs://code/bootstrap.js:456

To launch the script I use :

node_modules/casperjs/bin/casperjs test.js

test.js contains the example from readme.md

EDIT :

I just discoered that casper.js or phantom is not suporting ES6

setTimeout( () =>  console.log('Hello'), 3000 );

this fail :'(

Running NickJs in node:9-alpine

I tried running NickJs in node9-alpine image and I get this error:

CHROME STDERR: [0204/101402.974481:WARNING:dns_config_service_posix.cc(326)] Failed to read DnsConfig.
CHROME STDERR: [0204/101402.984254:ERROR:devtools_http_handler.cc(759)] Error writing DevTools active port to file
> It took 105ms to start and connect to Chrome (1 tries)
{
  "level": "error",
  "message": "cannot connect to chrome tab: Error: Unknown command: protocol"
}

I have added the chromium browser to the image and set the CHROME_PATH to the chromium-browser. It is also present there, I checked with running a bash in the container.
Does anyone know what to do?

My Dockerfile:

FROM node:9-alpine

ENV CHROME_PATH=/usr/bin/chromium-browser
ENV NICKJS_NO_SANDBOX=1
# ╒═════════════════════════
# │ Copy from builder
# ╘═════════════════════════
RUN mkdir -p /app
WORKDIR /app

# Node modules
COPY --from=builder /app-prod/node_modules ./node_modules

# App files
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/res ./res

COPY --from=builder /app/package.json .
COPY --from=builder /app/util/healthcheck.js .
COPY --from=builder /app/start.sh .

RUN rm -f /app/dist/source.js /app/dist/source.js.map

# ╒═════════════════════════
# │ INSTALL NATIVE
# ╘═════════════════════════
# Install native app dependencies
RUN set -ex; \
  apk add --no-cache \
    rsync \
    openssh \
    udev \
    ttf-freefont \
    chromium

# ╒═════════════════════════
# │ SETUP HEALTH CHECK
# ╘═════════════════════════
HEALTHCHECK --interval=12s --timeout=12s --retries=3 --start-period=60s \
 CMD node ./util/healthcheck.js

# Run the container under "node" user by default
#USER node
CMD [ "npm", "run", "serve" ]

getaddrinfo ENOTFOUND

hi , when i run the demo, it appears Error: getaddrinfo ENOTFOUND localhost undefined:9222

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.