Code Monkey home page Code Monkey logo

node-worker-farm's Introduction

Worker Farm Build Status

NPM

Distribute processing tasks to child processes with an über-simple API and baked-in durability & custom concurrency options. Available in npm as worker-farm.

Example

Given a file, child.js:

module.exports = function (inp, callback) {
  callback(null, inp + ' BAR (' + process.pid + ')')
}

And a main file:

var workerFarm = require('worker-farm')
  , workers    = workerFarm(require.resolve('./child'))
  , ret        = 0

for (var i = 0; i < 10; i++) {
  workers('#' + i + ' FOO', function (err, outp) {
    console.log(outp)
    if (++ret == 10)
      workerFarm.end(workers)
  })
}

We'll get an output something like the following:

#1 FOO BAR (8546)
#0 FOO BAR (8545)
#8 FOO BAR (8545)
#9 FOO BAR (8546)
#2 FOO BAR (8548)
#4 FOO BAR (8551)
#3 FOO BAR (8549)
#6 FOO BAR (8555)
#5 FOO BAR (8553)
#7 FOO BAR (8557)

This example is contained in the examples/basic directory.

Example #1: Estimating π using child workers

You will also find a more complex example in examples/pi that estimates the value of π by using a Monte Carlo area-under-the-curve method and compares the speed of doing it all in-process vs using child workers to complete separate portions.

Running node examples/pi will give you something like:

Doing it the slow (single-process) way...
π ≈ 3.1416269360000006  (0.0000342824102075312 away from actual!)
took 8341 milliseconds
Doing it the fast (multi-process) way...
π ≈ 3.1416233600000036  (0.00003070641021052367 away from actual!)
took 1985 milliseconds

Durability

An important feature of Worker Farm is call durability. If a child process dies for any reason during the execution of call(s), those calls will be re-queued and taken care of by other child processes. In this way, when you ask for something to be done, unless there is something seriously wrong with what you're doing, you should get a result on your callback function.

My use-case

There are other libraries for managing worker processes available but my use-case was fairly specific: I need to make heavy use of the node-java library to interact with JVM code. Unfortunately, because the JVM garbage collector is so difficult to interact with, it's prone to killing your Node process when the GC kicks under heavy load. For safety I needed a durable way to make calls so that (1) it wouldn't kill my main process and (2) any calls that weren't successful would be resubmitted for processing.

Worker Farm allows me to spin up multiple JVMs to be controlled by Node, and have a single, uncomplicated API that acts the same way as an in-process API and the calls will be taken care of by a child process even if an error kills a child process while it is working as the call will simply be passed to a new child process.

But, don't think that Worker Farm is specific to that use-case, it's designed to be very generic and simple to adapt to anything requiring the use of child Node processes.

API

Worker Farm exports a main function and an end() method. The main function sets up a "farm" of coordinated child-process workers and it can be used to instantiate multiple farms, all operating independently.

workerFarm([options, ]pathToModule[, exportedMethods])

In its most basic form, you call workerFarm() with the path to a module file to be invoked by the child process. You should use an absolute path to the module file, the best way to obtain the path is with require.resolve('./path/to/module'), this function can be used in exactly the same way as require('./path/to/module') but it returns an absolute path.

exportedMethods

If your module exports a single function on module.exports then you should omit the final parameter. However, if you are exporting multiple functions on module.exports then you should list them in an Array of Strings:

var workers = workerFarm(require.resolve('./mod'), [ 'doSomething', 'doSomethingElse' ])
workers.doSomething(function () {})
workers.doSomethingElse(function () {})

Listing the available methods will instruct Worker Farm what API to provide you with on the returned object. If you don't list a exportedMethods Array then you'll get a single callable function to use; but if you list the available methods then you'll get an object with callable functions by those names.

It is assumed that each function you call on your child module will take a callback function as the last argument.

options

If you don't provide an options object then the following defaults will be used:

{
    workerOptions               : {}
  , maxCallsPerWorker           : Infinity
  , maxConcurrentWorkers        : require('os').cpus().length
  , maxConcurrentCallsPerWorker : 10
  , maxConcurrentCalls          : Infinity
  , maxCallTime                 : Infinity
  , maxRetries                  : Infinity
  , autoStart                   : false
  , onChild                     : function() {}
}
  • workerOptions allows you to customize all the parameters passed to child nodes. This object supports all possible options of child_process.fork. The default options passed are the parent execArgv, cwd and env. Any (or all) of them can be overridden, and others can be added as well.

  • maxCallsPerWorker allows you to control the lifespan of your child processes. A positive number will indicate that you only want each child to accept that many calls before it is terminated. This may be useful if you need to control memory leaks or similar in child processes.

  • maxConcurrentWorkers will set the number of child processes to maintain concurrently. By default it is set to the number of CPUs available on the current system, but it can be any reasonable number, including 1.

  • maxConcurrentCallsPerWorker allows you to control the concurrency of individual child processes. Calls are placed into a queue and farmed out to child processes according to the number of calls they are allowed to handle concurrently. It is arbitrarily set to 10 by default so that calls are shared relatively evenly across workers, however if your calls predictably take a similar amount of time then you could set it to Infinity and Worker Farm won't queue any calls but spread them evenly across child processes and let them go at it. If your calls aren't I/O bound then it won't matter what value you use here as the individual workers won't be able to execute more than a single call at a time.

  • maxConcurrentCalls allows you to control the maximum number of calls in the queue—either actively being processed or waiting for a worker to be processed. Infinity indicates no limit but if you have conditions that may endlessly queue jobs and you need to set a limit then provide a >0 value and any calls that push the limit will return on their callback with a MaxConcurrentCallsError error (check err.type == 'MaxConcurrentCallsError').

  • maxCallTime (use with caution, understand what this does before you use it!) when !== Infinity, will cap a time, in milliseconds, that any single call can take to execute in a worker. If this time limit is exceeded by just a single call then the worker running that call will be killed and any calls running on that worker will have their callbacks returned with a TimeoutError (check err.type == 'TimeoutError'). If you are running with maxConcurrentCallsPerWorker value greater than 1 then all calls currently executing will fail and will be automatically resubmitted unless you've changed the maxRetries option. Use this if you have jobs that may potentially end in infinite loops that you can't programatically end with your child code. Preferably run this with a maxConcurrentCallsPerWorker so you don't interrupt other calls when you have a timeout. This timeout operates on a per-call basis but will interrupt a whole worker.

  • maxRetries allows you to control the max number of call requeues after worker termination (unexpected or timeout). By default this option is set to Infinity which means that each call of each terminated worker will always be auto requeued. When the number of retries exceeds maxRetries value, the job callback will be executed with a ProcessTerminatedError. Note that if you are running with finite maxCallTime and maxConcurrentCallsPerWorkers greater than 1 then any TimeoutError will increase the retries counter for each concurrent call of the terminated worker.

  • autoStart when set to true will start the workers as early as possible. Use this when your workers have to do expensive initialization. That way they'll be ready when the first request comes through.

  • onChild when new child process starts this callback will be called with subprocess object as an argument. Use this when you need to add some custom communication with child processes.

workerFarm.end(farm)

Child processes stay alive waiting for jobs indefinitely and your farm manager will stay alive managing its workers, so if you need it to stop then you have to do so explicitly. If you send your farm API to workerFarm.end() then it'll cleanly end your worker processes. Note though that it's a soft ending so it'll wait for child processes to finish what they are working on before asking them to die.

Any calls that are queued and not yet being handled by a child process will be discarded. end() only waits for those currently in progress.

Once you end a farm, it won't handle any more calls, so don't even try!

Related

  • farm-cli – Launch a farm of workers from CLI.

License

Worker Farm is Copyright (c) Rod Vagg and licensed under the MIT license. All rights not explicitly granted in the MIT license are reserved. See the included LICENSE.md file for more details.

node-worker-farm's People

Contributors

amasad avatar amilajack avatar arlolra avatar bnoordhuis avatar dustinsoftware avatar egavr avatar ericbyers avatar gdborton avatar j0tunn avatar kikobeats avatar localjo avatar notslang avatar quasic avatar realityking avatar rvagg avatar ryandesign avatar thaumant avatar xamgore avatar zeroevidence avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

node-worker-farm's Issues

Have it work with node's debugger

Hi,
First thank you very much for the lib, it did save me a lot of time.
The only issue I have now is that I cannot debug my app when a WorkerFarm is running.

Those are the errors I'm having:

Failed to open socket on port XXX, waiting 1000 ms before retrying

Looks like each child process' debugger is bound to the same debug port thus the error message and the connection loss with the main process.
Looks like it's a bug in node, which was fixed in v0.11 but not backported to v0.10 (which I'm using) nodejs/node-v0.x-archive#5318

How to pass class instance as parameters to the child process

runner.js:

function run(options, callback) {
    // some code
    options.user.getCurrentUser()
}
class UserService {
  currentUser() {
    return {id: 123};
  }
}

// ...

const workers = workerFarm(workerOptions, require.resolve('./runner'), ['run']);
const options = {
  str: 'aa'
  num: 1,
  user: new UserService(),
}
workers.run(options, (err, out) => {
  //do something
})

When I run the code above, And trying to exec options.user.currentUser() in the runner.js, it throws an error TypeError: options.user.currentUser is not a function. It seems the options.user be serialized after passed to the child process.

Is there some way to pass class instance to the child process?

Keeps using the same worker

I'm having a problem where the worker farm just keeps using a single worker no matter what I set in the options, I'm still not used to the node event tick and programming in general so pardon my ignorance.

I have a app.js file that calls the worker, the worker does a request (https://github.com/request/request) and returns the response to app.js, inside the request response in worker.js I did a simple console.log(process.pid) and it's always the same number even when I'm requesting hundreds of times all the time.

The options:

var requesting = workerFarm({
        maxCallsPerWorker: Infinity,
        maxConcurrentWorkers: 3, // Processes
        maxConcurrentCallsPerWorker: 10, // default: 10
        maxConcurrentCalls: Infinity,
        maxCallTime: ((2) * (60) * (1000)), // Careful, default: Infinity
        maxRetries: Infinity,
        autoStart: true // default: false
    }, require.resolve('./lib/worker.js'));

What am I doing wrong here?

Ability to run a function on all workers

I'm looking at setting some persistent state across the workers, for example initialization information that can change so doesn't necessarily make sense to pass as an environment variable and restart all the workers each time.

It would be nice if there was a way to run a function on all workers to ensure that they all get that state regardless of which worker applies subsequent operations.

Perhaps offering this API something like - workerFarm([options, ]pathToModule[, exportedMethods][, exportedStateMethods])

So that one could run workerFarm('mod', ['transform'], ['setState']).

By assigning an id to individual states (setState (state) => stateId, transform(stateId, options)), one can use this same technique to reuse input in-memory between runs as well.

For now I'm just creating many different pools to manage the multiple state cases, but thought it worth mentioning this to see if it might be a possibility.

Module not found: Error: Can't resolve 'child_process'

Hi. I am building my project with [email protected] and I am getting the following error. As it looks on npmjs.org, child_process is not in use. Is there any chance that it could be fixed here?

This package name is not currently in use, but was formerly occupied by another package. To avoid malicious use, npm is hanging on to the package name, but loosely, and we'll probably give it to you if you want it.

Error I am getting is:

ERROR in ./node_modules/worker-farm/lib/fork.js
Module not found: Error: Can't resolve 'child_process' in 'C:\Programming\Keia-AOT\node_modules\worker-farm\lib'
 @ ./node_modules/worker-farm/lib/fork.js 3:21-45
 @ ./node_modules/worker-farm/lib/farm.js
 @ ./node_modules/worker-farm/lib/index.js
 @ ./node_modules/terser-webpack-plugin/dist/TaskRunner.js
 @ ./node_modules/terser-webpack-plugin/dist/index.js
 @ ./node_modules/terser-webpack-plugin/dist/cjs.js
 @ (webpack)/lib/WebpackOptionsDefaulter.js
 @ (webpack)/lib/webpack.js
 @ ./config/webpack.server.js

Error sent by `childTimeout` is an empty object

For some reason the TimeoutError used in childTimeout is generating an empty object, or something that looks like an empty object when logged. Using a normal Error generates the expected output.

Children working in series

It seem like the children is working in series and they wait for the previous one to finish.
When i try to run the pi program it returns

Doing it the slow (single-process) way...
5348
5348
5348
5348
π ≈ 3.1413785799999996  (0.00021407358979352864 away from actual!)
took 9561 milliseconds
Doing it the fast (multi-process) way...
5355
5355
5355
5355
π ≈ 3.1417808   (0.00018814641020670209 away from actual!)
took 9597 milliseconds

It's supposed to run the multi-process way faster, but it doesn't, it seems that it's waiting for the previous child to finish, before starting the next one. Why is that?

An option to start child workers as early as possible

Right now, child workers are only lazily started (i.e. when we have work for them). I want to start the child processes early because they have heavy initialization code to do that is taxing on the first few requests that comes through.

I can write a pull requests, but wanted check if you're interested in pulling this. Here is what I'm currently doing to hack around this:

function warmupWorkers() {
  os.cpus().forEach(function() {
    workers(1, function() {});
  });
}

Thats ilegal!!!

Hi, just wanna say That using child labour is ilegal and you should stop. Especialy when you use inocent children to something like Estimating π using child workers and making manuals to guide others do such things. You are evil and should go to jail!

Make number of running processes configurable

We're using node-worker-farm on React Native's transformer. Thanks to this library we can speed up the transform step by parallelizing it, so thanks a lot for making this!.

One problem we have though, is that once the farm starts N processes it will never kill them. Our child processes consume a ton of memory. It would be great if we could tweak have a policy to manage how and when processes are created and killed. For instance, if many processes were idle during a long period of time would be great to kill some of them, and then, once the farm receives lots of new requests we could spawn them back again.

Do you think this feature would fit nice in the library? We could submit a PR :)

ERR_IPC_CHANNEL_CLOSED: Channel closed

I'm experimenting this error using a more than one cpu for calculate fibonacci numbers.

events.js:136
      throw er; // Unhandled 'error' event
      ^

Error [ERR_IPC_CHANNEL_CLOSED]: Channel closed
    at process.target.send (internal/child_process.js:606:16)
    at callback (/Users/josefranciscoverdugambin/Projects/farm-cli/node_modules/worker-farm/lib/child/index.js:32:17)
    at /Users/josefranciscoverdugambin/Projects/farm-cli/node_modules/async/dist/async.js:958:16
    at check (/Users/josefranciscoverdugambin/Projects/farm-cli/node_modules/async/dist/async.js:3087:28)
    at Command.redis.get [as callback] (/Users/josefranciscoverdugambin/Projects/farm-cli/examples/fibonacci-redis/index.js:25:48)
    at normal_reply (/Users/josefranciscoverdugambin/Projects/farm-cli/node_modules/redis/index.js:726:21)
    at RedisClient.return_reply (/Users/josefranciscoverdugambin/Projects/farm-cli/node_modules/redis/index.js:824:9)
    at JavascriptRedisParser.returnReply (/Users/josefranciscoverdugambin/Projects/farm-cli/node_modules/redis/index.js:192:18)
    at JavascriptRedisParser.execute (/Users/josefranciscoverdugambin/Projects/farm-cli/node_modules/redis-parser/lib/parser.js:574:12)
    at Socket.<anonymous> (/Users/josefranciscoverdugambin/Projects/farm-cli/node_modules/redis/index.js:274:27)

reproducible code

This is what I'm running
https://github.com/Kikobeats/farm-cli/blob/master/examples/fibonacci-redis/index.js

git clone [email protected]:Kikobeats/farm-cli.git && cd farm-cli && npm install
redis-server &
DEBUG=farm-cli node ./bin/cli/index.js --cores 2 --workers 1 --delay 0 examples/fibonacci-redis --n 40

environment

node --version
v9.2.0
```

```
npm --version
5.5.1
```

Probably related with https://github.com/rvagg/node-worker-farm/pull/64

Error: spawn E2BIG with big object argument

Worker code:

const wkx = require('wkx')
const gdal = require('gdal')

module.exports = (geojson, cb) => {
  const wkb = wkx.Geometry.parseGeoJSON(geojson).toWkb()
  const geo = gdal.Geometry.fromWKB(wkb)
  cb(null, geo.isValid())
}

Giving it a sufficiently large GeoJSON object (lets say a polygon of the united states) will fail with Error: spawn E2BIG from the OS.

Is there any strategy for passing large objects to workers or is this something nobody has encountered yet?

How to limit job CPU time

My workers wrap a C library using ffi. Certain inputs to the library can consume inordinate (possibly infinite) CPU time. I need to be able to detect this and kill a job that hasn't returned an answer after X milliseconds. Could you add an example of how this could be accomplished? Thanks.

Unhandled rejection TypeError: child.send is not a function

Getting error in Jenkins job (Ubuntu box)

node[3708]: pthread_create: Resource temporarily unavailable
Unhandled rejection TypeError: child.send is not a function
at fork (/app/onejenkins/workspace/app.Build/build/node_modules/worker-farm/lib/fork.js:22:9)
at Farm.startChild (/app/onejenkins/workspace/app.Build/build/node_modules/worker-farm/lib/farm.js:106:16)
at Farm.processQueue (/app/onejenkins/workspace/app.Build/build/node_modules/worker-farm/lib/farm.js:279:10)
at Farm.addCall (/app/onejenkins/workspace/app.Build/build/node_modules/worker-farm/lib/farm.js:307:8)
at Farm. (/app/onejenkins/workspace/app.Build/build/node_modules/worker-farm/lib/farm.js:38:10)
at _class.boundWorkers (/app/onejenkins/workspace/app.Build/build/node_modules/uglifyjs-webpack-plugin/dist/uglify/index.js:71:24)
at enqueue (/app/onejenkins/workspace/app.Build/build/node_modules/uglifyjs-webpack-plugin/dist/uglify/index.js:96:17)
at tryCatcher (/app/onejenkins/workspace/app.Build/build/node_modules/bluebird/js/release/util.js:16:23)
at Promise._settlePromiseFromHandler (/app/onejenkins/workspace/app.Build/build/node_modules/bluebird/js/release/promise.js:512:31)
at Promise._settlePromise (/app/onejenkins/workspace/app.Build/build/node_modules/bluebird/js/release/promise.js:569:18)
at Promise._settlePromise0 (/app/onejenkins/workspace/app.Build/build/node_modules/bluebird/js/release/promise.js:614:10)
at Promise._settlePromises (/app/onejenkins/workspace/app.Build/build/node_modules/bluebird/js/release/promise.js:689:18)
at Async._drainQueue (/app/onejenkins/workspace/app.Build/build/node_modules/bluebird/js/release/async.js:133:16)
at Async._drainQueues (/app/onejenkins/workspace/app.Build/build/node_modules/bluebird/js/release/async.js:143:10)
at Immediate.Async.drainQueues (/app/onejenkins/workspace/app.Build/build/node_modules/bluebird/js/release/async.js:17:14)
at runCallback (timers.js:789:20)
at tryOnImmediate (timers.js:751:5)
at processImmediate [as _immediateCallback] (timers.js:722:5)

How to limit length of job queue

If my server is very busy and has a zillion jobs queued up already, I don't want to queue up any more; I'd rather send an HTTP 503 Service Unavailable response until some jobs finish. Can I specify a maximum queue length, or somehow determine the length of the queue?

Uncaught exception on worker spawn

Hello,

Spawning a worker when the server doesn't have enough free memory, an error event won't be emitted, but exception is thrown instead. Looking at the child_process module source, it seems that only a certain errors will be emitted via 'error' event at this point.

Example:
(node:2177497) UnhandledPromiseRejectionWarning: Error: spawn ENOMEM at _errnoException (util.js:992:11) at ChildProcess.spawn (internal/child_process.js:323:11) at exports.spawn (child_process.js:502:9) at Object.exports.fork (child_process.js:103:10) at fork (<omitted>/node_modules/worker-farm/lib/fork.js:17:36) at Farm.setup (<omitted>/node_modules/worker-farm/lib/farm.js:68:12) at farm (<omitted>/node_modules/worker-farm/lib/index.js:16:15)

Node.js version: 8.11.3

EDIT1: I forgot to mention that the worker farm do get created inside an async function, thus resulting in rejected promise instead of uncaught exception.

Feature request: Be able to call a "warmup" function on a worker

I have a scenario in which I want to have workers "warm" so that they are ready to serve requests (they have some "heavy" initialisation).
This works currently by using autoStart: true option. The only problem is that during the setup of my worker I depend on configuration from the code that creates the workers, so the only way I can do that setup now is during the call itself (and thus it is not a "warm" worker anymore, since it needs to configure/setup when it's called).

It would be nice to have the possibility to call a initialiser method for all workers that can take options and make the worker ready for calls.

Is there anyway to use it with Promises?

Has anyone managed to do something like

worker('getusers')
.then(function(result) {
    console.log(result);
})
.catch(function(error) {
    console.error(error);
});

I tried but didn't work out, maybe I'm doing something wrong, any examples?

Another question, is there anyway to "cancel" the current running task if a user closes the webpage (when using it with express), something like:

req.on('close', function(){
    worker.cancel_current_task();
});

If not, are there any other modules that are capable of this? thanks for your time.

debugging spawned workers

I know this is not a problem unique to this module, but how do I debug spawned workers? Is there a way to --debug=[unique-port] for each worker spawned?

RFC: sticky workers

Problem

While using worker-farm, I've found that workers get assigned from the list based on whoever is free, starting from the first worker. This means that there is no predictability on which worker would execute which call. This is good for a simple usage and I'm convinced that the default should be kept as-is.

There are use cases where delegated tasks might take advantage of executing on a particular process, especially when the process is expensive. A practical example comes when workers cache the result or part of the result. Caches can end up being n-plicated (at most once per worker); also increasing memory consumption. As a side-effect, processing time can be increased because of these cache misses.

The counter-argument for the example above would probably be to keep the cache on the main process (and avoid the round trip as well), but workers can die freely (e.g. because an OOM), while the main process should not.

Proposed solution

As said before, the default behavior should remain the same. An additional method (getWorker) would be added to the options object. The getWorker method would accept the same object as addCall to let the user decide what to do with the call.

  • When getWorker is not present, its default would be a refactor of the code that picks the first available one from the queue (pointed on the "Problem" section).
  • When present, it will be called, and its return has to be the index of the worker that will be used. Returning NaN will imply that worker-farm can choose whatever worker it wants, and a number outside the range [0, maxWorkers) will throw a RangeError.

Signature method

getWorker({
  method: string,
  callback: (err: ?Error, data: any),
  args: $ReadOnlyArray<any>
  retries: number,
}): number

Overflow control

It could be that a custom getWorker implementation keeps pushing data into the same worker, up to the point where maxConcurrentCallsPerWorker is overflowed. In that case, an error should be thrown by the implementation. However, if maxCallsPerWorker is reached, the worker should be killed, and a new one spawned without any error.

emitting events

It would be really nice if workers could emit events for dealing with things other than the return value of the job... Perhaps an interface similar to webworker-threads?

Thoughts?

3 concurrency tests fail in master

Github says the build is passing, but 3 concurrency tests are failing.

# many concurrent calls
not ok 19 processed tasks concurrently (191ms)
  ---
    operator: ok
    expected: true
    actual:   false
  ...
ok 20 workerFarm ended
# single concurrent call
not ok 21 processed tasks sequentially (192ms)
  ---
    operator: ok
    expected: true
    actual:   false
  ...
ok 22 workerFarm ended
# multiple concurrent calls
not ok 23 processed tasks concurrently (183ms)
  ---
    operator: ok
    expected: true
    actual:   false
  ...
ok 24 workerFarm ended

Job retry limit?

I have a job that emits the following error: FATAL ERROR: CALL_AND_RETRY_2 Allocation failed - process out of memory.

These kinds of errors make node-worker-farm resubmit a particular job indefinitely that emit these kinds of errors. I'd like a way to have a limit on the number of retries of tasks before they're dropped as complete failures.

Would this be possible with current API?

At the moment, I'm using maxCallTime which isn't really what I want; and isn't the strongest solution.

Worker Farm: Received message for unknown index for existing child. This should not happen!

Hi - First of all thanks for this quality multi-process library. I looked into many options before settling on this one - it's great! But I need help with an issue. In my setup I am getting the following error very often:

Worker Farm: Received message for unknown index for existing child. This should not happen!

What could cause this? If needed I could come up with a sample app to reproduce this, but it will take some time to setup so I thought I'd ask first.

Thanks!

[Question] How to stop a currently-running work?

workerFarm.end(farm) documentation says:

If you send your farm API to workerFarm.end() then it'll cleanly end your worker processes. Note though that it's a soft ending so it'll wait for child processes to finish what they are working on before asking them to die.

What if I need to hard-end an ongoing jobs, before queuing new ones?

Use case: the long-running (forever) ongoing jobs has gone bad because of top-level (parent process) config changes, so the parent needs to spin new long-running ongoing jobs, but only after killing the original, bad jobs first.

Detect heap out of memory

Context

I'm doing memory intensive calculations at worker, and thus heap out of memory occasionally happens. I want to be able to detect failures of those jobs from parent process and blacklist them in my database as "too-much memory intensive".

Feature Request 1

How to detect and handle that child has hit the node's memory limit?

Feature Request 2

When worker crashes, it outputs the heap out of memory trace. I want a way to suppress that output because I'll be handling that.

Unable to run test example

Hello,

I was trying to run the basic pi example. but looks like it is not able to find the extend function in the util module.

require('util')._extend

I am not able to locate that function here eithere 👍

http://nodejs.org/api/util.html

Any thoughts ?

Thanks

Error: channel closed

Hello guys, I'm having a trouble:

events.js:85
      throw er; // Unhandled 'error' event
            ^
Error: channel closed
    at process.target.send (child_process.js:414:26)
    at callback (/path/to/my/app/node_modules/worker-farm/lib/child/index.js:25:17)
    at Request._callback (/path/to/my/app/functions/getContent.js:26:13)
    at Request.self.callback (/path/to/my/app/node_modules/request/request.js:344:22)
    at Request.emit (events.js:110:17)
    at Request.<anonymous> (/path/to/my/app/node_modules/request/request.js:1239:14)
    at Request.emit (events.js:129:20)
    at IncomingMessage.<anonymous> (/path/to/my/app/node_modules/request/request.js:1187:12)
    at IncomingMessage.emit (events.js:129:20)
    at _stream_readable.js:908:16
events.js:85
      throw er; // Unhandled 'error' event
            ^
Error: channel closed
    at process.target.send (child_process.js:414:26)
    at callback (/path/to/my/app/node_modules/worker-farm/lib/child/index.js:25:17)
    at Request._callback (/path/to/my/app/functions/getContent.js:26:13)
    at Request.self.callback (/path/to/my/app/node_modules/request/request.js:344:22)
    at Request.emit (events.js:110:17)
    at Request.<anonymous> (/path/to/my/app/node_modules/request/request.js:1239:14)
    at Request.emit (events.js:129:20)
    at IncomingMessage.<anonymous> (/path/to/my/app/node_modules/request/request.js:1187:12)
    at IncomingMessage.emit (events.js:129:20)
    at _stream_readable.js:908:16
events.js:85
      throw er; // Unhandled 'error' event

Any idea?

Add/Remove concurrent workers dynamically

Hello,

I'm trying to add an autobalance feature for add/remove concurrents calls per workers automatically when the CPU is idle or saturate (respectively).

See more information in: Kikobeats/farm-cli#12 😄

I reviewed the code for know the current code approach for that and I saw this:

if (this.children[childId].activeCalls < this.options.maxConcurrentCallsPerWorker
        && this.children[childId].calls.length < this.options.maxCallsPerWorker) {
      this.send(childId, this.callQueue.shift())

so for add an extra worker I can use .send with the child id and generate an extra calllQueue. Fine!

My problematic is how to kill this concurrent workers.

In the code you use .stopChild but this call all the worker; I need kill just one (the last) concurrent call in the worker.

I tried to code a mockup for that, something like:

function killLastConcurrentCall (farm) {
  Object.keys(farm.children).forEach(function (childId) {
    const calls = farm.children[childId].calls
    const call = calls[calls.length -1]
    const idx = calls.length - 1

    farm.receive({
      idx,
      child: childId,
      args: [ new ProcessTerminatedError('cancel by autobalance') ]
    })
  })
}

But doesn't worker. Any idea how I can handle this feature from the code side?

How to return an error from a child

Could you add an example showing how to return an error from a child? I'm trying to call the callback function with a new Error('...') object in the first parameter as usual, but although the master process recognizes that there's something in err, it appears to be an empty object, so I can't get any details about the error that occurred.

This is with node 0.10.12 on OS X 10.8.4.

cluster.isMaster is returning true for spawned workers

Checking cluster.isMaster for workers returns true. This was some unexpected behavior because I'm using a logging library and am relying on the cluster.isMaster check to be false for child workers so that log messages can be pumped to the master process to be written to file and prevent file collisions with the logger. Why does cluster.isMaster return true for all workers?

Child processes taking more time than parent for JSON processing

I am trying JSON stringify of very large datasets. I am creating child processes to perform the operations and doing equivalent operations in the parent process.

The child processes should execute the results faster as they are executing the process in parallel but I am observing opposite results in my test JSON.
For running this code please replace the url's with the actual JSON(unable to paste them due to character limit).
The code for main.js is as follows:

let workerFarm = require("worker-farm")
    , workers    = workerFarm(require.resolve('./child'))
    , ret        = 0
    ,final_out =0;

let start1 = new Date().getTime();
for (let i = 0; i < 100; i++) {
    workers('#' + i + ' FOO', function (err, outp) {
        if (++ret == 100){
            console.log('Doing it the fast (multi-process) way...');
            let end1 = new Date().getTime();
            console.log('fast took',  (end1 - start1), 'milliseconds')
            workerFarm.end(workers)
        }
    })
}

let start2 = new Date().getTime();

for (let i = 0; i < 100; i++) {

    let json= JSON.stringify([https://pkgstore.datahub.io/datahq/1mb-test/1mb-test_json/data/ca5fd34861cc68b4f519b1c1e15c510e/1mb-test_json.json](url));

        if (i == 99){
        
        console.log('Doing it the slow (single-process) way...')
        console.log('slow took', new Date().getTime() - start2, 'milliseconds')
    }
}

The code for child.js file is as follows:

'use strict'

module.exports = function (inp, callback) {
    let srt= new Date().getTime();
    let json= JSON.stringify([https://pkgstore.datahub.io/datahq/1mb-test/1mb-test_json/data/ca5fd34861cc68b4f519b1c1e15c510e/1mb-test_json.json](url));

    let end = new Date().getTime();
    return callback(null, end-srt);
}

Kindly help me identify the issue as the results of this iteration are as follows:

Doing it the slow (single-process) way...
slow took 967 milliseconds
Doing it the fast (multi-process) way...
fast took 1079 milliseconds

The child processes are taking more time. I have tried this for less as well as more iterations but the results are quite similar.

workerFarm.end is not working

I'm using node 10.0.0 on macOS.

When I call workerFarm.end it immediately terminates my running processes which I understand is not supposed to happen, and then they all start again. Possibly because they are retrying, but regardless, the queue picks up again and keeps going.

const workerFarm = require('worker-farm');

const options = {
    maxConcurrentCallsPerWorker: 1
};

// This worker spawns a mongodb client inside it
// But that shouldn't be relevant since node-worker-farm sends SIGKILL
// to close the processes.
const queue = workerFarm(options, require.resolve('./worker.js'));

// While my script is running I press CTRL-C to terminate the process
// This is where things go wrong because the processes all immediately exit and
// start up again.
process.on('SIGINT', () => {
    console.info('\rGracefully shutting down...');
    workerFarm.end(queue, finish);
});

// this is close to 300,000 items
const arr = [];

for (const item of arr) {
    queue(item, onProcessed);
}

let counter = 0;

function onProcessed() {
    counter++;
    if (counter === arr.length) {
        workerFarm.end(queue, finish);
    }
}

function finish() {
    console.info('Done');
}

Missing typescript definitions

Hi folks! I can see index.d.ts file in the repository root, but can't find it inside the npm package v1.3.1:

image

Typescript compiler can't find them too, so it will throw an error.

image

Worker timeout not always occurring

I have a typical node-worker-farm setup, where a single farm (in the main process of my application) delegates work to its workers and executes a callback upon receiving a response from each worker. If I put sufficient load on the farm, a small fraction of the work that gets delegated never comes back (i.e., a long-running worker won't time out). The basic configuration of my system is:

In main.js

var workers = require('./my-farm');
// in a voucher-controlled loop (makes ~100 calls then waits until they return before making more calls)
workers.doWork(work);

In my-farm.js

var workerFarm = require('worker-farm');
var workers = createWorkerPool();

module.exports = {
    doWork: doWork
}

function createWorkerPool() {
    var options = {
        maxCallsPerWorker: Infinity,
        maxConcurrentWorkers: 1,
        maxConcurrentCallsPerWorker: 1,
        maxConcurrentCalls: Infinity,
        maxCallTime: 10000,
        maxRetries: 0,
        autoStart: true
    };
    return workerFarm(options, require.resolve('./worker'));
}

function doWork(workObj, callback) {
    workers(workObj, function (err, response) {
        callback(err, response);
    });
}

In worker.js

module.exports = function(msg, callback) {
    try {
        ...
        // do a mix of work involving both CPU- and IO-bound executions
        ...
        callback(null, results);
    } catch(err) {
        callback(err, null);
    }
}

I have experimented with different farm configurations, like using more workers and increasing maxCallTime, but it still happens; most work is processed just fine and timeouts occur when they should, but every once in a while the child worker doesn't return and doesn't time out, and both the main process and child are still alive. The problem is definitely in this part of my code, as the application continues running if I reset the worker farm every so often. It might be worth noting that my application expects workers to hit the timeout quite often, but the timeout is sufficiently large relative to the worker initialization time (~1 second), so that shouldn't be a problem.

@rvagg, @amasad, @thaumant or anyone else: have you encountered this behavior before?

Having more than two farms causes the first farm call to be delayed until it's called again

Steps for re-creating the issue:

  1. Setup two farms
  2. Call both farms
  3. They both fire as soon as they're called as expected
  4. Setup a third farm
  5. Call any of the farm
  6. Call that farm again

You'll notice the first and second call fire when the second call is made.

EDIT:
This only occurs when I'm running this locally on my Mac.
It appears to work fine when deployed on linux.

How to treat crashed worker as failed job

Thanks for worker-farm! It's looking great so far.

When a worker crashes, worker-farm re-forks it and sends the job again.

Can you add an example that shows how to instead treat a crashed worker as a failed job? My workers are wrapping a C library using ffi and certain inputs can crash the library; I don't want to retry such jobs as they'll just continue crashing each time.

workerFarm.end(workers)

In the first example with child and main, you have the ret variable that counts the number of workers that have been called, but then later in the documentation I read that workerFarm.end will wait for all the work to finish before killing, meaning it sounds like you can put the workerFarm.end outside of the for loop?

I tested it out and it seemed to work, but wondering if this is proper usage. I haven't quite dug around in the source yet where I could probably answer for myself... =P

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.