Code Monkey home page Code Monkey logo

Comments (5)

hibiken avatar hibiken commented on July 25, 2024 1

We're planning to support this uniqueness across multiple workers yeah?

If your worker processes are connecting to the same Redis instance, then answer is yes. My thought is to create a lock key in Redis with the given TTL and check that before inserting the task into Redis.

I'm currently planning the feature right now. I should be able to get started on this feature in the next few days 👍

Additional Note: While I was planning this feature, I realized that we do not need IgnoreUnique option since simply omitting the Unique option when enqueuing should give you the same effect.

from asynq.

hibiken avatar hibiken commented on July 25, 2024

Thanks for opening this issue!

I've seen Sidekiq and other libraries supporting uniqueness of tasks(jobs) in Redis. It sounds like a useful feature to include, and I'm onboard with adding this feature to this package.

I haven't personally used this feature in the past. Would you mind adding more context as to why/when you would need this feature?

I've taken a look at Sidekiq's uniqueness feature and it seems like they support uniqueness based on (type, args, queue) of a task, which seems reasonable. The semantics around uniqueness TTL (unique_for: <duration>) is a bit confusing to me, especially when you consider task retries.
For example, in Sidekiq, if you set unique_for: 10.minutes in MyWorker class, a job of that class is guaranteed to be unique in the next 10 mins or until that first job is processed successfully. But the doc states that

If your job retries for a while, 10 minutes can pass, thus allowing another copy of the same job to be pushed to Redis. Design your jobs so that uniqueness is considered best effort, not a 100% guarantee

I'm not quite sure if this is the behavior we want from the uniqueness feature.

I think I need to think this through and come up with something more intuitive and simpler to understand (better for both implementors & users of the package).

from asynq.

cayter avatar cayter commented on July 25, 2024

Here's 1 use case example: https://blog.francium.tech/avoiding-duplicate-jobs-in-sidekiq-dcbb1aca1e20.

In general, when we have a set of servers running behind a load balancer, it's very likely our servers would enqueue multiple jobs that do the same thing. As such, having the unique job feature would ensure that we don't enqueue the same job multiple times and waste redundant computing resource(especially when it's writing into the SQL DB).

In my current job where payment gateways would sometimes hit our webhook endpoint with the exact same payload multiple times(can be within a short period of time), this leads to some issues on updating the same data in the SQL database which can be redundant/incorrect.

from asynq.

hibiken avatar hibiken commented on July 25, 2024

I see. Thank you for the link and the explanation!

After reading the use cases and Sidekiq's wiki, it sounds like it's reasonable to have best-effort uniqueness (i.e., If unique TTL expires then it's okay to enqueue a duplicate task) .

Here's my initial proposal for the API of the feature (Please provide feedback!).

Provide a function to create uniqueness option.

// Unique returns an option to specify uniqueness lock for the given task.
//
// Uniqueness of a task is based on (type, payload, queue).
func Unique(ttl time.Duration) Option

Example of enqueueing a task:

// task will be enqueued if duplicates don't exist.
// if enqueued, duplicate tasks won't be enqueued for the next 10 minutes.
err := client.Enqueue(t, asynq.Unique(10*time.Minute))

Example of scheduling a task (I have two proposals):

Option 1: Treat TTL the same as Enqueue, so TTL behaves the same.

// task is unique for the next 70 minutes.
err := client.EnqueueIn(time.Hour, t, asynq.Unique(70 * time.Minute)

Option 2: TTL behaves differently when using scheduling (TTL is set to delay + duration_provided_by_unique)

// task is unique for the next 70 minutes (delay + unique TTL).
err := client.EnqueueIn(time.Hour, t, asynq.Unique(10 * time.Minute)

I'm not sure if we want to treat uniqueness TTL differently when scheduling like Sidekiq does. Let me know your thoughts.

Return ErrDuplicateTask when using Unique option and duplicate exists.

err := client.Enqueue(t, asynq.Unique(10 * time.Minute))
if errors.Is(err, asynq.ErrDuplicateTask) {
   // logic to handle duplicate, if any.
}

Retry should be treated the same as Sidekiq does:

A task that is pending retry will still hold the unique lock and prevent further task) from being enqueued until the retry succeeds or the timeout passes. Manually removing the task from the retry queue will not remove the lock.

I don't think we need this Unlock Policy. Let me know if you disagree!

For bypassing uniqueness, I suggest that we also provide an option to do so:

// IgnoreUnique returns an option to ignore task uniqueness when enqueuing. 
func IgnoreUnique() Option

Example for ignoring uniqueness:

// Really want to push this task, bypass uniqueness constraint!
err := client.Enqueue(t, asynq.IgnoreUnique())

Let me know your thoughts and feedback!

from asynq.

cayter avatar cayter commented on July 25, 2024

For example of scheduling a task...

I am leaning towards option 2 as this library is gonna be made friendly to those who had used Sidekiq before.

For bypassing uniqueness...

The suggestion is good.

For ignoring uniqueness...

The suggestion is good.

I don't think we need this Unlock Policy. Let me know if you disagree!

I just roughly looked into the possible options here. I can imagine how much efforts we need to put into to support all of them. So, I wouldn't be pushing for it now as I think the default policy which is unlocking right after a job is successfully processed should be sufficient for most use cases.

Question
We're planning to support this uniqueness across multiple workers yeah? Thanks!

from asynq.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.