geofmureithi / apalis Goto Github PK
View Code? Open in Web Editor NEWSimple, extensible multithreaded background job and message processing library for Rust
Home Page: https://crates.io/crates/apalis
License: MIT License
Simple, extensible multithreaded background job and message processing library for Rust
Home Page: https://crates.io/crates/apalis
License: MIT License
How would I add a job to an apalis queue from a nodejs process?
It looks like to cancel a running task (using Redis as the queue) you need the worker_id
and job_id
, that latter of which can be obtained via ctx.id()
when the task is initiated. However it's not clear how to obtain the worker_id
associated with a running job_id
.
Hey - what's the timeline for Lock-free Postgres implementation?
Feel free to outline the rough problem which needs to be solved and we (Bundlr) might be able to solve, depending on time commitment
Currently, each instance of PostgresStorage
grabs a connection to LISTEN
for new jobs. This means that if you have 10 workers, you constantly consume 10 connections to listen on the exact same topic.
Connections can be notably expensive with Postgres, so it would be nice if multiple PostgresStorage
could share the same notification stream (maybe through a tokio::sync::watch
channel or similar?) instead of each grabbing a connection for each worker.
Currently apalis-sql
is using version 0.5 of sqlx
this make using it a bit harder than necessary to use in an application using sqlx
0.6.
I'm unable to use the PostgresStorage::new
method do to the differing sqlx
versions.
Pass a connection string to PostgresStorage::connect
.
Would be great if surrealdb support can be added. It can run in memory, single node with file as well as distributed, has events support and is written in Rust.
SurrealDB in 100 Seconds gives a good overview.- https://www.youtube.com/watch?v=C7WFwgDRStM
I searched all the examples in this repo but could not find any sample code to kill
, reschedule
and update_by_id
.
My use case is that I want to schedule a job and then edit the time it needs to run later (I think this should be reschedule
?). I would also like to know how to change the information
parameter sent to a scheduled job (I think it's the update_by_id
method). Finally, another method to completely kill a scheduled job.
Change email-service:
pub async fn send_email(job: Email, _ctx: JobContext) -> Result<JobResult, JobError> {
log::info!("Attempting to send email to {}", job.to);
Err(JobError::WorkerCrashed)
}
2022-12-22T05:57:53.733964Z INFO job{job_id="b1ad3d63-0fc1-4c65-8c38-e071b9f51344" current_attempt=1}: email_service: Attempting to send email to 1
2022-12-22T05:57:53.734014Z ERROR job{job_id="b1ad3d63-0fc1-4c65-8c38-e071b9f51344" current_attempt=1}: apalis_core::layers::tracing::on_failure: Job Failed: Attempted to communicate with a crashed background worker done_in=0 ms
async fn handle(&mut self, job: JobRequestWrapper<T>) -> Self::Result {
match job.0 {
Ok(Some(job)) => {
self.handle_job(job).await.unwrap(); << THIS LINE
}
Ok(None) => {
// on drain
}
Err(_e) => {
todo!()
}
};
}
thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: Failed(WorkerCrashed)', /home/dan/tests/apalis/packages/apalis-core/src/worker/mod.rs:351:44
Hi!
I found your strange example without any code.
And nobody asks about it.
May be I miss something and now everybody should use telepathy to use examples? :)
Currently it only shows one worker send_email
. Would be good to showcase more types of jobs being consumed.
Monitor::new()
.register_with_count(5, move |_| {
WorkerBuilder::new(sqlite.clone())
.layer(TraceLayer::new())
.build_fn(send_email)
})
.run()
.await
Hello, this project depends on sqlx = "^0.6"
. But the latest version is 0.7
. This is introducing errors in the dependency graph. Would it be possible to update the dependency?
I cant find any apalis examples with using extensions for shared data.
Only a wrong example in docs:
/// Extension data for jobs.
///
/// forked from [axum::Extensions]
/// # In Context
///
/// This is commonly used to share state across jobs.
///
/// ```rust,ignore
/// use apalis::{
/// Extension,
/// WorkerBuilder,
/// JobContext
/// };
/// use std::sync::Arc;
///
/// // Some shared state used throughout our application
/// struct State {
/// // ...
/// }
///
/// async fn email_service(email: Email, ctx: JobContext) {
/// let state: &Arc<State> = ctx.data_opt().unwrap();
/// }
///
/// let state = Arc::new(State { /* ... */ });
///
/// let worker = WorkerBuilder::new(storage)
/// .layer(Extension(state))
/// .build_fn(email_service);
/// ```
Where even
use apalis::{
Extension,
WorkerBuilder,
JobContext
};
is throwing error apalis does not have Extension
.
Please make a working example with Shared state.
The concept would be something like:
.build_pipeline(Pipeline::new(start_here).then_daily((do_this, and_this_too)).until(complete))
This would be a sort of persistent job that is running for a long period of time.
This may possibly go to its own repo
https://github.com/tembo-io/tembo/blob/5d985810c0a305dea469d5630157566a71aeedac/pgmq/core/examples/basic/src/main.rs
It would be really helpful if an example with async graphql was present. I am using axum and async graphql and I can not access the storage
in the gql_ctx
. I can not figure out how to push a new job.
These APIs are conflicting and making it hard to work MQ.
geofmureithi/apalis-amqp#1
Hi Thanks for sharing this excellent library to the community.
Currently the job function accepts two arguments: one is Request type and JobContext
I want to share database pool connection to my job function, is there a way I can do it ?
Example:
#[derive(Deserialize, Serialize)]
struct Youtubelink(String);
impl Job for Youtubelink {
const NAME: &'static str = "youtube-transcript";
}
async fn transcript(
job: impl Into<Youtubelink>,
ctx: JobContext,
// pgpool: &PgPool
) -> Result<YoutubeContent, Serror> {
let data = Youtube::link(&job.into().0).content().await?;
// do something with pgpool and data
}
I'm looking something like this where dbpool is created and registered. Later used in handlers
Hi, firstly thanks for the great library - it looks great and I'd love to use it.
I have a project in mind that I'd like to use it on, but the project is using the sqlx
feature runtime-tokio-native-tls
and this seems to conflict with apalis with the error:
only one of ['runtime-actix-native-tls', 'runtime-async-std-native-tls', 'runtime-tokio-native-tls', 'runtime-actix-rustls', 'runtime-async-std-rustls', 'runtime-tokio-rustls'] can be enabled
I don't think I can easily get round this requirement of using sqlx
with the feature runtime-tokio-native-tls
enabled in this particular project.
I'm fairly new to rust so sorry if I'm missing something obvious, but is it possible to use this library in a manner compatible with the sqlx
feature runtime-tokio-native-tls
?
https://github.com/google/osv-scanner finds a lot of security issues in the dependencies.
Would be a good idea to bump a lot of the deps to prevent these.
osv-scanner --lockfile Cargo.lock
Scanned /home/jayvdb/rust/apalis/Cargo.lock file and found 418 packages
╭─────────────────────────────────────┬───────────┬─────────────────┬─────────┬────────────╮
│ OSV URL (ID IN BOLD) │ ECOSYSTEM │ PACKAGE │ VERSION │ SOURCE │
├─────────────────────────────────────┼───────────┼─────────────────┼─────────┼────────────┤
│ https://osv.dev/RUSTSEC-2021-0145 │ crates.io │ atty │ 0.2.14 │ Cargo.lock │
│ https://osv.dev/GHSA-qc84-gqf4-9926 │ crates.io │ crossbeam-utils │ 0.7.2 │ Cargo.lock │
│ https://osv.dev/RUSTSEC-2022-0041 │ │ │ │ │
│ https://osv.dev/RUSTSEC-2021-0141 │ crates.io │ dotenv │ 0.15.0 │ Cargo.lock │
│ https://osv.dev/GHSA-jw36-hf63-69r9 │ crates.io │ libsqlite3-sys │ 0.24.2 │ Cargo.lock │
│ https://osv.dev/RUSTSEC-2022-0090 │ │ │ │ │
│ https://osv.dev/GHSA-5wg8-7c9q-794v │ crates.io │ lock_api │ 0.3.4 │ Cargo.lock │
│ https://osv.dev/GHSA-gmv4-vmx3-x9f3 │ │ │ │ │
│ https://osv.dev/GHSA-hj9h-wrgg-hgmx │ │ │ │ │
│ https://osv.dev/GHSA-ppj3-7jw3-8vc4 │ │ │ │ │
│ https://osv.dev/GHSA-vh4p-6j7g-f4j9 │ │ │ │ │
│ https://osv.dev/RUSTSEC-2020-0070 │ │ │ │ │
│ https://osv.dev/RUSTSEC-2020-0016 │ crates.io │ net2 │ 0.2.38 │ Cargo.lock │
│ https://osv.dev/GHSA-fg7r-2g4j-5cgr │ crates.io │ tokio │ 0.1.22 │ Cargo.lock │
│ https://osv.dev/RUSTSEC-2021-0124 │ │ │ │ │
╰─────────────────────────────────────┴───────────┴─────────────────┴─────────┴────────────╯
The current implementation allocates a random worker-id from uuid. This works well but may produce undefined behaviour which doesn't affect the processing of jobs but may show invalid data when queried the number of workers.
See #41
Current cronjob is scheduled and run in worker itself.
Another use case is schedule and re-schedule a job in job server and only one worker can peek the job.
I have a docker container which inherits from scratch
. I have an apalis cron job which is supposed to execute at 0000 hrs and 1200 hrs everyday. It does execute but at exactly 0530 hrs for me (I live in Asia/Kolkata timezone which has that offset). How can I configure apalis so that it executes at 0000 hrs of my timezone?
Setting the TZ
env variable does not work, probably due to the scratch docker environment.
I'm not really sure if this problem even belongs to this repository, but would love any pointers you'd have on debugging this issue.
When trying to terminate the application with Ctrl + C (or by sending SIGINT directly) the monitor shuts down, but the application is stuck until it receives a SIGTERM signal.
$ cargo run --package axum-example
Finished dev [unoptimized + debuginfo] target(s) in 0.18s
Running `target/debug/axum-example`
2022-10-07T13:33:36.526309Z DEBUG axum_example: listening on 127.0.0.1:3000
2022-10-07T13:33:36.526681Z DEBUG apalis_core::worker::monitor: Listening shut down command (ctrl + c)
2022-10-07T13:33:46.759826Z DEBUG apalis_core::worker::monitor: Workers shutdown complete
^C^C
I tested this with the axum example and my own app, both have the same result.
The actix-web example does properly exit, so maybe axum needs some special handling for signals.
$ cargo run --package actix-web-example
Finished dev [unoptimized + debuginfo] target(s) in 0.19s
Running `target/debug/actix-web-example`
[2022-10-07T13:40:11Z INFO actix_server::builder] Starting 4 workers
[2022-10-07T13:40:11Z INFO actix_server::server] Actix runtime found; starting in Actix runtime
[2022-10-07T13:40:11Z DEBUG apalis_core::worker::monitor] Listening shut down command (ctrl + c)
[2022-10-07T13:40:13Z INFO actix_server::server] SIGINT received; starting forced shutdown
[2022-10-07T13:40:13Z INFO actix_server::worker] Shutting down idle worker
[2022-10-07T13:40:13Z DEBUG actix_server::accept] Paused accepting connections on 127.0.0.1:8000
[2022-10-07T13:40:13Z INFO actix_server::worker] Shutting down idle worker
[2022-10-07T13:40:13Z INFO actix_server::accept] Accept thread stopped
[2022-10-07T13:40:13Z INFO actix_server::worker] Shutting down idle worker
[2022-10-07T13:40:13Z INFO actix_server::worker] Shutting down idle worker
[2022-10-07T13:40:13Z DEBUG apalis_core::worker::monitor] Workers shutdown complete
Currently we have docs showing the main way of building worker job functions:
async fn do_job(job: Job, ctx: JobContext) -> Result<JobResult, JobError> {
....
}
Nonetheless, apalis
can accept more than that, eg:
async fn do_job(job: Job, ctx: JobContext) {
....
}
We should be able to support anyhow::Result
too.
Currently integrations tests are running on CI only for MySQL backend.
Let's add integration tests for the following backends
Let's strive to have different pull requests for each storage. This will help in collaboration.
Also before starting on any backend, please add a comment and assigned.
It is currently not possible to use apalis
on the same postgres database as the main application.
This is do to the sqlx Migrator
currently not support multiple applications using the same database. As the postgres migrations are already using a separate schema, it would be quite nice to be able to setup apalis in the same database as the main application it self.
I currently see two options:
sqlx
migrator for postgres until sqlx
supports mutliple applications in the same database.This isn't a problem for sqlite or mysql as either of these support postgresql like schemas.
In the example below, there are two ways of building workers, one involving ServiceBuilder
and the other involving WorkerBuilder
. It should be fixed so that there is only one approach.
fn main() -> Result<()> {
let storage = RedisStorage::new(redis).await?;
let schedule = Schedule::from_str("@daily").unwrap();
let service = ServiceBuilder::new()
.layer(Extension(storage.clone()))
.service(job_fn(enqueue_daily));
let cron_worker = CronWorker::new(schedule, service);
Monitor::new()
.register(cron_worker)
.register_with_count(2, move || {
WorkerBuilder::new(storage.clone())
.build_fn(do_distributed_stuff)
})
.run()
.await
.unwrap();
}
Is it possible to deploy a job that gets triggered at a specific time? For eg: 2 hrs from now (duration), or at 24-08-2023 10:05:30 (specific time).
Right now I do it manually. It would be nice if it was automatic.
Steps needed:
I want to build multiple async function works. How can I do this? The following cannot be achieved
Monitor::new()
.register_with_count(6, move |_| {
WorkerBuilder::new(worker_storage.clone())
.layer(SentryJobLayer)
.layer(TraceLayer::new())
.build_fn(queue::send_email)
.build_fn(queue::send_message)
.build_fn(queue::and_more)
})
Currently polling happens continuously within a specific interval.
A busy worker should not spend resources competing for jobs with other workers who might be idle.
WorkerContext
that checks if worker is busy.WorkerContext
instead of WorkerId
into consume_jobs(&ctx)
is_idle/is_busy
method in the polling logic.Adding support for using sqlx AnyPool would allow to dynamically use any database.
apalis uses sleepers when polling. The current sleeper implementation uses cfg and is not optimal.
In the next version, it might be good to have these together with Executor.
trait Executor {
fn spawn(&self, future: F);
fn sleep(&self) -> SleepFuture;
}
I might be blind but it would be cool to add a tip/"buy a coffee" etc thing in your README. I'd definitely contribute!
It's difficult to check the error messages from db on job failures.
Let the workers emit logs to stdout and stderr.
This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.
This repository currently has no open or pending branches.
Cargo.toml
document-features 0.2
tokio 1
async-std 1
tower 0.4
tracing-futures 0.2.5
sentry-core 0.34.0
metrics 0.23.0
metrics-exporter-prometheus 0.15
thiserror 1.0.59
futures 0.3.30
pin-project-lite 0.2.14
uuid 1.8
ulid 1
serde 1.0
tracing 0.1.40
criterion 0.5.1
pprof 0.13
paste 1.0.14
serde 1
tokio 1
redis 0.25.3
sqlx 0.7.4
packages/apalis-core/Cargo.toml
serde 1.0
futures 0.3.30
tower 0.4
pin-project-lite 0.2.14
async-oneshot 0.5.9
thiserror 1.0.59
ulid 1.1.2
futures-timer 3.0.3
serde_json 1
document-features 0.2
tokio 1.37.0
tokio-stream 0.1.15
packages/apalis-cron/Cargo.toml
cron 0.12.1
futures 0.3.30
tower 0.4
chrono 0.4.38
async-stream 0.3.5
async-std 1.12.0
tokio 1
serde 1.0
packages/apalis-redis/Cargo.toml
redis 0.25.3
serde 1
log 0.4.21
chrono 0.4.38
async-stream 0.3.5
futures 0.3.30
tokio 1
async-std 1.12.0
async-trait 0.1.80
tokio 1
packages/apalis-sql/Cargo.toml
sqlx 0.7.4
serde 1
serde_json 1
log 0.4.21
futures 0.3.30
async-stream 0.3.5
tokio 1
futures-lite 2.3.0
async-std 1.12.0
tokio 1
once_cell 1.19.0
.github/workflows/bench.yaml
actions/checkout v4@692973e3d937129bcbf40652eb9f2f61becf3332
boa-dev/criterion-compare-action v3
postgres 16
mysql 8
.github/workflows/cd.yaml
actions/checkout v4@692973e3d937129bcbf40652eb9f2f61becf3332
actions-rs/toolchain v1
actions-rs/cargo v1
actions-rs/cargo v1
actions-rs/cargo v1
actions-rs/cargo v1
actions-rs/cargo v1
actions-rs/cargo v1
.github/workflows/ci.yaml
actions/checkout v4@692973e3d937129bcbf40652eb9f2f61becf3332
actions-rs/toolchain v1
actions-rs/cargo v1
actions-rs/cargo v1
actions/checkout v4@692973e3d937129bcbf40652eb9f2f61becf3332
actions-rs/toolchain v1
actions-rs/cargo v1
actions/checkout v4@692973e3d937129bcbf40652eb9f2f61becf3332
actions-rs/toolchain v1
actions-rs/cargo v1
actions/checkout v4@692973e3d937129bcbf40652eb9f2f61becf3332
actions-rs/toolchain v1
actions-rs/cargo v1
.github/workflows/mysql.yaml
actions/checkout v4@692973e3d937129bcbf40652eb9f2f61becf3332
actions-rs/toolchain v1
mysql 8
.github/workflows/postgres.yaml
actions/checkout v4@692973e3d937129bcbf40652eb9f2f61becf3332
actions-rs/toolchain v1
postgres 16
.github/workflows/redis.yaml
actions/checkout v4@692973e3d937129bcbf40652eb9f2f61becf3332
actions-rs/toolchain v1
.github/workflows/sqlite.yaml
actions/checkout v4@692973e3d937129bcbf40652eb9f2f61becf3332
actions-rs/toolchain v1
Hello in some case as follow a worker may stop listening for new tasks, this happens when the connection between the worker & redis is interrupted.
Start a worker :
pub async fn send_email(job: Email, _ctx: JobContext) -> impl IntoJobResponse {
println!("sleeping");
let dur = time::Duration::from_millis(5_000);
thread::sleep(dur);
println!("job number {}", job.text);
}
#[tokio::main]
async fn main() -> Result<()> {
let storage = RedisStorage::connect("redis://127.0.0.1/").await.expect("");
Monitor::new()
.register(
WorkerBuilder::new(storage.clone())
.build_fn(send_email),
)
.run()
.await
}
Then with a client send 20 tasks for example.
Once some job are processed, restart redis, under macOS: brew service restart redis
.
If the worker was already processing a job (eg in our example sleeping ...)
then the worker will finish this task and will NOT handle the rest of 20 initial tasks.
But you can start another worker in parallel and it will handle the reste of the task without any issue.
If this issue can be fixed this will allow enable the Try Reconnect feature in the middle of application and also at startup (celery style)
Hello
First of all thanks for your amazing work !!
Here are some features that could be really amazing
Few questions btw, does a worker run in a dedicated thread ?
Is it possible to run multiple instance of worker app ?
The storage api for mysql doesnt support this feature.
// Worker not seen in 5 minutes yet has running jobs
StorageWorkerPulse::RenqueueOrpharned { count: _ } => {
//...
Ok(true)
The goal is to rewrite the query at mysql.rs#L202 to be Mysql compatible and uncomment it.
Currently, version 0.3 is designed with the actor model in mind. This works ok but has several problems like in #42 and #41.
The next version will lose the following:
Rethinking these issues, we can see that a better approach is to use tower layers.
We can change these to optional layers eg:
The example provided doesn't work with a simple string argument for WorkBuilder's new method: WorkerBuilder::new("email-worker-1")
instead it works if it's WorkerBuilder::new(storage.clone())
.
I feel .with_storage(storage.clone())
is defunct.
apalis version: 0.3.6
The only example with multiple jobs is the examples/rest-api
, but it uses a different storage type for each job. I have a requirement wherein my application has multiple jobs but I want to use the same database for each job. This does not seem possible for now, since when I do this, new jobs are often not deployed.
Right now I am creating a new in-memory database for each job I have but I would like to store them on an actual sqlite database. How can I do this?
I have my own migrations folder in the root of the project. Apalis fails to run workers on database migration error.
Finished dev [unoptimized + debuginfo] target(s) in 0.13s
Running `target/debug/controller`
Error: Database("migration 20230420060200 was previously applied but is missing in the resolved migrations")
Note 20230420060200
is my project's migration file.
Hi, i really like the crate,
i'm looking for a way to create a pipeline, where i can post jobs from another job.
i can pass it as an Extension, but is there another way?
I started exploring this crate and attempted the following:
use std::ops::Add;
use apalis::{layers::Extension, prelude::*, redis::RedisStorage};
use chrono::{Duration, Utc};
use serde::{Deserialize, Serialize};
#[derive(Debug, Deserialize, Serialize)]
struct Event {
id: u64,
}
impl Job for Event {
const NAME: &'static str = "test::Event";
}
async fn produce_jobs(mut storage: RedisStorage<Event>) -> anyhow::Result<()> {
for index in 0..5 {
tracing::info!("Scheduling event: {}", index);
let time = Utc::now().add(Duration::seconds((2 * (index + 1)) as i64));
storage.schedule(Event { id: index }, time).await?
}
Ok(())
}
async fn handle_event(job: Event, _ctx: JobContext) -> Result<JobResult, JobError> {
tracing::info!("Handling event: {:?}", job);
Ok(JobResult::Success)
}
#[tokio::main]
async fn main() -> anyhow::Result<()> {
tracing_subscriber::fmt::init();
let storage = RedisStorage::connect("redis://127.0.0.1/").await?;
let storage_clone = storage.clone();
let _ = tokio::spawn(async move {
tokio::time::sleep(tokio::time::Duration::from_secs(3)).await;
produce_jobs(storage_clone).await.ok();
});
Monitor::new()
.register(
WorkerBuilder::new(storage.clone())
.layer(Extension(storage.clone()))
.build_fn(handle_event),
)
.run()
.await
}
The events queued after the app is started do not get handled at all. Restarting the application does handle them retroactively though. I was expecting that after 3 seconds one queued event would be handled every 2 seconds. Is there something I am missing here?
{
"job": null,
"context": {
"id": "JID-01H1F31FYEDKGH94XQEDX4ZSYA",
"status": "Pending",
"run_at": "2023-05-27T17:10:33.166137782Z",
"attempts": 0,
"max_attempts": 25,
"last_error": null,
"lock_at": null,
"lock_by": null,
"done_at": null
}
}
{"v":0,"name":"majinn","msg":"Sending email: EmailApalis","level":30,"hostname":"samuel","pid":87291,"time":"2023-05-27T17:10:32.792390087Z","line":23}
I'm getting this from redis and the trace respectively. Any hints on what could be?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.