Comments (11)
Yes it's a small query and the table is indexed well so I believe we are fine but this is a good concern, thanks for discussing it.
The big picture that I was thinking of last night is:
What if we made the process of registering the new job is happening through action scheduler and CRON too?
So the process will be as follows:
- The not cached page is visited for the first time, we will create a row in RUCSS table with the status
pending_create
for example and in this step we will not send the RUCSS SaaS job creation request. - Will have a CRON to get some of those
pending_create
rows and create async job in action scheduler to create them. - Once the job is created, the status will be pending and get into the normal process.
This will give us the control of how many jobs are created at once and not to miss any job but also will add a layer of complexity.
from wp-rocket.
From @wordpressfan: (Slack)
Let's think about this, we need a way to stop sending the request of registering a new job when we have enough pending jobs (we can define the word enough later)
to do that we can easily create a new filter here:
As a switch to enable/disable registering new jobs from the plugin, then we can add a callback to change this switch based on the number of pending+inprogress jobs so for example we get 100 pending rows to process every minute, then we need to stop registering new jobs once we reach 500 rows (pending+inprogress)
from wp-rocket.
@wordpressfan I agree with the approach. However, it should not impact the 408 retry mechanism here
What do you think of adding an optional argument force
to add_url_to_the_queue
, set to False
by default.
And in the method add_url_to_the_queue
:
if ( force == False ) {
$max_pending_jobs = apply_filters( 'rocket_rucss_max_pending_jobs', MAGIC_CONSTANT )
if ( 0 < $max_pending_jobs) {
$nb_pending_jobs = count_pending_jobs()
if ( nb_pending_jobs >= max_pending_jobs) {
return false;
}
}
}
Where MAGIC_CONSTANT is to be defined (> 100 not to block the default batches, and not too big to avoid the problem we currently have. Maybe 300?)
And where count_pending_jobs
is just a query on the RUCSS DB.
@piotrbak, what do you think in terms of product impact?
from wp-rocket.
@wordpressfan @vmanthos what was discussed during the daily is that we could replace MAGIC_CONSTANT
in the above suggestion with a calculated value?
The one that could be relevant would be this one: n * apply_filters( 'rocket_rucss_pending_jobs_cron_rows_count', 100 )
What do you think?
This way, we would scale how many jobs we send based on how many jobs we are trying to collect.
I would suggest n to be ~3 (it has to be more than 2 and not to big...)
So something like:
$max_pending_jobs = apply_filters('rocket_rucss_max_pending_jobs', 3 * apply_filters( 'rocket_rucss_pending_jobs_cron_rows_count', 100 ) )
(And please, let's not hard code 3 and 100 but use (shared) constants instead? ; maybe put the inner apply_filter in a method as we use it at different places in the codebase)
from wp-rocket.
I'm just worried about hammering the DB with the COUNT query every time a job is added 🤔
It's not a costly query and the DB is small... I don't really know
from wp-rocket.
Reproduce the issue
Yes I can see the issue in one of our customer's site.
Root cause
Jobs are kept in plugin's database even when they are removed from SaaS side so every check status after that time will fail because SaaS at this point returns empty response.
Scope a solution
As We mentioned above we need to add a new status pending_create
and stop registering jobs into SaaS directly.
-
Here
wp-rocket/inc/Engine/Optimization/RUCSS/Database/Queries/UsedCSS.php
Lines 164 to 179 in 8f94841
we need to make $job_id and $queue_name optional arguments, also change the status here from pending topending_create
-
Make $job_id argument optional here with default value as empty string:
wp-rocket/inc/Engine/Optimization/RUCSS/Database/Queries/UsedCSS.php
Lines 209 to 225 in 8f94841
And change the status to pending_create
.
-
Remove this part of code from here
wp-rocket/inc/Engine/Optimization/RUCSS/Controller/UsedCSS.php
Lines 219 to 236 in 06381b8
-
Change the functionality of this method
add_url_to_the_queue
to only:
$used_css_row = $this->used_css_query->get_row( $url, $is_mobile );
if ( empty ( $used_css_row ) ) {
$this->used_css_query->create_new_job( $url, '', '', $is_mobile );
} else {
$this->used_css_query->reset_job( $used_css_row->id );
}
-
Remove this part:
wp-rocket/inc/Engine/Optimization/RUCSS/Controller/UsedCSS.php
Lines 684 to 687 in 06381b8
-
Remove this part:
wp-rocket/inc/Engine/Optimization/RUCSS/Controller/UsedCSS.php
Lines 866 to 869 in 06381b8
-
For the CRON we can use the current CRON
rocket_rucss_pending_jobs
to process also pending_create jobs by adding a new callback here:
-
Here: https://github.com/wp-media/wp-rocket/blob/06381b8860eb46e570911b2812652f54a676dd6d/inc/Engine/Optimization/RUCSS/Controller/UsedCSS.php#L871-L870
Create a new methodprocess_pending_create_jobs
that will be called exactly likeprocess_pending_jobs
-
In this new method
process_pending_create_jobs
we will get rows with pending_create status and do the same functionality we already do insideprocess_pending_jobs
with a filter to change the number of rows to handle, with a default value (we need to agree on that default value because we only need to keep the number of pending rows below a specific number and use this max_pending_count to decide the number of pending_create to process) -
Here we are between two approaches of creating async action scheduler's task for each row to call SaaS to create the job.
Effort
[M] or [L]
@CrochetFeve0251 @wp-media/php to review plz.
from wp-rocket.
Looks good overall. Some comments:
On point 1:
we need to make $job_id and $queue_name optional arguments
-> Let's just remove those arguments in the method, and pass an empty string to the DB, no?- To ease discussion, let's not re-use pending in the status name. Maybe 'to_submit'?
Point 2: Same here, let's just remove the job_id argument.
Point 3 & 4:
Here again, for naming clarity we should properly name methods to avoid confusion.
If I recap your suggestion, my understanding would be:
public function prepare_job( $url, $is_mobile ) {
$used_css_row = $this->used_css_query->get_row( $url, $is_mobile );
if ( empty ( $used_css_row ) ) {
$this->used_css_query->create_new_job( $url, '', '', $is_mobile );
} else {
$this->used_css_query->reset_job( $used_css_row->id );
}
}
public function send_job( $used_css_row ) {
$url = $used_css_row->url;
$is_mobile = $used_css_row->is_mobile;
// CURRENT CONTENT OF add_url_to_the_queue
/**
* Filters the RUCSS safelist
*
* @since 3.11
*
* @param array $safelist Array of safelist values.
*/
$safelist = apply_filters( 'rocket_rucss_safelist', $this->options->get( 'remove_unused_css_safelist', [] ) );
/**
* Filters the styles attributes to be skipped (blocked) by RUCSS.
*
* @since 3.14
*
* @param array $skipped_attr Array of safelist values.
*/
$skipped_attr = apply_filters( 'rocket_rucss_skip_styles_with_attr', [] );
$skipped_attr = ( is_array( $skipped_attr ) ) ? $skipped_attr : [];
$config = [
'treeshake' => 1,
'rucss_safelist' => $safelist,
'skip_attr' => $skipped_attr,
'is_mobile' => $is_mobile,
'is_home' => $this->is_home( $url ),
];
$add_to_queue_response = $this->api->add_to_queue( $url, $config );
if ( 200 !== $add_to_queue_response['code'] ) {
Logger::error(
'Error when contacting the RUCSS API.',
[
'rucss error',
'url' => $url,
'code' => $add_to_queue_response['code'],
'message' => $add_to_queue_response['message'],
]
);
return false;
}
// END OF CURRENT CONTENT OF add_url_to_the_queue
/**
* Lock preload URL.
*
* @param string $url URL to lock
*/
do_action( 'rocket_preload_lock_url', $url );
// We got jobid and queue name so save them into the DB and change status to be pending.
$this->used_css_query->add_job_reference(
$add_to_queue_response['contents']['jobId'],
$add_to_queue_response['contents']['queueName'],
);
add_job_reference
has to be created.
We can delete add_url_to_the_queue
and replace occurrences with prepare_job
in the if ( empty( $used_css ) ) {
we would just call prepare_job
.
Point 5: Make sure to add do_action( 'rocket_preload_unlock_url', $row_details->url );
after calling prepare_job
: the job will not be immediately sent to the server, so we should unlock preload in the meantime.
Point 9: We can use this limit for the number of "to_submit" jobs:
$max_pending_jobs = apply_filters('rocket_rucss_max_pending_jobs', 3 * apply_filters( 'rocket_rucss_pending_jobs_cron_rows_count', 100 ) )
Point 10: Do you foresee an issue in using a for loop to go through the (up to) max_pending_jobs RUL and making the API calls? for fetching the results, we use async tasks for each API call. Is there a reason for this approach?
from wp-rocket.
@wordpressfan The grooming looks good but there is one point I am not sure about is the rocket_preload_lock_url
filter disapearing. Why are we sure we do not need to lock the URLs for the preload anymore?
Another thing this issue is working on the exact same methods with different AC. I guess we need some communication to not end up with monster conflicts.
from wp-rocket.
rocket_preload_lock_url
I think it should be kept and used in the CRON job part, when actually sending the job to the SaaS. that could be in the send_job
method as suggested above
from wp-rocket.
After a discussion with @MathieuLamiot and @CrochetFeve0251
- Yes we will keep the
rocket_preload_lock_url
action but will move it from its current place to be in the method that will run for each url after sending the job to SaaS. - For the action scheduler part we need to do the same functionality we do now for pending rows, so we will create Async job for every database row to send the request to SaaS.
from wp-rocket.
@CrochetFeve0251 Can you provide a documentation for the new submission system and the related filters, so that support can learn how to use them? Thanks!
from wp-rocket.
Related Issues (20)
- 3.16 - Secure LCP/ATF data from the AJAX endpoint and DB HOT 2
- Commented URLs at home page shouldn't be sent to saas
- 3.16 - When RUCSS & LCP/ATF are both enabled, URLs sent to the SaaS should have the wpr_imagedimensions=1 query string HOT 3
- 3.16 - Error & timeout handling of Beacon Script HOT 4
- Reduce the processing on not-cached page when checking the LCP/ATF data HOT 12
- Should fetch up to 10 valid (internal) links from Homepage and expand search for valid links beyond the 1st 10 if needed HOT 5
- 3.16 - Remove console.log of the beacon script when the debug mode is not enabled HOT 3
- 3.16 Preload - Shouldn't fetch RSS feed or restAPI links from the Homepage HOT 5
- In case rocket_atf_warmup_links filter is given a wrong input, the fetch mechanism should default to 10 URLs + home
- LazyLoad for CSS background images breaks linear-gradient applied to images
- Should fetch the 1st 10 links in home page on multisite HOT 3
- WPR Settings page fatal error when debug file is so large even though debug mode is disabled
- LCP/ATF Warm up process not blocked on local env
- Warmup needs to generate desktop and mobile data HOT 3
- LCP/ATF data is malformed when no LCP is found by the beacon HOT 2
- Clear Used CSS for this URL does not appear in the post editor HOT 1
- Exploratory testing on 3.16 HOT 7
- Guard beacon script against saving not expected values into the database HOT 22
- Update build script to generate production scripts individually HOT 2
- Fetchpriority of LCP is not applied on the LCP markup when the LCP URL is relative HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from wp-rocket.