Code Monkey home page Code Monkey logo

capo.js's Introduction

Get your <head> in order

Inspired by Harry Roberts' work on ct.css and Vitaly Friedman's Nordic.js 2022 presentation:

image

Why it matters

How you order elements in the <head> can have an effect on the (perceived) performance of the page.

This script helps you identify which elements are out of order.

How to use it

New: Install the Capo Chrome extension

  1. Copy capo.js
  2. Run it in a new DevTools snippet, or use a bookmarklet generator
  3. Explore the console logs
capo screenshot

For applications that add lots of dynamic content to the <head> on the client, it'd be more accurate to look at the server-rendered <head> instead.

Chrome extension

Capo.js Chrome extension

WIP see crx/

WebPageTest

You can use the capo WebPageTest custom metric to evaluate only the server-rendered HTML <head>. Note that because this approach doesn't output to the console, we lose the visualization.

BigQuery

You can also use the httparchive.fn.CAPO function on BigQuery to process HTML response bodies in the HTTP Archive dataset. Similar to the WebPageTest approach, the output is very basic.

Other

Alternatively, you can use local overrides in DevTools to manually inject the capo.js script into the document so that it runs before anything else, eg the first child of <body>. Harry Roberts also has a nifty video showing how to use this feature. This has some drawbacks as well, for example the inline script might be blocked by CSP.

Another idea would be to use something like Cloudflare workers to inject the script into the HTML stream. To work around CSP issues, you can write the worker in such a way that it parses out the correct nonce and adds it to the inline script. (Note: Not tested, but please share examples if you get it working! 😄)

Summary view

The script logs two info groups to the console: the actual order of the <head>, and the optimal order. In this collapsed view, you can see at a glance whether there are any high impact elements out of order.

Each "weight" has a corresponding color, with red being the highest and blue/grey being the lowest. See capo.js for the exact mapping.

Here are a few examples.

image

docs.github.io

image

web.dev

image

stackoverflow.com

image

Detailed view

Expanding the actual or sorted views reveals the detailed view. This includes an itemized list of each <head> element and its weight as well as a reference to the actual or sorted <head> element.

Here you can see a drilled-down view of the end of the <head> for the NYT site, where high impact origin trial meta elements are set too late.

image

capo.js's People

Contributors

rviscomi avatar dgrammatiko avatar kurtextrem avatar kilian avatar radum avatar stoyan avatar

Stargazers

mat avatar Arturo Silva avatar Darius avatar Rafael Salas Robledo avatar Hubert Tulibacki avatar Yee Jia Wei avatar  avatar  avatar higby avatar Dmitriy avatar j0Shi avatar Marc Heiduk avatar Juho Leinonen avatar Volkmar Rigo avatar sam bacha avatar dayney avatar 云云酱 avatar  avatar Jesús avatar Dariusz Maciejewski avatar Andy Lu avatar zhouchengfeng avatar Stéphane Dion avatar RV avatar Leif Niemczik avatar Marcin Wróblewski avatar  avatar Thanos Valimitis avatar Vladimir avatar Sam Bulatov avatar Sergey avatar Lee Mashanlo avatar Paweł Smolak avatar Tom Byrer avatar  avatar 茅伟凡 avatar Harri Lehtola avatar Jihchi Lee avatar Dominik Mertz avatar Jason Morganson avatar Rob Villeneuve avatar Daniil avatar João Lucas Farias avatar Jems avatar Jakub Gabory avatar Tony avatar Julio Morales avatar Arwel avatar Ravin avatar Aether Chen avatar  avatar jazsouf avatar Dimitar Mitkov avatar Michael Sprague avatar Hiroki Osame avatar David avatar Can Otsay avatar Fynn avatar Kevin Batdorf avatar Reza Ardestani avatar Phillip Lovelace avatar Pedro Tallón avatar Dix Huit avatar Richie McColl avatar Brian Douglass avatar Vadim avatar Starlink avatar Marcus avatar Ehliman ŞEN avatar /jb avatar Istiaq Nirab avatar İbrahim Turan avatar Oren Mizrahi avatar Nicolas Pulido M. avatar Felix Schlegel avatar Deyaa avatar eno2kn avatar mowi12 avatar  avatar  avatar Ankur Lakhanpal avatar Eduardo Sasso avatar Mithun Kamath avatar Jeremy avatar Bruno Bernard avatar Ganzorig Erdenebat avatar Lukas Bang avatar Syed Umar Anis avatar Injoong Yoon avatar Arek Bartnik avatar Thomas Off avatar Alexander King avatar Aaron Peters avatar Federico Budassi avatar Frank de Groot - Schouten avatar Ria Scholz avatar Derry Redjeki avatar  avatar Greg Martenson avatar  avatar

Watchers

Fili avatar 情封 avatar  avatar Joan León avatar Caleb Queern avatar Tim Narr avatar Mehdi avatar  avatar

capo.js's Issues

Validate meta CSP directives

Address a TODO in the code to validate meta CSP elements to ensure that they don't include forbidden directives: sandbox, report-uri, and frame-ancestors.

Support demo permalinks

It could be useful to have a permanent link to specific capo.js results. I really like how the Lighthouse viewer uses GitHub gists as a storage layer, which completely absolves the LH developers of any data storage concerns (ownership, costs, legal).

There are a few use cases:

  • An extension user wants to take a "snapshot" of the dynamic capo.js results to share with their development team.
  • I want to showcase a few different validation features of capo.js by using preset HTML snippets
  • A user wants to get a live look at the same URL periodically (URL persistence already exists)

If a gist ID is passed into the demo page, we should be able to read from the gist without an API. The contents could be added to the raw HTML field.

Validate response headers

TODO: need to think through what exactly we'd be validating but it seems useful to include HTTP response headers in the scope of the tool. Only the HTML document should be validated.

Improve viewport warning for `shrink-to-fit`

According to Stack Overflow and this blog post shrink-to-fit is a non-standard directive briefly supported by older versions of Safari (9.0–9.2 circa 2018).

Capo will warn if it sees this directive, flagging it as invalid:

❌ Invalid viewport directive "shrink-to-fit".

Here's an example of it in the wild: https://rviscomi.github.io/capo.js/user/demo/?url=https%3A%2F%2Fwww.cnn.com%2F (cnn.com)

<meta name="viewport" content="width=device-width,initial-scale=1,shrink-to-fit=no">
image

We could provide better messaging for developers who believe that this directive is valid and they still need it, similar to the IE-specific obsoletion warnings.

Incorrect origin trial validation for subdomain

image

On https://www.youtube.com/ there's an origin trial registered to https://youtube.com with isSubdomain set to true:

{
    "origin": "https://youtube.com:443",
    "feature": "PrivacySandboxAdsAPIs",
    "expiry": "2023-09-19T23:59:59.000Z",
    "isSubdomain": true
}

capo.js warns that the origin is invalid. There shouldn't be any validation warnings for this origin trial because the isSubdomain check permits it to be used by the www subdomain.

Log origin trial metadata

The blanket rule for all meta[http-equiv] tags is to put them first, but there's more nuance than that. Some origin trials alter the behavior or feature support of the page and their meta tags really would benefit from coming first. Other origin trials are less consequential.

To help differentiate between them and for developers to get a better sense of urgency if a super important OT is too low in the head, include some metadata logging.

@samdutton created this excellent OT debug tool, which includes a decoder function for the OT token:

function decodeToken(token) {
    const buf = base64decode(token);
    const view = new DataView(buf.buffer)
    const version = view.getUint8()
    const signature = buf.slice(1, 65)
    const length = view.getUint32(65, false)
    const payload = JSON.parse((new TextDecoder()).decode(buf.slice(69, 69 + length)))
    return {payload, version, length, signature}
}

It doesn't look like there would be any licensing incompatibilities if we integrated that into the tool with proper attribution.

Add the ability to toggle features on/off

Some features like static head evaluation seem to have buggy edge cases (#31). Other features like validation might not be useful to all users. Create a way for users to enable/disable features or customize the script in some ways.

Features:

  • static/dynamic head
  • validation
  • automatically log on page load (extension)
  • validation error count badge (extension)

Customizations:

  • color palette
capo-pink.mov

will this support node.js based ussage?

Would be nice to be able to use this in node.js environment where html is passed as argument instead of script interacting with the page directly in the browser. Also for the results to be JSON rather than console.log

Is this something you're thinking of?

Try fetching/parsing HTML client-side

In HTTPArchive/custom-metrics#12 (comment) @jroakes shared a screencast of some code for a document to parse its own body. Taking inspiration from that, I've modified the code a bit to produce a snippet that yields the entire contents of the static (server-rendered) <head>:

async function parse_own_body() {
    const url = document.location.href;
    let response = await fetch(url);
    let responseText = await response.text();
    return responseText;
}

let html = await parse_own_body();
html = html.replace(/(<\/?)(head)/ig, '$1static-head');
const staticDoc = document.implementation.createHTMLDocument("New Document");
staticDoc.documentElement.innerHTML = html;
const staticHead = staticDoc.querySelector('static-head');
staticHead;

The insight I had was that by renaming the head element to anything else, we can circumvent the HTML parser's behavior to truncate on invalid elements. Combined with fetching the raw HTML from the server, this should give us a pristine copy of the original head to use for capo analysis.

Screen Shot 2023-06-20 at 9 28 29 PM

This screenshot shows it working on the NYT site.

It should be possible to drop this approach into capo.js and fall back to the dynamic head as needed—I'd imagine some CSP rules blocking this use of fetch.

There might be sandbox limitations of doing something like this in a Chrome extension. I'll investigate.

Flag invalid elements

Content model:
If the document is an iframe srcdoc document or if title information is available from a higher-level protocol: Zero or more elements of metadata content, of which no more than one is a title element and no more than one is a base element.
Otherwise: One or more elements of metadata content, of which exactly one is a title element and no more than one is a base element.

https://html.spec.whatwg.org/multipage/semantics.html#the-head-element

Metadata content is content that sets up the presentation or behavior of the rest of the content, or that sets up the relationship of the document with other documents, or that conveys other "out of band" information.

https://html.spec.whatwg.org/multipage/dom.html#metadata-content-2

Ensure there is:

  • exactly one <title>
  • no more than one <base>
  • only the listed valid elements

Validate `meta[http-equiv]`

In general, the WHATWG supports a limited set of keywords that are valid attribute values for http-equiv:

  • content-language
  • content-type
  • default-style
  • refresh
  • set-cookie
  • x-ua-compatible
  • content-security-policy

Notable omissions include:

  • origin-trial
  • etag
  • x-* (besides x-ua-compatible
  • cache-control
  • expires
  • pragma
  • accept-ch
  • content-style-type
  • content-script-type

These are all http-equiv attribute values used by over 100k pages, according to HTTP Archive, in descending order of popularity.

See the full results and query, if interested
http_equiv pages
x-ua-compatible 5,849,869
content-type 4,064,550
origin-trial 3,741,447
etag 432,755
x-wix-published-version 432,595
x-wix-application-instance-id 432,594
x-wix-meta-site-id 432,593
content-language 430,009
cache-control 351,196
expires 301,664
pragma 296,342
accept-ch 232,735
x-dns-prefetch-control 176,712
content-style-type 172,497
content-script-type 136,871
imagetoolbar 97,656
cleartype 93,802
content-security-policy 82,064
refresh 28,243
keywords 28,027
last-modified 14,478
x-xrds-location 13,495
page-enter 11,945
encoding 10,936
description 10,716
x-rim-auto-match 10,361
msthemecompatible 9,564
reply-to 9,113
language 8,653
content-location 6,896
copyright 6,435
x-frame-options 6,323
window-target 4,930
title 4,601
x-ua-compatiable 4,493
page-exit 4,468
pics-label 3,269
screenorientation 3,105
audience 2,378
author 2,140
access-control-allow-origin 2,072
dc.description 1,836
cache 1,759
robots 1,501
distribution 1,464
vary 1,386
x-webkit-csp 1,376
p3p 1,258
revisit-after 1,226
default-style 1,054

Query:

WITH meta AS (
  SELECT
    page,
    LOWER(JSON_VALUE(meta, '$.http-equiv')) AS http_equiv
  FROM
    `httparchive.all.pages`,
    UNNEST(JSON_QUERY_ARRAY(custom_metrics, '$.almanac.meta-nodes.nodes')) AS meta
  WHERE
    date = '2023-06-01' AND
    client = 'mobile' AND
    is_root_page
)


SELECT
  http_equiv,
  COUNT(DISTINCT page) AS pages
FROM
  meta
WHERE
  http_equiv IS NOT NULL
GROUP BY
  http_equiv
ORDER BY
  pages DESC

The biggest one that jumps out to me is origin-trial, which is used on ~375k pages. Given that it is explicitly supported and endorsed by Chrome, Edge, and Firefox (Safari doesn't support origin trials) I've left a comment on the WHATWG issue recommending its standardization.

I don't think capo.js should complain about spec validity for these keywords as long as browsers support them. But there are some specific usages worth validating.

http-equiv=content-type

According to the W3C spec, the content attribute of a meta[http-equiv=content-type] tag must be set to a "specially formatted string providing a character encoding name... in exactly the following order":

  1. The literal string "text/html;".
  2. Optionally, one or more space characters.
  3. The literal string "charset=".
  4. One of the following:

The WHATWG further requires that the character encoding name be exactly utf-8 and that:

A document must not contain both a meta element with an http-equiv attribute in the Encoding declaration state and a meta element with the charset attribute present.

capo.js should validate that HTML5 pages set a charset to utf-8 and don't have redundant meta tags. Not sure about HTTP header vs meta tag redundancy, but that's also worth exploring (related #59).

Could capo be used to automate ordering of head elements as part of a build process

I just learned about capo and think it's very cool. I applied it to my own site and found that I could fix some ordering, but due to the way that Astro injects CSS and scripts at the end of the <head>, I wasn't able to entirely make capo happy. I noticed that the docs site is built using Astro's starlight, so has the same issue.

I wondered if capo could be used to automatically re-order the items in the <head> so that it could be run during the build process of a site. That way you can write the elements out in whatever order makes sense to you, and frameworks can inject parts in whatever order works for them, and then a capo plugin could come in and set the order straight.

Is this possible with how capo works today? And if not, is it something that capo could be extended to do?

OT metadata discrepancy on extension click

  1. Run the extension on https://web.dev/
  2. Click the red element that sticks out at the end
  3. Check the console log
image image

There are a couple of discrepancies that I noticed between the actual element and what's getting logged:

  • The token is missing 0= at the end

Ah3H7DwyoUUsaRQdSySa1hMCS/JFQn/VVmrQODVDnRJGH9mU/uG6G0Uhh+4atnFGAoiEwDq+r9TzCyBi7f7wRw4AAABfeyJvcmlnaW4iOiJodHRwczovL3dlYi5kZXY6NDQzIiwiZmVhdHVyZSI6IlNwZWN1bGF0aW9uUnVsZXNQcmVmZXRjaEZ1dHVyZSIsImV4cGlyeSI6MTY5NDEzMTE5OX

  • Expiry is missing

I'd expect the metadata to match DevTools:

image

We don't have this bug when logging the full list of elements:

image

Seems like some kind of serialization issue, or getting the custom validations specifically for the popup.

Capo extension crashes for my site

Hey Rick, I hope you're well. I just wanted to play with the new extension and it's crashing for my site. 🫣

Might be because there's too much inline styles? 😅

image

Scripts before meta data

Capo positions scripts before meta data.

There is an issue with that though. If a script in the head tag introduces a body tag - example with a Google Tam Manager script and advertiser could add in an <iframe> tag as part of their advert script for tracking, then when the page is read by Google (and probably other search engines) it will read the body tag and will assume it has reached the body and will from that point ignore any further head tags. This can lead to meta descriptions and canonicals being missed completely. My advice, because of this, has been to always put scripts last in the head unless you know what is in them - tag managers mean you will probably not.

But perhaps this is beyond the scope of this very good project!

Can't expand to detail view

Hey! Just wanted to mention that this tool is awesome!
Although I am having one issue with expanding the visual output from the script into a detailed view.

I ran the script on the citi.com website and this is what I see:

Screenshot 2023-05-30 at 11 12 23 AM

Only make meaningful `http-equiv` highly weighted

Related to #70, I think capo.js should go a step farther and de-weight http-equiv elements that are invalid.

For example, on https://github.com/:

image

None of these http-equiv=x-* elements are meaningful and there's no compelling reason why they must be loaded before other more important elements (title, preconnect, style, etc).

In cases like these where http-equiv is known with high confidence to be ignored by the browser, assign a weight of 1 (lowest).

Effectively this is the list of keywords provided by the HTML spec plus a few extras that browsers care about—origin-trial being one, but there are others.

Manage code duplication

The core capo logic is effectively duplicated across the snippet, WPT custom metric, BQ function, and now the Chrome extension.

Devise a way to have all of the core logic defined in one place and integrated into each surface. This will probably require a build process.

Investigate high Linux and Windows uninstall rates

The Chrome Web Store gives me some usage stats. There might be some buggy behavior in non-Mac OSes worth investigating.

41% of installs are on Windows, 9% on Linux

image

51% of uninstalls are on Windows, 17% on Linux

image

If anyone has Windows or Linux and is willing to file any bugs with the extension, please do!

Is it really invalid to omit a meta viewport?

Current behavior:

image

I see this on https://www.amazon.com/ when I'm logged in using a desktop device. When I emulate a mobile viewport, they adaptively serve a meta viewport, albeit with validation warnings of its own:

image

Given that meta viewports are really only useful for mobile devices, it's not necessarily bad for a desktop site to omit one. The thing is that it's hard for Capo to know whether we're looking at the desktop version of a site or whether it adaptively serves a meta viewport for the mobile version.

I'd be interested to hear feedback on whether it's generally a good idea to always have one, in which case we should keep the validation warning, or if there are too many use cases where omitting it is valid and removing the warning would be better.

Broken link

In the BigQuery README, httparchive.fn.CAPO has a broken link.

To analyze pages in HTTP Archive, pass the HTML response body to the httparchive.fn.CAPO function:

Validate against unnecessary preloads

From @ebizindia: https://twitter.com/ebizindia/status/1683146602113044486

image

image

I agree that it doesn't make much sense to preload after the image already starts loading, so capo.js isn't necessarily right here. But I still think the priorities are correct. So rather than find a way to preload first, the issue here seems to be that it's not needed at all.

Add a validation warning against unnecessary preloads. Check if the href of the preload matches the href/src of another resource in the head.

Use a service worker

There's a UX issue that a service worker might be able to help with:

image

When there's high latency to load the HTML resource, it takes a noticeably long time from clicking the extension to seeing the color bars. It appears blank for several seconds.

The same thing happens with the DevTools snippet: there's a delay from running the script to seeing the logs.

The extension is able to use a service worker to run in the background before the icon is clicked. We could use this to preload the HTML resource, so that when the icon is clicked, the response is immediately available. Unfortunately, this doesn't make sense for the snippet.

The same service worker could be used for a feature mentioned in #44, showing a badge with the number of validation warnings.

Unable to parse the static (server-rendered) `<head>`. Falling back to `document.head`

image

I'm consistently getting this warning on https://web.dev/ when the "prefer static head" option is enabled.

On one hand, this page is sending HTML without any <head> element:

image

Maybe this is some kind of extreme minification setting. But because capo can't extract the static head from the markup, it seems ok to fall back.

On the other hand, the browser is forgiving enough to make everything between html and body the head, so maybe capo could be more flexible and try falling back to that.

So one idea would be to check for the static head, fall back to the implicit static head, and if all else fails, fall back to the dynamic head. If we ever get into this middle state like on web.dev, there should be a different warning message about the lack of any discernible head in the static markup. It's not invalid but it's sketchy.

Would be good to understand other notable examples of this warning in the wild. (please comment if you have one)

Test external stylesheets for `@import`

There's a TODO to support external stylesheets in the isImportStyles matcher. The code is mostly implemented but commented out due to async complexity in an earlier version of the script. Now that we have #29, which also relies on async code, I think it'll be easier to fit all the async plumbing together.

Write demo output to the UI

Right now the demo writes capo.js results to the console. It's not totally natural to be using the demo web page and have to open DevTools to see the output so I think it'd be nicer if the results appeared in the page itself in a sort of emulated console UI.

One of the biggest challenges will be rendering HTML elements. Things like expand/collapse and syntax highlighting are free in DevTools but non-trivial to reimplement.

Maybe a tradeoff is to only print the color bars with tooltips, similar to the extension popup.

Show a loading animation while extension fetches static head

When the extension is slow to fetch the static head, users see a blank color bar:

image

There isn't any indication that work is being done, so it's unclear whether there's an error or whether to expect results soon.

Design some kind of loading animation to suggest progress being made in the background. Consider also including some kind of loading message.

One design idea is to play on the rectangle shapes of the populated color bar:

image

The colors can be desaturated so it's obvious that it's not the final result:

image

And maybe some kind of pulsing animation on random "elements"/rectangles to mesmerize the user while they wait :)

Side note: the desaturated demo image uses the pink palette in place of the default theme because it uses a linear luminance scale, which preserves the visual "sort order" of the results. Only important if the bottom row is actually sorted, otherwise we can randomize both rows.

The dynamic mode shouldn't need a loading animation because it can ~immediately process the document.head without any network latency. The snippet is also affected by the static loading delay, but we're limited in how much we can do in the console.

Base element

Would you consider adding some documentation on the best practices concerning the order of the base element? I feel it's good to put it somewhat early because it may apply to subsequent elements but the script makes me feel like I'm doing something wrong (I might be, but would like to know why).

Port extension to Firefox

I imagine this extension is fairly easily ported to Firefox. Would this be something you'ld consider?

Should print stylesheets be a lower weight?

Print stylesheets have the same weight as all other synchronous styles (5).

Given that they're not necessary for rendering UI to screens, and browsers still attempt to load them anyway (at a lower priority), consider lowering their weight to 0 accordingly.

Validate `meta[name=viewport]`

Ensure that the viewport meta:

  • permits zooming (a11y)
  • avoids double tap to zoom delay (perf)

For example: <meta name="viewport" content="width=device-width, initial-scale=1">

capo.js web app client

The default behavior of capo.js is to output logs to the console. The extension does that plus visualizing the color bar in HTML.

Create a new client type that outputs both the color bar and logs in HTML. We can embed that in the documentation site to enable users to interactively paste static HTML snippets and see how it would be evaluated. With some HTML presets, we can also use it to demo some of the script's capabilities.

Add footer with info and link to repo

We can add a footer to add info/documentation and link to this repo

Code

<footer class="capo-footer">
  <ul>
    <li>Click in the rectangles to see the element in the console</li>
    <li><a href="https://github.com/rviscomi/capo.js" target="_blank">GitHub</a></li>
  </ul>
</footer>
:root {
  --text-colot: #353535;
}

body {
  color: var(--text-colot);
  padding: 0;
  margin: 0;
}

.capo-footer {
  padding: 1ch;

  & ul {
    display: flex;
    justify-content: space-between;
    list-style: none;
    margin: 0;
    padding: 0;
    width: 100%;
  }

  & a {
    color: var(--text-colot);
  }
}

Screenshot

footer

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.