Code Monkey home page Code Monkey logo

Comments (20)

mnot avatar mnot commented on May 13, 2024

I see lots of terminology problems here. Misusing "resource" here is just going to lead to yet more confusion. "decoded" isn't helpful here either, since it could refer to transfer-coding or content-coding. "as delivered by the server" also implies that the client somehow knows what bits the server sent in every case; if the connection drops, it will not have this information.

In general, this section should use terminology from HTTP, since it's talking about HTTP.

Furthermore, this is going to be even muddier when HTTP/2 gains hold, because the headers are compressed, and because the status code is part of the headers.

I think you want:

  • Transfer size (transferSize): the amount of application data, in octets received by the client, consumed by the response header fields and the response message body [http://httpwg.github.io/specs/rfc7230.html#message.body]. This SHOULD include HTTP overhead (such as HTTP/1.1 chunked encoding and whitespace around header fields, including newlines, and HTTP/2 frame overhead, along with other server-to-client frames on the same stream), but SHOULD NOT include lower-layer protocol overhead (such as TLS or TCP).
  • Decoded size (decodedSize): the size, in octets, of the payload body [http://httpwg.github.io/specs/rfc7230.html#message.body] used, after removing any content-codings applied [http://httpwg.github.io/specs/rfc7231.html#data.encoding]. Note that the response may have been cached; decodedSize does not reflect how much data was sent "on the wire."

I'd argue that the response status code is necessary, since you're asking people to use heuristics to discriminate between a fresh cache hit and various other conditions (e.g., errors). I think that's another bug, however.

from navigation-timing.

igrigorik avatar igrigorik commented on May 13, 2024

@mnot thanks, that's a big improvement! I've updated the original proposal with your suggestions.

I'd argue that the response status code is necessary, since you're asking people to use heuristics to discriminate between a fresh cache hit and various other conditions (e.g., errors). I think that's another bug, however.

Do you have some examples in mind?


Also, thinking about Fetch... Assuming we plumb through ability to pull out underlying Fetch object for each resource fetch (JS or element initiated), then the client can get access to the status message, header list, and other meta data: https://fetch.spec.whatwg.org/#responses. However, I don't believe we can get reliable access to both the transfer and decoded sizes via same approach... And those still make sense to expose in ResourceTiming?

@annevk does that sound sane?

from navigation-timing.

annevk avatar annevk commented on May 13, 2024

Fetch automatically deals with 304 and transfer codings. I think it would be confusing if fetch() returned a 200 (either directly from cache or revalidated) and the sizing information was from something else.

For progress events we report the size post-transfer-codings. I don't think we have access to the other sizing information in Gecko. @bzbarsky / @mayhemer / @mcmanus probably know better.

Also, we can only offer this data for CORS or same-origin responses. That needs to be made clear.

from navigation-timing.

bzbarsky avatar bzbarsky commented on May 13, 2024

Progress events report size of the HTTP payload body, before undoing any content encodings, in RFC 7230 terminology. That is, they report numbers compatible with the number that goes in the "Content-Length" header.

The network library Gecko uses does not HTTP message body size information, I'm pretty sure. It provides size information for the payload body and provides the actual data after undoing content encodings.

Note that the message body is a transient transmission artifact that can be freely modified by proxies and the like (unlike the payload body), so I'm not sure how much sense it makes to report anything about the message body.

from navigation-timing.

igrigorik avatar igrigorik commented on May 13, 2024

@annevk @bzbarsky thanks guys, a few notes below.

Fetch automatically deals with 304 and transfer codings. I think it would be confusing if fetch() returned a 200 (either directly from cache or revalidated) and the sizing information was from something else.

Just to clarify, given that fetch() deals with revalidation and content-coding under the hood... calling size on response body should exactly match the decodedSize proposed here.

Also, we can only offer this data for CORS or same-origin responses. That needs to be made clear.

Proposed attributes are part of Nav+Resource Timing entries, so we already have TAO restrictions that would/could apply. That said, it's not immediately clear to me that transferSize and decodedSize should be behind a CORS flag? After all, if I can get a Fetch object out of every JS/element initiated request (would that be CORS restricted? if it is, then ok...), I can listen to progress events and also query the final response body size to get this data? In effect, I think this is similar to duration which is available on all resource events.

Note that the message body is a transient transmission artifact that can be freely modified by proxies and the like (unlike the payload body), so I'm not sure how much sense it makes to report anything about the message body.

That's precisely why it is relevant for performance and monitoring - e.g. I want to see that my CDN/optimization proxy has, in fact, applied configured optimizations; I want to see if anyone else is modifying the resource as it transits the network; etc. Similarly, we know there are cases where malware (and even antivirus software) are stripping gzip headers on the client, which forces the client to fetch uncompressed resources -- reporting transfer size allows us to identify these cases.

from navigation-timing.

nikmd23 avatar nikmd23 commented on May 13, 2024

I could be wrong here, but it would seem that access to headers, at least the headers typically used in content negotiation scenarios (Content-Type, Content-Language, etc.), would be necessary to identify a representation.

For example, given the following:

URL TransferSize DecodedSize
/images/logo 43567 43498
/images/sprite 7893 7704

How would I know if logo and sprite were served as .webp or .png? Other examples could include the differences between the English and German representation of a resource.

Perhaps this example is contrived, but I hope the point is clear. I know it isn't this specification's job to provide headers in general, but it seems to be necessary context for size* to make sense.

Note: Content negotiation information seems like necessary context for timing too. Do people not run into this problem with Resource Timing?

from navigation-timing.

igrigorik avatar igrigorik commented on May 13, 2024

@nikmd23 that's an orthogonal use case, the problem we're trying to solve here is to surface transferSize and decodedSize of the resource, regardless of its negotiated format/representation.

That said, In the future... Assuming we can get the fetch object out of each request, you could get more data that way, including the headers.

from navigation-timing.

igrigorik avatar igrigorik commented on May 13, 2024

public-web-perf thread: http://lists.w3.org/Archives/Public/public-web-perf/2014Oct/0060.html

from navigation-timing.

annevk avatar annevk commented on May 13, 2024

How would duration give you access to the size of something? And no, we wouldn't want to give progress events for things not going through CORS.

from navigation-timing.

mcmclx avatar mcmclx commented on May 13, 2024

@annevk I think Ilya's point is that every resource, CORS/same origin or not, exposes a 'duration', and in the same vein, every resource should also expose its size. As an analyst, I love the idea, but I remember way back in the day that there were general security concerns about resource timing exposing data that could be used by an attacker to guess which sites a user had visited, by seeing which resources were cached by the browser. This obviously opens that hole (as does exposing HTTP status codes to non-CORS requests; another thing I want, but orthogonal to the issue at hand).

from navigation-timing.

annevk avatar annevk commented on May 13, 2024

Yes and I disagree. Duration was already exposed as you can measure time. Size is not exposed and should not be (unless it's already observable as it is with CORS).

from navigation-timing.

igrigorik avatar igrigorik commented on May 13, 2024

@mcmclx yep, thanks for the clarification, that's exactly what I meant.

@annevk makes sense, I'm OK with applying the TAO restrictions on what we're discussing here. That said, as a brief aside: we've previously talked about ability to pull out a fetch object from an element-initiated fetch... (a) do we have some concrete proposals for how that could look, (b) would that be limited to CORS/same origin for same underlying reasons?

from navigation-timing.

annevk avatar annevk commented on May 13, 2024

There's no such thing as a fetch object. There's Request and Response objects. The latter has constraints in place when the underlying response is opaque (no CORS). Generalizing the Request object and exposing is https://www.w3.org/Bugs/Public/show_bug.cgi?id=26533 though I have not looked into that in detail yet.

from navigation-timing.

igrigorik avatar igrigorik commented on May 13, 2024

@annevk got it, makes sense. Thanks.

from navigation-timing.

mnot avatar mnot commented on May 13, 2024

A good illustration of why capturing transfer overhead is important:
https://gist.github.com/mnot/1138792

from navigation-timing.

avanderhoorn avatar avanderhoorn commented on May 13, 2024

Looks like this made progress on the Navigation Timings, is getting the same data into Resource Timings still on the radar?

from navigation-timing.

igrigorik avatar igrigorik commented on May 13, 2024

@avanderhoorn yep: https://w3c.github.io/resource-timing/#dom-performanceresourcetiming-transfersize :)

from navigation-timing.

avanderhoorn avatar avanderhoorn commented on May 13, 2024

Great! Lastly, did we make any progress on whether accessing headers was going to be an option?

from navigation-timing.

avanderhoorn avatar avanderhoorn commented on May 13, 2024

Very lastly, how would you expect one to interpret these values to determine if it came from cache or not? Or would that be something else again?

from navigation-timing.

igrigorik avatar igrigorik commented on May 13, 2024
  • Headers: I'm not sure what that means, sounds orthogonal to what we're discussing here.
  • Cache vs not: if transferSize is 0, then it was served from cache.

from navigation-timing.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.