WebPerfWG call - April 29th 2021

Participants

Yoav Weiss, Michal Mocny, Nic Jansma, Giacomo Zecchini, Peter Perlepes, Noam Helfman, Nicolás Peña Moreno, Patrick Hamann, Sean Feng, Steven Bougon, Marcel Duran, Benjamin De Kosnik

Minutes

Topic

Resource initiator information to enable creation of dependency trees

Yoav: A couple weeks we talked about a proposal to expose render blocking information to RT
... Another proposal here to enable creation of dependency trees
... Currently with RT we have initiator type information, which is weird and doesn't provide a ton of information
... Something that would be useful on top of it would be a specific initiator
... In dev tools for example, we have a link from a resource to what requested it
... Even the specific line in the HTML triggered that resource request
... I don't know if we need line-by-line attribution, but it would be helpful from my perspective to add initiator information that may be alongside some concept of Fetch ID or links to RT entry
... Which would be able to provide dependency trees from RUM data
... i.e. Script A triggered Script B to Script C which triggered an image load
... And if that script or image caused issues, we can go back to Script A to assign blame
… Whether the author of Script A in on our team or a third party provider.
… Thoughts? Would that be useful?
… In the past I thought it can be used to find long dependency chains and then flatten them using preload.
NoamH : Personally I would find it very helpful
... Would find it more useful if like dev tools it showed full call tree or stack traces
... In terms of usefulness for diagnostic or determining issues
Yoav: I think it might be interesting to tie that to JS Profiling proposal in some way
... When JS Profiling is enabled, maybe we can provide more initiator data
... I would tie to to same security primitives that JS Profiling has
... Main difference is that JS Profiling is an opt-in, where initiator is something we'd want to enable for everyone ideally
Marcel: What about when you have a service worker involved and several tabs open requesting resources. How can we organize this mess?
Yoav: In this case, we're talking about the render-based initiator or the SW as the initiator?
Marcel: The SW would be handling the events, but it would be triggered by something in the page
Yoav: That's a good question
... I would expect this info would be tied to theRequest object in Fetch, and if the SW generated a new request object, then the SW would become the initiator instead of the original renderer based-one
… But I’d love for you to be involved in those discussions
... It'd definitely more complex than "let's add initiator info"
Marcel: Currently with SW, I have enough problems figuring out where a resource came from. Hard to map without ID. ID would help in this case.
Yoav: Do you know what dev tools in browsers do for this case?
Marcel: I'm not sure, when I'm debugging I have dedicated dev tools for this situation
... Not sure how we'd map one to another
Yoav: If you could outline your use-case in this issue, that’s be great.
Benjamin: Would you consider Preload an initiator, that would be a root?
Yoav: Yes for Preload, that would be an initiator
... There's HTML-based Preload, and for that I'd consider the HTML the initiator of that Preload, and tie that back to the link rel=preload tag that kicked it off
... For HTTP header preloads, I'd consider the resource that triggered the initiator, line-by-line may be hard in that case
Benjamin: Ideally we'd be able to pick apart where the Preload request came from, any light you could shine on that the better
Noam: Another use-case: when you have an app w/ 3P resources that you don't, they often initiate requests for resources and it’s hard to troubleshoot. Even though you can see them on RT you don't know where they're coming from
Yoav: Very much the use case I had in mind
Nic: It would be similar to Simon Hearne's RequestMap tool, but doing that from RUM would be great
.. Akamai’s point of view, this would be useful. From a RUM perspective, we try to do some of this heuristically or when looking at waterfalls for customers. Would love a more definitive answer for what to point to… assigning blame / biggest hitters
..The edge itself could make more intelligent decisions to deliver content in different ways
..Security perspective: could you find rogue requests going out? Security products could find things.. what triggered what.. may enable things we haven’t even thought of! As long as we do that in a safe & secure way.
Yoav: Yeah the security is interesting.. Tied to report-only CSP? Then can tie CSP violations back to the RUM report, and find which 3p was responsible for the request?
Marcel: Another point would be with extensions, not knowing which extension a request came from
Yoav: I'm not sure if we can expose scripts from "another world" in that case
Marcel: If a known resource from a screen-reader for example
Yoav: There may be privacy concerns or from an implementation perspective
... e.g. an extension is not a "previous resource"
... Also, going back to security case, I'm not sure we can guarantee to have attribution to the initiator. If a request is initiated during a setTimeout callback, does the browser keep this info? There may be loopholes in the security story
... But interesting to keep that use case in mind, and see what we can do. Maybe we can only tackle non-sophisticated hackers
Nic: +1 to the extension case. Customers often ask why resources are triggered, when it can be extensions that don’t reproduce locally. Some attribution there, or even an indication that it was triggered by something outside the page would be helpful. Causes confusion for sure.
Yoav: Would probably be safe to say “triggered by extension”, but need to verify.
Yoav: Thanks all for the feedback, I expect there's a lot of value in this that we can unlock
... Critical to figure out use cases and edge cases that could trip us up

Reference mess

Yoav: I'm not sure if this will be actionable on the call
... We need to setup auto-publishing
... The Fetch spec currently has references to both RT and RT2
... New SpecProd Github repo that I need to figure out and deploy, offers a great way to build ReSpec specs to avoid situation in the past where the spec is published but later breaks
... Build time is part of the PR process that breaks as part of publishing but not later

When should a UA clear its entry buffers?

Sean: Context was for PerformanceEventTiming
... We always buffer timing entries if they have more than 100ms
... That will grow indefinitely, should the UA be able to clear entry buffers at some point
... Days-old entry buffers are probably not going to be helpful anymore
Yoav: Resource timing has the ability to explicitly clear buffers, as well as offering an overall buffer limit.
...this proved to be an antipattern, since multiple scripts on page can think they are done with the data and clear each other..
Nic: We already have an overall buffer limit of 150. Are you worried about old entries?
Sean: Do we have a limit for Event Timing?
Nicolás: We do indeed have limits. The reason for the buffers is to get data from early in the page load, until the observers are registered.
… the current buffer limits are arbitrary, we didn’t really do enough studies to see what the perfect limits would be.
… once the buffer is full, it doesn’t keep increasing.
… The expectation is that you would have already registered listeners by then.
… There is one entry type that has no buffer limit: User Timing, because it is a developer initiated signal. Also we need to support getEntryByName() which don’t have a buffer size limit.
Sean: So is there an expectation that we should be able to clear the buffer?
Nicolás: Who, the browser?
Sean: Yes.
Yoav: That runs the risk of violating developer expectations. I don’t know if developers are collecting UserTiming Entries from many days ago, but they can. Is that main concern a memory concern?
Sean: Memory concern.
Nicolás: Is this still hypothetical or have you seen it in practice?
Sean: I was thinking of Event Timing, but if it’s limited that may not be an issue. We have seen some cases, like user timing, where it takes lots of memory (maybe it was Facebook, cannot recall exactly)
Yoav: Facebook has a separate use case where they want to use User Timing for DevTools and not for perf timeline.
Michal: I have some context here, they were adding entries to UT and immediately clear them, so they would show up in Dev Tools. But the entries didn’t persist in the timeline. I don’t know if that removes them from the buffer
Nicolás: It should remove them.
Michal: They use lots of timers and others may persist. The use case they discussed was for advanced debugging, but they probably have others.
Yoav: So we have buffer limits, other than UT where they have clearMarks and clearMeasures, developer can cleanup those for long-lived apps, either because they want Dev Tools annotation, or because they've collected those marks/measures and they've sent them to their server, so they can be cleared from the buffer.
... That seems like it would address the developer concern
Sean: That seems fine
Nic: We have Timing Entry Names Registry where we can change the max buffer sizes if we need to, either increases if they're too egregious or too small (e.g. ResourceTiming increased from 150 to 250)
Nicolás: We should implement the “number of dropped entries”, which we specified
Yoav: that would enable us to know if our limits are too small.
Michal: Sounds like the intention of the buffer is to be a bootstrapping buffer for late observers. If you haven’t registered your buffers and you have many dropped entries, is it worthwhile at this point to still keep those entries? The measurement is already tainted.
Nic: Can talk about Resource Timing. It was defined as 150 entries, we often found customers with waterfalls that didn’t include all the entries, and often had to bump the buffer size for them. But having a partial result was way better than having no result. My preference would be to present whatever you can and say how many were dropped. Understand the memory concerns.
Michal: If you could drop timeline data from pages that are no longer relevant, would you have more room for keeping entries on the current page?
Nicolás: Something to keep in mind that adding an auto-clearing heuristic might cause issues to those looking at the data
Yoav: Tradeoff between memory utilization and consistency
... Because those buffers are bound to what seem like reasonable values, I wouldn't expect dropping them would gain you a ton of memory
Michal: In the absence of numDroppedEntries having predictable fixed numbers for buffers is important (or else you don’t know when buffer is full), but I wonder if once that is added we cannot have flexible buffer sizes? The buffer limits were apparently arbitrarily picked.
… Sean - were your questions addressed?
Sean: Yes

Allow HTTP headers to be defined for the preload request

Yoav: Issue we talked about many years ago as part of Fetch parameters
... Essentially what Angular folks are trying to do is Preload Fetch requests that are being sent later
... They're being fitted with custom Accept headers, e.g. JSON over XML for various REST API endpoints
... Because of those custom Accept headers, right unspecified Preload cache in Chromium and Webkit doesn't match those requests, and when they're later calling Fetch they're triggering a second request. That's the desired behavior here because if Preload cache were to match requests with different headers they could have different responses
... So Preload cache behavior here seems correct even though it's not specified
... No way to trigger Preload without double-download
... Ideally what they'd like is an attribute on Preload to specify Accept header or any arbitrary on the request
... On one hand I think the use-case is legit, but I'm concerned about adding cruft to HTML
... In the past we've asked about adding Fetch object attributes to elements that load resources, so they could specify a JSON with the various Fetch options to apply to the resource request, to define new headers or other parameters, credentialed mode, etc
... At the same time I'm concerned about adding arbitrary cruft
... Tradeoff between HTML legibility and Preload for those use-cases
... We can continue discussions on issue itself
... AI: Yoav to respond on issue

Can't preload for type-switching <picture>s

Yoav: When using <picture> to load new and exciting image formats that are not yet supported anywhere, we currently have a way to use the type attribute to just Preload the latest one.
... That worked well with WebP, but if we're interested in Preloading more than just the latest format, i.e. AVIF has support in some browsers. Preload AVIF and WebP and JPEG-XL, it's becoming a problem to only use the correct one.
... Want to be able to Preload AVIF, if you don't support AVIF Preload WebP, if not JPEG, etc
... We avoided tackling this use-case in the past because it didn't seem like it had a lot of real-life implications, and support matrix for newer formats matched Preload, but with AVIF and more common Preload support, it seems like there is now a mechanism
... Opinions on what that may look like?
Nic: Do you want to tie multiple preloads together?
Yoav: Yes, but we can’t really add a new element to the head. And we can’t use <link> for that, because link is self closing.
Nic: If they all shared a common ID?
Yoav: That would be one way of doing that, but not sure that the processing model for that would be reasonable. E.g. multiple links in the head and another that arrives later. May work if we still take ordering into account.
… What do folks think about the use case? A reasonable one for us to tackle?
... We had similar requests in past for fonts which were dismissed by hoping all UAs would just support WOFF2
... But with progressively rendered webfonts, the same mechanism could potentially be useful for both
Nicolás: Why can't you have rel=preload inside <picture> element sources?
Yoav: Preload is typically you'd put before something is defined, i.e. when picture is dynamically generated or added later on
... You would want to Preload that image in those cases
... If you're already discovering the Picture element and the images inside it, there's no difference between loading and preloading it
Nic: would you be able to define picture twice, with only links in it and no images? You’d define it twice, I guess.
Michal: Or a hidden picture
Nicolás: In that guess picture becomes a link so better to just add that link
Yoav: Another thing to bare in mind - whatever we come up with should also work in HTTP Header format for Link
... So maybe instead of an ID that requires a cache across elements or headers
... Instead have a "not-type" attribute, i.e. this is a WebP but don't load it if you support AVIF, load this JPEG If you don't support neither AVIF nor WebP
… Maybe cumbersome, but expressive
Benjamin: What you said sounds more like a header, where you put Preload priority in this order
Yoav: Main conclusion is that we need to explore this more, multiple possible sketches and none is ideal
... Take to the issue and discuss further there