WebPerf WG @ TPAC 2024

WebPerf WG @ TPAC 2024

bit.ly/webperf-tpac24

 

Table of Contents

Logistics

Where

When

Registering

Calling in

Masking Policy

Attendees

Agenda

Agenda Proposals

Lightning Topics List

Monday - September 23 (09:00 - 16:00 PDT)

Tuesday - September 24 (09:00 - 16:00 PDT)

Wednesday - September 25

Thursday - September 26 (09:00 - 16:00 PDT)

Friday - September 27

Joint Meeting with WHATWG

Meeting Minutes

Day 1  - Monday

Day 2 - Tuesday

Day 3 - Wednesday

Day 4 - Thursday

Day 5 - Friday

Logistics

TPAC 2024 Home Page

Where

Hilton Anaheim, California, USA

When

September 23-27 2024

Registering

Calling in

Join the Zoom meeting through:
https://w3c.zoom.us/j/4556160000?pwd=2024

Or join from your phone using one of the local phone numbers at:
https://w3c.zoom.us/u/kb8tBvhWMN

Meeting ID: 455 616 0000
Passcode: 2024

Masking Policy

We will not require masks to be worn in the WebPerf WG meeting room.

Attendees

Agenda

Lightning Topics List

Have something to discuss but it didn't make it into the official agenda?  Want to have a low-overhead (no slides?) discussion?  Some ideas:

Add your ideas here.  Note, you can request discussion topics without presenting them:

Times in PDT

Monday - September 23 (09:00 - 16:00 PDT)

Recordings

-1 Lower Level - Catalina 5

Timeslot (PT)

Dur

Subject

Links

POC

09:00-09:30

30m

Intros, CEPC, agenda review, meeting goals, introspection

minutes

Nic, Yoav

09:30-10:30

1h

Container Timing Status Update

minutes

Jase

10:30-11:00

30m

Break

11:00-11:45

45m

Gather (New) Insights from Happy Eyeballs Behaviors

minutes

Nic, Utkarsh

11:45-12:30

45m

TTFB—what does it mean and why’s it so messy?

minutes

Barry Pollard

(remote)

12:30-14:00

1h 30m

Lunch

14:00-15:00

1h

Cross-origin caching of pervasive assets

minutes

Pat (remote)

15:00-15:30

30m

WebPerf Admin / Specs / Incubations

minutes

Yoav

Tuesday - September 24 (09:00 - 16:00 PDT)

Recordings

-1 Lower Level - Catalina 5

Timeslot (PT)

Dur

Subject

Links

POC

09:00-09:45

45m

Web Performance APIs in Excel

minutes

Noam Helfman

09:45-10:30

45m

Conditional tracing

minutes

Noam Rosenthal

10:30-12:30

2h

Power outage break

12:30-14:00

1h 30m

Lunch

14:00-15:00

1h

Scheduling APIs

minutes

Scott

15:00-16:00

1h

RUM Pain Points 2024

minutes

Nic

After Hours
18:00-🥳

???

RUM Archive Hacking

Nic

Wednesday - September 25

(breakout sessions)

Thursday - September 26 (09:00 - 16:00 PDT)

Recordings

4 Concourse Level - Huntington

Timeslot (PT)

Dur

Subject

Links

POC

09:00-09:30

30m

Service Worker TimingInfo

minutes

Shunya & Keita

09:30-10:30

1h

Exposing style/layout

minutes

Noam Rosenthal

10:30-11:00

30m

Break

11:00-12:00

1h 00m

Soft Navigations

minutes

Michal

12:00-12:30

30m

Long Animation Frames: Status & what’s next

minutes

Noam Rosenthal

12:30-14:00

1h 30m

Lunch

14:00-14:20

20m

Protobuf encode/decode

minutes

Colin

14:20-15:00

40m

Task Attribution: what is it, and how it might help beyond soft-navs (...RUM tooling, LCP hover, async event timing, resource initiators…)

minutes

Scott & Michal

15:00-15:30

30m

Javascript loading

minutes

Yoav

15:30-16:00

30m

Monitoring and Deployment Principles doc

minutes

Yoav

Friday - September 27

Joint Meeting with WHATWG

https://w3c.zoom.us/j/7639586116?pwd=1NQ1Zj2DjGVdP0uSVTJYcbizkQQM5c.1 

https://github.com/whatwg/meta/issues/326

https://whatwg.org/chat

https://app.element.io/#/room/#whatwg:matrix.org

Minutes

4 Concourse Level - Capistrano

Timeslot (PT)

Dur

Subject

POC

14:00-14:40

40m

FetchLater consolidation

Noam

14:40-15:20

40m

Scheduling API updates

Scott

15:20-16:00

40m

CompressionStreams

Adam

Meeting Minutes

Day 1  - Monday

Intro - Nic Jansma

Nic: Highlights..
… Rechartering - mostly clarifications and Event Timing adoption

… We’ve been evolving ideas around adapting our APIs with privacy in mind

… Working on a `confidence` attribute to enable privacy preserving dimensions

… Been also talking about LoAF and Element Timing for containers

… Worked with IETF and WHATWG on Compression dictionaries and fetchLater

… Closed 77 issues (last year was better)

… Charter, we’ll be talking about it again at the middle of next year

… Goals to revitalize - there’s a WG primer. We wanted to revitalize it and rework it to modernize it.

… Another deliverable we haven’t updated in a while, called performance APIs, security and privacy

… we have thoughts on that subject, and we’ll talk about principles related to it later in the week

… A few of us submitted a proposal for a RUM Community Group. We’ll talk about that more tomorrow

… Submitted last night, but there’d be a voting process at some point

… Lots of incubations that we’re interested in following:

… Interesting to go through these past discussions and see if it’s still relevant

… Could be interesting to potentially do a lightning talk about these

… Was interesting to go over the conversations we had in the last year

… Market adoption (based on Chrome)

… Small dips with Server Timing and Reporting Observer, everything else went up!!

… A few new members joining: Techfriar, Netcetera, Datadog and Tim Kadlec

<intros>

… Been thinking about lightning topics. If there are small things you want to talk about, have questions, etc. let us know

… There’s a section in the agenda where you can suggest ideas. If that’s useful, we’ll schedule some time for it

… <Housekeeping note>

… <health rules>

… Follow up with a poll on meeting cadence

Container Timing Status Update - Jase Williams

Presentation recording

Summary

Minutes

Gather (New) Insights from Happy Eyeballs Behaviors - Nic Jansma, Utkarsh Goel, Erik Nygren

Presentation recording

Summary

Minutes

TTFB—what does it mean and why’s it so messy? - Barry Pollard

Presentation recording

Summary

Minutes

Barry: Investigated TTFB and it’s painful

… TTFB is one of the oldest metrics and also one of the most ill defined

… There’s no spec for TTFB

… MDN defines it as responseStart - navigationStart

… uses the older navigation timing, so needs updating

… web.dev has a similar definition

… but also has alternative definitions.. :/

… not useful for what people say it’s useful for

… It tries to cover all of

… What we actually need it parts of that section

… I tried to do this in webvitals.js, but it turned out not being possible

… redirects have an old issue - it’s kinda measurable, but not really

… No timing for the HTTP cache, if there’s no worker you can kinda guess it

… connections get messy with http3

… because browsers use happy eyeballs, the connections used may not be the ones reported

… There are lots of bits where browsers are doing unmeasurable things

… <insert image on subparts>

… TTFB can’t tell you what is wrong, just that something is wrong. Very easy to misunderstand

… Also, what does the first byte measure?

… We fixed it (issue #345)

… Implemented firstInterimResponseStart and made responseStart back to be the old response start

… This was effectively a breaking change and browsers that don’t support this are sending the older value

… Now we’re getting different results from different browsers

… So RUM gives you very different TTFB results

… Also not implemented good across tools, and CrUX doesn’t use it

… …

… Changing ill-defined things is hard

(let the record show, that Barry is NOT throwing shade)

… so we just broke it more..

… Is it only Early Hints?

… But why are early hints different from early flushing?

… For both of them the goal is to let the browser start work earlier

… differences around HTTP status and rendering, but they are both similar

… Should we try to fix this? What should it be?

… Interested in Firefox and Safari to implement the firstInterimResponse?

… We need to do better and avoid changing semantics of existing things

… Do we also need to have more holistic solutions?

… Could we have solved early hints and early flush together?

… Can we try to revert firstInterimResponse? Or is it too late?

Philip: TTFB came from “80% of load time is frontend”

… It was added originally as a comparison with existing tools

… I am OK with getting rid of the concept of TTFB

Pat: Less eager to get rid of it. Helpful to track down infra vs frontend

… Out of the gate, we need to stay away from durations as they don’t handle async things very well

… If we race things, durations just fall apart

… UNO represents a real pain. Renderers on slow Android can be very slow and developers can’t do anything about it

… being able to account for that is valuable

Barry: So we’d need to flesh out resource timing for that

Pat: How granular do we need to be? Should RT become tracing from the field?

Michal: My intuition was “time to first useful byte”. The “render blocking” concept may be it?

… Minimal time before the UA could actually start rendering anything at all. Not necessarily when it actually unblocked rendering, but the first chance it could have allowed rendering if it weren't blocked.

… May wait up for a second to get blocking resources if you had enough content to start rendering

… As long as there’s a stream of bytes waiting from the server and we haven’t started rendering..

Barry: Not just rendering. For LCP, it’s the first time you could start fetching an LCP resource

… Even while fetching blocking content, the fact you could fetch that LCP resource is useful to show to people

… Delaying that to render, is basically paint timing

Michal: even if we’re not rendering because we’re render blocked, it’d mark the first point in time where we could start doing work

Bas: You’d almost want to tell people how much time the UA spent idle

Barry: kinda what we measure in LCP sub-parts

NoamH: so “time to head read”, minimum bar to start reading more, fetch blocking scripts, etc

Yoav: </head> end?

NoamH: start of the head tag?

Michal: but start of head could still not be useful

Philip: that’s how we used to implement it - a script tag at the start of the head

Pat: The point at which you can start measuring things in script. So you could theoretically measure this in script

… but is header the first byte of content

Barry: early hints can be used to start useful work from the browser. So in some ways it’s better than early flush

Pat: But the results show up in later measurements.

Barry: In practice they won’t

Bas: So in the slide of “things TTFB is useful for”, are all of those things needed? Do we have ways of solving those?

Barry: Server timing could be useful for server response time.

… “cross-browser supported” is interesting.

… We kinda have metrics for most of them

Michal: Interop issue with firstinterimResponseStart, what part of this is exposed?

Barry: We’ve change the semantics of responseStart

Bas: Doesn’t sound like it’d be difficult to change

Sean: responseStart changes - did Chrome do an analysis of the change?

… we haven’t so didn’t implement it

Barry: the origin plan was for finalResponseHeaderStart

Bas: so there’s nothing that’s the current responseStart

Barry: so we’d need responseStart to still point at firstInterimResponseStart

Nic: This would allow RUM providers to choose which version of this they want to support

Bas: So chrome would change back

Jase: What's the baseline?

Barry: “for things like LCP, look at your TTFB first”

Bas: We also do both TTFB and time origin as a baseline

Yoav: With API owner hat on, this all depends on how big of a breakage we would have if we want to expose another thing, and keep interimResponseStart where it is, but at the same time change back the responseStart semantics.

... People that collect interim would be a good proxy

... If they got broken, they already got broken when Chromium shipped the first version

... Use counter data would be useful

Michal: I'll follow-up after

Barry: We use responseStart in web-vitals.js, just measuring firstInterimresponseStart won't measure breakage

Yoav: People who use firstInterimResponseStart and have Early Hints would be useful to measure on the Chrome side

... This could inform that decision

Barry: On Chromium side we haven't implemented firstInterimResponseStart in telemetry

Michal: What is firstInterimResponseStart if there is no response?

Barry: 0

Michal: What if we exposed all of the timings, and responseStart is the smallest of those if non-0/non-null

Yoav: You're suggesting two breaking changes rather than one?

Michal: Value that responseStart returned changed recently, the values in dashboards will change but usage won't break

Bas: What I like is that responseStart then do what the name suggests it does

Yoav: More complex from compat perspective if there's reasonable usage from interim one

Bas: From FF perspective we're interested in people having useful information

Yoav: w/ Shopify hat on, there is code that distinguishes between browsers that determines what the value means.  Would be great to fix that.

Bas: I would prefer on a complete solution to only work on something once

Yoav: Figure out what's wanted end state

Barry: I will open an issue (update: commented on the existing issue which is still open)

Cross-origin caching of pervasive assets - Pat Meenan

Presentation recording

Summary

Minutes

Pat: We've discussed this for decades, cache partitioning, etc

... Let's look at it again

... One of the things we did in HTTP Archive ~2mo ago was added sha256 hash of every response body, even things we don't store

... images, responses, etc

... So we can see duplication of resource across web independent of URLs

... e.g. jQuery shipped by thousands of sites

... 17,034 "pervasive" responses

... 150 common across > 1 million sites

... Since this is a SHA hash of entire body of response, is it doesn't handle bundled code that's been minified and grouped together

... Anything edited to e.g. add/remove copyright

... Ran query against BigQuery for HTTP archive

... CSV of 17k of resources

... Basically 2 or 3 types of resources that showed up

... 3P embeds, YouTube desktop player

... Analytics, ads code.  Common URL used by sites, controlled by 3P.

... Same thing goes for first-party things like Shopify, Wix, where they control the platform

... Then slower-changing long-tail of resources: libraries, CMS, wordpress, jQuery

... Rev less frequently but when new versions get released

... By far most popular things were scripts

... Surprised fonts showed as much as they did, including Font Awesome

... Some binary fonts showing up on many origins

... Some CSS and a few images from e.g. WordPress templates or Wix template

... By far most popular was Google Analytics

... On 14.5 M pages of ~50M

... WordPress by far drive lion-share of jQuery standalone file usage on the web

... Libraries for jQuery e.g. jQuery UI

... Recaptcha is relatively large at 217K compressed

... YouTube largest one at 800 KB for player, a few million sites as well

... But not just largest libraries, Font Awesome, Google Maps, lot of Shopify code and libraries embedded in sites

...

... Cache partitioning and how we got to where we are

... Most browsers do triple-keyed cache for resources, so no more sharing of downloaded resources across top-level sites

... On Chrome it's top-level site, frame origin and resource URL (e.g. 2 resources from 2 iframes are fetched independently)

... TKC performance impact is negligible, but data coming from that report and UMA, is usage-based.  People that visit the same site aren't going to see a difference.  It's a single or first page load.  Doesn't show in 75 percentile of UMA data

... Unpartitioned cached risks

... If you put unique data inside payload you can use it as a cookie

... Put User ID in response, re-generate as origin, you can use it as a cookie

... Or, existence of a file at all.  e.g. 32 files you have a 32-bit number depending on what resources you put into a user's cache, existence of those tells you who those are.

... Leaking the history of sites or technologies the user has been to is sensitive.  Not a site decision, it's a user decision.  Example is a map tile or auth provider, knowing you've looked at a map tile for a sensitive location could target you.

... Or knowing you've been to a certain banks' website makes you an easier phishing target

... Site may not care, but individual users could

... Unpartitioned cache doesn't solve jQuery problem as they're all hosted on many sites (unless you're using a well-known CDN which doesn't happen and has connection overhead and everything)

... If we have the client use integrity attribute on resource or on a fetch, we can eliminate the explicit tracking of the payload

... No way to add additional data to the client that the client didn't already know

... Can we identify resources that are well-known, public, static, unchanging.  We allowlist them for example

... Is there a way to declaratively say they're immutable and public (opt-in for consideration)

... Need to know at time of fetch if you're looking at share cache if it's public

... In HTTP we have public and immutable headers, but we don't have anything until response that we know that

... For now let's assume we have new attributes that says it's public and immutable, and you have the integrity

... If there's a trusted cache and it's requested by a lot of clients, can we get to a point where then it's used in a shared cache that we know it's pervasive, unmodified and not targeting specific users

... Not probe-able

... Doesn't solve problems, it makes them more probabilistic

... Doesn't completely eliminate the privacy concerns

... Still leaking information about some level of history

... A browser with an empty cache, probing with a known resource like GA.  If a user doesn't have GA in it, maybe it's never been to a site with GA or maybe it's just not populated.

... Doesn't solve jQuery problem with sites saving same copy at different URLs

... Maybe use SRI for cache indexing?  But security nightmare

... Resources served from a given origin and for that origin to not be partition

... Nidhi and others can give more background on the experiment we can try

... Ran an experiment, manual list items had expired, difficult to tackle without automation

... Brings me to dictionaries

... Is a pervasive dictionary something safer to share perhaps

... May not solve jQuery problem unless we look at pre installed dictionary

... Brotli shows with a small dictionary with common web terms in it is useful

... Can we ship with a larger dictionary, Compression Dictionary stuff, Available-Dictionary that the client has for compression

... Like Brotli dictionary but maybe larger, versioned over the years, 2024 version of "web" dictionary that has the current versions of React and other things commonly seen on the web

... We previously talked about this, e.g. no 1 version of jQuery everyone's using -- and we pin people to stick towards that

... Compression Dictionaries are interesting because even if we ship with one version, subsequent versions would compress very well against it

... Site is sending what it will send, it's still on the sites to update as it wants

... Not just the the one version we happen to include

... Deters updates a little bit, but solves problem self-hosting standalone or bundling jQuery within another file

... We will be favoring whatever resources in that dictionary -- does React have a perf benefit in e.g. 2024 that new libraries will have to overcome

... Tradeoff of pushing innovation on the web vs. savings bytes

... For jQuery where would we draw that line?

... Legal issues to sort out about what can be included in a dictionary, are there copyright concerns?

MNot: I like the first approach, it seems promising to me.  I don't know if I would leverage cache-control: immutable in public, they're specific, might cause confusion to reuse

... Sounds promising

... If predicated on privacy-preserving proxies, maybe we could do some interesting things with that

... Doesn't have problem with 2nd solution which is by choosing what to do in dictionary, we're choosing winners

... Distorts marketplace, seems problematic

Christian: Generally a statement of support

... Thanks for sharing data

... We have a breakout on Wed AM about the problem that could be solved with cross-site caching of pervasive assets

... We download country data on every store across many domains, if could cache it could help

Pat: Country data is example where both paths have intersection and divergence

... If we make it immutable, that exact copy from that origin may be available

... If we include JSON blob in well-known dictionary, even if it's not a library, we could compress everyone's data with the same thing

... Sites could still make those decisions without relying on one known shared version, still compress as well

... Libraries and site-specific code may be verboten?  Large blobs in JSON maybe makes sense?

... Shame for one specific shopify dictionary be cached but other sites not be able to benefit

... Both potential ways to solve Shopify-specific case

Michal: Question about 2nd solution and "Deters updates/code changes"

... If updated to a newer version not included in dictionary, less efficiently compressed

... Better than the status quo today

... Any use-case I can think of, this only improves that problem (people not wanting to update because inefficient)

Pat: Unless there's a strong reason to move to the ".2" version if it's e.g. ~5 KB bigger, and a site w/ infrequent visits, they're usually downloading the full version today.  We give them a magic bullet where jQuery is faster.  Now if the two versions are a different size, the second version may be bigger.

Michal: Over the course of the year we'd slowly see it perform less, but it's better than the status quo

Yoav: Could be a monthly thing

Pat: Where origins or CDNs that have a rolling window of dictionaries available

... If we're asking sites to update dictionaries they're compressing against monthly, manual has coordination problem

... I think there's a people problem, yes cost was way worse yesterday, but new baseline going forward

Michal: Everyone after update is seeing less efficient resources, I think with this everyone would be more efficient going forward

Noam: Can you expand more on the SRI idea that had security concerns?

Pat: If you have the hash of jQuery "1234" and you want to access it from your URL as SRI://1234.  If you pull it directly from the cache, you could in theory bypass CSP where given origin may not serve that resource, but since it was already in cache you could pretend it came from that origin without having to fetch it

Noam: No hash collision?

Pat: No stuffing something into the cache with SRI, it's available from every origin you want to pull it from

... You want to be able to go to an origin to ensure the SRI from that origin matches hash

Evan: When you update the dictionary, do you not need the old dictionary to decompress?

Pat: Browser impl detail -- Chrome is the only one that's implemented?  Chrome caches the decompressed version of resources, otherwise you get into a cascade problem: fetch origin resource, then delta update, etc, you need to keep the whole chain

... Chrome's current version once fetched over wire, keeps the dictionary-decompressed version

... Whether that's compressed on disk or not

... If you have well-known dictionaries, you may want to keep older versions of that dictionary around

Nic: Is it possible to have multiple “standard” dictionaries?

Pat: It complicates things a bit and you’d have to have attributes on the fetch side

… You can use compression dictionaries with the link rel dictionary.

… you’d have to fetch it on demand, and it won’t be usable across origins

… That has privacy concerns

… We could have defined 10 well-known dictionaries, but it’s more complicated than having a single large dictionary

Bas: It changes the “pick the winner” dynamics

Pat: to avoid all privacy concerns, you want this to be something that the browser already downloaded

Yoav: To tackle the point of deciding winners, the dictionary, have you looked into diffs of React vs. Future React vs. Preact, how much of a difference does that make in practice?

... Do developers care about JS size a whole lot?

... Diff in libraries that perform similar tasks

Pat: React 17 vs. previous version, how effective, it was fairly effective.  50-60% maybe.  How much of a complete rewrite?

... Wrinkle is bundlers, where they rename things differently, work with bundlers to be more consistent with e.g. naming so it compresses better

... If function names and vars, maybe the code flow still compresses away

... Idio Synchronicities that get into when things get bundled

... How much does Preact look like React for core?  Modules that depend on the two may be similar in how they use the libraries for example

Mnot: Maybe not target any specific libraries, or common functions.  AI ALL THE THINGS. <joke>

Bas: As a matter of principle, I struggle with the idea of favoring any particular distributor of frameworks.  Even 10-20% optics are very bad.

Yoav: Crawl the web, throw all resources into a thing.

Mnot: Creates a feedback loop

Eric: Is there a way where we can have some things that are popular, some of "new" upcoming things?

Bas: Whatever methodology would have to be agnostic

Eric: If I want to publish a new library, if I can get me and 100 friends be in the "newcomers" part of the list

Mnot: A new library already has a lot going against it, adding "now less performant", you'll have a less diverse field

Michal: Relative cost may decrease

... Today the overall increased cost, of a new library is a lot

... Fetching jQuery may be 100% efficient, but a new library is only 10%, but still a lot cheaper

MNot: Today but not for new baseline

Kannan: ??

Michal: Inequality

Bas: Optics of someone having to pick

Pat: Would we be whitewashing by throwing AI on it?  Would pick popular things on the web today anyway.

... Whether or not that's explicit decision to de-prioritize some exact copies of React in library, I'm not sure just throwing machine algorithm at it

Bas: Bias with extra steps

Dominic: User Agents do this sort of thing today, optics not bad.  Brotli dictionary.  Non-perf things where top 200k sites get media auto-play.

... Other optimizations in the browser, bias exists

... e.g. JavaScript Map options

Bas: PGO

Yoav: Dictionary isn't necessarily standardized

Mnot: Might be interested in data with React version N, N+1 bump vs. another library with similar functionality but different code base

Yoav: How much bias would be introduced

Kannan: I feel like part of the discussion with picking winners and losers and biases, almost all decisions we make in standards picks some winners and some losers.  Sounds like valuable optimization that may help users right now.

... Maybe log scale weighing against popularity, present and publish that's what we're doing

Pat: Other wrinkle to throw into it, do we include third-party embeds in this scrape?  Always from same origin

... Like facebook events JS, ga.js, always coming from one URL?

... Pierce cache boundaries in some way?

... If we create dictionary crawl based, those would show up

... Or only things common from different URLs

Dominic: We want users to see the benefits

Mnot: Users also benefit from a diverse ecosystem

Bas: and innovation

Pat: May want to discuss this further

... Keep thinking about both paths, see if there are avenues to make it more valuable

Guohui: What if we throw in some randomness from the browser, so we can see protect against history testing

Pat: You can reduce the likelihood, but not eliminate it

... Is there a line where probabilistic coding goes away

... Some options where rolling the dice you miss the cache intentionally?  Is that enough?

... Or only after seeing N number of sites?

... Some options to make it more probabilistic, but nothing that eliminates all concerns

... You wouldn't want to pre-download all 150 resources, to completely eliminate it

... Those resources get rev'd frequently

... There is some line where each browser vendor's team makes trade-offs, where it is and if it's absolute I'm not sure where it lands for everyone

... No way I've seen to be able to do some form of piercing

Yoav: One thing that could be interesting is to compare approaches from benefit perspective

... Seems to me naively Compression Dictionary approach is more resilient to changes over time

... Where exact-cache-match would have potentially have less efficiently

Pat: Dictionaries resilient to change, solve first-visit problem

... For something like a YT embed, that put videos into their pages, the exact match over the course of a year (YT revs code) the dictionary will age over time

Yoav: If we used shared cache for Compression Dictionaries that would be better

Pat: If you can exclude resources you put in shared cache from this dictionary it would be better

Philip: With Shared cache we'd want to avoid steering site developers toward specific vendors, e.g. Google Analytics and Boomerang are always top sites.

Pat: If we're OK with self-declaring immutable and it shows up on e.g. 5+ sites then you lower that bar significantly.  Doesn't have to be GA well-known

Nidhi: In our experiment, several URLs were changing, but the changes were minor, so dictionaries would be more resilient to this

WebPerf Admin / Specs / Incubation

Summary:

Yoav: We have LCP adopted but e.g. ElementTiming is not

Michal: Ian refactored, some of those things moved into LCP

... ElementTiming is minimal now

Michal: Chromium's implementation that supports LCP and ElementTiming was leveraged for Container Timing

... Under the hood, ElementTiming off, Container Timing could still work

... ElementTiming attribute

Yoav: Expand semantics of Element Timing or have opt-in to this new Container Mode

Bas: From the perspective of using Container Timing, would it be adopted in some form close to what it is now?  Would it eliminate the need for ElementTiming or not?

Yoav: Feels like a superset

Barry: I thought we had not' adopted ElementTiming because use-cases weren't clear.  Aren't ContainerTiming superset?  How can we say one has a use case and the other doesn't?

Bas: Where and when the use-case for ElementTiming wasn't clear, do those apply to Container Timing.  Any more context.

... Seems like some clear use-cases for Container Timing that it seems reasonable

NoamH: Some argument that ElementTiming didn't support use-cases of containers, so a variant of ET with containers would solve problem

Bas: Individual elements may not have correct granularity so unless you have Container timing it's not useful enough

Michal: If you aggressively apply ElementTimings it gives you a lot of visibility into many timings

... Also argument that polyfills could do this

Yoav: Some argument about :visited visibility.  Maybe those arguments are no longer valid that the visited cache could be partitioned?

Michal: That's already leaky, and ET gives you a bit more control

Yoav: If we eliminate that leak, observing the paint of something that doesn't leak

Michal: FCP is exposing same details, for a specific :visited

Barry: Are we saying there's interest in CT vs ET, or both?

Yoav: Seems more interest in CT potentially

... Seems too early to talk about adoption, but glad to see interest

Michal: Web developers are positive, Chrome has shipped, if Firefox would ship would that be the bar?

Bas: Don't know we've done the work, LCP work would make it easier to do now

Nic: It’s a good rough start for people that may want to use web performance

Michal: there’s no way to know what resource to fetch before the first byte arrives

… but in practice you really need that resource to be fetched and a slot to stuff that resource into

… Early hints can get the resource fetch early, but that won’t help you if the LCP resource is not there

… The slot on the page is a different “byte”

… Both timings are interesting

Bas: assumption that other than the preloadScanner the static DOM contains all the slots

… that’s not true today

Michal: pendulum is swinging

Bas: DCL

Barry: We published some analysis on this, put some recommendations for people for how to improve their LCP

... e.g. we can say things like stop optimizing servers, or minimizing images, because developers will often concentrate on the wrong part

... e.g. slow part is images in HTML is being detected later

... Getting image downloaded faster isn't going to help

Yoav: I think TTFB isn't the right tool, ResourceTiming initiator, play around with resource loading graph

Bas: Complete critical path you'd analyze locally

... But to see if you did good and optimized

Yoav: Critical path for LCP could be consent management, third party

Barry: Redirect times, measure locally, go to site via an ad that goes to many things, you need RUM to measure

Michal: Don't disagree, but it's focused on resource flow. There's two flows.

Yoav: Create 2 initiators

Michal: TTFB in my mind it's saying what's the earliest time the slot could've been discovered?  Resource initiator.

... And then you finally discovered the actual slot.  Load delay.

Bas: For IMG, is it the time you get the IMG, the bytes describing image tab, renderer processed IMG tag

Michal: Prescanner would be sufficient I think

Yoav: RT gives you that

Michal: LCP gives final time after render blocking has been unblocked, you already have that

... If it took a long time, what were we waiting for?

Bas: ElementTiming or some API should have some field, this was the "last" bit of work I had to do to get it

... As a site creator you could make changes

... If decoding of IMG that was big to decode, the last step to get image on the screen it was decode.  Decode was bottleneck.

... Decode faster, change decompression, then the thing you need to fix is the DOM element created or reflowed.

... Way to tell developer to optimize X to get faster

Barry: We've done in Chrome side, we have 4 segments.  Before TTFB doc problem.  From TTFB until resource download is initiated.  Download time.  Time after Download until rendered, decode time.  Those phases work pretty good in optimizations

Michal: What happens w/ Early Hints?

Barry: EH doesn't work, as implemented by Chrome is busted, both get resource, put in cache, then forget it ever got it.  Not linked to download times.

... Duration effectively 0

Noam: Race to download fetches

Bas: I don't think we fix race cache with EH things

Barry: One reason why Chrome devrel team didn't want to move responseStart because it doesn't help w/ metrics

... Agree this is a powerful way of doing it

... At today you can still carve up timings

Michal: ElementTiming, right now we have loadTime, and renderTime.  Often there's some amount of delay from the actual paint instruction until you render.  Some proposals to expose paint that issued the final image.  First Layout where that image was in the layout.

... This slot was already in the page

... Start of the layout

Noam: Resource was ready?

Michal: Document was ready to have the resource slotted in if the document was available

... Those pieces all need to align

... Document-centric

... If there was no problem getting to layout to slot, then etc -- you look into ResourceTiming to find out why?

Michal: When is the first time the actual IMG element goes to get it

Bas: Creation of the frame, layoutstart

... Creation of DOM element, IMG maybe special-cased to decode earlier

... Not yet slot you're talking about

... What defines the slot here?

Michal: In my mind, if resource is fully loaded and cached and available, you could begin decode now

Bas: IMGs get cached decoded

Michal: Will improve performance of decode once requested

.... Promse.all() on those things

... Sounds like timings are all more document-centric than expected

Barry: Always going to be boundaries when you have many different segments

... Due to X reasons

... Difference between download finishing and paint happening, and there could be various reasons

Bas: When you're exposing to web developers, you assume this is when the browser had doc, had your slot, up to user agent to do all the other work.  You can't do anything about that anyway, so we don't give you times

... But you can affect it, so what should we expose?

Michal: Barry's breakdowns are useful.

... In particular let's say a 10 second long task, IMG maybe already decoded, we have a 10s render delay in breakdowns.  An alternative is to split that further or move those timings.  I want to know about the input delay.

Barry: We've been advocating these 4 breakdowns.  Could have more, but won't cover every scenario. Simplicity is better, would rather have 4

Tim: Perf stuff is more like "enhance" on TV shows.  Each time you zoom in you get more detail.  Every metric is a bit of a black box consisting of sub-phases and breakdowns.

... Good exercise to make sure we can always answer "so what" do we do about it?

... Here's exactly what's delaying my render

... Do we have the proper metrics underneath, so I can figure out granular things for what to do about it

... Render delay is one of the harder ones to pinpoint the "why"

Bas: Can include anything browser wants to do

Barry: At what point do we gather too much info in field, and move to tracing instead

... Same discussion with INP.  3 phases instead of 4.  Need more details.

... Latest proposal is everyone needs to collect 56 pieces of data and send back to RUM provider

... That's too much for average user, developer

... We'd all love that level of detail

... Too much is dangerous

... "Enhance" is great, but simplicity wins a lot of the time

Bas: Tim's suggestion is there is something relatively simple.  If it is possible for UA to say primary factor in render delay is long running JS on main thread, or X X X, then actionable for most developers

... That's a thing that's hard for the user agent to do well

Barry: We've looked at this into things like Lighthouse

... Works great when 95% of the time it's X

... But harder when less percentages

... Gets really messy

... We want to do that, but we're having problems aggregating

Bas: Not impossible for UA to categorize things in its top level event loop

... Provide list (censored) saying N time decode, N time internal, etc

... Quite extensive thing to do

Noam: Usually I feel there's no such thing as too much data

... Even if not actionable by "regular" web dev, but could be useful if shared by UA vendors

... e.g. issue in 1% of cases, repro is hard.  But if aggregate data and internal metrics, then you can share that with a bug report, it can be useful to solve issues.

Bas: We can already do that, repro in the wild.  Same as grabbing a profile.

Noam: Without enough samples

Bas: We're not just sampling profiler, most essential information in there

... In FF that mechanism already exists

... I suspect the same is true for Chrome

Barry: Even w/out profiler, ResourceTiming has 100s of bits there.

... I'm trying to distill that down to guidance for developers

Tim: Can say same thing for any standard out there

... e.g. Push giving "foot cannon"

... We have to think of consequences, side effects.  Onto developers and tools for how to use it.

... Going deeper enables people to get to the point where RUM tooling and monitoring solves problem

... You need someone to read/interpret to mine insights

... One of the things really exciting with attribution, RUM tooling isn't just data now

... LCP blipped, this element, particular delay

... Most people monitoring to solve problems

... Ability slice and dice we give ability to tools to solve problems

... Some tools will just get all data and overwhelm

Nic: LongTasks was a tough API to collect data, not much actionable

Bas: More interesting thing is what things are more actionable and implementation complexity

Adam: For TTFB we thought people always wanted to know about the time of the network, but discussions here it's time for when the rendering process reads it?

Michal: A bunch of mechanisms to feed bytes early, even coming from CDN, maybe improve performance of that fetch.  But your specific request requires some server processing.  A little bit of byte fetching, e.g. Early Hints, knowing when it first arrived is useful.  But really what you want to know is when the server was really done processing, that's when TTFB matters.

Adam: When was the server done, from network POV, we know when they get headers.  But there may not be a render process, other work needs to be done first.

Bas: Confusing to web developer if they included that time

... Slow machine can affect render process

Michal: If it takes 2.5 seconds to hit LCP, is it because render was slow from too much JS, or huge document, or because rendered is sitting around "idle", that distinction may be interesting

... 2 second in render heating up caches

... Could be useful on its own diagnostic

Philip: We've seen cases where boomerang.js has been 2 second load, and it's because host is too busy to process it

Michal: We've talked about gaps in NT.  How do we turn it into 4 useful values?

... We're trying to say higher-level what's important, then "Enhance"

Eric: A set of things that preclude useful work

... And are useful work

... e.g. TTFB where I didn't have a connection yet

... If I don't have a connection yet, nothing useful will come across it

Anadeep: TTFB to distill this down, whenever TTFB comes up, a few browser intracities are different in how they deal with it, would it make sense to be a higher-level metric.  Time takes for the entire request to complete, each would expose that number, based on its own implementation specific.

... If you want to monitor metrics, as Barry pointed out, you want to know if it's web server, CDN, front-end trying to do too much

Bas: RT already gives you that

Anadeep: Fetching resources, parsing resources, in overall rendering process timeline, chunk into 3/4 key parts.  One is network request and network response.  Give the user developer insight into how much time it's thinking for the server to get back.

... Give a simple number for how to interpret

... Give FCP

Bas: Isn 't that NT responseStart

Michal: Maybe we leave TTFB exactly as undefined as it is, but a LCP breakdown or whatever it is, all complexity

Philip: TTFB is a duration, not a timepoint

... It has a undefined start and end

Barry: It has a defined start and undefined end

Bas: TTFB is relative to origin 0?

Michal: Yes

Jase: Other metrics using TTFB, are they not using time origin?

Barry: Can't understand LCP unless you understand TTFB

Day 2 - Tuesday

Web Performance APIs in Excel - Noam Helfman

Presentation recording

Summary

Minutes

Conditional tracing - Noam Rosenthal

Presentation recording

Summary

Minutes

Scheduling APIs

Presentation recording

Summary

Minutes

Scott: Update since last year

… Been focused on schedule.yield

… first proposed in 2019, but with the focus on INP that got shipped in Chrome 129

… TAG review wanted a big picture explainer

… outlines the direction and things we’re thinking about

… Also nascent ideas around extensions

… scheduling matters during congestion

… browsers chunk up work and prioritize it internally. This exposing it

… improving responsiveness is the focus

… long task can block event from processing, or frames from rendering

… Sites can use this to prioritize work and improve user perceived latency

… Lots of things run async, we keep adding more

… HTML spec guarantees that every task source has guaranteed order run

… but different queues can be prioritized compared to other queues

… Rendering is special - it’s a separate task with variable priority

… Other browsers may vary the frame rate

… scheduler.postTask was the first API and thought of as a tool to improve responsiveness

… helps developers break tasks into individual pieces

… Modern API - promise, TaskController (inherits from AbortController), Signal

… defines the ordering between the tasks

… “user-visible” is the default

… scheduler.yield() - doesn’t require a function boundary, yield then resume

… heard a lot from developers that they are hesitant to give up the thread, as they’d need to wait until everything runs

… With yielding, if you’re breaking things up, setTimeout delays the continuation. With yield, the continuations continue to have the same priority as the task that scheduled them

… Yield inherits priority from the task that called it

… Treating it conceptually as a single async task

… Also works for requestIdleCallback. Yielding from rIC would keep the continuation as “background”

… Idle until urgent - you want to yield but you may lose data if e.g. navigation happens

… Related to async event listeners which may be related

… Rendering!

… You’re yielding but want the yield to only return after the render has actually happened

… proposed scheduler.render(), but people weren’t happy

… if the site knows that the above the fold content was loaded, and want to display ASAP

… currently no way to signal this to the browser

… seems bad to couple them

Yoav: For the renderNow() one, what do you think are the use-cases?

... How would developers know that all the content they need for rendering is already there?

Scott: Search has talked that they know

... Amazon may know this as well

Michal: Rumor: Server will block response so the bytes across stream temp pause, parser forces yield point

Yoav: Introduce that delay, in order to force parser heuristics, to flush

... Seems like you'd need a declarative element

... Better for this use-case maybe for others

Michal: Previous thinking, we have a API to request not resuming after next rendering is done

... Inline script that asks to not resume after rendering is done

... Nothing explicitly blocking parser from moving forward

... A bit of magic

Yoav: Sounds very indirect

Scott: Suggest de-coupling them

... I don't think it's the right fit, separate use-cases

Noam: Link rel=expect (without blocking=render) pointing to an ID, hint expecting a specific element, yield parser

Michal: Opposite of blocking=render, don't continue doing more work until

Noam: Link rel=expect Important milestone in document

Yoav: Wonder if other implementers have parser heuristics regarding blocking or not?

Olli: User-input very old heuristics

Yoav: No user input, loading HTML spec, when do you first render

Bas: Don't render in short period after page load

Olli: Timer based (short)

... Multiple heuristics

Michal: Chromium has some heuristics and timers as well

... e.g. first bytes of body start streaming

... Later than document.commit

Bas: Not relative to origin

Yoav: Not relative to request, relative to response

Bas: Main point is to avoid painting something useless

Yoav: You don't necessarily want to block here, but if you see this, render

... Different semantic, link rel=expect with some other signal

Bas: Worries me there are ways to use it that are stupid

Michal: Idle-Until-Urgent, status quo is anyone allowed to register event listener

... Default priority is infinite

... Hold that task as long as you want

... Lot of book-keeping happening

... If they try to be nice and be good citizens, they see a loss

... Guarantee will run before doc is unloaded

... Go a long way for better actors on the web

Noam: Run continuation before unload

Michal: Is it possible a task broken up with setTimeout, would just yield

Naom: If yielding for input, in event prioritized task?

Scott: No, main app would do small DOM update then do big thing

... Not update app state

Michal: Maybe if browser provided, script run in background as page is unloaded

... Maybe flush with higher priority

... Too much work and it doesn't run

Alex: Infinite amount of work and yielding, browser can say it's done

Michal: If I was blocking a doc, now yielding and I get 100ms, very good policy

Noam: Another thing we could do is priority if a navigation is pending, not yield in the same way

... Started nav but no response yet, several things we could do w/out additional API

... Polyfill-able

Olli: Plenty of idle time after beforeunload and when we actually unload

Michal: Reasons why existing players think it's not enough, lack of guarantee is important

Yoav: Would you get that guarantee in cases not today?

Michal: Crashes no, in case where browser is trying to unload, browser is too eager to unload previous document is feedback we hear over and over

Yoav: Browsers maybe be less eager?

Scott: Changed recently to be more eager, want to be as fast as possible

Christian: OS is probably interrupting app

Bas: Process will be at risk of being killed once it says no longer it should be in foreground

Yoav: Don't kill prev rendering process until the next one is ready

Michal: Is explicit signal important, or just anything scheduled to go FIFO, good enough?

Scott: One concern is if you have >100ms continuation in main app state change

Bas: Hydration taking >2s

... Nav to something else

Michal: A well-behaved site can abort doing that work

Scott: Minimal work, like to do the minimal thing and unblock as soon as possible

... Need to investigate when all continuations finish running.  Good enough with guarantee?

Michal: Event listener and you yield, priority?

Scott: default, user-visible

Michal: Opt-in was scheduler.postTask() instead of scheduler.yield()?

Scott: You can add same task to either-or

Alex: Perverse incentive we want to do as much as possible and aggressive as possible, try to solve somewhat with fetchLater(), what kinds of other things do people want to do at unload

Scott: Could be processing that you need to send that data

... Could do that in event handler and fetchLater()

Yoav: Processing has to happen right before unload

Michal: If you yield for being a good citizen, and doesn't run, fetchLater() doesn't get to go

Nic: in Boomerang we try to send resource timing data at the very end, as a concrete example

Scott: In this case it wouldn’t help

NoamR: another example is putting things in local storage

Michal: Could you do that work incrementally?

Nic: Depends. We use a trie of all the urls on the page

Bas: Is this a proposed API?

Scott: brainstorming. It’s similar to yield but a different concept

Michal: Only when we’re at that last moment do we know that this needs to be flushed

… A snapshot of the page needs to be very late

… Other things can run in the background, in idle periods

NoamR: I’d be curious to see how far we can go with the polyfill. Would inform decisions

… e.g. If you’re after “beforeunload” you don’t yield

… promise race between yield and visibility change

… Phil Walton did some work

Scott: maybe we need to come back to the group when we have more

Michal: Even if a polyfill works, having it baked in would help adoption and be worth the cost

Bas: What’s the downside? Could people abuse/misuse it?

Scott: We could add limits

NoamR: I can see libraries that use it and create too many such tasks

Michal: Right now events have no way to get interrupted. It can only be better than status quo

NoamR: It can make things worse with continuation task getting to block things

Michal: Only one of them

RUM Pain Points 2024

Presentation recording

Summary

Minutes

Nic: We talked about that already in previous years

.. Talked about

… Made pretty good progress on these things

… We’ve been thinking through the things that are important to RUM providers

… given that list, here are the things we’re struggling with

… So we started looking into forming a CG

… Hoping to get more folks involved beyond the WG (because no W3 membership is required)

… Not be developing specs, but bring inputs to the WG

… Waiting for the W3C to review the proposal

… << Click here to sign up >>

… Took a poll amongst the interested folks

… Non blocking script - mPulse uses preload and a snippet as a fallback, but an attribute would be nicer

… Interested in server timing and increasing its semantic stability across the industry

… Better RT initiator support - building the resource fetching tree and do critical path analysis

… Small RT enhancements, LCP issues, Container timing..

Christian: LCP popup behavior?

Tim: you got a hover event that pops in content and that becomes your LCP

Bas: Standard position on Resource Timing enhancements?

Yoav: too early

NoamR: browser work funded by the interested parties would be great!

Nic: There would be challenges, but it would be interesting to consider

Bas: Non-blocking script loading. Are browsers even consistent in how they work?

Olli: Chrome changed async loading a few years ago

Philip: The issue is about blocking the onload event. Only preload doesn’t block the onload event (Ref to original bugzilla ticket: https://www.w3.org/Bugs/Public/show_bug.cgi?id=21074)

… Also true for dynamic imports that started loading before onload

… The iframe hack is tackling it

Yoav: preload was supposed to fix this. This would just be additive

Nic: Yeah but that would lead us to a better future

NoamR: Server timing standards were already discussed. So it might be something that the CG can take on and create a registry

Olli: question about observer effect - have we considered notifying the site when they are misusing the APIs. E.g. when adding a million user-timing entries

Yoav: maybe we should limit the buffer size?

NoamH: configurable limited buffer?

Nic: we have that for Resource Timing

Bas: a very common observer effect

NoamR: maybe we could make it so that all that work is done in a worker. Create a performanceObserver in the worker and the worker gets all the entries, but it’s hard

… In chrome you could move the objects between threads, but not serialize it

Nic: Otherwise, there’s a RUM archive session at 6pm

… Data on how real users are using the web

Day 3 - Wednesday (no WebPerf meetings)

(breakout sessions)

Day 4 - Thursday

Service Worker TimingInfo

Presentation recording

Summary

Minutes

Exposing style/layout

Summary

Minutes

Soft Navigations

Summary

Minutes

Long Animation Frames: Status & what’s next

Presentation recording

Summary

Minutes

Protobuf encode/decode

Presentation recording

Slides

Summary

Minutes

Task Attribution: what is it, and how it might help beyond soft-navs (...RUM tooling, LCP hover, async event timing, resource initiators…)

Summary

Minutes

Javascript loading

Presentation recording

Summary

Minutes

Monitoring and Deployment Principles doc

Summary

Minutes

Day 5 - Friday

(joint meetings with WHATWG)

Minutes

fetchLater

Summary

Scheduling APIs

Summary

CompressionStreams

Summary