WebPerfWG call - September 12th 2024

Participants

Barry Pollard, Amiya Gupta, Sean Feng, Noam Helfman, Alex Christensen, Nic Jansma, Giacomo Zecchini, Michal Mocny, Annie Sullivan, Hao Liu, Ia Clelland, Aoyuan Zou, Carine Bournez, Jase Williams, Dan Shappir, Pat Meenan, Mike Henniger, Yoav Weiss

Admin

W3C Community survey (again)

Please take if you get a chance - closes tomorrow

TPAC!!

Minutes

https://github.com/w3c/event-timing/issues/139

Event Timing + Interactions

Michal: The specific context is issue #139, #124
... Related to more tightly coupling with Interactions and Interaction ID
... Mozilla has EventTiming but not Interaction ID yet
... Quick Review
... A single interaction by a user may be many events dispatched to different parts
... Pointer down then up, click synthetically generated
... Any single part there might be some pieces in there, it can get complicated
... Takes effort to interpret raw stream of timings
... Much easier for browser to do so
... Impossible to do so from EventTiming as it does its own filtering
... Goal of interactions is to count interactions by user, group by those interactions
... When user inputs, there can be any number of events dispatched
... But fundamental events, and a pattern
... Goal is to say this is one user interaction
... Key frames are presented
... Interesting to consider interacts are an umbrella of time that could be interesting to the user
... From interaction to when they saw the paint
... Minimum set of events so you know the time range involved and can do attribution
... There are some exceptions
... List of raw Event Timings
... Only report discrete events, not continuous, but some are rather noisy, e.g. drag
... Some interesting events are not captured by EventTiming today
... Assign Interaction ID in the state machine
... Sequence of operations based on the order events are dispatched
... Decide how to group things together
... Need the raw stream which only the browser has
... Goal is to map to what the user did, not the browser events
... Goal is to reduce redundancy, and make it actionable by developer
... What we've learned from real world
... Some improvements from different origins
... Some of the highest ranking origins saw some of the largest gains
... Interaction IDs meant to highlight most interesting events and animation frames
... Long Animation Frame API (LoAF) does attribution for animation frames
... EventTiming still useful as it reports the event dispatch, helps repros
... LoAF has a different perspective, but supplemental, does more attribution of scripts in frame
... Model that seems to be working for attribution, is to use labels from EventTiming (processing time ranges) to map to LoAFs
... Get those LoAF entries, you have the full animation frame timeline. Then you can get all interesting entries
... "event-centric" perspective to find a repro in lab
... "animation-frame-centric" to define breakdowns and attributions, useful in aggregate
... Issue #139
... EventTiming has "event" entries and "first-input" entry type
... Main difference is "first-input" doesn't have a duration threshold, all first inputs added
... Also a older API, similar to interactions
... Overlaps with goals of Interaction ID, but we've evolved to handle more use-cases
... Question: Should we define first-input in terms of first event with a non-0 interaction ID
... Second question: A single interaction may have more than a single EventTiming. Often you have to measure future events before you can dispatch first-input
... Most interesting is not the very first event (click, no pointerdown). Keypress/beforeinput not keydown
... So "first-input" is only useful for initial delay (FID), but if you care about latency of first interaction, you want more than just the first entry
... Should "first-input" report the whole list of entries for that first interactionID, or maybe just the longest
... For EvenTiming the default duration filter is >= 104ms, but you can change via PerformanceObserver, but that motivates eager JS loading
... Alternative way to solve, via Issue #124
... Part of this issue stems from buffering strategy we have for EventTiming.
... Share a single buffer, capped to 150 entries
... But events can fire in quick succession. Buffer can flood and be filled up before Observers are registered
... In Chrome ~30% of interactions are dropped from EventTiming buffer before first PO is registered
... Maybe we change the approach to buffering?
... First Interaction to page is uniquely useful
... But events with interactions are more useful than regular events
... Ensure first event happens to be in the buffer always
... Default 104ms threshold doesn't apply to interactions, so maybe all interactions regardless of duration should be included?
... We have some choices
Noam: Diagram of interactions and frames, what I'm familiar with, that may be true in many cases, but in many of our cases the first paint can be the most important one
... PointerDown and MouseMove and PointerUp if you're selecting a range of characters
... In many cases, it could be interesting more in the first event, and later frames would be less interesting
Michal: I would not propose to replace first-input entry, I would say we need all of them
... Or just collapsing duration down to the first, I think the longest is better than the first
... First-input will always report a single entry
... We could report a list of events, breaking change?
... Most snippets that I see that do FID, register PO and take [0]th value, which they wouldn't read
... EventTiming also reports every, assuming they meet duration threshold
... If you use higher-level library, we will group those together for you, and decide the 2nd may be more interesting as it's longer
... Under the hood you can have full list of entries
Yoav: A lot of cases where the very first input is before JavaScript ever loaded on the page so it was very fast but useless.
... First Input is not uniquely significant in all cases
... Here you're suggesting to expand that concept to interactions
Michal: FID was flawed. The first-input and only measuring inputDelay of the first input
... The first input of an interaction in Chrome, will still report then whole ET data including processing and presentation time
... The very first variants of ET did not do that
... FID is a metric that only looks at the first part of it. Only report on input delay.
... Even if you only report the first part, with FID you can still look at this whole range.
... Problem with hydration libraries is they yield until first input arrived, you'd get no input delay, then sync render the whole page and reprocess the event
... React did this for a while
... Separate issue where you might not do anything interesting because you're waiting for JS to load. Event doesn't do anything. That's not a timing issue but perf metrics don't capture that
Yoav: For pre-hydration case, the focus on FID made people say this is a great pattern as we're getting good FID scores, moving to interactions won't necessarily change that
Michal: True, ecosystem solving because it's still a problem
... Many frameworks have changed time to bootstrap and hydrate, better
... Can use semantic encoding, forms, anchors that work
... Ecosystem is shaking out that way
... Either way, it would be difficult for a perf metric to understand whether it did what the user expected
... Cases where we think things are fast when it wasn't
... One option is to measure the whole span of time
... Another which is if interaction triggers fetch event... discussion at TPAC to measure across that
Yoav: Issue is moving from first-input to first-interaction, is there a way to get better coverage. You can do that in userland via interaction IDs, when they go above the threshold
Michal: Not sure if there's precedent about being clever for entry buffering
... Change first-input entry to be part of interaction ID. Existing consumers would handle that nicely
Noam: Could we provide an API hinting when the full interaction has ended
... For example pointerUp, considered end of interaction, or indicate Element ID that you're interested in observing. End to end from first input to last activity you're interested in.
Michal: Another topic Noam's covering at TPAC about tracing annotations, related to LoAF scripts
... What browser thinks of a task is different than developer knows. Might want to encode time ranges and link things together and have attribution
... EventTiming might be a good fit for that as well
Noam: You need an event listener to have an interaction?
Michal: You don't need to, we measure even if no registered event. Default browser actions.
Noam: If they did they could provide hints on the ending condition of that interaction
Michal: Philosophically we tried to change it so the values provided don't change
... We have Marks and Measures if you know start/end points
... Proposal from Noam can be a lighter-weight and higher-performance that's better for tooling
... OTOH we have ElementTiming where annotations can expose more
... Change buffering approach?
... I'm not sure we need more than UserTimings
... Please weigh-in on those two issues if you have preferences
... Implementers may want fallback for first-input
Sean: No questions, it looks reasonable
... Maybe instead of changing first-input, the first-interaction
... We haven't implemented Interaction ID yet
Michal: The part of it exposing Interaction ID property could be decoupled from spec changes
Noam: Should there be an InteractionTiming API?
Michal: The intersection between it and LoAF. We want to know total interactions and targets.
... But animation frames are important part of that
... Within animation frames there's N interactions with XYZ targets
... You could think of cluster of things in LoAF, that seems to be working better
... Maybe eventually that could be an API but it's probably just libraries built on top of primitives that we have

`Confidence` API shape - continued discussion

Yoav: Continued discussion around Confidence attribute and it's API shape
... Challenge is Mike Jackson isn't available to join us today
... I'll try to share what we did last time
... 3 different proposals on API shape
... #1 is on NT in timeline
... #2 is dedicated "metadata: ["confidence"]" ends up on PT entry
... #3 is dedicated performance.getConfidence() for a particular entry
... From my POV I think #3 is interesting API shape if we can extend confidence to other PerformanceEntrys.
... e.g. for RT entry, how confidence is it that entry was loaded w/out interference?
... Discussion on last call effectively indicated that we have low-confidence in our ability to do that w/out leaking privacy-sensitive information, and other things happening on the user's machine
... Which brings us to proposal #1 and exposing on the PerformanceNavigationTiming try
... IIRC main argument for proposal #2 is it's less discoverable and they'd need to indicate they specifically want that data
... So they would only use it when they know how to use it properly, de-bias the data/etc
... Leaning towards proposal #1 accompanied by developer documentation w/ de-biasing
... Prevent abuse or mis-use
Michal: Agreed.
... I don't know that hiding this from developers is a good thing
... Naming and that this was broken out from a raw value, wouldn't know how to interpret it unless you read docs which is a good idea
... For NavTiming, all timing values remain what they are, they're not fuzzed?
... For some of the other types of APIs, IIRC the values are fuzzed?
Yoav: That is something we discussed as server-side aggregation, a different API shape for a different purpose
... Value itself is a boolean with coin flip, and it's used to discard the entire navigation or not. Nothing is fuzzed.
... All values are the same, but collector may choose to ignore in cases probabilistically. Not for specific entry, but for histogram calc is done differently if they want to take in confidence.
Michal: Interested in future use-cases. Need the confidence and not hidden. If any values could be affected by coin flip, but in this case they're not.
... Would be another reason to go for proposal #1
Yoav: If we end up doing server-side aggregation, will have a different shape.
... Noam also talks about worklet fuzzing but that could be different
Jase: Supportive of proposal #1
Michal: What about actual when we trigger the value. #1 is how we expose.
Yoav: I think it will have to be UA-defined. We have confidence rather than a more specific thing. We made it fuzzier because more use-cases
... Good question on when we'll fill in data
... For NavTiming entry, in timeline, it is there rather early and values get added over time
... Would be one of the values added later
Michal
Nic: how would we know if the value is already filled in?
Yoav: could be another polled attribute
Michal: For the current usecase, we’d know right away
Nic: The other usecases mentioned were around extensions, and that’s something we’d know late
Yoav: I don’t think that’s in the current implementation, and could complicate privacy
Michal: at session restore you’d want this to be “high”, but there’s another part. The background tabs that are not loaded immediately, there could be tabs that would have been loaded, but they got delayed. They also have slow loading but how would you present it?
Yoav: You're saying the tab started, timeOrigin kicked in?
Michal: Not sure how timeOrigin starts, tab is ther that the user can click in at any moment
... Would act different if I had session restore where all background tabs would've loaded. Now we're deferring those initial loads
... Time load may be different, background vs. foreground
Yoav: I think it depends, if timeOrigin hasn't started, clicking on tab starts nav, then if nothing interferes, high confidence compared to if browser started loading page then froze it.
Michal: From POV of data collection, page immediately loaded in background would have low confidence and perhaps bad loading performance but quick to view. Vs throttled load. Have interesting effects on histograms coming into play.
Amiya: Also have other signals like tab not being visible, used to distinguish.
... Other pointers to break out that data
Nic: people would use this alongside “hidden”, “has the user navigated the site a lot”, etc
Nic: otherwise, no meeting in 2 weeks. See y’all at TPAC!!