WebPerfWG call - May 27th 2021

Participants

Yoav Weiss, Nic Jansma, Alex Christensen, Nicolás Peña Moreno, Carine Bournez, Patrick Hamann, Sean Feng, Michelle Vu, Noam Helfman, Hongbo Song, Annie Sullivan, Giacomo Zecchini, Steven Bougon, Benjamin De Kosnik

Next Meeting

June 10th 10am PST / 1pm EST

Minutes

Presentation will be recorded, but discussion afterwards will not be

Web Perf APIs for responsiveness - Nicolás

nicolás: Goals: assess responsiveness of a webpage
... Current metric used in CWV: FID
... Problems: Only captures the first input, not interactions after the first one
... Only looks at input delay, not how long it takes for input to be processed
... Does not measure scrolling at all
... Using an Event's duration, as part of PerformanceEventTiming interface
... Delta between event timeStamp to the next rendering update after event handlers have run (after event has been dispatched), from the beginning till the first yellow star
... Capturing async work is difficult, so now focusing on event duration, which is the minimal duration it’d take for users to see a response
... How can we aggregate events across the page?
… different number of events depending on the interaction
... Aggregation by considering all events equal does not make sense, keyboard keydown/press/up are part of the same interaction
... Clicks/taps have more events associated with them but they're a single user interaction
... We want to consider tap/keyboard to be equally weighted
... Measure something that is user perceptible, which is not individual DOM events but the actual interaction they did
... Look at the user interactions themselves
... Could also be a drag event
... For these interactions we'd like to find something similar to duration but they're not events, so we need to define something about how long they take
... For these slides, I'm calling them latency
... One option: Use maximum duration of any associated event
... Other option: Use non-overlapping sum of durations of events
... Example from pressing key and releasing it
... Key press takes more than a single frame
... In this case key down duration is separate from key up duration (non-overlapping)
... How can a web developer calculate this right now?
... Problems for a web developer: We have a minimum duration threshold (e.g. 16ms), where anything less is not surfaced to the API
... Would have to use event counts somehow to count those properly
... Only receiving DOM events you got, you'd have to match/group into interactions they correspond to
... Can try to do this via event timestamps, but this is not an easy process
... Proposing interactionID, a new attribute for interface where it has an ID that is shared across corresponding entries from the same interaction
... Other issue is measuring scrolling
... We want to include scroll latency
... Latency is just the scroll update, from trigger of scroll to first render after that
... No real DOM events associated with this latency
... Scroll can be performed without going to JavaScript thread so it's not blocked on DOM handlers
... For WebPerf APIs, we'd need scroll latency
... Options: Have a new PerformanceEventTiming entry with a name equal to 'scroll' or 'scrollBegin' (name right now is event type)
... Other option is have a new Scroll Timing API where we expose the latency we want
... Benefit of another API is it could be nicer to augment in the future
Yoav: Looking for opinions on API shape for scrolls?
nicolás: Two questions would be thoughts on scrolling, and thoughts on EventTiming threshold/interactionID
... Worry with changing minimum duration threshold is that it could become too noisy
... Would offer enabling filtering by event type, which would save developer time and would be better for performance
… Avoid getting callbacks for entries you don’t care about
Nic: Any other observer APIs that allow you to pre-filter or specify a filter?
nicolás: I don’t think we do
Nic: So you’d need to specify the event types?
nicolás: Yeah, in the observe call. We already have duration to query only for durations you care about. Would that be confusing?
Nic: No, was just looking for precedents. Seems reasonable.
nicolás: We know that this is important for long tasks, so that kind of filtering may be useful for other entry types. And we do have the observer dictionary where we could set the types
Noam: Been using Event Timing for a few months now, and never needed event duration smaller than 100ms, but never dealt with grouping of related events. My intention was to minimize the number of events and 100ms was good enough for that - just filter all events below that. But I like the idea of interaction ID for grouping
… Why do we need to capture all events? What’s the use case?
nicolás: the idea is: if you’re tapping on a website, developer can perform the work in any of the events that are being dispatched. The idea is to help with the metric being agnostic to the way the site is implemented.
… FID looks at the first discrete interaction with the page, which is probably meaningful. So it doesn’t include hover events, as they are not typically meaningful.
… It’s with that constraint that we want to measure interactivity, where we think an interaction ID can be meaningful
… Grouping by interaction ID can help RUM providers
... Noam: But why smaller durations?
nicolás: Events not present would have a duration of zero
Noam: The challenge - if not all events are captured because of the duration threshold (missing all events below that), if we want to calculate the percentage of the responsive events, you can’t really do that. Do you calculate the relative percentiles - if you dropped 90% of the events, you can calculate the 95 %ile by finding the p50 of the captured events. We can count the totals from the event handler
nicolás: Makes sense. Another use case for the minimum duration - if you want to compute FID from BFCache, whatever you get is not guaranteed to be the first, as events could’ve been dropped. But it may not be related to the discussion.
… Other ways to solve this is just firing the first input. So no minimum duration makes sense
Sean:Like the idea of interactionID, but do we really need to expose it? If we see the key down, is that not enough? What’s the use case of calculating latency and grouping the events together?
nicolás: The use case is focused on analytics and browser metrics - analytics provider wants to provide data on the site without having control over the content. If you have no control over the content and still make sense of event timing data, this can help you aggregate entries together.
Annie: Metrics try to measure what the user is experiencing - interaction ID brings us closer to the user experience
Sean: If you know that key down is slow, you’d still be able to tell
Annie: You’d need to understand what the page is doing, which is not always true
Noam: question regarding scrolling - we correlate event handlers based on the event timestamps and event timing timestamps and figure out what the user did. I’m trying to think how that would work for scroll. The timestamp on the scroll event won’t match any event timestamp.
nicolás: The timestamp we’d use for scroll latency would be something like a mouse move, which is not exposed in event timing
Noam: But there’s also mouse wheel, touch, etc.
nicolás: yeah, and the scroll event cannot be used because it’s dispatched later
Noam: So how so we correlate between the event timing entry and the operation the user did? We can expose the target so we know which dom element was scrolled on, but e.g. I wouldn’t know if it’s a horizontal/vertical scroll. Need a way to correlate between what happened and the event timing entry
nicolás: sounds like you’re arguing for the second option, because you’d need extra information
Noam: I would need extra information on the scroll event. Maybe we need a scroll ID? It’s another correlation problem similar to the interaction ID
nicolás: not entirely sure
Alex: What exactly do you want to measure when measuring scroll timing? It’s simple for mousewheel/click where you have a single event. But the world has changed and we have continuous scrolling and you can have events that can cause you to scroll up and down for ~10 seconds. What would we measure?
nicolás: The initial scroll latency. The first frame that the scroll initiated.
Alex: makes sense. It’s a complicated problem with many moving parts, threads, etc
nicolás: yeah. Which makes Noam’s question even more interesting
… In terms of making sense of later data - what happens with other scroll updates? We don’t currently look into that, but it may be interesting to developers. If we’d expose it, it’d need to follow a similar shape of what we ship here. It’d need to fit the mold of event timing
Nic: You had 3 challenges, one of them was around measuring async work. Any of the work here would help with that?
nicolás: We think it’d be hard to come up with heuristics that measure the work we want to measure. There’s work posted from the event handler, using setTimeout, etc. We’re interested in exploring the problem, but haven’t made progress there
Yoav: For the scroll timing API, it sounds like the feedback was for a separate Scroll Timing API
nicolás: I would like to hear feedback from any other vendor or implementor
Noam: InteractionID could help capture and correlate between scroll events and handlers
Yoav: You could do that with a passive scroll event that gives you a target on which the scroll happens
Noam: Right now we can use target or event timestamp, but today for scroll we don't have any options

https://bugs.webkit.org/show_bug.cgi?id=225733 spec clarification

Nic: an issue with webkit specifically around TLS connection reuse and the connectionStart being 0 in that case. This case needs to be clarified that secureConnectionStart needs to be equal to fetchStart only in the HTTPS case. So we’re returning 0 in a reused connection that has TLS, but it’s fetchStart without TLS.
Yoav: Current definition is in Fetch, and is doing something else than what implementations are doing
nicolás: should we change the spec or change the implementation?
Alex: just aligned to the current behavior, so prefer to fix the spec
Yoav: current behavior gives us info on whether a secure connection was used
Nic: Can take an action time to file an issue
nicolás: when did we get 58 open issues on RT?

https://github.com/whatwg/fetch/issues/1215

Yoav: ideal answer: none?
Nic: There’s a middleground answer of exposing them as if they are cross origin resources with no TAO
Alex: Remember needing to add all HTTP status code to this, and now there are more errors. There’s a lot that can go wrong: DNS, TLS, certs. Do you want to allow visibility into this?
Nic: Today we have visibility into DNS, TCP and TLS errors
Yoav: what about overlap with NEL? Would NEL be sufficient?
Nic: NEL is hard to tie back to the original session. Would be preferable from an analytics perspective to have them be part of RUM, rather than a scrubbed report later
Yoav: Over time, let’s continue the discussion on the issue and on the next call in a couple of weeks