WebPerfWG call - August 17th 2023

Participants

Yoav Weiss, Nic Jansma, Michal Mocny, Amiya Gupta, Andy Davies, Jase Williams, Mike Jackson, Aoyuan Zuo, Andy Lurhs, Hao Liu, Patricija Cerkaite, Giacomo Zecchini, Alex Jose, Andrew Galloni, Barry Polard, Benjamin De Kosnik, Carine, Dan Shappir, Patrick Meenan, Rafael Lebre

Admin

Next: Aug 31 1pm ET / 10am PT
Request for TPAC 2023 topics

https://docs.google.com/document/d/1ijP7yc6UXZVgzpXDVmBa7h70gxLpFcqdlOZzL3DRT_0/edit

Minutes

EventTimings Updates - Aoyuan

Recording

Aoyuan: Closing gaps in EventTiming, want to propose changes to the spec
... Interaction is a group of events triggered by same user input
... Unique interactionId for each group
... INP captures longest event during an interaction
… Started computing interaction ID with a minimal set of events, just for computing INP
… other events would have an interaction ID of 0, so INP is not calculated for them
… want to expose INP to keyboard interactions
... Random number of IME events are dispatched, depending on the platform
... IME is used for e.g. English keyboard inputting extra character set characters
... With IME, keyup numbers can be random. One keydown and two keyups.
... Now the input event becomes special, as there's just one per interaction. Key event associated with text input on the page, triggering rendering and paint.
... However there are some cases where there are user interactions with no input event
... How can we capture this type of interaction?
... Need to do that in order to be able to correctly count the number of interactions on the page
... After compositions started, the first keydown starts it. After composition ends, the first keyup ends it.
... Pro is all important events get interactionIds, but Con is there could be duplicate event types
... e.g. 4 key-ups in a row back-to-back, but they come from a single user interaction
... We want to add interactionId to keypress, input events under composition, and keydown/keyup clusters under composition updates
... Questions?
Nic: How will this affect existing data?
Aoyuan: No data just yet, but planning to experiment with a prototype, can share data back with the group
Amiya: Beyond new events, other types of interactions that would come in the future?
Aoyuan: We think it’d be ideal to expose interaction ID to all events, and would reduce confusion. We prioritize the important events ATM, but it’d easy to extend this to more event types. E.g. Input events outside composition
Michal: two classes of long-term improvements: This is in the first class where we only need to longest interaction, and will have many events clustered together
… this is closing gaps and making sure we capture duration more consistently
… But if you want to know why your INP is high, omitting interaction ID from the events can degrade the quality of attribution.
… Complicated when you have high level synthetic events, so slimmed the initial set
… The second reason is to enable monitoring for other event types: drag and drop, etc
… Interaction ID can open possibilities there
Jase: Before this change, keydown and keyup could sometimes register as different interaction IDs atm?
Aoyuan: currently only measuring them outside of composition, so not assigning interaction ID to them under composition
Michal: But when they get an interaction ID, they should always match up
… In some cases the keypress event is also useful

Fix Bimodal Performance Timings in Web Applications - Microsoft

Recording

Mike: Previously discussed this in W3C Webperf in June '22
... Had talked about a "user agent launch" attribute
... Feedback was this could happen for all navigation types, not just launch
... Timings can be thrown off in back-forward and other things
... Impractical adding enums to all nav types
... etc
... Went back and thought about it a bit more, to restate goals
... Goal: Allow user-agent to specify it's in a non-optimal performance state
... Spikes from when user agent was under load, we'd see spikes from start time to requestStart, during DNS lookups, etc
... Avoid being a real-time method for determining the performance state of the UA
... Instead of adding a new nav type, add a new property called systemEntropy
... Idea being for top-level navigations we'd return "high" or "normal" values
... Other navigations would be empty string
... Focus on user-agent launches today
... Android-specific behavior
... Custom Tabs, WebView, Restored Tabs have specific behavior
... Why can't we use compute pressure?
... By the time the developer is able to register for PressureObserver, system state is back to normal
... Discussion?
Michal: Could you share primary motivation or use-case?
Mike: Problem we're trying to solve is we've seen during browser load, they'd get NavigationTiming metrics but they're skewed due to delays, outside of their control
... Want to partition that data off, what's in control vs. outside
Michal: I'm on the fence. On one hand it's useful to know the state of the machine or navigation. e.g. BFCache is not same as regular nav.
... But RUM is to measure the real world.
... If you're wanting consistent data, should you go all out and build a lab?
... I see use-case, but wonder how it's used in practice
Amiya: One common scenario I've seen with certain teams is they're reporting up to leadership perf data. Set of pages that tend to be correlated with browser launches more than others. Performance at higher percentiles is much slower.
... Executives ask why? Engineers point out launch page loads are higher for that page.
... How much of page load is browser launch vs. slower page?
Michal: Makes sense. Not changes over time but comparing the impact of a change on different pages within an organization.
Dan: Additional information tends to be a good thing. As long as there are no security issues.
... Concerned with what Michal brought up -- what is an edge case?
... Would low-battery scenario count as such an edge case? Or if a machine heats up and is throttled, is that a special case?
... On the other-hand, if they're common for your application, are they special cases? And if they’re not common, they should drown out in the rest of the data
Amiya: Goal isn't to classify every edge case. Want to look at distribution. Browser-launch can add multiple seconds to the page load, so we just want to know it was a contributing factor, it's a discrete thing.
... Many other things that can contribute
Nic: The system entropy attribute would be an attribute of the navigation so will not change over time?
Mike: yup
Nic: As a RUM provider additional dimensions to segment data on can help, and can allow customers to choose including or excluding this kind of data. Data looks different for these types of navigation, and customers make different choices about including it
… Tim Vereecke did a talk about “noise canceling RUM” and this could be an additional tool in the toolbox to segment data on
… The attribute could indicate browser launch scenarios but could other scenarios set this bit?
Mike: Could imagine a world where this integrates with the compute pressure API, or e.g. with restoring 1000 tabs
Nic: So could be useful beyond your initial use case? cool
… last question: customers may be interested in more data beyond a boolean? E.g. a list of causes to the higher entropy?
Mike: That hasn’t come up in our research, but can see a world where we might want to provide that
.. Good point, but have no data that suggests this to be useful atm, but it could be
Nic: Other dimensions are very specific (e.g. prerender), where this is more vague. So could be useful to break this down
Mike: But there could be multiple reasons (low battery during launch)
Nic: BFCache not-restored reasons could be an interesting precedent
Hao: Curious about confidence level on system entropy? Should it be a numerical value? The UA is making that decision instead of the API’s user
Mike: Haven’t considered a numerical value, as a boolean enables the UA to treat situations as high entropy states. (e.g. user restores a 1000 tabs, switching profiles)
… Could be challenging to assign a numerical value in these situations
… Inputs could take into account many different factors (launch, memory, etc)
Yoav: There's a cost to exposing more values, as it exposes more fingerprinting information (privacy-sensitive). A single boolean is more condensed in its exposure.
Andy Davies: As another RUM provider, having more data to eliminate or identify data on the extreme edge. Really useful. To Dan's point, I don't particularly want battery level or cpu-hot, I just want to know if device is in low power mode or not
Sean: I think my question is similar to Hao, if this is something that is not going to be clearly defined in the spec, browsers should be able to clearly know if they should return high or normal.
... Concerns this is a new fingerprinting factor. Users with "high" entropy have lower-spec power or computing.
Mike: Definitely a concern. Initially we're targeting startup as the primary case we're chasing. In our analysis of that, what we're asserting is it's useful to tell website that this is someone's homepage or something, but it would be difficult to tell that too much apart from Restore All Tabs.
... That setting as well becomes harder to assert this is someone's home page.
Sean: I still have a concern
Michal: Pointed out gaps with ComputePressure API. Would it suffice to just close those gaps, in particular if you had initial estimated compute pressure?
... Providing it for default performance entries earlier on? Or not enough?
Mike: I don't think ComputePressure is enough by itself, but it could be a factor we'd consider. e.g. other things like loading binaries off disk can cause slowdowns.
... One aspect, but not sufficient.
… Further feedback on the doc, please!