WebPerfWG call - March 18th 2021

Participants

Jake Archibald, Yoav Weiss, Nic Jansma, Ben Kelly, Andrew Galloni, Steven bougon, Sean Feng, Pat Meenan, Michal, Noam Rosenthal, Nicolás, Carine Bournez

Next Meeting

April 1st 2021 at 10am PST / 1pm EST

ServiceWorkers and workerStart

Yoav: We want to talk about workerStart, and how we can properly define what it does

… Defining potentially a more complete solution, since we think it’s not sufficient to measure Service Worker Startup Time in many cases

Nic: Briefly summarize from last time

… Goal of the attribute was to help us measure the critical path cost of SW start, which is non-zero and measurable in many cases

… The spec are not synchronized or consistent

… PRs to address the processing model and make it more consistent

… Focusing on how long it takes SW to start up - what should we be measuring?

… With redirects, there could be multiple SWs that are in the critical path

… The current proposal aims to place the workerStart before the final same-origin redirects

… Redirects that happen before that, aiming to capture the worker that started up before all of that

… Could be multiple workers in that path as well, including ones from other origins that we cannot expose

… Given that, there’s a PR that addresses the inconsistencies - workerStart is the time when the first worker that starts up in the same origin redirect chain

… Need to see how where that belongs relative to fetch start

… How do we define this workerStart? Should we be thinking about it differently

… Do we need workerReadyTime? Something else?

Jake: When I looked at the diagram, it’s confusing that workerStart is before fetchStart, then I realized that the wording meant “the final of those things” after redirects.

… Would it make sense to rename `workerStart` to account for that? `firstWorkerStart` and `workerStart` that’s the final one. firstWorkerStart may be earlier, if it was involved in the redirect chain.

… Extra values to know why your SW was slow would be useful. workerStart is when we decide that a SW needs to be involved

… workerReady - it’s ready now

… Something else to signify when we dispatched the fetch event

… Those three points in time can help you understand why it’s taking long to startup

… If you have a big gap between workerReady and fetch event start, that may indicate main thread contention

Nic: Makes sense. If we make changes here it makes sense to try and capture all of that.

Jake: What do we do in a case where the first and final service worker are the same?

Ben: Today at workerStart we take a timestamp there, even in the redirect case we take that timestamp, it’s just short

Nic: Today we’re usually measuring the not-expensive case of the “last” service worker to startup

Ben: Moving workerStart after firstStart would break the only reliable measurement

Nic: Not reliable if there are redirects

Ben: Reliable for people that know they don’t have redirects. Reliable for their use-case

… Separate final navigation workerStart from the first one. That could correspond to the same heuristics

Noam: Doing work on Fetch right now. There are other metrics that can happen several times and we’re only measuring the last one.

… Two separate issues

… Measuring redirects for resource is one issue - separate entries per redirect request

… Other issue is measuring more stuff in a worker - maybe belongs in the performance timeline of the worker

… Not sure if the lifecycle of the worker is a 1:1 match with the resource fetch

… Maybe inside the SW, we should have entries that give us information about it

… “When did we send a message to the SW” When did we hand off the Fetch?

Ben: I think that conceptually we have that with workerStart.

… Made me think of another case which is navigationPreload, complication in navigation case

… Agree that a lot of this comes down to the clunkiness of redirect handling and a more holistic solution would be better

Noam: Doing that, workerStart would be consistent with other attributes. Like where there would be a path where you would get connectStart/domainLookupStart/etc

… And a separate path for workers

… And then try to find a solution for redirects

Jake: If solving redirects comes later, workerStart should be after fetchStart

Noam: Around the time of connectEnd or connectStart. The same time you would go to create an HTTP connection.

Ben: If we go that route, can we make a fetchEventStart or fetchEventDispatch

Yoav: I don’t think we can change the semantics because people are using that today

.... It could be multiple phased, one would be on the current RT entry, and another for redirects in general, but we can’t change existin semantics

Ben: People got confused by it not saying service worker, so maybe a rebranding opportunity here

Jake: Makes sense to me. If we can’t change what workerStart does, what are we doing here?

Nic: I think we want to make sure it’s useful for the original intent, measuring Service Worker Startup. The simple path would be to also measure the end time. A more complex path would be to measure an array of start and end. Then the most complex is to fix redirects

Ben: We should just do an array for SWs. Need to be associated with redirects

Pat: A redirect has every part of that path and they are a separate request

… You need an array of resource timings, but that doesn’t solve the simple case of fetchStart-workerStart

Noam: But it only gives you that if you assume no redirects

… Putting it after fetchStart would still keep the semantics of the useful case where you don’t have redirects

Jake: So if we’re keeping workerStart, what does it mean?

Noam: Handling off to SW

Jake: After fetchStart then?

Noam: Yes

Ben: Today when a SW is involved, fetchStart is changed to be when the fetchEvent is dispatched on the worker thread

Jake: When a worker wasn’t involved then?

Nic: This was called fetchStart before Fetch.

… It’s been adapted over time to the Fetch event

… To Ben’s point, the definition of fetchStart is a little nebulous

Noam: fetchStart is right after redirectEnd

Ben: In implementations fetchStart starts after the SW is started. If the worker is with 0 overhead, the measurement would be the same as redirects.

… One option is to leave workerStart and fetchStart and create new names for our approach

… Instead of fetchStart we have something else named

Yoav: I’m supportive of that, to not break existing content

Ben: Maybe it’s easier to create a new ResourceTiming type

Yoav: We can also measure attributes usage

Ben: Dumb idea for redirects - creating timeline entries for each request. But then you’d need a way to chain them together.

… If you have multiple overlapping requests, it’s be hard to sort them

Noam: For ResourceTiming, if we have an image, we want to know everything about the image, so need some way to connect the requests to the image

Pat: That largely fits with this. If the resource is the HTML, then you have the array of fetches needed to fetch the HTML, whether it’s 3 redirects or whatever.

Jake: I support the idea. Rather than have all these things appear in a specific order, you have little events in time depending on the redirects happening

Ben: Another idea. Proposed the fetch event worker timing, where the fetch event can inject UserTiming

… What if instead of workerStart, we would populate this array with native points in time, as well as things developers contribute

Benjamin: Would that include registration time to workerStart

Ben: That’s not normally part of a resource fetch

Noam: Reminds me of Server Timing. Can treat the timings the SW does as ServerTiming

Ben: I tried to model it after Server Timing so it has a lot of parallels

Pat: Would be nice for this to be a bit more structured. If a SW combines 2 fetches to a response, that would make it hard for analytics to rationalize, it’d be best if different SWs would make their metrics in the same way. ServerTiming today is just a complete blackbox and is implementation specific.

Ben: Currently defined to create mark entries, but then be able to stick fetches in there

… It’s partially implemented in Chrome

Jake: In most cases, we’ve done the fetch in the SW, got a response and sent it back. Would we be able to associate that with the final response?

Ben: We could if the SW does a pass through. But any repackaging of the response would love that data

… Comes with Naoms’ work on the Fetch spec. Once it’s in the response object, we can pass it along, but would lose it for synthetic things

Jake: Covers a lot of the use cases

Ben: Can also indicate things came from storage.

… Also, kudos on the Fetch work

Yoav: I don’t know if we’ve reached any conclusions on the definition of workerStart. Assuming we have to maintain compat for the current use cases… if we provided a more supported alternative, maybe people will move to that.

… Long-term collaboration with ServiceWorker folks to figure out better metrics

… Nic do you have any comments on Noam’s proposal?

Nic: Doesn’t Noam’s proposal switches the order of workerStart and fetchStart?

Noam: Not suggesting to switch the ordering. Just that workerStart would be before the last resource in the redirect case.

Nic: Can’t change the order, but we could define it as the last redirect in the chain. Makes it simpler to define, but less useful in the same origin redirect case.

Pat: But the time is not lost, you’d just have a long redirect time

Nic: Yeah, we could get details at that point

… I don’t have a good sense today of sites that do that with redirects. Sounds like Ben’s customers haven’t had that concern.

Ben: The ones I know using this API don’t have that problem

… Some sites have own redirects, but they don’t have that info today

Nic: Current processing model defines it as the final SW startup time, but if we’re talking about simplifying that and solving the redirect case later, maybe we can make it clear that redirects are not properly captured.

Noam: Maybe not familiar enough with SW - if the SW does an extra fetch, would the redirects be inside the SW?

Jake: Depends on navigation or subresources. For subresources, it would be.

Noam: So redirects after workerStart are only relevant to navigation. Otherwise they are blackbox inside the SW

Ben: There are cases where if the SW doesn’t do what Jake said, for example returns without calling respondWith()

… Other cases where the SW opts not to handle it (not call respondWith), the redirect would be handled outside of the SW.

… In the navigation case, if you do this, it will reenter a potentially different SW

Jake: HTML spec handles redirects for navigation, whereas Fetch spec handles redirects for resources

Noam: Navigation redirects are different

Ben: I like this plan. Removing redirects from workerStart for now would get us to a solid predictable base and then we can improve from there

Nic: Would we still need a workerReady timestamp

Noam: Very similar to responseStart

Nic: But this all happens before fetchStart

Ben: Propose to defer this until we have a more complete proposal.

… extra attributes vs. array of entries

Nic: workerReady was proposed for redirects, so saying we’re not trying to solve redirects, kinda discounts that

… For our customers it would be enough

… So simplify it for now, it may not be useful for redirects, but you already don’t have visibility into that

… And later we can get more detailed timing for SW and redirects

Ben: Sounds like a plan to me

Nic: Concerns about that plan?

Noam: About fetchStart being different in the workerStart scenario. Found that surprising because ResourceTiming wasn’t talking about that

Ben: Random question - do you have a sense for views of Service Worker performance?

… I hear from a lot of people about navigationPreload

Nic: Can talk about mPulse customers. A small set that are using SW and a smaller set that know SW cause issues. Had some cases where performance analysts helped sites

… Doesn’t feel like it’s a well-known place to look for issues

Ben: Cross browsers?

Nic: Only get data from Chrome

Pat: Field experiments from Facebook showed that without navigationPreload, shipping SW is too much of a penalty.

Yoav: Agreement on workerStart bit and that we need more work on getting a better solution of measurement

Nic: AI for me to adjust the PR on NavigationTiming. Followup on Resource Timing and then long term discuss what we want

Jake: Looking forward to reviewing

Resource direct lookup support · Issue #255 · w3c/resource-timing

Nicolás: Wants to be able to get the exact performance timing entry associated with a given element.

… Right now there’s no really a way to do that

… You can use the image source, but there can be multiple entries with the same URL

… So need to know which entries with the same URL is the one you care about

… A feature request

Yoav: I feel like this is something we had already discussed in the past

… Another issue that had a reverse-item mapping

Nicolás: Entry from element, but presumably you may want to get the element from the entry

Yoav: We talked about having an initiator resource, although the the initiator is the injector

… For terms of triaging we can say this is a feature request for L3

… Which is approaching, where we can start to talk about new and exciting things

… Once we reach a stable state with RT, we can look at the feature requests and see what makes sense

… Will label as L3 and Enhancement