Jake Archibald, Yoav Weiss, Nic Jansma, Ben Kelly, Andrew Galloni, Steven bougon, Sean Feng, Pat Meenan, Michal, Noam Rosenthal, Nicolás, Carine Bournez
April 1st 2021 at 10am PST / 1pm EST
Yoav: We want to talk about workerStart, and how we can properly define what it does
… Defining potentially a more complete solution, since we think it’s not sufficient to measure Service Worker Startup Time in many cases
Nic: Briefly summarize from last time
… Goal of the attribute was to help us measure the critical path cost of SW start, which is non-zero and measurable in many cases
… The spec are not synchronized or consistent
… PRs to address the processing model and make it more consistent
… Focusing on how long it takes SW to start up - what should we be measuring?
… With redirects, there could be multiple SWs that are in the critical path
… The current proposal aims to place the workerStart before the final same-origin redirects
… Redirects that happen before that, aiming to capture the worker that started up before all of that
… Could be multiple workers in that path as well, including ones from other origins that we cannot expose
… Given that, there’s a PR that addresses the inconsistencies - workerStart is the time when the first worker that starts up in the same origin redirect chain
… Need to see how where that belongs relative to fetch start
… How do we define this workerStart? Should we be thinking about it differently
… Do we need workerReadyTime? Something else?
Jake: When I looked at the diagram, it’s confusing that workerStart is before fetchStart, then I realized that the wording meant “the final of those things” after redirects.
… Would it make sense to rename `workerStart` to account for that? `firstWorkerStart` and `workerStart` that’s the final one. firstWorkerStart may be earlier, if it was involved in the redirect chain.
… Extra values to know why your SW was slow would be useful. workerStart is when we decide that a SW needs to be involved
… workerReady - it’s ready now
… Something else to signify when we dispatched the fetch event
… Those three points in time can help you understand why it’s taking long to startup
… If you have a big gap between workerReady and fetch event start, that may indicate main thread contention
Nic: Makes sense. If we make changes here it makes sense to try and capture all of that.
Jake: What do we do in a case where the first and final service worker are the same?
Ben: Today at workerStart we take a timestamp there, even in the redirect case we take that timestamp, it’s just short
Nic: Today we’re usually measuring the not-expensive case of the “last” service worker to startup
Ben: Moving workerStart after firstStart would break the only reliable measurement
Nic: Not reliable if there are redirects
Ben: Reliable for people that know they don’t have redirects. Reliable for their use-case
… Separate final navigation workerStart from the first one. That could correspond to the same heuristics
Noam: Doing work on Fetch right now. There are other metrics that can happen several times and we’re only measuring the last one.
… Two separate issues
… Measuring redirects for resource is one issue - separate entries per redirect request
… Other issue is measuring more stuff in a worker - maybe belongs in the performance timeline of the worker
… Not sure if the lifecycle of the worker is a 1:1 match with the resource fetch
… Maybe inside the SW, we should have entries that give us information about it
… “When did we send a message to the SW” When did we hand off the Fetch?
Ben: I think that conceptually we have that with workerStart.
… Made me think of another case which is navigationPreload, complication in navigation case
… Agree that a lot of this comes down to the clunkiness of redirect handling and a more holistic solution would be better
Noam: Doing that, workerStart would be consistent with other attributes. Like where there would be a path where you would get connectStart/domainLookupStart/etc
… And a separate path for workers
… And then try to find a solution for redirects
Jake: If solving redirects comes later, workerStart should be after fetchStart
Noam: Around the time of connectEnd or connectStart. The same time you would go to create an HTTP connection.
Ben: If we go that route, can we make a fetchEventStart or fetchEventDispatch
Yoav: I don’t think we can change the semantics because people are using that today
.... It could be multiple phased, one would be on the current RT entry, and another for redirects in general, but we can’t change existin semantics
Ben: People got confused by it not saying service worker, so maybe a rebranding opportunity here
Jake: Makes sense to me. If we can’t change what workerStart does, what are we doing here?
Nic: I think we want to make sure it’s useful for the original intent, measuring Service Worker Startup. The simple path would be to also measure the end time. A more complex path would be to measure an array of start and end. Then the most complex is to fix redirects
Ben: We should just do an array for SWs. Need to be associated with redirects
Pat: A redirect has every part of that path and they are a separate request
… You need an array of resource timings, but that doesn’t solve the simple case of fetchStart-workerStart
Noam: But it only gives you that if you assume no redirects
… Putting it after fetchStart would still keep the semantics of the useful case where you don’t have redirects
Jake: So if we’re keeping workerStart, what does it mean?
Noam: Handling off to SW
Jake: After fetchStart then?
Noam: Yes
Ben: Today when a SW is involved, fetchStart is changed to be when the fetchEvent is dispatched on the worker thread
Jake: When a worker wasn’t involved then?
Nic: This was called fetchStart before Fetch.
… It’s been adapted over time to the Fetch event
… To Ben’s point, the definition of fetchStart is a little nebulous
Noam: fetchStart is right after redirectEnd
Ben: In implementations fetchStart starts after the SW is started. If the worker is with 0 overhead, the measurement would be the same as redirects.
… One option is to leave workerStart and fetchStart and create new names for our approach
… Instead of fetchStart we have something else named
Yoav: I’m supportive of that, to not break existing content
Ben: Maybe it’s easier to create a new ResourceTiming type
Yoav: We can also measure attributes usage
Ben: Dumb idea for redirects - creating timeline entries for each request. But then you’d need a way to chain them together.
… If you have multiple overlapping requests, it’s be hard to sort them
Noam: For ResourceTiming, if we have an image, we want to know everything about the image, so need some way to connect the requests to the image
Pat: That largely fits with this. If the resource is the HTML, then you have the array of fetches needed to fetch the HTML, whether it’s 3 redirects or whatever.
Jake: I support the idea. Rather than have all these things appear in a specific order, you have little events in time depending on the redirects happening
Ben: Another idea. Proposed the fetch event worker timing, where the fetch event can inject UserTiming
… What if instead of workerStart, we would populate this array with native points in time, as well as things developers contribute
Benjamin: Would that include registration time to workerStart
Ben: That’s not normally part of a resource fetch
Noam: Reminds me of Server Timing. Can treat the timings the SW does as ServerTiming
Ben: I tried to model it after Server Timing so it has a lot of parallels
Pat: Would be nice for this to be a bit more structured. If a SW combines 2 fetches to a response, that would make it hard for analytics to rationalize, it’d be best if different SWs would make their metrics in the same way. ServerTiming today is just a complete blackbox and is implementation specific.
Ben: Currently defined to create mark entries, but then be able to stick fetches in there
… It’s partially implemented in Chrome
Jake: In most cases, we’ve done the fetch in the SW, got a response and sent it back. Would we be able to associate that with the final response?
Ben: We could if the SW does a pass through. But any repackaging of the response would love that data
… Comes with Naoms’ work on the Fetch spec. Once it’s in the response object, we can pass it along, but would lose it for synthetic things
Jake: Covers a lot of the use cases
Ben: Can also indicate things came from storage.
… Also, kudos on the Fetch work
Yoav: I don’t know if we’ve reached any conclusions on the definition of workerStart. Assuming we have to maintain compat for the current use cases… if we provided a more supported alternative, maybe people will move to that.
… Long-term collaboration with ServiceWorker folks to figure out better metrics
… Nic do you have any comments on Noam’s proposal?
Nic: Doesn’t Noam’s proposal switches the order of workerStart and fetchStart?
Noam: Not suggesting to switch the ordering. Just that workerStart would be before the last resource in the redirect case.
Nic: Can’t change the order, but we could define it as the last redirect in the chain. Makes it simpler to define, but less useful in the same origin redirect case.
Pat: But the time is not lost, you’d just have a long redirect time
Nic: Yeah, we could get details at that point
… I don’t have a good sense today of sites that do that with redirects. Sounds like Ben’s customers haven’t had that concern.
Ben: The ones I know using this API don’t have that problem
… Some sites have own redirects, but they don’t have that info today
Nic: Current processing model defines it as the final SW startup time, but if we’re talking about simplifying that and solving the redirect case later, maybe we can make it clear that redirects are not properly captured.
Noam: Maybe not familiar enough with SW - if the SW does an extra fetch, would the redirects be inside the SW?
Jake: Depends on navigation or subresources. For subresources, it would be.
Noam: So redirects after workerStart are only relevant to navigation. Otherwise they are blackbox inside the SW
Ben: There are cases where if the SW doesn’t do what Jake said, for example returns without calling respondWith()
… Other cases where the SW opts not to handle it (not call respondWith), the redirect would be handled outside of the SW.
… In the navigation case, if you do this, it will reenter a potentially different SW
Jake: HTML spec handles redirects for navigation, whereas Fetch spec handles redirects for resources
Noam: Navigation redirects are different
Ben: I like this plan. Removing redirects from workerStart for now would get us to a solid predictable base and then we can improve from there
Nic: Would we still need a workerReady timestamp
Noam: Very similar to responseStart
Nic: But this all happens before fetchStart
Ben: Propose to defer this until we have a more complete proposal.
… extra attributes vs. array of entries
Nic: workerReady was proposed for redirects, so saying we’re not trying to solve redirects, kinda discounts that
… For our customers it would be enough
… So simplify it for now, it may not be useful for redirects, but you already don’t have visibility into that
… And later we can get more detailed timing for SW and redirects
Ben: Sounds like a plan to me
Nic: Concerns about that plan?
Noam: About fetchStart being different in the workerStart scenario. Found that surprising because ResourceTiming wasn’t talking about that
Ben: Random question - do you have a sense for views of Service Worker performance?
… I hear from a lot of people about navigationPreload
Nic: Can talk about mPulse customers. A small set that are using SW and a smaller set that know SW cause issues. Had some cases where performance analysts helped sites
… Doesn’t feel like it’s a well-known place to look for issues
Ben: Cross browsers?
Nic: Only get data from Chrome
Pat: Field experiments from Facebook showed that without navigationPreload, shipping SW is too much of a penalty.
Yoav: Agreement on workerStart bit and that we need more work on getting a better solution of measurement
Nic: AI for me to adjust the PR on NavigationTiming. Followup on Resource Timing and then long term discuss what we want
Jake: Looking forward to reviewing
Nicolás: Wants to be able to get the exact performance timing entry associated with a given element.
… Right now there’s no really a way to do that
… You can use the image source, but there can be multiple entries with the same URL
… So need to know which entries with the same URL is the one you care about
… A feature request
Yoav: I feel like this is something we had already discussed in the past
… Another issue that had a reverse-item mapping
Nicolás: Entry from element, but presumably you may want to get the element from the entry
Yoav: We talked about having an initiator resource, although the the initiator is the injector
… For terms of triaging we can say this is a feature request for L3
… Which is approaching, where we can start to talk about new and exciting things
… Once we reach a stable state with RT, we can look at the feature requests and see what makes sense
… Will label as L3 and Enhancement