Participants

Lucas Pardue, Yoav Weiss, Yash Joshi, Rafael Lebre, Giacomo Zecchini, Pat Meenan, Ian Clelland, Hao Liu, Aoyuan Zuo, Nic Jansma, Neil Craig, Patricija Cerkaite, Sean Feng, Carine Bournez, Katie Sylor Miller,

Admin

Next call on June 8th - we’ll talk about Managed Components
TPAC hotel registration is open

Minutes

FetchID - Yash Joshi

Recording

Yash: Fetch ID - proposed addition to Resource Timing
… Unique identifier to each resource fetch
… Talked about it in issue 263
… Major use case is the initiator attribute, which will be a part of the resource timing entry, and create a clear connection between resources and their initiator
… Would help create a dependency tree and help RUM providers identify resources on the critical path to LCP, etc
… Aim to add a read only attribute to the Resource Timing entry
… Use would be similar to other entry attributes
… Characteristics - uniqueness to enable identification
… randomness to be on the safe side
… Plan to reuse the UUID from WebCrypto
… Spec changes - add the attribute to the interface and add a processing model to add the fetch ID, which will be done in Fetch
… Open questions
… Should multiple fetches to the same resource have the same fetch ID? Can simplify use
… May want to call the attribute resource ID
… Need to come up with the right mechanisms to pass this info from Fetch to Resource Timing
… Should there be a fetch/resource ID for the navigation timing entry as well? What should be the initiator of resources that are kicked off by the HTML.
… One option is to have a null initiator, the other is to have a UUID for the HTML resource itself, enabling a single dependency tree
Nic: Excited to get this info! What’s Timing Info?
Yash: A spec structure that’s built in Fetch.
Nic: No strong opinions. Can you elaborate on e.g. image initiated by a navigation response?
Yash: <img> tag in the HTML that initiates an image request. The question here is if the initiator should be null or the navigation’s resource
Nic: So for orphaned entries, what would be their initiator? As a RUM provider, there’s no real difference here
Ian: If you look at who initiated the fetch, it’s the HTML document that represents the navigation itself. Given that NT is inheriting from RT, I thought it’d make sense.
Yoav: NT’s ID would enable us to expose a full dependency tree, setting the ID to null in those cases would enable easily identifying them.
Nic: The ID we’re talking about - would I ever need to export that UUID as part of my payload? Can’t think of a reason why. More likely would use that in the page itself.
… But if I did, the UUID is larger than an int
.. Is there a reason for me to transfer it?
Yash: Can’t think of a reason where these values are useful outside the scope of a page.
Yoav: I imagine you’d translate them to unique but smaller ints
Nic: I can just assign a smaller int UID and export that way.
Nic: Yash, how are we collecting feedback?
Yash: I can pose them as an issue. I didn’t hear strong objection.
Nic: Thanks!!

NEL Issues

Data protection and security issues with Network Error Logging

Yoav: Issue filed last week tha linked to a paper that had various claims and issues on NEL
… I closed a larger issue because it’s hard to keep tabs on specific things there, I opened up more specific issues where we can improve NEL privacy in practice, some are implementation realities vs. spec issues

NEL cache should be partitioned

Yoav: Possible to deduce information about user’s history and other related things by fact that there are or aren’t NEL reports registered under certain origins
… The issue is correct, it is no longer an issue in Chromium as of a few months ago where the entire network state has been partitioned
… NEL was partitioned along with that effort
… Still spec work to define that partitioning in the spec (which is not partitioned)
… Don’t really have a strong standard concept for the network partition key
… Ian are you aware of the state of those discussions?
… Multiple different keys for partitioning different things
… HTTP vs. network cache, different by each browser implementation
… Need to define something in NEL but don’t have a solid spec to define that on
Ian: Privacy WG has a ueber-spec for how to define these
… Pointed to Storage Spec in WHATWG, which does define a storage key
… Intended to be used as a partitioning key, I think we could use that
… At least if we have a common key for other APIs, that would be the right for standardization
Yoav: I think it makes sense to partition based on that (better than not at all)
… There are subtle differences between the keys used for storage
… Double-key, triple-key and double-key plus iframe bit
… I think this is what the Storage Spec landed on
… NEL work shouldn’t be blocked on these though
… Makes sense to point to Storage Key but UAs can do different things based on their own privacy requirements
… Open an issue on Fetch to specify all partitioning key
Ian: That makes sense
… NEL is different from other specs, the origin you’re trying to reach vs. reporting to may be different
… If there are additional keying requirements we should address that here
Yoav: Are there additional requirements?
… The origin you’re reporting to is the value being cached, I wouldn’t expect it to be part of the key
Ian: I think you’re right, I just wanted to bring it up
Neil: Could you explain what this caching/partitioning would affect?
Yoav: Essentially if we have publisher.com that embeds ads.com IFRAME that sends itself NEL reports so it knows it doesn’t have any network config issues, and if we have commerce.com that embeds ads.com, those two NEL reports will not be automatically sent in commerce.com if it was in publisher.com
… Essentially the cache key based on which the NEL reports will be cached will be based on the top-level origin of the embedder site plus the embedding sub-resource that registered
Neil: Thank you
Nic: To clarify Chrome already has partitioning built in? This wouldn’t change anything.
Yoav: Correct
Neil: I don’t think this would affect us
Yoav: Correct
… Third-parties need to re-register on each top-level site that embeds them
… The same is true for DNS caching and other types, connection re-use, etc.

NEL cache should be time-bound

Yoav: Another issue raised in the paper is that if a site has a vulnerability or is being MITM’d for a short period of time
… or another temporary vulnerability could be permanent if the attacker is using the NEL cache to register reports that are then being sent forever to whatever origin
… True for errors as well as success reports
… Theoretically an attacker could set success reports to 100%, when it’s vulnerable or after the attacker could know when they’re visiting the site
… An easy way to solve this is how service workers solve this problem, making sure that caching time for NEL reports is bound for 24h (aligned to Service Workers), then have the cache maintain stale-while-revalidate until it expires
… A single last report, but the origin needs to renew the registration otherwise reports are gone
… A temporary vulnerability could then be resolved
Ian: Caching the policies themselves and revalidate, not the reports themselves
Yoav: NEL cache would have stale-while-revalidate semantics
Ian: Are we proposing an eventual timeout that the sender can defined
Yoav: One last one defined by the sender
… Right now those policies are being cached by a timeout bound to the sender
… I think the stale-while-revalidate if not renewed by 24h then it’s evicted
Neil: Does that break the max-age directive? We have a lot of users who come to us regularly but not frequently, so we set max-age to 1 month. So if someone hits us and gets NEL and comes back in 3 weeks time?
Yoav: They will send one last report (stale-while-revalidate) for the month, but once they reach the server after 3 weeks if you didn’t send back a NEL policy, renewing that cache, that is the last report they will send
Nic: To build on this, if a visitor comes after 3 weeks, no response, then again 48 later, they’d still send that second report? It’s not until they actually receive a response that has no NEL that it would be cleared?
Yoav: Correct, eviction would happen once they get a response without a NEL policy
Neil: Does spec say if there’s no NEL it will clear policy?
Ian: You have to say max-age 0
Yoav: This would require changes in Chromium
… Some of Chrome’s privacy people are not convinced this is an actual issue that attackers can’t work around, but I think it’s something we should at least consider

Consider limiting scope of NEL registration

Yoav: A somewhat similar scenario
… Cases where an attacker controls a path on an origin but not the full one
… Classic example where university sites that have per-user directories, users having access to header configuration
… In these cases an attacker could register NEL reports for the entire origin
… One of the claims in the paper is that this opens up possibilities of that attacker getting error reports and success reports for the entire origin
… It seems like we should consider allowing trimmed-down scopes to only register DNS reports for an entire origin or reports scoped to where the NEL headers were accepted
… At the same time this adds a lot of complexities
… Not sure this is something we should do
Ian: I’m not convinced this is totally effective
… Worried this means your policies may not be effective based on how you structure your URLs
… If you always use /a/b/c/d/img.png you could only register a URL for that path
… I don’t know of a better way to do it
Nic: Seems very limiting. Don’t want all visits to go through the root
Neil: want to second that. People could be in specific parts of the site. Enforcing path scopes would significantly reduce the value
Nic: Easy mechanism to validate with the root path that it has the same NEL policy? Would require the root path to know about that. Seems complex
Neil: If you’re in a big traffic spike, you don’t want to double your requests
Yoav: complexity cost and operational cost
Nic: Other platform features that had subpath concerns?
Yoav: Service Workers, megacorp.com/email and /maps etc could be different apps with different policies
… Adding scope is different than enforcing. Voluntary scope vs. enforced scope.
… Enforcing the scope to not go above its path, seems mainly for university website/multi-user scenario
… For SW we dealt with it by limiting scope
Pat: Any multi-tenant situation the operator would be better to not allow any tenants to set NEL header
… Rather than requiring scope for everyone
Ian: Previous proposals from webapp-sec for suborigins. They’ve abandoned and are saying that origins are the security boundary and nothing beneath that.
… Might be a good group to look at this
… Cookies are another place where we’ve done this but that’s voluntary, set the level if you want
Yoav: Voluntary scope is easy, but enforcement is tricky
Pat: Service Worker case the main reasons were about damage a sub-origin application could do to others, whether it be multi-tenant and able to intercept traffic, or a megacorp where a worker would accidentally work on scope for the entire origin
… The risk on this side is visibility into traffic volumes on the success path, not so much the breaking case
Yoav: At worse it can be visibility into all traffic, 100% of success reports is valid use
Pat: Issue for tenant operator, doesn’t break the site
Yoav: Can leak all IPs for example
Pat: If you don’t want operator to leak traffic, then you don’t allow others to set NEL headers and steal reports
Yoav: Maybe solution here is to add to privacy and security considerations, any multi-tenant folks should be aware of this and set NEL max-age=0 or something along those lines
Pat: Or make clear origin is the privacy and security boundary, if not, dont’ allow origins to set NEL headers
Nic: Should we revisit allowing set 100% of success traffic
Neil: Some people use it in place of server access logs, because they don’t have access
Yoav: Is that a valid use-case?
Neil: Some don’t get access logs, more of an organization problem than a technical one
Yoav: I think original reason for success logs was to estimate quantities, but sampled success logs
Ian: Any real-world use-cases sampling is probably fine, I don’t know what you would set maximum to though. Different for BBC vs. small site.
… Would we want to spec 1% is max you can go on success rates?
Yoav: Amount of visits would depend on that
Pat: Does it solve anything
Nic: To not get all IPs?
Pat: But is sampling there better?
Ian: If you see just one person, you could follow them, privacy issue
… But maybe 1% is still an issue?
Yoav: This scenario isn’t really possible because secure connection restrictions
… In the privacy and security section we should add that this should be a consideration for any site that’s not using origin as a security boundary
… See if other solutions beyond that, but no obvious ones