W3C

DPVCG Meeting Call

20 MAR 2025

Attendees

Present
arthitSuriyawongkul, delaramGolpayegani, harshPandit, iainHenderson, julianFlake, stratisKoulierakis, tyttiRintamaki
Regrets
georgKrog, paulRyan
Chair
harshPandit
Scribe
harshPandit

Meeting minutes

Repository: w3c/dpv

Agenda: https://www.w3.org/events/meetings/178d1c71-a92d-4da7-a196-6a89d0fe2277/20250320T133000/

Meeting minutes: https://w3id.org/dpv/meetings

Persistent ID for current minutes: https://w3id.org/dpv/meetings/meeting-2025-03-20

Intro

new attendees / intro for non-regular members

stratisKoulierakis: PhD in Law on data protection, connections to semweb technologies and recommended some additions to the DPV based on GDPR as a proposal of concepts from PhD thesis to capture specific expressions in formal documents.

arthitSuriyawongkul: working in ADAPT on AI, accountability, ML devops

2.1 release

<ghurlbot> Issue 235 DPV v2.1-RC hotfixes and feedback (by coolharsh55)

DPV 2.1 has been released. See announcement on https://lists.w3.org/Archives/Public/public-dpvcg/2025Mar/0002.html and release on github https://github.com/w3c/dpv/releases/tag/dpv-2.1

Datasheets and Model Cards

<ghurlbot> Issue 94 Represent Datasheets and Model Cards with DPV (by coolharsh55)

harshPandit: Working on modelling datasheets and model cards using DPV, and have questions about state of the art.

arthitSuriyawongkul: MLCommons and Croissant https://mlcommons.org/working-groups/data/croissant/ and RAI https://docs.mlcommons.org/croissant/docs/croissant-rai-spec.html

harshPandit: We have looked at that before, and it was modelling ML specific information but no information on regulations or GDPR specific things. They were supposed to be doing this in the new project, but AFAIK that has been changed / hasn't happened.

delaramGolpayegani: Open Datasheets: https://arxiv.org/abs/2312.06153

harshPandit: Am aware of this work from Microsoft - similar issue that it doesn't address regulation or align with legal requirements
… Not exactly data sheet, but legal metadata for media https://www.joanneum.at/digital/en/projects/fairmedia. I only notice about FAIRmedia because they participate in Croissant: mlcommons/croissant#808

harshPandit: two issues when looking at such relevant SotA and deciding what we should be doing in this group — (1) we don't want to simply transform text to digital form -- it must be helpful beyond capturing information and assist in auditing and accountability; (2) we want the information to be legally relevant / aligned e.g. in terminology and structure even if we don't directly address specific compliance tasks

harshPandit: In modelling datasheets, two questions for the group: (1) to what extent do we retain the terminology e.g. instances, labels – do we replace with DPV data categories; (2) to what extent do we model technical information that we haven't considered so far e.g. split in testing/validation data

delaramGolpayegani: we can look at AI Act which mentions testing / validation etc. in technical documentation and see what info is needed for compliance, and then use that as a guide for what should be included

harshPandit: Okay, that's a good approach. Though it would still mean deciding on how much technical stuff should be provided in DPV, and how much we should be looking elsewhere to model. Will share the draft of work and then we can continue the discussion.

ACTION: Harsh to share drafts for Datasheet/Model Cards

Move wiki to Github / Update wiki

<ghurlbot> Issue 129 Move content from W3C wiki to Github wiki, and close W3C wiki (by coolharsh55)

harshPandit: proposal to move docs to wiki from code/repo

accepted

ACTION: Move w3c wiki and the docs in code repo to Github repo

Simplify Purposes for adoption

<ghurlbot> Issue 258 Consolidate Purposes into Groups for Simplifying Adoption with P7012 (by coolharsh55)

harshPandit: with Iain, the proposal is that for works like P7012 where it is expected the individuals or orgs will choose specific purposes in an agreement, we want to create a grouping of related purposes that are commonly expected (while being privacy friendly) so that its easier to pick and use them. For example, if you are getting a service, then in DPV there are many different purposes that would have to be individually chosen like ServiceProvision, ServiceImprovement, FraudPreventionAndDetection, CustomerCare, etc. Instead, we want to group them together under a common concept - in the P7012 extension so that it can be used with the standard as a single concept. Different groupings can then be established for different use-cases and verticals as needed e.g. for banking, for a pharmacy, and so on.

iainHenderson: This comes from Customer Commons which is like Creative Commons and aims to 'standardise' privacy policies, which is then what would be useful for implementing IEEE P7012; policies are written from perspective of individuals, and the machine-readability and standardisation would be implemented with DPV (context in chat) Goal is to provide high-level concepts that organisations and individuals understand and can easily use.
… Overarching Context - Customer Management (dpv:CustomerManagement) refers to all purposes associated with managing activities related with past, current, and future customers. So we want to have a purposes such as USE: dpv:CustomerManagement with:
… 1 Inquiry Management (propose - dpv:InquiryManagement)
… 2 Relationship set-up (propose - dpv:RelationshipSetUp)
… 3 Product/ service provision (use – dpv:ServiceProvision)
… 4 Account management (use - dpv:AccountManagement)
… 5 Relationship Closure (propose - dpv:RelationshipClosure)

harshPandit: If there are no concerns, issues then we will discuss this next with concrete proposals and examples.

ACTION: Iain and Harsh to share proposal and examples for simplifying purposes for P7012

Legal extensions for East/SouthEast Asian jurisdictions

<ghurlbot> Issue 253 [Concept]: Legal Concepts for East Asian and Southeast Asian jurisdictions (by bact)

arthitSuriyawongkul: Data protection regulations and authorities from these countries (see issue) are proposed to be modelled – Hong Kong, Japan, Macau, South Korea, Thailand, Malaysia. Used India and US extensions as an example to model this. For some countries like South Korea and Thailand there are additional authorities - have modelled them as dpv:NationalAuthority for the moment

harshPandit: let's discuss this (re. authority) later when we also have to see how the authorities from AI Act, which are sectorial, but also are responsible for enforcing the AI Act should be modelled in DPV.

harshPandit: For these extensions, no issues are apparent for now, so we will add them as proposed work and discuss next week.

ACTION: Review proposed SEA extensions, add them to legal spreadsheets

Other Topics

Subjective locations e.g. Home, Work; Inverted locations e.g. Non-EU; AI training

<ghurlbot> Issue 209 [Concept]: `AtX` subjective Location concepts (by coolharsh55)

<ghurlbot> Issue 208 [Concept]: Add Non-X (X = specific location) to represent locations that are not X (by coolharsh55)

<ghurlbot> Issue 82 Provide vocabulary to specify purposes and permissions related to AI training (by scottkellum)

harshPandit: propose we add them to our spreadsheets / live outputs and discuss from there

julianFlake: +1

ACTION: Add concepts from subjective locations, inverted locations, AI training to spreadsheets

Existing data taxonomies

harshPandit: had a brief discussion with Georg about modelling existing data taxonomies e.g. IAB, Google in PD extension as a scope discussion. I have two proposals: 1) directly add them into PD to say we cover those categories; OR 2) do this and also have a 'virtual layer' that shows these are the subset from IAB, Google, etc. so that if someone wants to use that specific subset e.g. to replace their existing uses then they can do so. In all of this, we are very clear and explicit that we are not promoting either the IAB or the Google taxonomy as there are issues with their use in terms of being privacy respecting and in terms of legal compliance as has been known/researched for a while. Instead, we are trying to keep DPV competitive and have an alternative ready for use with systems that want to provide a better way of going about things, and where they can use DPV.

iainHenderson: good idea to do these, we can also go over their limitations and be human-centric e.g. Google taxonomy is 7 levels deep and they stop at things like whiskey or age. We are not enabling the IAB/Google, but rather enabling detailed representations for humans. Google is good for products, but not for services. So when we go in this direction, we will see gaps, which will need to be plugged from the human perspective.

to be discussed further when relevant members are present

Next Meeting

The next meeting will be on MAR-20 Thursday 13:30WET/14:30CET

Agenda will be continuing discussion on topics started under v2.2 and any other updates that come up.

Summary of action items

  1. Harsh to share drafts for Datasheet/Model Cards
  2. Move w3c wiki and the docs in code repo to Github repo
  3. Iain and Harsh to share proposal and examples for simplifying purposes for P7012
  4. Review proposed SEA extensions, add them to legal spreadsheets
  5. Add concepts from subjective locations, inverted locations, AI training to spreadsheets
Minutes manually created (not a transcript), formatted by scribe.perl version 217 (Fri Apr 7 17:23:01 2023 UTC).