DPVCG Meeting Call – 27 FEB 2025

Meeting minutes

Repository: w3c/dpv

Meeting minutes: https://w3id.org/dpv/meetings

purl for this meeting: https://w3id.org/dpv/meetings/meeting-2025-02-27

AI Training

<ghurlbot> Issue 82 Provide vocabulary to specify purposes and permissions related to AI training (by scottkellum)

email by harsh on mailing list https://lists.w3.org/Archives/Public/public-dpvcg/2025Feb/0010.html

harsh: I tried out the implications of defining AI training as a dpv:Process, as a dpv:Processing, and as an ai:Technique. See the comment in github issue and the linked blog post for examples and more details. Based on this I propose to add training as part of the technique taxonomy in AI extension, and to distinguish between those that use data - which we define as dpv:Processing - and those that don't. We already have some concepts related to training like supervised training in the taxonomy, so this will also avoid having to create duplicate technique and training taxonomies. IncrementalTraining and FederatedTraining have AI specific meaning - so they can go to the Technique taxonomy.

harsh: The concepts for frequency, duration, location, actor, etc. associated with training should be expressed through dpv:Process like we do for the other activities. Otherwise we will need more properties just for training - its better to use a common pattern as training is just another factor in the process.

delaramGolpayegani: Where will get the definitions for these from?

harsh: These are common terms in the ML/AI domain, so there should be a reference or a link to some glossary that we can use e.g. from IEEE or IBM or something like that. Since these are technical aspects / techniques, and not procedural or policy concepts, we can find reference to definitions/descriptions in existing implementations.

delaramGolpayegani: IBM risk atlas https://www.ibm.com/docs/en/watsonx/saas?topic=ai-risk-atlas for AI risk taxonomy enrichment

mayraRusso: not sure about training under technique, seems more like a process

harsh: I see the point, so there's a proposal also to add ai:AIProcess to flag that AI is being used in the process - which could be training or something else. Then the technique or method used for training, data, etc. can also be specified there.

group will evaluate the proposal / use-case and discuss this in later meetings

Subjective Locations

<ghurlbot> Issue 209 [Concept]: `AtX` subjective Location concepts (by coolharsh55)

harsh: use-cases described as last week, with RDF examples; only unclear bit is browser cookie - how to state third party managed location within personal space? E.g. private owned shop but owned by someone else. Complex concept to state. Also see email to mailing list https://lists.w3.org/Archives/Public/public-dpvcg/2025Feb/0010.html and the github issue

group will evaluate the proposal / use-case and discuss this in later meetings

Inverted Locations

<ghurlbot> Issue 208 [Concept]: Add Non-X (X = specific location) to represent locations that are not X (by coolharsh55)

harsh: see email https://lists.w3.org/Archives/Public/public-dpvcg/2025Feb/0010.html and github issue

group will evaluate the proposal / use-case and discuss this in later meetings

Uncategorised/Unstructured Data

<ghurlbot> Issue 240 Add `UnstructuredData` and `UncategorisedData` concepts/properties (by coolharsh55)

harsh: Proposal from RobBrennan and myself to add uncategorised and unstructured as categorisations for data. Uncategorised refers to not knowing whether some data is personal data or not (or other categories) - which is a risk. Unstructured refers to not knowing how the data is structured or organised - which is a different risk. The context is a health governance project where flagging that data is unstructured/uncategorised helps with data quality and legal assessments.

paulRyan: +1

harsh: These concepts are also helpful later as ai has labelled data

paulRyan: is there a difference?

harsh: yes e.g. personal data without labels can mean it is not categorised (labelled)

Next Meeting

The next meeting will be on MAR-06 Thursday 13:30WET/14:30CET

Agenda will be reviewing and finalising release of v2.1, and continuing discussion on topics started under v2.2

Attendees

Meeting minutes

AI Training

Subjective Locations

Inverted Locations

Uncategorised/Unstructured Data

Next Meeting