W3C Workshop on Web and Machine Learning

🌱 A virtual event with pre-recorded talks and interactive sessions

The Program Committee has identified the following speakers and presentations as input to the September 2020 live discussions.

Discussions emerging from the presentations are welcomed in our GitHub repository issues, either by creating a new issue or commenting on an existing one.

  1. Opportunities and Challenges of Browser-Based Machine Learning
  2. Web Platform Foundations for Machine Learning
  3. Machine Learning Experiences on the Web: A Developer's Perspective
  4. Machine Learning Experiences on the Web: A User's Perspective

Opportunities and Challenges of Browser-Based Machine Learning

Goal: Determine what are the unique opportunities of browser-based ML, what are the obstacles hindering adoption

Privacy-first approach to machine learning by Philip Laszkowicz - 11 min

11 minutes presentation

Speaker
Philip Laszkowicz
Abstract
The presentation will discuss how developers should be building modern web apps and what is missing in the existing ecosystem to make privacy-first ML possible including the challenges with WASI, modular web architecture, and localized analytics.
Machine Learning and Web Media by Bernard Aboba (Microsoft) - 7 min

7 minutes presentation

Speaker
Bernard Aboba (Microsoft)
Abstract
The presentation will discuss efficient processing of raw video in machine learning, highlighting the need to minimize memory copies and enable integration with WebGPU.

10 minutes presentation

Speaker
Jason Mayes (Google)
Developer Advocate for TensorFlow.js
Abstract
This talk will give a brief overview of TensorFlow.js, how it helps developers build ML-powered applications along with examples of work that is pushing the boundaries of the web, and discuss future directions for the web tech stack to help overcome barriers to ML in the web the TF.js community has encountered.
Machine Learning in Web Architecture by Sangwhan Moon - 4 min

4 minutes presentation

Speaker
Sangwhan Moon
Extending W3C ML Work to Embedded Systems by Peter Hoddie (Moddable Tech) - 6 min

6 minutes presentation

Speaker
Peter Hoddie (Moddable Tech)

Peter is the chair of Ecma TC53, ECMAScript Module for Embedded Systems, working to bring standard JavaScript APIs to IoT.

He is a delegate to Ecma TC39, the JavaScript language standards committee, where his focus is ensuring JavaScript remains a viable language on resource constrained devices.

He is a co-founder and CEO of Moddable Tech, building XS, the only modern JavaScript engine for embedded systems, and the Moddable SDK, a JavaScript framework for delivering consumer and industrial IoT products.

Peter is the co-author of “IoT Development for ESP32 and ESP8266 with JavaScript”, published in 2020 by Apress, the professional books imprint of Springer Nature.

He contributed to the ISO MPEG-4 file format standard.

Additional bio information

Abstract

JavaScript's dominance on the web often obscures its many successes beyond the web, such as in embedded systems. New silicon for embedded systems is beginning to include hardware to accelerate ML, bringing ML to edge devices. These embedded systems are capable of running the same modern JavaScript used on the web. Would it be possible for the embedded systems to be coded in JavaScript in a way that is compatible with the ML APIs of the web?

This talk will briefly present two examples of JavaScript APIs developed for the web to support hardware features -- the W3C Sensor API and the Chrome Serial API. It will describe how each has been bridged to the embedded world in a different way -- perhaps suggesting a model for how W3C ML JavaScript APIs can bridge the embedded and browser worlds as well.

Web Platform Foundations for Machine Learning

Goal: Understand how machine learning fits into the Web technology stack

10 minutes presentation

Speaker
Dominique Hazaël-Massieux (W3C)
Dominique is part of the full-time technical staff employed by W3C to animate the Web standardization work. He is in particular responsible for the work on WebRTC, WebXR and Web & Networks, led the effort to start a WebTransport Working Group and is one of the organizers of the Web and Machine Learning workshop.
Abstract
Background talk on the specificities of the Web browser as a development platform.
Media processing hooks for the Web by François Daoust (W3C) - 12 min

12 minutes presentation

Speaker
François Daoust (W3C)
François is part of the full-time technical staff employed by W3C and supervizes there the work related to media technologies.
Abstract
This talk will provide an overview of existing, planned or possible hooks for processing muxed and demuxed media (audio and video) in real time in Web applications, and rendering the results. It will also present high-level requirements for efficient media processing.

10 minutes presentation

Speaker
Ningxin Hu (Intel)
Ningxin is a principal software engineer at Intel. Ningxin is co-editing the Web Neural Network (WebNN) API spec within W3C Machine Learning for the Web Community Group.
Abstract
The WebNN API is a new web standard proposal that allows web apps and frameworks to accelerate deep neural networks with dedicated on-device hardware such as GPUs, CPUs with deep learning extensions, or purpose-built AI accelerators. A prototype of WebNN API will be used to demonstrate the near-native speed of deep neural network execution for object detection by accessing AI accelerators on phone and PC.

10 minutes presentation

Speaker
Jonathan Bingham (Google)
Jonathan is a web product manager at Google.
Abstract
The Model Loader API is a new proposal for a web standard to make it easy to load and run ML models from JavaScript, taking advantage of available hardware acceleration. The API surface is similar to existing model serving APIs (like TensorFlow Serving, TensorRT, and MXNet Model Server), and it is complementary to the Web NN graph API proposal as well as lower level WebGL and WebGPU APIs.
SIMD operations in WebGPU for ML by Mehmet Oguz Derin - 5 min

5 minutes presentation

Speaker
Mehmet Oguz Derin
Accelerated graphics and compute API for Machine Learning - DirectML by Chai Chaoweeraprasit (Microsoft) - 10 min

10 minutes presentation

Speaker
Chai Chaoweeraprasit (Microsoft)
Chai leads development of machine learning platform at Microsoft
Abstract
DirectML is Microsoft's hardware-accelerated machine learning platform that powers popular frameworks such as TensorFlow and ONNX Runtime. It expands the framework's hardware footprint by enabling high-performance training and inference on any device with DirectX-capable GPU

7 minutes presentation

Speaker
Miao Wang (Google)
Software Engineer for Android Neural Networks API
Abstract
The Android Neural Networks API (NNAPI) is an Android C API designed for running computationally intensive operations for machine learning on Android devices. NNAPI is designed to provide a base layer of functionality for higher-level machine learning frameworks, such as TensorFlow Lite and Caffe2, that build and train neural networks. The API is available on all Android devices running Android 8.1 (API level 27) or higher. Based on an app’s requirements and the hardware capabilities on an Android device, NNAPI can efficiently distribute the computation workload across available on-device processors, including dedicated neural network hardware (NPUs and TPUs), graphics processing units (GPUs), and digital signal processors (DSPs).

12 minutes presentation

Speaker
Jeff Hammond (Intel)
Jeff Hammond is a Principal Engineer at Intel where he works on a wide range of high-performance computing topics, including parallel programming models, system architecture and open-source software. He has published more than 60 journal and conference papers on parallel computing, computational chemistry, and linear algebra software. Jeff received his PhD in Physical Chemistry from the University of Chicago.
Abstract
Diversity in computer architecture and the unceasing demand for application performance in data-intensive workloads are never-ending challenges for programmers. This talk will describe Intel’s oneAPI initiative, which is an open ecosystem for heterogeneous computing that supports high-performance data analytics, machine learning and other workloads. A key component of this is Data Parallel C++, which is based on C++17 and Khronos SYCL and supports direct programming of CPU, GPU and FPGA platforms. We will describe how oneAPI and Data Parallel C++ can be used to build high-performance applications for a range of devices.

9 minutes presentation

Speaker
Yakun Huang & Xiuquan Qiao (BPTU)
Abstract
This talk introduces two deep learning technologies for the mobile web over cloud, edge and end devices. One is an adaptive DNN execution scheme, which partitions and performs the computation that can be done within the mobile web, reducing the computing pressure of the edge cloud. The other is a lightweight collaborative DNN over cloud, edge and devices, which provides a collaborative mechanism with the edge cloud for accurate compensation.
Collaborative Learning by Wolfgang Maß (DFKI) - 10 min

10 minutes presentation

Speaker
Wolfgang Maß (DFKI)
Professor at Saarland University and scientific director at DFKI
Abstract
The execution of data analysis services in a browser on devices has recently gained momentum, but the lack of computing resources on devices and data protection regulations are forcing strong constraints. In our talk we will present a browser-based collaborative learning approach for running data analysis services on peer-to-peer networks of devices. Our platform is developed in Javascript, supports modularization of services, model training and usage on devices (tensorflow.js), sensor communication (mqtt), and peer-to-peer communication (WebRTC) with role-based access-control (oauth 2.0).
Introducing WASI-NN by Mingqiu Sun & Andrew Brown (Intel) - 7 min

7 minutes presentation

Speaker
Mingqiu Sun & Andrew Brown (Intel)
Senior PE at Intel & software engineer at Intel
Abstract
Trained machine learning models are typically deployed on a variety of devices with different architectures and operating systems. WebAssembly provides an ideal portable form of deployment for those models. In this talk, we will introduce the WASI-NN initiative we have started in the WebAssembly System Interface (WASI) community, which would standardize the neural network system interface for WebAssembly programs.

Machine Learning Experiences on the Web: A Developer's Perspective

Goal: Authoring ML experiences on the Web; challenges and opportunities of reusing existing ML models on the Web; on-device training, known technical solutions, gaps

Fast client-side ML with TensorFlow.js by Ann Yuan (Google) - 8 min

8 minutes presentation

Speaker
Ann Yuan (Google)
Software Engineer for TensorFlow.js
Abstract
This talk will present how TensorFlow.js enables ML execution in the browser utilizing web technologies such as WebGL for GPU acceleration, Web Assembly, and technical design considerations.

14 minutes presentation

Speaker
Emma Ning (Microsoft)
Emma Ning is a senior Product manager in AI Framework team under Microsoft Cloud + AI Group, focusing on AI model operationalization and acceleration with ONNX/ONNX Runtime for open and interoperable AI. She has more than five years of product experience in search engine taking advantage of machine learning techniques and spent more than three years exploring AI adoption among various businesses. She is passionate about bringing AI solutions to solve business problems as well as enhance product experience.
Abstract
ONNX.js is a Javascript library for running ONNX models on browsers and on Node.js, on both CPU and GPU. Thanks to ONNX interoperability, it’s also compatible with Tensorflow and Pytroch. For running on CPU, ONNX.js adopts WebAssembly to execute the model at near-native speed and utilizes Web Workers to provide a “multi-threaded” environment, achieving very promising performance gains. For running on GPU, ONNX.js takes advantage of WebGL which is a popular standard for accessing GPU capabilities. By reducing data transfer between CPU and GPU as well as GPU processing cycles, ONNX.js further push the performance to the maximum.
Paddle.js - Machine Learning for the Web by Ping Wu (Baidu) - 5 min

5 minutes presentation

Speaker
Ping Wu (Baidu)
Architect at Baidu, Lead of Paddle.js
Abstract
Paddle.js is a high-performance JavaScript DL framework for diverse web runtimes, which helps building a PaddlePaddle ecosystem with web community. This talk will introduce Paddle.js design principle, implementation, use scenario and future work the project would like to explore.
ml5.js: Friendly Machine Learning for the Web by Yining Shi (New York University, RunwayML) - 8 min

8 minutes presentation

Speaker
Yining Shi (New York University, RunwayML)
ml5.js contributor and adjunct professor at Interactive Telecommunications Program (ITP)
Pipcook, a front-end oriented DL framework by Wenhe Eric Li (Alibaba) - 10 min

10 minutes presentation

Speaker
Wenhe Eric Li (Alibaba)
ML/DL on web, contributor of ML5 & tfjs, memebr of pipcook, SDE @ Alibaba
Abstract
We are going to present a front-end oriented platform based on TensorFlow.js. We will cover what is pipcook, the design philosophy, as well as some examples & use cases in our internal community. Apart from that, we will show a brandy new solution to bridge the flourish python DL/ML environment and javascript runtime in both browsers and nodejs.

11 minutes presentation

Speaker
Oleksandr Paraska (eyeo)
eyeo GmbH is the company behind Adblock plus
Abstract
eyeo GmbH has recently deployed tensorflow.js into their product for better ad blocking functionality and has identified gaps in what the WebNN draft covers, e.g. using the DOM as input data, or primitives needed for Graph Convolutional Networks. The talk will present the relevant use case and give indications on how can it be best supported by the new standard.
Exploring unsupervised image segmentation results by Piotr Migdal & Bartłomiej Olechno - 6 min

6 minutes presentation

Speaker
Piotr Migdal & Bartłomiej Olechno
Abstract
This talk will present the usage of web-based tools to interactively explore machine learning models, with the example of an interactive D3.js-based visualization to see the results of unsupervised image segmentation.
Mobile-first web-based Machine Learning by Josh Meyer & Lindy Rauchenstein (Artie) - 11 min

11 minutes presentation

Speaker
Josh Meyer & Lindy Rauchenstein (Artie)
Lead Scientist at Artie, Inc. and Machine Learning Fellow at Mozilla, Lead Scientist at Artie
Abstract
This talk is an overview of some of Artie's machine learning tech stack, which is web-based and mobile first. It will discuss peculiarities of working with voice, text, and images originating from a user's phone, while running an application in the browser and will include discussions about balancing user preferences with privacy, latency, and performance.

Machine Learning Experiences on the Web: A User's Perspective

Goal: Web & ML for all: education, learning, accessibility, cross-industry experiences, cross-disciplinary ML: music, art, and media meet ML; Share learnings and best practices across industries

We Count: Fair Treatment, Disability and Machine Learning by Jutta Treviranus (OCAD University) - 13 min

13 minutes presentation

Speaker
Jutta Treviranus (OCAD University)
Director & Professor, Inclusive Design Research Centre, OCAD University
Abstract

The risks of AI Bias have recently received attention in public discourse. Numerous stories of the automation and amplification of existing discrimination and inequity are emerging, as more and more critical decisions and functions are handed over to machine learning systems. There is a growing movement to tackle non-representative data and to prevent the introduction of human biases into machine learning algorithms.

However, these efforts are not addressing a fundamental characteristic of data driven decisions that presents significant risk if you have a disability. Even if there is full proportional representation and even if all human bias is removed from AI systems, the systems will favour the majority and dominant patterns. This has implications for individuals and groups that are outliers, small minorities or highly heterogeneous. The only common characteristic of disability is sufficient difference from the average such that most systems are a misfit and present a barrier. Machine learning requires large data sets. Many people with disabilities represent a data set of one. Decisions based on population data will decide against small minorities and for the majority. The further you are from average the harder it will be to train machine learning systems to serve your needs. To add insult to injury, if you are an outlier and highly unique, privacy protections won’t work for you and you will be most vulnerable to data abuse and misuse.

This presentation will:

  • outline the risks and opportunities presented by machine learning systems;
  • address strategies to mitigate the risks; and
  • discuss steps needed to support decisions that do not discriminate against outliers and small minorities.

The benefits for innovation and the well-being of society as a whole will also be discussed

AI (Machine Learning): Bias & Garbage In, Bias & Garbage Out by John Rochford (University of Massachusetts Medical School) - 10 min

10 minutes presentation

Speaker
John Rochford (University of Massachusetts Medical School)
Director, INDEX Program, Eunice Kennedy Shriver Center, University of Massachusetts Medical School
Abstract
Biased training data produces untrustworthy, unfair, useless results. Such results include:
  • predicting black prisoners are the most likely recidivist; and
  • killing a wheelchair user in a street crosswalk by autonomous car ML models.

Training data must include representation of people with disabilities, all races, all ethnicities, all genders, etc. Creation of training data must include those populations. There are opensource and commercial toolkits and APIs to facilitate bias mitigation.

John is an expert in this area focused on AI fairness and empowerment for people with disabilities and is a member of the Machine Learning for the Web Community Group.

Cognitive Accessibility and Machine Learning by Lisa Seeman, Joshue O’Connor - 13 min

13 minutes presentation

Speaker
Lisa Seeman, Joshue O’Connor
Interactive ML - Powered Music Applications on the Web by Tero Parviainen (Counterpoint) - 10 min

10 minutes presentation

Speaker
Tero Parviainen (Counterpoint)
Tero Parviainen is a software developer in music, media, and the arts. As a co-founder of creative technology studio Counterpoint, he's recently built installations for The Barbican Centre, Somerset House, The Helsinki Festival, The Dallas Museum of Art, and various corners of the web. He also contributes at Wavepaths, building generative music systems for psychedelic therapy.
Abstract
This talk will present a few projects Counterpoint has built with TensorFlow.js and Magenta.js over the past couple of years. Ranging from experimental musical instruments to interactive artworks, they've really stretched what can be done in the browser context. It will focus on the special considerations needed in music & audio applications, the relationship between ML models and Web Audio, and the limitations encountered while combining the two.

6 minutes presentation

Speaker
Kelly Davis (Mozilla)
Manager of the machine learning group at Mozilla. Kelly's work at Mozilla includes Deep Speech (an open speech recognition system), Common Voice (a crowdsourced tool for creating opens speech corpora), Mozilla's TTS (an open source speech synthesis system), Snakepit (an open source ML job scheduler), as well as ML research and many other projects.
Privacy focused machine translation in Firefox by Nikolay Bogoychev (University of Edinburgh) - 6 min

6 minutes presentation

Speaker
Nikolay Bogoychev (University of Edinburgh)
Postdoc researcher at the University of Edinburgh
Abstract
In the recent years, machine translation has been widely adopted by the end user, making online content in foreign languages more accessible than ever. However, machine translation has always been treated as a computationally heavy problem and as such is usually delivered to the end user via online services such as Google Translate, which may not be appropriate for sensitive content. We present a privacy focussed machine translation system that runs locally on the user's machine and is accessible through a Firefox browser extension. The translation models used are just 16MB and translation speed is high enough for a seamless user experience even on laptops from 2012.
AI-Powered Per-Scene Live Encoding by Anita Chen (Fraunhofer FOKUS) - 9 min

9 minutes presentation

Speaker
Anita Chen (Fraunhofer FOKUS)
Project Manager at Fraunhofer FOKUS
Abstract
This presentation will provide an overview of utilizing machine learning methods in automating per-title encoding for Video on Demand (VoD) and live streaming in order to improve the viewing experience. It will also address the behaviors of various regression models that can predict encoding ladders in a browser in real-time, including a future outlook in terms of optimization.

8 minutes presentation

Speaker
Zelun Chen (Netease)
Front-end and Client Development Engineer of Netease
Abstract
This talk will cover the use of machine learning to enhance participant's expression in a virtual character web meeting and highlight the problems of using webassembly to running AI models In browser.

7 minutes presentation

Speaker
Jean-Marc Valin
Jean-Marc Valin has previously contributed to the Opus and AV1 codecs. He is employed by Amazon, but is giving this talk as an individual.
Abstract
This talk presents RNNoise, a small and fast real-time noise suppression algorithm that combines classical signal processing with deep learning. We will discuss the algorithm and how the browser can be improved to make RNNoise and other neural speech enhancement algorithms more efficient.

7 minutes presentation

Speaker
Louis McCallum (University of London)
Louis is an experienced software developer, researcher, artist and musician. Currently, he holds a Post Doctoral position at the Embodied AudioVisual Interaction Group, Goldsmiths, University of London, where he is also an Associate Lecturer. He is also lead developer on the MIMIC platform and accompanying Learner.js and MaxiInstrument.js libraries
Abstract
Over the past 2 years, as part of the RCUK AHRC funded MIMIC project we have provided platforms and libraries for musicians and artists to use, perform and collaborate online using machine learning. Although it has a lot to offer these communities, their skill sets and requirements often diverge from more conventional machine learning use cases. For example, requirements for dynamic provision of data and on-the-fly training in the browser raises challenges with performance, connectivity and storage. We seek to address the non trivial challenges of connecting inputs from a variety of sources, running potentially computationally expensive feature extractors alongside lightweight machine learning models and generating audio and visual output, in real time, without interference. Whilst technologies like AudioWorklets addresses this to some extent, there remain issues with implementation, documentation and adoption (currently limited to Chrome). For example, issues with garbage collection (created by the worker thread messaging system) caused wide scale disruption to many developers using AudioWorklets and was only addressed by a ringbuffer solution that developers must integrate outside of the core API. We are also keen to ensure the WebGPU API takes realtime media into consideration as it is introduced. Our talk will cover both the user’s perspectives as uncovered by our user-centered research and a developer’s perspective from the technical challenges we have faced developing tools to meet the needs of these users in both creative and educational settings.