Hello, everyone.
This is Ping Wu from Baidu.
And my presentation today is the Paddle.js
machine learning for the web.
My presentation has five parts.
First is a brief introduction for Paddle.js,
the design principle, implementation, new scenario, and final, the conclusion, the future
work.
What's Paddle.js?
Paddle.js is a high performance deep learning
framework for JavaScript, which provides on-device computation in diverse web runtimes, including
PC, mobile browser and mini programs.
It's part of Baidu PaddlePaddle ecosystem.
It has a compatible APIs with PaddlePaddle
Python and C++ part.
Paddle.js currently only has inference, and
it has no training part, but we provide remote RPC JS APIs to PaddlePaddle serving part.
As to the opportunities and challenges for
web AI, we have similar opinions.
Chances we have like vast front-end developer
community, which continues to grow and expand.
Web community has low barrier to develop and
deploy.
For the end user, it's also easy to experience
and share due to cross-platform web runtime support.
And the on-device computation provide a privacy,
real time, offloaded and decentralized end computation.
Challenges we have like high performance computation
in web runtime, and also the diverse web runtime and cross-browser compatibility.
Paddle.js has three design principles.
The first is integrated with PaddlePaddle
ecosystem.
We fully utilize PaddlePaddle model, toolchain
and inference experience we have on other on-device platforms.
Paddle.js is a good start for entry-level
developers and also help experienced PaddlePaddle developers easily migrate their work to JS
environment.
It's high performance.
It has efficient WebGPU backend for operator
and kernel implementation, and also efficient data IO.
Although web runtime is designed to be cross-platform,
we still face implementation difference on different browser or devices.
We use extensive unit tests to hide all this
inconsistency as we can for the developers.
As to the OO architecture and APIs, the OO
architecture of Paddle.js is divided into two parts, the offline model converter optimize
graphs by operator fusion, operator substitution, and change the model topology from binary
formats into JSON.
And PaddlePredictor consists of loader and
executor, the former is responsible for loading model, and the latter calculates model graphs.
As to the performance part, Paddle.js is compatible
with different computation backends like WebGL.
We also have tentative experiments with new
JS standards like WebGPU and WebNN.
We optimize WebGL backend performance with
texture packing, which shows very useful, results in almost 30% improvement on models
like MobileNetV2, across both mobile and desktop devices.
And we also have some performance improvement
on initialization cost and memory optimization.
As to the compatibility part, Paddle.js supports
WebGL 1.0 and 2.0.
It's runtime compatible with devices that
support OES texture float extension.
As to the mobile device support, due to lack
of 32-bit float support on almost all mobile GPUs, we may have precision loss, but we also
find high float quantization also work efficiently in many situations.
is a use case for real-time gesture
recognition and tracking with Paddle.js.
The whole process for optimization may include
operator fusion, workflow optimization, and the GPU backend implementation for some pre-process
work.
And the conclusion and future work.
Paddle.js is a high performance JavaScript
deep learning framework for diverse web runtimes, which helps building a PaddlePaddle ecosystem's
web community.
The future work may include a general and
high performance numerical computing programming model for web runtime.
And more toolchain and develop framework support
for Paddle.js developers, and more innovation in new classes on web AI applications.