This document describes use cases and requirements for Danmaku.
This is still a work in progress. The proposal is being incubated in the W3C Bullet Chatting Community Group.
To create more immersive media experiences, many video-sharing platforms embed social mechanisms that allow users to share comments and view others' comments at specific points in a media timeline. One of these mechanisms is bullet chatting, also known as danmaku (弾幕) in Japanese, where a possibly large number of comments and annotations get rendered and animated as video overlays during playback (see ).
Bullet chatting was first introduced by the Japanese video-sharing website Niconico (ニコニコ). In China, besides use in video-sharing websites such as Bilibili and AcFun, bullet chatting is also supported by video players embedded in main video websites such as Tencent video, iQiyi video, Youku video and Migu video (see ).
A bullet chatting comment can be described with the following three attributes:
A bullet Chatting experience has three characteristics:
There are four basic modes for a bullet chatting comment:
In addition, Bullet Chatting has higher levels of customization and is out of scope for this document.
This document uses the following terms:
During live streaming, two popular text chatting features are often provided to encourage user interaction: Chatroom and Bullet Chatting.
We can see from the below figure that when many people are chatting in the same room during live streaming, messages will be scrolling quickly, the display time of each message is highly affected by the chatroom's activity, the more active the users are, the less time each message will be displayed.
Bullet Chatting tries to provide a better way to display each message when many people are chatting online at the same time.
Density of the Information
Compared to the chatroom, Bullet Chatting has a wider display area, which provides better user experience reading a message even with the same font size.
Update Frequency of the Information
In the Chatroom display mode, each message is scrolling up at the same speed as the others, so it's very difficult to do some special handling. While in the Bullet Chatting mode, each message is moving along their own path, and rarely affected by the update frequency of all the other messages, so it's possible and encouraged to ensure a proper display time for each message by algorithms.
Movement of Users' Sight
In the Chatroom display mode, it's difficult for a user to concentrate on both the video content and the text comments. While in the Bullet Chatting mode, video content is actually covered by the text comments, the user's sight doesn't need to move back and forth across the video and the text messages, so it can provide better immersive experience.
In a lot of languages, text is read from left to right and from top to bottom, so many people are used to read a single-line message horizontally. In the Bullet Chatting mode, text is mostly moving right-to-left, which allows people to read from left to right, so the user can get the meaning of such message very quickly.
While in the Chatroom display mode, even the user would like to read from left to right, the text messages are actually moving from bottom to top, which leads a mismatching bevel angle and increases the difficulty for the user to easily scan the comments.
From the perspective of social psychology, compared to watching a video alone, the feeling of participating in a group activity will make the user more cheerful. Bullet Chatting, by showing the text comments and the video content together in a comfortable way, can create a sense of participation in a group activity to the user. Without leaving his sight from the video content, the user is able to read others' comments on the on-going video clip or the up-coming clip, this helps to increase everyone's social presence.
When watching a video site to provide a video, the viewer will have some ideas or voicing points in the process of watching the video content, and want to publish it to share with more people. At this time, a Bullet Chatting is needed to meet this demand. Through the Bullet Chatting, the viewer's comments at the same time can be displayed in the video area by scrolling in a fixed direction, or statically displayed at the top or bottom of the video area, which can increase the interactive characteristics of the viewer and the video and the viewer. Interaction between the two. The Bullet Chatting sent at the same time basically has the same theme.
In this scenario, the Bullet Chatting data is generally offline data (not real-time), and there is also a small amount of real-time data.
Bullet Chatting can also be a direct interaction between the anchor and the viewers in the live streaming scenario. Compared with the traditional real-time comments, the anchor can understand the audience's needs and feedback more intuitively according to the display of the Bullet Chatting on the screen, adjust the next action and processing more conveniently, and can also interact according to the viewers' input.
In this scenario, the Bullet Chatting data is generally real-time data.
Since the Bullet Chatting will only appear at a specific point in the video, if there is a large number of Bullet Chatting at a certain point in time, it means that the time point has bright spots and high energy. The audience is interested in this time point event and can also be used as video management and Recommended reference data for other functions.
In the process of video on demand or live streaming, there are some stages, the user is not concerned with the picture content of the video itself, but the emotions and emotions stimulated by a certain point. At this time, the viewer can enhance the video by superimposing on the video by the Bullet Chatting. The effect at this time is to achieve a better experience. For example, the victory of the game, the climax of the plot to render the atmosphere, or the cover of the horror of the horror to reduce fear.
Sometimes in order to increase the effect of a webpage, the product operator hopes to make the relevant content into a visual impact of the Bullet Chatting effect, so that the relevant activities can be promoted in the webpage to increase the impact of the page, attracting The attention of young people increases the income.
In this kind of scenario, the Bullet Chatting is displayed separately and is not attached to the video.
In this scenario, the user can send the content of the Bullet Chatting to the display wall in an offline event. The wall can be a pure Bullet Chatting application, without any video or other content on the big screen. The wall only shows the discussions at the venue or online, and enhances the event atmosphere to make participants have a stronger sense of participation.
In this case, computer vision and AI technologies can be used to analyze the video content and identify the previously defined "main content" of the video, generate a mask and distribute it to the client side. The browser uses CSS to render the bullet chatting without covering the defined "video body content". This kind of technique is called masking.
Bullet chatting can contain non-text content such as emoji and images, to express the viewer's thoughts and opinions more vividly.
Please refer Recommended API .
[[webvtt1]] is a file format intended for marking up external text track resources, one of its typical usages is to provide captions or subtitles for video content. Bullet Chatting is also often intended for providing text descriptions of video content, so it was once considered as a subset of WebVTT, and a special format of subtitles.
However, in the Scenarios section, we can see that the usage of Bullet Chatting is not limited to videos. For example, it's also widely used on web page interaction and interactive wall, running independently as a part of the web page instead of a part of the video player.
If Bullet Chatting is designed as a subset of WebVTT, then it has to follow all the rules of WebVTT, and to cue the Bullet Chatting messages as a
vtt file in the
track element of a
video element. However, in the live streaming interaction scenario, the Bullet Chatting messages come from comments submitted by users in real-time, it's impossible to prepare a
vtt file which contains all the Bullet Chatting messages in advance. This real-time requirement also applies to the on-demand video interaction scenario.
From the perspective of usage scenarios of bullet chatting and WebVTT, the two have different ways of interaction. WebVTT is used for captions or subtitles. There's basically no special interactions. The browser only displays the textual expression of the content in a fixed time period on the video timeline. The content carried by the bullet chatting is not only the textual expression of the content in the video, but also the viewer's subjective understanding of the video content. Some bullet chatting needs to have interaction, for example, the viewer might want to look at a fast scrolling bullet chatting carefully, they can hover over it to make bullet chatting stop scrolling, or click on the bullet chatting to see more information, etc. Therefore, there is a clear difference between bullet chatting and WebVTT when used interactively.
In addition, the presentation of bullet chatting and the WebVTT subtitles are also very different. The subtitles of WebVTT can only be displayed in a fixed position of the video, and only one cue can be displayed at a specific time point. Bullet chatting is more flexible. It can be displayed fixedly, but is more often scrolling. The length of a WebVTT cue display is limited, but bullet chatting often has much more content than WebVTT subtitles. Therefore, for the size of the content carried, WebVTT can not meet the requirements of bullet chatting, and this is also a clear difference from WebVTT.
In summary, bullet chatting and WebVTT are somewhat similar in typical usage scenarios, but there is a big difference between the functions and implementation principles behind them. Therefore, when considering the standardization of the bullet chatting, it is not designed as a subset or extension of WebVTT.
As with WebVTT, [[ttml1]] is also a format for subtitling and captioning. A detailed comparison has been made in the above section. TTML is mainly used for video content, while Bullet Chatting is a kind of dynamic, interactive presentation for comments, which is quite different. TTML describes timed text via XML. Although it is more readable, for web developers, they are more accustomed to using JSON to describe data structures in Bullet Chatting.
Bullet Chatting has a wide range of applications in China and Japan, and mainstream video sites and their apps have good support and application for Bullet Chatting. The monthly activity of the relevant video websites can be referred to as follows (only counting monthly active users for video-sharing websites/apps):
Source: 2019 Latest Mobile App TOP1000