This document describes use cases and requirements for Danmaku.

This is still a work in progress. The proposal is being incubated in the W3C Bullet Chatting Community Group.

Introduction

To create more immersive media experiences, many video-sharing platforms embed social mechanisms that allow users to share comments and view others' comments at specific points in a media timeline. One of these mechanisms is bullet chatting, also known as danmaku (弾幕) in Japanese, where a possibly large number of comments and annotations get rendered and animated as video overlays during playback (see ).

Bullet chatting was first introduced by the Japanese video-sharing website Niconico (ニコニコ). In China, besides use in video-sharing websites such as Bilibili and AcFun, bullet chatting is also supported by video players embedded in main video websites such as Tencent video, iQiyi video, Youku video and Migu video (see ).

Example of bullet chatting
An example of bullet chatting

Bullet Chatting attributes

A bullet chatting comment can be described with the following three attributes:

Bullet Chatting characteristics

A bullet Chatting experience has three characteristics:

Basic modes of the Bullet Chatting

There are four basic modes for a bullet chatting comment:

  1. Rolling Bullet Chatting: the bullet chat moves from right to left at a constant speed, stacked from top to bottom.
  2. Reverse Bullet Chatting: the bullet chat moves from left to right at a constant speed, stacked from top to bottom. This is contrasted against Rolling Bullet Chatting.
  3. Top Bullet Chatting: A bullet chat that is horizontally and statically centered, stacked from top to bottom.
  4. Bottom Bullet Chatting: A bullet chat that is horizontally and statically centered, stacked from bottom to top.

In addition, Bullet Chatting has higher levels of customization and is out of scope for this document.

Terminology

This document uses the following terms:

Background

Analysis from the perspective of user experience

During live streaming, two popular text chatting features are often provided to encourage user interaction: Chatroom and Bullet Chatting.

an example of chatroom
Chatroom: with the input of text, the content list will be scrolling from the bottom to the top at a constant speed.
an example of Bullet Chatting
Bullet Chatting: with the input of text, a single line of text will show up from the right side of the video, moving right-to-left in an independent path.

Advantage of displaying with Bullet Chatting

We can see from the below figure that when many people are chatting in the same room during live streaming, messages will be scrolling quickly, the display time of each message is highly affected by the chatroom's activity, the more active the users are, the less time each message will be displayed.

the display time of a single message in a chatroom
The display time of a single message in a chatroom

Bullet Chatting tries to provide a better way to display each message when many people are chatting online at the same time.

  • Density of the Information

    Compared to the chatroom, Bullet Chatting has a wider display area, which provides better user experience reading a message even with the same font size.

    the display time of a single message in Bullet Chatting
    a figure of the display time of a single message in Bullet Chatting
  • Update Frequency of the Information

    In the Chatroom display mode, each message is scrolling up at the same speed as the others, so it's very difficult to do some special handling. While in the Bullet Chatting mode, each message is moving along their own path, and rarely affected by the update frequency of all the other messages, so it's possible and encouraged to ensure a proper display time for each message by algorithms.

  • Movement of Users' Sight

    In the Chatroom display mode, it's difficult for a user to concentrate on both the video content and the text comments. While in the Bullet Chatting mode, video content is actually covered by the text comments, the user's sight doesn't need to move back and forth across the video and the text messages, so it can provide better immersive experience.

    Movement of Users' Sight in Chatroom
    a figure of the movement of users' sight in the Chatroom display mode
    Movement of Users' Sight in Chatroom in Bullet Chatting
    a figure of the movement of users' sight in the Bullet Chatting display mode
  • Reading Habit

    In a lot of languages, text is read from left to right and from top to bottom, so many people are used to read a single-line message horizontally. In the Bullet Chatting mode, text is mostly moving right-to-left, which allows people to read from left to right, so the user can get the meaning of such message very quickly.

    direction of reading in Bullet Chatting
    a figure of the direction of reading in the Bullet Chatting display mode

    While in the Chatroom display mode, even the user would like to read from left to right, the text messages are actually moving from bottom to top, which leads a mismatching bevel angle and increases the difficulty for the user to easily scan the comments.

    direction of reading in chatroom
    a figure of the direction of reading in the chatroom display mode

Analysis of the Psychological Factors

From the perspective of social psychology, compared to watching a video alone, the feeling of participating in a group activity will make the user more cheerful. Bullet Chatting, by showing the text comments and the video content together in a comfortable way, can create a sense of participation in a group activity to the user. Without leaving his sight from the video content, the user is able to read others' comments on the on-going video clip or the up-coming clip, this helps to increase everyone's social presence.

Scenarios

This section mainly describes the specific use scenarios of Bullet Chatting.

On-demand video interaction

When watching a video site to provide a video, the viewer will have some ideas or voicing points in the process of watching the video content, and want to publish it to share with more people. At this time, a Bullet Chatting is needed to meet this demand. Through the Bullet Chatting, the viewer's comments at the same time can be displayed in the video area by scrolling in a fixed direction, or statically displayed at the top or bottom of the video area, which can increase the interactive characteristics of the viewer and the video and the viewer. Interaction between the two. The Bullet Chatting sent at the same time basically has the same theme.

Screenshot of on-demand video interaction
Example of on-demand video interaction

In this scenario, the Bullet Chatting data is generally offline data (not real-time), and there is also a small amount of real-time data.

Requirement

  • From the user's point of view: the user needs to input the content of the bullet chat, set the presentation of the bullet chat (including the font size, font color, whether it contains images, the area for displaying it etc.), and the bullet chat should be displayed on the screen in real time after it is being sent.
  • From the service provider's point of view: the bullet chat sent by the user needs to be displayed over the video. The bullet chat may be sent by a user at a certain time, or in real time. For real-time bullet chatting, it is necessary to associate the data sent in real time with the timeline of the video; it is necessary to save the bullet chat data sent by the user to the server and saved in a database; the bullet chat should be synchronized to other users viewing the video at this moment. For bullet chats saved on the server, the developer needs to get the data of the bullet chat from the server and render it in the specified area. The bullet chat displayed on the web page and on the native app need to be consistent in terms of content and timing.

Live streaming interaction

Bullet Chatting can also be a direct interaction between the anchor and the viewers in the live streaming scenario. Compared with the traditional real-time comments, the anchor can understand the audience's needs and feedback more intuitively according to the display of the Bullet Chatting on the screen, adjust the next action and processing more conveniently, and can also interact according to the viewers' input.

Live streaming interaction
Example of live streaming interaction: control the game being live streamed by sending bullet chatting commands to vote

In this scenario, the Bullet Chatting data is generally real-time data.

Requirement

  • From the user's point of view: the content of the bullet chat sent by the user should be displayed on the screen in real time.
  • From the service provider's point of view: the bullet chat sent by the user needs to be displayed in real time on both the anchor's and the viewers' screen.

Identify video highlights

Since the Bullet Chatting will only appear at a specific point in the video, if there is a large number of Bullet Chatting at a certain point in time, it means that the time point has bright spots and high energy. The audience is interested in this time point event and can also be used as video management and Recommended reference data for other functions.

Identify video highlights
Example of identifying video highlights: visualize the density or time distriibution of bullet chatting, to identify the highlights of the video

Requirement

  • From the user's point of view: users can enter the content of the bullet chat at any time, set the presentation style, and display it on the screen in real time after sending it.
  • From the service provider's point of view: when bullet chatting is enabled, a bullet chatting list needs to be provided to the user. The bullet chat sent by the user needs to be displayed on the video, and the number of bullet chat corresponding to the video timeline needs be visualized and displayed to the user.

Video content enhancement

In the process of video on demand or live streaming, there are some stages, the user is not concerned with the picture content of the video itself, but the emotions and emotions stimulated by a certain point. At this time, the viewer can enhance the video by superimposing on the video by the Bullet Chatting. The effect at this time is to achieve a better experience. For example, the victory of the game, the climax of the plot to render the atmosphere, or the cover of the horror of the horror to reduce fear.

Example of video content enhancement
Example of video content enhancement: screenshot for the horror scene of a horror movie, without bullet chatting
Example of video content enhancement
Example of video content enhancement: screenshot for the horror scene of a horror movie, with bullet chatting

Requirement

  • From the user's point of view: the user should be able to turn on or off the bullet chatting function.
  • From the service provider's point of view: when bullet chatting is enabled, a bullet chatting list needs to be provided to the user. Display the bullet chat when the video reaches the time corresponding to the time the bullet chat was sent.

Interaction within a webpage

Sometimes in order to increase the effect of a webpage, the product operator hopes to make the relevant content into a visual impact of the Bullet Chatting effect, so that the relevant activities can be promoted in the webpage to increase the impact of the page, attracting The attention of young people increases the income.

In this kind of scenario, the Bullet Chatting is displayed separately and is not attached to the video.

Interaction within a webpage
Example of interaction within a webpage: discussion on specific content in the webpage

Requirement

  • From the user's point of view: the user needs to input the content of the bullet chat, set the presentation of the bullet chat (including the font size, font color, whether it contains images, the area for displaying it etc.), and the bullet chat should be displayed on the screen in real time after it is being sent.
  • From the service provider's point of view: the bullet chat may be sent by a user at a certain time, or in real time. For real-time bullet chatting, it is necessary to display the bullet chat in real time; it is necessary to save the bullet chat data sent by the user to the server and saved in a database; the bullet chat should be synchronized to other users viewing this page at this moment. For bullet chats saved on the server, the developer needs to get the data of the bullet chat from the server and render it in the specified area.

Interactive wall

In this scenario, the user can send the content of the Bullet Chatting to the display wall in an offline event. The wall can be a pure Bullet Chatting application, without any video or other content on the big screen. The wall only shows the discussions at the venue or online, and enhances the event atmosphere to make participants have a stronger sense of participation.

Interactive wall
Example of interactive wall: in an event, the audience participates in the event by sending bullet chatting to the big screen.

Requirement

  • From the user's point of view: enter the content and presentation style of the bullet chat at the event venue through a device that can access the web page, and display it on the big screen in real time.
  • From the service provider's point of view: provide a page for sending bullet chat and display the received content on the big screen in the event in real time.

Masking

In this case, computer vision and AI technologies can be used to analyze the video content and identify the previously defined "main content" of the video, generate a mask and distribute it to the client side. The browser uses CSS to render the bullet chatting without covering the defined "video body content". This kind of technique is called masking.

Masking
[To be translated]

Requirement

  • From the user's point of view: the user should be able to turn on or off the masking function.
  • From the service provider's point of view: computer vision and AI capabilities are needed to identify the "main content" and calculate the mask area, then return it to the client side to achieve the masking effect.

Non-text bullet chatting

Bullet chatting can contain non-text content such as emoji and images, to express the viewer's thoughts and opinions more vividly.

Non-text bullet chatting
Bullet chatting containing images and emoji

Recommended API

Please refer Recommended API .

A Gap Analysis of Bullet Chatting and Subtitles

Bullet Chatting vs WebVTT

[[webvtt1]] is a file format intended for marking up external text track resources, one of its typical usages is to provide captions or subtitles for video content. Bullet Chatting is also often intended for providing text descriptions of video content, so it was once considered as a subset of WebVTT, and a special format of subtitles.

However, in the Scenarios section, we can see that the usage of Bullet Chatting is not limited to videos. For example, it's also widely used on web page interaction and interactive wall, running independently as a part of the web page instead of a part of the video player.

If Bullet Chatting is designed as a subset of WebVTT, then it has to follow all the rules of WebVTT, and to cue the Bullet Chatting messages as a vtt file in the track element of a video element. However, in the live streaming interaction scenario, the Bullet Chatting messages come from comments submitted by users in real-time, it's impossible to prepare a vtt file which contains all the Bullet Chatting messages in advance. This real-time requirement also applies to the on-demand video interaction scenario.

From the perspective of usage scenarios of bullet chatting and WebVTT, the two have different ways of interaction. WebVTT is used for captions or subtitles. There's basically no special interactions. The browser only displays the textual expression of the content in a fixed time period on the video timeline. The content carried by the bullet chatting is not only the textual expression of the content in the video, but also the viewer's subjective understanding of the video content. Some bullet chatting needs to have interaction, for example, the viewer might want to look at a fast scrolling bullet chatting carefully, they can hover over it to make bullet chatting stop scrolling, or click on the bullet chatting to see more information, etc. Therefore, there is a clear difference between bullet chatting and WebVTT when used interactively.

In addition, the presentation of bullet chatting and the WebVTT subtitles are also very different. The subtitles of WebVTT can only be displayed in a fixed position of the video, and only one cue can be displayed at a specific time point. Bullet chatting is more flexible. It can be displayed fixedly, but is more often scrolling. The length of a WebVTT cue display is limited, but bullet chatting often has much more content than WebVTT subtitles. Therefore, for the size of the content carried, WebVTT can not meet the requirements of bullet chatting, and this is also a clear difference from WebVTT.

In summary, bullet chatting and WebVTT are somewhat similar in typical usage scenarios, but there is a big difference between the functions and implementation principles behind them. Therefore, when considering the standardization of the bullet chatting, it is not designed as a subset or extension of WebVTT.

Bullet Chatting vs TTML

As with WebVTT, [[ttml1]] is also a format for subtitling and captioning. A detailed comparison has been made in the above section. TTML is mainly used for video content, while Bullet Chatting is a kind of dynamic, interactive presentation for comments, which is quite different. TTML describes timed text via XML. Although it is more readable, for web developers, they are more accustomed to using JSON to describe data structures in Bullet Chatting.

Commercial operation of bullet chatting

Bullet Chatting has a wide range of applications in China and Japan, and mainstream video sites and their apps have good support and application for Bullet Chatting. The monthly activity of the relevant video websites can be referred to as follows (only counting monthly active users for video-sharing websites/apps):

Source: 2019 Latest Mobile App TOP1000