Copyright © 2016 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
This document is addressed to people who want to develop Modality Components for Multimodal Applications distributed over a local network or "in the cloud". With this goal, in a multimodal system implemented according to the Multimodal Architecture Specification, over a network, to configure the technical conditions needed for the interaction, the system must discover and register its Modality Components in order to monitor and preserve the overall state of the distributed elements. Therefore, Modality Components can be composed with automation mechanisms in order to adapt the Application to the state of the surrounding environment.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is the 11 April 2016 W3C Working Draft on "Discovery & Registration of Multimodal Modality Components: State Handling".
This draft was modified to enhance the contrast in graphics to meet WCAG requirements for color contrast. Fonts in Graphics were increased in size, a longdesc was added and the support for keyboard interaction. We modify the state diagram, update the normative part about states, edit the code examples and remove some informative text.
A diff-marked version of this document is also available for comparison purposes.
This W3C Working Draft has been developed by the Multimodal Interaction Working Group of the W3C Multimodal Interaction Activity.
This document was published by the Multimodal Interaction Working Group as a Working Draft. If you wish to make comments regarding this document, please send them to www-multimodal@w3.org (subscribe, archives). All comments are welcomed and should have a subject starting with the prefix '[dis]'.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
Sections in this document that are not marked as Normative are Informative.
This document is governed by the 1 September 2015 W3C Process Document.
To the best of our knowledge, there is no standardized way to build a web Application that can dynamically combine and control discovered components by querying a registry build based on the multimodal types of the modalities and their states. This document covers three needs on Discovery & Registration for this kind of web Application implemented following the Multimodal Architecture Specification.
First, we define a new component responsible for the management of the state of a Multimodal System, extending the control layer already defined in the Multimodal Architecture Specification (Table 1 col. 1). This component will be responsible for handling the messages exchanged in order to declare the presence (or absence) of the Modality Components of the system .
Second, this document presents an adaptive push/pull mechanism, needed to inform the system about the changes in the state of the Modality Components. (Table 1 col. 2) These changes are not necessarily related to the interaction functional context itself, but they can affect it, for example, in the case of the unavailability of a given Modality Component
And finally, to allow the advertisement of the state of the Modality Components by using the adaptive mechanism, two new events are needed.(Table 1 col. 3) The semantics of these new events is not directly related to the interaction context but it is related with the system's configuration; for this reason a new component responsible for the management of the state of the Multimodal System is needed.
Resources Handling | A new direction in the messages flow | Events for System's updates |
The state management through events and the pull mechanism must be supported by a dedicated component, responsible for the management of the state of the Modality Components in the Multimodal System. | An adaptive pull mechanism needed to inform periodically of the availability or other kind of evolution on the state of the Modality Components. | A new event and a new notification to support the pull mechanism and the advertisement, registering, search and update of Modality Component's availability. |
In the current state of the Multimodal Architecture Specification, the events that are responsible for handling the control of the user-system interaction, like Prepare or Start, must be triggered only by the Interaction Manager and sent to the Modality Components. As a result, a Modality Component can not send a StartRequest or a PrepareRequest to the Interaction Manager. In both cases the Modality Component depends on the Interaction Manager to begin the interaction cycle by raising an event, originated by an internal command or in reaction to a previous notification sent by a Modality Component (Figure 1).
A Modality Component may send a NewContextRequest to the Interaction Manager to request the creation of a new context of interaction. The interaction can be started by different Modality Components independently. Nevertheless, to start an interaction the Modality Component needs to be already part of the system a to be registered, given that a context represents a single extended interaction with one or more Modality Components.
This means that the Multimodal System has two complementary phases: the runtime phase (defined by the execution of one or multiple interaction cycles), and the system configuration phase (defined by the loading of components and their monitoring and adaptation in real-time).
The semantics of the NewContextRequest event is different and mostly oriented to the interaction phase, while the registering process is part of a previous phase, when even the presence of the user is not mandatory. This phase is designed for a system that will handle one or more interaction processes at the same time.
In addition, in the current state of the MMI Recommendation, the Interaction Manager is supposed to know ahead of time, the address and port of all the Modality Components available in the system. In consequence, the preparation of the media or the start of the interaction cycle also currently implies the setting up of a "multimodal session" that is not completely defined at the current stage of the specification.
The Extension and Status notifications are dedicated to the exchange of interaction data, while the data exchanged in a discovery process is mostly previous to any interaction between the user and the system. During the configuration phase (or reconfiguration according to a changement of the overall state), the system prepares and registers the information about the Modality Components (availability, technical characteristics, cost). In this way, all this information might be used in the future, when the user-system interaction actually takes place.
In other terms, the semantics of the two existing notifications differ from the features needed for discovery. The communication protocol paradigm (the flow of messages always initiated by the Interaction Manager) is not sufficient if the Recommendation is used to address use cases evolving in dynamic environments, as described in uses cases like [UC 2.1] Personal Externalized Interfaces: Smart Cars, [UC 3.1] Public Spaces: Interactive Spaces or [UC 3.2] Public Spaces:In-Office Events Assistance, MMI Use Cases, or some of the use cases described in our current charter.
In all these cases, the Modality Components enter and quit the multimodal system dynamically, and they must declare to the system, their existence, availability and capabilities in some way:
In the first case, [UC 2.1] Personal Externalized Interfaces: Smart Cars, the Modality Components provided by a smartphone must be detected by the multimodal system to relate these features to the features provided by the Modality Components in the car.
In the second case, [UC 3.1] Public Spaces: Interactive Spaces, the discovery of the Modality Components installed on the client's smartphone can affect the behavior of the multimodal application in the public space.
In the third case, [ UC 3.2 ] Public Spaces: In-Office Events Assistance, the announcement and discovery of the Modality Component capabilities in a smart conference room can allow the attendees to access to some of the multimodal services provided by the conference room, providing a fine-grained adaptation of the application features to the multimodal interaction's environment state.
For all these reasons the current document addresses the need of support for discovery and registration in very dynamical environments, like the ones described above by proposing a resources manager, a new flow of messages and two events specifically designed to carry discovery and registration data.
A Modality Component's discovery protocol needs a mechanism tracing the relevant session data to be handled on the control layer. This is the first of the responsibilities for a Resources Manager. This manager is responsible for handling the evolution of the "multimodal session" [See: Functions of Session Component in W3C Multimodal Interaction Framework] and the modifications in any of the participants of the system that could affect its global state. This component is also aware of the system's capabilities, like the address of modalities, their availability or their processing state.
The inclusion of the Resources Manager responds to the functional requirement concerning the management of the interaction cycles locally and globally, the requirement of an appropriate real-time sensing for dynamic uses; and, partially, to the requirement concerning the support of processing of dynamic and incomplete data. [See: MMI Framework requirements]
The Resources Manager is nested in the control layer of the
multimodal system (turquoise in Figure 2) which is slightly
different from the proposal of a Session Component described in the
W3C
Multimodal Interaction Framework.
In the MVC model (Figure 3), the Controller translates the user's actions into method calls on the Model. The Model broadcasts a notification to the View and to the Controller to inform that its state has changed. The View queries the Model to determine the exact change. Upon reception of the response, the View updates the display according to the information received. Thus, in the MVC pattern, the View is directly linked with its controller, but it can also query and communicate with the Model.
In this pattern, the Model offers a registration mechanism so
that multiple Views and Controllers can express their interest in
the Model through anonymous callbacks. This allows an easy
implementation of multiple renderings of the same domain concepts
either on one local device or across multiple distributed
devices.
The Resources Manager described in the current document, allows the
management of the states of the Modality Component (which
represents the MVC view in the MMI Architecture) putting this
function in the control layer (dark gray in Figure 4).
The Resources Manager translates the user's actions into method calls on the Data Component, as the MVC pattern proposes. WHile the INteraction Manager handles the user interaction, the Ressources Manager will take care of the state of the system, the type and avalaibility of the Modality Components and the state of the multimodal session.
The Modality Component's communication and request of state
information is restricted to exchanges with the Control layer as
the MMI
Recommendation defines. The Model broadcasts a notification to the Resources Manager
(Figure 4), and then, the Resources Manager informs the Modality
Component that the state has changed using a flow of messages
through an UpdateNotification or a CheckUpdateResponse. Upon
reception of the UpdateNotification or the checkUpdateResponse, the
Modality Component updates the user interface according to the
information received.
Thus, the Resources Manager delivers information about the state and the resources of the multimodal system during and outside the interaction cycle . Some of its responsibilities can be:
The Resources Manager can also process and serialize in data structures the traces of external and internal phenomena. Depending on the complexity of the implementation, the application can store in the Data Component:
Requirements | |
Distribution | The Resources Manager supports the coordination between distributed components, and their communication through the control layer. This enables it to synchronize the input constraints across modalities [MMI-I16] and also enhances the resolution of input conflicts from distributed modalities [MMI-I17]. |
Advertisement | The Resources Manager is the starting point to declare and process the advertised announcements and to keep them up to date. |
Discovery | The Resources Manager is also the core support for mediated and passive discovery and ilt can also be used to trigger active discovery using the push mechanism or to execute some of the tasks on fixed discovery. |
Registration | The Resources Manager is also the interface that can be requested to register the Modality Component's information. It handles all the communication between the Modality Components and the registry handled by the Data Component, and it manages the multiple renderings of private and public data related to the state of the multimodal system or the state of the interaction cycle. |
Querying | The flow of queries transit through the Resources Manager who dispatches the requests to the Data Component and notifies the Interaction Manager if needed. Some of these queries must be produced using the state handling events proposed on this document. |
According to the current MMI life-cycle events protocol, the command of Modality Components is initiated by the Interaction Manager, which means that if there is an HTTP client-server implementation, it can be designed following a push notification technique.
In the communication protocol designed for the MMI life-cycle events , the direction of the message flow (mostly from the Interaction Manager to the Modality Components) is suggested by the specification through the description of the control events, even if the specific communication mechanism is not currently described in detail in the normative section and it is, for the moment, implementation dependent.
This document describes the flow of messages in both directions, which are needed for the Discovery & Registration of Modality Components. With this proposal, the MMI architecture will respond more accurately to architectural requirements like completeness, extensibility, integratability and interoperability concerning the relations allowed between requesters and providers of messages.
Our intention is to allow multimodal developers the use of a communication flow initiated by Modality Components arriving dynamically to the system. An extension that authorizes the Modality Component client to request or provide new data from the server; using for example, form submissions or AJAX-based technologies with the XMLHttpRequest object.
With this mechanism the change in the state of the multimodal session (i.e. the dynamic inclusion of new distributed modalities) is instigated from the Modality Component itself.
After a certain period, the Modality Component's client requests the Resources Manager (i.e. in a server), which notifies the Modality Component about changes on the user interface displayed with other distant components or in the data related to the overall state of the system, causing eventually the Modality Component‘s state to evolve, for example, by putting it on stand-by. The connection is closed after each transfer and the Modality Component is told when to open a new connection, and what data to fetch when it does so.
The inclusion of this new direction in the flow of messages is the best option for tightly coupled clients to which the Resources Manager has reliable access.
Nevertheless, adding a new direction in the message flow can raise issues related to the risk of high network traffic reducing the overall performance.
In a distributed multimodal system, Modality Components can be idle for long time if no interaction happens or the situation is not optimal to allow a specific type of interaction. Given that the data rate is very low during this period, it is not necessary to keep the client requesting all the time.
For fine-tuning the Modality Component's requests we propose a new attribute: the timeout attribute. The sleep value of this attribute can reduce the requesting time by putting the client (e.g. a Modality Component using recognition services) into a periodic sleep state. This allows handling the requesting frequency to update the state data in the Modality Component.
Requirements | |
Distribution | Modality Components can be distributed
in a centralized way, an hybrid way or a fully decentralized way in
order to support distributed processing [MMI-A14], [MMI-A15] and distributed input / output
synchronization [MMI-A13]. Given the number of devices that could be used, a more flexible way to recognize and include the device in the multimodal system's registry requires to adding a new direction in the flow of messages to allow an announcement of modalities coming from the device every time an important change occurs. This reduces the number of permanent connections, and allows a more pertinent monitoring of the availability and changes on Modality Components at the session level [MMI-A6]. |
Advertisement | With a pull mechanism, the unique
identifier of the Modality Component, its name, address, port
number, its embedded services, constructor, version and lifetime
can be announced when important changes affecting this information
occurs. This proactive updating of information facilitates the
management of scalable multimodal systems across wide ranges of
devices, and supports the application's adaptability [MMI-G2] and the coordination capabilities of
the multimodal session [MMI-I8]. It also supports the announcement of
evolution in the user profile or user preferences [MMI-G13] - [MMI-G14]. A new direction in the flow of messages also supports the extensibility of the system, through the active announcement of the new modalities or new devices and capabilities to be dynamically added [MMI-I12] - [MMI-O8]. This implies the management of external input events during the announcement process [MMI-A16]. |
Discovery | This new direction in the flow of
messages facilitates the mediated and passive discovery of Modality
Components. Functions can be partitioned and distributed across
several servers or devices that notify periodically their
availability and general state. [MMI-C1] and [MMI-C2] . It also facilitates the deployments using mobile networks, preventing bandwidth limitations and delays because the embedded Modality Component itself can announce and update its current state. [MMI-R1] and [MMI-R2]. |
Registration | Using a new direction in the flow of
messages, the updates to the register are triggered by changes
dynamically declared by the Modality Component itself without the
need of a persistent connection to update data that is not very
frequently modified. This also helps in the registration of high level information used to specify the preconditions and effects produced by the addition of this new Modality Component to the system or its unavailability [MMI-G15] It also supports the registering of other information that does not change very often, like the semantics of some kinds of inputs, or any specification of the meaning of the embedded modalities implemented in the Modality Component to be registered [MMI-I13] |
Querying | To enable information gathering in a multimodal system, the
simplest strategy is to have all Modality Components providing a
continuous stream of all the data that they gather to the
Interaction Manager. However, for many types of applications where
only a small subset of the collected information is likely to be
useful, updated or pertinent, this simple approach can become very
inefficient. For this reason, a tunable communication strategy offers significant advantages for optimizing querying. |
With these mechanisms of communication a Modality Component can register its services for a specific period of time. This is the basis for the handling of the Modality Component's state. Every Modality Component can have a life-time, that begins at discovery and ends at a date provided at registration. If the Modality Component does not re-register the service before its lifetime expires, the Modality Component's index is purged. This depends on the parameters given by the Application logic, the distribution of the Modality Components or the context of interaction.
When the lifetime has no end, the Modality Component is part of the multimodal system indefinitely. In contrast, in more dynamic environments, a limited life-time can be associated with the Modality Component, and if it is not renewed before expiration, the Modality Component will be assumed to no longer be part of the multimodal system. Thus, by the use of this kind of registering, the multimodal system can implement a procedure to confirm its global state and update the <<inventory>> of the components that could eventually participate in the interaction cycle. Therefore, registering involves some Modality Components' timeout information, which can be always exchanged between components and, in the case of a dynamic environment, can be updated from time to time.
For this reason, a registration renewal mechanism is needed. We define a renewal mechanism based on the use of the timeout attribute and two new events: the CheckUpdate Event and the UpdateNotification, used in conjunction with an automatic process that ensures periodical requests.
The checkUpdate Event provides a mechanism :The UpdateNotification provides a mechanism:
A dedicated data structure is defined for registration: the
timeout attribute. A timeout is an ordered list of three
elements:
Each Modality Component can sleep for some time, and then wake up and check to see if there are changes planned on the systems side (by requesting the component responsible for the management of the system states). During sleeping, the client turns off checkUpdate requests, and sets a timer to awake itself later.
The sleep value is calculated by the Resources Manager (on the server side, for example) based on the context-awareness level of the multimodal system. It can be static and defined with a set of basic rules or more dynamic, linked to the semantic analysis of the environment.
The second element of the timeout tuple is the communication life-time. A Modality Component leaves the multimodal system when its life-time is exceeded and needs to restart its registering mechanism to obtain a new Modality Component ID and timeout pace. This supports periodic updates of the availability of the Component (e.g. authorization) or the renewal of its metadata (See Figure 6).
The third element is the communication interval, which is modulated according to the multimodal system's needs by a set of static rules or by a prediction mechanism used in the state handler Component. This element informs the Modality Component before-hand about the frequency of requests that can be allowed by the recipient component (a Resources Manager in a server, for example) in the current conditions. This value is exchanged on each request, which means that it can be changed at any moment in the multimodal session.
The communication intervals will be synchronized, because the Modality Component knows the exact publish interval beforehand according to a time pattern. In this way, data coherence is ensured and network performance maintained. Since the Resources Manager has access to all the state data, it can for example, use a prediction algorithm implemented in the Data Component, to foresee a time when the data is going to change. The Resources Manager then attaches this time value in the timeout triplet to the outgoing data allowing the data synchronization.
Finally, if the Resources Manager prediction is wrong and still a change occurs in data, if the Resources Manager knows the address of the Modality Component it can push the change to it, using the original push technique proposed here. In this case the push command is handled as an interruption of the default pull update mechanism. In this way, the system maintains its reliability.
This section is Normative.
The CheckUpdate events are the request/response pair, CheckUpdateRequest and CheckUpdateResponse. CheckUpdateRequest and CheckUpdateResponse are used to check if there are any changes in the system. They share the Context, Source, Target, RequestID and Data fields with MMI Life Cycle Events. A CheckUpdate event MUST include Source, Target and RequestID. It MAY include a Data field. It MAY also include a Context field, if the event pertains to a specific context.
In addition, both CheckUpdate events MUST include the additional fields UpdateType, State, and Timeout. The CheckUpdateResponse MUST also include the field "AutomaticUpdate". The CheckUpdate events can be sent from either the Modality Component to the Resources Manager or from the Resources Manager to the Modality Components.
An attribute that MUST indicate the type of check to be performed. Some values can be: Handshake, Monitoring, Reporting, DataCheck, Resuming, Leaving. This values are application specific.
An attribute that MUST indicate the state of the requesting component and its value. Some values MUST be: Alive, Loading, Registering, Available, Idle, Busy Waiting, Processing, Unavailable, Unregistered. (See Figure 7)
A Modality Component MUST be in Alive state when it is already started and ready to be identified and registered on the multimodal system.
A Modality Component MAY be in Loading state if it is currently loading resources that it will need to function.
A Modality Component MAY be in Registering state when it has already requested a registration id in the multimodal system through the Resources Manager.
A Modality Component MUST be in Available state when it is already registered, ready to function and not busy.
A Modality Component MAY be in Idle state when it is already registered, functioning and waiting for a user input.
A Modality Component MAY be in Busy Waiting state when it is already registered, functioning and waiting for a system's event or a system's response.
A Modality Component MUST be in Processing state when it is already registered and processing some task. The process could be any multimodal or unimodal task like transferring, searching, recognizing or any other kind of process. The processing state is related to a given multimodal session (the same Modality Component can handle multiple tasks in parallel from different users and sessions).
A Modality Component MUST be in Unregistered state if the system's rules command the unregistration and the Modality Component is no longer authorized to interact with the system (for example if it has to update its access credentials).
A Modality Component MUST be in Unavailable state when it has a failure, or when it is unregistered and it does not update its registration, or when it lacks of resources or must reload them. In short, when the Modality Component is no more able to correctly ensure its task.
The following list shows the flow between these nine states:
The component MUST pass from the ALIVE state to Loading, Registering or Available state.
The component MUST pass from the LOADING state to Registering or Available state.
The component MUST pass from the REGISTERING state only to Available state.
The component MUST pass from the AVAILABLE state to Idle, Busy Waiting or Unavailable state.
The component MUST pass from the IDLE state to Busy Waiting, Processing or Unavailable state.
The component MUST pass from the BUSY WAITING state to Processing, Idle, Unregistered or Unavailable state.
The component MUST pass from the PROCESSING state to Processing, Idle or Unavailable state.
The component MUST pass from the UNREGISTERED state only to Unavailable state.
Some examples of this flow between the states are:- Unauthorized Component
ALIVE | AVAILABLE | BUSY WAITING | UNREGISTERED | UNAVAILABLE |
First the Modality Component announces that it is ALIVE and declares its AVAILABLE state to the system. After sending this announcement, the Modality Component enters the BUSY WAITING state, waiting a response from the system. The system does not allow the Modality Component to continue joining the system (it is no more authorized to join the system), then the Component pass to an UNREGISTERED state, and becomes UNAVAILABLE. |
- Failure of a Registered Component
ALIVE | REGISTERING | AVAILABLE | IDLE | PROCESSING | UNAVAILABLE |
The Modality Component is ALIVE and announces its address and port to the Resources Manager which registers this data allowing the component to pass to the REGISTERED state. The Modality Component passes to the AVAILABLE state. When the system is ready to interact with a user, the Modality Component passes to the IDLE state, waiting for a user action. If the user interacts with the Modality Component it passes to a PROCESSING state, but then, the current process fails and the component becomes UNAVAILABLE. |
- Unavailability of a Registered Component
ALIVE | REGISTERING | AVAILABLE | IDLE | PROCESSING | BUSY WAITING | UNAVAILABLE |
The Modality Component is ALIVE and announces its address and port. It is allowed to pass to the REGISTERED state. The Modality Component passes to the AVAILABLE state. When the system is ready to interact with a user, the Modality Component passes to the IDLE state, waiting for a user action. The user interacts with the Modality Component it passes to a PROCESSING state. The process needs an exchange with another component on the system, then it waits for a response. After a certain time with no response (or after a response making impossible to continue the process), the current process fails and the component becomes UNAVAILABLE. |
- Registration of a Component needing multimodal resources
ALIVE | LOADING | REGISTERING | AVAILABLE | IDLE | PROCESSING | BUSY WAITING | PROCESSING | IDLE |
The Modality Component is ALIVE. It needs to load some resources, passing on LOADING state. Then the Modality Component announces its address, port and resources and is allowed to pass to the REGISTERED state. The Modality Component passes to the AVAILABLE state. When the system is ready to interact with a user, the Modality Component passes to the IDLE state, waiting for a user action. The user interacts with the Modality Component, it passes to a PROCESSING state. The process communicates with another component on the system, and receives a response. The process ends and then the Modality Component returns to its IDLE state to wait for another user interaction. |
A boolean-valued attribute indicating whether the state of the Modality Component will be automatically updated by UpdateNotification events or whether the Modality Component will keep sending UpdateNotification events in the future without waiting for another CheckUpdateRequest event. If the Resources Manager is temporarily unavailable the Modality Component will continue to send messages according with the last interval defined by the last timeout information received.
An element with an attribute to link with external complementary metadata and an info attribute for inline data. The metadata non-functional information will be complementary to the data, which is a functional information type.
An element used to temporize the exchanges between components. The values of this element are defined by the Resources Manager. These values can be changed by a Modality Component if the Modality Component arrives into a state that makes impossible to preserve the pace of communication (i.e. error, fail, unavailability). This element MUST include three attributes. It MUST include a sleep attribute to define the "communication sleep period", a validity attribute to represent the "communication validity period" in milliseconds and an interval attribute to express the "communication interval" in milliseconds. Example:
<mmi:Timeout sleep="1000" validity="5000" interval="500"/>
<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0"> <mmi:CheckUpdateRequest mmi:Source="URIForMC" mmi:Target="URIForRM" mmi:RequestID="request-1" mmi:State="LOADING" mmi:UpdateType="HANDSHAKE" mmi:AutomaticUpdate="true"> <mmi:metadata src="URIForMetadata" info="{medium:{acoustic}, modality:{acoustic:SPEECH}}" /> <mmi:Timeout sleep="0" validity="500" interval="500"/> </mmi:CheckUpdateRequest> </mmi:mmi>
<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0"> <mmi:CheckUpdateResponse mmi:Source="URIForRM" mmi:Target="URIForMC" mmi:RequestID="request-1" mmi:State="REGISTERED" mmi:UpdateType="HANDSHAKE" mmi:AutomaticUpdate="true" mmi:data="MCRegistrationID Data"> <mmi:Timeout sleep="1000" validity="360000" interval="500"/> </mmi:CheckUpdateResponse> </mmi:mmi>
This section is Normative.
The UpdateNotification event informs other system components (periodically or not) about the changes on the state of a Component. If automatic updates are enabled, the Component may send multiple UpdateNotification messages after a single CheckUpdateRequest message. It shares the Context, Source, Target, RequestID and Data fields with MMI Life Cycle Events. An UpdateNotification event MUST include Source, Target, and RequestID. It MAY include a Data field. It MAY also include a Context field, if the notification pertains to a specific context.
In addition, an UpdateNotification MUST include the additional fields UpdateType, State, and Timeout. The UpdateNotification event can be sent from either the Modality Component to the Resources Manager or from the Resources Manager to the Modality Components.
An attribute that MUST indicate the type of check to be performed. Some values can be: Reporting, in the case of an important change to the Modality Component that needs to be reported to the Resources Manager, like a noise situation in some audio capture, for example. An update notification can also be triggered when the Modality Component uses or produces new data: in this case the UpdateType can be DataUpdate. Finally a Modality Component can need to inform other components about some user interface changes, for example when the load of some data is finished and this affects the user interface display. In this case the UpdateType will be InterfaceUpdate
An attribute that MUST indicate the state of the requesting component and its value. These values correspond to the values supported by the CheckUpdate event: Alive, Loading, Registering, Available, Idle, Busy Waiting, Processing, Unavailable, Unregistered.
An element used to indicate the pace of the notification process when automatic updates are enabled. This element MUST include three attributes. It MUST include a sleep attribute to define the "communication sleep period", a validity attribute to represent the "communication validity period" in milliseconds and an interval attribute to express the "communication interval" in milliseconds.
<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0"> <mmi:UpdateNotification mmi:Source="URIForMC" mmi:Target="URIForRM" mmi:RequestID="request-1" mmi:UpdateType="REPORTING" mmi:State="BUSY WAITING"> <mmi:Timeout sleep="1000" validity="360000" interval="200"/> </mmi:UpdateNotification> </mmi:mmi>
Requirements | |
Distribution | For notification of failures, progress
or delays in distributed processing [MMI-A14] the UpdateNotification ensures
periodical requests informing other components if any change occurs
in the Modality Component's state. This can support, for example,
grammar updates or image recognition updates for a subset of
differential data (the general recognized image is the same but one
little part of the image has changed, i. e. the face is the same
but there is a smile) On the other hand, if a Modality Component is waiting for some processing provided by other distributed component, the checkUpdate Event allows the recovery of progress information and the fine tuning of requests by changing the timeout attribute. This enhances input/output synchronization in distributed environments [MMI-A13]. |
Advertisement | The use of the timeout attribute helps in the management of the validity of the advertised data. If a Modality Component communication is out-of-date, the system can infer that the data has the risk of being inaccurate or invalid. |
Discovery | The UpdateNotification and the checkUpdate Event support mediated and passive discovery of Modality Components, by allowing servers or devices to announce their capabilities at bootstrapping and notify or check periodically availability and session state changes [MMI-C1] . |
Registration | The UpdateNotification and the checkUpdate Event tuned by a timeout mechanism for pull requests allow the dynamic registration and update of the information about the capabilities of the Modality Component [MMI-G2] or the user preferences [MMI-G13] and profile [MMI-G14] collected on the device. |
Querying | The checkUpdate Event allows the recovery of a small subset of the information provided by the interaction manager or the data component, to maintain up to date the data in the Modality Components as in the Data Component. |
This proposal is designed to support the annotation of Modality Components, to allow their discovery and registering in a multimodal system. The focus is the dynamic discovery of Modality Component as services using generic information about the underlying properties and types of processes. This information is provided by an announcement and a description (a capabilities manifest, for example) advertised in some network. In this document we will illustrate this point with an example of a multimodal greeting service in a smart environment.
The Modality Components can be described with a document that evolves on complexity depending on the application needs. This description can be limited to indications about the Input and Output interfaces or be more detailed describing functional and non-functional properties, inspired by some of the Extensible Multimodal Annotation Markup Language (EMMA) properties [W3C-EMMA 2009] like emma:function, emma:media-type, emma:medium and emma:mode.
The meaning of the terms for a controlled vocabulary in the form of a Glossary for the annotation of Modality Components, is divided in two parts (Figure 2): Subsumption terms and behavior terms.
Subsumption concerns the attributes classifiying the Modality Components. It is structured with metadata classifying the Modality Component according to its membership or association with a Multimodal class in conformance to the modes handled by the System. This first description allows discovery filtering for a precise target mode. There are four properties:
Based on this term, the Modality Component's capabilities can be classified from a high level perspective, for example, we can infer that the first component is part of the device class "TEXT_DISPLAYS", and the second to the class "MEDIA_CONTROLLERS". The triplet is inspired by the intentional name schema [ADJIE-1999] and show hierarchical tree relationships between general concepts (including some negative differentiating aspects). These names are intentional; they describe the intent of the Modaity Component and its implementation in the form of a tuple of attributes.
The functions are the technical entities supporting a limited number of modalities according to the semantics of the message and the capabilities of the support itself. A Modality Component acts as a complex set of functions. Each function uses one or more modalities that realizes some mode. For example, in Figure 2 the Avatar uses a 3D mesh modality through a visual mode. The functions term defines a list of functions using in the service, ordered by importance and by mode. For example, a gesture recognizer service uses the sign language function, using the single hand gesture modality that is executed in the haptic mode and is perceived in the visual mode.
Finally, the operations is the IOPE list of the Modality Component Capabilities. IOPE means Inputs, Outputs, Preconditions and Effects of a service[YU-2007] [OWL-S]
In Figure 2 the "Face Synthesizer Service" acts in some mode that is perceived by a final user through a modality that is part of some functions, i.e. a face synthesis service acts in the visual mode that is perceived through a 3D mesh modality that is part of an avatar function.
Thus, for the "Face Syntesizer" service illustrated in Figure 2 the Modality Component's description (description.js document) shows an operation description. It could be a list of other expressions but we propose the smile operation as an example:
{ "name": "VRML_FACE_SYNTHESIZER", "affiliation": "ANIMATED_3D_RENDERER", "version": "1.0", "endpoints": { "1.0" : { "description":"http://localhost:5000/vrml_face_synthesizer/1-0/description.js", "uri": "http://localhost:5000/vrml_face_synthesizer/1-0/" } }, "modalities":{ "visual":["REALTIME_SINTHESIZER"] } }, "functions":{ "visual":["VR_GRAPHICS"] }, "operations": { "smile": { "method":"POST", "endpoint":"http://localhost:5000/vrml_face_synthesizer/1-0", "documentation": "Operation to change the expression to a smiling face. ", "metadata": {"emotion":"emotionML_uri","behavior":"behaviorML_uri"}, "input": { "key": { "position": 1, "metadata": { "Content-Type":{ "cognitive":["text/plain"] } }, "documentation": "The user key to acces this API" }, "event": { "position": 0, "metadata": { "Content-Type":{ "cognitive":["ExtensionNotification","StartRequest"] } }, "documentation": "If the event type is extension, the service returns just true or fail (for a steady smile, for example). If the event type is start request (for a time-controlled smile), the service can receive the starting time and returns the acceleration info." "data": { "metadata": { "Content-Type":{ "cognitive":["data/integer","data/time"] } }, "documentation": "If the event's data is a notification, the event will include the easing integer value for the acceleration. If the event is a StartRequest the event can also include the start time in milliseconds for the smile process." } } }, "output": { "event": { "position": 0, "metadata": { "Content-Type":{ "cognitive":["StartResponse"] } }, "documentation": "The type of response event.", "data": { "metadata": { "Content-Type":{ "cognitive":["data/integer" } }, "documentation": "In the case of a startRequest, a confirmation of the starting time of the animation." } } }, "preconditions": {"documentation": "No precondition is needed other than the loading of the face visual data."}, "effects": {"documentation": "Asynchronous modality. It will not block the rest of the application rendering."} } } }
This description can be parsed before the execution of the service, in a discovery process. To call the service and to execute a smile operation, le service query with a POST method must be structures as follows:
POST /vrml_face_synthesizer/1-0 HTTP/1.1
Host: localhost:5000
Content-Type: text/xml
<?xml version="1.0"?> <smile>
<input>
<event>
<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
<mmi:startRequest source="IM_1" target="smile" context="c_1" requestID="r_1">
<mmi:data>
<ease value="0.5"/>
<starting_time value="300"/>
</mmi:data>
</mmi:startRequest>
</mmi:mmi>
</event>
</input> </smile>
The smile tag represents the operation that has been requested, the input tag express that this is a request, the event tag contains the MMI Lifecycle event to control the operation. There can be multiple MMI events inside the input and output elements to support concurrential or parallel commands to the interface. The MMI Lifecycle event sent to the operation provided by the Modality Component can be any of the events defined to handle inputs on the MMI specification.
The POST response of the service will be:
<?xml version="1.0"?> <smile>
<output> <event>
<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0"> <mmi:startResponse source="smile" target="IM_1" context="c_1" requestID="r_1" status="success" /> </mmi:mmi>
</event>
</output> </smile>
The possible GET request to the REST endpoint for the same service could be:
GET /vrml_face_synthesizer/1-0 HTTP/1.1
Host: localhost:5000 /IM_1/c_1/event/startRequest/r_1/smile?data[ease]=0.5&data[starting_time]=300
The possible Json response to the REST request:
{ "output": {
"event": [{
"mmi": "startResponse",
"context": "c_1", "source": "smile", "target": "IM_1",
"requestID": "r_1",
"status": "success",
"data": {}
}]
}
}
Security techniques are separated from the current communication protocol in the architecture as in this document: we assume that this is a private network. Security issues for this protocol in public networks will be addressed later.
Also, this document is focusing on the flow of messages and the building blocks needed to support this flow. The details of the communication between the Interaction Manager and the State Manager, as the interfaces between the Data Component and the State Manager will be described later.
Another open issue is the management of multiple instances of the Interaction Manager and the flow of messages between them, the Resources Manager and multiple Modality Components.
Finally, a common vocabulary for the description of the Modality Component's attributes in order to Register and Compose them, is an important subject to be treated in order to allow a better interoperability between multimodal systems. Vocabulary and Capabilities will be addressed in a subsequent document.
The authors wish to acknowledge the contributions by all the members of the Multimodal Interaction Working Group.
Finally, the authors would also like to acknowledge the people outside of the MMI Working Group who help with the process of developing this document, specially Jean-Claude Moissinac and Isabelle Demeure.