Recipes for building service-oriented knowledge-driven systems

Draft 2016-07-08

Latest version: https://www.w3.org/community/kiss/latest_recipes/
Editor: Andrei Lobov, Tampere University of Technology

Authors

Pavel Balda, University of West Bohemia

Sergii Iarovyi, Tampere University of Technology

Andrei Lobov, Tampere University of Technology

Wael Mohammed, Tampere University of Technology

Borja Ramis Ferrer, Tampere University of Techology

Copyright © 2016 the Contributors to the Recipes for building service-oriented knowledge-driven systems Specification, published by the open Knowledge-driven Service-oriented System architectures and APIs (KiSS) community group under the W3C Community Contributor License Agreement (CLA). A human-readable summary is available.

Abstract

This document is the first draft specification for the recipes to build service-oriented knowledge-driven systems.

Status of This Document

This specification was published by the open Knowledge-driven Service-oriented System architectures and APIs (KiSS) community group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.

Introduction
Terminology
Devices
Knowledge representation and reasoning
Service protocols
Service composition and integration

1. Introduction

This document summarizes different problems and solutions (via APIs) for implementing service-oriented knowledge-driven systems. Services are the basic components for building loosely-coupled applications, where the services can be integrated and invoked on demand. These basic components joint with the knowledge representation allow building powerful applications utilizing self-descriptive and discoverable components features at run time. The ultimate goal of an application in this context can be seen as a continuous search and improvement on how to efficiently address different changing goals of a system by integrating together necessary resources (e.g. services). Such integration can be automated thanks to the knowledge representation and reasoning performed by machines (computers) at run time.

This document is organized as follows. Next subsection provides the terminology as it is understood and defined by KiSS community group. Then, there are dedicated chapters coming on devices; knowledge representation and reasoning; service protocols and service composition and integration. Each of those dedicated chapters has problem definition and APIs subsections to outline main issues at each level and APIs that could help in finding solutions for those problems.

This document is a living doc which will be further improved and extended based on new insights received by the KiSS community group members and feedback. The feedback on the document can be provided via the corresponding interface specified at https://www.w3.org/community/kiss/feedback.

2. Terminology

API - Application Programming Interface, a set of software libraries and tools for application/service development, deployment, configuration and maintenance.
Device - an embedded device - computer - that can host service(s).
JSON – JavaScript Object Notation.
JSON-LD – JSON for Linking Data.
Knowledge - An organized representation of data making it possible to systematically establish connections between separate measurements.
Knowledge base - A software platform to manage and organize knowledge representation models.
Knowledge-driven system (KDS) - A system that relies on knowledge bases for storing and managing its data/knowledge.
Ontology - A format for knowledge representation that can be processed by machines.
RDF - Resource Description Framework.
Service - A software application having standardized description to be automatically processed to discover capabilities of a service and decide if and when to integrate it into the bigger application.
Service-oriented system - A system built out of loosely-coupled components (services) that are arranged into application depending on the current goals of the system.
Reasoning - A discovery of new, previously unknown facts about the system represented with the knowledge base (e.g. by ontology).

3. Devices

In knowledge-driven systems, devices are computer based modules which perform some non-trivial functionality (usually via services), are distributed and interconnected via some network (often via a wireless network). Other terms as embedded systems, cyber-physical systems (CPS), remote terminal units (RTU) are also used instead of the term devices. Devices usually provide interfaces to real world equipment (via sensors and actuators).

3.1. Problem definition

There are several problems that can be attributed to devices for building knowledge-oriented service-driven systems. Some of those may look similar for any embedded or cyber-physical systems. However, due to the potential exposure of the devices to the global networks and/or their frequent use for changing applications, those problems may require new approaches.

Device hardware selection guidelines.

Selection of hardware should correspond to the purpose of a device. There are a wide variety of possible hardwares from devices based on low power chips used in wireless sensor networks to the powerful industrial computers (usually PCs) for implementing complex Cyber Physical Systems. Another selection criterion can be based on the instant availability of the device: from custom design to of the shelf solutions (like Arduino, Raspberry Pi, Beagle Bone Black, etc.)

Device software selection guidelines.

Software can also have several variations. It can range from pure hardware (microcontroller) without any operating system to embedded real-time operating system (e.g. Free RTOS, eCos, Zephyr, etc.) or full complex operating system (e.g. Linux, Windows). But only the operating system is not sufficient. Devices should be equipped with software managing the communication, device discovery, reading inputs / writing outputs and dynamically working with services. See, for example, eScop RTU based on REX control system or Inico S1000 devices.

Selection of network communication protocol(s).

It is highly probable that the majority of devices will implement standard internetworking protocols based on the TCP/IP family of protocols (IP, TCP, UDP, HTTP, HTTPS, FTP, etc.). Also some newer messaging protocols can be used (e.g. MQTT). The new generation of wireless devices will very likely support 6LoWPAN which is suitable for the smallest devices with limited processing capabilities based on low-power radio communication with lower data rates.

Device discovery.

Device discovery is a must for larger systems with dynamically added/removed devices. There are several discovery protocols that usually use some broadcast technique.

Device inputs and outputs (incl. safety).

Software of the device should support drivers for reading of built-in direct inputs (analog, digital, counter, frequency, encoder inputs, etc.) and writing of direct outputs (analog, digital, pulse width modulated outputs, etc.). The quality of this piece of software influences substantially the real-time behaviour of the device.

Device services.

Device services are based on the same principles as the other services in the system. The only difference is that they are embedded into devices. See details in the following sections.

Semantics of services.

Semantics of services is a general issue in Knowledge Driven systems, see the discussion below.

3.2. APIs

Network communication protocol(s) API(s)
- HTTP, HTTPS
- MQTT
- CoAP, etc.
Device discovery protocol API(s)
API of drivers for reading inputs and writing outputs
- Direct inputs/outputs (registers, memory mapping, I2C, SPI, etc.)
- Communicated inputs/outputs (fieldbus protocols, industrial Ethernet, etc.)
API for management of device services (Adding, Deleting, Executing)
- OPC Unified Architecture
- REST based APIs
- Device Profile for Web Services (DPWS)
API and tools for device configuration
- Direct inputs/outputs (registers, memory mapping, I2C, SPI, etc.)
- Communicated inputs/outputs (fieldbus protocols, industrial Ethernet, etc.)

4. Knowledge representation and reasoning

Conceptually, Knowledge Representation (KR) and Reasoning (KR&R) is an AI discipline that systematize knowledge description independently of any domain and provides a set of decision-making algorithms. Currently, KR&R is being researched by our community due to the possibility of managing semantic descriptions that represent the status and requirements of, for example, personnel, equipment and products.

Semantic descriptions are easily understandable by both human and machines so that they have became a valuable resource for the interaction between machines e.g. to control processes in contemporary manufacturing systems.

In addition, semantic reasoning permits the conclusion of implicit facts that are derived from explicit descriptions that are stored in the Knowledge Base (KB). This is a powerful feature of KR&R because machines can extend on runtime the KB, which contains the knowledge about any specific domain. Therefore, the correct use of KR&R brings to machines new capabilities as e.g. automated decision-making or learning.

Moreover, the implementation of KBs is nowadays possible within web-based standards. Hence, the accessibility and manipulation of semantic descriptions can be done remotely through the semantic web. This fact presents a nice opportunity that the industrial automation domain is taking for integrating KR&R descriptions with current ICT tools and developments allowing e.g. remote access and control of processes, data mappings, complex event processing, flexible re-configuration or model validation, among other applications.

4.1. Problem definition

Adequate way to represent knowledge.

One of the main problems when employing KR&R techniques to formally describe knowledge to be used in a knowledge-driven system is indeed the manner in which the knowledge is represented. The on-going discussion, for example, in the industrial automation domain in this matter concentrates on which of the formalisms must be employed e.g. ontologies, semantic nets, frames, production rules or even databases, among others. Nevertheless, such discussion should be extended to the fact that once the formalism is selected, the implementation of it must be optimal or at least good enough to be more efficient than others. In other words, the selection of a formalism does not make it good by its capabilities but by the quality of its implementation and the expressiveness. Therefore, the implementation of the employed formalism must employ full potentials of it to ensure that the KR&R will be satisfactory and usable for other modules of the knowledge-driven solution.

Use of accepted standards for semantic descriptions.

Knowledge-driven systems can be implemented using CPS to integrate cyber and physical system domains. Such domains have a set of standards that can work together allowing a logical and mappable description. The use of standardised descriptions could be recommended to support the acceptance of knowledge representation. A different definition or naming for the same domain concepts expressed in different models may require to find translation principles between those models. It can be a reason of failure when two different systems exchange same type of information that is hosted and was created by different organisations.

Scope of KiSS in terms of semantic descriptions.

The KiSS W3C Community Group believes that ontology is a mature formalism for fully describing any knowledge of different domains and the services that may be executed for application integration purposes. RDF-based languages provide rich vocabulary and functions to describe information that may be understandable and usable for both humans and machines. In addition, the possibility of adding a layer of knowledge inference within semantic reasoning makes the use of ontologies an attractive approach for the domain so that implicit knowledge not known at design phase can be extremely helpful when making decisions at run time. Moreover, the flexibility and reusability of models are features of ontologies that are critical in very dynamic environments as industrial automation.

Efficiency of ontologies.
Time constraints/requirements when querying and updating model.
Representative use cases for reasoning.

One of the main concerns of KiSS for KR&R in knowledge-driven systems is the efficiency of employing ontologies in real industrial scenarios. The completion of the eScop project has demonstrated that ontologies can be a trustful approach for describing and consuming knowledge in industrial environments. Nevertheless, additional and extended cases have to be developed to identify representative use cases for using reasoning techniques, as industrial systems generally tend to be predictive implemented with a set of hardcoded rules for their control. The engineering methodologies for such systems may not easily allow going away from “hardcoding” perspective. Performance aspects such as a time for interactions with KB are important criteria for assessment and comparison of KB with traditional database approach. Degradation in performance may undermine the benefits of expressive power of ontologies. Yet another problem is a management of ontology life time. As the changes in the models may be requested from various sources (e.g. devices), the maintenance of the entire picture, as it seen from various sources, may be challenging task. It may range starting from the situations, when one source may simply overwrite the input from another source to the situations, where this contradictions between different inputs can be detected, understood and reacted upon, which requires another level of conceptualization and reasoning.

4.2. APIs and tools

Frameworks

Jena
Sesame
OWL API
SKOS

Ontology languages

RDF, OWL, OWL 2, OWL-S
SPARQL, SPARUL
- SPARQL over HTTP
SWRL

Ontology editors

Olingvo
Protege
FAST Semantic Workbench

RDF stores / Triple stores

Fuseki
Stardog

Reasoners

Pellet
Sesame
HermiT
RacerPro

5. Service protocols

There exist several approaches to service implementation. Term services is generally widely used. The term “services” here and further will be user in tight relation to the named expected features. This is related to a need for a proper encapsulation of the functionality on the right level of abstraction.

Among the currently existing approaches for the service implementation approaches two main ones should be mentioned: WS-* specifications and RESTful architectural style. These approaches are not strictly contradictory, but are following different philosophies. WS-* proposes the most complete set of specification to implement the WS, while RESTful approach suggests the architectural style to implement capable but lightweight web services exploiting the nature of the web. As the result, the specification and architectural style cannot be effectively compared.

Considering the application of distributed systems in manufacturing several options exist with different level of distribution and acceptance by community:

OPC-UA (with SOAP binding implements some part of WS-* spec)
DPWS (implements WS-* spec)
CoAP (follows the RESTful architecture, used in general IoT)
RESTful RTUs
NSGi

5.1. Problem definition

Which “philosophy” to follow?

The first problem which can be encountered during the implementation of web service based solution is a decision how to implement the services. Considering a need to be integrated in the wider community and leaving the theoretical possibility of development of a new approach to service implementation, the implementer is left with few possibilities: to follow WS-* specification, RESTful architectural style, or some combination of both.

Which protocol to use?

Assuming that the implementer made a decision of which approach to web services to follow arises a following choice related to the implementation of the approach. Within the WS-” specification there exist multiple possibilities for service binding. Each possibility provides its own benefits and disadvantages, rooted in the nature of underlying communication protocol. In case one follows the RESTful approach, different protocols can be used. Normally RESTful services are implemented using HTTP, but unfortunately, it provides certain overhead and is not applicable for example for devices with restrained capabilities. To implement RESTful services in such environment the CoAP protocol can be used. It is easily mapped to HTTP, but enables better performance. Combination of both HTTP and CoAP is possible when the gateways are being used in between corresponding network segments. As the result, the selection of the protocol has to be performed based on the application requirements.

How to select a service protocol for implementation in application?

In our opinion it should be possible to identify the set of mnemonic rules or a decision table which will support the designer in this decision. The most important factors influencing the decision are: service host capabilities (computational power, networking capabilities), network parameters (reliability, bandwidth), the nature of overall application (amount of service nodes, size and frequency of messages, reliability requirements), nature of related equipment if relevant (e.g. how often controlled device should receive the input from the overall application).

How to implement services in applications to create KDS?

After the basic questions of the technology and tools selection for basic service capabilities are resolved, the next question is how to implement services. In the opinion of the authors, it is generally required to follow the basic rules identified by the very concept of web services and by the specification and architectural style selected, as otherwise the basic goal of the application of the selected concepts is endangered and the result solution may perform not in line with specification and expectations. Additionally such deviations will complicate the integration with other solutions.

Additional requirements for the service design are being applied by the aim of having a knowledge-driven system. The services in KDS should be combined with the related knowledge about the service itself. The knowledge about the service should include: the details required for the service invocation such as the communication protocol, expected communication pattern, parameterizable and semantically annotated request and response, possible exceptions and exception resolution mechanisms. Some of the named knowledge may be implicit or agreed upon by default, but this reduces possibility for interaction with the systems following different convention of service description. Additionally on the higher level, the meaning of service related operation should be defined. This will allow the selection of the service for the needs of external systems or the composed services.

This leads to a tradeoff between reducing complexity and improving flexibility in design of KDS services. The balance between them is to be defined by common sense depending on the expectation set for the solution.

It is important to understand the nature of the operation underlying the service. For some cases the underlying operation can be abstracted as some combination of CRUD (Create/Read/Update/Delete) actions. Such operation can be natively mapped to e.g. HTTP. In some other cases the operation has different meaning than basic CRUD operation. In this case the operation may be reformulated to be mapped to CRUD or needs specific approach. There is a need for generalized approach for a named problem.

The one of the practical approaches that encompasses the description of the CRUD HTTP RESTful services on application level is HYDRA API.

The service description (extended contract) can be provided by the service itself or registered in some service inventory separately from the service. Furthermore, the automated registration of the services in the service inventory is possible. Considering the fact that complex systems may have hundreds of services dynamically appearing and disappearing in the system, the automation of service registration should provide significant advantages over manual process.

How to represent data?

As was discussed before the data included in the service oriented communication in the KDS system requires description same as the services themselves. Practically the data should be mapped to the concepts in the knowledge system. One of the approaches would be to include all related information in the message, but such approach will significantly increase the overhead of the communication and create possibilities of desynchronization of data and model. More feasible and secure approach would be to use references for the concepts outside the message itself. The concept of linked data provides a possibility to give the context for a message. Two major implementations for linked data based on JSON and XML are JSON-LD and RDF accordingly. It is possible, but not required to use the self-descriptive data with more complex definitions based on RDFS or OWL on this level.

How to represent data relations?

Following the concept of the linked data the relations may be hyper links. Furthermore availability of open, public, crowd-sourced resource for the data schemas and relations should accelerate and simplify the convergence of the solutions in different domains.

5.2. APIs and tools

Frameworks for service implementations

Spring (WS-* & REST for Java)
Express (REST for Node.js)
Flask (REST for Python)
WS4D (DPWS implementation for C, C++, Java ...)

Clients

Web browser - general
Postman, REST advance client - REST
Service Explorer - DPWS

To implement on device level

Some PLCs implementing OPC-UA
REX, Inico S1000, eScop RTUs

Data representation

Concept of hypermedia
Linked data RDF, JSON-LD
Web Ontology Language (OWL)
Hydra: Hypermedia-Driven Web APIs

6. Service composition and integration

As presented in the previous section, services are generally bind several parties in order to assure a proper communication. These parties may exist on different platforms, including cloud-based platforms. Thus, services are recommended to be independent in terms of the consumer technology. Generally services follow a request and response approach. The request comprises a need for certain service. On the other hand, the response of the requested service contains the answer. In this matter, two main players are elaborating for such communication; servers which receive the request and clients which generates the request. In this manner, many organizations (i.e. W3C and OASIS) provided several protocols for implementing services.

The term service composition and integration can be seen as the method for constructing and using the communication data according to the selected protocol. It has to be noted that the user selects a web service protocol and then uses the features of the protocol such as discovery of web services or the sequence of flow.

6.1. Problem definition

After choosing the proper protocol, the user is subjected to define the method(s) of exploiting these protocols and deploy them in the KBS. The problem definition can be spotted in the following questions:

What is the nature of the exchanged data?

The nature of the data which is required to be exchanged governs the implementation of the web services protocol. As an example, if a resource provides a continuous stream of data for some consumer, then the selection of the protocol and the integration of the protocol has to be compatible with such kind of data. Besides that, the notation format of the data (i.e. JSON, XML, plain text or CSV) matters in terms of parsing.

What is the level of required security?

Another factor that the user has to consider is the security level. In this manner, the sequence flow of the web services might be changed according to the security level which the user chooses for the same protocol. As an example, http and https are similar in terms of functionality. However, https includes more request actions.

How to discover web services of a certain resource?

As KBS provides reasoning capabilities, the discovery of the web services improves the interactions and allows an autonomous integration. The role of reasoning can be in finding the right services for the given problem application has to solve.

Which approach optimizes the communications efficiency (time, running cost, security)?

Many approaches could lead for the same functionality in the same web service protocol. However, the user needs to put the performance into account where the system requires a real time integrations. As an example, in KBS, the information query could have more than one structure, but with an optimized solution, the KBS may have better performance such as the time for querying the data or for parsing the response.

6.2. APIs and tools

All modern API and tools supports the majority of the available communication protocols. Therefore, it comes to the user for building the service composition methods, the following are some APIs and tools that help the user for more consistent results: