For safety assurance in conventional engineering, the safety view as a refined quality view is of particular interest. Even though functional models and aspects such as tests are also important building blocks for safety assurance, all safety-relevant information is eventually compiled in dedicated safety models and the complete safety assurance process is controlled based on these safety models. Iteratively, the residual risk is evaluated, cause-effect chains are identified, appropriate countermeasures are selected, their appropriateness is evaluated, and the process starts again with a re-evaluation of the residual risk – until the residual risk falls below an acceptable threshold.
In principle, this process is transferred – at least partially – into runtime for dynamic safety assurance. B-Spaces shall now possess the required knowledge (in form of models at runtime) as well as the means (in form of management mechanisms that, for instance, realize dynamic self-adaptation) for transferring certain aspects of the assurance process from the human engineer at development time to the systems at runtime. How much of the process should actually be transferred depends on the degrees of openness and adaptivity in the respective systems and the corresponding B-Space “level” and “scope” in correspondence with Fig. 4. In [12], we identified a set of distinct “safety-intelligence” levels (safety certificates at runtime, safety cases at runtime and hazard and risk analysis at runtime), each implying a certain amount of shifted safety intelligence and each appropriate for different degrees of openness and adaptivity / levels and scopes of the B-Space. The higher categories of “level” and “scope” require (or at least greatly benefit) higher levels of shifted safety intelligence. While safety certificates at runtime are well suited for system/mission level and operational scope, if I would like to operate on the ecosystem level (i.e. having an open ecosystem with formerly unknown constituent systems entering dynamically), then I require some means to dynamically analyze hazards and risks for those completely new collaboration schemes that form within my ecosystems. In the following we re-iterate these distinct levels of runtime safety-intelligence before we go on to exemplify in more detail (based on ConSerts) how a runtime safety certificate approach can be set up.
Certificates at runtime (such as ConSerts) are post-certification artifacts. This means that only few unknowns exist regarding the system’s environment and collaborations and that, based on this general knowledge, a certain variability is built into the safety certificates, which is bound to formal conditions. These conditions are associated with properties of the environment (such as safety requirements of an external service) that can be resolved at runtime. ConSerts, as an instance of certificates at runtime, enable a compositional safety evaluation in a heterarchical system-of-systems setting. Due to their limitations, however, ConSerts are mostly suited for closed ecosystems and pre-engineered adaptive behavior. Overall, ConSerts are well suited to constitute a B-Space for dynamic safety assurance on the mission level and with an operational scope. The B-Space would then consist of the ConSerts for all cooperating systems (as well as auxiliary models such as type systems for services and safety properties). Based thereon, safety guarantees can be calculated dynamically and dynamic management can be triggered to maintain necessary guarantees and to optimize performance. Even though ConSerts only cover a slice of the capabilities and dimensions outlined in Fig. 4 (i.e. mission level, operational scope, focus on safety as a quality), they show what we understand under a runtime model and what kind of implications this has regarding established engineering practice. ConSerts will therefore be discussed in more detail in the subsequent chapter. Of course, apart from ConSerts there are other approaches from the state of the art that fit into the B-Space concept as well, either addressing other dimensions (cf. Fig. 4) or addressing orthogonal concerns (such as monitoring properties by runtime verification, which can be relevant for arbitrary B-Space dimensions). Examples are brought up in the following and in the conclusion.
The next level are safety/assurance cases at runtime, which, as SM@RTs, would be the backbone for the dynamic acquisition of evidence by means of runtime V&V (i.e. validation and verification) strategies. And, based thereon, dynamic adjustment of V&V models would enable an additional dimension of flexibility. One example are Digital Dependability Identities (DDI). A DDI can be understood as a dependability-related aspect of a B-Space. A DDI contains all the information that uniquely describes the dependability characteristics of a system or component [4]. This includes attributes that describe the system’s or component’s dependability behavior, such as fault propagations, requirements on how the component interacts with other entities in a dependable way, and the level of trust and assurance, respectively. The latter can be described using concepts from the theory of safety contracts. A DDI is a living modular dependability assurance case. It contains an expression of dependability requirements for the respective component or system, arguments of how these requirements are met, and evidence in the form of safety analysis artifacts that substantiate these arguments. A DDI is produced during design, issued when the component is released, and is then continually maintained over the complete lifetime of a component or system. DDIs are used for the integration of components into systems during development as well as for the dynamic integration of systems into systems of systems in the field.
The penultimate level of runtime safety intelligence would be to also shift parts of Hazard and Risk analyses to runtime, hence enabling dynamic risk assessment or runtime risk analysis. This, in turn, would be a starting point for enabling dynamic adjustments in the safety argumentation, which, in turn, might affect the dynamic certificates.
Ultimately, safety assurance would be shifted completely to runtime. Comprehensive models within the B-Space covering all relevant concerns would constitute the basis for a fully automated dynamic safety lifecycle yielding an optimal trade-off between safety and performance within an ecosystem. Approaches from the field of artificial intelligence might provide the reasoning capabilities required on top of the information provided by the B-Space. Clearly, this idea of emergent safety is provocative and seems far-fetched; it should be understood as an ultimate vision for tomorrow rather than a concrete research goal for today.
4.1 Certificates at runtime – Conditional safety certificates
In this chapter, we re-iterate the concept of ConSerts as introduced in [2, 3] and outlined in [4] and illustrate how ConSerts are an aspect of and an initial step towards the concept of B-Spaces. In particular, it is shown how B-Space models could look like and what the implications would be with respect to established engineering practice.
ConSerts operate on the level of safety requirements. They are issued at development time and certify specific safety guarantees (i.e., guaranteed safety requirements) that depend on the fulfillment of specific demands (i.e., safety requirements demanded from the environment) regarding the environment. In the same way as “static” certificates, ConSerts shall be issued by safety experts, independent organizations, or authorized bodies after a stringent manual check of the safety argument. To this end, it is mandatory to prove all claims regarding the fulfillment of provided safety guarantees by means of suitable evidence and to provide adequate documentation of the overall argument – including the external demands and their implications.
There are, however, some significant differences between ConSerts and static certificates that are owed to the nature of open systems: A ConSert is not static but variable and conditional; i.e., it comprises a number of variants that are conditional with respect to the (dynamic) fulfillment of demands. Moreover, a ConSert must be available in an executable (and composable) form at runtime (i.e., as a safety model at runtime) and systems must be equipped with corresponding mechanisms to operate on the ConSerts. Conditions within a ConSert manifest in relations between potentially guaranteed safety requirements, which can simply be denoted as guarantees, and the corresponding demanded safety requirements, i.e., demands. Demands always represent safety requirements relating to the environment of a component, which cannot be verified at development time because the required information is not available yet. These demands might directly relate to required functionalities from other components. On the other hand, evidence can be required beyond that since safety is not a purely modular property and it cannot be assumed that a composition of safe components is automatically safe. To this end, ConSerts support the concept of so-called Runtime Evidences (RtE) as an additional operand of the conditions. RtEs are a very flexible concept. In principle, any runtime analysis providing a Boolean result can be used. RtEs might relate to properties of the composition or to any context information, e.g., a physical phenomenon such as the temperature of the environment that is safety-relevant and could be measured with a sensor. Other RtEs require dynamic negotiation between components, such as for determining independence between different services used. To this end, a dedicated protocol could be used that builds traces through the composition hierarchy starting from the services in question to identify common elements in the traces.
Accordingly, a ConSert can be specified as a set of Boolean functions, which, in turn, can generally be represented by a corresponding set of (potentially overlapping) rooted directed acyclic graphs in a graphical specification. The root of each of these directed acyclic graphs is constituted by a potential safety guarantee, which becomes true if, at runtime, related (according to the Boolean logic) demands and runtime evidences are satisfied. Each graph consists of:
• A set of Boolean input variables; i.e., demands and runtime evidences
• A set of Boolean gates; i.e., {and, or}
• A set of directed edges connecting the elements
• A Boolean output variable; i.e., the guarantee
The guarantees and demands can be specified based on the grammar introduced in [3]. An example follows later in this article.
Note that ConSerts are designed to harmonize with the pre-engineered dynamic reconfiguration. Each constituent system (/component) can have a series of configurations, which can be switched at runtime. Each of these configurations is characterized by a specific profile of required and provided services and equipped with its own specific ConSert, which provides the mapping function between guarantees, demands and runtime evidences as described above (cf. Figure 5). Within the B-Space, the reconfiguration models are an important additional ingredient for enabling not only dynamic assessment of safety guarantees, but also their management and enforcement.
To be utilized as a runtime model of the B-Space, ConSerts (as well as other relevant models) need to be transferred into a suitable representation. Suitability clearly depends on the characteristics of systems and domains and on the corresponding constraints. When we are talking about tiny microcontrollers where resources are very scarce, it may make sense to bring ConSerts into a BDD representation (i.e. a set of BDDs, one for each potential guarantee) [2, 3], which can be optimized at development time to enable very easy evaluation and a minimal memory footprint at runtime. If we are talking about more powerful IoT devices, an XML-based representation might be more appropriate, and if an Internet connection is always available, even a cloud-based ConSert evaluation is conceivable. Looking at the B-Space as a whole, different runtime models for different concerns might reside in different places. ConSerts and reconfiguration models might be distributed modularly over all constituent systems, while other models, such as those working on a larger time scale and with a more strategic scope, may reside and be managed in the cloud.
At runtime, dynamic hierarchies are formed when a top-level application is instantiated in an open adaptive system of systems. Thus, the top-level application requires basic services from other systems, which might in turn require basic services from yet another system, and so on. In this process of forming a dynamic hierarchy, involved constituent systems might be reconfigured/adapted to be fit for their tasks. This also implies that constituent systems deliver services for consumers of different composition hierarchies. Therefore, a heterarchical system-of-systems structure is formed overall.
Throughout these hierarchies, ConSerts are composed and evaluated, yielding the current safety guarantees for the respective hierarchy at their root node. The evaluation is performed from leaves to root, starting from leaf ConSerts that have only runtime evidences and no demands. Generally, based on the fulfillment of demands and runtime evidences, it is determined for each ConSert through Boolean logic which of its potential guarantees are currently valid. If several guarantees of a ConSert are valid, one (i.e. the “best” one) shall be selected. This can be done based on a simple absolute order from best to worst over the guarantees of a ConSert or based on more sophisticated utility functions, which might include context information in the equation to make a more informed decision on what is actually to be considered “best” under given circumstances.
Based on the continuously monitored safety guarantees, managing actions might be performed by the system(s) in the sense of the MAPE-M@RT (monitoring, analysis, planning and execution based on models at runtime) [13] control loop. Thus, the context is monitored and the current safety guarantees are determined; it is analyzed whether the system has the appropriate configuration and parameterization; potential adaptations are planned and executed – all based on the M@RT of the B-Space, in particular the ConSerts and correlated adaptation/reconfiguration models.
4.2 Example
We further illustrate ConSerts with an example from the agricultural domain, which has previously been presented in [4]. The agricultural domain today pioneers innovative applications involving systems of systems and dynamic integration. One of these applications is the so-calledTractor Implement Management (TIM). The TIM functionality enables agricultural implements to control typical tractor functions such as velocity, steering angle, power take off, or auxiliary valves. It is possible to fully automate implement-specific work procedures and to optimize them with respect to parameters such as performance, efficiency, or wear and tear. TIM utilizes a standardized bus system for communication between the participating devices and machines. During TIM operation, control is typically assumed by the implement. It uses the TIM functions of the tractor (e.g., controlling velocity, steering angle, power take-off, hydraulic valves) and auxiliary devices, such as sensors for the respective automation purpose, displays data to the operator, and executes operator inputs. Between different tractors, implements, and auxiliary devices such as virtual terminals (providing the operator UI) or GPS systems of different manufacturers, a huge space of configurations arises, which makes it unfeasible to analyze each potential configuration a priori at development time. For this reason, those TIM applications already available on the market today only work for predetermined concrete pairs of tractors and implements whose integration has been thoroughly analyzed at development time by the involved manufacturers.
The benefit of ConSerts (or SM@RT in general) in this setting is pretty clear. Assume there is a farmer who owns a TIM-capable tractor and a TIM-capable round baler. The TIM baling application is running on the implement, and the user interface is displayed on a virtual terminal in the cabin. In addition to a standard configuration, the baling application also supports an extended configuration, which additionally incorporates a swath scanner device. This device is mounted on the front of the tractor and measures the volumetric flow and the location of the swath to further optimize the baling operation in terms of tractor speed and steering angle. The baling application can then be enabled when tractor, implement, virtual terminal, and swath scanner are connected and the automated ConSert-based interoperability and safety checks have been successful. Corresponding information is provided to the operator via the terminal. Parameterization and constraints are set appropriately. The actual round baling process can then be activated by the operator, who thus relinquishes control to the round baler. The round baler commands the tractor to drive over the swath with optimal acceleration rates and speed. When the bale reaches a preset size, the tractor decelerates to a standstill and the bale is ejected. The process can then be re-started by the operator.
4.3 Engineering of ConSerts
For the engineering of ConSerts, the role and viewpoint of the implement manufacturer are assumed in this example. The goal of the manufacturer is to develop a round baler with TIM support. From a functional point of view, it is known by the manufacturer (due to existing standards) what the interfaces between the potential participants look like and how they are to be used. However, the implement manufacturer does not know anything about the safety properties of these functions.
From a safety point of view, the engineering of the baling application starts top-down with an application-level hazard and risk analysis. Assume that the agricultural manufacturers agreed by convention that during the operation of a TIM application, the application (and thus the application manufacturer) has the responsibility for the overall automated system. Therefore, the safety engineering goal is to ensure adequate safety not only for the TIM baling application or for the implement, but for the whole collaboration of systems that will be rendering the application service at runtime. Thanks to the ConSert-based modularization, it is thus sufficient to only consider the direct dependencies of the system under development on its environment. Potential “external” safety requirements will be associated either with demands regarding required services or with RtEs. At runtime, it will be determined whether the demands can be satisfied based on guarantees given by external systems (which, in turn, might have demands depending on yet (an)other external system(s)). This negotiation can thus range across several layers and incorporate a series of dependent systems and guarantee-demand relationships.
Relating the ConSerts from the example to the classification of B-Space models, we are talking about quality models (i.e., safety). ConSerts will mostly be utilized with an operational scope, but they might as well be input to considerations with a tactical or even strategic scope (e.g., strategic deployment of TIM machines with respect to weather conditions so that their performance over a season can be optimized). As regards the B-Space level where ConSerts reside, this is at least twofold. There are ConSerts at the system level, but due to their compositional and hierarchical nature, there is always at least one ConSert in a composition hierarchy that is at the mission level – the ConSert of the root node of the hierarchy. For this ConSert, the overall collaboration of the cooperative application (i.e., mission) has been considered in the corresponding safety engineering as exemplified above for the TIM baling application. Beyond that, it would also be conceivable to utilize ConSerts at the levels above, system of systems and ecosystems. However, to date we have not engineered any in-depth case study in that respect.
Going back to the engineering of the baler, relevant hazards of the TIM baling application might be self-acceleration or self-steering during operation or self-acceleration or power take-off during standstill. These hazards would be assessed with respect to their associated risks based on the risk assessment tables provided by ISO 25119 (i.e., the safety standard of the agricultural domain). In a subsequent step, corresponding top-level safety requirements would be derived. In addition, reasonable guarantee levels need to be identified. In the given example, it is conceivable that different safety guarantee levels are required for different locations (e.g., in the midst of nothing vs. a field close to a children’s playground) or that guarantee levels are defined in interplay with application-specific parameters (e.g., acceleration or velocity levels, different degrees of automation, etc.) or with relevant conditions of the operating environment (weather, topography).
The next step is to develop a safety concept that ensures the satisfaction of the safety requirements and of the associated ConSert guarantees. This is done in a standard way: Safety analyses are applied to identify cause-effect relationships and to specify the failure logic; corresponding safety measures are identified; and, eventually, a conclusive safety argument is built up that factors in suitable evidence. The main difference to the engineering of closed systems is that besides possible internal causes, there might be external causes that may either be associated with safety properties of the required services or with RtEs. Moreover, there is also some degree of variability to be considered due to different ConSert guarantees and corresponding differences in the correlated demands.
With regard to the causes related to required services, there are two possibilities. First, it is possible to define internal measures, such as error detection mechanisms, so that failures of the required services can be tolerated. Alternatively or in addition, it is possible to demand that the external service provider has to guarantee certain safety properties for the service. These safety properties need to be formalized and standardized for a domain in order to constitute the basis for the definition of ConSerts guarantees and demands. As for the RtEs, two categories can be distinguished: intra-device and inter-device RtEs. The former can be designed and implemented rather freely because they do not require any information from other external systems. The latter do require such information and thus, they need to be standardized or at least described in guidelines for a given domain. In reference to the example, assume that there is a top-level safety requirement that self-acceleration must not occur during standstill. Based on the hazard and risk analyses, it has been determined that this requirement needs to be assigned the integrity level AgPL dFootnote 1. However, this is due to a relatively high degree of exposure assumed for bystanders, as would be the case for operation in the vicinity of a residential area. In other areas, AgPL c would be sufficientFootnote 2. With ConSerts, it is now conceivable to optimize the trade-off between availability and safety by factoring in dynamic context knowledge. Specifically, this means that it would be possible to use, for instance, a GPS position and (in this case) annotated map data to distinguish between different usage contexts of the TIM system that imply different levels of safety requirements. Or, based on the vision of B-Spaces, we might have a wealth of up-to-date context information from and about systems and persons in the vicinity at our disposal. Of course, such a context-sensitivity mechanism needs to be safe in its own right, but for now let us just assume this can be done.
Thus, following these considerations, three different levels of ConSert guarantees are defined in the example: a) a high integrity one, enabling full automation features of the TIM application; b) a medium integrity one, enabling full features only in specific areas (or, alternatively, enabling operation with some constraints); and c) a default guarantee that can always be granted, enabling only a very constrained operation, e.g., without acceleration from standstill or automated steering. The high integrity guarantee would include AgPL d for self-acceleration in standstill as well as a series of other relevant guarantees omitted here for the sake of simplicity. The specification of the guarantee given next is based on a grammar [14] and on service types, safety property types, and rules of refinement specified in a domain-specific standard or guidelines:
TIMBalingSwSc(1): AgPL = b, SelfAcc{,Standstill}.AgPL = d, LateAcc{30s,Standstill}.AgPl = d, (...)
The first element of the specification denominates the associated service (by type) and gives an (absolute) order number for the guarantee (from 1 (best) to n (worst)). More sophisticated orderings could be useful but have not been considered yet for ConSerts. The next element describes an integrity level for the whole service. This is basically a shortcut and implies that all safety properties of the service (as specified by the standard or guideline) are guaranteed with the named integrity level. Then follows a series of concrete safety properties, whose types and parameters (for refinement) are also defined by the standard.
The next step from the ConSert perspective is to determine the demands (i.e., service-related demands as well as RtEs) that relate to the identified guarantees. This relation is modeled by means of a Boolean function, where the demands are input variables and the guarantee is the output variable. There is also a corresponding graphical specification technique based on directed acyclic graphs, where each function is represented by a tuple (D, R, BG, E, g): a set of Boolean input variables D representing service-related demands and RtEs R, a set of Boolean gates BG, a set of directed edges E connecting the elements, and a Boolean output variable g.
Overall, a ConSert is a set of such mapping functions, one for each guarantee level (of each offered service/function of a unit of composition). A unit of composition is typically a self-contained piece of hardware and software, i.e., a system. But it could also be just a piece of software, i.e., an application. If deployed concurrently with other applications on a shared hardware and particularly if dynamic download and update shall be possible, it is also important to include vertical dependencies between applications and (shared) resources in the eq. A corresponding extension of ConSerts has been developed in the EMC2 project [15,16,17].
The definition of the guarantees, the demands, and the mapping functions is generally done conjointly with the development of the safety concept and safety argument. In fact, the resulting ConSert becomes an integral part of the safety argument because it needs to be shown in a convincing manner that the ConSert guarantees are actually valid given the fulfillment of their related demands.
4.4 ConSerts at runtime
ConSerts need to be transferred into a machine-readable form to enable dynamic evaluation, and there need to be corresponding mechanisms built into the systems that operate on this information (i.e., ConSerts as SM@RT) to conduct the evaluation. Of course, the evaluation protocols need to be standardized to ensure that every participating system is interoperable from a ConSert point of view. Assume that the operator has installed the swath scanner on the tractor and that tractor and round baler are coupled. The operator initiates TIM via the virtual terminal and explicitly selects the application variant that provides flexible control of speed and steering based on the input from the swath scanner. The first step is now to start the application, i.e., to dynamically integrate the participating systems. After this has succeeded, the evaluation of the safety guarantees of the application is started. Note that the application forms the root of a dynamically formed composition hierarchy (in contrast to basic services or functions, which are rendered by lower-level components/systems and which are consumed by superordinate components/systems) and the correlated ConSert has the scope of this whole system-of-systems application. The evaluation of ConSerts starts from the leaf systems that have no external service-related dependencies. These systems determine their RtEs and propagate them up in the composition hierarchy. Eventually, all service-related demands of the root (i.e., the TIM baling application) can be checked and, together with the evaluation results of the RtEs, the top-level safety guarantees are determined.