Skip to main content

Open Access 04.05.2024

A framework for monitored dynamic slicing of reaction systems

verfasst von: Linda Brodo, Roberto Bruni, Moreno Falaschi

Erschienen in: Natural Computing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Reaction systems (RSs) are a computational framework inspired by biochemical mechanisms. A RS defines a finite set of reactions over a finite set of entities. Typically each reaction has a local scope, because it is concerned with a small set of entities, but complex models can involve a large number of reactions and entities, and their computation can manifest unforeseen emerging behaviours. When a deviation is detected, like the unexpected production of some entities, it is often difficult to establish its causes, e.g., which entities were directly responsible or if some reaction was misconceived. Slicing is a well-known technique for debugging, which can point out the program lines containing the faulty code. In this paper, we define the first dynamic slicer for RSs and show that it can help to detect the causes of erroneous behaviour and highlight the involved reactions for a closer inspection. To fully automate the debugging process, we propose to distil monitors for starting the slicing whenever a violation from a safety specification is detected. We have integrated our slicer in BioResolve, written in Prolog which provides many useful features for the formal analysis of RSs. We define the slicing algorithm for basic RSs and then enhance it for dealing with quantitative extensions of RSs, where timed processes and linear processes can be represented. Our framework is shown at work on suitable biologically inspired RS models.
Hinweise
Linda Brodo, Roberto Bruni and Moreno Falaschi have contributed equally to this work.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Reaction Systems (RSs) (Ehrenfeucht and Rozenberg 2007b; Brijder et al. 2011a) are a computational framework inspired by systems of living cells. Their constituents are a finite set of entities and a finite set of reactions. Each reaction is a triple that consists of: a set of entities whose presence is needed to enable the reaction, called reactants; a set of entities whose absence is needed to enable the reaction, called inhibitors; and a set of entities that are produced when the reaction takes place, called products. RSs have shown to be a general computational model whose application ranges from the modelling of biological phenomena (Azimi et al. 2014; Corolli et al. 2012; Azimi 2017; Barbuti et al. 2016) to molecular chemistry (Okubo and Yokomori 2016). The classical semantics of RSs is defined as a reduction system whose states are sets of entities (those produced at the previous step, possibly joined with others provided by an external context, thus modelling the interaction with the environment). Notably, as reactions exploit facilitation and inhibition mechanisms, the behaviour of RSs is generally non-monotonic, i.e., a computational effect, like the production of a given entity, is not necessarily preserved when more entities are present in the source state.

1.1 Problem statement

Several tools are already available to simulate RSs or to verify that certain properties are met. Besides our own prototype BioReSolve1 proposed in Brodo et al. (2021a, 2023b) and further extended in this paper, some notable examples are, e.g., brsim2 a Basic Reaction System Simulator written in Haskell and distributed under the terms of GNU GPLv3 license (Azimi et al. 2015), its online version WEBRSIM3 that makes all functionalities of brsim available through a friendly web interface (Ivanov et al. 2018), the GPU-based Highly Efficient REaction SYstem simulator HERESY4 that exploits the large number of computational units inside GPUs to boost performance (Nobile et al. 2017), the optimised Common Lisp simulator for RSs cl-rs5 presented in Ferretti et al. (2020).
However, the design of RSs for modelling some natural phenomenon is often done by domain experts to validate their hypotheses and require some degree of abstraction. Consequently, inaccuracies or false assumptions may be introduced as early as the design stage. In addition, writing reactions is an error-prone activity, and verifying their execution can be difficult even for RSs with a couple of dozens of entities and reactions. For example, if some mistake is done at the design level and some inexplicable result is observed during the simulation, then a manual inspection of the computation may be necessary in order to understand the nature of the problem.
The aim of this paper is to propose the first automatic technique to ease the debugging of RSs. In fact, we are not aware of any debugging systems available for the above RSs tools.

1.2 The approach

Slicing is the classical technique to circumscribe the most relevant parts of a program that may have caused a certain observable effect. Slicing was first introduced as a static technique by Weiser (1984). Then it was extended in Korel and Laski (1988) by introducing the so called dynamic program slicing, which supports the debugging process by selecting a portion of the program containing the faulty code. Dynamic program slicing has been applied to several programming paradigms (see Silva (2012) for a survey).
The idea explored in this paper is to tailor the slicing technique to RSs and then to automatically trigger the slicing of the computation as soon as certain events are detected. The expected benefit is to discard all irrelevant entities and reactions and to focus the attention on those directly responsible for the observed undesirable effect.
In order to trigger the slicing in a fully automated manner we adapt the monitoring framework in Aceto et al. (2021a) to the setting of RSs. To this aim, first we build on the available Labelled Transition System (LTS) semantics of RSs defined in Brodo et al. (2021a) following the well-known Structural Operational Semantics (SOS) style (Plotkin 1981, 2004). Second, we introduce a flexible modal logic of assertions over transition labels that can be used to express safety conditions. Formulas are then used to synthesize suitable monitors for the LTS semantics of RSs: they inspect the labels as the computation progresses and when a violation is detected they trigger the slicing of the trace from the faulty state.
Our slicing framework follows three main steps as in Falaschi et al. (2016) (there declined for the quite different setting of Concurrent Constraint Programming, which is a monotonic language). First the dynamics of RSs is extended to an enriched semantics that considers states paired with their computation history. Second, we take a partial computation which is considered faulty by the user, or which is determined automatically as faulty by the monitor: the fault is identified by marking a set of entities in the last state reached by the computation. The marked entities are considered responsible of the bug and the goal of the slicing is to determine the causes which led to their production. The third step is the execution of an algorithm that removes from the history any irrelevant information w.r.t. the production of marked entities.

1.3 Contribution

First, we consider a slicing algorithm for ‘closed’ RSs which are RSs where the environment provides just a set of initial entities and then no longer interact at subsequent computation steps. Then, we extend the slicing algorithm to take into account the step-by-step interaction of RSs with the environment, represented by a ‘context’, which can provide new entities at each computation step. In both cases, we show that the sliced computation is a simplification of the original one, which highlights the entities, contexts and reactions that are essential to produce the marked entities. Then we show how to define monitors to fully automate the slicing process.
Finally, we consider slicing for enhanced RSs with quantitative features, such as delays, duration and linearly constrained reactions.
We have developed a prototype implementation in Prolog, freely available online. Here the use of a declarative approach is useful to guarantee the correctness of the implementation, which closely mirrors the theoretical definitions, and also it can be easily modified to accommodate several quantitative extensions.

1.4 Organisation

In Sect. 2 we summarise the basics of RSs. In Sect. 3 we recall a more convenient process algebra for RSs (Brodo et al. 2021a). In Sect. 4 we define the slicing technique for RSs. In Sect. 5 we show a biological example to illustrate our framework and implemented tools. In Sect. 6 we present a specialised assertion language and we introduce monitors. Monitors can be derived from safety formulas in modal logic to mark automatically the entities and start the slicing. In Sect. 7 we recall the definition of extensions of RSs for modelling quantitative information, namely processes with delays and processes with quantities which can be computed by solving linear expressions. Then we show how the slicing algorithm naturally extends to delays and linear quantities. We discuss some related work in Sect. 8 and future work in Sect. 9, together with concluding remarks.
This paper is an extended version of Brodo et al. (2023a). We have added an entire new section to show that our slicing framework is flexible and that our algorithm can be easily modified to deal with previous extensions of RSs such as timed processes and linear processes, which introduce quantitative information in RSs. We have included the correctness proof of our base slicing algorithm. We have added some examples and further explanations to improve the readability of this paper and we have introduced a better and more clear notation to represent the RS computations and their sliced simplifications.

2 Reaction systems

The theory of Reaction Systems (RSs) was born in the field of Natural Computing to model the qualitative behaviour of biochemical reactions in living cells. We briefly account here for the main concepts as introduced in the classical set theoretic presentation of RSs (Brijder et al. 2011a). In the following, we use the term entities to denote generic molecular substances (e.g., atoms, ions, molecules) that may be present in the states of a biochemical system.
Let S be a (finite) set of entities; we call a state any subset \(W\subseteq S\). A reaction in S is a triple \(r = (R,I,P)\), where \(R, I, P\subseteq S\) are finite, non empty sets and \(R \cap I = \emptyset \). The sets RIP are the sets of reactants, inhibitors, and products, respectively. We let \( rac (S)\) denote the set of all reactions over S.
Given a state \(W\subseteq S\), the result of a reaction \(r = (R,I,P)\in rac (S)\) on W, denoted by \( res _r(W)\), is defined as follows, where \( en _r(W)\) is called the enabling predicate of r:
$$\begin{aligned}{} & {} res _r(W) \triangleq \; \left\{ \begin{array}{ll} P &{} \hbox { if}\ en _r(W)\\ \emptyset &{} \text{ otherwise } \end{array} \right. \qquad \\{} & {} en _r(W)\triangleq \; R \subseteq W\; \wedge \; I \cap W = \emptyset \end{aligned}$$
From the above definition it follows that all reactants have to be present in the current state for the reaction to take place, while the presence of any of the inhibitors would block the reaction. Products are the outcome of the reaction, to be released in the next state.
A Reaction System is a pair \(\mathcal{A} = (S, A)\) where S is the set of entities, and \(A \subseteq rac (S)\) is a finite set of reactions over S. Given \(W\subseteq S\), the result of the reactions A on W, denoted \( res _A(W)\), is just the lifting of \( res _r\), i.e., \( res _A(W) \triangleq \cup _{r \in A} res _r(W)\).
Since living cells are seen as open systems that react to environmental stimuli, the behaviour of a RS is formalised in terms of an interactive process. Let \(\mathcal{A} = (S, A)\) be a RS and let \(n \ge 0\). An \((n+1)\)-steps interactive process in \(\mathcal{A}\) is a pair \(\pi = (\gamma , \delta )\) s.t. \( \gamma =\{C_i\}_{i\in [0,n]}\) is the context sequence and \(\delta =\{D_i\}_{i\in [0,n]} \) is the result sequence, where \( C_{i}, D_{i} \subseteq S\) for any \(i\in [0,n]\), \(D_0 = \emptyset \), and \(D_{i+1} = res _{A}(D_{i} \cup C_{i})\) for any \(i \in [0,n-1]\). The context sequence \(\gamma \) represents the environment, while the result sequence \(\delta \) is entirely determined by \(\gamma \) and the set of reactions A. We call \(\tau = W_{0}, \ldots , W_{n}\) the state sequence, with \(W_{i} \triangleq C_{i} \cup D_{i}\) for any \(i \in [0, n]\). Note that each state \(W_{i}\) in \(\tau \) is the union of the context \(C_{i}\) at step i and the result set \(D_i= res _{A}(W_{i-1})\) from step \(i-1\). Note also that the result of a computation step does not depend on the order of application of the reactions.
Example 1
We consider the simple RS \(\mathcal{A} \triangleq (S, A)\), where \(S\triangleq \{\textsf{a},\textsf{b},\textsf{c},\textsf{d}\}\) and the set \(A\triangleq \{r_1,r_2\}\) contains the reactions \(r_1\triangleq (\{\textsf{a}\},\{\textsf{d}\},\{\textsf{b},\textsf{c}\})\) and \(r_2\triangleq (\{\textsf{a},\textsf{c}\},\{\textsf{d}\},\{\textsf{d}\})\), more concisely written as \((\textsf{a}, \textsf{d},\textsf{b}\textsf{c})\) and \((\textsf{a}\textsf{c}, \textsf{d},\textsf{d})\). Then, we consider a \(3 - steps\) interactive process \(\pi \triangleq (\gamma ,\delta )\), where \(\gamma \triangleq C_0,C_1,C_2 =\{\textsf{a},\textsf{b}\},\{ \textsf{a},\textsf{b}\},\{\textsf{a},\textsf{b}\}\) and \(\delta \triangleq D_0,D_1,D_2=\emptyset ,\{\textsf{b},\textsf{c}\},\{\textsf{b},\textsf{c},\textsf{d}\}\). The resulting state sequence is \(\tau \triangleq W_0, W_1, W_2= \{\textsf{a},\textsf{b}\}, \{\textsf{a},\textsf{b},\textsf{c}\}, \{\textsf{a},\textsf{b},\textsf{c},\textsf{d}\}\). In fact, it is easy to check that, e.g., \(W_0 =C_0\), \(D_1 = res _{A}(W_0) = res _{A}(\{\textsf{a},\textsf{b}\}) = \{\textsf{b},\textsf{c}\}\) because \( en _{r_1}(W_0)\wedge \lnot en _{r_2}(W_0)\), and \(W_1 = C_1\cup D_1 = \{ \textsf{a},\textsf{b}\}\cup \{ \textsf{b},\textsf{c}\} = \{\textsf{a}, \textsf{b},\textsf{c}\}\).

3 SOS rules for reaction systems

In order to define our automatic slicing technique, we find it convenient to exploit the algebraic syntax for RSs introduced in Brodo et al. (2021a). Inspired by process algebras such as CCS (Milner 1980), simple SOS inference rules define the behaviour of each operator. This induces a LTS semantics for RSs, where states are terms of the algebra, each transition corresponds to a step of the RS and transition labels retain some information on the entities involved at each step. Transition labels and SOS rules will allow us to pair RSs with monitors and to easily enrich state information with histories, respectively.
Definition 1
(RS processes) Let S be a set of entities. A RS process \(\textsf{P}\) is any term defined by the following grammar:
$$\begin{aligned}{} & {} \textsf {P}{:}{=}[\textsf {M}] \qquad \textsf {M}{:}{=}(R,I,P)\; \big \vert \; D\; \big \vert \; \textsf {K}\; \big \vert \; \textsf {M}|\textsf {M}\\{} & {} \textsf {K}{:}{:}{=}\textbf{0}\; \big \vert \; X\; \big \vert \; C.\textsf {K}\; \big \vert \; \textsf {K}+\textsf {K}\; \big \vert \; \textsf {rec}~X.~\textsf {K} \end{aligned}$$
where \(R,I,P\subseteq S\) are non empty sets of entities, \(C,D\subseteq S\) are possibly empty set of entities, and X is a process variable.
While in principle reactions, entities and contexts could be seen as components to be handled separately (and in the implementation is more convenient to do so), we mix them together to provide a compositional account of their interactions. A RS process \(\textsf{P}\) embeds a mixture process \(\textsf{M}\) obtained as the parallel composition of some reactions (RIP), some set of current entities D (possibly the empty set), and some context processes \(\textsf{K}\). We write \(\prod _{i\in I} \textsf{M}_i\) for the parallel composition of all \(\textsf{M}_i\) with \(i\in I\).
A context process \(\textsf{K}\) is a possibly non-deterministic and recursive system: the nil context \(\textbf{0}\) stops the computation; the prefixed context \(C.\textsf{K}\) makes the entities C available to the reactions at the current step, and then leaves \(\textsf{K}\) be the context offered at the next step; the non-deterministic choice \(\textsf{K}_1+\textsf{K}_2\) allows the context to behave as either \(\textsf{K}_1\) or \(\textsf{K}_2\); X is a process variable, and \(\textsf{rec}~X.~\textsf{K}\) is the usual recursive operator of process algebras. Choice and recursion can combine in-breadth (different evolutions) and in-depth (evolutions of different length, possibly infinite) analysis. The ability to compose contexts in parallel is useful, e.g., to handle different entities with different strategies or to reduce combinatorial explosion at the specification level. For example, letting \(\textsf{K}^?_{\textsf{a}} \triangleq \textsf{rec}~X.~\textsf{a}.X + \emptyset .X\) be the context that can recursively offer \(\textsf{a}\) or not at any step, the process \(\textsf{K}^?_{\textsf{a}} | \textsf{K}^?_{\textsf{b}} | \textsf{K}^?_{\textsf{c}}\) can recursively offer any combination of entities \(\textsf{a}\), \(\textsf{b}\) and \(\textsf{c}\).
We say that \(\textsf{P}\) and \(\textsf{P}'\) are structurally equivalent, written \(\textsf{P} \equiv \textsf{P}'\), when they denote the same term up to the laws of commutative monoids (unit, associativity and commutativity) for parallel composition \(\cdot | \cdot \), with \(\emptyset \) as the unit, and the laws of idempotent and commutative monoids for choice \(\cdot +\cdot \), with \(\textbf{0}\) as the unit. We also assume \(D_1 | D_2 \equiv D_1\cup D_2\) for any \(D_1,D_2\subseteq S\).
Definition 2
(RSs as RS processes) Let \(\mathcal{A} = (S, A)\) be a RS, and \(\pi = (\gamma , \delta )\) an \((n+1)\)-steps interactive process in \(\mathcal{A}\), with \(\gamma = \{C_i\}_{i\in [0,n]}\) and \(\delta = \{D_i\}_{i\in [0,n]}\). For any step \(i\in [0,n]\), the corresponding RS process \(\llbracket \mathcal{A},\pi \rrbracket _i\) is defined as follows:
$$\begin{aligned} \llbracket \mathcal{A},\pi \rrbracket _i \triangleq \left[ \prod _{r\in A} r~|~D_i~|~\textsf{K}_{\gamma ^i}\right] \end{aligned}$$
where the context \(\textsf{K}_{\gamma ^i} \triangleq C_i.C_{i+1}.\cdots .C_n.\textbf{0}\) is the serialisation of the entities offered by \(\gamma ^i\) (the shifting of \(\gamma \) at the i-th step). We write \(\llbracket \mathcal{A},\pi \rrbracket \) as a shorthand for \(\llbracket \mathcal{A},\pi \rrbracket _0\).
Example 2
The encoding of the RS \(\mathcal{A} = (S, A)\), in Example 1, is as follows:
$$\begin{aligned}{} & {} \textsf{P}_0 \triangleq \llbracket \mathcal{A},\pi \rrbracket = [(\textsf{a}, \textsf{d},\textsf{b}\textsf{c})~|~(\textsf{a}\textsf{c}, \textsf{d},\textsf{d})~|~ \emptyset ~|~\textsf{K}_{\gamma }]\\{} & {} \equiv [(\textsf{a}, \textsf{d},\textsf{b}\textsf{c})~|~(\textsf{a}\textsf{c}, \textsf{d},\textsf{d})~|~ \textsf{a}\textsf{b}.\textsf{a}\textsf{b}.\textsf{a}\textsf{b}.\textbf{0}] \end{aligned}$$
where \(\textsf{K}_{\gamma } = \{\textsf{a},\textsf{b}\}.\{\textsf{a},\textsf{b}\}. \{\textsf{a},\textsf{b}\}.\textbf{0}\) is written more concisely as \(\textsf{a}\textsf{b}.\textsf{a}\textsf{b}.\textsf{a}\textsf{b}.\textbf{0}\). Note that \(D_0=\emptyset \) is inessential and can be discarded thanks to structural congruence.
A transition label \(\ell \) is a tuple \(\langle (D,C)\vartriangleright R,I,P\rangle \) with \(D,C,R,I,P\subseteq S\). The sets DC record the entities available in the current state of the system, where C is provided by the context and D is the result from the previous step; the set R records entities whose presence is assumed (either acting as reactants or as inhibitors); the set I records entities whose absence is assumed (either acting as inhibitors or as missing reactants); and the set P records the products of enabled reactions.
The operational semantics of RS processes is defined by the SOS rules in Fig. 1.
The process \(\textbf{0}\) has no transition. The rule \((\textit{Ent})\) makes available the entities in the (possibly empty) set D, then reduces to \(\emptyset \). The rule \((\textit{Cxt})\) says that a prefixed context process \(C.\textsf{K}\) makes available the entities in the set C and then reduces to \(\textsf{K}\). The rule \((\textit{Rec})\) is the classical rule for recursion. The rules \((\textit{Suml})\) and \((\textit{Sumr})\) select a move of either the left or the right component, resp., discarding the other process. The rule \((\textit{Pro})\) executes the reaction (RIP) (its reactants, inhibitors, and products are recorded in the label), which remains available at the next step together with newly produced entities P. The rule \((\textit{Inh})\) applies when the reaction (RIP) should not be executed; its label records the causes for which the reaction is disabled: possibly some inhibiting entities \((J \subseteq I)\) are present or some reactants \((Q \subseteq R)\) are missing, with \(J \cup Q \ne \emptyset \), as at least one cause is needed. The rule \((\textit{Par})\) puts two processes in parallel by pooling their labels and joining all the set components of the labels. The sanity check \(\ell _1\frown \ell _2\) is required to guarantee that reactants and inhibitors are consistent (see definition below, where we let \(W_i \triangleq D_i \cup C_i\) for brevity):
$$\begin{aligned}{} & {} \langle (D_1,C_1)\vartriangleright R_1,I_1,P_1\rangle \frown \langle (D_2,C_2)\vartriangleright R_2,I_2,P_2\rangle \\{} & {} \quad \triangleq (W_1\cup W_2 \cup R_1 \cup R_2) \cap (I_1 \cup I_2) = \emptyset \end{aligned}$$
In the conclusion of rule \((\textit{Par})\) we write \(\ell _1\cup \ell _2\) for the component-wise union of labels (see definition below, where the notation \(X_{1,2} \triangleq X_1 \cup X_2\) is used):
$$\begin{aligned}{} & {} \langle (D_1,C_1)\vartriangleright R_1,I_1,P_1\rangle \cup \langle (D_2,C_2)\vartriangleright R_2,I_2,P_2\rangle \\{} & {} \quad \triangleq \langle (D_{1,2},C_{1,2})\vartriangleright R_{1,2},I_{1,2},P_{1,2}\rangle \end{aligned}$$
Finally, the rule \((\textit{Sys})\) requires that all the processes of the systems have been considered, and also checks that all the needed reactants are actually available in the system (\(R \subseteq D \cup C\)). In fact this constraint can only be met on top of all processes. The check that inhibitors are absent (\(I\cap (D \cup C) = \emptyset \)) is embedded in rule \((\textit{Par})\). In the following we assume transitions \(\textsf{P}\xrightarrow {\langle (D,C)\vartriangleright R,I,P\rangle } \textsf{P}'\) guarantee that any instance of the rule \((\textit{Inh})\) is applied in a way that maximises the sets J and Q (see Brodo et al. 2021a).
Example 3
Let us consider the RS process \(\textsf{P}_0\triangleq [\textsf{M}~|~\textsf{a}\textsf{b}.\textsf{a}\textsf{b}.\textsf{a}\textsf{b}.\textbf{0}]\) from Example 2, where we let \(\textsf{M}\triangleq (\textsf{a}, \textsf{d},\textsf{b}\textsf{c})~|~(\textsf{a}\textsf{c}, \textsf{d},\textsf{d})\) for brevity. The process \(\textsf{P}_0\) has a unique outgoing transition:
$$\begin{aligned} \textsf{P}_0= [\textsf{M}~|~\textsf{a}\textsf{b}.\textsf{a}\textsf{b}.\textsf{a}\textsf{b}.\textbf{0}] \xrightarrow {\langle (\emptyset ,\textsf{a}\textsf{b})\vartriangleright \textsf{a}, \textsf{c}\textsf{d},\textsf{b}\textsf{c}\rangle } [\textsf{M}~|~\textsf{b}~|~\textsf{c}~|~\textsf{a}\textsf{b}.\textsf{a}\textsf{b}.\textbf{0}] . \end{aligned}$$
Letting \(\textsf{P}_1\triangleq [\textsf{M}~|~\textsf{b}~|~\textsf{c}~|~\textsf{a}\textsf{b}.\textsf{a}\textsf{b}.\textbf{0}]\), there is also a unique outgoing transition from \(\textsf{P}_1\):
$$\begin{aligned} \textsf{P}_1= & {} [\textsf{M}~|~\textsf{b}~|~\textsf{c}~|~\textsf{a}\textsf{b}.\textsf{a}\textsf{b}.\textbf{0}] \xrightarrow {\langle (\textsf{b}\textsf{c},\textsf{a}\textsf{b})\vartriangleright \textsf{a}\textsf{c}, \textsf{d},\textsf{b}\textsf{c}\textsf{d}\rangle }\\{} & {} [\textsf{M}~|~\textsf{b}~|~\textsf{c}~|~\textsf{d}~|~\textsf{a}\textsf{b}.\textbf{0}] . \end{aligned}$$
Finally, letting \(\textsf{P}_2\triangleq [\textsf{M}~|~\textsf{b}~|~\textsf{c}~|~\textsf{d}~|~\textsf{a}\textsf{b}.\textbf{0}]\), there is also a unique outgoing transition from \(\textsf{P}_2\):
$$\begin{aligned} \textsf{P}_2= [\textsf{M}~|~\textsf{b}~|~\textsf{c}~|~\textsf{d}~|~\textsf{a}\textsf{b}.\textbf{0}] \xrightarrow {\langle (\textsf{b}\textsf{c}\textsf{d},\textsf{a}\textsf{b})\vartriangleright \textsf{d}, \emptyset ,\emptyset \rangle } [\textsf{M}~|~\textbf{0}] . \end{aligned}$$
As the process \(\textsf{P}_3\triangleq [\textsf{M}~|~\textbf{0}]\) contains the stopping context \(\textbf{0}\), it has no outgoing transition.
The following theorem from Brodo et al. (2021a) shows how the set-theoretic dynamics of a RS matches the SOS semantics of its RS process.
Theorem 1
(see Brodo et al. 2021a) Let \(\mathcal{A} = (S, A)\) be a RS, and \(\pi =(\gamma ,\delta )\) an \((n+1)\)-steps interactive process in \(\mathcal{A}\), with \(\gamma = \{C_i\}_{i\in [0,n]}\), \(\delta = \{D_i\}_{i\in [0,n]}\), and let \(\textsf{P}_i \triangleq \llbracket \mathcal{A},\pi \rrbracket _i\) for any \(i\in [0,n]\). Then, for any \(i\in [0,n-1]\):
1.
if \(\textsf{P}_i \xrightarrow {\langle (D,C)\vartriangleright R,I,P\rangle } \textsf{P}\) then \(D=D_i\), \(C=C_i\), \(P= D_{i+1}\) and \(\textsf{P} \equiv \textsf{P}_{i+1}\);
 
2.
there exist \(R,I \subseteq S\) such that \(\textsf{P}_i \xrightarrow {\langle (D_i,C_i)\vartriangleright R,I,D_{i+1}\rangle } \textsf{P}_{i+1}\).
 

4 Slicing RS computations

Dynamic slicing is a technique that helps to debug a program by simplifying a partial execution trace, thus depurating it from parts which are irrelevant to finding the bug. It can also help to highlight parts of the program which have been wrongly ignored by the execution.

4.1 SOS rules for the slicing computation

The slicing algorithm, presented in the next section, needs a slightly different state configuration that includes the whole past state sequence \(W_i = C_i \cup D_i\). To carry the relevant information, we enrich the state configuration \([\textsf{M}]\) by prefixing it with the list of the previous result and context sets, written \(\frac{D_0}{C_0}\frac{D_1}{C_1}\cdots \frac{D_n}{C_n}[\textsf{M}]\).
First, we extend the syntax of RS processes as follows:
$$\begin{aligned} \textstyle \textsf{P}{:}{=}[\textsf{M}]\;\big \vert \;\frac{D}{C}\textsf{P} \end{aligned}$$
The next step consists in enriching the operational semantics to deal with histories. We add the new rule (Hist) and modify the (Sys) inference rule in Fig. 1 as below:
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-024-09976-3/MediaObjects/11047_2024_9976_Equ29_HTML.png
The formal Definition 2 is thus updated to carry on the history of the computation.
Definition 3
(RS processes with history) Let \(\mathcal{A} = (S, A)\) be a RS, and \(\pi = (\gamma , \delta )\) an \((n+1)\)-steps interactive process in \(\mathcal{A}\), with \(\gamma = \{C_i\}_{i\in [0,n]}\) and \(\delta = \{D_i\}_{i\in [0,n]}\). For any step \(i\in [0,n]\), the corresponding new process configuration \(\llbracket \mathcal{A},\pi \rrbracket _i\) is defined as follows:
$$\begin{aligned} \llbracket \mathcal{A},\pi \rrbracket _i \triangleq \frac{D_0}{C_0}\frac{D_1}{C_1}\cdots \frac{D_{i-1}}{C_{i-1}}\left[ \prod _{r\in A} r~|~D_i~|~\textsf{K}_{\gamma ^i}\right] . \end{aligned}$$
Example 4
Let us revisit the computation in Example 3 to take histories into account. The corresponding sequence of transitions is thus:
$$\begin{aligned}{} & {} \textsf{P}_0 \xrightarrow {\langle (\emptyset ,\textsf{a}\textsf{b})\vartriangleright \textsf{a}, \textsf{c}\textsf{d},\textsf{b}\textsf{c}\rangle } \frac{\emptyset }{\textsf{a}\textsf{b}} \textsf{P}_1 \xrightarrow {\langle (\textsf{b}\textsf{c},\textsf{a}\textsf{b})\vartriangleright \textsf{a}\textsf{c}, \textsf{d},\textsf{b}\textsf{c}\textsf{d}\rangle } \frac{\emptyset }{\textsf{a}\textsf{b}} \frac{\textsf{b}\textsf{c}}{\textsf{a}\textsf{b}} \textsf{P}_2\\{} & {} \quad \xrightarrow {\langle (\textsf{b}\textsf{c}\textsf{d},\textsf{a}\textsf{b})\vartriangleright \textsf{d}, \emptyset ,\emptyset \rangle } \frac{\emptyset }{\textsf{a}\textsf{b}} \frac{\textsf{b}\textsf{c}}{\textsf{a}\textsf{b}} \frac{\textsf{b}\textsf{c}\textsf{d}}{\textsf{a}\textsf{b}} \textsf{P}_3 . \end{aligned}$$
where we recall that \(\textsf{P}_0\triangleq [\textsf{M}~|~\textsf{a}\textsf{b}.\textsf{a}\textsf{b}.\textsf{a}\textsf{b}.\textbf{0}]\), \(\textsf{P}_1\triangleq [\textsf{M}~|~\textsf{b}~|~\textsf{c}~|~\textsf{a}\textsf{b}.\textsf{a}\textsf{b}.\textbf{0}]\), \(\textsf{P}_2 \triangleq [\textsf{M}~|~\textsf{b}~|~\textsf{c}~|~\textsf{d}~|~\textsf{a}\textsf{b}.\textbf{0}]\) and \(\textsf{P}_3 \triangleq [\textsf{M}~|~\textbf{0}]\) with \(\textsf{M}\triangleq (\textsf{a}, \textsf{d},\textsf{b}\textsf{c})~|~(\textsf{a}\textsf{c}, \textsf{d},\textsf{d})\).

4.2 Slicing algorithms

In the following we assume that the reactions are numbered consecutively by positive integer numbers, and denote the j-th reaction in the RS by the notation \(r_j\). As a matter of notation, please notice that each history \(\frac{D_0}{C_0}\cdots \frac{D_m}{C_m}\) computed in m steps by Definition 3, defines a trace \(\frac{D_0}{C_0}\xrightarrow {N_1} \cdots \xrightarrow {N_m}\frac{D_m}{C_m}\) on which we perform the slicing computation, where \(N_i\) is the set of reactions applied in the i-th computation step. Here each reaction is simply represented by its numeric position in the list of reactions, i.e., \(N_{i} = \{j\mid en _{r_j}(D_{i-1}\cup C_{i-1})\}\) for any \(i\in [1,m]\). Abusing the notation, in the following we write \(r_j\in N\) whenever \(j\in N\).
Example 5
The trace that corresponds to the computation in Example 4 is just:
$$\begin{aligned} \frac{\emptyset }{\textsf{a}\textsf{b}} \xrightarrow {\{r_1\}} \frac{\textsf{b}\textsf{c}}{\textsf{a}\textsf{b}} \xrightarrow {\{r_1,r_2\}} \frac{\textsf{b}\textsf{c}\textsf{d}}{\textsf{a}\textsf{b}} . \end{aligned}$$
Our slicing technique consists of three main steps.
Enriched Semantics (Step S1). The slicing process requires some extra information from the execution of the processes. More precisely, (1) at each operational step we need to highlight the reactions that have been applied; and (2) we need to determine the part of the context which adds to the previous state the entities which are necessary to produce the marked entities in the following state. For solving (1) and (2), in Sect. 4.1 we have introduced an enriched semantics that records computation sequences. We need to keep track of the state sequence of the computation for the slicing process, by keeping separated the produced entities \(D_i\) in a computation step from the context \(C_i\).
Marking the state (Step S2). Let us suppose that the final configuration in a partial computation is \(\frac{D_m}{C_m}\). The user selects a subset \(D_{ sliced }\subseteq D_m\) that may explain the (wrong) behaviour of the program. In Sect. 6 we describe an assertion language and monitors to automatize the selection.
Trace Slice (Step S3). Starting from the user’s selection \(\frac{D_{ sliced }}{\emptyset }\), we define a backward slicing step. Roughly, this step allows us to eliminate from the execution trace all the information not related to \(D_{ sliced }\). Starting from this sliced final state and proceeding backwards we can compute for each computation step the information which is relevant to produce the marked elements in the final state. At the generic step \(i\in [1,m]\) we assume the marked entities \(\frac{D'_i}{C'_i}\) are already available and infer the entities \(\frac{D'_{i-1}}{C'_{i-1}}\) to be marked at step \(i-1\). Notably, the set \(C'_i\) is irrelevant for computing \(\frac{D'_{i-1}}{C'_{i-1}}\) (only the component \(D'_i\) matters).

4.2.1 Marking algorithms

Let us explain how the slicing Algorithm 1 works.
Our algorithm returns a sliced trace which contains only the (usually rather small) subset of the entities which are necessary for deriving the marked entities. Let us now describe it informally. Let us consider first the more complex case of context dependent computations. First of all the user has to indicate the subset \(D_{ sliced }\) of the entities in the last state of the computation \(D_m\) that she wants to mark. Then the backward slicing process can start. Now let us consider the iteration i of the slicer. Marking the relevant information in previous state \(\frac{D_{i-1}}{C_{i-1}}\) requires analyzing the rules which have been applied at step \(i-1\). So, if \(r_j \in N_{i-1}\) and \(r_j = (R_{j},I_{j},P_{j})\) then we need to check if \(r_j\) produces at least one entity which is marked in the next state. If this is the case, then j is added to the set of marked rules. Then it is necessary to check if the context \(C_{i-1}\) was essential for applying rule \(r_j\), or if all necessary entities were already included in \(D_{i-1}\). Thus it is necessary to compute the entities in \(C_{i-1}\) which are missing in \(D_{i-1}\) in order for rule \(r_j\) to be enabled, and those entities are marked in (added to) context \(C'_{i-1}\). The elements in \(R_{j}\) are added to the marked entities in \(D'_{i-1}\). For the computations which are context independent the part of transformation which is related to the context can clearly be eliminated (see Algorithm 2).
Example 6
Let us consider the trace in Example 5:
$$\begin{aligned} \frac{\emptyset }{\textsf{a}\textsf{b}} \xrightarrow {\{r_1\}} \frac{\textsf{b}\textsf{c}}{\textsf{a}\textsf{b}} \xrightarrow {\{r_1,r_2\}} \frac{\textsf{b}\textsf{c}\textsf{d}}{\textsf{a}\textsf{b}} . \end{aligned}$$
and suppose we let \(D_{ sliced }=\{\textsf{d}\}\), i.e., we mark the entity \(\textsf{d}\) in the target state. The slicing algorithm returns the sliced trace:
$$\begin{aligned} \frac{\emptyset }{\textsf{a}} \xrightarrow {\{r_1\}} \frac{\textsf{c}}{\textsf{a}} \xrightarrow {\{r_2\}} \frac{\textsf{d}}{\emptyset } . \end{aligned}$$
In the following we will sometimes illustrate the sliced trace as embedded in the original trace, by boxing the sliced entities and reactions:
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-024-09976-3/MediaObjects/11047_2024_9976_Equ30_HTML.png
The following proposition states what is kept or removed by the slicing algorithm at each step.
Proposition 2
Let \(\frac{D_0}{C_0}\xrightarrow {N_1} \cdots \xrightarrow {N_m}\frac{D_m}{C_m}\) be a context dependent computation and let \(\frac{D'_0}{C'_0}\xrightarrow {N_1} \cdots \xrightarrow {N_m}\frac{D'_m}{C'_m}\) be the sliced trace corresponding to a given marking \(D_{ sliced }=D'_m\subseteq D_m\) (with \(C'_m=\emptyset )\). Then, we have all of the following:
1.
\(\forall i\in [1, m]\), \(r_j = (R_j,I_j,P_j) \in N'_{i}\) iff \(r_j\in N_i\) and there exists \(e \in D'_{i}\) s.t. \(e \in P_j\).
 
2.
\(\forall i\in [0, m-1]\), \(e \in D'_{i}\) iff there exists \(r_j = (R_j,I_j,P_j) \in N'_{i+1}\) such that \(e \in D_{i} \cap R_j\).
 
3.
\(\forall i\in [0, m-1]\), \(e \in C'_{i}\) iff there exists \(r_j = (R_j,I_j,P_j) \in N'_{i+1}\) such that \(e \in (C_i \cap R_j)\setminus D_i\).
 
Proof
The proof is by induction on the number n of computation steps.
Base case \(n=1\))
  • Let us first prove property (1).
    Let the marked subset of \(D_1\) be the set \(D'_1 \subseteq D_1\) as defined in line 2 of Algorithm 1. Then, by line 5, for a reaction \(r_j = (R_j,I_j,P_j) \in N_1\), we have that: \(r_j \in N'_1\) (by line 6) iff \(D'_1 \cap P_j \ne \emptyset \) (by line 5) iff there exists \(e \in D'_1\) s.t. \(e \in P_j\). Thus, property (1) holds.
  • Let us now prove property (2).
    In Algorithm 1, \(D'_0\) is initialised to the empty set in line 4. Thus, we have that an entity \(e \in D'_0\) iff (by line 7 we have two cases) (a) \(r_j\) is enabled on \(D_{0}\), which means that \(R_j \subseteq D_{0}\), hence \(D'_{0} = D'_{0} \cup R_j = D'_{0} \cup (D_0 \cap R_j)\), and the property holds, or (b) if \(\lnot en_{r_j}(D_{0})\) then \(D'_{0} = D'_{0} \cup \{ D_{0} \cap R_j\}\) and hence the property holds.
  • Let us now prove property (3).
    By line 4, \(C'_0 = \emptyset \). Then a new entity e can be added to \(C'_0\) only by line 10 and only if the conditions in line 5 holds but not the one in line 7. Hence, by line 5, \(j \in N_0\), with \(r_j = (R_j,I_j,P_j)\), and, by line 6, \(r_j \in N'_1\). By line 10, we derive that \(C'_0 = C'_0 \cup (R_j\backslash D_0)\). Now let us consider \(e \in R_j\backslash D_0\). By line 5, \(r_j\) is enabled in this (current) first step, hence, by definition, \(e \in R_j \subseteq D_0 \cup C_0\). By line 10 \(\lnot en_{r_j}(D_{0})\), and hence \(R_j\backslash D_0 \subseteq C_0\) must be non empty and \(R_j\backslash D_0 \subseteq C_0 \cap R_j\) and hence for all \(e \in R_j\backslash D_0 \subseteq C_0 \cap R_j\) we have that \(e \in C'_0 \text{(by } \text{ line } \text{10) }\), and \(e \not \in D_0\), and the property holds.
  • Inductive case \(n>1\))
    We start by considering the step \(\frac{D_{n-1}}{C_{n-1}}\xrightarrow {N_n}\frac{D_n}{C_n}\). In the first iteration of the \(\textbf{for}\) statement in line 5 in Algorithm 1, thus with \(i = n-1\), we can show that properties (1-3) hold. Thus we prove that property (1) holds for \(i = n\), and properties (2) and (3) hold for \(i = n-1\).
  • Let us first prove property (1).
    Let the marked subset of \(D_n\) be the set \(D_{sliced} = D'_n \subseteq D_n\) as defined in line 2 of Algorithm 1. Then, by line 5, for a reaction \(r_j = (R_j,I_j,P_j) \in N_n\), we have that: \(r_j \in N'_n\) (by line 6) iff \(D'_n \cap P_j \ne \emptyset \) (by line 5) iff there exists \(e \in D'_n\) s.t. \(e \in P_j\). Thus, property (1) holds for the last computation step.
  • Let us now prove property (2).
    In Algorithm 1, \(D'_{n-1}\) is initialised to the empty set in line 4. Thus, we have that an entity \(e \in D'_{n-1}\) iff (by line 7 we have two cases) (a) \(r_j\) is enabled on \(D_{n-1}\), which means that \(R_j \subseteq D_{n-1}\). Hence \(D'_{n-1} = D'_{n-1} \cup R_j = D'_{n-1} \cup (D_{n-1} \cap R_j)\), and the property holds. (b) if \(\lnot en_{r_j}(D_{n-1})\) then \(D'_{n-1} = D'_{n-1} \cup (D_{n-1} \cap R_j)\) and hence the property holds.
  • Let us now prove property (3).
    By line 4, \(C'_{n-1} = \emptyset \). Then a new entity e can be added to \(C'_{n-1}\) only by line 10 and only if the conditions in line 5 holds but not the one in line 7. Hence, by line 5, \(j \in N_{n}\), with \(r_j = (R_j,I_j,P_j)\), and, by line 6, \(r_j \in N'_{n}\). By line 10, we derive that \(C'_0 = C'_0 \cup (R_j\backslash D_0)\). Now let us consider \(e \in R_j\backslash D_{n-1}\). By line 5, \(r_j\) is enabled in this (current) first step, hence, by definition, \(e \in R_j \subseteq D_{n-1} \cup C_{n-1}\). By line 7 \(\lnot en_{r_j}(D_{n-1})\), and hence \(R_j\backslash D_{n-1} \subseteq C_{n-1}\) must be non empty and \(R_j\backslash D_{n-1} \subseteq C_{n-1} \cap R_j\) and hence for all \(e \in R_j\backslash D_{n-1} \subseteq C_{n-1} \cap R_j\) we have that \(e \in C'_{n-1} \text{(by } \text{ line } \text{10) }\), and \(e \not \in D_{n-1}\), and the property holds for the last computation step. Now we consider the computation \(\frac{D_0}{C_0}\xrightarrow {N_1} \cdots \xrightarrow {N_{n-1}}\frac{D_{n-1}}{C_{n-1}}\) w.r.t. the marking \(D'_{n-1}\) computed in the last computation step (see line 7 of Algorithm 1). By line 5 the set \(R_j\) in line 10 is a subset of \(D_{n-1}\), and hence \(D'_{n-1}\) is a subset of \(D_{n-1}\) and determines a marking \(D_{ sliced }\) for the trace \(\frac{D_0}{C_0}\xrightarrow {N_1} \cdots \xrightarrow {N_{n-1}}\frac{D_{n-1}}{C_{n-1}}\), then the property follows by the inductive hypothesis, as the number of steps for the remaining computation is \(n-1<n\). \(\square \)
Proposition 2 addresses context independent computations when contexts are empty.

5 Implementation

In this section we show how to check a biological model by our slicing methodology. The implementation is available on-line,6 with a small manual to use it. It extends the tool BioReSolve,7 which already provided a friendly environment for simulation, analysis and verification of RSs. The tool has been developed and tested under SWI-Prolog and exploits a few library predicates for efficiency reasons. DCG Grammar rules are used to ease the writing of RS specifications. Its features include the possibility to simulate single traces or generate the whole graph of the LTS (where user defined predicates can be used to color each node depending on its content so to improve readability), to verify modal logic formulas, to check bisimulation based equivalences between different RSs, to deal with quantitative aspects of RSs, such as delays and duration for produced entities and linear handling of concentration levels (see Brodo et al. 2021a, c for more details).
For a computation which does not use assertions the interpreter gives the users some choices: (1) whether they want to make a context independent computation; (2) the possibility to specify the maximum number m of computation steps.
The interpreter will show the corresponding trace \(\frac{D_0}{C_0}\xrightarrow {N_1} \cdots \xrightarrow {N_m}\frac{D_m}{C_m}\), emphasizing the elements \(D_i, C_i, N_i\). Then the users have to provide the entities that they wantsto mark in \(D_m\). Finally, the interpreter will compute the corresponding sliced computation and present it. Let us see one example for illustrating our tool.
Example 7
We consider a RS defined in Barbuti et al. (2021), to model a network for gene regulation. These networks represent the interactions among genes regulating the activation of specific cell functions. The RS models a fragment of the network for controlling the process of differentiation of T helper (Th) lymphocytes, which play a fundamental role in the immune system. We introduced on purpose one wrong reaction in the RS model that can be found in our tool website8 and performed some experiments. For instance we made a context free computation, starting from the initial state containing only the entity |ifngammah|. The computation, limited to 6 steps, produced the following sequence of states.
Now we wanted to focus on the molecule |tbeth| in the last computation state, and hence we used our slicer, marking |tbeth|. The sliced trace in outcome was the following:
with the following sequence of reaction numbers applied in the steps:
The sliced sequence can now easily be interpreted. Clearly |[tbeth]| was produced as a result of the application of four reaction rules in a sequence, those with numbers |[5],[8],[19],[18]|. Then reaction |[27]| reintroduced |tbeth| in each following step. So, the sliced sequence produced a much simpler trace, with an easy interpretation. It is now immediate to see that |[tbeth]| was introduced by rule |[18]|, due to the reactant |[socs1]|, which is recognised as an error, because |[tbeth]| should be introduced by |[stat1h]|. Thus the user can now correct reaction |[18]| for |[stat1h]|.

6 A logical framework for the slicing algorithm

In this section we try to further automate the slicing process whenever the user can describe a property to be checked along a computation: the computation is stopped as soon as the current state S does not satisfy the property. Then slicing starts from S, which is automatically marked.
To specify properties we reuse the simple assertion language introduced in Brodo et al. (2021a), which is tailored on the labels of the LTS generated by SOS semantics in Sect. 3. Then, we rely on a fragment of the recursive extension of the Hennessy-Milner Logic, called sHML (Aceto et al. 2021a), to formally express properties to be verified along a RS process execution. We exploit the monitor technique from Aceto et al. (2020) to check if the current state of the RS process execution respects or not the required property. To this aim, we apply the translation from sHML formula to monitors given in Aceto et al. (2021a). Some modifications are required as in the original proposal sHML logic works on action names, whereas we work with logic assertions.

6.1 The assertion language

The labels of our LTS carry on a large amount of information about the activity executed during each transition, our assertion is a formula that predicates on those labels. Hereafter, we assume that the context can be non-deterministic.
Example 8
Here follows an example of some properties which we may verify:
Has the reaction \((\textsf{a}\textsf{b},\textsf{c},\textsf{b})\) been applied ?
Has the entity \(\textsf{a}\) played both as reactant and as product ?
Definition 4
[Assertion Language] Given a set of entities S, assertions \(\textsf{F}\) on S are built from the following syntax, where \(E\subseteq S\) and \( Pos \in \{ {\mathcal {D}}, {\mathcal {C}}, {\mathcal {W}}, {\mathcal {R}}, {\mathcal {I}}, {\mathcal {P}}\}\):
$$\begin{aligned} \begin{array}{lcl} \textsf{F}&{:}{:}{=}&\textbf{tt}~\mid ~ E \subseteq Pos ~\mid ~ ? \in Pos ~\mid ~ \textsf{F} \vee \textsf{F} ~\mid ~ \textsf{F} \wedge \textsf{F}~\mid ~ \lnot \textsf{F} \end{array} \end{aligned}$$
\( Pos \) distinguishes different positions in the labels: \({\mathcal {D}}\) stands for entities produced in the previous transition, \({\mathcal {C}}\) for entities provided by the context, \({\mathcal {W}}\) for their union, \({\mathcal {R}}\) for reactants, \({\mathcal {I}}\) for inhibitors, and \({\mathcal {P}}\) for products. An assertion \(E\subseteq Pos \), checks the membership of a subset of entities E in a given \( Pos \), \(?\in Pos \) is a test of non-emptiness of \( Pos \), \(\textsf{F}_1 \vee \textsf{F}_2\) denotes a disjunction, \(\textsf{F}_1 \wedge \textsf{F}_2\) is a conjunction, \(\lnot \textsf{F}\) is a negation.
Definition 5
[Satisfaction of Assertion] Let \(\textsf{P}\) be a RSs process, let \(\ell = \langle (D,C)\vartriangleright R,I,P\rangle \) be a transition label, and \(\textsf{F}\) be an assertion. We write \(\ell \models \textsf{F}\) (read as the transition label \(\ell \) satisfies the assertion \(\textsf{F}\)) if and only if the following hold, where \(\textsf{select}(\ell , Pos )\) extract the desired component from the label \(\ell \):
$$\begin{aligned}{} & {} \begin{array}{lcl} \ell \models E \subseteq Pos &{} \text{ iff } &{} E\subseteq \textsf{select}(\ell , Pos )\\ \ell \models ? \in Pos &{} \text{ iff } &{} \textsf{select}(\ell , Pos ) \ne \emptyset \\ \ell \models \textsf{F}_1 \vee \textsf{F}_2 &{} \text{ iff } &{}\ell \models \textsf{F}_1 \vee \ell \models \textsf{F}_2 \\ \ell \models \textsf{F}_1 \wedge \textsf{F}_2 &{} \text{ iff } &{}\ell \models \textsf{F}_1 \wedge \ell \models \textsf{F}_2 \\ \ell \models \lnot \textsf{F} &{} \text{ iff } &{}\ell \not \models \textsf{F} \end{array} \\{} & {} \textsf{select}(\langle (D,C)\vartriangleright R,I,P\rangle , Pos ) \triangleq \left\{ \begin{array}{ll} D &{} \text{ if } Pos = {\mathcal {D}},\\ C &{} \text{ if } Pos = {\mathcal {C}},\\ D\cup C &{} \text{ if } Pos = {\mathcal {W}},\\ R &{} \text{ if } Pos = {\mathcal {R}},\\ I &{} \text{ if } Pos = {\mathcal {I}},\\ P &{} \hbox { if }\ Pos = {\mathcal {P}} \end{array} \right. \end{aligned}$$
Example 9
Some assertions matching the queries listed in Example 8 are:
  • \(\textsf{F}_1 \triangleq \{\textsf{ab}\}\subseteq {\mathcal {R}} \wedge \{\textsf{c}\}\subseteq {\mathcal {I}}\), while \(\textsf{F}_1' \triangleq \lnot \textsf{F}_1\) is verified if \((\textsf{a}\textsf{b},\textsf{c},\textsf{b})\) is not applied,
  • \(\textsf{F}_2 \triangleq \mathsf {\{a\}} \subseteq {\mathcal {R}}\wedge \mathsf {\{a\}} \subseteq {\mathcal {P}}\).

6.2 Monitors

Differently from  Aceto et al. (2020), in our context monitors check if transition labels satisfy a given property. A process monitor stops when a verdict is reached, thus we omit \(\textbf{0}\). The \(\textbf{no}_s\) verdict is equipped with a set of entities, \(s \subseteq S\), used as markers for the slicing.
Definition 6
A monitor process is defined by the grammar:
$$\begin{aligned} \begin{array}{c} m,n \in \textsf {Mon} {:}{:}{=} {\textbf {no}}_{s}~\big \vert ~{\textbf {yes}} ~\big \vert ~\textsf {F}.m~\big \vert ~m+n~\big \vert ~m\otimes n~\big \vert ~\textsf {rec}~X.m~\big \vert ~X \\ \end{array} \end{aligned}$$
where X comes from a countably infinite set of monitor variables, and the set \(s \subseteq S\).
The syntax of a monitor is similar to that of a context process: actions are replaced by properties to be verified by the process action. A ‘verdict’ can be \(\textbf{yes}\) or \(\textbf{no}_s\) for acceptance or rejection respectively. Sum \(m+n\) is used to provide monitors with different behaviours, while \(m\otimes n\) is used to compose monitors in parallel.
The semantics is in Fig. 2, where the symmetric rules are omitted. The set of transition labels is composed by the set of the formulas of the assertion language in Definition 4 plus a special silent action \(\tau \). The verdicts do nothing.
Definition 7
A monitored system is a monitor \(m \in \textsf{Mon}\) and a process \(p \in \textsf{Proc}\) that run side-by-side, denoted \(m \vartriangleleft p\). The behaviour of a monitored system is defined by the inference rules in Table 1.
Table 1
Monitored systems
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-024-09976-3/MediaObjects/11047_2024_9976_IEq376_HTML.gif
If a monitored system \(m \vartriangleleft p\) reaches a verdict like \(\textbf{no}_s\vartriangleleft q\), then a violation is detected and q is the state where the slicing starts by marking the entities in the set s.

6.3 Monitorability

Typically we are interested in verifying if certain assertions are met along a process execution. Writing the corresponding monitors is an error prone task. Following Aceto et al. (2021a), we prefer to write property specifications as formulas in a fragment of the (recursive) Hennessy-Milner logic, called sHML. The syntax and semantics of sHML are reported in Table 2. For example, given a certain assertion \(\textsf{F}\) (see Definition 4) we can write the sHML formula \(\textsf{max}~X.([ \textsf{F}].X \wedge [\lnot \textsf{F}].\textbf{ff})\) to specify that “the computation should always exhibit transition labels satisfying \( \textsf{F}\) and stops as soon as a violation is detected”. Of course, other properties can be required, e.g. that \(\textsf{F}_1\) and \(\textsf{F}_2\) are satisfied in alternation. Following the monitor synthesis in Aceto et al. (2021b), and recalled in Table 3, we obtain that, after some logical simplifications, a monitor implementing the previous formula is: \(\textsf{m}= \textsf{rec}~X. (\lnot \textsf{F}.X + \textsf{F}.\textbf{no})\). While it may seem that monitors closely resemble sHML formulas, we argue that the box modality \([\textsf{F}].\phi \) is much more convenient to write and to manage than the sum \(\textsf{F}.\textsf{m}(\phi ) + \lnot \textsf{F}.\textbf{yes}\).
Example 10
Let \(\textsf{P}_0\) be the process in Examples 34. We want to study a computation where all the visited states satisfy the following assertion: \(\textsf{F} \triangleq \{\textsf{b}\} \subseteq {\mathcal {C}} \wedge \{\textsf{d}\} \not \subseteq {\mathcal {R}}\). Then we need a sHML formula of the format described above: \(\phi \triangleq \textsf{max}~X. ([\textsf{F}].X \wedge [\lnot \textsf{F}]\textbf{ff}_{\{\textsf{d}\}})\), where the idea is to flag the presence of the entity \(\textsf{d}\) as a fault to understand why it has been produced. The corresponding process monitor is thus: \(\textsf{m} \triangleq \textsf{rec}~X.(\textsf{F}.X + \lnot \textsf{F}.\textbf{no}_{\{\textsf{d}\}})\). The execution of the monitored process \(\textsf{m} \vartriangleleft \textsf{P}_0 \) proceeds by applying the rules in Table 1.
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-024-09976-3/MediaObjects/11047_2024_9976_Equ31_HTML.png
The next step is derived by using rule \((\textit{Exec})\) again:
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-024-09976-3/MediaObjects/11047_2024_9976_Equ32_HTML.png
The computation ends after applying the \((\textit{Exec})\) rule:
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-024-09976-3/MediaObjects/11047_2024_9976_Equ33_HTML.png
The computation stops and \(\frac{\emptyset }{\textsf{a}\textsf{b}} \frac{\textsf{b}\textsf{c}}{\textsf{a}\textsf{b}} \frac{\textsf{b}\textsf{c}\textsf{d}}{\textsf{a}\textsf{b}} \textsf{P}_3\) is a starting point for backward slicing on \({\{\textsf{d}\}}\).
Table 2
Syntax and semantics for the sHML
Syntax                        \(\phi ,\psi \in \text{ sHML } {:}{:}{=} \textbf{tt}~\mid ~\textbf{ff}_s~\mid ~[ \textsf{F}].\phi ~\mid ~\phi \wedge \psi ~\mid ~ \textsf{max}X.\phi ~\mid ~X\)
Semantics            \( \begin{array}{lcllcl} \;\llbracket \textbf{tt},\rho \rrbracket &{} \triangleq &{} \mathcal{P} &{} \llbracket \textbf{ff}_s,\rho \rrbracket &{} \triangleq &{} \emptyset \\ \;\llbracket [\textsf{F}].\phi ,\rho \rrbracket &{}\triangleq &{} \{ p~\mid ~ \forall q.~ p \xrightarrow { \ell } q \; , \ell \models \textsf{F} \text{ and } q \in \llbracket \phi ,\rho \rrbracket \} &{} \llbracket \phi \wedge \psi , \rho \rrbracket &{} \triangleq &{} \llbracket \phi , \rho \rrbracket \cap \llbracket \psi , \rho \rrbracket \\ \;\llbracket \textsf{max}~X.\phi ,\rho \rrbracket &{} \triangleq &{} \bigcup \{\textsf{P}~\mid ~\textsf{P}\subseteq \llbracket \phi ,\rho [X \mapsto \textsf{P}]\rrbracket \} &{} \llbracket X,\rho \rrbracket &{}\triangleq &{} \rho (X) \end{array} \)
where \(\rho \) is a set of formula definitions, and \(\mathcal{P}\) is the whole set of processes.
Table 3
Rules for deriving a process monitor
\(\begin{array}{lcllcllcl} \textsf{m}(\textbf{ff}_s)&{}\triangleq &{} \textbf{no}_s &{} \textsf{m}([ \textsf{F}].\phi ) &{}\triangleq &{} \textsf{F}.\textsf{m}(\phi ) + \lnot \textsf{F}.\textbf{yes}&{} \textsf{m}(\textsf{max}~X.\phi ) &{} \triangleq &{}\textsf{rec}~x.\textsf{m}(\phi )\\ \textsf{m}(\textbf{tt})&{}\triangleq &{} \textbf{yes}&{} \textsf{m}(\phi \wedge \psi )&{}\triangleq &{}\textsf{m}(\phi ) \otimes \textsf{m}(\psi ) &{}\textsf{m}(X)&{}\triangleq &{}x\\ \end{array}\)
Example 11
Given the assertion \(\textsf{F} = \{ \texttt {ifngammah}, \texttt {tbeth} \} \not \subseteq \mathcal{W}\), the formula \(\phi \triangleq \textsf{max}~X. ([\textsf{F}].X \wedge [\lnot \textsf{F}]\textbf{ff}_{\{\textsf{tbeth}\}})\) would automatize the slicing in Example 7, by selecting tbeth with the monitor, if it is found present.
Let us now consider a complex biological example in which there is a continuous interaction with the environment, represented by the sequence of the provided contexts.
Example 12
This example is due to Nobile et al. (2017), where a RS was introduced to replicate one of the experiments in Helikar and et al. (2013) related to a dynamical model of ErbB receptor signal transduction in human mammary epithelial cells. The non-receptor tyrosine kinase Src and receptor tyrosine kinase epidermal growth factor receptor (EGFR/ErbB1) have been established as collaborators in cellular signaling and their combined disregulation plays key roles in human cancers, including breast cancer. The RS contains 6,720 reactions and a sequence of 1,000 contexts.
We installed a monitor looking for the introduction of an EFGR lysosome, which is essential for a bifurcation in the pathway of our experiment. We discovered that the lysosome is introduced after 11 computation steps. Then we marked the lysosome in the last state of the computation of length 11, and computed the slicing.
The resulting sliced sequence contains states with at most a couple of dozens of entities, and the context of larger size contains 17 entities. Moreover, in the sliced sequence the biologist can identify other entities that can be used for further slicing simplifications. We report here an excerpt of the sliced sequence. The first and last three states of the sliced computation are:
The last states of the original computation consisted of about 100 different molecules each, containing molecules which are unrelated to the introduction of the marked entities. On the other hand the sliced computation gives a sequence containing less than 33 entities in each state, which is easier to inspect, and can be used for further slicing derivations. Moreover all elements in the sequence are essential to derive the marked entities. We show the sequence of sliced contexts, which emphasizes the entities provided by the contexts which are really useful/necessary for the sliced computation:
Each context in the original computation adds at every step 30 entities, most of which after the first computation step become irrelevant, while only 17 were used in the first sliced step. In the last two computation steps the context does not contribute at all to the computation of the marked entity. Concerning the applied reactions, in each sliced step at most 33 are used. In the last three steps of the computation the relevant reactions are just the following ones:
Thus, it is possible to know exactly, within the 6,720 reactions, the ones which are relevant to compute the marked entities, as well as the relevant entities introduced by the contexts. Then it is possible to proceed to identify a bug in the model following the same strategy outlined in Example 7. In practice the biologists analyze the simplified sliced sequence by using their knowledge of the intended model, or they can express the properties to be monitored automatically.
We performed our experiments on a MacBook Pro with OSX 11.7, 2,6GHz Intel i7 6 core, with 16GB RAM. The slicing for the small Example 7 was executed in 9 ms, while the big Example 12 took 6000 ms. Considering the number of reactions (about 300 times more) and the fact that the context is absent in the first example and it is pretty large in the second one we can notice that the computation time increases linearly with the number of reactions. This is a result that we might expect also theoretically by an analysis of the algorithm, which depends directly on the number of reactions and the size of the context. We are working on an optimisation of the implementation by recording in the history also the list of the reactions applied in the computation.

7 Slicing enhanced reaction systems

Reaction Systems have been designed as a qualitative model. In fact, their theory is based on assumptions such as no permanency, i.e., any entity vanishes unless it is sustained by a reaction in the same way as a living cell would die for lack of energy, without chemical reactions; no counting, i.e., the quantity of entities that are present in a cell is not taken into account; and threshold nature of resources, i.e., we assume that either an entity is available for all reactions, or it is not available at all.
More recently, different versions of RSs were also studied in several papers (Ehrenfeucht and Rozenberg 2007a; Brijder et al. 2011b; Meski et al. 2016; Ehrenfeucht et al. 2017; Bottoni et al. 2019, 2020; Koutny et al. 2021) to account for certain aspects of biological system in a more accurate way and increase their predictive power about natural phenomena. In particular, the work in Brodo et al. (2021c, 2023b) show that the process algebraic approach described in Sect. 3 allows to enhance RS models with several quantitative features in a modular way.
In this section we show that the basic slicing algorithm can be rather naturally extended to deal with quantitative extensions of RSs, like those including delays and linear processes. The extension is rather direct in the case of delays. Linear processes require some more elaborate changes due to the peculiarity of this type of RSs.

7.1 Delays, durations and timed processes

In Biology it is well known that reactions occur with different frequencies. For example, since enzymes catalyse reactions, many reactions are more frequent when some enzymes are present, and less frequent when such enzymes are absent. Moreover, reactions describing complex transformations may require time before releasing their products. To capture these dynamical aspects in our framework by preserving the discrete and abstract nature of RS, in Brodo et al. (2021c, 2023b) we have proposed a discretisation of the delay between two occurrences of a reaction by using a scale of natural numbers, from 0 (smallest delay, highest frequency) up to n (increasing delay, lower frequency).
Intuitively, the notation \(D^n\) stands for making the entities D available after n time units, and we use the shorthand D for \(D^0\), meaning that the entities are immediately available. Similarly, we can associate a delay value to the product of each reaction by writing \((R,I,P)^n\) when the product of the reaction will be available after n time units, and we write (RIP) for \((R,I,P)^0\). The syntax for mixture processes is thus extended as below and the operational semantics is changed accordingly:
$$\begin{aligned} \begin{array}{lcl} \textsf{M} &{}{:}{:}{=}&{}(R,I,P)^n~\big \vert ~D^n~\big \vert ~ \textsf{K}~\big \vert ~\textsf{M}{\mid }\textsf{M}\\ \end{array} \end{aligned}$$
Figure 3 only reports the rules that are new and those that override the ones in Fig. 1 (e.g., the semantics of context processes is unchanged). Rule \((\textit{Tick})\) represents the passing of one time unit, while rule \((\textit{Ent})\) makes available those entities whose delay has expired. Note that the entities \(D^{n+1}\) that appear is the transition label in rule \((\textit{Tick})\) cannot be used as reactants by any reaction, because their delay is not yet zero. Rule \((\textit{Pro})\) delays the product of the reaction as specified by the reaction itself, while rule \((\textit{Inh})\) is used when the reaction is not enabled.
Example 13
Let us consider two RSs sharing the same entity set \(S=\{\textsf{a},\textsf{b},\textsf{c},\textsf{d}\}\) and the same reactions \(r_1 = (\textsf{a},\textsf{b},\textsf{b})\), \(r_2=(\textsf{b},\textsf{a},\textsf{a})\), \(r_3=(\textsf{a}\textsf{c},\textsf{b},\textsf{d})\), \(r_4=(\textsf{d},\textsf{a},\textsf{c})\), but working with different reaction speeds. For simplicity we assume that only two speed levels are distinguished: 0 the fastest and 1 the slowest. The reaction system \({\mathcal {A}}_1\) provides the speed assignment \(\{r_1^1,r_2,r_3,r_4^1\}\) to its reactions, while \({\mathcal {A}}_2\) provides the speed assignment \(\{r_1,r_2^1,r_3^1,r_4\}\). We assume that the context process for both reaction systems is just \(\textsf{K}\triangleq \textsf{a}\textsf{c}.\emptyset .\emptyset .\textbf{0}\). The LTSs of their corresponding timed processes are in Fig. 4, where, for brevity we let:
$$\begin{aligned}{} & {} \textsf{M}_1 \triangleq r_1^1\mid r_2\mid r_3\mid r_4^1 \qquad \textsf{M}_2 \triangleq r_1\mid r_2^1\mid r_3^1\mid r_4\\{} & {} \textsf{K}_{2\emptyset } \triangleq \emptyset .\emptyset .\textbf{0}. \qquad \textsf{K}_\emptyset \triangleq \emptyset .\textbf{0}. \end{aligned}$$
In both cases, the execution is deterministic and the corresponding traces are, respectively:
$$\begin{aligned} \frac{\emptyset }{\textsf{a}\textsf{c}} \xrightarrow {\{r_1^1,r_3\}} \frac{\textsf{b}^1\textsf{d}}{\emptyset } \xrightarrow {\{r_4^1\}} \frac{\textsf{b}\textsf{c}^1}{\emptyset } \quad \text {and}\quad \frac{\emptyset }{\textsf{a}\textsf{c}} \xrightarrow {\{r_1,r_3^1\}} \frac{\textsf{b}\textsf{d}^1}{\emptyset } \xrightarrow {\{r_2^1\}} \frac{\textsf{a}^1\textsf{d}}{\emptyset } . \end{aligned}$$
Inspired by Brijder et al. (2011b), we can also provide entities with a duration, i.e. entities that last a finite number of steps. To this aim we use the syntax \(D^{[n,m]}\) to represent the availability of D for \(m>0\) time units starting after n time units from the current time. In Brodo et al. (2021c) we have shown that durations can be encoded using just delays as follows:
$$\begin{aligned} D^{[n,m]} \triangleq \prod _{k=n}^{n+m-1} D^k \quad (R,I,P)^{[n,m]} \triangleq \prod _{k=n}^{n+m-1} (R,I,P)^k . \end{aligned}$$
For example, we have \(\textsf{a}^{[2,3]} \equiv \textsf{a}^2 \mid \textsf{a}^3 \mid \textsf{a}^4\) and \(\textsf{a}^{[0,1]} \equiv \textsf{a}^0 \equiv \textsf{a}\).
We use the name timed processes for processes with delays and durations. Notably, our extension is conservative, i.e., it does not change the semantics of processes without delays and durations. Therefore, the encoding of standard RSs described in Definition 2 still applies.

7.1.1 Dynamic slicing of timed processes

We now explain how the base slicing algorithm for RSs can be extended to timed processes. The corresponding pseudocode is Algorithm 3. Here we have an additional case to be considered: if a delayed entity \(e^k\) is marked in \(D_i\), then it can derive from an entity \(e^{k+1}\) in the previous state \(D_{i-1}\) with a delay which is one tick bigger than the marked one. We also note that the context only introduces entities without delay.
Example 14
Let us consider Example 13, and the evolution (computation) for \(\mathsf {M_1}\) in Fig. 4, whose corresponding trace is:
$$\begin{aligned} \frac{\emptyset }{\textsf{a}\textsf{c}} \xrightarrow {\{r_1^1,r_3\}} \frac{\textsf{b}^1\textsf{d}}{\emptyset } \xrightarrow {\{r_4^1\}} \frac{\textsf{b}\textsf{c}^1}{\emptyset } . \end{aligned}$$
By marking the entity \(\textsf{b}\) in the last state we get the simplified sliced computation as follows:
$$\begin{aligned} \frac{\emptyset }{\textsf{a}} \xrightarrow {\{r_1^1\}} \frac{\textsf{b}^1}{\emptyset } \xrightarrow {\emptyset } \frac{\textsf{b}}{\emptyset } . \end{aligned}$$
As previously discussed, the sliced trace can be also highlighted as part of the original trace using boxes:
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-024-09976-3/MediaObjects/11047_2024_9976_Equ34_HTML.png
This shows that only reaction \(r_1\) is involved in the first computation step, while no reaction is involved in the second computation step, as by the rules in lines \(10 \text{ and } 11\) in Algorithm 3 entity \(\textsf{b}^1\) is marked by the presence of the value \(\textsf{b}\) in the last computation state and the delayed value \(\textsf{b}^1\) in previous state.

7.2 Concentration levels and linear processes

Quantitative modelling of chemical reactions requires taking molecule concentrations into account. An abstract representation of concentrations that is considered in many formalisms is based on concentration levels: rather than representing these quantities as real numbers, a finite classification is considered (e.g., low/medium/high) with a granularity that reflects the number of concentration levels at which significant changes in the molecule’s behaviour are observed. In classical RSs, the modelling of concentration levels would require using different entities for the same molecule (e.g., \(\textsf {a}_{\texttt {l}}\), \(\textsf {a}_{\texttt {m}}\), and \(\textsf {a}_{\texttt {h}}\) for low, medium and high concentration of \(\textsf {a}\), respectively). This may introduce some additional complexity due to the need of guaranteeing that only one of these entities is present at any time for the state to be consistent. Moreover, consistency would be put at risk whenever the context sequence could provide a molecule with a concentration level different from the one present in the result sequence (e.g., how do we handle conflicting levels \(\textsf {a}_{\texttt {l}}\in D_i\) and \(\textsf {a}_{\texttt {h}}\in C_i\)?).
In Brodo et al. (2021c) we have enhanced RS processes by representing concentration levels with natural numbers associated with entities and using symbolic reactions whose product concentration levels can depend linearly on reactant levels. The idea is to associate linear expressions, such as \(e=m\cdot x+n\) (where \(m\in {\mathbb {N}}\) and \(n \in {\mathbb {N}}^+\) are two constants and x stands for a variable ranging over natural numbers),9 to reactants and products of each reaction. As a special case, when \(m=0\) the expression e is called ground and it is just a positive natural number. In the definition of states we exploit ground expressions: each entity in the state is paired with a natural number representing its current concentration level. In the following we write s(e) to state that the linear expression e is associated to entity s and s(e)[k/x] to denote the substitution of the variable x by the actual value k in e. Substitutions are extended to set of entities R as expected, by writing R[k/x] to denote the substitution of the variable x by the actual value k in all the expressions present in R. Expressions associated to reactants are used as patterns to match the current levels of the entities involved in the reaction. Pattern matching allows to find the largest value for the variable x (the same for all reactants) that is consistent with the concentration levels in the current state. Then, linear expressions associated to products (that can contain, again, variable x) can be evaluated to compute the concentration levels of those entities. For example, the reactant \(\textsf{a}(x+1)\) requires that the concentration level of \(\textsf{a}\) is at least 1 and uses x to count the additional quantity of \(\textsf{a}\) in the current state; then the product \(\textsf{b}(2x+2)\) can be used to produce a molecule \(\textsf{b}\) with twice a concentration level than that of \(\textsf{a}\). Expressions can be associated also to reaction inhibitors in order to let such entities inhibit the reaction only when their concentration level is above a given threshold. However, we require inhibitor expressions to be ground.
For the sake of simplicity we do not report here all the technical definitions in Brodo et al. (2021c) that formalise the computation step, but rather we try to give an intuition of how one computation step is performed. Let us consider a state W and a reaction \(r=(R,I,P)\). We call multiplicity of r w.r.t. W the maximum value k (a natural number) that the variable x can assume so that the concentration levels of entities in R[k/x] are below or equal to those in W. Of course, the enabling of the reaction also requires that the concentration levels of all inhibitors I are above those in W. The multiplicity k of the reaction determines the level P[k/x] of products. As a special case, when R and P are ground10 we let the multiplicity be 1 by default. The following example instantiates the previous description.
Example 15
Assume that we want to write a reaction that produces \(\textsf{c}\) with a concentration level that corresponds to the current concentration level of \(\textsf{a}\) (but at least one occurrence of \(\textsf{a}\) must be present), and that requires \(\textsf{b}\) not to be present at a concentration level higher than 1. Such a reaction would be \(r_1\triangleq (\textsf{a}(x+1),\textsf{b}(2),\textsf{c}(x+1))\). When \(C_0\triangleq \{\textsf{a}(3),\textsf{b}(2)\}\) the reaction \(r_1\) is not enabled because the concentration of the inhibitor is too high (\(\textsf{b}(2)\not < \textsf{b}(2)\)). On the contrary, if \(C'_0\triangleq \{\textsf{a}(3),\textsf{b}(1)\}\) it holds \(\textsf{b}(1)< \textsf{b}(2)\), and the reaction \(r_1\) is enabled with multiplicity \(k=2\), that is the maximum value for x that satisfies \(\textsf{a}(x+1)\le \textsf{a}(3)\). Therefore, the concentration level of the product \(\textsf{c}\) is given by \(\textsf{c}(x+1)[2/x]=\textsf{c}(2+1)=\textsf{c}(3)\). The corresponding trace can be written:
$$\begin{aligned} \frac{\emptyset }{3\textsf{a}+\textsf{b}} \xrightarrow {2r_1} \frac{3\textsf{c}}{\emptyset } \end{aligned}$$
where the concentration levels and multiplicities are represented more concisely using standard multiset notation: we write, e.g., \(3\textsf{a}+\textsf{b}\) for the context where the concentration level of \(\textsf{a}\) is 3 and that of \(\textsf{b}\) is 1 and \(2r_1\) for the enabling of the reaction \(r_1\) with multiplicity 2.
In the following we further exploit multiset notation, e.g.,
  • By writing W(s) to denote the concentration level of the entity s in the state W;
  • By writing \(R\subseteq W\) whenever \(R(s)\le W(s)\) for each entity \(s\in S\), i.e., to denote multiset inclusion;
  • By writing \(s(k)\in W\) if s is present in W with a concentration level \(k'\ge k\);
  • By letting \(W(s)=0\) if s is not present in W;
  • By writing s instead of s(1).
It is worth remarking that the concentration level of an entity in the result state is the maximum concentration level of each product. Similarly, the concentration level of an entity in the current state sequence is computed as the maximum concentration level between the result state and the context. To this aim, we introduce the notation \(D \sqcup C\) to combine multisets as follow:
$$\begin{aligned} (D \sqcup C)(s) \triangleq \max \{D(s),C(s)\} \end{aligned}$$
i.e., the concentration level of s in \(D \sqcup C\) is the maximum between D(s) and C(s).

7.2.1 Dynamic slicing of linear processes

We finally consider the extension of the base slicing algorithm to linear processes. We assume that the user may decide to mark only a subset s(k) with \(k<m\) of the m available s-entities (represented by notation s(m)) in the state, so that the slicing is concerned with how such a subset of the entities was derived. Moreover, it has to consider the multiplicity of each reaction in a computation step, and then also how many of entities come from the context or from the result state.
The algorithm we propose is Algorithm 4, where we assume that \(D_i,C_i,N_i,...\) are multisets. Line 5 selects from \(N_i\) all reactions \(r_j=(R_j,I_j,P_j)\) with multiplicity \(k_j\) such that their product \(P_j[k_j/x]\) includes at least one entity \(s\in D'_i\) with a multiplicity \(P_j[k_j/x](s)\) greater or equal to the one s has in \(D'_i\): in such cases, \(r_j\) can be responsible for the production of the marked entities and its reactants should be considered at the stage \(i-1\). Line 6 updates \(N'_i\) to include \(r_j\) with its multiplicity \(k_j\). Then for any reactant s of \(r_j\) we let m be the multiplicity of s required to enable \(k_j\) instances of \(r_j\) and update one of \(D'_{i-1}\) or \(C'_{i-1}\).
Example 16
Let us consider Example 15. By marking \(\textsf{c}(3)\) in the final state we obtain the following sliced computation step:
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-024-09976-3/MediaObjects/11047_2024_9976_Equ35_HTML.png
We note that by line 10 in Algorithm 4, we derive a marking of the three \(\textsf{a}\)’s entities in the first computation state, which is the quantity required by reaction \(r_1\).
Let us consider one last example which illustrates the differences in the sliced computations when different quantities of entities are marked.
Example 17
Let us consider the following linear RS over \(S=\{\textsf{a},\textsf{b},\textsf{c}\}\) whose reactions are \(r_1=(\{\textsf{a}(x+1),\textsf{c}(x+1)\}, \{\textsf{b}(3)\},\{\textsf{c}(x+1)\})\), \(r_2=(\{\textsf{a}(x+4)\}, \{\textsf{b}(2)\},\{\textsf{c}(x+1),\textsf{a}(x+1)\})\) \(r_3=(\{\textsf{c}(x+1)\}, \{\textsf{b}(1)\},\{\textsf{c}(x+2)\})\), and context \(\textsf{K} = \{ \textsf{a}(5),\textsf{c}(3),\textsf{b}(1) \}.\{ \textsf{a}(2) \}.\emptyset .\textbf{0}\). Then we get the following computation trace:
$$\begin{aligned} \frac{\emptyset }{5\textsf{a}+3\textsf{c}+\textsf{b}} \xrightarrow {2r_1+r_2} \frac{2\textsf{a}+3\textsf{c}}{2\textsf{a}} \xrightarrow {r_1+2r_3} \frac{4\textsf{c}}{\emptyset } \end{aligned}$$
If the user marks \(\textsf{c}(4)\) in the last state, we obtain the following sliced computation:
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-024-09976-3/MediaObjects/11047_2024_9976_Equ36_HTML.png
which is, by erasing all non boxed elements:
$$\begin{aligned} \frac{\emptyset }{3\textsf{a}+3\textsf{c}} \xrightarrow {2r_1} \frac{3\textsf{c}}{\emptyset } \xrightarrow {2r_3} \frac{4\textsf{c}}{\emptyset } \end{aligned}$$
If instead the user marks \(\textsf{c}(2)\) in the last state, we obtain the following sliced computation:
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-024-09976-3/MediaObjects/11047_2024_9976_Equ37_HTML.png
We note that by selecting only two \(\textsf{c}\)-entities in the last state there are two reactions (\(r_1\) and \(r_3\)) which are able to produce at least two \(\textsf{c}\)-entities in the last computation step and hence we obtain a sliced computation which is a kind of ‘superset’ of the previous one.
Let us note that Proposition 2 can be generalised to the cases of the Algorithms 3 and 4 for the quantitative extensions and can be proven following a similar pattern.
Dynamic program slicing has been applied to several programming paradigms, for instance to imperative programming (Korel and Laski 1988), functional programming (Ochoa et al. 2008), Term Rewriting (Alpuente et al. 2011), functional logic programming (Alpuente et al. 2014, 2016), and Constraint Concurrent Programming (Falaschi et al. 2020). RSs use term rewriting and sets manipulation for its basic computation mechanism. Thus, Alpuente et al. (2011, 2016) have a similarity with our work in the adoption of a backward style of computation of the slicer and Alpuente et al. (2016) uses assertions to stop the computation and to start the slicing process. However our framework and the one in Alpuente et al. (2011, 2016) are quite different: the former is oriented towards functional computations, while the latter considers the Maude language. In the language that they consider there are neither inhibitors, which introduce a kind of non-monotonic behaviour in the rewriting rule, nor a notion of interactive context. We tried to apply the framework in Alpuente et al. (2011, 2016) on some examples, but we were losing too much information, and the resulting simplified computations were not informative. We have instead defined a framework which is totally specialised and suitable for RSs, we use different logics specific for RSs and we are the first to consider monitors for checking the verification process in a slicer. The automatization of the slicing has been inspired by the recent work by Aceto et al. (2020, 2021a, b), here extended to exploit assertions over labels and the explicit marking of entities that are primary responsible for the fault detection. Azimi et al. (2016) defines a framework for model checking RSs. There is some similarity with our system, in the sense that both frameworks use a logic for checking properties of the underlying model. However our system considers set oriented properties specific for RSs, and analyzes one computation per execution, while the specifications considered in model checking can be more general. Slicing does not consider the full set of possible states in all possible execution traces as model checking do. The strategy is different. Model checking aims to prove if a property holds considering all possible computation states, and it provides a counterexample if it does not. Slicing returns a simplified trace for one specific computation which will help the users in their analyses. In Brodo et al. (2019, 2021b) we derived a similar LTS to the one in this paper by encoding RSs into cCNA, a multi-party process algebra (a variant of the link-calculus defined in Bodei et al. 2019; Brodo and Olarte 2017). In comparison with the encoding of RS in cCNA, here the SOS semantics is extended to traces and tailored for RSs, without relying on an ad hoc translation.
Our implementation introduces several novel features not covered in the literature and has been designed as a tool for verification, for slicing, as well as for rapid prototyping extensions of RSs, not just for their simulations. For ordinary interactive RSs there are already some performant simulators, such as brsim, written in Haskell (Azimi et al. 2015) or such as HERESY which is a GPU-based simulator of RSs, written using CUDA (Nobile et al. 2017). Using Prolog has had the advantage of a more flexible, safe and rapid prototyping.
Ehrenfeucht and Rozenberg (2007a) define the notion of event for “extended reaction systems" (which are not standard RSs), which roughly consists of considering a computation \(W_0, \ldots , W_n\) in RS taking a subsequence of states generated by the RS, starting from a set \(U \subseteq W_i\), with \(0 \le i < n\). The sets which are in the event sequence are called modules. Ehrenfeucht and Rozenberg (2007a) consider that two different events can merge, when one event generates one module of state \(W_k\) which is a subset of the corresponding module in the other event. The notion of event can remind our notion of sliced sequence. However there are many differences. An event corresponds to a forward computation, while we compute a slice backwards. We focus on a set of marked entities, a notion which is not considered in Ehrenfeucht and Rozenberg (2007a). Thus, we look for “a minimal module" which includes the entities of our interest and try to determine the sliced subsequence which is responsible for their introduction. On the other hand, the notion of event and of module do not make a clear separation of the role of the context for deriving the marked entities, which is essential for our framework.
The quantitative enhancements or RSs based on delays and linear processes were first studied in Brodo et al. (2021c, 2023b). Some previous work Ehrenfeucht and Rozenberg (2009) introduced a notion of time in RSs with the purpose of computing the time distance between two different states in a single state sequence. Thus, the authors define, when it exists, a positive and strictly increasing time function, called universal time function, for a RS that assigns a value to each state and it determines how many time units elapse between two consecutive states in an interactive process for that RS. In our approach, the passing time between two consecutive states is alway a time unit. We define a duration for each reactant as a value indicating how many time units it will be active, and a delay value n is associated to each reaction whose effect is to make available the produced reagents only after n time units. We believe that the concept of delay as we defined in Brodo et al. (2021c, 2023b) is novel in the literature on RSs.
The work in Brijder et al. (2011b) introduces a time duration for each entity that indicates how many steps the entity lasts, instead of decaying in the next step time; the authors point out that this mechanism can also be formalised as an interacting context. The entity duration function is then applied to define entity concentrations and to formalise some theoretical results. Later on, Salomaa (2017) introduced a time duration definition for each entity similar to the one in Brijder et al. (2011b). The duration function is then exploited to state a theoretical result. As in Brijder et al. (2011b), Salomaa (2017), we specify a time duration for each entity by assigning a value indicating how many time steps it will be active. As a difference to Brijder et al. (2011b), Salomaa (2017) we define a delay value n associated with each reaction whose effect is to make available the produced reagents only after n time steps. Thus, we make explicit the time duration of each entity and the time delay each reaction takes to be completed. We also remark that in Brijder et al. (2011b), Salomaa (2017) the duration is fixed for each entity, while in our work we can define a different duration for each reaction, leading to different durations depending on the enabled reactions. We believe that our framework for slicing is a good basis to be extended also to the works in Brijder et al. (2011b), Salomaa (2017), even if they mainly seem devoted to the development of theoretical results. In Meski et al. (2016) the authors introduce an extension of RSs by considering discrete concentrations and introducing some quantitative modelling. There, ‘bags’ are used to count the number of each entity which are necessary to enable a reaction. However as a difference to our system they do not consider a concept of multiplicity like us. Thus, whenever they apply a reaction, the number of produced entities is the one indicated explicitly by the bags of the products of the reaction. Similarly to us Meski et al. (2016) takes the maximum number of entities produced by the reactions, and the maximum considering also the contribution of the context. We believe that our slicing algorithm for linear processes might be easily instantiated to the case of the system in Meski et al. (2016).

9 Conclusions and future work

We have presented the first framework for dynamic slicing of RSs. We have defined a slicing algorithm which has been applied first to standard RSs: both closed systems or interactive (context depending) ones can be dealt with. Then we have shown that the slicer can be applied also to several quantitative extensions of RSs by defining some natural modifications of the basic algorithm. This way we have been able to define a slicer for analysing RSs which consider notions of reaction speed and delays in the activation of the entities, as well as for treating discrete concentration levels in reactions (Brodo et al. 2021c, 2023b).
The main advantage of the slicer is the possibility to automatically extract a simplified computation, with the minimal amount of information that is relevant to explain the production of marked entities. Monitors help to identify states which violate a property and to automate marking of such states, using a flexible language to specify assertions over computations. We have implemented the slicer (Algorithm 1) for RSs, including the monitors, in our working environment called BioResolve11, in Prolog, and have applied our tool to some bioinformatic examples. We are currently working to expand our implementation to the quantitative extensions (Algorithms 3 and 4).
As future work, we plan to add forward slicing to our interpreter, and explore the advantages of combining the information that we can derive by proceeding in the two directions (backward and forward), making the analysis more precise. We also want to extend our slicing algorithm to keep trace of inhibitors, which are essential for the computation. This would provide the user with a fully detailed information about the entities which are involved in the computation of the marked elements.

Acknowledgements

We thank the anonymous reviewers for their comments and suggestions that helped us to improve our paper.

Declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Ethical approval

This is not applicable.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literatur
Zurück zum Zitat Brijder R, Ehrenfeucht A, Rozenberg G (2011b) Reaction systems with duration. In: Kelemen J, Kelemenová A (eds) Computation, cooperation, and life: essays dedicated to gheorghe Păun on the occasion of his 60th birthday. Springer, Berlin, pp 191–202. https://doi.org/10.1007/978-3-642-20000-7_16 Brijder R, Ehrenfeucht A, Rozenberg G (2011b) Reaction systems with duration. In: Kelemen J, Kelemenová A (eds) Computation, cooperation, and life: essays dedicated to gheorghe Păun on the occasion of his 60th birthday. Springer, Berlin, pp 191–202. https://​doi.​org/​10.​1007/​978-3-642-20000-7_​16
Zurück zum Zitat Brodo L, Bruni R, Falaschi M, et al (2021c) Exploiting modularity of SOS semantics to define quantitative extensions of reaction systems. In: Aranha C, Martín-Vide C, Vega-Rodríguez MA (eds) Proc. of TPNC 2021, Lecture Notes in Computer Science, vol 13082. Springer, Berlin, pp 15–32. https://doi.org/10.1007/978-3-030-90425-8_2 Brodo L, Bruni R, Falaschi M, et al (2021c) Exploiting modularity of SOS semantics to define quantitative extensions of reaction systems. In: Aranha C, Martín-Vide C, Vega-Rodríguez MA (eds) Proc. of TPNC 2021, Lecture Notes in Computer Science, vol 13082. Springer, Berlin, pp 15–32. https://​doi.​org/​10.​1007/​978-3-030-90425-8_​2
Zurück zum Zitat Brodo L, Bruni R, Falaschi M (2023a) Dynamic slicing of reaction systems based on assertions and monitors. In: Hanus M, Inclezan D (eds) Proc. of practical aspects of declarative languages–25th Int. Symp., PADL 2023, LNCS, vol 13880. Springer, Berlin, pp 107–124. https://doi.org/10.1007/978-3-031-24841-2_8 Brodo L, Bruni R, Falaschi M (2023a) Dynamic slicing of reaction systems based on assertions and monitors. In: Hanus M, Inclezan D (eds) Proc. of practical aspects of declarative languages–25th Int. Symp., PADL 2023, LNCS, vol 13880. Springer, Berlin, pp 107–124. https://​doi.​org/​10.​1007/​978-3-031-24841-2_​8
Zurück zum Zitat Ehrenfeucht A, Rozenberg G (2007) Reaction systems. Fundam. Inform 75(1–4):263–280MathSciNet Ehrenfeucht A, Rozenberg G (2007) Reaction systems. Fundam. Inform 75(1–4):263–280MathSciNet
Zurück zum Zitat Ivanov S, Rogojin V, Azimi S, et al (2018) WEBRSIM: A web-based reaction systems simulator. In: Díaz CG, Riscos-Núnez A, Paun G et al (eds) Enjoying natural computing–essays dedicated to Mario de Jesús Pérez-Jiménez on the occasion of his 70th birthday, Lecture Notes in Computer Science, vol 11270. Springer, Berlin, pp 170–181. https://doi.org/10.1007/978-3-030-00265-7_14 Ivanov S, Rogojin V, Azimi S, et al (2018) WEBRSIM: A web-based reaction systems simulator. In: Díaz CG, Riscos-Núnez A, Paun G et al (eds) Enjoying natural computing–essays dedicated to Mario de Jesús Pérez-Jiménez on the occasion of his 70th birthday, Lecture Notes in Computer Science, vol 11270. Springer, Berlin, pp 170–181. https://​doi.​org/​10.​1007/​978-3-030-00265-7_​14
Zurück zum Zitat Plotkin GD (1981) A structural approach to operational semantics. Tech. Rep. DAIMI FN-19, Computer Science Department, Aarhus University Plotkin GD (1981) A structural approach to operational semantics. Tech. Rep. DAIMI FN-19, Computer Science Department, Aarhus University
Metadaten
Titel
A framework for monitored dynamic slicing of reaction systems
verfasst von
Linda Brodo
Roberto Bruni
Moreno Falaschi
Publikationsdatum
04.05.2024
Verlag
Springer Netherlands
Erschienen in
Natural Computing
Print ISSN: 1567-7818
Elektronische ISSN: 1572-9796
DOI
https://doi.org/10.1007/s11047-024-09976-3

Premium Partner