Steady-state property veriﬁcation: a comparison study

Model checking of probabilistic models can be done either by numerical analysis or by simulation and statistical methods. In this paper, we compare the efﬁciency and the scalability of different model checking approaches when they are applied to the veriﬁcation of steady-state properties of large models. We provide an experimental comparison study between the statistical model checking using perfect sampling implemented in Ψ 2 [15] and proposed in [11, 10] and the numerical method implemented in PRISM [6], for the veriﬁcation of CSL [2] steady-state properties. We show that the proposed statistical approach lets us to consider very large models


INTRODUCTION
Model Checking is a technique for automated verification of software, hardware and network systems.It has been introduced to verify functional properties of systems expressed in a formal logic like Computational Tree Logic (CTL).It is done by accepting as input system models and the properties or specifications that the final system is expected to satisfy and by giving outputs Yes if the given model satisfies given specifications and No otherwise.Probabilistic model checking is an extension for the formal verification of systems exhibiting stochastic behavior.The system model is usually specified as a state transition system, with probability values attached to transitions, for example Markov Chains.¿From this model, a wide range of quantitative performance, reliability, and dependability measures of the original system can be computed.These measures can be specified using temporal logics as PCTL for Discrete Time Markov Chains (DTMC) and CSL for Continuous Time Markov Chains (CTMC).These temporal logics are extensions of the Computational Tree Logic (CTL) [3].The underlying stochastic models which are usually Markov chains is defined by a high-level formalisms such as stochastic Petri nets, stochastic process algebras, and queueing networks, a finitestate model of a real-life system.We consider queueing networks as high level formalism for the case studies in this paper.
There are two distinct approaches to perform probabilistic model checking: Numerical techniques based on computation of transient-state or steadystate distributions of the underlying Markov chain and statistical techniques based on hypothesis testing and on sampling by means of discrete event simulation or by measurement.In fact, numerical approach is highly accurate but it suffers from state space explosion problem while statistical approach can overcome the state space explosion problem but it provides verification results with probabilistic guarantees of correctness.Thus statistical model checking techniques can be seen as an alternative to numerical techniques and they can be applied when it is infeasible to use numerical techniques.In the last years, different statistical model checkers have been proposed [12,17,13] especially for properties specified by time-bounded until formulas.Moreover, the statistical model checker MRMC [8] has been proposed and extended to support the statistical model checking of CSL steady-state property.For this formula the probability is estimated based on steady-state simulation of bottom strongly connected components (BSCCs) and estimates for the probabilities to reach those BSCCs.On the other hand, for numerical techniques, the PRISM numerical model checker [6] is largely used.It makes use of symbolic data representation in order to reduce memory requirements for numerical techniques and it supports the verification of CSL steady-state property.
In [5] numerical and statistical techniques have been compared when they are applied to the verification of time-bounded until formulas in the temporal stochastic logic CSL.In this work, we compare the efficiency of the numerical model checking implemented in PRISM tool [6] and our statistical model checking approach proposed in [11,10] which combines perfect sampling and statistical hypothesis testing to study the steady-state properties of large Markovian models.The significant advantage of perfect sampling is that it provides an unbiased sampling of the steady-state distribution, hence the accuracy of the verification result only belongs to the statistical testing.In other words, we ensure the correctness of our results considering a specified precision level.This paper is organized as follows: Section 2 briefly presents the temporal logic CSL.In section 3 we present our proposed approach based on perfect sampling.We give a brief introduction of the studied tools in section 4. Section 5 is devoted to the case studies.First we present the models and their validation.Next, we compare and analyze the results of our experiments.Finally, in section 6 we summarize the conclusions and provide the future works.

CSL MODEL CHECKING
The Continuous Stochastic Logic (CSL) is a branching-time temporal logic with state and path formulas over CTMCs [2].Thus it is useful to specify performance and dependability measures as logical formulas over CTMCs.We consider indeed a labelled CTMC defined over a finite state space X .Let AP denote the finite set of atomic propositions.L : X → 2 AP is the labeling function which assigns to each state s ∈ X the list of atomic propositions satisfied in this state.Intuitively speaking, when the system is in state s, the properties defined by the set of atomic propositions L(s) assigned to this state are satisfied.The satisfaction operator is denoted by |=, then for all state s ∈ X , s |= true.Atomic proposition a is satisfied by state s (s |= a) iff a ∈ L(s).The logic operators are obtained using standard logic equivalence rules : We give here only the steady-state operator that will be used in this paper.The steady-state operator (formula) S ⊲⊳θ (ϕ) lets us to analyze the long-run behaviour of the system where θ is a probability threshold, ⊲⊳ a comparison operator, for example ⊲⊳∈ {<, >, ≤, ≥}, ϕ is a state formula.In this work, we consider ergodic CTMCs, hence there is unique steady-state distribution independent of the initial state.Contrary to the model checking formalism, the satisfaction property is assigned to the model but not to an inital state.We check if the underlying model M satisfies the steady-state property or not.M |= S ⊲⊳θ (ϕ), if the property specified by the steadystate operator S is satisfied by the model M .Note that for a steady-state property ψ = S ≥θ (ϕ), the verification of S ≥θ (ϕ) is the same as S ≤1−θ (¬ϕ) and also is the same as ¬S <θ (ϕ).

STATISTICAL MODEL CHECKING BY PERFECT SAMPLING
In statistical model checkers, it is generally focused on the time-bounded until formulas.Recently, we have proposed to statistically check steady-state properties by means of perfect sampling and hypothesis testing [10,11].In this section, we briefly present how sample paths are generated and tested for the statistical model checking of the steady-state formula ψ = S ≥θ (ϕ).

Perfect Sampling
Let {X n } n∈N be an irreducible and aperiodic discrete time Markov chain with a finite state space X and a transition matrix P = (p i,j ).Let π denote the steady state distribution of the chain: π = πP .The evolution of the Markov chain can always be described by a stochastic recurrence sequence with {e n } n∈Z an independent and identically distributed sequence of events e n ∈ E, n ∈ N. The transition function η : X × E → X verifies the property that P (η(i, e) = j) = p i,j for every pair of states (i, j) ∈ X × X and a random event e ∈ E.
An execution of the Markov chain is defined by an initial state x 0 and a sequence of events {e n } n∈Z .The sequence of states {x n } n∈Z defined by equation ( 1) is called a trajectory.
Z ← η(Z, e i ) ; { η (2 n ) (X , e −2 n +1→0 ) ⊂ Z the bounding set of all possible states at time 0 knowing the event process starting at time −2 n } 9: from all possible initial states in X at time −τ ) are coupled before time 0 then the sampled state at time 0 is exactly distributed according to the stationary distribution.
Definition 1 Given a partial order on X , an event e is said to be monotone if it preserves the partial ordering on X .That is If all events are monotone, the global system is said to be monotone.
According to an order on X , there exists a set M ⊂ X of extremal states (maximal and minimal states).When a Markov chain is monotone, all trajectories issued from X are always bounded by trajectories issued from M .Thus, it is sufficient to compute trajectories issued from M since when they couple, global coupling also occurs.As the size of M is usually drastically smaller than the size of X , monotone perfect sampling [9] significantly improves the sampling time.
Efficiency of simulations is also improved by functional perfect sampling [16].The algorithm sample a reward value, according to a user defined reward function r : X → R; The algorithm then stops when all trajectories are in a set of states at time 0 that belong to the same reward value (going further in the past will inevitably couple in a state that belong to this reward value).To combine monotone and functional perfect sampling, the reward function r must be monotone, that is x y ⇒ r(x) r(y).As |R| is smaller than |X |, this technique may lead to an important reduction of the coupling time.
Algorithm 1 is the perfect sampling algorithm that we use in this paper for steady-state property verification.It combines the monotone and functional techniques explained above.The main loop follows a doubling period scheme to find a time −τ sufficiently far so that coupling occurs (Propp and Wilson have shown that doubling period scheme provides a better mean complexity).In a property verification context, we focus on reward functions that correspond to properties we want to check, so that R = {0, 1}.In Figure 1, we show an illustration of the behaviour of algorithm 1.In this example, like in case studies of section 5, the set of extremal states is composed of one maximum and one minimum; M = {M ax, M in}.
Note that, as the reward function is monotone, values 0 and 1 cover contiguous zones of the state-space.Then, an interesting phenomenum happens when the property to be checked has a small set of positive states {x ∈ X |r ϕ (x) = 1} (ϕ corresponds to a rare property / event): coupling frequently occurs in reward value 0 and the coupling time is very short (the time needed by the trajectory issued from M ax to decrease until it leaves the "positive zone").Moreover, if |{x ∈ X |r ϕ (x) = 1}| does not depend on |X | (case of saturation properties for example), then the performance of algorithm 1 will be as good for very large state-spaces as for small ones.This intuition is validated by results of section 5.

Statistical hypothesis testing
The probabilistic model checking consists in deciding whether the probability that the considered system satisfies the underlying property ϕ meets a given threshold θ or not.Let p be the probability that the system satisfies ϕ, then the verification problem of ψ = S ≥θ (ϕ) where ϕ is for example the availability property of a network system, can be formulated as an hypothesis testing: H : p ≥ θ against the alternative hypothesis K : p < θ.In fact, the strength of the statistical test was determined by two parameters: α and β, where α is a bound on the probability of accepting K when H holds (known as a type I error, or false negative) and β is a bound on the probability of accepting H when K holds (a type II error, or false positive).
In practice, two thresholds, p 0 and p 1 are defined in terms of the probability threshold, θ, and the halfwidth δ of the indifference region: Suppose that we have generated n samples, and a sample X i is a positive sample (X i = 1) if it satisfies ϕ and negative (X i = 0) otherwise.X i is a random variable with Bernoulli distribution with parameter p.Thus the probability to obtain a positive sample is p.

Single Sampling Plan (SSP):
It is based on the acceptance sampling with fixed sample size and with a given acceptance strength (α, β).
accepted, where m is the acceptance threshold.The hypothesis H 1 will be accepted with probability F (m, n, p) and the null hypothesis H 0 will be accepted with the probability It is required that the probability of accepting H 1 when H 0 holds is at most α, and the probability of accepting H 0 when H 1 holds is at most β.These constraints can be illustrated as below: The number of samples n and the acceptance threshold m must be chosen under these constraints and their approximation formulas are given in [19].

Verification of steady-state properties
We have as input parameters: the model defined by a labelled CTMC, M , the property ϕ (to be verified on each sample).Formally, the steady-state property is specified ψ = S ≥θ (ϕ).Moreover, the threshold parameter θ, the indifference region parameter δ, α, β for the strength of statistical hypothesis testing are the other input parameters.
We propose to apply functional perfect sampling (Algorithm 1, Figure 1), so at time 0, we test if the rewards are coupled at reward 0 or 1.In other words, we test if it is a positive or negative sample.Thus we associate the reward r ϕ (x) to each state x ∈ X for the given property ϕ: On the other hand, the statistical decision method we use when performing our statistical hypothesis testing on generated samples for the steady state formulas is inspired from the Single Sampling Plan (SSP) method.In SSP method the number of samples n and the acceptance threshold m will be computed by using the approximation formulas given in [19].In fact, this decision method tests if ϕ is verified (positive sample) or not (negative sample) on each generated sample path, when counting the number of positive samples.Then it provides decision either Yes if the number of positive samples is greater or equal to m (ψ is satisfied) or No otherwise (ψ is not satisfied).

Probabilistic Symbolic Model Checker: PRISM
PRISM [6] is a largely used probabilistic model checker developped at the University of Birmingham (http://www.prismmodelchecker.org/).It supports three types of probabilistic models: discretetime Markov chains (DTMCs), continuous-time Markov chains (CTMCs) and Markov decision processes (MDPs).PRISM has been used to analyse systems from a wide range of application domains, including communication and multimedia protocols, randomised distributed algorithms, security protocols, biological systems and many others.Models are described using the PRISM language, a high level language.PRISM supports automated analysis of a wide range of quantitative properties of these models.In fact, the property specification language incorporates the temporal logics PCTL, CSL.In addition, PRISM incorporates symbolic data structures and algorithms used for state space representation, based on BDDs (Binary Decision Diagrams) and MTBDDs (Multi-Terminal Binary Decision Diagrams).For numerical computation, PRISM includes three separate engines making varying use of symbolic methods.These engines use different data structures: The first engine generates an MTBDD to represent the transition matrix, the sparse engine permits to convert the transition matrix to a sparse matrix.The hybrid engine is generally faster than MTBDD one, and while handling larger systems is expected to be faster and to consume less memory than sparse matrices, and hence is the one used in this paper.The user interface and parsers of PRISM are written in Java; the core algorithms are mostly implemented in C++.It also features discrete-event simulation functionality for generating approximate results to quantitative analysis but it does not support steady-state properties.
We represent in Figure 2, a simplified architecture of the PRISM tool.As input the PRISM engine takes the model file (DTMC, CTMC) and the property specification file (PCTL, CSL).This engine performs numerical computation of the probability p that we look for by solving a numerical equations system, then it compares the computed probability p to the property threshold, θ and then it generates an output file containing the model checking decision (True, False) and a log file containing the model checking time (VT) in seconds and the memory consumption (VSZ) in Kbytes.Queueing systems description is based on an events library which is continually improved by the MESCAL team.All events of this library are monotone.Typical events of queueing networks are client arrivals, end of services and routing from one queue to another.Complex routing strategies had been captured in a common framework, based on index functions [14], so that a large scope of monotone queueing networks can be studied.Moreover, the sandwiching principle of monotone backward coupling had been generalized to non-monotone queueing networks (envelopes techniques [1]) and implemented in Ψ 2 .Ψ 2 has been used in various domains, including networks dimensioning, telecommunications systems, resource brokering problems, etc. Ψ 2 is well suited for probabilistic model checking, and in particular for steady state formula verification, since it provides an unbiased sampling of stationary rewards and guarantees independence of samples.
It is specifically suited for rare event probability estimation, as was ever done in [14].
We represent this tool in Figure 3.As input the Ψ 2 engine takes the model file (Queueing network model) and the property specification file (Reward function).This engine performs a statistical hypothesis testing for the computed probability p (number of positive samples over the total number of samples).It compares the computed probability p to the property threshold and it generates then an output file containing the model checking decision (True, False) and a log file containing the model checking time (VT) in seconds.

EXPERIMENTAL COMPARISON STUDY
We now evaluate three case studies, taken from Ψ 2 and PRISM benchmarks, on which we will base our performance and scalability comparison.In fact, we verify the steady-state formula for these three case studies using both the numerical (PRISM tool) and the statistical approach (Ψ 2 tool), by varying the problem size (state space size related to the maximal queue capacity).We illustrate the verification time in seconds for these case studies, as a function of the maximal queue capacity (state space size).Since the considered Markovian models are ergodic, thus the steady-state probabilities are independent of the initial state.Thus, the considered steady-state property is satisfied or not whatever the initial states.

Tandem network
This model is taken from the Ψ 2 benchmark and we implemented it as a PRISM model.We consider b finite buffers in tandem where each buffer is a M/M/1/N max queue (Figure 4).This tandem network is defined by an input Poisson process (rate λ) at the first stage and by an exponential service rates in each stage.Let µ i be the service rate in stage i.In fact, the end of service in stage i, 1 ≤ i ≤ b − 1 constitutes an arrival to stage i + 1.The packet acceptance mechanism is the rejection: a packet which arrives to a full buffer is lost.Denote by N max the maximal capacity of each queue.The state space associated to this tandem network is defined by (N max + 1) b .Three types of events occur in this system : Arrival from exterior, end of service in stage i and arrival to stage i + 1, and departure to exterior after end of service in the last stage.The monotonicity of these events is shown in [14].Thus this considered model is monotone.
In the sequel, we denote by s = (n 1 , n 2 , • • • , n b ) a state of this Markov chain.We are interested in saturation properties in the last stage and we can also conclude about availability properties.Since all earlier stages must be taken into account to compute saturation probabilities in the last buffer b, we must consider whole Markov chain of (N max + 1) b size.Thus the numerical complexity to solve the underlying model increases rapidly with N max .We define the following atomic proposition related to buffer b : last-full is valid if the b th buffer is full.Based on this atomic proposition, we check the following steady-state formula: S ≤θ (last-full) to check whether the probability that buffer b is full in steady-state is less than θ or not.

Multistage Delta Network (MDN)
In telecommunication networks, multistage models are used for modelling switches.This model is taken from the Ψ 2 benchamrk and we implemented it as a PRISM model.We show the monotonicity of this model in the following.The considered model is a delta network with y stages and z buffers at each stage (Figure 5).Thus the total number of queues (buffers) is b = y * z.With Markovian arrival and service hypothesis, the model can be defined as a CTMC with a state vector (N 1 , N 2 , • • • , N b ) where N i is the number of packets in the i th queue.The size of the state space is then (N max + 1) b , if the maximum queue size is N max .We suppose an homogeneous input trafic with arrival rate λ to the first stage and service rate is µ in each queue.The routing policy is rejection (packets are lost if the queue is full) and at the end of a service in stage i the routing service rates to stage i + 1 are (τ rout1 , τ rout2 ) with 1 ≤ i ≤ y − 1.There are events (z external arrivals at the 1 st stage, z departures at the y th stage, 2 × z routing events between stage i and stage i + 1 with 1 ≤ i ≤ y − 1.The monotonicity of these events and thus the monotonicity of this model has been shown in [14].State labels are defined through atomic propositions depending on the number of packets in queues.For a given k ∈ {0, • • • , N max }, the atomic proposition a i (k) is true if N i ≥ k and false otherwise.For example, a i (N max ) is true if the i th buffer is full.
The underlying CTMC is labelled with these atomic propositions depending on the considered property.It is indeed possible to express different interesting availability and reliability measures for the underlying system by means of these atomic propositions.For instance, with formula ψ=S <θ (a i (N max )) we can check the saturation property in the i th buffer to see the long run saturation probability of the i th buffer is less than θ or not.This lets us also to check the availability property, S >1−θ (¬ a i (N max )).

Figure 5: Multistage Delta network
We define the atomic proposition last-stage-full that is valid if at least a queue at the second level is saturated.Thus it is defined as the disjunction of atomic propositions a i (N max ), 4 ≤ i ≤ 7. Based on this atomic proposition, we consider to verify the following steady-state formula ψ = S <θ (last-stagefull) that let us to study saturation or availability properties (S >1−θ (¬ last-stage-full)).

Tandem Queueing Network with coxian phase (TQN)
This model is taken from PRISM benchmark and we implemented it as a Ψ 2 model.The non-monotonicity of this model is shown in the following.The system consists of an M/ Cox 2 /1 queue sequentially composed with an M/M/1 queue (Figure 6).Let N max to be the maximal capacity of each queue then the state space is O((N max + 1) 2 ).Messages arrive at the first queue with rate λ, and exit the system from the second queue with rate κ.If the first queue is not empty and the second queue is not full, then messages are routed from the first to the second queue.The routing time is governed by a two-phase Coxian distribution with parameters µ 1 , µ 2 , and a.
Here, µ i is the exit rate for the i th phase of the distribution, and 1 -a is the probability of skipping the second phase.Let x i ∈ {0, • • • , N max }, for i ∈ {1, 2}, denote the number of messages currently in queue i, and x ph ∈ {1, 2} denote the current phase of the Coxian distribution.We define the atomic proposition that the system is full with the formula sys-full = Based on this atomic proposition, we check the following Steady-state formula: S ≤θ (sys-full) to check whether the probability that the system is full in steady-state is less than θ or not.In Ψ 2 , the evolution of the state of a Markov chain is described by a sequence of events, that is arrival of a client in a queue, routing to another queue, end of service etc.In one state x of the Markov chain, an event e models a transition to another state x ′ = η(x, e) where η is called the transition function of the system.
To model a Coxian server with two phases, we consider 3 events: phase 1 service d 1 , phase 1 service skipping the second phase d ′ 1 and phase 2 service d 2 .
The implementation of a Coxian server in Ψ 2 then consists in encoding the transition function of these three events in the Ψ 2 event library.If the events are monotone, the transition function is directly encoded according to the dynamics of the system.However, if an event is not monotone, non monotone techniques such as defined in [1] must be involved.Let X = {0, . . ., N max } × {1, 2} × {0, . . ., N max } be the state space of the system, we consider the component-wise partial ordering on X .Given two states x, y ∈ X , we have Unfortunately, in the case a cox-2 server, events d ′ 1 and d 2 are non monotone, given the following counterexamples : In fact, the event d ′ 1 (resp.d 2 ) has no effect on a state where the server is in phase 2 (resp.phase 1), and the corresponding transition function is the identity function.On another hand, the effect of the event on a phase 1 state (resp.phase 2 state) is the move of a client from Q 1 to Q 2 , which increases the load of Q 2 (resp.decreases the load of Q 1 ) and can result in an order inversion on Q 2 (resp.on Q 1 ), as in above counterexamples.
The events d ′ 1 and d 2 has been implemented in Ψ 2 by defining an envelope function for each one.The technical details of these envelope functions are beyond the scope of this paper and it is explained in a research report [4].The interested reader can also refer to [1] for a detailed introduction to the envelope technique.Instead, in the next section, we prove the validity of our implementation using classical goodness-of-fit tests.

Model Validation
In this section, we assess the correctness of our implementation of a two phases Coxian server in Ψ 2 .The validation method is defined as follows: We compute an empirical estimation of the stationary distribution from a sample of size n, obtained with Ψ 2 .Then, the resulting estimation is compared with the exact stationary distribution.We use PRISM (numerical method) to compute the exact distribution, so that we also assess that the same models are considered in both PRISM and Ψ 2 .We use a Chi-square test to measure the fitting of a Ψ 2 sample with the theoretical expected distribution.Details on the Chi-square test can be found in [7].Note that grouping into classes (the distance between the estimated and theoretical distributions is measured on groups of states) is needed for some models, in particular when there are a lots of states with a probability of almost 0.
Results of the Chi-square tests are reported in Tables 1 and 2 for various values of the capacity of queues, the arrival rate λ and the service rate κ.
In every cases, we have D < χ 2 (0.95,k−1) , so that the Chi-square test tell us that samples obtained with Ψ 2 are statistically correct with a confidence level of 95%.We also applied this validation method to assess the correctness of our implementation of Tandem network and of Multistage delta network models in PRISM.

Experimental setup
Tools and hardware settings The results were generated on a 1.5 GHz Intel Core 2 Duo PC running Linux, and with a 2 GB of RAM.The PRISM tool has mainly two parameters ǫ : the desired convergence error or precision; maxiternumber : the maximal number of iterations to obtain the result with certain precision.Ψ 2 tool has four parameters N samp : the desired number of samples; α : the desired probability of false-positive answer for H.T. ; β : the desired probability of the false-negative answer for H.T. ; δ : the half width of the indifference region.In fact, to match these tools parameters and in order to have a fair comparison we take ǫ = 2.δ [18].
Timing In (probabilistic) model checking, two time factors are of interest: the model construction time, i. e. the time to build the internal representation from the input model, and the model checking time, i. e. the time to verify the property on the internal representation.We mainly focused on the model check time.In our comparison study, we use the time as reported by the system command T ime (real value).

Verification time precision
All experiments were repeated many times using shell scripts.An experiment consists of verifying one property on one particular model using one of the model checkers.Each experiment was repeated 20 times, except that experiments for which a single run took more than 30 minutes were repeated only three times.Thus, from the collected data (runtime), we calculated the mean and the standard deviation with 95% of confidence level.Generally, the obtained precision on the collected data will be greater or equal to 10 −2 .

Memory consumption
In the case of the numerical solution method, all experiments were run using the hybrid engine which, although not necessarily the fastest engine, in general allows the analysis of larger problems than the other engines.The limiting factor in the hybrid approach is the space required to store the iteration vector, then the memory is proportional to the number of states.On the other hand, the memory requirements for the statistical approaches are very conservative.In principle, the current state is needed to be stored during verification, which only requires memory logarithmic in the size of the state space.Then memory is never exhausted during verification when using the statistical solution method.For numerical application, we consider λ =0.9, all service rates will be state-independent with rate
In all of the tables and figures we denote by: P RISM : numerical model checking time in seconds for the steady-state formula by using PRISM hybrid engine.P SI2 : statistical model checking time in seconds for the steady-state formula by using our verification method (section 3.3) implemented in Ψ 2 engine.outmem : an out of memory message in PRISM tool.iterpr: a maximal iteration number problem (we consider maxiternumber = 100000 in PRISM tool).

Discussions
Our results confirm that for steady-state property, our statistical method scales better with the size  of the state space.Moreover it is generally faster.However, high accuracy comes at a greater price than for numerical method.Tables 3, 4, and 5 and their corresponding figures show the verification time in seconds as function of state space size in the three case studies.In fact, for the smaller models (monotone and non-monotone cases), the PRISM tool is slightly faster.However, for the larger models (monotone and non-monotone cases), our statistical method implemented in Ψ 2 is the fastest.Moreover, Tables 3, 4 and 5 show the memory limits of the PRISM tool for large state space size while the Ψ 2 tool scales better, is efficient and do not have any memory limitation for these large state space size.For Tandem network case, we obtain an out of memory message in PRISM tool from the value |X | = 1.0 * 10 8 (corresponding to N max = 99).For Multistage delta network case, we obtain an out of memory message in PRISM tool from the value |X | = 2.1 * 10 8 (corresponding to N max = 99).Moreover, for Tandem network with coaxian phase case, we obtain an out of memory message in PRISM tool from the value |X | = 1.1 * 10 8 (corresponding to N max = 7500).
On the other hand, we have variated the precision parameters of numerical (ǫ) and of statistical (δ) methods.Thus we have note from Tables 3, 4 and 5 that the numerical verification time dependence on ǫ is negligible while the statistical verification time dependence on δ is considerable.Moreover, we refer to section 3.1 to explain why in some of our case studies, the obtained statistical verification time does not depend on |X |.Finally note that, we have used single acceptance sampling method in our statistical method implemented in Ψ 2 .However, even if we have not use sequential acceptance sampling which is more efficient than single one, we have obtain more efficient and more scalable results than numerical method.

CONCLUSION AND FUTURE WORKS
In this paper, we have empirically compared numerical and statistical solution techniques for probabilistic model checking on case studies taken from the PRISM and Ψ 2 benchmarks.We focused our attention on steady-state properties.For these properties, we have found that our statistical method implemented in Ψ 2 scales better with the state space size and it is faster than PRISM tool especially for large models.In fact, we aim to find the limiting problem sizes for the considered case studies.We see that the statistical approach scales well with the problem size and it lets us to consider very large models.We consider to compare our statistical method (single and sequential acceptance sampling) with the one implemented in MRMC tool.

Figure 1 :
Figure 1: Example of a backward monotone steady-state property-sampling

Figure 6 :
Figure 6: Tandem queueing network with coxian phase Nmax : Capacity of both the first and the second queue.n : Number of samples.|X | : State-space size.k : number of classes used for the Chi-square test.D : Chi-square computed distance.χ 2 (0.95,k−1) : 95-percentiles of the Chi-square distribution with k − 1 degrees of freedom.

Table 1 :
Chi-square tests varying the capacity.

Table 3 :
Tandem network: Verification time as function of state space size |X | and of queue capacity Nmax for S<0.001 (last-full)

Table 4 :
MDN : Verification time as function of state space size |X | and of queue capacity Nmax for S<0.001 (laststage-full) Figures 10, 11, and 12.In fact, for N max = 10 we obtain an out of memory message with PRISM.

Table 5 :
TQN: Verification time as function of state space size |X | and of queue capacity Nmax for S<0.001 (sys-full)