Go backward to 2 The System Model
Go up to Top
Go forward to 4 Correctness of the Algorithm

3 The Termination Detection Algorithm

We will extend the System interface to

SystemT_{n in N}(act in Z_n->B, chan in Z_n x Z_n->Seq(MSG), term in B)

such that term initialized by

Init(...) :<=>

/\

...
~term

signals termination of the system.

Basic Idea The derivation of the termination detection algorithm is based on the following idea (see Figure 2): If Process 0 wants to detect termination, it sends a signal to process n-1. If an inactive process i > 0 receives the signal, it forwards the signal to process i-1. An active process keeps the signal until it becomes inactive. The signal therefore represents a "token" that circulates through the ring of processes.

We model this circulation by introducing a variable

sig in Z_n->B

initialized by

Init(...) :<=>

/\

...
forall i in Z_n: ~sig_i

and by two actions

Start_i(...) :<=>

/\

i = 0
sig' = sig[i-_n1 |-> true]

Forward_i(...) :<=>

/\

sig' = if i != 0 then sig[i |-> false, i-_n1 |-> true] else sig[i |-> false]

where a-_nb denotes the difference modulo n, i.e., 0-_n1 = n-1).

(Modeling token circulation by an array of boolean signals reflects the behavior of the distributed algorithm closer than modeling it by a single integer position that "automatically" guarantees token unicity.)

Figure 2: Token Circulation

However, above specification allows Process 0 to submit a new token before it has received the token it has previously submitted. We therefore introduce another variable

run in B

that records whether there is a token in the system and then disables Start:

Start_i(...) :<=>

/\

...
~run/\ run'

Forward_i(...) :<=>

/\

...
run' = if i != 0 then run else false

However, when Process 0 finally receives the token, it can only deduce that each process has been inactive at some time in the past. Since a process mave have received a message after forwarding the token, it may have become active again. Only if the algorithm could maintain the invariant

forall k in Z_n: sig_k =>
forall j in Z_n: j > k => ~act_j,

Process 0 could deduce from the receipt of the token (i.e., sig₀) and its own inactivity (i.e., ~act₀) that all processes are currently inactive.

Message Counters

Even if Process 0 receives the token under above invariant, there may be still messages pending in the network that can cause processes to become active again. We therefore need some mean to keep track of the number of messages pending in the network. Since a process only knows about the messages this it itself sends or receives, this can be only achieved by introducing a distributed counter

cnt in Z_n->Z

such that every process sending a message increases its counter and every process receiving a message decreases its counter (see Figure 3):

Send_{i, j}(...) :<=>

/\

...
cnt' = cnt[i |-> cnt_i+1],

Receive_{i, j}(...) :<=>

/\

...
cnt' = cnt[j |-> cnt_j-1]

Figure 3: Message Counters

Then clearly the total sum of the distributed counter equals the number of the messages in the network, i.e., the system maintains the invariant

sum_{i in Z_n}cnt_i = sum_{i in Z_n}sum_{j in Z_n}len(chan_i,
j)

where len(ch) := such l in N: exists m₀, ..., m_l-1: ch = < m₀, ..., m_l-1 > .

Token Value

How can Process 0 learn about this sum? Apparently this is only possible, if the circulating token collects this information and delivers it to this process. We therefore introduce

tval in Z

modeling the value carried by the token. Process 0 initializes the value to 0 when submitting the token; each forwarding process adds the value of its counter (see Figure 4):

Start_i(...) :<=>

/\

...
tval' = 0

Forward_i(...) :<=>

/\

...
tval' = tval+cnt_i

Figure 4: Token Value

If the algorithm could maintain the invariant

forall k in Z_n: sig_k =>

/\

forall j in Z_n: j > k => ~act_j
tval = sum_{j in Z_n,
j > k}cnt_j

then Process 0 could on receipt of the token deduce from ~act₀ that all processes are inactive and from cnt₀+tval = 0 that no message is in the network, i.e., that the system has terminated.

Unfortunately, the algorithm cannot maintain this invariant. We therefore have to disjoin the conclusion part (the "core") of the invariant with some weakening conditions such that the algorithm is able to maintain the weaker invariant but Process 0 can still conclude termination from the information stored locally and received by the token.

Awaking a Process

Figure 5: Invalidating the Core Invariant

Let us investigate how above invariant can be falsified by an action (i.e., the invariant holds in the state before the action but does not any more hold in the state after the action). Since all processes that have already forwarded the token are inactive, the only possibility is that such a process receives a message and becomes active again (see Figure 5). This however means that there must be still pending messages in the network. Since the invariant still holds before invalidation, a weaker version of the variant is

forall k in Z_n: sig_k =>

\/

/\

forall j in Z_n: j > k => ~act_j
tval = sum_{j in Z_n,
j > k}cnt_j

0 < tval+sum_{j in Z_n,
j <= k}cnt_j

If Process 0 receives the token (i.e., sig₀ holds), it can still conclude termination from ~act₀ and cnt₀+tval = 0.

Process Marks

Is it still possible for an action to falsify the weaker invariant? Clearly it can be invalidated, if a process that has not yet forwarded the token receives a message, which causes the summation term in above condition to drop by one (see Figure 6). A simple way to record whether such an incident has taken place is to associate to each process a marker that is set on message receipt. We therefore introduce

mark in Z_n->B

which is initialized to false in every position and which is set to true when a message is received:

Init(...) :<=>

/\

...
forall i in Z_n: ~mark_i

Receive_{i, j}(...) :<=>

/\

...
mark' = mark[j |-> true],

Then we can weaken the invariant further to

forall k in Z_n: sig_k =>

\/

/\

forall j in Z_n: j > k => ~act_j
tval = sum_{j in Z_n,
j > k}cnt_j

0 < tval+sum_{j in Z_n,
j <= k}cnt_j
exists j in Z_n: j <= k/\ mark_j

If Process 0 receives the token, i.e., sig₀ holds, it can still conclude termination from ~act₀, cnt₀+tval = 0, and ~mark₀.

Figure 6: Process Markers

Token Marker

Clearly this weakened invariant is still falsified when the marked process with the least index forwards the token (see Figure 7). Therefore the token itself has to record the information when it passes a marked process. We introduce a token marker

tmark in B

which is initialized by Process 0 and forwarded by each process as

Start_i(...) :<=>

/\

...
~tmark'

Forward_i(...) :<=>

/\

...
tmark' = tmark\/ mark_i

Then we can weaken the invariant further by

forall k in Z_n: sig_k =>

\/

/\

forall j in Z_n: j > k => ~act_j
tval = sum_{j in Z_n,
j > k}cnt_j

0 < tval+sum_{j in Z_n,
j <= k}cnt_j
exists j in Z_n: j <= k/\ mark_j
tmark

This invariant cannot be falsified any more by the activity of any process.

Process 0 can then conclude termination from sig₀, ~act₀, cnt₀+tval = 0, cnt₀+tval = 0, and ~tmark. By definition of tmark' and tval', this can be stated as

Forward_i(...) :<=>

/\

...
term' <=> if i != 0 then term else (~tmark'/\ tval' = 0).

Figure 7: Token Marker

Resetting Process Markers

If Process 0 cannot yet conclude termination after a termination detection round, it may later start another round. Without resetting the process markers, however, such an attempt is doomed to fail. Fortunately, after it has marked the token, the process marker has fulfilled its task and can be reset:

Forward_i(...) :<=>

/\

...
mark' = mark[i |-> false]

Since this action takes place simultaneously with token transmission (i.e., ~sig'_i holds), it cannot falsify the invariant.

If the system terminates during a termination detection round, Process 0 may not yet conclude termination at the end of this round. However, after this run, no process is active, there are no messages in the network any more, and the sum of the message counters equals 0. Still some processes may be marked; the next termination detection round may thus fail, too. Anyway, after this run all processes are unmarked, and the next termination detection round will succeed.

Algorithm Specification

The complete algorithm is compiled in Specification 2. The existential quantifier

var v in T: ...

introduces a local program variable v in a formula like (exists v in T: ...) introduces a local mathematical variable v. The crucial difference between both kinds is that the "variable" v may have different values in different states of a program behavior while the "constant" v has always the same value in all states.

Specification 2: The Termination Detection Algorithm

SystemT_{n in N}(act in Z_n->B, chan in Z_n x Z_n->Seq(MSG), term in B) :<=>
(a system that may detect its termination)
let

Init(in act, in chan, in term, in run, in cnt, in sig, in mark) :<=>

/\
- ...(as in the basic system)
- ~term/\ ~run
- forall i in Z_n: cnt_i = 0/\ ~sig_i/\ ~mark_i,
Send_{i, j}(in a, io chan, io cnt) :<=>

/\
- ...(as in the basic system)
- cnt' = cnt[i |-> cnt_i+1],
Receive_{i, j}(io act, io chan, io cnt, io mark) :<=>

/\
- ...(as in the basic system)
- cnt' = cnt[j |-> cnt_j-1]
- mark' = mark[j |-> true],
Inactivate_i(io act) :<=> ...(as in the basic system)
Start_i(io run, io sig, io mark, io tval, out tmark) :<=>
(Process 0 starts termination detection)
/\
- i = 0/\ ~run
- run'
- sig' = sig[i-_n1 |-> true]
- mark' = mark[i |-> false]
- tval' = 0
- ~tmark',
Forward_i(in a, in c, io sig, io mark, io tval, io tmark, io term) :<=> (process i gets token, forwards it, or detects termination)

/\
- ~a/\ sig_i
- run' = if i != 0 then run else false
- sig' = if i != 0 then sig[i |-> false, i-_n1 |-> true] else sig[i |-> false]
- mark' = mark[i |-> false]
- tval' = tval+c
- tmark' = tmark\/ mark_i
- term' <=> if i != 0 then term else (~tmark'/\ tval' = 0),
Action_{n, i}(io act, io chan, io term, io run, io cnt, io sig, io mark, io tval, io tmark) :<=>

\/
- exists j in Z_n: Send_i,
  j(in act_i, io chan, io cnt)
- exists j in Z_n: Receive_i,
  j(io act, io chan, io cnt, io mark)
- Inactivate_i(io act)
- Start_i(io run, io sig, io mark, out tval, out tmark)
- Forward_i(in act_i, in cnt_i, io sig, io mark, io tval, io tmark, io term):

var
(internal variables of the algorithm)

run in B, cnt in Z_n->Z, sig in Z_n->B, mark in Z_n->B,
tval in Z, tmark in B:

Init(in act, in chan, in term, in run, in cnt, in sig, in mark)
[][exists i in Z_n: Action_{n, i}(io act, io chan, io term, io run, io cnt, io sig, io mark, io tval, io tmark)]

Maintainer: Wolfgang Schreiner
Last Modification: August 20, 1998