Distributed Snapshots

RISC-Linz logo


We are going to design and to implement a distributed program that simulates the behavior of a banking system:

Let the network value be the sum of all money stored in some node or contained in some message of the network; by above operation this value remains constant but is unknown to every node.

The program is to determine the network value in every node without interrupting normal operation; when the value is known to every node, execution may terminate. An assertion should check whether the result is correct.


You should see a button labelled "Press Me" embedded below. If you do not see this button but the string "Please enable Java!" instead, you must enable Java on your Web browser and then reload thiis page.

Please enable Java!

Initially all nodes in the network are in mode RUNNING; when you press the "Run" button, the network starts execution managing the deposit and exchanging messages as described above. The current deposit of each node and the values of the messages exchanged between nodes may be determined by moving the mouse pointer over a node respectively channel.

Taking the Snapshot

At some random time, node 0 triggers the request to determine the value of the network. At this time, the execution of the network is interrupted, which gives you the possibility to examine the switch of the node's mode from RUNNING to SNAPPING.

We implement the solution of the problem using the Chandy-Lamport Algorithm for determining consistent global snapshots of a network 1. In our program, this algorithm is applied as follows:

The following figure illustrates the basic idea of the algorithm: the stream of messages passing through a channel is split by the snap message into two parts; a node's snapValue eventually represents the value of the node after having processed all messages within the partition bounded by the snap messages of the incoming and outgoing channels. Since every message is contained in exactly one such partition, the sum of all snapValues represents the total network value.

Distributed Snapshots: Basic Idea

If you now press the "Run" button, the network continues execution until node 0 has determined its final snapValue. The node switches then from mode SNAPPING to mode BROADCASTING and you can determine the node's internal state.

Please note that at this time the other nodes may have not yet determined their corresponding values, some of them may be even still in the mode RUNNING (i.e. they have not yet been informed about taking the snapshot). Please take a look at the different node states.

Broadcasting the Snapshot

The program then continues to determine the total value of the network:

The program thus ultimately terminates in a state where every node holds the same totalValue which represents the value of the network.

If you press "Run" again, the network will continue execution until every node has determined totalValue (which will take place at a network time of about 350).

After the program has finally terminated, you may take a look at the totalValue of every node and should find that they agree. The program implements an assertion that checks in every node, whether totalValue really equals the sum of all current deposit and message values of the network. If the program does not show an "Assertion failed" message in the bottom line, also these values agree.

You may reset the program and run it again with another random network value and different execution behavior.

Source Code

This is the full source code of above program.

Correctness Proof

This is a section of the DAJ report.

Maintained by: Wolfgang Schreiner
Last Modification: November 13, 1997

[Up] [RISC-Linz] [University] [Search]