Joint work with Michael
Norrish, Andrei Serjantov,
and Keith Wansbrough.
Papers describing the work are available here.
Network programming is notoriously hard to understand: one has to deal
with a variety of protocols (IP, ICMP, UDP, TCP etc), concurrency,
packet loss, host failure, timeouts, the complex <em>sockets</em>
interface to the protocols, and subtle portability issues. Moreover,
the behavioural properties of operating systems and the network are
not well documented.
A few of these issues have been addressed in the process calculus and
distributed algorithm communities, but there remains a wide gulf
between what has been captured in semantic models and what is required
for a precise understanding of the behaviour of practical distributed
programs that use these protocols.
In this work we demonstrate (in a preliminary way) that the gulf
can be bridged. We present a semantic model for distributed programs
that use the standard sockets interface; it covers message loss, host
failure and temporary disconnection, and supports reasoning about
distributed infrastructure. We consider interaction via the UDP and
To do this, it has been necessary to: construct an
experimentally-validated post-hoc specification of the UDP/ICMP
sockets interface; develop a timed operational semantics with threads,
as such programs are typically multithreaded and depend on timeouts;
model the behaviour of partial systems, making explicit the
interactions that the infrastructure offers to applications; integrate
the above with semantics for an executable fragment of a programming
language (OCaml) with OS library primitives; and use tool support to
manage complexity, mechanizing the model with the HOL theorem prover.
We illustrate the whole with some simple network programs, including a
module that provides naive heartbeat failure detection.