Replay, recovery, replication, and snapshots of nondeterministic concurrent programs

Haim Gaifman, Michael J. Maker, Ehud Shapiro

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

The problem of replaying computations of non-deterministic concurrent programs arises in contexts such as debugging and recovery. We investigate the problem for an abstract model of concurrency, which generalizes dataflow networks, processors with shared variables, and logic programming models of concurrency. We say that nondeterminism is visible if the state is determined, up to some (appropriately defined) notion of equivalence, by the external behavior. We show that if nondeterminism is visible then replay is achievable using a. one-step lookahead sequential simulation algorithm. If the program has an additional monotonicity property called stability then recovery is possible without simulating the original computation, by restarting the program from a. certain easily constructed state. Also, for stable programs with visible nondeterminism, a process composed of identical parallel processes has the same external behavior as each of its components. Hence high crash-failure resilience is achievable by simple process replication. For such programs there is also an easy solution to the asynchronous snapshot problem. Stability holds for certain concurrent logic/constraint programming languages. We describe an efficient method for transforming a given stable concurrent logic/constraint program to an equivalent one with visible nondeterminism. The transformation has acceptable execution overhead, thus it could be employed in a practical realization of the proposed methods.

Original languageEnglish
Title of host publicationProceedings of the Annual ACM Symposium on Principles of Distributed Computing
PublisherAssociation for Computing Machinery
Pages241-255
Number of pages15
ISBN (Print)0897914392
DOIs
StatePublished - 1 Jul 1991
Event10th Annual ACM Symposium on Principles of Distributed Computing, PODC 1991 - Montreal, Canada
Duration: 19 Aug 199121 Aug 1991

Publication series

NameProceedings of the Annual ACM Symposium on Principles of Distributed Computing

Conference

Conference10th Annual ACM Symposium on Principles of Distributed Computing, PODC 1991
Country/TerritoryCanada
CityMontreal
Period19/08/9121/08/91

Bibliographical note

Publisher Copyright:
© 1991 ACM.

Fingerprint

Dive into the research topics of 'Replay, recovery, replication, and snapshots of nondeterministic concurrent programs'. Together they form a unique fingerprint.

Cite this