A synchronous checkpointing protocol for mobile distributed systems: probabilistic approach

doi:10.1504/IJICS.2007.013957

Journal ArticleDOI

A synchronous checkpointing protocol for mobile distributed systems: probabilistic approach

Lalit Kumar Awasthi, +1 more

- 01 Jun 2007 -

International Journal of Information and...

- Vol. 1, Iss: 3, pp 298-314

Chats0

TLDR

A minimum process coordinated checkpointing algorithm where the number of useless checkpoints and blocking are reduced using a probabilistic approach that computes an interacting set of processes on checkpoint initiation.

Abstract:

Coordinated checkpointing is a method that minimises number of processes to checkpoint for an initiation. It may require blocking of processes, extra synchronisation messages or useless checkpoints. We propose a minimum process coordinated checkpointing algorithm where the number of useless checkpoints and blocking are reduced using a probabilistic approach that computes an interacting set of processes on checkpoint initiation. A process checkpoints if the probability that it will get a checkpoint request in current initiation is high. A few processes may be blocked but they can continue their normal computation and may send messages. We also modified methodology to maintain exact dependencies.

Citations

PDF

Open Access

More filters

IJCSI Publicity Board 2010

Borislav D Dimitrov, +2 more

TL;DR: This paper presents a systematic analysis of a variety of different ad hoc network topologies in terms of node placement, node mobility and routing protocols through several simulated scenarios.

...read moreread less

Journal ArticleDOI

A low-cost hybrid coordinated checkpointing protocol for mobile distributed systems

Parveen Kumar

- 01 Jan 2008 -

Mobile Information Systems

TL;DR: A blocking algorithm is designed, where no useless checkpoints are taken and an effort has been made to optimize the blocking of processes for minimum-process checkpointing, and selective messages at the receiver end are delayed.

...read moreread less

Journal ArticleDOI

Soft-Checkpointing Based Hybrid Synchronous Checkpointing Protocol for Mobile Distributed Systems

Parveen Kumar, +1 more

- 01 Jan 2011 -

International Journal of Distributed Sys...

TL;DR: In order to balance the checkpointing overhead and the loss of computation on recovery, the authors propose a hybrid checkpointing algorithm, wherein an all- process coordinated checkpoint is taken after the execution of minimum-process coordinated checkpointing algorithms for a fixed number of times.

...read moreread less

Posted Content

Minimum Process Coordinated Checkpointing Scheme for Ad Hoc Networks

Ruchi Tuli, +1 more

- 09 Nov 2011 -

arXiv: Distributed, Parallel, and Cluste...

TL;DR: Performance analysis shows that the proposed minimum process coordinated checkpointing scheme for cluster based ad hoc routing protocols outperforms the existing related works and is a novel idea in the field.

...read moreread less

Journal ArticleDOI

Evaluating Overheads of Integrated Multilevel Checkpointing Algorithms in Cloud Computing Environment

Dilbag Singh, +2 more

- 01 Jun 2012 -

International Journal of Computer Networ...

TL;DR: This paper presents a methodology for providing high availability to the demands of cloud's clients by integrating checkpointing feature with load balancing algorithms and making multilevel checkpoint to decrease checkpointing overheads.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book ChapterDOI

Time, clocks, and the ordering of events in a distributed system

Leslie Lamport

- 04 Oct 2019 -

Concurrency and Computation: Practice an...

TL;DR: In this paper, the concept of one event happening before another in a distributed system is examined, and a distributed algorithm is given for synchronizing a system of logical clocks which can be used to totally order the events.

...read moreread less

Journal ArticleDOI

Distributed snapshots: determining global states of distributed systems

K. Mani Chandy, +1 more

- 01 Feb 1985 -

ACM Transactions on Computer Systems

TL;DR: An algorithm by which a process in a distributed system determines a global state of the system during a computation, which helps to solve an important class of problems: stable property detection.

...read moreread less

Journal ArticleDOI

A survey of rollback-recovery protocols in message-passing systems

Elmootazbellah Nabil Elnozahy, +3 more

- 01 Sep 2002 -

ACM Computing Surveys

TL;DR: This survey covers rollback-recovery techniques that do not require special language constructs and distinguishes between checkpoint-based and log-based protocols, which rely solely on checkpointing for system state restoration.

...read moreread less

Journal ArticleDOI

Checkpointing and Rollback-Recovery for Distributed Systems

Richard Koo, +1 more

- 01 Jan 1987 -

IEEE Transactions on Software Engineerin...

TL;DR: In this article, the authors consider the problem of bringing a distributed system to a consistent state after transient failures, and propose a distributed algorithm to create consistent checkpoints, as well as a rollback-recovery algorithm to recover the system from transient failures.

...read moreread less

Proceedings ArticleDOI

The performance of consistent checkpointing

Elmootazbellah Nabil Elnozahy, +2 more

TL;DR: It is argued that these measurements show that consistent checkpointing is an efficient way to provide fault tolerance for long-running distributed applications.

...read moreread less