Design principles of operating systems for large scale multicomputers

Amnon Barak, Yoram Kornatzky

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

Future multicomputer systems are expected to consist of thousands of interconnected computers. To simplify the usage of these systems, multicomputer operating systems must be developed to integrate a cluster of computers into a unified and coherent environment. Using existing multicomputer operating systems is inappropriate as many commonly used techniques get clogged and lead to congestion, once the system is enlarged over a certain size. This paper deals with the various issues involved with designing an operating system for a large scale multicomputer. We identify the difficulties of using existing operating systems in large multicomputer configurations. Then, based on insight gained in the design of several algorithms, we present eight principles which should serve as guidelines for the designer of such systems. These principles include symmetry, customer-server protocols, and partiality. Another component of our approach is the use of randomness in the system's control. We present probabilistic algorithms for information scattering and load estimation. Tolerating node failures, and garbage collection due to node failures, are part of a distributed operating system routine operations. We present a robust algorithm for locating processes, and an efficient algorithm for garbage collection in a large scale system, which are in line with our principles.

Original languageEnglish
Title of host publicationExperiences with Distributed Systems - International Workshop, Proceddings
EditorsJurgen Nehmer
PublisherSpringer Verlag
Pages104-123
Number of pages20
ISBN (Print)9783540193333
DOIs
StatePublished - 1988
EventInternational workshop on Experiences with Distributed Systems, 1987 - Kaiserslautern, Germany
Duration: 28 Sep 198730 Sep 1987

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume309 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational workshop on Experiences with Distributed Systems, 1987
Country/TerritoryGermany
CityKaiserslautern
Period28/09/8730/09/87

Bibliographical note

Publisher Copyright:
© 1988, Springer-Verlag.

Fingerprint

Dive into the research topics of 'Design principles of operating systems for large scale multicomputers'. Together they form a unique fingerprint.

Cite this