Please use this identifier to cite or link to this item: https://hdl.handle.net/10316/10194
Title: Dependability Mechanisms for Desktop Grids
Authors: Domingues, Patrício Rodrigues 
Orientador: Silva, Luís Moura e
Keywords: Fault tolerance; Desktop grids; Volunteer computing; Checkpointing; Scheduling; Sabotage-tolerance
Issue Date: 11-May-2009
Citation: DOMINGUES, Patrício Rodrigues - Dependability Mechanisms for Desktop Grids [em linha]. Coimbra : [s.n], 2009. Tese de doutoramento. Disponível na WWW:<http://hdl.handle.net/10316/10194>
Abstract: It is a well-known fact that most of the computing power spread over the Internet simply goes unused, with CPU and other resources sitting idle most of the time: on average less than 5% of the CPU time is effectively used. Desktop grids are software infrastructures that aim to exploit the otherwise idle processing power, making it available to users that require computational resources to solve longrunning applications. The outcome of some efforts to harness idle machines can be seen in public projects such as SETI@home and Folding@home that boost impressive performance figures, in the order of several hundreds of TFLOPS each. At the same time, many institutions, both academic and corporate, run their own desktop grid platforms. However, while desktop grids provide free computing power, they need to face important issues like fault tolerance and security, two of the main problems that harden the widespread use of desktop grid computing. In this thesis, we aim to exploit a set of fault tolerance techniques, such as checkpointing and redundant executions, to promote faster turnaround times. We start with an experimental study, where we analyze the availability of the computing resources of an academic institution. We then focus on the benefits of sharing checkpoints in both institutional and wide-scale environments. We also explore hybrid schemes, where the traditional centralized desktop grid organization is complemented with peer-to-peer resources. Another major issue regarding desktop grids is related with the level of trust that can be achieved relatively to the volunteered hosts that carry out the executions. We propose and explore several mechanisms aimed at reducing the waste of computational resources needed to detect incorrect computations. For this purpose, we detail a checkpoint-based scheme for early detection of errors. We also propose and analyze an invitation-based strategy coupled to a credit rewarding scheme, to allow the enrollment and filtering of more trustworthy and more motivated resource donors. To summarize, we propose and study several fault tolerance methodologies oriented toward a more efficient usage of resources, resorting to techniques such as checkpointing, replication and sabotage tolerance to fasten and to make more reliable executions that are carried over desktop grid resources. The usage of techniques like these ones will be of ultimate importance for the wider deployment of applications over desktop grids.
Description: Tese de doutoramento em Engenharia Informática apresentada à Fac. Ciências e Tecnologia de Coimbra
URI: https://hdl.handle.net/10316/10194
Rights: embargoedAccess
Appears in Collections:FCTUC Eng.Informática - Teses de Doutoramento

Files in This Item:
File Description SizeFormat
PatricioDomingues_PhD_thesis---WEB.pdf4.09 MBAdobe PDFView/Open
Show full item record

Page view(s)

288
checked on Apr 30, 2024

Download(s)

213
checked on Apr 30, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.