Zobrazeno 1 - 10
of 17
pro vyhledávání: '"Nematollah Bidokhti"'
Autor:
Nematollah Bidokhti, Caitie McCaffrey, Gary Grider, Andree Jacobson, Tim Emami, Deepthi Srinivasan, Riza O. Suminto, Peter Alvaro, Casey Golliher, Robert Ross, Biswaranjan Panda, Kevin Harms, Xing Lin, Robert Ricci, H. Birali Runesha, Russell Sears, Huaicheng Li, Haryadi S. Gunawi, Andrew D. Baptist, Kirk Webb, Weiguang Sheng, Mingzhe Hao, Swaminathan Sundararaman, Parks Fields
Publikováno v:
ACM Transactions on Storage. 14:1-26
Fail-slow hardware is an under-studied failure mode. We present a study of 114 reports of fail-slow hardware incidents, collected from large-scale cluster deployments in 14 institutions. We show that all hardware types such as disk, SSD, CPU, memory,
Publikováno v:
ISSRE
In order to plan for failure recovery, the designers of cloud systems need to understand how their system can potentially fail. Unfortunately, analyzing the failure behavior of such systems can be very difficult and time-consuming, due to the large v
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::927594d4c830dbe742fc7ee20e3b4b65
http://arxiv.org/abs/1908.11640
http://arxiv.org/abs/1908.11640
Publikováno v:
ESEC/SIGSOFT FSE
Cloud management systems provide abstractions and APIs for programmatically configuring cloud infrastructures. Unfortunately, residual software bugs in these systems can potentially lead to high-severity failures, such as prolonged outages and data l
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::86f4987dea2dbc191b371e71192579ee
https://hdl.handle.net/11588/766427
https://hdl.handle.net/11588/766427
Autor:
Nematollah Bidokhti
Publikováno v:
2016 Annual Reliability and Maintainability Symposium (RAMS).
Reliability Demonstration Test (RDT) is an essential part of the overall product reliability development life cycle validation and it is required by all customers. This test allows equipment manufacturers to demonstrate that the product meets its int
Publikováno v:
ATS
Increased functional density with shrinking technology could result in escalating noise-induced failures in the field. Further, the low correlation between system level functional test and production test is making it difficult to better screen parts
Publikováno v:
ISSRE Workshops
The fact that many software systems are still plagued by critical software bugs conducted researches to deal with methodologies which introduce bugs into the software and recover from them. Since such methodologies need to be well-tested, it necessit
Publikováno v:
IOLTS
Single Event Effects negatively impact the reliability of complex electronic devices and systems. System architects, reliability engineers and digital designers have to invest considerable resources to successfully meet the reliability goals set by t
Publikováno v:
2014 Reliability and Maintainability Symposium.
This paper discusses the challenges, strategy and mitigation methods to prolong the operation of a product even after it is either about to have a failure or have experienced actual failures. The focus is mostly on two component types ASICs and memor
Autor:
Nematollah Bidokhti
Publikováno v:
2013 Proceedings Annual Reliability and Maintainability Symposium (RAMS).
This paper discusses the challenges and solutions for reliability allocation analysis. The sophistication of today's designs requires a re-evaluation of reliability allocation methodologies to account for complex system designs where a single board c