Výsledky vyhledávání - "Soila Kavulya"

Performance troubleshooting in data centers

Autor: Liting Hu, Soila Kavulya, Mike Kasick, Jiaqi Tan, Karsten Schwan, Priya Narasimhan, Mahendra Kutare, Chengwel Wang, Rajeev Gandhi

Publikováno v: ACM SIGOPS Operating Systems Review. 47:50-62

In the emerging cloud computing era, enterprise data centers host a plethora of web services and applications, including those for e-Commerce, distributed multimedia, and social networks, which jointly, serve many aspects of our daily lives and busin

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::ec0d8707feb1559a647471e23e022399
https://doi.org/10.1145/2553070.2553079

Zobrazit plný text záznamu

Ganesha

Autor: Xinghao Pan, Soila Kavulya, Jiaqi Tan, Priya Narasimhan, Rajeev Gandhi

Publikováno v: ACM SIGMETRICS Performance Evaluation Review. 37:8-13

Ganesha aims to diagnose faults transparently (in a black-box manner) in MapReduce systems, by analyzing OS-level metrics. Ganesha's approach is based on peer-symmetry under fault-free conditions, and can diagnose faults that manifest asymmetrically

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::4007b5a7cff00ad973b53f731b94f219
https://doi.org/10.1145/1710115.1710118

Zobrazit plný text záznamu

Diagnostic Fusion for Time-Triggered Automotive Networks

Autor: Kunal Mankodiya, Priya Narasimhan, Thomas E. Fuhrman, Utsav Drolia, Soila Kavulya

Publikováno v: HASE

Modern vehicles with semi-autonomous (driver-assistance systems) and autonomous capabilities require sophisticated on-board and off-board diagnostics for safe operation, and to reduce unnecessary component replacements at the service garage. We prese

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::06066bbea3710dffffa686a030630b43
https://doi.org/10.1109/hase.2012.11

Zobrazit plný text záznamu

Light-weight black-box failure detection for distributed systems

Autor: Soila Kavulya, Rajeev Gandhi, Jiaqi Tan, Priya Narasimhan

Publikováno v: Proceedings of the 2012 workshop on Management of big data systems.

Detecting failures in distributed systems is challenging, as modern datacenters run a variety of applications. Current techniques for detecting failures often require training, have limited scalability, or have results that are hard to interpret. We

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::a6203f4fcfe11fa42100374dcbd5adc0
https://doi.org/10.1145/2378356.2378360

Zobrazit plný text záznamu

Draco: Statistical diagnosis of chronic problems in large distributed systems

Autor: Matti Hiltunen, Soila Kavulya, Kaustubh Joshi, Scott Daniels, Priya Narasimhan, Rajeev Gandhi

Publikováno v: DSN

Chronics are recurrent problems that often fly under the radar of operations teams because they do not affect enough users or service invocations to set off alarm thresholds. In contrast with major outages that are rare, often have a single cause, an

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::d95476d50bd7b50ba6f880d92c9fb6ca
https://doi.org/10.1109/dsn.2012.6263927

Zobrazit plný text záznamu

Failure Diagnosis of Complex Systems

Autor: Kaustubh Joshi, Priya Narasimhan, Felicita Di Giandomenico, Soila Kavulya

Publikováno v: Resilience Assessment and Evaluation of Computing Systems, edited by Wolter K., Avritzer A., Vieira M., van Moorsel A., pp. 239–261. Berlin: Springer-Verlag, 2012
Resilience Assessment and Evaluation of Computing Systems ISBN: 9783642290312
Resilience Assessment and Evaluation of Computing Systems
info:cnr-pdr/source/autori:Kavulya S. P., Joshi K., Di Giandomenico F., Narasimhan P./titolo:Failure Diagnosis of Complex Systems./titolo_volume:Resilience Assessment and Evaluation of Computing Systems/curatori_volume:Wolter K., Avritzer A., Vieira M., van Moorsel A./editore: /anno:2012

Failure diagnosis is the process of identifying the causes of impairment in a system’s function based on observable symptoms, i.e., determining which fault led to an observed failure. Since multiple faults can often lead to very similar symptoms, f

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::ca2729c84b04f277680a4ffbf80965f1
https://openportal.isti.cnr.it/doc?id=people______::9a5bfb5bed5ad75f774fd83457b72700

Zobrazit plný text záznamu

Understanding and improving the diagnostic workflow of MapReduce users

Autor: Ben Gotow, Soila Kavulya, Mark Shuster, Jason Campbell, Jiaqi Tan, Priya Narasimhan, Sriram Ramasubramanian, Arun B. Ganesan, James Mulholland

Publikováno v: Proceedings of the 5th ACM Symposium on Computer Human Interaction for Management of Information Technology.

New abstractions are simplifying the programming of large clusters, but diagnosis nontheless gets more and more challenging as cluster sizes grow: Debugging information increases linearly with cluster size, and the count of intercomponent relationshi

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::ab9028c1f8049e42f81e80172223b9e5
https://doi.org/10.1145/2076444.2076445

Zobrazit plný text záznamu

Practical experiences with chronics discovery in large telecommunications systems

Autor: Kaustubh Joshi, Soila Kavulya, Scott Daniels, Matti A. Hiltunen, Rajeev Gandhi, Priya Narasimhan

Publikováno v: Managing Large-scale Systems via the Analysis of System Logs and the Application of Machine Learning Techniques.

Chronics are recurrent problems that fly under the radar of operations teams because they do not perturb the system enough to set off alarms or violate service-level objectives. The discovery and diagnosis of never-before seen chronics poses new chal

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::634b38a0ed1e96c1b6f2d31508d535ce
https://doi.org/10.1145/2038633.2038640

Zobrazit plný text záznamu

Visual, Log-Based Causal Tracing for Performance Debugging of MapReduce Systems

Autor: Soila Kavulya, Rajeev Gandhi, Jiaqi Tan, Priya Narasimhan

Publikováno v: ICDCS

The distributed nature and large scale of MapReduce programs and systems poses two challenges in using existing profiling and debugging tools to understand MapReduce programs. Existing tools produce too much information because of the large scale of

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::df0c589e7bc82b6d7edff0359cad1950
https://doi.org/10.1109/icdcs.2010.63

Zobrazit plný text záznamu

Kahuna: Problem diagnosis for Mapreduce-based cloud computing environments

Autor: Rajeev Gandhi, Xinghao Pan, Eugene Marinelli, Soila Kavulya, Jiaqi Tan, Priya Narasimhan

Publikováno v: NOMS

We present Kahuna, an approach that aims to diagnose performance problems in MapReduce systems. Central to Kahuna's approach is our insight on peer-similarity, that nodes behave alike in the absence of performance problems, and that a node that behav

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::11f8e0b83c0cc06ad5c1c4c1450043c7
https://doi.org/10.1109/noms.2010.5488446

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání