Popis: |
In this paper, we propose a management framework for protecting large computer systems against operator mistakes. By detecting and confining mistakes to isolated portions of the managed system, our framework facilitates correct operation even by inexperienced operators. We built a prototype management system called Barricade based on our framework. We evaluate Barricade by deploying it for two different systems, a prototype Internet service and an enterprise computer infrastructure, and conducting experiments with 20 volunteer operators. Our results are very promising. For example, we show that Barricade can detect and contain 39 out of the 43 mistakes that we observed in 49 live operator experiments performed with our Internet service. |