7 Responses

  1. Jeramiah Dooley
    Jeramiah Dooley at |

    There are two interesting bits here:

    1) You aren't sure what actually happened and don't know whether a full postmortem will be made available.
    2) Your SLA was violated, but will that make you whole?

    It's interesting to see how various service and services providers handle these things differently. So many times I've heard an out-sourced IT org say "We are paying for the SLA" while at the same time the provider is saying "Make the SLA 100% for marketing purposes, we're never on the hook for actual damages." In your example of the company losing $10M an hour, I promise if there was a 3rd party data center or service provider involved, the SLA didn't compensate them for that loss!

    Postmortems are the best evidence of transparency between a provider and a customer. Before I sign up, I want to see where they have published all of the communication related to previous outages. I want to see how they communicate with their customers. I want to *know* that if there's an outage I'll know the good, bad and ugly of the event, because that's what I have to gauge risk and whether I want to stay.

    Having been on the service provider side, the idea that customers buy SLAs is the single best piece of misdirection ever marketed.

    Reply
  2. » Heads Up Alert: vSphere 5.5 U1 NFS Random Disconnection Bug! Long White Virtual Clouds

    […] The timing of this is a coincidence that it comes right on the coat tails of my previous article Hardware Fails, Software Has Bugs and People Make Mistakes – Usually You Get All At Once! During the disconnects VM’s will appear frozen and the NFS datastores may be greyed out. […]

  3. Brett Weaver
    Brett Weaver at |

    " hopefully a full root cause analysis is made available to customers, along with a preventative action plan that gives customers confidence that steps are being taken to reduce the risk of another similar incident."
    In my experience infrastructure is the best area in IT for adopting this approach. I am quite sure you will find things improve.
    <RANT>Unfortunately Software Projects crash and burn -or- succeed and no one works out why! That's how you have project after project failing and costing a fortune.
    Trying to get clients to run Post Implementation Reviews seems like the hardest thing in the world. What are the Universities teaching people in Business courses? </RANT>

    Reply
  4. Luca Dell'Oca
    Luca Dell'Oca at |

    From a Data Protection perspective, I’ve felt SLAs are often only numbers good for the marketing of the service provider. It’s an agreement, and the customer has no way to check if the SP is effectively able to guarantee that value. We agree on an SLA, as a customer I hope the real SLA would always be 100%, and if the SP violates it, at least I want some money back.

    Also, in your examples there is the other problem with SLA: it’s an average value usually on a year base, while an outage impacts a business in every single event. I’d prefer to discuss (and write down on the contract) RPO and RTO with a Service Provider rather than SLA.

    Reply

Leave a Reply