An old fable says, the fox knows many things, but the hedgehog knows one important thing. I came across this saying the other day which got me thinking. I hear a lot about “Cloud-first” strategies going on at seemingly all companies. And I was thinking that companies are moving to the cloud because they support ‘many things’ fairly well. However HPE has seen a number of repatriations from the cloud back to the Data Center based on costs, security and control of data. The idea of just paying for what you use sounds very appealing. In the case of public clouds it does require you to remember to shut the lights off. Many customers are surprised at the end of the month, based on machines they forgot to turn off/shutdown. The meter continues to run so vigilance is in order. For those applications that will always be running, or are supposed to always be running, in most cases, the cheaper solution in the long run is to have it in your Data Center. I might add that it is the better solution in the case of applications running on NonStop since availability is the one important thing, like the hedgehog, we know.
Availability, often measured in “nines,” is a critical metric for mission-critical IT systems. “Five nines” (99.999%) availability equates to just over five minutes of downtime per year and is considered the gold standard for industries where downtime can have severe consequences, such as finance, healthcare, and manufacturing.
Leading public cloud providers (such as AWS, Microsoft Azure, and Google Cloud) typically advertise Service Level Agreements (SLAs) in the range of 99.9% (three nines) to 99.99% (four nines) for most core services. Some specialized services may offer higher SLAs, but five nines are rare and often limited to specific configurations or regions. Achieving five nines in public cloud environments is challenging due to the complexity of interconnected infrastructure—servers, storage, networks, inter-data center links, and WANs all introduce potential points of failure. 70–80% of data center managers’ report experiencing outages within a three-year period, and while public cloud reliability has improved, significant outages still occur and can be highly impactful due to the scale and centralization of services. In the public cloud, customers are responsible for architecting their applications for high availability (e.g., using multi-region deployments, redundancy, and failover mechanisms). The underlying infrastructure may support high availability, but the realized uptime depends on customer design choices.
It would appear public cloud is more expensive in 3 areas, First, general costs for a 24x7x365 application environment. Second, for whatever a customers specific cost of downtime is calculated to be. The numbers show ~ 5 minutes of downtime for a NonStop system whereas studies are indicating between 53 minutes (99.99% uptime) to 8.76 hours (99.9%). Once calculated you can get an average range of what running in the cloud will cost you in terms of downtime compared to running on NonStop. That calculation would be ‘cost of downtime’ per minute times 48 for the low number and ‘cost of downtime’ per minute times 526 for the higher number. Finally, as mentioned, the customer, not the cloud provider is responsible for architecting availability. NonStop has been architecting mission-critical, high availability systems for over 50 years now. Generally, high-availability features that are added onto an existing system, rather than designed in from the start (as in “built-in availability”) will not compare to the NonStop system designed from the ground up to be 24x7x365. The point being, a customer will spend more on development, support and testing on a mission-critical application they want to run in the cloud.
The public cloud providers do many things well. The question is do you want well, or do you want NonStop?

Be the first to comment