What the Xbox Live Outage Can Teach Us

Silver Lining

What the Xbox Live Outage Can Teach Us

By Erik Linask, Group Editorial Director  |  April 06, 2015

What happens when the cloud fails and customers aren’t able to access their resources, applications, data, or services? The cloud is supposed to be the great equalizer, providing scalability, security, and reliability, ensuring that all your apps and services are available whenever you need them. In fact, Ted Brown writes starting on page 22 about why managed Office 365 is a great alternative for SMBs.

But what happens when a provider like Microsoft (News - Alert) can’t hold up its end of the deal, and services become unavailable for a period of time? For the six million or so new Xbox owners during the holiday period, the next month may have been a painful experience, beginning with the highly publicized Xbox Live outage starting on Christmas Day (to be sure, Sony’s PSN also experienced a similar outage as a result of a major denial of service hack that shut down both networks).

What it meant that many features of either system were unavailable, for many customers, for an extended period. There are reports indicating varying periods of network down time for each but, in the  greater New York City area, PSN was up within a few days, which Xbox Live was unavailable for nearly a month (I own both an Xbox and PlayStation console), and know that my children were able to access Xbox Live on two occasions between Christmas and ITEXPO (News - Alert) Miami. 

While some of the features and games are available offline, there are many that are not – with the Xbox One, some games are completely unplayable without network access, turning the console into an oversized paperweight for kids around the world. To be honest, the PS is a much more friendly offline system and the Xbox One.

What’s worse, because outages were largely localized to a number of major metro areas, based on downdetector.com, Microsoft, logically, took the opportunity to claim it was not experiencing any network-wide issues and suggested that issues were with the local Internet access, WiFi (News - Alert), or the console. Downdetector.com claims to be able to display an accurate outage map based on reported issues. Assuming the reported issues are accurate, one can surmise such a map would indicate lower than true levels of outages, not higher, based on the theory that not every user reports outages. I can confirm that each day we were unable to access the network, the site showed a significant outage in the region – among others across the US and Europe.

Now, extend a similar scenario to a business using Office 365, Dropbox (News - Alert), or one of the many cloud communications services available – what would happen if service was unavailable for a month? Oddly, Microsoft doesn’t seem to have taken as much heat for its outages as it could have, thanks to some good customer service protocols (and the fact that kids’ emotions are in play, which no parent wants to mess with by throwing out the consoles).

We’ve heard for the past year that security and reliability issues with the cloud have been largely addressed, and that businesses are starting to move their critical apps and data into the cloud. These latest hacks bring into question how easy it is for hackers to get through security measures and expose user data, corporate secrets, and other information and if both business and consumer technology isn’t evolving faster than the security measures required to thwart such attacks.

The cloud has proven to be a tremendous asset for many, and it will continue as such but, breaches such as these – and even outages that occur after hacks have been stopped – underscore the need for not only a constant focus on security, but increase awareness of how well networks and servers will handle a sudden increase in workload, such as the addition of six million users within a few days’ time. It requires testing, validation, and constant application and network performance monitoring and load balancing. The good news is the emergence of SDN and NFV make these things all possible in a virtual environment, so scale and resource allocation can be done automatically and efficiently.

This won’t be the last major hack we experience. What we can hope, though, is damage is limited thanks to high levels of diligence on our providers’ part.




Edited by Maurice Nagle
blog comments powered by Disqus