I thought they have a button in their office which shuts down a server, as a means of testing. Maybe someone brought in their 3 year old and he just kept hitting it.
They have a "Chaos Monkey" [1] feature that is intended to bring down individual nodes. "Exposing engineers to failures more frequently incentivizes them to build resilient services."
If Chaos Monkey had been responsible for setting off a global outage, I could imagine business leaders getting cold feet about using a tool like this. In traditional companies, anyways, they'd never have seen the benefit of it and after only hearing the costs, they'd probably be livid that a widespread outage had been caused by something like this.