By now nearly everyone has seen the video of the San Diego 4th of July fireworks debacle. Here’s one version (taken from about where I was watching).
I took two car-loads of friends and family downtown through the traffic, paid $12 per car to park, waded through the crowds (over 500,000 strong) to get a good spot to watch from, only to have to take everyone home disappointed.
The show was supposed to last 18 minutes and be “one of the most logistically complex displays in the world,” according to Garden State Fireworks, the New Jersey company that produced the show. In business since 1890, Garden State produced hundreds of other shows across the country on July 4. It has staged pyrotechnic displays for such events as the 1988 Winter Olympics, the Statue of Liberty Bicentennial Celebration, Macy’s New York July Fourth Celebration on the Hudson and the Washington, D.C. July Fourth Celebration. Only ours failed.
“Everyone’s seen their computers crash, everyone’s seen their cell phones drop calls,” August Santore, Garden State’s owner, told NJ.com. “The only way to correct anything that’s not working properly, you have to live it. In this particular case, it’s something that was unknown. It’s never happened.”
As Donald Rumsfeld would have it, such unknown unknowns provide our greatest difficulties.
A highly technical (and convoluted) explanation released by Garden State stated that a technical “anomaly” caused about 7,000 shells to go off simultaneously over San Diego Bay. A doubling of code commands in the Big Bay Boom fireworks computer system caused all of the show’s fireworks, from four separate barges and five locations over a 14-mile span, to launch within 30 seconds. Thankfully, no one was hurt.
Garden State’s statement included an explanation of how fireworks shows are produced through code, with a primary launch file and a secondary back-up. The two files are then merged to create a new launch file, and sent to each of the five fireworks locations. Apparently, an “unintentional procedural step” occurred during that process, causing an “anomaly” that doubled the primary firing sequence.
“The command code was initiated, and the ‘new’ file did exactly what it ‘thought’ it was supposed to do,” the report says. “It executed all sequences simultaneously because the new primary file contained two sets of instructions. It executed the file we designed as well as the file that was created in the back-up downloading process.” The statement placed the blame generally upon its “effort to be over-prepared for any disruption in communications.”
In the world of computer specialization, it’s easy to forget how many of our systems depend on complex code that may be extremely difficult to understand. More broadly, it’s remarkable how complex nearly everything in our society is. IT security measures have largely focused on sabotage, but as Edward Tenner points out in The Atlantic, there is also a complexity risk, especially as the scope of cloud computing increases — including risks to back-ups in the cloud. The Technology Review blog recently presented the concerns of Professor Bryan Ford at Yale University:
“Non-transparent layering structures…may create unexpected and potentially catastrophic failure correlations, reminiscent of financial industry crashes,” he says.
But the lack of transparency is only part of the story. A more general risk arises when systems are complex because seemingly unrelated parts can become coupled in unexpected ways.
A growing number of complexity theorists are beginning to recognise this problem. The growing consensus is that bizarre and unpredictable behaviour often emerges in systems made up of “networks of networks”.
An obvious example is the flash crashes that now plague many financial markets in which prices plummet dramatically for no apparent reason.
The issue relates to more than complexity, however. Complexity, optimization, leverage and efficiency all conspire against redundancy, nature’s primary risk management tool. They also can readily lead to hubris — rather than an intellectual humility — which causes us to think we’ve “got everything covered” (think VAR and the 2008-09 financial crisis). As Nassim Taleb points out, “Nature builds with extra spare parts (two kidneys), and extra capacity in many, many things (say lungs, neural system, arterial apparatus, etc.), while design by humans tend to be spare [and] overoptimized.”
My point is not to denigrate complexity, optimization and the like. Instead, I merely wish to emphasize that these benefits also come with risks and that we are foolish to the extent to which we do not recognize and deal with those risks. In the broader context, these risks can include market crashes and other catastrophes. More personally, it can mean the failure of a retirement income portfolio withdrawal plan or other individual catastrophes. In all cases, we should at least explore a good quality back-up plan, insurance of some kind, or both, at least when we’re dealing with important matters. In all cases, if (when) our plans fail, the resulting explosions can be real and debilitating.