December 2013
This issue's contents Current issue Index Search

Why We Can’t Develop Software

by Jonathan Wallace jw@bway.net

In the 1990’s, I worked for a family of companies that included several software operations. I am not a programmer, but I took an interest in how software is developed, read books and articles, and talked to developers and architects.When I was CEO of one such organization from 1995-2000, I hired the strongest people I could and encouraged the development of a software methodology.

I am fascinated by the huge, spectacular failures of software projects. Like plane crashes, nuclear plant meltdowns and other failed human endeavors, software projects which cost millions of dollars without resulting in a working application are rich topics for study of the psychological, political and economic factors which contribute to disaster.

The failure of the Affordable Health Care Act website to work on October 1 is the latest in a series of government projects which predictably failed to result (either immediately or ever) in working applications. Among the most prominent were an IRS taxpayer compliance project, an FBI virtual case file project, and a Homeland Security financial system. In each case, millions or even billions of dollars were sunk into development of systems which were either declared failures, or required substantial remediation. There have been notable commercial failures; government projects fail for many of the same reasons as private ones, though government likely has a few unique issues as well.

The ACA website was not more complicated than commercial websites we use every day, Amazon, Ebay and banks. The problems which the developers failed to solve, such as pulling information from IRS on the back end, are not different than the functions Amazon carries out routinely, such as checking your credit card.

Semantics and optics

As with everything, in politics especially, much depends on the way we define terms, and on the spin things are given. All projects fail before they succeed; all software is buggy, and almost all projects go through a stage where they work very badly before they work well. The exercise is, of course, to have an appropriate PRIVATE testing process, and not to release software before its ready. The lamentable state of the ACA website on October 1 would have been a normal condition for the same application if it was still in test mode and not released to the public. One huge error the government made was announcing a hard and fast release date, October 1, and then keeping to it despite the incompleteness of the project. In a perfect world, all software deadlines would be, within reason, moving targets; goals rather than immutable.

I think the Obama administration fucked up badly on this project, so none of this is intended to let them off the hook. But Republican spin, generated by a party that actually does not want us to have affordable health insurance, vastly contributed to the “optics” of failure. Large, transformative software projects are often plagued by similar internal politics within corporations: there is so often a contingent who does not want the transformation to happen, who want the project to fail.

Spin versus reality

There is a phenomenon I have described before which results in our mistaking metaphor for reality. I see this happening at all levels of human endeavor, but noticed it particularly in two manifestations during the Bush administration: the shipboard ceremony with the huge “Mission Accomplished” sign when the mission was not nearly done, and Bush’s praise to his incompetent FEMA administrator during Hurricane Katrina: “Brownie, you’re doing a heck of a job”.

Politics (which, again, is not restricted to the government world, but is rife within corporations and everywhere else in life) involves a lot of rhetoric, of defining one’s own terms, using labels which may be devoid of content (like “socialist” applied to someone who does not believe the workers should own the means of production), and then, at the end of the debate, declaring victory. This behavior, from the Greek sophists and orators on down, has always been most successful in human affairs when the things discussed were purely abstract (how many angels can dance on the head of a pin?) or where the links to real outcomes were diffuse or took decades or even centuries to detect. I remember the Republicans declaring victory in the 1970’s over Gramm-Rudman, a bill to coerce balancing of the federal budget; fast forward thirty years and it failed to have any effect.

The grave and often terminal problem comes when we hold to the rhetoric even when its contradicted by the optics. The fact that countries like Greece and Spain are sliding towards mass unemployment, hopelessness and even fascist movements like Golden Dawn in response to austerity programs has not, apparently, caused any of the more ideological proponents to question the effectiveness of austerity. On the other hand, sometimes the real world failures are so obvious and immediate, there can be no spin; Katrina was an example, with its floating bodies, epidemic disease, and mass anarchy including police lawlessness. Nobody except possibly Bush could believe Brownie was doing a heck of a job.

I suppose that declaring victory, and believing it, is a propensity of any culture that is too involved with words; there comes a moment when we forget that words are not things, and that predicting or affirming a real world outcome does not necessarily cause that outcome to happen. Another example is American exceptionalism, declaring this country the greatest in the world at the point where we can no longer build bridges, do math, or even collaborate in Congress to vote legislation.

Since companies and governments have wide latitude to sell things to others which don’t exist or don’t function, why would they handle internal sales any differently? There is a great similarity between a Wall Street firm selling crappy mortgage-backed securities to clients, and internally selling itself on a crappy software project. When you lose the ability to detect reality in one zone, you tend to lose it in all others.

Slice and dice

Another major problem in huge software projects, in corporations and governments, is that they are handled in a very decentralized way, without strong leadership. Government and banks both, when they need three hundred people writing software, bring in a lot of intermediate firms, essentially body shops, to provide them. These people do not share a common culture, values or goals, and typically are not strongly supervised by one leader with a vision of, and for, the entire project. Imagine trying to build a skyscraper by hiring three hundred workers to carry bricks, unsupervised by an authoritative architect. Hmmm, that's the story of the Tower of Babel.

I see this as being a fascinating, disturbing trend in capitalism itself. All manufacturing started on an artisanal model, where one expert designed and built a piece of furniture or a boat from scratch in a worlds of masters and apprentices. The industrial revolution brought the idea that you could have a mass of workers each responsible for one part or operation, without an overview of the entire entity. In medieval times, every apprentice had the opportunity to become a master; but alienated factory workers who only ever turn one screw came not to care if they missed a few.

This process of decentralization, alienation, leadership vacuums and general dumbing down is visible everywhere in human endeavor: workers failing to insert o-rings or tighten bolts, unaware or uncaring that these errors will scrub missions or cause the loss of life; bureaucratic organizations where people tasked with filling out a form lose sight that the goal is to save or improve a life; customer support organizations outsourced to foreign countries where the workers, however polite, educated or well intentioned, can have little conception of the culture or idiom of the company for whom they work or the customers to whom they are speaking; companies “managing” mortgages that have no idea what they own or how the mortgage originated or where the original paperwork is. An astonishing example, about which I want to write an essay one day soon, is the burgeoning field of document review in large litigations. You would think that law is one of the last of the artisanal, master-apprentice fields. Thirty years ago, when I started out, teams of the most junior lawyers of a large firm--the apprentices--would sit in rooms full of boxes and review paper documents, supervised by a partner (the master) with exacting standards as to what to look for. Today, teams of mainly otherwise unemployable law graduates, their eyes glazed over, sit (many of them seething with resentment) in cavernous rooms, clicking away at terminals, where they are expected to classify documents in a few seconds (you are expected to review more than 1,000 per day). It is the law as assembly line factory, and the results are predictable (thousands of privileged documents mistakenly misclassified and released to the adversary, because the people doing the work don’t care and nobody is supervising them).

Almost every large software project I have personally witnessed is similarly run. If you have endless time and money, you might end up with something workable, on what I call the “Normandy invasion” theory: the bloodshed at the beaches was unbearable, none of the paratroopers landed where intended or linked up with the units they were supposed to join, but in the end we could deploy so many more millions of men than the Germans could that we won anyway.

The vast majority of three hundred or three thousand person projects could, in my estimation, be better done by teams of one tenth of even one hundredth the size, led by a really strong architect and project manager. There is almost no software out there, maybe none at all, which really should cost billions, or hundreds of millions of dollars to build.

In 1975, Fred Brooks of IBM wrote a then-famous essay, The Mythical Man-Month, which claimed that adding more people to a project makes it take longer, that there is a loss of time when one hundred people have to communicate with one another that may not hamper a team of ten.

Vision

The single most important factor that makes software projects fail is a lack of vision and leadership, which is again a problem which cuts across society and causes nations as well as other types of projects to fail. Someone needs to analyze the requirements, define the specifications, and supervise the developers; although any of these functions may be delegated, there needs to be one human being with the responsibility and the authority to make the project come out right. "Authority" means the ability to fire anyone on the team, or to buy everyone new laptops. "Responsibility" means that if the project doesn't work, the leader falls on her sword. In a slice and dice world, where CEO's commonly blame their underlings for failure while absolving themselves, classic concepts of leadership have been forgotten. I doubt there was anyone at the top of the ACA website project who fits this definition of leadership. President Obama, unlike some of his predecessors (remember Reagan's "Mistakes were made"?) seems to remember the buck stops on his desk; but he forgets that the same must be true of the people he hires: the buck stops on everyone's desk in a truly effective organization.

Conclusion

Software is science, is architecture, is an artisanal field where you’d better be careful, experienced, and keyed in to objective results. Software is also a field in which small is beautiful: more can be better accomplished by teams of fewer but more masterful people.