Causation and Common Sense

By Dr. David Murray
Statistical Assessment Service

There is a specter haunting health policy, ironically enough brought on by a novel scientific strength--our ability to discover computer-generated associations. This development has worrisome dimensions, particularly when we remember the alacrity with which journalists and policy makers move from scanty evidence of risk to ironclad assumptions of demonstrated causality. A good example can be found in an issue which has recently perked up, the effects of coffee drinking.

In 1981 the New England Journal of Medicine (NEJM) reported that coffee "might account" for cases of pancreatic cancer. Further, a 1998 study linked coffee to difficulty getting pregnant, to miscarriages and even low infant birth weight. But now the tables have turned. Based on a study in the Journal of the American Medical Association (JAMA May 24), news headlines claimed "Coffee Might Prevent Parkinson's" (Dayton Daily News). The clear sense is that coffee has been convicted of doing something-in one case, something negative, in the other, something positive. In fact, none of these studies has proven particularly compelling to critics. They only point out possible associations. But it's also clear that we haven't hit the dregs yet. This past summer, yet another medical claim about coffee made instant news.

In this case, as the report from the Associated Press was headlined, "Coffee, Rheumatoid Arthritis Linked." This time a Finnish study published in the Annals of the Rheumatic Diseases followed 19,000 people over 15 years. The researchers found that among people who drank more than three cups a day, 0.5 percent got the disease, as opposed to only 0.2 percent of those who drank three cups a day or less.

While the effect is not large and the study was criticized by several scientific commentators (particularly for failure to control for confounding factors that may affect the development of the disease), the AP stressed that the study was important because "it is the first to produce evidence of a possible link" between coffee and rheumatoid arthritis. Well, isn't discovering and then publicizing "possible links" a public health service? It would seem so, at first sip. Perhaps we would be served by a general device that would automatically find "possible links" warning us of health risks. Several agencies are moving in that direction. For instance, a campaign for a new "national strategy" on health called HealthTrack has recently been launched by the Pew Charitable Trusts through a series of full-page advertisements and editorials in national newspapers. The campaign argues that Americans need a federally funded "national system for tracking, monitoring and responding to health threats caused by environmental factors." As the executive director of HealthTrack put it in the Wall Street Journal, the program would track diseases and their relationship to environmental factors through the "sophisticated use of computer programs and mapping."

However, in a realization that extends far beyond concerns over coffee beans, "possible links" may not prove substantial grounds for guiding behavior. Climatologist Dr. Richard Lindzen of MIT captured our dilemma very well when he framed the following paradigm for the misuse of science: "Everything is connected to everything. Nothing is certain. Anything may cause anything. Therefore, something must be done." And here is the heart of the matter. Like sorcerer's apprentice Mickey Mouse overwhelmed by his broom-splitting, we must be cautious once we automate data-dredging-how do we get the blasted thing to stop? The sleepless sifting of a computer program, sorting, combining, suggesting associations all night long, is not always accompanied by an evaluative component to judge the patterns discerned. We should remind ourselves that while much of science depends upon coming up with good ideas, finding new patterns or connections, much of science depends at least as much on sorting and defeating bad ideas. Too much is connected already to too much, and nearly everything is instantly connected to the media. The challenge is to discriminate between valuable connections and mere mirages. Hence, the need for mechanisms or procedures that disable spurious associations. Statistical tests of significance and the like are part of this mechanism, but they cannot assure validity (real-world applicability) even when they work.

This threat is not just hypothetical. The Centers for Disease Control and Prevention (CDC) recently introduced their latest web-engine, the FERRET (Federal Electronic Research and Review Extraction Tool), which greatly expands a user's capacity to find things. However, one dimension was disquieting. As Dr. Maureen Storey of Georgetown University described it, "Like the furry ferret, a tenacious cousin of the skunk, FERRET burrows through the government's massive datasets to match up the user's requested variables, combines the data, and presents it in a table of cross-tabulations or frequencies. Users may also download the data making it easy to write simple reports from the whole range of surveys available." So what's wrong with that? Dr. Storey continues, "But like ferrets, FERRET has the potential to create havoc. FERRET digs to find the data all right, but it doesn't distinguish between the high quality and the "bargain" data sets ... So the invaluable tool can be a delight or a demon, depending on who's doing the asking and whether or not there is careful, responsible statistical analysis noting the documentation and limitations of the datasets. The predictable result is that more groups-some with public relations talent for the pithy sound bite-will have the capacity to mine these data and vaunt their non-peer-reviewed reports." We have arrived at a curious juncture. With supercomputing capacity, our searching, connecting and data-dredging associational power grows exponentially (the computations are cheap, fast, and fun to watch). But the more information we splice together, the more tangled our knot of knowledge grows. And in the absence of evaluative judgment, mere guilt by association quickly accelerates to class-action convictions. With the postulation of "possible links" proceeding at a gallop, the need for an adequate disabling mechanism becomes an imperative, lest mere perception of risk should keep us up all night. Coleridge once claimed that poetic faith depended on the "willing suspension of disbelief." Perhaps good science depends, at least initially, on the opposite.