## Saturday, February 16, 2008

### probability and public policy VII: cancer

[part one of this series here]

The variety of interpretations of probability is supplemented by a variety of applications. Unfortunately, the most certain and convenient applications of statistical analysis (those implementable in scientific experiments) don't often provide an answer to our questions about how to act in an uncertain world.

Causal connections, for example, are notoriously difficult to justify: how can we separate the case where smoking causes cancer from that where smoking and cancer share a mutual cause (say, a gene predisposing one to both the practice of smoking and the tendency to contract cancer)? The problem is a vexed one, but it misconstrues the situation. Such questions about causality can always be reduced to the more pertinent: how should I act?

This latter question, the question of what to do, is precisely one the answer to which can never be "read off" statistical analyses. Although there are some cases where the answer may be clear (if analysis determines the probability that you will die if you leap off a 15 storey building to be 98%, you probably shouldn't jump off that building), in most cases, the answer provided by a statistical analysis will be orthogonal to the question of what to do. Some examples:

Threshold Effects

Consider, for example, the question: Does the sugar substitute saccharin cause cancer? In fact, this question is largely irrelevant to the more pertinent: Should I ingest saccharin?

Suppose a study demonstrates that feeding lab mice some substantial amount of saccharin each day dramatically increases the probability that they will contract cancer. This seems to demonstrate, fairly conclusively, that, in some sense, saccharin "causes" cancer.

However, many of the natural substances ingested regularly by human beings would cause serious medical problems if consumed in suitably large amounts. The problem here is that there is a threshold for such substances: less than some amount, and they are easily processed by the body; once that amount is surpassed, however, there is a sudden increase in the probability that the body will suffer ill effects.

So, the result that saccharin "causes" cancer in some sense is irrelevant to the determination of "safe" behavior; really, we must know what the pertinent threshold for saccharin is: how much must be ingested before the chance of getting cancer spikes? How close is this threshold to the amount of saccharin I might consume on a daily basis? In the words of E. T. Jaynes, Probability Theory, 2003:

For virtually every organic substance (such as saccharin or cyclomates), the existence of a finite metabolic rate means that there must exist a finite threshold dose rate, below which the substance is decomposed, eliminated, or chemically altered so rapidly that it causes no ill effects. If this were not true, the human race could never have survived to the present time, in view of all the things we have been eating.

Indeed, every mouthful of food you and I have ever taken contained many billions of kinds of complex molecules whose structure and physiological effects have never been determined - and many millions of which would be toxic or fatal in large doses. We cannot doubt that we are daily ingesting thousands of substances that are far more dangerous than saccharin - but in amounts that are safe, because they are far below the various thresholds of toxicity. At present there are hardly any substances, except some common drugs, for which we actually know the threshold.

Therefore, the goal of inference in this field should be to estimate not only the slope of the response curve but, far more importantly, to decide whether there is evidence for a threshold; and, if there is, to estimate its magnitude (the 'maximum safe dose'). For example, to tell us that a sugar substitute can produce a barely detectable incidence of cancer in doses 1000 times greater than would ever be enountered in practice, is hardly an argument against using the substitute; indeed, the fact that it is necessary to go to kilodoses in order to detect any ill effects at all, is rather conclusive evidence, not of the danger, but of the safety, of a tested substance. A similar overdose of sugar would be far more dangerous, leading not to barely detectable harmful effects, but to sure, immediate death by diabetic coma; yet nobody has proposed to ban the use of sugar in food.

Modeling Continuity where Discreteness is Needed

A related problem to that of threshold is how to take a continuous model and parse it into the discrete chunks relevant for discussion in natural language and, by extension, policy decision.

Consider, for example, the question of whether or not second hand smoking laws are scientifically justified. Research that shows second hand smoke is "just as dangerous" as "first hand" smoke depends upon data involving the ingestion of equivalent amounts of smoke. Second hand smoke disperses, however, and is not inhaled by bystanders with the same intensity as "first hand" smoke. The pertinent question, then, is not is second hand smoke dangerous?, but rather, at what distance, and in what ventilation conditions is second hand smoke dangerous?

The dispersal of second hand smoke is a phenomenon suitable for statistical modeling. Such models do not tell us what is safe, however, merely the density of particles due to second hand smoke at various distances from the source in various ventilation conditions. Here, we have a continuous phenomenon, the dispersal of second hand smoke, and a discrete classification, safe or not safe. The problem is that statistical modeling of the continuous situation does not fully determine the locus upon that continuum of the boundary between safe and non-safe; yet it is this second question which is relevant to policy decision.

For example, a recent Stanford University study, one of the first peer-reviewed articles on the topic, emphasizes the support of their findings for public conceptions about second hand smoke dispersal (the results are summarized here), in particular, as evidence for a ban on smoking in public places. However, personal communication with one of the graduate students working on the project revealed that at prior stages of analysis there was surprise at how little public fears were confirmed and how little justification for such laws could be found. Whence this apparent disagreement? The study shows that, on the one hand, yes, there are quite high levels of particles in the air near an outdoor smoker; on the other hand, the density of particles drops off quite dramatically within a fairly small distance of the smoker and is highly susceptible to ventilation effects. Furthermore, the effect lasts for a relatively short time. Since the "safe" background level of particles is defined in terms of average density over a 24 hour period, it is not all clear how the localized and relatively short period of increased particle density compares with that contributed by, say, a passing car.

The Clustering Illusion

The clustering illusion occurs when humans see patterns in "random," or mathematically unpatterned, data. Everyday examples of this phenomenon include hearing the murmur of voices in the sound of a bubbling brook, the perception of a gambler that he is on a "winning streak," and instances of religious pareidolia, such as the "nun bun."

Such perceptions of non-existent pattern are relatively benign, however, as the poor choices they inspire tend to effect only a handful of individuals. When the objects of the illusion are potentially relevant to public policy, however, the clustering illusion can waste millions of dollars, inspire fear, and generate spurious "research." An example is the power lines ~ leukemia scare.

Law suits, local legislation, and widespread fear were caused throughout the '80s by the idea that power lines might, via their surrounding electromagnetic fields, somehow increase the possibility that a child contracts leukemia (or, perhaps, some other form of cancer). The inspiration for the studies centered on clustering phenomena, in particular, the apparent clustering of leukemia sufferers in a neighborhood "near" power lines. Subsequent research demonstrated conclusively that there was no measurable correlation between proximity to power line generated magnetic fields and tendency to contract leukemia, a result consonant with current scientific theories of magnetic fields. Nevertheless, public misconception and superstition persist; the damage of misguided and statistically unfounded claims continues to play out in the those corners of the public sphere dominated by paranoia and underinformed superstition.

A nice summary of the issue by Edward Campion can be found here. We quote only the conclusion:

Serious limitations have been pointed out in nearly all the studies of power lines and cancer. These limitations include unblinded assessment of exposure, difficulty in making direct measurements of the constantly varying electromagnetic fields, inconsistencies between the measured levels and the estimates of exposure based on wiring configurations, recall bias with respect to exposure, post hoc definitions of exposure categories, and huge numbers of comparisons with selective emphasis on those that were positive. Both study participation and residential wire-code categories may be confounded by socioeconomic factors. Often the number of cases of [acute lymphoblastic leukemia] in the high-exposure categories has been very small, and controls may not have been truly comparable. Moreover, all these epidemiologic studies have been conducted in pursuit of a cause of cancer for which there is no plausible biologic basis. There is no convincing evidence that exposure to electromagnetic fields causes cancer in animals, and electromagnetic fields have no reproducible biologic effects at all, except at strengths that are far beyond those ever found in people's homes.

The point here is not just that fear and irrational paranoia on the part of the public can motivate policy choices (in contradiction to or via abuse of the results of statistical analysis), but further that such paranoia can motivate bad research and lower standards of scientifc rigor from those necessary to ensure meaningful results.

next: intelligent design