Ian St James-Roberts
Intentional bias is a common feature of scientific research. The danger is that fallacious results of such studies mightbe inflicted on the public.
IAN ST. JAMES-ROBERTS is Lecturer in Psychology at the University of London Institute of Education. In his studies of twins Sir Cyril Burt faked his research findings by falsifying some of his data.
By way of coincidence, the Burt disclosure came at a time when an attempt to quantify the extent and importance of bias in scientific research was already underway. On Sept. 2,1976, New Scientist had published an article arguing that what could be called "intentional bias" in research required investigation. The argument was based on the premise that the inducements to deliberately bias research were considerable, whereas the constraints operating to detect or punish miscreants were paltry. Perhaps more important, the article proposed that science's uncritical attitude and the consequent lack of information on intentional bias in research were inimical to a discipline whose way of life is based on skepticism. The September 2 article was accompanied by a questionnaire which invited readers to provide information concerning their experiences of intentional bias. Analyses of the questionnaire replies, 204 of which were received, amply justified the view that intentional bias in science was more prevalent than many would allow. Like beauty, intentional bias is in the eye of the beholder, or rather in this case the experimenter, since only he can know whether an "error" is intentional or unintentional. The subtlety of this distinction should not be un-deremphasized. One colleague, for example, communicated that when he had completed an analysis of results he sometimes had a "niggling feeling" that all was not well. It was his experience that if the result of the analysis contradicted his hypothesis, he would check it. If, however, the analysis confirmed his hypothesis, he found that, although never making a deliberate decision not to check, he didn't quite get around to doing it.
The subject of unintentional bias has received considerable attention in scientific literature, and the common use in research of control groups and double-blind procedures is one consequence of the realization of its importance. Perhaps the best-known example of bias presumed to be of this sort is the N-ray scandal of 1903. The case concerned a mysterious ray, analogous to the X-ray but with the considerable advantage of being able to penetrate metals. This ray, initially isolated by Rene Blondlot, was soon also identified by dozens of other respectable laboratories, and its characteristics and properties quickly became well known. In 1904, however, Robert W. Wood was able to demonstrate conclusively that the rays did not exist. Given the reputations of those concerned, it seems most likely that the rays were the result of unintentionally biased observation resulting from excessive experimental zeal. In any event, they provide a perfect example of the extent to which fashionability and expectation can overrule the effects of common sense and scientific training.Science has undoubtedly made considerable progress in developing controls to minimize unintentional bias, and it seems unlikely that an N-ray-like affair could occur today. In the process, however, the idea of intentional bias has been more or less swept under the carpet and the thin and indistinct nature of the line separating the two has been ignored. In this context, it is worthwhile to examine in some detail the cause celebre of scientific fraud— Paul Kammerer's experiments on the midwife toad—since it provides an excellent demonstration of how difficult absolute proof of deliberate deception can be.
The midwife toad
The case of the midwife toad (to borrow the title of Arthur Koestler's excel-lent book on the subject) concerns a species of toad that, unlike most others, normally mates on land. In the years up to 1909, Kammerer had managed to persuade several generations of the toad to mate instead in water. This was no mean feat-from both Kammerer's and the toad's point of view-and the technical difficulty of these breeding experiments may be one reason why they do not appear to have been repeated. The difficulty the toad faced was to remain attached, during the long time required for fertilization, to the slippery back of the female. In toads that habitually mate in water, this behavior is facilitated by the existence on the male's hands and feet of "nuptial pads," which assist in clinging. Kammerer's claim was that he had caused these nuptial pads to appear on the limbs of the land-mating toad after only a few generations of forced mating in water. The importance of such a finding would be that it would be contrary to orthodox Darwinism, favoring instead Jean-Baptiste de Lamarck's theory of inheritance. According to Darwinian theory, the effects of the environment can be incorporated into the genetic makeup of a species only indirectly, as a result of the "survival of the fittest" dictum. Kammerer's findings suggested that such effects had been incorporated directly into the genetic material and thereafter were passed on as an inherited characteristic to subsequent generations.
Kammerer's results were greeted with hostility because of their controversial nature. Initially, there was no consideration of fraud, Kammerer's reputation in general being excellent. Some years after the original work, however, the only laboratory specimen of midwife toad that Kammerer had preserved was found to have been tampered with: the nuptial pads were merely judiciously applied india ink. Kammerer subsequently committed suicide and so implicitly accepted the blame for the tampering. However, as Koestler emphasized, it is by no means certain that Kammerer's suicide is attributable solely to the faked specimen and a possibility also exists that he did not himself apply the ink. One interpretation of the evidence is that, when the midwife toad specimen began to deteriorate, a technician attempted to restore the essential characteristics with ink so that they might be better seen. This kind of refurbishment is by no means uncommon in biology. The existence of the nuptial pads was not, however, ever verified by any other scientific observer.
A significant aspect of the case is that no attempt to repeat Kammerer's results appears to have been made. This is not just because the experiments are so difficult to perform. Scientists are as sensitive to impropriety and stigma as any other group, and one scandal of this sort can make a complete area of inquiry disreputable.
The Kammerer case raises a number of questions. The ethical issues involved in intentional bias are too complex to receive attention here. How-ever, since the subject of the gray area between intentional and unintentional bias is under consideration, it is appropriate to point out that a similarly indistinct area exists for moral perspectives. Two famous examples may even be seen as evidence that bias in some instances may be to the ultimate good. The best-known concerns statistical reanalysis by R. A. Fisher of Gregor Mendel's data, which form the basis of modern views on heredity. Fisher showed that Mendel's results were just too good to be true—the chances of his getting them, given his research techniques, were something like 1 in 10,000.
Nobody knows whether Mendel deliberately misrepresented his data or not, but it is clear that, whatever the means, the results of his work are of inestimable importance for modern society. A more controversial and recent instance concerns the alleged publication of misleading data on wheat radiation mutation by the influential Indian agriculturalist M. S. Swaminathan. Swaminathan claimed that he had increased the protein and lysine content of a strain of wheat by subjecting seeds of a parent strain to a combination of gamma radiation and ultraviolet light. In this case, the issue is not so much whether Swaminathan deliberately fabricated his experiments but rather whether he was less than vigilant in his attitude to the data after it had been discredited. Swaminathan's supporters argued that any carelessness on his part was more than justified by the contribution he had made to the Green Revolution that brought about increased agricultural yields in
Pressures of competition
In trying to understand the reasons for the existence of intentional bias one must think of the scientist as an individual under considerable pressure to obtain particular results. The pressure comes from a number of sources. Research funding from industry, for example from pharmaceutical firms, is normally assigned to groups producing results that look promising from the funder's point of view. The temptation to produce experiments that yield such results is, consequently, a strong one. At a different level, the postgraduate scientist, working strenuously for his Ph.D., is all too aware that his research is funded for a very limited period. If, toward the end of that time, he is failing to get "good" results, the temptation to "improve" the data a little so as to get the Ph.D. must be extraordinary. And also, at all levels, advancement in science depends primarily on publication of impressive research findings. All journals receive far more material for publication than they can possibly handle, so they have to be selective. Selection relates to the importance of findings, and consequently "failed" experiments—those where hypotheses have not been confirmed and so no "positive" results have been obtained—are seldom published. Once again, therefore, the emphasis is on obtaining clear-cut experimental evidence in favor of predicted phenomena.
Two recent examples of fraud were generated in large part by such pressures. The first concerned the work of William T. Summerlin at the
Every scientist is likely at some time in his career to have to face a choice between morally acceptable and expedient choices of action. While the ethical standards of science are no doubt keenly felt on such occasions, the individual's need to survive must also be taken into account. When ambition and career, to say nothing of more mundane considerations like holding down a job and salary, depend on getting results of a particular sort, it is obvious that expediency must sometimes win.
In contrast to the considerable pressures working in favor of intentional bias, the sanctions operating to prevent it are negligible. The most significant is replication, and it can be argued that intentional bias may safely be ignored because important experiments are always replicated independently by other researchers before their findings are accepted or applied.For some major advances, there is undoubtedly something in this viewpoint. For less celebrated work, though, exact replication is seldom carried out and is published even less often; journals are understandably not sympathetic to repetitious material. Moreover, the increasing expense and complexity of research is making replication even less common. Many experiments involve extremely sophisticated and costly apparatus, which, consequently, exist only in a few laboratories. Access to such apparatus is keenly sought, and it is unlikely that precious apparatus time will be allowed to be used simply for repeating experiments. In other cases, experiments are simply not reproducible, either for technical reasons (as in the Kammerer case) or because of some peculiarity in the design or subject matter. A notorious example of the sort of problem associated with the latter is the Piltdown man, a fraudulent skull specimen that led archaeology astray for more than 40 years before it was discredited.
It could even be argued that the increasing complexity of modern experimentation provides a ready cloak for the would-be charlatan, since discrepant results may be explained away as the consequences of equipment or sample idiosyncrasies. Indeed, for obvious and laudable reasons, researchers normally go to great trouble to detect possible reasons for discrepancies between their own results and those of others. If replication of the experiment, involving systematic testing of each idiosyncrasy, is then attempted, the cost in time and resources is likely to be considerable; a recent disclosure itemized one case in which four man-years had been wasted in this way on a faked original result.
The questions that made up the New Scientist questionnaire were written with the sort of issues thus far considered very much in mind. The researchers hoped to obtain information about the circumstances most likely to give rise to intentional bias, about the sort of individual most likely to succumb, and about the likelihood and consequences of detection. In addition, it was hoped that recommendations could be made with respect to the development of safeguards to minimize intentional bias if they proved to be needed.Five of the 204 questionnaires received were spoiled, and so analysis involved 199. The questionnaire consisted largely of multiple-choice answers, where one or more of several alternatives had to be selected, but in some cases respondents were encouraged to provide additional information. Some did this, to the extent of sending in letters and complete documented case histories of fraud they had encountered. An important qualification of the survey's data is that respondents were a self-selected (rather than randomly selected) group. In the vast majority of cases (92%) they were individuals who had had some experience of intentional bias, and almost all of them (90%) were in favor of investigation of fraud in science. Although no figures exist, it seems unlikely that such high proportions would be obtained if scientists were selected at random. Hence, the group of respondents must be regarded as "unrepresentative" in the formal statistical sense in that they almost all shared an attitude not necessarily characteristic of all scientists. Of course, this kind of selectivity does not imply dishonesty or even that respondents were necessarily people with a person-al ax to grind. Indeed, the reasoned and dispassionate nature of most of their reponses and the certainty of their evidence suggest that their information is reliable (75% were reporting unequivocal evidence, and in 52% of cases the evidence was based on direct personal experience).
There are other good reasons for assuming respondents to be reason-able, responsible, and mature individuals. Nearly two-thirds were over 30 years of age and one-third were over 40. Their job backgrounds and status varied considerably, but 23% were tenured academic staff and an additional 12% were senior industrial officials.
Taken as a whole, the results of the survey suggest that intentional bias of one sort or another is a common feature of scientific research and that existing controls are incapable of preventing it. One question that was included with the subject of controls specifically in mind asked how many scientists were involved in each case of bias because it was anticipated that issue were difficult to evaluate because the relative amounts of research done by groups of different sizes were unknown. It seemed reasonable to assume, however, that most research is done by one or two workers, with decreasing amounts as the size of the group increases. With this proviso taken into account, the responses to the question offer less support than suspected for the effectiveness of multiple experimenters as controls of intentional bias.Although the proportion of fraudulent experimentation diminishes as the number of experimenters in-creases, nearly half of the intentional bias reported involved more than one experimenter and approximately 15% involved three or more. The informal comments of respondents suggested that an important consideration may be whether they collaborate only afterward, as often happens in multidisciplinary research.
Insistence on multiple authors for papers and joint running of experiments might, therefore, provide at least some measure of control of intentional bias. What other constraints are possible? Perhaps the simplest would be for journals to insist that experimenters oversee one another's research and retain "open" data books so that anyone can have ready access to their entire data. Such controls are unlikely to be wholly effective, but they may reduce at least some kinds of intentional bias. Whatever methods are used, the cost of controls must be evaluated in terms of inconvenience and loss of time and personal liberty to the researcher.