University of Oxford

Wellcome Unit for the History of Medicine

 

Conference on

Beating biases in therapeutic research: historical perspectives

Thursday 5th and Friday 6th September 2002

Osler-McGovern Centre, Green College, Oxford

Introduction

National drug licensing authorities and other organisations acting on behalf of the public reach their decisions after considering research evidence assessing the extent to which healthcare interventions achieve hoped for benefits without unwanted side effects. An increasing number of these agencies require research evidence that has been generated in ways intended to ensure that assessments of healthcare interventions are fair, and avoid biases of various kinds. Indeed, citizens are increasingly seeking such evidence to inform their own choices in health care.

This conference will begin by considering the kinds of biases that are considered important today; go on to show how biases and ways of reducing them have been conceptualised and developed over the past three centuries; and conclude with an account of how the National Institute of Clinical Excellence takes account of biases in its work.

PROGRAMME

Thursday 5th September

09.00-09.30 Registration and coffee
Chair: Dr Irvine Loudon, Wellcome Unit and Green College, Oxford, UK
09.3510.35 Where are we now? Current expectations for the control of biases
Professor Matthias Egger, Department of Social Medicine, University of Bristol, UK
10.35-11.00 Coffee
11.00-12.00 Conceptualisation and control of biases in 18th century British medicine and surgery
Dr Ulrich Trhler, Institut f r Geschichte der Medizin, Freiburg i Br, Germany
12.00-1300 Comparing like with like: the evolution of prospective control of selection biases
Sir Iain Chalmers, UK Cochrane Centre, NHS Research & Development Programme, UK
13.00-14.30 Lunch
Chair: Sir Iain Chalmers, UK Cochrane Centre, NHS Research & Development Programme, UK
14.30-17.00 Fisher, Bradford Hill and Randomization
Professor Peter Armitage, St Peters College, University of Oxford, UK
Sir Richard Doll, Clinical Trials Service Unit, University of Oxford, UK
Professor Harry Marks, Johns Hopkins University, Baltimore, USA
Sir Walter Bodmer, Hertford College, University of Oxford, UK
17.00 Reception
Friday 6th September
Chair: Dr Maureen Malowany, Wellcome Unit and Green College, Oxford, UK
09.00-10.00 The placebo problem: historical background to current concerns
Dr Michael Dean, Department of Health Sciences, York University, UK
10.00-11.00 Comparing like with like in observational studies: real and imaginary biases
Professor Jan Vandenbroucke, Department of Clinical Epidemiology, Leiden University, The Netherlands
11.00-11.30 Coffee
11.30-12.30 Biased reporting of research evidence
Professor Kay Dickersin, Brown University, Rhode Island, USA
12.30-14.00 Lunch
Chair: Sir Iain Chalmers, UK Cochrane Centre, NHS Research & Development Programme, UK
14.00-15.00 Controlling biases on behalf of the public today
Sir Michael Rawlins, National Institute for Clinical Excellence, London, UK
15.00-15.30 Reflections and final discussion
Professor Frederick Mosteller, Harvard University, Boston, USA
15.30 Close of conference

Back to top

Abstracts of Presentations

Where are we now?
Current expectations for the control of biases

Mike Clarke
UK Cochrane Centre, Oxford

Bias can enter evaluations of health care deliberately and surreptitiously. It can affect the findings of these evaluations and alter their interpretation. It can, thereby, affect how these evaluations are used by people making decisions about health care. It is important, therefore, that people performing, reporting, interpreting and using studies of health care try to understand the biases that can arise, and try to minimise their impact. This presentation will discuss a selection of biases. The reasons for such biases will be considered along with, where available, the evidence of their extent and the ways in which they can be minimised or countered.

It is first worth thinking about what is meant by bias. To a bowler, it means the way to make a bowl swerve; to an electrician, the voltage applied to a transistor to help it work; and to a seamstress it is a diagonal cut across the weave of a fabric. Within health care evaluations, it has both a qualitative and a quantitative meaning. The presence of the former might be suspected but can be difficult to measure and is a preference or inclination that prevents impartial judgement. The latter might be measurable if there is a good estimate of the "true value", since it is the process that produces results that differ systematically from this true value.

Among the biases to be considered are those involved in choosing the topics and areas of health care to be evaluated; in choosing the intervention to compare against a new therapy; in allocating people to the different interventions; in measuring outcomes; in reporting the findings of the research; in reviewing and combining the results of separate studies; and in interpreting the results of research, be it a review or a single study.

Research agenda bias can affect health care research in general, as well as randomised trials and systematic reviews in particular. It influences the types of intervention to be studied (for example, pharmaceutical drugs over physiotherapy, exercise and alternative medicine in osteoarthritis of the knee) and the outcome measures assessed (for example, blood chemicals over a patient's perception of their fatigue).

Bias in the choice of the comparator for a new intervention can serve to make the new treatment appear good (by comparing it with no treatment); make the comparator appear neutral (by using a placebo even it there is an effective reference treatment); or make the comparator look bad (by using a toxic "old" drug or a ineffective or harmful dose).

In 1979, Sackett identified 24 different types of bias relating to the conduct of a study. Among the most important of these are selection bias, leading to biased assignment of patients to the comparison groups (dealt with by concealed, random allocation); performance bias, leading to unequal provision of care to patients in the comparison groups (dealt with by blinding and strict protocols if appropriate); detection and attrition bias, leading to differences in how outcomes are assessed and losses to follow-up are dealt with in the comparison groups.

Publication bias makes it more likely that studies with statistically significant results in favour of the experimental intervention will be published compared to those with non-significant results, or results favouring the comparator. Added to this, time-lag bias means that even if non-significant results are published, this publication will take longer.

Trialists need to do what they can to protect their studies from these biases, reviewers need to do what they can to minimise their influence on their review, and the users of health care research need to be wary of biases and their impact on this research.

References:

Kunz R, Vist G, Oxman AD. Randomisation to protect against selection bias in healthcare trials Cochrane Methodology Review). In: The Cochrane Library, Issue 4, 2002. Oxford: Update Software.

Hopewell S, Clarke M, Stewart L, Tierney J. Time to publication for results of clinical trials (Cochrane Methodology Review). In: The Cochrane Library, Issue 4, 2002. Oxford: Update Software.

Scherer RW, Langenberg P. Full publication of results initially presented in abstracts (Cochrane Methodology Review). In: The Cochrane Library, Issue 4, 2002. Oxford: Update Software.

Altman DG, Schulz KF, Moher D, et al. The revised CONSORT statement for reporting randomized trials: explanation and elaboration. Annals of internal medicine 2001; 134: 663-694.

Clarke M. Ovarian ablation in breast cancer, 1896 to 1998: milestones along hierarchy of evidence from case report to Cochrane review. BMJ 1998;317:1246-1248.

Sackett DL. Bias in analytic research Journal of Chronic Diseases 1979; 32: 51-63.

Tallon D, Chard J, Dieppe P. Relation between agendas of the research community and the research consumer. Lancet 2000;355:2037-2040.

Djulbegovic B, Lacevic M, Cantor A, et al. The uncertainty principle and industry-sponsored research. Lancet 2000; 356: 635-638.

Dickersin K. How important is publication bias? A synthesis of the available data. AIDS educationa and prevention 1997; 9: 15-21.

Egger M, Zellweger-Zahner T, Schneider M, et al. Language bias in randomised controlled trials published in English and German. Lancet 1997; 350: 326-329.

Vickers A, Goyal N, Harland R, Rees R. Do certain countries produce only positive results? A systematic review of controlled trials. Controlled Clinical Trials 1998; 19: 159-166.

Gotzsche P. Reference bias in reports of drug trials. BMJ 1987; 295: 654-656.

Clarke M, Alderson P, Chalmers I. Discussion sections in reports of controlled trials published in general medical journals. JAMA 2002;287:2799-2801.

Back to top

Conceptualisation and control of biases
in 18th century British medicine and surgery

Ulrich Troehler
Institut fr Geschichte der Medizin, Freiburg i Br, Germany

Until the 18th century, certainty of knowledge (preferably in accordance with the wisdom of the Ancients) was the essential basis for clinical decisions. Evidence of therapeutic effectiveness was not an issue. Truth was attained by sound argument based on a pathophysiological system. A physician knew and failures could always be explained away. Personal experience and the empirical approach to truth by trial and error were therefore considered a matter for craftsmen, mountebanks and surgeons, that is, men considered to be of lower standing. Were there not itinerant stone-cutters who boasted of their successes using numbers?

This view of the assessment of therapeutic effectiveness changed in the 18th century for a number of reasons. 1 Within the profession, new therapies, particularly in surgery, but also in internal medicine, challenged traditional treatments. Much maligned empirical assessment became a virtue, particularly in Britain where strong cases were made for comparisons of data from clinical observations, and even for formal experimentation. Data should include successes and failures alike so that truth was not distorted by “reporting bias”. 2 This, of course, presupposed accurate recordings to avoid the fallacy of trusting memory alone - “recall bias”. 3 In order to induce “more certainty” (sic!), observations needed to be numerous and numerically presented, hence the importance of records from the Army, Navy, voluntary hospitals and dispensaries, questionnaires and privately kept notes. 4

The issue of fair comparison arose to a certain extent because of this type of quantification. Two approaches can be distinguished, that of the “arithmetic observationists”, and that of the experimentalists.1 In acute diseases and injuries the criterion for assessing effects was often death. Thus, arithmetic observationists refined crude mortality rates in case series of operations by calculating age-specific death ratesfor lithotomy,5,6 or by taking account of the anatomical localization of amputations. 7 In prospective clinical experiments, comparison groups were formed in order to compare like with like.8 Some experimenters furthermore realized that the results of a study might be flawed by prejudiced patients’ statements and used blinding to avoid this type of bias.9ï¡,í° 

In summary, the critical assessment of therapeutic effectiveness became topical in 18th century British medicine and surgery. 1 In order to minimize the influence of the prejudices of doctors and patients, a kind of “procedural objectivity” was conceptualised, and appropriate rules applied in practice.

References:

1.

Troehler U. "To improve the evidence of medicine": The 18th century British origins of a critical approach. Edinburgh: Royal College of Physicians, 2000.

2.

Ferriar J. Medical histories and reflexions. Vol.1. London: Cadell and Davies, 1792.

3.

Fowler T. Medical reports on the effects of tobacco in the cure of dropsies and dysenteries..., quoted from the 2nd ed., London: for the author, 1788.

4.

Millar J. Observations on the management of diseases in the Army and Navy, London: for the author, 1783.

5.

Cheselden W. The anatomy of the human body, 4th ed., London: Bettesworth and Hitch, 1732.

6.

Ibid., 5th ed., London: Bowyer, 1740.

7.

Guthrie GJ. On gun-shot wounds of the extremities. London: Longman, 1815.

8.

Lind J. A treatise on the scurvy..., 2nd ed., London: Millar, 1757.

9.

Haygarth J. Of the imagination as a cause and as a cure of disorders of the body..., Bath and London: Cruttwell and Cadell, 1800.

Back to top

Comparing like with like:
the evolution of prospective control of selection biases

Iain Chalmers
UK Cochrane Centre,
NHS Research & Development Programme, UK

When did people start thinking about how to test the validity of therapeutic claims? And how did they begin to conceptualise ‘fair tests’ of medical treatments? A prerequisite for fair tests is that steps be taken to reduce biases; yet histories of clinical trials seem only rarely to have identified the conceptualisation and reduction of biases as major themes. Kaptchuk’s history 1 of measures taken to reduce observer biases is an important exception.

As Kaptchuk has shown with respect to the control of observer biases, the need to control selection biases (to ensure that like is compared with like) has been appreciated for at least 250 years. 2-4 James Lind’s 1753 account of his comparison of different treatments for scurvy describes how he took steps to ensure that the sailors who participated in his trial were clinically “as similar as I could have them”, “lay together in one place”, and had “one diet common to all”.5 In an account of a controlled evaluation of bloodletting published in 1816, Hamilton described how sick soldiers had been “admitted, alternately” under the care of surgeons who either used or withheld venesection, but were otherwise “attended as nearly as possible with the same care and accommodated with the same comforts”.6 Balfour’s 1854 account of an assessment of the effects of homeopathic belladonna in preventing scarlet fever describes how he divided the participants in the experiment into two groups, “taking them alternately from the list, to avoid the imputation of selection”.7

These 18th and 19th century examples of measures taken to reduce selection biases are important because they suggest that one of the requirements for ‘fair tests’ – that ‘like be compared with like’ – had been conceptualised at least a century before Fisher’s development of the statistical theory underpinning the design and analysis of experiments using random allocation.8 Indeed, not only does treatment allocation based on strict alternation abolish selection bias just as effectively as treatment allocation based on random allocation; these different allocation methods also seem unlikely to have importantly different implications for statistical analysis.9

Although examples exist of random allocation being used during the 1930s,10,11 alternation remained the principal method used to achieve prospective control of selection biases until after the 2nd World War. An allocation schedule based on random number tables was used in the Medical Research Council trial of streptomycin for pulmonary tuberculosis,12 not because of statistical considerations, but because it would help to conceal allocations until after eligible patients had been irrevocably entered into the study.13,14 This measure seems likely to have reflected Bradford Hill’s critical assessment of an earlier clinical trial in which an allocation schedule based on alternation had clearly sometimes been ignored by the clinicians entering patients into the study. It is the measures taken to conceal the allocation schedule, not the use of random number tables to generate it, which make the MRC streptomycin trial a methodological milestone.3,14

References:
1.

Kaptchuk TJ. Intentional ignorance: a history of blind assessment and placebo controls in medicine. Bulletin of the History of Medicine 1998;72:389-433.

2.

Trhler U.“To improve the Evidence of Medicine”: the 18th century British origins of a critical approach. Edinburgh: Royal College of Physicians, 2000.

3.

Chalmers I. Comparing like with like: some historical milestones in the evolution of methods to create unbiased comparison groups in therapeutic experiments. International Journal of Epidemiology 2001;30:1170-78.

4.

Milne I, Chalmers I. Tackling bias in assessing the effects of health care interventions: early contributions from James Lind, Alexander Lesassier Hamilton and T. Graham Balfour. Proc Roy Soc Coll Phys Edin 2001;31 Suppl 9:46-48.

5.

Lind J. A treatise of the scurvy. In three parts. Containing an inquiry into the nature, causes and cure, of that disease. Together with a critical and chronological view of what has been published on the subject. Edinburgh: Printed by Sands, Murray and Cochran for A Kincaid and A Donaldson, 1753.

6.

Hamilton AL. Dissertatio Medica Inauguralis De Synocho Castrensi. Edinburgh: J Ballantyne, 1816.

7.

Balfour TG. Quoted in West C. Lectures on the Diseases of Infancy and Childhood. London: Longman, Brown, Green and Longmans, 1854, p 600.

8.

Fisher RA. The design of experiments. London: Oliver and Boyd, 1935.

9.

Armitage P. Quoted in: Chalmers I. Comparing like with like: some historical milestones in the evolution of methods to create unbiased comparison groups in therapeutic experiments. International Journal of Epidemiology 2001;30:1170-78.

10.

Amberson JB, McMahon BT, Pinner M. A clinical trial of sanocrysin in pulmonary tuberculosis. The American Review of Tuberculosis 1931;24:401-35.  

11.

Theobald GW. Effect of calcium and vitamin A and D on incidence of pregnancy toxaemia. Lancet 1937;2:1397-1399.

12.

Medical Research Council. Streptomycin treatment of pulmonary tuberculosis: a Medical Research Council investigation. BMJ 1948;2:769-782.

13.

D'Arcy Hart P. A change in scientific approach: from alternation to randomised allocation in clinical trials in the 1940s. BMJ 1999; 319: 572-573.

14.

Chalmers I. Why transition from alternation to randomisation in clinical trials was made. BMJ 1999;319:1372.

Back to top

Fisher, Bradford Hill and Randomization

Peter Armitage
St Peter's College, University of Oxford, UK

In the 1920s R.A. Fisher presented randomization as an essential ingredient of his approach to the design and analysis of experiments, validating significance tests. In its absence the experimenter had to rely on his judgement that the effects of biases could be discounted. Twenty years later, A. Bradford Hill promulgated the random assignment of treatments in clinical trials as the only means of avoiding systematic bias between the characteristics of patients assigned to different treatments. The two approaches were complementary, Fisher appealing to statistical theory, Hill to practical needs. The two men remained on good terms throughout most of their careers.

Selected bibliography:
 
Armitage P. The search for optimality in clinical trials. Int Stat Rev 1985;53:15-24.
Atkins WRB, Fisher RA. The therapeutic use of vitamin C. J Roy Army Med Corps 1943;83:251-252.
Box JF. R.A. Fisher: The Life of a Scientist. New York: Wiley, 1978.
Doll R, Hill AB. Smoking and carcinoma of the lung. BMJ 1950;2:739-748.
Fisher RA. The arrangement of field experiments. J Ministr Agric Gr Brit 1926;33:503-513.
Fisher RA.The Design of Experiments. Edinburgh: Oliver and Boyd, 1935.
Fisher RA. Development of the theory of experimental design. Proceedings of the International Statistical Conferences 1947;3:434-439.
Fisher RA, Bartlett S. Pasteurised and raw milk. Nature 1931;127:591-592.

Hill AB. Principles of Medical Statistics. London: Lancet, 1937.

Hill AB.í¹,í°  The clinical trial. New Eng J Med 1952;247:113-119.

Hill AB. Memories of the British streptomycin trial in tuberculosis: the first randomized clinical trial. Contr Clin Trials 1990;11:77-79.
Marks HM. The Progress of Experiment: Science and Therapeutic Reform in the United States, 1900-1990. Cambridge, England: Cambridge University Press, 1997.

Back to top

Fisher and Bradford Hill: their personal impact

Richard Doll
Clinical Trials Service Unit, University of Oxford, UK

Fisher’s book on statistical methods for research workers came to my notice when a medical student and led to my appreciation of the chi-share test and consequently to my first medical publication,1 in which I showed that the observations that one of my teachers had cited as vindication of a new treatment could easily have arisen by chance (p= 0.6). Fisher’s book did not, however, lead me to use randomisation rather than alternation in the first two trials I undertook (or sought to undertake) after qualification (in 1937). A few years later Bradford Hill’s example did, when I came to test the value of many treatments advocated for patients with gastric ulcers. From him, also, I learnt (i) the central importance of ethical considerations in the conduct of trials,2 as exemplified in the first large randomised clinical trial (of streptomycin for pulmonary tuberculosis); and (ii) the logical approach, adapted for each case, that could lead to the determination of whether an epidemiological association should be taken to imply causation.3

References:

1.

Doll R. Medical statistics. St Thomas’s Hospital Gazette 1936:294-297.

2.

Hill AB. Medical ethics and controlled trials. BMJ 1963;1:1043.

3.

Hill AB. The environment and disease: association or causation? Proc Roy Soc Med 1965;58:295-300.

Rigorous Uncertainty: Why R.A. Fisher is important

Harry M. Marks
The Johns Hopkins University, USA

Two epistemological claims underwrite the randomized clinical trial. The first, associated with A. Bradford Hill, asserts that randomization prevents biased estimates of the value of new therapies. The second, associated with R.A. Fisher, maintains that randomization is necessary for the valid interpretation of statistical significance.

This paper places Fisher’s views on randomization in the larger context of his views on statistical inference and the nature of science. Fisher’s life-long interest in the problem of induction, how to draw valid empirical conclusions about the world, led him to emphasize the provisional character of knowledge. The statistical methods he developed - randomization, likelihood - were aimed at producing what Fisher termed “rigorously specified uncertainty”. Fisher’s highly technical arguments about the nature of probability and likelihood are rooted in his more general concerns about the evolutionary and political importance of intellectual autonomy - concerns rooted in his early eugenic views but strongly reinforced by his ideological critique of socialist science during the Cold War.

Notwithstanding the esoteric character of some of his arguments, Fisher’s ideas have great relevance for contemporary debates within evidence-based medicine. How can we best distinguish what is known from what is not known? How can we emphasize the uncertainty of existing knowledge; and how can we best capture that uncertainty? All these Fisherian concerns are crucial in a time where the passage from the journal article to the press conference is increasingly swift.

Selected bibliography:

Fisher Box J. R.A. Fisher. The Life of A Scientist. New York: John Wiley, 1978.

Fisher RA. Collected Papers, ed. JH Bennett. University of Adelaide, 1971.

Fisher RA. Statistical Inference and Analysis.Selected Correspondence of R.A. Fisher, ed. JH Bennett. Oxford: Oxford University Press, 1990.

Jones G. Science, Politics and the Cold War. London: Routledge, 1988.

Marks HM. The Progress of Experiment. Science and Therapeutic Reform in the United States, 1900-1990. Cambridge: Cambridge University Press, 1997, 2000.

Mazumdar P. Eugenics, Human Genetics and Human Failings. The Eugenics Society, Its Sources and Critics in Britain. London: Routledge, 1992.

Perkin H. The Rise of Professional Society. England Since 1880. London: Routledge, 1990.

Savage LJ. “On Rereading R.A. Fisher.”Annals of Statistics 1976;4:441-445.

Back to top

The early use of placebos

Michael Emmans Dean
Department of Health Sciences, University of York, UK

Inert palliatives, called placebos, were traditionally used as psychological props in clinical practice in the centuries before their widespread adoption in clinical trials as sham treatments designed to reduce observer bias. Currently, the ethics of dummy controls are under scrutiny, as is the clinical relevance of therapeutic efficacy defined merely as the group difference in outcomes between a treatment being tested and dummy. At the same time, the so-called ‘placebo response’ is suspected of being a dustbin category for psychosocial variables neglected by biomedicine, or even an artefact of research that forgets the natural history of disease and regression to the mean.

Historical perspectives cannot solve our placebo problems, but can help in their analysis. In this paper I give an overview of important early experiments involving masking with dummy treatments, drawing particularly on the work of Ted Kaptchuk, and my own systematic review of early prospective trials of homeopathy. I make four main points, and one recommendation:

The emergence of dummy control treatments was not a clear-cut and one-sided application of scientific scrutiny to unorthodox treatments. Experiment and clinical practice using dummies within mesmerism and homeopathy appears to have given the lead, while external evaluations were often more rhetorical than scientific.

Dummy controls penetrated allopathic and biomedical self-evaluation much less quickly than homeopathy and mesmerism, perhaps because they threatened to expose the mistaken nature of much institutionalised therapeutic practice.

Many of the most decisive early clinical trials did not use dummy controls, but involved pragmatic comparisons of competing treatments or untreated controls.

Important aspects of the ‘placebo effect’, including therapist empathy and expectations, milieu, and negative responses to dummy treatments (‘nocebos’) were identified at a very early date, but have not been the subject of research until recently.

A refinement of technical vocabulary is needed, since the single term ‘placebo’ no longer seems adequate to contain (1) paternalistic patient management; (2) dummy controls; and (3) context effects.

Selected bibliography (* = Historical reviews containing references to primary sources mentioned in the paper):

Dean ME. A homeopathic origin for placebo controls: ‘An invaluable gift of God’. Alternative Therapies in Health and Medicine 2000;6:58–66.*

Dean ME. The trials of homeopathy: a critical-historical account of the origins, structure and development of Hahnemann's scientific therapeutics, and two systematic reviews of homeopathic clinical trials, 1821–1953 and 1940–1998 (PhD thesis). University of York, 2001.*

de Craen AJM, Kaptchuk TJ, Tijssen JGP, Kleijnen J. Placebos and placebo effects in medicine: historical overview. J Roy Soc Med 1999;92:511–515.*

Di Blasi Z, Harkness E, Ernst E, Georgiou A, Kleijnen J. Influence of context effects on health outcomes: a systematic review. Lancet 2001;357:757-62.

Donner F. Zwlf Vorlesungen ber Homopathie. Berlin:Haug, 1948.*

Ellenberger HF. The discovery of the unconscious. The history and evolution of dynamic psychiatry. London: Allen Lane, 1970.*

Glaser EM. Volunteers, controls, placebos, and questionaries in clinical trials. In: Witts L J, ed. Medical surveys and clinical trials: some methods and applications of group research in medicine. London: Oxford University Press, 1959:105–117.

G tzsche PC. The logic of the placebo concept. Is there any? Lancet 1994;344:925–927.

Haas H, Fink H, Hrtfelder G. Das Placeboproblem. Fortschritte der Arzneimittelforschung / Progress in Drug Research. 1959;1:279–454.*

Hahn RA. The nocebo phenomenon: concept, evidence and implications for public health. Preventive Medicine 1997;26:607–611.

Harley D. Rhetoric and the social construction of sickness and healing. Soc Hist Med 1999;12:407–435.

Hrobjartsson A, G tzsche PC. Is the placebo powerless? An analysis of clinical trials comparing placebo with no treatment. New England Journal of Medicine 2001;344:1594-602.

Jobst K. A matter of mind or matter? Network 1997;64:6–8.

Kaptchuk TJ. Intentional ignorance: a history of blind assessment and placebo controls in medicine. Bulletin of the History of Medicine 1998;72:389-433.*

Kaptchuk TJ. Powerful placebo: the dark side of the randomised controlled trial. Lancet 1998;351:1722–1725.*

Kaptchuk TJ, Dean ME. Debate over the history of placebos in medicine. Alternative Therapies in Health and Medicine 2000;6:18–20.*

Kienle GS, Kiene H. The placebo effect: a scientific critique. Comp Ther Med 1998;6:14–24.

Reubi FC. Armand Trousseau (1801-1867) et l’effet placebo. Schweiz Med Woch 1986;116:27-28.*

Shelley JH, Baur MP. Paul Martini: the first clinical pharmacologist? Lancet 1999; 353:1870–1873.*

Suttie I. The origins of love and hate. Harmondsworth: Penguin, 1952 [1936].

Walach H, Maidhof C. Is the placebo effect dependent on time? A meta-analysis. In: Kirsch I., ed. How expectancies shape experience. Washington, DC, American Psychological Association, 1999:321–332.

Back to top

Comparing like with like in observational studies of adverse effects of treatment: real and imaginary biases.

Jan Vandenbroucke
Department of Clinical Epidemiology, Leiden University, The Netherlands

The history of research on side effects of drugs is not older than about half a century. It parallels the history of modern chronic disease epidemiology. Several ideas in side effect research have been influenced by concepts from chronic disease epidemiology, and vice versa. The historical record can improve our insight into the essential conditions that make side effect research by observational means possible, and in this way it can also enlighten a broader discussion about the relative place of observational vs. experimental research in clinical medicine. A key question is: Under what conditions in observational studies of side effects it is possible to assume that treatment allocation was unbiased (that is, approaching randomisation) in respect of particular prognostic factors? In this presentation, I will explore the origin and development of these ideas in theoretical papers on side effect research.

Bibliography of key references to the older literature:

Finney DJ. The design and logic of a monitor of drug use. J Chron Dis 1965;18:77-98. [an update was published in 1971, and reprinted with postscript as: Finney DJ. Statistical logic in the monitoring of reactions to therapeutic drugs. In: Inman WHW, Ed. Monitoring drug safety. Philadelphia, Lippincot: 1980]

Jick H, Vessey MP. Case-control studies in the evaluation of drug induced illness. Am J Epidemiol 1978;107:1-7.

Jick H, Walker AM, Spriet-Pourra CI. Postmarketing follow-up. JAMA 1979;242:2310-4.

Skegg DCG. Medical record linkage. In: Inman WHW, Ed. Monitoring drug safety. Philadephia, Lippincot: 1980:337-48.

Back to top

Biased reporting of research evidence

Kay Dickersin
Brown University, Rhode Island, USA

“Many excellent notions or experiments are, by sober and modest men, suppressed”

Robert Boyle, 1661

Publication bias is a tendency to prepare, submit and publish research findings, based on the nature and direction of the results. If the literature is more likely to contain reports showing “positive” results than “negative” or null findings, estimates of the effects of treatments, for example, are likely to be biased. Although publication bias has been recognized for centuries,1 only recently have the size of the problem and its sources been the subject of research. There is now a large body of evidence confirming the existence of publication bias.

Early cross sectional studies by Sterling2 and Smart3 showed that the majority of published articles in psychology and the social sciences had statistically significant findings. Others, including those examining medical journals, have confirmed these findings.4, 5, 6 Surveys of peer reviewers and investigator authors have also shown that studies with “positive” results are more likely to be submitted or recommended for publication.7, 8, 9, 10, 11 Experimental studies, in which test manuscripts with “positive” and “negative” findings were submitted to journals or referees, have shown that “positive” results are associated with acceptance.12, 13, 14

The most direct evidence of publication bias in the medical field comes from six follow-up studies of the research projects identified at the time of funding or ethical approval.15, 16, 17, 18, 19 These found that “positive” findings was the main factor associated with subsequent publication.20 Investigators stated that their main reason for not reporting projects was that they had never written them up or submitted a manuscript for publication because they “were not interested.” Editorial rejection by journals was a rare cause of failure to publish.

There is mixed evidence about whether full publication of studies originally published in abstract form is associated with positive findings, depending on how “positive findings” is defined.21 These and other investigators have examined the time taken to publish research findings and have found mixed results: in some cases negative findings take longer to reach publication22 and in others they do not.23, 24

Despite the evidence identifying investigators as the main cause of publication bias, they continue to claim that editorial bias is the main reason that negative or null results are not published, and that this is why they do not submit negative findings. A recent study examining editorial review of reports of controlled trials at JAMA found little evidence of editorial bias in that journal.25

If trials were registered at inception, reviewers and others would at least know when relevant studies had been done, and could pursue the responsible investigators to obtain study results. Under-reporting of research is unethical and constitutes scientific misconduct.26 Unless this problem is addressed, estimates of the effects of treatments based on published evidence may be biased.

Back to top

References:

1. Ferriar J. Medical histories and reflexions. Vol 1. Cadell and Davies, London 1792.

2. Sterling TD. Publication decisions and their possible effects on inferences drawn from tests of significance - or vice versa. Journal of the American Statistical Association 1959;54:30-4./p>

3. Smart RG. The importance of negative results in psychological research. Canadian Psychologist 1964;5:225-32

4. Sterling TD, Rosenbaum WL, Weinkam JJ. Publication decisions revisited: The effect of the outcome of statistical tests on the decision to publish. Am Statist 1995;49:108-112.

5. Moscati R, Jehle D, Ellis D, Fiorello A, Landi M. Positive-outcome bias: comparison of emergency medicine and general medicine literatures. Acad Emerg Med 1994;1:267-71.

6. Vickers A, Goyal N, Harland R, Rees R. Do certain countries produce only positive results? A systematic review of controlled trials. Control Clin Trials 1998;19:159-66.

7. Consequences of prejudice against the null hypothesis. Psychol Bull 1975;82:1-20.

8. Coursol A, Wagner EE. Effect of positive findings on submission and acceptance rates: A note on meta-analysis bias. Professional Psychology: Research and Practice 1986;17:136-7.

9. Shadish WR, Doherty M, Montgomery LM. How many studies are in the file drawer? An estimate from the family/marital psychotherapy literature. Clin Psychol Rev 1989;9:589-603.

10. Sommer B. The file drawer effect and publication rates in menstrual cycle research. Psychology of Women Quarterly 1987;11:233-42.

11. Dickersin K, Chan S, Chalmers TC, Sacks HS, Smith H Jr. Publication bias and clinical trials. Control Clin Trials 1987;8:343-53.

12. Mahoney MJ. Publication prejudices: An experimental study of confirmatory bias in the peer review system. Cognitive Therapy and Research 1977;1:161-75.

13. Peters D, Ceci S. Peer review practice of psychologic journals: The fate of published articles submitted again. Behav Brain Sci 1982;5:187-95.

14. Epstein WM. Confirmational response bias among social work journals. Science, Technology, & Human Values 1990;15:9-37.

15. Easterbrook PJ, Berlin JA, Gopalan R, Matthews DR. Publication bias in clinical research. Lancet 1991;337:867-72.

16. Dickersin K, Min YI, Meinert CL. Factors influencing publication of research results. Follow-up of applications submitted to two institutional review boards. JAMA 1992;267:374-8.

17. Dickersin K, Min YI. NIH clinical trials and publication bias. Online J Curr Clin Trials 1993; Doc No 50:[4967 words; 53 paragraphs].

18. Stern JM, Simes RJ. Publication bias: evidence of delayed publication in a cohort study of clinical research projects. BMJ 1997;315:640-5.

19. Ioannidis JP. Effect of the statistical significance of results on the time to completion and publication of randomized efficacy trials. JAMA 1998;279:281-6.

20. Dickersin K. How important is publication bias? A synthesis of available data. AIDS Educ Prev 1997;9(1 Suppl):15-21.

21. Scherer RW, Langenberg P. Full publication of results initially presented in abstracts (Cochrane Methodology Review). In: The Cochrane Library, Issue 3, 2002. Oxford: Update Software.

22. Hopewell S, Clarke M, Stewart L, Tierney J. Time to publication for results of clinical trials (Cochrane Methodology Review). In: The Cochrane Library, Issue 3, 2002. Oxford: Update Software.

23. Tierney J, Stewart L. Looking for the evidence: is there bias in the publication of individual patient data meta-analyses? Presented at 5th Cochrane Colloquium. Amsterdam, 1997.

24. Dickersin K, Olson CM, Rennie D, Cook D, Flanagin A, Zhu Q, Reiling J, Pace B. Association between time interval to publication and statistical significance. JAMA. 2002;287:2829-31.

25. Olson CM, Rennie D, Cook D, Dickersin K, Flanagin A, Hogan J, Zhu Q, Reiling J, Pace B. Publication bias in editorial decision making. JAMA. 2002;287:2825-8.

26. Chalmers I. Under-reporting research is scientific misconduct. JAMA 1990;263:1405-1408.

Back to top

Controlling biases on behalf of the public today

Michael D Rawlins Department of Clinical Pharmacology, University of Newcastle and National Institute for Clinical Excellence, London, UK

Controlling bias in studies designed to investigate the efficacy, effectiveness and safety of therapeutic measures plays a critical role in the design and assessment of formal clinical studies. For organisations such as the Committee on Safety of Medicines and the National Institute for Clinical Excellence, with broad responsibilities to promote public health, controlling for these sources of scientific bias are pivotal. Nevertheless, such bodies also have to exercise judgement; and controlling for judgemental bias is equally important if their decisions are to yield appropriate health gains and retain broad professional and public support.

The necessary judgements that have to be made by public bodies fall into two types. Some forms of judgement are essentially scientific and rely on a skilful evaluation of the evidence. They apply particularly to circumstances where extrapolation, beyond the studies’ immediate findings, has to be made. Examples include the relevance of intermediate endpoints (or surrogate markers) to predict longer-term outcomes; or judgements about the generalisability of data, from the often relatively artificial environments of randomised controlled trials, to a heterogeneous population; and judgements about assumption made in constructing models.

The second form of judgement often involves balancing conflicting issues, where a decision must be based on intuition rather than analysis. For the Committee on Safety of Medicines it is the interplay between efficacy and safety; whilst for National Institute for Clinical Excellence it is primarily the tension between clinical benefit and cost.

In exercising such judgements on behalf of the public, national organisations must both act, and be seen to act, in the best interests of patients and the public as a whole. In particular, they must avoid inappropriate influences (sometimes known as 'capture') from any of a wide group of stakeholders. The stakeholder group most frequently cited as protagonists in these circumstances is the pharmaceutical industry. But there are others including politicians, patients (individually or collectively), the clinical professions, and the health service itself. Stakeholders' agendas are often similar, and they may form strategic alliances.

Modern capture theory 1 offers important insights into the judgemental biases associated with evaluating risk-benefit and cost-effectiveness by national bodies. It provides a coherent basis for controlling and avoiding them, and needs to be more widely understood by both "regulators" and the "regulated".

Reference:

1. Hancher H. Regulating for Competition: Government, Law, and the Pharmaceutical Industry in the United Kingdom and France. Clarendon Press: Oxford, 1990.

Return to Events Index

Back to top

Visit the BMJ editorial page http://bmj.com/cgi/reprint/325/7377/1372

 

Information for
students with special
educational needs