Bayesian
Bayesian
Bayesian
Correspondence Summary
Dr Lyle C. Gurrin Statistical analysis of both experimental and observational data is central
Biostatistician
Women and Infants Research
to medical research. Unfortunately, the process of conventional statistical
Foundation analysis is poorly understood by many medical scientists. This is due, in part,
King Edward Memorial Hospital to the counter-intuitive nature of the basic tools of traditional (frequency-
PO Box 134, Subiaco based) statistical inference. For example, the proper definition of a con-
Perth, W.A., 6008
Australia
ventional 95% confidence interval is quite confusing. It is based upon the
imaginary results of a series of hypothetical repetitions of the data gen-
Keywords: assisted reproduction eration process and subsequent analysis. Not surprisingly, this formal defi-
technology, Bayesian statistics, nition is often ignored and a 95% confidence interval is widely taken to
confidence intervals, ICSI, medical
represent a range of values that is associated with a 95% probability of con-
statistics, P values, statistical inference
taining the true value of the parameter being estimated. Working within the
Accepted for publication: traditional framework of frequency-based statistics, this interpretation is
21 September 1999 fundamentally incorrect. It is perfectly valid, however, if one works within
the framework of Bayesian statistics and assumes a ‘prior distribution’ that
is uniform on the scale of the main outcome variable. This reflects a limited
equivalence between conventional and Bayesian statistics that can be used
to facilitate a simple Bayesian interpretation based on the results of a stand-
ard analysis. Such inferences provide direct and understandable answers
to many important types of question in medical research. For example, they
can be used to assist decision making based upon studies with unavoidably
low statistical power, where non-significant results are all too often, and
wrongly, interpreted as implying ‘no effect’. They can also be used to over-
come the confusion that can result when statistically significant effects are
too small to be clinically relevant. This paper describes the theoretical basis
of the Bayesian-based approach and illustrates its application with a prac-
tical example that investigates the prevalence of major cardiac defects in a
cohort of children born using the assisted reproduction technique known
as ICSI (intracytoplasmic sperm injection).
of the null hypothesis requires simultaneous con- avoid irrational behaviour in response to them
sideration of the relative plausibility of a variety of (Walley 1991; Walley et al. 1996).
competing hypotheses that are also consistent with
the data and cannot, therefore, be based on the cal-
culation of a single P-value, assuming that the null
Bayesian statistics
hypothesis is true!
Suppose that we plan to conduct an observational
or experimental study to further our knowledge
Degrees of belief and subjective probability
about some quantity of interest (called a statistical
In order to overcome the problems discussed in parameter), and thus to collect information that
the last paragraph, we need to subscribe to a more will provide evidence to support or refute a current
general notion of probability. While we would wish hypothesis about the quantity of interest. The
to maintain the simple frequentist interpretation of Bayesian approach to statistical inference (named
probability as the long-run frequency of events in cir- after the 18th century English clergyman the Rev-
cumstances where it is appropriate, we would also erend Thomas Bayes) initially asks the researcher
like to make probabilistic statements and judgements to collate all pre-existing information, reflecting
about statistical parameters and, ultimately, scientific both evidence based on past studies and current
hypotheses. beliefs, before prospectively collecting any new data.
Most statisticians now accept the concept of sub- This information is then expressed in mathematical
jective probability, where statements involving the form as a prior probability distribution. The prior
use of probability are taken to represent a ‘degree distribution is simply a quantification of the current
of personal belief’ about the quantity or event of state of understanding about the unknown quantity
interest (Lindley 1965a). This removes the need of interest and can be thought of as attaching a
to associate probability with observable events weight, expressed as a probability, to each possible
(however hypothetical such events may be) and value of the quantity of interest before additional
allows us to make quantitative judgements about the data are recorded and examined. Values of the quan-
likelihood of an assertion being correct in circum- tity of interest that are viewed as being a priori fairly
stances where there is no reasonable long-run fre- likely to represent the true quantity are assigned a
quency interpretation. high prior probability and those that are viewed as
A typical example occurs when we attach a less likely receive a correspondingly lower prior
probability to an event in public affairs, such as probability.
the statement ‘there is a 10% chance that Australia The prior distribution by definition allows investi-
will become a republic before 31 December 2005’. gators to incorporate pre-existing information into
Clearly Australia will not debate the transition their analysis, something that is more difficult to do
from a constitutional monarchy to a republic a large in the frequentist theory of statistics. Lilford et al.
number of times under identical conditions and so a (1995) comment that ‘Bayesian methods utilise all
relative frequency interpretation of the probability of available data’. This provides a distinct advantage
this event is simply not possible. Frequentist statisti- over conventional methods of analysis, which Lilford
cians should refrain from attaching probabilities & Braunholtz (1996) rightly observe ‘do not allow
to such one off events, though many events (for decision makers to take explicit account of additional
example, sporting contests) can be similar enough for evidence’. The choice of an appropriate prior distri-
the associated probabilities to warrant a frequentist bution is usually based on a combination of the fol-
interpretation. Although it is not strictly necessary lowing three sources of information:
for subjective probabilities to be based on data, they (i) evidence from previous studies via the inspec-
should change, in a rational manner, as new data tion of historical data;
accrue. More formally, a sequence of subjective prob- (ii) consultation with experts in the field to elicit
ability statements must be internally consistent or their clinical opinion, which potentially involves a
coherent in the sense of Walley (1991) in order to degree of subjective judgement;
0.4
logical models.
New evidence from the data collected during the
0.3
current study is summarized by the likelihood func-
Posterior density
tion (Edwards 1992; Berger & Wolpert 1988). This is
0.2
a mathematical object that describes how the prob-
ability distribution of the observed data depends on
0.1
the particular values of the statistical parameters that
govern the chosen class of statistical models.
0.0
The last step in the Bayesian process is to combine 0 5 10 15 20
the prior distribution with the likelihood function Fall in diastolic blood pressure in response to drug 'X' (mm Hg)
Thus there is a 95.2% posterior probability that the not unusual to hear researchers in a scientific context
true fall in DBP in response to drug ‘X’ is at least say that they need to draw an ‘objective’ inference
5 mm Hg. that is untainted by the personal opinions and preju-
dices of those participating in the project.
Under these circumstances some statisticians have
Choosing a prior distribution
proposed what might loosely be called an objective
Specification of the prior distribution is a matter of Bayesian theory of statistical inference. They ad-
ongoing concern for those contemplating the use of vocate the use of ‘vague’, ‘flat’ or ‘non-informative’
Bayesian methods in medical research. Clearly any prior distributions that in some sense emphasize the
conclusions drawn from a Bayesian analysis will role of the current experimental data and obviate the
potentially be sensitive to the choice of prior distrib- need for specific reference to prior beliefs (Lindley
ution. Some authors have devoted considerable 1965b; Hughes 1993). One such distribution is the
thought to the process of formalizing the choice uniform probability distribution, which assigns equal
of a prior probability distribution. Freedman & prior weight to each possible value of the quantity of
Spiegelhalter (1983), Spiegelhalter & Freedman interest on the scale of the chosen outcome measure.
(1986), Chaloner et al. (1993) and Kadane et al. (1980) Each value of the quantity of interest is viewed as
have all made some suggestions as to eliciting and ‘equally likely’ before the new data are observed,
quantifying the prior opinions of clinicians, but this which seems intrinsically reasonable. The use of a
remains a difficult task. It is sometimes fancifully sug- uniform prior probability distribution focuses atten-
gested that clinicians and ‘consumers’ should come tion on current rather than pre-existing data (Lindley
equipped with their own prior distribution which 1965b), in that the shape of the posterior distribu-
they can then combine with the likelihood function tion depends entirely on the likelihood function.
provided by the statistician! Although the use of the prior distribution has a
If there is important pre-existing information that number of shortcomings and does not truly represent
needs to be taken into account then it can be incor- a formal mathematical expression of the state of
porated into a subsequent analysis by formulating a ‘prior ignorance’ (Walley 1991; Walley et al. 1996; see
suitably descriptive prior distribution.This is a crucial also the discussion), it provides an ad hoc standard
step in the Bayesian process, despite the fact that or reference analysis, from a common starting point,
it is often treated with scepticism by traditionally that aids comparison between current experimental
minded statisticians and clinicians. Nevertheless, or observational data, and that obtained from other
although we do not wish to downplay the importance sources. Furthermore, the uniform prior distribution
of choosing an appropriate prior distribution in situa- provides an important link between frequentist and
tions where there is considerable prior knowledge, Bayesian theories of statistical analysis, which can be
there are, to be realistic, many circumstances where conveniently illustrated by exploring the role of the
little or no relevant pre-existing information is avail- confidence interval in statistical inference.
able. It is perhaps a reasonable criticism of the
Bayesian approach to statistical analysis that, in this
The interpretation of confidence intervals
situation, attempting to specify a prior distribution is
effectively trying to quantify something that does not The concept of a confidence interval was developed
exist. Alternatively, we may wish to restrict attention by frequentist statisticians in order to represent the
to the current data so that we can, in some sense, let precision of a parameter estimate as the size of
the data ‘speak for themselves’, or, in the words of an interval of values that necessarily includes the
Lilford et al. (1995) ‘represent the information arising estimate itself. Confidence intervals are generated
just from the data’. Lindley (1965a) comments ‘even by inverting a probability statement about the data
when one has some appreciable prior knowledge of given the value of the parameters, in order to come
theta [a quantity of interest] one may like to express up with a range of values for the true parameter to
the posterior beliefs about theta without reference to which we can attach a probabilistic interpretation.
them [i.e. the prior distribution]’. Equivalently, it is In order to remain faithful to the frequency based
the 14 (3.33%) infants with cardiac malformations Using these standard results the data are likely to
defined as major by Kurinczuk & Bower (1997). be interpreted in one of three ways. First, it may
There was some concern, however, that because of be noted that P > 0.05 and this may be interpreted
the unusually close surveillance of the Belgian as suggesting that the null hypothesis should be
cohort, the increased risk of cardiac birth defects ‘accepted’ and the conclusion drawn that there is no
described by Kurinczuk & Bower (1997) may have evidence of an increased prevalence of major cardiac
been due to the over-diagnosis of defects that would defects in children born following ICSI. This inter-
otherwise never have come to medical attention pretation is, of course, fundamentally incorrect.
(Kurinczuk & Bower 1997; Bonduelle et al. 1997). Second, it may be noted that there appears to be a
Having excluded all cardiac defects that may (even potentially important increase in the prevalence of
remotely) have fallen into this category, 5 of the 420 major cardiac defects in the ICSI cohort that is close
(1.19%) infants were deemed to have at least one to twice the corresponding proportion in the general
major cardiac defect that would definitely have been population. However, because the result based on the
identified under routine surveillance. This was then five cases of cardiac defects was not statistically sig-
compared to the corresponding prevalence of major nificant, it might be argued that this data set is too
cardiac defects in the population of Western small to draw any meaningful inferences. This inter-
Australian live births, that is, 0.67%. pretation is safer than the first, but fails to use the
Researchers in Western Australia wished to deter- data to their full potential. A third alternative is to
mine whether these results warranted the submission interpret the 95% confidence interval. This interval
of a grant application to investigate this issue further (calculated above as 0.00153 to 0.0223) is wide and
using local ICSI data. They wished to know how encompasses values that would lead to quite differ-
likely it was that ICSI was associated with an increase ent inferences. For example, a birth prevalence of
in the birth prevalence of major birth defects, par- 0.002 would suggest that ICSI infants had a preva-
ticularly cardiac defects, and if so, how likely it was lence of major cardiac defects that was only 30% of
that such an increase in cardiac defects was large, for that in the general population, whereas a prevalence
example, greater than two-fold. of 0.0201 would suggest that it was three times as
Let us initially consider how these data might be high. Both of these values are contained in the con-
analysed in a conventional setting. The ‘null’ hypoth- fidence interval and are therefore, in some sense, con-
esis will be that ‘the birth prevalence of major cardiac sistent with the observed data. This confirms that the
defects in the ICSI birth cohort is the same (0.0067) sample size is too small and suggests that further
as in the general Western Australian population’. study is important. This interpretation is both valid
A conventional test of the null hypothesis based and informative and there is no question that if a
upon the standard Normal approximation to the standard approach to analysis is to be adopted it
binomial distribution (Armitage & Berry 1994, should be based upon confidence intervals. This
pp70–71, 118–125) would utilize a standard error for approach does not, however, allow us to express
1
the proportion of ((0.0067 ¥ 0.9933)/420) /2 = 0.00398. some of these qualitative impressions in a quantita-
The observed proportion in the ICSI cohort is tive manner. For example, although a prevalence of
5/420 = 0.0119 and so the standardized Normal 0.0201 falls within the 95% confidence interval and
deviate (Z) is ((0.0119 - 0.0067)/0.00398) = 1.31, is therefore ‘consistent’ with the data, it is unclear
which (from the usual statistical tables) is equival- how likely it is, on the basis of this preliminary analy-
ent to a 2-tailed P-value of 0.191. In calculating sis, that the true birth prevalence really is this high or
the 95% confidence interval for the proportion, maybe even higher.
we now ignore the null hypothesis and use the As an alternative, we would propose that a
observed proportion to calculate its standard error Bayesian analysis be carried out using a ‘non-
(Armitage & Berry 1994, sections 4.7, 4.9): informative’ prior distribution that is uniform on the
((0.0119 ¥ 0.9881)/(420) = 0.00529. This produces a scale of proportions. Having made this assumption
95% confidence interval of 0.0119 ± 1.96 ¥0.00529 = we can now make use of the equivalence of a stand-
0.00153 to 0.0223. ard C% confidence interval and a Bayesian C%
credible interval. In order to generalize the ensuing stated threshold is C% + (100% - C%)/2. This is
calculations, let us consider what may be called a crit- because any value which falls inside the critical con-
ical confidence interval. This is defined as the confi- fidence interval (posterior probability = C%) must by
dence interval with a midpoint at the observed value definition exceed the threshold of interest and sym-
(in this example, at a proportion of 0.0119) and a metry dictates that one half of all values which fall
lower limit at the value of a threshold of interest (in outside the confidence interval (posterior probability
this example, at a proportion of 0.0067 correspond- = (100% - C%)/2) will also exceed the threshold. In
ing to the ‘null value’ associated with the rate in the order to calculate the probability that the true value
general population), and an upper limit that is by of a quantity of interest is less than a given thresh-
symmetry the same distance above the observed old, one may carry out a series of analogous calcula-
value as the lower limit is below. Using Z tables it tions using the critical confidence interval whose
is straightforward to determine the percentage upper limit falls at the threshold.
coverage of this critical confidence interval. In this Returning to the example, let us calculate the
example, such a confidence interval on the propor- probability that the true proportion exceeds 0.0134,
tion scale extends from 0.0067 to 0.0119 + (0.0119 - which is twice the rate in the general population.
0.0067) = 0.0171. This is symmetric about the Since this value exceeds the observed value of 0.0119,
observed proportion, that is 0.0119, and extends we set the upper bound of the critical confidence
0.0052/0.00529 = 0.983 standard errors in either direc- interval to the threshold of interest, namely 0.0134,
tion. Reference to a table of the Z distribution indi- and calculate the lower bound to be as far below
cates that 83.71% of the area under the curve lies the observed value of 0.0119 as 0.0134 is above,
below Z = 0.983, thus 16.29% lies above this point giving a value of 0.0119 - (0.0134 - 0.0119) = 0.0104.
and, by symmetry, 16.29% of the area lies below This confidence interval, extending from 0.0104
Z = -0.983. This particular confidence interval is to 0.0134, is ± 0.2835 standard errors around the
therefore a (100 - [2 ¥ (16.29]) = 67.42% confidence estimated proportion of 0.0119. This is a 22.32%
interval which means that, having adopted a prior confidence interval and the posterior probability
distribution that is uniform on the scale of pro- that the true proportion exceeds 0.0134 is half of
portions, the range 0.0067 to 0.0171 is a Bayesian the probability lying outside this interval, or (100%
67.42% credible interval. This means that there - 22.32%)/2 = 38.84%.
is 67.42% posterior probability that the true propor- These results tell the researcher that it is very
tion lies between 0.0067 and 0.0171 and a 16.29% likely (approximately 84%) that the true prevalence
posterior probability that it is greater than 0.0171. of major cardiac defects is greater in the ICSI cohort
There is therefore a 67.42% + 16.29% = 83.71% than in the general population and that there is close
posterior probability that the true proportion is to a 40% probability that it exceeds twice the back-
greater than 0.0067 and thus a relatively high prob- ground rate. Similar calculations demonstrate that
ability that the risk of a major cardiac defect in a the chance that the true proportion in the ICSI
baby conceived using ICSI is higher than the risk in cohort is as high as three times the rate in the general
the general population. Readers should note that, in population is only 6.06%. To extend the characteri-
this particular case, the posterior probability of zation further, Table 1 details the posterior probabil-
83.71% could have been obtained directly from the ity that the true proportion exceeds a series of
table of the Z distribution: ‘83.71% of the area under thresholds of interest.
the curve lies below Z = + 0.983’. However, we Analyses such as those illustrated above
explain the calculation in terms of a two-sided confi- proved to be of considerable value to the medical
dence interval, because we believe that this clarifies scientists in Western Australia investigating the risks
the full procedure and it is appropriate under all associated with ICSI therapy. The investigators
circumstances. were subsequently successful in obtaining a research
In general, if the percentage coverage of the con- grant (from the March of Dimes Birth Defects
fidence interval is C%, the posterior probability that Foundation in New York) to continue their work in
the true value of the quantity of interest exceeds the this area.
Table 1 The posterior probability that the prevalence of major cardiac defects in the ICSI cohort exceeds a series of
thresholds based on the prevalence in the general population
Threshold Threshold as a multiple of the Posterior probability that the true Posterior probability that the true rate
of interest prevalence in general population rate exceeds the stated threshold is less than the stated threshold
foundation on which to base a bold new theory of noticeably different to the original values of 83.7%
statistical analysis! and 38.8%, respectively. Nevertheless, this change
First, the uniform prior probability distribution would make little or no difference to the principal
does not provide a formal mathematical representa- conclusion of the analysis.
tion of ‘prior ignorance’. No single prior distribution In most settings in medical statistics, confidence
is appropriate when one is faced with a complete intervals are calculated in a way that assumes that the
lack of information (Walley 1991).Walley et al. (1996) distribution of the data, or at least a relevant
note that ‘. . . any [single] Bayesian prior distribution summary statistic, can be approximated by a suitable
assigns precise probabilities to hypotheses and there- Normal distribution. The correspondence of a C%
fore has strong behavioural implications, e.g. it pre- confidence interval calculated using such an approxi-
cisely determines “fair” betting rates on the truth of mation to a C% credible interval is therefore only
the hypotheses.’ Bayesian statisticians would endorse exact when the data are Normally distributed. For
repeating the analysis using many different prior dis- most of the standard probability distributions used to
tributions in the hope of encapsulating a wide range analyse and model medical data, the approximation
of prior beliefs about the values of the relevant para- is, in general, quite close even when the sample size
meters. This is known as a Robust Bayesian approach is relatively small (for an example, see Burton
to analysis (Berger 1984, 1990, 1994; Greenhouse & (1994)).
Wasserman 1995). Such a sensitivity analysis is clearly One of the problems with Bayesian analysis is that
important if we are to ascertain how the posterior it is often a non-trivial problem to combine the prior
distribution is affected by changes to the prior prob- information and the current data to produce the pos-
ability distribution or by changes to the model used terior distribution. Despite the increasing availability
to create the likelihood. of purpose-designed software for Bayesian analysis
A second difficulty with the uniform prior distrib- (BUGS, Spiegelhalter et al. 1995), specialist advice
ution is its sensitivity to transformation; the uniform and software is generally required in order to bring
distribution may in fact be very non-uniform when Bayesian statistics into the medical research work-
transformed to another scale of analysis. A prior dis- place. The congruence between conventional confi-
tribution that is uniform on the scale of proportions, dence intervals and Bayesian credible intervals
for example, cannot simultaneously be uniform on generated using a uniform prior distribution does,
the scale of odds and vice versa, and yet in many cases however, provide a simple way to obtain inferences
either scale would be appropriate for analysis. We in Bayesian form which can be implemented using
would argue that if two scales really are equally standard software based on the results and output of
appropriate, and the use of a prior which is uniform a conventional statistical analysis.
on one scale leads to a qualitatively different con- The use of Bayesian methods is growing amongst
clusion to an analysis based upon a prior which is clinical scientists and clinicians. The congruence
uniform on the other scale, then inferences must, of between a Bayesian analysis using a uniform
course, be viewed as uncertain. One hopes that in prior and a conventional analysis provides a non-
situations where more than one analytical scale is threatening introduction to Bayesian methods and
appropriate, the choice of scale would result in rela- means that analyses of the type we describe can be
tively small quantitative changes rather than large carried out on standard software. Our approach is
qualitative alterations to the principal conclusions.To straightforward to implement, offers the potential to
illustrate, if Example 2 had been worked assuming describe the results of conventional analyses in a
uniformity on the scale of loge(odds) rather than on manner that is more easily understood, and leads nat-
the scale of proportions, the estimated posterior urally to rational decisions. We do not suggest that
probability that the true rate of cardiovascular birth this approach should be used all the time, nor should
defects in ICSI baby exceeded the general popula- it be used is an excuse for designing studies which are
tion rate, or twice that rate, would have been 90.1% too small or a fallback position when a conventional
and 39.5%, respectively. Because the sample size is analysis fails to produce a statistically significant
so small (only five cases), these probabilities are result. However, when it is used appropriately, we
believe that this approach is a useful addition to con- alternative to p values. Journal of Epidemiology and
ventional methods. Community Health 52, 318–323.
Chaloner K., Church T., Louis T.A. & Matts J.P. (1993)
Graphical elicitation of a prior distribution for a clinical
Acknowledgements trial. The Statistician 42, 341–353.
Cummins J.M. & Jequier A.M. (1994) Treating male infer-
Jennifer Kurinczuk gratefully acknowledges receipt tility needs more clinical andrology, not less. Human
of a two year project grant from the March of Dimes Reproduction 9, 1214–1219.
Birth Defects Foundation, New York (#6-FY98–497; De Kretser D.M. (1995) The potential of intracytoplasmic
#6-FY99–683). sperm injection (ICSI) to transmit genetic defects
This work was funded in part by the National causing male infertility. Reproduction Fertility and Devel-
Health and Medical Research Council of Australia as opment 7, 137–142.
one component of Program Grant #96\3209. Edwards A.W.F. (1992) Likelihood. Johns Hopkins
University Press, Baltimore.
Freedman L.S. & Spiegelhalter D.J. (1983) The assessment
References of subjective opinion and its use in relation to stopp-
ing rules for clinical trials. The Statistician 32, 153–
Armitage P. & Berry G. (1994) Statistical Methods in 160.
Medical Research. 3rd edn. Blackwell Scientific Publica- Freeman P.R. (1993) The role of p-values in analysing trials
tions, Oxford. results. Statistics in Medicine 12, 1443–1452.
Berger J. (1984) The robust Bayesian viewpoint (with Greenhouse J.B. & Wasserman L. (1995) Robust Bayesian
discussion). In: Robustness in Bayesian Analyses. methods for monitoring clinical trials. Statistics in Medi-
(Ed. J. Kadane). North-Holland, Amsterdam. pp. 63– cine 14, 1379–1391.
144. Hughes M.D. (1993) Reporting Bayesian analyses of clini-
Berger J. (1990) Robust Bayesian analysis: sensitivity to the cal trials. Statistics in Medicine 12, 1651–1663.
prior. Journal of Statistical Planning and Inference 25, Kadane J.B., Dickey J.M., Winkler R.L., Smith W.S. &
303–328. Peters S.C. (1980) Interactive elicitation of opinion for a
Berger J.O. (1994) An overview of robust Bayesian analy- normal linear model. Journal of the American Statistical
sis (with discussion). Test 3, 5–124. Association 75, 845–854.
Berger J.O. & Wolpert R.L. (1988) The Likelihood Kurinczuk J.J. & Bower C. (1997) Birth defects in infants
Principle. 2nd edn. Institute of Mathematical Statistics, conceived by intracytoplasmic sperm injection: an alter-
Hayward, California. native interpretation. British Medical Journal 315,
Bonduelle M., Desmyttere S., Buysse A., Van Assche E., 1260–1265.
Schietecatte J., Devroey P. et al. (1994) Prospective Lee P.M. (1989) Bayesian statistics: an introduction. Arnold,
follow-up study of 55 children born after subzonal London.
insemination and intracytoplasmic sperm injection. Liebaers I., Bonduelle M., Legein J., Wilikens E., Van
Human Reproduction 9, 1765–1769. Assche E., Buysse A. et al. (1995) Follow-up of children
Bonduelle M., Legein J., Buysse A., Van Assche E., Wisanto born after intracytoplasmic sperm injection. In: Fertility
A., Devroey P. et al. (1996) Prospective follow-up study and Sterility: a Current Overview. (Hedon B., Bringer J.,
of 423 children born after intracytoplasmic sperm injec- Mares P. eds.) Parthenon, New York.
tion. Human Reproduction 11, 1558–1564. Lilford R.J., Thornton J.G. & Braunholtz D. (1995) Clinical
Bonduelle M., Devroey P., Liebaers I. & Van Steirteghem trials and rare diseases: a way out of a conundrum.
A. (1997) Commentary: Major defects are overesti- British Medical Journal 311, 1621–1625.
mated. British Medical Journal 315, 1265–1266. Lilford R.J. & Braunholtz D. (1996) The statistical basis of
Box G.E.P. & Tiao G.C. (1973) Bayesian inference public policy: a paradigm shift is overdue. British Medical
in statistical analysis. Addison-Wesley, Reading, Journal 313, 603–607.
Massachusetts. Lindley D.V. (1965a) Introduction to probability and
Burton P.R. (1994) Helping doctors to draw appropriate statistics from a Bayesian viewpoint. Part 1 Probability.
inferences from the analysis of medical studies. Statistics Cambridge University Press, Cambridge. pp. 19–25,
in Medicine 13, 1699–1713. 29–42, 50, 58.
Burton P.R., Gurrin L.C. & Campbell M.J. (1998) Clinical Lindley D.V. (1965b) Introduction to probability and statis-
significance not statistical significance: a simple Bayesian tics from a Bayesian viewpoint. Part 2 Inference. Cam-
bridge University Press, Cambridge. pp. 1–13, 15, 18, Tournaye H., Liu J., Nagy Z., Joris H., Wisanto A.,
19. Bonduelle M. et al. (1995) Intracytoplasmic sperm injec-
Palermo G., Joris H., Devroey P. & Van Steirteghem A.C. tion (ICSI): the Brussels Experience. Reproduction Fer-
(1992) Pregnancies after intracytoplasmic injection of a tility and Development 7, 269–279.
single spermatozoon into an oocyte. Lancet 340, 17–18. Van Steirteghem A.C., Nagy P., Liu J., Joris H., Smitz J.,
Patrizio P. (1995) Intracytoplasmic sperm injection (ICSI): Camus M. et al. (1994) Intracytoplasmic sperm injection
potential genetic concerns. Human Reproduction 10, – ICSI. Reproductive Medical Review 3, 199–207.
2520–2523. Walley P. (1991) Statistical reasoning with imprecise proba-
Spiegelhalter D.J. & Freedman L.S. (1986) A predictive bilities. Chapman & Hall, London.
approach to selecting the size of a clinical trial, based on Walley P., Gurrin L. & Burton P. (1996) Analysis of clinical
subjective opinion. Statistics in Medicine 5, 1–13. data using imprecise prior probabilities. The Statistician
Spiegelhalter D., Thomas A., Best N. & Gilks W. (1995) 45, 457–486.
BUGS. Bayesian inference using Gibbs sampling, Winkler R.L. (1972) An introduction to Bayesian inference
Version 0.60. MRC Biostatistics Unit, Cambridge. and decision. Holt, Rinehart and Winston Inc., New
http://www.mrc-bsu.cam.ac.uk/bugs. York. pp. 395–396.