Statistical significance — a misconstrued notion in medical research

Nurminen M.2

Scandinavian Journal of Work Environment and Health 1997;23(3):232-5.

Abstract The P-value is the significance probability of obtaining a value of the test statistic that is
as extreme, in relation to the null hypothesis, as that observed. Medical researchers may, in some
situations, disagree on its appropriate use or on its interpretation as a summary measure of
consistency with the null hypothesis in a particular data set. More informative statistical measures,
such as the likelihood ratio and the Bayesian posterior probability, have been suggested for drawing
inferences from clinical trials and epidemiologic studies. Causal inference is not statistical in nature;
rather it strives to provide scientific explanations or criticisms of proposed explanations that would
describe the observed data pattern. In this context, it is important to remember that a finding may
not be medically important, or a causal hypothesis may even not be true, even if a study shows a
significant P-value.

Key words: Bayesian tests, Causal inference, Data interpretation, Epidemiologic methods, Multiple
comparisons, Null hypothesis, Power, Statistical inference, Tests of significance

“Statistically significant” is in clinical trials and epidemiology a chronically misinterpreted concept.
Misconception can be caused by both the confusing terminology and difficult theory of statistics.
The standard language word “significant” has a special meaning in statistical research work: the
consistency of data with a hypothesis is measured by the “significance probability” or the P-value.
Finney (1) has proposed that one should always add the adverb “statistically” to precede the word
“significant” whenever its meaning could otherwise be in doubt.


