Differences and non-differences
July 10, 2019•173 words
Statistical significance has nothing to do with practical importance or scientific relevance; statistical significance reflects sampling uncertainty. Moreover, the number of statistically significant findings that can be expected in a study is related to the study design, not least sample size, number of statistical tests peformed, and strategy used for addressing multiplicity issues.
Successfull investigators develop the design of their experiments in a way that enables detection of practically important and scientifically relevant differences or effects. Parts of such a development are a sample size calculation based on a reasonable estimate of what is practically important, a procedure for data collection that prevents selection bias and confounding, and a strategy for addressing multiplicity issues. In observational research, similar problems have to be resolved in the statistical analysis instead of in the study design. However, entirely disregarding these problems and just interpreting statistical significance as an indication of practical importance and statistical non-significance as an indication of equivalence reflects a fundamental misunderstanding. P-values are not, and have never been, a substitute for scientific reasoning.