The importance of Sample Size calculation in clinical research

Di Francesca Botta - Senior Biostatistician

INTRODUCTION

In clinical research, the definition of sample size for a clinical trial is one of the most important aspects of the trial’s success.
The sample size should be properly calculated during the study design phase, because it is a fundamental step of a clinical study from several points of view, including methodological, ethical, and financial perspectives.
The aim of this article is to present the definition of sample size, describe all assumptions and hypotheses that there are underlying the calculation and discuss why the sample size calculation is so important for clinical research to be successful.

WHAT IS THE SAMPLE SIZE?

The sample size is the number of subjects that needs to be included in a clinical trial who will allow the detection of a clinically relevant treatment effect (i.e., a treatment outcome that generally physicians would identify as important). 
The main objective of a clinical study is to demonstrate that an experimental treatment has a good safety profile and has a given effect on the population of interest, i.e. a homogeneous group affected by a given disease with a certain severity.
Since it is not possible to recruit the entire population, such as those with a given disease or who are to be treated with the experimental treatment, a subgroup of individuals will be enrolled in the study to represent the overall population.
The collaboration between biostatisticians and clinicians is important to calculate the most appropriate sample size to be enrolled in a trial.

WHICH ARE THE KEY ELEMENTS OF SAMPLE SIZE CALCULATION?

An appropriate sample size calculation depends on the pre-specified statistical hypotheses and the following assumptions: 

  • Primary endpoint: the primary endpoint of the study is a fundamental aspect of sample size calculation because the statistical methodology that will be applied to estimate the sample size depends on the nature of the primary endpoint (i.e., continuous, categorical, or time-to-event endpoint). Moreover, the statistical test that will be applied during the final analysis of the primary endpoint should be the same one considered in the assumptions of sample size calculation (i.e., if the primary analysis will be based on a Chi-square test between two independent groups, the same statistical test should be considered during the sample size calculation).
  • Number of treatment groups involved: the choice of the statistical methodology (i.e., independent or paired test, etc.) that will be applied for sample size calculation depends on the number of treatment groups involved in the clinical investigation.
  • Type I error (alpha): it is the rejection of a true null hypothesis (i.e., no difference between the effects of two treatments). The type I error rate (α), which is also called significance level, measures the probability of rejecting the null hypothesis given that it is true (i.e., to conclude that the difference between two treatments’ outcomes is statistically significant when it is not). In other words, it is referred to as false positive results and it is expressed by the p-value. By convention, the alpha level is set to 0.05, which means that it is tolerable to have a 5% chance of wrongly rejecting the null hypothesis. The lower the alpha error, the larger the required sample size will be.
  • Type II error (beta): it is the non-rejection of a false null hypothesis. The type II error rate is indicated by β and measures the probability of non-rejecting the null hypothesis is given that it is false (i.e., to conclude that there is no difference between two treatments’ outcomes when there is a difference). In other words, it is referred to as false negative results. By convention, the beta level is set to 0.20, which means that it is tolerable to have a 20% chance of false-negative conclusions.
  • Power: it is the complement of β (i.e. 1- β) and represents the desired probability of observing the expected difference between two treatments’ outcomes, if true, at the significance level α (i.e., the probability of correctly rejecting the null hypothesis when it is false). By convention, the power is set to 0.80, meaning the chance of correctly rejecting a null hypothesis is at least 80%. The higher the power, the larger the required sample size will be.
  • Minimal meaningful detectable difference: it is the smallest difference between treatments being regarded as clinically relevant in the management of subjects or on a judgment concerning the anticipated effect of a new treatment (i.e., the difference between two treatments’ outcomes that study investigators consider both biologically plausible and relevant from a clinical point of view). The larger this expected difference is, the smaller the required sample size will be.
  • Outcome variability: it is a measure of the dispersion of the data points in a specific population. It can be obtained by examining published literature or from pilot studies, but this is not always feasible. In those cases, in which outcome variability is not known, estimation should be used and carefully simulated. The greater variability in the specified outcome is expected, the larger the required sample size will be.
  • Drop-out rate: it is the expected proportion of subjects who can leave out the study due to any reason. Normally, the sample size calculation will give the number of study subjects to be analyzed to achieve statistical significance for a given hypothesis. Therefore, in clinical practice, we may need to enroll more subjects to compensate for these potential dropouts. Given n the estimated sample size and d the drop-out rate, the total sample size to be enrolled in the study, according to Freedman’s formula, is:

N=n/(1-d ⁄100)

 

WHY IS THE SAMPLE SIZE SO IMPORTANT?

The calculation of the appropriate sample size is an important aspect of clinical investigation because  two situations should be avoided:

  • Underpowered studies: they are studies that have a high chance of missing study objectives, due to a sample size too small. An underpowered clinical trial doesn’t have the necessary power to clearly answer the clinical question and highlight the beneficial effect of a new intervention at the detriment of future subjects. This may lead to improper negative or uncertain conclusions of the trial. Moreover, it would be unethical to expose individuals, even if few, to pointless and unnecessary risks: the uncertainty associated with clinical research (e.g., severe adverse events or a less effective treatment) is worth only if there is an actual chance of obtaining conclusive results. 
  • Overpowered studies: they are studies with a sample size unduly large. An overpowered study may be unnecessarily time-consuming, expensive, and unethical, since depriving some of the patients of superior treatment. Moreover, too large samples (oversized trials) may lead to statistically significant differences that are not actually of clinical interest.

Last but not least, a formal sample size calculation is required by Health Authorities (and also by high-impact journals) for confirmatory trials. Furthermore, a sample size justification is required also for non-pivotal studies if it is not possible to provide a power-based sample size.

CONCLUSION

In conclusion, the sample size calculation is the first and one of the most important steps in planning a clinical trial and any negligence in its estimation may lead to the rejection of an efficacious treatment and the approval of an ineffective treatment. Although techniques for sample size calculation are described in various statistical books, performing these calculations can be complicated and it is desirable to consult an experienced biostatistician to estimate this vital study parameter. 

 

REFERENCES AND NOTES

  • Andrade C. Sample Size and its Importance in Research. Indian J Psychol Med. 2020 Jan 6;42(1):102-103. doi: 10.4103/IJPSYM.IJPSYM_504_19. PMID: 31997873; PMCID: PMC6970301.
  • Biau DJ, Kernéis S, Porcher R. Statistics in brief: the importance of sample size in the planning and interpretation of medical research. Clin Orthop Relat Res. 2008 Sep;466(9):2282-8. doi: 10.1007/s11999-008-0346-9. Epub 2008 Jun 20. PMID: 18566874; PMCID: PMC2493004.
  • Faber J, Fonseca LM. How sample size influences research outcomes. Dental Press J Orthod. 2014 Jul-Aug;19(4):27-9. doi: 10.1590/2176-9451.19.4.027-029.ebo. PMID: 25279518; PMCID: PMC4296634.
  • Fogel DB. Factors associated with clinical trials that fail and opportunities for improving the likelihood of success: A review. Contemp Clin Trials Commun. 2018 Aug 7;11:156-164. doi: 10.1016/j.conctc.2018.08.001. PMID: 30112460; PMCID: PMC6092479.
  • Gupta KK, Attri JP, Singh A, Kaur H, Kaur G. Basic concepts for sample size calculation: Critical step for any clinical trials! Saudi J Anaesth 2016;10:328-31.
  • ICH (International Council for Harmonisation of Technical Requirements for Human Use) Topic E 9 Statistical Principles for Clinical Trials. Chapter 3.5.
  • Martínez-Mesa J, González-Chica DA, Bastos JL, Bonamigo RR, Duquia RP. Sample size: how many participants do I need in my research? An Bras Dermatol. 2014 Jul-Aug;89(4):609-15. doi: 10.1590/abd1806-4841.20143705. PMID: 25054748; PMCID: PMC4148275.
  • Noordzij M, Tripepi G, Dekker FW, Zoccali C, Tanck MW, Jager KJ. Sample size calculations: basic principles and common pitfalls. Nephrol Dial Transplant. 2010 May;25(5):1388-93. doi: 10.1093/ndt/gfp732. Epub 2010 Jan 12. Erratum in: Nephrol Dial Transplant. 2010 Oct;25(10):3461-2. PMID: 20067907.
  • Nyirongo VB, Mukaka MM, Kalilani-Phiri LV. Statistical pitfalls in medical research. Malawi Med J. 2008 Mar;20(1):15-8. doi: 10.4314/mmj.v20i1.10949. PMID: 19260441; PMCID: PMC3345655.
  • Wang X, Ji X. Sample Size Estimation in Clinical Research: From Randomized Controlled Trials to Observational Studies. Chest. 2020 Jul;158(1S):S12-S20. doi: 10.1016/j.chest.2020.03.010. PMID: 32658647.