Introduction
Statistics is an inseparable part of biomedical research from the stages of planning
to the final publication.[1] For research conducted by an undergraduate or postgraduate
medical student, statistics is a fairly new domain to explore. Although a few basic
statistical methods are taught in the undergraduate course, their practical application
is limited in majority of the institutions. Hence, they seek help from their immediate
seniors, mentors, or expert statisticians.[2] Many of the statistical tests can be
calculated manually. However, it is time-consuming. The effort is minimized with the
use of software packages. Some of these software packages are free and others are
paid. Many of the resource-limited settings in developing countries may find difficulty
procuring the paid software due to financial constraints. The next option is to use
free software packages (e.g., Epi Info) on a personal computer. However, many researchers
may not have access to personal computers or have technical difficulty using the software.
Hence, we searched for websites that run software on the internet browser and provide
free service to the users.
In this article, we aimed to provide a brief technical guide on common statistical
tests that can be conducted from any computer connected to the internet.
Selection of Statistical Test
The common statistical test used for numerical data (e.g., the age, height, weight
of research participants) is shown in Figure 1 and that for categorical data (e.g.,
sex [male/female/intersex], presence of disease [yes/no], socioeconomic status [I–V])
is shown in Figure 2.[1
3]
Figure 1
Example of some common statistical tests for numerical data
Figure 2
Examples of some common statistical tests for categorical data
Websites for Statistical Tests
Many websites provide multiple statistical tests. Table 1 shows the tests and some
of the websites that provide these tests. This would make the readers aware of different
websites so that they can make an informed choice for their future statistical tests.
Along with the statistical tests listed in [Figures 1 and 2], we included central
tendency, frequency distribution and normality test as these are very basic statistics,
needed even before the selection of appropriate tests.
Table 1
Websites to conduct common statistical tests online
Statistics
https://www.statskingdom.com/
https://www.socscistatistics.com/
https://www.graphpad.com/quickcalcs/
https://epitools.ausvet.com.au/
https://astatsa.com/
Central tendency and frequency distribution
√
√
√
Normality test
√
√
√
One-sample t-test
√
√
√
√
One-sample median test
√
√
√
Unpaired t-test
√
√
√
√
Mann-Whitney U test
√
√
√
√
Paired t-test
√
√
√
√
Wilcoxon signed-rank test
√
√
√
√
One-way analysis of variance (ANOVA)
√
√
√
Kruskal-Wallis test
√
√
√
Repeated-measure ANOVA
√
Friedman Test
√
√
Pearson correlation test
√
√
√
Spearman correlation test
√
√
√
Binomial test
√
√
Chi-square test
√
√
√
√
Fisher’s exact test
√
√
√
McNemar test
√
√
√
This list is not a comprehensive list for statistical tests. The range of tests offered
by each website may also be missed in this list and websites are dynamic. Readers
are encouraged to find more free tools online
Descriptive Statistics
Central tendency and frequency distribution
Central tendency is the most commonly used descriptive statistical test. Invariably,
all the research data are expressed as mean, standard deviation, median, interquartile
range, mode, and range. Frequency distribution is also used to group the observations
into different categories.
Normality test
From Figures 1 and 2, it is obvious how important it is to check the normality of
the data set. There is one type of test for normally distributed data and another
type of test for not-normally distributed data. Hence, it helps in decision-making
about the inferential statistical test to use.[4] If the data are not normally distributed,
these are commonly presented as median, quartile, and interquartile range.
Inferential Statistics: Numerical Data
One-sample t-test and median test
When the observations come from a sample and the mean or median of the observations
are needed to be compared with a reference value, a one-sample t-test or one-sample
median test is used. When the data are normally distributed, the one-sample t-test
is used to compare the sample mean with a reference value. When the data are not normally
distributed, the one-sample median test is used to compare the sample median with
the reference value.
Unpaired t-test and Mann–Whitney U test
When the observations come from two independent samples, either unpaired t-test or
Mann–Whitney U test is used to compare the mean or median, respectively. For example,
if the mean urticarial activity score is to be compared between male and female research
participants, and the data are normally distributed, the unpaired t-test is used.
If the data are not normally distributed, the Mann–Whitney U test is used.
Paired t-test and Wilcoxon signed-rank test
When two measurements come from a sample, either paired t-test or Wilcoxon signed-rank
test is used. For example, a new treatment regime was applied to a sample and the
eosinophil count was measured before and after the treatment. If the data are normally
distributed, a paired t-test is used to compare the mean eosinophil count before and
after the treatment. If the data are not normally distributed, median eosinophil counts
before and after treatments are compared with Wilcoxon signed-rank test.
One-way analysis of variance (ANOVA) and Kruskal–Wallis test
When the observations come from > two samples, the one-way analysis of variance (ANOVA)
or Kruskal–Wallis test is used. For example, if the urticarial activity scores among
males, females, and the intersex group is to be compared and the data are normally
distributed, one-way ANOVA is used to compare the mean. If the data are not normally
distributed, the Kruskal–Wallis test is used to compare the median.
If there is a significant difference, it is established that there is a difference
among the mean or median of the three groups. However, which pair (e.g., male–female,
male–intersex, female–intersex) significantly differ is not revealed from the ANOVA
or Kruskal–Wallis test. To know this, a post-hoc test is to be carried out. For ANOVA,
Tukey’s honestly significant difference (Tukey’s HSD) is used and for Kruskal–Wallis
test, Dunn’s test is used. If Dunn’s test is not available online, a pair-wise Mann–Whitney
U test with Bonferroni correction (a = 0.05 will be divided by the number of groups;
corrected a = 0.05/3 = 0.0166) can be used to compare between the pairs (e.g., male–female,
male–intersex, female–intersex).[5]
Repeated-measure ANOVA and Friedman test
When the observations come from one sample with more than two measurements, repeated-measure
ANOVA or Friedman test is used. For example, after the application of a new drug regimen,
the eosinophil count was measured after 1st week, 2nd week, and 3rd week of treatment.
If the data are normally distributed, repeated-measure ANOVA is used to compare the
mean. If the data are not normally distributed, the Friedman test is used to compare
the median.
If there is a significant difference, a post
hoc test is to be run. For ANOVA, paired t-test with Bonferroni correction is carried
out to compare the mean. For the Friedman test, a multiple pair-wise Wilcoxon signed-rank
test with Bonferroni correction is used to compare the median.
Pearson’s correlation test and Spearman’s correlation test
When the relationship between two groups needs to be tested, Pearson’s correlation
test or Spearman’s correlation test is used. For example, if the relationship between
the urticarial activity score and Pittsburgh sleep quality index score is to be tested
and if the data are normally distributed, Pearson’s correlation is used and if not
normally distributed, Spearman’s correlation test is used. The correlation coefficient
spans between –1 and +1.
Inferential Statistics: Categorical Data
Binomial test
When there is one variable with a dichotomous outcome, a binomial test is used. For
example, outcome of a treatment regimen as a variable with a dichotomous outcome—success
or failure.
Chi-square test and Fisher’s exact test
When there are ≥ two variables and ≥ two samples, the Chi-square test is used. For
example, if there were two samples—smoker and non-smoker and two variables—having
oral carcinoma and not having carcinoma, a 2 × 2 contingency table can be created
to conduct the Chi-square test to find any relationship between smoking and oral cancer.
When there are more than 2 columns and rows (e.g., a 4 × 4 contingency table), the
Chi-square test should be coupled with a post-hoc 2 × 2 Chi-square test with Bonferroni
correction of alpha. If the frequency is less than five, Fisher’s exact test is used
instead of the Chi-square test.
McNemar test
When there is one sample and two variables or two matched samples and one variable,
the McNemar test is used. For example, a new drug was applied to a sample and two
variables were measured—decrease in itching (yes/no) and decrease in eosinophil count
(yes/no), then a 2 × 2 contingency table was created with the number of four types
of patients—decreased itching + decreased eosinophil, decreased itching + not decreased
eosinophil, not decreased itching + decreased eosinophil, not decreased itching +
not decreased eosinophil. The result would show if there is a difference in the proportion
of participants with decreased itching and eosinophil count after the treatment.[6]
Discussion
We presume that this article would help us to know the basics of biomedical statistics
and get an idea of the websites where these tests can be carried out with limited
resources.
Although we have listed some of the available websites for the statistical tests,
this may not be the comprehensive list. In addition, the descriptive statistics were
not described in detail. It can be found in the article contributed by Kaliyadan and
Kulkarni.[7] Furthermore, in many cases, multiple tests are available for analyzing
the same set of data. For example, there are several tests for checking the normality
of data.[8] Similarly, there may be other websites that offer the same test. We presume
that researchers would find the best suitable websites for their statistical tests.
This article was written with the sole purpose of training novice researchers. We
do not claim it to be a complete guide for inferential statistics. However, we presume
that the glimpse of common statistical tests with examples would enhance the learning
of the physicians cum researchers.
Conclusion
We briefly described how common statistical tests used in biomedical researches can
be conducted online, without installing any dedicated software. However, minimum cost
involving access to a computer with an internet connection is a prerequisite. Novice
researchers in resource-limited settings may carry out these statistical tests. The
application of statistical tests for analyzing clinical data is evolving. Hence, researchers
are suggested to update themselves continually.
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.