Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
Recently, a new theory of hypothesis testing was introduced: safe testing. Within the safe testing framework, random variables called S-values are used for hypothesis testing. S-values can be...Show moreRecently, a new theory of hypothesis testing was introduced: safe testing. Within the safe testing framework, random variables called S-values are used for hypothesis testing. S-values can be interpreted as both conservative p-values and Bayes factors. Further, they allow for optional continuation: S-values from multiple studies can be multiplied while retaining a type-I error guarantee, and some S-values are even robust under the frequentist interpretation of optional stopping. For this thesis, I developed safe tests for two classical frequentist hypothesis tests: the 2x2 contingency table test and its stratified equivalent, the Cochran-Mantel-Haenszel test. These tests were designed to be GROW (growth-rate optimal in the worst case) for certain subsets of the alternative hypothesis. Two versions of the tests were presented: a version that provides the GROW S-value for a restricted alternative hypothesis based on a minimal absolute di↵erence between group means, and a version that is based on the Kullback-Leibler divergence between the alternative and null hypothesis. For the ‘minimal absolute di↵erence’ version, an analytically computable ‘simple’ S-value turned out to exist, which is robust under optional stopping. I showed that when using this safe test for optional stopping, the expected sample size needed to achieve a desired power can be lower than when using Fisher’s exact test. No ‘simple’ definition could be found for the Kullback-Leibler version: this GROW safe test has to be found through numerical optimization. Nevertheless, the Kullback-Leibler version could still be preferred in some cases: it was shown to gain higher power for certain data-generating distributions compared to the simple S-value. Both S-values were implemented in an R package: the safe2x2 package.Show less