Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
In clinical trials, heterogeneity of treatment effect often exists between patients with different pretreatment characteristics, such as age, gender, weight, etc. In response to such issue, various...Show moreIn clinical trials, heterogeneity of treatment effect often exists between patients with different pretreatment characteristics, such as age, gender, weight, etc. In response to such issue, various subgroup identification approaches have been proposed. Two methods among them, Qualitative Interaction Tree (QUINT) and a method adapted from an optimal treatment regimes (OTR) approach proposed by Zhang et al. (2012), are compared in this paper. These two methods identify three types of subgroups in a situation with two treatments (A and B): one subgroup for which treatment A is better than treatment B, one for which treatment B is better than treatment A, and one for which the difference between the two treatment outcomes is negligible (called ”indifference group”). A simulation study was conducted to compare the two methods with regard to their recovery performance (quantified by type I error rates, type II error rates, Cohen’s κ agreement to the true subgroups, and splitting performance of the derived trees) and their predictive performance (quantified using the difference between the true expected treatment outcome and the estimated treatment outcome of sample data and population data). Results of the simulation study suggested that QUINT has its advantage in recovering the subgroups, and the method adapted from the OTR approach has its advantage in predicting treatment outcome.Show less
Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
Prediction rule ensembles (PREs) aim to offer a good compromise between prediction accuracy and interpretability by selecting a small set of the most important prediction rules. The accuracy of...Show morePrediction rule ensembles (PREs) aim to offer a good compromise between prediction accuracy and interpretability by selecting a small set of the most important prediction rules. The accuracy of tree-based methods, such as single decision trees are known to be negatively affected by measurement error. The PRE algorithm is based on single decision trees, which are turned into an ensemble of multiple rules and may thus inherit the negative effect of measurement error. However, an extensive investigation of the influence of measurement error on the performance of PREs has not been conducted before. Therefore, we evaluated the impact of measurement error on the performance of PREs though two simulation studies: one for data with continuous predictor variables and the other for data with binary predictor variables. In both the focus is solely on binary classification. We found that the predictive accuracy of PREs, as measured by AUC values, deteriorated in the presence of measurement error. More importantly, it was found that the performance of the PRE method deteriorated with larger amounts of measurement error for both the binary and continuous predictor scenarios. In addition, the performance of PREs in terms of number of correctly selected rules, type I and type II errors was evaluated. We found that, apart from deteriorating the predictive performance of the PREs, measurement error can also deteriorate the interpretability of the fitted ensemble by selecting wrong rules, resulting in unreliable and wrong conclusions. Keywords: RuleFit, prediction rule ensembles, measurement error, classification error, reliability, type I error, type II errorShow less