It is very easy to understand and to interpret a single tree model. However, it is of-ten unstable and relatively inaccurate. The aim of this article is to evaluate and improvethe performance of...Show moreIt is very easy to understand and to interpret a single tree model. However, it is of-ten unstable and relatively inaccurate. The aim of this article is to evaluate and improvethe performance of single tree algorithms. In total, three single tree algorithms includingClassification and Regression Tree (CART) applied with R package ’rpart’, EvolutionaryTree applied with R package ’evtree’ and a new method that combining Bayesian AdditiveRegression Trees (BART) and Born Again Tree were evaluated. We did a bechmark studyon six differnet datasets and found that the evolutionary trees and born-again trees bothperform better than CART in terms of accuracy. The relative performance between evolu-tionary and born-again trees depended on the dataset. Evolutionary trees performed betteron relatively larger datasets and born-again trees performed better on relatively smallerdatasets. However, these single tree methods still showed a huge gap in performance com-pared to BART, especially when applied to large datasets. we conclude that there is stillroom for the improvement of single trees compared to ensemble methods.Show less
Objective: The Generalized linear mixed-model (GLMM) tree is a decision-tree method which allows for subgroup detection in a wide range of multilevel datasets. This thesis provides a first...Show moreObjective: The Generalized linear mixed-model (GLMM) tree is a decision-tree method which allows for subgroup detection in a wide range of multilevel datasets. This thesis provides a first evaluation of how missing data can be handled in GLMM trees, by assessing the performance of listwise deletion (LD), mean or mode single imputation (SI), multiple imputation (MI) and missingness incorporated in attributes (MIA), in terms of predictive accuracy and tree size accuracy. Method: Different missingness mechanisms, proportions of missing cases and missing data were artificially introduced into data retrieved from the Early Childhood Longitudinal Study Kindergarten class of 1998-1999. Results: As expected, MI yielded the highest performance overall, closely followed by MIA, which exhibited an approximately similar performance. SI performed somewhat worse than MIA and MI, whereas LD showed a substantially inferior performance. Individually, MI and MIA performed very similar for lower amounts of missing data, MI slightly outperformed MIA for higher amounts of missing data, missing completely at random (MCAR) and missing at random (MAR) data and MIA slightly outperformed MI for MNAR (missing not at random) data. When comparing the size of fitted GLMM trees with those fitted on the complete data, MI tended to overfit and yield ensembles of more complex trees, whereas LD, SI and MIA tended to underfit and yield simpler decision trees. Furthermore, the performance of LD was lowest across all conditions and deteriorated even further as the number of cases with missing increased. Conclusion: For handling missing data in GLMM trees, MI is recommended predominantly for prediction purposes, but lacks interpretability. Alternatively, MIA is recommended for interpretability and when a smaller tree size is preferred. Conversely, using either LD or SI is discouraged, even though SI is preferred over LD.Show less
The article screening process for meta-analyses is time-intensive and laborious. Statistical learning and natural language processing techniques can be used to partially automate this process. In...Show moreThe article screening process for meta-analyses is time-intensive and laborious. Statistical learning and natural language processing techniques can be used to partially automate this process. In this study, the performance of four models were compared using a range of evaluation metrics. The first model was built using Latent Dirichlet Allocation (LDA) to extract the topics from the articles to be used as input for a random forest. The second model was built using LDA topics as input for an Extreme Gradient Boosted (XGBoost) tree. The third and fourth models added to the first two by also incorporating a bibliometric feature as input to the respective classifiers. To compare these models, the article catalogues from two meta-analyses pertaining to the field of psychology were gathered and processed. Thus, two real life data-sets were used for the analysis. All four models were built using the full body of text from the article as input for the LDA. These four models were pitted against a benchmark model which represented the more conventional approach to automated article screening. In both datasets, all four proposed models outperformed the benchmark model across all the performance metrics. In the first dataset, the model using LDA topics and bibliometric features as input to the XGBoost was the highest performing model. In the second dataset, the model using LDA topics and bibliometric features as input to the random forest was the highest performing model. The results of this study support the growing evidence that the partial automation of article screening for meta-analyses is indeed possible with a high level of efficiency.Show less
Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
In this thesis, two regression models for the nonlinear analysis of interaction effects are proposed. The regression models are based on the Optimal Scaling methodology and specifically target the...Show moreIn this thesis, two regression models for the nonlinear analysis of interaction effects are proposed. The regression models are based on the Optimal Scaling methodology and specifically target the analysis of Factor-by-Curve interactions between a categorical and a continuous variable. The Optimal Scaling methodology was originally developed for analysis of categorical data, but is also applicable to continuous data. It estimates optimal quantifications for the original observed values in an iterative process by maximising the squared multiple regression coefficient (R2 ), thereby transforming the original variable. These quantifications are restricted according to a prespecified scaling level, indicating the stringency of the transformation. These scaling levels can restrict the quantifications to be unsmoothed (non)monotone, or to be smooth (non)monotone. Unsmoothed nonmonotone quantifications are not restricted to any relation between the original observed values, whereas the monotone restriction preserves the ordering of the original observed values in the quantifications. The smooth restrictions are similar, but the quantifications are then also smoothed using a spline function. The quantifications can also be restricted to a linear transformation of the original observed values. This (ordinary) Optimal Scaling regression model, however, does not take into account any interaction effects between the variables. The type of interactions considered in this thesis are the Factor-by-Curve interactions. Factorby-Curve interactions are interactions between a categorical variable (factor) and a continuous variable. The models proposed in this thesis will be referred to as the Factor-by-Curve Optimal Scaling regression (FbC-OS-regression) models. Both models fit a separate curve for the continuous variable in the interaction for each level of the factor. For example, an interaction between a continuous variable and a factor of three levels is then fitted with three curves on that continuous variable. The difference between the two proposed models is that they either fit main and interaction effects separately or fit the joint effects in a single term. The models are illustrated with two applications on real data. The advantage of both FbCOS-regression models, compared to existing methods for modelling of Factor-by-Curve interactions, is that the Optimal Scaling methodology allows for monotone restrictions of the effects. This is demonstrated using the applications shown in this thesis, which are fitted using monotone spline restrictions. Results for the fitted FbC-OS-regression models are then compared to fitted linear regression models with interactions. Finally, the two approaches of modelling Factor-byCurve interactions with OS-regression are compared to each other and to the additive model, which is a model suitable for nonlinear analysis of Factor-by-Curve interactions as well, after which suggestions for further study of the proposed models are given.Show less
Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
In clinical trials, heterogeneity of treatment effect often exists between patients with different pretreatment characteristics, such as age, gender, weight, etc. In response to such issue, various...Show moreIn clinical trials, heterogeneity of treatment effect often exists between patients with different pretreatment characteristics, such as age, gender, weight, etc. In response to such issue, various subgroup identification approaches have been proposed. Two methods among them, Qualitative Interaction Tree (QUINT) and a method adapted from an optimal treatment regimes (OTR) approach proposed by Zhang et al. (2012), are compared in this paper. These two methods identify three types of subgroups in a situation with two treatments (A and B): one subgroup for which treatment A is better than treatment B, one for which treatment B is better than treatment A, and one for which the difference between the two treatment outcomes is negligible (called ”indifference group”). A simulation study was conducted to compare the two methods with regard to their recovery performance (quantified by type I error rates, type II error rates, Cohen’s κ agreement to the true subgroups, and splitting performance of the derived trees) and their predictive performance (quantified using the difference between the true expected treatment outcome and the estimated treatment outcome of sample data and population data). Results of the simulation study suggested that QUINT has its advantage in recovering the subgroups, and the method adapted from the OTR approach has its advantage in predicting treatment outcome.Show less