Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
In this thesis, two regression models for the nonlinear analysis of interaction effects are proposed. The regression models are based on the Optimal Scaling methodology and specifically target the...Show moreIn this thesis, two regression models for the nonlinear analysis of interaction effects are proposed. The regression models are based on the Optimal Scaling methodology and specifically target the analysis of Factor-by-Curve interactions between a categorical and a continuous variable. The Optimal Scaling methodology was originally developed for analysis of categorical data, but is also applicable to continuous data. It estimates optimal quantifications for the original observed values in an iterative process by maximising the squared multiple regression coefficient (R2 ), thereby transforming the original variable. These quantifications are restricted according to a prespecified scaling level, indicating the stringency of the transformation. These scaling levels can restrict the quantifications to be unsmoothed (non)monotone, or to be smooth (non)monotone. Unsmoothed nonmonotone quantifications are not restricted to any relation between the original observed values, whereas the monotone restriction preserves the ordering of the original observed values in the quantifications. The smooth restrictions are similar, but the quantifications are then also smoothed using a spline function. The quantifications can also be restricted to a linear transformation of the original observed values. This (ordinary) Optimal Scaling regression model, however, does not take into account any interaction effects between the variables. The type of interactions considered in this thesis are the Factor-by-Curve interactions. Factorby-Curve interactions are interactions between a categorical variable (factor) and a continuous variable. The models proposed in this thesis will be referred to as the Factor-by-Curve Optimal Scaling regression (FbC-OS-regression) models. Both models fit a separate curve for the continuous variable in the interaction for each level of the factor. For example, an interaction between a continuous variable and a factor of three levels is then fitted with three curves on that continuous variable. The difference between the two proposed models is that they either fit main and interaction effects separately or fit the joint effects in a single term. The models are illustrated with two applications on real data. The advantage of both FbCOS-regression models, compared to existing methods for modelling of Factor-by-Curve interactions, is that the Optimal Scaling methodology allows for monotone restrictions of the effects. This is demonstrated using the applications shown in this thesis, which are fitted using monotone spline restrictions. Results for the fitted FbC-OS-regression models are then compared to fitted linear regression models with interactions. Finally, the two approaches of modelling Factor-byCurve interactions with OS-regression are compared to each other and to the additive model, which is a model suitable for nonlinear analysis of Factor-by-Curve interactions as well, after which suggestions for further study of the proposed models are given.Show less
Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
In clinical trials, heterogeneity of treatment effect often exists between patients with different pretreatment characteristics, such as age, gender, weight, etc. In response to such issue, various...Show moreIn clinical trials, heterogeneity of treatment effect often exists between patients with different pretreatment characteristics, such as age, gender, weight, etc. In response to such issue, various subgroup identification approaches have been proposed. Two methods among them, Qualitative Interaction Tree (QUINT) and a method adapted from an optimal treatment regimes (OTR) approach proposed by Zhang et al. (2012), are compared in this paper. These two methods identify three types of subgroups in a situation with two treatments (A and B): one subgroup for which treatment A is better than treatment B, one for which treatment B is better than treatment A, and one for which the difference between the two treatment outcomes is negligible (called ”indifference group”). A simulation study was conducted to compare the two methods with regard to their recovery performance (quantified by type I error rates, type II error rates, Cohen’s κ agreement to the true subgroups, and splitting performance of the derived trees) and their predictive performance (quantified using the difference between the true expected treatment outcome and the estimated treatment outcome of sample data and population data). Results of the simulation study suggested that QUINT has its advantage in recovering the subgroups, and the method adapted from the OTR approach has its advantage in predicting treatment outcome.Show less