Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
Multivariate binary data are often collected in scientific fields such as psychology, economics and epidemiology. Worku and de Rooij (2018) proposed a marginal model for the analysis of this type...Show moreMultivariate binary data are often collected in scientific fields such as psychology, economics and epidemiology. Worku and de Rooij (2018) proposed a marginal model for the analysis of this type of data in a distance framework: The multivariate logistic distance (MLD) model. Two different models were introduced by Worku and de Rooij: a restricted and an unrestricted MLD model. The interpretation of both models is clear, and a log-odds as well as a biplot representation can be used. In this work we proposed three extensions to the restricted model and showed the implications of the extensions for the interpretation of the corresponding biplot as well as for the log-odds. First, we showed how the model can be extended by making it possible for a response variable to belong to multiple dimensions. Consequently, the extended model can be used to examine other dimensionality structures compared to the original model. Second, we allowed for non-linear relationships of the predictor variables with the response variables in the model and therefore making the model more flexible. Finally, the dimensionality structure as well as the final predictor variables need to be selected. We showed how to use the prediction capability of a model as a selection criterion to select between competing models. This is a more versatile method to perform model selection, based on the bias-variance trade off, compared to the likelihood based criterion used in the original model. We fitted 16 variations of the model to an empirical data set to compare performance based on their prediction capability. All variations of the model can be estimated using standard statistical software for univariate modelsShow less