The data used for this thesis are data about Bacterial Vaginosis (BV) and they have some special characteristics. The numerical values are semi-quantitative, the response is categorical (BV...Show moreThe data used for this thesis are data about Bacterial Vaginosis (BV) and they have some special characteristics. The numerical values are semi-quantitative, the response is categorical (BV negative, intermediate and BV positive) and the data are high-dimensional. Categorical regression (CATREG) is a method that can be used to analyze these data. To determine how CATREG performs in predicting future outcomes from these data it will be compared to Random Forests, one of the golden standards in statistical learning. The dataset was randomly divided in a training and test set. The training set was used for variable selection and determining the values of the regularization parameters, and the test set was used for estimating the prediction accuracy. Based on the training set a Random Forests model and a CATREG model were chosen and used for prediction. Random Forests and CATREG both classify 68% of the outcomes correctly, but the models are not able to distinguish well between intermediate and BV positive women. When the intermediate and BV positive women are taken together, the percentages of correctly classified women increases to 95% and 97% for Random Forest and CATREG, respectively. Overall this analysis showed that CATREG performs as well as Random Forests in the prediction and therefore it can be considered as a worthwhile alternative.Show less