In this study the performance of feature-based dissimilarity space(FDS) classification is evaluated by comparing it to conventional classification techniques. In FDS classification a classifier is...Show moreIn this study the performance of feature-based dissimilarity space(FDS) classification is evaluated by comparing it to conventional classification techniques. In FDS classification a classifier is trained by using a dissimilarity space instead of a feature vector space. Since FDS classification is applied in a wide range of classifiers a new and model independent dissimilarity feature selection method is presented and tested. The fundamentals of this newly proposed selection method are given by the compactness hypothesis(Arkadev and Braverman, 1966). The performance of this newly proposed dissimilarity feature selection technique is evaluated by a Monto-Carlo simulation experiment and a bootstrap study. The performance of FDS classification is evaluated by comparing it to the performance of conventional classification techniques. The performance of FDS classification is estimated by using a bootstrap procedure. The results indicate that FDS classification is beneficial in combination with a linear classifier and a complex classification task. Due to the combination of a linear classifier and FDS classification a linear decision boundary is fitted in a dissimilarity space. This decision boundary becomes non-linear in the original feature vector space.Show less
High dimensional classification problems become increasingly frequent and these problems are notoriously difficult. Classifying Alzheimer patients using MRI data or fMRI data is one such challenge:...Show moreHigh dimensional classification problems become increasingly frequent and these problems are notoriously difficult. Classifying Alzheimer patients using MRI data or fMRI data is one such challenge: often no more than 50 subjects are measured while the number of variables or features observed per patient or object can be as high as 10000. Specialized statistical learners attempt to combat the challenges these high dimensional classification problems present. In this thesis we propose an extension of an ensemble learner called Stacked Generalization that combines the idea of Stacking multiple classification techniques and sub setting the feature space. We call it Stacked Domain Learning. We argue that Stacked Domain Learning may improve prediction performance in high dimensional classification problems. Performance increase is mainly expected in situations where the data presents different modalities. We investigate this claim in a simulation study. We apply state of the art (high dimensional) classification techniques as part of the ensemble learner and as comparison for the extension. Differential performance between the learners and the extension when applied to relatively simple data sets, without different modalities, shows that the extension could improve performance of both Stacked Generalization in general and choosing, through cross-validation, the single best performing statistical learner. Performance improvement is highly dependent on the characteristics of the data and most notable in conditions that are relatively noiseless. Performance increase is however not universal, even in the most favorable conditions, and application of Stacked Domain Learning is therefore best used not as a replacement of existing techniques but rather as an addition to the library of techniques the statistician might consider. The results warrant further study of Stacked Domain Learning to investigate performance improvement in a practical setting: in settings with or without explicit modalities. Results of the simulation study also implicate that further improvements could be made, for instance in the way the ensemble is combined. We also attempt to measure the quality of the prediction performance of the Stacking ensemble by attempting to measure the size of the classifier space (or hypothesis space), the enlargement of which is the main argument in favor of the extension. Results interpreted favorably indicate a negligible relation but the results are not conclusive.Show less