Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
National statistical institutes (NSIs) try to construct datasets that are rich in information as efficiently and cost effectively as possible. This can be achieved by combining available data, such...Show moreNational statistical institutes (NSIs) try to construct datasets that are rich in information as efficiently and cost effectively as possible. This can be achieved by combining available data, such as administrative data or survey data. When datasets do not pertain to the same units, one can sometimes resort to statistical matching to integrate them. Statistical matching is a data fusion technique which can be used when different data sets contain different units, but with a set of common (background) variables. The main goal of statistical matching is to estimate the relationship between the non-common variables in the different datasets. This paper investigates how best to utilize a small overlap of units in a statistical matching situation where data only consists of categorical variables. A small overlap of units contains joint information on all variables for only a limited number of units. A new statistical matching method, namely the combined estimator, is developed in this paper employing an idea from small area estimation. The performance of the combined estimator was compared to a couple of pre-existing statistical matching methods for categorical data under various data conditions. The result shows that, even though the combined estimator itself does not perform better than the pre-existing statistical matching method (the EM algorithm), the usage of the combined estimator as the starting point of the EM algorithm helps increasing its accuracy under certain data circumstances. The improvement of accuracy was noticed in cases where the number of matching variables was large.Show less
Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
Association of neurological and psychological conditions with changes in coactivation patterns of brain regions in ’resting state’ is of recent interest in neuroscience. To uncover such latent...Show moreAssociation of neurological and psychological conditions with changes in coactivation patterns of brain regions in ’resting state’ is of recent interest in neuroscience. To uncover such latent functional connectivity, series of functional Magnetic Resonance Imaging (fMRI) scans are typically reduced by averaging activations in brain atlas regions. The averaged activations are further reduced to pairwise correlation in sliding fixed width time windows. Unfortunately such reduction in dimensions also reduces the scan resolution and complicates interpretation. Changing to a text mining perspective, this thesis interprets the high dimensional scans as documents with categorical words drawn from a study bag. Consecutive scans measure the activation in V discrete voxels of brain volumes. Activation series in each voxel are segmented into stationary subsequences. Similar correlated segments within voxels and from distinct voxels are then bagged as words. The words capture correlated activation both within- and between-voxels. Instead of being predefined in an atlas, regions emerge as neighbourhoods of voxels drawing the same word at the original scan resolution. The word counts that document voxels draw from the bag of categorical words defines the document state. Document state transition probabilities measure the dynamics in coactivated brain locations at the original fMRI resolution, as a possible marker for a neurological condition. This alternative fMRI activation reduction method avoids a-priori selection of regions, tuning of fixed time window widths, and selection of the number of principal components of the contrasted existing method; the alternative method allows a more direct interpretation of activations. However, the direct state switching interpretation of scan document voxels drawing categorical word counts, does not sufficiently separate subject groups for reliable classification of neurological conditions.Show less
Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
Functional connectivity (FC) is an important metric to characterize brain mechanisms. Assessment of resting-state FC is a popular tool for studying brain disease mechanisms. Correlations between...Show moreFunctional connectivity (FC) is an important metric to characterize brain mechanisms. Assessment of resting-state FC is a popular tool for studying brain disease mechanisms. Correlations between functional magnetic resonance imaging (fMRI) blood-oxygenation-level-dependent (BOLD) time courses in different brain regions can measure FC which has revealed a meaningful organization of spontaneous fluctuations in the brain during rest. Therefore, in most studies, the presence of temporal and spatial dynamics of FC are usually measured by the correlation coefficients between the fMRI signals of several brain regions. However, recent research has shown that FC is not stationarity. That is, FC dynamically changes over time reflecting additional and rich information about brain organization. In 2013, Leonardi et al. proposed a new approach which was based on principal component analysis (PCA) to reveal hidden patterns of coherent FC dynamics across multiple subjects. This thesis evaluates this new approach in a simulation study. Moreover, also a framework to test the new approach is proposed. The simulation study showed advantages and disadvantages of the new approach. The results of the simulation study showed that the new approach can extract the most important dynamic connectivity features underlying fMRI data. It can retrieve timevarying connectivity between dynamic brain regions during rest effectively. The new approach identified connections with similar fluctuations, and gave an efficient linear representation, but only sensitive to linear relations between connectivity pairs, and it yielded robust results in restricted conditions. Finally, some recommendations for researchers using this method to study dynamic brain functional brain connectivity at rest are provided.Show less
Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
Synchronous neuronal responses across subjects is also known as neural reliability. The level of neural reliability evoked by natural stimuli is shown to be a predictor to larger audience...Show moreSynchronous neuronal responses across subjects is also known as neural reliability. The level of neural reliability evoked by natural stimuli is shown to be a predictor to larger audience preferences (Dmochowski et al., 2014). The same authors also proposed the state-of-the-art method for calculating neural reliability in an EEG setting (Dmochowski et al., 2014). However, the method is indirect and rather ad hoc, therefore, some existing alternative methods are proposed as well as an own proposed algorithm of calculating neural reliability. All the different methods are compared by means of a simulation study. Here, the performance is tested in their ability to recover the actual neural reliability in the data, but also their performance in predicting a population measure. Furthermore, wavelet transform as a denoising step in the setting of EEG data is investigated. The results of the simulation study show that Dmochowski and colleagues’ (2014) is performing well on undenoised data and when the relationship between the “true” ISC and buying behaviour is strong. However, the adapted neural reliability method of Hasson and colleagues’ (2004) and originally intended for fMRI studies stands out not only in terms of performance, but also in consistency of performance under different data characteristics, like the strength of the ISC, the signal to noise ratio and the strength of the relation between true ISC and buying behavior. Moreover, this method is also more direct and easier to calculate. The proposed way of denoising by wavelet transform only hurts the performance of the proposed neural reliability methods. It can be concluded that the adapted method of Hasson and colleagues’ (2004) can be recommended both for determining the ISC as the relation between ISC and a population measure.Show less