Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
Association of neurological and psychological conditions with changes in coactivation patterns of brain regions in ’resting state’ is of recent interest in neuroscience. To uncover such latent...Show moreAssociation of neurological and psychological conditions with changes in coactivation patterns of brain regions in ’resting state’ is of recent interest in neuroscience. To uncover such latent functional connectivity, series of functional Magnetic Resonance Imaging (fMRI) scans are typically reduced by averaging activations in brain atlas regions. The averaged activations are further reduced to pairwise correlation in sliding fixed width time windows. Unfortunately such reduction in dimensions also reduces the scan resolution and complicates interpretation. Changing to a text mining perspective, this thesis interprets the high dimensional scans as documents with categorical words drawn from a study bag. Consecutive scans measure the activation in V discrete voxels of brain volumes. Activation series in each voxel are segmented into stationary subsequences. Similar correlated segments within voxels and from distinct voxels are then bagged as words. The words capture correlated activation both within- and between-voxels. Instead of being predefined in an atlas, regions emerge as neighbourhoods of voxels drawing the same word at the original scan resolution. The word counts that document voxels draw from the bag of categorical words defines the document state. Document state transition probabilities measure the dynamics in coactivated brain locations at the original fMRI resolution, as a possible marker for a neurological condition. This alternative fMRI activation reduction method avoids a-priori selection of regions, tuning of fixed time window widths, and selection of the number of principal components of the contrasted existing method; the alternative method allows a more direct interpretation of activations. However, the direct state switching interpretation of scan document voxels drawing categorical word counts, does not sufficiently separate subject groups for reliable classification of neurological conditions.Show less
Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
Functional connectivity (FC) is an important metric to characterize brain mechanisms. Assessment of resting-state FC is a popular tool for studying brain disease mechanisms. Correlations between...Show moreFunctional connectivity (FC) is an important metric to characterize brain mechanisms. Assessment of resting-state FC is a popular tool for studying brain disease mechanisms. Correlations between functional magnetic resonance imaging (fMRI) blood-oxygenation-level-dependent (BOLD) time courses in different brain regions can measure FC which has revealed a meaningful organization of spontaneous fluctuations in the brain during rest. Therefore, in most studies, the presence of temporal and spatial dynamics of FC are usually measured by the correlation coefficients between the fMRI signals of several brain regions. However, recent research has shown that FC is not stationarity. That is, FC dynamically changes over time reflecting additional and rich information about brain organization. In 2013, Leonardi et al. proposed a new approach which was based on principal component analysis (PCA) to reveal hidden patterns of coherent FC dynamics across multiple subjects. This thesis evaluates this new approach in a simulation study. Moreover, also a framework to test the new approach is proposed. The simulation study showed advantages and disadvantages of the new approach. The results of the simulation study showed that the new approach can extract the most important dynamic connectivity features underlying fMRI data. It can retrieve timevarying connectivity between dynamic brain regions during rest effectively. The new approach identified connections with similar fluctuations, and gave an efficient linear representation, but only sensitive to linear relations between connectivity pairs, and it yielded robust results in restricted conditions. Finally, some recommendations for researchers using this method to study dynamic brain functional brain connectivity at rest are provided.Show less
Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
Aerosols are tiny particles of various kinds and compositions suspended in the atmosphere, some of which have a critical, adverse impact on public health. Hence, modelling the prevalence and...Show moreAerosols are tiny particles of various kinds and compositions suspended in the atmosphere, some of which have a critical, adverse impact on public health. Hence, modelling the prevalence and distribution of these separate types is vital for giving shape to informed policy on air quality. In this work, methods are described to identify clusters of similar aerosol type mixtures in the Earth’s atmosphere on a global scale, on the basis of microphysical data from the space-borne remote sensing instrument POLDER-3. We report an unsupervised learning approach using the Self-Organizing Map (SOM) and k-means clustering, which allows for clustering without a priori assumptions on existing aerosol types, nature or prevalence. Two methods are introduced to stabilize these clustering algorithms over multiple equal runs to manage their local optima convergence property: the k-means nstart option is extended to the SOM and a set-up is given for a new method, Expectation-Maximization-centered Mahalanobis clustering (EMcMc). A (repeated) v-fold cross-validation framework is presented to find the optimal number of clusters k in the data by means of cluster validation measures, currently including Prediction Strength and validated variants of the Silhouette Width. Using a separate test set, the method can be used to optimize a generic k, countering overfitting. A novel validation index is developed which extends the Silhouette Width to data sets with many observations (large N): the Gridded Silhouette Width. All described methods are implemented in the statistical software package R and shown to work for simulated examples, originating from scaled Gaussian distributions with varying degrees of overlap. Analysis of the POLDER-3 data indicated that using only four variables, 8 clusters can be found in a stable and reproducable fashion. The Silhouette indices did not appear to perform well for data so widely dispersed as here. The found clusters were characterized based on their variable distributions and geographical occurence, which proved to be feasible and meaningful for real-life interpretations. The proposed aerosol types were dust, marine, urban-industrial, smoke and mixtures thereof. Keywords: aerosol typing; unsupervised learning; self-organizing map; k-means clustering; cluster validation measures; cross-validation; gridded silhouette.Show less
Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
Synchronous neuronal responses across subjects is also known as neural reliability. The level of neural reliability evoked by natural stimuli is shown to be a predictor to larger audience...Show moreSynchronous neuronal responses across subjects is also known as neural reliability. The level of neural reliability evoked by natural stimuli is shown to be a predictor to larger audience preferences (Dmochowski et al., 2014). The same authors also proposed the state-of-the-art method for calculating neural reliability in an EEG setting (Dmochowski et al., 2014). However, the method is indirect and rather ad hoc, therefore, some existing alternative methods are proposed as well as an own proposed algorithm of calculating neural reliability. All the different methods are compared by means of a simulation study. Here, the performance is tested in their ability to recover the actual neural reliability in the data, but also their performance in predicting a population measure. Furthermore, wavelet transform as a denoising step in the setting of EEG data is investigated. The results of the simulation study show that Dmochowski and colleagues’ (2014) is performing well on undenoised data and when the relationship between the “true” ISC and buying behaviour is strong. However, the adapted neural reliability method of Hasson and colleagues’ (2004) and originally intended for fMRI studies stands out not only in terms of performance, but also in consistency of performance under different data characteristics, like the strength of the ISC, the signal to noise ratio and the strength of the relation between true ISC and buying behavior. Moreover, this method is also more direct and easier to calculate. The proposed way of denoising by wavelet transform only hurts the performance of the proposed neural reliability methods. It can be concluded that the adapted method of Hasson and colleagues’ (2004) can be recommended both for determining the ISC as the relation between ISC and a population measure.Show less