Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
The task of finding suitable candidates for a job has never been an easy one, and now that recruiters have access to various online job boards and are not necessarily constrained by national...Show moreThe task of finding suitable candidates for a job has never been an easy one, and now that recruiters have access to various online job boards and are not necessarily constrained by national borders, it can be argued that shortlisting relevant candidates is more difficult than ever. This is especially true for online recruitment agencies that have huge databases of potential candidates and no effective ways to quickly identify which of those candidates have the required experience and skills for the vacancy at hand. There are many ways that different companies go about solving the aforementioned problem. In case of YoungCapital, a Dutch recruitment agency, all candidates can state their preferred profession and location when creating a profile on the company’s website, and recruiters can then create a search query based on those stated preferences. It is also possible to get keyword matches with candidates’ resumes, which, however, is a manual task where recruiters have to decide on the specific keywords they want to find. Given recent advances in machine learning and natural language processing, it was decided that a learning-to-rank (LTR) approach should be tried to see whether the candidate search process could be improved by presenting recruiters with a ranked list of candidates for each job, with the most suitable candidates at the top of the list. The LambdaMART model was chosen for this task as the state-of-the-art algorithm, and the baseline ranking model was a simple Linear Regression. Most of the features were designed using custom word embeddings. The results were evaluated with common rank-based measures: Normalised Discounted Cumulative Gain (NDCG) and Mean Average Precision (MAP). Precision, which ignores the order of results, was reported as well. Overall, we found a significant improvement over the current method according to all three measurements. We also demonstrated the impact of different feature sets on the performance of ranking models.Show less
Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
Online Linear Regression is a sequential variant of regression in which the data points arrive one by one. It is normally studied in the gametheoretic framework of Online Convex Optimization, which...Show moreOnline Linear Regression is a sequential variant of regression in which the data points arrive one by one. It is normally studied in the gametheoretic framework of Online Convex Optimization, which models the data as being generated by an adversary. In this framework, the standard statistical procedure of Online Ridge Regression is known to be essentially optimal. In Statistics, there is an improvement for Ridge Regression when the noise is not constant. This improvement is Weighted Ridge Regression, which relies on weighting the data by their variances. In this thesis, we will employ weighting in Online Ridge Regression to show that an improvement over Online Ridge Regression can be made. We furthermore explored the situation where weighting is disadvantageous, mathematically and experimentally using simulations. Finally we applied Online Weighted Ridge Regression to different real-world datasets and found that we also can improve Online Ridge Regression in practical situationsShow less
Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
Machine Learning classifiers are naturally black boxes when it comes to interpretation. In this thesis, Decision Boundary Approximation (DBA), a new algorithm for locally explaining complex binary...Show moreMachine Learning classifiers are naturally black boxes when it comes to interpretation. In this thesis, Decision Boundary Approximation (DBA), a new algorithm for locally explaining complex binary classifiers is developed, tested experimentally and discussed. The algorithm explains predictions of individual instances, by approximating their most relevant region of the decision boundary with a linear model. We overview and discuss limitations of existing methods when applied to classification, with specific focus on LIME due to the similarity with DBA concepts. Experiments with DBA, cover both low dimensions and sparse high dimensional data. In Experiment 1 we show that DBA can provide stable explanations for various decision boundary structures in a 2D simulated case. Experiment 2 demonstrates that DBA outperforms LIME for low dimensionalities, while in Experiment 3 (MNIST data) we show that when data are sparse, DBA explanations can include features that are absent from the explained example, making the explanation more complete compared to LIME. In Experiment 4 we explain a Naive Bayes trained on SMS ham/spam messages and show that the DBA solution is in agreement with the Naive Bayes posterior. Finally, the benefits and drawbacks of DBA are discussed elaborately and future recommendations for modifications are given.Show less
The Multi-Armed Bandit(MAB) problem is named after slot machine games. When playing slot machines, one player has to decide which machine to play, in which order to play them and how many times to...Show moreThe Multi-Armed Bandit(MAB) problem is named after slot machine games. When playing slot machines, one player has to decide which machine to play, in which order to play them and how many times to play each machine. After the choice, that specific machine will offer a random reward from a probability distribution, and the player’s target is to maximize the sum of rewards earned through a sequence of lever pulls. In order to figure out the distribution of each machine as soon as possible and to get as much profit as possible, we will consider the popular Thompson Sampling (TS) method, which is based on Bayesian ideas. TS is a heuristic for choosing actions that maximize the expected reward with respect to a randomly drawn belief.[9] In this thesis, for the first half part, we test the performance of TS and also compare a variation called Top-two Thompson Sampling(TTTS) method to the normal TS, based on uniform sampling. Computationally, TTTS is a slow algorithm, so we also try to improve its performance and create another algorithm: Top-two Gibbs Thompson sampling, which combines TTTS and Gibbs Sampling methods and improves the computation speed of the TTTS method. In the second half of the thesis, we try to take a step forward in the application of TS, so we combine TS with the Maximin Action Identification(MAI) problem. Maximin is a concept from two-player zero-sum games in game theory. The main idea behind maximin action selection is a constant alternation of minimizing and maximizing the value of moves to account for an adversarial opponent in the game. Existing methods of maximin are related to games such as Checkers, Chess and Go. We try to broaden its application area and apply it to our new algorithms created in the first half part. First of all, the time budget is limited and we use different divisions to test the performance; Afterwards, we create a new algorithm to pick out the right arm with one step for two layers. The results show that there is not any significant difference between the time division method, which consists of even time budgets for he child layer and no budget for parent layer, and the Maximin Thompson Sampling method.Show less
Master thesis | Statistical Science for the Life and Behavioural Sciences (MSc)
open access
The Kalman filter has numerous applications in spatial-temporal prediction. A common application is for guidance, navigation, and control of vehicles, particularly aircraft and spacecraft. [1] In...Show moreThe Kalman filter has numerous applications in spatial-temporal prediction. A common application is for guidance, navigation, and control of vehicles, particularly aircraft and spacecraft. [1] In this thesis, we focus on one typical spatial-temporal data type of discrete time and discrete space. We consider a rectangular grid for the space domain. We make a first order Markov property assumption in both time and space to reduce complexity. In addition, several input control features are introduced into the Kalman filter. In other words, the distribution of future states depends only on the current states and input control features in their own area and their neighboring areas. Under our Markov assumption, it is natural for the transition matrix in the Kalman filter to be sparse for spatial-temporal data where sparse transition matrices with constrained structure are designed to interpret the spatial correlation among all the areas. We will derive the equations for inference in this particular spatial system, namely the Kalman filter and Kalman smoother. Using the results for the Kalman filter and Kalman smoother, we further consider the determination of the parameters of the Kalman filter model through a modified Expectation–Maximization(EM) algorithm that estimates sparse transition matrices. This stands in contrast with the standard EM algorithm, which usually produces a dense estimate for the matrices. To respect the spatial pre-constrained sparsity structure, we specify greedy EM updates that work on rows of the transition matrix. We study the properties of our new method in simulations and apply the method to a real data set on aviation safety where the goal is to predict which areas at Schiphol airport are at risk of having a large density of birds in the near future.Show less
Introduction Action recognition is an important task for domestic care robots. Current action recognition literature exclusively studies closed-set recognition problems, where performance is...Show moreIntroduction Action recognition is an important task for domestic care robots. Current action recognition literature exclusively studies closed-set recognition problems, where performance is evaluated on videos which were also available in the training set. However, the real-world environment is by definition open, and it is both theoretically and practically infeasible to supply the robot with all necessary information beforehand. This work develops novelty detection methodology applicable to HMM-based classifiers, which have shown earlier success in action recognition. By filtering unknown actions instances, our novelty detection module increases system robustness in open environments, and is a first step towards adaptively learning robots. Methodology We first develop an ordinary action recognition system based on a new skeleton-derived feature and a HMM back-end classifier. The HMM system is estimated in three ways: clustered, Expectation-Maximization (EM) and Extended Baum-Welch (EBW). The latter is a discriminative training criterion, which could theoretically improve novelty detection accuracy. Since the EBW algorithm has only been implemented in speech recognition software, we wrote its first open-source implementation (in Matlab, publicly available from www.github.com/thomasmoerland/Thesis). Then, novelty detection is approached from both a posterior probability and hypothesis-testing perspective, which we unify as background models. Since novelty detection for action recognition has not been reported before, we investigate a diverse set of background models: sum over competing models, filler models, flat models, anti-models, reweighted anti-models, and some combinations of them. Results Our HMM classification system has around 95% closed-set recognition accuracy on the Microsoft Action 3D dataset, which is near the state-of-the-art. Performance did not di↵er between the clustered, EM an EBW estimation methods, although our results do indicate the latter two might be beneficial on more challenging datasets. The optimal novelty detection module combining anti-models with flat models had 78% novelty accuracy, while maintaining 78% recognition accuracy as well. Novelty detection results were consistent over various dataset splits. Discriminative training did not alter novelty detection performance. Conclusion We are the first to study novelty detection for action recognition. Our results could increase system robustness in an open-set real-world environment, and furthermore serve as a first step towards an adaptively learning robot.Show less