In this research, a quantum and classical version of a value-based reinforcement learning method are compared to each other. This is done by training each of them to try and learn how to play a...Show moreIn this research, a quantum and classical version of a value-based reinforcement learning method are compared to each other. This is done by training each of them to try and learn how to play a simple game called Fox in a hole. The two models are compared to each other based on performance, training stability, convergence speed, and amount of trainable parameters. After hyperparameter tuning and further experimentation of the models, no clear difference is found between their performances and training stabilities. Nonetheless, the quantum model does seem to converge slower as the dimensionality of the game grows, and it also seems to require longer computation times than the classical model to keep up with its performance. Thus, the results suggest that for the task at hand, a classical value-based RL method is preferred over a quantum version of it.Show less