Search results

Your search

Sort

(1 - 2 of 2)

Spoor, Lindsay 2024

Quantum error correction on the toric code using two distinct reinforcement learning game frameworks

Master thesis | Physics (MSc)

open access

This project employs reinforcement learning techniques to explore novel decoding strategies for quantum error correction, particularly focusing on the toric code, to address the challenge of...Show moreThis project employs reinforcement learning techniques to explore novel decoding strategies for quantum error correction, particularly focusing on the toric code, to address the challenge of maintaining stable quantum states for fault-tolerant quantum computing. Two game frameworks are established, including a novel dynamic game framework applicable to the training and measuring of RL agents and potential application in multiagent scenarios. The RL agents use Stable Baselines 3’s Proximal Policy Optimization and show to achieve Minimum Weight Perfect Matching performance on 3 × 3 toric code lattices in both the static and dynamic game frameworks.Show less

Sinttruije, Deborah van 2023

Simulating Anticipatory Centering Behavior in a Robotic Sequential Reaching Task using Deep Reinforcement Learning

Master thesis | Psychology (MSc)

open access

Humans use inferred statistical properties of sequential events to smoothen subsequent actions by anticipatory movements. These anticipatory movements have been studied in the serial reaction time ...Show moreHumans use inferred statistical properties of sequential events to smoothen subsequent actions by anticipatory movements. These anticipatory movements have been studied in the serial reaction time (SRT) task, in which participants anticipate the target stimuli in learned sequences, however, under uncertainty, the participants seem to adhere to a centering strategy. It remains unclear whether this centering behavior is a statistically inferred way to compensate for the absence of sequence knowledge, using the center as an optimal anticipatory position. In this study, two state-of-the-art Deep Reinforcement Learning (Deep RL) algorithms (Proximal Policy Optimization (PPO) & Soft Actor-Critic (SAC)) are compared and employed to train artificial agents to investigate the scope of centering behavior, by manipulating the frequency distributions of target stimuli. While SAC evidently outperformed PPO in terms of performance and stability, both algorithms displayed an effect of frequency distribution on centering position. Specifically, a proportional shift toward more probable target stimuli, suggesting that centering behavior is indeed anticipatory behavior as a way to compensate for the absence of explicit sequence knowledge.Show less

Leiden University Student Repository

Refine Results

Availability

Faculty

Thesis type

Programme

Issued

Supervisor

Language

Your search

Enabled Filters

Sort

Search results