M. Lanctot

M. Lanctot
Are you M. Lanctot?

Claim your profile, edit publications, add additional information:

Contact Details

Name
M. Lanctot
Affiliation
Location

Pubs By Year

Pub Categories

 
Computer Science - Learning (5)
 
Computer Science - Computer Science and Game Theory (5)
 
Computer Science - Artificial Intelligence (4)
 
Physics - Plasma Physics (2)
 
Computer Science - Neural and Evolutionary Computing (2)
 
Physics - Classical Physics (1)
 
Physics - Physics Education (1)
 
Computer Science - Computer Vision and Pattern Recognition (1)
 
Computer Science - Multiagent Systems (1)

Publications Authored By M. Lanctot

Deep reinforcement learning (RL) has achieved several high profile successes in difficult control problems. However, these algorithms typically require a huge amount of data before they reach reasonable performance. In fact, their performance during learning can be extremely poor. Read More

A long-term energy option that is just approaching the horizon after decades of struggle, is fusion. Recent developments allow us to apply techniques from spin physics to advance its viability. The cross section for the primary fusion fuel in a tokamak reactor, D+T=>alpha+n, would be increased by a factor of 1. Read More

Matrix games like Prisoner's Dilemma have guided research on social dilemmas for decades. However, they necessarily treat the choice to cooperate or defect as an atomic action. In real-world social dilemmas these choices are temporally extended. Read More

We propose a novel approach to reduce memory consumption of the backpropagation through time (BPTT) algorithm when training recurrent neural networks (RNNs). Our approach uses dynamic programming to balance a trade-off between caching of intermediate results and recomputation. The algorithm is capable of tightly fitting within almost any user-set memory budget while finding an optimal execution policy minimizing the computational cost. Read More

In this work we introduce a differentiable version of the Compositional Pattern Producing Network, called the DPPN. Unlike a standard CPPN, the topology of a DPPN is evolved but the weights are learned. A Lamarckian algorithm, that combines evolution and learning, produces DPPNs to reconstruct an image. Read More

In recent years there have been many successes of using deep representations in reinforcement learning. Still, many of these applications use conventional architectures, such as convolutional networks, LSTMs, or auto-encoders. In this paper, we present a new neural network architecture for model-free reinforcement learning. Read More

Non-rotating (`locked') magnetic islands often lead to complete losses of confinement in tokamak plasmas, called major disruptions. Here locked islands were suppressed for the first time, by a combination of applied three-dimensional magnetic fields and injected millimetre waves. The applied fields were used to control the phase of locking and so align the island O-point with the region where the injected waves generated non-inductive currents. Read More

Monte Carlo Tree Search (MCTS) has improved the performance of game engines in domains such as Go, Hex, and general game playing. MCTS has been shown to outperform classic alpha-beta search in games where good heuristic evaluations are difficult to obtain. In recent years, combining ideas from traditional minimax search in MCTS has been shown to be advantageous in some domains, such as Lines of Action, Amazons, and Breakthrough. Read More

This article discusses two contributions to decision-making in complex partially observable stochastic games. First, we apply two state-of-the-art search techniques that use Monte-Carlo sampling to the task of approximating a Nash-Equilibrium (NE) in such games, namely Monte-Carlo Tree Search (MCTS) and Monte-Carlo Counterfactual Regret Minimization (MCCFR). MCTS has been proven to approximate a NE in perfect-information games. Read More

We study Monte Carlo tree search (MCTS) in zero-sum extensive-form games with perfect information and simultaneous moves. We present a general template of MCTS algorithms for these games, which can be instantiated by various selection methods. We formally prove that if a selection method is $\epsilon$-Hannan consistent in a matrix game and satisfies additional requirements on exploration, then the MCTS algorithm eventually converges to an approximate Nash equilibrium (NE) of the extensive-form game. Read More

This paper introduces Monte Carlo *-Minimax Search (MCMS), a Monte Carlo search algorithm for turned-based, stochastic, two-player, zero-sum games of perfect information. The algorithm is designed for the class of of densely stochastic games; that is, games where one would rarely expect to sample the same successor state multiple times at any particular chance node. Our approach combines sparse sampling techniques from MDP planning with classic pruning techniques developed for adversarial expectimax planning. Read More

Counterfactual Regret Minimization (CFR) is an efficient no-regret learning algorithm for decision problems modeled as extensive games. CFR's regret bounds depend on the requirement of perfect recall: players always remember information that was revealed to them and the order in which it was revealed. In games without perfect recall, however, CFR's guarantees do not apply. Read More

We investigate a simple variation of the series RLC circuit in which anti-parallel diodes replace the resistor. This results in a damped harmonic oscillator with a nonlinear damping term that is maximal at zero current and decreases with an inverse current relation for currents far from zero. A set of nonlinear differential equations for the oscillator circuit is derived and integrated numerically for comparison with circuit measurements. Read More