View on GitHub

ResearchPaperNotes

Initiative to read research papers

Reinforcement Learning

Paper	Notes	Author	Summary
DREAM TO CONTROL: LEARNING BEHAVIORS BY LATENT IMAGINATION (ICLR ‘20)	HackMD	Raj	This paper focuses to learn long-horizon behaviors by propagating analytic value gradients through imagined trajectories using a recurrent state space model (PlaNet, haffner et al)
The Value Equivalence Principle for Model-Based Reinforcement Learning (NeurIPS ‘20)	HackMD	Raj	This paper introduces and studies the concept of equivalence for Reinforcement Learning models with respect to a set of policies and value functions. It further shows that this principle can be leveraged to find models constrained by representational capacity, which are better than their maximum likelihood counterparts.
Stackelberg Actor-critic: A game theoretic perspective	HackMD	Sharath	This paper formulates the interaction between the actor and critic ans a stackelberg games and leverages the implicit function theorem to calculate the accurate gradient updates for actor and critic.
Curriculum learning for Reinforcement Learning Domains	HackMD	Sharath	This is a survey paper on curriculum learning methods in reinforcement learning.
Policy Gradient Methods for Reinforcement Learning with Function Approximation (NIPS 1999)	HackMD	Raj	This paper provides the first policy gradient algorithm based on neural networks.
Reinforcement Learning via Fenchel Rockafellar Duality	HackMD	Sharath	This paper reviews the basic concepts of fenchel duality, f-divergences and shows how can these set of tools can be applied tin the context of reinforcement learning to derive theoritcally as well as practically robust algorithms.
High-Dimensional Continuous Control Using Generalized Advantage Estimation	HackMD	Raj	This paper gives an algorithm with an advantage estimator and TRPO technique to empirically guarantee monotonic policy improvement.
Off-Policy Actor-Critic (ICML ‘12)	HackMD	Sharath	This paper presents the first off-policy version of the actor-critic algorithms and derives a simple and elegant algorithm which performs better than the existing algorithms on standard reinforcement-learning benchmark problems.
Combining Physical Simulators and Object-Based Networks for Control (ICRA ‘19)	HackMD	Sharath	In this paper the authors proposed a hybrid dynamics model, Simulation-Augemented Interaction Networks, where they incorporated Interaction Networks into a physics engine for solving real world complex robotics control tasks.
Learning Agile and Dynamic Motor Skills for Legged Robots	HackMD	Sharath	This paper tackles the sim2real transfer problem for legged robots.
PAC-Bounds-for-Multi-armed-Bandit (CoLT ‘02)	HackMD	Raj	This paper provides a technique to guarantee PAC bounds based on the rewards distirbution of the particular problem achieving better sample complexity.
Deep Reinforcement Learning for Dialogue Generation	HackMD	Om	This paper discusses how better dialogue generation can be achieved using RL. It provides a technique to convert converstational properties like informativity, coherence and ease of answering into reward functions.
Rainbow: Combining Improvements in Deep Reinforcement Learning	HackMD	Om	The paper discusses add-ons to the DQN and A3C that can improve their performance, namely Double DQN, Prioritized Experience Replay, Dueling Network Architecture, Distributional Q-Learning, Noisy DQN.
The Option-Critic Architecture	HackMD	Om	Paper discusses the hierarchical reinforcement learning method implimentation based on temporal abstractions.
Addressing Distribution Shift in Online Reinforcement Learning with Offline Datasets	HackMD	Om	The paper suggests and provides experimental justification for methods to tackle Distribution Shift.
FeUdal Networks for Hierarchical Reinforcement Learning	HackMD	Om	This paper describes the FeUdal Network model. Employs a manager-worker hierarchy.