Reinforcement learning with prediction-based rewards

We’ve developed Random Network Distillation (RND), a prediction-based method for encouraging reinforcement learning agents to explore their environments through curiosity, which for the first time exceeds average human performance on Montezuma’s Reveng...

In Openai IA, Research

We’ve developed Random Network Distillation (RND), a prediction-based method for encouraging reinforcement learning agents to explore their environments through curiosity, which for the first time exceeds average human performance on Montezuma’s Revenge.

Learning complex goals with iterated amplification

Plan online, learn offline: Efficient learning and exploration via model-based control

Related Posts

OpenAI’s comments to the NTIA on data center growth, resilience, and security

Introducing data residency in Europe

Cisco and OpenAI redefine enterprise engineering with AI agents

An update on disrupting deceptive uses of AI