Category
Openai IA
Toute l’actualité IA d’Openai
-
-
Learning dexterity
-
Variational option discovery algorithms
-
OpenAI Scholars 2018: Meet our Scholars
-
OpenAI Five Benchmark
-
Glow: Better reversible generative models
-
Learning Montezuma’s Revenge from a single demonstration
-
OpenAI Five
-
Retro Contest: Results
-
Learning policy representations in multiagent systems
-
Improving language understanding with unsupervised learning
-
GamePad: A learning environment for theorem proving
-
OpenAI Fellows Fall 2018
-
Gym Retro
-
AI and compute
-
AI safety via debate
-
Evolved Policy Gradients
-
Gotta Learn Fast: A new benchmark for generalization in RL
-
Retro Contest
-
Variance reduction for policy gradient with action-dependent factorized baselines
-
Improving GANs using optimal transport
-
Report from the OpenAI hackathon
-
On first-order meta-learning algorithms
-
Reptile: A scalable meta-learning algorithm
-
OpenAI Scholars
-
Some considerations on learning to explore via meta-reinforcement learning
-
Multi-Goal Reinforcement Learning: Challenging robotics environments and request for research
-
Ingredients for robotics research
-
OpenAI hackathon
-
OpenAI supporters
-
Preparing for malicious uses of AI
-
Interpretable machine learning through teaching
-
Interpretable machine learning through teaching
-
Discovering types for entity disambiguation
-
Requests for Research 2.0
-
Scaling Kubernetes to 2,500 nodes
-
Block-sparse GPU kernels
-
Learning sparse neural networks through L₀ regularization
-
Interpretable and pedagogical examples
-
Learning a hierarchy
-
Generalizing from simulation
-
Sim-to-real transfer of robotic control with dynamics randomization
-
Asymmetric actor critic for image-based robot learning
-
Domain randomization and generative models for robotic grasping
-
Meta-learning for wrestling
-
Competitive self-play
-
Nonlinear computation in deep linear networks
-
Learning to model other minds
-
Learning with opponent-learning awareness
-
Learning with opponent-learning awareness
-
OpenAI Baselines: ACKTR & A2C
-
More on Dota 2
-
Dota 2
-
Gathering human feedback
-
Better exploration with parameter noise
-
Proximal Policy Optimization
-
Robust adversarial inputs
-
Hindsight Experience Replay
-
Teacher–student curriculum learning
-
Faster physics in Python
-
Learning from human preferences
-
Learning to cooperate, compete, and communicate
-
UCB exploration via Q-ensembles
-
OpenAI Baselines: DQN
-
Robots that learn
-
Roboschool
-
Equivalence between policy gradients and soft Q-learning
-
Stochastic Neural Networks for hierarchical reinforcement learning
-
Stochastic Neural Networks for hierarchical reinforcement learning
-
Unsupervised sentiment neuron
-
Unsupervised sentiment neuron
-
Spam detection in the physical world
-
Spam detection in the physical world
-
Evolution strategies as a scalable alternative to reinforcement learning
-
One-shot imitation learning
-
Distill
-
Learning to communicate
-
Emergence of grounded compositional language in multi-agent populations
-
Prediction and control with temporal segment models
-
Third-person imitation learning
-
Attacking machine learning with adversarial examples
-
Adversarial attacks on neural network policies
-
Team update
-
PixelCNN++: Improving the PixelCNN with discretized logistic mixture likelihood and other modifications
-
Faulty reward functions in the wild
-
Universe
-
#Exploration: A study of count-based exploration for deep reinforcement learning
-
#Exploration: A study of count-based exploration for deep reinforcement learning
-
OpenAI and Microsoft
-
On the quantitative analysis of decoder-based generative models
-
A connection between generative adversarial networks, inverse reinforcement learning, and energy-based models
-
RL²: Fast reinforcement learning via slow reinforcement learning
-
Variational lossy autoencoder
-
Extensions and limitations of the neural GPU
-
Semi-supervised knowledge transfer for deep learning from private training data
-
Report from the self-organizing conference
-
Transfer from simulation to real world through learning deep inverse dynamics model
-
Infrastructure for deep learning
-
Machine Learning Unconference
-
Team update
-
Special projects