Google has taught its DeepMind AI to navigate a parkour course by using reinforcement learning.
Reinforcement learning is the practice of rewarding desirable behaviour. The faster the AI could navigate the virtual parkour course, the greater the reward. Further incentives and penalties were added for various other metrics.
“We train several simulated bodies on a diverse set of challenging terrains and obstacles, using a simple reward function based on forward progress,” explains Nicolas Heess, a researcher on the project. “Using a novel scalable variant of policy gradient reinforcement learning, our agents learn to run, jump, crouch and turn as required by the environment without explicit reward-based guidance.”
The virtual parkour course designed by the researchers featured drops, hurdles, and ledges. All of the navigation was self-taught by the AI using a trial-and-error approach to working out how to move forward and progress across the course as fast as possible.
You can see how DeepMind got on with parkour in the video below:
Everything the stick figure is doing is self-taught. It’s fascinating (and humourous) to observe all the leaps, crouches, leaps, and limbos the AI decided was the best method of navigating the course. Perhaps we could all learn a thing or two.
What are your thoughts on DeepMind’s use of reinforcement learning? Let us know in the comments