Meta-learning for wrestling

To test our continuous adaptation approach we designed 3 types of agents—Ant (4-leg), Bug (6-leg), and Spider (8-leg)—and set up a multi-round game where each agent played several matches against the same opponent and adapted its policy parameters between the rounds to better counter the opponent’s policy. In tests, we found that agents that could adapt their tactics are much better competitors than agents that have fixed policies. After training over a hundred agents, some of which learned fixed policies and others learned to adapt, we evaluated the fitness of each agent.

Learning on the fly can also let agents deal with unusual changes in their own bodies as well, like adapting to some of their own limbs losing functionality over time. This suggests we can use techniques like this to develop agents that can handle both changes in their external environment and also changes in their own bodies or internal states.