This is interesting but it's still far from how a human would learn how to play the game. Humans don't have inbuilt rewards for Montezuma's Revenge, they acquire them culturally. How much of what was learned (by the machine, not the researchers) in playing Montezuma's Revenge could be applied to a game like Zelda? A human would instantly notice many of the connections between the two games: enemies that follow simple patterns and harm the player on contact, rooms that connect to one another laid out on a grid pattern, single use consumable keys that open doors, valuable gems to collect. Is the machine able to make any of these connections on its own?

I think this technique -- in the title of the paper -- would transfer somewhat to other games, because it essentially encourages the player to go places it hasn't been [without simultaneously having it explore everything thoroughly].

So you use existing tools to encourage the player to survive, and then you add additional motivation to encourage the player to learn what it needs to do [ie explore]. And exploration is a common theme in games like Zelda.

I actually think that this technique will generalize to learning lots of games [eg the same engine might play Zelda & Montezuma's Revenge], but I don't think techniques in this vein alone will allow it to learn how to play one game and then immediately understand how to play the other.

[but "transfer learning" is a separate, challenging problem in AI]

chongli

That's a shame. For a human, each game played helps to make the person better at all games. Play enough adventure games, for example, and one begins to recognize so many of the patterns and tropes at work that the solutions to puzzles jump out obviously and immediately in a single play-through.

Moreover, we already have machines that are extremely proficient at solving very complex games when given enough context: expert systems. One very notable example is the bot which successfully completed the game NetHack [0]. Would DeepMind's novelty-based reward technique work for NetHack?

[0] https://github.com/krajj7/BotHack