Tom Zahavy

Research Scientist, Google DeepMind

I co-lead the Discovery team at Google DeepMind in London, where I've worked since 2019. My recent research focuses on hard problems that demand creativity and aesthetic taste — in particular chess, mathematics, and computational origami. A few highlights are below.

Featured Projects

AlphaProof

An AI system that taught itself to prove mathematical theorems in Lean, reaching silver-medal performance at the International Mathematical Olympiad through continuous reinforcement learning.

COrigami

Combines reinforcement learning with Gemini to design origami crease patterns, using a semantic representation and visual feedback loop to fold arbitrary target shapes.

PuzzleGen

Generates original chess puzzles with reinforcement learning, rewarding uniqueness, counter-intuitiveness, and novelty. Evaluated by chess grandmasters and featured on lichess and chess.com.

AlphaZero db

Explores diversity in AI decision-making by training a league of agents with distinct playing styles, revealing multiple qualitatively different ways to play chess at a superhuman level.

Rewards are Gradients

A general theory of what an RL agent can be asked to do. When the goal is a convex function of the states an agent visits — exploration, imitation, diversity, constraints — the reward that solves it is the gradient of that objective.

LLMs can't jump

A position paper examining a fundamental limitation of large language models: their difficulty with abductive reasoning, a capacity central to genuine scientific invention.