Alexander Pondaven

I’m a PhD student at Oxford’s Torr Vision Group, working on controllable and interactive video generation — teaching world models to be steered, and even played.

Right now I’m a research intern at Odyssey, building multiplayer world models: generative environments that many players can act in at once. My recent paper ActionParty (ECCV 2026) is a step towards this — the first video world model that binds distinct actions to up to seven players in a scene.

🎮 Now — Research intern at Odyssey, building multiplayer world models
🎓 PhD (2023–) — Controllable video diffusion at Oxford AIMS CDT, funded by a Snap studentship
📄 ECCV 2026 — ActionParty: binding actions to multiple players in a world model
📄 CVPR 2025 — DiTFlow: motion transfer for diffusion transformers
🌍 NeurIPS 2022 — Satellite inpainting with neural processes for climate

Curious about anything below? The publications, projects and experience pages have the details — otherwise the best way to reach me is by email.

news

Jun 1, 2026	Started a research internship at Odyssey, building multiplayer world models 🎮
Apr 3, 2026	ActionParty accepted to ECCV 2026 🎉 — a multi-subject world model that controls up to seven players at once. [arXiv]
Feb 27, 2025	DiTFlow — video motion transfer with diffusion transformers — accepted at CVPR 2025 🎉

selected publications

ECCV
ActionParty: Multi-Subject Action Binding in Generative Video Games

Pondaven, Alexander, Wu, Ziyi, Gilitschenski, Igor, Torr, Philip, Tulyakov, Sergey, Pizzati, Fabio, and Siarohin, Aliaksandr

In ECCV 2026

Abs arXiv Bib Website

Recent advances in video diffusion have enabled world models that simulate interactive environments, but these are largely restricted to single-agent settings. We tackle action binding — associating specific actions with their corresponding subjects — and propose ActionParty, an action-controllable multi-subject world model for generative video games. It introduces subject state tokens that persistently capture each subject’s state, and a spatial biasing mechanism that disentangles global frame rendering from individual action-controlled subject updates. On the Melting Pot benchmark, ActionParty is the first video world model to control up to seven players simultaneously across 46 environments.
@inproceedings{pondaven2026actionparty, title = {ActionParty: Multi-Subject Action Binding in Generative Video Games}, author = {Pondaven, Alexander and Wu, Ziyi and Gilitschenski, Igor and Torr, Philip and Tulyakov, Sergey and Pizzati, Fabio and Siarohin, Aliaksandr}, booktitle = {ECCV}, year = {2026}, }

CVPR

Video Motion Transfer with Diffusion Transformers

Pondaven, Alexander, Siarohin, Aliaksandr, Tulyakov, Sergey, Torr, Philip, and Pizzati, Fabio

2025

arXiv Bib Code Website

@article{pondaven2025ditflow,
  title = {Video Motion Transfer with Diffusion Transformers},
  author = {Pondaven, Alexander and Siarohin, Aliaksandr and Tulyakov, Sergey and Torr, Philip and Pizzati, Fabio},
  booktitle = {CVPR},
  year = {2025},
}

Video Motion Transfer with Diffusion Transformers

NeurIPS-W

Convolutional Neural Processes for Inpainting Satellite Images: Application to Water Body Segmentation

Pondaven, Alexander, Bakler, Märt, Guo, Donghu, Hashim, Hamzah, Ignatov, Martin G, Bhatt, Samir, Flaxman, Seth, Mishra, Swapnil, Alhajjar, Elie, and Zhu, Harrison

In NeurIPS 2022 Workshop on Tackling Climate Change with Machine Learning 2022

arXiv Bib HTML Slides

@inproceedings{pondaven2022convolutional,
  title = {Convolutional Neural Processes for Inpainting Satellite Images: Application to Water Body Segmentation},
  author = {Pondaven, Alexander and Bakler, Märt and Guo, Donghu and Hashim, Hamzah and Ignatov, Martin G and Bhatt, Samir and Flaxman, Seth and Mishra, Swapnil and Alhajjar, Elie and Zhu, Harrison},
  booktitle = {NeurIPS 2022 Workshop on Tackling Climate Change with Machine Learning},
  url = {https://www.climatechange.ai/papers/neurips2022/24},
  year = {2022},
}