Alexander Pondaven
I am a PhD student in Oxford at Torr Vision Group in the Autonomous Intelligent Machines and Systems CDT with funding from SNAP. I am doing research in video and 3D generative models including diffusion models. I am very interested in how these models can learn generalisable representations in a compositional fashion.
Research: I completed my four year MEng degree in Electronic and Information Engineering (computer engineering) at Imperial College London in 2023. My Master’s project supervised by Dr. Yingzhen Li focused on increasing the diversity of diffusion model sampling by applying particle based methods to the Stable Diffusion 2 model. This involves adding a repulsion force between image samples and my method allows extra control over what direction we would like to spread results just by augmenting the denoising step. This allows extra control over how users may explore the image space of generations while reducing redundancy in samples. See project page and slides for more information.
While in my third year at Imperial, I worked on a research problem involving inpainting satellite images within a meta-learning framework. After demonstrating its downstream impact in a water classification setting, our work was accepted in the 2022 NeurIPS climate AI workshop. See workshop paper and project page for more information. I worked with a great team of other Imperial students supervised by Harrison Zhu and got to collaborate with researchers from the University of Copenhagen and Oxford.
Industrial Experience: Before starting my final year at Imperial, I completed a 6-month placement at a machine learning startup called Humanising Autonomy, working on building object detection systems to better understand humans. I worked on implementing active learning approaches that find the key frames in videos so that they can be labelled and used for training object detection models in a more data-efficient manner. I then built an automated data ingestion pipeline in AWS, which efficiently gets frames from client videos without human intervention. I also worked on improving the smoothness of object tracking, writing production level Python code and developed a reproducible way of creating demos of the pose estimation and object detection systems on the edge.
I was also previously a software engineering intern at MathWorks, where I worked on the MATLAB Deep Learning Toolbox. This was a unique experience to code machine learning tools rather than just using them out-of-the-box. I developed a video action recognition classifier based on a research paper in order to demonstrate new functionality in the toolbox and enhanced the pooling layers used in neural networks to allow spatiotemporal operations for the 2022a release.