|
I am a Research Manager and Senior Research Scientist at NVIDIA Spatial Intelligence (TorontoAI) Lab, leading a group focusing on generative world models.
Contact me at: linghuan at cs.toronto.edu
|
|
|
|
Open-sourced world foundation models and tools. |
|
Jay Zhangjie Wu*, Xuanchi Ren*, Tianchang Shen, Tianshi Cao, Kai He, Yifan Lu, Ruiyuan Gao, Enze Xie, Shiyi Lan, Jose M. Alvarez, Jun Gao, Sanja Fidler, Zian Wang, Huan Ling* (*: equally contributed) Technical Report, 2025 project page / code A framework for temporal reasoning that enables advanced image editing and world simulation through understanding time-based relationships. |
|
Xuanchi Ren*, Yifan Lu*, Tianshi Cao*, Ruiyuan Gao*, Shengyu Huang, Amirmojtaba Sabour, Tianchang Shen, Tobias Pfaff, Jay Zhangjie Wu, Runjian Chen, Seung Wook Kim, Jun Gao, Laura Leal-Taixe, Mike Chen, Sanja Fidler, Huan Ling (*: equally contributed) White Paper, 2025 project page / Code A world model based synthetic data generation (SDG) pipeline designed to enhance downstream tasks for autonomous vehicles. |
|
Leading the autonomous driving post-training research, data curation, model training and evaluation White Paper, 2025 website / code / white paper Large-scale multimodal control for conditional world generation. |
|
Core contributor. My contributions include data curation, large scale base model training, and leading self-driving autonomous driving post-training. White Paper, 2025 website / code / white paper / video / Jensen Huang Keynote at CES 2025 NVIDIA's Cosmos product and open-source models. Media:
|
|
Full list at Google Scholar. |
Generative Models: | |
|
Sherwin Bahmani, Tianchang Shen, Jiawei Ren, Jiahui Huang, Yifeng Jiang, Haithem Turki, Andrea Tagliasacchi, David B. Lindell, Zan Gojcic, Sanja Fidler, Huan Ling, Jun Gao*, Xuanchi Ren* (*: equally contributed) ArXiv, 2025 project page / paper / code A generative 3D scene reconstruction method that uses video diffusion model self-distillation to create high-quality 3D representations. |
|
Jay Zhangjie Wu*, Yuxuan Zhang*, Haithem Turki, Xuanchi Ren, Jun Gao, Mike Zheng Shou, Sanja Fidler, Zan Gojcjc^, Huan Ling^ (*, ^: equally contributed) CVPR, 2025 (Oral & Best Paper Nomination) project page / paper / Code Enhancing 3D reconstructions and novel-view synthesis via single step diffusion inference. |
|
Ruofan Liang*, Zan Gojcic, Huan Ling, Jacob Munkberg, Jon Hasselgren, Zhi-Hao Lin, Jun Gao, Alexander Keller, Nandita Vijaykumar, Sanja Fidler, Zian Wang* (* : equally contributed) CVPR, 2025 (Oral) project page / paper A neural approach that addresses the dual problem of inverse and forward rendering within a holistic framework. |
|
Xuanchi Ren *, Tianchang Shen *, Jiahui Huang, Huan Ling, Yifan Lu, Merlin Nimier-David, Thomas Müller, Alexander Keller, Sanja Fidler, Jun Gao (* : equally contributed) CVPR, 2025 (Highlight) project page / paper A generative video model with precise Camera Control and temporal 3D Consistency with a 3D cache Media:
|
|
Jiawei Ren, Kevin Xie, Ashkan Mirzaei, Hanxue Liang, Xiaohui Zeng, Karsten Kreis, Ziwei Liu, Antonio Torralba, Sanja Fidler, Seung Wook Kim, Huan Ling NeurIPS, 2024 project page A scalable and expressive framework for high-quality dynamic 3D scene reconstruction. |
|
Huan Ling*, Seung Wook Kim*, Antonio Torralba, Sanja Fidler, Karsten Kreis (* : equally contributed) CVPR, 2024 (Highlight) project page We propose a framework that aligns dynamic 3D Gaussians with text-driven 4D generations. |
|
Andreas Blattmann*, Robin Rombach*, Huan Ling*, Tim Dockhorn*, Seung Wook Kim, Sanja Fidler, Karsten Kreis (* : equally contributed) CVPR, 2023 project page We present a latent diffusion framework for high-resolution video synthesis. A follow-up open sourced model, Stable Video Diffusion (SVD), is available at huggingface, featuring enhanced datasets and fine-tuned results. Media:
|
|
Huan Ling*, Karsten Kreis*, Daiqing Li, Seung Wook Kim, Antonio Torralba, Sanja Fidler (* : equally contributed) NeurIPS, 2021 project page / code & demo EditGAN enables fine-grained and high-quality semantic edits to images by directly manipulating the latent space of GANs with explicit control over object-level attributes. Media:
|
Generative Representation Learning: | |
|
Chenfeng Xu, Huan Ling, Sanja Fidler, Or Litany ICCV, 2023 project page A novel 3D object detection approach that leverages geometry-aware features derived from diffusion models for more robust 3D understanding. |
|
Daiqing Li*, Huan Ling*, Amlan Kar, David Acuna, Seung Wook Kim, Karsten Kreis, Antonio Torralba, Sanja Fidler (* : equally contributed) ICCV, 2023 project page We propose a self-supervised pretraining framework that uses generative models to supervise image encoders without requiring labeled data. |
|
Daiqing Li, Huan Ling, Seung Wook Kim, Karsten Kreis, Adela Barriuso, Sanja Fidler, Antonio Torralba CVPR, 2022 project page / code & demo We extend DatasetGAN to synthesize large-scale datasets like ImageNet with dense pixel-wise annotations, significantly reducing the manual labeling burden. |
|
Yuxuan Zhang*, Huan Ling*, Jun Gao, Kangxue Yin, Jean-Francois Lafleche, Adela Barriuso, Antonio Torralba, Sanja Fidler (* : equally contributed) CVPR, 2021 project page / code & data DatasetGAN leverages the rich latent space of GANs to synthesize annotated datasets with minimal human effort, enabling pixel-level labeling from a few examples. |
|
Huan Ling*, David Acuna, Karsten Kreis, Seung Wook Kim, Sanja Fidler (* : equally contributed) NeurIPS, 2020 project page We introduce a variational framework for completing partially occluded objects, producing plausible and diverse amodal completions for scene understanding. |
3D Vision: | |
|
Yuxuan Zhang*, Wenzheng Chen*, Huan Ling, Yinan Zhang, Sanja Fidler (* : equally contributed) ICLR, 2021 paper / project page Media: We unify GANs and differentiable rendering for interpretable 3D reconstruction and inverse graphics from single images using GAN priors and neural rendering techniques. |
|
Wenzheng Chen, Jun Gao*, Huan Ling*, Edward J. Smith*, Jaakko Lehtinen, Alec Jacobson, Sanja Fidler (* : equally contributed) NeurIPS, 2019 paper / project page Media: We propose DIB-R, a differentiable renderer that enables training of 3D object predictors end-to-end with supervision from 2D images alone. |
Interactive Annotation: | |
|
Bowen Chen*, Huan Ling*, Jun Gao, Xiaohui Zeng, Ziyue Xu, Sanja Fidler (* : equally contributed) ECCV, 2020 paper / project page ScribbleBox is a practical system that allows users to quickly annotate video object segmentations with scribbles, enabling high-quality masks with minimal effort. |
|
Huan Ling*, Jun Gao*, Amlan Kar, Wenzheng Chen, Sanja Fidler (* : equally contributed) CVPR, 2019 paper / code We introduce Curve-GCN, a real-time annotation tool that models object contours as closed curves and uses graph convolution to refine annotations interactively. |
|
David Acuna*, Huan Ling*, Amlan Kar*, Sanja Fidler (* : equally contributed) CVPR, 2018 demo video / paper Polygon-RNN++ is an annotation system that speeds up segmentation mask creation by predicting object boundaries with minimal clicks via recurrent polygon prediction. |
Fun note from year 2023: Yes, we took a bite of RLHF back in 2017 :) | |
|
Huan Ling, Sanja Fidler NeurIPS, 2017 project page / paper We introduce an early reinforcement learning with human feedback (RLHF) approach for image captioning, where a model learns to improve descriptions through natural language feedback rather than numerical rewards. |
|
|
|
Research Manager at NVIDIA July 2025 - Present Senior Research Scientist at NVIDIA Mar. 2025 - July 2025 Research Scientist at NVIDIA Jan. 2020 - Mar. 2025 Research Intern at NVIDIA Sep. 2018 - Dec. 2019 |
|
|
|
University of Toronto 2021 - 2024 Doctor of Philosophy - PhD University of Toronto 2018 - 2020 Master of Science - MS, Artificial Intelligence University of Toronto 2014 - 2018 Bachelor's degree, Computer Science |
|
I’m incredibly proud to work alongside world-class students and interns at the Toronto AI Lab. We recruit PhD interns year-round - usually we send offers from Sep to Dec — feel free to reach out if you're interested in joining us. |
|
Sherwin Bahmani , CS Ph.D. student at University of Toronto Runjian Chen , CS Ph.D. student at University of Hong Kong and HKU-MMLab Jiawei Ren , CS Ph.D. student at MMLab@NTU Jay Zhangjie Wu , CS Ph.D. student at Show Lab, National University of Singapore Chenfeng Xu , CS Ph.D. student at UC Berkeley Yuxuan Zhang , CS Ph.D. student at Princeton University Bowen Chen , CS undergrad student at University of Toronto |
|
|
|
I recognize the information asymmetry between junior and senior students regarding research topics, career directions, and navigating the challenges and excitement of research. This problem is more severe for people from underrepresented groups.
Following Krishna, Wei-Chiu, Shangzhe and Jun, I decide to commit 1 hour per week to host free pro bono office hours to help reduce the information asymmetry mentioned above. Please send me an email if you are interested. |
|
|