Thinking with Spatial Code for Physical-World Video Reasoning
Jieneng Chen*, Wenxin Ma*, Ruisheng Yuan*, Yunzhi Zhang*, Jiajun Wuโ , Alan Yuilleโ Johns Hopkins University & Stanford University
Spatial Code โ Transform visual scenes into structured, executable 3D representations for spatial reasoning.
Stay tuned!
- arXiv paper โ released on March 5 โ arXiv:2603.05591
- Codebase โ releasing by March 17
- Reinforcement training details โ releasing by March 22
- Reproducible models โ releasing by March 31
If you find this work useful, please consider citing:
@article{chen2025spatialcode,
title={Thinking with Spatial Code for Physical-World Video Reasoning},
author={Chen, Jieneng and Ma, Wenxin and Yuan, Ruisheng and Zhang, Yunzhi and Wu, Jiajun and Yuille, Alan},
journal={arXiv preprint arXiv:2603.05591},
year={2025}
}