Understanding Physical Dynamics with Counterfactual World Modeling

[**Rahul Venkatesh***](https://rahulvenkk.github.io/)1 · [**Honglin Chen***](https://web.stanford.edu/~honglinc/)1* · [**Kevin Feigelis***](https://neuroscience.stanford.edu/people/kevin-t-feigelis)1 · [**Daniel M. Bear**](https://twitter.com/recursus?lang=en)1 · [**Khaled Jedoui**](https://web.stanford.edu/~thekej/)1 · [**Klemen Kotar**](https://klemenkotar.github.io/)1 · [**Felix Binder**](https://ac.felixbinder.net/)2 · [**Wanhee Lee**](https://www.linkedin.com/in/wanhee-lee-31102820b/)1 · [**Sherry Liu**](https://neuroailab.github.io/cwm-physics/)1 · [**Kevin A. Smith**](https://www.mit.edu/~k2smith/)3 · [**Judith E. Fan**](https://cogtoolslab.github.io/)1 · [**Daniel L. K. Yamins**](https://stanford.edu/~yamins/)1 (* equal contribution) 1Stanford    2UCSD    3MIT Paper PDF Project Page
This work presents the Counterfactual World Modeling (CWM) framework. CWM is capable of counterfactual prediction and extraction of vision structures useful for understanding physical dynamics. ![](assets/cwm_teaser.gif) ## 📣 News - 2024-06-01: Release [project page](https://neuroailab.github.io) and [codes](https://github.com/rahulvenkk/cwm_release.git) ## 🔨 Installation ``` git clone https://github.com/rahulvenkk/cwm_release.git pip install -e . ``` ## ✨ Usage To download and use a pre-trianed model run the following ``` from cwm.model.model_factory import model_factory model = model_factory.load_model('vitbase_8x8patch_3frames_1tube') ``` This will automatically initialize the appropriate model class and download the specified weights to your `$CACHE` directory. ## 🔄 Pre-training To train the model run the following script ``` ./scripts/pretrain/3frame_patch8x8_mr0.90_gpu.sh ```