EasyMocap/Readme.md

185 lines
8.4 KiB
Markdown
Raw Normal View History

2021-01-14 21:17:40 +08:00
<!--
* @Date: 2021-01-13 20:32:12
* @Author: Qing Shuai
* @LastEditors: Qing Shuai
2021-07-12 20:22:09 +08:00
* @LastEditTime: 2021-07-12 15:27:41
2021-01-14 21:22:44 +08:00
* @FilePath: /EasyMocapRelease/Readme.md
2021-01-14 21:17:40 +08:00
-->
2021-03-13 21:58:16 +08:00
2021-01-14 21:17:40 +08:00
# EasyMocap
2021-03-13 21:58:16 +08:00
2021-04-02 12:28:46 +08:00
**EasyMocap** is an open-source toolbox for **markerless human motion capture** from RGB videos. In this project, we provide a lot of motion capture demos in different settings.
2021-01-17 21:08:07 +08:00
2021-04-02 12:28:46 +08:00
![python](https://img.shields.io/github/languages/top/zju3dv/EasyMocap)
![star](https://img.shields.io/github/stars/zju3dv/EasyMocap?style=social)
2021-01-14 21:17:40 +08:00
2021-04-14 15:22:51 +08:00
---
2021-01-14 21:17:40 +08:00
2021-04-02 12:28:46 +08:00
## Core features
2021-01-17 21:08:07 +08:00
2021-04-14 15:22:51 +08:00
### Multiple views of a single person
2021-01-14 21:17:40 +08:00
2021-04-14 15:22:51 +08:00
[![report](https://img.shields.io/badge/quickstart-green)](./doc/quickstart.md)
2021-06-14 16:47:16 +08:00
This is the basic code for fitting SMPL[1]/SMPL+H[2]/SMPL-X[3]/MANO[2] model to capture body+hand+face poses from multiple views.
2021-01-14 21:41:31 +08:00
2021-04-02 12:28:46 +08:00
<div align="center">
<img src="doc/feng/mv1pmf-smplx.gif" width="80%">
<br>
<sup>Videos are from ZJU-MoCap, with 23 calibrated and synchronized cameras.<sup/>
</div>
2021-03-13 21:58:16 +08:00
2021-06-14 16:47:16 +08:00
<div align="center">
<img src="doc/feng/mano.gif" width="80%">
<br>
<sup>Captured with 8 cameras.<sup/>
</div>
2021-04-02 12:28:46 +08:00
### Internet video with a mirror
2021-01-14 21:41:31 +08:00
2021-04-14 15:22:51 +08:00
[![report](https://img.shields.io/badge/CVPR21-mirror-red)](https://arxiv.org/pdf/2104.00340.pdf) [![quickstart](https://img.shields.io/badge/quickstart-green)](https://github.com/zju3dv/Mirrored-Human)
2021-01-14 21:41:31 +08:00
2021-04-02 12:28:46 +08:00
<div align="center">
<img src="https://raw.githubusercontent.com/zju3dv/Mirrored-Human/main/doc/assets/smpl-avatar.gif" width="80%">
<br>
2021-04-14 16:03:38 +08:00
<sup>The raw video is from <a href="https://www.youtube.com/watch?v=KOCJJ27hhIE">Youtube<a/>.<sup/>
</div>
<div align="center">
<img src="doc/imocap/mv1p-mirror.gif" width="80%"><br/>
<sup>Captured with 6 cameras and a mirror<sup/>
2021-04-02 12:28:46 +08:00
</div>
2021-03-13 21:58:16 +08:00
2021-04-02 12:28:46 +08:00
### Multiple Internet videos with a specific action (Coming soon)
2021-01-14 21:17:40 +08:00
2021-04-14 15:22:51 +08:00
[![report](https://img.shields.io/badge/ECCV20-imocap-red)](https://arxiv.org/pdf/2008.07931.pdf) [![quickstart](https://img.shields.io/badge/quickstart-green)](./doc/todo.md)
2021-03-13 21:58:16 +08:00
2021-04-14 16:03:38 +08:00
<div align="center">
<img src="doc/imocap/imocap.gif" width="80%"><br/>
<sup>Internet videos of Roger Federer's serving<sup/>
</div>
2021-06-28 12:14:56 +08:00
### Multiple views of multiple people
2021-01-14 21:17:40 +08:00
2021-07-07 12:03:18 +08:00
[![report](https://img.shields.io/badge/CVPR19-mvpose-red)](https://arxiv.org/pdf/1901.04111.pdf) [![quickstart](https://img.shields.io/badge/quickstart-green)](./doc/mvmp.md)
2021-03-13 21:58:16 +08:00
2021-04-14 16:03:38 +08:00
<div align="center">
2021-06-28 12:14:56 +08:00
<img src="doc/assets/mvmp1f.gif" width="80%"><br/>
<sup>Captured with 8 consumer cameras<sup/>
2021-04-14 16:03:38 +08:00
</div>
2021-06-14 16:47:16 +08:00
### Novel view synthesis from sparse views
[![report](https://img.shields.io/badge/CVPR21-neuralbody-red)](https://arxiv.org/pdf/2012.15838.pdf) [![quickstart](https://img.shields.io/badge/quickstart-green)](https://github.com/zju3dv/neuralbody)
2021-04-14 15:22:51 +08:00
2021-06-14 16:47:16 +08:00
<div align="center">
<img src="doc/neuralbody/sida-frame0.jpg" width="80%"><br/>
<img src="doc/neuralbody/sida.gif" width="80%"><br/>
<sup>Captured with 8 consumer cameras<sup/>
</div>
2021-04-14 15:22:51 +08:00
2021-07-07 12:03:18 +08:00
## ZJU-MoCap
With out proposed method, we release two large dataset of human motion: LightStage and Mirrored-Human. See the [website](https://chingswy.github.io/Dataset-Demo/) for more details.
2021-04-02 12:28:46 +08:00
## Other features
2021-01-14 21:17:40 +08:00
2021-06-28 19:37:15 +08:00
### 3D Realtime visualization
[![quickstart](https://img.shields.io/badge/quickstart-green)](./doc/realtime_visualization.md)
<div align="center">
<img src="https://raw.githubusercontent.com/chingswy/Dataset-Demo/main/assets/vis3d/skel-body25.gif" width="26%">
<img src="https://raw.githubusercontent.com/chingswy/Dataset-Demo/main/assets/vis3d/skel-total.gif" width="26%">
<img src="https://raw.githubusercontent.com/chingswy/Dataset-Demo/main/assets/vis3d/skel-multi.gif" width="26%">
</div>
<div align="center">
<img src="https://raw.githubusercontent.com/chingswy/Dataset-Demo/main/assets/vis3d/mesh-smpl.gif" width="26%">
<img src="https://raw.githubusercontent.com/chingswy/Dataset-Demo/main/assets/vis3d/mesh-smplx.gif" width="26%">
<img src="https://raw.githubusercontent.com/chingswy/Dataset-Demo/main/assets/vis3d/mesh-manol.gif" width="26%">
</div>
### Other
2021-04-14 15:22:51 +08:00
- [Camera calibration](apps/calibration/Readme.md): a simple calibration tool based on OpenCV
- [Pose guided synchronization](./doc/todo.md) (comming soon)
2021-06-22 14:22:05 +08:00
- [Annotator](apps/annotation/Readme.md): a simple GUI annotator based on OpenCV
2021-04-14 15:22:51 +08:00
- [Exporting of multiple data formats(bvh, asf/amc, ...)](./doc/02_output.md)
2021-03-13 21:58:16 +08:00
2021-04-02 12:28:46 +08:00
## Updates
2021-03-13 21:58:16 +08:00
2021-06-28 12:14:56 +08:00
- 06/28/2021: The **Multi-view Multi-person** part is released!
2021-06-14 16:47:16 +08:00
- 06/10/2021: The **real-time 3D visualization** part is released!
- 04/11/2021: The calibration tool and the annotator are released.
- 04/11/2021: **Mirrored-Human** part is released.
2021-03-13 21:58:16 +08:00
2021-04-14 15:22:51 +08:00
## Installation
2021-01-14 21:17:40 +08:00
2021-04-14 15:22:51 +08:00
See [doc/install](./doc/installation.md) for more instructions.
2021-01-24 22:33:08 +08:00
2021-01-14 21:17:40 +08:00
## Acknowledgements
2021-04-02 12:28:46 +08:00
2021-01-14 23:13:49 +08:00
Here are the great works this project is built upon:
2021-01-14 21:17:40 +08:00
2021-01-14 23:13:49 +08:00
- SMPL models and layer are from MPII [SMPL-X model](https://github.com/vchoutas/smplx).
2021-01-14 21:17:40 +08:00
- Some functions are borrowed from [SPIN](https://github.com/nkolot/SPIN), [VIBE](https://github.com/mkocabas/VIBE), [SMPLify-X](https://github.com/vchoutas/smplify-x)
2021-04-14 15:22:51 +08:00
- The method for fitting 3D skeleton and SMPL model is similar to [TotalCapture](http://www.cs.cmu.edu/~hanbyulj/totalcapture/), without using point clouds.
- We integrate some easy-to-use functions for previous great work:
- `easymocap/estimator/SPIN` : an SMPL estimator[5]
- `easymocap/estimator/YOLOv4`: an object detector[6](Coming soon)
- `easymocap/estimator/HRNet` : a 2D human pose estimator[7](Coming soon)
2021-01-14 21:17:40 +08:00
2021-06-14 16:47:16 +08:00
We also would like to thank Wenduo Feng, Di Huang, Yuji Chen, Hao Xu, Qing Shuai, Qi Fang, Ting Xie, Junting Dong, Sida Peng and Xiaopeng Ji who are the performers in the sample data.
2021-01-14 21:17:40 +08:00
## Contact
2021-04-02 12:28:46 +08:00
Please open an issue if you have any questions. We appreciate all contributions to improve our project.
2021-01-14 21:17:40 +08:00
## Citation
2021-04-02 12:28:46 +08:00
2021-03-13 21:58:16 +08:00
This project is a part of our work [iMocap](https://zju3dv.github.io/iMoCap/), [Mirrored-Human](https://zju3dv.github.io/Mirrored-Human/) and [Neural Body](https://zju3dv.github.io/neuralbody/)
2021-01-14 23:13:49 +08:00
2021-04-02 12:28:46 +08:00
Please consider citing these works if you find this repo is useful for your projects.
2021-01-14 21:17:40 +08:00
```bibtex
@inproceedings{dong2020motion,
title={Motion capture from internet videos},
author={Dong, Junting and Shuai, Qing and Zhang, Yuanqing and Liu, Xian and Zhou, Xiaowei and Bao, Hujun},
booktitle={European Conference on Computer Vision},
pages={210--227},
year={2020},
organization={Springer}
}
2021-01-14 22:48:55 +08:00
2021-03-24 17:03:03 +08:00
@inproceedings{peng2021neural,
2021-01-14 22:48:55 +08:00
title={Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans},
2021-01-16 20:40:50 +08:00
author={Peng, Sida and Zhang, Yuanqing and Xu, Yinghao and Wang, Qianqian and Shuai, Qing and Bao, Hujun and Zhou, Xiaowei},
2021-03-24 17:04:20 +08:00
booktitle={CVPR},
2021-03-04 10:15:38 +08:00
year={2021}
2021-01-14 22:48:55 +08:00
}
2021-03-13 21:58:16 +08:00
2021-03-23 09:33:47 +08:00
@inproceedings{fang2021mirrored,
title={Reconstructing 3D Human Pose by Watching Humans in the Mirror},
2021-03-13 21:58:16 +08:00
author={Fang, Qi and Shuai, Qing and Dong, Junting and Bao, Hujun and Zhou, Xiaowei},
2021-03-23 09:33:47 +08:00
booktitle={CVPR},
2021-03-13 21:58:16 +08:00
year={2021}
}
2021-07-12 20:22:09 +08:00
@inproceedings{dong2021fast,
title={Fast and Robust Multi-Person 3D Pose Estimation and Tracking from Multiple Views},
author={Dong, Junting and Fang, Qi and Jiang, Wen and Yang, Yurou and Bao, Hujun and Zhou, Xiaowei},
booktitle={T-PAMI},
year={2021}
}
2021-01-24 22:33:08 +08:00
```
## Reference
2021-04-14 15:22:51 +08:00
2021-01-24 22:33:08 +08:00
```bash
[1] Loper, Matthew, et al. "SMPL: A skinned multi-person linear model." ACM transactions on graphics (TOG) 34.6 (2015): 1-16.
[2] Romero, Javier, Dimitrios Tzionas, and Michael J. Black. "Embodied hands: Modeling and capturing hands and bodies together." ACM Transactions on Graphics (ToG) 36.6 (2017): 1-17.
[3] Pavlakos, Georgios, et al. "Expressive body capture: 3d hands, face, and body from a single image." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
Bogo, Federica, et al. "Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image." European conference on computer vision. Springer, Cham, 2016.
[4] Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: Openpose: real-time multi-person 2d pose estimation using part affinity fields. arXiv preprint arXiv:1812.08008 (2018)
2021-04-14 15:22:51 +08:00
[5] Kolotouros, Nikos, et al. "Learning to reconstruct 3D human pose and shape via model-fitting in the loop." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019
[6] Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao. "Yolov4: Optimal speed and accuracy of object detection." arXiv preprint arXiv:2004.10934 (2020).
[7] Sun, Ke, et al. "Deep high-resolution representation learning for human pose estimation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
```