🚀 update to v0.2

This commit is contained in:
shuaiqing 2021-04-14 15:22:51 +08:00
parent 3f2aec6b44
commit da43e1bf5a
67 changed files with 15004 additions and 163 deletions

2
.gitignore vendored
View File

@ -110,3 +110,5 @@ exp/coco*
output output
data data
.DS* .DS*
code_deprecate
code

View File

@ -2,7 +2,7 @@
* @Date: 2021-01-13 20:32:12 * @Date: 2021-01-13 20:32:12
* @Author: Qing Shuai * @Author: Qing Shuai
* @LastEditors: Qing Shuai * @LastEditors: Qing Shuai
* @LastEditTime: 2021-04-02 12:26:56 * @LastEditTime: 2021-04-13 17:42:17
* @FilePath: /EasyMocapRelease/Readme.md * @FilePath: /EasyMocapRelease/Readme.md
--> -->
@ -13,11 +13,13 @@
![python](https://img.shields.io/github/languages/top/zju3dv/EasyMocap) ![python](https://img.shields.io/github/languages/top/zju3dv/EasyMocap)
![star](https://img.shields.io/github/stars/zju3dv/EasyMocap?style=social) ![star](https://img.shields.io/github/stars/zju3dv/EasyMocap?style=social)
---- ---
## Core features ## Core features
### Multiple views of single person ### Multiple views of a single person
[![report](https://img.shields.io/badge/quickstart-green)](./doc/quickstart.md)
This is the basic code for fitting SMPL[1]/SMPL+H[2]/SMPL-X[3] model to capture body+hand+face poses from multiple views. This is the basic code for fitting SMPL[1]/SMPL+H[2]/SMPL-X[3] model to capture body+hand+face poses from multiple views.
@ -29,7 +31,7 @@ This is the basic code for fitting SMPL[1]/SMPL+H[2]/SMPL-X[3] model to capture
### Internet video with a mirror ### Internet video with a mirror
[![report](https://img.shields.io/badge/mirrored-link-red)](https://arxiv.org/pdf/2104.00340.pdf) [![report](https://img.shields.io/badge/CVPR21-mirror-red)](https://arxiv.org/pdf/2104.00340.pdf) [![quickstart](https://img.shields.io/badge/quickstart-green)](https://github.com/zju3dv/Mirrored-Human)
<div align="center"> <div align="center">
<img src="https://raw.githubusercontent.com/zju3dv/Mirrored-Human/main/doc/assets/smpl-avatar.gif" width="80%"> <img src="https://raw.githubusercontent.com/zju3dv/Mirrored-Human/main/doc/assets/smpl-avatar.gif" width="80%">
@ -39,46 +41,36 @@ This is the basic code for fitting SMPL[1]/SMPL+H[2]/SMPL-X[3] model to capture
### Multiple Internet videos with a specific action (Coming soon) ### Multiple Internet videos with a specific action (Coming soon)
[![report](https://img.shields.io/badge/imocap-link-red)](https://arxiv.org/pdf/2008.07931.pdf) [![report](https://img.shields.io/badge/ECCV20-imocap-red)](https://arxiv.org/pdf/2008.07931.pdf) [![quickstart](https://img.shields.io/badge/quickstart-green)](./doc/todo.md)
<div align="center"> ### Multiple views of multiple people (Coming soon)
<img src="doc/imocap/frame_00036_036.jpg" width="80%">
</div>
### Multiple views of multiple people (Comming soon)
[![report](https://img.shields.io/badge/mvpose-link-red)](https://arxiv.org/pdf/1901.04111.pdf)
[![report](https://img.shields.io/badge/CVPR20-mvpose-red)](https://arxiv.org/pdf/1901.04111.pdf) [![quickstart](https://img.shields.io/badge/quickstart-green)](./doc/todo.md)
### Others ### Others
This project is used by many other projects: This project is used by many other projects:
- [[CVPR21] Dense Reconstruction and View Synthesis from **Sparse Views**](https://zju3dv.github.io/neuralbody/) - [[CVPR21] Dense Reconstruction and View Synthesis from **Sparse Views**](https://zju3dv.github.io/neuralbody/)
## Other features ## Other features
- [Camera calibration](./doc/todo.md) - [Camera calibration](apps/calibration/Readme.md): a simple calibration tool based on OpenCV
- [Pose guided synchronization](./doc/todo.md) - [Pose guided synchronization](./doc/todo.md) (comming soon)
- [Annotator](./doc/todo.md) - [Annotator](apps/calibration/Readme.md): a simple GUI annotator based on OpenCV
- [Exporting of multiple data formats(bvh, asf/amc, ...)](./doc/todo.md) - [Exporting of multiple data formats(bvh, asf/amc, ...)](./doc/02_output.md)
## Updates ## Updates
- 04/02/2021: We are now rebuilding our project for `v0.2`, please stay tuned. `v0.1` is available at [this link](https://github.com/zju3dv/EasyMocap/releases/tag/v0.1).
- 04/12/2021: Mirrored-Human part is released. We also release the calibration tool and the annotator.
## Installation ## Installation
See [doc/install](./doc/install.md) for more instructions. See [doc/install](./doc/installation.md) for more instructions.
## Quick Start
See [doc/quickstart](doc/quickstart.md) for more instructions.
## Not Quick Start
See [doc/notquickstart](doc/notquickstart.md) for more instructions.
## Evaluation ## Evaluation
The weight parameters can be set according your data. The weight parameters can be set according to your data.
More quantitative reports will be added in [doc/evaluation.md](doc/evaluation.md) More quantitative reports will be added in [doc/evaluation.md](doc/evaluation.md)
@ -88,7 +80,11 @@ Here are the great works this project is built upon:
- SMPL models and layer are from MPII [SMPL-X model](https://github.com/vchoutas/smplx). - SMPL models and layer are from MPII [SMPL-X model](https://github.com/vchoutas/smplx).
- Some functions are borrowed from [SPIN](https://github.com/nkolot/SPIN), [VIBE](https://github.com/mkocabas/VIBE), [SMPLify-X](https://github.com/vchoutas/smplify-x) - Some functions are borrowed from [SPIN](https://github.com/nkolot/SPIN), [VIBE](https://github.com/mkocabas/VIBE), [SMPLify-X](https://github.com/vchoutas/smplify-x)
- The method for fitting 3D skeleton and SMPL model is similar to [TotalCapture](http://www.cs.cmu.edu/~hanbyulj/totalcapture/), without using point cloud. - The method for fitting 3D skeleton and SMPL model is similar to [TotalCapture](http://www.cs.cmu.edu/~hanbyulj/totalcapture/), without using point clouds.
- We integrate some easy-to-use functions for previous great work:
- `easymocap/estimator/SPIN` : an SMPL estimator[5]
- `easymocap/estimator/YOLOv4`: an object detector[6](Coming soon)
- `easymocap/estimator/HRNet` : a 2D human pose estimator[7](Coming soon)
We also would like to thank Wenduo Feng who is the performer in the sample data. We also would like to thank Wenduo Feng who is the performer in the sample data.
@ -128,10 +124,14 @@ Please consider citing these works if you find this repo is useful for your proj
``` ```
## Reference ## Reference
```bash ```bash
[1] Loper, Matthew, et al. "SMPL: A skinned multi-person linear model." ACM transactions on graphics (TOG) 34.6 (2015): 1-16. [1] Loper, Matthew, et al. "SMPL: A skinned multi-person linear model." ACM transactions on graphics (TOG) 34.6 (2015): 1-16.
[2] Romero, Javier, Dimitrios Tzionas, and Michael J. Black. "Embodied hands: Modeling and capturing hands and bodies together." ACM Transactions on Graphics (ToG) 36.6 (2017): 1-17. [2] Romero, Javier, Dimitrios Tzionas, and Michael J. Black. "Embodied hands: Modeling and capturing hands and bodies together." ACM Transactions on Graphics (ToG) 36.6 (2017): 1-17.
[3] Pavlakos, Georgios, et al. "Expressive body capture: 3d hands, face, and body from a single image." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. [3] Pavlakos, Georgios, et al. "Expressive body capture: 3d hands, face, and body from a single image." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
Bogo, Federica, et al. "Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image." European conference on computer vision. Springer, Cham, 2016. Bogo, Federica, et al. "Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image." European conference on computer vision. Springer, Cham, 2016.
[4] Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: Openpose: real-time multi-person 2d pose estimation using part affinity fields. arXiv preprint arXiv:1812.08008 (2018) [4] Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: Openpose: real-time multi-person 2d pose estimation using part affinity fields. arXiv preprint arXiv:1812.08008 (2018)
[5] Kolotouros, Nikos, et al. "Learning to reconstruct 3D human pose and shape via model-fitting in the loop." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019
[6] Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao. "Yolov4: Optimal speed and accuracy of object detection." arXiv preprint arXiv:2004.10934 (2020).
[7] Sun, Ke, et al. "Deep high-resolution representation learning for human pose estimation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
``` ```

61
apps/annotation/Readme.md Normal file
View File

@ -0,0 +1,61 @@
<!--
* @Date: 2021-04-13 17:30:14
* @Author: Qing Shuai
* @LastEditors: Qing Shuai
* @LastEditTime: 2021-04-13 17:36:26
* @FilePath: /EasyMocapRelease/apps/annotation/Readme.md
-->
# EasyMocap - Annotator
## Usage
### Example
To start with our annotator, you should take 1 minutes to learn it. First you can run our example script:
```bash
python3 apps/annotation/annot_example.py ${data}
```
#### Mouse
In this example, you can try the two basic operations: `click` and `move`.
- `click`: click the left mousekey. This operation is often used if you want to select something.
- `move`: press the left mousekey and drag the mouse. This operation is often used if you want to plot a line or move something.
#### Keyboard
We list some common keys:
|key|usage|
|----|----|
|`h`|help|
|`w`, `a`, `s`, `d`|switch the frame|
|`q`|quit|
|`p`|start/stop recording the frame|
## Annotate tracking
```bash
python3 apps/annotation/annot_track.py ${data}
```
- `click` the center of bbox to select a person.
- press `0-9` to set the person's ID
- `x` to delete the bbox
- `drag` the corner to reshape the bounding box
- `t`: tracking the person to previous frame
## Annotate vanishing line
```bash
python3 apps/annotation/annot_vanish.py ${data}
```
- `drag` to plot a line
- `X`, `Y`, `Z` to add this line to the set of vanishing lines.
- `k` to calculate the intrinsic matrix with vanishing points in dim x, y.
- `b` to calculating the vanishing point from human keypoints
## Annotate keypoints(coming soon)
## Annotate calibration board(coming soon)
## Define your annotator

View File

@ -0,0 +1,26 @@
# This script shows an example of our annotator
from easymocap.annotator import ImageFolder
from easymocap.annotator import vis_point, vis_line
from easymocap.annotator import AnnotBase
def annot_example(path):
# define datasets
dataset = ImageFolder(path)
# define visualize
vis_funcs = [vis_point, vis_line]
# construct annotations
annotator = AnnotBase(
dataset=dataset,
key_funcs={},
vis_funcs=vis_funcs)
while annotator.isOpen:
annotator.run()
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('path', type=str, default='/home/')
parser.add_argument('--debug', action='store_true')
args = parser.parse_args()
annot_example(args.path)

View File

@ -0,0 +1,48 @@
'''
@ Date: 2021-03-28 21:22:38
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-03-28 21:23:19
@ FilePath: /EasyMocap/annotation/annot_track.py
'''
import os
from os.path import join
from easymocap.annotator import ImageFolder
from easymocap.annotator import plot_text, plot_bbox_body, vis_active_bbox, vis_line
from easymocap.annotator import AnnotBase
from easymocap.annotator import callback_select_bbox_corner, callback_select_bbox_center, auto_pose_track
def annot_example(path, subs, annot, step):
for sub in subs:
# define datasets
dataset = ImageFolder(path, sub=sub, annot=annot)
key_funcs = {
't': auto_pose_track
}
callbacks = [callback_select_bbox_corner, callback_select_bbox_center]
# define visualize
vis_funcs = [vis_line, plot_bbox_body, vis_active_bbox]
# construct annotations
annotator = AnnotBase(
dataset=dataset,
key_funcs=key_funcs,
vis_funcs=vis_funcs,
callbacks=callbacks,
name=sub,
step=step)
while annotator.isOpen:
annotator.run()
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('path', type=str)
parser.add_argument('--sub', type=str, nargs='+', default=[],
help='the sub folder lists when in video mode')
parser.add_argument('--annot', type=str, default='annots')
parser.add_argument('--step', type=int, default=100)
parser.add_argument('--debug', action='store_true')
args = parser.parse_args()
if len(args.sub) == 0:
args.sub = sorted(os.listdir(join(args.path, 'images')))
annot_example(args.path, annot=args.annot, subs=args.sub, step=args.step)

View File

@ -0,0 +1,42 @@
# This script shows an example to annotate vanishing lines
from easymocap.annotator import ImageFolder
from easymocap.annotator import plot_text, plot_skeleton_simple, vis_active_bbox, vis_line
from easymocap.annotator import AnnotBase
from easymocap.annotator.vanish_callback import get_record_vanish_lines, get_calc_intrinsic, clear_vanish_points, vanish_point_from_body, copy_edges, clear_body_points
from easymocap.annotator.vanish_visualize import vis_vanish_lines
def annot_example(path, annot, sub=None, step=100):
# define datasets
dataset = ImageFolder(path, sub=sub, annot=annot)
key_funcs = {
'X': get_record_vanish_lines(0),
'Y': get_record_vanish_lines(1),
'Z': get_record_vanish_lines(2),
'k': get_calc_intrinsic('xy'),
'K': get_calc_intrinsic('yz'),
'b': vanish_point_from_body,
'C': clear_vanish_points,
'B': clear_body_points,
'c': copy_edges,
}
# define visualize
vis_funcs = [vis_line, plot_skeleton_simple, vis_vanish_lines]
# construct annotations
annotator = AnnotBase(
dataset=dataset,
key_funcs=key_funcs,
vis_funcs=vis_funcs,
step=step)
while annotator.isOpen:
annotator.run()
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('path', type=str)
parser.add_argument('--annot', type=str, default='annots')
parser.add_argument('--step', type=int, default=100)
parser.add_argument('--debug', action='store_true')
args = parser.parse_args()
annot_example(args.path, args.annot, step=args.step)

View File

@ -0,0 +1,82 @@
<!--
* @Date: 2021-03-02 16:14:48
* @Author: Qing Shuai
* @LastEditors: Qing Shuai
* @LastEditTime: 2021-03-27 21:56:34
* @FilePath: /EasyMocap/scripts/calibration/Readme.md
-->
# Camera Calibration
Before reading this document, you should read the OpenCV-Python Tutorials of [Camera Calibration](https://docs.opencv.org/master/dc/dbb/tutorial_py_calibration.html) carefully.
## Some Tips
1. Use a chessboard as big as possible.
2. You must keep the same resolution during all the steps.
## 0. Prepare your chessboard
## 1. Record videos
Usually, we need to record two sets of videos, one for intrinsic parameters and one for extrinsic parameters.
First, you should record a video with your chessboard for each camera separately. The videos of each camera should be placed into the `<intri_data>/videos` directory. The following code will take the file name as the name of each camera.
```bash
<intri_data>
└── videos
   ├── 01.mp4
   ├── 02.mp4
   ├── ...
└── xx.mp4
```
For the extrinsic parameters, you should place the chessboard pattern where it will be visible to all the cameras (on the floor for example) and then take a picture or a short video on all of the cameras.
```bash
<extri_data>
└── videos
   ├── 01.mp4
   ├── 02.mp4
   ├── ...
└── xx.mp4
```
## 2. Detect the chessboard
For both intrinsic parameters and extrinsic parameters, we need detect the corners of the chessboard. So in this step, we first extract images from videos and second detect and write the corners.
```bash
# extrac 2d
python3 scripts/preprocess/extract_video.py ${data} --no2d
# detect chessboard
python3 apps/calibration/detect_chessboard.py ${data} --out ${data}/output/calibration --pattern 9,6 --grid 0.1
```
The results will be saved in `${data}/chessboard`, the visualization will be saved in `${data}/output/calibration`.
To specify your chessboard, add the option `--pattern`, `--grid`.
Repeat this step for `<intri_data>` and `<extri_data>`.
## 3. Intrinsic Parameter Calibration
```bash
python3 apps/calibration/calib_intri.py ${data} --step 5
```
## 4. Extrinsic Parameter Calibration
```
python3 apps/calibration/calib_extri.py ${extri} --intri ${intri}/output/intri.yml
```
## 5. (Optional)Bundle Adjustment
Coming soon
## 6. Check the calibration
1. Check the calibration results with chessboard:
```bash
python3 apps/calibration/check_calib.py ${extri} --out ${intri}/output --vis --show
```
Check the results with a cube.
```bash
python3 apps/calibration/check_calib.py ${extri} --out ${extri}/output --cube
```
2. (TODO) Check the calibration results with people.

View File

@ -0,0 +1,46 @@
'''
@ Date: 2021-03-02 16:13:03
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-03-27 22:08:18
@ FilePath: /EasyMocap/scripts/calibration/calib_extri.py
'''
import os
from glob import glob
from os.path import join
import numpy as np
import cv2
from easymocap.mytools import read_intri, write_extri, read_json
def calib_extri(path, intriname):
assert os.path.exists(intriname), intriname
intri = read_intri(intriname)
camnames = list(intri.keys())
extri = {}
for ic, cam in enumerate(camnames):
imagenames = sorted(glob(join(path, 'images', cam, '*.jpg')))
chessnames = sorted(glob(join(path, 'chessboard', cam, '*.json')))
chessname = chessnames[0]
data = read_json(chessname)
k3d = np.array(data['keypoints3d'], dtype=np.float32)
k3d[:, 0] *= -1
k2d = np.array(data['keypoints2d'], dtype=np.float32)
k2d = np.ascontiguousarray(k2d[:, :-1])
ret, rvec, tvec = cv2.solvePnP(k3d, k2d, intri[cam]['K'], intri[cam]['dist'])
extri[cam] = {}
extri[cam]['Rvec'] = rvec
extri[cam]['R'] = cv2.Rodrigues(rvec)[0]
extri[cam]['T'] = tvec
center = - extri[cam]['R'].T @ tvec
print('{} center => {}'.format(cam, center.squeeze()))
write_extri(join(os.path.dirname(intriname), 'extri.yml'), extri)
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('path', type=str)
parser.add_argument('--intri', type=str)
parser.add_argument('--step', type=int, default=1)
parser.add_argument('--debug', action='store_true')
args = parser.parse_args()
calib_extri(args.path, intriname=args.intri)

View File

@ -0,0 +1,50 @@
'''
@ Date: 2021-03-02 16:12:59
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-03-02 16:12:59
@ FilePath: /EasyMocap/scripts/calibration/calib_intri.py
'''
# This script calibrate each intrinsic parameters
from easymocap.mytools import write_intri
import numpy as np
import cv2
import os
from os.path import join
from glob import glob
from easymocap.mytools import read_json, Timer
def calib_intri(path, step):
camnames = sorted(os.listdir(join(path, 'images')))
cameras = {}
for ic, cam in enumerate(camnames):
imagenames = sorted(glob(join(path, 'images', cam, '*.jpg')))
chessnames = sorted(glob(join(path, 'chessboard', cam, '*.json')))
k3ds, k2ds = [], []
for chessname in chessnames[::step]:
data = read_json(chessname)
k3d = np.array(data['keypoints3d'], dtype=np.float32)
k2d = np.array(data['keypoints2d'], dtype=np.float32)
if k2d[:, -1].sum() < 0.01:
continue
k3ds.append(k3d)
k2ds.append(np.ascontiguousarray(k2d[:, :-1]))
gray = cv2.imread(imagenames[0], 0)
print('>> Detect {}/{:3d} frames'.format(cam, len(k2ds)))
with Timer('calibrate'):
ret, K, dist, rvecs, tvecs = cv2.calibrateCamera(
k3ds, k2ds, gray.shape[::-1], None, None)
cameras[cam] = {
'K': K,
'dist': dist # dist: (1, 5)
}
write_intri(join(path, 'output', 'intri.yml'), cameras)
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('path', type=str, default='/home/')
parser.add_argument('--step', type=int, default=1)
parser.add_argument('--debug', action='store_true')
args = parser.parse_args()
calib_intri(args.path, step=args.step)

View File

@ -0,0 +1,19 @@
<!--
* @Date: 2021-04-13 16:49:12
* @Author: Qing Shuai
* @LastEditors: Qing Shuai
* @LastEditTime: 2021-04-13 16:51:16
* @FilePath: /EasyMocapRelease/apps/calibration/camera_parameters.md
-->
# Camera Parameters Format
For example, if the name of a video is `1.mp4`, then there must exist `K_1`, `dist_1` in `intri.yml`, and `R_1((3, 1), rotation vector of camera)`, `T_1(3, 1)` in `extri.yml`. The file format is following [OpenCV format](https://docs.opencv.org/master/dd/d74/tutorial_file_input_output_with_xml_yml.html).
## Write/Read
See `easymocap/mytools/camera_utils.py`=>`write_camera`, `read_camera` functions.
## Conversion between different format
TODO

View File

@ -0,0 +1,125 @@
'''
@ Date: 2021-03-27 19:13:50
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-02 22:01:10
@ FilePath: /EasyMocap/scripts/calibration/check_calib.py
'''
import cv2
import numpy as np
import os
from os.path import join
from easymocap.mytools import read_json, merge
from easymocap.mytools import read_camera, plot_points2d
from easymocap.mytools import batch_triangulate, projectN3, Undistort
from tqdm import tqdm
def load_grids():
points3d = np.array([
[0., 0., 0.],
[1., 0., 0.],
[1., 1., 0.],
[0., 1., 0.],
[0., 0., 1.],
[1., 0., 1.],
[1., 1., 1.],
[0., 1., 1.]
])
lines = np.array([
[0, 1],
[1, 2],
[2, 3],
[3, 0],
[4, 5],
[5, 6],
[6, 7],
[7, 4],
[0, 4],
[1, 5],
[2, 6],
[3, 7]
], dtype=np.int)
points3d = np.hstack((points3d, np.ones((points3d.shape[0], 1))))
return points3d, lines
def check_calib(path, out, vis=False, show=False, debug=False):
if vis:
out_dir = join(out, 'check')
os.makedirs(out_dir, exist_ok=True)
cameras = read_camera(join(out, 'intri.yml'), join(out, 'extri.yml'))
cameras.pop('basenames')
total_sum, cnt = 0, 0
for nf in tqdm(range(10000)):
imgs = []
k2ds = []
for cam, camera in cameras.items():
if vis:
imgname = join(path, 'images', cam, '{:06d}.jpg'.format(nf))
assert os.path.exists(imgname), imgname
img = cv2.imread(imgname)
img = Undistort.image(img, camera['K'], camera['dist'])
imgs.append(img)
annname = join(path, 'chessboard', cam, '{:06d}.json'.format(nf))
if not os.path.exists(annname):
break
data = read_json(annname)
k2d = np.array(data['keypoints2d'], dtype=np.float32)
k2d = Undistort.points(k2d, camera['K'], camera['dist'])
k2ds.append(k2d)
if len(k2ds) == 0:
break
Pall = np.stack([camera['P'] for camera in cameras.values()])
k2ds = np.stack(k2ds)
k3d = batch_triangulate(k2ds, Pall)
kpts_repro = projectN3(k3d, Pall)
for nv in range(len(k2ds)):
conf = k2ds[nv][:, -1]
dist = conf * np.linalg.norm(kpts_repro[nv][:, :2] - k2ds[nv][:, :2], axis=1)
total_sum += dist.sum()
cnt += conf.sum()
if debug:
print('{:2d}-{:2d}: {:6.2f}/{:2d}'.format(nf, nv, dist.sum(), int(conf.sum())))
if vis:
plot_points2d(imgs[nv], kpts_repro[nv], [], col=(0, 0, 255), lw=1, putText=False)
plot_points2d(imgs[nv], k2ds[nv], [], lw=1, putText=False)
if show:
cv2.imshow('vis', imgs[nv])
cv2.waitKey(0)
if vis:
imgout = merge(imgs, resize=False)
outname = join(out, 'check', '{:06d}.jpg'.format(nf))
cv2.imwrite(outname, imgout)
print('{:.2f}/{} = {:.2f} pixel'.format(total_sum, int(cnt), total_sum/cnt))
def check_scene(path, out):
cameras = read_camera(join(out, 'intri.yml'), join(out, 'extri.yml'))
cameras.pop('basenames')
points3d, lines = load_grids()
nf = 0
for cam, camera in cameras.items():
imgname = join(path, 'images', cam, '{:06d}.jpg'.format(nf))
assert os.path.exists(imgname), imgname
img = cv2.imread(imgname)
img = Undistort.image(img, camera['K'], camera['dist'])
kpts_repro = projectN3(points3d, camera['P'][None, :, :])[0]
plot_points2d(img, kpts_repro, lines, col=(0, 0, 255), lw=1, putText=True)
cv2.imshow('vis', img)
cv2.waitKey(0)
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('path', type=str,
help='the directory contains the extrinsic images')
parser.add_argument('--out', type=str,
help='with camera parameters')
parser.add_argument('--vis', action='store_true')
parser.add_argument('--show', action='store_true')
parser.add_argument('--debug', action='store_true')
parser.add_argument('--cube', action='store_true')
args = parser.parse_args()
if args.cube:
check_scene(args.path, args.out)
else:
check_calib(args.path, args.out, args.vis, args.show, args.debug)

View File

@ -0,0 +1,68 @@
# detect the corner of chessboard
from easymocap.annotator.file_utils import getFileList, read_json, save_json
from tqdm import tqdm
from easymocap.annotator import ImageFolder, findChessboardCorners
import numpy as np
from os.path import join
import cv2
import os
def get_object(pattern, gridSize):
object_points = np.zeros((pattern[1]*pattern[0], 3), np.float32)
# 注意这里为了让标定板z轴朝上设定了短边是x长边是y
object_points[:,:2] = np.mgrid[0:pattern[0], 0:pattern[1]].T.reshape(-1,2)
object_points[:, [0, 1]] = object_points[:, [1, 0]]
object_points = object_points * gridSize
return object_points
def create_chessboard(path, pattern, gridSize):
print('Create chessboard {}'.format(pattern))
keypoints3d = get_object(pattern, gridSize=gridSize)
keypoints2d = np.zeros((keypoints3d.shape[0], 3))
imgnames = getFileList(path, ext='.jpg')
template = {
'keypoints3d': keypoints3d.tolist(),
'keypoints2d': keypoints2d.tolist(),
'visited': False
}
for imgname in tqdm(imgnames, desc='create template chessboard'):
annname = imgname.replace('images', 'chessboard').replace('.jpg', '.json')
annname = join(path, annname)
if os.path.exists(annname):
# 覆盖keypoints3d
data = read_json(annname)
data['keypoints3d'] = template['keypoints3d']
save_json(annname, data)
else:
save_json(annname, template)
def detect_chessboard(path, out, pattern, gridSize):
create_chessboard(path, pattern, gridSize)
dataset = ImageFolder(path, annot='chessboard')
dataset.isTmp = False
for i in tqdm(range(len(dataset))):
imgname, annotname = dataset[i]
# detect the 2d chessboard
img = cv2.imread(imgname)
annots = read_json(annotname)
show = findChessboardCorners(img, annots, pattern)
save_json(annotname, annots)
if show is None:
continue
outname = join(out, imgname.replace(path + '/images/', ''))
os.makedirs(os.path.dirname(outname), exist_ok=True)
cv2.imwrite(outname, show)
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('path', type=str)
parser.add_argument('--out', type=str)
parser.add_argument('--pattern', type=lambda x: (int(x.split(',')[0]), int(x.split(',')[1])),
help='The pattern of the chessboard', default=(9, 6))
parser.add_argument('--grid', type=float, default=0.1,
help='The length of the grid size (unit: meter)')
parser.add_argument('--debug', action='store_true')
args = parser.parse_args()
detect_chessboard(args.path, args.out, pattern=args.pattern, gridSize=args.grid)

157
apps/demo/1v1p_mirror.py Normal file
View File

@ -0,0 +1,157 @@
from operator import imod
import numpy as np
from tqdm import tqdm
from os.path import join
from easymocap.dataset.mv1pmf_mirror import ImageFolderMirror as ImageFolder
from easymocap.mytools import Timer
from easymocap.smplmodel import load_model, merge_params, select_nf
from easymocap.estimator import SPIN, init_with_spin
from easymocap.pipeline.mirror import multi_stage_optimize
def demo_1v1p1f_smpl_mirror(path, body_model, spin_model, args):
"Optimization for single image"
# 0. construct the dataset
dataset = ImageFolder(path, out=args.out, kpts_type=args.body)
if args.gtK:
dataset.gtK = True
dataset.load_gt_cameras()
start, end = args.start, min(args.end, len(dataset))
for nf in tqdm(range(start, end, args.step), desc='Optimizing'):
image, annots = dataset[nf]
if len(annots) < 2:
continue
annots = annots[:2]
camera = dataset.camera(nf)
# initialize the SMPL parameters
body_params_all = []
bboxes, keypoints2d, pids = [], [], []
for i, annot in enumerate(annots):
assert annot['id'] == i, (i, annot['id'])
result = init_with_spin(body_model, spin_model, image,
annot['bbox'], annot['keypoints'], camera)
body_params_all.append(result['body_params'])
bboxes.append(annot['bbox'])
keypoints2d.append(annot['keypoints'])
pids.append(annot['id'])
bboxes = np.vstack(bboxes)
keypoints2d = np.stack(keypoints2d)
body_params = merge_params(body_params_all)
# bboxes: (nViews(2), 1, 5); keypoints2d: (nViews(2), 1, nJoints, 3)
bboxes = bboxes[:, None]
keypoints2d = keypoints2d[:, None]
if args.normal:
normal = dataset.normal(nf)[None, :, :]
else:
normal = None
body_params = multi_stage_optimize(body_model, body_params, bboxes, keypoints2d, Pall=camera['P'], normal=normal, args=args)
vertices = body_model(return_verts=True, return_tensor=False, **body_params)
keypoints = body_model(return_verts=False, return_tensor=False, **body_params)
write_data = [{'id': pids[i], 'keypoints3d': keypoints[i]} for i in range(len(pids))]
# write out the results
dataset.write_keypoints3d(write_data, nf)
for i in range(len(pids)):
write_data[i].update(select_nf(body_params, i))
if args.vis_smpl:
# render the results
render_data = {pids[i]: {
'vertices': vertices[i],
'faces': body_model.faces,
'vid': 0, 'name': 'human_{}'.format(pids[i])} for i in range(len(pids))}
dataset.vis_smpl(render_data, image, camera, nf)
dataset.write_smpl(write_data, nf)
def demo_1v1pmf_smpl_mirror(path, body_model, spin_model, args):
subs = args.sub
assert len(subs) > 0
# 遍历所有文件夹
for sub in subs:
dataset = ImageFolder(path, subs=[sub], out=args.out, kpts_type=args.body)
start, end = args.start, min(args.end, len(dataset))
frames = list(range(start, end, args.step))
nFrames = len(frames)
pids = [0, 1]
body_params_all = {pid:[None for nf in frames] for pid in pids}
bboxes = {pid:[None for nf in frames] for pid in pids}
keypoints2d = {pid:[None for nf in frames] for pid in pids}
for nf in tqdm(frames, desc='loading'):
image, annots = dataset[nf]
# 这个时候如果annots不够 不能够跳过了,需要进行补全
camera = dataset.camera(nf)
# 初始化每个人的SMPL参数
for i, annot in enumerate(annots):
pid = annot['id']
if pid not in pids:
continue
result = init_with_spin(body_model, spin_model, image,
annot['bbox'], annot['keypoints'], camera)
body_params_all[pid][nf-start] = result['body_params']
bboxes[pid][nf-start] = annot['bbox']
keypoints2d[pid][nf-start] = annot['keypoints']
# stack [p1f1, p1f2, p1f3, ..., p1fn, p2f1, p2f2, p2f3, ..., p2fn]
# TODO:for missing bbox
body_params = merge_params([merge_params(body_params_all[pid]) for pid in pids])
# bboxes: (nViews, nFrames, 5)
bboxes = np.stack([np.stack(bboxes[pid]) for pid in pids])
# keypoints: (nViews, nFrames, nJoints, 3)
keypoints2d = np.stack([np.stack(keypoints2d[pid]) for pid in pids])
# optimize
P = dataset.camera(start)['P']
if args.normal:
normal = dataset.normal_all(start=start, end=end)
else:
normal = None
body_params = multi_stage_optimize(body_model, body_params, bboxes, keypoints2d, Pall=P, normal=normal, args=args)
# write
vertices = body_model(return_verts=True, return_tensor=False, **body_params)
keypoints = body_model(return_verts=False, return_tensor=False, **body_params)
dataset.no_img = not args.vis_smpl
for nf in tqdm(frames, desc='rendering'):
idx = nf - start
write_data = [{'id': pids[i], 'keypoints3d': keypoints[i*nFrames+idx]} for i in range(len(pids))]
dataset.write_keypoints3d(write_data, nf)
for i in range(len(pids)):
write_data[i].update(select_nf(body_params, i*nFrames+idx))
dataset.write_smpl(write_data, nf)
# 保存结果
if args.vis_smpl:
image, annots = dataset[nf]
camera = dataset.camera(nf)
render_data = {pids[i]: {
'vertices': vertices[i*nFrames+idx],
'faces': body_model.faces,
'vid': 0, 'name': 'human_{}'.format(pids[i])} for i in range(len(pids))}
dataset.vis_smpl(render_data, image, camera, nf)
if __name__ == "__main__":
from easymocap.mytools import load_parser, parse_parser
parser = load_parser()
parser.add_argument('--skel', type=str, default=None,
help='path to keypoints3d')
parser.add_argument('--direct', action='store_true')
parser.add_argument('--video', action='store_true')
parser.add_argument('--gtK', action='store_true')
parser.add_argument('--normal', action='store_true',
help='set to use the normal of the mirror')
args = parse_parser(parser)
helps = '''
Demo code for single view and one person with mirror:
- Input : {}: [{}]
- Output: {}
- Body : {} => {}, {}
'''.format(args.path, ', '.join(args.sub), args.out,
args.model, args.gender, args.body)
print(helps)
with Timer('Loading {}, {}'.format(args.model, args.gender)):
body_model = load_model(args.gender, model_type=args.model)
with Timer('Loading SPIN'):
spin_model = SPIN(
SMPL_MEAN_PARAMS='data/models/smpl_mean_params.npz',
checkpoint='data/models/spin_checkpoint.pt',
device=body_model.device)
if args.video:
demo_1v1pmf_smpl_mirror(args.path, body_model, spin_model, args)
else:
demo_1v1p1f_smpl_mirror(args.path, body_model, spin_model, args)

110
apps/demo/mv1p.py Normal file
View File

@ -0,0 +1,110 @@
'''
@ Date: 2021-04-13 19:46:51
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-14 11:33:00
@ FilePath: /EasyMocapRelease/apps/demo/mv1p.py
'''
from tqdm import tqdm
from easymocap.smplmodel import check_keypoints, load_model, select_nf
from easymocap.mytools import simple_recon_person, Timer, projectN3
from easymocap.pipeline import smpl_from_keypoints3d2d
import os
from os.path import join
import numpy as np
def check_repro_error(keypoints3d, kpts_repro, keypoints2d, P, MAX_REPRO_ERROR):
square_diff = (keypoints2d[:, :, :2] - kpts_repro[:, :, :2])**2
conf = keypoints3d[None, :, -1:]
conf = (keypoints3d[None, :, -1:] > 0) * (keypoints2d[:, :, -1:] > 0)
dist = np.sqrt((((kpts_repro[..., :2] - keypoints2d[..., :2])*conf)**2).sum(axis=-1))
vv, jj = np.where(dist > MAX_REPRO_ERROR)
if vv.shape[0] > 0:
keypoints2d[vv, jj, -1] = 0.
keypoints3d, kpts_repro = simple_recon_person(keypoints2d, P)
return keypoints3d, kpts_repro
def mv1pmf_skel(dataset, check_repro=True, args=None):
MIN_CONF_THRES = args.thres2d
no_img = not (args.vis_det or args.vis_repro)
dataset.no_img = no_img
kp3ds = []
start, end = args.start, min(args.end, len(dataset))
kpts_repro = None
for nf in tqdm(range(start, end), desc='triangulation'):
images, annots = dataset[nf]
check_keypoints(annots['keypoints'], WEIGHT_DEBUFF=1, min_conf=MIN_CONF_THRES)
keypoints3d, kpts_repro = simple_recon_person(annots['keypoints'], dataset.Pall)
if check_repro:
keypoints3d, kpts_repro = check_repro_error(keypoints3d, kpts_repro, annots['keypoints'], P=dataset.Pall, MAX_REPRO_ERROR=args.MAX_REPRO_ERROR)
# keypoints3d, kpts_repro = robust_triangulate(annots['keypoints'], dataset.Pall, config=config, ret_repro=True)
kp3ds.append(keypoints3d)
if args.vis_det:
dataset.vis_detections(images, annots, nf, sub_vis=args.sub_vis)
if args.vis_repro:
dataset.vis_repro(images, kpts_repro, nf=nf, sub_vis=args.sub_vis)
# smooth the skeleton
if args.smooth3d > 0:
kp3ds = smooth_skeleton(kp3ds, args.smooth3d)
for nf in tqdm(range(len(kp3ds)), desc='dump'):
dataset.write_keypoints3d(kp3ds[nf], nf+start)
def mv1pmf_smpl(dataset, args, weight_pose=None, weight_shape=None):
dataset.skel_path = args.skel
kp3ds = []
start, end = args.start, min(args.end, len(dataset))
keypoints2d, bboxes = [], []
dataset.no_img = True
for nf in tqdm(range(start, end), desc='loading'):
images, annots = dataset[nf]
keypoints2d.append(annots['keypoints'])
bboxes.append(annots['bbox'])
kp3ds = dataset.read_skeleton(start, end)
keypoints2d = np.stack(keypoints2d)
bboxes = np.stack(bboxes)
kp3ds = check_keypoints(kp3ds, 1)
# optimize the human shape
with Timer('Loading {}, {}'.format(args.model, args.gender), not args.verbose):
body_model = load_model(gender=args.gender, model_type=args.model)
params = smpl_from_keypoints3d2d(body_model, kp3ds, keypoints2d, bboxes,
dataset.Pall, config=dataset.config, args=args,
weight_shape=weight_shape, weight_pose=weight_pose)
# write out the results
dataset.no_img = not (args.vis_smpl or args.vis_repro)
for nf in tqdm(range(start, end), desc='render'):
images, annots = dataset[nf]
param = select_nf(params, nf-start)
dataset.write_smpl(param, nf)
if args.vis_smpl:
vertices = body_model(return_verts=True, return_tensor=False, **param)
dataset.vis_smpl(vertices=vertices[0], faces=body_model.faces, images=images, nf=nf, sub_vis=args.sub_vis, add_back=True)
if args.vis_repro:
keypoints = body_model(return_verts=False, return_tensor=False, **param)[0]
kpts_repro = projectN3(keypoints, dataset.Pall)
dataset.vis_repro(images, kpts_repro, nf=nf, sub_vis=args.sub_vis)
if __name__ == "__main__":
from easymocap.mytools import load_parser, parse_parser
from easymocap.dataset import CONFIG, MV1PMF
parser = load_parser()
parser.add_argument('--skel', action='store_true')
args = parse_parser(parser)
help="""
Demo code for multiple views and one person:
- Input : {} => {}
- Output: {}
- Body : {}=>{}, {}
""".format(args.path, ', '.join(args.sub), args.out,
args.model, args.gender, args.body)
print(help)
skel_path = join(args.out, 'keypoints3d')
dataset = MV1PMF(args.path, annot_root=args.annot, cams=args.sub, out=args.out,
config=CONFIG[args.body], kpts_type=args.body,
undis=args.undis, no_img=False, verbose=args.verbose)
dataset.writer.save_origin = args.save_origin
if args.skel or not os.path.exists(skel_path):
mv1pmf_skel(dataset, check_repro=True, args=args)
mv1pmf_smpl(dataset, args)

38
apps/demo/mv1p_mirror.py Normal file
View File

@ -0,0 +1,38 @@
'''
@ Date: 2021-04-13 22:21:39
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-14 12:22:59
@ FilePath: /EasyMocap/apps/demo/mv1p_mirror.py
'''
import os
from os.path import join
from mv1p import mv1pmf_skel, mv1pmf_smpl
from easymocap.dataset import CONFIG
if __name__ == "__main__":
from easymocap.mytools import load_parser, parse_parser
parser = load_parser()
parser.add_argument('--skel', action='store_true')
args = parse_parser(parser)
help="""
Demo code for multiple views and one person with mirror:
- Input : {} => {}
- Output: {}
- Body : {}=>{}, {}
""".format(args.path, ', '.join(args.sub), args.out,
args.model, args.gender, args.body)
print(help)
from easymocap.dataset import MV1PMF_Mirror as MV1PMF
dataset = MV1PMF(args.path, annot_root=args.annot, cams=args.sub, out=args.out,
config=CONFIG[args.body], kpts_type=args.body,
undis=args.undis, no_img=False, verbose=args.verbose)
dataset.writer.save_origin = args.save_origin
skel_path = join(args.out, 'keypoints3d')
if args.skel or not os.path.exists(skel_path):
mv1pmf_skel(dataset, check_repro=False, args=args)
from easymocap.pipeline.weight import load_weight_pose, load_weight_shape
weight_shape = load_weight_shape(args.opts)
weight_pose = load_weight_pose(args.model, args.opts)
mv1pmf_smpl(dataset, args=args, weight_pose=weight_pose, weight_shape=weight_shape)

View File

@ -2,10 +2,7 @@
* @Date: 2021-01-15 21:12:49 * @Date: 2021-01-15 21:12:49
* @Author: Qing Shuai * @Author: Qing Shuai
* @LastEditors: Qing Shuai * @LastEditors: Qing Shuai
* @LastEditTime: 2021-01-15 21:13:35 * @LastEditTime: 2021-04-13 17:42:28
* @FilePath: /EasyMocapRelease/doc/evaluation.md * @FilePath: /EasyMocapRelease/doc/evaluation.md
--> -->
# Evaluation # Evaluation
## Evaluation of fitting SMPL
### Human3.6M

View File

@ -2,12 +2,14 @@
* @Date: 2021-04-02 11:52:33 * @Date: 2021-04-02 11:52:33
* @Author: Qing Shuai * @Author: Qing Shuai
* @LastEditors: Qing Shuai * @LastEditors: Qing Shuai
* @LastEditTime: 2021-04-02 11:52:59 * @LastEditTime: 2021-04-13 17:15:49
* @FilePath: /EasyMocapRelease/doc/installation.md * @FilePath: /EasyMocapRelease/doc/installation.md
--> -->
# EasyMocap - Installation # EasyMocap - Installation
### 1. Download SMPL models ## 0. Download models
## 0.1 SMPL models
This step is the same as [smplx](https://github.com/vchoutas/smplx#model-loading). This step is the same as [smplx](https://github.com/vchoutas/smplx#model-loading).
@ -40,7 +42,31 @@ data
└── SMPLX_NEUTRAL.pkl └── SMPLX_NEUTRAL.pkl
``` ```
### 2. Requirements ## 0.2 (Optional) SPIN model
This part is used in `1v1p*.py`. You can skip this step if you only use the multiple views dataset.
Download pretrained SPIN model [here](http://visiondata.cis.upenn.edu/spin/model_checkpoint.pt) and place it to `data/models/spin_checkpoints.pt`.
Fetch the extra data [here](http://visiondata.cis.upenn.edu/spin/dataset_extras.tar.gz) and place the `smpl_mean_params.npz` to `data/models/smpl_mean_params.npz`.
## 0.3 (Optional) 2D model
You can skip this step if you use openpose as your human keypoints detector.
Download [yolov4.weights]() and place it into `data/models/yolov4.weights`.
Download pretrained HRNet [weight]() and place it into `data/models/pose_hrnet_w48_384x288.pth`.
```bash
data
└── models
├── smpl_mean_params.npz
├── spin_checkpoint.pt
├── pose_hrnet_w48_384x288.pth
└── yolov4.weights
```
## 2. Requirements
- python>=3.6 - python>=3.6
- torch==1.4.0 - torch==1.4.0
@ -51,3 +77,9 @@ data
- OpenPose[4]: for 2D pose - OpenPose[4]: for 2D pose
Some of python libraries can be found in `requirements.txt`. You can test different version of PyTorch. Some of python libraries can be found in `requirements.txt`. You can test different version of PyTorch.
## 3. Install
```bash
python3 setup.py develop --user
```

View File

@ -1,61 +0,0 @@
<!--
* @Date: 2021-04-02 11:53:55
* @Author: Qing Shuai
* @LastEditors: Qing Shuai
* @LastEditTime: 2021-04-02 11:53:55
* @FilePath: /EasyMocapRelease/doc/notquickstart.md
-->
### 0. Prepare Your Own Dataset
```bash
zju-ls-feng
├── intri.yml
├── extri.yml
└── videos
├── 1.mp4
├── 2.mp4
├── ...
├── 8.mp4
└── 9.mp4
```
The input videos are placed in `videos/`.
Here `intri.yml` and `extri.yml` store the camera intrinsici and extrinsic parameters. For example, if the name of a video is `1.mp4`, then there must exist `K_1`, `dist_1` in `intri.yml`, and `R_1((3, 1), rotation vector of camera)`, `T_1(3, 1)` in `extri.yml`. The file format is following [OpenCV format](https://docs.opencv.org/master/dd/d74/tutorial_file_input_output_with_xml_yml.html).
### 1. Run [OpenPose](https://github.com/CMU-Perceptual-Computing-Lab/openpose)
```bash
data=path/to/data
out=path/to/output
python3 scripts/preprocess/extract_video.py ${data} --openpose <openpose_path> --handface
```
- `--openpose`: specify the openpose path
- `--handface`: detect hands and face keypoints
### 2. Run the code
```bash
# 1. example for skeleton reconstruction
python3 code/demo_mv1pmf_skel.py ${data} --out ${out} --vis_det --vis_repro --undis --sub_vis 1 7 13 19
# 2. example for SMPL reconstruction
python3 code/demo_mv1pmf_smpl.py ${data} --out ${out} --end 300 --vis_smpl --undis --sub_vis 1 7 13 19
```
The input flags:
- `--undis`: use to undistort the images
- `--start, --end`: control the begin and end number of frames.
The output flags:
- `--vis_det`: visualize the detection
- `--vis_repro`: visualize the reprojection
- `--sub_vis`: use to specify the views to visualize. If not set, the code will use all views
- `--vis_smpl`: use to render the SMPL mesh to images.
### 3. Output
Please refer to [output.md](doc/02_output.md)

View File

@ -2,9 +2,12 @@
* @Date: 2021-04-02 11:53:16 * @Date: 2021-04-02 11:53:16
* @Author: Qing Shuai * @Author: Qing Shuai
* @LastEditors: Qing Shuai * @LastEditors: Qing Shuai
* @LastEditTime: 2021-04-02 11:53:16 * @LastEditTime: 2021-04-13 16:56:19
* @FilePath: /EasyMocapRelease/doc/quickstart.md * @FilePath: /EasyMocapRelease/doc/quickstart.md
--> -->
# Quick Start
## Demo
We provide an example multiview dataset[[dropbox](https://www.dropbox.com/s/24mb7r921b1g9a7/zju-ls-feng.zip?dl=0)][[BaiduDisk](https://pan.baidu.com/s/1lvAopzYGCic3nauoQXjbPw)(vg1z)], which has 800 frames from 23 synchronized and calibrated cameras. After downloading the dataset, you can run the following example scripts. We provide an example multiview dataset[[dropbox](https://www.dropbox.com/s/24mb7r921b1g9a7/zju-ls-feng.zip?dl=0)][[BaiduDisk](https://pan.baidu.com/s/1lvAopzYGCic3nauoQXjbPw)(vg1z)], which has 800 frames from 23 synchronized and calibrated cameras. After downloading the dataset, you can run the following example scripts.
@ -13,14 +16,61 @@ data=path/to/data
out=path/to/output out=path/to/output
# 0. extract the video to images # 0. extract the video to images
python3 scripts/preprocess/extract_video.py ${data} python3 scripts/preprocess/extract_video.py ${data}
# 1. example for skeleton reconstruction
python3 code/demo_mv1pmf_skel.py ${data} --out ${out} --vis_det --vis_repro --undis --sub_vis 1 7 13 19
# 2.1 example for SMPL reconstruction # 2.1 example for SMPL reconstruction
python3 code/demo_mv1pmf_smpl.py ${data} --out ${out} --end 300 --vis_smpl --undis --sub_vis 1 7 13 19 --gender male python3 apps/demo/mv1p.py ${data} --out ${out} --vis_det --vis_repro --undis --sub_vis 1 7 13 19 --vis_smpl
# 2.2 example for SMPL-X reconstruction # 2.2 example for SMPL-X reconstruction
python3 code/demo_mv1pmf_smpl.py ${data} --out ${out} --undis --body bodyhandface --sub_vis 1 7 13 19 --start 400 --model smplx --vis_smpl --gender male python3 apps/demo/mv1p.py ${data} --out ${out} --vis_det --vis_repro --undis --sub_vis 1 7 13 19 --body bodyhandface --model smplx --gender male --vis_smpl
# 3.1 example for rendering SMPLX to ${out}/smpl
python3 code/vis_render.py ${data} --out ${out} --skel ${out}/smpl --model smplx --gender male --undis --start 400 --sub_vis 1
# 3.2 example for rendering skeleton of SMPL to ${out}/smplskel
python3 code/vis_render.py ${data} --out ${out} --skel ${out}/smpl --model smplx --gender male --undis --start 400 --sub_vis 1 --type smplskel --body bodyhandface
``` ```
# Demo On Your Dataset
## 0. Prepare Your Own Dataset
```bash
<seq>
├── intri.yml
├── extri.yml
└── videos
├── 1.mp4
├── 2.mp4
├── ...
├── 8.mp4
└── 9.mp4
```
The input videos are placed in `videos/`.
Here `intri.yml` and `extri.yml` store the camera intrinsici and extrinsic parameters.
See [`apps/calibration/Readme`](../apps/calibration/Readme.md) for instruction of camera calibration.
See [`apps/calibration/camera_parameters`](../apps/calibration/camera_parameters.md) for the format of camera parameters.
### 1. Run [OpenPose](https://github.com/CMU-Perceptual-Computing-Lab/openpose)
```bash
data=path/to/data
out=path/to/output
python3 scripts/preprocess/extract_video.py ${data} --openpose <openpose_path> --handface
```
- `--openpose`: specify the openpose path
- `--handface`: detect hands and face keypoints
### 2. Run the code
The input flags:
- `--undis`: use to undistort the images
- `--start, --end`: control the begin and end number of frames.
The output flags:
- `--vis_det`: visualize the detection
- `--vis_repro`: visualize the reprojection
- `--sub_vis`: use to specify the views to visualize. If not set, the code will use all views
- `--vis_smpl`: use to render the SMPL mesh to images.
### 3. Output
Please refer to [output.md](../doc/02_output.md)

View File

@ -0,0 +1,7 @@
from .basic_dataset import ImageFolder
from .basic_visualize import vis_point, vis_line
from .basic_visualize import plot_bbox_body, plot_skeleton, plot_skeleton_simple, plot_text, vis_active_bbox
from .basic_annotator import AnnotBase
from .chessboard import findChessboardCorners
# bbox callbacks
from .bbox_callback import callback_select_bbox_center, callback_select_bbox_corner, auto_pose_track

View File

@ -0,0 +1,137 @@
import shutil
import cv2
from .basic_keyboard import register_keys
from .basic_visualize import resize_to_screen
from .basic_callback import point_callback, CV_KEY, get_key
from .file_utils import load_annot_to_tmp, save_annot
class ComposedCallback:
def __init__(self, callbacks=[point_callback], processes=[]) -> None:
self.callbacks = callbacks
self.processes = processes
def call(self, event, x, y, flags, param):
scale = param['scale']
x, y = int(x/scale), int(y/scale)
for callback in self.callbacks:
callback(event, x, y, flags, param)
for key in ['click', 'start', 'end']:
if param[key] is not None:
break
else:
return 0
for process in self.processes:
process(**param)
class AnnotBase:
def __init__(self, dataset, key_funcs={}, callbacks=[], vis_funcs=[],
name = 'main',
step=1) -> None:
self.name = name
self.dataset = dataset
self.nFrames = len(dataset)
self.step = step
self.register_keys = register_keys.copy()
self.register_keys.update(key_funcs)
self.vis_funcs = vis_funcs + [resize_to_screen]
self.isOpen = True
self._frame = 0
self.visited_frames = set([self._frame])
self.param = {'select': {'bbox': -1, 'corner': -1},
'start': None, 'end': None, 'click': None,
'capture_screen':False}
self.set_frame(0)
cv2.namedWindow(self.name)
callback = ComposedCallback(processes=callbacks)
cv2.setMouseCallback(self.name, callback.call, self.param)
@property
def working(self):
param = self.param
flag = False
if param['click'] is not None or param['start'] is not None:
flag = True
for key in self.param['select']:
if self.param['select'][key] != -1:
flag = True
return flag
def clear_working(self):
self.param['click'] = None
self.param['start'] = None
self.param['end'] = None
for key in self.param['select']:
self.param['select'][key] = -1
def save_and_quit(self):
self.frame = self.frame
self.isOpen = False
cv2.destroyWindow(self.name)
# get the input
while True:
key = input('Saving this annotations? [y/n]')
if key in ['y', 'n']:
break
print('Please specify [y/n]')
if key == 'n':
return 0
if key == 'n':
return 0
for frame in self.visited_frames:
self.dataset.isTmp = True
_, annname = self.dataset[frame]
self.dataset.isTmp = False
_, annname_ = self.dataset[frame]
shutil.copy(annname, annname_)
@property
def frame(self):
return self._frame
def previous(self):
if self.frame == 0:
print('Reach to the first frame')
return None
imgname, annname = self.dataset[self.frame-1]
annots = load_annot_to_tmp(annname)
return annots
def set_frame(self, nf):
self.clear_working()
imgname, annname = self.dataset[nf]
img0 = cv2.imread(imgname)
annots = load_annot_to_tmp(annname)
# 清空键盘
for key in ['click', 'start', 'end']:
self.param[key] = None
# 清空选中
for key in self.param['select']:
self.param['select'][key] = -1
self.param['imgname'] = imgname
self.param['annname'] = annname
self.param['frame'] = nf
self.param['annots'] = annots
self.param['img0'] = img0
# self.param['pid'] = len(annot['annots'])
self.param['scale'] = min(CV_KEY.WINDOW_HEIGHT/img0.shape[0], CV_KEY.WINDOW_WIDTH/img0.shape[1])
@frame.setter
def frame(self, value):
self.visited_frames.add(value)
self._frame = value
# save current frames
save_annot(self.param['annname'], self.param['annots'])
self.set_frame(value)
def run(self, key=None):
if key is None:
key = chr(get_key())
if key in self.register_keys.keys():
self.register_keys[key](self, param=self.param)
if not self.isOpen:
return 0
img = self.param['img0'].copy()
for func in self.vis_funcs:
img = func(img, **self.param)
cv2.imshow(self.name, img)

View File

@ -0,0 +1,56 @@
import cv2
class CV_KEY:
BLANK = 32
ENTER = 13
LSHIFT = 225 # Mac上不行
NONE = 255
TAB = 9
q = 113
ESC = 27
BACKSPACE = 8
WINDOW_WIDTH = int(1920*0.9)
WINDOW_HEIGHT = int(1080*0.9)
LEFT = ord('a')
RIGHT = ord('d')
UP = ord('w')
DOWN = ord('s')
MINUS = 45
PLUS = 61
def get_key():
k = cv2.waitKey(10) & 0xFF
if k == CV_KEY.LSHIFT:
key1 = cv2.waitKey(500) & 0xFF
if key1 == CV_KEY.NONE:
return key1
# 转换为大写
k = key1 - ord('a') + ord('A')
return k
def point_callback(event, x, y, flags, param):
"""
OpenCV使用的简单的回调函数主要实现两个基础功能
1. 对于按住拖动的情况记录起始点与终止点当前点
2. 对于点击的情况记录选择的点
"""
if event not in [cv2.EVENT_LBUTTONDOWN, cv2.EVENT_MOUSEMOVE, cv2.EVENT_LBUTTONUP]:
return 0
# 判断出了选择了的点的位置,直接写入这个位置
if event == cv2.EVENT_LBUTTONDOWN:
param['click'] = None
param['start'] = (x, y)
param['end'] = (x, y)
# 清除所有选择项
for key in param['select'].keys():
param['select'][key] = -1
elif event == cv2.EVENT_MOUSEMOVE and flags == cv2.EVENT_FLAG_LBUTTON:
param['end'] = (x, y)
elif event == cv2.EVENT_LBUTTONUP:
if x == param['start'][0] and y == param['start'][1]:
param['click'] = param['start']
param['start'] = None
param['end'] = None
else:
param['click'] = None
return 1

View File

@ -0,0 +1,32 @@
from os.path import join
from .file_utils import getFileList
class ImageFolder:
def __init__(self, path, sub=None, annot='annots') -> None:
self.root = path
self.image = 'images'
self.annot = annot
self.image_root = join(path, self.image)
self.annot_root = join(path, self.annot)
self.annot_root_tmp = join(path, self.annot + '_tmp')
if sub is None:
self.imgnames = getFileList(self.image_root, ext='.jpg')
self.annnames = getFileList(self.annot_root, ext='.json')
else:
self.imgnames = getFileList(join(self.image_root, sub), ext='.jpg')
self.annnames = getFileList(join(self.annot_root, sub), ext='.json')
self.imgnames = [join(sub, name) for name in self.imgnames]
self.annnames = [join(sub, name) for name in self.annnames]
self.isTmp = True
assert len(self.imgnames) == len(self.annnames)
def __getitem__(self, index):
imgname = join(self.image_root, self.imgnames[index])
if self.isTmp:
annname = join(self.annot_root_tmp, self.annnames[index])
else:
annname = join(self.annot_root, self.annnames[index])
return imgname, annname
def __len__(self):
return len(self.imgnames)

View File

@ -0,0 +1,92 @@
from tqdm import tqdm
from .basic_callback import get_key
def print_help(annotator, **kwargs):
"""print the help"""
print('Here is the help:')
print( '------------------')
for key, val in annotator.register_keys.items():
# print(' {}: {}'.format(key, ': ', str(val.__doc__)))
print(' {}: '.format(key, ': '), str(val.__doc__))
def close(annotator, param, **kwargs):
"""quit the annotation"""
if annotator.working:
annotator.clear_working()
else:
annotator.save_and_quit()
# annotator.pbar.close()
def get_move(wasd):
get_frame = {
'a': lambda x, f: f - 1,
'd': lambda x, f: f + 1,
'w': lambda x, f: f - x.step,
's': lambda x, f: f + x.step
}[wasd]
text = {
'a': 'Move to last frame',
'd': 'Move to next frame',
'w': 'Move to last step frame',
's': 'Move to next step frame'
}
clip_frame = lambda x, f: max(0, min(x.nFrames-1, f))
def move(annotator, **kwargs):
newframe = get_frame(annotator, annotator.frame)
newframe = clip_frame(annotator, newframe)
annotator.frame = newframe
move.__doc__ = text[wasd]
return move
def set_personID(i):
def func(self, param, **kwargs):
active = param['select']['bbox']
if active == -1:
return 0
else:
param['annots']['annots'][active]['personID'] = i
return 0
func.__doc__ = "set the bbox ID to {}".format(i)
return func
def delete_bbox(self, param, **kwargs):
"delete the person"
active = param['select']['bbox']
if active == -1:
return 0
else:
param['annots']['annots'].pop(active)
param['select']['bbox'] = -1
return 0
def capture_screen(self, param):
"capture the screen"
if param['capture_screen']:
param['capture_screen'] = False
else:
param['capture_screen'] = True
def automatic(self, param):
"Automatic running"
keys = input('Enter the ordered key(separate with blank): ').split(' ')
repeats = int(input('Input the repeat times: (0->{})'.format(len(self.dataset)-self.frame)))
for nf in tqdm(range(repeats), desc='auto {}'.format('->'.join(keys))):
for key in keys:
self.run(key=key)
if chr(get_key()) == 'q':
break
self.run(key='d')
register_keys = {
'h': print_help,
'q': close,
'x': delete_bbox,
'p': capture_screen,
'A': automatic
}
for key in 'wasd':
register_keys[key] = get_move(key)
for i in range(10):
register_keys[str(i)] = set_personID(i)

View File

@ -0,0 +1,116 @@
import numpy as np
import cv2
import os
from os.path import join
from ..mytools import plot_cross, plot_line, plot_bbox, plot_keypoints, get_rgb
from ..dataset import CONFIG
# click and (start, end) is the output of the OpenCV callback
def vis_point(img, click, **kwargs):
if click is not None:
plot_cross(img, click[0], click[1], (255, 255, 255))
return img
def vis_line(img, start, end, **kwargs):
if start is not None and end is not None:
cv2.line(img, (int(start[0]), int(start[1])),
(int(end[0]), int(end[1])), (0, 255, 0), 1)
return img
def resize_to_screen(img, scale=1, capture_screen=False, **kwargs):
if capture_screen:
from datetime import datetime
time_now = datetime.now().strftime("%m-%d-%H:%M:%S")
outname = join('capture', time_now+'.jpg')
os.makedirs('capture', exist_ok=True)
cv2.imwrite(outname, img)
print('Capture current screen to {}'.format(outname))
img = cv2.resize(img, None, fx=scale, fy=scale)
return img
def plot_text(img, annots, **kwargs):
if annots['isKeyframe']: # 关键帧使用红框表示
cv2.rectangle(img, (0, 0), (img.shape[1], img.shape[0]), (0, 0, 255), img.shape[1]//100)
else: # 非关键帧使用绿框表示
cv2.rectangle(img, (0, 0), (img.shape[1], img.shape[0]), (0, 255, 0), img.shape[1]//100)
text_size = int(max(1, img.shape[0]//1500))
border = 20 * text_size
width = 2 * text_size
cv2.putText(img, '{}'.format(annots['filename']), (border, img.shape[0]-border), cv2.FONT_HERSHEY_SIMPLEX, text_size, (0, 0, 255), width)
return img
def plot_bbox_body(img, annots, **kwargs):
annots = annots['annots']
for data in annots:
bbox = data['bbox']
# 画一个X形
x1, y1, x2, y2 = bbox[:4]
pid = data['personID']
color = get_rgb(pid)
lw = max(1, int((x2 - x1)//100))
plot_line(img, (x1, y1), (x2, y2), lw, color)
plot_line(img, (x1, y2), (x2, y1), lw, color)
# border
cv2.rectangle(img, (int(x1), int(y1)), (int(x2), int(y2)), color, lw+1)
ratio = (y2-y1)/(x2-x1)/2
w = 10*lw
cv2.rectangle(img,
(int((x1+x2)/2-w), int((y1+y2)/2-w*ratio)),
(int((x1+x2)/2+w), int((y1+y2)/2+w*ratio)),
color, -1)
cv2.putText(img, '{}'.format(pid), (int(x1), int(y1)+20), cv2.FONT_HERSHEY_SIMPLEX, 1, color, 2)
return img
def plot_skeleton(img, annots, **kwargs):
annots = annots['annots']
vis_conf = False
for data in annots:
bbox, keypoints = data['bbox'], data['keypoints']
if False:
pid = data.get('matchID', -1)
else:
pid = data.get('personID', -1)
plot_bbox(img, bbox, pid)
if True:
plot_keypoints(img, keypoints, pid, CONFIG['body25'], vis_conf=vis_conf, use_limb_color=True)
if 'handl2d' in data.keys():
plot_keypoints(img, data['handl2d'], pid, CONFIG['hand'], vis_conf=vis_conf, lw=1, use_limb_color=False)
plot_keypoints(img, data['handr2d'], pid, CONFIG['hand'], vis_conf=vis_conf, lw=1, use_limb_color=False)
plot_keypoints(img, data['face2d'], pid, CONFIG['face'], vis_conf=vis_conf, lw=1, use_limb_color=False)
return img
def plot_keypoints_whole(img, points, kintree):
for ii, (i, j) in enumerate(kintree):
if i >= len(points) or j >= len(points):
continue
col = (255, 240, 160)
lw = 4
pt1, pt2 = points[i], points[j]
if pt1[-1] > 0.01 and pt2[-1] > 0.01:
image = cv2.line(
img, (int(pt1[0]+0.5), int(pt1[1]+0.5)), (int(pt2[0]+0.5), int(pt2[1]+0.5)),
col, lw)
def plot_skeleton_simple(img, annots, **kwargs):
annots = annots['annots']
vis_conf = False
for data in annots:
bbox, keypoints = data['bbox'], data['keypoints']
pid = data.get('personID', -1)
plot_keypoints_whole(img, keypoints, CONFIG['body25']['kintree'])
return img
def vis_active_bbox(img, annots, select, **kwargs):
active = select['bbox']
if active == -1:
return img
else:
bbox = annots['annots'][active]['bbox']
pid = annots['annots'][active]['personID']
mask = np.zeros_like(img, dtype=np.uint8)
cv2.rectangle(mask,
(int(bbox[0]), int(bbox[1])),
(int(bbox[2]), int(bbox[3])),
get_rgb(pid), -1)
img = cv2.addWeighted(img, 0.6, mask, 0.4, 0)
return img

View File

@ -0,0 +1,83 @@
import numpy as np
import cv2
MIN_PIXEL = 50
def callback_select_bbox_corner(start, end, annots, select, **kwargs):
if start is None or end is None:
select['corner'] = -1
return 0
if start[0] == end[0] and start[1] == end[1]:
return 0
# 判断选择了哪个角点
annots = annots['annots']
start = np.array(start)[None, :]
if select['bbox'] == -1 and select['corner'] == -1:
for i in range(len(annots)):
l, t, r, b = annots[i]['bbox'][:4]
corners = np.array([(l, t), (l, b), (r, t), (r, b)])
dist = np.linalg.norm(corners - start, axis=1)
mindist = dist.min()
if mindist < MIN_PIXEL:
mincor = dist.argmin()
select['bbox'] = i
select['corner'] = mincor
break
else:
select['corner'] = -1
elif select['bbox'] != -1 and select['corner'] == -1:
i = select['bbox']
l, t, r, b = annots[i]['bbox'][:4]
corners = np.array([(l, t), (l, b), (r, t), (r, b)])
dist = np.linalg.norm(corners - start, axis=1)
mindist = dist.min()
if mindist < MIN_PIXEL:
mincor = dist.argmin()
select['corner'] = mincor
elif select['bbox'] != -1 and select['corner'] != -1:
# Move the corner
x, y = end
(i, j) = [(0, 1), (0, 3), (2, 1), (2, 3)][select['corner']]
data = annots[select['bbox']]
data['bbox'][i] = x
data['bbox'][j] = y
elif select['bbox'] == -1 and select['corner'] != -1:
select['corner'] = -1
def callback_select_bbox_center(click, annots, select, **kwargs):
if click is None:
return 0
annots = annots['annots']
bboxes = np.array([d['bbox'] for d in annots])
center = (bboxes[:, [2, 3]] + bboxes[:, [0, 1]])/2
click = np.array(click)[None, :]
dist = np.linalg.norm(click - center, axis=1)
mindist, minid = dist.min(), dist.argmin()
if mindist < MIN_PIXEL:
select['bbox'] = minid
def auto_pose_track(self, param, **kwargs):
"auto tracking with poses"
MAX_SPEED = 100
if self.frame == 0:
return 0
previous = self.previous()
annots = param['annots']['annots']
keypoints_pre = np.array([d['keypoints'] for d in previous['annots']])
keypoints_now = np.array([d['keypoints'] for d in annots])
conf = np.sqrt(keypoints_now[:, None, :, -1] * keypoints_pre[None, :, :, -1])
diff = np.linalg.norm(keypoints_now[:, None, :, :] - keypoints_pre[None, :, :, :], axis=-1)
dist = np.sum(diff * conf, axis=-1)/np.sum(conf, axis=-1)
nows, pres = np.where(dist < MAX_SPEED)
edges = []
for n, p in zip(nows, pres):
edges.append((n, p, dist[n, p]))
edges.sort(key=lambda x:x[2])
used_n, used_p = [], []
for n, p, _ in edges:
if n in used_n or p in used_p:
continue
annots[n]['personID'] = previous['annots'][p]['personID']
used_n.append(n)
used_p.append(p)
# TODO:stop when missing

View File

@ -0,0 +1,38 @@
import numpy as np
import cv2
def _findChessboardCorners(img, pattern):
"basic function"
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
retval, corners = cv2.findChessboardCorners(img, pattern,
flags=cv2.CALIB_CB_ADAPTIVE_THRESH + cv2.CALIB_CB_FAST_CHECK + cv2.CALIB_CB_FILTER_QUADS)
if not retval:
return False, None
corners = cv2.cornerSubPix(img, corners, (11, 11), (-1, -1), criteria)
corners = corners.squeeze()
return True, corners
def _findChessboardCornersAdapt(img, pattern):
"Adapt mode"
img = cv2.adaptiveThreshold(img, 255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\
cv2.THRESH_BINARY,21, 2)
return _findChessboardCorners(img, pattern)
def findChessboardCorners(img, annots, pattern):
if annots['visited']:
return None
annots['visited'] = True
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# Find the chess board corners
for func in [_findChessboardCornersAdapt, _findChessboardCorners]:
ret, corners = func(gray, pattern)
if ret:break
else:
return None
# found the corners
show = img.copy()
show = cv2.drawChessboardCorners(show, pattern, corners, ret)
assert corners.shape[0] == len(annots['keypoints2d'])
corners = np.hstack((corners, np.ones((corners.shape[0], 1))))
annots['keypoints2d'] = corners.tolist()
return show

View File

@ -0,0 +1,41 @@
import os
import json
import numpy as np
from os.path import join
import shutil
def read_json(path):
with open(path) as f:
data = json.load(f)
return data
def save_json(file, data):
if not os.path.exists(os.path.dirname(file)):
os.makedirs(os.path.dirname(file))
with open(file, 'w') as f:
json.dump(data, f, indent=4)
save_annot = save_json
def getFileList(root, ext='.jpg'):
files = []
dirs = os.listdir(root)
while len(dirs) > 0:
path = dirs.pop()
fullname = join(root, path)
if os.path.isfile(fullname) and fullname.endswith(ext):
files.append(path)
elif os.path.isdir(fullname):
for s in os.listdir(fullname):
newDir = join(path, s)
dirs.append(newDir)
files = sorted(files)
return files
def load_annot_to_tmp(annotname):
if not os.path.exists(annotname):
dirname = os.path.dirname(annotname)
os.makedirs(dirname, exist_ok=True)
shutil.copy(annotname.replace('_tmp', ''), annotname)
annot = read_json(annotname)
return annot

View File

@ -0,0 +1,148 @@
import numpy as np
from easymocap.dataset.mirror import flipPoint2D
def clear_vanish_points(self, param):
"remove all vanishing points"
annots = param['annots']
annots['vanish_line'] = [[], [], []]
annots['vanish_point'] = [[], [], []]
def clear_body_points(self, param):
"remove vanish lines of body"
annots = param['annots']
for i in range(3):
vanish_lines = []
for data in annots['vanish_line'][i]:
if data[0][-1] > 1 and data[1][-1] > 1:
vanish_lines.append(data)
annots['vanish_line'][i] = vanish_lines
if len(vanish_lines) > 1:
annots['vanish_point'][i] = update_vanish_points(vanish_lines)
def calc_vanishpoint(keypoints2d, thres=0.3):
'''
keypoints2d: (2, N, 3)
'''
valid_idx = []
for nj in range(keypoints2d.shape[1]):
if keypoints2d[0, nj, 2] > thres and keypoints2d[1, nj, 2] > thres:
valid_idx.append(nj)
assert len(valid_idx) > 0, 'ATTN: cannot calculate the mirror pose'
keypoints2d = keypoints2d[:, valid_idx]
# weight: (N, 1)
weight = keypoints2d[:, :, 2:].mean(axis=0)
conf = weight.mean()
A = np.hstack([
keypoints2d[1, :, 1:2] - keypoints2d[0, :, 1:2],
-(keypoints2d[1, :, 0:1] - keypoints2d[0, :, 0:1])
])
b = -keypoints2d[0, :, 0:1]*(keypoints2d[1, :, 1:2] - keypoints2d[0, :, 1:2]) \
+ keypoints2d[0, :, 1:2] * (keypoints2d[1, :, 0:1] - keypoints2d[0, :, 0:1])
b = -b
A = A * weight
b = b * weight
avgInsec = np.linalg.inv(A.T @ A) @ (A.T @ b)
result = np.zeros(3)
result[0] = avgInsec[0, 0]
result[1] = avgInsec[1, 0]
result[2] = conf
return result
def update_vanish_points(lines):
vline0 = np.array(lines).transpose(1, 0, 2)
# vline0 = np.dstack((vline0, np.ones((vline0.shape[0], vline0.shape[1], 1))))
dim1points = vline0.copy()
points = calc_vanishpoint(dim1points)
return points.tolist()
def get_record_vanish_lines(index):
def record_vanish_lines(self, param, **kwargs):
"record vanish lines, X: mirror edge, Y: into mirror, Z: Up"
annots = param['annots']
if 'vanish_line' not in annots.keys():
annots['vanish_line'] = [[], [], []]
if 'vanish_point' not in annots.keys():
annots['vanish_point'] = [[], [], []]
start, end = param['start'], param['end']
if start is not None and end is not None:
annots['vanish_line'][index].append([[start[0], start[1], 2], [end[0], end[1], 2]])
# 更新vanish point
if len(annots['vanish_line'][index]) > 1:
annots['vanish_point'][index] = update_vanish_points(annots['vanish_line'][index])
param['start'] = None
param['end'] = None
func = record_vanish_lines
text = ['parallel to mirror edges', 'vertical to mirror', 'vertical to ground']
func.__doc__ = 'vanish line of ' + text[index]
return record_vanish_lines
def vanish_point_from_body(self, param, **kwargs):
"calculating the vanish point from human keypoints"
annots = param['annots']
bodies = annots['annots']
if len(bodies) < 2:
return 0
assert len(bodies) == 2, 'Please make sure that there are only two bboxes!'
kpts0 = np.array(bodies[0]['keypoints'])
kpts1 = flipPoint2D(np.array(bodies[1]['keypoints']))
vanish_line = annots['vanish_line'][1] # the y-dim
MIN_CONF = 0.5
for i in range(15):
conf = min(kpts0[i, -1], kpts1[i, -1])
if kpts0[i, -1] > MIN_CONF and kpts1[i, -1] > MIN_CONF:
vanish_line.append([[kpts0[i, 0], kpts0[i, 1], conf], [kpts1[i, 0], kpts1[i, 1], conf]])
if len(vanish_line) > 1:
annots['vanish_point'][1] = update_vanish_points(vanish_line)
def copy_edges(self, param, **kwargs):
"copy the static edges from previous frame"
if self.frame == 0:
return 0
previous = self.previous()
annots = param['annots']
# copy the vanish points
vanish_lines_pre = previous['vanish_line']
vanish_lines = param['annots']['vanish_line']
for i in range(3):
vanish_lines[i] = []
for data in vanish_lines_pre[i]:
if data[0][-1] > 1 and data[1][-1] > 1:
vanish_lines[i].append(data)
if len(vanish_lines[i]) > 1:
annots['vanish_point'][i] = update_vanish_points(vanish_lines[i])
def get_calc_intrinsic(mode='xy'):
def calc_intrinsic(self, param, **kwargs):
"calculating intrinsic matrix according to vanish points"
annots = param['annots']
if mode == 'xy':
point0 = annots['vanish_point'][0]
point1 = annots['vanish_point'][1]
elif mode == 'yz':
point0 = annots['vanish_point'][1]
point1 = annots['vanish_point'][2]
else:
import ipdb; ipdb.set_trace()
if len(point0) < 1 or len(point1) < 1:
return 0
vanish_point = np.stack([np.array(point0), np.array(point1)])
K = np.eye(3)
H = annots['height']
W = annots['width']
K = np.eye(3)
K[0, 2] = W/2
K[1, 2] = H/2
vanish_point[:, 0] -= W/2
vanish_point[:, 1] -= H/2
focal = np.sqrt(-(vanish_point[0][0]*vanish_point[1][0] + vanish_point[0][1]*vanish_point[1][1]))
K[0, 0] = focal
K[1, 1] = focal
annots['K'] = K.tolist()
print('>>> estimated K: ')
print(K)
calc_intrinsic.__doc__ = 'calculate K with {}'.format(mode)
return calc_intrinsic

View File

@ -0,0 +1,28 @@
import cv2
import numpy as np
from .basic_visualize import plot_cross
def vis_vanish_lines(img, annots, **kwargs):
if 'vanish_line' not in annots.keys():
annots['vanish_line'] = [[], [], []]
if 'vanish_point' not in annots.keys():
annots['vanish_point'] = [[], [], []]
colors = [(96, 96, 255), (96, 255, 96), (255, 64, 64)]
for i in range(3):
point = annots['vanish_point'][i]
if len(point) == 0:
continue
x, y, c = point
plot_cross(img, x, y, colors[i])
points = np.array(annots['vanish_line'][i]).reshape(-1, 3)
for (xx, yy, conf) in points:
plot_cross(img, xx, yy, colors[i])
cv2.line(img, (int(x), int(y)), (int(xx), int(yy)), colors[i], 2)
for i in range(3):
for pt1, pt2 in annots['vanish_line'][i]:
cv2.line(img, (int(pt1[0]), int(pt1[1])), (int(pt2[0]), int(pt2[1])), colors[i], 2)
return img

View File

@ -0,0 +1,12 @@
'''
@ Date: 2021-01-13 18:50:31
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-03-28 22:11:58
@ FilePath: /EasyMocap/code/dataset/__init__.py
'''
from .config import CONFIG
from .base import ImageFolder
from .mv1pmf import MV1PMF
from .mv1pmf_mirror import MV1PMF_Mirror
from .mvmpmf import MVMPMF

608
easymocap/dataset/base.py Normal file
View File

@ -0,0 +1,608 @@
'''
@ Date: 2021-01-13 16:53:55
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-13 15:59:35
@ FilePath: /EasyMocap/easymocap/dataset/base.py
'''
import os
import json
from os.path import join
from glob import glob
import cv2
import os, sys
import numpy as np
from ..mytools.camera_utils import read_camera, get_fundamental_matrix, Undistort
from ..mytools import FileWriter, read_annot, getFileList
from ..mytools.reader import read_keypoints3d, read_json
# from ..mytools.writer import FileWriter
# from ..mytools.camera_utils import read_camera, undistort, write_camera, get_fundamental_matrix
# from ..mytools.vis_base import merge, plot_bbox, plot_keypoints
# from ..mytools.file_utils import read_json, save_json, read_annot, read_smpl, write_smpl, get_bbox_from_pose
# from ..mytools.file_utils import merge_params, select_nf, getFileList
def crop_image(img, annot, vis_2d=False, config={}, crop_square=True):
for det in annot:
bbox = det['bbox']
l, t, r, b = det['bbox'][:4]
if crop_square:
if b - t > r - l:
diff = (b - t) - (r - l)
l -= diff//2
r += diff//2
else:
diff = (r - l) - (b - t)
t -= diff//2
b += diff//2
l = max(0, int(l+0.5))
t = max(0, int(t+0.5))
r = min(img.shape[1], int(r+0.5))
b = min(img.shape[0], int(b+0.5))
det['bbox'][:4] = [l, t, r, b]
if vis_2d:
crop_img = img.copy()
plot_keypoints(crop_img, det['keypoints'], pid=det['id'],
config=config, use_limb_color=True, lw=2)
else:
crop_img = img
crop_img = crop_img[t:b, l:r, :]
if crop_square:
crop_img = cv2.resize(crop_img, (256, 256))
else:
crop_img = cv2.resize(crop_img, (128, 256))
det['crop'] = crop_img
det['img'] = img
return 0
class ImageFolder:
"""Dataset for image folders"""
def __init__(self, root, subs=[], out=None, image_root='images', annot_root='annots',
kpts_type='body15', config={}, no_img=False) -> None:
self.root = root
self.image_root = join(root, image_root)
self.annot_root = join(root, annot_root)
self.kpts_type = kpts_type
self.no_img = no_img
if len(subs) == 0:
self.imagelist = getFileList(self.image_root, '.jpg')
self.annotlist = getFileList(self.annot_root, '.json')
else:
self.imagelist, self.annotlist = [], []
for sub in subs:
images = sorted([join(sub, i) for i in os.listdir(join(self.image_root, sub))])
self.imagelist.extend(images)
annots = sorted([join(sub, i) for i in os.listdir(join(self.annot_root, sub))])
self.annotlist.extend(annots)
# output
assert out is not None
self.out = out
self.writer = FileWriter(self.out, config=config)
self.gtK, self.gtRT = False, False
def load_gt_cameras(self):
cameras = load_cameras(self.root)
gtCameras = []
for i, name in enumerate(self.annotlist):
cam = os.path.dirname(name)
gtcams = {key:cameras[cam][key].copy() for key in ['K', 'R', 'T', 'dist']}
gtCameras.append(gtcams)
self.gtCameras = gtCameras
def __len__(self) -> int:
return len(self.imagelist)
def __getitem__(self, index: int):
imgname = join(self.image_root, self.imagelist[index])
annname = join(self.annot_root, self.annotlist[index])
assert os.path.exists(imgname) and os.path.exists(annname), (imgname, annname)
assert os.path.basename(imgname).split('.')[0] == os.path.basename(annname).split('.')[0], '{}, {}'.format(imgname, annname)
if not self.no_img:
img = cv2.imread(imgname)
else:
img = None
annot = read_annot(annname, self.kpts_type)
return img, annot
def camera(self, index=0, annname=None):
if annname is None:
annname = join(self.annot_root, self.annotlist[index])
data = read_json(annname)
if 'K' not in data.keys():
height, width = data['height'], data['width']
# focal = 1.2*max(height, width) # as colmap
focal = 1.2*min(height, width) # as colmap
K = np.array([focal, 0., width/2, 0., focal, height/2, 0. ,0., 1.]).reshape(3, 3)
else:
K = np.array(data['K']).reshape(3, 3)
camera = {'K':K ,'R': np.eye(3), 'T': np.zeros((3, 1)), 'dist': np.zeros((1, 5))}
if self.gtK:
camera['K'] = self.gtCameras[index]['K']
if self.gtRT:
camera['R'] = self.gtCameras[index]['R']
camera['T'] = self.gtCameras[index]['T']
# camera['T'][2, 0] = 5. # guess to 5 meters
camera['RT'] = np.hstack((camera['R'], camera['T']))
camera['P'] = camera['K'] @ np.hstack((camera['R'], camera['T']))
return camera
def basename(self, nf):
return self.annotlist[nf].replace('.json', '')
def write_keypoints3d(self, results, nf):
outname = join(self.out, 'keypoints3d', '{}.json'.format(self.basename(nf)))
self.writer.write_keypoints3d(results, outname)
def write_smpl(self, results, nf):
outname = join(self.out, 'smpl', '{}.json'.format(self.basename(nf)))
self.writer.write_smpl(results, outname)
def vis_smpl(self, render_data, image, camera, nf):
outname = join(self.out, 'smpl', '{}.jpg'.format(self.basename(nf)))
images = [image]
for key in camera.keys():
camera[key] = camera[key][None, :, :]
self.writer.vis_smpl(render_data, images, camera, outname, add_back=True)
class VideoFolder(ImageFolder):
"一段视频的图片的文件夹"
def __init__(self, root, name, out=None,
image_root='images', annot_root='annots',
kpts_type='body15', config={}, no_img=False) -> None:
self.root = root
self.image_root = join(root, image_root, name)
self.annot_root = join(root, annot_root, name)
self.name = name
self.kpts_type = kpts_type
self.no_img = no_img
self.imagelist = sorted(os.listdir(self.image_root))
self.annotlist = sorted(os.listdir(self.annot_root))
self.ret_crop = False
def load_annot_all(self, path):
# 这个不使用personID只是单纯的罗列一下
assert os.path.exists(path), '{} not exists!'.format(path)
results = []
annnames = sorted(glob(join(path, '*.json')))
for annname in annnames:
datas = read_annot(annname, self.kpts_type)
if self.ret_crop:
# TODO:修改imgname
basename = os.path.basename(annname)
imgname = annname\
.replace('annots-cpn', 'images')\
.replace('annots', 'images')\
.replace('.json', '.jpg')
assert os.path.exists(imgname), imgname
img = cv2.imread(imgname)
crop_image(img, datas)
results.append(datas)
return results
def load_annot(self, path, pids=[]):
# 这个根据人的ID预先存一下
assert os.path.exists(path), '{} not exists!'.format(path)
results = {}
annnames = sorted(glob(join(path, '*.json')))
for annname in annnames:
nf = int(os.path.basename(annname).replace('.json', ''))
datas = read_annot(annname, self.kpts_type)
for data in datas:
pid = data['id']
if len(pids) > 0 and pid not in pids:
continue
# 注意 这里没有考虑从哪开始的
if pid not in results.keys():
results[pid] = {'bboxes': [], 'keypoints2d': []}
results[pid]['bboxes'].append(data['bbox'])
results[pid]['keypoints2d'].append(data['keypoints'])
for pid, val in results.items():
for key in val.keys():
val[key] = np.stack(val[key])
return results
def load_smpl(self, path, pids=[]):
""" load SMPL parameters from files
Args:
path (str): root path of smpl
pids (list, optional): used person ids. Defaults to [], loading all person.
"""
assert os.path.exists(path), '{} not exists!'.format(path)
results = {}
smplnames = sorted(glob(join(path, '*.json')))
for smplname in smplnames:
nf = int(os.path.basename(smplname).replace('.json', ''))
datas = read_smpl(smplname)
for data in datas:
pid = data['id']
if len(pids) > 0 and pid not in pids:
continue
# 注意 这里没有考虑从哪开始的
if pid not in results.keys():
results[pid] = {'body_params': [], 'frames': []}
results[pid]['body_params'].append(data)
results[pid]['frames'].append(nf)
for pid, val in results.items():
val['body_params'] = merge_params(val['body_params'])
return results
class _VideoBase:
"""Dataset for single sequence data
"""
def __init__(self, image_root, annot_root, out=None, config={}, kpts_type='body15', no_img=False) -> None:
self.image_root = image_root
self.annot_root = annot_root
self.kpts_type = kpts_type
self.no_img = no_img
self.config = config
assert out is not None
self.out = out
self.writer = FileWriter(self.out, config=config)
imgnames = sorted(os.listdir(self.image_root))
self.imagelist = imgnames
self.annotlist = sorted(os.listdir(self.annot_root))
self.nFrames = len(self.imagelist)
self.undis = False
self.read_camera()
def read_camera(self):
# 读入相机参数
annname = join(self.annot_root, self.annotlist[0])
data = read_json(annname)
if 'K' not in data.keys():
height, width = data['height'], data['width']
focal = 1.2*max(height, width)
K = np.array([focal, 0., width/2, 0., focal, height/2, 0. ,0., 1.]).reshape(3, 3)
else:
K = np.array(data['K']).reshape(3, 3)
self.camera = {'K':K ,'R': np.eye(3), 'T': np.zeros((3, 1))}
def __getitem__(self, index: int):
imgname = join(self.image_root, self.imagelist[index])
annname = join(self.annot_root, self.annotlist[index])
assert os.path.exists(imgname) and os.path.exists(annname)
assert os.path.basename(imgname).split('.')[0] == os.path.basename(annname).split('.')[0], '{}, {}'.format(imgname, annname)
if not self.no_img:
img = cv2.imread(imgname)
else:
img = None
annot = read_annot(annname, self.kpts_type)
return img, annot
def __len__(self) -> int:
return self.nFrames
def write_smpl(self, peopleDict, nf):
results = []
for pid, people in peopleDict.items():
result = {'id': pid}
result.update(people.body_params)
results.append(result)
self.writer.write_smpl(results, nf)
def vis_detections(self, image, detections, nf, to_img=True):
return self.writer.vis_detections([image], [detections], nf,
key='keypoints', to_img=to_img, vis_id=True)
def vis_repro(self, peopleDict, image, annots, nf):
# 可视化重投影的关键点与输入的关键点
detections = []
for pid, data in peopleDict.items():
keypoints3d = (data.keypoints3d @ self.camera['R'].T + self.camera['T'].T) @ self.camera['K'].T
keypoints3d[:, :2] /= keypoints3d[:, 2:]
keypoints3d = np.hstack([keypoints3d, data.keypoints3d[:, -1:]])
det = {
'id': pid,
'repro': keypoints3d
}
detections.append(det)
return self.writer.vis_detections([image], [detections], nf, key='repro',
to_img=True, vis_id=False)
def vis_smpl(self, peopleDict, faces, image, nf, sub_vis=[],
mode='smpl', extra_data=[], add_back=True,
axis=np.array([1., 0., 0.]), degree=0., fix_center=None):
# 为了统一接口,旋转视角的在此处实现,只在单视角的数据中使用
# 通过修改相机参数实现
# 相机参数的修正可以通过计算点的中心来获得
# render the smpl to each view
render_data = {}
for pid, data in peopleDict.items():
render_data[pid] = {
'vertices': data.vertices, 'faces': faces,
'vid': pid, 'name': 'human_{}_{}'.format(nf, pid)}
for iid, extra in enumerate(extra_data):
render_data[10000+iid] = {
'vertices': extra['vertices'],
'faces': extra['faces'],
'colors': extra['colors'],
'name': extra['name']
}
camera = {}
for key in self.camera.keys():
camera[key] = self.camera[key][None, :, :]
# render another view point
if np.abs(degree) > 1e-3:
vertices_all = np.vstack([data.vertices for data in peopleDict.values()])
if fix_center is None:
center = np.mean(vertices_all, axis=0, keepdims=True)
new_center = center.copy()
new_center[:, 0:2] = 0
else:
center = fix_center.copy()
new_center = fix_center.copy()
new_center[:, 2] *= 1.5
direc = np.array(axis)
rot, _ = cv2.Rodrigues(direc*degree/90*np.pi/2)
# If we rorate the data, it is like:
# V = Rnew @ (V0 - center) + new_center
# = Rnew @ V0 - Rnew @ center + new_center
# combine with the camera
# VV = Rc(Rnew @ V0 - Rnew @ center + new_center) + Tc
# = Rc@Rnew @ V0 + Rc @ (new_center - Rnew@center) + Tc
blank = np.zeros_like(image, dtype=np.uint8) + 255
images = [image, blank]
Rnew = camera['R'][0] @ rot
Tnew = camera['R'][0] @ (new_center.T - rot @ center.T) + camera['T'][0]
camera['K'] = np.vstack([camera['K'], camera['K']])
camera['R'] = np.vstack([camera['R'], Rnew[None, :, :]])
camera['T'] = np.vstack([camera['T'], Tnew[None, :, :]])
else:
images = [image]
self.writer.vis_smpl(render_data, nf, images, camera, mode, add_back=add_back)
def load_cameras(path):
# 读入相机参数
intri_name = join(path, 'intri.yml')
extri_name = join(path, 'extri.yml')
if os.path.exists(intri_name) and os.path.exists(extri_name):
cameras = read_camera(intri_name, extri_name)
cams = cameras.pop('basenames')
else:
print('\n\n!!!there is no camera parameters, maybe bug: \n', intri_name, extri_name, '\n')
cameras = None
return cameras
class MVBase:
""" Dataset for multiview data
"""
def __init__(self, root, cams=[], out=None, config={},
image_root='images', annot_root='annots',
kpts_type='body15',
undis=True, no_img=False) -> None:
self.root = root
self.image_root = join(root, image_root)
self.annot_root = join(root, annot_root)
self.kpts_type = kpts_type
self.undis = undis
self.no_img = no_img
# use when debug
self.ret_crop = False
self.config = config
# results path
# the results store keypoints3d
self.skel_path = None
self.out = out
self.writer = FileWriter(self.out, config=config)
self.cams = cams
self.imagelist = {}
self.annotlist = {}
for cam in cams: #TODO: 增加start,end
# ATTN: when image name's frame number is not continuous,
imgnames = sorted(os.listdir(join(self.image_root, cam)))
self.imagelist[cam] = imgnames
if os.path.exists(self.annot_root):
self.annotlist[cam] = sorted(os.listdir(join(self.annot_root, cam)))
self.has2d = True
else:
self.has2d = False
nFrames = min([len(val) for key, val in self.imagelist.items()])
self.nFrames = nFrames
self.nViews = len(cams)
self.read_camera(self.root)
def read_camera(self, path):
# 读入相机参数
intri_name = join(path, 'intri.yml')
extri_name = join(path, 'extri.yml')
if os.path.exists(intri_name) and os.path.exists(extri_name):
self.cameras = read_camera(intri_name, extri_name)
self.cameras.pop('basenames')
# 注意:这里的相机参数一定要用定义的,不然只用一部分相机的时候会出错
cams = self.cams
self.cameras_for_affinity = [[cam['invK'], cam['R'], cam['T']] for cam in [self.cameras[name] for name in cams]]
self.Pall = np.stack([self.cameras[cam]['P'] for cam in cams])
self.Fall = get_fundamental_matrix(self.cameras, cams)
else:
print('\n!!!\n!!!there is no camera parameters, maybe bug: \n', intri_name, extri_name, '\n')
self.cameras = None
def undistort(self, images):
if self.cameras is not None and len(images) > 0:
images_ = []
for nv in range(self.nViews):
mtx = self.cameras[self.cams[nv]]['K']
dist = self.cameras[self.cams[nv]]['dist']
if images[nv] is not None:
frame = cv2.undistort(images[nv], mtx, dist, None)
else:
frame = None
images_.append(frame)
else:
images_ = images
return images_
def undis_det(self, lDetections):
for nv in range(len(lDetections)):
camera = self.cameras[self.cams[nv]]
for det in lDetections[nv]:
det['bbox'] = Undistort.bbox(det['bbox'], K=camera['K'], dist=camera['dist'])
keypoints = det['keypoints']
det['keypoints'] = Undistort.points(keypoints=keypoints, K=camera['K'], dist=camera['dist'])
return lDetections
def select_person(self, annots_all, index, pid):
annots = {'bbox': [], 'keypoints': []}
for nv, cam in enumerate(self.cams):
data = [d for d in annots_all[nv] if d['id'] == pid]
if len(data) == 1:
data = data[0]
bbox = data['bbox']
keypoints = data['keypoints']
else:
if self.verbose:print('not found pid {} in frame {}, view {}'.format(self.pid, index, nv))
keypoints = np.zeros((self.config['nJoints'], 3))
bbox = np.array([0, 0, 100., 100., 0.])
annots['bbox'].append(bbox)
annots['keypoints'].append(keypoints)
for key in ['bbox', 'keypoints']:
annots[key] = np.stack(annots[key])
return annots
def __getitem__(self, index: int):
images, annots = [], []
for cam in self.cams:
imgname = join(self.image_root, cam, self.imagelist[cam][index])
assert os.path.exists(imgname), imgname
if self.has2d:
annname = join(self.annot_root, cam, self.annotlist[cam][index])
assert os.path.exists(annname), annname
assert self.imagelist[cam][index].split('.')[0] == self.annotlist[cam][index].split('.')[0]
annot = read_annot(annname, self.kpts_type)
else:
annot = []
if not self.no_img:
img = cv2.imread(imgname)
images.append(img)
else:
img = None
images.append(None)
# TODO:这里直接取了0
if self.ret_crop:
crop_image(img, annot, True, self.config)
annots.append(annot)
if self.undis:
images = self.undistort(images)
annots = self.undis_det(annots)
return images, annots
def __len__(self) -> int:
return self.nFrames
def vis_detections(self, images, lDetections, nf, mode='detec', to_img=True, sub_vis=[]):
outname = join(self.out, mode, '{:06d}.jpg'.format(nf))
if len(sub_vis) != 0:
valid_idx = [self.cams.index(i) for i in sub_vis]
images = [images[i] for i in valid_idx]
lDetections = [lDetections[i] for i in valid_idx]
return self.writer.vis_keypoints2d_mv(images, lDetections, outname=outname, vis_id=False)
def vis_match(self, images, lDetections, nf, to_img=True, sub_vis=[]):
if len(sub_vis) != 0:
valid_idx = [self.cams.index(i) for i in sub_vis]
images = [images[i] for i in valid_idx]
lDetections = [lDetections[i] for i in valid_idx]
return self.writer.vis_detections(images, lDetections, nf,
key='match', to_img=to_img, vis_id=True)
def basename(self, nf):
return '{:06d}'.format(nf)
def write_keypoints3d(self, results, nf):
outname = join(self.out, 'keypoints3d', self.basename(nf)+'.json')
self.writer.write_keypoints3d(results, outname)
def write_smpl(self, results, nf):
outname = join(self.out, 'smpl', self.basename(nf)+'.json')
self.writer.write_smpl(results, outname)
def vis_smpl(self, peopleDict, faces, images, nf, sub_vis=[],
mode='smpl', extra_data=[], extra_mesh=[],
add_back=True, camera_scale=1, cameras=None):
# render the smpl to each view
render_data = {}
for pid, data in peopleDict.items():
render_data[pid] = {
'vertices': data.vertices, 'faces': faces,
'vid': pid, 'name': 'human_{}_{}'.format(nf, pid)}
for iid, extra in enumerate(extra_data):
render_data[10000+iid] = {
'vertices': extra['vertices'],
'faces': extra['faces'],
'name': extra['name']
}
if 'colors' in extra.keys():
render_data[10000+iid]['colors'] = extra['colors']
elif 'vid' in extra.keys():
render_data[10000+iid]['vid'] = extra['vid']
if len(sub_vis) == 0:
sub_vis = self.cams
images = [images[self.cams.index(cam)] for cam in sub_vis]
if cameras is None:
cameras = {'K': [], 'R':[], 'T':[]}
for key in cameras.keys():
cameras[key] = [self.cameras[cam][key] for cam in sub_vis]
for key in cameras.keys():
cameras[key] = np.stack([self.cameras[cam][key] for cam in sub_vis])
# 根据camera_back参数控制相机向后退的距离
# 相机的光心的位置: -R.T @ T
if False:
R = cameras['R']
T = cameras['T']
cam_center = np.einsum('bij,bjk->bik', -R.transpose(0, 2, 1), T)
# 相机的朝向: R @ [0, 0, 1]
zdir = np.array([0., 0., 1.]).reshape(-1, 3, 1)
direction = np.einsum('bij,bjk->bik', R, zdir)
cam_center = cam_center - direction * 1
# 更新过后的相机的T: - R @ C
Tnew = - np.einsum('bij,bjk->bik', R, cam_center)
cameras['T'] = Tnew
else:
cameras['K'][:, 0, 0] /= camera_scale
cameras['K'][:, 1, 1] /= camera_scale
return self.writer.vis_smpl(render_data, nf, images, cameras, mode, add_back=add_back, extra_mesh=extra_mesh)
def read_skeleton(self, start, end):
keypoints3ds = []
for nf in range(start, end):
skelname = join(self.out, 'keypoints3d', '{:06d}.json'.format(nf))
skeletons = read_keypoints3d(skelname)
skeleton = [i for i in skeletons if i['id'] == self.pid]
assert len(skeleton) == 1, 'There must be only 1 keypoints3d, id = {} in {}'.format(self.pid, skelname)
keypoints3ds.append(skeleton[0]['keypoints3d'])
keypoints3ds = np.stack(keypoints3ds)
return keypoints3ds
def read_skel(self, nf, path=None, mode='none'):
if path is None:
path = self.skel_path
assert path is not None, 'please set the skeleton path'
if mode == 'a4d':
outname = join(path, '{}.txt'.format(nf))
assert os.path.exists(outname), outname
skels = readReasultsTxt(outname)
elif mode == 'none':
outname = join(path, '{:06d}.json'.format(nf))
assert os.path.exists(outname), outname
skels = readResultsJson(outname)
else:
import ipdb; ipdb.set_trace()
return skels
def read_smpl(self, nf, path=None):
if path is None:
path = self.skel_path
assert path is not None, 'please set the skeleton path'
outname = join(path, '{:06d}.json'.format(nf))
assert os.path.exists(outname), outname
datas = read_json(outname)
outputs = []
for data in datas:
for key in ['Rh', 'Th', 'poses', 'shapes']:
data[key] = np.array(data[key])
outputs.append(data)
return outputs

696
easymocap/dataset/config.py Normal file
View File

@ -0,0 +1,696 @@
'''
* @ Date: 2020-09-26 16:52:55
* @ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-03 18:30:13
@ FilePath: /EasyMocap/easymocap/dataset/config.py
'''
import numpy as np
CONFIG = {}
CONFIG['smpl'] = {'nJoints': 24, 'kintree':
[
[ 0, 1 ],
[ 0, 2 ],
[ 0, 3 ],
[ 1, 4 ],
[ 2, 5 ],
[ 3, 6 ],
[ 4, 7 ],
[ 5, 8 ],
[ 6, 9 ],
[ 7, 10],
[ 8, 11],
[ 9, 12],
[ 9, 13],
[ 9, 14],
[12, 15],
[13, 16],
[14, 17],
[16, 18],
[17, 19],
[18, 20],
[19, 21],
[20, 22],
[21, 23],
],
'joint_names': [
'MidHip', # 0
'LUpLeg', # 1
'RUpLeg', # 2
'spine', # 3
'LLeg', # 4
'RLeg', # 5
'spine1', # 6
'LFoot', # 7
'RFoot', # 8
'spine2', # 9
'LToeBase', # 10
'RToeBase', # 11
'neck', # 12
'LShoulder', # 13
'RShoulder', # 14
'head', # 15
'LArm', # 16
'RArm', # 17
'LForeArm', # 18
'RForeArm', # 19
'LHand', # 20
'RHand', # 21
'LHandIndex1', # 22
'RHandIndex1', # 23
]
}
CONFIG['body25'] = {'nJoints': 25, 'kintree':
[[ 1, 0],
[ 2, 1],
[ 3, 2],
[ 4, 3],
[ 5, 1],
[ 6, 5],
[ 7, 6],
[ 8, 1],
[ 9, 8],
[10, 9],
[11, 10],
[12, 8],
[13, 12],
[14, 13],
[15, 0],
[16, 0],
[17, 15],
[18, 16],
[19, 14],
[20, 19],
[21, 14],
[22, 11],
[23, 22],
[24, 11]],
'joint_names': [
"Nose", "Neck", "RShoulder", "RElbow", "RWrist", "LShoulder", "LElbow", "LWrist", "MidHip", "RHip","RKnee","RAnkle","LHip","LKnee","LAnkle","REye","LEye","REar","LEar","LBigToe","LSmallToe","LHeel","RBigToe","RSmallToe","RHeel"]}
CONFIG['body25']['kintree_order'] = [
[1, 8], # 躯干放在最前面
[1, 2],
[2, 3],
[3, 4],
[1, 5],
[5, 6],
[6, 7],
[8, 9],
[8, 12],
[9, 10],
[10, 11],
[12, 13],
[13, 14],
[1, 0],
[0, 15],
[0, 16],
[15, 17],
[16, 18],
[11, 22],
[11, 24],
[22, 23],
[14, 19],
[19, 20],
[14, 21]
]
CONFIG['body25']['colors'] = ['k', 'r', 'r', 'r', 'b', 'b', 'b', 'k', 'r', 'r', 'r', 'b', 'b', 'b', 'r', 'b', 'r', 'b', 'b', 'b', 'b', 'r', 'r', 'r']
CONFIG['body25']['skeleton'] = \
{
( 0, 1): {'mean': 0.228, 'std': 0.046}, # Nose ->Neck
( 1, 2): {'mean': 0.144, 'std': 0.029}, # Neck ->RShoulder
( 2, 3): {'mean': 0.283, 'std': 0.057}, # RShoulder->RElbow
( 3, 4): {'mean': 0.258, 'std': 0.052}, # RElbow ->RWrist
( 1, 5): {'mean': 0.145, 'std': 0.029}, # Neck ->LShoulder
( 5, 6): {'mean': 0.281, 'std': 0.056}, # LShoulder->LElbow
( 6, 7): {'mean': 0.258, 'std': 0.052}, # LElbow ->LWrist
( 1, 8): {'mean': 0.483, 'std': 0.097}, # Neck ->MidHip
( 8, 9): {'mean': 0.106, 'std': 0.021}, # MidHip ->RHip
( 9, 10): {'mean': 0.438, 'std': 0.088}, # RHip ->RKnee
(10, 11): {'mean': 0.406, 'std': 0.081}, # RKnee ->RAnkle
( 8, 12): {'mean': 0.106, 'std': 0.021}, # MidHip ->LHip
(12, 13): {'mean': 0.438, 'std': 0.088}, # LHip ->LKnee
(13, 14): {'mean': 0.408, 'std': 0.082}, # LKnee ->LAnkle
( 0, 15): {'mean': 0.043, 'std': 0.009}, # Nose ->REye
( 0, 16): {'mean': 0.043, 'std': 0.009}, # Nose ->LEye
(15, 17): {'mean': 0.105, 'std': 0.021}, # REye ->REar
(16, 18): {'mean': 0.104, 'std': 0.021}, # LEye ->LEar
(14, 19): {'mean': 0.180, 'std': 0.036}, # LAnkle ->LBigToe
(19, 20): {'mean': 0.038, 'std': 0.008}, # LBigToe ->LSmallToe
(14, 21): {'mean': 0.044, 'std': 0.009}, # LAnkle ->LHeel
(11, 22): {'mean': 0.182, 'std': 0.036}, # RAnkle ->RBigToe
(22, 23): {'mean': 0.038, 'std': 0.008}, # RBigToe ->RSmallToe
(11, 24): {'mean': 0.044, 'std': 0.009}, # RAnkle ->RHeel
}
CONFIG['body15'] = {'nJoints': 15, 'kintree':
[[ 1, 0],
[ 2, 1],
[ 3, 2],
[ 4, 3],
[ 5, 1],
[ 6, 5],
[ 7, 6],
[ 8, 1],
[ 9, 8],
[10, 9],
[11, 10],
[12, 8],
[13, 12],
[14, 13],]}
CONFIG['body15']['joint_names'] = CONFIG['body25']['joint_names'][:15]
CONFIG['body15']['skeleton'] = {key: val for key, val in CONFIG['body25']['skeleton'].items() if key[0] < 15 and key[1] < 15}
CONFIG['body15']['kintree_order'] = CONFIG['body25']['kintree_order'][:14]
CONFIG['body15']['colors'] = CONFIG['body25']['colors'][:15]
CONFIG['panoptic'] = {
'nJoints': 19,
'joint_names': ['Neck', 'Nose', 'MidHip', 'LShoulder', 'LElbow', 'LWrist', 'LHip', 'LKnee', 'LAnkle', 'RShoulder','RElbow', 'RWrist', 'RHip','RKnee', 'RAnkle', 'LEye', 'LEar', 'REye', 'REar']
}
CONFIG['hand'] = {'kintree':
[[ 1, 0],
[ 2, 1],
[ 3, 2],
[ 4, 3],
[ 5, 0],
[ 6, 5],
[ 7, 6],
[ 8, 7],
[ 9, 0],
[10, 9],
[11, 10],
[12, 11],
[13, 0],
[14, 13],
[15, 14],
[16, 15],
[17, 0],
[18, 17],
[19, 18],
[20, 19]],
'colors': [
'k', 'k', 'k', 'k', 'r', 'r', 'r', 'r',
'g', 'g', 'g', 'g', 'b', 'b', 'b', 'b',
'y', 'y', 'y', 'y']
}
CONFIG['bodyhand'] = {'kintree':
[[ 1, 0],
[ 2, 1],
[ 3, 2],
[ 4, 3],
[ 5, 1],
[ 6, 5],
[ 7, 6],
[ 8, 1],
[ 9, 8],
[10, 9],
[11, 10],
[12, 8],
[13, 12],
[14, 13],
[15, 0],
[16, 0],
[17, 15],
[18, 16],
[19, 14],
[20, 19],
[21, 14],
[22, 11],
[23, 22],
[24, 11],
[26, 7], # handl
[27, 26],
[28, 27],
[29, 28],
[30, 7],
[31, 30],
[32, 31],
[33, 32],
[34, 7],
[35, 34],
[36, 35],
[37, 36],
[38, 7],
[39, 38],
[40, 39],
[41, 40],
[42, 7],
[43, 42],
[44, 43],
[45, 44],
[47, 4], # handr
[48, 47],
[49, 48],
[50, 49],
[51, 4],
[52, 51],
[53, 52],
[54, 53],
[55, 4],
[56, 55],
[57, 56],
[58, 57],
[59, 4],
[60, 59],
[61, 60],
[62, 61],
[63, 4],
[64, 63],
[65, 64],
[66, 65]
],
'nJoints': 67,
'skeleton':{
( 0, 1): {'mean': 0.251, 'std': 0.050},
( 1, 2): {'mean': 0.169, 'std': 0.034},
( 2, 3): {'mean': 0.292, 'std': 0.058},
( 3, 4): {'mean': 0.275, 'std': 0.055},
( 1, 5): {'mean': 0.169, 'std': 0.034},
( 5, 6): {'mean': 0.295, 'std': 0.059},
( 6, 7): {'mean': 0.278, 'std': 0.056},
( 1, 8): {'mean': 0.566, 'std': 0.113},
( 8, 9): {'mean': 0.110, 'std': 0.022},
( 9, 10): {'mean': 0.398, 'std': 0.080},
(10, 11): {'mean': 0.402, 'std': 0.080},
( 8, 12): {'mean': 0.111, 'std': 0.022},
(12, 13): {'mean': 0.395, 'std': 0.079},
(13, 14): {'mean': 0.403, 'std': 0.081},
( 0, 15): {'mean': 0.053, 'std': 0.011},
( 0, 16): {'mean': 0.056, 'std': 0.011},
(15, 17): {'mean': 0.107, 'std': 0.021},
(16, 18): {'mean': 0.107, 'std': 0.021},
(14, 19): {'mean': 0.180, 'std': 0.036},
(19, 20): {'mean': 0.055, 'std': 0.011},
(14, 21): {'mean': 0.065, 'std': 0.013},
(11, 22): {'mean': 0.169, 'std': 0.034},
(22, 23): {'mean': 0.052, 'std': 0.010},
(11, 24): {'mean': 0.061, 'std': 0.012},
( 7, 26): {'mean': 0.045, 'std': 0.009},
(26, 27): {'mean': 0.042, 'std': 0.008},
(27, 28): {'mean': 0.035, 'std': 0.007},
(28, 29): {'mean': 0.029, 'std': 0.006},
( 7, 30): {'mean': 0.102, 'std': 0.020},
(30, 31): {'mean': 0.040, 'std': 0.008},
(31, 32): {'mean': 0.026, 'std': 0.005},
(32, 33): {'mean': 0.023, 'std': 0.005},
( 7, 34): {'mean': 0.101, 'std': 0.020},
(34, 35): {'mean': 0.043, 'std': 0.009},
(35, 36): {'mean': 0.029, 'std': 0.006},
(36, 37): {'mean': 0.024, 'std': 0.005},
( 7, 38): {'mean': 0.097, 'std': 0.019},
(38, 39): {'mean': 0.041, 'std': 0.008},
(39, 40): {'mean': 0.027, 'std': 0.005},
(40, 41): {'mean': 0.024, 'std': 0.005},
( 7, 42): {'mean': 0.095, 'std': 0.019},
(42, 43): {'mean': 0.033, 'std': 0.007},
(43, 44): {'mean': 0.020, 'std': 0.004},
(44, 45): {'mean': 0.018, 'std': 0.004},
( 4, 47): {'mean': 0.043, 'std': 0.009},
(47, 48): {'mean': 0.041, 'std': 0.008},
(48, 49): {'mean': 0.034, 'std': 0.007},
(49, 50): {'mean': 0.028, 'std': 0.006},
( 4, 51): {'mean': 0.101, 'std': 0.020},
(51, 52): {'mean': 0.041, 'std': 0.008},
(52, 53): {'mean': 0.026, 'std': 0.005},
(53, 54): {'mean': 0.024, 'std': 0.005},
( 4, 55): {'mean': 0.100, 'std': 0.020},
(55, 56): {'mean': 0.044, 'std': 0.009},
(56, 57): {'mean': 0.029, 'std': 0.006},
(57, 58): {'mean': 0.023, 'std': 0.005},
( 4, 59): {'mean': 0.096, 'std': 0.019},
(59, 60): {'mean': 0.040, 'std': 0.008},
(60, 61): {'mean': 0.028, 'std': 0.006},
(61, 62): {'mean': 0.023, 'std': 0.005},
( 4, 63): {'mean': 0.094, 'std': 0.019},
(63, 64): {'mean': 0.032, 'std': 0.006},
(64, 65): {'mean': 0.020, 'std': 0.004},
(65, 66): {'mean': 0.018, 'std': 0.004},
}
}
CONFIG['bodyhandface'] = {'kintree':
[[ 1, 0],
[ 2, 1],
[ 3, 2],
[ 4, 3],
[ 5, 1],
[ 6, 5],
[ 7, 6],
[ 8, 1],
[ 9, 8],
[10, 9],
[11, 10],
[12, 8],
[13, 12],
[14, 13],
[15, 0],
[16, 0],
[17, 15],
[18, 16],
[19, 14],
[20, 19],
[21, 14],
[22, 11],
[23, 22],
[24, 11],
[26, 7], # handl
[27, 26],
[28, 27],
[29, 28],
[30, 7],
[31, 30],
[32, 31],
[33, 32],
[34, 7],
[35, 34],
[36, 35],
[37, 36],
[38, 7],
[39, 38],
[40, 39],
[41, 40],
[42, 7],
[43, 42],
[44, 43],
[45, 44],
[47, 4], # handr
[48, 47],
[49, 48],
[50, 49],
[51, 4],
[52, 51],
[53, 52],
[54, 53],
[55, 4],
[56, 55],
[57, 56],
[58, 57],
[59, 4],
[60, 59],
[61, 60],
[62, 61],
[63, 4],
[64, 63],
[65, 64],
[66, 65],
[ 67, 68],
[ 68, 69],
[ 69, 70],
[ 70, 71],
[ 72, 73],
[ 73, 74],
[ 74, 75],
[ 75, 76],
[ 77, 78],
[ 78, 79],
[ 79, 80],
[ 81, 82],
[ 82, 83],
[ 83, 84],
[ 84, 85],
[ 86, 87],
[ 87, 88],
[ 88, 89],
[ 89, 90],
[ 90, 91],
[ 91, 86],
[ 92, 93],
[ 93, 94],
[ 94, 95],
[ 95, 96],
[ 96, 97],
[ 97, 92],
[ 98, 99],
[ 99, 100],
[100, 101],
[101, 102],
[102, 103],
[103, 104],
[104, 105],
[105, 106],
[106, 107],
[107, 108],
[108, 109],
[109, 98],
[110, 111],
[111, 112],
[112, 113],
[113, 114],
[114, 115],
[115, 116],
[116, 117],
[117, 110]
],
'nJoints': 118,
'skeleton':{
( 0, 1): {'mean': 0.251, 'std': 0.050},
( 1, 2): {'mean': 0.169, 'std': 0.034},
( 2, 3): {'mean': 0.292, 'std': 0.058},
( 3, 4): {'mean': 0.275, 'std': 0.055},
( 1, 5): {'mean': 0.169, 'std': 0.034},
( 5, 6): {'mean': 0.295, 'std': 0.059},
( 6, 7): {'mean': 0.278, 'std': 0.056},
( 1, 8): {'mean': 0.566, 'std': 0.113},
( 8, 9): {'mean': 0.110, 'std': 0.022},
( 9, 10): {'mean': 0.398, 'std': 0.080},
(10, 11): {'mean': 0.402, 'std': 0.080},
( 8, 12): {'mean': 0.111, 'std': 0.022},
(12, 13): {'mean': 0.395, 'std': 0.079},
(13, 14): {'mean': 0.403, 'std': 0.081},
( 0, 15): {'mean': 0.053, 'std': 0.011},
( 0, 16): {'mean': 0.056, 'std': 0.011},
(15, 17): {'mean': 0.107, 'std': 0.021},
(16, 18): {'mean': 0.107, 'std': 0.021},
(14, 19): {'mean': 0.180, 'std': 0.036},
(19, 20): {'mean': 0.055, 'std': 0.011},
(14, 21): {'mean': 0.065, 'std': 0.013},
(11, 22): {'mean': 0.169, 'std': 0.034},
(22, 23): {'mean': 0.052, 'std': 0.010},
(11, 24): {'mean': 0.061, 'std': 0.012},
( 7, 26): {'mean': 0.045, 'std': 0.009},
(26, 27): {'mean': 0.042, 'std': 0.008},
(27, 28): {'mean': 0.035, 'std': 0.007},
(28, 29): {'mean': 0.029, 'std': 0.006},
( 7, 30): {'mean': 0.102, 'std': 0.020},
(30, 31): {'mean': 0.040, 'std': 0.008},
(31, 32): {'mean': 0.026, 'std': 0.005},
(32, 33): {'mean': 0.023, 'std': 0.005},
( 7, 34): {'mean': 0.101, 'std': 0.020},
(34, 35): {'mean': 0.043, 'std': 0.009},
(35, 36): {'mean': 0.029, 'std': 0.006},
(36, 37): {'mean': 0.024, 'std': 0.005},
( 7, 38): {'mean': 0.097, 'std': 0.019},
(38, 39): {'mean': 0.041, 'std': 0.008},
(39, 40): {'mean': 0.027, 'std': 0.005},
(40, 41): {'mean': 0.024, 'std': 0.005},
( 7, 42): {'mean': 0.095, 'std': 0.019},
(42, 43): {'mean': 0.033, 'std': 0.007},
(43, 44): {'mean': 0.020, 'std': 0.004},
(44, 45): {'mean': 0.018, 'std': 0.004},
( 4, 47): {'mean': 0.043, 'std': 0.009},
(47, 48): {'mean': 0.041, 'std': 0.008},
(48, 49): {'mean': 0.034, 'std': 0.007},
(49, 50): {'mean': 0.028, 'std': 0.006},
( 4, 51): {'mean': 0.101, 'std': 0.020},
(51, 52): {'mean': 0.041, 'std': 0.008},
(52, 53): {'mean': 0.026, 'std': 0.005},
(53, 54): {'mean': 0.024, 'std': 0.005},
( 4, 55): {'mean': 0.100, 'std': 0.020},
(55, 56): {'mean': 0.044, 'std': 0.009},
(56, 57): {'mean': 0.029, 'std': 0.006},
(57, 58): {'mean': 0.023, 'std': 0.005},
( 4, 59): {'mean': 0.096, 'std': 0.019},
(59, 60): {'mean': 0.040, 'std': 0.008},
(60, 61): {'mean': 0.028, 'std': 0.006},
(61, 62): {'mean': 0.023, 'std': 0.005},
( 4, 63): {'mean': 0.094, 'std': 0.019},
(63, 64): {'mean': 0.032, 'std': 0.006},
(64, 65): {'mean': 0.020, 'std': 0.004},
(65, 66): {'mean': 0.018, 'std': 0.004},
(67, 68): {'mean': 0.012, 'std': 0.002},
(68, 69): {'mean': 0.013, 'std': 0.003},
(69, 70): {'mean': 0.014, 'std': 0.003},
(70, 71): {'mean': 0.012, 'std': 0.002},
(72, 73): {'mean': 0.014, 'std': 0.003},
(73, 74): {'mean': 0.014, 'std': 0.003},
(74, 75): {'mean': 0.015, 'std': 0.003},
(75, 76): {'mean': 0.013, 'std': 0.003},
(77, 78): {'mean': 0.014, 'std': 0.003},
(78, 79): {'mean': 0.014, 'std': 0.003},
(79, 80): {'mean': 0.015, 'std': 0.003},
(81, 82): {'mean': 0.009, 'std': 0.002},
(82, 83): {'mean': 0.010, 'std': 0.002},
(83, 84): {'mean': 0.010, 'std': 0.002},
(84, 85): {'mean': 0.010, 'std': 0.002},
(86, 87): {'mean': 0.009, 'std': 0.002},
(87, 88): {'mean': 0.009, 'std': 0.002},
(88, 89): {'mean': 0.008, 'std': 0.002},
(89, 90): {'mean': 0.008, 'std': 0.002},
(90, 91): {'mean': 0.009, 'std': 0.002},
(86, 91): {'mean': 0.008, 'std': 0.002},
(92, 93): {'mean': 0.009, 'std': 0.002},
(93, 94): {'mean': 0.009, 'std': 0.002},
(94, 95): {'mean': 0.009, 'std': 0.002},
(95, 96): {'mean': 0.009, 'std': 0.002},
(96, 97): {'mean': 0.009, 'std': 0.002},
(92, 97): {'mean': 0.009, 'std': 0.002},
(98, 99): {'mean': 0.016, 'std': 0.003},
(99, 100): {'mean': 0.013, 'std': 0.003},
(100, 101): {'mean': 0.008, 'std': 0.002},
(101, 102): {'mean': 0.008, 'std': 0.002},
(102, 103): {'mean': 0.012, 'std': 0.002},
(103, 104): {'mean': 0.014, 'std': 0.003},
(104, 105): {'mean': 0.015, 'std': 0.003},
(105, 106): {'mean': 0.012, 'std': 0.002},
(106, 107): {'mean': 0.009, 'std': 0.002},
(107, 108): {'mean': 0.009, 'std': 0.002},
(108, 109): {'mean': 0.013, 'std': 0.003},
(98, 109): {'mean': 0.016, 'std': 0.003},
(110, 111): {'mean': 0.021, 'std': 0.004},
(111, 112): {'mean': 0.009, 'std': 0.002},
(112, 113): {'mean': 0.008, 'std': 0.002},
(113, 114): {'mean': 0.019, 'std': 0.004},
(114, 115): {'mean': 0.018, 'std': 0.004},
(115, 116): {'mean': 0.008, 'std': 0.002},
(116, 117): {'mean': 0.009, 'std': 0.002},
(110, 117): {'mean': 0.020, 'std': 0.004},
}
}
face_kintree_without_contour = [[ 0, 1],
[ 1, 2],
[ 2, 3],
[ 3, 4],
[ 5, 6],
[ 6, 7],
[ 7, 8],
[ 8, 9],
[10, 11],
[11, 12],
[12, 13],
[14, 15],
[15, 16],
[16, 17],
[17, 18],
[19, 20],
[20, 21],
[21, 22],
[22, 23],
[23, 24],
[24, 19],
[25, 26],
[26, 27],
[27, 28],
[28, 29],
[29, 30],
[30, 25],
[31, 32],
[32, 33],
[33, 34],
[34, 35],
[35, 36],
[36, 37],
[37, 38],
[38, 39],
[39, 40],
[40, 41],
[41, 42],
[42, 31],
[43, 44],
[44, 45],
[45, 46],
[46, 47],
[47, 48],
[48, 49],
[49, 50],
[50, 43]]
CONFIG['face'] = {'kintree':[ [0,1],[1,2],[2,3],[3,4],[4,5],[5,6],[6,7],[7,8],[8,9],[9,10],[10,11],[11,12],[12,13],[13,14],[14,15],[15,16], #outline (ignored)
[17,18],[18,19],[19,20],[20,21], #right eyebrow
[22,23],[23,24],[24,25],[25,26], #left eyebrow
[27,28],[28,29],[29,30], #nose upper part
[31,32],[32,33],[33,34],[34,35], #nose lower part
[36,37],[37,38],[38,39],[39,40],[40,41],[41,36], #right eye
[42,43],[43,44],[44,45],[45,46],[46,47],[47,42], #left eye
[48,49],[49,50],[50,51],[51,52],[52,53],[53,54],[54,55],[55,56],[56,57],[57,58],[58,59],[59,48], #Lip outline
[60,61],[61,62],[62,63],[63,64],[64,65],[65,66],[66,67],[67,60] #Lip inner line
], 'colors': ['g' for _ in range(100)]}
CONFIG['h36m'] = {
'kintree': [[0, 1], [1, 2], [2, 3], [0, 4], [4, 5], [5, 6], [0, 7], [7, 8], [8, 9], [9, 10], [8, 11], [11, 12], [
12, 13], [8, 14], [14, 15], [15, 16]],
'color': ['r', 'r', 'r', 'g', 'g', 'g', 'k', 'k', 'k', 'k', 'g', 'g', 'g', 'r', 'r', 'r'],
'joint_names': [
'hip', # 0
'LHip', # 1
'LKnee', # 2
'LAnkle', # 3
'RHip', # 4
'RKnee', # 5
'RAnkle', # 6
'Spine (H36M)', # 7
'Neck', # 8
'Head (H36M)', # 9
'headtop', # 10
'LShoulder', # 11
'LElbow', # 12
'LWrist', # 13
'RShoulder', # 14
'RElbow', # 15
'RWrist', # 16
],
'nJoints': 17}
NJOINTS_BODY = 25
NJOINTS_HAND = 21
NJOINTS_FACE = 70
NLIMBS_BODY = len(CONFIG['body25']['kintree'])
NLIMBS_HAND = len(CONFIG['hand']['kintree'])
NLIMBS_FACE = len(CONFIG['face']['kintree'])
def getKintree(name='total'):
if name == 'total':
# order: body25, face, rhand, lhand
kintree = CONFIG['body25']['kintree'] + CONFIG['hand']['kintree'] + CONFIG['hand']['kintree'] + CONFIG['face']['kintree']
kintree = np.array(kintree)
kintree[NLIMBS_BODY:NLIMBS_BODY + NLIMBS_HAND] += NJOINTS_BODY
kintree[NLIMBS_BODY + NLIMBS_HAND:NLIMBS_BODY + 2*NLIMBS_HAND] += NJOINTS_BODY + NJOINTS_HAND
kintree[NLIMBS_BODY + 2*NLIMBS_HAND:] += NJOINTS_BODY + 2*NJOINTS_HAND
elif name == 'smplh':
# order: body25, lhand, rhand
kintree = CONFIG['body25']['kintree'] + CONFIG['hand']['kintree'] + CONFIG['hand']['kintree']
kintree = np.array(kintree)
kintree[NLIMBS_BODY:NLIMBS_BODY + NLIMBS_HAND] += NJOINTS_BODY
kintree[NLIMBS_BODY + NLIMBS_HAND:NLIMBS_BODY + 2*NLIMBS_HAND] += NJOINTS_BODY + NJOINTS_HAND
return kintree
CONFIG['total'] = {}
CONFIG['total']['kintree'] = getKintree('total')
CONFIG['total']['nJoints'] = 137
COCO17_IN_BODY25 = [0,16,15,18,17,5,2,6,3,7,4,12,9,13,10,14,11]
def coco17tobody25(points2d):
dim = 3
if len(points2d.shape) == 2:
points2d = points2d[None, :, :]
dim = 2
kpts = np.zeros((points2d.shape[0], 25, 3))
kpts[:, COCO17_IN_BODY25, :2] = points2d[:, :, :2]
kpts[:, COCO17_IN_BODY25, 2:3] = points2d[:, :, 2:3]
kpts[:, 8, :2] = kpts[:, [9, 12], :2].mean(axis=1)
kpts[:, 8, 2] = kpts[:, [9, 12], 2].min(axis=1)
kpts[:, 1, :2] = kpts[:, [2, 5], :2].mean(axis=1)
kpts[:, 1, 2] = kpts[:, [2, 5], 2].min(axis=1)
if dim == 2:
kpts = kpts[0]
return kpts
for skeltype, config in CONFIG.items():
if 'joint_names' in config.keys():
torsoid = [config['joint_names'].index(name) if name in config['joint_names'] else None for name in ['LShoulder', 'RShoulder', 'LHip', 'RHip']]
torsoid = [i for i in torsoid if i is not None]
config['torso'] = torsoid

138
easymocap/dataset/mirror.py Normal file
View File

@ -0,0 +1,138 @@
'''
@ Date: 2021-01-21 19:34:48
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-03-06 18:57:47
@ FilePath: /EasyMocap/code/dataset/mirror.py
'''
import numpy as np
from os.path import join
import os
import cv2
FLIP_BODY25 = [0,1,5,6,7,2,3,4,8,12,13,14,9,10,11,16,15,18,17,22,23,24,19,20,21]
FLIP_BODYHAND = [
0,1,5,6,7,2,3,4,8,12,13,14,9,10,11,16,15,18,17,22,23,24,19,20,21, # body 25
22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, # right hand
]
FLIP_SMPL_VERTICES = np.loadtxt(join(os.path.dirname(__file__), 'smpl_vert_sym.txt'), dtype=np.int)
def flipPoint2D(point):
if point.shape[-2] == 25:
return point[..., FLIP_BODY25, :]
elif point.shape[-2] == 15:
return point[..., FLIP_BODY25[:15], :]
elif point.shape[-2] == 6890:
return point[..., FLIP_SMPL_VERTICES, :]
import ipdb; ipdb.set_trace()
elif point.shape[-1] == 67:
import ipdb; ipdb.set_trace()
# Permutation of SMPL pose parameters when flipping the shape
_PERMUTATION = {
'smpl': [0, 2, 1, 3, 5, 4, 6, 8, 7, 9, 11, 10, 12, 14, 13, 15, 17, 16, 19, 18, 21, 20, 23, 22],
'smplh': [0, 2, 1, 3, 5, 4, 6, 8, 7, 9, 11, 10, 12, 14, 13, 15, 17, 16, 19, 18, 21, 20, 24, 25, 23, 24],
'smplx': [0, 2, 1, 3, 5, 4, 6, 8, 7, 9, 11, 10, 12, 14, 13, 15, 17, 16, 19, 18, 21, 20, 24, 25, 23, 24, 26, 28, 27],
'smplhfull': [
0, 2, 1, 3, 5, 4, 6, 8, 7, 9, 11, 10, 12, 14, 13, 15, 17, 16, 19, 18, 21, 20, # body
37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51,
22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36
],
'smplxfull': [
0, 2, 1, 3, 5, 4, 6, 8, 7, 9, 11, 10, 12, 14, 13, 15, 17, 16, 19, 18, 21, 20, # body
22, 24, 23, # jaw, left eye, right eye
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, # right hand
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, # left hand
]
}
PERMUTATION = {}
for key in _PERMUTATION.keys():
res = []
for i in _PERMUTATION[key]:
res.extend([3*i + j for j in range(3)])
PERMUTATION[max(res)+1] = res
def flipSMPLPoses(pose):
"""Flip pose.
const input: (N, 72) -> (N, 72)
The flipping is based on SMPL parameters.
"""
pose = pose[:, PERMUTATION[pose.shape[-1]]]
if pose.shape[1] in [72, 156, 165]:
pose[:, 1::3] = -pose[:, 1::3]
pose[:, 2::3] = -pose[:, 2::3]
elif pose.shape[1] in [78, 87]:
pose[:, 1:66:3] = -pose[:, 1:66:3]
pose[:, 2:66:3] = -pose[:, 2:66:3]
else:
import ipdb; ipdb.set_trace()
# we also negate the second and the third dimension of the axis-angle
return pose
def mirrorPoint3D(point, M):
point_homo = np.hstack([point, np.ones([point.shape[0], 1])])
point_m = (M @ point_homo.T).T[..., :3]
return flipPoint2D(point_m)
def calc_mirror_transform(m):
coeff_mat = np.eye(4)[None, :, :]
coeff_mat = coeff_mat.repeat(m.shape[0], 0)
norm = np.linalg.norm(m[:, :3], keepdims=True, axis=1)
m[:, :3] /= norm
coeff_mat[:, 0, 0] = 1 - 2*m[:, 0]**2
coeff_mat[:, 0, 1] = -2*m[:, 0]*m[:, 1]
coeff_mat[:, 0, 2] = -2*m[:, 0]*m[:, 2]
coeff_mat[:, 0, 3] = -2*m[:, 0]*m[:, 3]
coeff_mat[:, 1, 0] = -2*m[:, 1]*m[:, 0]
coeff_mat[:, 1, 1] = 1-2*m[:, 1]**2
coeff_mat[:, 1, 2] = -2*m[:, 1]*m[:, 2]
coeff_mat[:, 1, 3] = -2*m[:, 1]*m[:, 3]
coeff_mat[:, 2, 0] = -2*m[:, 2]*m[:, 0]
coeff_mat[:, 2, 1] = -2*m[:, 2]*m[:, 1]
coeff_mat[:, 2, 2] = 1-2*m[:, 2]**2
coeff_mat[:, 2, 3] = -2*m[:, 2]*m[:, 3]
return coeff_mat
def get_rotation_from_two_directions(direc0, direc1):
direc0 = direc0/np.linalg.norm(direc0)
direc1 = direc1/np.linalg.norm(direc1)
rotdir = np.cross(direc0, direc1)
if np.linalg.norm(rotdir) < 1e-2:
return np.eye(3)
rotdir = rotdir/np.linalg.norm(rotdir)
rotdir = rotdir * np.arccos(np.dot(direc0, direc1))
rotmat, _ = cv2.Rodrigues(rotdir)
return rotmat
def mirror_Rh(Rh, normals):
rvecs = np.zeros_like(Rh)
for nf in range(Rh.shape[0]):
normal = normals[nf]
rotmat = cv2.Rodrigues(Rh[nf])[0]
rotmat_m = np.zeros((3, 3))
for i in range(3):
rot = rotmat[:, i] - 2*(rotmat[:, i] * normal).sum()*normal
rotmat_m[:, i] = rot
rotmat_m[:, 0] *= -1
rvecs[nf] = cv2.Rodrigues(rotmat_m)[0].T
return rvecs
def flipSMPLParams(params, mirror):
"""Flip pose.
const input: (1, 72) -> (1, 72)
The flipping is based on SMPL parameters.
"""
mirror[:, :3] /= np.linalg.norm(mirror[:, :3], keepdims=True, axis=1)
if mirror.shape[0] == 1 and mirror.shape[0] != params['Rh'].shape[0]:
mirror = mirror.repeat(params['Rh'].shape[0], 0)
M = calc_mirror_transform(mirror)
T = params['Th']
rvecm = mirror_Rh(params['Rh'], mirror[:, :3])
Tnew = np.einsum('bmn,bn->bm', M[:, :3, :3], params['Th']) + M[:, :3, 3]
params = {
'poses': flipSMPLPoses(params['poses']),
'shapes': params['shapes'],
'Rh': rvecm,
'Th': Tnew
}
return params

View File

@ -0,0 +1,80 @@
'''
@ Date: 2021-01-12 17:12:50
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-13 10:59:22
@ FilePath: /EasyMocap/easymocap/dataset/mv1pmf.py
'''
from ..mytools.file_utils import get_bbox_from_pose
from os.path import join
import numpy as np
from os.path import join
from .base import MVBase
class MV1PMF(MVBase):
def __init__(self, root, cams=[], pid=0, out=None, config={},
image_root='images', annot_root='annots', kpts_type='body15',
undis=True, no_img=False, verbose=False) -> None:
super().__init__(root=root, cams=cams, out=out, config=config,
image_root=image_root, annot_root=annot_root,
kpts_type=kpts_type, undis=undis, no_img=no_img)
self.pid = pid
self.verbose = verbose
def write_keypoints3d(self, keypoints3d, nf):
results = [{'id': self.pid, 'keypoints3d': keypoints3d}]
super().write_keypoints3d(results, nf)
def write_smpl(self, params, nf):
result = {'id': 0}
result.update(params)
super().write_smpl([result], nf)
def vis_smpl(self, vertices, faces, images, nf, sub_vis=[],
mode='smpl', extra_data=[], add_back=True):
outname = join(self.out, 'smpl', '{:06d}.jpg'.format(nf))
render_data = {}
assert vertices.shape[1] == 3 and len(vertices.shape) == 2, 'shape {} != (N, 3)'.format(vertices.shape)
pid = self.pid
render_data[pid] = {'vertices': vertices, 'faces': faces,
'vid': pid, 'name': 'human_{}_{}'.format(nf, pid)}
cameras = {'K': [], 'R':[], 'T':[]}
if len(sub_vis) == 0:
sub_vis = self.cams
for key in cameras.keys():
cameras[key] = [self.cameras[cam][key] for cam in sub_vis]
images = [images[self.cams.index(cam)] for cam in sub_vis]
self.writer.vis_smpl(render_data, images, cameras, outname, add_back=add_back)
def vis_detections(self, images, annots, nf, to_img=True, sub_vis=[]):
lDetections = []
for nv in range(len(images)):
det = {
'id': self.pid,
'bbox': annots['bbox'][nv],
'keypoints2d': annots['keypoints'][nv]
}
lDetections.append([det])
return super().vis_detections(images, lDetections, nf, sub_vis=sub_vis)
def vis_repro(self, images, kpts_repro, nf, to_img=True, sub_vis=[]):
lDetections = []
for nv in range(len(images)):
det = {
'id': -1,
'keypoints2d': kpts_repro[nv],
'bbox': get_bbox_from_pose(kpts_repro[nv], images[nv])
}
lDetections.append([det])
return super().vis_detections(images, lDetections, nf, mode='repro', sub_vis=sub_vis)
def __getitem__(self, index: int):
images, annots_all = super().__getitem__(index)
annots = self.select_person(annots_all, index, self.pid)
return images, annots
if __name__ == "__main__":
root = '/home/qian/zjurv2/mnt/data/ftp/Human/vis/lightstage/CoreView_302_sync/'
dataset = MV1PMF(root)
images, annots = dataset[0]

View File

@ -0,0 +1,189 @@
'''
@ Date: 2021-01-12 17:12:50
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-14 11:26:36
@ FilePath: /EasyMocapRelease/easymocap/dataset/mv1pmf_mirror.py
'''
import os
from os.path import join
import numpy as np
import cv2
from .base import ImageFolder
from .mv1pmf import MVBase
from .mirror import calc_mirror_transform, flipSMPLParams, mirrorPoint3D, flipPoint2D, mirror_Rh
from ..mytools.file_utils import get_bbox_from_pose, read_json
class MV1PMF_Mirror(MVBase):
def __init__(self, root, cams=[], pid=0, out=None, config={},
image_root='images', annot_root='annots', kpts_type='body15',
undis=True, no_img=False,
verbose=False) -> None:
self.mirror = np.array([[0., 1., 0., 0.]])
super().__init__(root=root, cams=cams, out=out, config=config,
image_root=image_root, annot_root=annot_root,
kpts_type=kpts_type, undis=undis, no_img=no_img)
self.pid = pid
self.verbose = False
def __str__(self) -> str:
return 'Dataset for MultiMirror: {} views'.format(len(self.cams))
def write_keypoints3d(self, keypoints3d, nf):
results = []
M = self.Mirror[0]
pid = self.pid
val = {'id': pid, 'keypoints3d': keypoints3d}
results.append(val)
kpts = keypoints3d
kpts3dm = (M[:3, :3] @ kpts[:, :3].T + M[:3, 3:]).T
kpts3dm = np.hstack([kpts3dm, kpts[:, 3:]])
kpts3dm = flipPoint2D(kpts3dm)
val1 = {'id': pid + 1, 'keypoints3d': kpts3dm}
results.append(val1)
super().write_keypoints3d(results, nf)
def write_smpl(self, params, nf):
outname = join(self.out, 'smpl', '{:06d}.json'.format(nf))
results = []
M = self.Mirror[0]
pid = self.pid
val = {'id': pid}
val.update(params)
results.append(val)
# 增加镜子里的人的
val = {'id': pid + 1}
val.update(flipSMPLParams(params, self.mirror))
results.append(val)
self.writer.write_smpl(results, outname)
def vis_smpl(self, vertices, faces, images, nf, sub_vis=[],
mode='smpl', extra_data=[], add_back=True):
outname = join(self.out, 'smpl', '{:06d}.jpg'.format(nf))
render_data = {}
if len(vertices.shape) == 3:
vertices = vertices[0]
pid = self.pid
render_data[pid] = {'vertices': vertices, 'faces': faces,
'vid': pid, 'name': 'human_{}_{}'.format(nf, pid)}
vertices_m = mirrorPoint3D(vertices, self.Mirror[0])
render_data[pid+1] = {'vertices': vertices_m, 'faces': faces,
'vid': pid, 'name': 'human_mirror_{}_{}'.format(nf, pid)}
cameras = {'K': [], 'R':[], 'T':[]}
if len(sub_vis) == 0:
sub_vis = self.cams
for key in cameras.keys():
cameras[key] = [self.cameras[cam][key] for cam in sub_vis]
images = [images[self.cams.index(cam)] for cam in sub_vis]
self.writer.vis_smpl(render_data, images, cameras, outname, add_back=add_back)
def vis_detections(self, images, annots, nf, to_img=True, sub_vis=[]):
outname = join(self.out, 'detec', '{:06d}.jpg'.format(nf))
lDetections = []
nViews = len(images)
for nv in range(len(images)):
det = {
'id': self.pid,
'bbox': annots['bbox'][nv],
'keypoints2d': annots['keypoints'][nv]
}
det_m = {
'id': self.pid + 1,
'bbox': annots['bbox'][nv+nViews],
'keypoints2d': annots['keypoints'][nv+nViews]
}
lDetections.append([det, det_m])
if len(sub_vis) != 0:
valid_idx = [self.cams.index(i) for i in sub_vis]
images = [images[i] for i in valid_idx]
lDetections = [lDetections[i] for i in valid_idx]
return self.writer.vis_keypoints2d_mv(images, lDetections, outname=outname, vis_id=False)
def vis_repro(self, images, kpts_repro, nf, to_img=True, sub_vis=[]):
outname = join(self.out, 'repro', '{:06d}.jpg'.format(nf))
lDetections = []
for nv in range(len(images)):
det = {
'id': -1,
'keypoints2d': kpts_repro[nv],
'bbox': get_bbox_from_pose(kpts_repro[nv], images[nv])
}
det_mirror = {
'id': -1,
'keypoints2d': kpts_repro[nv+len(images)],
'bbox': get_bbox_from_pose(kpts_repro[nv+len(images)], images[nv])
}
lDetections.append([det, det_mirror])
if len(sub_vis) != 0:
valid_idx = [self.cams.index(i) for i in sub_vis]
images = [images[i] for i in valid_idx]
lDetections = [lDetections[i] for i in valid_idx]
return self.writer.vis_keypoints2d_mv(images, lDetections, outname=outname, vis_id=False)
@property
def Mirror(self):
M = calc_mirror_transform(self.mirror)
return M
@property
def Pall(self):
return self.Pall_
@Pall.setter
def Pall(self, value):
M = self.Mirror
if M.shape[0] == 1 and M.shape[0] != value.shape[0]:
M = M.repeat(value.shape[0], 0)
Pall_mirror = np.einsum('bmn,bno->bmo', value, M)
Pall = np.vstack((value, Pall_mirror))
self.Pall_ = Pall
def __getitem__(self, index: int):
images, annots_all = super().__getitem__(index)
annots0 = self.select_person(annots_all, index, self.pid)
annots1 = self.select_person(annots_all, index, self.pid + 1)
# flip points
# stack it as only one person
annots = {
'bbox': np.vstack([annots0['bbox'], annots1['bbox']]),
'keypoints': np.vstack([annots0['keypoints'], flipPoint2D(annots1['keypoints'])]),
}
return images, annots
class ImageFolderMirror(ImageFolder):
def normal(self, nf):
annname = join(self.annot_root, self.annotlist[nf])
data = read_json(annname)
if 'vanish_point' in data.keys():
vp1 = np.array(data['vanish_point'][1])
vp1[2] = 1
K = self.camera(nf)['K']
normal = np.linalg.inv(K) @ vp1.reshape(3, 1)
normal = normal.T / np.linalg.norm(normal)
else:
normal = None
# normal: (1, 3)
return normal
def normal_all(self, start, end):
normals = []
for nf in range(start, end):
annname = join(self.annot_root, self.annotlist[nf])
data = read_json(annname)
if 'vanish_point' in data.keys():
vp1 = np.array(data['vanish_point'][1])
vp1[2] = 1
K = self.camera(nf)['K']
normal = np.linalg.inv(K) @ vp1.reshape(3, 1)
normal = normal.T / np.linalg.norm(normal)
normals.append(normal)
# nFrames, 1, 3
if len(normals) > 0:
normals = np.stack(normals)
else:
normals = None
return normals
if __name__ == "__main__":
pass

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,8 @@
'''
@ Date: 2020-11-17 15:04:25
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-03-05 13:58:07
@ FilePath: /EasyMocap/code/estimator/SPIN/__init__.py
'''
from .spin_api import SPIN, init_with_spin

View File

@ -0,0 +1,180 @@
import torch
import torch.nn as nn
import torchvision.models.resnet as resnet
import numpy as np
import math
from torch.nn import functional as F
def rot6d_to_rotmat(x):
"""Convert 6D rotation representation to 3x3 rotation matrix.
Based on Zhou et al., "On the Continuity of Rotation Representations in Neural Networks", CVPR 2019
Input:
(B,6) Batch of 6-D rotation representations
Output:
(B,3,3) Batch of corresponding rotation matrices
"""
x = x.view(-1,3,2)
a1 = x[:, :, 0]
a2 = x[:, :, 1]
b1 = F.normalize(a1)
b2 = F.normalize(a2 - torch.einsum('bi,bi->b', b1, a2).unsqueeze(-1) * b1)
b3 = torch.cross(b1, b2)
return torch.stack((b1, b2, b3), dim=-1)
class Bottleneck(nn.Module):
""" Redefinition of Bottleneck residual block
Adapted from the official PyTorch implementation
"""
expansion = 4
def __init__(self, inplanes, planes, stride=1, downsample=None):
super(Bottleneck, self).__init__()
self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
self.bn1 = nn.BatchNorm2d(planes)
self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
padding=1, bias=False)
self.bn2 = nn.BatchNorm2d(planes)
self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
self.bn3 = nn.BatchNorm2d(planes * 4)
self.relu = nn.ReLU(inplace=True)
self.downsample = downsample
self.stride = stride
def forward(self, x):
residual = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
out = self.relu(out)
out = self.conv3(out)
out = self.bn3(out)
if self.downsample is not None:
residual = self.downsample(x)
out += residual
out = self.relu(out)
return out
class HMR(nn.Module):
""" SMPL Iterative Regressor with ResNet50 backbone
"""
def __init__(self, block, layers, smpl_mean_params):
self.inplanes = 64
super(HMR, self).__init__()
npose = 24 * 6
self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
bias=False)
self.bn1 = nn.BatchNorm2d(64)
self.relu = nn.ReLU(inplace=True)
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
self.layer1 = self._make_layer(block, 64, layers[0])
self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
self.avgpool = nn.AvgPool2d(7, stride=1)
self.fc1 = nn.Linear(512 * block.expansion + npose + 13, 1024)
self.drop1 = nn.Dropout()
self.fc2 = nn.Linear(1024, 1024)
self.drop2 = nn.Dropout()
self.decpose = nn.Linear(1024, npose)
self.decshape = nn.Linear(1024, 10)
self.deccam = nn.Linear(1024, 3)
nn.init.xavier_uniform_(self.decpose.weight, gain=0.01)
nn.init.xavier_uniform_(self.decshape.weight, gain=0.01)
nn.init.xavier_uniform_(self.deccam.weight, gain=0.01)
for m in self.modules():
if isinstance(m, nn.Conv2d):
n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
m.weight.data.normal_(0, math.sqrt(2. / n))
elif isinstance(m, nn.BatchNorm2d):
m.weight.data.fill_(1)
m.bias.data.zero_()
mean_params = np.load(smpl_mean_params)
init_pose = torch.from_numpy(mean_params['pose'][:]).unsqueeze(0)
init_shape = torch.from_numpy(mean_params['shape'][:].astype('float32')).unsqueeze(0)
init_cam = torch.from_numpy(mean_params['cam']).unsqueeze(0)
self.register_buffer('init_pose', init_pose)
self.register_buffer('init_shape', init_shape)
self.register_buffer('init_cam', init_cam)
def _make_layer(self, block, planes, blocks, stride=1):
downsample = None
if stride != 1 or self.inplanes != planes * block.expansion:
downsample = nn.Sequential(
nn.Conv2d(self.inplanes, planes * block.expansion,
kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(planes * block.expansion),
)
layers = []
layers.append(block(self.inplanes, planes, stride, downsample))
self.inplanes = planes * block.expansion
for i in range(1, blocks):
layers.append(block(self.inplanes, planes))
return nn.Sequential(*layers)
def forward(self, x, init_pose=None, init_shape=None, init_cam=None, n_iter=3):
batch_size = x.shape[0]
if init_pose is None:
init_pose = self.init_pose.expand(batch_size, -1)
if init_shape is None:
init_shape = self.init_shape.expand(batch_size, -1)
if init_cam is None:
init_cam = self.init_cam.expand(batch_size, -1)
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)
x1 = self.layer1(x)
x2 = self.layer2(x1)
x3 = self.layer3(x2)
x4 = self.layer4(x3)
xf = self.avgpool(x4)
xf = xf.view(xf.size(0), -1)
pred_pose = init_pose
pred_shape = init_shape
pred_cam = init_cam
for i in range(n_iter):
xc = torch.cat([xf, pred_pose, pred_shape, pred_cam],1)
xc = self.fc1(xc)
xc = self.drop1(xc)
xc = self.fc2(xc)
xc = self.drop2(xc)
pred_pose = self.decpose(xc) + pred_pose
pred_shape = self.decshape(xc) + pred_shape
pred_cam = self.deccam(xc) + pred_cam
pred_rotmat = rot6d_to_rotmat(pred_pose).view(batch_size, 24, 3, 3)
return pred_rotmat, pred_shape, pred_cam
def hmr(smpl_mean_params, pretrained=True, **kwargs):
""" Constructs an HMR model with ResNet50 backbone.
Args:
pretrained (bool): If True, returns a model pre-trained on ImageNet
"""
model = HMR(Bottleneck, [3, 4, 6, 3], smpl_mean_params, **kwargs)
if pretrained:
resnet_imagenet = resnet.resnet50(pretrained=True)
model.load_state_dict(resnet_imagenet.state_dict(),strict=False)
return model

View File

@ -0,0 +1,233 @@
'''
@ Date: 2020-10-23 20:07:49
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-03-05 13:43:01
@ FilePath: /EasyMocap/code/estimator/SPIN/spin_api.py
'''
"""
Demo code
To run our method, you need a bounding box around the person. The person needs to be centered inside the bounding box and the bounding box should be relatively tight. You can either supply the bounding box directly or provide an [OpenPose](https://github.com/CMU-Perceptual-Computing-Lab/openpose) detection file. In the latter case we infer the bounding box from the detections.
In summary, we provide 3 different ways to use our demo code and models:
1. Provide only an input image (using ```--img```), in which case it is assumed that it is already cropped with the person centered in the image.
2. Provide an input image as before, together with the OpenPose detection .json (using ```--openpose```). Our code will use the detections to compute the bounding box and crop the image.
3. Provide an image and a bounding box (using ```--bbox```). The expected format for the json file can be seen in ```examples/im1010_bbox.json```.
Example with OpenPose detection .json
```
python3 demo.py --checkpoint=data/model_checkpoint.pt --img=examples/im1010.png --openpose=examples/im1010_openpose.json
```
Example with predefined Bounding Box
```
python3 demo.py --checkpoint=data/model_checkpoint.pt --img=examples/im1010.png --bbox=examples/im1010_bbox.json
```
Example with cropped and centered image
```
python3 demo.py --checkpoint=data/model_checkpoint.pt --img=examples/im1010.png
```
Running the previous command will save the results in ```examples/im1010_{shape,shape_side}.png```. The file ```im1010_shape.png``` shows the overlayed reconstruction of human shape. We also render a side view, saved in ```im1010_shape_side.png```.
"""
import torch
from torchvision.transforms import Normalize
import numpy as np
import cv2
from .models import hmr
class constants:
FOCAL_LENGTH = 5000.
IMG_RES = 224
# Mean and standard deviation for normalizing input image
IMG_NORM_MEAN = [0.485, 0.456, 0.406]
IMG_NORM_STD = [0.229, 0.224, 0.225]
def get_transform(center, scale, res, rot=0):
"""Generate transformation matrix."""
h = 200 * scale
t = np.zeros((3, 3))
t[0, 0] = float(res[1]) / h
t[1, 1] = float(res[0]) / h
t[0, 2] = res[1] * (-float(center[0]) / h + .5)
t[1, 2] = res[0] * (-float(center[1]) / h + .5)
t[2, 2] = 1
if not rot == 0:
rot = -rot # To match direction of rotation from cropping
rot_mat = np.zeros((3,3))
rot_rad = rot * np.pi / 180
sn,cs = np.sin(rot_rad), np.cos(rot_rad)
rot_mat[0,:2] = [cs, -sn]
rot_mat[1,:2] = [sn, cs]
rot_mat[2,2] = 1
# Need to rotate around center
t_mat = np.eye(3)
t_mat[0,2] = -res[1]/2
t_mat[1,2] = -res[0]/2
t_inv = t_mat.copy()
t_inv[:2,2] *= -1
t = np.dot(t_inv,np.dot(rot_mat,np.dot(t_mat,t)))
return t
def transform(pt, center, scale, res, invert=0, rot=0):
"""Transform pixel location to different reference."""
t = get_transform(center, scale, res, rot=rot)
if invert:
t = np.linalg.inv(t)
new_pt = np.array([pt[0]-1, pt[1]-1, 1.]).T
new_pt = np.dot(t, new_pt)
return new_pt[:2].astype(int)+1
def crop(img, center, scale, res, rot=0, bias=0):
"""Crop image according to the supplied bounding box."""
# Upper left point
ul = np.array(transform([1, 1], center, scale, res, invert=1))-1
# Bottom right point
br = np.array(transform([res[0]+1,
res[1]+1], center, scale, res, invert=1))-1
# Padding so that when rotated proper amount of context is included
pad = int(np.linalg.norm(br - ul) / 2 - float(br[1] - ul[1]) / 2)
if not rot == 0:
ul -= pad
br += pad
new_shape = [br[1] - ul[1], br[0] - ul[0]]
if len(img.shape) > 2:
new_shape += [img.shape[2]]
new_img = np.zeros(new_shape) + bias
# Range to fill new array
new_x = max(0, -ul[0]), min(br[0], len(img[0])) - ul[0]
new_y = max(0, -ul[1]), min(br[1], len(img)) - ul[1]
# Range to sample from original image
old_x = max(0, ul[0]), min(len(img[0]), br[0])
old_y = max(0, ul[1]), min(len(img), br[1])
new_img[new_y[0]:new_y[1], new_x[0]:new_x[1]] = img[old_y[0]:old_y[1],
old_x[0]:old_x[1]]
if not rot == 0:
# Remove padding
new_img = scipy.misc.imrotate(new_img, rot)
new_img = new_img[pad:-pad, pad:-pad]
new_img = cv2.resize(new_img, (res[0], res[1]))
return new_img
def process_image(img, bbox, input_res=224):
"""Read image, do preprocessing and possibly crop it according to the bounding box.
If there are bounding box annotations, use them to crop the image.
If no bounding box is specified but openpose detections are available, use them to get the bounding box.
"""
img = img[:, :, ::-1].copy()
normalize_img = Normalize(mean=constants.IMG_NORM_MEAN, std=constants.IMG_NORM_STD)
l, t, r, b = bbox[:4]
center = [(l+r)/2, (t+b)/2]
width = max(r-l, b-t)
scale = width/200.0
img = crop(img, center, scale, (input_res, input_res))
img = img.astype(np.float32) / 255.
img = torch.from_numpy(img).permute(2,0,1)
norm_img = normalize_img(img.clone())[None]
return img, norm_img
def estimate_translation_np(S, joints_2d, joints_conf, K):
"""Find camera translation that brings 3D joints S closest to 2D the corresponding joints_2d.
Input:
S: (25, 3) 3D joint locations
joints: (25, 3) 2D joint locations and confidence
Returns:
(3,) camera translation vector
"""
num_joints = S.shape[0]
# focal length
f = np.array([K[0, 0], K[1, 1]])
# optical center
center = np.array([K[0, 2], K[1, 2]])
# transformations
Z = np.reshape(np.tile(S[:,2],(2,1)).T,-1)
XY = np.reshape(S[:,0:2],-1)
O = np.tile(center,num_joints)
F = np.tile(f,num_joints)
weight2 = np.reshape(np.tile(np.sqrt(joints_conf),(2,1)).T,-1)
# least squares
Q = np.array([F*np.tile(np.array([1,0]),num_joints), F*np.tile(np.array([0,1]),num_joints), O-np.reshape(joints_2d,-1)]).T
c = (np.reshape(joints_2d,-1)-O)*Z - F*XY
# weighted least squares
W = np.diagflat(weight2)
Q = np.dot(W,Q)
c = np.dot(W,c)
# square matrix
A = np.dot(Q.T,Q)
b = np.dot(Q.T,c)
# solution
trans = np.linalg.solve(A, b)
return trans
class SPIN:
def __init__(self, SMPL_MEAN_PARAMS, checkpoint, device) -> None:
model = hmr(SMPL_MEAN_PARAMS).to(device)
checkpoint = torch.load(checkpoint)
model.load_state_dict(checkpoint['model'], strict=False)
# Load SMPL model
model.eval()
self.model = model
self.device = device
def forward(self, img, bbox, use_rh_th=True):
# Preprocess input image and generate predictions
img, norm_img = process_image(img, bbox, input_res=constants.IMG_RES)
with torch.no_grad():
pred_rotmat, pred_betas, pred_camera = self.model(norm_img.to(self.device))
results = {
'shapes': pred_betas.detach().cpu().numpy()
}
rotmat = pred_rotmat[0].detach().cpu().numpy()
poses = np.zeros((1, rotmat.shape[0]*3))
for i in range(rotmat.shape[0]):
p, _ = cv2.Rodrigues(rotmat[i])
poses[0, 3*i:3*i+3] = p[:, 0]
results['poses'] = poses
if use_rh_th:
body_params = {
'poses': results['poses'],
'shapes': results['shapes'],
'Rh': results['poses'][:, :3].copy(),
'Th': np.zeros((1, 3)),
}
body_params['Th'][0, 2] = 5
body_params['poses'][:, :3] = 0
results = body_params
return results
def init_with_spin(body_model, spin_model, img, bbox, kpts, camera):
body_params = spin_model.forward(img.copy(), bbox)
body_params = body_model.check_params(body_params)
# only use body joints to estimation translation
nJoints = 15
keypoints3d = body_model(return_verts=False, return_tensor=False, **body_params)[0]
trans = estimate_translation_np(keypoints3d[:nJoints], kpts[:nJoints, :2], kpts[:nJoints, 2], camera['K'])
body_params['Th'] += trans[None, :]
# convert to world coordinate
Rhold = cv2.Rodrigues(body_params['Rh'])[0]
Thold = body_params['Th']
Rh = camera['R'].T @ Rhold
Th = (camera['R'].T @ (Thold.T - camera['T'])).T
body_params['Th'] = Th
body_params['Rh'] = cv2.Rodrigues(Rh)[0].reshape(1, 3)
vertices = body_model(return_verts=True, return_tensor=False, **body_params)[0]
keypoints3d = body_model(return_verts=False, return_tensor=False, **body_params)[0]
results = {'body_params': body_params, 'vertices': vertices, 'keypoints3d': keypoints3d}
return results
if __name__ == '__main__':
pass

View File

@ -0,0 +1,264 @@
import cv2
import numpy as np
from tqdm import tqdm
import os
class FileStorage(object):
def __init__(self, filename, isWrite=False):
version = cv2.__version__
self.major_version = int(version.split('.')[0])
self.second_version = int(version.split('.')[1])
if isWrite:
os.makedirs(os.path.dirname(filename), exist_ok=True)
self.fs = cv2.FileStorage(filename, cv2.FILE_STORAGE_WRITE)
else:
self.fs = cv2.FileStorage(filename, cv2.FILE_STORAGE_READ)
def __del__(self):
cv2.FileStorage.release(self.fs)
def write(self, key, value, dt='mat'):
if dt == 'mat':
cv2.FileStorage.write(self.fs, key, value)
elif dt == 'list':
if self.major_version == 4: # 4.4
self.fs.startWriteStruct(key, cv2.FileNode_SEQ)
for elem in value:
self.fs.write('', elem)
self.fs.endWriteStruct()
else: # 3.4
self.fs.write(key, '[')
for elem in value:
self.fs.write('none', elem)
self.fs.write('none', ']')
def read(self, key, dt='mat'):
if dt == 'mat':
output = self.fs.getNode(key).mat()
elif dt == 'list':
results = []
n = self.fs.getNode(key)
for i in range(n.size()):
val = n.at(i).string()
if val == '':
val = str(int(n.at(i).real()))
if val != 'none':
results.append(val)
output = results
else:
raise NotImplementedError
return output
def close(self):
self.__del__(self)
def safe_mkdir(path):
if not os.path.exists(path):
os.makedirs(path)
def read_intri(intri_name):
assert os.path.exists(intri_name), intri_name
intri = FileStorage(intri_name)
camnames = intri.read('names', dt='list')
cameras = {}
for key in camnames:
cam = {}
cam['K'] = intri.read('K_{}'.format(key))
cam['invK'] = np.linalg.inv(cam['K'])
cam['dist'] = intri.read('dist_{}'.format(key))
cameras[key] = cam
return cameras
def write_intri(intri_name, cameras):
intri = FileStorage(intri_name, True)
results = {}
camnames = list(cameras.keys())
intri.write('names', camnames, 'list')
for key_, val in cameras.items():
key = key_.split('.')[0]
K, dist = val['K'], val['dist']
assert K.shape == (3, 3), K.shape
assert dist.shape == (1, 5) or dist.shape == (5, 1), dist.shape
intri.write('K_{}'.format(key), K)
intri.write('dist_{}'.format(key), dist.reshape(1, 5))
def write_extri(extri_name, cameras):
extri = FileStorage(extri_name, True)
results = {}
camnames = list(cameras.keys())
extri.write('names', camnames, 'list')
for key_, val in cameras.items():
key = key_.split('.')[0]
extri.write('R_{}'.format(key), val['Rvec'])
extri.write('Rot_{}'.format(key), val['R'])
extri.write('T_{}'.format(key), val['T'])
return 0
def read_camera(intri_name, extri_name, cam_names=[]):
assert os.path.exists(intri_name), intri_name
assert os.path.exists(extri_name), extri_name
intri = FileStorage(intri_name)
extri = FileStorage(extri_name)
cams, P = {}, {}
cam_names = intri.read('names', dt='list')
for cam in cam_names:
# 内参只读子码流的
cams[cam] = {}
cams[cam]['K'] = intri.read('K_{}'.format( cam))
cams[cam]['invK'] = np.linalg.inv(cams[cam]['K'])
Rvec = extri.read('R_{}'.format(cam))
Tvec = extri.read('T_{}'.format(cam))
R = cv2.Rodrigues(Rvec)[0]
RT = np.hstack((R, Tvec))
cams[cam]['RT'] = RT
cams[cam]['R'] = R
cams[cam]['T'] = Tvec
P[cam] = cams[cam]['K'] @ cams[cam]['RT']
cams[cam]['P'] = P[cam]
cams[cam]['dist'] = intri.read('dist_{}'.format(cam))
cams['basenames'] = cam_names
return cams
def write_camera(camera, path):
from os.path import join
intri_name = join(path, 'intri.yml')
extri_name = join(path, 'extri.yml')
intri = FileStorage(intri_name, True)
extri = FileStorage(extri_name, True)
results = {}
camnames = [key_.split('.')[0] for key_ in camera.keys()]
intri.write('names', camnames, 'list')
extri.write('names', camnames, 'list')
for key_, val in camera.items():
if key_ == 'basenames':
continue
key = key_.split('.')[0]
intri.write('K_{}'.format(key), val['K'])
intri.write('dist_{}'.format(key), val['dist'])
if 'Rvec' not in val.keys():
val['Rvec'] = cv2.Rodrigues(val['R'])[0]
extri.write('R_{}'.format(key), val['Rvec'])
extri.write('Rot_{}'.format(key), val['R'])
extri.write('T_{}'.format(key), val['T'])
class Undistort:
@staticmethod
def image(frame, K, dist):
return cv2.undistort(frame, K, dist, None)
@staticmethod
def points(keypoints, K, dist):
# keypoints: (N, 3)
assert len(keypoints.shape) == 2, keypoints.shape
kpts = keypoints[:, None, :2]
kpts = np.ascontiguousarray(kpts)
kpts = cv2.undistortPoints(kpts, K, dist, P=K)
keypoints[:, :2] = kpts[:, 0]
return keypoints
@staticmethod
def bbox(bbox, K, dist):
keypoints = np.array([[bbox[0], bbox[1], 1], [bbox[2], bbox[3], 1]])
kpts = Undistort.points(keypoints, K, dist)
bbox = np.array([kpts[0, 0], kpts[0, 1], kpts[1, 0], kpts[1, 1], bbox[4]])
return bbox
def undistort(camera, frame=None, keypoints=None, output=None, bbox=None):
# bbox: 1, 7
mtx = camera['K']
dist = camera['dist']
if frame is not None:
frame = cv2.undistort(frame, mtx, dist, None)
if output is not None:
output = cv2.undistort(output, mtx, dist, None)
if keypoints is not None:
for nP in range(keypoints.shape[0]):
kpts = keypoints[nP][:, None, :2]
kpts = np.ascontiguousarray(kpts)
kpts = cv2.undistortPoints(kpts, mtx, dist, P=mtx)
keypoints[nP, :, :2] = kpts[:, 0]
if bbox is not None:
kpts = np.zeros((2, 1, 2))
kpts[0, 0, 0] = bbox[0]
kpts[0, 0, 1] = bbox[1]
kpts[1, 0, 0] = bbox[2]
kpts[1, 0, 1] = bbox[3]
kpts = cv2.undistortPoints(kpts, mtx, dist, P=mtx)
bbox[0] = kpts[0, 0, 0]
bbox[1] = kpts[0, 0, 1]
bbox[2] = kpts[1, 0, 0]
bbox[3] = kpts[1, 0, 1]
return bbox
return frame, keypoints, output
def get_bbox(points_set, H, W, thres=0.1, scale=1.2):
bboxes = np.zeros((points_set.shape[0], 6))
for iv in range(points_set.shape[0]):
pose = points_set[iv, :, :]
use_idx = pose[:,2] > thres
if np.sum(use_idx) < 1:
continue
ll, rr = np.min(pose[use_idx, 0]), np.max(pose[use_idx, 0])
bb, tt = np.min(pose[use_idx, 1]), np.max(pose[use_idx, 1])
center = (int((ll + rr) / 2), int((bb + tt) / 2))
length = [int(scale*(rr-ll)/2), int(scale*(tt-bb)/2)]
l = max(0, center[0] - length[0])
r = min(W, center[0] + length[0]) # img.shape[1]
b = max(0, center[1] - length[1])
t = min(H, center[1] + length[1]) # img.shape[0]
conf = pose[:, 2].mean()
cls_conf = pose[use_idx, 2].mean()
bboxes[iv, 0] = l
bboxes[iv, 1] = r
bboxes[iv, 2] = b
bboxes[iv, 3] = t
bboxes[iv, 4] = conf
bboxes[iv, 5] = cls_conf
return bboxes
def filterKeypoints(keypoints, thres = 0.1, min_width=40, \
min_height=40, min_area= 50000, min_count=6):
add_list = []
# TODO:并行化
for ik in range(keypoints.shape[0]):
pose = keypoints[ik]
vis_count = np.sum(pose[:15, 2] > thres) #TODO:
if vis_count < min_count:
continue
ll, rr = np.min(pose[pose[:,2]>thres,0]), np.max(pose[pose[:,2]>thres,0])
bb, tt = np.min(pose[pose[:,2]>thres,1]), np.max(pose[pose[:,2]>thres,1])
center = (int((ll+rr)/2), int((bb+tt)/2))
length = [int(1.2*(rr-ll)/2), int(1.2*(tt-bb)/2)]
l = center[0] - length[0]
r = center[0] + length[0]
b = center[1] - length[1]
t = center[1] + length[1]
if (r - l) < min_width:
continue
if (t - b) < min_height:
continue
if (r - l)*(t - b) < min_area:
continue
add_list.append(ik)
keypoints = keypoints[add_list, :, :]
return keypoints, add_list
def get_fundamental_matrix(cameras, basenames):
skew_op = lambda x: np.array([[0, -x[2], x[1]], [x[2], 0, -x[0]], [-x[1], x[0], 0]])
fundamental_op = lambda K_0, R_0, T_0, K_1, R_1, T_1: np.linalg.inv(K_0).T @ (
R_0 @ R_1.T) @ K_1.T @ skew_op(K_1 @ R_1 @ R_0.T @ (T_0 - R_0 @ R_1.T @ T_1))
fundamental_RT_op = lambda K_0, RT_0, K_1, RT_1: fundamental_op (K_0, RT_0[:, :3], RT_0[:, 3], K_1,
RT_1[:, :3], RT_1[:, 3] )
F = np.zeros((len(basenames), len(basenames), 3, 3)) # N x N x 3 x 3 matrix
F = {(icam, jcam): np.zeros((3, 3)) for jcam in basenames for icam in basenames}
for icam in basenames:
for jcam in basenames:
F[(icam, jcam)] += fundamental_RT_op(cameras[icam]['K'], cameras[icam]['RT'], cameras[jcam]['K'], cameras[jcam]['RT'])
if F[(icam, jcam)].sum() == 0:
F[(icam, jcam)] += 1e-12 # to avoid nan
return F

View File

@ -0,0 +1,91 @@
'''
@ Date: 2021-01-15 12:09:27
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-13 19:45:18
@ FilePath: /EasyMocapRelease/easymocap/mytools/cmd_loader.py
'''
import os
import argparse
def load_parser():
parser = argparse.ArgumentParser('EasyMocap commond line tools')
parser.add_argument('path', type=str)
parser.add_argument('--out', type=str, default=None)
parser.add_argument('--annot', type=str, default='annots', help="sub directory name to store the generated annotation files, default to be annots")
parser.add_argument('--sub', type=str, nargs='+', default=[],
help='the sub folder lists when in video mode')
parser.add_argument('--pid', type=int, nargs='+', default=[0],
help='the person IDs')
parser.add_argument('--max_person', type=int, default=-1,
help='maximum number of person')
parser.add_argument('--start', type=int, default=0,
help='frame start')
parser.add_argument('--end', type=int, default=100000,
help='frame end')
parser.add_argument('--step', type=int, default=1,
help='frame step')
#
# keypoints and body model
#
parser.add_argument('--body', type=str, default='body25', choices=['body15', 'body25', 'h36m', 'bodyhand', 'bodyhandface', 'total'])
parser.add_argument('--model', type=str, default='smpl', choices=['smpl', 'smplh', 'smplx', 'mano'])
parser.add_argument('--gender', type=str, default='neutral',
choices=['neutral', 'male', 'female'])
# Input control
detec = parser.add_argument_group('Detection control')
detec.add_argument("--thres2d", type=float, default=0.3,
help="The threshold for suppress noisy kpts")
#
# Optimization control
#
recon = parser.add_argument_group('Reconstruction control')
recon.add_argument('--smooth3d', type=int,
help='the size of window to smooth keypoints3d', default=0)
recon.add_argument('--MAX_REPRO_ERROR', type=int,
help='The threshold of reprojection error', default=50)
recon.add_argument('--MAX_SPEED_ERROR', type=int,
help='The threshold of reprojection error', default=50)
recon.add_argument('--robust3d', action='store_true')
#
# visualization part
#
parser.add_argument('--vis_det', action='store_true')
parser.add_argument('--vis_repro', action='store_true')
parser.add_argument('--vis_smpl', action='store_true')
parser.add_argument('--undis', action='store_true')
parser.add_argument('--sub_vis', type=str, nargs='+', default=[],
help='the sub folder lists for visualization')
#
# debug
#
parser.add_argument('--verbose', action='store_true')
parser.add_argument('--save_origin', action='store_true')
parser.add_argument('--debug', action='store_true')
parser.add_argument('--opts',
help="Modify config options using the command-line",
default=[],
nargs=argparse.REMAINDER)
return parser
from os.path import join
def save_parser(args):
import yaml
res = vars(args)
os.makedirs(args.out, exist_ok=True)
with open(join(args.out, 'exp.yml'), 'w') as f:
yaml.dump(res, f)
def parse_parser(parser):
args = parser.parse_args()
if args.out is None:
print(' - [Warning] Please specify the output path `--out ${out}`')
print(' - [Warning] Default to {}/output'.format(args.path))
args.out = join(args.path, 'output')
if len(args.sub) == 0 and os.path.exists(join(args.path, 'images')):
args.sub = sorted(os.listdir(join(args.path, 'images')))
if args.sub[0].isdigit():
args.sub = sorted(args.sub, key=lambda x:int(x))
args.opts = {args.opts[2*i]:float(args.opts[2*i+1]) for i in range(len(args.opts)//2)}
save_parser(args)
return args

View File

@ -0,0 +1,158 @@
'''
@ Date: 2021-03-15 12:23:12
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-01 16:17:34
@ FilePath: /EasyMocap/easymocap/mytools/file_utils.py
'''
import os
import json
import numpy as np
from os.path import join
mkdir = lambda x:os.makedirs(x, exist_ok=True)
mkout = lambda x:mkdir(os.path.dirname(x))
def read_json(path):
assert os.path.exists(path), path
with open(path) as f:
data = json.load(f)
return data
def save_json(file, data):
if not os.path.exists(os.path.dirname(file)):
os.makedirs(os.path.dirname(file))
with open(file, 'w') as f:
json.dump(data, f, indent=4)
def getFileList(root, ext='.jpg'):
files = []
dirs = os.listdir(root)
while len(dirs) > 0:
path = dirs.pop()
fullname = join(root, path)
if os.path.isfile(fullname) and fullname.endswith(ext):
files.append(path)
elif os.path.isdir(fullname):
for s in os.listdir(fullname):
newDir = join(path, s)
dirs.append(newDir)
files = sorted(files)
return files
def read_annot(annotname, mode='body25'):
data = read_json(annotname)
if not isinstance(data, list):
data = data['annots']
for i in range(len(data)):
if 'id' not in data[i].keys():
data[i]['id'] = data[i].pop('personID')
if 'keypoints2d' in data[i].keys() and 'keypoints' not in data[i].keys():
data[i]['keypoints'] = data[i].pop('keypoints2d')
for key in ['bbox', 'keypoints', 'handl2d', 'handr2d', 'face2d']:
if key not in data[i].keys():continue
data[i][key] = np.array(data[i][key])
if key == 'face2d':
# TODO: Make parameters, 17 is the offset for the eye brows,
# etc. 51 is the total number of FLAME compatible landmarks
data[i][key] = data[i][key][17:17+51, :]
data[i]['bbox'] = data[i]['bbox'][:5]
if data[i]['bbox'][-1] < 0.001:
# print('{}/{} bbox conf = 0, may be error'.format(annotname, i))
data[i]['bbox'][-1] = 1
if mode == 'body25':
data[i]['keypoints'] = data[i]['keypoints']
elif mode == 'body15':
data[i]['keypoints'] = data[i]['keypoints'][:15, :]
elif mode == 'total':
data[i]['keypoints'] = np.vstack([data[i][key] for key in ['keypoints', 'handl2d', 'handr2d', 'face2d']])
elif mode == 'bodyhand':
data[i]['keypoints'] = np.vstack([data[i][key] for key in ['keypoints', 'handl2d', 'handr2d']])
elif mode == 'bodyhandface':
data[i]['keypoints'] = np.vstack([data[i][key] for key in ['keypoints', 'handl2d', 'handr2d', 'face2d']])
conf = data[i]['keypoints'][..., -1]
conf[conf<0] = 0
data.sort(key=lambda x:x['id'])
return data
def write_common_results(dumpname, results, keys, fmt='%.3f'):
mkout(dumpname)
format_out = {'float_kind':lambda x: fmt % x}
with open(dumpname, 'w') as f:
f.write('[\n')
for idata, data in enumerate(results):
f.write(' {\n')
output = {}
output['id'] = data['id']
for key in keys:
if key not in data.keys():continue
output[key] = np.array2string(data[key], max_line_width=1000, separator=', ', formatter=format_out)
for key in output.keys():
f.write(' \"{}\": {}'.format(key, output[key]))
if key != keys[-1]:
f.write(',\n')
else:
f.write('\n')
f.write(' }')
if idata != len(results) - 1:
f.write(',\n')
else:
f.write('\n')
f.write(']\n')
def write_keypoints3d(dumpname, results):
# TODO:rewrite it
keys = ['keypoints3d']
write_common_results(dumpname, results, keys, fmt='%.3f')
def write_smpl(dumpname, results):
keys = ['Rh', 'Th', 'poses', 'expression', 'shapes']
write_common_results(dumpname, results, keys)
def get_bbox_from_pose(pose_2d, img, rate = 0.1):
# this function returns bounding box from the 2D pose
# here use pose_2d[:, -1] instead of pose_2d[:, 2]
# because when vis reprojection, the result will be (x, y, depth, conf)
validIdx = pose_2d[:, -1] > 0
if validIdx.sum() == 0:
return [0, 0, 100, 100, 0]
y_min = int(min(pose_2d[validIdx, 1]))
y_max = int(max(pose_2d[validIdx, 1]))
x_min = int(min(pose_2d[validIdx, 0]))
x_max = int(max(pose_2d[validIdx, 0]))
dx = (x_max - x_min)*rate
dy = (y_max - y_min)*rate
# 后面加上类别这些
bbox = [x_min-dx, y_min-dy, x_max+dx, y_max+dy, 1]
correct_bbox(img, bbox)
return bbox
def correct_bbox(img, bbox):
# this function corrects the bbox, which is out of image
w = img.shape[0]
h = img.shape[1]
if bbox[2] <= 0 or bbox[0] >= h or bbox[1] >= w or bbox[3] <= 0:
bbox[4] = 0
return bbox
def merge_params(param_list, share_shape=True):
output = {}
for key in ['poses', 'shapes', 'Rh', 'Th', 'expression']:
if key in param_list[0].keys():
output[key] = np.vstack([v[key] for v in param_list])
if share_shape:
output['shapes'] = output['shapes'].mean(axis=0, keepdims=True)
return output
def select_nf(params_all, nf):
output = {}
for key in ['poses', 'Rh', 'Th']:
output[key] = params_all[key][nf:nf+1, :]
if 'expression' in params_all.keys():
output['expression'] = params_all['expression'][nf:nf+1, :]
if params_all['shapes'].shape[0] == 1:
output['shapes'] = params_all['shapes']
else:
output['shapes'] = params_all['shapes'][nf:nf+1, :]
return output

View File

@ -0,0 +1,59 @@
# function to read data
"""
This class provides:
| write | vis
- keypoints2d | x | o
- keypoints3d | x | o
- smpl | x | o
"""
import numpy as np
from .file_utils import read_json
def read_keypoints2d(filename):
pass
def read_keypoints3d(filename):
data = read_json(filename)
res_ = []
for d in data:
pid = d['id'] if 'id' in d.keys() else d['personID']
pose3d = np.array(d['keypoints3d'])
if pose3d.shape[0] > 25:
# 对于有手的情况把手的根节点赋值成body25上的点
pose3d[25, :] = pose3d[7, :]
pose3d[46, :] = pose3d[4, :]
res_.append({
'id': pid,
'keypoints3d': pose3d
})
return res_
def read_smpl(filename):
datas = read_json(filename)
outputs = []
for data in datas:
for key in ['Rh', 'Th', 'poses', 'shapes']:
data[key] = np.array(data[key])
# for smplx results
outputs.append(data)
return outputs
def read_keypoints3d_a4d(outname):
res_ = []
with open(outname, "r") as file:
lines = file.readlines()
if len(lines) < 2:
return res_
nPerson, nJoints = int(lines[0]), int(lines[1])
# 只包含每个人的结果
lines = lines[1:]
# 每个人的都写了关键点数量
line_per_person = 1 + 1 + nJoints
for i in range(nPerson):
trackId = int(lines[i*line_per_person+1])
content = ''.join(lines[i*line_per_person+2:i*line_per_person+2+nJoints])
pose3d = np.fromstring(content, dtype=float, sep=' ').reshape((nJoints, 4))
# association4d 的关节顺序和正常的定义不一样
pose3d = pose3d[[4, 1, 5, 9, 13, 6, 10, 14, 0, 2, 7, 11, 3, 8, 12], :]
res_.append({'id':trackId, 'keypoints3d':np.array(pose3d)})
return res_

View File

@ -0,0 +1,116 @@
'''
* @ Date: 2020-09-14 11:01:52
* @ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-13 20:31:34
@ FilePath: /EasyMocapRelease/media/qing/Project/mirror/EasyMocap/easymocap/mytools/reconstruction.py
'''
import numpy as np
def solveZ(A):
u, s, v = np.linalg.svd(A)
X = v[-1, :]
X = X / X[3]
return X[:3]
def projectN3(kpts3d, Pall):
# kpts3d: (N, 3)
nViews = len(Pall)
kp3d = np.hstack((kpts3d[:, :3], np.ones((kpts3d.shape[0], 1))))
kp2ds = []
for nv in range(nViews):
kp2d = Pall[nv] @ kp3d.T
kp2d[:2, :] /= kp2d[2:, :]
kp2ds.append(kp2d.T[None, :, :])
kp2ds = np.vstack(kp2ds)
kp2ds[..., -1] = kp2ds[..., -1] * (kpts3d[None, :, -1] > 0.)
return kp2ds
def simple_reprojection_error(kpts1, kpts1_proj):
# (N, 3)
error = np.mean((kpts1[:, :2] - kpts1_proj[:, :2])**2)
return error
def simple_triangulate(kpts, Pall):
# kpts: (nViews, 3)
# Pall: (nViews, 3, 4)
# return: kpts3d(3,), conf: float
nViews = len(kpts)
A = np.zeros((nViews*2, 4), dtype=np.float)
result = np.zeros(4)
result[3] = kpts[:, 2].sum()/(kpts[:, 2]>0).sum()
for i in range(nViews):
P = Pall[i]
A[i*2, :] = kpts[i, 2]*(kpts[i, 0]*P[2:3,:] - P[0:1,:])
A[i*2 + 1, :] = kpts[i, 2]*(kpts[i, 1]*P[2:3,:] - P[1:2,:])
result[:3] = solveZ(A)
return result
def batch_triangulate(keypoints_, Pall, keypoints_pre=None, lamb=1e3):
# keypoints: (nViews, nJoints, 3)
# Pall: (nViews, 3, 4)
# A: (nJoints, nViewsx2, 4), x: (nJoints, 4, 1); b: (nJoints, nViewsx2, 1)
v = (keypoints_[:, :, -1]>0).sum(axis=0)
valid_joint = np.where(v > 1)[0]
keypoints = keypoints_[:, valid_joint]
conf3d = keypoints[:, :, -1].sum(axis=0)/v[valid_joint]
# P2: P矩阵的最后一行(1, nViews, 1, 4)
P0 = Pall[None, :, 0, :]
P1 = Pall[None, :, 1, :]
P2 = Pall[None, :, 2, :]
# uP2: x坐标乘上P2: (nJoints, nViews, 1, 4)
uP2 = keypoints[:, :, 0].T[:, :, None] * P2
vP2 = keypoints[:, :, 1].T[:, :, None] * P2
conf = keypoints[:, :, 2].T[:, :, None]
Au = conf * (uP2 - P0)
Av = conf * (vP2 - P1)
A = np.hstack([Au, Av])
if keypoints_pre is not None:
# keypoints_pre: (nJoints, 4)
B = np.eye(4)[None, :, :].repeat(A.shape[0], axis=0)
B[:, :3, 3] = -keypoints_pre[valid_joint, :3]
confpre = lamb * keypoints_pre[valid_joint, 3]
# 1, 0, 0, -x0
# 0, 1, 0, -y0
# 0, 0, 1, -z0
# 0, 0, 0, 0
B[:, 3, 3] = 0
B = B * confpre[:, None, None]
A = np.hstack((A, B))
u, s, v = np.linalg.svd(A)
X = v[:, -1, :]
X = X / X[:, 3:]
# out: (nJoints, 4)
result = np.zeros((keypoints_.shape[1], 4))
result[valid_joint, :3] = X[:, :3]
result[valid_joint, 3] = conf3d
return result
eps = 0.01
def simple_recon_person(keypoints_use, Puse):
out = batch_triangulate(keypoints_use, Puse)
# compute reprojection error
kpts_repro = projectN3(out, Puse)
square_diff = (keypoints_use[:, :, :2] - kpts_repro[:, :, :2])**2
conf = np.repeat(out[None, :, -1:], len(Puse), 0)
kpts_repro = np.concatenate((kpts_repro, conf), axis=2)
return out, kpts_repro
def check_limb(keypoints3d, limb_means, thres=0.5):
# keypoints3d: (nJ, 4)
valid = True
cnt = 0
for (src, dst), val in limb_means.items():
if not (keypoints3d[src, 3] > 0 and keypoints3d[dst, 3] > 0):
continue
cnt += 1
# 计算骨长
l_est = np.linalg.norm(keypoints3d[src, :3] - keypoints3d[dst, :3])
if abs(l_est - val['mean'])/val['mean']/val['std'] > thres:
valid = False
break
# 至少两段骨头可以使用
valid = valid and cnt > 2
return valid

View File

@ -0,0 +1,157 @@
'''
@ Date: 2020-11-28 17:23:04
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-03-28 22:19:34
@ FilePath: /EasyMocap/easymocap/mytools/vis_base.py
'''
import cv2
import numpy as np
import json
def generate_colorbar(N = 20, cmap = 'jet'):
bar = ((np.arange(N)/(N-1))*255).astype(np.uint8).reshape(-1, 1)
colorbar = cv2.applyColorMap(bar, cv2.COLORMAP_JET).squeeze()
if False:
colorbar = np.clip(colorbar + 64, 0, 255)
import random
random.seed(666)
index = [i for i in range(N)]
random.shuffle(index)
rgb = colorbar[index, :]
rgb = rgb.tolist()
return rgb
colors_bar_rgb = generate_colorbar(cmap='hsv')
colors_table = {
'b': [0.65098039, 0.74117647, 0.85882353],
'_pink': [.9, .7, .7],
'_mint': [ 166/255., 229/255., 204/255.],
'_mint2': [ 202/255., 229/255., 223/255.],
'_green': [ 153/255., 216/255., 201/255.],
'_green2': [ 171/255., 221/255., 164/255.],
'r': [ 251/255., 128/255., 114/255.],
'_orange': [ 253/255., 174/255., 97/255.],
'y': [ 250/255., 230/255., 154/255.],
'_r':[255/255,0,0],
'g':[0,255/255,0],
'_b':[0,0,255/255],
'k':[0,0,0],
'_y':[255/255,255/255,0],
'purple':[128/255,0,128/255],
'smap_b':[51/255,153/255,255/255],
'smap_r':[255/255,51/255,153/255],
'smap_b':[51/255,255/255,153/255],
}
def get_rgb(index):
if isinstance(index, int):
if index == -1:
return (255, 255, 255)
if index < -1:
return (0, 0, 0)
col = colors_bar_rgb[index%len(colors_bar_rgb)]
else:
col = colors_table.get(index, (1, 0, 0))
col = tuple([int(c*255) for c in col[::-1]])
return col
def plot_point(img, x, y, r, col, pid=-1):
cv2.circle(img, (int(x+0.5), int(y+0.5)), r, col, -1)
if pid != -1:
cv2.putText(img, '{}'.format(pid), (int(x+0.5), int(y+0.5)), cv2.FONT_HERSHEY_SIMPLEX, 1, col, 2)
def plot_line(img, pt1, pt2, lw, col):
cv2.line(img, (int(pt1[0]+0.5), int(pt1[1]+0.5)), (int(pt2[0]+0.5), int(pt2[1]+0.5)),
col, lw)
def plot_cross(img, x, y, col, width=10, lw=2):
cv2.line(img, (int(x-width), int(y)), (int(x+width), int(y)), col, lw)
cv2.line(img, (int(x), int(y-width)), (int(x), int(y+width)), col, lw)
def plot_bbox(img, bbox, pid, vis_id=True):
# 画bbox: (l, t, r, b)
x1, y1, x2, y2 = bbox[:4]
x1 = int(round(x1))
x2 = int(round(x2))
y1 = int(round(y1))
y2 = int(round(y2))
color = get_rgb(pid)
lw = max(img.shape[0]//300, 2)
cv2.rectangle(img, (x1, y1), (x2, y2), color, lw)
if vis_id:
cv2.putText(img, '{}'.format(pid), (x1, y1+20), cv2.FONT_HERSHEY_SIMPLEX, 1, color, 2)
def plot_keypoints(img, points, pid, config, vis_conf=False, use_limb_color=True, lw=2):
for ii, (i, j) in enumerate(config['kintree']):
if i >= len(points) or j >= len(points):
continue
pt1, pt2 = points[i], points[j]
if use_limb_color:
col = get_rgb(config['colors'][ii])
else:
col = get_rgb(pid)
if pt1[-1] > 0.01 and pt2[-1] > 0.01:
image = cv2.line(
img, (int(pt1[0]+0.5), int(pt1[1]+0.5)), (int(pt2[0]+0.5), int(pt2[1]+0.5)),
col, lw)
for i in range(len(points)):
x, y = points[i][0], points[i][1]
c = points[i][-1]
if c > 0.01:
col = get_rgb(pid)
cv2.circle(img, (int(x+0.5), int(y+0.5)), lw*2, col, -1)
if vis_conf:
cv2.putText(img, '{:.1f}'.format(c), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX, 1, col, 2)
def plot_points2d(img, points2d, lines, lw=4, col=(0, 255, 0), putText=True):
# 将2d点画上去
for i, (x, y, v) in enumerate(points2d):
if v < 0.01:
continue
c = col
plot_cross(img, x, y, width=10, col=c, lw=lw)
if putText:
cv2.putText(img, '{}'.format(i), (int(x), int(y)), cv2.FONT_HERSHEY_SIMPLEX, 1, c, 2)
for i, j in lines:
if points2d[i][2] < 0.01 or points2d[j][2] < 0.01:
continue
plot_line(img, points2d[i], points2d[j], 2, (255, 255, 255))
def merge(images, row=-1, col=-1, resize=False, ret_range=False):
if row == -1 and col == -1:
from math import sqrt
row = int(sqrt(len(images)) + 0.5)
col = int(len(images)/ row + 0.5)
if row > col:
row, col = col, row
if len(images) == 8:
# basketball 场景
row, col = 2, 4
images = [images[i] for i in [0, 1, 2, 3, 7, 6, 5, 4]]
if len(images) == 7:
row, col = 3, 3
elif len(images) == 2:
row, col = 2, 1
height = images[0].shape[0]
width = images[0].shape[1]
ret_img = np.zeros((height * row, width * col, images[0].shape[2]), dtype=np.uint8) + 255
ranges = []
for i in range(row):
for j in range(col):
if i*col + j >= len(images):
break
img = images[i * col + j]
# resize the image size
img = cv2.resize(img, (width, height))
ret_img[height * i: height * (i+1), width * j: width * (j+1)] = img
ranges.append((width*j, height*i, width*(j+1), height*(i+1)))
if resize:
scale = min(1000/ret_img.shape[0], 1800/ret_img.shape[1])
while ret_img.shape[0] > 2000:
ret_img = cv2.resize(ret_img, None, fx=scale, fy=scale)
if ret_range:
return ret_img, ranges
return ret_img

160
easymocap/mytools/writer.py Normal file
View File

@ -0,0 +1,160 @@
import os
from os.path import join
import numpy as np
import cv2
# from mytools import save_json, merge
# from ..mytools import merge, plot_bbox, plot_keypoints
# from mytools.file_utils import read_json, save_json, read_annot, read_smpl, write_smpl, get_bbox_from_pose
from .vis_base import plot_bbox, plot_keypoints, merge
from .file_utils import write_keypoints3d, write_smpl, mkout, mkdir
class FileWriter:
"""
This class provides:
| write | vis
- keypoints2d | x | o
- keypoints3d | x | o
- smpl | x | o
"""
def __init__(self, output_path, config=None, basenames=[], cfg=None) -> None:
self.out = output_path
keys = ['keypoints3d', 'match', 'smpl', 'skel', 'repro', 'keypoints']
output_dict = {key:join(self.out, key) for key in keys}
self.output_dict = output_dict
self.basenames = basenames
if cfg is not None:
print(cfg, file=open(join(output_path, 'exp.yml'), 'w'))
self.save_origin = False
self.config = config
def write_keypoints2d(self, ):
pass
def vis_keypoints2d_mv(self, images, lDetections, outname=None,
vis_id=True):
mkout(outname)
images_vis = []
for nv, image in enumerate(images):
img = image.copy()
for det in lDetections[nv]:
pid = det['id']
if 'keypoints2d' in det.keys():
keypoints = det['keypoints2d']
else:
keypoints = det['keypoints']
if 'bbox' not in det.keys():
bbox = get_bbox_from_pose(keypoints, img)
else:
bbox = det['bbox']
plot_bbox(img, bbox, pid=pid, vis_id=vis_id)
plot_keypoints(img, keypoints, pid=pid, config=self.config, use_limb_color=False, lw=2)
images_vis.append(img)
if len(images_vis) > 1:
images_vis = merge(images_vis, resize=not self.save_origin)
else:
images_vis = images_vis[0]
if outname is not None:
# savename = join(self.output_dict[key], '{:06d}.jpg'.format(nf))
# savename = join(self.output_dict[key], '{:06d}.jpg'.format(nf))
cv2.imwrite(outname, images_vis)
return images_vis
def write_keypoints3d(self, results, outname):
write_keypoints3d(outname, results)
def vis_keypoints3d(self, result, outname):
# visualize the repro of keypoints3d
import ipdb; ipdb.set_trace()
def vis_smpl(self, render_data, images, cameras, outname, add_back):
mkout(outname)
from ..visualize import Renderer
render = Renderer(height=1024, width=1024, faces=None)
render_results = render.render(render_data, cameras, images, add_back=add_back)
image_vis = merge(render_results, resize=not self.save_origin)
cv2.imwrite(outname, image_vis)
return image_vis
def _write_keypoints3d(self, results, nf=-1, base=None):
os.makedirs(self.output_dict['keypoints3d'], exist_ok=True)
if base is None:
base = '{:06d}'.format(nf)
savename = join(self.output_dict['keypoints3d'], '{}.json'.format(base))
save_json(savename, results)
def vis_detections(self, images, lDetections, nf, key='keypoints', to_img=True, vis_id=True):
os.makedirs(self.output_dict[key], exist_ok=True)
images_vis = []
for nv, image in enumerate(images):
img = image.copy()
for det in lDetections[nv]:
if key == 'match' and 'id_match' in det.keys():
pid = det['id_match']
else:
pid = det['id']
if key not in det.keys():
keypoints = det['keypoints']
else:
keypoints = det[key]
if 'bbox' not in det.keys():
bbox = get_bbox_from_pose(keypoints, img)
else:
bbox = det['bbox']
plot_bbox(img, bbox, pid=pid, vis_id=vis_id)
plot_keypoints(img, keypoints, pid=pid, config=self.config, use_limb_color=False, lw=2)
images_vis.append(img)
image_vis = merge(images_vis, resize=not self.save_origin)
if to_img:
savename = join(self.output_dict[key], '{:06d}.jpg'.format(nf))
cv2.imwrite(savename, image_vis)
return image_vis
def write_smpl(self, results, outname):
write_smpl(outname, results)
def vis_keypoints3d(self, infos, nf, images, cameras, mode='repro'):
out = join(self.out, mode)
os.makedirs(out, exist_ok=True)
# cameras: (K, R, T)
images_vis = []
for nv, image in enumerate(images):
img = image.copy()
K, R, T = cameras['K'][nv], cameras['R'][nv], cameras['T'][nv]
P = K @ np.hstack([R, T])
for info in infos:
pid = info['id']
keypoints3d = info['keypoints3d']
# 重投影
kcam = np.hstack([keypoints3d[:, :3], np.ones((keypoints3d.shape[0], 1))]) @ P.T
kcam = kcam[:, :2]/kcam[:, 2:]
k2d = np.hstack((kcam, keypoints3d[:, -1:]))
bbox = get_bbox_from_pose(k2d, img)
plot_bbox(img, bbox, pid=pid, vis_id=pid)
plot_keypoints(img, k2d, pid=pid, config=self.config, use_limb_color=False, lw=2)
images_vis.append(img)
savename = join(out, '{:06d}.jpg'.format(nf))
image_vis = merge(images_vis, resize=False)
cv2.imwrite(savename, image_vis)
return image_vis
def _vis_smpl(self, render_data_, nf, images, cameras, mode='smpl', base=None, add_back=False, extra_mesh=[]):
out = join(self.out, mode)
os.makedirs(out, exist_ok=True)
from visualize.renderer import Renderer
render = Renderer(height=1024, width=1024, faces=None, extra_mesh=extra_mesh)
if isinstance(render_data_, list): # different view have different data
for nv, render_data in enumerate(render_data_):
render_results = render.render(render_data, cameras, images)
image_vis = merge(render_results, resize=not self.save_origin)
savename = join(out, '{:06d}_{:02d}.jpg'.format(nf, nv))
cv2.imwrite(savename, image_vis)
else:
render_results = render.render(render_data_, cameras, images, add_back=add_back)
image_vis = merge(render_results, resize=not self.save_origin)
if nf != -1:
if base is None:
base = '{:06d}'.format(nf)
savename = join(out, '{}.jpg'.format(base))
cv2.imwrite(savename, image_vis)
return image_vis

View File

@ -0,0 +1 @@
from .basic import smpl_from_keypoints3d, smpl_from_keypoints3d2d

View File

@ -0,0 +1,89 @@
'''
@ Date: 2021-04-13 20:43:16
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-14 13:38:34
@ FilePath: /EasyMocapRelease/easymocap/pipeline/basic.py
'''
from ..pyfitting import optimizeShape, optimizePose2D, optimizePose3D
from ..smplmodel import init_params
from ..mytools import Timer
from ..dataset import CONFIG
from .weight import load_weight_pose, load_weight_shape
from .config import Config
def multi_stage_optimize(body_model, params, kp3ds, kp2ds=None, bboxes=None, Pall=None, weight={}, cfg=None):
with Timer('Optimize global RT'):
cfg.OPT_R = True
cfg.OPT_T = True
params = optimizePose3D(body_model, params, kp3ds, weight=weight, cfg=cfg)
# params = optimizePose(body_model, params, kp3ds, weight_loss=weight, kintree=config['kintree'], cfg=cfg)
with Timer('Optimize 3D Pose/{} frames'.format(kp3ds.shape[0])):
cfg.OPT_POSE = True
cfg.ROBUST_3D = False
params = optimizePose3D(body_model, params, kp3ds, weight=weight, cfg=cfg)
if False:
cfg.ROBUST_3D = True
params = optimizePose3D(body_model, params, kp3ds, weight=weight, cfg=cfg)
if cfg.model in ['smplh', 'smplx']:
cfg.OPT_HAND = True
params = optimizePose3D(body_model, params, kp3ds, weight=weight, cfg=cfg)
if cfg.model == 'smplx':
cfg.OPT_EXPR = True
params = optimizePose3D(body_model, params, kp3ds, weight=weight, cfg=cfg)
if kp2ds is not None:
with Timer('Optimize 2D Pose/{} frames'.format(kp3ds.shape[0])):
# bboxes => (nFrames, nViews, 5), keypoints2d => (nFrames, nViews, nJoints, 3)
params = optimizePose2D(body_model, params, bboxes, kp2ds, Pall, weight=weight, cfg=cfg)
return params
def smpl_from_keypoints3d2d(body_model, kp3ds, kp2ds, bboxes, Pall, config, args,
weight_shape=None, weight_pose=None):
model_type = body_model.model_type
params_init = init_params(nFrames=1, model_type=model_type)
if weight_shape is None:
weight_shape = load_weight_shape(args.opts)
if model_type in ['smpl', 'smplh', 'smplx']:
# when use SMPL model, optimize the shape only with first 1-14 limbs,
# don't use (nose, neck)
params_shape = optimizeShape(body_model, params_init, kp3ds,
weight_loss=weight_shape, kintree=CONFIG['body15']['kintree'][1:])
else:
params_shape = optimizeShape(body_model, params_init, kp3ds,
weight_loss=weight_shape, kintree=config['kintree'])
# optimize 3D pose
cfg = Config(args)
cfg.device = body_model.device
params = init_params(nFrames=kp3ds.shape[0], model_type=model_type)
params['shapes'] = params_shape['shapes'].copy()
if weight_pose is None:
weight_pose = load_weight_pose(model_type, args.opts)
# We divide this step to two functions, because we can have different initialization method
params = multi_stage_optimize(body_model, params, kp3ds, kp2ds, bboxes, Pall, weight_pose, cfg)
return params
def smpl_from_keypoints3d(body_model, kp3ds, config, args,
weight_shape=None, weight_pose=None):
model_type = body_model.model_type
params_init = init_params(nFrames=1, model_type=model_type)
if weight_shape is None:
weight_shape = load_weight_shape(args.opts)
if model_type in ['smpl', 'smplh', 'smplx']:
# when use SMPL model, optimize the shape only with first 1-14 limbs,
# don't use (nose, neck)
params_shape = optimizeShape(body_model, params_init, kp3ds,
weight_loss=weight_shape, kintree=CONFIG['body15']['kintree'][1:])
else:
params_shape = optimizeShape(body_model, params_init, kp3ds,
weight_loss=weight_shape, kintree=config['kintree'])
# optimize 3D pose
cfg = Config(args)
cfg.device = body_model.device
cfg.model_type = model_type
params = init_params(nFrames=kp3ds.shape[0], model_type=model_type)
params['shapes'] = params_shape['shapes'].copy()
if weight_pose is None:
weight_pose = load_weight_pose(model_type, args.opts)
# We divide this step to two functions, because we can have different initialization method
params = multi_stage_optimize(body_model, params, kp3ds, None, None, None, weight_pose, cfg)
return params

View File

@ -0,0 +1,18 @@
class Config:
OPT_R = False
OPT_T = False
OPT_POSE = False
OPT_SHAPE = False
OPT_HAND = False
OPT_EXPR = False
ROBUST_3D_ = False
ROBUST_3D = False
verbose = False
model = 'smpl'
device = None
def __init__(self, args=None) -> None:
if args is not None:
self.verbose = args.verbose
self.model = args.model
self.ROBUST_3D_ = args.robust3d

View File

@ -0,0 +1,54 @@
from .config import Config
from ..mytools import Timer
from ..pyfitting import optimizeMirrorSoft, optimizeMirrorDirect
def load_weight_mirror(model, opts):
if model == 'smpl':
weight = {
'k2d': 2e-4,
'init_poses': 1e-3, 'init_shapes': 1e-2,
'smooth_body': 5e-1, 'smooth_poses': 1e-1,
'par_self': 5e-2, 'ver_self': 2e-2,
'par_mirror': 5e-2
}
elif model == 'smplh':
weight = {'repro': 1, 'repro_hand': 0.1,
'init_poses': 10., 'init_shapes': 10., 'init_Th': 0.,
'reg_poses': 0., 'reg_shapes':10., 'reg_poses_zero': 10.,
# 'smooth_poses': 100., 'smooth_Rh': 1000., 'smooth_Th': 1000.,
'parallel_self': 10., 'vertical_self': 10., 'parallel_mirror': 0.
}
elif model == 'smplx':
weight = {'repro': 1, 'repro_hand': 0.2, 'repro_face': 1.,
'init_poses': 1., 'init_shapes': 0., 'init_Th': 0.,
'reg_poses': 0., 'reg_shapes': 10., 'reg_poses_zero': 10., 'reg_head': 1., 'reg_expression': 1.,
# 'smooth_body': 1., 'smooth_hand': 10.,
# 'smooth_poses_l1': 1.,
'parallel_self': 1., 'vertical_self': 1., 'parallel_mirror': 0.}
else:
weight = {}
for key in opts.keys():
if key in weight.keys():
weight[key] = opts[key]
return weight
def multi_stage_optimize(body_model, body_params, bboxes, keypoints2d, Pall, normal, args):
weight = load_weight_mirror(args.model, args.opts)
config = Config()
config.device = body_model.device
config.verbose = args.verbose
config.OPT_R = True
config.OPT_T = True
config.OPT_SHAPE = True
with Timer('Optimize 2D Pose/{} frames'.format(keypoints2d.shape[1]), not args.verbose):
if args.direct:
config.OPT_POSE = False
body_params = optimizeMirrorDirect(body_model, body_params, bboxes, keypoints2d, Pall, normal, weight, config)
config.OPT_POSE = True
body_params = optimizeMirrorDirect(body_model, body_params, bboxes, keypoints2d, Pall, normal, weight, config)
else:
config.OPT_POSE = False
body_params = optimizeMirrorSoft(body_model, body_params, bboxes, keypoints2d, Pall, normal, weight, config)
config.OPT_POSE = True
body_params = optimizeMirrorSoft(body_model, body_params, bboxes, keypoints2d, Pall, normal, weight, config)
return body_params

View File

@ -0,0 +1,43 @@
'''
@ Date: 2021-04-13 20:12:58
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-13 22:51:39
@ FilePath: /EasyMocapRelease/easymocap/pipeline/weight.py
'''
def load_weight_shape(opts):
weight = {'s3d': 1., 'reg_shapes': 5e-3}
for key in opts.keys():
if key in weight.keys():
weight[key] = opts[key]
return weight
def load_weight_pose(model, opts):
if model == 'smpl':
weight = {
'k3d': 1., 'reg_poses_zero': 1e-2, 'smooth_body': 5e-1,
'smooth_poses': 1e-1, 'reg_poses': 1e-3,
'k2d': 1e-4
}
elif model == 'smplh':
weight = {
'k3d': 1., 'k3d_hand': 5.,
'reg_poses_zero': 1e-2,
'smooth_body': 5e-1, 'smooth_poses': 1e-1, 'smooth_hand': 1e-3,
'reg_hand': 1e-4,
'k2d': 1e-4
}
elif model == 'smplx':
weight = {
'k3d': 1., 'k3d_hand': 5., 'k3d_face': 2.,
'reg_poses_zero': 1e-2,
'smooth_body': 5e-1, 'smooth_poses': 1e-1, 'smooth_hand': 1e-3,
'reg_hand': 1e-4, 'reg_expr': 1e-2, 'reg_head': 1e-2,
'k2d': 1e-4
}
else:
raise NotImplementedError
for key in opts.keys():
if key in weight.keys():
weight[key] = opts[key]
return weight

View File

@ -0,0 +1,9 @@
'''
@ Date: 2021-04-03 17:00:31
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-03 17:06:05
@ FilePath: /EasyMocap/easymocap/pyfitting/__init__.py
'''
from .optimize_simple import optimizePose2D, optimizePose3D, optimizeShape
from .optimize_mirror import optimizeMirrorDirect, optimizeMirrorSoft

View File

@ -0,0 +1,471 @@
import torch
from functools import reduce
from torch.optim.optimizer import Optimizer
def _cubic_interpolate(x1, f1, g1, x2, f2, g2, bounds=None):
# ported from https://github.com/torch/optim/blob/master/polyinterp.lua
# Compute bounds of interpolation area
if bounds is not None:
xmin_bound, xmax_bound = bounds
else:
xmin_bound, xmax_bound = (x1, x2) if x1 <= x2 else (x2, x1)
# Code for most common case: cubic interpolation of 2 points
# w/ function and derivative values for both
# Solution in this case (where x2 is the farthest point):
# d1 = g1 + g2 - 3*(f1-f2)/(x1-x2);
# d2 = sqrt(d1^2 - g1*g2);
# min_pos = x2 - (x2 - x1)*((g2 + d2 - d1)/(g2 - g1 + 2*d2));
# t_new = min(max(min_pos,xmin_bound),xmax_bound);
d1 = g1 + g2 - 3 * (f1 - f2) / (x1 - x2)
d2_square = d1**2 - g1 * g2
if d2_square >= 0:
d2 = d2_square.sqrt()
if x1 <= x2:
min_pos = x2 - (x2 - x1) * ((g2 + d2 - d1) / (g2 - g1 + 2 * d2))
else:
min_pos = x1 - (x1 - x2) * ((g1 + d2 - d1) / (g1 - g2 + 2 * d2))
return min(max(min_pos, xmin_bound), xmax_bound)
else:
return (xmin_bound + xmax_bound) / 2.
def _strong_wolfe(obj_func,
x,
t,
d,
f,
g,
gtd,
c1=1e-4,
c2=0.9,
tolerance_change=1e-9,
max_ls=25):
# ported from https://github.com/torch/optim/blob/master/lswolfe.lua
d_norm = d.abs().max()
g = g.clone()
# evaluate objective and gradient using initial step
f_new, g_new = obj_func(x, t, d)
ls_func_evals = 1
gtd_new = g_new.dot(d)
# bracket an interval containing a point satisfying the Wolfe criteria
t_prev, f_prev, g_prev, gtd_prev = 0, f, g, gtd
done = False
ls_iter = 0
while ls_iter < max_ls:
# check conditions
if f_new > (f + c1 * t * gtd) or (ls_iter > 1 and f_new >= f_prev):
bracket = [t_prev, t]
bracket_f = [f_prev, f_new]
bracket_g = [g_prev, g_new.clone()]
bracket_gtd = [gtd_prev, gtd_new]
break
if abs(gtd_new) <= -c2 * gtd:
bracket = [t]
bracket_f = [f_new]
bracket_g = [g_new]
done = True
break
if gtd_new >= 0:
bracket = [t_prev, t]
bracket_f = [f_prev, f_new]
bracket_g = [g_prev, g_new.clone()]
bracket_gtd = [gtd_prev, gtd_new]
break
# interpolate
min_step = t + 0.01 * (t - t_prev)
max_step = t * 10
tmp = t
t = _cubic_interpolate(
t_prev,
f_prev,
gtd_prev,
t,
f_new,
gtd_new,
bounds=(min_step, max_step))
# next step
t_prev = tmp
f_prev = f_new
g_prev = g_new.clone()
gtd_prev = gtd_new
f_new, g_new = obj_func(x, t, d)
ls_func_evals += 1
gtd_new = g_new.dot(d)
ls_iter += 1
# reached max number of iterations?
if ls_iter == max_ls:
bracket = [0, t]
bracket_f = [f, f_new]
bracket_g = [g, g_new]
# zoom phase: we now have a point satisfying the criteria, or
# a bracket around it. We refine the bracket until we find the
# exact point satisfying the criteria
insuf_progress = False
# find high and low points in bracket
low_pos, high_pos = (0, 1) if bracket_f[0] <= bracket_f[-1] else (1, 0)
while not done and ls_iter < max_ls:
# compute new trial value
t = _cubic_interpolate(bracket[0], bracket_f[0], bracket_gtd[0],
bracket[1], bracket_f[1], bracket_gtd[1])
# test that we are making sufficient progress:
# in case `t` is so close to boundary, we mark that we are making
# insufficient progress, and if
# + we have made insufficient progress in the last step, or
# + `t` is at one of the boundary,
# we will move `t` to a position which is `0.1 * len(bracket)`
# away from the nearest boundary point.
eps = 0.1 * (max(bracket) - min(bracket))
if min(max(bracket) - t, t - min(bracket)) < eps:
# interpolation close to boundary
if insuf_progress or t >= max(bracket) or t <= min(bracket):
# evaluate at 0.1 away from boundary
if abs(t - max(bracket)) < abs(t - min(bracket)):
t = max(bracket) - eps
else:
t = min(bracket) + eps
insuf_progress = False
else:
insuf_progress = True
else:
insuf_progress = False
# Evaluate new point
f_new, g_new = obj_func(x, t, d)
ls_func_evals += 1
gtd_new = g_new.dot(d)
ls_iter += 1
if f_new > (f + c1 * t * gtd) or f_new >= bracket_f[low_pos]:
# Armijo condition not satisfied or not lower than lowest point
bracket[high_pos] = t
bracket_f[high_pos] = f_new
bracket_g[high_pos] = g_new.clone()
bracket_gtd[high_pos] = gtd_new
low_pos, high_pos = (0, 1) if bracket_f[0] <= bracket_f[1] else (1, 0)
else:
if abs(gtd_new) <= -c2 * gtd:
# Wolfe conditions satisfied
done = True
elif gtd_new * (bracket[high_pos] - bracket[low_pos]) >= 0:
# old high becomes new low
bracket[high_pos] = bracket[low_pos]
bracket_f[high_pos] = bracket_f[low_pos]
bracket_g[high_pos] = bracket_g[low_pos]
bracket_gtd[high_pos] = bracket_gtd[low_pos]
# new point becomes new low
bracket[low_pos] = t
bracket_f[low_pos] = f_new
bracket_g[low_pos] = g_new.clone()
bracket_gtd[low_pos] = gtd_new
# line-search bracket is so small
if abs(bracket[1] - bracket[0]) * d_norm < tolerance_change:
break
# return stuff
t = bracket[low_pos]
f_new = bracket_f[low_pos]
g_new = bracket_g[low_pos]
return f_new, g_new, t, ls_func_evals
class LBFGS(Optimizer):
"""Implements L-BFGS algorithm, heavily inspired by `minFunc
<https://www.cs.ubc.ca/~schmidtm/Software/minFunc.html>`.
.. warning::
This optimizer doesn't support per-parameter options and parameter
groups (there can be only one).
.. warning::
Right now all parameters have to be on a single device. This will be
improved in the future.
.. note::
This is a very memory intensive optimizer (it requires additional
``param_bytes * (history_size + 1)`` bytes). If it doesn't fit in memory
try reducing the history size, or use a different algorithm.
Arguments:
lr (float): learning rate (default: 1)
max_iter (int): maximal number of iterations per optimization step
(default: 20)
max_eval (int): maximal number of function evaluations per optimization
step (default: max_iter * 1.25).
tolerance_grad (float): termination tolerance on first order optimality
(default: 1e-5).
tolerance_change (float): termination tolerance on function
value/parameter changes (default: 1e-9).
history_size (int): update history size (default: 100).
line_search_fn (str): either 'strong_wolfe' or None (default: None).
"""
def __init__(self,
params,
lr=1,
max_iter=20,
max_eval=None,
tolerance_grad=1e-5,
tolerance_change=1e-9,
history_size=100,
line_search_fn=None):
if max_eval is None:
max_eval = max_iter * 5 // 4
defaults = dict(
lr=lr,
max_iter=max_iter,
max_eval=max_eval,
tolerance_grad=tolerance_grad,
tolerance_change=tolerance_change,
history_size=history_size,
line_search_fn=line_search_fn)
super(LBFGS, self).__init__(params, defaults)
if len(self.param_groups) != 1:
raise ValueError("LBFGS doesn't support per-parameter options "
"(parameter groups)")
self._params = self.param_groups[0]['params']
self._numel_cache = None
def _numel(self):
if self._numel_cache is None:
self._numel_cache = reduce(lambda total, p: total + p.numel(), self._params, 0)
return self._numel_cache
def _gather_flat_grad(self):
views = []
for p in self._params:
if p.grad is None:
view = p.new(p.numel()).zero_()
elif p.grad.is_sparse:
view = p.grad.to_dense().view(-1)
else:
view = p.grad.view(-1)
views.append(view)
return torch.cat(views, 0)
def _add_grad(self, step_size, update):
offset = 0
for p in self._params:
numel = p.numel()
# view as to avoid deprecated pointwise semantics
p.data.add_(step_size, update[offset:offset + numel].view_as(p.data))
offset += numel
assert offset == self._numel()
def _clone_param(self):
return [p.clone() for p in self._params]
def _set_param(self, params_data):
for p, pdata in zip(self._params, params_data):
p.data.copy_(pdata)
def _directional_evaluate(self, closure, x, t, d):
self._add_grad(t, d)
loss = float(closure())
flat_grad = self._gather_flat_grad()
self._set_param(x)
return loss, flat_grad
def step(self, closure):
"""Performs a single optimization step.
Arguments:
closure (callable): A closure that reevaluates the model
and returns the loss.
"""
assert len(self.param_groups) == 1
group = self.param_groups[0]
lr = group['lr']
max_iter = group['max_iter']
max_eval = group['max_eval']
tolerance_grad = group['tolerance_grad']
tolerance_change = group['tolerance_change']
line_search_fn = group['line_search_fn']
history_size = group['history_size']
# NOTE: LBFGS has only global state, but we register it as state for
# the first param, because this helps with casting in load_state_dict
state = self.state[self._params[0]]
state.setdefault('func_evals', 0)
state.setdefault('n_iter', 0)
# evaluate initial f(x) and df/dx
orig_loss = closure()
loss = float(orig_loss)
current_evals = 1
state['func_evals'] += 1
flat_grad = self._gather_flat_grad()
opt_cond = flat_grad.abs().max() <= tolerance_grad
# optimal condition
if opt_cond:
return orig_loss
# tensors cached in state (for tracing)
d = state.get('d')
t = state.get('t')
old_dirs = state.get('old_dirs')
old_stps = state.get('old_stps')
ro = state.get('ro')
H_diag = state.get('H_diag')
prev_flat_grad = state.get('prev_flat_grad')
prev_loss = state.get('prev_loss')
n_iter = 0
# optimize for a max of max_iter iterations
while n_iter < max_iter:
# keep track of nb of iterations
n_iter += 1
state['n_iter'] += 1
############################################################
# compute gradient descent direction
############################################################
if state['n_iter'] == 1:
d = flat_grad.neg()
old_dirs = []
old_stps = []
ro = []
H_diag = 1
else:
# do lbfgs update (update memory)
y = flat_grad.sub(prev_flat_grad)
s = d.mul(t)
ys = y.dot(s) # y*s
if ys > 1e-10:
# updating memory
if len(old_dirs) == history_size:
# shift history by one (limited-memory)
old_dirs.pop(0)
old_stps.pop(0)
ro.pop(0)
# store new direction/step
old_dirs.append(y)
old_stps.append(s)
ro.append(1. / ys)
# update scale of initial Hessian approximation
H_diag = ys / y.dot(y) # (y*y)
# compute the approximate (L-BFGS) inverse Hessian
# multiplied by the gradient
num_old = len(old_dirs)
if 'al' not in state:
state['al'] = [None] * history_size
al = state['al']
# iteration in L-BFGS loop collapsed to use just one buffer
q = flat_grad.neg()
for i in range(num_old - 1, -1, -1):
al[i] = old_stps[i].dot(q) * ro[i]
q.add_(-al[i], old_dirs[i])
# multiply by initial Hessian
# r/d is the final direction
d = r = torch.mul(q, H_diag)
for i in range(num_old):
be_i = old_dirs[i].dot(r) * ro[i]
r.add_(al[i] - be_i, old_stps[i])
if prev_flat_grad is None:
prev_flat_grad = flat_grad.clone()
else:
prev_flat_grad.copy_(flat_grad)
prev_loss = loss
############################################################
# compute step length
############################################################
# reset initial guess for step size
if state['n_iter'] == 1:
t = min(1., 1. / flat_grad.abs().sum()) * lr
else:
t = lr
# directional derivative
gtd = flat_grad.dot(d) # g * d
# directional derivative is below tolerance
if gtd > -tolerance_change:
break
# optional line search: user function
ls_func_evals = 0
if line_search_fn is not None:
# perform line search, using user function
if line_search_fn != "strong_wolfe":
raise RuntimeError("only 'strong_wolfe' is supported")
else:
x_init = self._clone_param()
def obj_func(x, t, d):
return self._directional_evaluate(closure, x, t, d)
loss, flat_grad, t, ls_func_evals = _strong_wolfe(
obj_func, x_init, t, d, loss, flat_grad, gtd)
self._add_grad(t, d)
opt_cond = flat_grad.abs().max() <= tolerance_grad
else:
# no line search, simply move with fixed-step
self._add_grad(t, d)
if n_iter != max_iter:
# re-evaluate function only if not in last iteration
# the reason we do this: in a stochastic setting,
# no use to re-evaluate that function here
loss = float(closure())
flat_grad = self._gather_flat_grad()
opt_cond = flat_grad.abs().max() <= tolerance_grad
ls_func_evals = 1
# update func eval
current_evals += ls_func_evals
state['func_evals'] += ls_func_evals
############################################################
# check conditions
############################################################
if n_iter == max_iter:
break
if current_evals >= max_eval:
break
# optimal condition
if opt_cond:
break
# lack of progress
if d.mul(t).abs().max() <= tolerance_change:
break
if abs(loss - prev_loss) < tolerance_change:
break
state['d'] = d
state['t'] = t
state['old_dirs'] = old_dirs
state['old_stps'] = old_stps
state['ro'] = ro
state['H_diag'] = H_diag
state['prev_flat_grad'] = prev_flat_grad
state['prev_loss'] = prev_loss
return orig_loss

View File

@ -0,0 +1,419 @@
'''
@ Date: 2020-11-19 17:46:04
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-14 11:46:56
@ FilePath: /EasyMocap/easymocap/pyfitting/lossfactory.py
'''
import numpy as np
import torch
from .operation import projection, batch_rodrigues
funcl2 = lambda x: torch.sum(x**2)
funcl1 = lambda x: torch.sum(torch.abs(x**2))
def gmof(squared_res, sigma_squared):
"""
Geman-McClure error function
"""
return (sigma_squared * squared_res) / (sigma_squared + squared_res)
def ReprojectionLoss(keypoints3d, keypoints2d, K, Rc, Tc, inv_bbox_sizes, norm='l2'):
img_points = projection(keypoints3d, K, Rc, Tc)
residual = (img_points - keypoints2d[:, :, :2]) * keypoints2d[:, :, -1:]
# squared_res: (nFrames, nJoints, 2)
if norm == 'l2':
squared_res = (residual ** 2) * inv_bbox_sizes
elif norm == 'l1':
squared_res = torch.abs(residual) * inv_bbox_sizes
else:
import ipdb; ipdb.set_trace()
return torch.sum(squared_res)
class LossKeypoints3D:
def __init__(self, keypoints3d, cfg, norm='l2') -> None:
self.cfg = cfg
keypoints3d = torch.Tensor(keypoints3d).to(cfg.device)
self.nJoints = keypoints3d.shape[1]
self.keypoints3d = keypoints3d[..., :3]
self.conf = keypoints3d[..., 3:]
self.nFrames = keypoints3d.shape[0]
self.norm = norm
def loss(self, diff_square):
if self.norm == 'l2':
loss_3d = funcl2(diff_square)
elif self.norm == 'l1':
loss_3d = funcl1(diff_square)
elif self.norm == 'gm':
# 阈值设为0.2^2米
loss_3d = torch.sum(gmof(diff_square**2, 0.04))
else:
raise NotImplementedError
return loss_3d/self.nFrames
def body(self, kpts_est, **kwargs):
"distance of keypoints3d"
nJoints = min([kpts_est.shape[1], self.keypoints3d.shape[1], 25])
diff_square = (kpts_est[:, :nJoints, :3] - self.keypoints3d[:, :nJoints, :3])*self.conf[:, :nJoints]
return self.loss(diff_square)
def hand(self, kpts_est, **kwargs):
"distance of 3d hand keypoints"
diff_square = (kpts_est[:, 25:25+42, :3] - self.keypoints3d[:, 25:25+42, :3])*self.conf[:, 25:25+42]
return self.loss(diff_square)
def face(self, kpts_est, **kwargs):
"distance of 3d face keypoints"
diff_square = (kpts_est[:, 25+42:, :3] - self.keypoints3d[:, 25+42:, :3])*self.conf[:, 25+42:]
return self.loss(diff_square)
def __str__(self) -> str:
return 'Loss function for keypoints3D, norm = {}'.format(self.norm)
class LossRegPoses:
def __init__(self, cfg) -> None:
self.cfg = cfg
def reg_hand(self, poses, **kwargs):
"regulizer for hand pose"
assert self.cfg.model in ['smplh', 'smplx']
hand_poses = poses[:, 66:78]
loss = funcl2(hand_poses)
return loss/poses.shape[0]
def reg_head(self, poses, **kwargs):
"regulizer for head pose"
assert self.cfg.model in ['smplx']
poses = poses[:, 78:]
loss = funcl2(poses)
return loss/poses.shape[0]
def reg_expr(self, expression, **kwargs):
"regulizer for expression"
assert self.cfg.model in ['smplh', 'smplx']
return torch.sum(expression**2)
def reg_body(self, poses, **kwargs):
"regulizer for body poses"
if self.cfg.model in ['smplh', 'smplx']:
poses = poses[:, :66]
loss = funcl2(poses)
return loss/poses.shape[0]
def __str__(self) -> str:
return 'Loss function for Regulizer of Poses'
class LossRegPosesZero:
def __init__(self, keypoints, cfg) -> None:
model_type = cfg.model
if keypoints.shape[-2] <= 15:
use_feet = False
use_head = False
else:
use_feet = keypoints[..., [19, 20, 21, 22, 23, 24], -1].sum() > 0.1
use_head = keypoints[..., [15, 16, 17, 18], -1].sum() > 0.1
if model_type == 'smpl':
SMPL_JOINT_ZERO_IDX = [3, 6, 9, 10, 11, 13, 14, 20, 21, 22, 23]
elif model_type == 'smplh':
SMPL_JOINT_ZERO_IDX = [3, 6, 9, 10, 11, 13, 14]
elif model_type == 'smplx':
SMPL_JOINT_ZERO_IDX = [3, 6, 9, 10, 11, 13, 14]
else:
raise NotImplementedError
if not use_feet:
SMPL_JOINT_ZERO_IDX.extend([7, 8])
if not use_head:
SMPL_JOINT_ZERO_IDX.extend([12, 15])
SMPL_POSES_ZERO_IDX = [[j for j in range(3*i, 3*i+3)] for i in SMPL_JOINT_ZERO_IDX]
SMPL_POSES_ZERO_IDX = sum(SMPL_POSES_ZERO_IDX, [])
# SMPL_POSES_ZERO_IDX.extend([36, 37, 38, 45, 46, 47])
self.idx = SMPL_POSES_ZERO_IDX
def __call__(self, poses, **kwargs):
"regulizer for zero joints"
return torch.sum(torch.abs(poses[:, self.idx]))/poses.shape[0]
def __str__(self) -> str:
return 'Loss function for Regulizer of Poses'
class LossSmoothBody:
def __init__(self, cfg) -> None:
self.norm = 'l2'
def __call__(self, kpts_est, **kwargs):
N_BODY = min(25, kpts_est.shape[1])
assert kpts_est.shape[0] > 1, 'If you use smooth loss, it must be more than 1 frames'
if self.norm == 'l2':
loss = funcl2(kpts_est[:-1, :N_BODY] - kpts_est[1:, :N_BODY])
else:
loss = funcl1(kpts_est[:-1, :N_BODY] - kpts_est[1:, :N_BODY])
return loss/kpts_est.shape[0]
def __str__(self) -> str:
return 'Loss function for Smooth of Body'
class LossSmoothBodyMean:
def __init__(self, cfg) -> None:
self.cfg = cfg
def smooth(self, kpts_est, **kwargs):
"smooth body"
kpts_interp = kpts_est.clone().detach()
kpts_interp[1:-1] = (kpts_interp[:-2] + kpts_interp[2:])/2
loss = funcl2(kpts_est[1:-1] - kpts_interp[1:-1])
return loss/(kpts_est.shape[0] - 2)
def body(self, kpts_est, **kwargs):
"smooth body"
return self.smooth(kpts_est[:, :25])
def hand(self, kpts_est, **kwargs):
"smooth body"
return self.smooth(kpts_est[:, 25:25+42])
def __str__(self) -> str:
return 'Loss function for Smooth of Body'
class LossSmoothPoses:
def __init__(self, nViews, nFrames, cfg=None) -> None:
self.nViews = nViews
self.nFrames = nFrames
self.norm = 'l2'
self.cfg = cfg
def _poses(self, poses):
"smooth poses"
loss = 0
for nv in range(self.nViews):
poses_ = poses[nv*self.nFrames:(nv+1)*self.nFrames, ]
# 计算poses插值
poses_interp = poses_.clone().detach()
poses_interp[1:-1] = (poses_interp[1:-1] + poses_interp[:-2] + poses_interp[2:])/3
loss += funcl2(poses_[1:-1] - poses_interp[1:-1])
return loss/(self.nFrames-2)/self.nViews
def poses(self, poses, **kwargs):
"smooth body poses"
if self.cfg.model in ['smplh', 'smplx']:
poses = poses[:, :66]
return self._poses(poses)
def hands(self, poses, **kwargs):
"smooth hand poses"
if self.cfg.model in ['smplh', 'smplx']:
poses = poses[:, 66:66+12]
else:
raise NotImplementedError
return self._poses(poses)
def head(self, poses, **kwargs):
"smooth head poses"
if self.cfg.model == 'smplx':
poses = poses[:, 66+12:]
else:
raise NotImplementedError
return self._poses(poses)
def __str__(self) -> str:
return 'Loss function for Smooth of Body'
class LossSmoothBodyMulti(LossSmoothBody):
def __init__(self, dimGroups, cfg) -> None:
super().__init__(cfg)
self.cfg = cfg
self.dimGroups = dimGroups
def __call__(self, kpts_est, **kwargs):
"Smooth body"
assert kpts_est.shape[0] > 1, 'If you use smooth loss, it must be more than 1 frames'
loss = 0
for nv in range(len(self.dimGroups) - 1):
kpts = kpts_est[self.dimGroups[nv]:self.dimGroups[nv+1]]
loss += super().__call__(kpts_est=kpts)
return loss/(len(self.dimGroups) - 1)
def __str__(self) -> str:
return 'Loss function for Multi Smooth of Body'
class LossSmoothPosesMulti:
def __init__(self, dimGroups, cfg) -> None:
self.dimGroups = dimGroups
self.norm = 'l2'
def __call__(self, poses, **kwargs):
"Smooth poses"
loss = 0
for nv in range(len(self.dimGroups) - 1):
poses_ = poses[self.dimGroups[nv]:self.dimGroups[nv+1]]
poses_interp = poses_.clone().detach()
poses_interp[1:-1] = (poses_interp[1:-1] + poses_interp[:-2] + poses_interp[2:])/3
loss += funcl2(poses_[1:-1] - poses_interp[1:-1])/(poses_.shape[0] - 2)
return loss/(len(self.dimGroups) - 1)
def __str__(self) -> str:
return 'Loss function for Multi Smooth of Poses'
class LossRepro:
def __init__(self, bboxes, keypoints2d, cfg) -> None:
device = cfg.device
bbox_sizes = np.maximum(bboxes[..., 2] - bboxes[..., 0], bboxes[..., 3] - bboxes[..., 1])
# 这里的valid不是一维的因为不清楚总共有多少维所以不能遍历去做
bbox_conf = bboxes[..., 4]
bbox_mean_axis = -1
bbox_sizes = (bbox_sizes * bbox_conf).sum(axis=bbox_mean_axis)/(1e-3 + bbox_conf.sum(axis=bbox_mean_axis))
bbox_sizes = bbox_sizes[..., None, None, None]
# 抑制掉完全不可见的视角将其置信度设成0
bbox_sizes[bbox_sizes < 10] = 1e6
inv_bbox_sizes = torch.Tensor(1./bbox_sizes).to(device)
keypoints2d = torch.Tensor(keypoints2d).to(device)
self.keypoints2d = keypoints2d[..., :2]
self.conf = keypoints2d[..., 2:] * inv_bbox_sizes * 100
self.norm = 'gm'
def __call__(self, img_points):
residual = (img_points - self.keypoints2d) * self.conf
# squared_res: (nFrames, nJoints, 2)
if self.norm == 'l2':
squared_res = residual ** 2
elif self.norm == 'l1':
squared_res = torch.abs(residual)
elif self.norm == 'gm':
squared_res = gmof(residual**2, 200)
else:
import ipdb; ipdb.set_trace()
return torch.sum(squared_res)
class LossInit:
def __init__(self, params, cfg) -> None:
self.norm = 'l2'
self.poses = torch.Tensor(params['poses']).to(cfg.device)
self.shapes = torch.Tensor(params['shapes']).to(cfg.device)
def init_poses(self, poses, **kwargs):
"distance to poses_0"
if self.norm == 'l2':
return torch.sum((poses - self.poses)**2)/poses.shape[0]
def init_shapes(self, shapes, **kwargs):
"distance to shapes_0"
if self.norm == 'l2':
return torch.sum((shapes - self.shapes)**2)/shapes.shape[0]
class LossKeypointsMV2D(LossRepro):
def __init__(self, keypoints2d, bboxes, Pall, cfg) -> None:
"""
Args:
keypoints2d (ndarray): (nViews, nFrames, nJoints, 3)
bboxes (ndarray): (nViews, nFrames, 5)
"""
super().__init__(bboxes, keypoints2d, cfg)
assert Pall.shape[0] == keypoints2d.shape[0] and Pall.shape[0] == bboxes.shape[0], \
'check you P shape: {} and keypoints2d shape: {}'.format(Pall.shape, keypoints2d.shape)
device = cfg.device
self.Pall = torch.Tensor(Pall).to(device)
self.nViews, self.nFrames, self.nJoints = keypoints2d.shape[:3]
self.kpt_homo = torch.ones((self.nFrames, self.nJoints, 1), device=device)
def __call__(self, kpts_est, **kwargs):
"reprojection loss for multiple views"
# kpts_est: (nFrames, nJoints, 3+1), P: (nViews, 3, 4)
# => projection: (nViews, nFrames, nJoints, 3)
kpts_homo = torch.cat([kpts_est[..., :self.nJoints, :], self.kpt_homo], dim=2)
point_cam = torch.einsum('vab,fnb->vfna', self.Pall, kpts_homo)
img_points = point_cam[..., :2]/point_cam[..., 2:]
return super().__call__(img_points)/self.nViews/self.nFrames
def __str__(self) -> str:
return 'Loss function for Reprojection error'
class SMPLAngleLoss:
def __init__(self, keypoints, model_type='smpl'):
if keypoints.shape[1] <= 15:
use_feet = False
use_head = False
else:
use_feet = keypoints[:, [19, 20, 21, 22, 23, 24], -1].sum() > 0.1
use_head = keypoints[:, [15, 16, 17, 18], -1].sum() > 0.1
if model_type == 'smpl':
SMPL_JOINT_ZERO_IDX = [3, 6, 9, 10, 11, 13, 14, 20, 21, 22, 23]
elif model_type == 'smplh':
SMPL_JOINT_ZERO_IDX = [3, 6, 9, 10, 11, 13, 14]
elif model_type == 'smplx':
SMPL_JOINT_ZERO_IDX = [3, 6, 9, 10, 11, 13, 14]
else:
raise NotImplementedError
if not use_feet:
SMPL_JOINT_ZERO_IDX.extend([7, 8])
if not use_head:
SMPL_JOINT_ZERO_IDX.extend([12, 15])
SMPL_POSES_ZERO_IDX = [[j for j in range(3*i, 3*i+3)] for i in SMPL_JOINT_ZERO_IDX]
SMPL_POSES_ZERO_IDX = sum(SMPL_POSES_ZERO_IDX, [])
# SMPL_POSES_ZERO_IDX.extend([36, 37, 38, 45, 46, 47])
self.idx = SMPL_POSES_ZERO_IDX
def loss(self, poses):
return torch.sum(torch.abs(poses[:, self.idx]))
def SmoothLoss(body_params, keys, weight_loss, span=4, model_type='smpl'):
spans = [i for i in range(1, span)]
span_weights = {i:1/i for i in range(1, span)}
span_weights = {key: i/sum(span_weights) for key, i in span_weights.items()}
loss_dict = {}
nFrames = body_params['poses'].shape[0]
nPoses = body_params['poses'].shape[1]
if model_type == 'smplh' or model_type == 'smplx':
nPoses = 66
for key in ['poses', 'Th', 'poses_hand', 'expression']:
if key not in keys:
continue
k = 'smooth_' + key
if k in weight_loss.keys() and weight_loss[k] > 0.:
loss_dict[k] = 0.
for span in spans:
if key == 'poses_hand':
val = torch.sum((body_params['poses'][span:, 66:] - body_params['poses'][:nFrames-span, 66:])**2)
else:
val = torch.sum((body_params[key][span:, :nPoses] - body_params[key][:nFrames-span, :nPoses])**2)
loss_dict[k] += span_weights[span] * val
k = 'smooth_' + key + '_l1'
if k in weight_loss.keys() and weight_loss[k] > 0.:
loss_dict[k] = 0.
for span in spans:
if key == 'poses_hand':
val = torch.sum((body_params['poses'][span:, 66:] - body_params['poses'][:nFrames-span, 66:]).abs())
else:
val = torch.sum((body_params[key][span:, :nPoses] - body_params[key][:nFrames-span, :nPoses]).abs())
loss_dict[k] += span_weights[span] * val
# smooth rotation
rot = batch_rodrigues(body_params['Rh'])
key, k = 'Rh', 'smooth_Rh'
if key in keys and k in weight_loss.keys() and weight_loss[k] > 0.:
loss_dict[k] = 0.
for span in spans:
val = torch.sum((rot[span:, :] - rot[:nFrames-span, :])**2)
loss_dict[k] += span_weights[span] * val
return loss_dict
def RegularizationLoss(body_params, body_params_init, weight_loss):
loss_dict = {}
for key in ['poses', 'shapes', 'Th', 'hands', 'head', 'expression']:
if 'init_'+key in weight_loss.keys() and weight_loss['init_'+key] > 0.:
if key == 'poses':
loss_dict['init_'+key] = torch.sum((body_params[key][:, :66] - body_params_init[key][:, :66])**2)
elif key == 'hands':
loss_dict['init_'+key] = torch.sum((body_params['poses'][: , 66:66+12] - body_params_init['poses'][:, 66:66+12])**2)
elif key == 'head':
loss_dict['init_'+key] = torch.sum((body_params['poses'][: , 78:78+9] - body_params_init['poses'][:, 78:78+9])**2)
elif key in body_params.keys():
loss_dict['init_'+key] = torch.sum((body_params[key] - body_params_init[key])**2)
for key in ['poses', 'shapes', 'hands', 'head', 'expression']:
if 'reg_'+key in weight_loss.keys() and weight_loss['reg_'+key] > 0.:
if key == 'poses':
loss_dict['reg_'+key] = torch.sum((body_params[key][:, :66])**2)
elif key == 'hands':
loss_dict['reg_'+key] = torch.sum((body_params['poses'][: , 66:66+12])**2)
elif key == 'head':
loss_dict['reg_'+key] = torch.sum((body_params['poses'][: , 78:78+9])**2)
elif key in body_params.keys():
loss_dict['reg_'+key] = torch.sum((body_params[key])**2)
return loss_dict

View File

@ -0,0 +1,74 @@
'''
@ Date: 2020-11-19 11:39:45
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-01-20 15:06:28
@ FilePath: /EasyMocap/code/pyfitting/operation.py
'''
import torch
def batch_rodrigues(rot_vecs, epsilon=1e-8, dtype=torch.float32):
''' Calculates the rotation matrices for a batch of rotation vectors
Parameters
----------
rot_vecs: torch.tensor Nx3
array of N axis-angle vectors
Returns
-------
R: torch.tensor Nx3x3
The rotation matrices for the given axis-angle parameters
'''
batch_size = rot_vecs.shape[0]
device = rot_vecs.device
angle = torch.norm(rot_vecs + 1e-8, dim=1, keepdim=True)
rot_dir = rot_vecs / angle
cos = torch.unsqueeze(torch.cos(angle), dim=1)
sin = torch.unsqueeze(torch.sin(angle), dim=1)
# Bx1 arrays
rx, ry, rz = torch.split(rot_dir, 1, dim=1)
K = torch.zeros((batch_size, 3, 3), dtype=dtype, device=device)
zeros = torch.zeros((batch_size, 1), dtype=dtype, device=device)
K = torch.cat([zeros, -rz, ry, rz, zeros, -rx, -ry, rx, zeros], dim=1) \
.view((batch_size, 3, 3))
ident = torch.eye(3, dtype=dtype, device=device).unsqueeze(dim=0)
rot_mat = ident + sin * K + (1 - cos) * torch.bmm(K, K)
return rot_mat
def projection(points3d, camera_intri, R=None, T=None, distance=None):
""" project the 3d points to camera coordinate
Arguments:
points3d {Tensor} -- (bn, N, 3)
camera_intri {Tensor} -- (bn, 3, 3)
distance {Tensor} -- (bn, 1, 1)
R: bn, 3, 3
T: bn, 3, 1
Returns:
points2d -- (bn, N, 2)
"""
if R is not None:
Rt = torch.transpose(R, 1, 2)
if T.shape[-1] == 1:
Tt = torch.transpose(T, 1, 2)
points3d = torch.matmul(points3d, Rt) + Tt
else:
points3d = torch.matmul(points3d, Rt) + T
if distance is None:
img_points = torch.div(points3d[:, :, :2],
points3d[:, :, 2:3])
else:
img_points = torch.div(points3d[:, :, :2],
distance)
camera_mat = camera_intri[:, :2, :2]
center = torch.transpose(camera_intri[:, :2, 2:3], 1, 2)
img_points = torch.matmul(img_points, camera_mat.transpose(1, 2)) + center
# img_points = torch.einsum('bki,bji->bjk', [camera_mat, img_points]) \
# + center
return img_points

View File

@ -0,0 +1,131 @@
'''
@ Date: 2020-06-26 12:06:25
@ LastEditors: Qing Shuai
@ LastEditTime: 2020-06-26 12:08:37
@ Author: Qing Shuai
@ Mail: s_q@zju.edu.cn
'''
import numpy as np
import os
from tqdm import tqdm
import torch
import json
def rel_change(prev_val, curr_val):
return (prev_val - curr_val) / max([np.abs(prev_val), np.abs(curr_val), 1])
class FittingMonitor:
def __init__(self, ftol=1e-5, gtol=1e-6, maxiters=100, visualize=False, verbose=False, **kwargs):
self.maxiters = maxiters
self.ftol = ftol
self.gtol = gtol
self.visualize = visualize
self.verbose = verbose
if self.visualize:
from utils.mesh_viewer import MeshViewer
self.mv = MeshViewer(width=1024, height=1024, bg_color=[1.0, 1.0, 1.0, 1.0],
body_color=[0.65098039, 0.74117647, 0.85882353, 1.0],
offscreen=False)
def run_fitting(self, optimizer, closure, params, smpl_render=None, **kwargs):
prev_loss = None
grad_require(params, True)
if self.verbose:
trange = tqdm(range(self.maxiters), desc='Fitting')
else:
trange = range(self.maxiters)
for iter in trange:
loss = optimizer.step(closure)
if torch.isnan(loss).sum() > 0:
print('NaN loss value, stopping!')
break
if torch.isinf(loss).sum() > 0:
print('Infinite loss value, stopping!')
break
# if all([torch.abs(var.grad.view(-1).max()).item() < self.gtol
# for var in params if var.grad is not None]):
# print('Small grad, stopping!')
# break
if iter > 0 and prev_loss is not None and self.ftol > 0:
loss_rel_change = rel_change(prev_loss, loss.item())
if loss_rel_change <= self.ftol:
break
if self.visualize:
vertices = smpl_render.GetVertices(**kwargs)
self.mv.update_mesh(vertices[::10], smpl_render.faces)
prev_loss = loss.item()
grad_require(params, False)
return prev_loss
def close(self):
if self.visualize:
self.mv.close_viewer()
class FittingLog:
if False:
from tensorboardX import SummaryWriter
swriter = SummaryWriter()
def __init__(self, log_name, useVisdom=False):
if not os.path.exists(log_name):
log_file = open(log_name, 'w')
self.index = {log_name:0}
else:
log_file = open(log_name, 'r')
log_pre = log_file.readlines()
log_file.close()
self.index = {log_name:len(log_pre)}
log_file = open(log_name, 'a')
self.log_file = log_file
self.useVisdom = useVisdom
if useVisdom:
import visdom
self.vis = visdom.Visdom(env=os.path.realpath(
join(os.path.dirname(log_name), '..')).replace(os.sep, '_'))
elif False:
self.writer = FittingLog.swriter
self.log_name = log_name
def step(self, loss_dict, weight_loss):
print(' '.join([key + ' %f'%(loss_dict[key].item()*weight_loss[key])
for key in loss_dict.keys() if weight_loss[key]>0]), file=self.log_file)
loss = {key:loss_dict[key].item()*weight_loss[key]
for key in loss_dict.keys() if weight_loss[key]>0}
if self.useVisdom:
name = list(loss.keys())
val = list(loss.values())
x = self.index.get(self.log_name, 0)
if len(val) == 1:
y = np.array(val)
else:
y = np.array(val).reshape(-1, len(val))
self.vis.line(Y=y,X=np.ones(y.shape)*x,
win=str(self.log_name),#unicode
opts=dict(legend=name,
title=self.log_name),
update=None if x == 0 else 'append'
)
elif False:
self.writer.add_scalars('data/{}'.format(self.log_name), loss, self.index[self.log_name])
self.index[self.log_name] += 1
def log_loss(self, weight_loss):
loss = json.dumps(weight_loss, indent=4)
self.log_file.writelines(loss)
self.log_file.write('\n')
def close(self):
self.log_file.close()
def grad_require(paras, flag=False):
if isinstance(paras, list):
for par in paras:
par.requires_grad = flag
elif isinstance(paras, dict):
for key, par in paras.items():
par.requires_grad = flag

View File

@ -0,0 +1,317 @@
'''
@ Date: 2021-03-05 15:21:33
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-03-31 23:02:58
@ FilePath: /EasyMocap/easymocap/pyfitting/optimize_mirror.py
'''
from .optimize_simple import _optimizeSMPL, deepcopy_tensor, get_prepare_smplx, dict_of_tensor_to_numpy
from .lossfactory import LossRepro, LossInit, LossSmoothBody, LossSmoothPoses, LossSmoothBodyMulti, LossSmoothPosesMulti
from ..dataset.mirror import flipSMPLPoses, flipPoint2D, flipSMPLParams
import torch
import numpy as np
# 这里存在几种技术方案:
# 1. theta, beta, R, T, (a, b, c, d) || L_r
# 2. theta, beta, R, T, R', T' || L_r, L_s
# 3. theta, beta, R, T, theta', beta', R', T' || L_r, L_s
def flipSMPLPosesV(params, reverse=False):
# 前面部分是外面的人,后面部分是镜子里的人
nFrames = params['poses'].shape[0] // 2
if reverse:
params['poses'][:nFrames] = flipSMPLPoses(params['poses'][nFrames:])
else:
params['poses'][nFrames:] = flipSMPLPoses(params['poses'][:nFrames])
return params
def flipSMPLParamsV(params, mirror):
params_mirror = flipSMPLParams(params, mirror)
params_new = {}
for key in params.keys():
if key == 'shapes':
params_new['shapes'] = params['shapes']
else:
params_new[key] = np.vstack([params[key], params_mirror[key]])
return params_new
def calc_mirror_transform(m_):
""" From mirror vector to mirror matrix
Args:
m (bn, 4): (a, b, c, d)
Returns:
M: (bn, 3, 4)
"""
norm = torch.norm(m_[:, :3], dim=1, keepdim=True)
m = m_[:, :3] / norm
d = m_[:, 3]
coeff_mat = torch.zeros((m.shape[0], 3, 4), device=m.device)
coeff_mat[:, 0, 0] = 1 - 2*m[:, 0]**2
coeff_mat[:, 0, 1] = -2*m[:, 0]*m[:, 1]
coeff_mat[:, 0, 2] = -2*m[:, 0]*m[:, 2]
coeff_mat[:, 0, 3] = -2*m[:, 0]*d
coeff_mat[:, 1, 0] = -2*m[:, 1]*m[:, 0]
coeff_mat[:, 1, 1] = 1-2*m[:, 1]**2
coeff_mat[:, 1, 2] = -2*m[:, 1]*m[:, 2]
coeff_mat[:, 1, 3] = -2*m[:, 1]*d
coeff_mat[:, 2, 0] = -2*m[:, 2]*m[:, 0]
coeff_mat[:, 2, 1] = -2*m[:, 2]*m[:, 1]
coeff_mat[:, 2, 2] = 1-2*m[:, 2]**2
coeff_mat[:, 2, 3] = -2*m[:, 2]*d
return coeff_mat
class LossKeypointsMirror2D(LossRepro):
def __init__(self, keypoints2d, bboxes, Pall, cfg) -> None:
super().__init__(bboxes, keypoints2d, cfg)
self.Pall = torch.Tensor(Pall).to(cfg.device)
self.nJoints = keypoints2d.shape[-2]
self.nViews, self.nFrames = self.keypoints2d.shape[0], self.keypoints2d.shape[1]
self.kpt_homo = torch.ones((keypoints2d.shape[0]*keypoints2d.shape[1], keypoints2d.shape[2], 1), device=cfg.device)
self.norm = 'l2'
def residual(self, kpts_est):
# kpts_est: (2xnFrames, nJoints, 3)
kpts_homo = torch.cat([kpts_est[..., :self.nJoints, :], self.kpt_homo], dim=2)
point_cam = torch.einsum('ab,fnb->fna', self.Pall, kpts_homo)
img_points = point_cam[..., :2]/point_cam[..., 2:]
img_points = img_points.view(self.nViews, self.nFrames, self.nJoints, 2)
residual = (img_points - self.keypoints2d) * self.conf
return residual
def __call__(self, kpts_est, **kwargs):
"reprojection error for mirror"
# kpts_est: (2xnFrames, 25, 3)
kpts_homo = torch.cat([kpts_est[..., :self.nJoints, :], self.kpt_homo], dim=2)
point_cam = torch.einsum('ab,fnb->fna', self.Pall, kpts_homo)
img_points = point_cam[..., :2]/point_cam[..., 2:]
img_points = img_points.view(self.nViews, self.nFrames, self.nJoints, 2)
return super().__call__(img_points)/self.nViews/self.nFrames
def __str__(self) -> str:
return 'Loss function for Reprojection error of Mirror'
class LossKeypointsMirror2DDirect(LossKeypointsMirror2D):
def __init__(self, keypoints2d, bboxes, Pall, normal=None, cfg=None, mirror=None) -> None:
super().__init__(keypoints2d, bboxes, Pall, cfg)
nFrames = 1
if mirror is None:
self.mirror = torch.zeros([nFrames, 4], device=cfg.device)
if normal is not None:
self.mirror[:, :3] = torch.Tensor(normal).to(cfg.device)
else:
# roughly initialize the mirror => n = (0, -1, 0)
self.mirror[:, 2] = 1.
self.mirror[:, 3] = -10.
else:
self.mirror = torch.Tensor(mirror).to(cfg.device)
self.norm = 'l2'
def __call__(self, kpts_est, **kwargs):
"reprojection error for direct mirror ="
# kpts_est: (nFrames, 25, 3)
M = calc_mirror_transform(self.mirror)
if M.shape[0] != kpts_est.shape[0]:
M = M.expand(kpts_est.shape[0], -1, -1)
homo = torch.ones((kpts_est.shape[0], kpts_est.shape[1], 1), device=kpts_est.device)
kpts_homo = torch.cat([kpts_est, homo], dim=2)
kpts_mirror = flipPoint2D(torch.bmm(M, kpts_homo.transpose(1, 2)).transpose(1, 2))
# 视频的时候注意拼接的顺序
kpts_new = torch.cat([kpts_est, kpts_mirror])
# 使用镜像进行翻转
return super().__call__(kpts_new)
def __str__(self) -> str:
return 'Loss function for Reprojection error of Mirror '
class LossMirrorSymmetry:
def __init__(self, N_JOINTS=25, normal=None, cfg=None) -> None:
idx0, idx1 = np.meshgrid(np.arange(N_JOINTS), np.arange(N_JOINTS))
idx0, idx1 = idx0.reshape(-1), idx1.reshape(-1)
idx_diff = np.where(idx0!=idx1)[0]
self.idx00, self.idx11 = idx0[idx_diff], idx1[idx_diff]
self.N_JOINTS = N_JOINTS
self.idx0 = idx0
self.idx1 = idx1
if normal is not None:
self.normal = torch.Tensor(normal).to(cfg.device)
self.normal = self.normal.expand(-1, N_JOINTS, -1)
else:
self.normal = None
self.device = cfg.device
def parallel_mirror(self, kpts_est, **kwargs):
"encourage parallel to mirror"
# kpts_est: (nFramesxnViews, nJoints, 3)
if self.normal is None:
return torch.tensor(0.).to(self.device)
nFrames = kpts_est.shape[0] // 2
kpts_out = kpts_est[:nFrames, ...]
kpts_in = kpts_est[nFrames:, ...]
kpts_in = flipPoint2D(kpts_in)
direct = kpts_in - kpts_out
direct_norm = direct/torch.norm(direct, dim=-1, keepdim=True)
loss = torch.sum(torch.norm(torch.cross(self.normal, direct_norm), dim=2))
return loss / nFrames / kpts_est.shape[1]
def parallel_self(self, kpts_est, **kwargs):
"encourage parallel to self"
# kpts_est: (nFramesxnViews, nJoints, 3)
nFrames = kpts_est.shape[0] // 2
kpts_out = kpts_est[:nFrames, ...]
kpts_in = kpts_est[nFrames:, ...]
kpts_in = flipPoint2D(kpts_in)
direct = kpts_in - kpts_out
direct_norm = direct/torch.norm(direct, dim=-1, keepdim=True)
loss = torch.sum(torch.norm(
torch.cross(direct_norm[:, self.idx0, :], direct_norm[:, self.idx1, :]), dim=2))/self.idx0.shape[0]
return loss / nFrames
def vertical_self(self, kpts_est, **kwargs):
"encourage vertical to self"
# kpts_est: (nFramesxnViews, nJoints, 3)
nFrames = kpts_est.shape[0] // 2
kpts_out = kpts_est[:nFrames, ...]
kpts_in = kpts_est[nFrames:, ...]
kpts_in = flipPoint2D(kpts_in)
direct = kpts_in - kpts_out
direct_norm = direct/torch.norm(direct, dim=-1, keepdim=True)
mid_point = (kpts_in + kpts_out)/2
inner = torch.abs(torch.sum((mid_point[:, self.idx00, :] - mid_point[:, self.idx11, :])*direct_norm[:, self.idx11, :], dim=2))
loss = torch.sum(inner)/self.idx00.shape[0]
return loss / nFrames
def __str__(self) -> str:
return 'Loss function for Mirror Symmetry'
class MirrorLoss():
def __init__(self, N_JOINTS=25) -> None:
N_JOINTS = min(N_JOINTS, 25)
idx0, idx1 = np.meshgrid(np.arange(N_JOINTS), np.arange(N_JOINTS))
idx0, idx1 = idx0.reshape(-1), idx1.reshape(-1)
idx_diff = np.where(idx0!=idx1)[0]
self.idx00, self.idx11 = idx0[idx_diff], idx1[idx_diff]
self.N_JOINTS = N_JOINTS
self.idx0 = idx0
self.idx1 = idx1
def loss(self, lKeypoints, weight_loss):
loss_dict = {}
for key in ['parallel_self', 'parallel_mirror', 'vertical_self']:
if weight_loss[key] > 0.:
loss_dict[key] = 0.
# mirror loss for two person
kpts0 = lKeypoints[0][..., :self.N_JOINTS, :]
kpts1 = flipPoint(lKeypoints[1][..., :self.N_JOINTS, :])
# direct: (N, 25, 3)
direct = kpts1 - kpts0
direct_norm = direct/torch.norm(direct, dim=2, keepdim=True)
if weight_loss['parallel_self'] > 0.:
loss_dict['parallel_self'] += torch.sum(torch.norm(
torch.cross(direct_norm[:, self.idx0, :], direct_norm[:, self.idx1, :]), dim=2))/self.idx0.shape[0]
mid_point = (kpts0 + kpts1)/2
if weight_loss['vertical_self'] > 0:
inner = torch.abs(torch.sum((mid_point[:, self.idx00, :] - mid_point[:, self.idx11, :])*direct_norm[:, self.idx11, :], dim=2))
loss_dict['vertical_self'] += torch.sum(inner)/self.idx00.shape[0]
return loss_dict
def optimizeMirrorDirect(body_model, params, bboxes, keypoints2d, Pall, normal, weight, cfg):
"""
simple function for optimizing mirror
# 先写图片的
Args:
body_model (SMPL model)
params (DictParam): poses(2, 72), shapes(1, 10), Rh(2, 3), Th(2, 3)
bboxes (nFrames, nViews, nJoints, 4): 2D bbox of each view输入的时候是按照时序叠起来的
keypoints2d (nFrames, nViews, nJoints, 4): 2D keypoints of each view输入的时候是按照时序叠起来的
weight (Dict): string:float
cfg (Config): Config Node controling running mode
"""
nViews, nFrames = keypoints2d.shape[:2]
assert nViews == 2, 'Please make sure that there exists only 2 views'
# keep the parameters of the real person
for key in ['poses', 'Rh', 'Th']:
# select the parameters of first person
params[key] = params[key][:nFrames]
prepare_funcs = [
deepcopy_tensor,
get_prepare_smplx(params, cfg, nFrames),
]
loss_repro = LossKeypointsMirror2DDirect(keypoints2d, bboxes, Pall, normal, cfg,
mirror=params.pop('mirror', None))
loss_funcs = {
'k2d': loss_repro,
'init_poses': LossInit(params, cfg).init_poses,
'init_shapes': LossInit(params, cfg).init_shapes,
}
postprocess_funcs = [
dict_of_tensor_to_numpy,
]
params = _optimizeSMPL(body_model, params, prepare_funcs, postprocess_funcs, loss_funcs,
extra_params=[loss_repro.mirror],
weight_loss=weight, cfg=cfg)
mirror = loss_repro.mirror.detach().cpu().numpy()
params = flipSMPLParamsV(params, mirror)
params['mirror'] = mirror
return params
def viewSelection(params, body_model, loss_repro, nFrames):
# view selection
params_inp = {key: val.copy() for key, val in params.items()}
params_inp = flipSMPLPosesV(params_inp)
kpts_est = body_model(return_verts=False, return_tensor=True, **params_inp)
residual = loss_repro.residual(kpts_est)
res_i = torch.norm(residual, dim=-1).mean(dim=-1).sum(dim=0)
params_rev = {key: val.copy() for key, val in params.items()}
params_rev = flipSMPLPosesV(params_rev, reverse=True)
kpts_est = body_model(return_verts=False, return_tensor=True, **params_rev)
residual = loss_repro.residual(kpts_est)
res_o = torch.norm(residual, dim=-1).mean(dim=-1).sum(dim=0)
for nf in range(res_i.shape[0]):
if res_i[nf] < res_o[nf]: # 使用外面的
params['poses'][[nFrames+nf]] = flipSMPLPoses(params['poses'][[nf]])
else:
params['poses'][[nf]] = flipSMPLPoses(params['poses'][[nFrames+nf]])
return params
def optimizeMirrorSoft(body_model, params, bboxes, keypoints2d, Pall, normal, weight, cfg):
"""
simple function for optimizing mirror
Args:
body_model (SMPL model)
params (DictParam): poses(2, 72), shapes(1, 10), Rh(2, 3), Th(2, 3)
bboxes (nViews, nFrames, 5): 2D bbox of each view输入的时候是按照时序叠起来的
keypoints2d (nViews, nFrames, nJoints, 3): 2D keypoints of each view输入的时候是按照时序叠起来的
weight (Dict): string:float
cfg (Config): Config Node controling running mode
"""
nViews, nFrames = keypoints2d.shape[:2]
assert nViews == 2, 'Please make sure that there exists only 2 views'
prepare_funcs = [
deepcopy_tensor,
flipSMPLPosesV, #
get_prepare_smplx(params, cfg, nFrames*nViews)
]
loss_sym = LossMirrorSymmetry(normal=normal, cfg=cfg)
loss_repro = LossKeypointsMirror2D(keypoints2d, bboxes, Pall, cfg)
params = viewSelection(params, body_model, loss_repro, nFrames)
init = LossInit(params, cfg)
loss_funcs = {
'k2d': loss_repro.__call__,
'init_poses': init.init_poses,
'init_shapes': init.init_shapes,
'par_self': loss_sym.parallel_self,
'ver_self': loss_sym.vertical_self,
'par_mirror': loss_sym.parallel_mirror,
}
if nFrames > 1:
loss_funcs['smooth_body'] = LossSmoothBodyMulti([0, nFrames, nFrames*2], cfg)
loss_funcs['smooth_poses'] = LossSmoothPosesMulti([0, nFrames, nFrames*2], cfg)
postprocess_funcs = [
dict_of_tensor_to_numpy,
flipSMPLPosesV
]
params = _optimizeSMPL(body_model, params, prepare_funcs, postprocess_funcs, loss_funcs, weight_loss=weight, cfg=cfg)
return params

View File

@ -0,0 +1,350 @@
'''
@ Date: 2020-11-19 10:49:26
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-13 22:52:28
@ FilePath: /EasyMocapRelease/easymocap/pyfitting/optimize_simple.py
'''
import numpy as np
import torch
from .lbfgs import LBFGS
from .optimize import FittingMonitor, grad_require, FittingLog
from .lossfactory import LossSmoothBodyMean, LossRegPoses
from .lossfactory import LossKeypoints3D, LossKeypointsMV2D, LossSmoothBody, LossRegPosesZero, LossInit, LossSmoothPoses
def optimizeShape(body_model, body_params, keypoints3d,
weight_loss, kintree, cfg=None):
""" simple function for optimizing model shape given 3d keypoints
Args:
body_model (SMPL model)
params_init (DictParam): poses(1, 72), shapes(1, 10), Rh(1, 3), Th(1, 3)
keypoints (nFrames, nJoints, 3): 3D keypoints
weight (Dict): string:float
kintree ([[src, dst]]): list of list:int
cfg (Config): Config Node controling running mode
"""
device = body_model.device
# 计算不同的骨长
kintree = np.array(kintree, dtype=np.int)
# limb_length: nFrames, nLimbs, 1
limb_length = np.linalg.norm(keypoints3d[:, kintree[:, 1], :3] - keypoints3d[:, kintree[:, 0], :3], axis=2, keepdims=True)
# conf: nFrames, nLimbs, 1
limb_conf = np.minimum(keypoints3d[:, kintree[:, 1], 3:], keypoints3d[:, kintree[:, 0], 3:])
limb_length = torch.Tensor(limb_length).to(device)
limb_conf = torch.Tensor(limb_conf).to(device)
body_params = {key:torch.Tensor(val).to(device) for key, val in body_params.items()}
body_params_init = {key:val.clone() for key, val in body_params.items()}
opt_params = [body_params['shapes']]
grad_require(opt_params, True)
optimizer = LBFGS(
opt_params, line_search_fn='strong_wolfe', max_iter=10)
nFrames = keypoints3d.shape[0]
verbose = False
def closure(debug=False):
optimizer.zero_grad()
keypoints3d = body_model(return_verts=False, return_tensor=True, only_shape=True, **body_params)
src = keypoints3d[:, kintree[:, 0], :3] #.detach()
dst = keypoints3d[:, kintree[:, 1], :3]
direct_est = (dst - src).detach()
direct_norm = torch.norm(direct_est, dim=2, keepdim=True)
direct_normalized = direct_est/(direct_norm + 1e-4)
err = dst - src - direct_normalized * limb_length
loss_dict = {
's3d': torch.sum(err**2*limb_conf)/nFrames,
'reg_shapes': torch.sum(body_params['shapes']**2)}
if 'init_shape' in weight_loss.keys():
loss_dict['init_shape'] = torch.sum((body_params['shapes'] - body_params_init['shapes'])**2)
# fittingLog.step(loss_dict, weight_loss)
if verbose:
print(' '.join([key + ' %.3f'%(loss_dict[key].item()*weight_loss[key])
for key in loss_dict.keys() if weight_loss[key]>0]))
loss = sum([loss_dict[key]*weight_loss[key]
for key in loss_dict.keys()])
if not debug:
loss.backward()
return loss
else:
return loss_dict
fitting = FittingMonitor(ftol=1e-4)
final_loss = fitting.run_fitting(optimizer, closure, opt_params)
fitting.close()
grad_require(opt_params, False)
loss_dict = closure(debug=True)
for key in loss_dict.keys():
loss_dict[key] = loss_dict[key].item()
optimizer = LBFGS(
opt_params, line_search_fn='strong_wolfe')
body_params = {key:val.detach().cpu().numpy() for key, val in body_params.items()}
return body_params
N_BODY = 25
N_HAND = 21
def interp(left_value, right_value, weight, key='poses'):
if key == 'Rh':
return left_value * weight + right_value * (1 - weight)
elif key == 'Th':
return left_value * weight + right_value * (1 - weight)
elif key == 'poses':
return left_value * weight + right_value * (1 - weight)
def get_interp_by_keypoints(keypoints):
if len(keypoints.shape) == 3: # (nFrames, nJoints, 3)
conf = keypoints[..., -1]
elif len(keypoints.shape) == 4: # (nViews, nFrames, nJoints)
conf = keypoints[..., -1].sum(axis=0)
else:
raise NotImplementedError
not_valid_frames = np.where(conf.sum(axis=1) < 0.01)[0].tolist()
# 遍历空白帧,选择起点和终点
ranges = []
if len(not_valid_frames) > 0:
start = not_valid_frames[0]
for i in range(1, len(not_valid_frames)):
if not_valid_frames[i] == not_valid_frames[i-1] + 1:
pass
else:# 改变位置了
end = not_valid_frames[i-1]
ranges.append((start, end))
start = not_valid_frames[i]
ranges.append((start, not_valid_frames[-1]))
def interp_func(params):
for start, end in ranges:
# 对每个需要插值的区间: 这里直接使用最近帧进行插值了
left = start - 1
right = end + 1
for nf in range(start, end+1):
weight = (nf - left)/(right - left)
for key in ['Rh', 'Th', 'poses']:
params[key][nf] = interp(params[key][left], params[key][right], 1-weight, key=key)
return params
return interp_func
def interp_by_k3d(conf, params):
for key in ['Rh', 'Th', 'poses']:
params[key] = params[key].clone()
# Totally invalid frames
not_valid_frames = torch.nonzero(conf.sum(dim=1).squeeze() < 0.01)[:, 0].detach().cpu().numpy().tolist()
# 遍历空白帧,选择起点和终点
ranges = []
if len(not_valid_frames) > 0:
start = not_valid_frames[0]
for i in range(1, len(not_valid_frames)):
if not_valid_frames[i] == not_valid_frames[i-1] + 1:
pass
else:# 改变位置了
end = not_valid_frames[i-1]
ranges.append((start, end))
start = not_valid_frames[i]
ranges.append((start, not_valid_frames[-1]))
for start, end in ranges:
# 对每个需要插值的区间: 这里直接使用最近帧进行插值了
left = start - 1
right = end + 1
for nf in range(start, end+1):
weight = (nf - left)/(right - left)
for key in ['Rh', 'Th', 'poses']:
params[key][nf] = interp(params[key][left], params[key][right], 1-weight, key=key)
return params
def deepcopy_tensor(body_params):
for key in body_params.keys():
body_params[key] = body_params[key].clone()
return body_params
def dict_of_tensor_to_numpy(body_params):
body_params = {key:val.detach().cpu().numpy() for key, val in body_params.items()}
return body_params
def get_prepare_smplx(body_params, cfg, nFrames):
zero_pose = torch.zeros((nFrames, 3), device=cfg.device)
if not cfg.OPT_HAND and cfg.model in ['smplh', 'smplx']:
zero_pose_hand = torch.zeros((nFrames, body_params['poses'].shape[1] - 66), device=cfg.device)
elif cfg.OPT_HAND and not cfg.OPT_EXPR and cfg.model == 'smplx':
zero_pose_face = torch.zeros((nFrames, body_params['poses'].shape[1] - 78), device=cfg.device)
def pack(new_params):
if not cfg.OPT_HAND and cfg.model in ['smplh', 'smplx']:
new_params['poses'] = torch.cat([zero_pose, new_params['poses'][:, 3:66], zero_pose_hand], dim=1)
else:
new_params['poses'] = torch.cat([zero_pose, new_params['poses'][:, 3:]], dim=1)
return new_params
return pack
def get_optParams(body_params, cfg, extra_params):
for key, val in body_params.items():
body_params[key] = torch.Tensor(val).to(cfg.device)
if cfg is None:
opt_params = [body_params['Rh'], body_params['Th'], body_params['poses']]
else:
if extra_params is not None:
opt_params = extra_params
else:
opt_params = []
if cfg.OPT_R:
opt_params.append(body_params['Rh'])
if cfg.OPT_T:
opt_params.append(body_params['Th'])
if cfg.OPT_POSE:
opt_params.append(body_params['poses'])
if cfg.OPT_SHAPE:
opt_params.append(body_params['shapes'])
if cfg.OPT_EXPR and cfg.model == 'smplx':
opt_params.append(body_params['expression'])
return opt_params
def _optimizeSMPL(body_model, body_params, prepare_funcs, postprocess_funcs,
loss_funcs, extra_params=None,
weight_loss={}, cfg=None):
""" A common interface for different optimization.
Args:
body_model (SMPL model)
body_params (DictParam): poses(1, 72), shapes(1, 10), Rh(1, 3), Th(1, 3)
prepare_funcs (List): functions for prepare
loss_funcs (Dict): functions for loss
weight_loss (Dict): weight
cfg (Config): Config Node controling running mode
"""
loss_funcs = {key: val for key, val in loss_funcs.items() if key in weight_loss.keys() and weight_loss[key] > 0.}
if cfg.verbose:
print('Loss Functions: ')
for key, func in loss_funcs.items():
print(' -> {:15s}: {}'.format(key, func.__doc__))
opt_params = get_optParams(body_params, cfg, extra_params)
grad_require(opt_params, True)
optimizer = LBFGS(opt_params,
line_search_fn='strong_wolfe')
PRINT_STEP = 100
records = []
def closure(debug=False):
# 0. Prepare body parameters => new_params
optimizer.zero_grad()
new_params = body_params.copy()
for func in prepare_funcs:
new_params = func(new_params)
# 1. Compute keypoints => kpts_est
kpts_est = body_model(return_verts=False, return_tensor=True, **new_params)
# 2. Compute loss => loss_dict
loss_dict = {key:func(kpts_est=kpts_est, **new_params) for key, func in loss_funcs.items()}
# 3. Summary and log
cnt = len(records)
if cfg.verbose and cnt % PRINT_STEP == 0:
print('{:-6d}: '.format(cnt) + ' '.join([key + ' %f'%(loss_dict[key].item()*weight_loss[key])
for key in loss_dict.keys() if weight_loss[key]>0]))
loss = sum([loss_dict[key]*weight_loss[key]
for key in loss_dict.keys()])
records.append(loss.item())
if debug:
return loss_dict
loss.backward()
return loss
fitting = FittingMonitor(ftol=1e-4)
final_loss = fitting.run_fitting(optimizer, closure, opt_params)
fitting.close()
grad_require(opt_params, False)
loss_dict = closure(debug=True)
if cfg.verbose:
print('{:-6d}: '.format(len(records)) + ' '.join([key + ' %f'%(loss_dict[key].item()*weight_loss[key])
for key in loss_dict.keys() if weight_loss[key]>0]))
loss_dict = {key:val.item() for key, val in loss_dict.items()}
# post-process the body_parameters
for func in postprocess_funcs:
body_params = func(body_params)
return body_params
def optimizePose3D(body_model, params, keypoints3d, weight, cfg):
"""
simple function for optimizing model pose given 3d keypoints
Args:
body_model (SMPL model)
params (DictParam): poses(1, 72), shapes(1, 10), Rh(1, 3), Th(1, 3)
keypoints3d (nFrames, nJoints, 4): 3D keypoints
weight (Dict): string:float
cfg (Config): Config Node controling running mode
"""
nFrames = keypoints3d.shape[0]
prepare_funcs = [
deepcopy_tensor,
get_prepare_smplx(params, cfg, nFrames),
get_interp_by_keypoints(keypoints3d)
]
loss_funcs = {
'k3d': LossKeypoints3D(keypoints3d, cfg).body,
'smooth_body': LossSmoothBodyMean(cfg).body,
'smooth_poses': LossSmoothPoses(1, nFrames, cfg).poses,
'reg_poses': LossRegPoses(cfg).reg_body,
'init_poses': LossInit(params, cfg).init_poses,
'reg_poses_zero': LossRegPosesZero(keypoints3d, cfg).__call__,
}
if cfg.OPT_HAND:
loss_funcs['k3d_hand'] = LossKeypoints3D(keypoints3d, cfg, norm='l1').hand
loss_funcs['reg_hand'] = LossRegPoses(cfg).reg_hand
# loss_funcs['smooth_hand'] = LossSmoothPoses(1, nFrames, cfg).hands
loss_funcs['smooth_hand'] = LossSmoothBodyMean(cfg).hand
if cfg.OPT_EXPR:
loss_funcs['k3d_face'] = LossKeypoints3D(keypoints3d, cfg, norm='l1').face
loss_funcs['reg_head'] = LossRegPoses(cfg).reg_head
loss_funcs['reg_expr'] = LossRegPoses(cfg).reg_expr
loss_funcs['smooth_head'] = LossSmoothPoses(1, nFrames, cfg).head
postprocess_funcs = [
get_interp_by_keypoints(keypoints3d),
dict_of_tensor_to_numpy
]
params = _optimizeSMPL(body_model, params, prepare_funcs, postprocess_funcs, loss_funcs, weight_loss=weight, cfg=cfg)
return params
def optimizePose2D(body_model, params, bboxes, keypoints2d, Pall, weight, cfg):
"""
simple function for optimizing model pose given 3d keypoints
Args:
body_model (SMPL model)
params (DictParam): poses(1, 72), shapes(1, 10), Rh(1, 3), Th(1, 3)
keypoints2d (nFrames, nViews, nJoints, 4): 2D keypoints of each view
bboxes: (nFrames, nViews, 5)
weight (Dict): string:float
cfg (Config): Config Node controling running mode
"""
# transpose to (nViews, nFrames, 5)
bboxes = bboxes.transpose(1, 0, 2)
# transpose to => keypoints2d: (nViews, nFrames, nJoints, 3)
keypoints2d = keypoints2d.transpose(1, 0, 2, 3)
nViews, nFrames = keypoints2d.shape[:2]
prepare_funcs = [
deepcopy_tensor,
get_prepare_smplx(params, cfg, nFrames),
get_interp_by_keypoints(keypoints2d)
]
loss_funcs = {
'k2d': LossKeypointsMV2D(keypoints2d, bboxes, Pall, cfg).__call__,
'smooth_body': LossSmoothBodyMean(cfg).body,
'init_poses': LossInit(params, cfg).init_poses,
'smooth_poses': LossSmoothPoses(nViews, nFrames, cfg).poses,
# 'reg_poses': LossRegPoses(cfg).reg_body,
'reg_poses_zero': LossRegPosesZero(keypoints2d, cfg).__call__,
}
if cfg.OPT_HAND:
loss_funcs['reg_hand'] = LossRegPoses(cfg).reg_hand
# loss_funcs['smooth_hand'] = LossSmoothPoses(1, nFrames, cfg).hands
loss_funcs['smooth_hand'] = LossSmoothBodyMean(cfg).hand
if cfg.OPT_EXPR:
loss_funcs['reg_head'] = LossRegPoses(cfg).reg_head
loss_funcs['reg_expr'] = LossRegPoses(cfg).reg_expr
loss_funcs['smooth_head'] = LossSmoothPoses(1, nFrames, cfg).head
loss_funcs = {key:val for key, val in loss_funcs.items() if key in weight.keys()}
postprocess_funcs = [
get_interp_by_keypoints(keypoints2d),
dict_of_tensor_to_numpy
]
params = _optimizeSMPL(body_model, params, prepare_funcs, postprocess_funcs, loss_funcs, weight_loss=weight, cfg=cfg)
return params

View File

@ -0,0 +1,10 @@
'''
@ Date: 2020-11-18 14:33:20
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-01-20 16:33:02
@ FilePath: /EasyMocap/code/smplmodel/__init__.py
'''
from .body_model import SMPLlayer
from .body_param import load_model
from .body_param import merge_params, select_nf, init_params, check_params, check_keypoints

View File

@ -0,0 +1,280 @@
'''
@ Date: 2020-11-18 14:04:10
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-03-15 22:04:32
@ FilePath: /EasyMocap/code/smplmodel/body_model.py
'''
import torch
import torch.nn as nn
from .lbs import lbs, batch_rodrigues
import os.path as osp
import pickle
import numpy as np
import os
def to_tensor(array, dtype=torch.float32, device=torch.device('cpu')):
if 'torch.tensor' not in str(type(array)):
return torch.tensor(array, dtype=dtype).to(device)
else:
return array.to(device)
def to_np(array, dtype=np.float32):
if 'scipy.sparse' in str(type(array)):
array = array.todense()
return np.array(array, dtype=dtype)
def load_regressor(regressor_path):
if regressor_path.endswith('.npy'):
X_regressor = to_tensor(np.load(regressor_path))
elif regressor_path.endswith('.txt'):
data = np.loadtxt(regressor_path)
with open(regressor_path, 'r') as f:
shape = f.readline().split()[1:]
reg = np.zeros((int(shape[0]), int(shape[1])))
for i, j, v in data:
reg[int(i), int(j)] = v
X_regressor = to_tensor(reg)
else:
import ipdb; ipdb.set_trace()
return X_regressor
NUM_POSES = {'smpl': 72, 'smplh': 78, 'smplx': 66 + 12 + 9}
NUM_EXPR = 10
class SMPLlayer(nn.Module):
def __init__(self, model_path, model_type='smpl', gender='neutral', device=None,
regressor_path=None) -> None:
super(SMPLlayer, self).__init__()
dtype = torch.float32
self.dtype = dtype
self.device = device
self.model_type = model_type
# create the SMPL model
if osp.isdir(model_path):
model_fn = 'SMPL_{}.{ext}'.format(gender.upper(), ext='pkl')
smpl_path = osp.join(model_path, model_fn)
else:
smpl_path = model_path
assert osp.exists(smpl_path), 'Path {} does not exist!'.format(
smpl_path)
with open(smpl_path, 'rb') as smpl_file:
data = pickle.load(smpl_file, encoding='latin1')
self.faces = data['f']
self.register_buffer('faces_tensor',
to_tensor(to_np(self.faces, dtype=np.int64),
dtype=torch.long))
# Pose blend shape basis: 6890 x 3 x 207, reshaped to 6890*3 x 207
num_pose_basis = data['posedirs'].shape[-1]
# 207 x 20670
posedirs = data['posedirs']
data['posedirs'] = np.reshape(data['posedirs'], [-1, num_pose_basis]).T
for key in ['J_regressor', 'v_template', 'weights', 'posedirs', 'shapedirs']:
val = to_tensor(to_np(data[key]), dtype=dtype)
self.register_buffer(key, val)
# indices of parents for each joints
parents = to_tensor(to_np(data['kintree_table'][0])).long()
parents[0] = -1
self.register_buffer('parents', parents)
if self.model_type == 'smplx':
# shape
self.num_expression_coeffs = 10
self.num_shapes = 10
self.shapedirs = self.shapedirs[:, :, :self.num_shapes+self.num_expression_coeffs]
# joints regressor
if regressor_path is not None:
X_regressor = load_regressor(regressor_path)
X_regressor = torch.cat((self.J_regressor, X_regressor), dim=0)
j_J_regressor = torch.zeros(self.J_regressor.shape[0], X_regressor.shape[0], device=device)
for i in range(self.J_regressor.shape[0]):
j_J_regressor[i, i] = 1
j_v_template = X_regressor @ self.v_template
#
j_shapedirs = torch.einsum('vij,kv->kij', [self.shapedirs, X_regressor])
# (25, 24)
j_weights = X_regressor @ self.weights
j_posedirs = torch.einsum('ab, bde->ade', [X_regressor, torch.Tensor(posedirs)]).numpy()
j_posedirs = np.reshape(j_posedirs, [-1, num_pose_basis]).T
j_posedirs = to_tensor(j_posedirs)
self.register_buffer('j_posedirs', j_posedirs)
self.register_buffer('j_shapedirs', j_shapedirs)
self.register_buffer('j_weights', j_weights)
self.register_buffer('j_v_template', j_v_template)
self.register_buffer('j_J_regressor', j_J_regressor)
if self.model_type == 'smplh':
# load smplh data
self.num_pca_comps = 6
from os.path import join
for key in ['LEFT', 'RIGHT']:
left_file = join(os.path.dirname(smpl_path), 'MANO_{}.pkl'.format(key))
with open(left_file, 'rb') as f:
data = pickle.load(f, encoding='latin1')
val = to_tensor(to_np(data['hands_mean'].reshape(1, -1)), dtype=dtype)
self.register_buffer('mHandsMean'+key[0], val)
val = to_tensor(to_np(data['hands_components'][:self.num_pca_comps, :]), dtype=dtype)
self.register_buffer('mHandsComponents'+key[0], val)
self.use_pca = True
self.use_flat_mean = True
elif self.model_type == 'smplx':
# hand pose
self.num_pca_comps = 6
from os.path import join
for key in ['Ll', 'Rr']:
val = to_tensor(to_np(data['hands_mean'+key[1]].reshape(1, -1)), dtype=dtype)
self.register_buffer('mHandsMean'+key[0], val)
val = to_tensor(to_np(data['hands_components'+key[1]][:self.num_pca_comps, :]), dtype=dtype)
self.register_buffer('mHandsComponents'+key[0], val)
self.use_pca = True
self.use_flat_mean = True
def extend_pose(self, poses):
if self.model_type not in ['smplh', 'smplx']:
return poses
elif self.model_type == 'smplh' and poses.shape[-1] == 156:
return poses
elif self.model_type == 'smplx' and poses.shape[-1] == 165:
return poses
NUM_BODYJOINTS = 22 * 3
if self.use_pca:
NUM_HANDJOINTS = self.num_pca_comps
else:
NUM_HANDJOINTS = 15 * 3
NUM_FACEJOINTS = 3 * 3
poses_lh = poses[:, NUM_BODYJOINTS:NUM_BODYJOINTS + NUM_HANDJOINTS]
poses_rh = poses[:, NUM_BODYJOINTS + NUM_HANDJOINTS:NUM_BODYJOINTS+NUM_HANDJOINTS*2]
if self.use_pca:
poses_lh = poses_lh @ self.mHandsComponentsL
poses_rh = poses_rh @ self.mHandsComponentsR
if self.use_flat_mean:
poses_lh = poses_lh + self.mHandsMeanL
poses_rh = poses_rh + self.mHandsMeanR
if self.model_type == 'smplh':
poses = torch.cat([poses[:, :NUM_BODYJOINTS], poses_lh, poses_rh], dim=1)
elif self.model_type == 'smplx':
# the head part have only three joints
# poses_head: (N, 9), jaw_pose, leye_pose, reye_pose respectively
poses_head = poses[:, NUM_BODYJOINTS+NUM_HANDJOINTS*2:]
# body, head, left hand, right hand
poses = torch.cat([poses[:, :NUM_BODYJOINTS], poses_head, poses_lh, poses_rh], dim=1)
return poses
def get_root(self, poses, shapes, return_tensor=False):
if 'torch' not in str(type(poses)):
dtype, device = self.dtype, self.device
poses = to_tensor(poses, dtype, device)
shapes = to_tensor(shapes, dtype, device)
vertices, joints = lbs(shapes, poses, self.v_template,
self.shapedirs, self.posedirs,
self.J_regressor, self.parents,
self.weights, pose2rot=True, dtype=self.dtype, only_shape=True)
# N x 3
j0 = joints[:, 0, :]
if not return_tensor:
j0 = j0.detach().cpu().numpy()
return j0
def convert_from_standard_smpl(self, poses, shapes, Rh=None, Th=None, expression=None):
if 'torch' not in str(type(poses)):
dtype, device = self.dtype, self.device
poses = to_tensor(poses, dtype, device)
shapes = to_tensor(shapes, dtype, device)
Rh = to_tensor(Rh, dtype, device)
Th = to_tensor(Th, dtype, device)
if expression is not None:
expression = to_tensor(expression, dtype, device)
bn = poses.shape[0]
# process shapes
if shapes.shape[0] < bn:
shapes = shapes.expand(bn, -1)
vertices, joints = lbs(shapes, poses, self.v_template,
self.shapedirs, self.posedirs,
self.J_regressor, self.parents,
self.weights, pose2rot=True, dtype=self.dtype, only_shape=True)
# N x 3
j0 = joints[:, 0, :]
Rh = poses[:, :3].clone()
# N x 3 x 3
rot = batch_rodrigues(Rh)
Tnew = Th + j0 - torch.einsum('bij,bj->bi', rot, j0)
poses[:, :3] = 0
res = dict(poses=poses.detach().cpu().numpy(),
shapes=shapes.detach().cpu().numpy(),
Rh=Rh.detach().cpu().numpy(),
Th=Tnew.detach().cpu().numpy()
)
return res
def forward(self, poses, shapes, Rh=None, Th=None, expression=None, return_verts=True, return_tensor=True, only_shape=False, **kwargs):
""" Forward pass for SMPL model
Args:
poses (n, 72)
shapes (n, 10)
Rh (n, 3): global orientation
Th (n, 3): global translation
return_verts (bool, optional): if True return (6890, 3). Defaults to False.
"""
if 'torch' not in str(type(poses)):
dtype, device = self.dtype, self.device
poses = to_tensor(poses, dtype, device)
shapes = to_tensor(shapes, dtype, device)
Rh = to_tensor(Rh, dtype, device)
Th = to_tensor(Th, dtype, device)
if expression is not None:
expression = to_tensor(expression, dtype, device)
bn = poses.shape[0]
# process Rh, Th
if Rh is None:
Rh = torch.zeros(bn, 3, device=poses.device)
if len(Rh.shape) == 2: # angle-axis
rot = batch_rodrigues(Rh)
else:
rot = Rh
transl = Th.unsqueeze(dim=1)
# process shapes
if shapes.shape[0] < bn:
shapes = shapes.expand(bn, -1)
if expression is not None and self.model_type == 'smplx':
shapes = torch.cat([shapes, expression], dim=1)
# process poses
if self.model_type == 'smplh' or self.model_type == 'smplx':
poses = self.extend_pose(poses)
if return_verts:
vertices, joints = lbs(shapes, poses, self.v_template,
self.shapedirs, self.posedirs,
self.J_regressor, self.parents,
self.weights, pose2rot=True, dtype=self.dtype)
else:
vertices, joints = lbs(shapes, poses, self.j_v_template,
self.j_shapedirs, self.j_posedirs,
self.j_J_regressor, self.parents,
self.j_weights, pose2rot=True, dtype=self.dtype, only_shape=only_shape)
vertices = vertices[:, self.J_regressor.shape[0]:, :]
vertices = torch.matmul(vertices, rot.transpose(1, 2)) + transl
if not return_tensor:
vertices = vertices.detach().cpu().numpy()
return vertices
def check_params(self, body_params):
model_type = self.model_type
nFrames = body_params['poses'].shape[0]
if body_params['poses'].shape[1] != NUM_POSES[model_type]:
body_params['poses'] = np.hstack((body_params['poses'], np.zeros((nFrames, NUM_POSES[model_type] - body_params['poses'].shape[1]))))
if model_type == 'smplx' and 'expression' not in body_params.keys():
body_params['expression'] = np.zeros((nFrames, NUM_EXPR))
return body_params
@staticmethod
def merge_params(param_list, share_shape=True):
output = {}
for key in ['poses', 'shapes', 'Rh', 'Th', 'expression']:
if key in param_list[0].keys():
output[key] = np.vstack([v[key] for v in param_list])
if share_shape:
output['shapes'] = output['shapes'].mean(axis=0, keepdims=True)
return output

View File

@ -0,0 +1,101 @@
'''
@ Date: 2020-11-20 13:34:54
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-13 20:31:49
@ FilePath: /EasyMocapRelease/easymocap/smplmodel/body_param.py
'''
import numpy as np
from os.path import join
def merge_params(param_list, share_shape=True):
output = {}
for key in ['poses', 'shapes', 'Rh', 'Th', 'expression']:
if key in param_list[0].keys():
output[key] = np.vstack([v[key] for v in param_list])
if share_shape:
output['shapes'] = output['shapes'].mean(axis=0, keepdims=True)
return output
def select_nf(params_all, nf):
output = {}
for key in ['poses', 'Rh', 'Th']:
output[key] = params_all[key][nf:nf+1, :]
if 'expression' in params_all.keys():
output['expression'] = params_all['expression'][nf:nf+1, :]
if params_all['shapes'].shape[0] == 1:
output['shapes'] = params_all['shapes']
else:
output['shapes'] = params_all['shapes'][nf:nf+1, :]
return output
NUM_POSES = {'smpl': 72, 'smplh': 78, 'smplx': 66 + 12 + 9}
NUM_EXPR = 10
def init_params(nFrames=1, model_type='smpl'):
params = {
'poses': np.zeros((nFrames, NUM_POSES[model_type])),
'shapes': np.zeros((1, 10)),
'Rh': np.zeros((nFrames, 3)),
'Th': np.zeros((nFrames, 3)),
}
if model_type == 'smplx':
params['expression'] = np.zeros((nFrames, NUM_EXPR))
return params
def check_params(body_params, model_type):
nFrames = body_params['poses'].shape[0]
if body_params['poses'].shape[1] != NUM_POSES[model_type]:
body_params['poses'] = np.hstack((body_params['poses'], np.zeros((nFrames, NUM_POSES[model_type] - body_params['poses'].shape[1]))))
if model_type == 'smplx' and 'expression' not in body_params.keys():
body_params['expression'] = np.zeros((nFrames, NUM_EXPR))
return body_params
def load_model(gender='neutral', use_cuda=True, model_type='smpl', skel_type='body25', device=None, model_path='data/smplx'):
# prepare SMPL model
# print('[Load model {}/{}]'.format(model_type, gender))
import torch
if device is None:
if use_cuda and torch.cuda.is_available():
device = torch.device('cuda')
else:
device = torch.device('cpu')
from .body_model import SMPLlayer
if model_type == 'smpl':
if skel_type == 'body25':
reg_path = join(model_path, 'J_regressor_body25.npy')
elif skel_type == 'h36m':
reg_path = join(model_path, 'J_regressor_h36m.npy')
else:
raise NotImplementedError
body_model = SMPLlayer(join(model_path, 'smpl'), gender=gender, device=device,
regressor_path=reg_path)
elif model_type == 'smplh':
body_model = SMPLlayer(join(model_path, 'smplh/SMPLH_MALE.pkl'), model_type='smplh', gender=gender, device=device,
regressor_path=join(model_path, 'J_regressor_body25_smplh.txt'))
elif model_type == 'smplx':
body_model = SMPLlayer(join(model_path, 'smplx/SMPLX_{}.pkl'.format(gender.upper())), model_type='smplx', gender=gender, device=device,
regressor_path=join(model_path, 'J_regressor_body25_smplx.txt'))
else:
body_model = None
body_model.to(device)
return body_model
def check_keypoints(keypoints2d, WEIGHT_DEBUFF=1, min_conf=0.3):
# keypoints2d: nFrames, nJoints, 3
#
# wrong feet
# if keypoints2d.shape[-2] > 25 + 42:
# keypoints2d[..., 0, 2] = 0
# keypoints2d[..., [15, 16, 17, 18], -1] = 0
# keypoints2d[..., [19, 20, 21, 22, 23, 24], -1] /= 2
if keypoints2d.shape[-2] > 25:
# set the hand keypoints
keypoints2d[..., 25, :] = keypoints2d[..., 7, :]
keypoints2d[..., 46, :] = keypoints2d[..., 4, :]
keypoints2d[..., 25:, -1] *= WEIGHT_DEBUFF
# reduce the confidence of hand and face
MIN_CONF = min_conf
conf = keypoints2d[..., -1]
conf[conf<MIN_CONF] = 0
return keypoints2d

378
easymocap/smplmodel/lbs.py Normal file
View File

@ -0,0 +1,378 @@
# -*- coding: utf-8 -*-
# Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is
# holder of all proprietary rights on this computer program.
# You can only use this computer program if you have closed
# a license agreement with MPG or you get the right to use the computer
# program from someone who is authorized to grant you that right.
# Any use of the computer program without a valid license is prohibited and
# liable to prosecution.
#
# Copyright©2019 Max-Planck-Gesellschaft zur Förderung
# der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute
# for Intelligent Systems. All rights reserved.
#
# Contact: ps-license@tuebingen.mpg.de
from __future__ import absolute_import
from __future__ import print_function
from __future__ import division
import numpy as np
import torch
import torch.nn.functional as F
def rot_mat_to_euler(rot_mats):
# Calculates rotation matrix to euler angles
# Careful for extreme cases of eular angles like [0.0, pi, 0.0]
sy = torch.sqrt(rot_mats[:, 0, 0] * rot_mats[:, 0, 0] +
rot_mats[:, 1, 0] * rot_mats[:, 1, 0])
return torch.atan2(-rot_mats[:, 2, 0], sy)
def find_dynamic_lmk_idx_and_bcoords(vertices, pose, dynamic_lmk_faces_idx,
dynamic_lmk_b_coords,
neck_kin_chain, dtype=torch.float32):
''' Compute the faces, barycentric coordinates for the dynamic landmarks
To do so, we first compute the rotation of the neck around the y-axis
and then use a pre-computed look-up table to find the faces and the
barycentric coordinates that will be used.
Special thanks to Soubhik Sanyal (soubhik.sanyal@tuebingen.mpg.de)
for providing the original TensorFlow implementation and for the LUT.
Parameters
----------
vertices: torch.tensor BxVx3, dtype = torch.float32
The tensor of input vertices
pose: torch.tensor Bx(Jx3), dtype = torch.float32
The current pose of the body model
dynamic_lmk_faces_idx: torch.tensor L, dtype = torch.long
The look-up table from neck rotation to faces
dynamic_lmk_b_coords: torch.tensor Lx3, dtype = torch.float32
The look-up table from neck rotation to barycentric coordinates
neck_kin_chain: list
A python list that contains the indices of the joints that form the
kinematic chain of the neck.
dtype: torch.dtype, optional
Returns
-------
dyn_lmk_faces_idx: torch.tensor, dtype = torch.long
A tensor of size BxL that contains the indices of the faces that
will be used to compute the current dynamic landmarks.
dyn_lmk_b_coords: torch.tensor, dtype = torch.float32
A tensor of size BxL that contains the indices of the faces that
will be used to compute the current dynamic landmarks.
'''
batch_size = vertices.shape[0]
aa_pose = torch.index_select(pose.view(batch_size, -1, 3), 1,
neck_kin_chain)
rot_mats = batch_rodrigues(
aa_pose.view(-1, 3), dtype=dtype).view(batch_size, -1, 3, 3)
rel_rot_mat = torch.eye(3, device=vertices.device,
dtype=dtype).unsqueeze_(dim=0)
for idx in range(len(neck_kin_chain)):
rel_rot_mat = torch.bmm(rot_mats[:, idx], rel_rot_mat)
y_rot_angle = torch.round(
torch.clamp(-rot_mat_to_euler(rel_rot_mat) * 180.0 / np.pi,
max=39)).to(dtype=torch.long)
neg_mask = y_rot_angle.lt(0).to(dtype=torch.long)
mask = y_rot_angle.lt(-39).to(dtype=torch.long)
neg_vals = mask * 78 + (1 - mask) * (39 - y_rot_angle)
y_rot_angle = (neg_mask * neg_vals +
(1 - neg_mask) * y_rot_angle)
dyn_lmk_faces_idx = torch.index_select(dynamic_lmk_faces_idx,
0, y_rot_angle)
dyn_lmk_b_coords = torch.index_select(dynamic_lmk_b_coords,
0, y_rot_angle)
return dyn_lmk_faces_idx, dyn_lmk_b_coords
def vertices2landmarks(vertices, faces, lmk_faces_idx, lmk_bary_coords):
''' Calculates landmarks by barycentric interpolation
Parameters
----------
vertices: torch.tensor BxVx3, dtype = torch.float32
The tensor of input vertices
faces: torch.tensor Fx3, dtype = torch.long
The faces of the mesh
lmk_faces_idx: torch.tensor L, dtype = torch.long
The tensor with the indices of the faces used to calculate the
landmarks.
lmk_bary_coords: torch.tensor Lx3, dtype = torch.float32
The tensor of barycentric coordinates that are used to interpolate
the landmarks
Returns
-------
landmarks: torch.tensor BxLx3, dtype = torch.float32
The coordinates of the landmarks for each mesh in the batch
'''
# Extract the indices of the vertices for each face
# BxLx3
batch_size, num_verts = vertices.shape[:2]
device = vertices.device
lmk_faces = torch.index_select(faces, 0, lmk_faces_idx.view(-1)).view(
batch_size, -1, 3)
lmk_faces += torch.arange(
batch_size, dtype=torch.long, device=device).view(-1, 1, 1) * num_verts
lmk_vertices = vertices.view(-1, 3)[lmk_faces].view(
batch_size, -1, 3, 3)
landmarks = torch.einsum('blfi,blf->bli', [lmk_vertices, lmk_bary_coords])
return landmarks
def lbs(betas, pose, v_template, shapedirs, posedirs, J_regressor, parents,
lbs_weights, pose2rot=True, dtype=torch.float32, only_shape=False):
''' Performs Linear Blend Skinning with the given shape and pose parameters
Parameters
----------
betas : torch.tensor BxNB
The tensor of shape parameters
pose : torch.tensor Bx(J + 1) * 3
The pose parameters in axis-angle format
v_template torch.tensor BxVx3
The template mesh that will be deformed
shapedirs : torch.tensor 1xNB
The tensor of PCA shape displacements
posedirs : torch.tensor Px(V * 3)
The pose PCA coefficients
J_regressor : torch.tensor JxV
The regressor array that is used to calculate the joints from
the position of the vertices
parents: torch.tensor J
The array that describes the kinematic tree for the model
lbs_weights: torch.tensor N x V x (J + 1)
The linear blend skinning weights that represent how much the
rotation matrix of each part affects each vertex
pose2rot: bool, optional
Flag on whether to convert the input pose tensor to rotation
matrices. The default value is True. If False, then the pose tensor
should already contain rotation matrices and have a size of
Bx(J + 1)x9
dtype: torch.dtype, optional
Returns
-------
verts: torch.tensor BxVx3
The vertices of the mesh after applying the shape and pose
displacements.
joints: torch.tensor BxJx3
The joints of the model
'''
batch_size = max(betas.shape[0], pose.shape[0])
device = betas.device
# Add shape contribution
v_shaped = v_template + blend_shapes(betas, shapedirs)
# Get the joints
# NxJx3 array
J = vertices2joints(J_regressor, v_shaped)
if only_shape:
return v_shaped, J
# 3. Add pose blend shapes
# N x J x 3 x 3
ident = torch.eye(3, dtype=dtype, device=device)
if pose2rot:
rot_mats = batch_rodrigues(
pose.view(-1, 3), dtype=dtype).view([batch_size, -1, 3, 3])
pose_feature = (rot_mats[:, 1:, :, :] - ident).view([batch_size, -1])
# (N x P) x (P, V * 3) -> N x V x 3
pose_offsets = torch.matmul(pose_feature, posedirs) \
.view(batch_size, -1, 3)
else:
pose_feature = pose[:, 1:].view(batch_size, -1, 3, 3) - ident
rot_mats = pose.view(batch_size, -1, 3, 3)
pose_offsets = torch.matmul(pose_feature.view(batch_size, -1),
posedirs).view(batch_size, -1, 3)
v_posed = pose_offsets + v_shaped
# 4. Get the global joint location
J_transformed, A = batch_rigid_transform(rot_mats, J, parents, dtype=dtype)
# 5. Do skinning:
# W is N x V x (J + 1)
W = lbs_weights.unsqueeze(dim=0).expand([batch_size, -1, -1])
# (N x V x (J + 1)) x (N x (J + 1) x 16)
num_joints = J_regressor.shape[0]
T = torch.matmul(W, A.view(batch_size, num_joints, 16)) \
.view(batch_size, -1, 4, 4)
homogen_coord = torch.ones([batch_size, v_posed.shape[1], 1],
dtype=dtype, device=device)
v_posed_homo = torch.cat([v_posed, homogen_coord], dim=2)
v_homo = torch.matmul(T, torch.unsqueeze(v_posed_homo, dim=-1))
verts = v_homo[:, :, :3, 0]
return verts, J_transformed
def vertices2joints(J_regressor, vertices):
''' Calculates the 3D joint locations from the vertices
Parameters
----------
J_regressor : torch.tensor JxV
The regressor array that is used to calculate the joints from the
position of the vertices
vertices : torch.tensor BxVx3
The tensor of mesh vertices
Returns
-------
torch.tensor BxJx3
The location of the joints
'''
return torch.einsum('bik,ji->bjk', [vertices, J_regressor])
def blend_shapes(betas, shape_disps):
''' Calculates the per vertex displacement due to the blend shapes
Parameters
----------
betas : torch.tensor Bx(num_betas)
Blend shape coefficients
shape_disps: torch.tensor Vx3x(num_betas)
Blend shapes
Returns
-------
torch.tensor BxVx3
The per-vertex displacement due to shape deformation
'''
# Displacement[b, m, k] = sum_{l} betas[b, l] * shape_disps[m, k, l]
# i.e. Multiply each shape displacement by its corresponding beta and
# then sum them.
blend_shape = torch.einsum('bl,mkl->bmk', [betas, shape_disps])
return blend_shape
def batch_rodrigues(rot_vecs, epsilon=1e-8, dtype=torch.float32):
''' Calculates the rotation matrices for a batch of rotation vectors
Parameters
----------
rot_vecs: torch.tensor Nx3
array of N axis-angle vectors
Returns
-------
R: torch.tensor Nx3x3
The rotation matrices for the given axis-angle parameters
'''
batch_size = rot_vecs.shape[0]
device = rot_vecs.device
angle = torch.norm(rot_vecs + 1e-8, dim=1, keepdim=True)
rot_dir = rot_vecs / angle
cos = torch.unsqueeze(torch.cos(angle), dim=1)
sin = torch.unsqueeze(torch.sin(angle), dim=1)
# Bx1 arrays
rx, ry, rz = torch.split(rot_dir, 1, dim=1)
K = torch.zeros((batch_size, 3, 3), dtype=dtype, device=device)
zeros = torch.zeros((batch_size, 1), dtype=dtype, device=device)
K = torch.cat([zeros, -rz, ry, rz, zeros, -rx, -ry, rx, zeros], dim=1) \
.view((batch_size, 3, 3))
ident = torch.eye(3, dtype=dtype, device=device).unsqueeze(dim=0)
rot_mat = ident + sin * K + (1 - cos) * torch.bmm(K, K)
return rot_mat
def transform_mat(R, t):
''' Creates a batch of transformation matrices
Args:
- R: Bx3x3 array of a batch of rotation matrices
- t: Bx3x1 array of a batch of translation vectors
Returns:
- T: Bx4x4 Transformation matrix
'''
# No padding left or right, only add an extra row
return torch.cat([F.pad(R, [0, 0, 0, 1]),
F.pad(t, [0, 0, 0, 1], value=1)], dim=2)
def batch_rigid_transform(rot_mats, joints, parents, dtype=torch.float32):
"""
Applies a batch of rigid transformations to the joints
Parameters
----------
rot_mats : torch.tensor BxNx3x3
Tensor of rotation matrices
joints : torch.tensor BxNx3
Locations of joints
parents : torch.tensor BxN
The kinematic tree of each object
dtype : torch.dtype, optional:
The data type of the created tensors, the default is torch.float32
Returns
-------
posed_joints : torch.tensor BxNx3
The locations of the joints after applying the pose rotations
rel_transforms : torch.tensor BxNx4x4
The relative (with respect to the root joint) rigid transformations
for all the joints
"""
joints = torch.unsqueeze(joints, dim=-1)
rel_joints = joints.clone()
rel_joints[:, 1:] -= joints[:, parents[1:]]
transforms_mat = transform_mat(
rot_mats.view(-1, 3, 3),
rel_joints.contiguous().view(-1, 3, 1)).view(-1, joints.shape[1], 4, 4)
transform_chain = [transforms_mat[:, 0]]
for i in range(1, parents.shape[0]):
# Subtract the joint location at the rest pose
# No need for rotation, since it's identity when at rest
curr_res = torch.matmul(transform_chain[parents[i]],
transforms_mat[:, i])
transform_chain.append(curr_res)
transforms = torch.stack(transform_chain, dim=1)
# The last column of the transformations contains the posed joints
posed_joints = transforms[:, :, :3, 3]
# The last column of the transformations contains the posed joints
posed_joints = transforms[:, :, :3, 3]
joints_homogen = F.pad(joints, [0, 0, 0, 1])
rel_transforms = transforms - F.pad(
torch.matmul(transforms, joints_homogen), [3, 0, 0, 0, 0, 0, 0, 0])
return posed_joints, rel_transforms

View File

@ -0,0 +1,213 @@
'''
@ Date: 2021-03-05 19:29:49
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-03-31 22:46:05
@ FilePath: /EasyMocap/scripts/postprocess/eval_k3d.py
'''
# Evaluate any 3d keypoints
from glob import glob
from tqdm import tqdm
from os.path import join
import os
import numpy as np
from easymocap.dataset import CONFIG
from easymocap.mytools.reader import read_keypoints3d
from easymocap.mytools import read_camera
from eval_utils import keypoints_error
from pprint import pprint
class Conversion:
def __init__(self, type_i, type_o, type_e=None):
names_i = CONFIG[type_i]['joint_names']
names_o = CONFIG[type_o]['joint_names']
if type_e is None:
self.commons = [i for i in names_o if i in names_i]
else:
names_e = CONFIG[type_e]['joint_names']
self.commons = [i for i in names_e if i in names_i and i in names_o]
self.idx_i = [names_i.index(i) for i in self.commons]
self.idx_o = [names_o.index(i) for i in self.commons]
def inp(self, inp):
return inp[..., self.idx_i, :]
def out(self, out):
return out[..., self.idx_o, :]
def __call__(self, inp, out):
return inp[..., self.idx_i, :], out[..., self.idx_o, :]
def run_eval_keypoints(inp, out, type_i, type_o, step_gt, mode='single', args=None):
# 遍历输出文件夹
conversion = Conversion(type_i, type_o)
inplists = sorted(glob(join(inp, '*.json')))[::step_gt]
outlists = sorted(glob(join(out, '*.json')))[args.start:args.end]
assert len(inplists) == len(outlists), '{} != {}'.format(len(inplists), len(outlists))
results = []
for nf, inpname in enumerate(tqdm(inplists)):
outname = outlists[nf]
gts = read_keypoints3d(inpname)
ests = read_keypoints3d(outname)
# 将GT转换到当前坐标系
for gt in gts:
gt['keypoints3d'] = conversion.inp(gt['keypoints3d'])
if gt['keypoints3d'].shape[1] == 3:
gt['keypoints3d'] = np.hstack([gt['keypoints3d'], np.ones((gt['keypoints3d'].shape[0], 1))])
for est in ests:
est['keypoints3d'] = conversion.out(est['keypoints3d'])
if est['keypoints3d'].shape[1] == 3:
est['keypoints3d'] = np.hstack([est['keypoints3d'], np.ones((est['keypoints3d'].shape[0], 1))])
# 这一步将交换est的顺序
if mode == 'single':
# 单人的:直接匹配上
pass
elif mode == 'matched': # ID已经匹配过了
pass
else: # 进行匹配
# 把估计的id都清空
for est in ests:
est['id'] = -1
# 计算距离先
kpts_gt = np.stack([v['keypoints3d'] for v in gts])
kpts_dt = np.stack([v['keypoints3d'] for v in ests])
distances = np.linalg.norm(kpts_gt[:, None, :, :3] - kpts_dt[None, :, :, :3], axis=-1)
conf = (kpts_gt[:, None, :, -1] > 0) * (kpts_dt[None, :, :, -1] > 0)
dist = (distances * conf).sum(axis=-1)/conf.sum(axis=-1)
# 贪婪的匹配
ests_new = []
for igt, gt in enumerate(gts):
bestid = np.argmin(dist[igt])
ests_new.append(ests[bestid])
ests = ests_new
# 计算误差
for i, data in enumerate(gts):
kpts_gt = data['keypoints3d']
kpts_est = ests[i]['keypoints3d']
# 计算各种误差,存成字典
result = keypoints_error(kpts_gt, kpts_est, conversion.commons, joint_level=args.joint, use_align=args.align)
result['nf'] = nf
result['id'] = data['id']
results.append(result)
write_to_csv(join(out, '..', 'report.csv'), results)
return 0
keys = list(results[list(results.keys())[0]][0].keys())
reports = {}
for pid, result in results.items():
vals = {key: sum([res[key] for res in result])/len(result) for key in keys}
reports[pid] = vals
from tabulate import tabulate
headers = [''] + keys
table = []
for pid, report in reports.items():
res = ['{}'.format(pid)] + ['{:.2f}'.format(report[key]) for key in keys]
table.append(res)
savename = 'tmp.txt'
print(tabulate(table, headers, tablefmt='fancy_grid'))
print(tabulate(table, headers, tablefmt='fancy_grid'), file=open(savename, 'w'))
def write_to_csv(filename, results):
from tabulate import tabulate
keys = list(results[0].keys())
headers, table = [], []
for key in keys:
if isinstance(results[0][key], float):
headers.append(key)
table.append('{:.3f}'.format(sum([res[key] for res in results])/len(results)))
print('>> Totally {} samples:'.format(len(results)))
print(tabulate([table], headers, tablefmt='fancy_grid'))
with open(filename, 'w') as f:
# 写入头
header = list(results[0].keys())
f.write(','.join(header) + '\n')
for res in results:
f.write(','.join(['{}'.format(res[key]) for key in header]) + '\n')
def run_eval_keypoints_mono(inp, out, type_i, type_o, type_e, step_gt, cam_path, mode='single'):
conversion = Conversion(type_i, type_o, type_e)
inplists = sorted(glob(join(inp, '*.json')))[::step_gt]
# TODO:only evaluate a subset of views
if len(args.sub) == 0:
views = sorted(os.listdir(out))
else:
views = args.sub
# read camera
cameras = read_camera(join(cam_path, 'intri.yml'), join(cam_path, 'extri.yml'), views)
cameras = {key:cameras[key] for key in views}
if args.cam_res is not None:
cameras_res = read_camera(join(args.cam_res, 'intri.yml'), join(args.cam_res, 'extri.yml'), views)
cameras_res = {key:cameras_res[key] for key in views}
results = []
for view in views:
outlists = sorted(glob(join(out, view, '*.json')))
RT = cameras[view]['RT']
for outname in outlists:
basename = os.path.basename(outname)
gtname = join(inp, basename)
gts = read_keypoints3d(gtname)
ests = read_keypoints3d(outname)
# 将GT转换到当前坐标系
for gt in gts:
keypoints3d = conversion.inp(gt['keypoints3d'])
conf = keypoints3d[:, -1:].copy()
keypoints3d[:, -1] = 1
keypoints3d = (RT @ keypoints3d.T).T
gt['keypoints3d'] = np.hstack([keypoints3d, conf])
for est in ests:
est['keypoints3d'] = conversion.out(est['keypoints3d'])
if est['keypoints3d'].shape[1] == 3:
# 增加置信度为1
est['keypoints3d'] = np.hstack([est['keypoints3d'], np.ones((est['keypoints3d'].shape[0], 1))])
# 计算误差
for i, data in enumerate(gts):
kpts_gt = data['keypoints3d']
kpts_est = ests[i]['keypoints3d']
# 计算各种误差,存成字典
result = keypoints_error(kpts_gt, kpts_est, conversion.commons, joint_level=args.joint, use_align=True)
result['pid'] = data['id']
result['view'] = view
results.append(result)
write_to_csv(join(out, '..', 'report.csv'), results)
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('path', type=str)
parser.add_argument('--out', type=str, default=None)
parser.add_argument('--type_i', type=str, default='body25',
help='Type of ground-truth keypoints')
parser.add_argument('--type_o', type=str, default='body25',
help='Type of output keypoints')
parser.add_argument('--type_e', type=str, default=None,
help='Type of evaluation keypoints')
parser.add_argument('--mode', type=str, default='single', choices=['single', 'matched', 'greedy'],
help='the mode of match 3d person')
# parser.add_argument('--dataset', type=str, default='h36m')
parser.add_argument('--start', type=int, default=0,
help='frame start')
parser.add_argument('--end', type=int, default=100000,
help='frame end')
parser.add_argument('--step', type=int, default=1,
help='frame step')
parser.add_argument('--step_gt', type=int, default=1)
parser.add_argument('--joint', action='store_true',
help='report each joint')
parser.add_argument('--align', action='store_true',
help='report each joint')
# Multiple views dataset
parser.add_argument('--mono', action='store_true',
help='use this option if the estimated joints use monocular images. \
The results are stored in different folders.')
parser.add_argument('--sub', type=str, nargs='+', default=[],
help='the sub folder lists when in video mode')
parser.add_argument('--cam', type=str, default=None)
parser.add_argument('--cam_res', type=str, default=None)
parser.add_argument('--debug', action='store_true')
args = parser.parse_args()
if args.mono:
run_eval_keypoints_mono(args.path, args.out, args.type_i, args.type_o, args.type_e, cam_path=args.cam, step_gt=args.step_gt, mode=args.mode)
else:
run_eval_keypoints(args.path, args.out, args.type_i, args.type_o, args.step_gt, mode=args.mode, args=args)

View File

@ -0,0 +1,102 @@
import numpy as np
def compute_similarity_transform(S1, S2):
"""
Computes a similarity transform (sR, t) that takes
a set of 3D points S1 (3 x N) closest to a set of 3D points S2,
where R is an 3x3 rotation matrix, t 3x1 translation, s scale.
i.e. solves the orthogonal Procrutes problem.
"""
transposed = False
if S1.shape[0] != 3 and S1.shape[0] != 2:
S1 = S1.T
S2 = S2.T
transposed = True
assert(S2.shape[1] == S1.shape[1])
# 1. Remove mean.
mu1 = S1.mean(axis=1, keepdims=True)
mu2 = S2.mean(axis=1, keepdims=True)
X1 = S1 - mu1
X2 = S2 - mu2
# 2. Compute variance of X1 used for scale.
var1 = np.sum(X1**2)
# 3. The outer product of X1 and X2.
K = X1.dot(X2.T)
# 4. Solution that Maximizes trace(R'K) is R=U*V', where U, V are
# singular vectors of K.
U, s, Vh = np.linalg.svd(K)
V = Vh.T
# Construct Z that fixes the orientation of R to get det(R)=1.
Z = np.eye(U.shape[0])
Z[-1, -1] *= np.sign(np.linalg.det(U.dot(V.T)))
# Construct R.
R = V.dot(Z.dot(U.T))
# 5. Recover scale.
scale = np.trace(R.dot(K)) / var1
# 6. Recover translation.
t = mu2 - scale*(R.dot(mu1))
# 7. Error:
S1_hat = scale*R.dot(S1) + t
if transposed:
S1_hat = S1_hat.T
return S1_hat
def reconstruction_error(S1, S2, reduction='mean'):
"""Do Procrustes alignment and compute reconstruction error."""
S1_hat = compute_similarity_transform(S1, S2)
re = np.sqrt( ((S1_hat - S2)** 2).sum(axis=-1))
if reduction == 'mean':
re = re.mean()
elif reduction == 'sum':
re = re.sum()
return re
def align_by_pelvis(joints, names):
l_id = names.index('LHip')
r_id = names.index('RHip')
pelvis = joints[[l_id, r_id], :].mean(axis=0, keepdims=True)
return joints - pelvis
def keypoints_error(gt, est, names, use_align=False, joint_level=True):
assert gt.shape[-1] == 4
assert est.shape[-1] == 4
isValid = est[..., -1] > 0
isValidGT = gt[..., -1] > 0
isValid_common = isValid * isValidGT
est = est[..., :-1]
gt = gt[..., :-1]
dist = {}
dist['abs'] = np.sqrt(((gt - est)**2).sum(axis=-1)) * 1000
dist['pck@50'] = dist['abs'] < 50
# dist['pck@100'] = dist['abs'] < 100
# dist['pck@150'] = dist['abs'] < 0.15
if use_align:
l_id = names.index('LHip')
r_id = names.index('RHip')
assert isValid[l_id] and isValid[r_id]
assert isValidGT[l_id] and isValidGT[r_id]
# root align
gt, est = align_by_pelvis(gt, names), align_by_pelvis(est, names)
# Absolute error (MPJPE)
dist['ra'] = np.sqrt(((est - gt) ** 2).sum(axis=-1)) * 1000
# Reconstuction_error
est_hat = compute_similarity_transform(est, gt)
dist['pa'] = np.sqrt(((est_hat - gt) ** 2).sum(axis=-1)) * 1000
result = {}
for key in ['abs', 'ra', 'pa', 'pck@50', 'pck@100']:
if key not in dist:
continue
result[key+'_mean'] = dist[key].mean()
if joint_level:
for i, name in enumerate(names):
result[key+'_'+name] = dist[key][i]
return result

View File

@ -2,8 +2,8 @@
@ Date: 2021-01-13 20:38:33 @ Date: 2021-01-13 20:38:33
@ Author: Qing Shuai @ Author: Qing Shuai
@ LastEditors: Qing Shuai @ LastEditors: Qing Shuai
@ LastEditTime: 2021-01-27 10:41:48 @ LastEditTime: 2021-04-13 21:43:52
@ FilePath: /EasyMocap/scripts/preprocess/extract_video.py @ FilePath: /EasyMocapRelease/scripts/preprocess/extract_video.py
''' '''
import os, sys import os, sys
import cv2 import cv2
@ -11,8 +11,6 @@ from os.path import join
from tqdm import tqdm from tqdm import tqdm
from glob import glob from glob import glob
import numpy as np import numpy as np
code_path = join(os.path.dirname(__file__), '..', '..', 'code')
sys.path.append(code_path)
mkdir = lambda x: os.makedirs(x, exist_ok=True) mkdir = lambda x: os.makedirs(x, exist_ok=True)
@ -22,24 +20,33 @@ def extract_video(videoname, path, start, end, step):
return base return base
outpath = join(path, 'images', base) outpath = join(path, 'images', base)
if os.path.exists(outpath) and len(os.listdir(outpath)) > 0: if os.path.exists(outpath) and len(os.listdir(outpath)) > 0:
num_images = len(os.listdir(outpath))
print('>> exists {} frames'.format(num_images))
return base return base
else: else:
os.makedirs(outpath) os.makedirs(outpath, exist_ok=True)
video = cv2.VideoCapture(videoname) video = cv2.VideoCapture(videoname)
totalFrames = int(video.get(cv2.CAP_PROP_FRAME_COUNT)) totalFrames = int(video.get(cv2.CAP_PROP_FRAME_COUNT))
for cnt in tqdm(range(totalFrames)): for cnt in tqdm(range(totalFrames), desc='{:-10s}'.format(os.path.basename(videoname))):
ret, frame = video.read() ret, frame = video.read()
if cnt < start:continue if cnt < start:continue
if cnt > end:break if cnt >= end:break
if not ret:break if not ret:continue
cv2.imwrite(join(outpath, '{:06d}.jpg'.format(cnt)), frame) cv2.imwrite(join(outpath, '{:06d}.jpg'.format(cnt)), frame)
video.release() video.release()
return base return base
def extract_2d(openpose, image, keypoints, render): def extract_2d(openpose, image, keypoints, render, args):
if not os.path.exists(keypoints): skip = False
if os.path.exists(keypoints):
# check the number of images and keypoints
if len(os.listdir(image)) == len(os.listdir(keypoints)):
skip = True
if not skip:
os.makedirs(keypoints, exist_ok=True) os.makedirs(keypoints, exist_ok=True)
cmd = './build/examples/openpose/openpose.bin --image_dir {} --write_json {} --display 0'.format(image, keypoints) cmd = './build/examples/openpose/openpose.bin --image_dir {} --write_json {} --display 0'.format(image, keypoints)
if args.highres!=1:
cmd = cmd + ' --net_resolution -1x{}'.format(int(16*((368*args.highres)//16)))
if args.handface: if args.handface:
cmd = cmd + ' --hand --face' cmd = cmd + ' --hand --face'
if args.render: if args.render:
@ -117,17 +124,14 @@ def load_openpose(opname):
out.append(annot) out.append(annot)
return out return out
def convert_from_openpose(src, dst): def convert_from_openpose(src, dst, annotdir):
# convert the 2d pose from openpose # convert the 2d pose from openpose
inputlist = sorted(os.listdir(src)) inputlist = sorted(os.listdir(src))
for inp in tqdm(inputlist): for inp in tqdm(inputlist, desc='{:-10s}'.format(os.path.basename(dst))):
annots = load_openpose(join(src, inp)) annots = load_openpose(join(src, inp))
base = inp.replace('_keypoints.json', '') base = inp.replace('_keypoints.json', '')
annotname = join(dst, base+'.json') annotname = join(dst, base+'.json')
imgname = annotname.replace('annots', 'images').replace('.json', '.jpg') imgname = annotname.replace(annotdir, 'images').replace('.json', '.jpg')
if not os.path.exists(imgname):
os.remove(join(src, inp))
continue
annot = create_annot_file(annotname, imgname) annot = create_annot_file(annotname, imgname)
annot['annots'] = annots annot['annots'] = annots
save_json(annotname, annot) save_json(annotname, annot)
@ -145,12 +149,7 @@ def detect_frame(detector, img, pid=0):
annots.append(annot) annots.append(annot)
return annots return annots
def extract_yolo_hrnet(image_root, annot_root, ext='jpg'): config_high = {
imgnames = sorted(glob(join(image_root, '*.{}'.format(ext))))
import torch
device = torch.device('cuda')
from estimator.detector import Detector
config = {
'yolov4': { 'yolov4': {
'ckpt_path': 'data/models/yolov4.weights', 'ckpt_path': 'data/models/yolov4.weights',
'conf_thres': 0.3, 'conf_thres': 0.3,
@ -168,6 +167,33 @@ def extract_yolo_hrnet(image_root, annot_root, ext='jpg'):
'MIN_BBOX_LEN': 150 'MIN_BBOX_LEN': 150
} }
} }
config_low = {
'yolov4': {
'ckpt_path': 'data/models/yolov4.weights',
'conf_thres': 0.1,
'box_nms_thres': 0.9 # 阈值=0.9表示IOU 0.9的不会被筛掉
},
'hrnet':{
'nof_joints': 17,
'c': 48,
'checkpoint_path': 'data/models/pose_hrnet_w48_384x288.pth'
},
'detect':{
'MIN_PERSON_JOINTS': 0,
'MIN_BBOX_AREA': 0,
'MIN_JOINTS_CONF': 0.0,
'MIN_BBOX_LEN': 0
}
}
def extract_yolo_hrnet(image_root, annot_root, ext='jpg', use_low=False):
imgnames = sorted(glob(join(image_root, '*.{}'.format(ext))))
import torch
device = torch.device('cuda')
from easymocap.estimator import Detector
config = config_low if use_low else config_high
print(config)
detector = Detector('yolo', 'hrnet', device, config) detector = Detector('yolo', 'hrnet', device, config)
for nf, imgname in enumerate(tqdm(imgnames)): for nf, imgname in enumerate(tqdm(imgnames)):
annotname = join(annot_root, os.path.basename(imgname).replace('.{}'.format(ext), '.json')) annotname = join(annot_root, os.path.basename(imgname).replace('.{}'.format(ext), '.json'))
@ -186,47 +212,65 @@ def extract_yolo_hrnet(image_root, annot_root, ext='jpg'):
if __name__ == "__main__": if __name__ == "__main__":
import argparse import argparse
parser = argparse.ArgumentParser() parser = argparse.ArgumentParser()
parser.add_argument('path', type=str, default=None) parser.add_argument('path', type=str, default=None, help="the path of data")
parser.add_argument('--mode', type=str, default='openpose', choices=['openpose', 'yolo-hrnet']) parser.add_argument('--mode', type=str, default='openpose', choices=['openpose', 'yolo-hrnet'], help="model to extract joints from image")
parser.add_argument('--ext', type=str, default='jpg', choices=['jpg', 'png']) parser.add_argument('--ext', type=str, default='jpg', choices=['jpg', 'png'], help="image file extension")
parser.add_argument('--annot', type=str, default='annots', help="sub directory name to store the generated annotation files, default to be annots")
parser.add_argument('--highres', type=float, default=1)
parser.add_argument('--handface', action='store_true') parser.add_argument('--handface', action='store_true')
parser.add_argument('--openpose', type=str, parser.add_argument('--openpose', type=str,
default='/media/qing/Project/openpose') default='/media/qing/Project/openpose')
parser.add_argument('--render', action='store_true', help='use to render the openpose 2d') parser.add_argument('--render', action='store_true',
parser.add_argument('--no2d', action='store_true') help='use to render the openpose 2d')
parser.add_argument('--no2d', action='store_true',
help='only extract the images')
parser.add_argument('--start', type=int, default=0, parser.add_argument('--start', type=int, default=0,
help='frame start') help='frame start')
parser.add_argument('--end', type=int, default=10000, parser.add_argument('--end', type=int, default=10000,
help='frame end') help='frame end')
parser.add_argument('--step', type=int, default=1, parser.add_argument('--step', type=int, default=1,
help='frame step') help='frame step')
parser.add_argument('--low', action='store_true',
help='decrease the threshold of human detector')
parser.add_argument('--gtbbox', action='store_true',
help='use the ground-truth bounding box, and hrnet to estimate human pose')
parser.add_argument('--debug', action='store_true') parser.add_argument('--debug', action='store_true')
args = parser.parse_args() args = parser.parse_args()
mode = args.mode mode = args.mode
if os.path.isdir(args.path): if os.path.isdir(args.path):
image_path = join(args.path, 'images')
os.makedirs(image_path, exist_ok=True)
subs_image = sorted(os.listdir(image_path))
subs_videos = sorted(glob(join(args.path, 'videos', '*.mp4')))
if len(subs_videos) > len(subs_image):
videos = sorted(glob(join(args.path, 'videos', '*.mp4'))) videos = sorted(glob(join(args.path, 'videos', '*.mp4')))
subs = [] subs = []
for video in videos: for video in videos:
basename = extract_video(video, args.path, start=args.start, end=args.end, step=args.step) basename = extract_video(video, args.path, start=args.start, end=args.end, step=args.step)
subs.append(basename) subs.append(basename)
else:
subs = sorted(os.listdir(image_path))
print('cameras: ', ' '.join(subs)) print('cameras: ', ' '.join(subs))
if not args.no2d: if not args.no2d:
for sub in subs: for sub in subs:
image_root = join(args.path, 'images', sub) image_root = join(args.path, 'images', sub)
annot_root = join(args.path, 'annots', sub) annot_root = join(args.path, args.annot, sub)
if os.path.exists(annot_root): if os.path.exists(annot_root):
# check the number of annots and images
if len(os.listdir(image_root)) == len(os.listdir(annot_root)):
print('skip ', annot_root) print('skip ', annot_root)
continue continue
if mode == 'openpose': if mode == 'openpose':
extract_2d(args.openpose, image_root, extract_2d(args.openpose, image_root,
join(args.path, 'openpose', sub), join(args.path, 'openpose', sub),
join(args.path, 'openpose_render', sub)) join(args.path, 'openpose_render', sub), args)
convert_from_openpose( convert_from_openpose(
src=join(args.path, 'openpose', sub), src=join(args.path, 'openpose', sub),
dst=annot_root dst=annot_root,
annotdir=args.annot
) )
elif mode == 'yolo-hrnet': elif mode == 'yolo-hrnet':
extract_yolo_hrnet(image_root, annot_root, args.ext) extract_yolo_hrnet(image_root, annot_root, args.ext, args.low)
else: else:
print(args.path, ' not exists') print(args.path, ' not exists')

28
setup.py Normal file
View File

@ -0,0 +1,28 @@
'''
@ Date: 2021-03-02 16:53:55
@ Author: Qing Shuai
@ LastEditors: Qing Shuai
@ LastEditTime: 2021-04-14 15:17:28
@ FilePath: /EasyMocapRelease/setup.py
'''
from setuptools import setup
setup(
name='easymocap',
version='0.2', #
description='Easy Human Motion Capture Toolbox',
author='Qing Shuai',
author_email='s_q@zju.edu.cn',
# test_suite='setup.test_all',
packages=[
'easymocap',
'easymocap.dataset',
'easymocap.smplmodel',
'easymocap.pyfitting',
'easymocap.mytools',
'easymocap.annotator'
'easymocap.estimator'
],
install_requires=[],
data_files = []
)