🚀 support SMPL+H/SMPL-X

2021-01-24 22:33:08 +08:00 · 2021-01-24 22:33:08 +08:00 · f19058750f
commit f19058750f
parent 43e15512e3
25 changed files with 1460 additions and 293 deletions
--- a/Readme.md
+++ b/Readme.md
@ -2,43 +2,56 @@
 * @Date: 2021-01-13 20:32:12
 * @Author: Qing Shuai
 * @LastEditors: Qing Shuai
- * @LastEditTime: 2021-01-17 21:07:07
+ * @LastEditTime: 2021-01-24 22:11:37
 * @FilePath: /EasyMocapRelease/Readme.md
 -->
 # EasyMocap
 **EasyMocap** is an open-source toolbox for **markerless human motion capture** from RGB videos.

-## Features
- [x] multi-view, single person => 3d body keypoints
- [x] multi-view, single person => SMPL parameters
+In this project, we provide the basic code for fitting SMPL[1]/SMPL+H[2]/SMPLX[3] model to capture body+hand+face poses from multiple views.

-|:heavy_check_mark: Skeleton|:heavy_check_mark: SMPL|
-|----|----|
-|![repro](doc/feng/repro_512.gif)|![smpl](doc/feng/smpl_512.gif)|
+|Input|:heavy_check_mark: Skeleton|:heavy_check_mark: SMPL|
+|----|----|----|
+|![input](doc/feng/000400.jpg)|![repro](doc/feng/skel.gif)|![smpl](doc/feng/smplx.gif)|

-> The following features are not released yet. We are now working hard on them. Please stay tuned!
+> We plan to intergrate more interesting algorithms, please stay tuned!

-|Input|Output|
-|----|----|
-|multi-view, single person  | whole body 3d keypoints|
-|multi-view, single person  | SMPL-H/SMPLX/MANO parameters|
-|sparse view, single person |  dense reconstruction and view synthesis: [NeuralBody](https://zju3dv.github.io/neuralbody/).|
-
-|:black_square_button: Whole Body|:black_square_button: [Detailed Mesh](https://zju3dv.github.io/neuralbody/)|
-|----|----|
-|<div align="center"><img src="doc/feng/total_512.gif" height="300" alt="mesh" align=center /></div>|<div align="center"><img src="doc/feng/body_256.gif" height="300" width="300" alt="mesh" align=center /></div>|
+1. [Multi-Person from Multiple Views](https://github.com/zju3dv/mvpose)
+2. [Mocap from Multiple **Uncalibrated** and **Unsynchronized** Videos](https://arxiv.org/pdf/2008.07931.pdf)
+3. [Dense Reconstruction and View Synthesis from **Sparse Views**](https://zju3dv.github.io/neuralbody/)

 ## Installation
 ### 1. Download SMPL models
-To download the *SMPL* model go to [this](http://smpl.is.tue.mpg.de) (male and female models, version 1.0.0, 10 shape PCs) and [this](http://smplify.is.tue.mpg.de) (gender neutral model) project website and register to get access to the downloads section. Prepare the model as [smplx](https://github.com/vchoutas/smplx#model-loading). **Place them as following:**
+This step is the same as [smplx](https://github.com/vchoutas/smplx#model-loading).
+
+To download the *SMPL* model go to [this](http://smpl.is.tue.mpg.de) (male and female models, version 1.0.0, 10 shape PCs) and [this](http://smplify.is.tue.mpg.de) (gender neutral model) project website and register to get access to the downloads section. 
+
+To download the *SMPL+H* model go to [this project website](http://mano.is.tue.mpg.de) and register to get access to the downloads section. 
+
+To download the *SMPL-X* model go to [this project website](https://smpl-x.is.tue.mpg.de) and register to get access to the downloads section. 
+
+**Place them as following:**
 ```bash
 data
 └── smplx
    ├── J_regressor_body25.npy
-    └── smpl
-        ├── SMPL_FEMALE.pkl
-        ├── SMPL_MALE.pkl
-        └── SMPL_NEUTRAL.pkl
+    ├── J_regressor_body25_smplh.txt
+    ├── J_regressor_body25_smplx.txt
+    ├── smpl
+    │   ├── SMPL_FEMALE.pkl
+    │   ├── SMPL_MALE.pkl
+    │   └── SMPL_NEUTRAL.pkl
+    ├── smplh
+    │   ├── MANO_LEFT.pkl
+    │   ├── MANO_RIGHT.pkl
+    │   ├── SMPLH_female.pkl
+    │   ├── SMPLH_FEMALE.pkl
+    │   ├── SMPLH_male.pkl
+    │   └── SMPLH_MALE.pkl
+    └── smplx
+        ├── SMPLX_FEMALE.pkl
+        ├── SMPLX_MALE.pkl
+        └── SMPLX_NEUTRAL.pkl
 ```

 ### 2. Requirements
@ -47,15 +60,13 @@ data
 - opencv-python
 - [pyrender](https://pyrender.readthedocs.io/en/latest/install/index.html#python-installation): for visualization
 - chumpy: for loading SMPL model
+- OpenPose[4]: for 2D pose

 Some of python libraries can be found in `requirements.txt`. You can test different version of PyTorch.

-<!-- To download the *SMPL+H* model go to [this project website](http://mano.is.tue.mpg.de) and register to get access to the downloads section. 
-
-To download the *SMPL-X* model go to [this project website](https://smpl-x.is.tue.mpg.de) and register to get access to the downloads section.  -->

 ## Quick Start
-We provide an example multiview dataset[[dropbox](https://www.dropbox.com/s/24mb7r921b1g9a7/zju-ls-feng.zip?dl=0)][[BaiduDisk](https://pan.baidu.com/s/1lvAopzYGCic3nauoQXjbPw)(vg1z)]. After downloading the dataset, you can run the following example scripts.
+We provide an example multiview dataset[[dropbox](https://www.dropbox.com/s/24mb7r921b1g9a7/zju-ls-feng.zip?dl=0)][[BaiduDisk](https://pan.baidu.com/s/1lvAopzYGCic3nauoQXjbPw)(vg1z)], which has 800 frames from 23 synchronized and calibrated cameras. After downloading the dataset, you can run the following example scripts.
 ```bash
 data=path/to/data
 out=path/to/output
@ -64,15 +75,17 @@ python3 scripts/preprocess/extract_video.py ${data}
 # 1. example for skeleton reconstruction
 python3 code/demo_mv1pmf_skel.py ${data} --out ${out} --vis_det --vis_repro --undis --sub_vis 1 7 13 19
 # 2. example for SMPL reconstruction
-python3 code/demo_mv1pmf_smpl.py ${data} --out ${out} --end 300 --vis_smpl --undis --sub_vis 1 7 13 19
+python3 code/demo_mv1pmf_smpl.py ${data} --out ${out} --end 300 --vis_smpl --undis --sub_vis 1 7 13 19 --gender male
+# 2. example for SMPL-X reconstruction
+python3 code/demo_mv1pmf_smpl.py ${data} --out ${out} --undis --body bodyhandface --sub_vis 1 7 13 19 --start 400 --model smplx --vis_smpl --gender male
 ```

 ## Not Quick Start
 ### 0. Prepare Your Own Dataset
 ```bash
 zju-ls-feng
-├── extri.yml
 ├── intri.yml
+├── extri.yml
 └── videos
    ├── 1.mp4
    ├── 2.mp4
@ -88,8 +101,10 @@ Here `intri.yml` and `extri.yml` store the camera intrinsici and extrinsic param
 ```bash
 data=path/to/data
 out=path/to/output
-python3 scripts/preprocess/extract_video.py ${data} --openpose <openpose_path> 
+python3 scripts/preprocess/extract_video.py ${data} --openpose <openpose_path> --handface
 ```
+- `--openpose`: specify the openpose path
+- `--handface`: detect hands and face keypoints

 ### 2. Run the code
 ```bash
@ -98,12 +113,15 @@ python3 code/demo_mv1pmf_skel.py ${data} --out ${out} --vis_det --vis_repro --un
 # 2. example for SMPL reconstruction
 python3 code/demo_mv1pmf_smpl.py ${data} --out ${out} --end 300 --vis_smpl --undis --sub_vis 1 7 13 19
 ```
+The input flags:
+- `--undis`: use to undistort the images
+- `--start, --end`: control the begin and end number of frames.
+
+The output flags:
 - `--vis_det`: visualize the detection
 - `--vis_repro`: visualize the reprojection
- `--undis`: use to undistort the images
 - `--sub_vis`: use to specify the views to visualize. If not set, the code will use all views
 - `--vis_smpl`: use to render the SMPL mesh to images.
- `--start, --end`: control the begin and end number of frames.

 ### 3. Output
 The results are saved in `json` format.
@ -131,14 +149,19 @@ The data in `smpl/000000.json` is also a list, each element represents the SMPL
    "id": <id>,
    "Rh": <(1, 3)>,
    "Th": <(1, 3)>,
-    "poses": <(1, 72)>,
+    "poses": <(1, 72/78/87)>,
+    "expression": <(1, 10)>,
    "shapes": <(1, 10)>
 }
 ```
 We set the first 3 dimensions of `poses` to zero, and add a new parameter `Rh` to represents the global oritentation, the vertices of SMPL model V = RX(theta, beta) + T.

+If you use SMPL+H model, the poses contains `22x3+6+6`. We use `6` pca coefficients for each hand. `3(jaw, left eye, right eye)x3` poses of head are added for SMPL-X model.
+
 ## Evaluation

+In our code, we do not set the best weight parameters, you can adjust these according your data. If you find a set of good weights, feel free to tell me.
+
 We will add more quantitative reports in [doc/evaluation.md](doc/evaluation.md)

 ## Acknowledgements
@ -175,3 +198,12 @@ Please consider citing these works if you find this repo is useful for your proj
  year={2020}
 }
 ```
+
+## Reference
+```bash
+[1] Loper, Matthew, et al. "SMPL: A skinned multi-person linear model." ACM transactions on graphics (TOG) 34.6 (2015): 1-16.
+[2] Romero, Javier, Dimitrios Tzionas, and Michael J. Black. "Embodied hands: Modeling and capturing hands and bodies together." ACM Transactions on Graphics (ToG) 36.6 (2017): 1-17.
+[3] Pavlakos, Georgios, et al. "Expressive body capture: 3d hands, face, and body from a single image." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
+Bogo, Federica, et al. "Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image." European conference on computer vision. Springer, Cham, 2016.
+[4] Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: Openpose: real-time multi-person 2d pose estimation using part affinity fields. arXiv preprint arXiv:1812.08008 (2018)
+```
--- a/code/dataset/base.py
+++ b/code/dataset/base.py
@ -2,7 +2,7 @@
  @ Date: 2021-01-13 16:53:55
  @ Author: Qing Shuai
  @ LastEditors: Qing Shuai
-  @ LastEditTime: 2021-01-14 19:55:58
+  @ LastEditTime: 2021-01-24 22:27:01
  @ FilePath: /EasyMocapRelease/code/dataset/base.py
 '''
 import os
@ -15,7 +15,7 @@ import numpy as np
 code_path = join(os.path.dirname(__file__), '..')
 sys.path.append(code_path)

-from mytools.camera_utils import read_camera, undistort, write_camera
+from mytools.camera_utils import read_camera, undistort, write_camera, get_fundamental_matrix
 from mytools.vis_base import merge, plot_bbox, plot_keypoints

 def read_json(path):
@ -30,18 +30,40 @@ def save_json(file, data):
        json.dump(data, f, indent=4)


-def read_annot(annotname, add_hand_face=False):
-    data = read_json(annotname)['annots']
+def read_annot(annotname, mode='body25'):
+    data = read_json(annotname)
+    if not isinstance(data, list):
+        data = data['annots']
    for i in range(len(data)):
-        data[i]['id'] = data[i].pop('personID')
+        if 'id' not in data[i].keys():
+            data[i]['id'] = data[i].pop('personID')
+        if 'keypoints2d' in data[i].keys() and 'keypoints' not in data[i].keys():
+            data[i]['keypoints'] = data[i].pop('keypoints2d')
        for key in ['bbox', 'keypoints', 'handl2d', 'handr2d', 'face2d']:
            if key not in data[i].keys():continue
            data[i][key] = np.array(data[i][key])
+            if key == 'face2d':
+                # TODO: Make parameters, 17 is the offset for the eye brows,
+                # etc. 51 is the total number of FLAME compatible landmarks
+                data[i][key] = data[i][key][17:17+51, :]
+        if mode == 'body25':
+            data[i]['keypoints'] = data[i]['keypoints']
+        elif mode == 'body15':
+            data[i]['keypoints'] = data[i]['keypoints'][:15, :]
+        elif mode == 'total':
+            data[i]['keypoints'] = np.vstack([data[i][key] for key in ['keypoints', 'handl2d', 'handr2d', 'face2d']])
+        elif mode == 'bodyhand':
+            data[i]['keypoints'] = np.vstack([data[i][key] for key in ['keypoints', 'handl2d', 'handr2d']])
+        elif mode == 'bodyhandface':
+            data[i]['keypoints'] = np.vstack([data[i][key] for key in ['keypoints', 'handl2d', 'handr2d', 'face2d']])
+    data.sort(key=lambda x:x['id'])
    return data

 def get_bbox_from_pose(pose_2d, img, rate = 0.1):
    # this function returns bounding box from the 2D pose
-    validIdx = pose_2d[:, 2] > 0
+    # here use pose_2d[:, -1] instead of pose_2d[:, 2]
+    # because when vis reprojection, the result will be (x, y, depth, conf)
+    validIdx = pose_2d[:, -1] > 0
    if validIdx.sum() == 0:
        return [0, 0, 100, 100, 0]
    y_min = int(min(pose_2d[validIdx, 1]))
@ -65,10 +87,10 @@ def correct_bbox(img, bbox):
 class FileWriter:
    def __init__(self, output_path, config=None, basenames=[], cfg=None) -> None:
        self.out = output_path
-        keys = ['keypoints3d', 'smpl', 'repro', 'keypoints']
+        keys = ['keypoints3d', 'match', 'smpl', 'skel', 'repro', 'keypoints']
        output_dict = {key:join(self.out, key) for key in keys}
-        for key, p in output_dict.items():
-            os.makedirs(p, exist_ok=True)
+        # for key, p in output_dict.items():
+            # os.makedirs(p, exist_ok=True)
        self.output_dict = output_dict
        
        self.basenames = basenames
@ -78,19 +100,30 @@ class FileWriter:
        self.config = config
    
    def write_keypoints3d(self, results, nf):
+        os.makedirs(self.output_dict['keypoints3d'], exist_ok=True)
        savename = join(self.output_dict['keypoints3d'], '{:06d}.json'.format(nf))
        save_json(savename, results)
    
    def vis_detections(self, images, lDetections, nf, key='keypoints', to_img=True, vis_id=True):
+        os.makedirs(self.output_dict[key], exist_ok=True)
        images_vis = []
        for nv, image in enumerate(images):
            img = image.copy()
            for det in lDetections[nv]:
-                keypoints = det[key]
-                bbox = det.pop('bbox', get_bbox_from_pose(keypoints, img))
-                # bbox = det['bbox']
-                plot_bbox(img, bbox, pid=det['id'], vis_id=vis_id)
-                plot_keypoints(img, keypoints, pid=det['id'], config=self.config, use_limb_color=False, lw=2)
+                if key == 'match':
+                    pid = det['id_match']
+                else:
+                    pid = det['id']
+                if key not in det.keys():
+                    keypoints = det['keypoints']
+                else:
+                    keypoints = det[key]
+                if 'bbox' not in det.keys():
+                    bbox = get_bbox_from_pose(keypoints, img)
+                else:
+                    bbox = det['bbox']
+                plot_bbox(img, bbox, pid=pid, vis_id=vis_id)
+                plot_keypoints(img, keypoints, pid=pid, config=self.config, use_limb_color=False, lw=2)
            images_vis.append(img)
        image_vis = merge(images_vis, resize=not self.save_origin)
        if to_img:
@ -99,46 +132,229 @@ class FileWriter:
        return image_vis
    
    def write_smpl(self, results, nf):
+        os.makedirs(self.output_dict['smpl'], exist_ok=True)
        format_out = {'float_kind':lambda x: "%.3f" % x}
        filename = join(self.output_dict['smpl'], '{:06d}.json'.format(nf))
        with open(filename, 'w') as f:
            f.write('[\n')
-            for data in results:
+            for idata, data in enumerate(results):
                f.write('    {\n')
                output = {}
                output['id'] = data['id']
-                output['Rh']   = np.array2string(data['Rh'], max_line_width=1000, separator=', ', formatter=format_out)
-                output['Th']   = np.array2string(data['Th'], max_line_width=1000, separator=', ', formatter=format_out)
-                output['poses'] = np.array2string(data['poses'], max_line_width=1000, separator=', ', formatter=format_out)
-                output['shapes']  = np.array2string(data['shapes'], max_line_width=1000, separator=', ', formatter=format_out)
-                for key in ['id', 'Rh', 'Th', 'poses', 'shapes']:
-                    f.write('        \"{}\": {},\n'.format(key, output[key]))
-                f.write('    },\n')
+                for key in ['Rh', 'Th', 'poses', 'expression', 'shapes']:
+                    if key not in data.keys():continue
+                    output[key] = np.array2string(data[key], max_line_width=1000, separator=', ', formatter=format_out)
+                for key in output.keys():
+                    f.write('        \"{}\": {}'.format(key, output[key]))
+                    if key != 'shapes':
+                        f.write(',\n')
+                    else:
+                        f.write('\n')
+                        
+                f.write('    }')
+                if idata != len(results) - 1:
+                    f.write(',\n')
+                else:
+                    f.write('\n')
            f.write(']\n')

-    def vis_smpl(self, render_data, nf, images, cameras):
+    def vis_smpl(self, render_data_, nf, images, cameras, mode='smpl', add_back=False):
+        out = join(self.out, mode)
+        os.makedirs(out, exist_ok=True)
        from visualize.renderer import Renderer
        render = Renderer(height=1024, width=1024, faces=None)
-        render_results = render.render(render_data, cameras, images)
-        image_vis = merge(render_results, resize=not self.save_origin)
-        savename = join(self.output_dict['smpl'], '{:06d}.jpg'.format(nf))
-        cv2.imwrite(savename, image_vis)
+        if isinstance(render_data_, list): # different view have different data
+            for nv, render_data in enumerate(render_data_):
+                render_results = render.render(render_data, cameras, images)
+                image_vis = merge(render_results, resize=not self.save_origin)
+                savename = join(out, '{:06d}_{:02d}.jpg'.format(nf, nv))
+                cv2.imwrite(savename, image_vis)
+        else:
+            render_results = render.render(render_data_, cameras, images, add_back=add_back)
+            image_vis = merge(render_results, resize=not self.save_origin)
+            savename = join(out, '{:06d}.jpg'.format(nf))
+            cv2.imwrite(savename, image_vis)
+
+def readReasultsTxt(outname, isA4d=True):
+    res_ = []
+    with open(outname, "r") as file:
+        lines = file.readlines()
+        if len(lines) < 2:
+            return res_
+        nPerson, nJoints = int(lines[0]), int(lines[1])
+        # 只包含每个人的结果
+        lines = lines[1:]
+        # 每个人的都写了关键点数量
+        line_per_person = 1 + 1 + nJoints
+        for i in range(nPerson):
+            trackId = int(lines[i*line_per_person+1])
+            content = ''.join(lines[i*line_per_person+2:i*line_per_person+2+nJoints])
+            pose3d = np.fromstring(content, dtype=float, sep=' ').reshape((nJoints, 4))
+            if isA4d:
+                # association4d 的关节顺序和正常的定义不一样
+                pose3d = pose3d[[4, 1, 5, 9, 13, 6, 10, 14, 0, 2, 7, 11, 3, 8, 12], :]
+            res_.append({'id':trackId, 'keypoints3d':np.array(pose3d)})
+    return res_
+
+def readResultsJson(outname):
+    with open(outname) as f:
+        data = json.load(f)
+    res_ = []
+    for d in data:
+        pose3d = np.array(d['keypoints3d'])
+        if pose3d.shape[0] > 25:
+            # 对于有手的情况，把手的根节点赋值成body25上的点
+            pose3d[25, :] = pose3d[7, :]
+            pose3d[46, :] = pose3d[4, :]
+        res_.append({
+            'id': d['id'] if 'id' in d.keys() else d['personID'],
+            'keypoints3d': pose3d
+        })
+    return res_
+
+class VideoBase(Dataset):
+    """Dataset for single sequence data
+    """
+    def __init__(self, image_root, annot_root, out=None, config={}, mode='body15', no_img=False) -> None:
+        self.image_root = image_root
+        self.annot_root = annot_root
+        self.mode = mode
+        self.no_img = no_img
+        self.config = config
+        assert out is not None
+        self.out = out
+        self.writer = FileWriter(self.out, config=config)
+        imgnames = sorted(os.listdir(self.image_root))
+        self.imagelist = imgnames
+        self.annotlist = sorted(os.listdir(self.annot_root))
+        self.nFrames = len(self.imagelist)
+        self.undis = False
+        self.read_camera()
+    
+    def read_camera(self):
+        # 读入相机参数
+        annname = join(self.annot_root, self.annotlist[0])
+        data = read_json(annname)
+        if 'K' not in data.keys():
+            height, width = data['height'], data['width']
+            focal = 1.2*max(height, width)
+            K = np.array([focal, 0., width/2, 0., focal, height/2, 0. ,0., 1.]).reshape(3, 3)
+        else:
+            K = np.array(data['K']).reshape(3, 3)
+        self.camera = {'K':K ,'R': np.eye(3), 'T': np.zeros((3, 1))}
+
+    def __getitem__(self, index: int):
+        imgname = join(self.image_root, self.imagelist[index])
+        annname = join(self.annot_root, self.annotlist[index])
+        assert os.path.exists(imgname), imgname
+        assert os.path.exists(annname), annname
+        assert os.path.basename(imgname).split('.')[0] == os.path.basename(annname).split('.')[0], (imgname, annname)
+        if not self.no_img:
+            img = cv2.imread(imgname)
+        else:
+            img = None
+        annot = read_annot(annname, self.mode)
+        return img, annot
+    
+    def __len__(self) -> int:
+        return self.nFrames
+    
+    def write_smpl(self, peopleDict, nf):
+        results = []
+        for pid, people in peopleDict.items():
+            result = {'id': pid}
+            result.update(people.body_params)
+            results.append(result)
+        self.writer.write_smpl(results, nf)
+    
+    def vis_detections(self, image, detections, nf, to_img=True):
+        return self.writer.vis_detections([image], [detections], nf, 
+            key='keypoints', to_img=to_img, vis_id=True)
+    
+    def vis_repro(self, peopleDict, image, annots, nf):
+        # 可视化重投影的关键点与输入的关键点
+        detections = []
+        for pid, data in peopleDict.items():
+            keypoints3d = (data.keypoints3d @ self.camera['R'].T + self.camera['T'].T) @ self.camera['K'].T
+            keypoints3d[:, :2] /= keypoints3d[:, 2:]
+            keypoints3d = np.hstack([keypoints3d, data.keypoints3d[:, -1:]])
+            det = {
+                'id': pid,
+                'repro': keypoints3d
+            }
+            detections.append(det)
+        return self.writer.vis_detections([image], [detections], nf, key='repro',
+            to_img=True, vis_id=False)
+
+    def vis_smpl(self, peopleDict, faces, image, nf, sub_vis=[], 
+        mode='smpl', extra_data=[], add_back=True,
+        axis=np.array([1., 0., 0.]), degree=0., fix_center=None):
+        # 为了统一接口，旋转视角的在此处实现，只在单视角的数据中使用
+        # 通过修改相机参数实现
+        # 相机参数的修正可以通过计算点的中心来获得
+        # render the smpl to each view
+        render_data = {}
+        for pid, data in peopleDict.items():
+            render_data[pid] = {
+                'vertices': data.vertices, 'faces': faces, 
+                'vid': pid, 'name': 'human_{}_{}'.format(nf, pid)}
+        for iid, extra in enumerate(extra_data):
+            render_data[10000+iid] = {
+                'vertices': extra['vertices'],
+                'faces': extra['faces'],
+                'colors': extra['colors'],
+                'name': extra['name']
+            }
+        camera = {}
+        for key in self.camera.keys():
+            camera[key] = self.camera[key][None, :, :]
+        # render another view point
+        if np.abs(degree) > 1e-3:
+            vertices_all = np.vstack([data.vertices for data in peopleDict.values()])
+            if fix_center is None:
+                center = np.mean(vertices_all, axis=0, keepdims=True)
+                new_center = center.copy()
+                new_center[:, 0:2] = 0
+            else:
+                center = fix_center.copy()
+                new_center = fix_center.copy()
+                new_center[:, 2] *= 1.5
+            direc = np.array(axis)
+            rot, _ = cv2.Rodrigues(direc*degree/90*np.pi/2)
+            # If we rorate the data, it is like:
+            # V = Rnew @ (V0 - center) + new_center
+            #   = Rnew @ V0 - Rnew @ center + new_center
+            # combine with the camera
+            # VV = Rc(Rnew @ V0 - Rnew @ center + new_center) + Tc
+            #    = Rc@Rnew @ V0 + Rc @ (new_center - Rnew@center) + Tc
+            blank = np.zeros_like(image, dtype=np.uint8) + 255
+            images = [image, blank]
+            Rnew = camera['R'][0] @ rot
+            Tnew = camera['R'][0] @ (new_center.T - rot @ center.T) + camera['T'][0]
+            camera['K'] = np.vstack([camera['K'], camera['K']])
+            camera['R'] = np.vstack([camera['R'], Rnew[None, :, :]])
+            camera['T'] = np.vstack([camera['T'], Tnew[None, :, :]])
+        else:
+            images = [image]
+        self.writer.vis_smpl(render_data, nf, images, camera, mode, add_back=add_back)

 class MVBase(Dataset):
    """ Dataset for multiview data
    """
    def __init__(self, root, cams=[], out=None, config={}, 
        image_root='images', annot_root='annots', 
-        add_hand_face=True,
+        mode='body25',
        undis=True, no_img=False) -> None:
        self.root = root
        self.image_root = join(root, image_root)
        self.annot_root = join(root, annot_root)
-        self.add_hand_face = add_hand_face
+        self.mode = mode
        self.undis = undis
        self.no_img = no_img
        self.config = config
-        
+        # results path
+        # the results store keypoints3d
+        self.skel_path = None
        if out is None:
            out = join(root, 'output')
        self.out = out
@ -146,6 +362,8 @@ class MVBase(Dataset):
        
        if len(cams) == 0:
            cams = sorted([i for i in os.listdir(self.image_root) if os.path.isdir(join(self.image_root, i))])
+            if cams[0].isdigit(): # 对于使用数字命名的文件夹
+                cams.sort(key=lambda x:int(x))
        self.cams = cams
        self.imagelist = {}
        self.annotlist = {}
@ -168,6 +386,7 @@ class MVBase(Dataset):
            self.cameras.pop('basenames')
            self.cameras_for_affinity = [[cam['invK'], cam['R'], cam['T']] for cam in [self.cameras[name] for name in self.cams]]
            self.Pall = [self.cameras[cam]['P'] for cam in self.cams]
+            self.Fall = get_fundamental_matrix(self.cameras, self.cams)
        else:
            print('!!!there is no camera parameters, maybe bug', intri_name, extri_name)
            self.cameras = None
@ -205,7 +424,7 @@ class MVBase(Dataset):
                img = cv2.imread(imgname)
                images.append(img)
            # TODO:这里直接取了0
-            annot = read_annot(annname, self.add_hand_face)
+            annot = read_annot(annname, self.mode)
            annots.append(annot)
        if self.undis:
            images = self.undistort(images)
@ -214,3 +433,58 @@ class MVBase(Dataset):
    
    def __len__(self) -> int:
        return self.nFrames
+    
+    def vis_detections(self, images, lDetections, nf, to_img=True, sub_vis=[]):
+        if len(sub_vis) != 0:
+            valid_idx = [self.cams.index(i) for i in sub_vis]
+            images = [images[i] for i in valid_idx]
+            lDetections = [lDetections[i] for i in valid_idx]
+        return self.writer.vis_detections(images, lDetections, nf, 
+            key='keypoints', to_img=to_img, vis_id=True)
+    
+    def vis_match(self, images, lDetections, nf, to_img=True, sub_vis=[]):
+        if len(sub_vis) != 0:
+            valid_idx = [self.cams.index(i) for i in sub_vis]
+            images = [images[i] for i in valid_idx]
+            lDetections = [lDetections[i] for i in valid_idx]
+        return self.writer.vis_detections(images, lDetections, nf, 
+            key='match', to_img=to_img, vis_id=True)
+    
+    def write_keypoints3d(self, peopleDict, nf):
+        results = []
+        for pid, people in peopleDict.items():
+            result = {'id': pid, 'keypoints3d': people.keypoints3d.tolist()}
+            results.append(result)
+        self.writer.write_keypoints3d(results, nf)
+
+    def write_smpl(self, peopleDict, nf):
+        results = []
+        for pid, people in peopleDict.items():
+            result = {'id': pid}
+            result.update(people.body_params)
+            results.append(result)
+        self.writer.write_smpl(results, nf)
+
+    def read_skel(self, nf, mode='none'):
+        if mode == 'a4d':
+            outname = join(self.skel_path, '{}.txt'.format(nf))
+            assert os.path.exists(outname), outname
+            skels = readReasultsTxt(outname)
+        elif mode == 'none':
+            outname = join(self.skel_path, '{:06d}.json'.format(nf))
+            assert os.path.exists(outname), outname
+            skels = readResultsJson(outname)
+        else:
+            import ipdb; ipdb.set_trace()
+        return skels
+    
+    def read_smpl(self, nf):
+        outname = join(self.skel_path, '{:06d}.json'.format(nf))
+        assert os.path.exists(outname), outname
+        datas = read_json(outname)
+        outputs = []
+        for data in datas:
+            for key in ['Rh', 'Th', 'poses', 'shapes']:
+                data[key] = np.array(data[key])
+            outputs.append(data)
+        return outputs
--- a/code/dataset/config.py
+++ b/code/dataset/config.py
@ -2,14 +2,14 @@
 * @ Date: 2020-09-26 16:52:55
 * @ Author: Qing Shuai
  @ LastEditors: Qing Shuai
-  @ LastEditTime: 2021-01-13 14:04:46
-  @ FilePath: /EasyMocap/code/dataset/config.py
+  @ LastEditTime: 2021-01-24 20:21:50
+  @ FilePath: /EasyMocapRelease/code/dataset/config.py
 '''
 import numpy as np

 CONFIG = {}

-CONFIG['body25'] = {'kintree':
+CONFIG['body25'] = {'nJoints': 25, 'kintree':
   [[ 1,  0],
    [ 2,  1],
    [ 3,  2],
@ -33,9 +33,38 @@ CONFIG['body25'] = {'kintree':
    [21, 14],
    [22, 11],
    [23, 22],
-    [24, 11]]}
+    [24, 11]], 
+    'joint_names': ["Nose", "Neck", "RShoulder", "RElbow", "RWrist", "LShoulder", "LElbow", "LWrist", "MidHip", "RHip","RKnee","RAnkle","LHip","LKnee","LAnkle","REye","LEye","REar","LEar","LBigToe","LSmallToe","LHeel","RBigToe","RSmallToe","RHeel"]}

-CONFIG['body15'] = {'kintree':
+CONFIG['body25']['skeleton'] = \
+{
+    ( 0,  1): {'mean': 0.228, 'std': 0.046}, # Nose     ->Neck     
+    ( 1,  2): {'mean': 0.144, 'std': 0.029}, # Neck     ->RShoulder
+    ( 2,  3): {'mean': 0.283, 'std': 0.057}, # RShoulder->RElbow   
+    ( 3,  4): {'mean': 0.258, 'std': 0.052}, # RElbow   ->RWrist   
+    ( 1,  5): {'mean': 0.145, 'std': 0.029}, # Neck     ->LShoulder
+    ( 5,  6): {'mean': 0.281, 'std': 0.056}, # LShoulder->LElbow   
+    ( 6,  7): {'mean': 0.258, 'std': 0.052}, # LElbow   ->LWrist   
+    ( 1,  8): {'mean': 0.483, 'std': 0.097}, # Neck     ->MidHip   
+    ( 8,  9): {'mean': 0.106, 'std': 0.021}, # MidHip   ->RHip     
+    ( 9, 10): {'mean': 0.438, 'std': 0.088}, # RHip     ->RKnee    
+    (10, 11): {'mean': 0.406, 'std': 0.081}, # RKnee    ->RAnkle   
+    ( 8, 12): {'mean': 0.106, 'std': 0.021}, # MidHip   ->LHip     
+    (12, 13): {'mean': 0.438, 'std': 0.088}, # LHip     ->LKnee    
+    (13, 14): {'mean': 0.408, 'std': 0.082}, # LKnee    ->LAnkle   
+    ( 0, 15): {'mean': 0.043, 'std': 0.009}, # Nose     ->REye     
+    ( 0, 16): {'mean': 0.043, 'std': 0.009}, # Nose     ->LEye     
+    (15, 17): {'mean': 0.105, 'std': 0.021}, # REye     ->REar     
+    (16, 18): {'mean': 0.104, 'std': 0.021}, # LEye     ->LEar     
+    (14, 19): {'mean': 0.180, 'std': 0.036}, # LAnkle   ->LBigToe  
+    (19, 20): {'mean': 0.038, 'std': 0.008}, # LBigToe  ->LSmallToe
+    (14, 21): {'mean': 0.044, 'std': 0.009}, # LAnkle   ->LHeel    
+    (11, 22): {'mean': 0.182, 'std': 0.036}, # RAnkle   ->RBigToe  
+    (22, 23): {'mean': 0.038, 'std': 0.008}, # RBigToe  ->RSmallToe
+    (11, 24): {'mean': 0.044, 'std': 0.009}, # RAnkle   ->RHeel    
+}
+
+CONFIG['body15'] = {'nJoints': 15, 'kintree':
   [[ 1,  0],
    [ 2,  1],
    [ 3,  2],
@ -50,6 +79,8 @@ CONFIG['body15'] = {'kintree':
    [12,  8],
    [13, 12],
    [14, 13],]}
+CONFIG['body15']['joint_names'] = CONFIG['body25']['joint_names'][:15]
+CONFIG['body15']['skeleton'] = CONFIG['body25']['skeleton']

 CONFIG['hand'] = {'kintree':
      [[ 1,  0],
@ -99,48 +130,392 @@ CONFIG['bodyhand'] = {'kintree':
    [22, 11],
    [23, 22],
    [24, 11],
-    [26, 25],  # handl
+    [26,  7],  # handl
    [27, 26],
    [28, 27],
    [29, 28],
-    [30, 25],
+    [30,  7],
    [31, 30],
    [32, 31],
    [33, 32],
-    [34, 25],
+    [34,  7],
    [35, 34],
    [36, 35],
    [37, 36],
-    [38, 25],
+    [38,  7],
    [39, 38],
    [40, 39],
    [41, 40],
-    [42, 25],
+    [42,  7],
    [43, 42],
    [44, 43],
    [45, 44],
-    [47, 46], # handr
+    [47,  4], # handr
    [48, 47],
    [49, 48],
    [50, 49],
-    [51, 46],
+    [51,  4],
    [52, 51],
    [53, 52],
    [54, 53],
-    [55, 46],
+    [55,  4],
    [56, 55],
    [57, 56],
    [58, 57],
-    [59, 46],
+    [59,  4],
    [60, 59],
    [61, 60],
    [62, 61],
-    [63, 46],
+    [63,  4],
    [64, 63],
    [65, 64],
    [66, 65]
-    ]
+    ],
+    'nJoints': 67,
+    'skeleton':{
+    ( 0,  1): {'mean': 0.251, 'std': 0.050}, 
+    ( 1,  2): {'mean': 0.169, 'std': 0.034}, 
+    ( 2,  3): {'mean': 0.292, 'std': 0.058}, 
+    ( 3,  4): {'mean': 0.275, 'std': 0.055}, 
+    ( 1,  5): {'mean': 0.169, 'std': 0.034}, 
+    ( 5,  6): {'mean': 0.295, 'std': 0.059}, 
+    ( 6,  7): {'mean': 0.278, 'std': 0.056}, 
+    ( 1,  8): {'mean': 0.566, 'std': 0.113}, 
+    ( 8,  9): {'mean': 0.110, 'std': 0.022}, 
+    ( 9, 10): {'mean': 0.398, 'std': 0.080}, 
+    (10, 11): {'mean': 0.402, 'std': 0.080}, 
+    ( 8, 12): {'mean': 0.111, 'std': 0.022}, 
+    (12, 13): {'mean': 0.395, 'std': 0.079}, 
+    (13, 14): {'mean': 0.403, 'std': 0.081}, 
+    ( 0, 15): {'mean': 0.053, 'std': 0.011}, 
+    ( 0, 16): {'mean': 0.056, 'std': 0.011}, 
+    (15, 17): {'mean': 0.107, 'std': 0.021}, 
+    (16, 18): {'mean': 0.107, 'std': 0.021}, 
+    (14, 19): {'mean': 0.180, 'std': 0.036}, 
+    (19, 20): {'mean': 0.055, 'std': 0.011}, 
+    (14, 21): {'mean': 0.065, 'std': 0.013}, 
+    (11, 22): {'mean': 0.169, 'std': 0.034}, 
+    (22, 23): {'mean': 0.052, 'std': 0.010}, 
+    (11, 24): {'mean': 0.061, 'std': 0.012}, 
+    ( 7, 26): {'mean': 0.045, 'std': 0.009}, 
+    (26, 27): {'mean': 0.042, 'std': 0.008}, 
+    (27, 28): {'mean': 0.035, 'std': 0.007}, 
+    (28, 29): {'mean': 0.029, 'std': 0.006}, 
+    ( 7, 30): {'mean': 0.102, 'std': 0.020}, 
+    (30, 31): {'mean': 0.040, 'std': 0.008}, 
+    (31, 32): {'mean': 0.026, 'std': 0.005}, 
+    (32, 33): {'mean': 0.023, 'std': 0.005}, 
+    ( 7, 34): {'mean': 0.101, 'std': 0.020}, 
+    (34, 35): {'mean': 0.043, 'std': 0.009}, 
+    (35, 36): {'mean': 0.029, 'std': 0.006}, 
+    (36, 37): {'mean': 0.024, 'std': 0.005}, 
+    ( 7, 38): {'mean': 0.097, 'std': 0.019}, 
+    (38, 39): {'mean': 0.041, 'std': 0.008}, 
+    (39, 40): {'mean': 0.027, 'std': 0.005}, 
+    (40, 41): {'mean': 0.024, 'std': 0.005}, 
+    ( 7, 42): {'mean': 0.095, 'std': 0.019}, 
+    (42, 43): {'mean': 0.033, 'std': 0.007}, 
+    (43, 44): {'mean': 0.020, 'std': 0.004}, 
+    (44, 45): {'mean': 0.018, 'std': 0.004}, 
+    ( 4, 47): {'mean': 0.043, 'std': 0.009}, 
+    (47, 48): {'mean': 0.041, 'std': 0.008}, 
+    (48, 49): {'mean': 0.034, 'std': 0.007}, 
+    (49, 50): {'mean': 0.028, 'std': 0.006}, 
+    ( 4, 51): {'mean': 0.101, 'std': 0.020}, 
+    (51, 52): {'mean': 0.041, 'std': 0.008}, 
+    (52, 53): {'mean': 0.026, 'std': 0.005}, 
+    (53, 54): {'mean': 0.024, 'std': 0.005}, 
+    ( 4, 55): {'mean': 0.100, 'std': 0.020}, 
+    (55, 56): {'mean': 0.044, 'std': 0.009}, 
+    (56, 57): {'mean': 0.029, 'std': 0.006}, 
+    (57, 58): {'mean': 0.023, 'std': 0.005}, 
+    ( 4, 59): {'mean': 0.096, 'std': 0.019}, 
+    (59, 60): {'mean': 0.040, 'std': 0.008}, 
+    (60, 61): {'mean': 0.028, 'std': 0.006}, 
+    (61, 62): {'mean': 0.023, 'std': 0.005}, 
+    ( 4, 63): {'mean': 0.094, 'std': 0.019}, 
+    (63, 64): {'mean': 0.032, 'std': 0.006}, 
+    (64, 65): {'mean': 0.020, 'std': 0.004}, 
+    (65, 66): {'mean': 0.018, 'std': 0.004}, 
 }
+}
+
+CONFIG['bodyhandface'] = {'kintree':
+   [[ 1,  0],
+    [ 2,  1],
+    [ 3,  2],
+    [ 4,  3],
+    [ 5,  1],
+    [ 6,  5],
+    [ 7,  6],
+    [ 8,  1],
+    [ 9,  8],
+    [10,  9],
+    [11, 10],
+    [12,  8],
+    [13, 12],
+    [14, 13],
+    [15,  0],
+    [16,  0],
+    [17, 15],
+    [18, 16],
+    [19, 14],
+    [20, 19],
+    [21, 14],
+    [22, 11],
+    [23, 22],
+    [24, 11],
+    [26,  7],  # handl
+    [27, 26],
+    [28, 27],
+    [29, 28],
+    [30,  7],
+    [31, 30],
+    [32, 31],
+    [33, 32],
+    [34,  7],
+    [35, 34],
+    [36, 35],
+    [37, 36],
+    [38,  7],
+    [39, 38],
+    [40, 39],
+    [41, 40],
+    [42,  7],
+    [43, 42],
+    [44, 43],
+    [45, 44],
+    [47,  4], # handr
+    [48, 47],
+    [49, 48],
+    [50, 49],
+    [51,  4],
+    [52, 51],
+    [53, 52],
+    [54, 53],
+    [55,  4],
+    [56, 55],
+    [57, 56],
+    [58, 57],
+    [59,  4],
+    [60, 59],
+    [61, 60],
+    [62, 61],
+    [63,  4],
+    [64, 63],
+    [65, 64],
+    [66, 65],
+    [ 67,  68],
+    [ 68,  69],
+    [ 69,  70],
+    [ 70,  71],
+    [ 72,  73],
+    [ 73,  74],
+    [ 74,  75],
+    [ 75,  76],
+    [ 77,  78],
+    [ 78,  79],
+    [ 79,  80],
+    [ 81,  82],
+    [ 82,  83],
+    [ 83,  84],
+    [ 84,  85],
+    [ 86,  87],
+    [ 87,  88],
+    [ 88,  89],
+    [ 89,  90],
+    [ 90,  91],
+    [ 91,  86],
+    [ 92,  93],
+    [ 93,  94],
+    [ 94,  95],
+    [ 95,  96],
+    [ 96,  97],
+    [ 97,  92],
+    [ 98,  99],
+    [ 99, 100],
+    [100, 101],
+    [101, 102],
+    [102, 103],
+    [103, 104],
+    [104, 105],
+    [105, 106],
+    [106, 107],
+    [107, 108],
+    [108, 109],
+    [109,  98],
+    [110, 111],
+    [111, 112],
+    [112, 113],
+    [113, 114],
+    [114, 115],
+    [115, 116],
+    [116, 117],
+    [117, 110]
+    ],
+    'nJoints': 118,
+    'skeleton':{
+    ( 0,  1): {'mean': 0.251, 'std': 0.050}, 
+    ( 1,  2): {'mean': 0.169, 'std': 0.034}, 
+    ( 2,  3): {'mean': 0.292, 'std': 0.058}, 
+    ( 3,  4): {'mean': 0.275, 'std': 0.055}, 
+    ( 1,  5): {'mean': 0.169, 'std': 0.034}, 
+    ( 5,  6): {'mean': 0.295, 'std': 0.059}, 
+    ( 6,  7): {'mean': 0.278, 'std': 0.056}, 
+    ( 1,  8): {'mean': 0.566, 'std': 0.113}, 
+    ( 8,  9): {'mean': 0.110, 'std': 0.022}, 
+    ( 9, 10): {'mean': 0.398, 'std': 0.080}, 
+    (10, 11): {'mean': 0.402, 'std': 0.080}, 
+    ( 8, 12): {'mean': 0.111, 'std': 0.022}, 
+    (12, 13): {'mean': 0.395, 'std': 0.079}, 
+    (13, 14): {'mean': 0.403, 'std': 0.081}, 
+    ( 0, 15): {'mean': 0.053, 'std': 0.011}, 
+    ( 0, 16): {'mean': 0.056, 'std': 0.011}, 
+    (15, 17): {'mean': 0.107, 'std': 0.021}, 
+    (16, 18): {'mean': 0.107, 'std': 0.021}, 
+    (14, 19): {'mean': 0.180, 'std': 0.036}, 
+    (19, 20): {'mean': 0.055, 'std': 0.011}, 
+    (14, 21): {'mean': 0.065, 'std': 0.013}, 
+    (11, 22): {'mean': 0.169, 'std': 0.034}, 
+    (22, 23): {'mean': 0.052, 'std': 0.010}, 
+    (11, 24): {'mean': 0.061, 'std': 0.012}, 
+    ( 7, 26): {'mean': 0.045, 'std': 0.009}, 
+    (26, 27): {'mean': 0.042, 'std': 0.008}, 
+    (27, 28): {'mean': 0.035, 'std': 0.007}, 
+    (28, 29): {'mean': 0.029, 'std': 0.006}, 
+    ( 7, 30): {'mean': 0.102, 'std': 0.020}, 
+    (30, 31): {'mean': 0.040, 'std': 0.008}, 
+    (31, 32): {'mean': 0.026, 'std': 0.005}, 
+    (32, 33): {'mean': 0.023, 'std': 0.005}, 
+    ( 7, 34): {'mean': 0.101, 'std': 0.020}, 
+    (34, 35): {'mean': 0.043, 'std': 0.009}, 
+    (35, 36): {'mean': 0.029, 'std': 0.006}, 
+    (36, 37): {'mean': 0.024, 'std': 0.005}, 
+    ( 7, 38): {'mean': 0.097, 'std': 0.019}, 
+    (38, 39): {'mean': 0.041, 'std': 0.008}, 
+    (39, 40): {'mean': 0.027, 'std': 0.005}, 
+    (40, 41): {'mean': 0.024, 'std': 0.005}, 
+    ( 7, 42): {'mean': 0.095, 'std': 0.019}, 
+    (42, 43): {'mean': 0.033, 'std': 0.007}, 
+    (43, 44): {'mean': 0.020, 'std': 0.004}, 
+    (44, 45): {'mean': 0.018, 'std': 0.004}, 
+    ( 4, 47): {'mean': 0.043, 'std': 0.009}, 
+    (47, 48): {'mean': 0.041, 'std': 0.008}, 
+    (48, 49): {'mean': 0.034, 'std': 0.007}, 
+    (49, 50): {'mean': 0.028, 'std': 0.006}, 
+    ( 4, 51): {'mean': 0.101, 'std': 0.020}, 
+    (51, 52): {'mean': 0.041, 'std': 0.008}, 
+    (52, 53): {'mean': 0.026, 'std': 0.005}, 
+    (53, 54): {'mean': 0.024, 'std': 0.005}, 
+    ( 4, 55): {'mean': 0.100, 'std': 0.020}, 
+    (55, 56): {'mean': 0.044, 'std': 0.009}, 
+    (56, 57): {'mean': 0.029, 'std': 0.006}, 
+    (57, 58): {'mean': 0.023, 'std': 0.005}, 
+    ( 4, 59): {'mean': 0.096, 'std': 0.019}, 
+    (59, 60): {'mean': 0.040, 'std': 0.008}, 
+    (60, 61): {'mean': 0.028, 'std': 0.006}, 
+    (61, 62): {'mean': 0.023, 'std': 0.005}, 
+    ( 4, 63): {'mean': 0.094, 'std': 0.019}, 
+    (63, 64): {'mean': 0.032, 'std': 0.006}, 
+    (64, 65): {'mean': 0.020, 'std': 0.004}, 
+    (65, 66): {'mean': 0.018, 'std': 0.004}, 
+    (67, 68): {'mean': 0.012, 'std': 0.002}, 
+    (68, 69): {'mean': 0.013, 'std': 0.003}, 
+    (69, 70): {'mean': 0.014, 'std': 0.003}, 
+    (70, 71): {'mean': 0.012, 'std': 0.002}, 
+    (72, 73): {'mean': 0.014, 'std': 0.003}, 
+    (73, 74): {'mean': 0.014, 'std': 0.003}, 
+    (74, 75): {'mean': 0.015, 'std': 0.003}, 
+    (75, 76): {'mean': 0.013, 'std': 0.003}, 
+    (77, 78): {'mean': 0.014, 'std': 0.003}, 
+    (78, 79): {'mean': 0.014, 'std': 0.003}, 
+    (79, 80): {'mean': 0.015, 'std': 0.003}, 
+    (81, 82): {'mean': 0.009, 'std': 0.002}, 
+    (82, 83): {'mean': 0.010, 'std': 0.002}, 
+    (83, 84): {'mean': 0.010, 'std': 0.002}, 
+    (84, 85): {'mean': 0.010, 'std': 0.002}, 
+    (86, 87): {'mean': 0.009, 'std': 0.002}, 
+    (87, 88): {'mean': 0.009, 'std': 0.002}, 
+    (88, 89): {'mean': 0.008, 'std': 0.002}, 
+    (89, 90): {'mean': 0.008, 'std': 0.002}, 
+    (90, 91): {'mean': 0.009, 'std': 0.002}, 
+    (86, 91): {'mean': 0.008, 'std': 0.002}, 
+    (92, 93): {'mean': 0.009, 'std': 0.002}, 
+    (93, 94): {'mean': 0.009, 'std': 0.002}, 
+    (94, 95): {'mean': 0.009, 'std': 0.002}, 
+    (95, 96): {'mean': 0.009, 'std': 0.002}, 
+    (96, 97): {'mean': 0.009, 'std': 0.002}, 
+    (92, 97): {'mean': 0.009, 'std': 0.002}, 
+    (98, 99): {'mean': 0.016, 'std': 0.003}, 
+    (99, 100): {'mean': 0.013, 'std': 0.003}, 
+    (100, 101): {'mean': 0.008, 'std': 0.002}, 
+    (101, 102): {'mean': 0.008, 'std': 0.002}, 
+    (102, 103): {'mean': 0.012, 'std': 0.002}, 
+    (103, 104): {'mean': 0.014, 'std': 0.003}, 
+    (104, 105): {'mean': 0.015, 'std': 0.003}, 
+    (105, 106): {'mean': 0.012, 'std': 0.002}, 
+    (106, 107): {'mean': 0.009, 'std': 0.002}, 
+    (107, 108): {'mean': 0.009, 'std': 0.002}, 
+    (108, 109): {'mean': 0.013, 'std': 0.003}, 
+    (98, 109): {'mean': 0.016, 'std': 0.003}, 
+    (110, 111): {'mean': 0.021, 'std': 0.004}, 
+    (111, 112): {'mean': 0.009, 'std': 0.002}, 
+    (112, 113): {'mean': 0.008, 'std': 0.002}, 
+    (113, 114): {'mean': 0.019, 'std': 0.004}, 
+    (114, 115): {'mean': 0.018, 'std': 0.004}, 
+    (115, 116): {'mean': 0.008, 'std': 0.002}, 
+    (116, 117): {'mean': 0.009, 'std': 0.002}, 
+    (110, 117): {'mean': 0.020, 'std': 0.004}, 
+}
+}
+
+face_kintree_without_contour = [[ 0,  1],
+       [ 1,  2],
+       [ 2,  3],
+       [ 3,  4],
+       [ 5,  6],
+       [ 6,  7],
+       [ 7,  8],
+       [ 8,  9],
+       [10, 11],
+       [11, 12],
+       [12, 13],
+       [14, 15],
+       [15, 16],
+       [16, 17],
+       [17, 18],
+       [19, 20],
+       [20, 21],
+       [21, 22],
+       [22, 23],
+       [23, 24],
+       [24, 19],
+       [25, 26],
+       [26, 27],
+       [27, 28],
+       [28, 29],
+       [29, 30],
+       [30, 25],
+       [31, 32],
+       [32, 33],
+       [33, 34],
+       [34, 35],
+       [35, 36],
+       [36, 37],
+       [37, 38],
+       [38, 39],
+       [39, 40],
+       [40, 41],
+       [41, 42],
+       [42, 31],
+       [43, 44],
+       [44, 45],
+       [45, 46],
+       [46, 47],
+       [47, 48],
+       [48, 49],
+       [49, 50],
+       [50, 43]]

 CONFIG['face'] = {'kintree':[ [0,1],[1,2],[2,3],[3,4],[4,5],[5,6],[6,7],[7,8],[8,9],[9,10],[10,11],[11,12],[12,13],[13,14],[14,15],[15,16], #outline (ignored)
                [17,18],[18,19],[19,20],[20,21], #right eyebrow
@ -176,6 +551,7 @@ def getKintree(name='total'):
    return kintree
 CONFIG['total'] = {}
 CONFIG['total']['kintree'] = getKintree('total')
+CONFIG['total']['nJoints'] = 137

 COCO17_IN_BODY25 = [0,16,15,18,17,5,2,6,3,7,4,12,9,13,10,14,11]

--- a/code/dataset/mv1pmf.py
+++ b/code/dataset/mv1pmf.py
@ -2,7 +2,7 @@
  @ Date: 2021-01-12 17:12:50
  @ Author: Qing Shuai
  @ LastEditors: Qing Shuai
-  @ LastEditTime: 2021-01-14 17:14:34
+  @ LastEditTime: 2021-01-21 14:51:45
  @ FilePath: /EasyMocap/code/dataset/mv1pmf.py
 '''
 import os
@ -15,10 +15,10 @@ from .base import MVBase

 class MV1PMF(MVBase):
    def __init__(self, root, cams=[], pid=0, out=None, config={}, 
-        image_root='images', annot_root='annots', add_hand_face=True,
+        image_root='images', annot_root='annots', mode='body15',
        undis=True, no_img=False) -> None:
        super().__init__(root, cams, out, config, image_root, annot_root, 
-            add_hand_face, undis, no_img)
+            mode, undis, no_img)
        self.pid = pid
    
    def write_keypoints3d(self, keypoints3d, nf):
@ -30,20 +30,21 @@ class MV1PMF(MVBase):
        result.update(params)
        self.writer.write_smpl([result], nf)

-    def vis_smpl(self, vertices, faces, images, nf, sub_vis):
+    def vis_smpl(self, vertices, faces, images, nf, sub_vis=[], 
+        mode='smpl', extra_data=[], add_back=True):
        render_data = {}
        if len(vertices.shape) == 3:
            vertices = vertices[0]
        pid = self.pid
        render_data[pid] = {'vertices': vertices, 'faces': faces, 
-            'vid': pid, 'name': '{}_{}'.format(nf, pid)}
+            'vid': pid, 'name': 'human_{}_{}'.format(nf, pid)}
        cameras = {'K': [], 'R':[], 'T':[]}
        if len(sub_vis) == 0:
            sub_vis = self.cams
        for key in cameras.keys():
            cameras[key] = [self.cameras[cam][key] for cam in sub_vis]
        images = [images[self.cams.index(cam)] for cam in sub_vis]
-        self.writer.vis_smpl(render_data, nf, images, cameras)
+        self.writer.vis_smpl(render_data, nf, images, cameras, mode, add_back=add_back)
    
    def vis_detections(self, images, annots, nf, to_img=True, sub_vis=[]):
        lDetections = []
@ -87,7 +88,10 @@ class MV1PMF(MVBase):
                keypoints = data['keypoints']
            else:
                print('not found pid {} in {}, {}'.format(self.pid, index, nv))
-                keypoints = np.zeros((25, 3))
+                if self.add_hand_face:
+                    keypoints = np.zeros((137, 3))
+                else:
+                    keypoints = np.zeros((25, 3))
                bbox = np.array([0, 0, 100., 100., 0.])
            annots['bbox'].append(bbox)
            annots['keypoints'].append(keypoints)
--- a/code/demo_mv1pmf_skel.py
+++ b/code/demo_mv1pmf_skel.py
@ -2,15 +2,17 @@
  @ Date: 2021-01-12 17:08:25
  @ Author: Qing Shuai
  @ LastEditors: Qing Shuai
-  @ LastEditTime: 2021-01-14 17:08:05
-  @ FilePath: /EasyMocap/code/demo_mv1pmf_skel.py
+  @ LastEditTime: 2021-01-24 20:57:35
+  @ FilePath: /EasyMocapRelease/code/demo_mv1pmf_skel.py
 '''
 # show skeleton and reprojection
 from dataset.mv1pmf import MV1PMF
 from dataset.config import CONFIG
 from mytools.reconstruction import simple_recon_person, projectN3
+# from mytools.robust_triangulate import robust_triangulate
 from tqdm import tqdm
 import numpy as np
+from smplmodel import check_keypoints

 def smooth_skeleton(skeleton):
    # nFrames, nJoints, 4: [[(x, y, z, c)]]
@ -32,10 +34,37 @@ def smooth_skeleton(skeleton):
    skeleton[span:nFrames-span, :, :3] = skel
    return skeleton

+def get_limb_length(config, keypoints):
+    skeleton = {}
+    for i, j_ in config['kintree']:
+        if j_ == 25:
+            j = 7
+        elif j_ == 46:
+            j = 4
+        else:
+            j = j_
+        key = tuple(sorted([i, j]))
+        length, confs = 0, 0
+        for nf in range(keypoints.shape[0]):
+            limb_length = np.linalg.norm(keypoints[nf, i, :3] - keypoints[nf, j, :3])
+            conf = keypoints[nf, [i, j], -1].min()
+            length += limb_length * conf
+            confs += conf
+        limb_length = length/confs
+        skeleton[key] = {'mean': limb_length, 'std': limb_length*0.2}
+    print('{')
+    for key, val in skeleton.items():
+        res = '    ({:2d}, {:2d}): {{\'mean\': {:.3f}, \'std\': {:.3f}}}, '.format(*key, val['mean'], val['std'])
+        if 'joint_names' in config.keys():
+            res += '# {:9s}->{:9s}'.format(config['joint_names'][key[0]], config['joint_names'][key[1]])
+        print(res)
+    print('}')
+
 def mv1pmf_skel(path, sub, out, mode, args):
-    MIN_CONF_THRES = 0.5
+    MIN_CONF_THRES = 0.3
    no_img = not (args.vis_det or args.vis_repro)
-    dataset = MV1PMF(path, cams=sub, config=CONFIG[mode], add_hand_face=args.add_hand_face,
+    config = CONFIG[mode]
+    dataset = MV1PMF(path, cams=sub, config=config, mode=mode,
        undis=args.undis, no_img=no_img, out=out)
    kp3ds = []
    start, end = args.start, min(args.end, len(dataset))
@ -43,7 +72,9 @@ def mv1pmf_skel(path, sub, out, mode, args):
        images, annots = dataset[nf]
        conf = annots['keypoints'][..., -1]
        conf[conf < MIN_CONF_THRES] = 0
-        keypoints3d, _, kpts_repro = simple_recon_person(annots['keypoints'], dataset.Pall, ret_repro=True)
+        annots['keypoints'] = check_keypoints(annots['keypoints'], WEIGHT_DEBUFF=1)
+        keypoints3d, _, kpts_repro = simple_recon_person(annots['keypoints'], dataset.Pall, config=config, ret_repro=True)
+        # keypoints3d, _, kpts_repro = robust_triangulate(annots['keypoints'], dataset.Pall, config=config, ret_repro=True)
        kp3ds.append(keypoints3d)
        if args.vis_det:
            dataset.vis_detections(images, annots, nf, sub_vis=args.sub_vis)
@ -51,32 +82,16 @@ def mv1pmf_skel(path, sub, out, mode, args):
            dataset.vis_repro(images, annots, kpts_repro, nf, sub_vis=args.sub_vis)
    # smooth the skeleton
    kp3ds = np.stack(kp3ds)
-    if args.smooth:
-        kp3ds = smooth_skeleton(kp3ds)
+    # 计算一下骨长
+    # get_limb_length(config, kp3ds)
+    # if args.smooth:
+    #     kp3ds = smooth_skeleton(kp3ds)
    for nf in tqdm(range(kp3ds.shape[0]), desc='dump'):
        dataset.write_keypoints3d(kp3ds[nf], nf + start)

 if __name__ == "__main__":
-    import argparse
-    parser = argparse.ArgumentParser('multi_view one_person multi_frame skel')
-    parser.add_argument('path', type=str)
-    parser.add_argument('--out', type=str, default=None)
-    parser.add_argument('--sub', type=str, nargs='+', default=[],
-        help='the sub folder lists when in video mode')
-    parser.add_argument('--start', type=int, default=0,
-        help='frame start')
-    parser.add_argument('--end', type=int, default=10000,
-        help='frame end')    
-    parser.add_argument('--step', type=int, default=1,
-        help='frame step')
-    parser.add_argument('--body', type=str, default='body25', choices=['body15', 'body25', 'total'])
-    parser.add_argument('--undis', action='store_true')
-    parser.add_argument('--add_hand_face', action='store_true')
-    parser.add_argument('--smooth', action='store_true')
-    parser.add_argument('--vis_det', action='store_true')
-    parser.add_argument('--vis_repro', action='store_true')
-    parser.add_argument('--sub_vis', type=str, nargs='+', default=[],
-        help='the sub folder lists for visualization')
+    from mytools.cmd_loader import load_parser
+    parser = load_parser()

    args = parser.parse_args()
    mv1pmf_skel(args.path, args.sub, args.out, args.body, args)
--- a/code/demo_mv1pmf_smpl.py
+++ b/code/demo_mv1pmf_smpl.py
@ -2,106 +2,141 @@
  @ Date: 2021-01-12 17:08:25
  @ Author: Qing Shuai
  @ LastEditors: Qing Shuai
-  @ LastEditTime: 2021-01-14 20:49:25
-  @ FilePath: /EasyMocap/code/demo_mv1pmf_smpl.py
+  @ LastEditTime: 2021-01-24 22:26:09
+  @ FilePath: /EasyMocapRelease/code/demo_mv1pmf_smpl.py
 '''
 # show skeleton and reprojection
 import pyrender # first import the pyrender
 from pyfitting.optimize_simple import optimizeShape, optimizePose
 from dataset.mv1pmf import MV1PMF
 from dataset.config import CONFIG
-from mytools.reconstruction import simple_recon_person, projectN3
-from smplmodel import select_nf, init_params, Config
-
+from mytools.utils import Timer
+from smplmodel import select_nf, init_params, Config, load_model, check_keypoints
+from os.path import join
 from tqdm import tqdm
 import numpy as np

-def load_model(use_cuda=True):
-    # prepare SMPL model
-    import torch
-    if use_cuda:
-        device = torch.device('cuda')
-    else:
-        device = torch.device('cpu')
-    from smplmodel import SMPLlayer
-    body_model = SMPLlayer('data/smplx/smpl', gender='neutral', device=device,
-        regressor_path='data/smplx/J_regressor_body25.npy')
-    body_model.to(device)
-    return body_model
-
 def load_weight_shape():
-    weight = {'s3d': 1., 'reg_shape': 5e-3}
+    weight = {'s3d': 1., 'reg_shapes': 5e-3}
    return weight

-def load_weight_pose():
-    weight = {
-        'k3d': 1., 'reg_poses_zero': 1e-2, 
-        'smooth_Rh': 1e-2, 'smooth_Th': 1e-2, 'smooth_poses': 1e-2
-    }
+def load_weight_pose(model):
+    if model == 'smpl':
+        weight = {
+            'k3d': 1., 'reg_poses_zero': 1e-2,
+            'reg_expression': 1e-1,
+            'smooth_joints': 1e-5
+            # 'smooth_Rh': 1e-1, 'smooth_Th': 1e-1, 'smooth_poses': 1e-1, 'smooth_hands': 1e-2
+        }
+    elif model == 'smplh':
+        weight = {
+            'k3d': 1., 'reg_poses_zero': 1e-3,
+            'smooth_body': 1e-2, 'smooth_hand': 1e-2
+        }
+    elif model == 'smplx':
+        weight = {
+            'k3d': 1., 'reg_poses_zero': 1e-3,
+            'reg_expression': 1e-2,
+            'smooth_body': 1e-2, 'smooth_hand': 1e-2
+            # 'smooth_Rh': 1e-1, 'smooth_Th': 1e-1, 'smooth_poses': 1e-1, 'smooth_hands': 1e-2
+        }
+    else:
+        raise NotImplementedError
    return weight

+def print_mean_skel(mode):
+    with Timer('Loading {}, {}'.format(args.model, args.gender)):
+        body_model = load_model(args.gender, model_type=args.model)
+    params_init = init_params(nFrames=1, model_type=args.model)
+    skel = body_model(return_verts=False, return_tensor=False, **params_init)[0]
+    # skel: nJoints, 3
+    config = CONFIG[mode]
+    skeleton = {}
+    for i, j_ in config['kintree']:
+        if j_ == 25:
+            j = 7
+        elif j_ == 46:
+            j = 4
+        else:
+            j = j_
+        key = tuple(sorted([i, j]))
+        limb_length = np.linalg.norm(skel[i] - skel[j])
+        skeleton[key] = {'mean': limb_length, 'std': limb_length*0.2}
+    print('{')
+    for key, val in skeleton.items():
+        res = '    ({:2d}, {:2d}): {{\'mean\': {:.3f}, \'std\': {:.3f}}}, '.format(*key, val['mean'], val['std'])
+        if 'joint_names' in config.keys():
+            res += '# {:9s}->{:9s}'.format(config['joint_names'][key[0]], config['joint_names'][key[1]])
+        print(res)
+    print('}')
+
 def mv1pmf_smpl(path, sub, out, mode, args):
    config = CONFIG[mode]
-    MIN_CONF_THRES = 0.5
-    no_img = False
-    dataset = MV1PMF(path, cams=sub, config=CONFIG[mode], add_hand_face=False,
+    no_img = True
+    dataset = MV1PMF(path, cams=sub, config=CONFIG[mode], mode=args.body,
        undis=args.undis, no_img=no_img, out=out)
+    if args.skel is None:
+        from demo_mv1pmf_skel import mv1pmf_skel
+        mv1pmf_skel(path, sub, out, mode, args)
+        args.skel = join(out, 'keypoints3d')
+    dataset.skel_path = args.skel
    kp3ds = []
    start, end = args.start, min(args.end, len(dataset))
    dataset.no_img = True
    annots_all = []
-    for nf in tqdm(range(start, end), desc='triangulation'):
+    for nf in tqdm(range(start, end), desc='loading'):
        images, annots = dataset[nf]
-        conf = annots['keypoints'][..., -1]
-        conf[conf < MIN_CONF_THRES] = 0
-        keypoints3d, _, kpts_repro = simple_recon_person(annots['keypoints'], dataset.Pall, ret_repro=True)
-        kp3ds.append(keypoints3d)
+        infos = dataset.read_skel(nf)
+        kp3ds.append(infos[0]['keypoints3d'])
        annots_all.append(annots)
-    # smooth the skeleton
    kp3ds = np.stack(kp3ds)
+    kp3ds = check_keypoints(kp3ds, 1)
    # optimize the human shape
-    body_model = load_model()
-    params_init = init_params(nFrames=1)
+    with Timer('Loading {}, {}'.format(args.model, args.gender)):
+        body_model = load_model(args.gender, model_type=args.model)
+    params_init = init_params(nFrames=1, model_type=args.model)
    weight = load_weight_shape()
-    params_shape = optimizeShape(body_model, params_init, kp3ds, weight_loss=weight, kintree=config['kintree'])
+    if args.model in ['smpl', 'smplh', 'smplx']:
+        # when use SMPL model, optimize the shape only with first 14 limbs
+        params_shape = optimizeShape(body_model, params_init, kp3ds, weight_loss=weight, kintree=CONFIG['body15']['kintree'])
+    else:
+        params_shape = optimizeShape(body_model, params_init, kp3ds, weight_loss=weight, kintree=config['kintree'])
    # optimize 3D pose
    cfg = Config()
-    params = init_params(nFrames=kp3ds.shape[0])
+    cfg.VERBOSE = args.verbose
+    cfg.MODEL = args.model
+    params = init_params(nFrames=kp3ds.shape[0], model_type=args.model)
    params['shapes'] = params_shape['shapes'].copy()
-    weight = load_weight_pose()
-    cfg.OPT_R = True
-    cfg.OPT_T = True
-    params = optimizePose(body_model, params, kp3ds, weight_loss=weight, kintree=config['kintree'], cfg=cfg)
-    cfg.OPT_POSE = True
-    params = optimizePose(body_model, params, kp3ds, weight_loss=weight, kintree=config['kintree'], cfg=cfg)
-    # optimize 2D pose
-    # render the mesh
+    weight = load_weight_pose(args.model)
+    with Timer('Optimize global RT'):
+        cfg.OPT_R = True
+        cfg.OPT_T = True
+        params = optimizePose(body_model, params, kp3ds, weight_loss=weight, kintree=config['kintree'], cfg=cfg)
+    with Timer('Optimize Pose/{} frames'.format(end-start)):
+        cfg.OPT_POSE = True
+        params = optimizePose(body_model, params, kp3ds, weight_loss=weight, kintree=config['kintree'], cfg=cfg)
+        if args.model in ['smplh', 'smplx']:
+            cfg.OPT_HAND = True
+            params = optimizePose(body_model, params, kp3ds, weight_loss=weight, kintree=config['kintree'], cfg=cfg)
+        if args.model == 'smplx':
+            cfg.OPT_EXPR = True
+            params = optimizePose(body_model, params, kp3ds, weight_loss=weight, kintree=config['kintree'], cfg=cfg)
+    # TODO:optimize 2D pose
+    # write out the results
    dataset.no_img = not args.vis_smpl
    for nf in tqdm(range(start, end), desc='render'):
        images, annots = dataset[nf]
        dataset.write_smpl(select_nf(params, nf-start), nf)
        if args.vis_smpl:
            vertices = body_model(return_verts=True, return_tensor=False, **select_nf(params, nf-start))
-            dataset.vis_smpl(vertices=vertices, faces=body_model.faces, images=images, nf=nf, sub_vis=args.sub_vis)
+            dataset.vis_smpl(vertices=vertices, faces=body_model.faces, images=images, nf=nf, sub_vis=args.sub_vis, add_back=True)

 if __name__ == "__main__":
-    import argparse
-    parser = argparse.ArgumentParser('multi_view one_person multi_frame skel')
-    parser.add_argument('path', type=str)
-    parser.add_argument('--out', type=str, default=None)
-    parser.add_argument('--sub', type=str, nargs='+', default=[],
-        help='the sub folder lists when in video mode')
-    parser.add_argument('--start', type=int, default=0,
-        help='frame start')
-    parser.add_argument('--end', type=int, default=10000,
-        help='frame end')    
-    parser.add_argument('--step', type=int, default=1,
-        help='frame step')
-    parser.add_argument('--body', type=str, default='body15', choices=['body15', 'body25', 'total'])
-    parser.add_argument('--undis', action='store_true')
-    parser.add_argument('--add_hand_face', action='store_true')
+    from mytools.cmd_loader import load_parser
+    parser = load_parser()
+    parser.add_argument('--skel', type=str, default=None, 
+        help='path to keypoints3d')
    parser.add_argument('--vis_smpl', action='store_true')
-    parser.add_argument('--sub_vis', type=str, nargs='+', default=[],
-        help='the sub folder lists for visualization')
    args = parser.parse_args()
+    # print_mean_skel(args.body)
    mv1pmf_smpl(args.path, args.sub, args.out, args.body, args)
--- a/code/mytools/camera_utils.py
+++ b/code/mytools/camera_utils.py
@ -228,3 +228,19 @@ def filterKeypoints(keypoints, thres = 0.1, min_width=40, \
        add_list.append(ik)
    keypoints = keypoints[add_list, :, :]
    return keypoints, add_list
+
+
+def get_fundamental_matrix(cameras, basenames):
+    skew_op = lambda x: np.array([[0, -x[2], x[1]], [x[2], 0, -x[0]], [-x[1], x[0], 0]])
+    fundamental_op = lambda K_0, R_0, T_0, K_1, R_1, T_1: np.linalg.inv(K_0).T @ (
+            R_0 @ R_1.T) @ K_1.T @ skew_op(K_1 @ R_1 @ R_0.T @ (T_0 - R_0 @ R_1.T @ T_1))
+    fundamental_RT_op = lambda K_0, RT_0, K_1, RT_1: fundamental_op (K_0, RT_0[:, :3], RT_0[:, 3], K_1,
+                                                                          RT_1[:, :3], RT_1[:, 3] )
+    F = np.zeros((len(basenames), len(basenames), 3, 3))  # N x N x 3 x 3 matrix
+    F = {(icam, jcam): np.zeros((3, 3)) for jcam in basenames for icam in basenames}
+    for icam in basenames:
+        for jcam in basenames:
+            F[(icam, jcam)] += fundamental_RT_op(cameras[icam]['K'], cameras[icam]['RT'], cameras[jcam]['K'], cameras[jcam]['RT'])
+            if F[(icam, jcam)].sum() == 0:
+                F[(icam, jcam)] += 1e-12  # to avoid nan
+    return F
--- a/code/mytools/cmd_loader.py
+++ b/code/mytools/cmd_loader.py
@ -0,0 +1,44 @@
+'''
+  @ Date: 2021-01-15 12:09:27
+  @ Author: Qing Shuai
+  @ LastEditors: Qing Shuai
+  @ LastEditTime: 2021-01-24 20:57:22
+  @ FilePath: /EasyMocapRelease/code/mytools/cmd_loader.py
+'''
+
+import argparse
+
+def load_parser():
+    parser = argparse.ArgumentParser('EasyMocap commond line tools')
+    parser.add_argument('path', type=str)
+    parser.add_argument('--out', type=str, default=None)
+    parser.add_argument('--annot', type=str, default=None)
+    parser.add_argument('--sub', type=str, nargs='+', default=[],
+        help='the sub folder lists when in video mode')
+    parser.add_argument('--start', type=int, default=0,
+        help='frame start')
+    parser.add_argument('--end', type=int, default=10000,
+        help='frame end')    
+    parser.add_argument('--step', type=int, default=1,
+        help='frame step')
+    # 
+    # keypoints and body model
+    # 
+    parser.add_argument('--body', type=str, default='body25', choices=['body15', 'body25', 'bodyhand', 'bodyhandface', 'total'])
+    parser.add_argument('--model', type=str, default='smpl', choices=['smpl', 'smplh', 'smplx', 'mano'])
+    parser.add_argument('--gender', type=str, default='neutral', 
+        choices=['neutral', 'male', 'female'])
+    # 
+    # visualization part
+    # 
+    parser.add_argument('--vis_det', action='store_true')
+    parser.add_argument('--vis_repro', action='store_true')
+    parser.add_argument('--undis', action='store_true')
+    parser.add_argument('--sub_vis', type=str, nargs='+', default=[],
+        help='the sub folder lists for visualization')
+    # 
+    # debug
+    # 
+    parser.add_argument('--verbose', action='store_true')
+    parser.add_argument('--debug', action='store_true')
+    return parser
--- a/code/mytools/reconstruction.py
+++ b/code/mytools/reconstruction.py
@ -2,8 +2,8 @@
 * @ Date: 2020-09-14 11:01:52
 * @ Author: Qing Shuai
  @ LastEditors: Qing Shuai
-  @ LastEditTime: 2021-01-13 11:30:38
-  @ FilePath: /EasyMocap/code/mytools/reconstruction.py
+  @ LastEditTime: 2021-01-24 22:28:09
+  @ FilePath: /EasyMocapRelease/code/mytools/reconstruction.py
 '''

 import numpy as np
@ -45,13 +45,9 @@ def simple_triangulate(kpts, Pall):
        A[i*2 + 1, :] = kpts[i, 2]*(kpts[i, 1]*P[2:3,:] - P[1:2,:])
    result[:3] = solveZ(A)
    return result
-    # kpts_proj = projectN3(result, Pall)
-    # repro_error = simple_reprojection_error(kpts, kpts_proj)
-    #     return kpts3d, conf/nViews, repro_error/nViews
-    # else:
-    #     return kpts3d, conf
        
-def simple_recon_person(keypoints_use, Puse, ret_repro=False, max_error=100):
+def simple_recon_person(keypoints_use, Puse, config=None, ret_repro=False):
+    eps = 0.01
    nJoints = keypoints_use[0].shape[0]
    if isinstance(keypoints_use, list):
        keypoints_use = np.stack(keypoints_use)
@ -61,23 +57,33 @@ def simple_recon_person(keypoints_use, Puse, ret_repro=False, max_error=100):
        if (keypoints[:, 2] > 0.01).sum() < 2:
            continue
        out[nj] = simple_triangulate(keypoints, Puse)
+    if config is not None:
+        # remove the false limb with the help of limb
+        for (i, j), mean_std in config['skeleton'].items():
+            ii, jj = min(i, j), max(i, j)
+            if out[ii, -1] < eps:
+                out[jj, -1] = 0
+            if out[jj, -1] < eps:
+                continue
+            length = np.linalg.norm(out[ii, :3] - out[jj, :3])
+            if abs(length - mean_std['mean'])/(3*mean_std['std']) > 1:
+                # print((i, j), length, mean_std)
+                out[jj, :] = 0
    # 计算重投影误差
    kpts_repro = projectN3(out, Puse)
    square_diff = (keypoints_use[:, :, :2] - kpts_repro[:, :, :2])**2 
-    conf = (out[None, :, -1] > 0.01) * (keypoints_use[:, :, 2] > 0.01)
+    # conf = (out[None, :, -1] > 0.01) * (keypoints_use[:, :, 2] > 0.01)
+    conf = np.repeat(out[None, :, -1:], len(Puse), 0)
+    kpts_repro = np.concatenate((kpts_repro, conf), axis=2)
    if conf.sum() < 3: # 至少得有3个有效的关节
        repro_error = 1e3
    else:
-        repro_error_joint = np.sqrt(square_diff.sum(axis=2))*conf
-        num_valid_view = conf.sum(axis=0)
-        # 对于可见视角少的，强行设置为不可见
-        repro_error_joint[:, num_valid_view==0] = max_error * 2
-        num_valid_view[num_valid_view==0] = 1
-        repro_error_joint_ = repro_error_joint.sum(axis=0)/num_valid_view
-        # print(repro_error_joint_)
-        not_valid = np.where(repro_error_joint_>max_error)[0]
-        out[not_valid, -1] = 0
+        # (nViews, nJoints): reprojection error for each joint in each view
+        repro_error_joint = np.sqrt(square_diff.sum(axis=2, keepdims=True))*conf
+        # remove the not valid joints
+        # remove the bad views
        repro_error = repro_error_joint.sum()/conf.sum()
+    
    if ret_repro:
        return out, repro_error, kpts_repro
    return out, repro_error
--- a/code/mytools/utils.py
+++ b/code/mytools/utils.py
@ -0,0 +1,21 @@
+'''
+  @ Date: 2021-01-15 11:12:00
+  @ Author: Qing Shuai
+  @ LastEditors: Qing Shuai
+  @ LastEditTime: 2021-01-15 11:19:55
+  @ FilePath: /EasyMocap/code/mytools/utils.py
+'''
+import time
+
+class Timer:
+    def __init__(self, name, silent=False):
+        self.name = name
+        self.silent = silent
+    
+    def __enter__(self):
+        self.start = time.time()
+
+    def __exit__(self, exc_type, exc_value, exc_tb):
+        end = time.time()
+        if not self.silent:
+            print('-> [{}]: {:.2f}s'.format(self.name, end-self.start))
--- a/code/mytools/vis_base.py
+++ b/code/mytools/vis_base.py
@ -2,7 +2,7 @@
  @ Date: 2020-11-28 17:23:04
  @ Author: Qing Shuai
  @ LastEditors: Qing Shuai
-  @ LastEditTime: 2021-01-14 17:11:51
+  @ LastEditTime: 2021-01-21 15:16:52
  @ FilePath: /EasyMocap/code/mytools/vis_base.py
 '''
 import cv2
@ -73,12 +73,13 @@ def plot_keypoints(img, points, pid, config, vis_conf=False, use_limb_color=True
            col = get_rgb(config['colors'][ii])
        else:
            col = get_rgb(pid)
-        if pt1[2] > 0.01 and pt2[2] > 0.01:
+        if pt1[-1] > 0.01 and pt2[-1] > 0.01:
            image = cv2.line(
                img, (int(pt1[0]+0.5), int(pt1[1]+0.5)), (int(pt2[0]+0.5), int(pt2[1]+0.5)),
                col, lw)
    for i in range(len(points)):
-        x, y, c = points[i]
+        x, y = points[i][0], points[i][1]
+        c = points[i][-1]
        if c > 0.01:
            col = get_rgb(pid)
            cv2.circle(img, (int(x+0.5), int(y+0.5)), lw*2, col, -1)
@ -98,9 +99,11 @@ def merge(images, row=-1, col=-1, resize=False, ret_range=False):
        images = [images[i] for i in [0, 1, 2, 3, 7, 6, 5, 4]]
    if len(images) == 7:
        row, col = 3, 3
+    elif len(images) == 2:
+        row, col = 2, 1
    height = images[0].shape[0]
    width = images[0].shape[1]
-    ret_img = np.zeros((height * row, width * col, 3), dtype=np.uint8) + 255
+    ret_img = np.zeros((height * row, width * col, images[0].shape[2]), dtype=np.uint8) + 255
    ranges = []
    for i in range(row):
        for j in range(col):
--- a/code/pyfitting/lossfactory.py
+++ b/code/pyfitting/lossfactory.py
@ -2,51 +2,86 @@
  @ Date: 2020-11-19 17:46:04
  @ Author: Qing Shuai
  @ LastEditors: Qing Shuai
-  @ LastEditTime: 2021-01-14 15:02:39
+  @ LastEditTime: 2021-01-22 16:51:55
  @ FilePath: /EasyMocap/code/pyfitting/lossfactory.py
 '''
 import torch
 from .operation import projection, batch_rodrigues

-def ReprojectionLoss(keypoints3d, keypoints2d, K, Rc, Tc, inv_bbox_sizes):
+def ReprojectionLoss(keypoints3d, keypoints2d, K, Rc, Tc, inv_bbox_sizes, norm='l2'):
    img_points = projection(keypoints3d, K, Rc, Tc)
-    residual = (img_points - keypoints2d[:, :, :2]) * keypoints2d[:, :, 2:3]
-    squared_res = (residual ** 2) * inv_bbox_sizes
+    residual = (img_points - keypoints2d[:, :, :2]) * keypoints2d[:, :, -1:]
+    # squared_res: (nFrames, nJoints, 2)
+    if norm == 'l2':
+        squared_res = (residual ** 2) * inv_bbox_sizes
+    elif norm == 'l1':
+        squared_res = torch.abs(residual) * inv_bbox_sizes
+    else:
+        import ipdb; ipdb.set_trace()
    return torch.sum(squared_res)

 class SMPLAngleLoss:
-    def __init__(self, keypoints):
-        use_feet = keypoints[:, [19, 20, 21, 22, 23, 24], -1].sum() > 0.1
-        use_head = keypoints[:, [15, 16, 17, 18], -1].sum() > 0.1
-        SMPL_JOINT_ZERO_IDX = [3, 6, 9, 13, 14, 20, 21, 22, 23]
+    def __init__(self, keypoints, model_type='smpl'):
+        if keypoints.shape[1] <= 15:
+            use_feet = False
+            use_head = False
+        else:
+            use_feet = keypoints[:, [19, 20, 21, 22, 23, 24], -1].sum() > 0.1
+            use_head = keypoints[:, [15, 16, 17, 18], -1].sum() > 0.1
+        if model_type == 'smpl':
+            SMPL_JOINT_ZERO_IDX = [3, 6, 9, 10, 11, 13, 14, 20, 21, 22, 23]
+        elif model_type == 'smplh':
+            SMPL_JOINT_ZERO_IDX = [3, 6, 9, 10, 11, 13, 14]
+        elif model_type == 'smplx':
+            SMPL_JOINT_ZERO_IDX = [3, 6, 9, 10, 11, 13, 14]
+        else:
+            raise NotImplementedError
        if not use_feet:
            SMPL_JOINT_ZERO_IDX.extend([7, 8])
        if not use_head:
            SMPL_JOINT_ZERO_IDX.extend([12, 15])
        SMPL_POSES_ZERO_IDX = [[j for j in range(3*i, 3*i+3)] for i in SMPL_JOINT_ZERO_IDX]
        SMPL_POSES_ZERO_IDX = sum(SMPL_POSES_ZERO_IDX, [])
+        # SMPL_POSES_ZERO_IDX.extend([36, 37, 38, 45, 46, 47])
        self.idx = SMPL_POSES_ZERO_IDX

    def loss(self, poses):
        return torch.sum(torch.abs(poses[:, self.idx]))

-def SmoothLoss(body_params, keys, weight_loss, span=4):
+def SmoothLoss(body_params, keys, weight_loss, span=4, model_type='smpl'):
    spans = [i for i in range(1, span)]
    span_weights = {i:1/i for i in range(1, span)}
    span_weights = {key: i/sum(span_weights) for key, i in span_weights.items()}
    loss_dict = {}
    nFrames = body_params['poses'].shape[0]
-    for key in ['poses', 'Th']:
+    nPoses = body_params['poses'].shape[1]
+    if model_type == 'smplh' or model_type == 'smplx':
+        nPoses = 66
+    for key in ['poses', 'Th', 'poses_hand', 'expression']:
+        if key not in keys:
+            continue
        k = 'smooth_' + key
        if k in weight_loss.keys() and weight_loss[k] > 0.:
            loss_dict[k] = 0.
            for span in spans:
-                val = torch.sum((body_params[key][span:, :] - body_params[key][:nFrames-span, :])**2)
+                if key == 'poses_hand':
+                    val = torch.sum((body_params['poses'][span:, 66:] - body_params['poses'][:nFrames-span, 66:])**2)
+                else:
+                    val = torch.sum((body_params[key][span:, :nPoses] - body_params[key][:nFrames-span, :nPoses])**2)
+                loss_dict[k] += span_weights[span] * val
+        k = 'smooth_' + key + '_l1'
+        if k in weight_loss.keys() and weight_loss[k] > 0.:
+            loss_dict[k] = 0.
+            for span in spans:
+                if key == 'poses_hand':
+                    val = torch.sum((body_params['poses'][span:, 66:] - body_params['poses'][:nFrames-span, 66:]).abs())
+                else:
+                    val = torch.sum((body_params[key][span:, :nPoses] - body_params[key][:nFrames-span, :nPoses]).abs())
                loss_dict[k] += span_weights[span] * val
    # smooth rotation
    rot = batch_rodrigues(body_params['Rh'])
    key, k = 'Rh', 'smooth_Rh'
-    if k in weight_loss.keys() and weight_loss[k] > 0.:
+    if key in keys and k in weight_loss.keys() and weight_loss[k] > 0.:
        loss_dict[k] = 0.
        for span in spans:
            val = torch.sum((rot[span:, :] - rot[:nFrames-span, :])**2)
@ -55,10 +90,24 @@ def SmoothLoss(body_params, keys, weight_loss, span=4):

 def RegularizationLoss(body_params, body_params_init, weight_loss):
    loss_dict = {}
-    for key in ['poses', 'shapes', 'Th']:
-            if 'init_'+key in weight_loss.keys() and weight_loss['init_'+key] > 0.:
+    for key in ['poses', 'shapes', 'Th', 'hands', 'head', 'expression']:
+        if 'init_'+key in weight_loss.keys() and weight_loss['init_'+key] > 0.:
+            if key == 'poses':
+                loss_dict['init_'+key] = torch.sum((body_params[key][:, :66] - body_params_init[key][:, :66])**2)
+            elif key == 'hands':
+                loss_dict['init_'+key] = torch.sum((body_params['poses'][: , 66:66+12] - body_params_init['poses'][:, 66:66+12])**2)
+            elif key == 'head':
+                loss_dict['init_'+key] = torch.sum((body_params['poses'][: , 78:78+9] - body_params_init['poses'][:, 78:78+9])**2)
+            elif key in body_params.keys():
                loss_dict['init_'+key] = torch.sum((body_params[key] - body_params_init[key])**2)
-    for key in ['poses', 'shapes']:
+    for key in ['poses', 'shapes', 'hands', 'head', 'expression']:
        if 'reg_'+key in weight_loss.keys() and weight_loss['reg_'+key] > 0.:
-            loss_dict['reg_'+key] = torch.sum((body_params[key])**2)
+            if key == 'poses':
+                loss_dict['reg_'+key] = torch.sum((body_params[key][:, :66])**2)
+            elif key == 'hands':
+                loss_dict['reg_'+key] = torch.sum((body_params['poses'][: , 66:66+12])**2)
+            elif key == 'head':
+                loss_dict['reg_'+key] = torch.sum((body_params['poses'][: , 78:78+9])**2)
+            elif key in body_params.keys():
+                loss_dict['reg_'+key] = torch.sum((body_params[key])**2)
    return loss_dict
--- a/code/pyfitting/operation.py
+++ b/code/pyfitting/operation.py
@ -2,7 +2,7 @@
  @ Date: 2020-11-19 11:39:45
  @ Author: Qing Shuai
  @ LastEditors: Qing Shuai
-  @ LastEditTime: 2020-11-19 11:50:20
+  @ LastEditTime: 2021-01-20 15:06:28
  @ FilePath: /EasyMocap/code/pyfitting/operation.py
 '''
 import torch
@ -47,12 +47,18 @@ def projection(points3d, camera_intri, R=None, T=None, distance=None):
        points3d {Tensor} -- (bn, N, 3)
        camera_intri {Tensor} -- (bn, 3, 3)
        distance {Tensor} -- (bn, 1, 1)
+        R: bn, 3, 3
+        T: bn, 3, 1
    Returns:
        points2d -- (bn, N, 2)
    """
    if R is not None:
        Rt = torch.transpose(R, 1, 2)
-        points3d = torch.matmul(points3d, Rt) + T
+        if T.shape[-1] == 1:
+            Tt = torch.transpose(T, 1, 2)
+            points3d = torch.matmul(points3d, Rt) + Tt
+        else:
+            points3d = torch.matmul(points3d, Rt) + T
    
    if distance is None:
        img_points = torch.div(points3d[:, :, :2],
--- a/code/pyfitting/optimize_simple.py
+++ b/code/pyfitting/optimize_simple.py
@ -2,8 +2,8 @@
  @ Date: 2020-11-19 10:49:26
  @ Author: Qing Shuai
  @ LastEditors: Qing Shuai
-  @ LastEditTime: 2021-01-14 20:19:34
-  @ FilePath: /EasyMocap/code/pyfitting/optimize_simple.py
+  @ LastEditTime: 2021-01-24 21:29:12
+  @ FilePath: /EasyMocapRelease/code/pyfitting/optimize_simple.py
 '''
 import numpy as np
 import torch
@ -213,6 +213,7 @@ def optimizeShape(body_model, body_params, keypoints3d,
    limb_length = torch.Tensor(limb_length).to(device)
    limb_conf = torch.Tensor(limb_conf).to(device)
    body_params = {key:torch.Tensor(val).to(device) for key, val in body_params.items()}
+    body_params_init = {key:val.clone() for key, val in body_params.items()}
    opt_params = [body_params['shapes']]
    grad_require(opt_params, True)
    optimizer = LBFGS(
@ -226,14 +227,16 @@ def optimizeShape(body_model, body_params, keypoints3d,
        dst = keypoints3d[:, kintree[:, 1], :3]
        direct_est = (dst - src).detach()
        direct_norm = torch.norm(direct_est, dim=2, keepdim=True)
-        direct_normalized = direct_est/direct_norm
+        direct_normalized = direct_est/(direct_norm + 1e-4)
        err = dst - src - direct_normalized * limb_length
        loss_dict = {
            's3d': torch.sum(err**2*limb_conf)/nFrames, 
-            'reg_shape': torch.sum(body_params['shapes']**2)}
+            'reg_shapes': torch.sum(body_params['shapes']**2)}
+        if 'init_shape' in weight_loss.keys():
+            loss_dict['init_shape'] = torch.sum((body_params['shapes'] - body_params_init['shapes'])**2)
        # fittingLog.step(loss_dict, weight_loss)
        if verbose:
-            print(' '.join([key + ' %f'%(loss_dict[key].item()*weight_loss[key]) 
+            print(' '.join([key + ' %.3f'%(loss_dict[key].item()*weight_loss[key]) 
                for key in loss_dict.keys() if weight_loss[key]>0]))
        loss = sum([loss_dict[key]*weight_loss[key]
                    for key in loss_dict.keys()])
@ -255,6 +258,9 @@ def optimizeShape(body_model, body_params, keypoints3d,
    body_params = {key:val.detach().cpu().numpy() for key, val in body_params.items()}
    return body_params

+N_BODY = 25
+N_HAND = 21
+
 def optimizePose(body_model, body_params, keypoints3d,
    weight_loss, kintree, cfg=None):
    """ simple function for optimizing model pose given 3d keypoints
@ -268,22 +274,16 @@ def optimizePose(body_model, body_params, keypoints3d,
        cfg (Config): Config Node controling running mode
    """
    device = body_model.device
+    model_type = body_model.model_type
    # 计算不同的骨长
    kintree = np.array(kintree, dtype=np.int)
    nFrames = keypoints3d.shape[0]
-    # limb_length: nFrames, nLimbs, 1
-    limb = keypoints3d[:, kintree[:, 1], :3] - keypoints3d[:, kintree[:, 0], :3]
-    limb_length = np.linalg.norm(limb, axis=2, keepdims=True)
-    # conf: nFrames, nLimbs, 1
-    limb_conf = np.minimum(keypoints3d[:, kintree[:, 1], 3:], keypoints3d[:, kintree[:, 0], 3:])
-    limb_dir = limb/limb_length
-
+    nJoints = keypoints3d.shape[1]
    keypoints3d = torch.Tensor(keypoints3d).to(device)
-    limb_dir = torch.Tensor(limb_dir).to(device).unsqueeze(2)
-    limb_conf = torch.Tensor(limb_conf).to(device)
-    angle_prior = SMPLAngleLoss(keypoints3d)
+    angle_prior = SMPLAngleLoss(keypoints3d, body_model.model_type)

    body_params = {key:torch.Tensor(val).to(device) for key, val in body_params.items()}
+    body_params_init = {key:val.clone() for key, val in body_params.items()}
    if cfg is None:
        opt_params = [body_params['Rh'], body_params['Th'], body_params['poses']]
        verbose = False
@ -297,35 +297,46 @@ def optimizePose(body_model, body_params, keypoints3d,
            opt_params.append(body_params['poses'])
        if cfg.OPT_SHAPE:
            opt_params.append(body_params['shapes'])
+        if cfg.OPT_EXPR and model_type == 'smplx':
+            opt_params.append(body_params['expression'])
        verbose = cfg.VERBOSE
    grad_require(opt_params, True)
    optimizer = LBFGS(
        opt_params, line_search_fn='strong_wolfe')
    zero_pose = torch.zeros((nFrames, 3), device=device)
+    if not cfg.OPT_HAND and model_type in ['smplh', 'smplx']:
+        zero_pose_hand = torch.zeros((nFrames, body_params['poses'].shape[1] - 66), device=device)
+        nJoints = N_BODY
+        keypoints3d = keypoints3d[:, :nJoints]
+    elif cfg.OPT_HAND and not cfg.OPT_EXPR and model_type == 'smplx':
+        zero_pose_face = torch.zeros((nFrames, body_params['poses'].shape[1] - 78), device=device)
+        nJoints = N_BODY + N_HAND * 2
+        keypoints3d = keypoints3d[:, :nJoints]
+    else:
+        nJoints = keypoints3d.shape[1]
    def closure(debug=False):
        optimizer.zero_grad()
        new_params = body_params.copy()
-        new_params['poses'] = torch.cat([zero_pose, body_params['poses'][:, 3:]], dim=1)
-        kpts_est = body_model(return_verts=False, return_tensor=True, **new_params)
-        diff_square = (kpts_est - keypoints3d[..., :3])**2
-        if False:
-            pass
+        if not cfg.OPT_HAND and cfg.MODEL in ['smplh', 'smplx']:
+            new_params['poses'] = torch.cat([zero_pose, body_params['poses'][:, 3:66], zero_pose_hand], dim=1)
        else:
-            conf = keypoints3d[..., 3:]
+            new_params['poses'] = torch.cat([zero_pose, body_params['poses'][:, 3:]], dim=1)
+        kpts_est = body_model(return_verts=False, return_tensor=True, **new_params)[:, :nJoints, :]
+        diff_square = (kpts_est[:, :nJoints, :3] - keypoints3d[..., :3])**2
+        # TODO:add robust loss
+        conf = keypoints3d[..., 3:]
        loss_3d = torch.sum(conf * diff_square)
-        if False:
-            src = keypoints3d[:, kintree[:, 0], :3].detach()
-            dst = keypoints3d[:, kintree[:, 1], :3]
-            direct_est = dst - src
-            direct_norm = torch.norm(direct_est, dim=2, keepdim=True)
-            direct_normalized = direct_est/direct_norm
-        
        loss_dict = {
            'k3d': loss_3d,
            'reg_poses_zero': angle_prior.loss(body_params['poses'])
        }
+        # regularize
+        loss_dict.update(RegularizationLoss(body_params, body_params_init, weight_loss))
        # smooth
-        loss_dict.update(SmoothLoss(body_params, ['poses', 'Th'], weight_loss))
+        smooth_conf = keypoints3d[1:, ..., -1:]**2
+        loss_dict['smooth_body'] = torch.sum(smooth_conf[:, :N_BODY] * torch.abs(kpts_est[:-1, :N_BODY] - kpts_est[1:, :N_BODY]))
+        if cfg.OPT_HAND and cfg.MODEL in ['smplh', 'smplx']:
+            loss_dict['smooth_hand'] = torch.sum(smooth_conf[:, N_BODY:N_BODY+N_HAND*2] * torch.abs(kpts_est[:-1, N_BODY:N_BODY+N_HAND*2] - kpts_est[1:, N_BODY:N_BODY+N_HAND*2]))
        for key in loss_dict.keys():
            loss_dict[key] = loss_dict[key]/nFrames
        # fittingLog.step(loss_dict, weight_loss)
--- a/code/smplmodel/init.py
+++ b/code/smplmodel/init.py
@ -2,8 +2,9 @@
  @ Date: 2020-11-18 14:33:20
  @ Author: Qing Shuai
  @ LastEditors: Qing Shuai
-  @ LastEditTime: 2021-01-14 20:12:26
+  @ LastEditTime: 2021-01-20 16:33:02
  @ FilePath: /EasyMocap/code/smplmodel/__init__.py
 '''
 from .body_model import SMPLlayer
-from .body_param import merge_params, select_nf, init_params, Config
+from .body_param import load_model
+from .body_param import merge_params, select_nf, init_params, Config, check_params, check_keypoints
--- a/code/smplmodel/body_model.py
+++ b/code/smplmodel/body_model.py
@ -2,7 +2,7 @@
  @ Date: 2020-11-18 14:04:10
  @ Author: Qing Shuai
  @ LastEditors: Qing Shuai
-  @ LastEditTime: 2021-01-14 20:14:34
+  @ LastEditTime: 2021-01-22 16:04:54
  @ FilePath: /EasyMocap/code/smplmodel/body_model.py
 '''
 import torch
@ -11,6 +11,7 @@ from .lbs import lbs, batch_rodrigues
 import os.path as osp
 import pickle
 import numpy as np
+import os

 def to_tensor(array, dtype=torch.float32, device=torch.device('cpu')):
    if 'torch.tensor' not in str(type(array)):
@ -23,13 +24,29 @@ def to_np(array, dtype=np.float32):
        array = array.todense()
    return np.array(array, dtype=dtype)

+def load_regressor(regressor_path):
+    if regressor_path.endswith('.npy'):
+        X_regressor = to_tensor(np.load(regressor_path))
+    elif regressor_path.endswith('.txt'):
+        data = np.loadtxt(regressor_path)
+        with open(regressor_path, 'r') as f:
+            shape = f.readline().split()[1:]
+        reg = np.zeros((int(shape[0]), int(shape[1])))
+        for i, j, v in data:
+            reg[int(i), int(j)] = v
+        X_regressor = to_tensor(reg)
+    else:
+        import ipdb; ipdb.set_trace()
+    return X_regressor
+    
 class SMPLlayer(nn.Module):
-    def __init__(self, model_path, gender='neutral', device=None,
+    def __init__(self, model_path, model_type='smpl', gender='neutral', device=None,
        regressor_path=None) -> None:
        super(SMPLlayer, self).__init__()
        dtype = torch.float32
        self.dtype = dtype
        self.device = device
+        self.model_type = model_type
        # create the SMPL model
        if osp.isdir(model_path):
            model_fn = 'SMPL_{}.{ext}'.format(gender.upper(), ext='pkl')
@ -58,13 +75,18 @@ class SMPLlayer(nn.Module):
        parents = to_tensor(to_np(data['kintree_table'][0])).long()
        parents[0] = -1
        self.register_buffer('parents', parents)
+        if self.model_type == 'smplx':
+            # shape
+            self.num_expression_coeffs = 10
+            self.num_shapes = 10
+            self.shapedirs = self.shapedirs[:, :, :self.num_shapes+self.num_expression_coeffs]
        # joints regressor
        if regressor_path is not None:
-            X_regressor = to_tensor(np.load(regressor_path))
+            X_regressor = load_regressor(regressor_path)
            X_regressor = torch.cat((self.J_regressor, X_regressor), dim=0)

-            j_J_regressor = torch.zeros(24, X_regressor.shape[0], device=device)
-            for i in range(24):
+            j_J_regressor = torch.zeros(self.J_regressor.shape[0], X_regressor.shape[0], device=device)
+            for i in range(self.J_regressor.shape[0]):
                j_J_regressor[i, i] = 1
            j_v_template = X_regressor @ self.v_template
            # 
@ -79,8 +101,65 @@ class SMPLlayer(nn.Module):
            self.register_buffer('j_weights', j_weights)
            self.register_buffer('j_v_template', j_v_template)
            self.register_buffer('j_J_regressor', j_J_regressor)
+        if self.model_type == 'smplh':
+            # load smplh data
+            self.num_pca_comps = 6
+            from os.path import join
+            for key in ['LEFT', 'RIGHT']:
+                left_file = join(os.path.dirname(smpl_path), 'MANO_{}.pkl'.format(key))
+                with open(left_file, 'rb') as f:
+                    data = pickle.load(f, encoding='latin1')
+                val = to_tensor(to_np(data['hands_mean'].reshape(1, -1)), dtype=dtype)
+                self.register_buffer('mHandsMean'+key[0], val)
+                val = to_tensor(to_np(data['hands_components'][:self.num_pca_comps, :]), dtype=dtype)
+                self.register_buffer('mHandsComponents'+key[0], val)
+            self.use_pca = True
+            self.use_flat_mean = True
+        elif self.model_type == 'smplx':
+            # hand pose
+            self.num_pca_comps = 6
+            from os.path import join
+            for key in ['Ll', 'Rr']:
+                val = to_tensor(to_np(data['hands_mean'+key[1]].reshape(1, -1)), dtype=dtype)
+                self.register_buffer('mHandsMean'+key[0], val)
+                val = to_tensor(to_np(data['hands_components'+key[1]][:self.num_pca_comps, :]), dtype=dtype)
+                self.register_buffer('mHandsComponents'+key[0], val)
+            self.use_pca = True
+            self.use_flat_mean = True
            
-    def forward(self, poses, shapes, Rh=None, Th=None, return_verts=True, return_tensor=True, only_shape=False, **kwargs):
+    def extend_pose(self, poses):
+        if self.model_type not in ['smplh', 'smplx']:
+            return poses
+        elif self.model_type == 'smplh' and poses.shape[-1] == 156:
+            return poses
+        elif self.model_type == 'smplx' and poses.shape[-1] == 165:
+            return poses
+        
+        NUM_BODYJOINTS = 22 * 3
+        if self.use_pca:
+            NUM_HANDJOINTS = self.num_pca_comps
+        else:
+            NUM_HANDJOINTS = 15 * 3
+        NUM_FACEJOINTS = 3 * 3
+        poses_lh = poses[:, NUM_BODYJOINTS:NUM_BODYJOINTS + NUM_HANDJOINTS]
+        poses_rh = poses[:, NUM_BODYJOINTS + NUM_HANDJOINTS:NUM_BODYJOINTS+NUM_HANDJOINTS*2]
+        if self.use_pca:
+            poses_lh = poses_lh @ self.mHandsComponentsL
+            poses_rh = poses_rh @ self.mHandsComponentsR
+        if self.use_flat_mean:
+            poses_lh = poses_lh + self.mHandsMeanL
+            poses_rh = poses_rh + self.mHandsMeanR
+        if self.model_type == 'smplh':
+            poses = torch.cat([poses[:, :NUM_BODYJOINTS], poses_lh, poses_rh], dim=1)
+        elif self.model_type == 'smplx':
+            # the head part have only three joints
+            # poses_head: (N, 9), jaw_pose, leye_pose, reye_pose respectively
+            poses_head = poses[:, NUM_BODYJOINTS+NUM_HANDJOINTS*2:]
+            # body, head, left hand, right hand
+            poses = torch.cat([poses[:, :NUM_BODYJOINTS], poses_head, poses_lh, poses_rh], dim=1)
+        return poses
+
+    def forward(self, poses, shapes, Rh=None, Th=None, expression=None, return_verts=True, return_tensor=True, only_shape=False, **kwargs):
        """ Forward pass for SMPL model

        Args:
@ -96,13 +175,23 @@ class SMPLlayer(nn.Module):
            shapes = to_tensor(shapes, dtype, device)
            Rh = to_tensor(Rh, dtype, device)
            Th = to_tensor(Th, dtype, device)
+            if expression is not None:
+                expression = to_tensor(expression, dtype, device)
+
        bn = poses.shape[0]
+        # process Rh, Th
        if Rh is None:
            Rh = torch.zeros(bn, 3, device=poses.device)
        rot = batch_rodrigues(Rh)
        transl = Th.unsqueeze(dim=1)
+        # process shapes
        if shapes.shape[0] < bn:
            shapes = shapes.expand(bn, -1)
+        if expression is not None and self.model_type == 'smplx':
+            shapes = torch.cat([shapes, expression], dim=1)
+        # process poses
+        if self.model_type == 'smplh' or self.model_type == 'smplx':
+            poses = self.extend_pose(poses)
        if return_verts:
            vertices, joints = lbs(shapes, poses, self.v_template,
                                self.shapedirs, self.posedirs,
@ -113,7 +202,7 @@ class SMPLlayer(nn.Module):
                                self.j_shapedirs, self.j_posedirs,
                                self.j_J_regressor, self.parents,
                                self.j_weights, pose2rot=True, dtype=self.dtype, only_shape=only_shape)
-            vertices = vertices[:, 24:, :]
+            vertices = vertices[:, self.J_regressor.shape[0]:, :]
        vertices = torch.matmul(vertices, rot.transpose(1, 2)) + transl
        if not return_tensor:
            vertices = vertices.detach().cpu().numpy()
--- a/code/smplmodel/body_param.py
+++ b/code/smplmodel/body_param.py
@ -2,15 +2,16 @@
  @ Date: 2020-11-20 13:34:54
  @ Author: Qing Shuai
  @ LastEditors: Qing Shuai
-  @ LastEditTime: 2021-01-14 20:09:40
-  @ FilePath: /EasyMocap/code/smplmodel/body_param.py
+  @ LastEditTime: 2021-01-24 18:39:45
+  @ FilePath: /EasyMocapRelease/code/smplmodel/body_param.py
 '''
 import numpy as np

 def merge_params(param_list, share_shape=True):
    output = {}
-    for key in ['poses', 'shapes', 'Rh', 'Th']:
-        output[key] = np.vstack([v[key] for v in param_list])
+    for key in ['poses', 'shapes', 'Rh', 'Th', 'expression']:
+        if key in param_list[0].keys():
+            output[key] = np.vstack([v[key] for v in param_list])
    if share_shape:
        output['shapes'] = output['shapes'].mean(axis=0, keepdims=True)
    return output
@ -19,24 +20,83 @@ def select_nf(params_all, nf):
    output = {}
    for key in ['poses', 'Rh', 'Th']:
        output[key] = params_all[key][nf:nf+1, :]
+    if 'expression' in params_all.keys():
+        output['expression'] = params_all['expression'][nf:nf+1, :]
    if params_all['shapes'].shape[0] == 1:
        output['shapes'] = params_all['shapes']
    else:
        output['shapes'] = params_all['shapes'][nf:nf+1, :]
    return output

-def init_params(nFrames=1):
+NUM_POSES = {'smpl': 72, 'smplh': 78, 'smplx': 66 + 12 + 9}
+NUM_EXPR = 10
+
+def init_params(nFrames=1, model_type='smpl'):
    params = {
-        'poses': np.zeros((nFrames, 72)),
+        'poses': np.zeros((nFrames, NUM_POSES[model_type])),
        'shapes': np.zeros((1, 10)),
        'Rh': np.zeros((nFrames, 3)),
        'Th': np.zeros((nFrames, 3)),
    }
+    if model_type == 'smplx':
+        params['expression'] = np.zeros((nFrames, NUM_EXPR))
    return params

+def check_params(body_params, model_type):
+    nFrames = body_params['poses'].shape[0]
+    if body_params['poses'].shape[1] != NUM_POSES[model_type]:
+        body_params['poses'] = np.hstack((body_params['poses'], np.zeros((nFrames, NUM_POSES[model_type] - body_params['poses'].shape[1]))))
+    if model_type == 'smplx' and 'expression' not in body_params.keys():
+        body_params['expression'] = np.zeros((nFrames, NUM_EXPR))
+    return body_params
+
 class Config:
    OPT_R = False
    OPT_T = False
    OPT_POSE = False
    OPT_SHAPE = False
+    OPT_HAND = False
+    OPT_EXPR = False
    VERBOSE = False
+    MODEL = 'smpl'
+
+def load_model(gender='neutral', use_cuda=True, model_type='smpl'):
+    # prepare SMPL model
+    import torch
+    if use_cuda:
+        device = torch.device('cuda')
+    else:
+        device = torch.device('cpu')
+    from .body_model import SMPLlayer
+    if model_type == 'smpl':
+        body_model = SMPLlayer('data/smplx/smpl', gender=gender, device=device,
+            regressor_path='data/smplx/J_regressor_body25.npy')
+    elif model_type == 'smplh':
+        body_model = SMPLlayer('data/smplx/smplh/SMPLH_MALE.pkl', model_type='smplh', gender=gender, device=device,
+            regressor_path='data/smplx/J_regressor_body25_smplh.txt')
+    elif model_type == 'smplx':
+        body_model = SMPLlayer('data/smplx/smplx/SMPLX_{}.pkl'.format(gender.upper()), model_type='smplx', gender=gender, device=device,
+            regressor_path='data/smplx/J_regressor_body25_smplx.txt')
+    else:
+        body_model = None
+    body_model.to(device)
+    return body_model
+
+def check_keypoints(keypoints2d, WEIGHT_DEBUFF=1.2):
+    # keypoints2d: nFrames, nJoints, 3
+    # 
+    # wrong feet
+    # if keypoints2d.shape[-2] > 25 + 42:
+    #     keypoints2d[..., 0, 2] = 0
+    # keypoints2d[..., [15, 16, 17, 18], -1] = 0
+    # keypoints2d[..., [19, 20, 21, 22, 23, 24], -1] /= 2
+    if keypoints2d.shape[-2] > 25:
+        # set the hand keypoints
+        keypoints2d[..., 25, :] = keypoints2d[..., 7, :]
+        keypoints2d[..., 46, :] = keypoints2d[..., 4, :]
+        keypoints2d[..., 25:, -1] *= WEIGHT_DEBUFF
+    # reduce the confidence of hand and face
+    MIN_CONF = 0.3
+    conf = keypoints2d[..., -1]
+    conf[conf<MIN_CONF] = 0
+    return keypoints2d
--- a/code/visualize/renderer.py
+++ b/code/visualize/renderer.py
@ -79,13 +79,14 @@ if render_flags['rgba']:

 class Renderer(object):
    def __init__(self, focal_length=1000, height=512, width=512, faces=None,
-        bg_color=[0.0, 0.0, 0.0, 0.0]   # render 配置
+        bg_color=[1.0, 1.0, 1.0, 0.0], down_scale=1   # render 配置
    ): 
        self.renderer = pyrender.OffscreenRenderer(height, width)
        self.faces = faces
        self.focal_length = focal_length
        self.bg_color = bg_color
-        self.ambient_light = (0.3, 0.3, 0.3)
+        self.ambient_light = (0.5, 0.5, 0.5)
+        self.down_scale = down_scale

    def add_light(self, scene):
        trans = [0, 0, 0]
@ -101,7 +102,7 @@ class Renderer(object):
        scene.add(light, pose=light_pose)

    def render(self, render_data, cameras, images,
-        use_white=False,
+        use_white=False, add_back=True,
        ret_depth=False, ret_color=False):
        # Need to flip x-axis
        rot = trimesh.transformations.rotation_matrix(
@ -112,7 +113,11 @@ class Renderer(object):
                img = np.zeros_like(img_, dtype=np.uint8) + 255
            else:
                img = img_.copy()
-            K, R, T = cameras['K'][nv], cameras['R'][nv], cameras['T'][nv]
+            K, R, T = cameras['K'][nv].copy(), cameras['R'][nv], cameras['T'][nv]
+            # down scale the image to speed up rendering
+            img = cv2.resize(img, None, fx=1/self.down_scale, fy=1/self.down_scale)
+            K[:2, :] /= self.down_scale
+
            self.renderer.viewport_height = img.shape[0]
            self.renderer.viewport_width = img.shape[1]
            scene = pyrender.Scene(bg_color=self.bg_color,
@ -120,20 +125,31 @@ class Renderer(object):
            for trackId, data in render_data.items():
                vert = data['vertices'].copy()
                faces = data['faces']
-                # 如果使用了vid这个键，那么可视化的颜色使用vid的颜色
-                col = get_colors(data.get('vid', trackId))
                vert = vert @ R.T + T.T
-                mesh = trimesh.Trimesh(vert, faces)
-                mesh.apply_transform(rot)
+                if 'colors' not in data.keys():
+                    # 如果使用了vid这个键，那么可视化的颜色使用vid的颜色
+                    col = get_colors(data.get('vid', trackId))
+                    mesh = trimesh.Trimesh(vert, faces)
+                    mesh.apply_transform(rot)

-                material = pyrender.MetallicRoughnessMaterial(
-                    metallicFactor=0.0,
-                    alphaMode='OPAQUE',
-                    baseColorFactor=col)
-                mesh = pyrender.Mesh.from_trimesh(
-                    mesh,
-                    material=material)
-                scene.add(mesh, 'mesh')
+                    material = pyrender.MetallicRoughnessMaterial(
+                        metallicFactor=0.0,
+                        alphaMode='OPAQUE',
+                        baseColorFactor=col)
+                    mesh = pyrender.Mesh.from_trimesh(
+                        mesh,
+                        material=material)
+                    scene.add(mesh, data['name'])
+                else:
+                    mesh = trimesh.Trimesh(vert, faces, vertex_colors=data['colors'], process=False)
+                    # mesh = trimesh.Trimesh(vert, faces, process=False)
+                    mesh.apply_transform(rot)
+                    material = pyrender.MetallicRoughnessMaterial(
+                        metallicFactor=0.0,
+                        alphaMode='OPAQUE',
+                        baseColorFactor=(1., 1., 1.))
+                    mesh = pyrender.Mesh.from_trimesh(mesh, material=material)
+                    scene.add(mesh, data['name'])
            camera_pose = np.eye(4)
            camera = pyrender.camera.IntrinsicsCamera(fx=K[0, 0], fy=K[1, 1], cx=K[0, 2], cy=K[1, 2])
            scene.add(camera, pose=camera_pose)
@ -144,7 +160,12 @@ class Renderer(object):
            if rend_rgba.shape[2] == 3: # fail to generate transparent channel
                valid_mask = (rend_depth > 0)[:, :, None]
                rend_rgba = np.dstack((rend_rgba, (valid_mask*255).astype(np.uint8)))
-            rend_cat = cv2.addWeighted(cv2.bitwise_and(img, 255 - rend_rgba[:, :, 3:4].repeat(3, 2)), 1, rend_rgba[:, :, :3], 1, 0)
+            if add_back:
+                rend_cat = cv2.addWeighted(
+                    cv2.bitwise_and(img, 255 - rend_rgba[:, :, 3:4].repeat(3, 2)), 1, 
+                    cv2.bitwise_and(rend_rgba[:, :, :3], rend_rgba[:, :, 3:4].repeat(3, 2)), 1, 0)
+            else:
+                rend_cat = rend_rgba
            
            output_colors.append(rend_rgba)
            output_depths.append(rend_depth)
--- a/doc/evaluation.md
+++ b/doc/evaluation.md
@ -0,0 +1,11 @@
+<!--
+ * @Date: 2021-01-15 21:12:49
+ * @Author: Qing Shuai
+ * @LastEditors: Qing Shuai
+ * @LastEditTime: 2021-01-15 21:13:35
+ * @FilePath: /EasyMocapRelease/doc/evaluation.md
+-->
+# Evaluation
+
+## Evaluation of fitting SMPL
+### Human3.6M
--- a/doc/feng/000400.jpg
+++ b/doc/feng/000400.jpg
--- a/doc/feng/skel.gif
+++ b/doc/feng/skel.gif
--- a/doc/feng/smplx.gif
+++ b/doc/feng/smplx.gif
--- a/doc/log.md
+++ b/doc/log.md
@ -0,0 +1,13 @@
+<!--
+ * @Date: 2021-01-24 22:30:40
+ * @Author: Qing Shuai
+ * @LastEditors: Qing Shuai
+ * @LastEditTime: 2021-01-24 22:32:53
+ * @FilePath: /EasyMocapRelease/doc/log.md
+-->
+## 2020.01.24
+1. Support SMPL+H, SMPL-X model.
+2. Upgrade `body_model.py`.
+3. Update the optimization functions.
+4. Add checking length of limb
+5. Update the example figures.
--- a/doc/tutorial_new_task.md
+++ b/doc/tutorial_new_task.md
@ -0,0 +1,18 @@
+<!--
+ * @Date: 2021-01-21 11:18:47
+ * @Author: Qing Shuai
+ * @LastEditors: Qing Shuai
+ * @LastEditTime: 2021-01-21 11:20:42
+ * @FilePath: /EasyMocapRelease/doc/tutorial_add_new_task.md
+-->
+# Add new tasks
+
+## 0. Prepare the data and dataset
+
+## 1. Add new loss functions
+
+## 2. Add new optimization
+
+## 3. Write your own main function
+
+## 4. Evaluation for the new tasks
--- a/scripts/preprocess/extract_video.py
+++ b/scripts/preprocess/extract_video.py
@ -2,15 +2,17 @@
  @ Date: 2021-01-13 20:38:33
  @ Author: Qing Shuai
  @ LastEditors: Qing Shuai
-  @ LastEditTime: 2021-01-14 16:59:06
-  @ FilePath: /EasyMocapRelease/scripts/preprocess/extract_video.py
+  @ LastEditTime: 2021-01-22 20:45:37
+  @ FilePath: /EasyMocap/scripts/preprocess/extract_video.py
 '''
-import os
+import os, sys
 import cv2
 from os.path import join
 from tqdm import tqdm
 from glob import glob
 import numpy as np
+code_path = join(os.path.dirname(__file__), '..', '..', 'code')
+sys.path.append(code_path)

 mkdir = lambda x: os.makedirs(x, exist_ok=True)

@ -18,12 +20,12 @@ def extract_video(videoname, path, start=0, end=10000, step=1):
    base = os.path.basename(videoname).replace('.mp4', '')
    if not os.path.exists(videoname):
        return base
-    video = cv2.VideoCapture(videoname)
    outpath = join(path, 'images', base)
    if os.path.exists(outpath) and len(os.listdir(outpath)) > 0:
        return base
    else:
        os.makedirs(outpath)
+    video = cv2.VideoCapture(videoname)
    totalFrames = int(video.get(cv2.CAP_PROP_FRAME_COUNT))
    for cnt in tqdm(range(totalFrames)):
        ret, frame = video.read()
@ -36,6 +38,7 @@ def extract_video(videoname, path, start=0, end=10000, step=1):

 def extract_2d(openpose, image, keypoints, render):
    if not os.path.exists(keypoints):
+        os.makedirs(keypoints, exist_ok=True)
        cmd = './build/examples/openpose/openpose.bin --image_dir {} --write_json {} --display 0'.format(image, keypoints)
        if args.handface:
            cmd = cmd + ' --hand --face'
@ -87,7 +90,7 @@ def bbox_from_openpose(keypoints, rescale=1.2, detection_thresh=0.01):
        center[1] - bbox_size[1]/2,
        center[0] + bbox_size[0]/2, 
        center[1] + bbox_size[1]/2,
-        keypoints[valid, :2].mean()
+        keypoints[valid, 2].mean()
    ]
    return bbox

@ -129,10 +132,62 @@ def convert_from_openpose(src, dst):
        annot['annots'] = annots
        save_json(annotname, annot)

+def detect_frame(detector, img, pid=0):
+    lDetections = detector.detect([img])[0]
+    annots = []
+    for i in range(len(lDetections)):
+        annot = {
+            'bbox': [float(d) for d in lDetections[i]['bbox']],
+            'personID': pid + i,
+            'keypoints': lDetections[i]['keypoints'].tolist(),
+            'isKeyframe': True
+        }
+        annots.append(annot)
+    return annots
+    
+def extract_yolo_hrnet(image_root, annot_root):
+    imgnames = sorted(glob(join(image_root, '*.jpg')))
+    import torch
+    device = torch.device('cuda')
+    from estimator.detector import Detector
+    config = {
+        'yolov4': {
+              'ckpt_path': 'data/models/yolov4.weights',
+              'conf_thres': 0.3,
+              'box_nms_thres': 0.5 # 阈值=0.9，表示IOU 0.9的不会被筛掉
+        },
+        'hrnet':{
+            'nof_joints': 17,
+            'c': 48,
+            'checkpoint_path': 'data/models/pose_hrnet_w48_384x288.pth'
+        },
+        'detect':{
+            'MIN_PERSON_JOINTS': 10,
+            'MIN_BBOX_AREA': 5000,
+            'MIN_JOINTS_CONF': 0.3,
+            'MIN_BBOX_LEN': 150
+        }
+    }
+    detector = Detector('yolo', 'hrnet', device, config)
+    for nf, imgname in enumerate(tqdm(imgnames)):
+        annotname = join(annot_root, os.path.basename(imgname).replace('.jpg', '.json'))
+        annot = create_annot_file(annotname, imgname)
+        img0 = cv2.imread(imgname)
+        annot['annots'] = detect_frame(detector, img0, 0)
+        for i in range(len(annot['annots'])):
+            x = annot['annots'][i]
+            x['area'] = max(x['bbox'][2] - x['bbox'][0], x['bbox'][3] - x['bbox'][1])**2
+        annot['annots'].sort(key=lambda x:-x['area'])
+        # 重新赋值人的ID
+        for i in range(len(annot['annots'])):
+            annot['annots'][i]['personID'] = i
+        save_json(annotname, annot)
+
 if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument('path', type=str, default=None)
+    parser.add_argument('--mode', type=str, default='openpose', choices=['openpose', 'yolo-hrnet'])
    parser.add_argument('--handface', action='store_true')
    parser.add_argument('--openpose', type=str, 
        default='/media/qing/Project/openpose')
@ -140,24 +195,31 @@ if __name__ == "__main__":
    parser.add_argument('--no2d', action='store_true')
    parser.add_argument('--debug', action='store_true')
    args = parser.parse_args()
+    mode = args.mode
+    
    if os.path.isdir(args.path):
        videos = sorted(glob(join(args.path, 'videos', '*.mp4')))
        subs = []
        for video in videos:
            basename = extract_video(video, args.path)
            subs.append(basename)
+        print('cameras: ', ' '.join(subs))
        if not args.no2d:
-            os.makedirs(join(args.path, 'openpose'), exist_ok=True)
            for sub in subs:
+                image_root = join(args.path, 'images', sub)
                annot_root = join(args.path, 'annots', sub)
                if os.path.exists(annot_root):
+                    print('skip ', annot_root)
                    continue
-                extract_2d(args.openpose, join(args.path, 'images', sub), 
-                    join(args.path, 'openpose', sub), 
-                    join(args.path, 'openpose_render', sub))
-                convert_from_openpose(
-                    src=join(args.path, 'openpose', sub),
-                    dst=annot_root
-                )
+                if mode == 'openpose':
+                    extract_2d(args.openpose, image_root, 
+                        join(args.path, 'openpose', sub), 
+                        join(args.path, 'openpose_render', sub))
+                    convert_from_openpose(
+                        src=join(args.path, 'openpose', sub),
+                        dst=annot_root
+                    )
+                elif mode == 'yolo-hrnet':
+                    extract_yolo_hrnet(image_root, annot_root)
    else:
        print(args.path, ' not exists')