diff --git a/Readme.md b/Readme.md index b2a5fb2..e2dbaa9 100644 --- a/Readme.md +++ b/Readme.md @@ -2,43 +2,56 @@ * @Date: 2021-01-13 20:32:12 * @Author: Qing Shuai * @LastEditors: Qing Shuai - * @LastEditTime: 2021-01-17 21:07:07 + * @LastEditTime: 2021-01-24 22:11:37 * @FilePath: /EasyMocapRelease/Readme.md --> # EasyMocap **EasyMocap** is an open-source toolbox for **markerless human motion capture** from RGB videos. -## Features -- [x] multi-view, single person => 3d body keypoints -- [x] multi-view, single person => SMPL parameters +In this project, we provide the basic code for fitting SMPL[1]/SMPL+H[2]/SMPLX[3] model to capture body+hand+face poses from multiple views. -|:heavy_check_mark: Skeleton|:heavy_check_mark: SMPL| -|----|----| -|![repro](doc/feng/repro_512.gif)|![smpl](doc/feng/smpl_512.gif)| +|Input|:heavy_check_mark: Skeleton|:heavy_check_mark: SMPL| +|----|----|----| +|![input](doc/feng/000400.jpg)|![repro](doc/feng/skel.gif)|![smpl](doc/feng/smplx.gif)| -> The following features are not released yet. We are now working hard on them. Please stay tuned! +> We plan to intergrate more interesting algorithms, please stay tuned! -|Input|Output| -|----|----| -|multi-view, single person | whole body 3d keypoints| -|multi-view, single person | SMPL-H/SMPLX/MANO parameters| -|sparse view, single person | dense reconstruction and view synthesis: [NeuralBody](https://zju3dv.github.io/neuralbody/).| - -|:black_square_button: Whole Body|:black_square_button: [Detailed Mesh](https://zju3dv.github.io/neuralbody/)| -|----|----| -|
mesh
|
mesh
| +1. [Multi-Person from Multiple Views](https://github.com/zju3dv/mvpose) +2. [Mocap from Multiple **Uncalibrated** and **Unsynchronized** Videos](https://arxiv.org/pdf/2008.07931.pdf) +3. [Dense Reconstruction and View Synthesis from **Sparse Views**](https://zju3dv.github.io/neuralbody/) ## Installation ### 1. Download SMPL models -To download the *SMPL* model go to [this](http://smpl.is.tue.mpg.de) (male and female models, version 1.0.0, 10 shape PCs) and [this](http://smplify.is.tue.mpg.de) (gender neutral model) project website and register to get access to the downloads section. Prepare the model as [smplx](https://github.com/vchoutas/smplx#model-loading). **Place them as following:** +This step is the same as [smplx](https://github.com/vchoutas/smplx#model-loading). + +To download the *SMPL* model go to [this](http://smpl.is.tue.mpg.de) (male and female models, version 1.0.0, 10 shape PCs) and [this](http://smplify.is.tue.mpg.de) (gender neutral model) project website and register to get access to the downloads section. + +To download the *SMPL+H* model go to [this project website](http://mano.is.tue.mpg.de) and register to get access to the downloads section. + +To download the *SMPL-X* model go to [this project website](https://smpl-x.is.tue.mpg.de) and register to get access to the downloads section. + +**Place them as following:** ```bash data └── smplx ├── J_regressor_body25.npy - └── smpl -    ├── SMPL_FEMALE.pkl -    ├── SMPL_MALE.pkl -    └── SMPL_NEUTRAL.pkl + ├── J_regressor_body25_smplh.txt + ├── J_regressor_body25_smplx.txt + ├── smpl + │   ├── SMPL_FEMALE.pkl + │   ├── SMPL_MALE.pkl + │   └── SMPL_NEUTRAL.pkl + ├── smplh + │   ├── MANO_LEFT.pkl + │   ├── MANO_RIGHT.pkl + │   ├── SMPLH_female.pkl + │   ├── SMPLH_FEMALE.pkl + │   ├── SMPLH_male.pkl + │   └── SMPLH_MALE.pkl + └── smplx + ├── SMPLX_FEMALE.pkl + ├── SMPLX_MALE.pkl + └── SMPLX_NEUTRAL.pkl ``` ### 2. Requirements @@ -47,15 +60,13 @@ data - opencv-python - [pyrender](https://pyrender.readthedocs.io/en/latest/install/index.html#python-installation): for visualization - chumpy: for loading SMPL model +- OpenPose[4]: for 2D pose Some of python libraries can be found in `requirements.txt`. You can test different version of PyTorch. - ## Quick Start -We provide an example multiview dataset[[dropbox](https://www.dropbox.com/s/24mb7r921b1g9a7/zju-ls-feng.zip?dl=0)][[BaiduDisk](https://pan.baidu.com/s/1lvAopzYGCic3nauoQXjbPw)(vg1z)]. After downloading the dataset, you can run the following example scripts. +We provide an example multiview dataset[[dropbox](https://www.dropbox.com/s/24mb7r921b1g9a7/zju-ls-feng.zip?dl=0)][[BaiduDisk](https://pan.baidu.com/s/1lvAopzYGCic3nauoQXjbPw)(vg1z)], which has 800 frames from 23 synchronized and calibrated cameras. After downloading the dataset, you can run the following example scripts. ```bash data=path/to/data out=path/to/output @@ -64,15 +75,17 @@ python3 scripts/preprocess/extract_video.py ${data} # 1. example for skeleton reconstruction python3 code/demo_mv1pmf_skel.py ${data} --out ${out} --vis_det --vis_repro --undis --sub_vis 1 7 13 19 # 2. example for SMPL reconstruction -python3 code/demo_mv1pmf_smpl.py ${data} --out ${out} --end 300 --vis_smpl --undis --sub_vis 1 7 13 19 +python3 code/demo_mv1pmf_smpl.py ${data} --out ${out} --end 300 --vis_smpl --undis --sub_vis 1 7 13 19 --gender male +# 2. example for SMPL-X reconstruction +python3 code/demo_mv1pmf_smpl.py ${data} --out ${out} --undis --body bodyhandface --sub_vis 1 7 13 19 --start 400 --model smplx --vis_smpl --gender male ``` ## Not Quick Start ### 0. Prepare Your Own Dataset ```bash zju-ls-feng -├── extri.yml ├── intri.yml +├── extri.yml └── videos ├── 1.mp4 ├── 2.mp4 @@ -88,8 +101,10 @@ Here `intri.yml` and `extri.yml` store the camera intrinsici and extrinsic param ```bash data=path/to/data out=path/to/output -python3 scripts/preprocess/extract_video.py ${data} --openpose +python3 scripts/preprocess/extract_video.py ${data} --openpose --handface ``` +- `--openpose`: specify the openpose path +- `--handface`: detect hands and face keypoints ### 2. Run the code ```bash @@ -98,12 +113,15 @@ python3 code/demo_mv1pmf_skel.py ${data} --out ${out} --vis_det --vis_repro --un # 2. example for SMPL reconstruction python3 code/demo_mv1pmf_smpl.py ${data} --out ${out} --end 300 --vis_smpl --undis --sub_vis 1 7 13 19 ``` +The input flags: +- `--undis`: use to undistort the images +- `--start, --end`: control the begin and end number of frames. + +The output flags: - `--vis_det`: visualize the detection - `--vis_repro`: visualize the reprojection -- `--undis`: use to undistort the images - `--sub_vis`: use to specify the views to visualize. If not set, the code will use all views - `--vis_smpl`: use to render the SMPL mesh to images. -- `--start, --end`: control the begin and end number of frames. ### 3. Output The results are saved in `json` format. @@ -131,14 +149,19 @@ The data in `smpl/000000.json` is also a list, each element represents the SMPL "id": , "Rh": <(1, 3)>, "Th": <(1, 3)>, - "poses": <(1, 72)>, + "poses": <(1, 72/78/87)>, + "expression": <(1, 10)>, "shapes": <(1, 10)> } ``` We set the first 3 dimensions of `poses` to zero, and add a new parameter `Rh` to represents the global oritentation, the vertices of SMPL model V = RX(theta, beta) + T. +If you use SMPL+H model, the poses contains `22x3+6+6`. We use `6` pca coefficients for each hand. `3(jaw, left eye, right eye)x3` poses of head are added for SMPL-X model. + ## Evaluation +In our code, we do not set the best weight parameters, you can adjust these according your data. If you find a set of good weights, feel free to tell me. + We will add more quantitative reports in [doc/evaluation.md](doc/evaluation.md) ## Acknowledgements @@ -174,4 +197,13 @@ Please consider citing these works if you find this repo is useful for your proj journal={arXiv preprint arXiv:2012.15838}, year={2020} } +``` + +## Reference +```bash +[1] Loper, Matthew, et al. "SMPL: A skinned multi-person linear model." ACM transactions on graphics (TOG) 34.6 (2015): 1-16. +[2] Romero, Javier, Dimitrios Tzionas, and Michael J. Black. "Embodied hands: Modeling and capturing hands and bodies together." ACM Transactions on Graphics (ToG) 36.6 (2017): 1-17. +[3] Pavlakos, Georgios, et al. "Expressive body capture: 3d hands, face, and body from a single image." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. +Bogo, Federica, et al. "Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image." European conference on computer vision. Springer, Cham, 2016. +[4] Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: Openpose: real-time multi-person 2d pose estimation using part affinity fields. arXiv preprint arXiv:1812.08008 (2018) ``` \ No newline at end of file diff --git a/code/dataset/base.py b/code/dataset/base.py index 4ced246..880d877 100644 --- a/code/dataset/base.py +++ b/code/dataset/base.py @@ -2,7 +2,7 @@ @ Date: 2021-01-13 16:53:55 @ Author: Qing Shuai @ LastEditors: Qing Shuai - @ LastEditTime: 2021-01-14 19:55:58 + @ LastEditTime: 2021-01-24 22:27:01 @ FilePath: /EasyMocapRelease/code/dataset/base.py ''' import os @@ -15,7 +15,7 @@ import numpy as np code_path = join(os.path.dirname(__file__), '..') sys.path.append(code_path) -from mytools.camera_utils import read_camera, undistort, write_camera +from mytools.camera_utils import read_camera, undistort, write_camera, get_fundamental_matrix from mytools.vis_base import merge, plot_bbox, plot_keypoints def read_json(path): @@ -30,18 +30,40 @@ def save_json(file, data): json.dump(data, f, indent=4) -def read_annot(annotname, add_hand_face=False): - data = read_json(annotname)['annots'] +def read_annot(annotname, mode='body25'): + data = read_json(annotname) + if not isinstance(data, list): + data = data['annots'] for i in range(len(data)): - data[i]['id'] = data[i].pop('personID') + if 'id' not in data[i].keys(): + data[i]['id'] = data[i].pop('personID') + if 'keypoints2d' in data[i].keys() and 'keypoints' not in data[i].keys(): + data[i]['keypoints'] = data[i].pop('keypoints2d') for key in ['bbox', 'keypoints', 'handl2d', 'handr2d', 'face2d']: if key not in data[i].keys():continue data[i][key] = np.array(data[i][key]) + if key == 'face2d': + # TODO: Make parameters, 17 is the offset for the eye brows, + # etc. 51 is the total number of FLAME compatible landmarks + data[i][key] = data[i][key][17:17+51, :] + if mode == 'body25': + data[i]['keypoints'] = data[i]['keypoints'] + elif mode == 'body15': + data[i]['keypoints'] = data[i]['keypoints'][:15, :] + elif mode == 'total': + data[i]['keypoints'] = np.vstack([data[i][key] for key in ['keypoints', 'handl2d', 'handr2d', 'face2d']]) + elif mode == 'bodyhand': + data[i]['keypoints'] = np.vstack([data[i][key] for key in ['keypoints', 'handl2d', 'handr2d']]) + elif mode == 'bodyhandface': + data[i]['keypoints'] = np.vstack([data[i][key] for key in ['keypoints', 'handl2d', 'handr2d', 'face2d']]) + data.sort(key=lambda x:x['id']) return data def get_bbox_from_pose(pose_2d, img, rate = 0.1): # this function returns bounding box from the 2D pose - validIdx = pose_2d[:, 2] > 0 + # here use pose_2d[:, -1] instead of pose_2d[:, 2] + # because when vis reprojection, the result will be (x, y, depth, conf) + validIdx = pose_2d[:, -1] > 0 if validIdx.sum() == 0: return [0, 0, 100, 100, 0] y_min = int(min(pose_2d[validIdx, 1])) @@ -65,10 +87,10 @@ def correct_bbox(img, bbox): class FileWriter: def __init__(self, output_path, config=None, basenames=[], cfg=None) -> None: self.out = output_path - keys = ['keypoints3d', 'smpl', 'repro', 'keypoints'] + keys = ['keypoints3d', 'match', 'smpl', 'skel', 'repro', 'keypoints'] output_dict = {key:join(self.out, key) for key in keys} - for key, p in output_dict.items(): - os.makedirs(p, exist_ok=True) + # for key, p in output_dict.items(): + # os.makedirs(p, exist_ok=True) self.output_dict = output_dict self.basenames = basenames @@ -78,19 +100,30 @@ class FileWriter: self.config = config def write_keypoints3d(self, results, nf): + os.makedirs(self.output_dict['keypoints3d'], exist_ok=True) savename = join(self.output_dict['keypoints3d'], '{:06d}.json'.format(nf)) save_json(savename, results) def vis_detections(self, images, lDetections, nf, key='keypoints', to_img=True, vis_id=True): + os.makedirs(self.output_dict[key], exist_ok=True) images_vis = [] for nv, image in enumerate(images): img = image.copy() for det in lDetections[nv]: - keypoints = det[key] - bbox = det.pop('bbox', get_bbox_from_pose(keypoints, img)) - # bbox = det['bbox'] - plot_bbox(img, bbox, pid=det['id'], vis_id=vis_id) - plot_keypoints(img, keypoints, pid=det['id'], config=self.config, use_limb_color=False, lw=2) + if key == 'match': + pid = det['id_match'] + else: + pid = det['id'] + if key not in det.keys(): + keypoints = det['keypoints'] + else: + keypoints = det[key] + if 'bbox' not in det.keys(): + bbox = get_bbox_from_pose(keypoints, img) + else: + bbox = det['bbox'] + plot_bbox(img, bbox, pid=pid, vis_id=vis_id) + plot_keypoints(img, keypoints, pid=pid, config=self.config, use_limb_color=False, lw=2) images_vis.append(img) image_vis = merge(images_vis, resize=not self.save_origin) if to_img: @@ -99,46 +132,229 @@ class FileWriter: return image_vis def write_smpl(self, results, nf): + os.makedirs(self.output_dict['smpl'], exist_ok=True) format_out = {'float_kind':lambda x: "%.3f" % x} filename = join(self.output_dict['smpl'], '{:06d}.json'.format(nf)) with open(filename, 'w') as f: f.write('[\n') - for data in results: + for idata, data in enumerate(results): f.write(' {\n') output = {} output['id'] = data['id'] - output['Rh'] = np.array2string(data['Rh'], max_line_width=1000, separator=', ', formatter=format_out) - output['Th'] = np.array2string(data['Th'], max_line_width=1000, separator=', ', formatter=format_out) - output['poses'] = np.array2string(data['poses'], max_line_width=1000, separator=', ', formatter=format_out) - output['shapes'] = np.array2string(data['shapes'], max_line_width=1000, separator=', ', formatter=format_out) - for key in ['id', 'Rh', 'Th', 'poses', 'shapes']: - f.write(' \"{}\": {},\n'.format(key, output[key])) - f.write(' },\n') + for key in ['Rh', 'Th', 'poses', 'expression', 'shapes']: + if key not in data.keys():continue + output[key] = np.array2string(data[key], max_line_width=1000, separator=', ', formatter=format_out) + for key in output.keys(): + f.write(' \"{}\": {}'.format(key, output[key])) + if key != 'shapes': + f.write(',\n') + else: + f.write('\n') + + f.write(' }') + if idata != len(results) - 1: + f.write(',\n') + else: + f.write('\n') f.write(']\n') - def vis_smpl(self, render_data, nf, images, cameras): + def vis_smpl(self, render_data_, nf, images, cameras, mode='smpl', add_back=False): + out = join(self.out, mode) + os.makedirs(out, exist_ok=True) from visualize.renderer import Renderer render = Renderer(height=1024, width=1024, faces=None) - render_results = render.render(render_data, cameras, images) - image_vis = merge(render_results, resize=not self.save_origin) - savename = join(self.output_dict['smpl'], '{:06d}.jpg'.format(nf)) - cv2.imwrite(savename, image_vis) + if isinstance(render_data_, list): # different view have different data + for nv, render_data in enumerate(render_data_): + render_results = render.render(render_data, cameras, images) + image_vis = merge(render_results, resize=not self.save_origin) + savename = join(out, '{:06d}_{:02d}.jpg'.format(nf, nv)) + cv2.imwrite(savename, image_vis) + else: + render_results = render.render(render_data_, cameras, images, add_back=add_back) + image_vis = merge(render_results, resize=not self.save_origin) + savename = join(out, '{:06d}.jpg'.format(nf)) + cv2.imwrite(savename, image_vis) + +def readReasultsTxt(outname, isA4d=True): + res_ = [] + with open(outname, "r") as file: + lines = file.readlines() + if len(lines) < 2: + return res_ + nPerson, nJoints = int(lines[0]), int(lines[1]) + # 只包含每个人的结果 + lines = lines[1:] + # 每个人的都写了关键点数量 + line_per_person = 1 + 1 + nJoints + for i in range(nPerson): + trackId = int(lines[i*line_per_person+1]) + content = ''.join(lines[i*line_per_person+2:i*line_per_person+2+nJoints]) + pose3d = np.fromstring(content, dtype=float, sep=' ').reshape((nJoints, 4)) + if isA4d: + # association4d 的关节顺序和正常的定义不一样 + pose3d = pose3d[[4, 1, 5, 9, 13, 6, 10, 14, 0, 2, 7, 11, 3, 8, 12], :] + res_.append({'id':trackId, 'keypoints3d':np.array(pose3d)}) + return res_ + +def readResultsJson(outname): + with open(outname) as f: + data = json.load(f) + res_ = [] + for d in data: + pose3d = np.array(d['keypoints3d']) + if pose3d.shape[0] > 25: + # 对于有手的情况,把手的根节点赋值成body25上的点 + pose3d[25, :] = pose3d[7, :] + pose3d[46, :] = pose3d[4, :] + res_.append({ + 'id': d['id'] if 'id' in d.keys() else d['personID'], + 'keypoints3d': pose3d + }) + return res_ + +class VideoBase(Dataset): + """Dataset for single sequence data + """ + def __init__(self, image_root, annot_root, out=None, config={}, mode='body15', no_img=False) -> None: + self.image_root = image_root + self.annot_root = annot_root + self.mode = mode + self.no_img = no_img + self.config = config + assert out is not None + self.out = out + self.writer = FileWriter(self.out, config=config) + imgnames = sorted(os.listdir(self.image_root)) + self.imagelist = imgnames + self.annotlist = sorted(os.listdir(self.annot_root)) + self.nFrames = len(self.imagelist) + self.undis = False + self.read_camera() + + def read_camera(self): + # 读入相机参数 + annname = join(self.annot_root, self.annotlist[0]) + data = read_json(annname) + if 'K' not in data.keys(): + height, width = data['height'], data['width'] + focal = 1.2*max(height, width) + K = np.array([focal, 0., width/2, 0., focal, height/2, 0. ,0., 1.]).reshape(3, 3) + else: + K = np.array(data['K']).reshape(3, 3) + self.camera = {'K':K ,'R': np.eye(3), 'T': np.zeros((3, 1))} + + def __getitem__(self, index: int): + imgname = join(self.image_root, self.imagelist[index]) + annname = join(self.annot_root, self.annotlist[index]) + assert os.path.exists(imgname), imgname + assert os.path.exists(annname), annname + assert os.path.basename(imgname).split('.')[0] == os.path.basename(annname).split('.')[0], (imgname, annname) + if not self.no_img: + img = cv2.imread(imgname) + else: + img = None + annot = read_annot(annname, self.mode) + return img, annot + + def __len__(self) -> int: + return self.nFrames + + def write_smpl(self, peopleDict, nf): + results = [] + for pid, people in peopleDict.items(): + result = {'id': pid} + result.update(people.body_params) + results.append(result) + self.writer.write_smpl(results, nf) + + def vis_detections(self, image, detections, nf, to_img=True): + return self.writer.vis_detections([image], [detections], nf, + key='keypoints', to_img=to_img, vis_id=True) + + def vis_repro(self, peopleDict, image, annots, nf): + # 可视化重投影的关键点与输入的关键点 + detections = [] + for pid, data in peopleDict.items(): + keypoints3d = (data.keypoints3d @ self.camera['R'].T + self.camera['T'].T) @ self.camera['K'].T + keypoints3d[:, :2] /= keypoints3d[:, 2:] + keypoints3d = np.hstack([keypoints3d, data.keypoints3d[:, -1:]]) + det = { + 'id': pid, + 'repro': keypoints3d + } + detections.append(det) + return self.writer.vis_detections([image], [detections], nf, key='repro', + to_img=True, vis_id=False) + + def vis_smpl(self, peopleDict, faces, image, nf, sub_vis=[], + mode='smpl', extra_data=[], add_back=True, + axis=np.array([1., 0., 0.]), degree=0., fix_center=None): + # 为了统一接口,旋转视角的在此处实现,只在单视角的数据中使用 + # 通过修改相机参数实现 + # 相机参数的修正可以通过计算点的中心来获得 + # render the smpl to each view + render_data = {} + for pid, data in peopleDict.items(): + render_data[pid] = { + 'vertices': data.vertices, 'faces': faces, + 'vid': pid, 'name': 'human_{}_{}'.format(nf, pid)} + for iid, extra in enumerate(extra_data): + render_data[10000+iid] = { + 'vertices': extra['vertices'], + 'faces': extra['faces'], + 'colors': extra['colors'], + 'name': extra['name'] + } + camera = {} + for key in self.camera.keys(): + camera[key] = self.camera[key][None, :, :] + # render another view point + if np.abs(degree) > 1e-3: + vertices_all = np.vstack([data.vertices for data in peopleDict.values()]) + if fix_center is None: + center = np.mean(vertices_all, axis=0, keepdims=True) + new_center = center.copy() + new_center[:, 0:2] = 0 + else: + center = fix_center.copy() + new_center = fix_center.copy() + new_center[:, 2] *= 1.5 + direc = np.array(axis) + rot, _ = cv2.Rodrigues(direc*degree/90*np.pi/2) + # If we rorate the data, it is like: + # V = Rnew @ (V0 - center) + new_center + # = Rnew @ V0 - Rnew @ center + new_center + # combine with the camera + # VV = Rc(Rnew @ V0 - Rnew @ center + new_center) + Tc + # = Rc@Rnew @ V0 + Rc @ (new_center - Rnew@center) + Tc + blank = np.zeros_like(image, dtype=np.uint8) + 255 + images = [image, blank] + Rnew = camera['R'][0] @ rot + Tnew = camera['R'][0] @ (new_center.T - rot @ center.T) + camera['T'][0] + camera['K'] = np.vstack([camera['K'], camera['K']]) + camera['R'] = np.vstack([camera['R'], Rnew[None, :, :]]) + camera['T'] = np.vstack([camera['T'], Tnew[None, :, :]]) + else: + images = [image] + self.writer.vis_smpl(render_data, nf, images, camera, mode, add_back=add_back) class MVBase(Dataset): """ Dataset for multiview data """ def __init__(self, root, cams=[], out=None, config={}, image_root='images', annot_root='annots', - add_hand_face=True, + mode='body25', undis=True, no_img=False) -> None: self.root = root self.image_root = join(root, image_root) self.annot_root = join(root, annot_root) - self.add_hand_face = add_hand_face + self.mode = mode self.undis = undis self.no_img = no_img self.config = config - + # results path + # the results store keypoints3d + self.skel_path = None if out is None: out = join(root, 'output') self.out = out @@ -146,6 +362,8 @@ class MVBase(Dataset): if len(cams) == 0: cams = sorted([i for i in os.listdir(self.image_root) if os.path.isdir(join(self.image_root, i))]) + if cams[0].isdigit(): # 对于使用数字命名的文件夹 + cams.sort(key=lambda x:int(x)) self.cams = cams self.imagelist = {} self.annotlist = {} @@ -168,6 +386,7 @@ class MVBase(Dataset): self.cameras.pop('basenames') self.cameras_for_affinity = [[cam['invK'], cam['R'], cam['T']] for cam in [self.cameras[name] for name in self.cams]] self.Pall = [self.cameras[cam]['P'] for cam in self.cams] + self.Fall = get_fundamental_matrix(self.cameras, self.cams) else: print('!!!there is no camera parameters, maybe bug', intri_name, extri_name) self.cameras = None @@ -205,7 +424,7 @@ class MVBase(Dataset): img = cv2.imread(imgname) images.append(img) # TODO:这里直接取了0 - annot = read_annot(annname, self.add_hand_face) + annot = read_annot(annname, self.mode) annots.append(annot) if self.undis: images = self.undistort(images) @@ -213,4 +432,59 @@ class MVBase(Dataset): return images, annots def __len__(self) -> int: - return self.nFrames \ No newline at end of file + return self.nFrames + + def vis_detections(self, images, lDetections, nf, to_img=True, sub_vis=[]): + if len(sub_vis) != 0: + valid_idx = [self.cams.index(i) for i in sub_vis] + images = [images[i] for i in valid_idx] + lDetections = [lDetections[i] for i in valid_idx] + return self.writer.vis_detections(images, lDetections, nf, + key='keypoints', to_img=to_img, vis_id=True) + + def vis_match(self, images, lDetections, nf, to_img=True, sub_vis=[]): + if len(sub_vis) != 0: + valid_idx = [self.cams.index(i) for i in sub_vis] + images = [images[i] for i in valid_idx] + lDetections = [lDetections[i] for i in valid_idx] + return self.writer.vis_detections(images, lDetections, nf, + key='match', to_img=to_img, vis_id=True) + + def write_keypoints3d(self, peopleDict, nf): + results = [] + for pid, people in peopleDict.items(): + result = {'id': pid, 'keypoints3d': people.keypoints3d.tolist()} + results.append(result) + self.writer.write_keypoints3d(results, nf) + + def write_smpl(self, peopleDict, nf): + results = [] + for pid, people in peopleDict.items(): + result = {'id': pid} + result.update(people.body_params) + results.append(result) + self.writer.write_smpl(results, nf) + + def read_skel(self, nf, mode='none'): + if mode == 'a4d': + outname = join(self.skel_path, '{}.txt'.format(nf)) + assert os.path.exists(outname), outname + skels = readReasultsTxt(outname) + elif mode == 'none': + outname = join(self.skel_path, '{:06d}.json'.format(nf)) + assert os.path.exists(outname), outname + skels = readResultsJson(outname) + else: + import ipdb; ipdb.set_trace() + return skels + + def read_smpl(self, nf): + outname = join(self.skel_path, '{:06d}.json'.format(nf)) + assert os.path.exists(outname), outname + datas = read_json(outname) + outputs = [] + for data in datas: + for key in ['Rh', 'Th', 'poses', 'shapes']: + data[key] = np.array(data[key]) + outputs.append(data) + return outputs \ No newline at end of file diff --git a/code/dataset/config.py b/code/dataset/config.py index 9c0c678..51d7625 100644 --- a/code/dataset/config.py +++ b/code/dataset/config.py @@ -2,14 +2,14 @@ * @ Date: 2020-09-26 16:52:55 * @ Author: Qing Shuai @ LastEditors: Qing Shuai - @ LastEditTime: 2021-01-13 14:04:46 - @ FilePath: /EasyMocap/code/dataset/config.py + @ LastEditTime: 2021-01-24 20:21:50 + @ FilePath: /EasyMocapRelease/code/dataset/config.py ''' import numpy as np CONFIG = {} -CONFIG['body25'] = {'kintree': +CONFIG['body25'] = {'nJoints': 25, 'kintree': [[ 1, 0], [ 2, 1], [ 3, 2], @@ -33,9 +33,38 @@ CONFIG['body25'] = {'kintree': [21, 14], [22, 11], [23, 22], - [24, 11]]} + [24, 11]], + 'joint_names': ["Nose", "Neck", "RShoulder", "RElbow", "RWrist", "LShoulder", "LElbow", "LWrist", "MidHip", "RHip","RKnee","RAnkle","LHip","LKnee","LAnkle","REye","LEye","REar","LEar","LBigToe","LSmallToe","LHeel","RBigToe","RSmallToe","RHeel"]} -CONFIG['body15'] = {'kintree': +CONFIG['body25']['skeleton'] = \ +{ + ( 0, 1): {'mean': 0.228, 'std': 0.046}, # Nose ->Neck + ( 1, 2): {'mean': 0.144, 'std': 0.029}, # Neck ->RShoulder + ( 2, 3): {'mean': 0.283, 'std': 0.057}, # RShoulder->RElbow + ( 3, 4): {'mean': 0.258, 'std': 0.052}, # RElbow ->RWrist + ( 1, 5): {'mean': 0.145, 'std': 0.029}, # Neck ->LShoulder + ( 5, 6): {'mean': 0.281, 'std': 0.056}, # LShoulder->LElbow + ( 6, 7): {'mean': 0.258, 'std': 0.052}, # LElbow ->LWrist + ( 1, 8): {'mean': 0.483, 'std': 0.097}, # Neck ->MidHip + ( 8, 9): {'mean': 0.106, 'std': 0.021}, # MidHip ->RHip + ( 9, 10): {'mean': 0.438, 'std': 0.088}, # RHip ->RKnee + (10, 11): {'mean': 0.406, 'std': 0.081}, # RKnee ->RAnkle + ( 8, 12): {'mean': 0.106, 'std': 0.021}, # MidHip ->LHip + (12, 13): {'mean': 0.438, 'std': 0.088}, # LHip ->LKnee + (13, 14): {'mean': 0.408, 'std': 0.082}, # LKnee ->LAnkle + ( 0, 15): {'mean': 0.043, 'std': 0.009}, # Nose ->REye + ( 0, 16): {'mean': 0.043, 'std': 0.009}, # Nose ->LEye + (15, 17): {'mean': 0.105, 'std': 0.021}, # REye ->REar + (16, 18): {'mean': 0.104, 'std': 0.021}, # LEye ->LEar + (14, 19): {'mean': 0.180, 'std': 0.036}, # LAnkle ->LBigToe + (19, 20): {'mean': 0.038, 'std': 0.008}, # LBigToe ->LSmallToe + (14, 21): {'mean': 0.044, 'std': 0.009}, # LAnkle ->LHeel + (11, 22): {'mean': 0.182, 'std': 0.036}, # RAnkle ->RBigToe + (22, 23): {'mean': 0.038, 'std': 0.008}, # RBigToe ->RSmallToe + (11, 24): {'mean': 0.044, 'std': 0.009}, # RAnkle ->RHeel +} + +CONFIG['body15'] = {'nJoints': 15, 'kintree': [[ 1, 0], [ 2, 1], [ 3, 2], @@ -50,7 +79,9 @@ CONFIG['body15'] = {'kintree': [12, 8], [13, 12], [14, 13],]} - +CONFIG['body15']['joint_names'] = CONFIG['body25']['joint_names'][:15] +CONFIG['body15']['skeleton'] = CONFIG['body25']['skeleton'] + CONFIG['hand'] = {'kintree': [[ 1, 0], [ 2, 1], @@ -99,48 +130,392 @@ CONFIG['bodyhand'] = {'kintree': [22, 11], [23, 22], [24, 11], - [26, 25], # handl + [26, 7], # handl [27, 26], [28, 27], [29, 28], - [30, 25], + [30, 7], [31, 30], [32, 31], [33, 32], - [34, 25], + [34, 7], [35, 34], [36, 35], [37, 36], - [38, 25], + [38, 7], [39, 38], [40, 39], [41, 40], - [42, 25], + [42, 7], [43, 42], [44, 43], [45, 44], - [47, 46], # handr + [47, 4], # handr [48, 47], [49, 48], [50, 49], - [51, 46], + [51, 4], [52, 51], [53, 52], [54, 53], - [55, 46], + [55, 4], [56, 55], [57, 56], [58, 57], - [59, 46], + [59, 4], [60, 59], [61, 60], [62, 61], - [63, 46], + [63, 4], [64, 63], [65, 64], [66, 65] - ] + ], + 'nJoints': 67, + 'skeleton':{ + ( 0, 1): {'mean': 0.251, 'std': 0.050}, + ( 1, 2): {'mean': 0.169, 'std': 0.034}, + ( 2, 3): {'mean': 0.292, 'std': 0.058}, + ( 3, 4): {'mean': 0.275, 'std': 0.055}, + ( 1, 5): {'mean': 0.169, 'std': 0.034}, + ( 5, 6): {'mean': 0.295, 'std': 0.059}, + ( 6, 7): {'mean': 0.278, 'std': 0.056}, + ( 1, 8): {'mean': 0.566, 'std': 0.113}, + ( 8, 9): {'mean': 0.110, 'std': 0.022}, + ( 9, 10): {'mean': 0.398, 'std': 0.080}, + (10, 11): {'mean': 0.402, 'std': 0.080}, + ( 8, 12): {'mean': 0.111, 'std': 0.022}, + (12, 13): {'mean': 0.395, 'std': 0.079}, + (13, 14): {'mean': 0.403, 'std': 0.081}, + ( 0, 15): {'mean': 0.053, 'std': 0.011}, + ( 0, 16): {'mean': 0.056, 'std': 0.011}, + (15, 17): {'mean': 0.107, 'std': 0.021}, + (16, 18): {'mean': 0.107, 'std': 0.021}, + (14, 19): {'mean': 0.180, 'std': 0.036}, + (19, 20): {'mean': 0.055, 'std': 0.011}, + (14, 21): {'mean': 0.065, 'std': 0.013}, + (11, 22): {'mean': 0.169, 'std': 0.034}, + (22, 23): {'mean': 0.052, 'std': 0.010}, + (11, 24): {'mean': 0.061, 'std': 0.012}, + ( 7, 26): {'mean': 0.045, 'std': 0.009}, + (26, 27): {'mean': 0.042, 'std': 0.008}, + (27, 28): {'mean': 0.035, 'std': 0.007}, + (28, 29): {'mean': 0.029, 'std': 0.006}, + ( 7, 30): {'mean': 0.102, 'std': 0.020}, + (30, 31): {'mean': 0.040, 'std': 0.008}, + (31, 32): {'mean': 0.026, 'std': 0.005}, + (32, 33): {'mean': 0.023, 'std': 0.005}, + ( 7, 34): {'mean': 0.101, 'std': 0.020}, + (34, 35): {'mean': 0.043, 'std': 0.009}, + (35, 36): {'mean': 0.029, 'std': 0.006}, + (36, 37): {'mean': 0.024, 'std': 0.005}, + ( 7, 38): {'mean': 0.097, 'std': 0.019}, + (38, 39): {'mean': 0.041, 'std': 0.008}, + (39, 40): {'mean': 0.027, 'std': 0.005}, + (40, 41): {'mean': 0.024, 'std': 0.005}, + ( 7, 42): {'mean': 0.095, 'std': 0.019}, + (42, 43): {'mean': 0.033, 'std': 0.007}, + (43, 44): {'mean': 0.020, 'std': 0.004}, + (44, 45): {'mean': 0.018, 'std': 0.004}, + ( 4, 47): {'mean': 0.043, 'std': 0.009}, + (47, 48): {'mean': 0.041, 'std': 0.008}, + (48, 49): {'mean': 0.034, 'std': 0.007}, + (49, 50): {'mean': 0.028, 'std': 0.006}, + ( 4, 51): {'mean': 0.101, 'std': 0.020}, + (51, 52): {'mean': 0.041, 'std': 0.008}, + (52, 53): {'mean': 0.026, 'std': 0.005}, + (53, 54): {'mean': 0.024, 'std': 0.005}, + ( 4, 55): {'mean': 0.100, 'std': 0.020}, + (55, 56): {'mean': 0.044, 'std': 0.009}, + (56, 57): {'mean': 0.029, 'std': 0.006}, + (57, 58): {'mean': 0.023, 'std': 0.005}, + ( 4, 59): {'mean': 0.096, 'std': 0.019}, + (59, 60): {'mean': 0.040, 'std': 0.008}, + (60, 61): {'mean': 0.028, 'std': 0.006}, + (61, 62): {'mean': 0.023, 'std': 0.005}, + ( 4, 63): {'mean': 0.094, 'std': 0.019}, + (63, 64): {'mean': 0.032, 'std': 0.006}, + (64, 65): {'mean': 0.020, 'std': 0.004}, + (65, 66): {'mean': 0.018, 'std': 0.004}, } +} + +CONFIG['bodyhandface'] = {'kintree': + [[ 1, 0], + [ 2, 1], + [ 3, 2], + [ 4, 3], + [ 5, 1], + [ 6, 5], + [ 7, 6], + [ 8, 1], + [ 9, 8], + [10, 9], + [11, 10], + [12, 8], + [13, 12], + [14, 13], + [15, 0], + [16, 0], + [17, 15], + [18, 16], + [19, 14], + [20, 19], + [21, 14], + [22, 11], + [23, 22], + [24, 11], + [26, 7], # handl + [27, 26], + [28, 27], + [29, 28], + [30, 7], + [31, 30], + [32, 31], + [33, 32], + [34, 7], + [35, 34], + [36, 35], + [37, 36], + [38, 7], + [39, 38], + [40, 39], + [41, 40], + [42, 7], + [43, 42], + [44, 43], + [45, 44], + [47, 4], # handr + [48, 47], + [49, 48], + [50, 49], + [51, 4], + [52, 51], + [53, 52], + [54, 53], + [55, 4], + [56, 55], + [57, 56], + [58, 57], + [59, 4], + [60, 59], + [61, 60], + [62, 61], + [63, 4], + [64, 63], + [65, 64], + [66, 65], + [ 67, 68], + [ 68, 69], + [ 69, 70], + [ 70, 71], + [ 72, 73], + [ 73, 74], + [ 74, 75], + [ 75, 76], + [ 77, 78], + [ 78, 79], + [ 79, 80], + [ 81, 82], + [ 82, 83], + [ 83, 84], + [ 84, 85], + [ 86, 87], + [ 87, 88], + [ 88, 89], + [ 89, 90], + [ 90, 91], + [ 91, 86], + [ 92, 93], + [ 93, 94], + [ 94, 95], + [ 95, 96], + [ 96, 97], + [ 97, 92], + [ 98, 99], + [ 99, 100], + [100, 101], + [101, 102], + [102, 103], + [103, 104], + [104, 105], + [105, 106], + [106, 107], + [107, 108], + [108, 109], + [109, 98], + [110, 111], + [111, 112], + [112, 113], + [113, 114], + [114, 115], + [115, 116], + [116, 117], + [117, 110] + ], + 'nJoints': 118, + 'skeleton':{ + ( 0, 1): {'mean': 0.251, 'std': 0.050}, + ( 1, 2): {'mean': 0.169, 'std': 0.034}, + ( 2, 3): {'mean': 0.292, 'std': 0.058}, + ( 3, 4): {'mean': 0.275, 'std': 0.055}, + ( 1, 5): {'mean': 0.169, 'std': 0.034}, + ( 5, 6): {'mean': 0.295, 'std': 0.059}, + ( 6, 7): {'mean': 0.278, 'std': 0.056}, + ( 1, 8): {'mean': 0.566, 'std': 0.113}, + ( 8, 9): {'mean': 0.110, 'std': 0.022}, + ( 9, 10): {'mean': 0.398, 'std': 0.080}, + (10, 11): {'mean': 0.402, 'std': 0.080}, + ( 8, 12): {'mean': 0.111, 'std': 0.022}, + (12, 13): {'mean': 0.395, 'std': 0.079}, + (13, 14): {'mean': 0.403, 'std': 0.081}, + ( 0, 15): {'mean': 0.053, 'std': 0.011}, + ( 0, 16): {'mean': 0.056, 'std': 0.011}, + (15, 17): {'mean': 0.107, 'std': 0.021}, + (16, 18): {'mean': 0.107, 'std': 0.021}, + (14, 19): {'mean': 0.180, 'std': 0.036}, + (19, 20): {'mean': 0.055, 'std': 0.011}, + (14, 21): {'mean': 0.065, 'std': 0.013}, + (11, 22): {'mean': 0.169, 'std': 0.034}, + (22, 23): {'mean': 0.052, 'std': 0.010}, + (11, 24): {'mean': 0.061, 'std': 0.012}, + ( 7, 26): {'mean': 0.045, 'std': 0.009}, + (26, 27): {'mean': 0.042, 'std': 0.008}, + (27, 28): {'mean': 0.035, 'std': 0.007}, + (28, 29): {'mean': 0.029, 'std': 0.006}, + ( 7, 30): {'mean': 0.102, 'std': 0.020}, + (30, 31): {'mean': 0.040, 'std': 0.008}, + (31, 32): {'mean': 0.026, 'std': 0.005}, + (32, 33): {'mean': 0.023, 'std': 0.005}, + ( 7, 34): {'mean': 0.101, 'std': 0.020}, + (34, 35): {'mean': 0.043, 'std': 0.009}, + (35, 36): {'mean': 0.029, 'std': 0.006}, + (36, 37): {'mean': 0.024, 'std': 0.005}, + ( 7, 38): {'mean': 0.097, 'std': 0.019}, + (38, 39): {'mean': 0.041, 'std': 0.008}, + (39, 40): {'mean': 0.027, 'std': 0.005}, + (40, 41): {'mean': 0.024, 'std': 0.005}, + ( 7, 42): {'mean': 0.095, 'std': 0.019}, + (42, 43): {'mean': 0.033, 'std': 0.007}, + (43, 44): {'mean': 0.020, 'std': 0.004}, + (44, 45): {'mean': 0.018, 'std': 0.004}, + ( 4, 47): {'mean': 0.043, 'std': 0.009}, + (47, 48): {'mean': 0.041, 'std': 0.008}, + (48, 49): {'mean': 0.034, 'std': 0.007}, + (49, 50): {'mean': 0.028, 'std': 0.006}, + ( 4, 51): {'mean': 0.101, 'std': 0.020}, + (51, 52): {'mean': 0.041, 'std': 0.008}, + (52, 53): {'mean': 0.026, 'std': 0.005}, + (53, 54): {'mean': 0.024, 'std': 0.005}, + ( 4, 55): {'mean': 0.100, 'std': 0.020}, + (55, 56): {'mean': 0.044, 'std': 0.009}, + (56, 57): {'mean': 0.029, 'std': 0.006}, + (57, 58): {'mean': 0.023, 'std': 0.005}, + ( 4, 59): {'mean': 0.096, 'std': 0.019}, + (59, 60): {'mean': 0.040, 'std': 0.008}, + (60, 61): {'mean': 0.028, 'std': 0.006}, + (61, 62): {'mean': 0.023, 'std': 0.005}, + ( 4, 63): {'mean': 0.094, 'std': 0.019}, + (63, 64): {'mean': 0.032, 'std': 0.006}, + (64, 65): {'mean': 0.020, 'std': 0.004}, + (65, 66): {'mean': 0.018, 'std': 0.004}, + (67, 68): {'mean': 0.012, 'std': 0.002}, + (68, 69): {'mean': 0.013, 'std': 0.003}, + (69, 70): {'mean': 0.014, 'std': 0.003}, + (70, 71): {'mean': 0.012, 'std': 0.002}, + (72, 73): {'mean': 0.014, 'std': 0.003}, + (73, 74): {'mean': 0.014, 'std': 0.003}, + (74, 75): {'mean': 0.015, 'std': 0.003}, + (75, 76): {'mean': 0.013, 'std': 0.003}, + (77, 78): {'mean': 0.014, 'std': 0.003}, + (78, 79): {'mean': 0.014, 'std': 0.003}, + (79, 80): {'mean': 0.015, 'std': 0.003}, + (81, 82): {'mean': 0.009, 'std': 0.002}, + (82, 83): {'mean': 0.010, 'std': 0.002}, + (83, 84): {'mean': 0.010, 'std': 0.002}, + (84, 85): {'mean': 0.010, 'std': 0.002}, + (86, 87): {'mean': 0.009, 'std': 0.002}, + (87, 88): {'mean': 0.009, 'std': 0.002}, + (88, 89): {'mean': 0.008, 'std': 0.002}, + (89, 90): {'mean': 0.008, 'std': 0.002}, + (90, 91): {'mean': 0.009, 'std': 0.002}, + (86, 91): {'mean': 0.008, 'std': 0.002}, + (92, 93): {'mean': 0.009, 'std': 0.002}, + (93, 94): {'mean': 0.009, 'std': 0.002}, + (94, 95): {'mean': 0.009, 'std': 0.002}, + (95, 96): {'mean': 0.009, 'std': 0.002}, + (96, 97): {'mean': 0.009, 'std': 0.002}, + (92, 97): {'mean': 0.009, 'std': 0.002}, + (98, 99): {'mean': 0.016, 'std': 0.003}, + (99, 100): {'mean': 0.013, 'std': 0.003}, + (100, 101): {'mean': 0.008, 'std': 0.002}, + (101, 102): {'mean': 0.008, 'std': 0.002}, + (102, 103): {'mean': 0.012, 'std': 0.002}, + (103, 104): {'mean': 0.014, 'std': 0.003}, + (104, 105): {'mean': 0.015, 'std': 0.003}, + (105, 106): {'mean': 0.012, 'std': 0.002}, + (106, 107): {'mean': 0.009, 'std': 0.002}, + (107, 108): {'mean': 0.009, 'std': 0.002}, + (108, 109): {'mean': 0.013, 'std': 0.003}, + (98, 109): {'mean': 0.016, 'std': 0.003}, + (110, 111): {'mean': 0.021, 'std': 0.004}, + (111, 112): {'mean': 0.009, 'std': 0.002}, + (112, 113): {'mean': 0.008, 'std': 0.002}, + (113, 114): {'mean': 0.019, 'std': 0.004}, + (114, 115): {'mean': 0.018, 'std': 0.004}, + (115, 116): {'mean': 0.008, 'std': 0.002}, + (116, 117): {'mean': 0.009, 'std': 0.002}, + (110, 117): {'mean': 0.020, 'std': 0.004}, +} +} + +face_kintree_without_contour = [[ 0, 1], + [ 1, 2], + [ 2, 3], + [ 3, 4], + [ 5, 6], + [ 6, 7], + [ 7, 8], + [ 8, 9], + [10, 11], + [11, 12], + [12, 13], + [14, 15], + [15, 16], + [16, 17], + [17, 18], + [19, 20], + [20, 21], + [21, 22], + [22, 23], + [23, 24], + [24, 19], + [25, 26], + [26, 27], + [27, 28], + [28, 29], + [29, 30], + [30, 25], + [31, 32], + [32, 33], + [33, 34], + [34, 35], + [35, 36], + [36, 37], + [37, 38], + [38, 39], + [39, 40], + [40, 41], + [41, 42], + [42, 31], + [43, 44], + [44, 45], + [45, 46], + [46, 47], + [47, 48], + [48, 49], + [49, 50], + [50, 43]] CONFIG['face'] = {'kintree':[ [0,1],[1,2],[2,3],[3,4],[4,5],[5,6],[6,7],[7,8],[8,9],[9,10],[10,11],[11,12],[12,13],[13,14],[14,15],[15,16], #outline (ignored) [17,18],[18,19],[19,20],[20,21], #right eyebrow @@ -176,6 +551,7 @@ def getKintree(name='total'): return kintree CONFIG['total'] = {} CONFIG['total']['kintree'] = getKintree('total') +CONFIG['total']['nJoints'] = 137 COCO17_IN_BODY25 = [0,16,15,18,17,5,2,6,3,7,4,12,9,13,10,14,11] diff --git a/code/dataset/mv1pmf.py b/code/dataset/mv1pmf.py index adc74b1..8367ba1 100644 --- a/code/dataset/mv1pmf.py +++ b/code/dataset/mv1pmf.py @@ -2,7 +2,7 @@ @ Date: 2021-01-12 17:12:50 @ Author: Qing Shuai @ LastEditors: Qing Shuai - @ LastEditTime: 2021-01-14 17:14:34 + @ LastEditTime: 2021-01-21 14:51:45 @ FilePath: /EasyMocap/code/dataset/mv1pmf.py ''' import os @@ -15,10 +15,10 @@ from .base import MVBase class MV1PMF(MVBase): def __init__(self, root, cams=[], pid=0, out=None, config={}, - image_root='images', annot_root='annots', add_hand_face=True, + image_root='images', annot_root='annots', mode='body15', undis=True, no_img=False) -> None: super().__init__(root, cams, out, config, image_root, annot_root, - add_hand_face, undis, no_img) + mode, undis, no_img) self.pid = pid def write_keypoints3d(self, keypoints3d, nf): @@ -30,20 +30,21 @@ class MV1PMF(MVBase): result.update(params) self.writer.write_smpl([result], nf) - def vis_smpl(self, vertices, faces, images, nf, sub_vis): + def vis_smpl(self, vertices, faces, images, nf, sub_vis=[], + mode='smpl', extra_data=[], add_back=True): render_data = {} if len(vertices.shape) == 3: vertices = vertices[0] pid = self.pid render_data[pid] = {'vertices': vertices, 'faces': faces, - 'vid': pid, 'name': '{}_{}'.format(nf, pid)} + 'vid': pid, 'name': 'human_{}_{}'.format(nf, pid)} cameras = {'K': [], 'R':[], 'T':[]} if len(sub_vis) == 0: sub_vis = self.cams for key in cameras.keys(): cameras[key] = [self.cameras[cam][key] for cam in sub_vis] images = [images[self.cams.index(cam)] for cam in sub_vis] - self.writer.vis_smpl(render_data, nf, images, cameras) + self.writer.vis_smpl(render_data, nf, images, cameras, mode, add_back=add_back) def vis_detections(self, images, annots, nf, to_img=True, sub_vis=[]): lDetections = [] @@ -87,7 +88,10 @@ class MV1PMF(MVBase): keypoints = data['keypoints'] else: print('not found pid {} in {}, {}'.format(self.pid, index, nv)) - keypoints = np.zeros((25, 3)) + if self.add_hand_face: + keypoints = np.zeros((137, 3)) + else: + keypoints = np.zeros((25, 3)) bbox = np.array([0, 0, 100., 100., 0.]) annots['bbox'].append(bbox) annots['keypoints'].append(keypoints) diff --git a/code/demo_mv1pmf_skel.py b/code/demo_mv1pmf_skel.py index e85a9d9..60696cb 100644 --- a/code/demo_mv1pmf_skel.py +++ b/code/demo_mv1pmf_skel.py @@ -2,15 +2,17 @@ @ Date: 2021-01-12 17:08:25 @ Author: Qing Shuai @ LastEditors: Qing Shuai - @ LastEditTime: 2021-01-14 17:08:05 - @ FilePath: /EasyMocap/code/demo_mv1pmf_skel.py + @ LastEditTime: 2021-01-24 20:57:35 + @ FilePath: /EasyMocapRelease/code/demo_mv1pmf_skel.py ''' # show skeleton and reprojection from dataset.mv1pmf import MV1PMF from dataset.config import CONFIG from mytools.reconstruction import simple_recon_person, projectN3 +# from mytools.robust_triangulate import robust_triangulate from tqdm import tqdm import numpy as np +from smplmodel import check_keypoints def smooth_skeleton(skeleton): # nFrames, nJoints, 4: [[(x, y, z, c)]] @@ -32,10 +34,37 @@ def smooth_skeleton(skeleton): skeleton[span:nFrames-span, :, :3] = skel return skeleton +def get_limb_length(config, keypoints): + skeleton = {} + for i, j_ in config['kintree']: + if j_ == 25: + j = 7 + elif j_ == 46: + j = 4 + else: + j = j_ + key = tuple(sorted([i, j])) + length, confs = 0, 0 + for nf in range(keypoints.shape[0]): + limb_length = np.linalg.norm(keypoints[nf, i, :3] - keypoints[nf, j, :3]) + conf = keypoints[nf, [i, j], -1].min() + length += limb_length * conf + confs += conf + limb_length = length/confs + skeleton[key] = {'mean': limb_length, 'std': limb_length*0.2} + print('{') + for key, val in skeleton.items(): + res = ' ({:2d}, {:2d}): {{\'mean\': {:.3f}, \'std\': {:.3f}}}, '.format(*key, val['mean'], val['std']) + if 'joint_names' in config.keys(): + res += '# {:9s}->{:9s}'.format(config['joint_names'][key[0]], config['joint_names'][key[1]]) + print(res) + print('}') + def mv1pmf_skel(path, sub, out, mode, args): - MIN_CONF_THRES = 0.5 + MIN_CONF_THRES = 0.3 no_img = not (args.vis_det or args.vis_repro) - dataset = MV1PMF(path, cams=sub, config=CONFIG[mode], add_hand_face=args.add_hand_face, + config = CONFIG[mode] + dataset = MV1PMF(path, cams=sub, config=config, mode=mode, undis=args.undis, no_img=no_img, out=out) kp3ds = [] start, end = args.start, min(args.end, len(dataset)) @@ -43,7 +72,9 @@ def mv1pmf_skel(path, sub, out, mode, args): images, annots = dataset[nf] conf = annots['keypoints'][..., -1] conf[conf < MIN_CONF_THRES] = 0 - keypoints3d, _, kpts_repro = simple_recon_person(annots['keypoints'], dataset.Pall, ret_repro=True) + annots['keypoints'] = check_keypoints(annots['keypoints'], WEIGHT_DEBUFF=1) + keypoints3d, _, kpts_repro = simple_recon_person(annots['keypoints'], dataset.Pall, config=config, ret_repro=True) + # keypoints3d, _, kpts_repro = robust_triangulate(annots['keypoints'], dataset.Pall, config=config, ret_repro=True) kp3ds.append(keypoints3d) if args.vis_det: dataset.vis_detections(images, annots, nf, sub_vis=args.sub_vis) @@ -51,32 +82,16 @@ def mv1pmf_skel(path, sub, out, mode, args): dataset.vis_repro(images, annots, kpts_repro, nf, sub_vis=args.sub_vis) # smooth the skeleton kp3ds = np.stack(kp3ds) - if args.smooth: - kp3ds = smooth_skeleton(kp3ds) + # 计算一下骨长 + # get_limb_length(config, kp3ds) + # if args.smooth: + # kp3ds = smooth_skeleton(kp3ds) for nf in tqdm(range(kp3ds.shape[0]), desc='dump'): dataset.write_keypoints3d(kp3ds[nf], nf + start) if __name__ == "__main__": - import argparse - parser = argparse.ArgumentParser('multi_view one_person multi_frame skel') - parser.add_argument('path', type=str) - parser.add_argument('--out', type=str, default=None) - parser.add_argument('--sub', type=str, nargs='+', default=[], - help='the sub folder lists when in video mode') - parser.add_argument('--start', type=int, default=0, - help='frame start') - parser.add_argument('--end', type=int, default=10000, - help='frame end') - parser.add_argument('--step', type=int, default=1, - help='frame step') - parser.add_argument('--body', type=str, default='body25', choices=['body15', 'body25', 'total']) - parser.add_argument('--undis', action='store_true') - parser.add_argument('--add_hand_face', action='store_true') - parser.add_argument('--smooth', action='store_true') - parser.add_argument('--vis_det', action='store_true') - parser.add_argument('--vis_repro', action='store_true') - parser.add_argument('--sub_vis', type=str, nargs='+', default=[], - help='the sub folder lists for visualization') + from mytools.cmd_loader import load_parser + parser = load_parser() args = parser.parse_args() mv1pmf_skel(args.path, args.sub, args.out, args.body, args) \ No newline at end of file diff --git a/code/demo_mv1pmf_smpl.py b/code/demo_mv1pmf_smpl.py index cef8db8..55d8aeb 100644 --- a/code/demo_mv1pmf_smpl.py +++ b/code/demo_mv1pmf_smpl.py @@ -2,106 +2,141 @@ @ Date: 2021-01-12 17:08:25 @ Author: Qing Shuai @ LastEditors: Qing Shuai - @ LastEditTime: 2021-01-14 20:49:25 - @ FilePath: /EasyMocap/code/demo_mv1pmf_smpl.py + @ LastEditTime: 2021-01-24 22:26:09 + @ FilePath: /EasyMocapRelease/code/demo_mv1pmf_smpl.py ''' # show skeleton and reprojection import pyrender # first import the pyrender from pyfitting.optimize_simple import optimizeShape, optimizePose from dataset.mv1pmf import MV1PMF from dataset.config import CONFIG -from mytools.reconstruction import simple_recon_person, projectN3 -from smplmodel import select_nf, init_params, Config - +from mytools.utils import Timer +from smplmodel import select_nf, init_params, Config, load_model, check_keypoints +from os.path import join from tqdm import tqdm import numpy as np -def load_model(use_cuda=True): - # prepare SMPL model - import torch - if use_cuda: - device = torch.device('cuda') - else: - device = torch.device('cpu') - from smplmodel import SMPLlayer - body_model = SMPLlayer('data/smplx/smpl', gender='neutral', device=device, - regressor_path='data/smplx/J_regressor_body25.npy') - body_model.to(device) - return body_model - def load_weight_shape(): - weight = {'s3d': 1., 'reg_shape': 5e-3} + weight = {'s3d': 1., 'reg_shapes': 5e-3} return weight -def load_weight_pose(): - weight = { - 'k3d': 1., 'reg_poses_zero': 1e-2, - 'smooth_Rh': 1e-2, 'smooth_Th': 1e-2, 'smooth_poses': 1e-2 - } +def load_weight_pose(model): + if model == 'smpl': + weight = { + 'k3d': 1., 'reg_poses_zero': 1e-2, + 'reg_expression': 1e-1, + 'smooth_joints': 1e-5 + # 'smooth_Rh': 1e-1, 'smooth_Th': 1e-1, 'smooth_poses': 1e-1, 'smooth_hands': 1e-2 + } + elif model == 'smplh': + weight = { + 'k3d': 1., 'reg_poses_zero': 1e-3, + 'smooth_body': 1e-2, 'smooth_hand': 1e-2 + } + elif model == 'smplx': + weight = { + 'k3d': 1., 'reg_poses_zero': 1e-3, + 'reg_expression': 1e-2, + 'smooth_body': 1e-2, 'smooth_hand': 1e-2 + # 'smooth_Rh': 1e-1, 'smooth_Th': 1e-1, 'smooth_poses': 1e-1, 'smooth_hands': 1e-2 + } + else: + raise NotImplementedError return weight +def print_mean_skel(mode): + with Timer('Loading {}, {}'.format(args.model, args.gender)): + body_model = load_model(args.gender, model_type=args.model) + params_init = init_params(nFrames=1, model_type=args.model) + skel = body_model(return_verts=False, return_tensor=False, **params_init)[0] + # skel: nJoints, 3 + config = CONFIG[mode] + skeleton = {} + for i, j_ in config['kintree']: + if j_ == 25: + j = 7 + elif j_ == 46: + j = 4 + else: + j = j_ + key = tuple(sorted([i, j])) + limb_length = np.linalg.norm(skel[i] - skel[j]) + skeleton[key] = {'mean': limb_length, 'std': limb_length*0.2} + print('{') + for key, val in skeleton.items(): + res = ' ({:2d}, {:2d}): {{\'mean\': {:.3f}, \'std\': {:.3f}}}, '.format(*key, val['mean'], val['std']) + if 'joint_names' in config.keys(): + res += '# {:9s}->{:9s}'.format(config['joint_names'][key[0]], config['joint_names'][key[1]]) + print(res) + print('}') + def mv1pmf_smpl(path, sub, out, mode, args): config = CONFIG[mode] - MIN_CONF_THRES = 0.5 - no_img = False - dataset = MV1PMF(path, cams=sub, config=CONFIG[mode], add_hand_face=False, + no_img = True + dataset = MV1PMF(path, cams=sub, config=CONFIG[mode], mode=args.body, undis=args.undis, no_img=no_img, out=out) + if args.skel is None: + from demo_mv1pmf_skel import mv1pmf_skel + mv1pmf_skel(path, sub, out, mode, args) + args.skel = join(out, 'keypoints3d') + dataset.skel_path = args.skel kp3ds = [] start, end = args.start, min(args.end, len(dataset)) dataset.no_img = True annots_all = [] - for nf in tqdm(range(start, end), desc='triangulation'): + for nf in tqdm(range(start, end), desc='loading'): images, annots = dataset[nf] - conf = annots['keypoints'][..., -1] - conf[conf < MIN_CONF_THRES] = 0 - keypoints3d, _, kpts_repro = simple_recon_person(annots['keypoints'], dataset.Pall, ret_repro=True) - kp3ds.append(keypoints3d) + infos = dataset.read_skel(nf) + kp3ds.append(infos[0]['keypoints3d']) annots_all.append(annots) - # smooth the skeleton kp3ds = np.stack(kp3ds) + kp3ds = check_keypoints(kp3ds, 1) # optimize the human shape - body_model = load_model() - params_init = init_params(nFrames=1) + with Timer('Loading {}, {}'.format(args.model, args.gender)): + body_model = load_model(args.gender, model_type=args.model) + params_init = init_params(nFrames=1, model_type=args.model) weight = load_weight_shape() - params_shape = optimizeShape(body_model, params_init, kp3ds, weight_loss=weight, kintree=config['kintree']) + if args.model in ['smpl', 'smplh', 'smplx']: + # when use SMPL model, optimize the shape only with first 14 limbs + params_shape = optimizeShape(body_model, params_init, kp3ds, weight_loss=weight, kintree=CONFIG['body15']['kintree']) + else: + params_shape = optimizeShape(body_model, params_init, kp3ds, weight_loss=weight, kintree=config['kintree']) # optimize 3D pose cfg = Config() - params = init_params(nFrames=kp3ds.shape[0]) + cfg.VERBOSE = args.verbose + cfg.MODEL = args.model + params = init_params(nFrames=kp3ds.shape[0], model_type=args.model) params['shapes'] = params_shape['shapes'].copy() - weight = load_weight_pose() - cfg.OPT_R = True - cfg.OPT_T = True - params = optimizePose(body_model, params, kp3ds, weight_loss=weight, kintree=config['kintree'], cfg=cfg) - cfg.OPT_POSE = True - params = optimizePose(body_model, params, kp3ds, weight_loss=weight, kintree=config['kintree'], cfg=cfg) - # optimize 2D pose - # render the mesh + weight = load_weight_pose(args.model) + with Timer('Optimize global RT'): + cfg.OPT_R = True + cfg.OPT_T = True + params = optimizePose(body_model, params, kp3ds, weight_loss=weight, kintree=config['kintree'], cfg=cfg) + with Timer('Optimize Pose/{} frames'.format(end-start)): + cfg.OPT_POSE = True + params = optimizePose(body_model, params, kp3ds, weight_loss=weight, kintree=config['kintree'], cfg=cfg) + if args.model in ['smplh', 'smplx']: + cfg.OPT_HAND = True + params = optimizePose(body_model, params, kp3ds, weight_loss=weight, kintree=config['kintree'], cfg=cfg) + if args.model == 'smplx': + cfg.OPT_EXPR = True + params = optimizePose(body_model, params, kp3ds, weight_loss=weight, kintree=config['kintree'], cfg=cfg) + # TODO:optimize 2D pose + # write out the results dataset.no_img = not args.vis_smpl for nf in tqdm(range(start, end), desc='render'): images, annots = dataset[nf] dataset.write_smpl(select_nf(params, nf-start), nf) if args.vis_smpl: vertices = body_model(return_verts=True, return_tensor=False, **select_nf(params, nf-start)) - dataset.vis_smpl(vertices=vertices, faces=body_model.faces, images=images, nf=nf, sub_vis=args.sub_vis) + dataset.vis_smpl(vertices=vertices, faces=body_model.faces, images=images, nf=nf, sub_vis=args.sub_vis, add_back=True) if __name__ == "__main__": - import argparse - parser = argparse.ArgumentParser('multi_view one_person multi_frame skel') - parser.add_argument('path', type=str) - parser.add_argument('--out', type=str, default=None) - parser.add_argument('--sub', type=str, nargs='+', default=[], - help='the sub folder lists when in video mode') - parser.add_argument('--start', type=int, default=0, - help='frame start') - parser.add_argument('--end', type=int, default=10000, - help='frame end') - parser.add_argument('--step', type=int, default=1, - help='frame step') - parser.add_argument('--body', type=str, default='body15', choices=['body15', 'body25', 'total']) - parser.add_argument('--undis', action='store_true') - parser.add_argument('--add_hand_face', action='store_true') + from mytools.cmd_loader import load_parser + parser = load_parser() + parser.add_argument('--skel', type=str, default=None, + help='path to keypoints3d') parser.add_argument('--vis_smpl', action='store_true') - parser.add_argument('--sub_vis', type=str, nargs='+', default=[], - help='the sub folder lists for visualization') args = parser.parse_args() + # print_mean_skel(args.body) mv1pmf_smpl(args.path, args.sub, args.out, args.body, args) \ No newline at end of file diff --git a/code/mytools/camera_utils.py b/code/mytools/camera_utils.py index 83e088b..838a075 100644 --- a/code/mytools/camera_utils.py +++ b/code/mytools/camera_utils.py @@ -228,3 +228,19 @@ def filterKeypoints(keypoints, thres = 0.1, min_width=40, \ add_list.append(ik) keypoints = keypoints[add_list, :, :] return keypoints, add_list + + +def get_fundamental_matrix(cameras, basenames): + skew_op = lambda x: np.array([[0, -x[2], x[1]], [x[2], 0, -x[0]], [-x[1], x[0], 0]]) + fundamental_op = lambda K_0, R_0, T_0, K_1, R_1, T_1: np.linalg.inv(K_0).T @ ( + R_0 @ R_1.T) @ K_1.T @ skew_op(K_1 @ R_1 @ R_0.T @ (T_0 - R_0 @ R_1.T @ T_1)) + fundamental_RT_op = lambda K_0, RT_0, K_1, RT_1: fundamental_op (K_0, RT_0[:, :3], RT_0[:, 3], K_1, + RT_1[:, :3], RT_1[:, 3] ) + F = np.zeros((len(basenames), len(basenames), 3, 3)) # N x N x 3 x 3 matrix + F = {(icam, jcam): np.zeros((3, 3)) for jcam in basenames for icam in basenames} + for icam in basenames: + for jcam in basenames: + F[(icam, jcam)] += fundamental_RT_op(cameras[icam]['K'], cameras[icam]['RT'], cameras[jcam]['K'], cameras[jcam]['RT']) + if F[(icam, jcam)].sum() == 0: + F[(icam, jcam)] += 1e-12 # to avoid nan + return F \ No newline at end of file diff --git a/code/mytools/cmd_loader.py b/code/mytools/cmd_loader.py new file mode 100644 index 0000000..bc93f4c --- /dev/null +++ b/code/mytools/cmd_loader.py @@ -0,0 +1,44 @@ +''' + @ Date: 2021-01-15 12:09:27 + @ Author: Qing Shuai + @ LastEditors: Qing Shuai + @ LastEditTime: 2021-01-24 20:57:22 + @ FilePath: /EasyMocapRelease/code/mytools/cmd_loader.py +''' + +import argparse + +def load_parser(): + parser = argparse.ArgumentParser('EasyMocap commond line tools') + parser.add_argument('path', type=str) + parser.add_argument('--out', type=str, default=None) + parser.add_argument('--annot', type=str, default=None) + parser.add_argument('--sub', type=str, nargs='+', default=[], + help='the sub folder lists when in video mode') + parser.add_argument('--start', type=int, default=0, + help='frame start') + parser.add_argument('--end', type=int, default=10000, + help='frame end') + parser.add_argument('--step', type=int, default=1, + help='frame step') + # + # keypoints and body model + # + parser.add_argument('--body', type=str, default='body25', choices=['body15', 'body25', 'bodyhand', 'bodyhandface', 'total']) + parser.add_argument('--model', type=str, default='smpl', choices=['smpl', 'smplh', 'smplx', 'mano']) + parser.add_argument('--gender', type=str, default='neutral', + choices=['neutral', 'male', 'female']) + # + # visualization part + # + parser.add_argument('--vis_det', action='store_true') + parser.add_argument('--vis_repro', action='store_true') + parser.add_argument('--undis', action='store_true') + parser.add_argument('--sub_vis', type=str, nargs='+', default=[], + help='the sub folder lists for visualization') + # + # debug + # + parser.add_argument('--verbose', action='store_true') + parser.add_argument('--debug', action='store_true') + return parser \ No newline at end of file diff --git a/code/mytools/reconstruction.py b/code/mytools/reconstruction.py index 7861ce1..3657a82 100644 --- a/code/mytools/reconstruction.py +++ b/code/mytools/reconstruction.py @@ -2,8 +2,8 @@ * @ Date: 2020-09-14 11:01:52 * @ Author: Qing Shuai @ LastEditors: Qing Shuai - @ LastEditTime: 2021-01-13 11:30:38 - @ FilePath: /EasyMocap/code/mytools/reconstruction.py + @ LastEditTime: 2021-01-24 22:28:09 + @ FilePath: /EasyMocapRelease/code/mytools/reconstruction.py ''' import numpy as np @@ -45,13 +45,9 @@ def simple_triangulate(kpts, Pall): A[i*2 + 1, :] = kpts[i, 2]*(kpts[i, 1]*P[2:3,:] - P[1:2,:]) result[:3] = solveZ(A) return result - # kpts_proj = projectN3(result, Pall) - # repro_error = simple_reprojection_error(kpts, kpts_proj) - # return kpts3d, conf/nViews, repro_error/nViews - # else: - # return kpts3d, conf -def simple_recon_person(keypoints_use, Puse, ret_repro=False, max_error=100): +def simple_recon_person(keypoints_use, Puse, config=None, ret_repro=False): + eps = 0.01 nJoints = keypoints_use[0].shape[0] if isinstance(keypoints_use, list): keypoints_use = np.stack(keypoints_use) @@ -61,23 +57,33 @@ def simple_recon_person(keypoints_use, Puse, ret_repro=False, max_error=100): if (keypoints[:, 2] > 0.01).sum() < 2: continue out[nj] = simple_triangulate(keypoints, Puse) + if config is not None: + # remove the false limb with the help of limb + for (i, j), mean_std in config['skeleton'].items(): + ii, jj = min(i, j), max(i, j) + if out[ii, -1] < eps: + out[jj, -1] = 0 + if out[jj, -1] < eps: + continue + length = np.linalg.norm(out[ii, :3] - out[jj, :3]) + if abs(length - mean_std['mean'])/(3*mean_std['std']) > 1: + # print((i, j), length, mean_std) + out[jj, :] = 0 # 计算重投影误差 kpts_repro = projectN3(out, Puse) square_diff = (keypoints_use[:, :, :2] - kpts_repro[:, :, :2])**2 - conf = (out[None, :, -1] > 0.01) * (keypoints_use[:, :, 2] > 0.01) + # conf = (out[None, :, -1] > 0.01) * (keypoints_use[:, :, 2] > 0.01) + conf = np.repeat(out[None, :, -1:], len(Puse), 0) + kpts_repro = np.concatenate((kpts_repro, conf), axis=2) if conf.sum() < 3: # 至少得有3个有效的关节 repro_error = 1e3 else: - repro_error_joint = np.sqrt(square_diff.sum(axis=2))*conf - num_valid_view = conf.sum(axis=0) - # 对于可见视角少的,强行设置为不可见 - repro_error_joint[:, num_valid_view==0] = max_error * 2 - num_valid_view[num_valid_view==0] = 1 - repro_error_joint_ = repro_error_joint.sum(axis=0)/num_valid_view - # print(repro_error_joint_) - not_valid = np.where(repro_error_joint_>max_error)[0] - out[not_valid, -1] = 0 + # (nViews, nJoints): reprojection error for each joint in each view + repro_error_joint = np.sqrt(square_diff.sum(axis=2, keepdims=True))*conf + # remove the not valid joints + # remove the bad views repro_error = repro_error_joint.sum()/conf.sum() + if ret_repro: return out, repro_error, kpts_repro return out, repro_error diff --git a/code/mytools/utils.py b/code/mytools/utils.py new file mode 100644 index 0000000..878b68a --- /dev/null +++ b/code/mytools/utils.py @@ -0,0 +1,21 @@ +''' + @ Date: 2021-01-15 11:12:00 + @ Author: Qing Shuai + @ LastEditors: Qing Shuai + @ LastEditTime: 2021-01-15 11:19:55 + @ FilePath: /EasyMocap/code/mytools/utils.py +''' +import time + +class Timer: + def __init__(self, name, silent=False): + self.name = name + self.silent = silent + + def __enter__(self): + self.start = time.time() + + def __exit__(self, exc_type, exc_value, exc_tb): + end = time.time() + if not self.silent: + print('-> [{}]: {:.2f}s'.format(self.name, end-self.start)) diff --git a/code/mytools/vis_base.py b/code/mytools/vis_base.py index 986da91..acfea4a 100644 --- a/code/mytools/vis_base.py +++ b/code/mytools/vis_base.py @@ -2,7 +2,7 @@ @ Date: 2020-11-28 17:23:04 @ Author: Qing Shuai @ LastEditors: Qing Shuai - @ LastEditTime: 2021-01-14 17:11:51 + @ LastEditTime: 2021-01-21 15:16:52 @ FilePath: /EasyMocap/code/mytools/vis_base.py ''' import cv2 @@ -73,12 +73,13 @@ def plot_keypoints(img, points, pid, config, vis_conf=False, use_limb_color=True col = get_rgb(config['colors'][ii]) else: col = get_rgb(pid) - if pt1[2] > 0.01 and pt2[2] > 0.01: + if pt1[-1] > 0.01 and pt2[-1] > 0.01: image = cv2.line( img, (int(pt1[0]+0.5), int(pt1[1]+0.5)), (int(pt2[0]+0.5), int(pt2[1]+0.5)), col, lw) for i in range(len(points)): - x, y, c = points[i] + x, y = points[i][0], points[i][1] + c = points[i][-1] if c > 0.01: col = get_rgb(pid) cv2.circle(img, (int(x+0.5), int(y+0.5)), lw*2, col, -1) @@ -98,9 +99,11 @@ def merge(images, row=-1, col=-1, resize=False, ret_range=False): images = [images[i] for i in [0, 1, 2, 3, 7, 6, 5, 4]] if len(images) == 7: row, col = 3, 3 + elif len(images) == 2: + row, col = 2, 1 height = images[0].shape[0] width = images[0].shape[1] - ret_img = np.zeros((height * row, width * col, 3), dtype=np.uint8) + 255 + ret_img = np.zeros((height * row, width * col, images[0].shape[2]), dtype=np.uint8) + 255 ranges = [] for i in range(row): for j in range(col): diff --git a/code/pyfitting/lossfactory.py b/code/pyfitting/lossfactory.py index 0f5ed2c..7556fd9 100644 --- a/code/pyfitting/lossfactory.py +++ b/code/pyfitting/lossfactory.py @@ -2,51 +2,86 @@ @ Date: 2020-11-19 17:46:04 @ Author: Qing Shuai @ LastEditors: Qing Shuai - @ LastEditTime: 2021-01-14 15:02:39 + @ LastEditTime: 2021-01-22 16:51:55 @ FilePath: /EasyMocap/code/pyfitting/lossfactory.py ''' import torch from .operation import projection, batch_rodrigues -def ReprojectionLoss(keypoints3d, keypoints2d, K, Rc, Tc, inv_bbox_sizes): +def ReprojectionLoss(keypoints3d, keypoints2d, K, Rc, Tc, inv_bbox_sizes, norm='l2'): img_points = projection(keypoints3d, K, Rc, Tc) - residual = (img_points - keypoints2d[:, :, :2]) * keypoints2d[:, :, 2:3] - squared_res = (residual ** 2) * inv_bbox_sizes + residual = (img_points - keypoints2d[:, :, :2]) * keypoints2d[:, :, -1:] + # squared_res: (nFrames, nJoints, 2) + if norm == 'l2': + squared_res = (residual ** 2) * inv_bbox_sizes + elif norm == 'l1': + squared_res = torch.abs(residual) * inv_bbox_sizes + else: + import ipdb; ipdb.set_trace() return torch.sum(squared_res) class SMPLAngleLoss: - def __init__(self, keypoints): - use_feet = keypoints[:, [19, 20, 21, 22, 23, 24], -1].sum() > 0.1 - use_head = keypoints[:, [15, 16, 17, 18], -1].sum() > 0.1 - SMPL_JOINT_ZERO_IDX = [3, 6, 9, 13, 14, 20, 21, 22, 23] + def __init__(self, keypoints, model_type='smpl'): + if keypoints.shape[1] <= 15: + use_feet = False + use_head = False + else: + use_feet = keypoints[:, [19, 20, 21, 22, 23, 24], -1].sum() > 0.1 + use_head = keypoints[:, [15, 16, 17, 18], -1].sum() > 0.1 + if model_type == 'smpl': + SMPL_JOINT_ZERO_IDX = [3, 6, 9, 10, 11, 13, 14, 20, 21, 22, 23] + elif model_type == 'smplh': + SMPL_JOINT_ZERO_IDX = [3, 6, 9, 10, 11, 13, 14] + elif model_type == 'smplx': + SMPL_JOINT_ZERO_IDX = [3, 6, 9, 10, 11, 13, 14] + else: + raise NotImplementedError if not use_feet: SMPL_JOINT_ZERO_IDX.extend([7, 8]) if not use_head: SMPL_JOINT_ZERO_IDX.extend([12, 15]) SMPL_POSES_ZERO_IDX = [[j for j in range(3*i, 3*i+3)] for i in SMPL_JOINT_ZERO_IDX] SMPL_POSES_ZERO_IDX = sum(SMPL_POSES_ZERO_IDX, []) + # SMPL_POSES_ZERO_IDX.extend([36, 37, 38, 45, 46, 47]) self.idx = SMPL_POSES_ZERO_IDX def loss(self, poses): return torch.sum(torch.abs(poses[:, self.idx])) -def SmoothLoss(body_params, keys, weight_loss, span=4): +def SmoothLoss(body_params, keys, weight_loss, span=4, model_type='smpl'): spans = [i for i in range(1, span)] span_weights = {i:1/i for i in range(1, span)} span_weights = {key: i/sum(span_weights) for key, i in span_weights.items()} loss_dict = {} nFrames = body_params['poses'].shape[0] - for key in ['poses', 'Th']: + nPoses = body_params['poses'].shape[1] + if model_type == 'smplh' or model_type == 'smplx': + nPoses = 66 + for key in ['poses', 'Th', 'poses_hand', 'expression']: + if key not in keys: + continue k = 'smooth_' + key if k in weight_loss.keys() and weight_loss[k] > 0.: loss_dict[k] = 0. for span in spans: - val = torch.sum((body_params[key][span:, :] - body_params[key][:nFrames-span, :])**2) + if key == 'poses_hand': + val = torch.sum((body_params['poses'][span:, 66:] - body_params['poses'][:nFrames-span, 66:])**2) + else: + val = torch.sum((body_params[key][span:, :nPoses] - body_params[key][:nFrames-span, :nPoses])**2) + loss_dict[k] += span_weights[span] * val + k = 'smooth_' + key + '_l1' + if k in weight_loss.keys() and weight_loss[k] > 0.: + loss_dict[k] = 0. + for span in spans: + if key == 'poses_hand': + val = torch.sum((body_params['poses'][span:, 66:] - body_params['poses'][:nFrames-span, 66:]).abs()) + else: + val = torch.sum((body_params[key][span:, :nPoses] - body_params[key][:nFrames-span, :nPoses]).abs()) loss_dict[k] += span_weights[span] * val # smooth rotation rot = batch_rodrigues(body_params['Rh']) key, k = 'Rh', 'smooth_Rh' - if k in weight_loss.keys() and weight_loss[k] > 0.: + if key in keys and k in weight_loss.keys() and weight_loss[k] > 0.: loss_dict[k] = 0. for span in spans: val = torch.sum((rot[span:, :] - rot[:nFrames-span, :])**2) @@ -55,10 +90,24 @@ def SmoothLoss(body_params, keys, weight_loss, span=4): def RegularizationLoss(body_params, body_params_init, weight_loss): loss_dict = {} - for key in ['poses', 'shapes', 'Th']: - if 'init_'+key in weight_loss.keys() and weight_loss['init_'+key] > 0.: + for key in ['poses', 'shapes', 'Th', 'hands', 'head', 'expression']: + if 'init_'+key in weight_loss.keys() and weight_loss['init_'+key] > 0.: + if key == 'poses': + loss_dict['init_'+key] = torch.sum((body_params[key][:, :66] - body_params_init[key][:, :66])**2) + elif key == 'hands': + loss_dict['init_'+key] = torch.sum((body_params['poses'][: , 66:66+12] - body_params_init['poses'][:, 66:66+12])**2) + elif key == 'head': + loss_dict['init_'+key] = torch.sum((body_params['poses'][: , 78:78+9] - body_params_init['poses'][:, 78:78+9])**2) + elif key in body_params.keys(): loss_dict['init_'+key] = torch.sum((body_params[key] - body_params_init[key])**2) - for key in ['poses', 'shapes']: + for key in ['poses', 'shapes', 'hands', 'head', 'expression']: if 'reg_'+key in weight_loss.keys() and weight_loss['reg_'+key] > 0.: - loss_dict['reg_'+key] = torch.sum((body_params[key])**2) + if key == 'poses': + loss_dict['reg_'+key] = torch.sum((body_params[key][:, :66])**2) + elif key == 'hands': + loss_dict['reg_'+key] = torch.sum((body_params['poses'][: , 66:66+12])**2) + elif key == 'head': + loss_dict['reg_'+key] = torch.sum((body_params['poses'][: , 78:78+9])**2) + elif key in body_params.keys(): + loss_dict['reg_'+key] = torch.sum((body_params[key])**2) return loss_dict \ No newline at end of file diff --git a/code/pyfitting/operation.py b/code/pyfitting/operation.py index 5272249..f45ea63 100644 --- a/code/pyfitting/operation.py +++ b/code/pyfitting/operation.py @@ -2,7 +2,7 @@ @ Date: 2020-11-19 11:39:45 @ Author: Qing Shuai @ LastEditors: Qing Shuai - @ LastEditTime: 2020-11-19 11:50:20 + @ LastEditTime: 2021-01-20 15:06:28 @ FilePath: /EasyMocap/code/pyfitting/operation.py ''' import torch @@ -47,12 +47,18 @@ def projection(points3d, camera_intri, R=None, T=None, distance=None): points3d {Tensor} -- (bn, N, 3) camera_intri {Tensor} -- (bn, 3, 3) distance {Tensor} -- (bn, 1, 1) + R: bn, 3, 3 + T: bn, 3, 1 Returns: points2d -- (bn, N, 2) """ if R is not None: Rt = torch.transpose(R, 1, 2) - points3d = torch.matmul(points3d, Rt) + T + if T.shape[-1] == 1: + Tt = torch.transpose(T, 1, 2) + points3d = torch.matmul(points3d, Rt) + Tt + else: + points3d = torch.matmul(points3d, Rt) + T if distance is None: img_points = torch.div(points3d[:, :, :2], diff --git a/code/pyfitting/optimize_simple.py b/code/pyfitting/optimize_simple.py index e91ffdd..0fd3a07 100644 --- a/code/pyfitting/optimize_simple.py +++ b/code/pyfitting/optimize_simple.py @@ -2,8 +2,8 @@ @ Date: 2020-11-19 10:49:26 @ Author: Qing Shuai @ LastEditors: Qing Shuai - @ LastEditTime: 2021-01-14 20:19:34 - @ FilePath: /EasyMocap/code/pyfitting/optimize_simple.py + @ LastEditTime: 2021-01-24 21:29:12 + @ FilePath: /EasyMocapRelease/code/pyfitting/optimize_simple.py ''' import numpy as np import torch @@ -213,6 +213,7 @@ def optimizeShape(body_model, body_params, keypoints3d, limb_length = torch.Tensor(limb_length).to(device) limb_conf = torch.Tensor(limb_conf).to(device) body_params = {key:torch.Tensor(val).to(device) for key, val in body_params.items()} + body_params_init = {key:val.clone() for key, val in body_params.items()} opt_params = [body_params['shapes']] grad_require(opt_params, True) optimizer = LBFGS( @@ -226,14 +227,16 @@ def optimizeShape(body_model, body_params, keypoints3d, dst = keypoints3d[:, kintree[:, 1], :3] direct_est = (dst - src).detach() direct_norm = torch.norm(direct_est, dim=2, keepdim=True) - direct_normalized = direct_est/direct_norm + direct_normalized = direct_est/(direct_norm + 1e-4) err = dst - src - direct_normalized * limb_length loss_dict = { 's3d': torch.sum(err**2*limb_conf)/nFrames, - 'reg_shape': torch.sum(body_params['shapes']**2)} + 'reg_shapes': torch.sum(body_params['shapes']**2)} + if 'init_shape' in weight_loss.keys(): + loss_dict['init_shape'] = torch.sum((body_params['shapes'] - body_params_init['shapes'])**2) # fittingLog.step(loss_dict, weight_loss) if verbose: - print(' '.join([key + ' %f'%(loss_dict[key].item()*weight_loss[key]) + print(' '.join([key + ' %.3f'%(loss_dict[key].item()*weight_loss[key]) for key in loss_dict.keys() if weight_loss[key]>0])) loss = sum([loss_dict[key]*weight_loss[key] for key in loss_dict.keys()]) @@ -255,6 +258,9 @@ def optimizeShape(body_model, body_params, keypoints3d, body_params = {key:val.detach().cpu().numpy() for key, val in body_params.items()} return body_params +N_BODY = 25 +N_HAND = 21 + def optimizePose(body_model, body_params, keypoints3d, weight_loss, kintree, cfg=None): """ simple function for optimizing model pose given 3d keypoints @@ -268,22 +274,16 @@ def optimizePose(body_model, body_params, keypoints3d, cfg (Config): Config Node controling running mode """ device = body_model.device + model_type = body_model.model_type # 计算不同的骨长 kintree = np.array(kintree, dtype=np.int) nFrames = keypoints3d.shape[0] - # limb_length: nFrames, nLimbs, 1 - limb = keypoints3d[:, kintree[:, 1], :3] - keypoints3d[:, kintree[:, 0], :3] - limb_length = np.linalg.norm(limb, axis=2, keepdims=True) - # conf: nFrames, nLimbs, 1 - limb_conf = np.minimum(keypoints3d[:, kintree[:, 1], 3:], keypoints3d[:, kintree[:, 0], 3:]) - limb_dir = limb/limb_length - + nJoints = keypoints3d.shape[1] keypoints3d = torch.Tensor(keypoints3d).to(device) - limb_dir = torch.Tensor(limb_dir).to(device).unsqueeze(2) - limb_conf = torch.Tensor(limb_conf).to(device) - angle_prior = SMPLAngleLoss(keypoints3d) + angle_prior = SMPLAngleLoss(keypoints3d, body_model.model_type) body_params = {key:torch.Tensor(val).to(device) for key, val in body_params.items()} + body_params_init = {key:val.clone() for key, val in body_params.items()} if cfg is None: opt_params = [body_params['Rh'], body_params['Th'], body_params['poses']] verbose = False @@ -297,35 +297,46 @@ def optimizePose(body_model, body_params, keypoints3d, opt_params.append(body_params['poses']) if cfg.OPT_SHAPE: opt_params.append(body_params['shapes']) + if cfg.OPT_EXPR and model_type == 'smplx': + opt_params.append(body_params['expression']) verbose = cfg.VERBOSE grad_require(opt_params, True) optimizer = LBFGS( opt_params, line_search_fn='strong_wolfe') zero_pose = torch.zeros((nFrames, 3), device=device) + if not cfg.OPT_HAND and model_type in ['smplh', 'smplx']: + zero_pose_hand = torch.zeros((nFrames, body_params['poses'].shape[1] - 66), device=device) + nJoints = N_BODY + keypoints3d = keypoints3d[:, :nJoints] + elif cfg.OPT_HAND and not cfg.OPT_EXPR and model_type == 'smplx': + zero_pose_face = torch.zeros((nFrames, body_params['poses'].shape[1] - 78), device=device) + nJoints = N_BODY + N_HAND * 2 + keypoints3d = keypoints3d[:, :nJoints] + else: + nJoints = keypoints3d.shape[1] def closure(debug=False): optimizer.zero_grad() new_params = body_params.copy() - new_params['poses'] = torch.cat([zero_pose, body_params['poses'][:, 3:]], dim=1) - kpts_est = body_model(return_verts=False, return_tensor=True, **new_params) - diff_square = (kpts_est - keypoints3d[..., :3])**2 - if False: - pass + if not cfg.OPT_HAND and cfg.MODEL in ['smplh', 'smplx']: + new_params['poses'] = torch.cat([zero_pose, body_params['poses'][:, 3:66], zero_pose_hand], dim=1) else: - conf = keypoints3d[..., 3:] + new_params['poses'] = torch.cat([zero_pose, body_params['poses'][:, 3:]], dim=1) + kpts_est = body_model(return_verts=False, return_tensor=True, **new_params)[:, :nJoints, :] + diff_square = (kpts_est[:, :nJoints, :3] - keypoints3d[..., :3])**2 + # TODO:add robust loss + conf = keypoints3d[..., 3:] loss_3d = torch.sum(conf * diff_square) - if False: - src = keypoints3d[:, kintree[:, 0], :3].detach() - dst = keypoints3d[:, kintree[:, 1], :3] - direct_est = dst - src - direct_norm = torch.norm(direct_est, dim=2, keepdim=True) - direct_normalized = direct_est/direct_norm - loss_dict = { 'k3d': loss_3d, 'reg_poses_zero': angle_prior.loss(body_params['poses']) } + # regularize + loss_dict.update(RegularizationLoss(body_params, body_params_init, weight_loss)) # smooth - loss_dict.update(SmoothLoss(body_params, ['poses', 'Th'], weight_loss)) + smooth_conf = keypoints3d[1:, ..., -1:]**2 + loss_dict['smooth_body'] = torch.sum(smooth_conf[:, :N_BODY] * torch.abs(kpts_est[:-1, :N_BODY] - kpts_est[1:, :N_BODY])) + if cfg.OPT_HAND and cfg.MODEL in ['smplh', 'smplx']: + loss_dict['smooth_hand'] = torch.sum(smooth_conf[:, N_BODY:N_BODY+N_HAND*2] * torch.abs(kpts_est[:-1, N_BODY:N_BODY+N_HAND*2] - kpts_est[1:, N_BODY:N_BODY+N_HAND*2])) for key in loss_dict.keys(): loss_dict[key] = loss_dict[key]/nFrames # fittingLog.step(loss_dict, weight_loss) diff --git a/code/smplmodel/__init__.py b/code/smplmodel/__init__.py index 4e63938..08988ac 100644 --- a/code/smplmodel/__init__.py +++ b/code/smplmodel/__init__.py @@ -2,8 +2,9 @@ @ Date: 2020-11-18 14:33:20 @ Author: Qing Shuai @ LastEditors: Qing Shuai - @ LastEditTime: 2021-01-14 20:12:26 + @ LastEditTime: 2021-01-20 16:33:02 @ FilePath: /EasyMocap/code/smplmodel/__init__.py ''' from .body_model import SMPLlayer -from .body_param import merge_params, select_nf, init_params, Config \ No newline at end of file +from .body_param import load_model +from .body_param import merge_params, select_nf, init_params, Config, check_params, check_keypoints \ No newline at end of file diff --git a/code/smplmodel/body_model.py b/code/smplmodel/body_model.py index daa5bcf..5e9a045 100644 --- a/code/smplmodel/body_model.py +++ b/code/smplmodel/body_model.py @@ -2,7 +2,7 @@ @ Date: 2020-11-18 14:04:10 @ Author: Qing Shuai @ LastEditors: Qing Shuai - @ LastEditTime: 2021-01-14 20:14:34 + @ LastEditTime: 2021-01-22 16:04:54 @ FilePath: /EasyMocap/code/smplmodel/body_model.py ''' import torch @@ -11,6 +11,7 @@ from .lbs import lbs, batch_rodrigues import os.path as osp import pickle import numpy as np +import os def to_tensor(array, dtype=torch.float32, device=torch.device('cpu')): if 'torch.tensor' not in str(type(array)): @@ -23,13 +24,29 @@ def to_np(array, dtype=np.float32): array = array.todense() return np.array(array, dtype=dtype) +def load_regressor(regressor_path): + if regressor_path.endswith('.npy'): + X_regressor = to_tensor(np.load(regressor_path)) + elif regressor_path.endswith('.txt'): + data = np.loadtxt(regressor_path) + with open(regressor_path, 'r') as f: + shape = f.readline().split()[1:] + reg = np.zeros((int(shape[0]), int(shape[1]))) + for i, j, v in data: + reg[int(i), int(j)] = v + X_regressor = to_tensor(reg) + else: + import ipdb; ipdb.set_trace() + return X_regressor + class SMPLlayer(nn.Module): - def __init__(self, model_path, gender='neutral', device=None, + def __init__(self, model_path, model_type='smpl', gender='neutral', device=None, regressor_path=None) -> None: super(SMPLlayer, self).__init__() dtype = torch.float32 self.dtype = dtype self.device = device + self.model_type = model_type # create the SMPL model if osp.isdir(model_path): model_fn = 'SMPL_{}.{ext}'.format(gender.upper(), ext='pkl') @@ -58,13 +75,18 @@ class SMPLlayer(nn.Module): parents = to_tensor(to_np(data['kintree_table'][0])).long() parents[0] = -1 self.register_buffer('parents', parents) + if self.model_type == 'smplx': + # shape + self.num_expression_coeffs = 10 + self.num_shapes = 10 + self.shapedirs = self.shapedirs[:, :, :self.num_shapes+self.num_expression_coeffs] # joints regressor if regressor_path is not None: - X_regressor = to_tensor(np.load(regressor_path)) + X_regressor = load_regressor(regressor_path) X_regressor = torch.cat((self.J_regressor, X_regressor), dim=0) - j_J_regressor = torch.zeros(24, X_regressor.shape[0], device=device) - for i in range(24): + j_J_regressor = torch.zeros(self.J_regressor.shape[0], X_regressor.shape[0], device=device) + for i in range(self.J_regressor.shape[0]): j_J_regressor[i, i] = 1 j_v_template = X_regressor @ self.v_template # @@ -79,8 +101,65 @@ class SMPLlayer(nn.Module): self.register_buffer('j_weights', j_weights) self.register_buffer('j_v_template', j_v_template) self.register_buffer('j_J_regressor', j_J_regressor) + if self.model_type == 'smplh': + # load smplh data + self.num_pca_comps = 6 + from os.path import join + for key in ['LEFT', 'RIGHT']: + left_file = join(os.path.dirname(smpl_path), 'MANO_{}.pkl'.format(key)) + with open(left_file, 'rb') as f: + data = pickle.load(f, encoding='latin1') + val = to_tensor(to_np(data['hands_mean'].reshape(1, -1)), dtype=dtype) + self.register_buffer('mHandsMean'+key[0], val) + val = to_tensor(to_np(data['hands_components'][:self.num_pca_comps, :]), dtype=dtype) + self.register_buffer('mHandsComponents'+key[0], val) + self.use_pca = True + self.use_flat_mean = True + elif self.model_type == 'smplx': + # hand pose + self.num_pca_comps = 6 + from os.path import join + for key in ['Ll', 'Rr']: + val = to_tensor(to_np(data['hands_mean'+key[1]].reshape(1, -1)), dtype=dtype) + self.register_buffer('mHandsMean'+key[0], val) + val = to_tensor(to_np(data['hands_components'+key[1]][:self.num_pca_comps, :]), dtype=dtype) + self.register_buffer('mHandsComponents'+key[0], val) + self.use_pca = True + self.use_flat_mean = True + + def extend_pose(self, poses): + if self.model_type not in ['smplh', 'smplx']: + return poses + elif self.model_type == 'smplh' and poses.shape[-1] == 156: + return poses + elif self.model_type == 'smplx' and poses.shape[-1] == 165: + return poses + + NUM_BODYJOINTS = 22 * 3 + if self.use_pca: + NUM_HANDJOINTS = self.num_pca_comps + else: + NUM_HANDJOINTS = 15 * 3 + NUM_FACEJOINTS = 3 * 3 + poses_lh = poses[:, NUM_BODYJOINTS:NUM_BODYJOINTS + NUM_HANDJOINTS] + poses_rh = poses[:, NUM_BODYJOINTS + NUM_HANDJOINTS:NUM_BODYJOINTS+NUM_HANDJOINTS*2] + if self.use_pca: + poses_lh = poses_lh @ self.mHandsComponentsL + poses_rh = poses_rh @ self.mHandsComponentsR + if self.use_flat_mean: + poses_lh = poses_lh + self.mHandsMeanL + poses_rh = poses_rh + self.mHandsMeanR + if self.model_type == 'smplh': + poses = torch.cat([poses[:, :NUM_BODYJOINTS], poses_lh, poses_rh], dim=1) + elif self.model_type == 'smplx': + # the head part have only three joints + # poses_head: (N, 9), jaw_pose, leye_pose, reye_pose respectively + poses_head = poses[:, NUM_BODYJOINTS+NUM_HANDJOINTS*2:] + # body, head, left hand, right hand + poses = torch.cat([poses[:, :NUM_BODYJOINTS], poses_head, poses_lh, poses_rh], dim=1) + return poses - def forward(self, poses, shapes, Rh=None, Th=None, return_verts=True, return_tensor=True, only_shape=False, **kwargs): + def forward(self, poses, shapes, Rh=None, Th=None, expression=None, return_verts=True, return_tensor=True, only_shape=False, **kwargs): """ Forward pass for SMPL model Args: @@ -96,13 +175,23 @@ class SMPLlayer(nn.Module): shapes = to_tensor(shapes, dtype, device) Rh = to_tensor(Rh, dtype, device) Th = to_tensor(Th, dtype, device) + if expression is not None: + expression = to_tensor(expression, dtype, device) + bn = poses.shape[0] + # process Rh, Th if Rh is None: Rh = torch.zeros(bn, 3, device=poses.device) rot = batch_rodrigues(Rh) transl = Th.unsqueeze(dim=1) + # process shapes if shapes.shape[0] < bn: shapes = shapes.expand(bn, -1) + if expression is not None and self.model_type == 'smplx': + shapes = torch.cat([shapes, expression], dim=1) + # process poses + if self.model_type == 'smplh' or self.model_type == 'smplx': + poses = self.extend_pose(poses) if return_verts: vertices, joints = lbs(shapes, poses, self.v_template, self.shapedirs, self.posedirs, @@ -113,7 +202,7 @@ class SMPLlayer(nn.Module): self.j_shapedirs, self.j_posedirs, self.j_J_regressor, self.parents, self.j_weights, pose2rot=True, dtype=self.dtype, only_shape=only_shape) - vertices = vertices[:, 24:, :] + vertices = vertices[:, self.J_regressor.shape[0]:, :] vertices = torch.matmul(vertices, rot.transpose(1, 2)) + transl if not return_tensor: vertices = vertices.detach().cpu().numpy() diff --git a/code/smplmodel/body_param.py b/code/smplmodel/body_param.py index d9363c1..0063e91 100644 --- a/code/smplmodel/body_param.py +++ b/code/smplmodel/body_param.py @@ -2,15 +2,16 @@ @ Date: 2020-11-20 13:34:54 @ Author: Qing Shuai @ LastEditors: Qing Shuai - @ LastEditTime: 2021-01-14 20:09:40 - @ FilePath: /EasyMocap/code/smplmodel/body_param.py + @ LastEditTime: 2021-01-24 18:39:45 + @ FilePath: /EasyMocapRelease/code/smplmodel/body_param.py ''' import numpy as np def merge_params(param_list, share_shape=True): output = {} - for key in ['poses', 'shapes', 'Rh', 'Th']: - output[key] = np.vstack([v[key] for v in param_list]) + for key in ['poses', 'shapes', 'Rh', 'Th', 'expression']: + if key in param_list[0].keys(): + output[key] = np.vstack([v[key] for v in param_list]) if share_shape: output['shapes'] = output['shapes'].mean(axis=0, keepdims=True) return output @@ -19,24 +20,83 @@ def select_nf(params_all, nf): output = {} for key in ['poses', 'Rh', 'Th']: output[key] = params_all[key][nf:nf+1, :] + if 'expression' in params_all.keys(): + output['expression'] = params_all['expression'][nf:nf+1, :] if params_all['shapes'].shape[0] == 1: output['shapes'] = params_all['shapes'] else: output['shapes'] = params_all['shapes'][nf:nf+1, :] return output -def init_params(nFrames=1): +NUM_POSES = {'smpl': 72, 'smplh': 78, 'smplx': 66 + 12 + 9} +NUM_EXPR = 10 + +def init_params(nFrames=1, model_type='smpl'): params = { - 'poses': np.zeros((nFrames, 72)), + 'poses': np.zeros((nFrames, NUM_POSES[model_type])), 'shapes': np.zeros((1, 10)), 'Rh': np.zeros((nFrames, 3)), 'Th': np.zeros((nFrames, 3)), } + if model_type == 'smplx': + params['expression'] = np.zeros((nFrames, NUM_EXPR)) return params +def check_params(body_params, model_type): + nFrames = body_params['poses'].shape[0] + if body_params['poses'].shape[1] != NUM_POSES[model_type]: + body_params['poses'] = np.hstack((body_params['poses'], np.zeros((nFrames, NUM_POSES[model_type] - body_params['poses'].shape[1])))) + if model_type == 'smplx' and 'expression' not in body_params.keys(): + body_params['expression'] = np.zeros((nFrames, NUM_EXPR)) + return body_params + class Config: OPT_R = False OPT_T = False OPT_POSE = False OPT_SHAPE = False - VERBOSE = False \ No newline at end of file + OPT_HAND = False + OPT_EXPR = False + VERBOSE = False + MODEL = 'smpl' + +def load_model(gender='neutral', use_cuda=True, model_type='smpl'): + # prepare SMPL model + import torch + if use_cuda: + device = torch.device('cuda') + else: + device = torch.device('cpu') + from .body_model import SMPLlayer + if model_type == 'smpl': + body_model = SMPLlayer('data/smplx/smpl', gender=gender, device=device, + regressor_path='data/smplx/J_regressor_body25.npy') + elif model_type == 'smplh': + body_model = SMPLlayer('data/smplx/smplh/SMPLH_MALE.pkl', model_type='smplh', gender=gender, device=device, + regressor_path='data/smplx/J_regressor_body25_smplh.txt') + elif model_type == 'smplx': + body_model = SMPLlayer('data/smplx/smplx/SMPLX_{}.pkl'.format(gender.upper()), model_type='smplx', gender=gender, device=device, + regressor_path='data/smplx/J_regressor_body25_smplx.txt') + else: + body_model = None + body_model.to(device) + return body_model + +def check_keypoints(keypoints2d, WEIGHT_DEBUFF=1.2): + # keypoints2d: nFrames, nJoints, 3 + # + # wrong feet + # if keypoints2d.shape[-2] > 25 + 42: + # keypoints2d[..., 0, 2] = 0 + # keypoints2d[..., [15, 16, 17, 18], -1] = 0 + # keypoints2d[..., [19, 20, 21, 22, 23, 24], -1] /= 2 + if keypoints2d.shape[-2] > 25: + # set the hand keypoints + keypoints2d[..., 25, :] = keypoints2d[..., 7, :] + keypoints2d[..., 46, :] = keypoints2d[..., 4, :] + keypoints2d[..., 25:, -1] *= WEIGHT_DEBUFF + # reduce the confidence of hand and face + MIN_CONF = 0.3 + conf = keypoints2d[..., -1] + conf[conf 0)[:, :, None] rend_rgba = np.dstack((rend_rgba, (valid_mask*255).astype(np.uint8))) - rend_cat = cv2.addWeighted(cv2.bitwise_and(img, 255 - rend_rgba[:, :, 3:4].repeat(3, 2)), 1, rend_rgba[:, :, :3], 1, 0) + if add_back: + rend_cat = cv2.addWeighted( + cv2.bitwise_and(img, 255 - rend_rgba[:, :, 3:4].repeat(3, 2)), 1, + cv2.bitwise_and(rend_rgba[:, :, :3], rend_rgba[:, :, 3:4].repeat(3, 2)), 1, 0) + else: + rend_cat = rend_rgba output_colors.append(rend_rgba) output_depths.append(rend_depth) diff --git a/doc/evaluation.md b/doc/evaluation.md new file mode 100644 index 0000000..8a59ea7 --- /dev/null +++ b/doc/evaluation.md @@ -0,0 +1,11 @@ + +# Evaluation + +## Evaluation of fitting SMPL +### Human3.6M \ No newline at end of file diff --git a/doc/feng/000400.jpg b/doc/feng/000400.jpg new file mode 100644 index 0000000..10fa72d Binary files /dev/null and b/doc/feng/000400.jpg differ diff --git a/doc/feng/skel.gif b/doc/feng/skel.gif new file mode 100644 index 0000000..39ff027 Binary files /dev/null and b/doc/feng/skel.gif differ diff --git a/doc/feng/smplx.gif b/doc/feng/smplx.gif new file mode 100644 index 0000000..4adac7a Binary files /dev/null and b/doc/feng/smplx.gif differ diff --git a/doc/log.md b/doc/log.md new file mode 100644 index 0000000..fb0b330 --- /dev/null +++ b/doc/log.md @@ -0,0 +1,13 @@ + +## 2020.01.24 +1. Support SMPL+H, SMPL-X model. +2. Upgrade `body_model.py`. +3. Update the optimization functions. +4. Add checking length of limb +5. Update the example figures. \ No newline at end of file diff --git a/doc/tutorial_new_task.md b/doc/tutorial_new_task.md new file mode 100644 index 0000000..31d3967 --- /dev/null +++ b/doc/tutorial_new_task.md @@ -0,0 +1,18 @@ + +# Add new tasks + +## 0. Prepare the data and dataset + +## 1. Add new loss functions + +## 2. Add new optimization + +## 3. Write your own main function + +## 4. Evaluation for the new tasks diff --git a/scripts/preprocess/extract_video.py b/scripts/preprocess/extract_video.py index 9870095..0e340a1 100644 --- a/scripts/preprocess/extract_video.py +++ b/scripts/preprocess/extract_video.py @@ -2,15 +2,17 @@ @ Date: 2021-01-13 20:38:33 @ Author: Qing Shuai @ LastEditors: Qing Shuai - @ LastEditTime: 2021-01-14 16:59:06 - @ FilePath: /EasyMocapRelease/scripts/preprocess/extract_video.py + @ LastEditTime: 2021-01-22 20:45:37 + @ FilePath: /EasyMocap/scripts/preprocess/extract_video.py ''' -import os +import os, sys import cv2 from os.path import join from tqdm import tqdm from glob import glob import numpy as np +code_path = join(os.path.dirname(__file__), '..', '..', 'code') +sys.path.append(code_path) mkdir = lambda x: os.makedirs(x, exist_ok=True) @@ -18,12 +20,12 @@ def extract_video(videoname, path, start=0, end=10000, step=1): base = os.path.basename(videoname).replace('.mp4', '') if not os.path.exists(videoname): return base - video = cv2.VideoCapture(videoname) outpath = join(path, 'images', base) if os.path.exists(outpath) and len(os.listdir(outpath)) > 0: return base else: os.makedirs(outpath) + video = cv2.VideoCapture(videoname) totalFrames = int(video.get(cv2.CAP_PROP_FRAME_COUNT)) for cnt in tqdm(range(totalFrames)): ret, frame = video.read() @@ -36,6 +38,7 @@ def extract_video(videoname, path, start=0, end=10000, step=1): def extract_2d(openpose, image, keypoints, render): if not os.path.exists(keypoints): + os.makedirs(keypoints, exist_ok=True) cmd = './build/examples/openpose/openpose.bin --image_dir {} --write_json {} --display 0'.format(image, keypoints) if args.handface: cmd = cmd + ' --hand --face' @@ -87,7 +90,7 @@ def bbox_from_openpose(keypoints, rescale=1.2, detection_thresh=0.01): center[1] - bbox_size[1]/2, center[0] + bbox_size[0]/2, center[1] + bbox_size[1]/2, - keypoints[valid, :2].mean() + keypoints[valid, 2].mean() ] return bbox @@ -129,10 +132,62 @@ def convert_from_openpose(src, dst): annot['annots'] = annots save_json(annotname, annot) +def detect_frame(detector, img, pid=0): + lDetections = detector.detect([img])[0] + annots = [] + for i in range(len(lDetections)): + annot = { + 'bbox': [float(d) for d in lDetections[i]['bbox']], + 'personID': pid + i, + 'keypoints': lDetections[i]['keypoints'].tolist(), + 'isKeyframe': True + } + annots.append(annot) + return annots + +def extract_yolo_hrnet(image_root, annot_root): + imgnames = sorted(glob(join(image_root, '*.jpg'))) + import torch + device = torch.device('cuda') + from estimator.detector import Detector + config = { + 'yolov4': { + 'ckpt_path': 'data/models/yolov4.weights', + 'conf_thres': 0.3, + 'box_nms_thres': 0.5 # 阈值=0.9,表示IOU 0.9的不会被筛掉 + }, + 'hrnet':{ + 'nof_joints': 17, + 'c': 48, + 'checkpoint_path': 'data/models/pose_hrnet_w48_384x288.pth' + }, + 'detect':{ + 'MIN_PERSON_JOINTS': 10, + 'MIN_BBOX_AREA': 5000, + 'MIN_JOINTS_CONF': 0.3, + 'MIN_BBOX_LEN': 150 + } + } + detector = Detector('yolo', 'hrnet', device, config) + for nf, imgname in enumerate(tqdm(imgnames)): + annotname = join(annot_root, os.path.basename(imgname).replace('.jpg', '.json')) + annot = create_annot_file(annotname, imgname) + img0 = cv2.imread(imgname) + annot['annots'] = detect_frame(detector, img0, 0) + for i in range(len(annot['annots'])): + x = annot['annots'][i] + x['area'] = max(x['bbox'][2] - x['bbox'][0], x['bbox'][3] - x['bbox'][1])**2 + annot['annots'].sort(key=lambda x:-x['area']) + # 重新赋值人的ID + for i in range(len(annot['annots'])): + annot['annots'][i]['personID'] = i + save_json(annotname, annot) + if __name__ == "__main__": import argparse parser = argparse.ArgumentParser() parser.add_argument('path', type=str, default=None) + parser.add_argument('--mode', type=str, default='openpose', choices=['openpose', 'yolo-hrnet']) parser.add_argument('--handface', action='store_true') parser.add_argument('--openpose', type=str, default='/media/qing/Project/openpose') @@ -140,24 +195,31 @@ if __name__ == "__main__": parser.add_argument('--no2d', action='store_true') parser.add_argument('--debug', action='store_true') args = parser.parse_args() + mode = args.mode + if os.path.isdir(args.path): videos = sorted(glob(join(args.path, 'videos', '*.mp4'))) subs = [] for video in videos: basename = extract_video(video, args.path) subs.append(basename) + print('cameras: ', ' '.join(subs)) if not args.no2d: - os.makedirs(join(args.path, 'openpose'), exist_ok=True) for sub in subs: + image_root = join(args.path, 'images', sub) annot_root = join(args.path, 'annots', sub) if os.path.exists(annot_root): + print('skip ', annot_root) continue - extract_2d(args.openpose, join(args.path, 'images', sub), - join(args.path, 'openpose', sub), - join(args.path, 'openpose_render', sub)) - convert_from_openpose( - src=join(args.path, 'openpose', sub), - dst=annot_root - ) + if mode == 'openpose': + extract_2d(args.openpose, image_root, + join(args.path, 'openpose', sub), + join(args.path, 'openpose_render', sub)) + convert_from_openpose( + src=join(args.path, 'openpose', sub), + dst=annot_root + ) + elif mode == 'yolo-hrnet': + extract_yolo_hrnet(image_root, annot_root) else: print(args.path, ' not exists')