teaser

A fundamental component of motion modeling with deep learning is the pose parameterization. A suitable parameterization is one that holistically encodes the rotational and positional components. The dual quaternion formulation proposed in this work can encode these two components enabling a rich encoding that implicitly preserves the nuances and subtle variations in the motion of different characters

Paper Video Code

Abstract

Data-driven skeletal animation relies on the existence of a suitable learning scheme, which can capture the rich context of motion. However, commonly used motion representations often fail to accurately encode the full articulation of motion, or present artifacts. In this work, we address the fundamental problem of finding a robust pose representation for motion, suitable for deep skeletal animation, one that can better constrain poses and faithfully capture nuances correlated with skeletal characteristics. Our representation is based on dual quaternions, the mathematical abstractions with well-defined operations, which simultaneously encode rotational and positional orientation, enabling a rich encoding, centered around the root. We demonstrate that our representation overcomes common motion artifacts, and assess its performance compared to other popular representations. We conduct an ablation study to evaluate the impact of various losses that can be incorporated during learning. Leveraging the fact that our representation implicitly encodes skeletal motion attributes, we train a network on a dataset comprising of skeletons with different proportions, without the need to retarget them first to a universal skeleton, which causes subtle motion elements to be missed. Qualitative results demonstrate the usefulness of the parameterization in skeleton-specific synthesis.

Method

A dual quaternion \( \mathbf{\overline {q}} \) can be represented as an ensemble of ordinary quaternions \( \mathbf{q}_r \) and \( \mathbf{q}_d \) , in the form \( \mathbf{q}_r + \mathbf{q}_d\varepsilon \) , where \( \varepsilon \) is the dual unit, satisfying the relation \( \varepsilon^2= 0 \) . The first quaternion describes the rotation. The second quaternion, \( \mathbf{q}_d \) encodes translational information.

We establish the following notation:
\( \mathbf{q} \) quaternion
\( \mathbf{\overline{q}} \) dual quaternion
\( \mathbf{\hat q} \) unit quaternion
\( \mathbf{\overline{\hat q}} \) unit dual quaternion
\( \mathbf{q^*}\) quaternion conjugate
\( \mathbf{\overline{q}^*} \) dual quaternion conjugate

Dual quaternions allow for convenient mappings from and to other representations which are currently used in the literature (quaternions, rotation matrices, ortho6D [Zhou et al., 2018]), allowing for effortless integration into current architectures. They also have well established mathematical properties such as addition and multiplication.

Due to the mathematical properties of dual quaternions we can extract the rotational and positional components. We adopt a current coordinate system, so that the extracted positions correspond to the joint positions with respect to the root joint. We model the root displacement as a separate component. We can express the current transformation of the root joint using local homogeneous coordinates of the form: \[ \begin{equation}M_{curr,root} = \begin{bmatrix} r_{11} &r_{12}&r_{13}&0\\ r_{21} &r_{22}&r_{23}&0\\ r_{31} &r_{32}&r_{33}&0\\ 0&0&0&1 \end{bmatrix} \end{equation} \].
Then following the tree hierarchy and using the local homogeneous coordinates of each joint \( j \), \[ \begin{equation}M_{loc,j} = \begin{bmatrix} r_{11} & r_{12} & r_{13} & \text{offset}_x\\ r_{21} & r_{22} & r_{23} & \text{offset}_y \\ r_{31} & r_{32} & r_{33} & \text{offset}_z\\ 0 & 0 & 0 & 1 \end{bmatrix} , \end{equation} \] where the \( \text{offset} \) is taken from the hierarchy of the bvh file, we can compute the current homogeneous representation for each joint using \[ \begin{equation} M_{curr,j} = M_{curr,(j-1)} \times M_{loc,j} \end{equation} \] Obtaining the local rotations w.r.t. each joint's parent in the tree architecture, as well as the offsets is straightforward when animation files are used. Intuitively, we can recover the local transformation of joint \( j \) using the inverse procedure: \[ \begin{equation} M_{loc,j} = M^{-1}_{curr,(j-1)}M_{curr,j} \label{eq: curr2loc} \end{equation} \] With our formulation we take into consideration the hierarchical rig, and overcome problems such as the error accumulation (see Figure below).
Error Accumulation
We design dedicated losses that leverage the representation, such as the positional loss and offset loss. We assess the contribution of each loss through an ablation study. For the positional information, we extract the translation from the current dual quaternions, while for the offset loss we extract the translation from the local dual quaternions. We extract the translation component using \[\begin{aligned} 2 \mathbf{q}_d \mathbf{\hat{q}}_r^* \end{aligned}\] where \( \mathbf{\hat{q}}_r^* \) denotes the conjugate quaternion. We also make sure to normalize the predicted dual quaternions, as only unit dual quaternions represent valid rotations. We do so using \[ \begin{align} \hat{\mathbf{\overline{q}}} &= \frac{\overline{\mathbf{q}}}{||\overline{\mathbf{q}}||} = \frac{\mathbf{q}_r}{||\mathbf{q}_r||} + \varepsilon \bigg[ \frac{\mathbf{q}_d}{||\mathbf{q}_r||} - \frac{\mathbf{q}_r}{||\mathbf{q}_r||} \frac{<\mathbf{q}_r , \mathbf{q}_d>}{||\mathbf{q}_r||^2} \bigg] \nonumber \\ &= \mathbf{\hat{q}}_r + \varepsilon \bigg[ \frac{\mathbf{q}_d}{||\mathbf{q}_r||} - \mathbf{\hat{q}}_r \frac{<\mathbf{q}_r , \mathbf{q}_d>}{||\mathbf{q}_r||^2} \bigg] \end{align} \] We experiment with two recurrent neural networks, the acRNN and QuaterNet and compare with the following representations: quaternions, quaternions with joint positions, quaternions with forward kinematic (FK) loss, ortho6D and ortho6D with positions.

Overview Video & Results

Citation

@misc{Andreou:2021:DQ,
    author = {Andreou, Nefeli and Aristidou, Andreas and Chrysanthou, Yiorgos},
    title = {Pose Representations for Deep Skeletal Animation},
    eprint={2111.13907},
    year  = {2021},
    archivePrefix={arXiv}
}