Motion Estimation

, Xiaoyi Jiang1, Mohammad Dawood2 and Klaus P. Schäfers2



(1)
Department of Mathematics and Computer Science, University of Münster, Münster, Germany

(2)
European Institute for Molecular Imaging, University of Münster, Münster, Germany

 



In the introduction of this book we have seen that PET image acquisition is susceptible to motion artifacts. The first step to overcome this problem is the separation of the measured data into different motion states via gating as described in Sect. 1.​3.​ The next step is the estimation of motion between these gated reconstructions. This step is crucial as its accuracy highly influences the quality of the final motion corrected image.

In this chapter we discuss different algorithmic paradigms in terms of image registration and optical flow for motion estimation. For both paradigms, a focus is laid on the preservation of radioactivity (mass-preservation) to incorporate a priori information. For image registration, different data terms are introduced which are suitable for monomodal PET registration in combination with mass-preservation. In particular, a data term adopted to noisy PET data is proposed with the SAD measure. The preservation of mass requires diffeomorphic transformations to be a valid assumption. Thus, different regularization schemes will be introduced and discussed. In addition to the non-parametric transformation model we also present motion estimation using the popular parametric transformation model based on B-splines for image registration. For optical flow, classical local and global approaches are introduced and advanced methods are briefly discussed. In particular, mass-preserving optical flow in combination with non-quadratic penalization is addressed. The last part of this chapter compares the two algorithmic paradigms and intends to provide some guidance to finding an appropriate method for motion estimation and its configuration in practice.


2.1 Image Registration


Image registration is a versatile approach to motion estimation and provides a full range of application-specific options. We give an introduction to image registration including suitable options for the data term in Sect. 2.1.1 and the regularization functional in Sect. 2.1.2. The alternative to non-parametric image registration in terms of B-spline transformations is discussed in Sect. 2.1.3. One of the main messages of this book is the mass-preserving transformation model introduced in Sect. 2.1.4 which is tailored for gated PET. We refer to the comprehensive reviews of image registration for further reading [18, 49, 63, 66, 88, 90, 94, 95, 139].

The aim of image registration is to spatially align two corresponding images. In order to formulate this definition of image registration mathematically, we need to find a way to measure the similarity of two images. The objective is then to find a meaningful spatial transformation that, applied to one of the images, maximizes the similarity.

For motion estimation, a so-called template image 
$$\mathcal{T}:\varOmega \rightarrow \mathbb{R}$$
is registered onto a reference image 
$$\mathcal{R}:\varOmega \rightarrow \mathbb{R}$$
, where 
$$\varOmega \subset \mathbb{R}^{3}$$
is the image domain. This yields a spatial transformation 
$$\boldsymbol{y}: \mathbb{R}^{3} \rightarrow \mathbb{R}^{3}$$
representing point-to-point correspondences between 
$$\mathcal{T}$$
and 
$$\mathcal{R}$$
. To find 
$$\boldsymbol{y}$$
, the following functional has to be minimized



$$\displaystyle{ \mathrm{arg}\min _{\boldsymbol{y}}\;\mathcal{J} (\boldsymbol{y}):= \mathcal{D}(\mathcal{M}(\mathcal{T},\boldsymbol{y}),\;\mathcal{R}) +\alpha \; \mathcal{S}(\boldsymbol{y})\;. }$$

(2.1)
Here 
$$\mathcal{D}$$
denotes the distance functional measuring the dissimilarity between the transformed template image and the fixed reference image. 
$$\mathcal{M}$$
is the transformation model specifying how the transformation 
$$\boldsymbol{y}$$
should be applied to the template image 
$$\mathcal{T}$$
. 
$$\mathcal{S}$$
is the regularization functional which penalizes non-smooth transformations and thus enforces meaningful solutions.

The standard transformation model is simply given by an interpolation of 
$$\mathcal{T}$$
at the transformed grid 
$$\boldsymbol{y}$$



$$\displaystyle{ \mathcal{M}^{\mathrm{std}}(\mathcal{T},\boldsymbol{y}):= \mathcal{T} \circ \boldsymbol{ y} = \mathcal{T} (\boldsymbol{y})\;. }$$

(2.2)
We will discuss an alternative transformation model for mass-preserving image registration in Definition 2.11. A discussion of different options for the data term 
$$\mathcal{D}$$
is given in the next Sect. 2.1.1 and for the regularization term 
$$\mathcal{S}$$
in Sect. 2.1.2.


2.1.1 Data Terms


The data term measures the dissimilarity, i.e., reversal of the similarity, of two input images. In Eq. (2.1) the transformed template image is compared to the reference image. For simplicity we will use the original template image (without transformation) for the following definitions without loss of generality.

(Dis)similarity measures for monomodal registration tasks are best suited for motion correction of gated PET data. We thus restrict the discussion in the rest of this section to these monomodal measures. For a more detailed discussion of similarity measures for multimodal studies, such as mutual information (which measures the amount of shared information of two images) or normalized gradient fields (which measures deviations in the gradients of the input images), we refer to [95].

A common dissimilarity measure for monomodal image registration is 
$$\mathrm{SSD}$$
, which is introduced in the following Definition. Differences of the reference and the template image are measured quadratically.


Definition 2.1 (
$$\mathcal{D}^{\mathrm{SSD}}$$
– Sum of squared differences).

The sum of squared differences (SSD) of two images 
$$\mathcal{T}:\varOmega \rightarrow \mathbb{R}$$
and 
$$\mathcal{R}:\varOmega \rightarrow \mathbb{R}$$
on a domain 
$$\varOmega \subset \mathbb{R}^{3}$$
is defined as



$$\displaystyle{ \mathcal{D}^{\mathrm{SSD}}(\mathcal{T},\mathcal{R}):= \frac{1} {2}\;\int _{\varOmega }(\mathcal{T} (\boldsymbol{x}) -\mathcal{R}(\boldsymbol{x}))^{2}\;d\boldsymbol{x}\;. }$$

(2.3)

SSD measures the point-wise distances of image intensities. For images with locally similar intensities the measure gets low. SSD has to be minimized and has its optimal value at 0.

Large differences in the input images are often induced by noise which consequently leads to a high energy in the data term. L1-like distance measures are often used in optical flow techniques [19] to overcome this problem. The L1 distance of two images 
$$\mathcal{T}$$
and 
$$\mathcal{R}$$
is defined as



$$\displaystyle{ \mathrm{L}_{1}(\mathcal{T},\mathcal{R}):=\int _{\varOmega }\vert \mathcal{T} (\boldsymbol{x}) -\mathcal{R}(\boldsymbol{x})\vert \;d\boldsymbol{x}\;. }$$

(2.4)

Using this functional as a data term in Eq. (2.1) raises problems as we require differentiability of all components during optimization. As the absolute value function is not differentiable at zero, the idea is thus to create a differentiable version of the L1 distance by adding a differentiable outer function ψ to the difference in Eq. (2.4) with a similar behavior to the absolute value function.


Definition 2.2 (
$$\mathcal{D}^{\mathrm{SAD}}$$
– Sum of absolute differences).

The (approximated) sum of absolute differences (SAD) of two images 
$$\mathcal{T}:\varOmega \rightarrow \mathbb{R}$$
and 
$$\mathcal{R}:\varOmega \rightarrow \mathbb{R}$$
on a domain 
$$\varOmega \subset \mathbb{R}^{3}$$
is defined as



$$\displaystyle{ \mathcal{D}^{\mathrm{SAD}}(\mathcal{T},\mathcal{R}):=\int _{\varOmega }\psi (\mathcal{T} (\boldsymbol{x}) -\mathcal{R}(\boldsymbol{x}))\;d\boldsymbol{x}\;, }$$

(2.5)

where 
$$\psi: \mathbb{R} \rightarrow \mathbb{R}$$
is the continuously differentiable Charbonnier penalizing function [28] with a small positive constant 
$$\beta \in \mathbb{R}^{>0}$$
” src=”/wp-content/uploads/2016/09/A319392_1_En_2_Chapter_IEq29.gif”></SPAN> </DIV><br />
<DIV class=Para><br />
<DIV id=Equ6 class=Equation><br />
<DIV class=EquationContent><br />
<DIV class=MediaObject><IMG alt=

(2.6)


Remark 2.1.

Different choices for the function ψ in Definition 2.2 are possible, cf. [19].


Remark 2.2.

The function ψ in Eq. (2.6) is always greater than zero. Consequently, 
$$\mathcal{D}^{\mathrm{SAD}}(\mathcal{T},\mathcal{T} )\neq 0$$
. This issue could be fixed by subtracting β



$$\displaystyle{ \hat{\psi }(x):= \sqrt{x^{2 } +\beta ^{2}} -\beta \;. }$$

(2.7)

For minimization of the registration functional we need to compute the derivatives of the functional and hence the first



$$\displaystyle{ \partial \psi (x) = \frac{x} {\sqrt{x^{2 } +\beta ^{2}}} }$$

(2.8)
and second derivative of ψ



$$\displaystyle{ \partial ^{2}\psi (x) = \frac{\beta ^{2}} {(x^{2} +\beta ^{2})^{3/2}} }$$

(2.9)
are given here for completeness.

The behavior of the penalizing function ψ for different values of β is shown in Fig. 2.1. It can be observed that ψ becomes more quadratic around zero for higher β values. This might prevent abrupt jumps in the first derivative between − 1 and 1 during optimization, thus stabilizing the optimization process.


A319392_1_En_2_Fig1_HTML.gif


Fig. 2.1
The Charbonnier penalizing function (a) from Eq. (2.6) and its derivative (b) from Eq. (2.8) are plotted for β = 0. 1 and β = 10. They both give an approximation to the absolute value function


2.1.2 Regularization


According to Hadamard [61] a problem is called well-posed if there

1.

Exists a solution that is

 

2.

Unique and when

 

3.

Small changes in data lead only to small changes in the result.

 

In image registration, small changes in the data can lead to large changes in the results. Further, as the uniqueness is generally not given, image registration is ill-posed [48]. This makes regularization inevitable and essential to find feasible transformations.

Regularization restricts the space of possible transformations to a smaller set of reasonable functions. Regularization should be chosen depending on the application and can either be implicit or explicit. If the transformation is regularized by the properties of the space itself, as in parametric image registration, we speak of implicit regularization. For example, for a rigid 3D transformation only a total of three rotation and three translation parameters need to be estimated instead of a 3D motion vector for each voxel. In explicit regularization, a penalty is added to the registration functional to avoid non-smooth transformations. The penalty is often based on a model that is physically sound and fits the processed data.

As the regularization for parametric transformations is implicitly given we will only analyze penalties for non-linear image registration in this section. In the rest of this section we will introduce the following regularizers:



  • Diffusion regularization (Sect. 2.1.2.1),


  • Elastic regularization (Sect. 2.1.2.2), and


  • Hyperelastic regularization (Sect. 2.1.2.3).

These regularizers are specifically chosen as diffusion regularization is typically used for optical flow applications. Further, Vampire relies on hyperelastic regularization, which is a non-linear generalization of linear elastic regularization. More information about regularization (e.g., curvature or log-elasticity) and additional constraints, which are beyond the scope of this book, can be found in [3, 48, 94, 99, 107].

Hereinafter, we assume a given transformation 
$$\boldsymbol{y}: \mathbb{R}^{3} \rightarrow \mathbb{R}^{3}$$
which is composed of the spatial position 
$$\boldsymbol{x} \in \mathbb{R}^{3}$$
and a deformation (or displacement) 
$$\boldsymbol{u}: \mathbb{R}^{3} \rightarrow \mathbb{R}^{3}$$
at position 
$$\boldsymbol{x}$$
, i.e., 
$$\boldsymbol{y}(\boldsymbol{x}) = \boldsymbol{x} +\boldsymbol{ u}(\boldsymbol{x})$$
.


2.1.2.1 Diffusion Regularization


One of the simplest ways to control the behavior of the transformation is diffusion regularization [47]. The basic idea is to disallow large variations in the gradient of the motion vector between neighboring voxels, i.e., oscillations in the deformation 
$$\boldsymbol{u}$$
are penalized.

The motivation is thus simply a demanded smoothness of the transformation and no physical one. An advantage of diffusion regularization over other methods is its low computational complexity which allows fast computations and the treatment of high-dimensional data [94]. This becomes important when striving for real-time computations, which is why diffusion regularization is popular for optical flow techniques, cf. Sect. 2.2.


Definition 2.3 (
$$\mathcal{S}^{\mathrm{diff}}$$
– Diffusion regularization).

For a transformation 
$$\boldsymbol{y}: \mathbb{R}^{3} \rightarrow \mathbb{R}^{3}$$
with 
$$\boldsymbol{y}(\boldsymbol{x}) = \boldsymbol{x} +\boldsymbol{ u}(\boldsymbol{x})$$
and 
$$\boldsymbol{u}: \mathbb{R}^{3} \rightarrow \mathbb{R}^{3}$$
, the diffusion regularization energy is defined as



$$\displaystyle{ \mathcal{S}^{\mathrm{diff}}(\boldsymbol{u}):=\int _{\varOmega }\|\nabla \boldsymbol{u}(\boldsymbol{x})\|_{ 2}^{2}\;d\boldsymbol{x} =\int _{\varOmega }\sum _{ i=1}^{3}\vert \nabla \boldsymbol{u}^{i}(\boldsymbol{x})\vert ^{2}\;d\boldsymbol{x}\;, }$$

(2.10)
where the (squared) Frobenius norm is defined as 
$$\|A\|_{2}^{2}:=\mathrm{ tr}(A^{T}A)$$
for matrices 
$$A \in \mathbb{R}^{3\times 3}$$
. 
$$\boldsymbol{u}^{i}$$
denotes the i-th component of 
$$\boldsymbol{u}$$
with i ∈ { 1, 2, 3}.


2.1.2.2 Elastic Regularization


As indicated in the introduction of this section, regularization methods can be motivated physically. This is the case for linear elastic regularization which measures the forces of a transformation applied to an elastic material [17] (Lagrange frame). In the language of elasticity we would say that the stress due to strain is measured. Or vice versa, in a more physical motivation, the deformation (strain) is a reaction of force (stress) [94] (Euler frame).


Definition 2.4 (
$$\mathcal{S}^{\mathrm{elastic}}$$
– Elastic regularization).

For a transformation 
$$\boldsymbol{y}: \mathbb{R}^{3} \rightarrow \mathbb{R}^{3}$$
with 
$$\boldsymbol{y}(\boldsymbol{x}) = \boldsymbol{x} +\boldsymbol{ u}(\boldsymbol{x})$$
and 
$$\boldsymbol{u}: \mathbb{R}^{3} \rightarrow \mathbb{R}^{3}$$
, the elastic regularization energy is defined as



$$\displaystyle\begin{array}{rcl} \mathcal{S}^{\mathrm{elastic}}(\boldsymbol{u})&:=& \int _{\varOmega }\mu \|\nabla \boldsymbol{u}(\boldsymbol{x})\|_{ 2}^{2} + (\lambda +\mu )(\nabla \,\mbox{$ \cdot$ }\,\boldsymbol{u}(\boldsymbol{x}))^{2}\;d\boldsymbol{x} \\ & =& \int _{\varOmega }\mu \left (\sum _{i=1}^{3}\vert \nabla \boldsymbol{u}^{i}(\boldsymbol{x})\vert ^{2}\right ) + (\lambda +\mu )\left (\sum _{ i=1}^{3}\boldsymbol{u}_{ i}^{i}(\boldsymbol{x})\right )^{2}\;d\boldsymbol{x}\;,{}\end{array}$$

(2.11)
where 
$$\mu \in \mathbb{R}^{>0}$$
” src=”/wp-content/uploads/2016/09/A319392_1_En_2_Chapter_IEq49.gif”></SPAN> and <SPAN id=IEq50 class=InlineEquation><IMG alt=0}$$ ” src=”/wp-content/uploads/2016/09/A319392_1_En_2_Chapter_IEq50.gif”> are the Lamé constants according to Ciarlet [30]. Again, 
$$\|\,\mbox{$ \cdot$ }\,\|_{2}$$
is the Frobenius norm (see Eq. (2.10)) and 
$$\boldsymbol{u}^{i}$$
denotes the i-th component of 
$$\boldsymbol{u}$$
with i ∈ { 1, 2, 3}.

To better understand the physical interpretation of the elastic energy we will take a look at some useful properties of elasticity in terms of the Poisson ratio and Young’s modulus.


Definition 2.5 (ν – Poisson ratio).

The Poisson (contraction) ratio is defined as



$$\displaystyle{ \nu = \frac{\lambda } {2(\lambda +\mu )}\;, }$$

(2.12)
for the Lamé constants 
$$\mu \in \mathbb{R}^{>0}$$
” src=”/wp-content/uploads/2016/09/A319392_1_En_2_Chapter_IEq54.gif”></SPAN> and <SPAN id=IEq55 class=InlineEquation><IMG alt=0}$$ ” src=”/wp-content/uploads/2016/09/A319392_1_En_2_Chapter_IEq55.gif”> in Definition 2.4.

The Poisson ratio ν measures the amount of compression for the linear elastic model. If a material is compressed in one direction, the Poisson ratio measures the quotient of this value and the expansion in the other directions. For compressible materials we thus have a small quotient and for incompressible materials a large one. To model incompressible materials in the registration setting, the Poisson ratio needs to be weighted high. Note that this interpretation only holds for small displacements since we are dealing with a linear model.


Definition 2.6 (E – Young’s modulus).

Young’s modulus is defined as



$$\displaystyle{ E = \frac{\mu (3\lambda + 2\mu )} {\lambda +\mu } \;, }$$

(2.13)
for the Lamé constants 
$$\mu \in \mathbb{R}^{>0}$$
” src=”/wp-content/uploads/2016/09/A319392_1_En_2_Chapter_IEq56.gif”></SPAN> and <SPAN id=IEq57 class=InlineEquation><IMG alt=0}$$ ” src=”/wp-content/uploads/2016/09/A319392_1_En_2_Chapter_IEq57.gif”> in Definition 2.4.

Young’s modulus E is the quotient of stress (tension) and strain (stretch) in the same direction. The connection between the Lamé constants, Poisson ratio, and Young’s modulus is given by



$$\displaystyle\begin{array}{rcl} \lambda & =& \frac{E\nu } {(1+\nu )(1 - 2\nu )}\;,{}\end{array}$$

(2.14)




$$\displaystyle\begin{array}{rcl} \mu & =& \frac{E} {2(1+\nu )}\;,{}\end{array}$$

(2.15)




$$\displaystyle\begin{array}{rcl} \lambda > 0\mbox{ and }\mu > 0\;& \Longleftrightarrow& \;0 <\nu < \frac{1} {2}\mbox{ and }E > 0\;.{}\end{array}$$
” src=”/wp-content/uploads/2016/09/A319392_1_En_2_Chapter_Equ16.gif”></DIV></DIV><br />
<DIV class=EquationNumber>(2.16)</DIV></DIV></DIV><br />
<DIV class=Para>The linear dependency of Young’s modulus <SPAN class=EmphasisTypeItalic>E</SPAN> and the Lamé constants can be seen in Eqs. (<SPAN class=InternalRef><A href=2.14) and (2.15). Together with Eq. (2.11) it gets obvious that E simply scales the regularization functional. For practical reasons it can thus be set to E = 1 [73].


2.1.2.3 Hyperelastic Regularization


In many medical applications of image registration the user has a priori knowledge about the processed data concerning the allowed deformations. For example the invertibility of the transformation is a general requirement. More specific knowledge, like, e.g., the degree of compressibility of tissue can also be controlled. We have seen in the linear elastic case that the Poisson ratio (Definition 2.6) gives a measure for compressibility of small deformations. In the non-linear case, the determinant of the Jacobian of the transformation measures the volume change and can thus be used to control the compression behavior – even for large deformations. This is done with polyconvex hyperelastic regularization [22, 30, 40]. The regularization functional 
$$\mathcal{S}^{\mathrm{hyper}}$$
controls changes in length, area of the surface, and volume of 
$$\boldsymbol{y}$$
and guarantees thereby in particular the invertibility of the estimated transformation.


Definition 2.7 (
$$\mathcal{S}^{\mathrm{hyper}}$$
– Hyperelastic regularization).

Let 
$$\alpha _{l},\alpha _{a},\alpha _{v} \in \mathbb{R}^{>0}$$
” src=”/wp-content/uploads/2016/09/A319392_1_En_2_Chapter_IEq61.gif”></SPAN> be constants and <SPAN class=EmphasisTypeItalic>p</SPAN>, <SPAN class=EmphasisTypeItalic>q</SPAN> ≥ 2. Further, let <SPAN id=IEq62 class=InlineEquation><IMG alt= be positive and strictly convex functions, with Γ v satisfying 
$$\lim _{z\rightarrow 0^{+}}\varGamma _{v}(z) =\lim _{z\rightarrow \infty }\varGamma _{v}(z) = \infty $$
. The hyperelastic regularization energy caused by the transformation 
$$\boldsymbol{y}: \mathbb{R}^{3} \rightarrow \mathbb{R}^{3}$$
with 
$$\boldsymbol{y}(\boldsymbol{x}) = \boldsymbol{x} +\boldsymbol{ u}(\boldsymbol{x})$$
and 
$$\boldsymbol{u}: \mathbb{R}^{3} \rightarrow \mathbb{R}^{3}$$
is defined as



$$\displaystyle{ \mathcal{S}^{\mathrm{hyper}}(\boldsymbol{y}) =\alpha _{ l}\,\mbox{$ \cdot$ }\,\mathcal{S}^{\mathrm{length}}(\boldsymbol{y}) +\alpha _{ a}\,\mbox{$ \cdot$ }\,\mathcal{S}^{\mathrm{area}}(\boldsymbol{y}) +\alpha _{ v}\,\mbox{$ \cdot$ }\,\mathcal{S}^{\mathrm{vol}}(\boldsymbol{y})\;. }$$

(2.17)
The three summands individually control changes in length, area of the surface, and volume



$$\displaystyle\begin{array}{rcl} \mathcal{S}^{\mathrm{length}}(\boldsymbol{y})&:=& \int _{\varOmega }\|\nabla (\boldsymbol{y}(\boldsymbol{x}) -\boldsymbol{x})\|_{ 2}^{p}\;d\boldsymbol{x} =\int _{\varOmega }\|\nabla \boldsymbol{u}(\boldsymbol{x})\|_{ 2}^{p}\;d\boldsymbol{x}{}\end{array}$$

(2.18)




$$\displaystyle\begin{array}{rcl} \mathcal{S}^{\mathrm{area}}(\boldsymbol{y})&:=& \int _{\varOmega }\varGamma _{ a}(\|\mathrm{Cof}(\nabla \boldsymbol{y}(\boldsymbol{x}))\|_{2}^{q})\;d\boldsymbol{x}{}\end{array}$$

(2.19)




$$\displaystyle\begin{array}{rcl} \mathcal{S}^{\mathrm{vol}}(\boldsymbol{y})&:=& \int _{\varOmega }\varGamma _{ v}(\det (\nabla \boldsymbol{y}(\boldsymbol{x})))\;d\boldsymbol{x}\;,{}\end{array}$$

(2.20)
where 
$$\|\,\mbox{$ \cdot$ }\,\|_{2}$$
is the Frobenius norm (see Eq. (2.10)) and 
$$\mathrm{Cof}(\,\mbox{$ \cdot$ }\,)$$
denotes the cofactor matrix.


Remark 2.3.

A typical choice of p and q in Definition 2.7 is 
$$p = q = 2$$
[51].

In the formulation in Eq. (2.1), the positive real number α balances between minimizing the data driven energy term (maximizing the image similarity) and retaining smooth and realistic transformations, which is controlled by the regularization energy. As each term of the hyperelastic regularizer has an individual weighting factor, α can be set to 1 and only α l , α a , and α v need to be determined.

For α v  > 0, the conditions for Γ v claimed above ensure 
$$\det (\nabla \boldsymbol{y}) > 0$$
” src=”/wp-content/uploads/2016/09/A319392_1_En_2_Chapter_IEq70.gif”></SPAN>, i.e., <SPAN id=IEq71 class=InlineEquation><IMG alt= is a diffeomorphism [40]. This allows us to omit the absolute value bars later in Eq. (2.45) of the mass-preserving transformation model. Hyperelastic regularization has the ability of modeling tissue characteristics like compressibility, improves the robustness against noise and thus enforces realistic cardiac and respiratory motion estimates.


2.1.2.4 Relations


To deepen the understanding of the regularization energies described in this section, we highlight some similarities and differences. This helps us to understand the relations between the different regularization variants. As we will see in Sect. 2.1.4, controlling volume changes plays an important role in connection with the mass-preserving motion estimation approach Vampire. Accordingly, this aspect is highlighted in particular.


Diffusion ↔ Elastic

As we have seen, the elastic energy consists of the two terms 
$$\|\nabla \boldsymbol{u}\|_{2}^{2}$$
and 
$$(\nabla \,\mbox{$ \cdot$ }\,\boldsymbol{u})^{2}$$
. The interpretation of the latter is revealed as a measure of compressibility, i.e., of volume changes. The first term is the known diffusion regularization term, which is hence integrated into the elastic energy.


Diffusion ↔ Hyperelastic

Diffusion regularization is included into hyperelastic regularization. The length term 
$$\mathcal{S}^{\mathrm{length}}$$
in Eq. 2.17 equals the diffusion regularization term. The hyperelastic regularization energy has two additional energy terms – 
$$\mathcal{S}^{\mathrm{area}}$$
measuring changes in area and 
$$\mathcal{S}^{\mathrm{vol}}$$
measuring changes in volume.


Elastic ↔ Hyperelastic

The Jacobian determinant 
$$\det (\nabla \boldsymbol{y})$$
represents the exact volume change due to a transformation 
$$\boldsymbol{y}$$
. Consequently, a value of one characterizes incompressibility. Let us recall, the Poisson ratio ν in the linear elastic model (Definition 2.5) also controls the amount of incompressibility. Large values of ν (incompressible) results from large values of λ. For large values of λ compared to μ, the elastic energy is dominated by 
$$(\nabla \,\mbox{$ \cdot$ }\,\boldsymbol{u})^{2}$$
. The Jacobian determinant can by approximated by 
$$\nabla \,\mbox{$ \cdot$ }\,\boldsymbol{u}$$
if the 
$$\vert \partial _{j}\boldsymbol{u}^{i}\vert $$
are sufficiently small:



$$\displaystyle{ \det (\nabla \boldsymbol{y}) \approx 1 + \nabla \,\mbox{$ \cdot$ }\,\boldsymbol{u}\;. }$$

(2.21)
This shows the approximate behavior of linear elastic regularization. To conclude, hyperelastic regularization holds for large deformations whereas linear elastic regularization is restricted to small deformations. Further, the area energy term in hyperelastic regularization, which is missing in elastic regularization, serves as a link between the length and volume term [22].


2.1.3 B-Spline Transformation


So far only non-parametric transformations were discussed. However, in some cases the search space for the desired transformation can be restricted by using a suitable parametric transformation model. As an example, for many brain alignment tasks the transformation can be assumed to be rigid. This can increase the robustness of the registration task. The most prominent parametric transformations are scaling, rigid, affine, and spline-based transformations [95]. For non-linear transformations, as required for PET motion estimation, the parametric B-spline transformation is of particular interest.

In parametric image registration the transformation is given in terms of a parametric function 
$$\boldsymbol{y}_{p}: \mathbb{R}^{3} \rightarrow \mathbb{R}^{3}$$
. 
$$\boldsymbol{y}_{p}$$
is defined by a parameter vector p. The task is to find the parameter set p minimizing the distance between 
$$\mathcal{R}$$
and the transformed input image 
$$\mathcal{M}(\mathcal{T},\boldsymbol{y}_{p})$$



$$\displaystyle{ \mathcal{J} (p):= \mathcal{D}(\mathcal{M}(\mathcal{T},\boldsymbol{y}_{p}),\;\mathcal{R})\;\mathop{ =}\limits^{!}\min \;, }$$

(2.22)
where 
$$\mathcal{D}$$
is a data term according to Sect. 2.1.1 and 
$$\mathcal{M}$$
is a transformation model according to Eq. (2.2) or (2.46).


Definition 2.8 (Spline transformation).

Given a number 
$$n = (n^{1},n^{2},n^{3}) \in \mathbb{N}^{3}$$
of spline coefficients and a spline basis function 
$$b: \mathbb{R} \rightarrow \mathbb{R}$$
, the spline transformation or free-form transformation 
$$\boldsymbol{y}_{p}(\boldsymbol{x}) = (\boldsymbol{y}_{p}^{1}(\boldsymbol{x}),\boldsymbol{y}_{p}^{2}(\boldsymbol{x}),\boldsymbol{y}_{p}^{3}(\boldsymbol{x}))$$
for a point 
$$\boldsymbol{x} = (x,y,z)^{T} \in \varOmega \subset \mathbb{R}^{3}$$
is defined as



$$\displaystyle\begin{array}{rcl} \boldsymbol{y}_{p}^{1}(\boldsymbol{x})& =& x +\sum _{ i=1}^{n^{1} }\sum _{j=1}^{n^{2} }\sum _{k=1}^{n^{3} }p_{i,j,k}^{1}b_{ i}(x)b_{j}(y)b_{k}(z)\;,{}\end{array}$$

(2.23)




$$\displaystyle\begin{array}{rcl} \boldsymbol{y}_{p}^{2}(\boldsymbol{x})& =& y +\sum _{ i=1}^{n^{1} }\sum _{j=1}^{n^{2} }\sum _{k=1}^{n^{3} }p_{i,j,k}^{2}b_{ i}(x)b_{j}(y)b_{k}(z)\;,{}\end{array}$$

(2.24)




$$\displaystyle\begin{array}{rcl} \boldsymbol{y}_{p}^{3}(\boldsymbol{x})& =& z +\sum _{ i=1}^{n^{1} }\sum _{j=1}^{n^{2} }\sum _{k=1}^{n^{3} }p_{i,j,k}^{3}b_{ i}(x)b_{j}(y)b_{k}(z)\;,{}\end{array}$$

(2.25)
where



$$\displaystyle\begin{array}{rcl} p& =& \left \{(p_{i,j,k}^{1},p_{ i,j,k}^{2},p_{ i,j,k}^{3})^{T} \in \mathbb{R}^{3}\;\vert \right. \\ & & \left.i \in \{ 1,\ldots,n^{1}\},j \in \{ 1,\ldots,n^{2}\},k \in \{ 1,\ldots,n^{3}\}\right \}{}\end{array}$$

(2.26)
are the spline coefficients and 
$$b_{i}(x) = b(x - i)$$
for i ∈ { 1, , n 1}, 
$$b_{j}(y) = b(y - j)$$
for j ∈ { 1, , n 2}, 
$$b_{k}(z) = b(z - k)$$
for k ∈ { 1, , n 3}. Note that this notation is only valid for 
$$\varOmega =\{ (x,y,z)^{T} \in \mathbb{R}^{3}\;\vert \;0 \leq x < n^{1} + 1,0 \leq y < n^{2} + 1,0 \leq z < n^{3} + 1\}$$
.

A possible choice for the basis function 
$$b: \mathbb{R} \rightarrow \mathbb{R}$$
, often called mother spline (cf. Fig. 2.2), is



$$\displaystyle{ b(x) = \left \{\begin{array}{@{}l@{\quad }l@{}} (x + 2)^{3}\;, \quad &\quad - 2 \leq x < -1\;, \\ -x^{3} - 2(x + 1)^{3} + 6(x + 1)\;,\quad &\quad - 1 \leq x < 0\;, \\ x^{3} + 2(x - 1)^{3} - 6(x - 1)\;, \quad &\quad 0 \leq x < 1\;, \\ (-x + 2)^{3}\;, \quad &\quad 1 \leq x < 2\;, \\ 0\;, \quad &\quad else\;. \end{array} \right. }$$

(2.27)


A319392_1_En_2_Fig2_HTML.gif


Fig. 2.2
The mother spline b defined in Eq. (2.27) is shown in (a). b is continuously differentiable and its derivative is plotted in (b)

Parametric transformations are implicitly regularized by the reduced number of parameters compared to non-parametric transformations, as discussed in Sect. 2.1.2. Hence, regularization of, e.g., rigid or affine transformations, is usually not necessary. However, regularization becomes of importance for spline transformations due to the non-linearity [48]. Possible regularization variants for spline transformations are discussed in the following. For parametric transformations the functional in Eq. (2.22) becomes



$$\displaystyle{ \mathcal{J} (p):= \mathcal{D}(\mathcal{M}(\mathcal{T},\boldsymbol{y}_{p}),\;\mathcal{R}) +\alpha \, \mbox{$ \cdot$ }\,\mathcal{S}_{M}(p)\;\mathop{ =}\limits^{!}\min \;, }$$

(2.28)
for a regularization term 
$$\mathcal{S}_{M}$$
and a scalar weighting factor 
$$\alpha \in \mathbb{R}^{>0}$$
” src=”/wp-content/uploads/2016/09/A319392_1_En_2_Chapter_IEq98.gif”></SPAN> balancing between the data and regularization term.</DIV><br />
<DIV id=Sec12 class=

2.1.3.1 Hyperelastic Regularization


Hyperelastic regularization, introduced for non-parametric transformations in Definition 2.7, can also be applied to the B-spline transformation model. The transformation 
$$\boldsymbol{y}_{p}(\boldsymbol{x}) = \boldsymbol{x} +\boldsymbol{ u}_{p}(\boldsymbol{x})$$
is written in the discrete setting of [95] as



$$\displaystyle{ \boldsymbol{y} = \boldsymbol{x} + Q\,\mbox{$ \cdot$ }\,p\;. }$$

(2.29)
The deformation 
$$\boldsymbol{u}$$
is represented by a projection of the spline coefficients, given by the parameter vector p, into the image space by the projection matrix Q



$$\displaystyle{ \boldsymbol{u} = Q\,\mbox{$ \cdot$ }\,p\;. }$$

(2.30)
For the hyperelastic regularization the transformation 
$$\boldsymbol{y}$$
is first determined according to Eq. (2.29) and is then regularized as described in Eq. (2.17). The regularization functional in Eq. (2.28) is then defined as



$$\displaystyle{ \mathcal{S}_{M}(p):= \mathcal{S}^{\mathrm{hyper}}(\boldsymbol{x} + Q\,\mbox{$ \cdot$ }\,p)\;. }$$

(2.31)


2.1.3.2 Coefficient-Based Regularization


The regularization term 
$$\mathcal{S}_{M}(p)$$
in Eq. (2.28) is based on the coefficient vector p in the case of coefficient-based regularization and is a weighted norm of the form



$$\displaystyle{ \mathcal{S}_{M}(p):=\| p\|_{M}^{2}:= p^{T}Mp\;. }$$

(2.32)
The matrix M determines the regularization type. Instead of regularizing the deformation 
$$\boldsymbol{u}_{p}(\boldsymbol{x})$$
for each spatial position 
$$\boldsymbol{x} \in \mathbb{R}^{3}$$
as in the non-parametric case (cf. Sect. 2.1.2) we now regularize directly on p. We speak of coefficient-based regularization in image space if the regularization on p considers the projection matrix Q and thus regularizes Q ⋅ p. If only p is taken into account we speak of coefficient-based regularization in coefficient space.

With Tikhonov regularization we will discuss one option for coefficient-based regularization in coefficient space. Tikhonov regularization punishes gradients in the spline coefficients and is the coefficient-based analog of diffusion regularization introduced in Definition 2.3.


Definition 2.9 (Tikhonov regularization (coefficient space)).

Tikhonov regularization in coefficient space is defined by the choice of



$$\displaystyle{ M^{\mathrm{tc}} = D^{T}D }$$

(2.33)
in Eq. (2.32). The superscript tc denotes (t)ikhonov in (c)oefficient space. D is defined by



$$\displaystyle{ D = I_{3}\otimes \left [\begin{array}{*{10}c} D^{1} \\ D^{2} \\ D^{3} \end{array} \right ]\;. }$$

(2.34)
Here I 3 is the 3 × 3 identity matrix and ⊗ denotes the Kronecker product [16]. The 1D derivative operators are defined by



$$\displaystyle\begin{array}{rcl} D^{1}& =& I_{ n^{3}} \otimes I_{n^{2}} \otimes \mathrm{ d}_{1}^{1}\;,{}\end{array}$$

(2.35)




$$\displaystyle\begin{array}{rcl} D^{2}& =& I_{ n^{3}} \otimes \mathrm{ d}_{1}^{2} \otimes I_{ n^{1}}\;,{}\end{array}$$

(2.36)




$$\displaystyle\begin{array}{rcl} D^{3}& =& \mathrm{d}_{ 1}^{3} \otimes I_{ n^{2}} \otimes I_{n^{1}}\;.{}\end{array}$$

(2.37)
The 
$$I_{n^{i}}$$
denote identity matrices with the size n i × n i , i ∈ { 1, 2, 3}, and the n i defined according to Definition 2.8. The discrete 1D first derivative operators for each dimension are defined by



$$\displaystyle{ \mathrm{d}_{1}^{i} = \left [\begin{array}{*{10}c} -1&1& &0\\ & \ddots & \ddots & \\ 0 & &-1&1 \end{array} \right ] \in \mathbb{R}^{n^{i}-1,n^{i} },\;i \in \{ 1,2,3\}\;. }$$

(2.38)

The analog of the above definition in image space is given with the following Definition 2.10. As the real displacements and not the corresponding coefficients are processed, this version is even more related to the diffusion regularization energy in Definition 2.3.


Definition 2.10 (Tikhonov regularization (image space)).

Let the number of spline coefficients be 
$$n = (n^{1},n^{2},n^{3}) \in \mathbb{N}^{3}$$
and 
$$b_{i}(x) = b(x - i)$$
for i ∈ { 1, , n d }, d ∈ { 1, 2, 3}, be the translated spline basis function b defined in Eq. (2.27). The Tikhonov regularization matrix in image space is defined as



$$\displaystyle{ M^{\mathrm{ti}} = I_{ 3} \otimes T^{3} \otimes T^{2} \otimes T^{1}\;. }$$

(2.39)
The matrices 
$$T^{d} \in \mathbb{R}^{n^{d},n^{d} }$$
are defined by



$$\displaystyle{ T_{i,j}^{d} =\int _{\varOmega ^{ d}}\partial b_{i}(x)\,\mbox{$ \cdot$ }\,\partial b_{j}(x)dx\;,\;d \in \{ 1,2,3\}\;, }$$

(2.40)
where i ∈ { 1, , n d }, j ∈ { 1, , n d }, and 
$$\varOmega ^{d} \subset \mathbb{R}$$
is the 1D subspace of Ω in dimension d ∈ { 1, 2, 3}.

The concrete numbers for the matrix T d (including boundary treatment) are





$$\displaystyle{ T^{d}=\left [\begin{array}{*{10}c} 15.1 &-5.0&-7.2&-0.3& & & & & 0\\ -5.0 & 23.9 &-4.5 &-7.2 &-0.3 & & & & \\ -7.2&-4.5& 24.0 &-4.5&-7.2&-0.3& & &\\ -0.3 &-7.2 &-4.5 & 24.0 &-4.5 &-7.2 &-0.3 && \\ & \ddots & \ddots & \ddots & \ddots & \ddots & \ddots & \ddots &\\ & &-0.3 &-7.2 &-4.5 & 24.0 &-4.5 &-7.2&-0.3 \\ & & &-0.3&-7.2&-4.5& 24.0 &-4.5&-7.2\\ & & & &-0.3 &-7.2 &-4.5 & 23.9 &-5.0 \\ 0 & & & & &-0.3&-7.2&-5.0& 15.1 \end{array} \right ] \in \mathbb{R}^{n^{d},n^{d} } }$$

(2.41)

for d ∈ { 1, 2, 3}. The entries of the matrix T d are obtained by calculating the integral in Eq. (2.40) for each index i ∈ { 1, , n d } and j ∈ { 1, , n d }, d ∈ { 1, 2, 3}.


2.1.4 Mass-Preserving Image Registration


The standard transformation model for image registration in Eq. (2.2) does not guarantee the preservation of mass, i.e., in general



$$\displaystyle{ \int _{\varOmega }\mathcal{T} (\boldsymbol{x})\;d\boldsymbol{x}\neq \int _{\varOmega }\mathcal{T} (\boldsymbol{y}(\boldsymbol{x}))\;d\boldsymbol{x}\;. }$$

(2.42)
Further, the distance functional 
$$\mathcal{D}$$
typically entails the assumption of similar intensities at corresponding points. This assumption is not valid for the standard transformation model 
$$\mathcal{M}^{\mathrm{std}}$$
in case of dual gated PET as discussed in connection with Fig. 1.​11. Consequently, the standard transformation model needs some modification to express this feature.

The requirement for the desired transformation model 
$$\mathcal{M}^{\mathrm{MP}}$$
is the preservation of mass



$$\displaystyle{ \int _{\varOmega }\mathcal{T} (\boldsymbol{x})d\boldsymbol{x} =\int _{\varOmega }\mathcal{M}^{\mathrm{MP}}(\mathcal{T},\boldsymbol{y}(x))\;d\boldsymbol{x}\;. }$$

(2.43)
From the integration by substitution theorem for multiple variables we know that the following equation holds



$$\displaystyle{ \int _{\boldsymbol{y}(\varOmega )}\mathcal{T} (\boldsymbol{x})\;d\boldsymbol{x} =\int _{\varOmega }\mathcal{T} (\boldsymbol{y}(\boldsymbol{x}))\;\vert \det (\nabla \boldsymbol{y}(\boldsymbol{x}))\vert \;d\boldsymbol{x}\;. }$$

(2.44)
With respect to PET, Eq. (2.44) guarantees already the same total amount of radioactivity before and after applying the transformation 
$$\boldsymbol{y}$$
to 
$$\mathcal{T}$$
. The Jacobian determinant 
$$\vert \det (\nabla \boldsymbol{y}(\boldsymbol{x}))\vert $$
accounts for the volume change induced by the transformation 
$$\boldsymbol{y}$$
, representing the mass-preserving component. As 
$$\boldsymbol{y}$$
should reflect cardiac and respiratory motion, transformations that are not bijective are anatomically not meaningful and have therefore to be excluded. For example, the hyperelastic regularization functional in Sect. 2.1.2.3 guarantees 
$$\boldsymbol{y}$$
to be diffeomorphic and orientation preserving, which allows us to drop the absolute value bars



$$\displaystyle{ \int _{\boldsymbol{y}(\varOmega )}\mathcal{T} (\boldsymbol{x})d\boldsymbol{x} =\int _{\varOmega }\mathcal{T} (\boldsymbol{y}(\boldsymbol{x}))\;\det (\nabla \boldsymbol{y}(\boldsymbol{x}))\;d\boldsymbol{x}\;. }$$

(2.45)

We derived the Vampire (Variational Algorithm for Mass-Preserving Image REgistration) based on the above considerations in our previous work [51]. The source code of Vampire can be downloaded at [52].


Definition 2.11 (
$$\mathcal{M}^{\mathrm{MP}}$$
– Mass-preserving transformation model).

For an image 
$$\mathcal{T}:\varOmega \rightarrow \mathbb{R}$$
on the domain 
$$\varOmega \subset \mathbb{R}^{3}$$
and a transformation 
$$\boldsymbol{y}: \mathbb{R}^{3} \rightarrow \mathbb{R}^{3}$$
the mass-preserving transformation model is defined as



$$\displaystyle{ \mathcal{M}^{\mathrm{MP}}(\mathcal{T},\boldsymbol{y}):= (\mathcal{T} \circ \boldsymbol{ y})\,\mbox{$ \cdot$ }\,\det (\nabla \boldsymbol{y}) = \mathcal{T} (\boldsymbol{y})\,\mbox{$ \cdot$ }\,\det (\nabla \boldsymbol{y})\;. }$$

(2.46)

In the mass-preserving transformation model of Vampire the template image 
$$\mathcal{T}$$
is transformed by interpolation on the deformed grid 
$$\boldsymbol{y}$$
with an additional multiplication by the volume change. The multiplication by the Jacobian is a physiological and realistic modeling for density-based images [128, 138]. It guarantees similar intensities at corresponding points after transformation with a simultaneous preservation of the total amount of radioactivity.

A simple 2D example for the mass-preserving transformation model is shown in Fig. 2.3 for illustration. We will use the same notation as in the 3D case for 
$$\mathcal{T}$$
, 
$$\mathcal{R}$$
, Ω, 
$$\boldsymbol{y}$$
, and 
$$\boldsymbol{x}$$
. Thus, we redefine 
$$\varOmega \subset \mathbb{R}^{2}$$
, 
$$\boldsymbol{y}: \mathbb{R}^{2} \rightarrow \mathbb{R}^{2}$$
, and 
$$\boldsymbol{x} = (x,y)^{T} \in \mathbb{R}^{2}$$
for the 2D example. 
$$\mathcal{T}:\varOmega \rightarrow \mathbb{R}$$
and 
$$\mathcal{R}:\varOmega \rightarrow \mathbb{R}$$
are defined in the same way but for the new domain Ω.


A319392_1_En_2_Fig3_HTML.gif


Fig. 2.3
2D example for the mass-preserving transformation model. The template image 
$$\mathcal{T}$$
in (a) is registered to the reference image 
$$\mathcal{R}$$
in (b) with Vampire resulting in (c). The average endpoint error e between the estimated transformation 
$$\boldsymbol{y}$$
and the ground-truth transformation 
$$\boldsymbol{y}_{\mathrm{GT}}$$
is visualized in (d). The same plot restricted to a mask where 
$$\mathcal{T}$$
and 
$$\mathcal{R}$$
have an intensity above 1 is shown in (e). The color map for both error plots is given in-between. Line profiles are shown in (f)

The images of the example are constructed to be similar to SA views of the heart to a certain degree, cf. Fig. 1.14b. Further, they have the same total mass (by construction) and thus exactly fulfill the conditions for mass-preservation. The template and reference image represent smooth signals based on Gaussian distributions. The template image is created by subtracting two Gaussian distributions with different standard deviation 
$$\sigma _{1},\sigma _{2} \in \mathbb{R}^{>0}$$
” src=”/wp-content/uploads/2016/09/A319392_1_En_2_Chapter_IEq140.gif”></SPAN> and <SPAN class=EmphasisTypeItalic>σ</SPAN> <SUB>1</SUB> > <SPAN class=EmphasisTypeItalic>σ</SPAN> <SUB>2</SUB><br />
<DIV id=Equ47 class=Equation><br />
<DIV class=EquationContent><br />
<DIV class=MediaObject><IMG alt=

(2.47)
with



$$\displaystyle{ g(\boldsymbol{x},\sigma ):= \frac{1} {2\pi \sigma ^{2}}\,\mbox{$ \cdot$ }\,e^{-\frac{x^{2}+y^{2}} {2\sigma ^{2}} }\;. }$$

(2.48)
The intensities are finally scaled between 0 and 255. The reference image 
$$\mathcal{R}$$
is created by applying a known transformation 
$$\boldsymbol{y}_{\mathrm{GT}}$$
to 
$$\mathcal{T}$$
according to the mass-preserving transformation model 
$$\mathcal{M}^{\mathrm{MP}}(\mathcal{T},\boldsymbol{y}_{\mathrm{GT}})$$
in Eq. (2.46). Hence, 
$$\mathcal{T}$$
and 
$$\mathcal{R}$$
have the same global mass. The ground-truth transformation is defined by
Sep 23, 2016 | Posted by in CARDIOLOGY | Comments Off on Motion Estimation

Full access? Get Clinical Tree

Get Clinical Tree app for offline access