openvins_linux/docs/update.dox

/**


@page update Update Derivations
@tableofcontents

@section basics Minimum Mean Square Error (MMSE) Estimation

Consider the following static state estimation problem:
Given a prior distribution (probability density function or pdf) for a Gaussian random vector
\f$ \mathbf x\sim \mathcal N (\hat{\mathbf x}^\ominus, \mathbf P_{xx}^\ominus)\f$ with dimension of \f$n\f$ and a new \f$m\f$ dimentional measurement \f$\mathbf{z}_m = \mathbf z + \mathbf n = \mathbf h(\mathbf x) + \mathbf n \f$ corrupted by zero-mean white Gaussian noise independent of state, \f$ \mathbf n \sim \mathcal N(\mathbf 0, \mathbf R)\f$,
we want to compute the first two (central) moments of the posterior pdf \f$ p(\mathbf x|\mathbf z_m)\f$.
Generally (given a nonlinear measurement model), we approximate the posterior pdf as:
\f$ p(\mathbf x|\mathbf z_m) \simeq \mathcal N (\hat{\mathbf x}^\oplus, \mathbf P_{xx}^\oplus)\f$.
By design, this is the (approximate) solution to the MMSE estimation problem [[Kay 1993]](http://users.isr.ist.utl.pt/~pjcro/temp/Fundamentals%20Of%20Statistical%20Signal%20Processing--Estimation%20Theory-Kay.pdf) @cite Kay1993.


@section conditional-pdf Conditional Probability Distribution

To this end, we employ the Bayes Rule:

\f{align*}{
    p(\mathbf{x} | \mathbf{z}_m) = \frac{p(\mathbf{x},\mathbf{z}_m)}{p(\mathbf{z}_m)}
\f}

In general, this conditional pdf cannot be computed analytically without imposing simplifying assumptions.
For the problem at hand, we first approximate (if indeed) \f$ p(\mathbf{z}_m) \simeq \mathcal N (\hat{\mathbf z}, \mathbf P_{zz}) \f$,
and then have the following joint Gaussian pdf (noting that joint of Gaussian pdfs is Gaussian):

\f{align*}{
p(\mathbf{x},\mathbf{z}_m) = \mathcal N \left( \begin{bmatrix} \hat{\mathbf x}^\ominus \\ \mathbf z \end{bmatrix}, \begin{bmatrix}\mathbf P_{xx} & \mathbf P_{xz} \\ \mathbf P_{zx} & \mathbf P_{zz} \end{bmatrix} \right) =: \mathcal N(\hat{\mathbf y}, \mathbf P_{yy})
\f}

Substitution of these two Gaussians into the first equation yields the following conditional Gaussian pdf:

\f{align*}{
    p(\mathbf{x} | \mathbf{z}_m)
    &\simeq \frac{\mathcal N(\hat{\mathbf y}, \mathbf P_{yy})}{\mathcal N (\hat{\mathbf z}, \mathbf P_{zz}) }\\
    &= \frac{\frac{1}{\sqrt{(2\pi)^{n+m}|{\mathbf{P}_{yy}}|}}e^{-\frac{1}{2}(\mathbf{y}-\hat{\mathbf{y}})^\top\mathbf{P}_{yy}^{-1}(\mathbf{y}-\hat{\mathbf{y}})}}{ \frac{1}{\sqrt{(2\pi)^{m}|{\mathbf{P}_{zz}}|}}e^{-\frac{1}{2}(\mathbf{z}_m-\hat{\mathbf{z}})^\top\mathbf{P}_{zz}^{-1}(\mathbf{z}_m-\hat{\mathbf{z}})}}\\
    &= \frac{1}{\sqrt{(2\pi)^{n}|{\mathbf{P}_{yy}}|/|{\mathbf{P}_{zz}}|}}
    e^{    {-\frac{1}{2}\left[
    (\mathbf{y}-\hat{\mathbf{y}})^\top\mathbf{P}_{yy}^{-1}(\mathbf{y}-\hat{\mathbf{y}})
    -
    (\mathbf{z}_m-\hat{\mathbf{z}})^\top\mathbf{P}_{zz}^{-1}(\mathbf{z}_m-\hat{\mathbf{z}})
    \right]}
    }\\
    &=: \mathcal N (\hat{\mathbf x}^\oplus, \mathbf P_{xx}^\oplus)
\f}


We now derive the conditional mean and covariance can be computed as follows:
First we simplify the denominator term \f$|{\mathbf{P}_{yy}}|/|{\mathbf{P}_{zz}}|\f$ in order to find the conditional covariance.

\f{align*}{
    |{\mathbf{P}_{yy}}| = \Big|{\begin{bmatrix}
    \mathbf{P}_{xx} & \mathbf{P}_{xz} \\
    \mathbf{P}_{zx} & \mathbf{P}_{zz}
    \end{bmatrix}}\Big|
    = \Big|{\mathbf{P}_{xx} - \mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{P}_{zx}}\Big|\ \Big|{\mathbf{P}_{zz}}\Big|
\f}

where we assumed \f$\mathbf{P}_{zz}\f$ is invertible
and employed the determinant property of [Schur complement](https://en.wikipedia.org/wiki/Schur_complement).
Thus, we have:

\f{align*}{
    \frac{|{\mathbf{P}_{yy}}|}{|{\mathbf{P}_{zz}}|}
    = \frac{\Big|{\mathbf{P}_{xx} - \mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{P}_{zx}}\Big|\Big|{\mathbf{P}_{zz}}\Big|}{|{\mathbf{P}_{zz}}|}
    = \Big|{\mathbf{P}_{xx} - \mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{P}_{zx}}\Big|
\f}


Next, by defining the error states \f$\mathbf{r}_x=\mathbf{x}-\hat{\mathbf{x}}^\ominus\f$,\f$\mathbf{r}_z=\mathbf{z}_m-\hat{\mathbf{z}}\f$,\f$\mathbf{r}_y=\mathbf{y}-\hat{\mathbf{y}}\f$,
and using the [matrix inersion lemma](https://en.wikipedia.org/wiki/Woodbury_matrix_identity),
we rewrite the exponential term as follows:

\f{align*}{
    &(\mathbf{y}-\hat{\mathbf{y}})^\top\mathbf{P}_{yy}^{-1}(\mathbf{y}-\hat{\mathbf{y}})
    -
    (\mathbf{z}_m-\hat{\mathbf{z}})^\top\mathbf{P}_{zz}^{-1}(\mathbf{z}_m-\hat{\mathbf{z}}) \\[5px]
    &= \mathbf{r}_y^\top\mathbf{P}_{yy}^{-1}\mathbf{r}_y
    -
    \mathbf{r}_z^\top\mathbf{P}_{zz}^{-1}\mathbf{r}_z \\[5px]
    &=
    \begin{bmatrix}
    \mathbf{r}_x \\
    \mathbf{r}_z
    \end{bmatrix}^\top
    \begin{bmatrix}
    \mathbf{P}_{xx} & \mathbf{P}_{xz} \\
    \mathbf{P}_{zx} & \mathbf{P}_{zz}
    \end{bmatrix}^{-1}
    \begin{bmatrix}
    \mathbf{r}_x \\
    \mathbf{r}_z
    \end{bmatrix}
    -
    \mathbf{r}_z^\top\mathbf{P}_{zz}^{-1}\mathbf{r}_z \\[5px]
    &=
    \begin{bmatrix}
    \mathbf{r}_x \\
    \mathbf{r}_z
    \end{bmatrix}^\top
    \begin{bmatrix}
    \mathbf{Q} & -\mathbf{Q}\mathbf{P}_{xz}\mathbf{P}_{zz}^{-1} \\
    -\mathbf{P}_{zz}^{-1}\mathbf{P}_{zx}\mathbf{Q} & \mathbf{P}_{zz}^{-1}+\mathbf{P}_{zz}^{-1}\mathbf{P}_{zx}\mathbf{Q}\mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}
    \end{bmatrix}
    \begin{bmatrix}
    \mathbf{r}_x \\
    \mathbf{r}_z
    \end{bmatrix}
    -
    \mathbf{r}_z^\top\mathbf{P}_{zz}^{-1}\mathbf{r}_z \\[5px]
    &\hspace{8cm} \mathrm{where} ~ \mathbf{Q}= (\mathbf{P}_{xx} - \mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{P}_{zx})^{-1} \nonumber\\[5px]
    &=
    \mathbf{r}_x^\top\mathbf{Q}\mathbf{r}_x
    -\mathbf{r}_x^\top\mathbf{Q}\mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{r}_z
    -\mathbf{r}_z^\top\mathbf{P}_{zz}^{-1}\mathbf{P}_{zx}\mathbf{Q}\mathbf{r}_x \nonumber\\
    &\hspace{4.6cm}+\mathbf{r}_z^\top(
    \textcolor{red}{\mathbf{P}_{zz}^{-1}}+\mathbf{P}_{zz}^{-1}\mathbf{P}_{zx}\mathbf{Q}\mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}
    )\mathbf{r}_z
    -\textcolor{red}{\mathbf{r}_z^\top\mathbf{P}_{zz}^{-1}\mathbf{r}_z} \\[5px]
    &=
    \mathbf{r}_x^\top\mathbf{Q}\mathbf{r}_x
    -\mathbf{r}_x^\top\mathbf{Q}[\mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{r}_x]
    -[\mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{r}_z]^\top\mathbf{Q}\mathbf{r}_x
    +[\mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{r}_z]^\top\mathbf{Q}[\mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{r}_z] \\[5px]
    &=
    (\mathbf{r}_x - \mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{r}_z)^\top
    \mathbf{Q}
    (\mathbf{r}_x - \mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{r}_z) \\[5px]
    &=
    (\mathbf{r}_x - \mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{r}_z)^\top
    (\mathbf{P}_{xx} - \mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{P}_{zx})^{-1}
    (\mathbf{r}_x - \mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{r}_z)
\f}


where \f$(\mathbf{P}_{zz}^{-1})^\top = \mathbf{P}_{zz}^{-1}\f$ since covariance matrices are symmetric.
Up to this point, we can now construct the conditional Gaussian pdf as follows:


\f{align*}{
    p(\mathbf{x}_k | \mathbf{z}_m)
    &= \frac{1}{\sqrt{(2\pi)^{n}|{\mathbf{P}_{xx} - \mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{P}_{zx}}}|} \times \nonumber\\
    &\hspace{0.5cm}\mathrm{exp}\left(
    {-\frac{1}{2}\left[
    (\mathbf{r}_x - \mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{r}_z)^\top
    (\mathbf{P}_{xx} - \mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{P}_{zx})^{-1}
    (\mathbf{r}_x - \mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{r}_z)
    \right]}
    \right)
\f}

which results in the following conditional mean and covariance we were seeking:

\f{align*}{
    \hat{\mathbf{x}}^\oplus &= \hat{\mathbf{x}}^\ominus + \mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}(\mathbf{z}_m - \hat{\mathbf{z}}) \\
    \mathbf{P}_{xx}^\oplus &= \mathbf{P}_{xx}^\ominus - \mathbf{P}_{xz}\mathbf{P}_{zz}^{-1}\mathbf{P}_{zx}
\f}

These are the fundamental equations for (linear) state estimation.


@section linear-meas Linear Measurement Update

As a special case, we consider a simple linear measurement model to illustrate the linear MMSE estimator:

\f{align*}{
    \mathbf{z}_{m,k} &=  \mathbf{H}_k \mathbf{x}_k + \mathbf{n}_k \\
    \hat{\mathbf{z}}_k &:= \mathbb{E}[\mathbf{z}_{m,k}] = \mathbb{E}[\mathbf{H}_k\mathbf{x}_k + \mathbf{n}_k] = \mathbf{H}_k {\hat{\mathbf{x}}_k^\ominus}
\f}

With this, we can derive the covariance and cross-correlation matrices as follows:

\f{align*}{
    \mathbf{P}_{zz}
    &= \mathbb{E}\left[
    (\mathbf{z}_{m,k} - \hat{\mathbf{z}}_k)(\mathbf{z}_{m,k} - \hat{\mathbf{z}}_k)^\top
    \right] \\[5px]
    &= \mathbb{E}\left[
    (\mathbf{H}_k\mathbf{x}_k + \mathbf{n}_k - \mathbf{H}_k{\hat{\mathbf{x}}_k^\ominus})
    (\mathbf{H}_k\mathbf{x}_k + \mathbf{n}_k - \mathbf{H}_k{\hat{\mathbf{x}}_k^\ominus})^\top
    \right] \\[5px]
    &= \mathbb{E}\left[
    (\mathbf{H}_k(\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus}) + \mathbf{n}_k)
    (\mathbf{H}_k(\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus}) + \mathbf{n}_k)^\top
    \right] \\[5px]
    &= \mathbb{E}\left[
    \mathbf{H}_k(\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})
    (\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})^\top\mathbf{H}_k^\top
    +
    \textcolor{red}{\mathbf{H}_k(\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})
    \mathbf{n}_k^\top} \right.\nonumber\\
    &\hspace{4cm}+
    \textcolor{red}{\mathbf{n}_k
    (\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})^\top\mathbf{H}_k^\top}
    +
    \left.
    \mathbf{n}_k
    \mathbf{n}_k^\top
    \right] \\[5px]
    &= \mathbb{E}\left[
    \mathbf{H}_k(\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})
    (\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})^\top\mathbf{H}_k^\top
    +
    \mathbf{n}_k
    \mathbf{n}_k^\top
    \right] \\[5px]
    &=
    \mathbf{H}_k~\mathbb{E}\left[(\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})
    (\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})^\top\right]\mathbf{H}_k^\top
    +
    ~\mathbb{E}\left[\mathbf{n}_k\mathbf{n}_k^\top\right] \\[5px]
    &=
    \mathbf{H}_k\mathbf{P}_{xx}^\ominus\mathbf{H}_k^\top + \mathbf{R}_k
\f}


where \f$\mathbf{R}_k\f$ is the *discrete* measurement noise matrix, \f$\mathbf{H}_k\f$ is the measurement Jacobian mapping the state into the measurement domain, and \f$\mathbf{P}_{xx}^\ominus\f$ is the current state covariance.

\f{align*}{
    \mathbf{P}_{xz}
    &= \mathbb{E}\left[
    (\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})(\mathbf{z}_{m,k} - \hat{\mathbf{z}}_k)^\top
    \right] \\[5px]
    &= \mathbb{E}\left[
    (\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})
    (\mathbf{H}_k\mathbf{x}_k + \mathbf{n}_k - \mathbf{H}_k{\hat{\mathbf{x}}_k^\ominus})^\top
    \right] \\[5px]
    &= \mathbb{E}\left[
    (\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})
    (\mathbf{H}_k(\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus}) + \mathbf{n}_k)^\top
    \right] \\[5px]
    &= \mathbb{E}\left[
    (\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})(\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})^\top\mathbf{H}_k^\top
    +(\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})\mathbf{n}_k^\top
    \right] \\[5px]
    &= \mathbb{E}\left[
    (\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})(\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})^\top\right]\mathbf{H}_k^\top
    +
    \textcolor{red}{\mathbb{E}\left[(\mathbf{x}_k - {\hat{\mathbf{x}}_k^\ominus})\mathbf{n}_k^\top
    \right]} \\[5px]
    &= \mathbf{P}_{xx}^\ominus\mathbf{H}_k^\top
\f}

where we have employed the fact that the noise is independent of the state.
Substitution of these quantities into the fundamental equation
leads to the following update equations:


\f{align*}{
    \hat{\mathbf{x}}_k^\oplus
    &= {\hat{\mathbf{x}}_k^\ominus} + \mathbf{P}_{k}^\ominus\mathbf{H}_k^\top (\mathbf{H}_k\mathbf{P}_{k}^\ominus\mathbf{H}_k^\top + \mathbf{R}_{k})^{-1}(\mathbf{z}_{m,k} - \hat{\mathbf{z}}_k) \\[5px]
    &= {\hat{\mathbf{x}}_k^\ominus} + \mathbf{P}_{k}^\ominus\mathbf{H}_k^\top (\mathbf{H}_k\mathbf{P}_{k}^\ominus\mathbf{H}_k^\top + \mathbf{R}_{k})^{-1}(\mathbf{z}_{m,k} - \mathbf{H}_k{\hat{\mathbf{x}}_k^\ominus}) \\[5px]
    &= {\hat{\mathbf{x}}_k^\ominus} + \mathbf{K}\mathbf{r}_z \\[5px]
    \mathbf{P}_{xx}^\oplus
    &= \mathbf{P}_{k}^\ominus - \mathbf{P}_{k}^\ominus\mathbf{H}_k^\top (\mathbf{H}_k\mathbf{P}_{k}^\ominus\mathbf{H}_k^\top + \mathbf{R}_{k})^{-1} (\mathbf{P}_{k}^\ominus\mathbf{H}_k^\top)^\top \\[5px]
    &= \mathbf{P}_{k}^\ominus - \mathbf{P}_{k}^\ominus\mathbf{H}_k^\top (\mathbf{H}_k\mathbf{P}_{k}^\ominus\mathbf{H}_k^\top + \mathbf{R}_{k})^{-1} \mathbf{H}_k{\mathbf{P}_{k}^\ominus}
\f}

These are essentially the Kalman filter (or linear MMSE) update equations.


@section update-examples Update Equations and Derivations

-   @subpage  update-featinit --- 3D feature triangulation derivations for getting a feature linearization point
-   @subpage update-feat --- Measurement equations and derivation for 3D feature point
-   @subpage update-delay --- How to perform delayed initialization
-   @subpage update-null --- MSCKF nullspace projection
-   @subpage update-compress --- MSCKF measurement compression
-   @subpage update-zerovelocity --- Zero velocity stationary update


*/