The multivariate gaussian is a crucial building block in machine learning, robotics, and mathematical finance applications. Here is just a small sample of applications.
- Kalman Filters
- Gaussian Mixture Models
- Linear Discriminant Analysis
- Linear Quadratic Regulators
- Black Scholes Option Pricing
However, probability density formula of a multivariate gaussian can be pretty intimidating.
While this expression may work for a mathematician, it can be a little overwhelming for other audiences. In this post I’ll provide an alternate rigorous, programmatic, definition of the multivariate gaussian in Python. We’ll then be able to use our mental Python interpreter to get intuition about the behavior of any multivariate gaussian.
I will assume familiarity with the following concepts.
- the single dimensional gaussian, aka the normal random variable
- expectation and covariance of random variables
- vector and matrix operations, such as multiplication, and transpose
- basic Python and numpy notation
We’ll also need the concept of simulation. I’ll say that a function f simulates a random variable F if there is no way to determine, given a list of outputs, whether the list was generated by repeated function calls to f or whether it came from independent realizations of F.
Now we’re ready to give a computational definition of an dimensional multivariate gaussian.
Examine this simple program.
import numpy as np n = ... # Fill this in with a positive integer dimension, say n = 3 def z(): """ Return an n-dimensional column vector of independent standard random normals """ return np.random.normal(size=(n,1)) def x(mu, M): """ Return a random n-dimensional vector according to our special formula mu -- an n-dimensional column vector M -- an nxn matrix """ return mu + np.matmul(M, z())
Definition. is an -dimensional multivariate gaussian if and only if its distribution can be simulated by repeatedly calling the above function
x(mu, M) with some constant inputs
Now let’s break down what
x(mu, M) does.
- Fill an n-dimensional vector with independent standard random normals by calling
- Then multiply the result by matrix
- Then shift the result by
This computational viewpoint is especially useful if
M has a geometrical interpretation. For example, suppose we are working in two dimensions . If we graph the two columns of
M, we can get a picture of how it warps the plane.
Successive calls to
z() will generate a cloud of samples around the origin, with the cloud getting denser as we get closer to the origin. If we draw a circle around this cloud and multiply all points in the plane by
M, we can see that we get a cloud in an elliptical shape.
All that remains is to shift this cloud of points by
mu and we have a visualization for the two dimensional multivariate gaussian simulated by
This ellipsoid representation, by the way, is exactly what we use in the self driving car industry to represent the uncertainty estimates for the positions of other cars.
Suppose we know that is simulated by
x(mu, M) (and therefore it is, by our definition, multivariate gaussian) but we want to convert to the popular mean and covariance parameterization. It’s a simple matter of computing statistics.
First let’s compute the mean.
mu + M z()
z() (factoring out constants)
Unsurprisingly, the mean is
mu. We could have guessed that by looking at the illustrations above.
Now let’s compute the covariance. But first a clarification about notation. Consider that
np.matmul(z(), z().T) calls
z twice and therefore generates two independent versions of
z. How do we declare that we want the same sample of
z both times? We can use an immediately invoked
(lambda Z: np.matmul(Z Z.T))(z()). (You might have seen this elsewhere called the IIFE pattern.) Okay we’re ready to compute the covariance.
(lambda X: (X - mu)(X - mu).T)(x(mu, M))
(lambda Z: M Z (M Z).T)(z())
(lambda Z: M Z Z.T M.T)(z())
(lambda Z: Z Z.T)(z())
(lambda Z: Z Z.T)(z()) — diagonal terms are variances and cross terms are covariances of standard independent normals)
So the covariance is
M M.T and we get this result.
Result 1. If is a gaussian simulated by
x(mu, M), it has mean
mu and covariance
It’s actually more likely you’ll need to go backwards. For example, if you encounter a multivariate gaussian in the wild with the , parameterization and you need want to visualize it as an ellipsoid using the
mu, M parameterization. For that we usually use the eigendecomposition
This decomposition is extremely important in linear algebra, and I would encourage you to read up on it later. If we start with any covariance matrix , the eigendecomposition has the following properties.
- is a diagonal matrix with positive entries
- is a high dimensional rotation matrix (all columns are mutually orthogonal)
Defining as the diagonal matrix whose entries are the square root of ’s entries, we can see from Result 1 that calling
) will simulate a multivariate gaussian with mean and covariance
M M.T . So we have the following result.
Result 2. A multivariate gaussian with mean and covariance is simulated by
) where is the eigendecomposition of .
Let’s break down the action of into two steps, the action of and then the action of .
Remember is a diagonal matrix so it acts in an extremely simple way — by stretching the axis by the scaling factor given in its diagonal entry.
The matrix is a rotation matrix whose action is also easily interpretable from inspecting its columns using column oriented multiplication.
So using our mental Python interpreter as before, we can visualize the samples
) by imagining a cloud of samples contained in a sphere, then warping the sphere into an ellipsoid whose axes lengths are given by , and then shifting this whole ellipsoid by .
If you are fortunate enough to be working at a robotics company, you can do a code search for your gaussian visualization routines and I bet you will find something like the following.
def vis_gaussian(mean, covar): """Visualize a 3d gaussian""" eigen_vals, eigen_vects = np.linalg.eig(covar) plot_ellipsoid( center=mean, major_axes=eigen_vects, axes_lengths=np.sqrt(eigen_vals))
Now I hope whenever you see you’ll find it helpful to mentally replace it with
) and simulate it with your mental Python interpreter.