An application of Numerical Solution to Maximum Likelihood Estimation in GraphSLAM

5 min readJan 11, 2021

In the previous post, we looked at a robot taking repeated measurements of the same feature in the environment. This example demonstrated the fundamentals of maximum likelihood estimation(MLE) but was very limited since it was only estimating one parameter — z1. Below we will walk through a more complicated 1-dimensional estimation problem.

Motion and Measurement Example

The robot starts at an arbitrary location that will be labeled 0, and then proceeds to measure a feature in front of it — the sensor reads that the feature is 7 meters away. The resultant graph is shown in the image below.

After taking its first measurement, the following Gaussian distribution describes the robot’s most likely location.

Recall that since we constrained the robot’s initial location to 0, x_0x0 can actually be removed from the equation.

Next, the robot moves forward by what it records to be 10 meters and takes another measurement of the same feature. This time, the feature is read to be 4 meters behind the robot. The resultant graph looks like so:

Sum of Constraints

Now, the task at hand is to minimize the sum of all constraints:

To do this, you will need to take the first derivative of the function and set it to equal zero. Seems easy, but wait — there are two variables! You’ll have to take the derivative with respect to each, and then solve the system of equations to calculate the values of the variables.

For this calculation, we will assume that the measurements and motion have equal variance. You would have probably figured out that in the above example you needed to take the derivative of the error equation with respect to two different variables — z1 and x1 — and then perform variable elimination to calculate the most likely values for z1 and x1. The estimated location of z1 I got is 6.67 and for x1, 10.33 after solving the system of equation. This process will only get more complicated and tedious as the graph grows.

Optimization with Non-Trivial Variances

To make matters a little bit more complicated, let’s actually take into consideration the variances of each measurement and motion. Turns out that our robot has the fanciest wheels on the market — they’re solid rubber (they won’t deflate at different rates) — with the most expensive encoders. But, it looks like the funds ran dry after the purchase of the wheels — the sensor is of very poor quality.

Let’s redo our math with the following new information,

Motion variance: 0.02,
Measurement variance: 0.1.

After taking into account the variances, the estimated locations of z1 and x1 I got are z1 = 6.54 and x1 = 10.09.

That seemed to be a fair bit more work than the first example! At this point, we just have three constraints — imagine how difficult this process would be if we had collected measurement and motion data over a period of half-an-hour, as may happen when mapping a real-life environment. The calculations would be tedious — even for a computer!

Solving the system analytically has the advantage of finding the correct answer. However, doing so can be very computationally intensive — especially as you move into multi-dimensional problems with complex probability distributions. In this example, the steps were easy to perform, but it only takes a short stretch of the imagination to think of how complicated these steps can become in complex multidimensional problems.

Well, what is the alternative? you may ask. Finding the maximum value can be done in two ways — analytically and numerically. Solving the problem numerically allows for a solution to be found rather quickly, however, its accuracy may be sub-optimal.

Numerical Solution to MLE

The method that you applied in the previous two examples was very effective at finding a solution quickly — but that is not always the case. In more complicated problems, finding the analytical solution may involve lengthy computations.

Luckily there is an alternative — numerical solutions to maximum likelihood problems can be found in a fraction of the time. We will explore what a numerical solution to the previous example would look like.

Numerical solution

The graph of the error function from the previous example is seen below. In this example, it is very easy to see where the global minimum is located. However, in more complicated examples with multiple dimensions, this is not as trivial.

This MLE can be solved numerically by applying an optimization algorithm. The goal of an optimization algorithm is to speedily find the optimal solution — in this case, the local minimum. There are several different algorithms that can tackle this problem; in SLAM, the gradient descent, Levenberg-Marquardt, and conjugate gradient algorithms are quite common.

Quick Refresher on Gradient Descent

Recall that the gradient of a function is a vector that points in the direction of the greatest rate of change; or in the case of extrema, is equal to zero.

In gradient descent — you make an initial guess, and then adjust it incrementally in the direction opposite the gradient. Eventually, you should reach a minimum of the function.

This algorithm does have a shortcoming — in complex distributions, the initial guess can change the end result significantly. Depending on the initial guess, the algorithm converges on two different local minima. The algorithm has no way to determine where the global minimum is — it very naively moves down the steepest slope, and when it reaches local minima, it considers its task complete. One solution to this problem is to use stochastic gradient descent (SGD), an iterative method of gradient descent using subsamples of data.

Source: Udacity’s Self Driving Nano-degree program