Kalman Filters. Lots of you have probably heard something or another about Kalman filters and filtering; however, from what I’ve been able to discern, it’s all been rather incomplete. As such, I thought it’d be useful to give a brief-ish overview of what exactly Kalman filters are, how they work, and how they’re useful in robotics. Firstly, what *is* a Kalman filter? To explain that, I’ll have to go over some basic terms first. In control theory, “State” is a general way of describing everything you need to know about an object for the purposes of a problem. If you’re talking about a car on a freeway, its ‘state’ may include such ‘state variables’ as velocity, acceleration, and position, and probably some things for turning as well. If color was relevant to driving down a highway, you’d have it there too; but as is, it likely won’t change and can’t affect anything, so it’s not included. If we’re trying to represent MVRT’s robot from 2016, we’d probably need to account for position, velocity, acceleration, rotative position, velocity, acceleration, flywheel speed and acceleration (position doesn’t really matter on a flywheel if you think about it), if the claw is open or not, angler angle, velocity, acceleration, and if the punch is open or not. If we’d had a sensor to detect if the ball is in the robot, we’d probably want that as a variable too. Now, this is obviously a lot to do in one control loop; as such, we’d divide it into individual “Systems”. A system is essentially a closed group of interrelated variables for which all external inputs are easily modifiable. For example, you could take the drivetrain and *only* care about velocity/position/acceleration and the rotative versions of that, modelling friction as needed, and basically just ignore whatever the angler and grabber are doing. Now, on to Kalman.

What Kalman -is- is essentially a way of stabilizing signal measurements over time. By measurement, I mean exactly that; when you have an encoder on a wheel, you’re taking measurements off of it every so-and-so period of time. Now, as any of you who have worked with sensors before will know, sensors are *noisy*. This means that, at any specific point in time, the sensor will almost certainly *not* give an accurate measurement. “Noise” is a term we use for *unpredictable* errors in measurement; oftentimes, we’ll state it as a ‘Noise Function of Time’, where plugging in any time to the function will give you the noise- measurement minus actual position (known through some other method or purely theoretical)- at that moment in time. Now, we can’t actually predict anything with this function (that’s the point of noise), but we can use it to represent past noise mathematically. What some people will do is use a weighted average, wherein you take, say, the last 5 measurements and average them. There are a lot of problems with this, however; they don’t tend to deal well with volatile variables or when some other state changes the behavior of a measured variable suddenly. An example of this would be turning; you could be driving straight for a while, but when you turn, the measurement would take a while to catch up and you could screw up your control loop. What the Kalman Filter is, essentially, is a ‘adjustable weighted moving average’.

There are three important parts to the Kalman Filter; the last estimate of state, from which you calculate, the current *prediction* of state, a current measurement of state, and a series of ‘covariance matrices’. Each timestep, the Kalman Filter produces an Estimate of position, and more or less is assuming that’s correct. From this estimate, you can more or less figure out where you’re going to be in one timestep; if you’re at a position x=5, moving at a velocity of 3 meters per second, the next second you’ll probably be at x=8 (assuming no acceleration occurred in that timeframe). With more variables, this gets more complicated, but that’s more of an implementation problem than a theoretical problem. Next, you’ll be getting a measurement at each time step of your state; assuming your state contains several separate variables, this’ll actually be several individual measurements, like from an encoder, accelerometer, and gyroscope, which we’ll combine into what we’ll call a single “Vector Measurement of State”; a vector is sort of like a 1-d array, but in general just know that a Vector Measurement of State is basically a collection of all the individual measurements you need to track your state. Each time you get a measurement, the Kalman Filter has to make an estimate of the current state value. You have two values which are used to do this: a prediction of where you *should* be, and a measurement of where you *think* you are. What the Kalman Filter does is assign a *weight* to both of these and taking the average based on that. These weights are calculated based on the standards of probability; if the Predictions have been more random in the past, then you care for those less, with the same thing going for the Measurements. These ‘weights’ are stored in a series of Covariance Matrices, which are a bit too complicated to explain here, but essentially just hold a bunch of numbers notifying how ‘random’ various variables are with eachother for the Estimates, Predictions, and Measurements respectively. What’s so special about this? Well, it’s probabilistic, learning from its own errors as time goes on, meaning the longer you run it, the more accurate it’ll get. Secondly, it’s also Recursive, meaning that you only need two points of data at any one time to make an estimate: the current state and the previous. You don’t need a list of *all* previous data; those are all accumulated and stored in the Covariance Matrices.

There are some problems, however. Firstly, it works best if noise is distributed according to a “Gaussian” model, as opposed to something like white noise. White noise means that the noise is equally likely to be any value in the readable range, wheras Gaussian noise means it’s ‘weighted’ towards a central value that’s equal to the average of all individual noise instances. This is more likely than you think; if you roll ten dice, the result will be pretty Gaussian, and the more you roll, the more Gaussian it gets. Secondly, it’s only intended to operate on Linear systems, meaning the rate of change of the state is assumed to be constant; this is significantly *less* common, and as such, the regular Kalman filter isn’t quite adequate. As such, we have two additional types of Kalman Filters: the Extended Kalman Filter, which operates on mildly nonlinear systems, and Unscented Kalman filters, which operate on *really* nonlinear systems. I don’t understand Unscented Kalman Filters at all, so I won’t explain it, but what the Extended Kalman Filter does is essentially estimate a ‘linearization’ of the nonlinear data for each timestep using a matrix of the derivatives of the signal and state, something I won’t explain here.

And what is this all used for? Quite a lot actually. Two main purposes exist; Sensor Fusion and State Estimation. The latter is the ‘intended’ purpose of the algorithm; whenever you want to figure out what some system is *actually* doing, you slap on a Kalman Filter and it removes most of the noise that would normally perturb it. Sensor Fusion, also known as Data Fusion, is more accidental; essentially, due to how the Kalman Filter is set up, you’re able to use multiple sensors to measure the same value. This ends up being quite useful; you’re able to have, say, an ultrasonic, an infrared, and a vision sensor all tracking your distance from a wall, and the Kalman filter will automatically sort out how accurate each sensor is an give you a weighted average as accurate as it can generate. These two properties allow the Kalman Filter to basically appear, well, everywhere. Radios, Satellite Communications, GPS, Wi-Fi; anytime you’re transmitting *anything* over open space, you almost *need* a Kalman Filter. Self-Driving Cars and even normal cars use it to keep you on course and tell how the car is functioning. Real world robots use Kalman filters for basically *everything*; they simulate the ability of the human brain to filter out irrelevant data and move the body on a non-chaotic path. Look at any engineering system that interacts with the world at large, and there’s a pretty good chance Kalman will be there.