Visual-Inertial-Kinematic Odometry for Legged Robots (VILENS)

Home/DRS, Legged Robots, ORI Blog, Perception, yr_2020/Visual-Inertial-Kinematic Odometry for Legged Robots (VILENS)

Visual-Inertial-Kinematic Odometry for Legged Robots (VILENS)

This blog post provides an overview of our recent ICRA 2020 paper Preintegrated Velocity Bias Estimation to Overcome Contact Nonlinearities in Legged Robot Odometry:

  • [PDF] D. Wisth, M. Camurri, and M. Fallon, “Preintegrated Velocity Bias Estimation to Overcome Contact Nonlinearities in Legged Robot Odometry,” in IEEE Intl. Conf. on Robotics and Automation (ICRA), 2020.
    Title = {Preintegrated Velocity Bias Estimation to Overcome Contact Nonlinearities in Legged Robot Odometry},
    Author = {David Wisth and Marco Camurri and Maurice Fallon},
    Booktitle = {IEEE Intl. Conf. on Robotics and Automation (ICRA)},
    Year = 2020,
    month = may,
    Pdf = {},

This is one paper in a series of works on state estimation described here.


Many algorithms for mobile robotics rely on one crucial piece of information – Where is the robot?

Mobile robots rely on having an accurate location estimate for control, motion planning, navigation, and many other tasks. For example, when legged robots attempt to walk down stairs very precise foot placement is required – a error of just a few centimetres can send the robot tumbling down to the bottom!


(click image for animation)

The process of using sensor inputs to determine the robot’s location is known as state estimation. Legged robots, in particular, have relatively strict requirements for state estimation since the platforms often require high frequency control for balance and precise footstep placement when walking on uneven terrain. Typically these requirements are implemented as:

  • A high frequency (100Hz to 1kHz), locally accurate state estimate for control.
  • A lower frequency (1-30Hz), (more) globally accurate state estimate for motion planning and higher level tasks.

Background and Motivation

anymal ground deformation and slippage

(click image for animation)

Traditionally, legged robots rely on kinematic and inertial inputs since these can provide the high frequency updates required for control. Kinematic sensing takes advantage of the observation that when a robot’s foot is rigidly/firmly on the ground, the velocity of the body relative to the foot is the opposite of the velocity of the foot relative to body (similar to wheel odometry for wheeled vehicles).

However, in practice, the rigid contact assumption of footholds often breaks down. For example, in many scenarios there are unmodelled effects such as slippage, ground deformation, and deformation of the robot’s mechanical structure itself (common with legged robots due to the high impact forces when walking or running). The animation below shows that these unmodelled effects can introduce spurious velocities into the estimation, leading to poor results.

anymal-contact-animation(click image for animation)

These effects are very challenging to model and depend on the robot hardware, gait style, coefficient of friction, and other complex terrain properties (e.g. deformation and elasticity). The resultant state estimate drift in two kinematic-inertial estimators running on the ANYmal robot is shown below. During this experiment, the robot walked over concrete, grass, and gravel with several different gait styles. This results in several “distinct” regions where the drift is approximately linear over time. We have observed this type of state estimator drift over hundreds of hours of testing with our ANYmal robot in countless different scenarios.


Given the challenge of modelling these effects and estimating relevant parameters (e.g. terrain deformation properties of new terrain) we have chosen to estimate the kinematic-inertial (KI) bias online, similar to the popular approach of estimating IMU biases online. In order to make this KI bias observable, we have implemented our bias estimation into the VILENS estimation framework from our previous work [see link below].

Algorithm Description

The aim of this algorithm is to find the optimal state of the robot given the input measurements. The state consists of:

  • Robot position and orientation (pose)
  • Robot linear velocity
  • IMU biases (accelerometer and gyroscope)
  • Kinematic velocity bias (linear and angular).

We use a factor graph framework (in particular, ISAM2) to combine inputs from kinematics, IMU, and stereo camera measurements. Below is an summary of the different factors used in this system.


Prior Factors (PURPLE): We use prior factors to set the initial state of the system.

IMU Factors (ORANGE): We use preintegrated IMU factors to connect successive states.

Stereo Visual Factors (GREEN): We jointly optimise a sliding window of robot poses and visual landmark locations. We detect features using the FAST corner detector, track them between frames using the KLT feature tracker, and perform optimisation using a stereo projection error term.

Preintegrated Velocity Factors with Bias Estimation (RED): In the ideal situation with the rigid contact assumption the base velocity is simply the opposite of the foot velocity in the base frame. However, to account for the complex dynamics of ground contact described above (e.g. deformation or slippage) we augment the velocity with a slowly varying bias term. This allows us to use the kinematic odometry, without introducing a biased input into the optimisation.


We tested our proposed approach on a variety of terrain types with experiments totalling over 50 mins and 400m travelled. There were two main experimental locations — the Fire Service College (FSC) in the UK, and the Swiss Military Rescue Center (SMR) in Switzerland. The results are demonstrated in the video and table below. We compare the Relative Pose Error (RPE) of four different algorithms:

  • TSIF – The robot’s in-built kinematic-inertial estimator.
  • V-VI – VILENS with only visual inertial inputs
  • V-RP – VILENS with visual, inertial and kinematic inputs – adding the kinematics as a relative pose constraint, without accounting for the bias.
  • V-VB – VILENS with visual, inertial, and kinematic inputs – using the proposed preintegrated velocity factors with bias estimation.


Overall, we achieve a significant reduction in RPE compared to the baseline solutions. Compared to visual-inertial odometry only, adding the biased leg odometry input reduces the accuracy of state estimation, showing the importance of bias estimation.

More Information

For more information on our state estimation framework (VILENS) please see our website for a link to relevant publications.