📷 Inspection Robots @ CERN

At the CERN Robotics Team I was posed with the challenge of figuring out the problem of odometry for an inspection robot traversing the underground accelerator tunnels LHC and SPS, i.e. how does a robot localize itself using only on-board information from sensors and no GPS.

How to autonomously traverse the tunnels that comprise world’s largest machine

CERN, the European Organization for Nuclear Research, is a fascinating place; theoretical and experimental physicists, engineers, designers, construction and administration staff all work toward a unique mission of not only getting to the bottom of our universe’s origin and what keeps our world together, but also providing the scientific and medical community with vital knowledge with the help of some of the largest man-made machines on our planet.

ATLAS Experiment. Image courtesy: CERN Document Server

CERN began a side project and funded the development of an inspection robot for the Large Hadron Collider (LHC), namely the Train Inspection Monorail (TIM). Hanging from a rail above the tube encompassing the entire 27 km of the tunnel, it safely monitors the radiation and oxygen levels, communication bandwidth, and temperature while performing visual inspection tasks with cameras and laser scanners. With the success of TIM keeping CERN personnel safe by automating these tasks and the subsequents upgrades to TIM2 and TIM3, several parallel projects have been initiated, one of which is the versatile CERNBot as pictured below. Its main tasks consist of tele-operated inspection, (dis-)assembly, leak detection, just to name a few.

Since the infrastructure does not permit the installation of another monorail train above the Super Proton Synchrotron (SPS) — the 7 km accelerator built in 1976 providing beams for the LHC — this robot comes in handy. One of the biggest challenges for the CERNBot, however, such as for any other unmanned device, is its self-localization, i.e. pinpointing to the user its geographical location on a map. Logically, if a robot is required to handle complex tasks, it first needs to be aware of where it is. For humans this might be a trivial task, as long as we have a functioning brain and a smartphone with a GPS connection. For navigational inspection devices around 50 stories underground, we need to look toward alternatives.

Localization by Vision

Google has recently worked on a mobile VPS (Visual Positioning System) within their ARCore platform that enables you to find certain products inside of warehouses and big-box stores solely through your phone’s camera and accelerometer. You hold up your phone, point the camera around as the system recognizes where you are, and the navigation begins. Equally, the autonomous navigation industry is a major stakeholder in the robustness of this technology.

Typically, these systems rely on analyzing the incoming camera stream by picking distinctive features from your image such as sharp edges and corners. Now, imagine a small blue box on a white background like in the picture below. If one of its corners would be a distinctive feature, we can track this over multiple frames. In theory, you only need to find the same box corner twice in consecutive images to be able to make a prediction where this point is in the 3D world relative to our camera.

This concept is called triangulation. It is commonly used by geologists to find the location of earthquakes or by spacecraft engineers to determine a rocket’s location. Mount a camera on a robot or car, capture two images showing the same scene from slightly different perspectives, triangulate 10'000 times and, voilà, you receive a 3D model of the scene with 10'000 points. Do this with a video stream and your possibilities are endless.

The research community has come up with myriads of different methods to approach this problem, however the classical approach has been subject to large amounts of error and just don’t provide accurate results for CERN. Its tunnels are a real challenge for computer vision algorithms: repetitive features, wide spaces, smooth and textureless surfaces as well as irregular illumination.

Large Hadron Collider, 2017. Image courtesy: CERN Document Server

We therefore needed to figure out a way to minimize dependency on features and stumbled upon a localization method from the Technical University of Munich and Intel Labs called Direct Sparse Odometry (DSO). Instead of first extracting corners and edges, describing them, and brute-force matching them to those seen in the next frames as most algorithms do, the authors propose a way to circumvent this whole process and directly analyze patches of pixels while minimizing a joint energy function. In other words, DSO saves us an incredible amount of computing power and allows us to run it live on-board the CERNBot.

Scale is Key

Okay that’s fine and all, but how does having a 3D environment model help the robot in telling us where it is on a world map? This is where we attempt to infer the absolute location of the robot. We now know the relative location of all objects in the scene to each other and to the camera, however we do not know whether 10 cm in our virtual reconstruction equal 10 cm in the real world.

Humans are smart enough to estimate how large objects in images are: when we see a pencil on a photo we know it should be around 15 cm long or if we see the sun on an image, we know it’s millions of kilometers in diameter but therefore far away. In retrospect, computers cannot distinguish scale from one sole image.

Usually, this calls for an “initialization phase” each time we start localization where we infer this scale. However, this can be extremely erroneous for a single camera and no additional sensors (remember, ARCore uses multiple sensors including your camera, GPS, and IMUs). Therefore, we can install a 2D laser scanner next to the camera, which scans certain points on a plane in a specific scan area and can tell us with a high accuracy how far these points are (as seen below).

A “box” robot equipped with a 2D laser scanner

If we fuse information from the camera and the laser, the scale and therefore the position and path of the robot can be well determined. To step even further, the wheels of the CERNBot can give us another source of information about how far we drove. In the end we retrieve a virtual up-to-scale map of our robot trajectory and immediate environment.

CERNBot in action: Autonomously driving through SPS while odometry is running. (Speed ~3x)

This localization technique is currently being tested and optimized for employment in CERN’s facilities. In the future, this system might be combined with a trained neural network to recognize scenes with the help of specific alpha-numeric markings on the side of the tunnels. Either way, intelligent inspection robots will help in keeping employees safe and further facilitate the mission of discovering the origin of the universe.

A big thank you for this amazing experience to the EN-SMM-MRO (formerly EN-STI-ECE) section in the Engineering Department at CERN with Mario di Castro and Dr. Alessandro Masi.

CERN Robotics Team, January 2018. Top row (left to right): Antonio Strano, Simone Gargiulo, Dr. Giordano Lilli, Leanne Attard, Artūrs Ivanovs, Pawel Ptasznik, Alessandro Mosca, Jorge Camarero Vera, Laura Baiguera Tambutti, Giacomo Lunghi; Bottom row (left to right): Dr. Luca Buonocore, David Blanco Mulero, me, Carlos Veiga Almagro; Not pictured: Mario di Castro, Santiago Solis, Mengping Zheng
This post is a repost of my article on Medium.