For localization, its necessary to have maps. To map the surroundings, localization is needed. Very much like a chicken-and-egg problem. SLAM technology solves both the problem of localization as well of mapping together.
With the help of sensors computers have been enabled to view the physical world. The SLAM technology examines and converts this data. Making machines to understand and interpret data in the form of 3D visual points. Before SLAM technology, machines faced the problem how to localize themselves and how to understand their environment they were operating in. This is a basic requirement for robots, unmanned aerial vehicles, augmented reality and autonomous vehicles.
BIS Research expects SLAM technology market to grow from $50 million in 2017 to reach $8.23 billion by 2027.
The first SLAM techniques were published as research papers in the 1980s and were originally developed for robot navigation in unknown environments. It required expensive sensors such as LIDAR, SONAR and stereo cameras. Around 2001, break throughs happened in computational photography, the field showed that SLAM can be successful with just a single camera. After 2012, scholars made a breakthrough using the nonlinear least squares method. This reduced the process error and the graph-based approach was very intuitive. After that, many excellent visual SLAM schemes and three-dimensional LiDAR-based SLAM schemes were produced.
Every big Tech company with a stake in computers or sensors is trying to make a viable product out of SLAM. The Apple’s ARKit of 2017 is most pronounced, to combine device motion with the camera to provide accurate tracking and plane detection. These features make apps see the world and place virtual objects on horizontal and vertical surfaces and recognize images and objects. Google Tango is another example that can create a mesh of our environment. The device can tell you where the floor is, identify walls and objects in the environment allowing everything around to be an element to interact with. Google discontinued Tango for their more versatile AR core for a more powerful functionality.
Although, GPS already provides good positioning details for navigation and driving purpose. The GPS accuracy is not as fine as most people assume. 95% of the time only better than 2 meters, if using the GPS satellite information alone. In narrow two-way roads that is just an unacceptable and dangerous error. A completely self-driving car needs SLAM to locate itself accurately to put measurement and mathematics in action at good speeds. The self-driving car measures the environment using the roof-mounted LIDAR, that can create a 3D map of its surroundings at high frequency. These measurements are used to augment the pre-existing GPS maps. The readings create a massive amount of data and generating meaning from this data to make driving decisions. The software on the car accurately maps the environment and position of the car, essential for a safe operation.
While it’s easy to state that algorithms can make it work when fed with sensor data. The reality is often harder. To start with: Probabilistic SLAM as the name suggests, uses probability theories. To accurately represent a navigation system, there needs to be a learning process between the environment and the device and the states of measurements. SLAM contains two steps: The prediction step and the measurement step. One of the common learning methods for SLAM is called the Kalman Filter. In this method, the prediction process estimates the current position given previous positions and the current input. The measurement correction process makes the final estimate of the current state based on the estimated state, current and historic observations and uncertainty. This provides an estimation of the posterior probability function to merge data between image frames to predict the position and generate map. EKF and FAST SLAMs are example algorithms that put the theories to action.
Over time, case specific algorithms have been developed for specific applications. Examples of SLAM algorithms are: Laser SLAM for LIDAR based applications; Visual SLAM for application of Camera driven applications; and Graph-SLAM developed for longer frames of references for navigating unknown areas. RGBD-SLAM is another, case-specific version of visual SLAM for devices - like Microsoft Kinect - that host Time of Flight cameras.
The data consists of traditional information from RGB(D) cameras with pixel color: Red®, Green(G), Blue(B) and an additional information on depth of the pixel (D) extracted from the infrared projectors that do ‘time of flight’-calculations. With depth information a 3D motion capture is possible. The data can be made into a point cloud with the help of RGBD-SLAM algorithm. With this method, a highly detailed map of the (indoor) environment can be produced. One downside of Time of Flight cameras is the error in measuring depth. An alternative to overcome this lower depth accuracy is LiDAR, which returns higher depth-accuracy.
In the case of self-driving cars of automotive industry equipped with LIDAR, the SLAM techniques are applied to 3D point clouds. LIDAR has a high data rate and is computationally challenging. The key idea in 3D SLAM is to simultaneously optimize variables by two algorithms. One algorithm performs odometry at a high frequency but low fidelity to estimate position. The other algorithm is for precise matching and registration of the point cloud. Both algorithms extract feature points located on sharp edges and planar surfaces and match these feature points to edge line segments and planar surface patches, respectively. In the odometry algorithm, correspondences of the feature points are found by ensuring fast computation. In the mapping algorithm, the correspondences are determined by examining geometric distributions of local point clusters.
Visual SLAM is the best method to address the precise mapping of the physical world. Even though the technology is in nascent stage now, accurately projecting virtual images onto the physical world opens the door to plenty of innovative applications. Apart from virtual object projection, visual SLAM systems are also used in a wide variety of field robots. For example, rovers and landers for exploring Mars use visual SLAM systems to navigate autonomously. Field robots in agriculture, as well as drones, can use the same technology to independently travel around crop fields. Autonomous vehicles without a LIDAR, like Tesla, could potentially use visual SLAM systems for mapping and understanding the world around them. The Gaming industry depends on the SLAM development to invent the future of controllers.
Another major potential opportunity for SLAM systems is to replace or enhance GPS tracking and navigation. GPS systems aren’t useful indoors, or in big cities where large buildings obstruct the sky making the GPS accuracy to degrade to a few meters. Visual SLAM & LIDAR SLAM systems tackles each of these problems independently without needing satellite information and provides very accurate measurements of the physical world.
The ability to sense the location of a device, as well as its environment, without knowing a map is incredibly difficult. SLAM is a logical approach to solve that problem. Improvements in automating robots and cars are only possible when progress in SLAM technology will be made. It will create new use cases. The fields of navigation applications are diverse: by foot, for phone, for drone, for cars, real estate, retail, gaming, etc.
SLAM is a framework for temporal modeling of surroundings by making probabilistic inferences between measurements of sensors like Camera, LIDAR, IMUs. It is processing intensive and hard. A lot of planning and care is needed to make it accurate as it combines multiple sensors. This generated data can prove to be too large to be quickly processed and stored. Intelligent edge processing technologies will be key to advance as the processing and storage hurdles associated with visual SLAM or hybrid / sensor fusion SLAMs need to be overcome. Such edge AI sensor software, will enable to run applications at lower hardware requirements, consuming less energy (hence improving battery usage) to enable mapping in real-time for larger markets.