# Draft:SLAM for Decision Making

SLAM (Simultaneous Localization and Mapping) is a problem in Robotics/Autonomous Vehicles, that requires to create a map of an unknown environment by an agent placed at unknown location in the environment and keep on updating the agent's location within the map simultaneously. This map information is used to navigate the agent in the environment by path planning algorithms and obstacle avoidance. It is also used in navigating self-driving cars, autonomous robots, unmanned vehicles, etc. SLAM algorithm was first popularized by Standford Team's autonomous cars, STANLEY and JUNIOR, at DARPA Grand Challenge 2005, which used SLAM for navigation.

## How SLAM works?

At any time instant $k$ , $x_{k}$ is the state vector consists of orientation and location of the agent, $u_{k}$ is control vector applied at time $k-1$ on the agent to reach the state $x_{k}$ at time $k$ , $m_{i}$ is landmark vector of $i^{th}$ location and $z_{ik}$ is the observations taken by the agent at the time $k$ of the $i^{th}$ location. Since the quantities are probabilistic, so we have to compute joint posterior $P(x_{k},m|Z_{0:k},U_{0:k},x_{0})$ at time $k$ . Here, $Z_{0:k}$ is set of observations and $U_{0:k}$ is set of control inputs. Basic representation of autonomous agent states, control, observation and landmarks in a flow chart with respect to time t.

The SLAM algorithm is implemented in two recursive steps, i.e.,

#### Time-Update

               $P(x_{k},m|Z_{0:k-1},U_{0:k},x_{0})=\int P(x_{k}|x_{k-1},u_{k})\times P(x_{k-1},m|Z_{0:k-1},U_{0:k-1},U_{0:k-1},x_{0})dx_{k-1}$ ... (1)


#### Measurement Update

               $P(x_{k},m|Z_{0:k},U_{0:k},x_{0})={\frac {P(z_{k}|x_{k},m)P(x_{k},m|Z_{0:k-1},U_{0:k},x_{0})}{P(z_{k}|Z_{0:k-1},U_{0:k})}}$ ... (2)


The above mentioned equations calculates joint posterior. The localization problem is solved by calculating probability distribution $P(x_{k}|Z_{0:k},U_{0:k},m)$ and mapping problem is solved by calculating conditional density $P(m|X_{0:k},Z_{0:k},U_{0:k})$ .

## Contributions of SLAM

SLAM has modified the world in a drastic manner, it has opened possibilities to reach complex and enclosed spaces that were previously unreachable. Its effects disrupted the geospatial and surveying industries bringing up new technologies. Also, the fact that it answers major localization problems in autonomous vehicles, with increased robustness, yields a more stable and accurate state estimation for both ground and aerial navigation solutions. SLAM algorithms have also been used for localization and navigation by astronauts on the surface of Mars. In our everyday life, ARCore(Android) and ARKit(IOS) in the smartphones detect the feature points and find the change in location, also the visual information, and is combined with the IMU to estimate the position and the orientation. SLAM is a core for the upcoming technologies in Augmented Realities and has a lot of future scopes.

## Importance of SLAM

File:Annotation 2020-10-06 143959.png
(a) Real-world scenario (b) the map in RVIZ and (c) trajectory with obstacles using Active SLAM.

SLAM, when combined with path planning problems in an unknown complex environment, becomes an Active SLAM problem. In a paper active SLAM based on deep reinforcement learning is used for path planning under unknown environments. For obstacle avoidance path, Dueling DQN is used to navigate the robot, and FastSLAM is used to map and locate the robot in the unknown environment. Here, the robot learns and adapts to navigate in the environment based on Deep Reinforcement Learning Algorithms (DRL) which facilitates the robot to interact with its environment, avoiding both static and dynamic obstacles and reaching the goal by generating an optimal path using a 2D map created by FastSLAM for online navigation.

We all know SLAM is used to construct maps of unknown environments. Another example of active SLAM is described in a paper, Monte Carlo uncertainty maps-based for mobile robot autonomous SLAM navigation, where SLAM is integrated with Monte Carlo experiment to create an uncertainty map of an environment where SLAM is implemented on EKF (Extended Kalman Filter) to extract contours from the surrounding and navigate the robot to high uncertainty region of the environment. Based on the Gaussian method and Monte Carlo experiments the features are extracted and the uncertainty of the map is reconstructed. Monte Carlo's method is better than gridding the map, generated by SLAM, as it reduces the computational cost for navigation for real-time operation.

As mentioned in paper, robot can perform multiple task using decision making while the whole map or a part of map is unavailable. This increases the efficiency of the robot and saves time as the robot simultaneously maps and performs tasks. The proposed approach uses probabilistic RRT in combination with online POMDP to direct the belief and navigate in a dynamic environment.

## Variants of SLAM

### Extended Kalman Filter (EKF) SLAM

It is one of the early approaches to SLAM that estimates the state of the system based on the robot pose at every time step $k$ and the landmark location, where the navigation environment is represented in a discrete-time state-space vector. The initial state of robot represented as ${\bf {x}}_{k}^{R}$ , landmark observations ${\bf {Z}}_{k}$ and the control inputs ${\bf {U}}_{k}$ and the joint posterior density of the landmarks and state of robot is given by the probability distribution as:

                $P({\bf {x}}_{k}^{R},{\bf {m}}\vert {\bf {Z}}_{k},{\bf {U}}_{k},{\bf {x}}_{0}^{R})$ The state of the whole system is given as: ${\bf {x}}_{k}=\{{\bf {x}}_{k}^{R},{\bf {m}}\}$ SLAM in general is described as a Bayesian estimation problem, with a goal to make a belief or a probabilistic view that the robot has an idea of the landmark positions, and the pose is as close to the real distribution of the system. This belief is given as:

                $Bel^{-}({\bf {x}}_{k})=\int _{S}P({\bf {x}}_{k}\vert {\bf {x}}_{k-1},{\bf {U}}_{k},{\bf {x}}_{0})\times Bel^{+}({\bf {x}}_{k-1}){\bf {dx}}_{k-1}$ This approach utilizes the Extended Kalman Filter, with a Gaussian noise assumption. Taylor series expansion on the state-vector is used to linearize the motion and sensors . EKF follows the filter cycle with the following steps: State prediction, measurement prediction, measurement, data association, and update. In the state prediction, the state space of the robot is updated based on the robot's motion by deriving the Jacobian of motion. Next, in the prediction step where the next information is predicted based on the existing information and a correction step is performed where the known data is associated and the previously unobserved landmarks are initialized. Now an expected observation is made based on the current estimate a Jacobian is calculated for the observation. The measurement update in a single step requires only one full belief update. This cycle continues until the loop closure is made, where loop closing reduces the uncertainty in robot and landmark estimates. 

But the drawback of EKF is the ability to work with uncertainty under Non-Gaussian implication (or non-linearities) introduced FastSLAM. 

### FastSLAM The yellow line shows the estimated path of the robot, and the blue line is the GPS ground truth, and the yellow dots represent the estimated positions of the landmarks.

Unlike a simple EFK SLAM, FastSLAM estimates the posterior using a modified particle filter. This approach uses K independent Kalman filters to estimate the K landmark positions. Using Rao-Blackwellization to exploit dependencies between variables the equation (1) can be represented such that it makes the SLAM posterior into low-dimensional estimation problems.

                  $P({\bf {x}}_{0}^{t},{\bf {m}}\vert {\bf {Z}}_{k},{\bf {U}}_{k})=P({\bf {x}}_{0}^{t}\vert {\bf {Z}}_{k},{\bf {U}}_{k})\prod _{i=1}^{K}P({\bf {m}}_{i}\vert {\bf {x}}_{0}^{t},{\bf {Z}}_{1}^{t})$ Each particle maintains a K individual EKF's. The key steps of FastSLAM 1.0 involve, extending the path posterior by sampling a new pose for each sample by taking the old generation of the sample and drawing a new pose using the odometry information we get ${\bf {x}}_{t}^{k}$ . Now we compute a particle weight which is a result obtained from the sampling principle, the weight helps understand the uncertainty of the landmark. The weight is given as:

                  ${\bf {w}}^{[k]}={target({\bf {x}}^{[k]}) \over proposal({\bf {x}}^{[k]})}$ The last step is to update the belief of observed landmarks which is similar to the EFK update rule. There is a data-association problem involved where the system has more than one possible association for the observation and the landmark. In such cases, the multi-modal belief is considered and data is associated with the most probable match, if the probability for an assignment is too low then a new landmark is generated. This is a huge advantage of FastSLAM over EKF. FastSLAM 2.0 uses measurements along with the motion model shown below and hence this results in more peaked proposal distribution as the sensor improves the accuracy and robustness of the algorithm.

                   $x_{t}^{[k]}\sim p({\bf {x}}_{t}\vert {\bf {x}}_{1:t-1}^{[k]},U_{1:t},Z_{t-1})$ ### Graph SLAM 

It is the second major category of SLAM, follows the optimization approach instead of a filter-based approach. Graph-based SLAM uses a graph construction to represent the problem where every node represents a pose of the robot while mapping and the edges represent the spatial constraints between 2 poses. The aim of this approach is to find the node configuration with minimum error introduced by constraints. An edge is constructed when the robot is displaced from $x_{i}$ to $x_{i+1}$ or the robot observes the same part of the environment from $x_{i}$ and $x_{j}$ . It uses matrices to build the problem where the affine and projective transformations are captured, it is represented as an information matrix ${\bf {\Omega }}_{ij}$ . There is a function ${\bf {e}}({\bf {x}}_{i},{\bf {x}}_{j},{\bf {z}}_{ij})$ that computes the difference between the observation ${\hat {\bf {z}}}_{ij}$ and the real observation ${\bf {z}}_{ij}$ , given by:

                   ${\bf {e}}_{ij}({\bf {x}}_{i},{\bf {x}}_{j})={\bf {z}}_{ij}-{\hat {\bf {z}}}_{ij}({\bf {x}}_{i},{\bf {x}}_{j})$ To find the minimum cost, the following function is used to find the optimal value for the vehicle state and the landmark.

                   ${\bf {F}}({\bf {x}})=\sum _{\langle i,j\rangle }{{\bf {e}}_{ij}^{T}{\bf {\Omega }}_{ij}{\bf {e}}_{ij}}$ ### LSD SLAM or Large-Scale Direct SLAM 

Type of SLAM, where not just the local motion of the camera is tracked but the method also builds consistent large-scale maps of the surrounding environment. The algorithm consists of 3 components: Tracking is where camera images are used to estimate the body pose with respect to the keyframe using the pose of the previous frame. Depth-map estimation where the depth is found by a filtering-based estimation of semi-dense depth maps. The last phase is Map optimization where the map is not refined anymore and loop closures are detected by scale-aware and image alignments. LSD-SLAM can not just be implied on monocular but also on the stereo and omnidirectional or wide field-of-view cameras.

In this algorithm, the maps are represented as a pose graph of keyframes, where each frame has the camera image Ii, inverse depth map Di, and the variance of the inverse depth Vi. Initializing from the first keyframe Ii the relative 3D pose ξij is estimated by minimizing the variance-normalized photometric error. To estimate the depth map, when the camera moves too far from the existing map a new keyframe is created from the recent tracked image and a weighted combination of relative distance and angle to the current keyframe.

dist(ξji)=ξjiT W ξji. Tracked frames that are not keyframes are used for further refinement of the depth map, by a large number of comparisons on the stereo image regions with high accuracy to add new pixels to the depth map. This algorithm uses direct image alignment for monocular SLAM to avoid scale-drift errors over long trajectories. An important limitation of the approach of image alignment is the need for high accuracy with the initialization, for finding loop closure constraints in case of large loop closures. This can be resolved by increasing the convergence radius using Efficient Second-Order Minimization (ESM) or Coarse-To-Fine Approach. As the robot moves in a counter-clockwise direction, based on the robot's pose visual odometry is created. The left image represents the position before loop closure and the right image shows the position after the loop closure using V-SLAM.

### Visual SLAM (V-SLAM)

Localizing a robot in an unknown environment while creating a map of the environment using cameras is Visual SLAM (V-SLAM). As the robot traverses the environment it generates a trajectory based on camera poses (the camera is mounted on the agent) which observes the landmarks captured by the camera. The landmark position and camera pose are estimated using Extended Kalman Filters (EKF) or Bundle Adjustment variant. V-SLAM creates dense local maps which are constantly updated and after loop closure a 3D global map is created which is accurate than a stereo reconstruction. As the robot moves, the dense map is calculated based on the robot's movement and dense disparity images which are corrected by the Kalman filter. When it comes to high speed moving robot.

### Other

There are other important and upcoming SLAM variants like L-SLAM (Low dimensional SLAM), ORB-SLAM, CT-SLAM, CoSLAM (Collaborative Visual SLAM), SeqSLAM (Sequential SLAM).

## Applications of SLAM

SLAM has been used in mapping, sensing, kinematic modeling, audio-visual SLAM, collaborative SLAM, moving objects, loop closure, exploration, etc. The major applications that use SLAM are the robots (both manned and autonomous) that are used in unknown terrains like planet surfaces, underseas, underground. SLAM can be used in both indoor and outdoor applications like a home vacuum cleaner to an underground exploration of mines.

Considering the example of Vacuum Cleaner Robots, a robot without SLAM will randomly traverse in the room which may end up not cleaning the entire room and consuming high power and low efficiency. A robot with SLAM localizes itself in the room and moves accordingly based on the sensors. It also maps the room using a camera or LiDAR sensor to avoid moving blindly in the room by maintaining its power consumption cost and cleaning the room completely.

In paper of V-SLAM, an autonomous car was traversed in rectangle counter-clockwise which generates visual odometry of the car's pose. At the loop closure, using Monte Carlo simulation and Kalman filters, the drift was predicted and removed using sparse mapping of the V-SLAM algorithm. V-SLAM is also used in satellite positioning and navigation service, deep space exploration, indoor localization, etc. SLAM is also used in navigating a fleet of robots in a warehouse to perform a certain task, parking an autonomous car at a parking spot or navigating a delivery drone in an unknown environment.

## Open Research in SLAM

Based on the research done so far, SLAM is always used for navigating the robot and generating map of an unknown environment while algorithms from Monte Carlo Simulation, Deep Reinforcement Learning, online POMDP, etc are used to create dense map of the unknown environment and generate a path for the robot to traverse in uncertain environments. The paper describes about the path planning technique in belief space with Pose SLAM under uncertain environments. The maps generated by Pose SLAM can be directly used as belief roadmaps to plan path in the areas with low uncertainty. Similarly as described in paper, that creating map and localizing robot is solved by SLAM but optimization of path and creating detailed and informative environment in dynamic conditions using SLAM is still an ongoing research.

The algorithm that is based on semantic-based mapping or enhancing the accuracy of the semantic map is Semantic SLAM. Semantic SLAM opened the research to estimate poses, detect loop closures and build 3D maps more efficiently than the traditional SLAM methods. Blending Deep Learning and Semantic SLAM is a hot topic in SLAM that focuses on creating a detailed map of the environment that is robust to illumination, dynamics of the world, and fast-moving agents, and also generating optimal path planning for a dynamic environment.

As suggested by the paper, there are possibilities with simultaneous planning and performing multiple tasks using various types of sensors to get a better model of the environment, also there is more scope to improve the accuracy by using various other planning algorithms instead of RRTs in combination with online POMDP.

AR SLAM is another rising topic where a device collects reference points in the form of pictures from the environment. Google's AR platform uses advanced SLAM to distinguish between floors and walls. So in AR, a virtual image is blended with real-world information and SLAM provides information of the device and world, which is a MAXST's Visual SLAM problem.