A* Pathfinding and research

Custom engine -- C++

▚ Project summary:


In this research and prototype, I first researched and tried to understand the A* pathfinding algorithm and implemented fully working A* pathfinding inside a prototype project

This page is mainly here to share my research and conclusions.

Page contents:
·Understanding the A* algorithm.
·Pseudocode
·C++ Implementation
·In Project A* implementation

▌Specifications:

Project type: Buas school research project
Released: November 2020
Platform: PC, Windows
Time spent on project: 2 weeks of research and prototyping
Engine and Tools: Custom engine - C++

▌My contributions:

○ Fully documented research for understanding A*
○ Implemented core A* into our prototype project

▌Project goals:

○ Learn more about and understand the A* algorithm
○ Be able to use and explain the A* algorithm into our own prototype project
○ Document all my research

▚ Understanding the A* algorithm


After researching different pathfinding algorithms, I read that A* would be the most performance friendly pathfinding algorithm.

In A*, there are 3 important variables used. F, which is the total cost of the tile. G, which is the distance between the current tile and the start tile. H, which is the heuristic, which is the estimated distance from the current node to the end node.

The most important part of the A* algorithm is that we determine the total cost of a tile (f = G + H), so we can look at all our tiles, and determine if a specific tile is the best tile to move forward to.

Rather than running through every tile,
we can pick the one that have the best change of getting us to our goal.

A*

Dijkstra

Linked on Wikipedia, a great example is given to illustrate the difference between Dijkstra’s algorithm and the A* algorithm speed wise. Dijkstra’s algorithm does not use the calculated F value to determine the most likely fastest tile


▚ Technical understanding


Now to move on to more technical understanding,
I mostly understood how A* worked after watching Sebastian Lague his videos. One of the important things he explained, is that moving diagonal
should “cost"  more then moving straight.

To be able to use integers, or not use decimals, we use 14 for diagonal steps and 10 for straight steps.

Costs example:


In the next example, A is the start tile and B is the target tile.

Images from Sebastian Lague!

On the top left corner of each possible tile, the G cost is shown.
(the g cost shows how far away we are from our start point).
On the top right corner of each possible tile, the H cost is shown.
(The h cost is the distance to the target tile). The bigger, in the middle-positioned number is the F cost (which is the G cost + the H cost).

The tile where the mouse is positioned, has an G cost of 14 (1 diagonal step away from point A). And has a H cost of 28 (2 diagonal steps away from B). And of course, the F cost will be the G cost (14) + the H cost (28). Which results in a value of 42. The surrounding tile with the lowest F cost, is then used to check first when trying to find a path.

One of the things I learned while watching this example is, that even
though the tile where the mouse is positioned has the lowest F cost.

It does not necessarily mean it’s locked as the next best step. The algorithm just knows to look at the lowest F cost tile first, because it’s likely the best tile to pick for the shortest path.

As you can see on the image on the right, by picking the tiles with the
lowest F cost. A path has been found, while only having to look and determine the values of the neighbors of 3 tiles.



Of course pathfinding is made to find a path when there are obstacles.

Step 1 remains searching for the lowest F cost tile. In this example on
the right, that would still be the top left cornered tile, with a F cost value of 42.

Now step 2 would be searching for the next tile with the lowest F cost. Now there are already 3 tiles with a F cost of 48. If this happens, you check the surrounding tiles wit the same F cost value, in order of lowest H cost (closest to the target).

Step 3: Make sure to check the last tile with the lowest possible F cost (48).

Step 4: now that all the tiles with a F cost of 48 have been checked, it’s time to pick the lowest F cost neighbor tile to try to get closer to the target. This would be the left upper tile with an F cost of 54 (and after that the more downwards positioned tile with F cost 54 (but higher G H cost)).

Step 5: These values are calculated by already trying to follow a path. However values do get updated once we try a tile that is closer to the starting tile but were initially not an optimal value. This results in F cost 68 turning into a F cost of 60 on mouse positioned tile.

Step 6: following these steps, while checking and calculating all the values, you will get the shortest path towards your target when possible.


▚ A* Pseudo Code


Following this link, I had an excellent example of some pseudo code that helped me understand the basic A star algorithm.

// A* (star) Pathfinding
// Initialize both open and closed list
let the openList equal empty list of nodes
let the closedList equal empty list of nodes
// Add the start node
put the startNode on the openList (leave it's f at zero)
// Loop until you find the end
while the openList is not empty
  // Get the current node
  let the currentNode equal the node with the least f value
  remove the currentNode from the openList
  add the currentNode to the closedList
  // Found the goal
  if currentNode is the goal
    Congratz! You've found the end! Backtrack to get path
  // Generate children
  let the children of the currentNode equal the adjacent nodes
  
  for each child in the children
    // Child is on the closedList
    if child is in the closedList
      continue to beginning of for loop
    // Create the f, g, and h values
    child.g = currentNode.g + distance between child and current
    child.h = distance from child to end
    child.f = child.g + child.h
    // Child is already in openList
    if child.position is in the openList's nodes positions
      if the child.g is higher than the openList node's g
        continue to beginning of for loop
    // Add the child to the openList
    add the child to the openList

▚ C++ implementation


First, I define my Start and TargetTile.
I create a vector of Tile pointers, which I call openSet.
I also use an unordered_set of Tile pointers, which I call closedSet.

I use an unordered_set because the order is not important while storing the “closed”/ already evaluated tiles.

I push back the already known start tile in the Openset, as that will be the tile which has to be checked first.

I then start a while loop, with a condition to check if the openSet size is higher then 0.

As described in the previous slide the first for loop and if statement
checks if there is a tile that is currently in the openset with a lower F cost, or if the F cost is the same, a lower H cost.

If so, currenTile is inmidiatly set to the tile with the better/ lower F or H cost.

This tile is the removed from the open set and placed in the closed
set. Because there was no tile surrounding it that had better variables.

If the currentTile = TargetTile, a retrace function is called (which is explained more about a bit later). Then the function is done, and return is called. No new neighbors are needed to be checked and the while loop exits.

The next part is responsible for checking neighbors of the currentTile and adding them to the openset when they have a lower G cost and
are not already in the openset. It also checks if the tile is passable, which I use to define a obstacle or not while trying to find a path.

One of the interesting approaches I learned while rewriting this
following a tutorial that was using C# and unity, was that Sebastian Lague sets a reference, which he called parent to the last checked fastest tile.

He does that because that’s all the info we have before adding the checked neighbor tile to the openset. Using this reference, The retrace function can go from parent reference to parent reference and calculate a full path.



The GetDistance function is used to receive a correct difference or stepping value, by calculating it using the difference on both the X and Y axis from 2 tiles.

This system used the value 14 for diagonal steps (which takes longer) and 10 for straight steps (which should be preferred).

Now in a bit more detail, using the parent tile reference, a path can be found after finding the target. The RetracePath function does exactly that.

I first clear a vector of Tile pointers. I then start at the end because all our tiles store a reference to the previous fastest tile. I use a while loop to place all these tiles, following the references, in a vector.
This however results in a reversed path.

We can use the std::reverse function to reverse our vector. And our path of tiles is finished!


▚ In project A* implementation


I implemented some test movement for the enemies. 
When assigned a vector of tiles (the path), They move from the start index to the next index with a smooth lerp. When arrived, the start index is increased by 1, until the final target tile is reached.

In the first version, I used the FindPath function with the purple tower as begin tile, and the red tower as target tile. 

This resulted in the well known typical A* behavior most of us know, The enemies follow the generated path and take the shortest path possible.


This really works well, and I am quite happy with the working algorithm inside our project. I learned a lot while implementing a algorithm like A*.