Posted May 27, 2024 by serfofcinder
If you read my first dev log, you’ll know that one of the big inspirations I had was this GDC talk about Doom’s combat. They talked about several AI techniques they used to make their enemies so engaging to rip and tear apart.
Since Doom is one of my inspirations, I wanted to recreate some of these features in this game. However, a lot changes when you go from 3D down to 2D. There’s a lot less room to dodge, and there’s a lot less ways for enemies to miss you with their projectiles. There also isn’t really a concept of cover. Just platforms that projectiles can’t pass through. And I decided pretty early on that enemies becoming hostile toward each other would be out of scope.
So I was left with these goals for my enemy AI:
I implemented the enemy AI and achieved these goals with three main components:
Some games can get away with not having a real pathing system for their enemy AI. Like Mario. Those koopas and goombas can just walk forward until they detect an edge, then turn around. Or just go off the edge. I’m pretty sure Hollow Knight didn’t need a fully fledged pathing system either. Flying enemies will either just fly in a pattern and bounce off of walls and floors alike, or make swooping patterns at you ignoring any level geometry in the way. Ground-based enemies will stick to the platform they spawn on.
But this game does need a fully-fledged pathing system. That’s because I wanted an aggressive melee pursuer enemy, like Doom’s hell knight. It needed to be able to pursue you across different platforms, and jump or drop between them as needed.
A cursory search of GitHub and the Unity Asset Store surprisingly showed nothing like what I was looking for. So I got excited that I had an excuse to implement A*, because I’m that guy. I’m not going to explain what A* is in this dev log. If you don’t know and you’re curious, here’s the wikipedia. A* is a graph algorithm, so it needed a graph to operate on. I developed a data structure I called a sidescrolling nav mesh, which acted as that graph. I built a utility to bake that nav mesh ahead of time so I wouldn’t have to do any fancy searches to find where all an enemy could jump to at runtime. The way it worked was:
1. Check all of the nav vertices. For a nav mesh, I could set the position and dimensions of the navigable area, and the distance between nav vertices, and it would place all of the vertices in a 2D grid based on those specs. The first task was to figure out the terrain type of each vertice. The possible terrain types were:
I originally planned to have some enemies that could climb on walls or hang from ceilings, but cut that because I didn’t have the resources to animate those states. So I eventually cut the OnWall and OnCeiling terrain types as well, leaving us with only OnFloor, InAir, and InsideGeometry. The bake process decided which type each vertex was based on physics collision checks and raycasts in cardinal directions.
2. Connect adjacent vertices with edges. There were a few edge types as well:
WallClimb and CeilingClimb were cut just like the OnWall and OnCeiling terrain types. With the edge types left, the bake process connected all adjacent OnFloor vertices with FloorTraversal edges, and all adjacent or diagonal non-InsideGeometry vertices with Flying edges.
3. Find where there could be a valid jump or drop, and connect those vertices with the appropriate edges. I set a maximum jump size and a maximum drop size, each as Vector2 to specify a maximum vertical and horizontal distance. Then the bake process scanned all of the vertices, and found all OnFloor vertices that had adjacent non-floor vertices. For each of those, let’s call them Ledge Vertices, it checked to the non-floor side and down, up to a maximum distance in each direction based on the maximum jump and drop parameters. If there was an unobstructed drop to another floor vertex, and it was within the drop parameters, it added a drop edge from the higher floor vertex to the lower one. If it was within the jump parameters, it added a jump edge the opposite direction.
All of this nav mesh data was saved into one asset file using a ScriptableObject. I made a NavMesh MonoBehavior component that would reference one of these ScriptableObjects, use that data to calculate pathing in the game, and draw some helpful gizmos in the editor. I wrote a custom inspector to add a big “Bake” button to make it nice and easy.
Now that I had this navigation graph, enemies could run A* to calculate their path to their desired destination. I wrote a SidescrollingNavMeshAgent component to put on enemies that specified their maximum jump and drop size, what vertex and edge types they could navigate, and other standard movement parameters like movement speed, acceleration, and deceleration. Those parameters would all be used as filters on what vertices and edges they could use in the graph while calculating their path. Once they had their path, the difficult part was actually following it.
There were several touches necessary to make movement feel as non-awkward as possible. The first was combining FloorTraversal edges into one in the path. This allowed me to simplify the ground traversal logic to:
With that logic, but without combining floor edges, enemies would have taken one step, stopped, taken another step, stopped, and so on all the way across the floor. I had a bug for a little bit that actually made them do that.
The next touch was turning Flying edges into a spline instead of just a series of straight lines. I had a desired start position (the enemy’s current position), desired end position (the next vertex in the path), desired starting velocity (the enemy’s current velocity), and desired end velocity (a bit more complicated), and I could create a cubic spline based on those parameters for the flying enemy to move along. Once it reached the vertex, it would move on to the next edge in the path, and calculate a new cubic spline.
The spline method I used to decide the desired end velocity was the Catmull-Rom spline. Basically, for the normal at each point on the spline, you use the direction between the points before and after it.
The most mathematically complex one was jump and drop trajectories. You know I’ve already done a lot of math on jumping for this project, so I’m up for the challenge. On Awake, I calculated the jump gravity and maximum initial jump speed the same way I did for the character controller. We know the horizontal distance of the jump, and the horizontal movement of the enemy, so we can use that to calculate the time the jump should take. But either the horizontal distance or the vertical distance can be the limiting factor on this jump’s time, so we need to calculate the time required for the vertical distance as well and use the higher number for the jump duration. If vertical distance is the limiting factor, I assume the enemy will jump only exactly as high as required to reach their goal, and not a centimeter more. Either that, or they will not jump upward at all at the start of a drop. Either way, either the starting or ending vertical velocity is 0. Without loss of generality, we’ll treat starting velocity as 0, since that simplifies calculating duration. If we treated ending velocity as 0 instead, we would just be switching starting and ending velocity, without actually changing the duration. Since we know gravity and the initial velocity, we know the vertical position as a function of gravity and time:
You may notice that I switched the sign of y in there. Y should actually be negative, since this is for the case where we’re dropping. I switched the sign of y so that it’s just the distance we’re jumping or dropping, so we don’t need to worry about it being a signed distance.
Now that we have the minimum time required for the x movement, and the minimum time required for the y movement, we take the maximum of the two as the jump duration. Now we have the following parameters for the jump:
We can use those to calculate a trajectory. Let’s do horizontal velocity first, since that one’s easy.
And, more importantly, the function for horizontal position:
Now to calculate initial vertical velocity. We know that:
We actually know everything in this equation except for initial velocity (t is the duration, y is the vertical distance, signed this time), so we can just solve for it:
And, once we calculate that, we can plug that back into the function for vertical position:
Now we have both x and y as functions of t, and we know how high t goes, so we can easily update the position based on that every fixed update to create a realistic-looking jump arc.
Now here’s another tricky part. Enemies need to update their destination and pathing periodically if they’re trying to follow you, because you’re going to be moving pretty quickly. When they do update their path, they need to be able to do so without it being jarring. So, when the nav mesh agent gets its destination updated, I check the edge its currently navigating. If it’s not a jump or drop, I’ll clear the path, calculate a new path from their current position, and calculate the trajectory for its new first edge in its new path. But if it is a jump or a drop, it would be really weird for them to interrupt their jump to suddenly change course. So I clear the path after that jump or drop edge, and calculate a new path starting from the end point of that edge.
Enemies won’t necessarily spawn in a position they are able to navigate on at all. If that happens, I want them to handle it gracefully by falling to the floor below, and then be able to navigate. So, on enable, I check their current nearest vertex. If it is not a type they can navigate, I start scanning downward from there until I find one they can navigate. Then, I give them a Drop edge from their spawn position to that vertex beneath them to start off.
Enemies can be staggered or stunned by your attacks. What happens if they get staggered or stunned mid-jump? Well, this actually has the same solution as the problem of spawning in the air. I disable the nav mesh agent when they are stunned or staggered, and re-enable when they recover. And since that downward scan and drop is done on enable instead of just on Awake, they will also drop smoothly after being staggered mid-jump.
If they are interrupted by a stun or stagger while running on the floor or flying through the air, they simply stop their movement. When the nav mesh agent is re-enabled, they will once again find their nearest nav vertex, and start once again calculating their path to their destination from there.
Bahvior trees are a classic AI technique that allow you to write lots of small, modular behaviors or tasks, and then combine them into a complex decision tree to create complex behaviors and decisions. I considered using state machines instead, but behavior trees have the advantage that behaviors in a behavior tree don’t need any awareness of each other, unlike states in a state machine, which makes behavior trees more reusable.
A behavior tree is defined by many nodes, which each contain their own logic for having the enemy execute certain actions. There are two broad categories of nodes: decorator nodes, and task nodes. Decorator nodes have one or more child nodes, and they contain their own logic to execute and alter the result of those child nodes. Task nodes perform a concrete action. Every node type has an execute function which is called every frame when that behavior is in progress, and a current status, which is generally returned by the execute function. The status can be either In Progress, Success, or Failure. Once you get used to thinking with behavior trees, you can combine together many nodes to create complex enemy behaviors with ease.
Another important component for making a behavior tree work is the blackboard system. Behavior tree nodes need a mechanism for communicating with each other. Specifically, we want some nodes to be able to post data to a shared data structure, and other nodes be able to read that data. That shared data structure is generally called a Blackboard. As an example of why that is useful, I just have one Navigate task node. It reads the navigation destination from the blackboard. It doesn’t need to be concerned with how that destination was decided. It just reads the location and navigates there. There are a few different task nodes that can be put right before the navigation node in a sequence that will set that destination in the blackboard. One just sets it to the player’s location, to make the enemy chase the player. One chooses the best option from a list of waypoints and sets that as the destination. One figures out what section of floor the enemy is on and sets the destination to the far edge of that section of floor. And there are a few others.
I implemented the blackboard as its own MonoBehavior component, and passed that component into every task node in the constructor. That’s called dependency injection, because for some reason, software engineers have decided that a technique as basic as passing a dependency to an object in its constructor deserves a fancy name. The Blackboard component has the functions:
They all take a string as the key. The Get functions return the specified type, and the Set functions take an argument of the specified type. They all read or write the relevant data to a C# Dictionary of strings to generic objects.
There are several walkthroughs online already for implementing behavior trees in Unity. I can’t find the one I first referenced anymore. But there’s also this one: https://medium.com/geekculture/how-to-create-a-simple-behaviour-tree-in-unity-c-3964c84c060e. Don’t use this one. It’s wrong. Their sample code for a sequence node executes all of the children concurrently, not in sequence. That’s such a basic mistake. Their selector node does the same. Their blackboard system technically works, but it’s messy and duplicates a lot of data, wasting memory. And then to act like their system works, they just put the enemy’s entire logic in one node, completely defeating the purpose of creating a behavior tree system in the first place.
So, the basic decorator nodes I created were:
And there were a few others that I’ll get to later.
The task nodes I added were:
Those are all of the standard task nodes that were shared across multiple enemies. There were a few others that were specifically tied into one enemy’s specific behaviors.
I created the framework for behavior trees and all of these behavior nodes for it, but I stopped short of creating a custom GUI editor for them, so I constructed each enemy’s behavior tree manually in code. It would have been a little bit easier with a GUI editor, but honestly not much. I think with twice as many enemy types, it might have been worth the work to make a custom editor. As it was, I made one base class to manage basic behavior tree logic:
That’s the whole class. It has a Behavior tree, which is created by an abstract function on Start. On Update, it just calls Update on the behavior tree. On Disable, it just calls Reset on the behavior tree. The behavior tree class is even simpler:
It literally just has a root node, and it forwards Update calls to the root node’s Evaluate function, and Reset calls to the root node’s Reset function. Reset is something I didn’t talk about much. Many of the nodes have internal state that they need to keep track of, and Reset, well, resets that. SequenceNode keeps track of which child its on; Reset resets it to the first child. Reset also resets all of its children. WaitForSecondsTaskNode keeps track of how much time is remaining to wait. Reset resets that to its total wait time. And on the list goes. You can imagine what the Reset function would do for most of these nodes. Not only is Reset called on the whole tree when the AI director is disabled; Reset is also called on individual nodes or branches when they complete in either success or failure. That is generally up to the node’s parent.
But anyway, for each enemy, I just extended that BehaviorTreeAIDirector class, and overrode that CreateBehaviorTree function. And I’ll talk about some of those specific behavior trees in the next section. But first, there is one more critical component.
You may have gathered that I found the token pool system from Doom 2016 fascinating. So I used the same approach to ensure only a certain number of enemies can be attacking at once. I actually have two different token pools in my game: a melee pool and a ranged pool. The token pool itself is pretty straightforward. It keeps track of how many tokens it has. I put a difficulty option in my game for “Enemy Aggression”. Each token pool can specify how many tokens it has based on that option. I have settled on 1 for low aggression, 2 for medium aggression, and 3 for high aggression for both pools. When an enemy wants to do an attack, it requests a token from the appropriate pool. If the pool has more than 0 tokens, it reduces its token count by 1, and returns true to the enemy that requested it. If it is at 0 tokens, it returns false to the enemy that requested it. If the enemy got true from its request, it continues with the attack. Once it finishes the attack, it releases the token back to the token pool, and the token pool adds 1 back to its token count.
To facilitate enemies claiming and releasing tokens, I implemented a new behavior tree decorator node: ExecuteWithAttackTokenNode. This node takes the token pool object (melee or ranged), the time limit for successfully acquiring an attack token, and the child node to execute once it does successfully acquire a token. If it gets false from the token pool, it returns in progress. If the time limit is reached and the token pool still has only returned false, then the node fails. If the token pool returns true, then it will go on to execute the child node and return its result. Once the child node results in either success or failure, it will reset itself. Its Reset function will release the token back to the token pool if it has claimed one. Now I can create a sequence like so:
That is a full sequence to navigate to the player and attack them. And I can put that whole sequence as the child of an ExecuteWithAttackTokenNode, and the enemy will acquire a token before it starts that sequence, and release the token when it finishes or is interrupted.
What token pools are to attacks, waypoint pools are to navigation. I want to be able to set points of interest to enemies in each arena. Ranged enemies will choose from the pool of waypoints to decide which spot is the best to stand in while shooting. Support enemies will choose from the pool of waypoints to decide which spot is the best to stand in to get away from you. But I don’t want all of the enemies going to the same spot. I want them to spread out. Enemies all going to the same spot leads to a tedious game of kiting the enemies around while slowly picking them off one by one. Enemies all going to different spots creates a much more frenetic feel, even though only two of them can be attacking at any given time.
So, when an enemy chooses a waypoint to navigate to, they claim that waypoint and remove it from the available pool. Once they finish what they want to do at that waypoint, they release it back into the waypoint pool. I created two waypoint pools for each arena: a ground waypoint pool for all the ground-based enemies, and an air waypoint pool for all the air-based enemies.
And you guessed it: to facilitate enemies claiming and releasing waypoints, I created a new behavior tree decorator node: ExecuteWithWaypointNode. This node takes the waypoint pool object (ground or air), the time limit for successfully acquiring a waypoint, a utility function for which waypoint is best, a child node to execute after successfully claiming a waypoint, and a fallback child to execute if it doesn’t succeed in claiming a waypoint. Until it claims a waypoint, it will evaluate all of the available waypoints in the pool using its utility function.
I wrote utility functions to return a negative number if a given position is absolutely unacceptable, and positive numbers for acceptable positions, with higher numbers meaning more desirable positions. If the utility function returns a negative, the enemy will not claim that waypoint even if there are no other waypoints available. For example, shooting enemies will not claim a waypoint that does not have line of sight to the player character no matter what.
If it finds a suitable waypoint, it will claim the best one, write that to the blackboard as the navigation destination, and then execute the child node, returning its result. If it does not, it will return in progress. Once the time limit has run out, if it still has not found a suitable waypoint, it will begin executing its fallback child node and returning its result.
Once whichever child it is executing finishes, it will reset itself. Its Reset function releases any waypoint it had claimed back into the waypoint pool, and resets both children.
This node also allows me to specify if they should not use the same waypoint twice in a row. If so, on a successful child execution, it will keep track of the waypoint it used for that execution, and not reset that when the Reset function is called. The next time the node is executed, it will exclude that node from its considerations for which node to use. That’s useful because I don’t want a ranged enemy to just stand in one spot and shoot at you repeatedly. I want them to reach a spot, shoot, then start moving again.
You know the AlwaysSucceed and AlwaysFail task nodes I created for behavior trees? Those were kind of weird, right? Well, they come in handy with the ExecuteWithWaypointNode’s fallback child node, if I don’t want the enemy to do anything as a fallback without a waypoint. They give me just a placeholder node to effectively not have a fallback, while also not breaking the code or having to write any special cases.
Now I can create a sequence like so:
That is a full sequence to navigate somewhere and do a ranged attack. I can put that as the main child of an ExecuteWithWaypointNode, with the melee sequence from above as the fallback child, and that will make the enemy find a waypoint that has line of sight and an appropriate distance to the player character, navigate there, and shoot at the player character, all while communicating to other enemies not to navigate to that same spot. If there is no good spot for the enemy, then they will navigate up to the player and do a melee attack instead.
Three of my main goals for the enemy AI were:
The token pool system and waypoint pool systems worked together to achieve the first goal. The utility functions in the waypoint pool system helped me achieve the second goal. And the behavior tree framework gave me the tools to achieve the third goal. I designed nine different enemies, each with their own unique behaviors. Here are just a few of them:
The race makes the enemy patrol repeatedly while simultaneously watching for the player right in front of them, and move on in the sequence once it sees the player right in front of them, stopping navigation and attacking
Other enemies I won’t be describing the whole behavior trees of:
The behavior trees I spelled out here are actually simplified versions of the actual ones. I left out the boring stuff like finding the player character at the start, of course. But also, most of the navigation sequences are actually on a timeout and a loop. That’s because the player moves around a lot, and most decisions of where to navigate to are based at least in part on the player’s position. So, navigation destinations need to be updated frequently, more frequently than the enemy can reasonably finish navigating somewhere. Figuring out the exact right structure for these trees was a process of tweaking and testing, and tweaking some more.
I’m pretty satisfied with the AI systems I’ve developed for this game. They’re not perfect. Navigation still has occasional bugs. An editor GUI for the behavior trees would be great. There are also great pre-made behavior tree systems you can get off the asset store. I’ve heard good things about Playmaker and Behavior Designer. I can’t say what it’s like to actually use those tools, but unless you’re just really excited to implement your own behavior trees like I was, it would probably be easier to go with one of them. I know they at least have editor GUIs.
Now, I don’t know of a pre-made side scrolling AI navigation system like the one I made. I feel like surely it must be out there, but I didn’t find it in my search.
My general advice is to use pre-made tech if you can, unless you’re just really excited to build your own, like I was. And if you are building your own enemy AI systems, I hope this was useful to you. I'll sign off with another gif of my enemies being cool.