For my final AI project, I thought about all of the examples I could think of where an AI seeks or chases the player, and decided I was curious about AI that flees from the player instead. Ideally, this would be a system where agents did everything a player has to do in a steal game-- staying away from the thing chasing it, hiding behind obstacles, and slowing down relating to how close the aggressor was "to make less noise". I did not fully pull this off, but I learned a lot in my two main case studies.
BEHAVIOR TREES?
In my initial forays into research, I was imagining that I would be basing my program on a behavior tree. A behavior tree is a hierarchical node graph of "tree" form where all of the final "leaf" nodes-- the nodes that have no child nodes-- are the actual behaviors the agent must perform. The tree traverses down its branches and goes through each node in order-- the way that appears "right to left" or "top to bottom" on a graphical representation of the tree, depending on the orientation of the graph.
To better explain how this works, here is a summary of some of the node types seen in a behavior tree:
Composite: has one or more children, can process one or more in any order depending on what information is fed into them?
Composite Subtype: Sequence: Does a list of behaviors in order. Returns “failure” if sequence does not go through every child node. This can be a list of behaviors or a list of things to check to determine if a specific behavior is performable. In a more complex iterations, there are decorator n odes or more subtrees under these composite nodes before the eventual leaf nodes.
Composite Subtype: Selector: Instead of doing things until one fails or all succeed, chooses from different child nodes depending on other factors such as unit health or environment conditions. Nodes are sorted in most to least favorable so the "cheapest", or most straightforward method, can be taken. Can still return a fail state if there are conditions that prevent the intended outcome. (these also can be called Fallback nodes)
Decorator: has only one child node. Will do something to affect the child node's feedback or use information coming from the child node to do something.
Subtype: Inverter: The child node checks something, and the inverter node returns the opposite result (usually if the child is returning a boolean)
Subtype: Succeeder: Returns success regardless of what the child actually returns. There is no “Failer” subtype because, in practice, that’s just an Inverter stacked on top of a Succeeder. Sometimes, the step doesn’t need to succeed for the sequence to succeed (in example, you should check whether a door is open before you walk through, but a door that was smashed to pieces may not count as "open" in-code, so it would be better to just walk through the rubble and carry out the rest of your behaviors).
Subtype: Repeater: Repeats the child node to maintain loop (like an Update() function) or a set number of times (example: "check the oven is really turned off three times before you leave the house for work").
Subtype: Repeat Until Fail: Repeater that repeats until the leaf node returns a failure. (example: walk up the stairs until you cannot find a higher stair to step on.)
Leaf: The lowest level node, and the actual behavior that will be shown. Some call info from another behavior tree if agents are coordinating!
Subtype: Init: starts a behavior such as a pathfinding activity
Subtype: Process: called every tick while behavior is taking place. In some cases, can continued whilst the behavior tree calls up other behaviors. (Running while shooting would be an example of this.)
The main two reasons to use a behavior tree over a finite state machine or other older architecture for managing agent behavior are the modularity and the reactivity of behavior trees in comparison.
It is more modular in that reordering a or changing behavior tree is very simple in comparison. Branches can be moved around, taken away, or added easily, simply by changing where the parent node for each branch is connection. In comparison, adding a behavior to a finite state machine means creating the behavior and any number of transitions to it and the related behaviors. It can take long to make those changes and even longer to debug them.
It is more reactive as information passed along the behavior tree can be used to change how behaviors in later leaf nodes are implemented. Some behavior trees even have whole branches locked away, or re-ordered at runtime, based on some condition being met during the game's runtime.
IF I AM HAVING “ENEMY” UNITS HIDE FROM THE PLAYER UNIT, here are some checks I should consider if creating my agents' behavior with a behavior tree:
Can I see the player?
If yes: (we need to hide so)
Can I see something to hide behind
Go to it and be on the other side of it from the player
If not , can I run?
Do I think the player can hear me?
Compare distance to player with how fast unit is moving
If no: Keep moving quickly to get further away
My first case study: ALIEN: ISOLATION
While I could not think of a game I had seen where agents fled the player longer than it took to fire a long-range attack, I knew that the alien in Alien: Isolation was known for being out of sight for most of the game and also for having terrifying, complex behaviors. In my research, I found that Alien: Isolation uses behavior trees for its AI agents. The Xenomorph's behavior tree is a terror to behold on its own, with 32 main branches. Some of these branches are locked off until certain conditions are met. For instance, the alien agent does not bother checking in lockers or under desks for the player until the player has successfully hidden from the alien a few times. This saves the alien time and data, but more importantly, this makes it appear as if the alien is learning.
The data in the game is also organized such that the alien agent does not "know" the player's exact location, but at times, it will move towards the player from the other side of the map anyway. This is due to an agent call the AI Director, an omniscient AI that nudges agents to point the right way to keep the game moving. Without letting the alien know exactly where the player is, it overrides the regular behavior patterns and tells the alien to move towards another point on the map. The alien performs this behavior, not "knowing" that it's being pointed in the direction of the player so that the sound of it stomping on the ceiling overhead can send them into cardiac arrest.
The other valuable thing I learned about in this case study was the way that the alien's "senses" are handles. It has four different vision cones-- a large one pointing from its head as expected, which will cause the alien to notice the player differently depending on how far away they are and how long they remain in the area, a smaller one inside that, close enough that the player would definitely be seen, two cones to either side representing the alien's peripheral vision, and finally, a small vision sphere around the alien's torso so the player can't hide from the alien by standing in its shadow. The alien also gets alerted if the player character touches part of its body (and eviscerates the player character on the spot, usually). The alien's hearing is not quite as complex as sight, but as expected, the player shooting a weapon makes a sound that is loud enough to alert the alien from anywhere on the map, and the player moving more quickly will make louder sounds than the player inching along or standing still. Reading about how this was put together, especially the different vision cones, helped me think about how to set up my hiding AI.
My Second Case Study: OpenAI's Hide and Seek Simulations
While looking for other AI that had agents hide from a player, I found the video Two Minute Papers did on the OpenAI foundation's research in reinforcement learning AI. OpenAI created a simulation in which two teams of two agents tried to play Hide-And-Seek. One team "hides" from a team attempting to "seek" them (have the hiding agents be in their vision cone before a timer ran out). The level the simulations were run in had walls and various obstacles that the agents could move around or even lock in place so that the other team could not manipulate them. However, while the agents had access to these abilities, they had no instruction in the beginning to use those behaviors in any way. Their only in-code motivation was to win the game of Hide-And-Seek. Since this was a learning AI, the data of what happened in past trials of the simulation was kept for the agents to use in their calculations.
Over millions of simulations, patterns emerged. The "Hiding" team first began stopping up holes in walls with blocks to hide behind. Then, the "Seek" team started to move a ramp-shaped object to the wall to go over the cube-shaped blocks and get behind the barrier. The "Hiding" team started to pull the ramp object behind their wall before blocking it off with the cubes. Different strategies began to emerge, as did counteractions to those strategies.
Eventually, however, the agents began to exploit bugs in the program in which they operated. As the agents were propelled by exerting a force on the ground below them, the "Seeking" team began climbing onto cube blocks with a ramp. Walking around on the cube block would move the block around the floor in a behavior the researchers refer to in their reports as "block surfing.""Hiding" team agents figured out how to throw the ramp block through the wall where it met at a corner, so their barriers could not be climbed into. Agents on either side found ways to run off the stage or fling themselves over barriers by exploiting the simulation's physics.
In considering all these findings for my project goal, I decided that while keeping the motivation of my agents simple was a good idea, letting them learn and possibly begin to exploit game physics was a potential debugging nightmare. OpenAI's research is open source, so it is possible to look at their source code, which could be a resource for anyone looking into reinforcement learning with their AI Agents.
MY DEMONSTRATION
... Is not much to look at, admittedly. There is not much visible feedback, which is important for good AI. The prototype was developed in Unity Engine. The agents (red squares) have two colliders-- one representing a vision cone and one representing a range of hearing. The player-controlled agent (green circle) is faster than the agents, and the agents decide how quickly to flee the player based on how close it is to them by a force applied to them in the opposite direction. In retrospect, doing this with raycasting rather than colliders would have worked more smoothly.
I also envisioned obstacles for agents to hide behind, which would have required raycasting-- layer masks for object types like barriers can be used to stop an agents looking for something else. However, In my fervor to find out more about OpenAI's Hide and Seek simulation, I ran out of time to correctly implement this function, so I left the project as it is.
SOURCES
Ian Millington's Ai for Games (2nd ed)
https://www.gamedeveloper.com/programming/behavior-trees-for-ai-how-they-work
https://towardsdatascience.com/designing-ai-agents-behaviors-with-behavior-trees-b28aa1c3cf8a
https://openai.com/blog/emergent-tool-use/
https://www.vox.com/future-perfect/2019/9/20/20872672/ai-learn-play-hide-and-seek
https://www.aiandgames.com/2020/05/20/revisiting-alien-isolation/
https://www.gamedeveloper.com/design/the-perfect-organism-the-ai-of-alien-isolation
Comments