7  Game Search in AI

7.1 Introduction to Multiagent Systems

In this chapter we examine multiagent systems, where multiple autonomous agents interact within a shared environment. Each agent attempts to solve its own task, possibly learning, while considering other agents’ actions.

7.2 Understanding Games in AI

7.2.1 Why Games Matter in AI Research

Games provide an ideal framework for studying intelligent decision-making for several reasons:

  • Clear objectives: Success or failure is precisely measurable
  • Controlled environments: Well-defined rules and boundaries
  • Combinatorial complexity: Requires sophisticated knowledge representation and search strategies
  • Learning opportunities: Provides feedback loops for improving strategies

7.2.2 Game Classification Framework

Games can be categorized along multiple dimensions:

7.2.2.1 By Information Availability

  • Perfect information games: All game states and actions are fully visible to all players (e.g., chess, tic-tac-toe)
  • Imperfect information games: Some information is hidden from players (e.g., poker, battleship)

7.2.2.2 By Outcome Determination

  • Deterministic games: Actions have predictable outcomes (e.g., chess)
  • Stochastic games: Outcomes involve randomness (e.g., backgammon, poker)

Examples of game types and their classifications

7.2.2.3 Additional Classification Dimensions

  • Number of participants: Single-player, two-player, or n-player games
  • Turn structure: Sequential vs. simultaneous move games
  • Environment type: Discrete vs. continuous state spaces
  • Sum of outcomes: Zero-sum (one player’s gain equals another’s loss) vs. non-zero-sum games

7.3 Formalizing Games as Search Problems

Games can be transformed into structured search problems with these components:

  • State space: The set of all possible game configurations
  • Initial state: The starting configuration of the game
  • Successor function: Rules defining legal moves and resulting states
  • Terminal test: Determines when the game ends
  • Utility function: Evaluates outcomes from a player’s perspective

For perfect information games, we can model the problem as a search tree where:

  • Nodes: Represent game states
  • Branches: Represent moves (actions) leading to new states
  • Players: Two types alternate making decisions:
    • MAX: Player trying to maximize utility (often “us”)
    • MIN: Player trying to minimize utility (the opponent)

The tree structure alternates between levels where MAX decides and levels where MIN decides, with the goal of finding a path that maximizes utility assuming the opponent plays optimally.

7.4 The Minimax Algorithm

7.4.1 Core Concept

Minimax is a recursive decision algorithm designed for two-player zero-sum games with perfect information. It works by:

  1. Building a game tree to a certain depth
  2. Evaluating terminal positions
  3. Propagating values upward through the tree
  4. Selecting moves that maximize the minimum guaranteed outcome

The algorithm assumes both players play optimally - MAX tries to maximize the score while MIN tries to minimize it.

7.4.2 Mathematical Definition

Given a game tree, the Minimax value of a node n is defined as:

\text{Minimax}(n) = \begin{cases} \text{Utility}(n), & \text{if } n \text{ is a terminal state} \\ \max_{a \in Actions(n)} \text{Minimax}(Result(n, a)), & \text{if } n \text{ is a MAX node} \\ \min_{a \in Actions(n)} \text{Minimax}(Result(n, a)), & \text{if } n \text{ is a MIN node} \end{cases}

Where:

  • Actions(n): Returns all legal moves from state n
  • Result(n, a): Returns the state resulting from action a
  • Utility(n): Evaluates the score of a terminal state

7.4.3 Implementation

7.4.3.1 General Algorithm

def minimax(state, depth, maximizingPlayer):
    if is_terminal(state) or depth == 0:
        return utility(state)
    
    if maximizingPlayer:
        value = float('-inf')
        for a in actions(state):
            value = max(value, minimax(result(state, a), depth - 1, False))
        return value
    else:
        value = float('inf')
        for a in actions(state):
            value = min(value, minimax(result(state, a), depth - 1, True))
        return value

7.4.3.2 Detailed Pseudocode with Path Tracking

MIN-MAX(Pos, Depth, Player)
    IF deep_enough(Pos, Depth) THEN  // Terminal or reached depth limit
        Value = static(Pos, Player)
        Path = NIL
    ELSE
        Successors = gen_move(Pos, Player)  // Generate possible moves
        
        IF Successors is empty THEN  // No moves available
            return Value = static(Pos, Player), Path = NIL
        ELSE  // Evaluate each possible move
            Best_value = minimum possible value for static()
            FOR Succ in Successors
                Succ_res = MIN-MAX(Succ, Depth + 1, opponent(Player))
                New_value = -value(Succ_res)  // Opponent's perspective
                IF New_value > Best_value THEN  // Found better move
                    Best_value = New_value
                    Best_path = Succ & path(Succ_res)
            return Value = Best_value, Path = Best_path

Where:

  • deep_enough(Pos, Depth): Returns TRUE if search should stop (depth limit or terminal state)
  • gen_move(Pos, Player): Generates possible moves from the current position
  • static(Pos, Player): Evaluates position quality from Player’s perspective
  • opponent(Player): Returns the opposing player

In practice, we use limited-depth search with a position evaluation function to handle complex games, evaluating non-terminal states to estimate their value.

7.4.4 Properties and Limitations

7.4.4.1 Theoretical Properties

  • Completeness: Complete if the game tree is finite
  • Time complexity: O(b^m), where b is the branching factor and m is maximum depth
  • Space complexity: O(b \times m) using depth-first traversal

7.4.4.2 Practical Challenges

  1. Non-quiescent states problem
    Issue: Search cutoff might occur in volatile positions, leading to misleading evaluations
    Solution: Quiescence search: continue searching in unstable positions beyond the normal depth

  2. Horizon effect
    Issue: Algorithm may make poor choices that delay negative outcomes just beyond the search depth
    Impact: Can lead to locally optimal but globally poor decisions

  3. Exponential complexity
    Issue: Pure minimax becomes impractical for games with large branching factors
    Solution: Apply pruning techniques to reduce the search space

7.5 Alpha-Beta Pruning

Alpha-Beta pruning is an optimization technique that dramatically reduces the number of nodes evaluated by the minimax algorithm without affecting the final result.

7.5.1 Key Insight

The key insight is that once we have found a move that proves a position is worse than a previously examined option, we can stop analyzing further moves from this position.

7.5.2 How It Works

  • \alpha (alpha): Tracks the best value found so far for MAX along the path
  • \beta (beta): Tracks the best value found so far for MIN along the path

The algorithm prunes branches when:

  • At a MAX node: Current value \geq \beta (MIN will never allow this path)
  • At a MIN node: Current value \leq \alpha (MAX has a better option elsewhere)

7.5.3 Efficiency Gains

The effectiveness of alpha-beta pruning depends on move ordering:

  • With perfect move ordering: O(b^{(m/2)}) nodes evaluated vs. O(b^m) for plain minimax
  • This effectively doubles the searchable depth, allowing the AI to look twice as far ahead
  • The effective branching factor becomes \sqrt{b} instead of b

With these optimizations, games like chess, checkers, and Othello can be played by AI systems using minimax with alpha-beta pruning as their core strategy.