Artificial Intelligence

Wumpus World — AI Agents & Reinforcement Learning

Classic Wumpus World simulation implementing 4 AI agents: random, human-interactive, rational (logical inference on 10×10 grid), and Q-learning reinforcement learning agent.

2024
Completed (2024)
1 member

Technologies Used

PythonQ-LearningReinforcement LearningPropositional LogicAI Agents

Implementation of the classic Wumpus World problem from AI textbooks, featuring four distinct agent types. Built entirely in Python.

🤖 Agent Implementations

1. Random Agent

  • Basic agent selecting actions uniformly at random
  • Performance baseline for other agents

2. Human Agent

  • Interactive mode allowing manual control for debugging

3. Rational Agent

  • Logical inference-based agent on a 10×10 grid
  • Maintains a knowledge base of visited cells and breeze/stench perceptions
  • Applies propositional logic to infer safe and unsafe cells
  • Avoids pits and the Wumpus through logical deduction

4. Learning Agent (Q-Learning)

  • Model-free reinforcement learning (Q-learning)
  • State space: agent position + perception vector
  • Action space: move (4 directions), shoot arrow, grab gold, climb
  • Training: 50 episodes with epsilon-greedy exploration
  • Policy stored as Q-table for exploitation at inference time

🏗️ Architecture

| File | Role | |------|------| | wumpusworld.py | Environment definition: grid, wumpus, pits, gold placement | | wumpus.py | Environment rules and perception generation | | agent.py | Agent base class and 4 implementations | | utils.py | Helper functions: grid display, performance metrics |

Challenges

  • Designing a knowledge base that correctly infers safe cells from limited perceptions
  • Defining an effective state representation for Q-learning in a partially observable environment
  • Balancing exploration vs. exploitation during training with only 50 episodes

Solutions

  • Used propositional logic with frontier-based cell safety inference for the rational agent
  • Represented state as (position, breeze, stench, glitter) tuple for compact Q-table
  • Epsilon-greedy strategy with decaying epsilon over training episodes

Outcomes

  • Four functional agent implementations with measurable performance differences
  • Q-learning agent outperforming random baseline after 50 training episodes
  • Complete environment simulation with all Wumpus World rules
  • Published on GitHub: github.com/AyGoub/Projet-Ia-Wampus