An interactive Python application that implements Value Iteration algorithm to solve a Markov Decision Process (MDP) for a robot navigating a 1D hallway to reach a charging station.
Features
Value Iteration Algorithm: Solves the MDP using the Bellman optimality equation
Interactive GUI: Built with Tkinter for real-time visualization
Convergence Analysis: Live plot showing algorithm convergence over iterations
Q-Value Display: Shows state-action values for understanding optimal policy
Simulation Mode: Test the optimal policy with animated robot movements
Parameter Tuning: Adjust discount factor and convergence threshold in real-time
Export Results: Save results to JSON for further analysis
Problem Description
A robot navigates a 1D hallway with 4 positions [0, 1, 2, 3]:
Goal: Reach the charging station at position 3
Actions: Move LEFT or RIGHT
Transitions: 80% success rate, 20% stay in place (noisy environment)
Rewards: +10 for reaching goal, -1 for each move
Terminal State: Position 3 (charging station)