تفاصيل العمل

An interactive Python application that implements Value Iteration algorithm to solve a Markov Decision Process (MDP) for a robot navigating a 1D hallway to reach a charging station.

Features

Value Iteration Algorithm: Solves the MDP using the Bellman optimality equation

Interactive GUI: Built with Tkinter for real-time visualization

Convergence Analysis: Live plot showing algorithm convergence over iterations

Q-Value Display: Shows state-action values for understanding optimal policy

Simulation Mode: Test the optimal policy with animated robot movements

Parameter Tuning: Adjust discount factor and convergence threshold in real-time

Export Results: Save results to JSON for further analysis

Problem Description

A robot navigates a 1D hallway with 4 positions [0, 1, 2, 3]:

Goal: Reach the charging station at position 3

Actions: Move LEFT or RIGHT

Transitions: 80% success rate, 20% stay in place (noisy environment)

Rewards: +10 for reaching goal, -1 for each move

Terminal State: Position 3 (charging station)

بطاقة العمل

اسم المستقل
عدد الإعجابات
0
تاريخ الإضافة