Shortest Path in Maze using Reinforcement Learning

Modelled 2D maze as an MDP with appropriate states, actions, rewards & transition probabilities. Implemented Howard’s Policy Iteration, Value Iteration and Linear Programming to find best policy that minimizes number of steps between 2 given points while following constraints.