kunalp117 / inverse-reinforcement-learning-fundamentals Goto Github PK
View Code? Open in Web Editor NEWImplementation of Andrew Ng's paper's feasibility based results for recovering an MDP's rewards given its optimal policy. Showed how regularization can bring down the size of the feasible set by 30% (and increase precision)