Notes on All Things Machine Learning and Mathematics

MDPs, value and policy iteration