We introduce \textit{Policy Guided Monte Carlo} (PGMC), a computational paradigm using reinforcement learning to improve Markov Chain Monte Carlo (MCMC) sampling. The methodology is generally applicable, unbiased and opens up a new path to automated discovery of efficient MCMC samplers. After developing a general theory, we demonstrate some of PGMCs prospects on an Ising model on the kagome lattice, including when the model is its computationally challenging kagome spin ice regime. Here, we show that PGMC is able to automatically machine learn efficient MCMC updates without a priori knowledge of the physics at hand.