I work in supply chain optimization, and reinforcement learning has been an important technique in the field for decades. Supply chain problems are naturally modeled as Markov decision problems (MDPs). As the state space of your MDPs gets bigger and bigger, simulation-based reinforcement learning becomes one of the most versatile techniques for approximating optimal solutions.
I see some sort of reinforcement learning as the most promising technique for overcoming the dramatically named "curse of dimensionality" in the state—the single biggest roadblock to optimizing more complex supply chain models.
In fact, the study of MDPs and their solutions stems from operations research, and I think studying problems in that context give you a powerful way of understanding how reinforcement learning algorithms work. Basic inventory control problems are very intuitive, and there's a natural progression from exact dynamic programming methods (Bellman iteration and policy iteration) to different reinforcement learning algorithms that really helps build an intuition for how RL works.
I see some sort of reinforcement learning as the most promising technique for overcoming the dramatically named "curse of dimensionality" in the state—the single biggest roadblock to optimizing more complex supply chain models.
In fact, the study of MDPs and their solutions stems from operations research, and I think studying problems in that context give you a powerful way of understanding how reinforcement learning algorithms work. Basic inventory control problems are very intuitive, and there's a natural progression from exact dynamic programming methods (Bellman iteration and policy iteration) to different reinforcement learning algorithms that really helps build an intuition for how RL works.