In reinforcement learning, if state transition is independent of the action taken, is it still an MDP problem?

  Kiến thức lập trình

I have an environment in which the state transition satisfies the Markov property. However, the transition of the state from S_{t} to S_{t+1} does not depend on the action taken at S_{t}. Can I still use MDP to fomulate the problem? If yes, how does the Bellman equations change?

I tried to solve the problem using DRL solutions and they converge, So my guess is it should work. However, I couldn’t find a similar problem through search.

LEAVE A COMMENT