I have an environment in which the state transition satisfies the Markov property. However, the transition of the state from S_{t}
to S_{t+1}
does not depend on the action taken at S_{t}
. Can I still use MDP to fomulate the problem? If yes, how does the Bellman equations change?
I tried to solve the problem using DRL solutions and they converge, So my guess is it should work. However, I couldn’t find a similar problem through search.