Training RL model with TF over all the output vector
I’m training a deep RL model with TensorFlow, but my model doesn’t have a single correct action. The output of the network is a vector [x1, x2], and both are actions that need to be optimized.
I’m training a deep RL model with TensorFlow, but my model doesn’t have a single correct action. The output of the network is a vector [x1, x2], and both are actions that need to be optimized.