Relative Content

Tag Archive for reinforcement-learningstablebaseline3

PPO stable baselines 3

I am using custom environment, custom model for the environment. The goal is to train this custom model using reinforcement learning. I have defined my action space like this self.action_space = gym.spaces.Box(low=-1, high=1, shape=(num_hands, ), dtype=np.float32). The custom model have layer to normalize the output so it looks like this tensor([[-0.2728, 0.6061, 0.1210]]). However, when I print the the action in the environment it prints as [ 0.08697755 -0.01496916 -0.0330451 ]. What am I doing wrong? I want the prediction from custom model to be used as action, without any clipping or reascaling