OK, so I’m pulling my hair out on this one.
I created a custom gym to train an RL agent using CleanRLs implementation. I can’t figure out why the obs shape returned by reset doesn’t match what is expected.
The offending line is here on CleanRL Github
obs, _ = envs.reset(seed=args.seed)
envs.single_observation_space
Reference Gym
Box(-inf, inf, (11,), <class ‘numpy.float32’>)
Custom Gym
Box(-1.0, 1.0, (998,), <class ‘numpy.float32’>)
In my reset function in the custom gym I get
observation = self.get_obs()
# observation.shape => (1,998)
What else should I be looking at as from everything I can tell my observation shape is correct. Obviously it is not correct, but I can’t grok why.