HangMan RL with DQN
I am trying to implement the DQN algorithm using the “stable_baselines3” library, but I am encountering difficulties because the model starts to spam the same letter at some point, and I cannot understand why. The environment is custom; I wrote it myself, so there might be errors. Could you help me or at least let me know if the environment has been written correctly?