Does Adam is more robust than RMSProp to learning rate schedulers?
I have been running experiments using learning rate schedulers, and interestingly, Adam can improve slightly compared to fixed initial learning rates. However, the same is not observed using RMSProp, which underperforms fixed initial values. The context is learning Atari tasks in RL, and the results for Adam aligned with previous work. Is there any reason for that?