I deployed some code changes on our production app this Tuesday onto our AWS ECS and it immediately started to cause timeout and slowness issues. I quickly rolled back to the previous commit but the issues continued.
We discovered that the dynamodb we are using for sessions had reached it’s upper write limit. After rising the limit from 50 to 100, the application starting working again until it hit the limit once more. We then switched the dynamodb to On-Demand and it’s been fine. The writes/sec are now spiking to 300.
The application is for our internal business practices only so we didn’t suddenly get a large jump in traffic and it happened right after the deployment.
I am lost for ideas and was wondering if anyone else has come across anything similar or has ideas for things to check.