I have a large cluster of application processes (>5000) spread out over multiple EC2 servers. When I attempt to restart them or deploy new code (with a restart), they all attempt to reconnect to the database at the same time, causing the CPU on my RDS MySQL database to spike and the server usually locks up. My current solution is to add sleeps in-between individual instance restarts, however this makes my deploy take a very long time.
I would like to be able to restart everything as quickly as possible. One thought I had was to set up proxies local to each application server that would hold the connection open during the application restart. I haven't yet found something to use in production that is reliable and doesn't add lots of overhead, and I'm not sure if this is the right path to continue down.
Does anyone have suggestions for how to support a cluster like this?
Aucun commentaire:
Enregistrer un commentaire