Kubernetes, Rails on Puma and Nginx Ingress Controller

Kubernetes, Rails on Puma and Nginx Ingress Controller

Our infrastructure runs on Kubernetes with the following setup:

  • NGINX Ingress Controller as gateway
  • rails application with Puma application server
  • horizontal pod autoscaler

Horizontal Pod Autoscaler scales the application pods depending on the CPU usage. We try to keep it on the average level of 60%. When it goes above the application is scaled up. When the value goes lower the application is downscaled.

The problem we had was that, when the application was downscaled and the NGINX Ingress Controller proxied the connection to the pod that is being removed due to low level CPU usage NGINX raised connect() failed (111: Connection refused) while connecting to upstream or connect() failed (113: Host is unreachable) while connecting to upstream or upstream timed out (110: Operation timed out) while connecting to upstream This caused errors to around 1% of the traffic, especially when there were a lot of traffic spike because there was a lot of downscaling and upscaling.

The solution I found was to use preStop hook in Kubernetes pod lifecycle with the command sh -c „sleep 5”. This caused the pod to be marked as Not Ready, which meant that Nginx Ingress Controller did not proxy the requests to it, letting all incomming requests to finish during those 5 seconds before shutting down the application pod.

Below you can see the change in errors after applying this fix. The blue bars are all failed requests that were caused by the above mentioned error.

Graph showing number of errors cause by the above mentioned error