This article assumes you already know how to setup and configure HAProxy: you know what the terms
backend mean. You should also know that traffic enters HAProxy through a
frontend block or a
listen block and that it is passed on to one or more
backend servers based on Access Control Lists (
Important to know is that when we say HAProxy, we’re talking about version 1.8.x!
One more thing
Timeouts are not the problem!
Remember that quote well! If you’re unsure as to what it means, it’s simple: when hitting a timeout, your first reaction should not be to increase the timeout. You should investigate why something is taking so long that you hit a timeout somewhere and fix the root cause. It could also be that you need to push long running jobs to the background using queueing mechanisms or other techniques. If you do not take this to heart, the problem will come back and bite you really, really hard, especially if you want the application to scale properly.
A very basic config file
Let’s have a look at an extremely basic HAProxy config with a single frontend passing data on to a single backend:
global user haproxy group haproxy pidfile /var/run/haproxy-tep.pid stats socket /var/run/haproxy.stats maxconn 20480 defaults retries 3 option redispatch timeout client 30s timeout connect 4s timeout server 30s frontend www_frontend bind :80 mode http default_backend www_backend backend www_backend mode http server apache24_1 192.168.0.1:8080 check fall 2 inter 1s
You cannot go much more minimal without HAProxy spewing out warnings about missing timeouts etc. What this config does should be obvious:
- it accepts
httptraffic on port
80for any IP address with a maximum of
- it will forward this traffic to an Apache 2.4 server running on a host with IP
- HAProxy will check if the Apache server is available every second and will consider it dead after 2 consecutive failed checks.
You will instantly notice three timeouts in this config that HAProxy will nag about when you do not set them:
timeout client <timeout>
timeout connect <timeout>
timeout server <timeout>
Let’s tackle these basic ones one by one.
The three basic HAProxy timeouts
Set the maximum inactivity time on the client side.
That’s what the manual says and that’s exactly what it is: when the client is expected to acknowledge or send data, this timeout is applied. In our example it was set to 30 seconds, so when the client doesn’t start sending or accepting (receiving) data within 30 seconds, the connection is closed.
Set the maximum time to wait for a connection attempt to a server to succeed.
That’s quite important: it applies to the server, not the client! And it obviously only applies to the connection phase, not the transfer of data or anything else. With servers located in the same network, the connection time will be a few milliseconds. With more complex topology, say cross cloud connectivity which is what DeltaBlue is an expert at, you need to allow a bit more time for this. Always stay within reasonable parameters, though. If you need to go higher than 4 seconds, you really have a different problem altogether (remember “Timeouts are not the problem!”).
Set the maximum inactivity time on the server side.
The exact same as the first timeout we looked at, but at the server side: when the server is expected to acknowledge or send data, this timeout is applied. In our example, this applies to the Apache 2.4 server that could be running a PHP application with its own timeouts. If that PHP application does not start sending HTTP headers (our frontend is running in HTTP mode) within 30 seconds, the client will receive a
504 Gateway timeout error from HAProxy. So this timeout is all about the server’s processing time for the given request. Anything higher than 30 seconds should really be considered way too slow and again: you have a different problem (hint: it’s not the timeout).
HAProxy’s other timeouts that you really need
We’ll expand our basic config file a bit to look like this:
global user haproxy group haproxy pidfile /var/run/haproxy-tep.pid stats socket /var/run/haproxy.stats maxconn 20480 defaults retries 3 option redispatch timeout client 30s timeout connect 4s timeout server 30s # Newly added timeouts timeout http-request 10s timeout http-keep-alive 2s timeout queue 5s timeout tunnel 2m timeout client-fin 1s timeout server-fin 1s frontend www_frontend bind :80 mode http default_backend www_backend backend www_backend mode http server apache24_1 192.168.0.1:8080 check fall 2 inter 1s
So the next batch we will be looking at are these:
timeout http-request <timeout>
timeout http-keep-alive <timeout>
timeout queue <timeout>
timeout tunnel <timeout>
timeout client-fin <timeout>
timeout server-fin <timeout>
Set the maximum allowed time to wait for a complete HTTP request
A very easy and therefore popular attack is a Denial of Service (DoS) attack. A lot of timeout settings can help mitigate these and so can this one. When concerned about security, you will no doubt have heard about Slow loris. Not the animal, but the attack (named after the animal) 😉 This attack will open as many connections as possible and keep them open in order to consume all possible sockets thereby denying other people access and effectively ‘closing down’ the host.
In HAProxy, use this parameter to limit the time frame in which a complete HTTP request can be sent, rendering attacks such as Slow loris largely ineffective. By separating this from the
timeout client, you can do more fine grained tweaking in complex setups.
As the article title suggested, we will be tuning for performance and security. This one will actually do both as it will also keep HAProxy clear of processing (too much) garbage so it can direct its resources on useful things.
Set the maximum allowed time to wait for a new HTTP request to appear
HTTP Keep-Alive is also referred to as a persistent connection allowing browsers to work more efficiently with connections and offering a faster end user experience in page loading using HTTP/1.1 (HTTP/2 always uses a single connection per client).
When the server sends a response, this timeout kicks in and when a new request is received within this time frame, the connection is reused. As soon as a new request comes in, the
timeout http-request will take over!
If you do not set
timeout http-keep-alive, the
timeout http-request value will be used.
Set the maximum time to wait in the queue for a connection slot to be free
So that’s quite clear: when the maximum connections (of 20480 in this example) are reached, the requests will be queued for this amount of time. To keep performance optimal, you should set this timeout to prevent clients from being queued indefinitely.
If you do not set it,
timeout connect will be used instead.
Set the maximum inactivity time on the client and server side for tunnels.
As our config only handles HTTP, this setting will be used when
upgrading a connection to, say, a WebSocket.
Tunnels are usually long lived connections, so keep timeouts higher but still reasonable. Also be sure to set the
timeout client-fin parameter!
Set the inactivity timeout on the client side for half-closed connections.
This timeout starts ticking when the client disappears suddenly while it was still expected to acknowledge or send data. This can happen for various reasons: networking issues, buggy clients, …
In order for these semi-closed connections to be cleaned up swiftly, you should keep this timeout short so that you do not end up with a huge list of
FIN_WAIT connections flooding the server. When the client is gone, it’s gone. It’ll reconnect when it needs to.
Set the inactivity timeout on the server side for half-closed connections.
Exactly the same as the client side version, but for the server side. In cloud environments where you would have several servers per
backend block, closing these wonky connections swiftly will make HAProxy switch to a ‘working’ server faster to keep ‘downtime’ to an absolute minimum.
How about hosting some legacy applications?
Ok, let’s tackle the inevitable question:
“But I have some old applications that I just can’t migrate to a modern framework because no-one wants to invest in them anymore yet they’re still being actively used! I need higher timeouts!”
Sad but true, there are lots of those out there, but HAProxy can solve all of this legacy stuff for you if harness its power properly.
Don’t be tempted to mindlessly raise timeouts as they will get exploited at some point in time!
So we’ll do it as good as we can (can’t say properly as that would mean adjusting the application itself which we weren’t going to do) and adjust our configuration to allow an admin area with higher timeouts:
global user haproxy group haproxy pidfile /var/run/haproxy-tep.pid stats socket /var/run/haproxy.stats maxconn 20480 defaults retries 3 option redispatch timeout client 30s timeout connect 4s # Newly added timeouts timeout http-request 10s timeout http-keep-alive 2s timeout queue 5s timeout tunnel 2m timeout client-fin 1s timeout server-fin 1s frontend www_frontend bind :80 mode http acl is_path_admin path_beg /admin use_backend www_backend_slow_pool if is_path_admin default_backend www_backend backend www_backend mode http timeout server 30s server apache24_1 192.168.0.1:8080 check fall 2 inter 1s backend www_backend_slow_pool mode http timeout server 3600s server apache24_1 192.168.0.1:8080 check fall 2 inter 1s
So what happened?
timeout serverdeclaration has been moved to the
- An ACL was added that detects if the requested path begins with
use_backendstatement was added to route traffic to a slow pool backend if the path ACL matches.
- A slow pool backend was added with a (ridiculously) high timeout to prevent HAProxy from throwing a
504 Gateway Timeoutbecause of slow server responses.
These timeouts will obviously only affect HAProxy: the server behind the slow pool backend must be setup with its own proper high timeouts on various levels so that it won’t time out for the admin area. If that would happen, then HAProxy would return a
503 Service Unavailable.
Preventing high timeout abuse/exploitation
You can go further and setup a separate instance that will only handle the slow requests and give it, say, an
admin.example.com domain. You can add an ACL to route all traffic for that domain to the slow pool and even add some IP locking, basic auth protection etc. to prevent abuse of the high timeouts because that is a valid concern indeed.
Offloading this traffic to a separate instance will also make sure regular users are not impacted due to lots of slow processes on the server eating up the connection or worker pool.