Preface
This article assumes you already know how to setup and configure HAProxy: you
know what the terms proxies
, listen block
, frontend
and backend
mean.
You should also know that traffic enters HAProxy through a frontend block
or a
listen block
and that it is passed on to one or more backend
servers based
on Access Control Lists (acls
).
Important to know is that when we say HAProxy, we’re talking about version 1.8.x!
One more thing
Timeouts are not the problem!
Remember that quote well! If you’re unsure as to what it means, it’s simple: when hitting a timeout, your first reaction should not be to increase the timeout. You should investigate why something is taking so long that you hit a timeout somewhere and fix the root cause. It could also be that you need to push long running jobs to the background using queueing mechanisms or other techniques. If you do not take this to heart, the problem will come back and bite you really, really hard, especially if you want the application to scale properly.
A very basic config file
Let’s have a look at an extremely basic HAProxy config with a single frontend passing data on to a single backend:
global
user haproxy
group haproxy
pidfile /var/run/haproxy-tep.pid
stats socket /var/run/haproxy.stats
maxconn 20480
defaults
retries 3
option redispatch
timeout client 30s
timeout connect 4s
timeout server 30s
frontend www_frontend
bind :80
mode http
default_backend www_backend
backend www_backend
mode http
server apache24_1 192.168.0.1:8080 check fall 2 inter 1s
You cannot go much more minimal without HAProxy spewing out warnings about missing timeouts etc. What this config does should be obvious:
- it accepts
http
traffic on port80
for any IP address with a maximum of20480
connections. - it will forward this traffic to an Apache 2.4 server running on a host with
IP
192.168.0.1
and port8080
- HAProxy will check if the Apache server is available every second and will consider it dead after 2 consecutive failed checks.
You will instantly notice three timeouts in this config that HAProxy will nag about when you do not set them:
timeout client <timeout>
timeout connect <timeout>
timeout server <timeout>
Let’s tackle these basic ones one by one.
The three basic HAProxy timeouts
timeout client
Set the maximum inactivity time on the client side.
That’s what the manual says and that’s exactly what it is: when the client is expected to acknowledge or send data, this timeout is applied. In our example it was set to 30 seconds, so when the client doesn’t start sending or accepting (receiving) data within 30 seconds, the connection is closed.
timeout connect
Set the maximum time to wait for a connection attempt to a server to succeed.
That’s quite important: it applies to the server, not the client! And it obviously only applies to the connection phase, not the transfer of data or anything else. With servers located in the same network, the connection time will be a few milliseconds. With more complex topology, say cross cloud connectivity which is what DeltaBlue is an expert at, you need to allow a bit more time for this. Always stay within reasonable parameters, though. If you need to go higher than 4 seconds, you really have a different problem altogether (remember “Timeouts are not the problem!“).
timeout server
Set the maximum inactivity time on the server side.
The exact same as the first timeout we looked at, but at the server side: when
the server is expected to acknowledge or send data, this timeout is applied.
In our example, this applies to the Apache 2.4 server that could be running a
PHP application with its own timeouts. If that PHP application does not
start sending HTTP headers (our frontend is running in HTTP mode) within 30
seconds, the client will receive a 504 Gateway timeout
error from HAProxy.
So this timeout is all about the server’s processing time for the given request.
Anything higher than 30 seconds should really be considered way too slow and
again: you have a different problem (hint: it’s not the timeout).
HAProxy’s other timeouts that you really need
We’ll expand our basic config file a bit to look like this:
global
user haproxy
group haproxy
pidfile /var/run/haproxy-tep.pid
stats socket /var/run/haproxy.stats
maxconn 20480
defaults
retries 3
option redispatch
timeout client 30s
timeout connect 4s
timeout server 30s
# Newly added timeouts
timeout http-request 10s
timeout http-keep-alive 2s
timeout queue 5s
timeout tunnel 2m
timeout client-fin 1s
timeout server-fin 1s
frontend www_frontend
bind :80
mode http
default_backend www_backend
backend www_backend
mode http
server apache24_1 192.168.0.1:8080 check fall 2 inter 1s
So the next batch we will be looking at are these:
timeout http-request <timeout>
timeout http-keep-alive <timeout>
timeout queue <timeout>
timeout tunnel <timeout>
timeout client-fin <timeout>
timeout server-fin <timeout>
timeout http-request
Set the maximum allowed time to wait for a complete HTTP request
A very easy and therefore popular attack is a Denial of Service (DoS) attack. A lot of timeout settings can help mitigate these and so can this one. When concerned about security, you will no doubt have heard about Slow loris. Not the animal, but the attack (named after the animal) ;-) This attack will open as many connections as possible and keep them open in order to consume all possible sockets thereby denying other people access and effectively ‘closing down’ the host.
In HAProxy, use this parameter to limit the time frame in which a complete HTTP
request can be sent, rendering attacks such as Slow loris largely ineffective.
By separating this from the timeout client
, you can do more fine grained
tweaking in complex setups.
As the article title suggested, we will be tuning for performance and security. This one will actually do both as it will also keep HAProxy clear of processing (too much) garbage so it can direct its resources on useful things.
timeout http-keep-alive
Set the maximum allowed time to wait for a new HTTP request to appear
HTTP Keep-Alive is also referred to as a persistent connection allowing browsers to work more efficiently with connections and offering a faster end user experience in page loading using HTTP/1.1 (HTTP/2 always uses a single connection per client).
Say you have an HTML page loading CSS, JavaScript, images and other assets, using a persistent connection will be much faster, as a single connection can be reused to send the data. The overhead of recreating a connection for each asset is gone.
When the server sends a response, this timeout kicks in and when a new request
is received within this time frame, the connection is reused.
As soon as a new request comes in, the timeout http-request
will take over!
If you do not set timeout http-keep-alive
, the timeout http-request
value
will be used.
timeout queue
Set the maximum time to wait in the queue for a connection slot to be free
So that’s quite clear: when the maximum connections (of 20480 in this example) are reached, the requests will be queued for this amount of time. To keep performance optimal, you should set this timeout to prevent clients from being queued indefinitely.
If you do not set it, timeout connect
will be used instead.
timeout tunnel
Set the maximum inactivity time on the client and server side for tunnels.
As our config only handles HTTP, this setting will be used when upgrading
a
connection to, say, a WebSocket.
Tunnels are usually long lived connections, so keep timeouts higher but still
reasonable.
Also be sure to set the timeout client-fin
parameter!
timeout client-fin
Set the inactivity timeout on the client side for half-closed connections.
This timeout starts ticking when the client disappears suddenly while it was still expected to acknowledge or send data. This can happen for various reasons: networking issues, buggy clients, …
In order for these semi-closed connections to be cleaned up swiftly, you should
keep this timeout short so that you do not end up with a huge list of FIN_WAIT
connections flooding the server.
When the client is gone, it’s gone. It’ll reconnect when it needs to.
timeout server-fin
Set the inactivity timeout on the server side for half-closed connections.
Exactly the same as the client side version, but for the server side. In cloud
environments where you would have several servers per backend
block, closing
these wonky connections swiftly will make HAProxy switch to a ‘working’ server
faster to keep ‘downtime’ to an absolute minimum.
How about hosting some legacy applications?
Ok, let’s tackle the inevitable question:
“But I have some old applications that I just can’t migrate to a modern framework because no-one wants to invest in them anymore yet they’re still being actively used! I need higher timeouts!”
Sad but true, there are lots of those out there, but HAProxy can solve all of this legacy stuff for you if harness its power properly.
Don’t be tempted to mindlessly raise timeouts as they will get exploited at some point in time!
So we’ll do it as good as we can (can’t say properly as that would mean adjusting the application itself which we weren’t going to do) and adjust our configuration to allow an admin area with higher timeouts:
global
user haproxy
group haproxy
pidfile /var/run/haproxy-tep.pid
stats socket /var/run/haproxy.stats
maxconn 20480
defaults
retries 3
option redispatch
timeout client 30s
timeout connect 4s
# Newly added timeouts
timeout http-request 10s
timeout http-keep-alive 2s
timeout queue 5s
timeout tunnel 2m
timeout client-fin 1s
timeout server-fin 1s
frontend www_frontend
bind :80
mode http
acl is_path_admin path_beg /admin
use_backend www_backend_slow_pool if is_path_admin
default_backend www_backend
backend www_backend
mode http
timeout server 30s
server apache24_1 192.168.0.1:8080 check fall 2 inter 1s
backend www_backend_slow_pool
mode http
timeout server 3600s
server apache24_1 192.168.0.1:8080 check fall 2 inter 1s
So what happened?
- The
timeout server
declaration has been moved to thebackend
blocks. - An ACL was added that detects if the requested path begins with
/admin
. - A
use_backend
statement was added to route traffic to a slow pool backend if the path ACL matches. - A slow pool backend was added with a (ridiculously) high timeout to prevent
HAProxy from throwing a
504 Gateway Timeout
because of slow server responses.
These timeouts will obviously only affect HAProxy: the server behind the slow
pool backend must be setup with its own proper high timeouts on various levels
so that it won’t time out for the admin area. If that would happen, then HAProxy
would return a 503 Service Unavailable
.
Preventing high timeout abuse/exploitation
You can go further and setup a separate instance that will only handle the slow
requests and give it, say, an admin.example.com
domain. You can add an ACL to
route all traffic for that domain to the slow pool and even add some IP locking,
basic auth protection etc. to prevent abuse of the high timeouts because that
is a valid concern indeed.
Offloading this traffic to a separate instance will also make sure regular users are not impacted due to lots of slow processes on the server eating up the connection or worker pool.