Varnish basic configuration with http cache and stale-while

In high-demand environments, we can reach the point where the number of PHP queries (or CGI queries) we want to serve through apache httpd is higher than our servers can handle. To solve that we can do the simplest thing: adding more servers, lowering this way the servers load (the queries are spreaded along more servers). But the simplest way isn’t necesarily proficient. Instead of distributing the load, can our servers handle more queries?

Of course. We can speed up the PHP (or general CGI) processing with FastCGI. We can also make our http server faster, exchanging it for a lighter one, nginx for instance. We can approach the problem from other perspective, which we will discuss here: mantaining a cache where we store content, instead of processing it each time, avoiding CPU time and speeding up the time it takes to serve it. We will do that using varnish .

Maintaining a cache is a delicate matter because you should look for a lot of things. You shouldn’t cache a page if it has cookies involved, for instance, or if the http query is a POST. But all of this is app-related. Developers should be able to say what is safe to cache and what is not, and sysadmins should take this decisions to the servers. So we will assume we start from scratch, we have nothing in the varnish cache, and we want to begin with a particular URL, which we know implies no risk. We will do that here, caching just one URL.

For our tests we will use a simple PHP file. It takes 10 seconds to return the result, and it has a header expiring after 5 seconds. We will name it sleep.php:

If we query it, we can check it do take 10 seconds to return:

$ curl http://localhost/sleep.php -w %{time_total}
10,001

The first thing we should do is to install varnish with our package manager (apt-get install varnish, yum install varnish, whatever). After that we want varnish listening in port 80, instead of apache. So we move apache to 8080 for instance (“Listen: ” directive), and then varnish to 80 (VARNISH_LISTEN_PORT= directive, usually in /etc/default/varnish or /etc/sysconfig/varnish, depends on your distro). We also need to tell varnish the servers it will have behind, to forward the queries (backend servers). For that we will create /etc/varnish/default.vcl file with the following contents:


backend default {
.host = "127.0.0.1";
.port = "8080";
}

With all this we restart apache and varnish, and we check they are running:


$ curl http://localhost/sleep.php -IXGET
HTTP/1.1 200 OK
Server: Apache/2.2.22 (Ubuntu)
X-Powered-By: PHP/5.3.10-1ubuntu3.4
Cache-control: max-age=5, must-revalidate
Vary: Accept-Encoding
Content-Type: text/html
Transfer-Encoding: chunked
Date: Fri, 30 Nov 2012 13:56:33 GMT
X-Varnish: 1538615861
Age: 0
Via: 1.1 varnish
Connection: keep-alive

$ curl http://localhost:8080/sleep.php -IXGET
HTTP/1.1 200 OK
Date: Fri, 30 Nov 2012 13:56:59 GMT
Server: Apache/2.2.22 (Ubuntu)
X-Powered-By: PHP/5.3.10-1ubuntu3.4
Cache-control: max-age=5, must-revalidate
Vary: Accept-Encoding
Content-Length: 0
Content-Type: text/html

We can see different headers in each query. When we query varnish there are “Via: 1.1 varnish” and “Age: 0”, among others apache doesn’t show. If we have it like this, we have our baseline.

The default behaviour is to cache everything:

$ curl http://localhost/sleep.php -w %{time_total}
10,002
$ curl http://localhost/sleep.php -w %{time_total}
0,001

But we don’t want to cache everything, just a particular URL, avoiding cache of cookies and things like that. So we will change sub vcl_recv to not cache anything, adding this to the file /etc/varnish/default.vcl:

sub vcl_recv {
return(pass);
}

We check it:

$ curl http://localhost/sleep.php -w %{time_total}
10,002
$ curl http://localhost/sleep.php -w %{time_total}
10,001

Now we cache just sleep.php, adding this to default.vcl:

sub vcl_recv {
if (req.url == "/sleep.php")
{
return(lookup);
}
else
{
return(pass);
}
}

We can check it:

$ cp /var/www/sleep.php /var/www/sleep2.php
$ curl http://localhost/sleep.php -w %{time_total}
10,002
$ curl http://localhost/sleep.php -w %{time_total}
0,001
$ curl http://localhost/sleep2.php -w %{time_total}
10,002
$ curl http://localhost/sleep2.php -w %{time_total}
10,001

Also we check the “Age:” header is increasing, and when it reaches 5 (max-age we wrote), it takes 10 seconds again:

$ curl http://localhost/sleep.php -IXGET -w %{time_total}
HTTP/1.1 200 OK
Server: Apache/2.2.22 (Ubuntu)
X-Powered-By: PHP/5.3.10-1ubuntu3.4
Cache-control: max-age=5, must-revalidate
Vary: Accept-Encoding
Content-Type: text/html
Transfer-Encoding: chunked
Date: Mon, 03 Dec 2012 10:53:54 GMT
X-Varnish: 500945303
Age: 0
Via: 1.1 varnish
Connection: keep-alive

10,002
$ curl http://localhost/sleep.php -IXGET -w %{time_total}
HTTP/1.1 200 OK
Server: Apache/2.2.22 (Ubuntu)
X-Powered-By: PHP/5.3.10-1ubuntu3.4
Cache-control: max-age=5, must-revalidate
Vary: Accept-Encoding
Content-Type: text/html
Transfer-Encoding: chunked
Date: Mon, 03 Dec 2012 10:53:56 GMT
X-Varnish: 500945305 500945303
Age: 2
Via: 1.1 varnish
Connection: keep-alive

0,001
$ curl http://localhost/sleep.php -IXGET -w %{time_total}
HTTP/1.1 200 OK
Server: Apache/2.2.22 (Ubuntu)
X-Powered-By: PHP/5.3.10-1ubuntu3.4
Cache-control: max-age=5, must-revalidate
Vary: Accept-Encoding
Content-Type: text/html
Transfer-Encoding: chunked
Date: Mon, 03 Dec 2012 10:53:59 GMT
X-Varnish: 500945309 500945303
Age: 5
Via: 1.1 varnish
Connection: keep-alive

0,001
$ curl http://localhost/sleep.php -IXGET -w %{time_total}
HTTP/1.1 200 OK
Server: Apache/2.2.22 (Ubuntu)
X-Powered-By: PHP/5.3.10-1ubuntu3.4
Cache-control: max-age=5, must-revalidate
Vary: Accept-Encoding
Content-Type: text/html
Transfer-Encoding: chunked
Date: Mon, 03 Dec 2012 10:54:09 GMT
X-Varnish: 500945310
Age: 0
Via: 1.1 varnish
Connection: keep-alive

10,002

We can see when the content expires, varnish ask for it agains and it takes 10 seconds. But what happens during this time? The rest of queries must wait, too? No, they don’t. There is a 10 seconds grace period, and during this period varnish will keep serving the old content (stale content). We can check it if we run two curl at the same time, and we will see one of them stops while the other keeps serving content fast, with the header “Age” above the 5 seconds we assigned:

$ while :;do curl http://localhost/sleep.php -IXGET;sleep 1;done
(...)
HTTP/1.1 200 OK
Server: Apache/2.2.22 (Ubuntu)
X-Powered-By: PHP/5.3.10-1ubuntu3.4
Cache-control: max-age=5, must-revalidate
Vary: Accept-Encoding
Content-Type: text/html
Transfer-Encoding: chunked
Date: Mon, 03 Dec 2012 11:16:29 GMT
X-Varnish: 500952300 500952287
Age: 8
Via: 1.1 varnish
Connection: keep-alive

We can also check it with siege, with two concurrent users, and we will see for a while just one of the threads, while the other is stopped, waiting for the content.:

$ siege -t 30s -c 2 -d 1 localhost/sleep.php

If we think 10 seconds is a low value, we can change it with the beresp.grace directive, in the sub vcl_fetch in default.vcl file. We can set a minute, for instance:

sub vcl_fetch {
set beresp.grace = 60s;
}

What if the backend server is down? Will it keep serving stale content? Not as we have it right now. Because varnish has no way of knowing if a backend server is healthy or not, so it will consider all servers healthy. So, if the server is down, and the content expires, it will give error 503:

$ $ sudo /etc/init.d/apache2 stop
[sudo] password:
* Stopping web server apache2 apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1 for ServerName
... waiting [OK]
$ sudo /etc/init.d/apache2 status
Apache2 is NOT running.
$ while :;do curl http://localhost/sleep.php -IXGET;sleep 1;done
(...)
HTTP/1.1 200 OK
Server: Apache/2.2.22 (Ubuntu)
X-Powered-By: PHP/5.3.10-1ubuntu3.4
Cache-control: max-age=5, must-revalidate
Vary: Accept-Encoding
Content-Type: text/html
Transfer-Encoding: chunked
Date: Fri, 30 Nov 2012 14:19:15 GMT
X-Varnish: 1538616905 1538616860
Age: 5
Via: 1.1 varnish
Connection: keep-alive

HTTP/1.1 503 Service Unavailable
Server: Varnish
Content-Type: text/html; charset=utf-8
Retry-After: 5
Content-Length: 419
Accept-Ranges: bytes
Date: Fri, 30 Nov 2012 14:19:15 GMT
X-Varnish: 1538616906
Age: 0
Via: 1.1 varnish
Connection: close

To make the grace period apply in this situation, we just need to tell varnish how should it check if apache is up or down (healthy), just setting “probe” directive in the backend:

backend default {
.host = "127.0.0.1";
.port = "8080";
.probe = {
.url = "/";
.timeout = 100 ms;
.interval = 1s;
.window = 10;
.threshold = 8;
}
}

This way it keeps serving stale content when the backend is down, and it will keep serving it until it comes up and varnish can ask for the content again.

Testing with siege and curl, we can see there is always a thread that is “screwed”. The first time varnish finds an expired content, it asks for it to the backend and waits for the answer. Meanwhile, the rest of the threads will get the stale content, but this thread is “screwed”. The same thing happens when the server is down. There is a lot of literature trying to avoid this, you can read a lot about it, but bottomline: there is no way to avoid it. It just happens. One thread must be sacrificed.

Until now we are covering two scenarios where we will keep serving stale content:
– There is no backend server available, so we serve stale content.
– There is backends available, and a thread has asked for new content. While this content comes from the backend, varnish keeps serving stale content to the rest of the threads.

What if we want these two scenarios to have different timeouts? For instance, we could need the stale content to stop serving after certaing time (could be minutes). After this time, we stop and wait for the backend answer, forcing the content to be fresh. But at the same sime we could need to serve stale content when the servers are down (so there’s no way to get fresh content), because normally that’s better than serve a 503 error page. This can be configured at sub vcl_recv in default.vcl file, this way:

sub vcl_recv {
if (req.backend.healthy) {
set req.grace = 30s;
} else {
set req.grace = 1h;
}
}

sub vcl_fetch {
set beresp.grace = 1h;
}

Per tant, el nostre fitxer default.vcl complet tindra el seguent contingut:

$ cat /etc/varnish/default.vcl
backend default {
.host = "127.0.0.1";
.port = "8080";
.probe = {
.url = "/";
.timeout = 100 ms;
.interval = 1s;
.window = 10;
.threshold = 8;
}
}

sub vcl_recv {
if (req.backend.healthy) {
set req.grace = 30s;
} else {
set req.grace = 1h;
}
if (req.url == "/sleep.php")
{
return(lookup);
}
else
{
return(pass);
}
}
sub vcl_fetch {
set beresp.grace = 30s;
}