In this article I will be talking about caching in the context of the HTTP specs and REST (which is heavily linked with HTTP semantics). First I will cover the specifics covered in HTTP 1.0, then the changes made in HTTP 1.1, then conclude with how we can clean up some of the examples from the previous sections.
Table of Contents
Requirements
Tools you will need to follow along.
- cURL - A command line tool that I use as a REST client. You may use a different client, but any other tools usages will be outside the scope of this article.
- Maven - To build and run the app with the jetty-maven-plugin
You download or clone the app from Github
git clone https://github.com/psamsotha/jersey-cache-example.git
Or download it from the Github project site
Caching in HTTP 1.0
In HTTP 1.0 the server sends an Expires: <expiration-data>
header with the response. So a response might look something like
HTTP/1.0 200 OK
Date: Sun, 18 Oct 2015, 09:15:25 GMT
Content-Type: text/plain
Content-Length: 100
Expires: Sun, 18 Oct 2015, 10:15:25 GMT
<body content>
Here the expiration data has been set to one hour from the time the response was sent out.
Expires example
In JAX-RS, we would simply to something like
@GET
@Produces("text/plain")
public Response get() {
long current = new Date().getTime();
Date expires = new Date(current + (10 * 1_000));
return Response.ok("Some Data").expires(expires).build();
}
There’s nothing really interesting goind on here. We are calling expires
on the ResponseBuilder
which sets the Exires
to ten seconds from the time of request. If we make the cURL request, we can see the header.
C:\>curl -v http://localhost:8080/api/expires
> GET /api/expires HTTP/1.1
> User-Agent: curl/7.40.0
> Host: localhost:8080
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Sun, 18 Oct 2015 03:34:50 GMT
< Expires: Sun, 18 Oct 2015 03:35:00 GMT
< Content-Type: text/plain
< Content-Length: 9
< Server: Jetty(9.2.6.v20141205)
<
Some Data
cURL doesn’t support any caching features, so trying to make another call won’t respect the Expires
header, and another request will be made.
Caching in HTTP 1.1
Then came HTTP 1.1, which not only create new temporal semantics, but also added revalidation semantics to caching. Take for example the caching the Expires
example above. Say the expiration time is 10 seconds. A request made after that ten seconds will always get back new data from the server. With revalidation the server sends a condition request, saying “if the data has not been modified since [date]” and/or “if data still matches the data in my cache”, don’t send back the data. This helps in bandwidth usage, as the data is not always sent back, as would be the case with the old Expires semantics. Here’s a flow chart of basics of the process
Temporal caching
When I say “temporal”, I am talking about time. As discussed, the old spec used the Expires
header to set the expiration date on the data. The problem with this is that it is an absolute time that is dependent on system clock being set correctly. To fix this problem, the Cache-Control
header was introduced. Instead of an absolute date, it uses a relative time in the form of max-age
. For example a response might look like
HTTP/1.1 200 OK
Date: Sun, 18 Oct 2015 03:47:18 GMT
Cache-Control: max-age=10
Content-Type: text/plain
Content-Length: 9
Server: Jetty(9.2.6.v20141205)
Some Data
Here the Cache-Control
header is saying that the data is only good in the cache for 10 second, after which a supporting client should get new data.
Cache-Control Example
Here’s how we could code it in JAX-RS
@GET
@Produces("text/plain")
public Response get() {
CacheControl cc = new CacheControl();
cc.setMaxAge(10);
return Response.ok("Some Data").cacheControl(cc).build();
}
This example is from CacheControlResource
in the Github project. We are setting the max-age
to 10 seconds. So let’s test it out. From the command line, cd
to the project root, then run the project with the Maven Jetty plugin (You will need to have Maven installed, or you can deploy the project to any servlet container).
cd <to> jersey-cache-example
mvn jetty:run
You should see the embedded Jetty server start. Wait until you see [INFO] Started Jetty Server
. That’s when you know the server startup is complete.
Instead of testing with cURL, we will use the browser. As I mentioned above cURL doesn’t offer any caching support. For testing with the browser, I often make use of some developer tool. In Firefox, there should be able to go to Tools
→Web Developer
→Web Console
to see the below console, or with Chrome there is also a dev tool installed. Just hit F12 to open it.
This what we are going to do. First make a request, and see the request being sent to the server. Then make another immediate request, and see that the browser doesn’t make a request, but give’s use the cache value. And lastly, wait 10 seconds and make another request, and see the request being sent to the server.
When you first start up Firefox Web Console, you will see a window like below. Hit the Network tab. Which will show networks calls.
Now in the URL bar, type in the URL http://localhost:8080/api/cc
and hit Enter. You will can tell that a request is made, from the green dot on the left, as well as the request time in milliseconds on the right
Most likely 10 seconds has passed. So highlight the URL bar then make a request again by hitting Enter, but make a second request but highlighting and hitting enter again. This time what you should have seen if first the same screen as above, as a request is made, but the second time you hit enter to make a request, you should see
This time there is no green dot, and no time elapsed for the request. This is because the browser respects the Cache-Control
and knows the data is only good for the max-age
of 10 seconds. At any time you can select the request table row to see all the request and response headers. You should see the header for the cache control.
Cache-Control: no-transform, max-age=10
Revalidation caching
As previously mentioned, HTTP 1.1 also added the concept of revalidation. If there is a Cache-Control
header, then after the time expires, the supporting client should revalidate the date and check to see if it has not changed.
Temporal Revalidation
There are two client request headers that could be involved in the revalidation process. The first is the If-Modified-Since: [date]
header. How it works is that the server should send a Last-Modified: [date]
response header along with the data. The server should have some way to maintaining this “last modified” date, for example a database field, that changes whenever the data is modified.
One subsequent requests, the client should send the If-Modified-Since: [date]
, with the [date]
being the same date in the Last-Modified
response header from the server. The server should check if the data has been modified since that date. If it hasn’t then the sever simply send back a 304 Not Modified
, other wise send the new data with a 200 OK
Temporal Revalidation Example
Let go through an example of using the If-Modified-Since
and Last-Modified
example. The code can be found in the LastModifiedResource
in the Github project.
@Singleton
@Path("last-modified")
public class LastModifiedResource {
private String data = "Some Data";
private Date lastModified = new Date();
@GET
@Produces("text/plain")
public Response get(@Context Request request) {
ResponseBuilder builder = request.evaluatePreconditions(lastModified);
if (builder != null) {
return builder.build();
}
return Response.ok(data).lastModified(lastModified).build();
}
@POST
@Consumes("text/plain")
public Response post(String data) {
this.data = data;
lastModified = new Date();
return Response.noContent().build();
}
}
Here we are creating a singleton resource (which is not thread safe in this example; bare with me). Here is what we want to accomplish in this example.
- The client sends a GET that will produce the
data
, and send with it aLast-Modified
date being thelastModfied
field. - We will make a GET request to the same endpoint, but this time adding the
If-Modified-Since
header, with the date value being the one the server sent us in theLast-Modified
header. When we send this request, we will get back a304 Not Modified
response, with no data - Then we will make a POST request, changing the data, which will also cause the
lastModfied
field to change. - Finally we will make the same request as in step two. But this time the revalidation will fail, as the dates are different. This time we will get the data back with a
200 OK
along with a new date in theLast-Modified
response header.
In the example you will notice the uses of the javax.ws.rs.core.Request
context object. How it works is that it already has reference to the If-Modified-Since
header sent by the client. The evaluatePreconditions
check to see if they are the same dates. If they are, the method will return a ResponseBuilder
set up to return a 304 Not Modified
, so we don’t need to do anything but build and return it. If the dates are different, then it fail the revalidation and return null.
Here is the request sequence in cURL (listed as in the steps above)
Step 1 (initial GET)
C:\>curl -v http://localhost:8080/api/last-modified
> GET /api/last-modified HTTP/1.1
> ...
< HTTP/1.1 200 OK
< Date: Sun, 18 Oct 2015 05:44:48 GMT
< Last-Modified: Sun, 18 Oct 2015 05:44:48 GMT
< Content-Type: text/plain
<
Some Data
Step 2 (GET with If-Modified-Since)
C:\>curl -v http://localhost:8080/api/last-modified \
-H "If-Modified-Since: Sun, 18 Oct 2015 05:44:48 GMT"
> GET /api/last-modified HTTP/1.1
> If-Modified-Since: Sun, 18 Oct 2015 05:44:48 GMT
> ...
< HTTP/1.1 304 Not Modified
< Date: Sun, 18 Oct 2015 05:45:28 GMT
Step 3 (POST to change data)
C:\>curl -v -X POST -d "new data" \
-H "Content-Type:text/plain" \
http://localhost:8080/api/last-modified
> POST /api/last-modified HTTP/1.1
> Content-Type:text/plain
> Content-Length: 8
> ...
< HTTP/1.1 204 No Content
< Date: Sun, 18 Oct 2015 05:45:45 GMT
Step 4 (Same GET as in Step 2)
C:\>curl -v http://localhost:8080/api/last-modified
-H "If-Modified-Since: Sun, 18 Oct 2015 05:44:48 GMT"
> GET /api/last-modified HTTP/1.1
> If-Modified-Since: Sun, 18 Oct 2015 05:44:48 GMT
> ...
< HTTP/1.1 200 OK
< Date: Sun, 18 Oct 2015 05:46:01 GMT
< Last-Modified: Sun, 18 Oct 2015 05:45:45 GMT
< Content-Type: text/plain
< Content-Length: 8
<
new data
Hope you can easily follow along. The important point to notice is that in step two, we are getting no content back and a 304 as the date is the same as on the server. But in the last request, the date change because of the previous POST, so now we get back new data with a different date.
Etag Revalidation
The second form of validation is validating the content has not changed. To do that the server should send something to the client the represents the current state of the content being sent in the request. This is handled with the Etag
header. In most cases, this will be a hash of the data. The server sends the Etag: <hash>
header with the response. When the client makes a request for the same resource, it should send a If-None-Match: <hash>
to the server. If the value does not match that of the server’s, then the resource has been changed since the client last received it, and the server should send the new data with the new Etag
. Otherwise if the data is the same, the server should send a 304 Not Modified
.
Etag Revalidation Example
Below is an example implementation. It is very similar to the one above, except we are not checking the time the resource was last modified, but if is was actually modified
@Singleton
@Path("etag")
public class EtagResource {
private String data = "Some Data";
@GET
@Produces("text/plain")
public Response get(@Context Request request) throws Exception {
MessageDigest digest = MessageDigest.getInstance("MD5");
byte[] hash = digest.digest(data.getBytes(StandardCharsets.UTF_8));
String hex = DatatypeConverter.printHexBinary(hash);
EntityTag etag = new EntityTag(hex);
ResponseBuilder builder = request.evaluatePreconditions(etag);
if (builder != null) {
return builder.build();
}
return Response.ok(data).tag(etag).build();
}
@POST
@Consumes("text/plain")
public Response post(String data) {
this.data = data;
return Response.noContent().build();
}
}
You can see the same use of the Request
context object, and a similar usage of the evaluatePreconditions
. Only this time we are using the overloaded method that accepts and EntityTag
. We construct the EntityTag
from the hash of the data
. In the initial request, the revalidation will not pass as there is no header from the client, so we just return the normal response, setting the Etag
response header, by calling tag(EntityTag)
on the ResponseBuilder
. In subsequent requests, if the EntityTag
matches, the the ResponseBuilder
will not be null, and we create the pre-configured ResponseBuilder
that is set to send a 304 Not Modified
.
Below I’ll post the same steps as I did with the above example and you will see similar results
- Send initial GET to get the data for the first time. We will get a response with the hash in the
Etag
header. - We will make the same request, except this time we will add the
If-None-Match
header, with the etag value. We should get back a304 Not Modified
with no data. - We will send a POST request to change the data.
- We will make the exact same request as in step 2, but this time we will get the new data, with the new
Etag
.
Step 1 (initial GET)
C:\>curl -v http://localhost:8080/api/etag
> GET /api/etag HTTP/1.1
> ...
< HTTP/1.1 200 OK
< Date: Sun, 18 Oct 2015 07:57:29 GMT
< ETag: "30057E5031BCF44D47B005A1F1700F7B"
< Content-Type: text/plain
< Content-Length: 9
<
Some Data
Step 2 (GET with If-None-Match))
C:\>curl -v http://localhost:8080/api/etag \
-H "If-None-Match: \"30057E5031BCF44D47B005A1F1700F7B\""
> GET /api/etag HTTP/1.1
> If-None-Match: "30057E5031BCF44D47B005A1F1700F7B"
> ...
< HTTP/1.1 304 Not Modified
< Date: Sun, 18 Oct 2015 07:58:22 GMT
< ETag: "30057E5031BCF44D47B005A1F1700F7B"
Step 3 (change data with POST)
C:\>curl -v -X POST -d "new data" \
-H "Content-Type: text/plain" \
http://localhost:8080/api/etag
> POST /api/etag HTTP/1.1
> Content-Type: text/plain
> Content-Length: 8
> ...
< HTTP/1.1 204 No Content
< Date: Sun, 18 Oct 2015 07:58:57 GMT
< Server: Jetty(9.2.6.v20141205)
Step 4 (set GET from step 2)
C:\>curl -v http://localhost:8080/api/etag
-H "If-None-Match: \"30057E5031BCF44D47B005A1F1700F7B\""
> GET /api/etag HTTP/1.1
> If-None-Match: "30057E5031BCF44D47B005A1F1700F7B"
> ...
< HTTP/1.1 200 OK
< Date: Sun, 18 Oct 2015 07:59:01 GMT
< ETag: "E83CA39A795E57283EC1D12EDA0FECD3"
< Content-Type: text/plain
< Content-Length: 8
<
new data
You can see that in the 2nd step, we are not getting content back with a 304, but in the 4th request, we are getting the new data with a 200 and the new etag.
Staying DRY Using Filters and Interceptors
If you’ve been following along up to this point, you will have noticed that there is a lot of boilerplate that goes into producing the functionality for handling HTTP caching. With more resource methods, we will need to repeat similar code over and over. With Jersey (and JAX-RS), when we are faced with this, it is often an opportunity to handle this repetition inside of a filter or interceptor. If you are unfamiliar with these, I suggest taking some time to review Filters and Interceptors from the Jersey user guide.
What I will do here is not try to implement this functionality myself, but I will direct your attention to RESTeasy’s implementation. For those that don’t know, RESTeasy is another implementation of JAX-RS. From this implementation you can get an idea of what it takes to implement this.
What I did was port the functionality over to Jersey for my example. Some of the classes used in the implementation were RESTeasy specific, so I had to change some of it.
You will find all classes in the com.underdog.jersey.cache.example.dry
in my Github example project. All of the classes in the package are either complete duplicates or just altered a bit to port to Jersey. Below is a list of the classes, with a link to the original source code from the RESTeasy project. I will then describe them.
@Cache
CacheControlFilter
CacheControlFeature
ServerCache
SimpleServerCache
ServerCacheHitFilter
ServerCacheInterceptor
ServerCacheFeature
@Cache
- Is used as metadata for resource method or resource class. For example
@GET
@Cache(maxAge=10)
public Responge get(){}
The CacheControlFilter
uses this meta data to send the Cache-Control
header to the client
CacheControlFilter
- It uses the @Cache
metadata to create a CacheControl
object to send with the response header.
CacheControlFeature
is a DynamicFeature
that is registered with the application. This is what registers the CacheControlFilter
for each endpoint.
ServerCache
- This is an interface for the internal caching operations. Implementation will have an internal cache and the contract has methods to save to the cache, get from the cache, and remove from the cache
SimpleServerCache
- This is deprecated implementation of the ServerCache
. I only use it for this example so I don’t have to introduce other dependencies. The preferred implementation is the InfinispanCache
, which uses Infinispan as the underlying cache, as opposed to the SimpleServerCache
which simply uses a ConcurrentHashMap
Now get to where all the action lies.
ServerCacheHitFilter
The basic algorithm goes like this
- If it is not a GET request (e.g. PUT or POST), then remove the URI from the cache. (Note that cache keys are the URI and MediaTypes)
- If it is a GET request then
handleGET
- Lookup cache from URI and MediaType key
- If entry is null, end filter, go to resource method
- If the entry is expired, remove from cache and end filter, go to resource method.
- Else if entry not null
- Create
EntityTag
andRequest.evaluatePreconditions()
- If
ResponseBuilder
is returned, then etag from client matches. Return304 Not Modified
- If
ResponseBuilder
is null, that means that the etag from the client is different from the one in the cache. So we just send an OK response with the entity in the cache.
- Create
ServerCacheInterceptor
- This is the component that adds to the cache.
- If not GET, proceed and return.
- If there is no
Cache-Control
header set (from theCacheControlFilter
), proceed. The component will only handle caching for endpoints that send outCache-Control
- If all is well, the interceptor sets the
OutputStream
for the writer to aByteArrayOutputStream
. The reason for this is that the interceptor works in an AOP “around-invoke” style. It handles some “before” stuff, callsproceed
so theMessageBodyWriter
can do its serialization, then it handle the “after” stuff. So what happens is the afterproceed
is called, theMessageBodyWriter
has written the data to theByteArrayOutputStream
. Not the interceptor can get thebyte[]
and use it to create the hash for the etag. - With the etag created, the interceptor saves the
byte[]
entity data along with the etag into the cache.
Now the process is complete. To register this functionality, we use the last component
ServerCacheFeature
- The original ServerCacheFeature
(linked above) registers the InfinispanCache
as the default cache, so I change it up to register the SimpleServerCache
instead. The feature also registers the ServerCacheHitFilter
, the ServerCacheInterceptor
, and the CacheControlFilter
.
All that’s needed to make everything work, is to annotate a method with @Cache
, and to register the ServerCacheFeature
. You can see in the DryResource
from the example project, where the @Cache
annotation is used, and see the AppConfig
where the ServerCacheInterceptor
is registered.
To test this functionality out just start the server mvn jetty:run
, then follow the four cURL steps above in the Etag section. So will see the same results. The URL will be different though. Use http://localhost:8080/api/dry
Support Me
If you found any information in this post useful, please show your support and like it, share it, tweet it, pin it, and/or plus one it. Much thanks!