When caching is not enough: “Double Buffered” Remote Calls

One of the challenges of running WordPress at scale is dealing with API calls to (insert_external_service_here). Using wp_remote_get (or curl) is probably your go-to method for API calling and this is a fine function for a low traffic site. On a site that gets millions of pageviews, it is just not going to cut it. You will inevitably run into race conditions.

In case you don’t know, a race condition is when person1 is waiting on the server to finish the api call, then person2 makes another call, then person3 … then person X, but person1 is still waiting. If the API server is being slow, there could be a queue of thousands waiting, at that point, your server has crashed for sure. wikipedia

Another reason for not using wp_remote_get on every request is API limiting. Some services do not allow more than X calls per second/minute/day. If you make a call for every visit, you will surely reach that limit extremely fast!

Simple Solution: Caching.

By caching your call the “traditional” way, you’ve now gone way ahead from where you started. The API will only happen every 5 minutes, and people will not have to wait for the results as you have them stored already! This is just perfect for medium traffic sites and fast response APIs.

The problem with this approach is that at the 5 minute mark you still need to wait for the API to respond. If the response is slow you could run into a race condition again, because cache is invalidated, and it goes like so: person1 triggers cache invalidation (past 5 minutes) and calls API, person2 calls API too because cache is not valid, person3 same ….. person100 same, person1’s call is done and cache is set again for the next 5 minutes, person101 gets a cached result everyone is happy from here on, in the meantime, persons2-100 are still waiting on the response slowly. We have somewhat mitigated the problem, but not completely solved it. If the traffic is really high and the API is really slow, your server could crash.

If you have that kind of traffic, you are playing with the big boys. Lazy caching is not going to be enough.

Complex Solution: Double Caching

Instead of just caching the result, you can double cache it. To do so, we are doing the same thing as above, but twice. And we’ll do it in a way where cache1 lives for 5 minutes and cache2 lives for 10 minutes. When cache1 invalidates, that person makes the api call and sets a switch so everyone else uses cache2. Now only person1 is slow.

This is pretty great as is, but we can do better. Let’s say the API starts to malfunction for whatever reason. In that scenario, you’ll have good data for 5 minutes until the next time you do the call … not so awesome. Add a data consistency check (which you SHOULD have), and now we are in business:

The only caveat with that approach is if your backup cache invalidates. A neat alternative, save the backup in an option:

“Double Buffer”

Boom! Now even if the API fails or nobody accesses the page and cache dies, your site still has content, all you have to do now is find a way to alert you of the problem.

Overkill? yes … Works? … Really well!

Note: wordpress options have limited size, if the data you are storing is too big, you may want to consider WP Large Options.

Leave a Reply

Your email address will not be published. Required fields are marked *