One of the challenges of running WordPress at scale is dealing with API calls to (insert_external_service_here). Using wp_remote_get (or curl) is probably your go-to method for API calling and this is a fine function for a low traffic site. On a site that gets millions of pageviews, it is just not going to cut it. You will inevitably run into race conditions.
In case you don’t know, a race condition is when person1 is waiting on the server to finish the api call, then person2 makes another call, then person3 … then person X, but person1 is still waiting. If the API server is being slow, there could be a queue of thousands waiting, at that point, your server has crashed for sure. wikipedia
Another reason for not using wp_remote_get on every request is API limiting. Some services do not allow more than X calls per second/minute/day. If you make a call for every visit, you will surely reach that limit extremely fast!
Simple Solution: Caching.
1 2 3 4 5 6 7 8 |
$result = wp_cache_get( 'my_api_call', 'api_calls' ); if ( false === $result ) { $result = wp_remote_get( $API_url ); wp_cache_set( 'my_api_call', $result, 'api_calls', 300 ); // cache for 5 minutes } // do something with $result |
By caching your call the “traditional” way, you’ve now gone way ahead from where you started. The API will only happen every 5 minutes, and people will not have to wait for the results as you have them stored already! This is just perfect for medium traffic sites and fast response APIs.
The problem with this approach is that at the 5 minute mark you still need to wait for the API to respond. If the response is slow you could run into a race condition again, because cache is invalidated, and it goes like so: person1 triggers cache invalidation (past 5 minutes) and calls API, person2 calls API too because cache is not valid, person3 same ….. person100 same, person1’s call is done and cache is set again for the next 5 minutes, person101 gets a cached result everyone is happy from here on, in the meantime, persons2-100 are still waiting on the response slowly. We have somewhat mitigated the problem, but not completely solved it. If the traffic is really high and the API is really slow, your server could crash.
If you have that kind of traffic, you are playing with the big boys. Lazy caching is not going to be enough.
Complex Solution: Double Caching
Instead of just caching the result, you can double cache it. To do so, we are doing the same thing as above, but twice. And we’ll do it in a way where cache1 lives for 5 minutes and cache2 lives for 10 minutes. When cache1 invalidates, that person makes the api call and sets a switch so everyone else uses cache2. Now only person1 is slow.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
$switch = wp_cache_get( 'switch', 'api_calls'); if ( false === $switch ) { $result = wp_cache_get( 'my_api_call', 'api_calls' ); if ( false === $result ) { wp_cache_set( 'switch', true, 'api_calls', 300 ); // Set switch BEFORE making the external call. // Time doesn't matter here, just make sure it is longer than PHP's execution time $result = wp_remote_get( $API_url ); wp_cache_set( 'my_api_call', $result, 'api_calls', 300 ); // cache for 5 minutes wp_cache_set( 'my_api_backup_call', $result, 'api_calls', 600 ); // cache longer than 5mins so they don't invalidate at the same time wp_cache_delete( 'switch', 'api_calls' ); // Call is done, remove the switch } } else { $result = wp_cache_get( 'my_api_backup_call', 'api_calls' ); } // do something with $result |
This is pretty great as is, but we can do better. Let’s say the API starts to malfunction for whatever reason. In that scenario, you’ll have good data for 5 minutes until the next time you do the call … not so awesome. Add a data consistency check (which you SHOULD have), and now we are in business:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
$switch = wp_cache_get( 'switch', 'api_calls'); if ( false === $switch ) { $result = wp_cache_get( 'my_api_call', 'api_calls' ); if ( false === $result ) { wp_cache_set( 'switch', true, 'api_calls', 300 ); // Set switch BEFORE making the external call. // Time doesn't matter here, just make sure it is longer than PHP's execution time $result = wp_remote_get( $API_url ); if ( is_your_data_valid( $result ) ) { // the result from the call is as expected $save_data = $result } else { // Use the last good result and set backup $save_data = wp_cache_get( 'my_api_backup_call', 'api_calls' ); // Something went wrong in the api call.... notify somebody? do something? } wp_cache_set( 'my_api_call', $save_data, 'api_calls', 300 ); // cache for 5 minutes wp_cache_set( 'my_api_backup_call', $save_data, 'api_calls', 2000 ); // cache longer than 5mins so they don't invalidate at the same time wp_cache_delete( 'switch', 'api_calls' ); // Call is done, remove the switch $result = $save_data; } } else { $result = wp_cache_get( 'my_api_backup_call', 'api_calls' ); } // do something with $result |
The only caveat with that approach is if your backup cache invalidates. A neat alternative, save the backup in an option:
“Double Buffer”
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
$switch = wp_cache_get( 'switch', 'api_calls'); if ( false === $switch ) { $result = wp_cache_get( 'my_api_call', 'api_calls' ); if ( false === $result ) { wp_cache_set( 'switch', true, 'api_calls', 300 ); // Set switch BEFORE making the external call. // Time doesn't matter here, just make sure it is longer than PHP's execution time $result = wp_remote_get( $API_url ); if ( is_your_data_valid( $result ) ) { // the result from the call is as expected, store results wp_cache_set( 'my_api_call', $result, 'api_calls', 300 ); // cache for 5 minutes update_option( 'my_api_backup_call', $result ); // options don't invalidate, they just store stuff wp_cache_delete( 'switch', 'api_calls' ); // Call is done, remove the switch } else { // The result from the call is NOT ok deal with it somehow. // In the meantime do not clear the switch OR update the option // Because the switch is on, it'll eventually invalidate allowing another call. // If that other call is good, we return to regular use, if it is bad, // the switch will be set again, using the backup option } } } else { $result = get_option( 'my_api_backup_call' ); } // do something with $result |
Boom! Now even if the API fails or nobody accesses the page and cache dies, your site still has content, all you have to do now is find a way to alert you of the problem.
Overkill? yes … Works? … Really well!
Note: wordpress options have limited size, if the data you are storing is too big, you may want to consider WP Large Options.