Write a Simple Caching Proxy Server with PHP and Memcached

I recently had an issue where a plugin was trying to download remote data and then cache it to a directory, but the problem was that the directory wasn’t writable and the 500kb file was downloaded over and over again. To globally fix the issue and make sure I don’t spam people I built a quick caching proxy server for externally requested files using PHP and Memcached. You will need to have a Memcached server running and the php-pecl-memcache extension installed. It’s a hack, so you also have to have another server, but it works best for a largish environment anyway to throttle external requests. Faster internally and you won’t flood the remote host.

On the web server trick it into thinking the remote server is local by editting /etc/hosts and using the hostname of the server you are requesting data from – say www.remote.com – and the internal IP address of the Apache server you will use to proxy the requests. This way when you make an outbound request it will actually go to your proxy server instead of the final destination on the Internet.

Next on the Apache server you are using as a proxy, create a VirtualHost that is listening on the hostname of the remote server – only your servers will route here since it overrides the public DNS. You can add additional hosts here by using ServerAlias blah.whatever.com lines.

<VirtualHost *:80>
        ServerName www.remote.com
        DocumentRoot /var/www/cacheproxy
        <IfModule mod_rewrite.c>
        RewriteEngine On
        RewriteCond %{REQUEST_FILENAME} !-f
        RewriteCond %{REQUEST_FILENAME} !-d
        RewriteRule . /cacheproxy.php [L,QSA]
        </IfModule>
</VirtualHost>

Finally create a file under /var/www/cacheproxy/cacheproxy.php with the following. It assumes your memcache server is on localhost and port 11211, caching for 6 hours. Adjust as you wish.

<?php
class CacheProxy {

	private static $memcached = null;
	private static $memcached_host = 'localhost';
	private static $memcached_port = 11211;

	private static $cache_ttl =  2160; // 6 hours

	private function __construct(){}

	public static function proxy(){
		$host = $_SERVER['HTTP_HOST'];		// We think we *are* this host, neato
		$path = $_SERVER['REQUEST_URI'];	// Grab the URI
		if  ( empty( $path ) ) return false; 	// Don't request the homepage
		$url = "http://$host$path";
		$data = self::get_memcached()->get( $url ); // check if the path is in the cache
		if ( false === $data ){
			// Load and cache data
			$data = self::get_url( $url );
			self::get_memcached()->set( $url, ( ( false !== $data )? $data : '' ), self::$cache_ttl );
		}
		echo $data;
	}

	/* fetch remote data */
	private static function get_url( $url ) {
		$ch = curl_init( $url );
		curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
		curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, 5 );
		curl_setopt( $ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT'] );
		$file = curl_exec( $ch );
		curl_close( $ch );
		return $file;
	}

	/* get a memcached instance */
	private static function get_memcached(){
		if ( empty( self::$memcached ) ){
			self::$memcached = new Memcached();
			self::$memcached->addServer( self::$memcached_host, self::$memcached_port );
		}
		return self::$memcached;
	}
}

// run the proxy for this request
CacheProxy::proxy();

You should now be able to proxy HTTP requests to outbound servers through this script, caching the results for performance.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *