Mar 18 2009
Configuring a Basic Reverse Proxy in Squid on Windows (Website Accelerator)
I do a lot of web development in Visual Studio 2005 (and 2008) on a Vista workstation joined to a domain. Up until recently I've been working on a very large set of RESTful APIs written in a special library for creating RESTful APIs using metaprogramming. Its a great in-house library, but its not compatible with IIS7 - and debugging the PHP requests through the production compiled staging server (aka trawling through many many large log files) is become tedious and difficult.
I thought, wouldnt it be useful if I could debug the APIs if the guys could use the APIs hosted on my machine by the ASP.NET Development Webserver. But alas, remote connections are not allowed to this light-weight web server making this impossible.
Or is it?
I have been wanting to learn more about reverse proxies as I know from buzz in the industry that they can be god sends. I thought, that's what I need! A reverse proxy to forward external web requests into "internal requests" to trick the ASP.NET Development Webserver that the request was local, when in fact it isn't. After playing with a number of options, it turned out that what I had initially avoided for fear of it being too difficult was actually the best and easiest. Squid! For those not familiar with the concept of reverse proxies, I thought I'd paste this snippet from the Squid wiki:
What is the Reverse Proxy (httpd-accelerator) mode?
Occasionally people have trouble understanding accelerators and proxy caches, usually resulting from mixed up interpretations of "incoming" and "outgoing" data. I think in terms of requests (i.e., an outgoing request is from the local site out to the big bad Internet). The data received in reply is incoming, of course. Others think in the opposite sense of "a request for incoming data".
An accelerator caches incoming requests for outgoing data (i.e., that which you publish to the world). It takes load away from your HTTP server and internal network. You move the server away from port 80 (or whatever your published port is), and substitute the accelerator, which then pulls the HTTP data from the "real" HTTP server (only the accelerator needs to know where the real server is). The outside world sees no difference (apart from an increase in speed, with luck).
Quite apart from taking the load of a site's normal web server, accelerators can also sit outside firewalls or other network bottlenecks and talk to HTTP servers inside, reducing traffic across the bottleneck and simplifying the configuration. Two or more accelerators communicating via ICP can increase the speed and resilience of a web service to any single failure.
The Squid redirector can make one accelerator act as a single front-end for multiple servers. If you need to move parts of your filesystem from one server to another, or if separately administered HTTP servers should logically appear under a single URL hierarchy, the accelerator makes the right thing happen.
Start by obtaining a binary release of Squid. I'll be using the latest stable release, standard 2.7.STABLE4. Squid does not require installation as such, simply unzip it where you wish. To make it simple, I'll install Squid directly in C:\squid as the standard Squid configuration expects it to be installed here - it's easy to change though!
We'll start by installing Squid as a service, before doing the actual configuration. Open a command prompt and go to C:\squid\sbin. Now run "squid -i -n Squid". This will install Squid as a service under the name "Squid".
C:\squid\sbin>squid -i 0n Squid Registry stored HKLM\SOFTWARE\GNU\Squid\2.6\Squid\ConfigFile value c:/squid/etc/ squid.conf Squid Cache version 2.7.STABLE6 for i686-pc-winnt installed successfully as Squid Windows System Service. To run, start it from the Services Applet of Control Panel. Don't forget to edit squid.conf before starting it.
Before we start Squid, we have to configure it. Go to C:\squid\etc and make a copy of squid.conf.default and call it squid.conf. Do the same for mime.conf.default (we won't edit this one, but it's needed). There are hundreds of configuration options, all very well documented. Now, I won't go over all the options, so simply by-pass the entire contents of the squid.conf file, we'll add only the configuration options that we need at the bottom.
http_port 8880 accel defaultsite=your.dns.address cache_peer localhost parent 9977 0 no-query originserver name=myAccel acl our_sites dstdomain your.dns.address http_access allow our_sites cache_peer_access myAccel allow our_sites cache_peer_access myAccel deny all
This particular reverse proxy is going to co-exist on the same machine that is hosting the website, but for better load balancing, you should make the first line http_port 80, and the cache_peer line, change localhost to the ip of the webserver and parent 9977 to 80 as well. This way, when the Squid server gets a request on port 80 (the default HTTP port) it will properly reverse-proxy to the default webserver port on the web server machine. The other options are handy to know, so have a read, but so long as your firewall has the relevant ports open, the config file as it stands is all you need at a minimum to get things going. Another handy thing to know (as your site grows) is that the reverse-proxy capabilities of Squid are quite advanced. To load balance requests among a set of backend servers allow requests to be forwarded to more than one cache_peer, and use one of the load balancing options in the cache_peer lines. I.e. the round-robin option:
cache_peer ip.of.server1 parent 80 0 no-query originserver round-robin cache_peer ip.of.server2 parent 80 0 no-query originserver round-robin
This is very asy to spread load of your websites and services. Wikimedia is quoted a while ago as saying that their front-end squid servers have a 75% hit rate, effectively quadrupling the load capacity of their Apache servers. No doubt their configuration is quite different to this example, but it gives you an idea of just how much benefit a Squid based reverse proxy can be.
Before you run off and start the Squid service, first run "squid -z" which will report any configuration errors in your conf file. If all went good, it should look something like:
C:\squid\sbin>squid -z 2009/03/18 11:35:06| Creating Swap Directories
Since its working great, just execute "net start Squid" and:
C:\squid\sbin>net start Squid The Squid service is starting. The Squid service was started successfully.
BAM! Your fung foo is strong. Simply test the external url and it should forward the request and response back for you.
I know this has worked wonders for me, I hope you have the same positive experience.