Home|Cudeso|Linux|*NIX / BSD|Misc|
 

Squid

Contents

1. Introduction

2. Useful resources

3. Download

4. Unpacking and compile

5. Configuration

6. Precautions before starting squid

7. Starting squid

8. Using squid

9. Configuration details

10. Telenet-settings

11. Squid and Outlook Express/Hotmail

12. Automatic browser-configuration





1. Introduction

This document is not intended as a "total" guide for administrating and setting up squid. Neither is it intended to be without errors. These are just the experiences I had when setting up squid.
First of all, what is squid?

I quote Duane Wessels on http://www.squid-cache.org
squid is a high-performance proxy caching server for web clients, supporting FTP, gopher, and HTTP data objects. Unlike traditional caching software, squid handles all requests in a single, non-blocking, I/O-driven process.
squid keeps meta data and especially hot objects cached in RAM, caches DNS lookups, supports non-blocking DNS lookups, and implements negative caching of failed requests.
squid supports SSL, extensive access controls, and full request logging. By using the lightweight Internet Cache Protocol, squid caches can be arranged in a hierarchy or mesh for additional bandwidth savings.
squid consists of a main server program squid, a Domain Name System lookup program dnsserver, some optional programs for rewriting requests and performing authentication, and some management and client tools. When squid starts up, it spawns a configurable number of dnsserver processes, each of which can perform a single, blocking Domain Name System (DNS) lookup. This reduces the amount of time the cache waits for DNS lookups.
I wanted to use squid because even with an ADSL-connection (where speed is quite high) every bit that could speed up my connection would come in handy. Even more important is that objects (like gifs, jpgs or other stuff) that I've already downloaded once wouldn't be downloaded again. This cuts in your total download-queue.

2. Useful resources

Al lot of the this material is "collected" together from various other resources.
For a detailed explanation of how squid works I would strongly suggest you visit these pages and read through them. Pick out the things you need and you'll see that putting it all together isn't as hard as it could seem at first sight.

squid home-page - http://www.squid-cache.org/
squid configuration manual - http://squid.visolve.com/squidconf.html

There's also a mailinglist, squid-users@squid-cache.org. Subscribe by sending an empty mail to squid-users-subscribe@squid-cache.org

3. Download

Off course, before you could setup a proxy-server you need to download the package.
There are some binary's (RPM) available but I recommend you download the complete package and compile it yourself. It works fine with the default compile-options and given the source code, you have more control over some extra compile-features.
You can download the package from http://www.squid-cache.org/Versions/v2/2.4/

4. Unpacking and compile

When the download-proces is finished, copy the file to the place where you usually place your sources (in my case it's /usr/local/src ... but yours can be different). To unpack it, issue this command
tar zxvf squid-<package-name>.tar.gz
For the moment, I assume you downloaded squid-2.4.STABLE1-src.tar.gz but what follows counts for every version, you only need to adjust the package-name. When the .tar.gz-file wasn't corrupt, there should now be a directory squid-2.4.STABLE1. Navigate to this directory. When you're a dedicated system-administrator, the first thing you would AND should do is read the INSTALL and README documents. Take your time for it because they contain some useful information.

For now, we can start the compilation process. Compilation is very easy. Just make sure you've got a running compiler on your system.
The next thing is running the configure script. If you want to install the package into a specified directory you could use this
./configure --prefix=/usr/people/sys-admin/squid
The --prefix specifies the install-directory.
When no errors are reported you can issue the make-commando's. In case you receive any errors please refer to http://www.squid-cache.org/
make
make install
That's all for the compile-thing.

5. Configuration

Out off the box the squid package is installed in a manner that it's directly up-and-running. But off course for good performance, it's necessary to adjust some minor settings. I'll cover the most important ones here.

All settings are bundled in one file squid.conf. You can find this file in /usr/local/squid/etc/squid.conf


http_port
The first thing you need to handle with is the default port squid is listening. Normally, squid is running on port 3128. You can choose freely wich port you want to use but for me, port 3128 seems very suitable.
http_port 3128

icp_port
The icp-port is the port-number where Squid sends and receives ICP queries to and from neighbor caches. Default Squid uses 3130. Because with my configuration icp isn't used, I need to disable the icp-port. This is done by setting the port-number to 0.
icp_port 0

ACL
Next, you need to create an ACL for your local-domain. For a complete explanation of what an ACL is, I quote Duane Wessels on http://www.squid-cache.org
The primary use of the acl system is to implement simple access control: to stop other people using your cache infrastructure. (There are other uses of acls, described later in this chapter; in the meantime we are going to discuss only the access control function of acls.) Most people implement only very basic access control, denying access to people that are not on their network. squid's access system is incredibly flexible, but 99% of administrators only use the most basic elements. In this chapter some examples of the less common uses of acls are covered: hopefully you will discover some squid feature which suits your organization - and which you didn't think was part of squid before.
There's already a list of default ACL's. In the squid.conf file there's room for your custom-made ACL's. Insert your line there in this form

acl <mydomain> src <mysubnet>
A good example could be

acl cudeso_domain src 192.168.1.0/255.255.255.0

http_access
Now you need to allow access for your domain. This can be arranged by adding

http_access allow <myacl>
When I follow the previous mentioned example this would be
http_access allow cudeso_domain

cache_peer
You could use your squid-proxy as the only proxy in the line of getting data from the internet. But when your ISP already provides his own proxy you can use this one as the 'parent' for your proxy-server. To get this working you'll need to add a line
cache_peer <ISP-proxy> parent <http_port> <icp_port>

cache_peer proxy.myisp.com parent 8080 8080
 cache_peer proxy.telenet.be parent 8080 8080 no-query default
The first one is for 'regular' proxies. The second stage is the one you should use when you're connected with Telenet (cable-provider in Belgium)


log-dirs
Default, squid stores all the data in /usr/local/... I'm not very fond of this. It makes perfectly sense to collect all data (objects and log-files) in one location. Most of the other daemons are using the /var/ directory and I can't come up with one good reason why squid shouldn't use this also.
The directory tags are handled by cache_dir, cache_access_log, cache_log and cache_store_log.
To get all the data collected in the /var/ dir, you need to adjust the settings so that they would look like this :
cache_dir ufs /var/spool/squid 100 16 256
cache_access_log /var/log/squid/access.log
cache_log /var/log/squid/cache.log
cache_store_log /var/log/squid/store.log
A special note can be added to cache_dir. There are some extra switches you can add to this setting.
The first thing (in this case ufs) handles the kind of storage system to use. Almost everyone will want to use "ufs" as the type. If you are using Async I/O (--enable async-io) on GNU/Linux or Solaris, then you may want to try "asyncufs" as the type. Async IO support may be buggy, however, so beware.
Next, you can specify the directory where cache swap files will be stored (in this case /var/spool/squid). If you want to use an entire disk for caching, then this can be the mount-point directory. The directory must exist and be writable by the squid process. squid will NOT create any directory.
The following setting is the amount of disk space squid may use in this directory (here I've used 100). The amount is entered in MegaBytes.
After the size, you can enter the number of first-level subdirectories, which will be created under the 'directory' (here 16) and the number of second-level subdirectories, which will be created under each first-level directory (here 256). cache_access logs the client request activity.
 cache_log holds general information about the cache's behavior.
 cache_store logs the activities of the storage manager.


pid-file
As with the log files, squid stores the pid-file in /usr/local/squid/.... I think you're much better of when you store this in /var/run/. So, change the setting pid_filename so that it looks like this
pid_filename /var/run/squid.pid

icp_access
Allow ICP-queries from everybody.
icp_access allow all

cache_mgr
When you're on a LAN you can supply an e-mail where users can send their remarks or problems regarding squid. This setting can be found under
cache_mgr squid-admin@mydomain.com

visible_hostname
This setting allows you to present a usuable hostname in error-messages.
visible_hostname proxy.myhost.com

cache_effective_user and cache_effective_group
Default squid runs as user and group nobody. This is fine when you have nothing else running as this user. However, when there are other services running as 'nobody', things start to get complicated. For this reason I prefer running squid as user squid in the group squid. The effective user and group for squid are set by cache_effective_user and cache_effective_group. So, following my settings this would look like
cache_effective_user squid
cache_effective_group squid

logging
squid has a built in feature for the rotation of the logs. Allthough this works fine I prefer to let it be handled by the logrotate of GNU/Linux. You can disable the rotate in squid by changing
logfile_rotate=0

hierarchy_stoplist
This settings provides the facility to instruct Squid to handle the mentioned objects directly by the cache.
hierarchy_stoplist cgi-bin ?

no_cache
With the no_cache setting you can set Squid so that it doesn't cache the response.
acl QUERY urlpath_regex cgi-bin \?
no_cache deny QUERY

forwarded_for
This setting allows you to hide your internal IP when visiting a site.
forwarded_for off

6. Precautions before starting squid

Before you can issue the 'start' of squid, you need to arrange some other settings. First of all, you need to create both a user and a group 'squid'. Make sure it's not possible to log-in as user squid (ie. change /etc/shadow by placing an * before the password. Something like this :
squid:*!!:52654:0:49899:7:::
Now you need to create the log-directory. Create both /var/spool/squid and /var/log/squid. Make sure user and group of these directories is squid and that this user has write-permissions on both directories.

The settings of squid are in /usr/local/share/squid/etc. I think this is quite strange because all settings of the other services are somewhere in /etc. I suggest you make a symlink in etc towards the squid-etc. You can do this by
ln -s /usr/local/squid/etc /etc/squid
On my system, all binaries reside in /usr/sbin. To make squid compatible with the other binaries (the binary of squid is in /usr/local/squid/bin/) I suggest you also make a symlink in /usr/sbin towards the squid-binary with
ln -s /usr/local/squid/bin/squid /usr/sbin/squid
Next thing is to make sure your logs are rotated properly. Like I've mentioned before, I disabled (by setting the logrotate-tag to 0) the build-in rotate function of squid. I let the logrotate service of GNU/Linux handle the trick. Alle you need to do is place a config-file in your /etc/logrotate.d directory. For the syntax of this config-file please refer to your man-page. A good example would be
Example of /etc/logrotate.d/squid

/var/log/squid/access.log {
  daily
  rotate 4
  copytruncate
  compress
  notifempty
  missingok
}

/var/log/squid/cache.log {
  daily
  rotate 4
  copytruncate
  compress
  notifempty
  missingok
}

/var/log/squid/store.log {
  daily
  rotate 4
  copytruncate
  compress
  notifempty
  missingok
  # This script asks squid to rotate its logs on its own.
  # Restarting squid is a long process and it is not worth
  # doing it just to rotate logs
  postrotate
  /usr/sbin/squid -k rotate
  endscript
}
As you can see, I use the /usr/sbin/squid -k rotate command to let squid rotate his logs. You can issue this command everytime you feel the need to.

The last thing you need to do before you can get squid up-and-running is creating all the swap-files. This can be done by a single command
/usr/local/squid/bin/squid -z

7. Starting squid

You can start squid by simple typing
/usr/sbin/squid
The squid process will start and runs in the background (check this with ps -aux. I find this way of starting a service quite unhandy and I'm using the start-up script that comes with the RedHat-binary (with a few minor adjustments). This script needs to be placed in /etc/rc.d/init.d/ and should be executable (chmod u+x squid). The advantage of this script is that you can start, stop or restart the squid process in an easy way and furthermore, you can have it come up at boot time (to do so, run the setup tool and under system-services place a * before the squid-service).
Example of /etc/rc.d/init.d/squid

 #!/bin/bash
 # squid This shell script takes care of starting and stopping
 # Squid Internet Object Cache
 #  # chkconfig: - 90 25
 # description: Squid - Internet Object Cache. Internet object caching is \
 # a way to store requested Internet objects (i.e., data available \
 # via the HTTP, FTP, and gopher protocols) on a system closer to the \
 # requesting site than to the source. Web browsers can then use the \
 # local Squid cache as a proxy HTTP server, reducing access time as \
 # well as bandwidth consumption.
 # pidfile: /var/run/squid.pid
 # config: /etc/squid/squid.conf

 PATH=/usr/bin:/sbin:/bin:/usr/sbin
 export PATH

 # Source function library.
 . /etc/rc.d/init.d/functions

 # Source networking configuration.
 . /etc/sysconfig/network

 # Check that networking is up.
 [ ${NETWORKING} = "no" ] && exit 0

 # check if the squid conf file is present
 [ -f /etc/squid/squid.conf ] || exit 0

 # determine the name of the squid binary
 [ -f /usr/sbin/squid ] && SQUID=squid
 [ -z "$SQUID" ] && exit 0

 # determine which one is the cache_swap directory
 CACHE_SWAP=`sed -e 's/#.*//g' /etc/squid/squid.conf | \
   grep cache_dir | sed -e 's/cache_dir//' | \
   cut -d ' ' -f 2`
 [ -z "$CACHE_SWAP" ] && CACHE_SWAP=/var/spool/squid

 # default squid options
 # -D disables initial dns checks. If you most likely will not to have an
 # internet connection when you start squid, uncomment this
 SQUID_OPTS="-D"

 RETVAL=0
 case "$1" in
 start)
   echo -n "Starting $SQUID: "
   for adir in $CACHE_SWAP; do
    if [ ! -d $adir/00 ]; then
     echo -n "init_cache_dir $adir... "
     $SQUID -z -F 2>/dev/null
    fi
   done
   $SQUID $SQUID_OPTS &
   RETVAL=$?
   echo $SQUID
   [ $RETVAL -eq 0 ] && touch /var/lock/subsys/$SQUID
   ;;

 stop)
   echo -n "Stopping $SQUID: "
   $SQUID -k shutdown &
   RETVAL=$?
   if [ $RETVAL -eq 0 ] ; then
    rm -f /var/lock/subsys/$SQUID
    while : ; do
     [ -f /var/run/squid.pid ] || break
     sleep 2 && echo -n "."
    done
    echo "done"
   else
    echo
   fi
   ;;

 reload)
   $SQUID $SQUID_OPTS -k reconfigure
   exit $?
   ;;

 restart)
   $0 stop
   $0 start
   ;;

 status)
   status $SQUID
   $SQUID -k check
   exit $?
   ;;

 probe)
   exit 0;
   ;;

 *)
   echo "Usage: $0 {start|stop|status|reload|restart}"
   exit 1
  esac
 exit $RETVAL
That's about all for starting squid. You can have the current status displayed by issuing
/etc/rc.d/init.d/squid status

8. Using squid

I use squid on my Windows-clients for both Internet Explorer and Opera.
In Internet Explorer you can adjust the settings under the menu Extra, Internet-Options, Connections, Lan-settings
For Opera it's under File, Preferences, Connections, Proxy Servers

9. Configuration details

There's an excellent paper that covers most of the settings in squid.conf. When you want to tailor squid to your needs I advice you to read through this document. You can find it at http://squid.visolve.com/squid24s1/contents.htm.

Some other details I want to mention regarding the configuration (allthough some argue this is more setup-related).

First off, if you have set the default umask to restrictive, squid will function but you'll have all kinds of strange error-messages in your logs. If you have placed squid in /usr/local/squid make sure that these permissions are set
  • /usr/local/squid/libexec/unlinkd should be readeable and executeable by the squid-user (unlinkd removes old objects from the cache).
  • /usr/local/squid/etc/mime.conf should be reabeable by the squid user.
  • Make sure the squid-cache is readeable, writeable and executeable by the squid user and group. Otherwises issue an chmod -R g+rwx /path/to/squidcache.

10. Telenet-settings

Telenet is one of the cable-providers in Belgium. Whenever one feels the need to browse the internet with Telenet, he should use the Telenet-proxy. In order to install Squid and let it work with Telenet, you need to adjust some minor settings. When you have configured all the settings like they are mentioned above, these are the few things you need to add :
cache_peer proxy.telenet.be parent 8080 8080 no-query default
prefer_direct off
The first directive tells squid to use proxy.telenet.be as the parent. The second informs squid that it should never contact a website directly (because this will not work with Telenet).

These settings should be added where you define the ACL's.
acl telenet dstdomain .telenet.be
acl pandora dstdomain .pandora.be
always_direct allow telenet
always_direct allow pandora
Others
append_domain .telenet.be

11. Squid and Outlook Express/Hotmail

I had quite some problems with Squid when I wanted to use Outlook Express to collect the mails from my Hotmail-accounts. When you review the Squid-mailinglist there has been a lot traffic regarding this topic. The solution that worked for me was this :
acl cudeso_lan src 192.168.1.0/255.255.255.0
http_access allow cudeso_lan
acl extern_cudeso dstdomain www.cudeso.be
never_direct deny extern_cudeso
acl local-servers dstdomain cudeso.be
acl all src 0.0.0.0/0.0.0.0
never_direct deny local-servers
never_direct allow all

12. Automatic browser-configuration

When you're getting tired of always reconfiguring all your browsers when something has changed you can use a central 'automatic-configuration'-script. First of all, there needs to be a webserver that's running on your proxyserver because the configuration file will be server by normal 'http-requests'. When you're using Apache open up the mime.conf file in your favorite editor (for the right location check your server config-file). Add this line :
application/x-ns-proxy-autoconfig pac
Now when this is done, restart the webserver.

You can freely choose the name of the configuration file, as long as it ends with .pac. This is my example :
function FindProxyForURL(url, host)  {
  if (isInNet(host, "192.168.1.0", "255.255.255.0"))
    {
     return "DIRECT";
    }
  else if (isInNet(host, "192.168.6.0", "255.255.254.0"))
    {
     return "DIRECT";
    }
  else
    {
     return "PROXY proxy.mylan.com:3128;SOCKS proxy.mylan.com:3128";
    }
}
This will return a DIRECT hit for all sites that are on our local-LAN and will return the proxy-directives for external sites. A in-depth-configuration can be found on http://wp.netscape.com/eng/mozilla/2.0/relnotes/demo/proxy-live.html

If you're using Internet Explorer you can take advantage of the new WPAD-feature that gives you the ability for auto-configuring a browser without configuration scripts. You can find more details at http://proxy.nsysu.edu.tw/FAQ/FAQ-5.html. You need to follow these steps :
  • Make sure you have a working auto-proxy-configure script
  • Copy this file to a webserver and give it the name wpad.dat
  • Now add / edit an entry in the DNS so that wpad.yourdomain.com points to this webserver
  • Add the setting application/x-ns-proxy-autoconfig dat to the mime-definition file of your webserver. When you've changed this, restart the webserver so that the settings can take effect.
  • Open up Internet Explorer, check that only 'Automatically Detect Settings' is checked and restart your browser.
Your webserver should be listening to port 80 for this to work. If this is not the case (ie, it's listening on port 8080) open up the config-file and add these two lines :
Listen 80
Listen 8080
Copyleft 2002-2007 - cudeso.bewebmaster@cudeso.betop
>