I am a Linux system administrator with some experience. I started using Linux in 2001 and never looked back.
What particularly drew me into the community was the idea that people all over the world were contributing to a project that was open and building a solution that could potentially be applied everywhere where there was a role for computing. I liked the fact that this was a co-operation methodology and not a competition methodology...
Anyway enough about me, the reason I really decided to post here is because I wanted people, who would understand the humour, to check out my friends' site where he has saved the PENIX man pages from obliteration so people can still laugh at them for years to come.
Hope you enjoy it.
If this post survives moderation then I promise I will put some useful information here one day :)... as promised:
Unfortunately this is a little dated as I have not had an opportuity to play with the new stable squid builds much but I am feel that there is a great deal of good information in the text below: Much of this I found in solaris forums and those are now dead as every old reference link I have for this stuff just redirects to Oracle...
Unfortunately, Squid is probably one of the least documented applications out there – the documentation that exists is vague, and doesn’t go into much detail all at once. I will stress that the options may not be very relevant anymore and I did end up experimenting with different caching algorithyms and added diskd - also ext4 will probably be a safe bet for a file system although I have not benchmarked jfs,btrfs or zfs for a while... (unless building on freebsd or opensolaris --dare I mention them in a Linux user group - I don't think zfs is a solid option).
In debugging squid, there is approximately a 24 hour period after a modification before you really see whether what you have changed has fixed the problem. This is compounded if you add file-system benchmarking into the mix – the cache must refill before you get a decent picture of what is happening.
Playing the Optimization Game
Probably the most important thing to note when deploying squid, is that in 99% of cases, you will have many thousands – if not millions – of very small files; due to this, you need to choose a file-system that is able to deal very well with reading/writing many small files concurrently. Enter ReiserFS (check phoronix forums... for more recent benchmarks as I said earlier this is dated and ext4 may be a safer bet)
Having tried both XFS – very poor performance over time -, and ext3 – better performance, but still lags a lot under load -, I switched over to ReiserFS, and have found that this lives up very well to its reputation of being good with many-small-files and many-reads/writes-per/sec.
I highly recommend that your machine is set up with a separate pair of squid disks, or worst case, on a separate partition on the host OS drives, utilizing a decently fast RAID level (think RAID10 here, don’t bother going near RAID5, you’ll get major I/O lag on writes). I’d recommend going for FAST disks (stay away from IDE here, or you’ll be in a world of pain). (RAID0 is what I used ... cheaper...)
On Debian, you should have ReiserFS support already, on CentOS, you’ll need to enable the centosplus repo by setting enable=1 in /etc/yum.repos.d/CentOS-Base.repo (on, or around line 59), then yum install reiserfs-utils. (dated... I ended up using Ubuntu LTS... but it is not really important)
Then it’s a case of
Where XX is the partition you are going to use for squid – in our case:
Then add your partition to /etc/fstab:
/dev/sdb1 /var/spool/squid reiserfs defaults,notail,noatime 1 2
Note the notail,noatime – these are both important, and will give you a performance boost. For more details about ReiserFS mount options, see here
If you want to make sure your proxy will handle a decent load - like if you are looking after a University or an ISP... (ok I know cluster the stuff and have lots of room to manouver) Then you must compile squid from source to be optimized for your hardware or you will look stupid by the time it is lunch the next day.
I’m not usually a great fan of compiling from source when it comes to multi-system implementations; they make life hard when it comes to system administration, and to be honest, I’m a big fan-boy of packages for ease-of-use, and lack of headaches they cause. This optimization how-to wouldn’t benefit from a ‘now simply install a package’, would it? =)
This is for Squid-2.6STABLE18 here – it’s the latest STABLE branch. I took a look at the Squid-3.0 release a while ago, and found a lot of bugs (after all, it is in beta), so I’m sticking with 2.6 for now. You can find a full list of versions available here, but I warn you that this how-to is probably only good for 2.6, so YMMV if you choose another version.
Grab the source and extract it. You’ll need the relevant development binaries installed – gcc, g++, etc. The following CHOST and CFLAGS will vary based on your processor and platform. The ones you will need to change are -march= and of course, if you’re on a 32bit platform, use CHOST="i686-pc-linux-gnu".
I find the Gentoo-Wiki Safe CFLAGS page to be an excellent reference for quickly finding which -march= definition to use based off processor type.
In our case, we’re running 64bit Intel Quadcore chips, so compile with the following options
CHOST="x86_64-pc-linux-gnu" CFLAGS="-DNUMTHREADS=120 -march=nocona -O3 -pipe -fomit-frame-pointer -funroll-loops -ffast-math -fno-exceptions" ./configure --prefix=/usr --enable-async-io --enable-icmp --enable-useragent-log --enable-snmp --enable-cache-digests --enable-follow-x-forwarded-for --enable-storeio="aufs" --enable-removal-policies="heap,lru" --with-maxfd=16384 --enable-poll --disable-ident-lookups --enable-truncate --exec-prefix=/usr --bindir=/usr/sbin --libexecdir=/usr/lib/squid
Note the -DNUMTHREADS=120; this is probably an under-estimate for our setup, as you can easily run with 30 on a 500mhz machine. This CFLAG controls the number of threads squid is able to run when using asynchronous I/O. I’ve been quite conservative with this value, as I don’t want Squid to block, or utilize too much CPU. The rest of the CFLAGS heavily optimize the outputted binaries.
I recommend building with the ./configure line as above, obviously, if you change it, YMMV!
Here’s a rundown of what those options do:
--enable-async-io: enables asynchronous I/O – this is really important, as it stops squid from blocking on disk reads/writes
--enable-icmp: optional, squid uses this to determine the closest cache-peer, and then utilizes the most responsive one based off the ping time. Disable this if you don’t have cache peers.
--enable-useragent-log: causes squid to print the useragent in log entries – useful when you’re using lynx to debug squid speed.
--enable-snmp: To graph all of the squid boxes utilizing cacti/zabbix/zenoss, you’ll want this enabled if you want to proxy SNMP requests to squid and graph the output.
--enable-cache-digests: required if you want to use cache peering (must have that)
--enable-follow-x-forwarded-for: For multi-level proxying happening as packets come through to squid, so to stop squid from seeing every request as from the load balancers, we enable this so squid reads the X-Forwarded-For header and picks up the real IP of the client that’s making the request.
--enable-storeio="aufs": YMMV if you utilizing an alternate storage i/o method. AUFS is Asynchronous, and has significant performance gains over UFS or diskd.
--enable-removal-policies="heap,lru": heap removal policies outperform the LRU policy, and we personally utilize “heap LFUDA”, if you want to use LRU, YMMV.
--with-maxfd=16384: File Descriptors can play hell with squid, I’ve set this high to stop squid from either being killed or blocking when it’s under load. The default squid maxfd is (i believe), 4096, and I’ve seen squid hit this numerous times.
--enable-poll: Enables poll() over select(), as this increases performance.
--disable-ident-lookups: Stops squid from performing an ident looking for every connection, this also removes a possible DoS vulnerability, whereby a malicious user could take down your squid server by opening thousands of connections.
--enable-truncate: Forces squid to use truncate() instead of unlink() when removing cache files. The squid docs claim that this can cause problems when used with async I/O, but so far I haven’t seen this be the cing more inodes on disk. Go! Go! Gadget Makefile
After your ./configure has finished running, and if there aren’t any errors, it’s time to make. This will take some time, depending on the spec of your machine, but once it’s finished (without errors), you’ll want to make install.
This bit is optional, but doesn’t hurt:
strip /usr/sbin/squid /usr/lib/squid/*
This will remove the symbols from the squid binaries, and give them a slightly smaller memory footprint. /etc/squid.conf
Now, lets move on to getting the squid.conf options right…
I’m not going to go into every config option here, if you don’t understand one, I recommend you check out the Configuration Manual, which contains pretty much every option and a description of how to use it.
This would be my recommended squid.conf contents:
NOTE! I’ve stripped out superfluous (obvious) configuration options that are required, such as http_port IP:PORT type, as they are out-side the scope of this blog entry.
dns_nameservers x.x.x.x x.x.x.x
cache_replacement_policy heap LFUDA
maximum_object_size_in_memory 50 KB
cache_dir aufs /var/spool/squid 40000 16 256
cache_mem 100 MB
maximum_object_size 50 MB
quick_abort_min 0 KB
quick_abort_max 0 KB
Okay, so what does all that do?
hosts_file /etc/hosts: Forces squid to look in /etc/hosts for any hosts file entries; don’t ask me why, but it isn’t good at figuring out that this is the default place on every Linux distribution.
dns_nameservers x.x.x.x x.x.x.x: Important! Squid will stall connections while attempting to do DNS lookups, somehow, specifying DNS name-servers within the squid.conf stops this from happening (and yes, they must be valid name-servers).
cache_replacement_policy heap LFUDA: You may not want to use the LFUDA replacement policy. If not, I recommend you stick with a variant on heap, as there are massive performance gains over LRU. Details of the other policies are here
cache_swap_low 90: Low water mark before squid starts purging stuff from its cache – this is in percent. If you have a 10gb cache storage limit, squid will begin to prune at 9gb used.
cache_swap_high 95: The high water mark. Squid will aggressively prune old cache files utilizing the replacement policy defined above. This would take place at 9.5gb in our above example. If you have a huge cache, it’s worth noting that your percentages would be better served closer together. i.e. a 100gb cage is 90gb/95gb – 5 gb difference. In this case, it would be better to have a 94%/95% watermark setup.
maximum_object_size_in_memory 50 KB: Unless you want to serve larger files super fast from memory, I recommend keeping this low – mainly to keep memory usage under control. Large files monopolizing vital RAM, while giving you a better byte hit-rate, will sacrifice your request hit-rate, as smaller files will keep getting swapped in and out.
cache_dir aufs /var/spool/squid X X X: I highly recommend NOT changing from AUFS. All the other storage methods in my benchmarking have been a lot slower performance wise. Obviously, replace the 3 X’s here with your storage limits.
cache_mem 100 MB: Keep this set low-ish. This represents the maximum amount of ram that squid will utilize to keep cached objects in memory. Remember, squid requires about 100mb of ram per GB of cache storage. If you have a 10gb cache, squid will use ~1gb just to handle that. Make sure that cache_mem + (storage size limit * 100mb ) is less than your available ram, or your squid will start to swap.
memory_pools off: This stops squid from holding onto ram that it is no longer actively using.
maximum_object_size 50 MB: Tweak this to suite the approximate maximum object size you’re going to serve from cache. I’d recommend not putting this up too high though. Better: This feature is useful, in some cases, but not in an optimized squid case. What quick_abort does in laymans terms, is evaluates how much data is left to be transferred if a client cancels a transfer. If that amount is within the quick_abort range, squid will continue downloading the file and then swap it out to cache. Sounds good, right? Hell no. If a client makes multiple requests, you can end up with squid finishing off multiple fetches for the same file. This bogs everything down, and causes your squid to be slow. 0 KB disables this feature.
quick_abort_max 0 KB: See quick_abort_min
log_icp_queries off: If you’re using cache_peers, you probably don’t need to know every time squid goes and talks to one of its peer-caches. This is needless logging in most cases, and is just an extra I/O thread that could be used elsewhere.
client_db off: If enabled, squid will keep statistics on each client. This can become a memory hog after a while, so it’s best to keep it disabled.
buffered_logs on: Buffers the write-out to log files. This can increase performance slightly. YMMV.
half_closed_clients off: Sends a connection-close to clients that leave a half open connection to the squid server. Tweak my /proc baby, yeah!
Okay, so Squid is optimized; what about the TCP stack? By default, a pristine installation is ‘optimized’ for any-use. By that, I mean it has a set of default kernel-level configuration settings that really don’t play ball well with network/disk intensive applications. We need to make a few modifications.
First thing, is to ‘modprobe ip_conntrack’, and add this module to either /etc/modules (debian) or /etc/modprobe.conf (RHEL/CentOS). This will stop squid from spitting out the terribly useful message
parseHttpRequest: NF getsockopt(SO_ORIGINAL_DST) failed: (92) Protocol not available
With that done, lets make some sysctl modifications…
Add the following lines to the end of your /etc/sysctl.conf
fs.file-max = 65535
net.core.rmem_default = 262144
net.core.rmem_max = 262144
net.core.wmem_default = 262144
net.core.wmem_max = 262144
net.ipv4.tcp_rmem = 4096 87380 8388608
net.ipv4.tcp_wmem = 4096 65536 8388608
net.ipv4.tcp_mem = 4096 4096 4096
net.ipv4.tcp_low_latency = 1
net.core.netdev_max_backlog = 4000
net.ipv4.ip_local_port_range = 1024 65000
net.ipv4.tcp_max_syn_backlog = 16384
There are also a couple of TCP related parameters that might need to be tuned as well:
ip_local_port_range increase tcp_max_syn_backlog increase tcp_fin_timeout decrease
(well, all of them needs tuning if you are running with a high request ratio like 100+)
I’ll let you google for the meaning of those changes, they’re documented almost everywhere; I’m merely telling you which one’s are worth changing. Note that with the file-max entry, you’ll also want to modify /etc/security/limits.conf and add:
With that done, your best bet is to reboot, and let the box pick up the changes that way. I’ve had some funky issues with squid + file-descriptor changes on the fly.
When the box is back up, start up squid, and have fun. You’re optimized. =)
--disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no) --enable-FEATURE[=ARG] include FEATURE [ARG=yes] --enable-maintainer-mode enable make rules and dependencies not useful (and sometimes confusing) to the casual installer --disable-dependency-tracking speeds up one-time build --enable-dependency-tracking do not reject slow dependency extractors --enable-dlmalloc=LIB Compile & use the malloc package by Doug Lea --enable-gnuregex Compile GNUregex. Unless you have reason to use this option, you should not enable it. This library file is usually only required on Windows and very old Unix boxes which do not have their own regex library built in. --enable-mempool-debug Include MemPool debug verifications --enable-xmalloc-statistics Show malloc statistics in status page --disable-carp Disable CARP support --enable-async-io=N_THREADS ith-pthreads --enable-storeio=ufs,aufs --enable-storeio="list of modules" Build support for the list of store I/O modules. The default is only to build the "ufs" module. See src/fs for a list of available modules, or Programmers Guide section <not yet written> for details on how to build your custom store module --enable-heap-replacement Backwards compatibility option. Please use the new --enable-removal-policies directive instead. --enable-removal-policies="list of policies" Build support for the list of removal policies. The default is only to build the "lru" module. See src/repl for a list of available modules, or Programmers Guide section 9.9 for details on how to build your custom policy --enable-icmp Enable ICMP pinging --enable-delay-pools Enable delay pools to limit bandwidth usage --enable-useragent-log Enable logging of User-Agent header --enable-referer-log Enable logging of Referer header --disable-wccp Disable Web Cache Coordination V1 Protocol --disable-wccpv2 Disable Web Cache Coordination V2 Protocol --enable-kill-parent-hack Kill parent on shutdown --enable-forward-log Enable experimental forward_log directive --enable-multicast-miss Enable experimental multicast notification of cachemisses --enable-snmp Enable SNMP monitoring --enable-cachemgr-hostname=hostname Make cachemgr.cgi default to this host --enable-arp-acl Enable use of ARP ACL lists (ether address) --enable-htcp Enable HTCP protocol --enable-ssl Enable ssl gatewaying support using OpenSSL --enable-forw-via-db Enable Forw/Via database --enable-cache-digests Use Cache Digests see http://www.squid-cache.org/FAQ/FAQ-16.html --enable-default-err-language=lang Select default language for Error pages (see errors directory) --enable-err-languages=\"lang1 lang2..\" Select languages to be installed. (All will be installed by default) --enable-coss-aio-ops Enable COSS I/O with Posix AIO (default is aufs I/O) --enable-select Force the use of select support. Normally configure automatically selects a better alternative if available. --disable-select Disable select support, causing configure to fail if a better alternative is not available --enable-select-simple Force the use of select support (POSIX). Useful if your system only supports the bare minium POSIX select requirements without fds_bits. --enable-poll Force the use of poll even if automatic checks indicate poll may be broken on your plaform. --disable-poll Disable the use of poll. --enable-epoll Force the use of epoll even if automatic checks indicate epoll may not be supported. --disable-epoll Disable the use of epoll. --enable-kqueue Force the use of kqueue even if automatic checks indicate kqueue may not be supported. --disable-kqueue Disable kqueue support. --disable-http-violations This allows you to remove code which is known to violate the HTTP protocol specification. --enable-ipf-transparent Enable Transparent Proxy support for systems using IP-Filter network address redirection. --enable-pf-transparent Enable Transparentction. --enable-linux-netfilter Enable Transparent Proxy support for Linux 2.4 and later --enable-large-cache-files Enable support for large cache files (>2GB). WARNING: on-disk cache format is changed by this option --enable-linux-tproxy Enable real Transparent Proxy support for Netfilter TPROXY. --enable-leakfinder Enable Leak Finding code. Enabling this alone does nothing; you also have to modify the source code to use the leak finding functions. Probably Useful for hackers only. --disable-ident-lookups This allows you to remove code that performs Ident (RFC 931) lookups. --disable-internal-dns This prevents Squid from directly sending and receiving DNS messages, and instead enables the old external 'dnsserver' processes. --enable-truncate This uses truncate() instead of unlink() when removing cache files. Truncate gives a little performance improvement, but may cause problems when used with async I/O. Truncate uses more filesystem inodes than unlink.. --enable-default-hostsfile=path Select default location for hosts file. See hosts_file directive in squid.conf for details --enable-win32-service Compile Squid as a WIN32 Service Works only on Windows NT and Windows 2000 Platforms. --enable-auth="list of auth scheme modules" Build support for the list of authentication schemes. The default is to build support for the Basic scheme. See src/auth for a list of available modules, or Programmers Guide section authentication schemes for details on how to build your custom auth scheme module --enable-basic-auth-helpers="list of helpers" This option selects which basic scheme proxy_auth helpers to build and install as part of the normal build process. For a list of available helpers see the helpers/basic_auth directory. --enable-ntlm-auth-helpers="list of helpers" This option selects which proxy_auth ntlm helpers to build and install as part of the normal build process. For a list of available helpers see the helpers/ntlm_auth directory. --enable-digest-auth-helpers="list of helpers" This option selects which digest scheme proxy_auth helpers to build and install as part of the normal build process. For a list of available helpers see the helpers/digest_auth directory. --enable-negotiate-auth-helpers="list of helpers" This option selects which negotiate scheme authentication helpers to build and install as part of the normal build process. For a list of available helpers see the helpers/negotiate_auth directory. --enable-ntlm-fail-open Enable NTLM fail open, where a helper that fails one of the Authentication steps can allow squid to still authenticate the user. --enable-external-acl-helpers="list of helpers" This option selects which external_acl helpers to build and install as part of the normal build process. For a list of available helpers see the Enable automatic call backtrace on fatal errors --enable-x-accelerator-vary Enable support for the X-Accelerator-Vary HTTP header. Can be used to indicate variance within an accelerator setup. Typically used together with other code that adds custom HTTP headers to the requests. --enable-follow-x-forwarded-for Enable support for following the X-Forwarded-For HTTP header to try to find the IP address of the original or indirect client when a request has been forwarded through other proxies.
--with-PACKAGE[=ARG] use PACKAGE [ARG=yes] --without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no) --with-valgrind-debug Include debug instrumentation for use with valgrind --with-aufs-threads=N_THREADS Tune the number of worker threads for the aufs object store. --with-pthreads Use POSIX Threads --with-aio Use POSIX AIO --with-dl Use dynamic linking --with-openssl=prefix Compile with the OpenSSL libraries. The path to the OpenSSL development libraries and headers installation can be specified if outside of the system standard directories --with-coss-membuf-size COSS membuf size (default 1048576 bytes) --with-large-files Enable support for large files (logs etc). --with-build-environment=model The build environment to use. Normally one of POSIX_V6_ILP32_OFF32 32 bits POSIX_V6_ILP32_OFFBIG 32 bits with large file support POSIX_V6_LP64_OFF64 64 bits POSIX_V6_LPBIG_OFFBIG large pointers and files XBS5_ILP32_OFF32 32 bits (legacy) XBS5_ILP32_OFFBIG 32 bits with large file support (legacy) XBS5_LP64_OFF64 64 bits (legacy) XBS5_LPBIG_OFFBIG large pointers and files (legacy) default The default for your OS --with-maxfd=N Override maximum number of filedescriptors. Useful if you build as another user who is not privileged to use the number of filedescriptors you want the resulting binary to support
Some influential environment variables:
CC C compiler command CFLAGS C compiler flags LDFLAGS linker flags, e.g. -L<lib dir> if you have libraries in a nonstandard directory <lib dir> CPPFLAGS C/C++ preprocessor flags, e.g. -I<include dir> if you have headers in a nonstandard directory <include dir> CPP C preprocessor