Cache Packages Locally with Apt-cacher






Cache Packages Locally with Apt-cacher

Save time and bandwidth when updating multiple Ubuntu machines by keeping a local package cache.

If you manage multiple Ubuntu machines, you probably wish there were some way to download new packages only once and install them on every machine. Better still, it would be good if it worked totally transparently so you could just use the regular package-management tools in Ubuntu and not care about what happens behind the scenes.

Apt-cacher allows you to do exactly that. With Apt-cacher running on one machine on your network, you can configure all your other machines to fetch packages through it.

apt generally uses HTTP to fetch packages from package servers; as a result, it's pretty easy to use a normal HTTP proxy like Squid [Hack #98] to cache packages locally. However, Squid is designed to cache lots of small items, while software packages are usually a few large items. You may find Squid drops large packages from its cache, which are the very packages most important to store for reuse. To make apt use a proxy, you can configure the option permanently in the config file (use man apt.conf for details) or just export the http_proxy environment variable by running a command like export http_proxy=proxy.example.com:8080 prior to running apt.


Apt-cacher is different from many other caching systems because rather than being a standalone program, it runs as a CGI script under Apache. That has a number of advantages, such as making it small, simple, and therefore more robust because it doesn't need its own protocol-handling code. It also makes it very flexible because you can use Apache's built-in access-control mechanism in case you want to let only certain machines use your cache.

Apt-cacher itself needs to be set up on only one machine, the one you decide to use as your local cache. Then all computers on your local network have a setting modified to tell them to direct all package requests to your cache machine rather than directly to the package server.

Apt-cacher works by intercepting requests for packages and fetching them on behalf of local machines, while simultaneously storing them on disk in case other machines later ask for the same package. Once set up, there is no need to do anything differently to install packages: just install a package on one machine with apt or Synaptic, and it comes off the Internet; then when you install it on other machines later, it comes from the local cache. Easy!

Installing Apt-cacher

Getting Apt-cacher working involves two parts: setting up the cache server itself and then telling your local machines to use it.

Server setup

First, select a machine to use as your cache server. Apt-cacher puts very little load on the system, so you can safely run it on just about any machine you have available, even one that's normally used as a workstation. Probably the most critical things are to make sure your cache server has a fixed IP address, so other computers on your network can find it, and that there is plenty of disk space, because the cache itself can become quite large. Disk usage depends on how many packages you have cached, so the greater variety of software you run, the more space you will need. A few hundred megabytes is common, while large caches may need several gigabytes.

On the machine nominated to be your cache, start by installing the apt-cacher package:

$ sudo apt-get install apt-cacher apache
               

This will also install Apache plus a couple of other packages, unless they were already in place. If you already have Apache installed, you should restart it:

$ sudo /etc/init.d/apache restart
               

And you're done. (If you installed Apache after you installed Apt-cacher, you should run the command sudo dpkg-reconfigure apt-cacher.) You can test that the installation worked properly by opening a web browser and going to the address http://<cache.example.com>/apt-cacher, replacing <cache.example.com> with the hostname or IP address of your cache server. If all went well, you'll see an information page generated by Apt-cacher, as shown in Figure.

Apt-cacher default screen


Client setup

Client machines don't need to have anything installed to use Apt-cacher; they just need to have their list of package sources modified so that they send their package requests to the cache server.

The list of package sources is stored in a file called /etc/apt/sources.list (see "Modify the List of Package Repositories" [Hack #60] for information on modifying this file). Here is an HTTP entry for the dapper repository:

deb http://archive.ubuntu.com/ubuntu/ dapper main restricted

Each HTTP entry needs to have the address of your cache server prepended, so the previous entry becomes something like this:

deb http://cache.example.com/apt-cacher/archive.ubuntu.com/ubuntu/ dapper 
main restricted

Once you've modified all your entries, run:

$ apt-get update
               

to tell your machine to update its package list, and you're set. Any packages you install from now on will come via the cache server.

Configuration Options

At this point, you've created a working installation of Apt-cacher without touching a single config setting on the cache server. However, Apt-cacher has a number of options you can set by editing the file /etc/apt-cacher/apt-cacher.conf. You don't need to restart anything after editing this file; all changes are immediate.

The config file is very well commented, so for all the details, just read the file itself. Options include the ability to restrict use of your cache to specific IPv4 or IPv6 address ranges, restrict access to only approved software repositories, and pass requests through an upstream proxy server.

Traffic Reports

Apt-cacher can be configured to generate daily traffic reports. Generating reports is extremely fast even with a high-traffic cache and happens only once per day, so this option can safely be turned on without any impact on performance. To access the report, just point your browser at http://<cache.example.com>/apt-cacher/report, and you should see something like Figure.

Apt-cacher traffic report


The actual report is generated by a script at /usr/share/apt-cacher/apt-cacher-report.pl, which is run by cron and writes an HTML report to /var/log/apt-cacher/report.html.

Hacking the Hack

What if you want the report emailed to you each day? Because the report is in HTML format, it will need to be sent as an attachment. To do so, you can use the extremely versatile Mutt mail client. First, install the mutt package:

$ sudo apt-get install mutt
            

Then, edit /etc/cron.daily/apt-cacher and add the following line to the end:

echo "Apt-cacher report \Qdate\Q" | mutt -a \\
  /var/log/apt-cacher/report.html -s "Apt-cacher report" [email protected]
            

(Be sure to replace [email protected] with your email address.) Run this whole command manually just to check that it works and to force Mutt to create a mail directory for the user if it doesn't already exist. From now on, you'll receive daily updates on the efficiency of your Apt-cache.



 Python   SQL   Java   php   Perl 
 game development   web development   internet   *nix   graphics   hardware 
 telecommunications   C++ 
 Flash   Active Directory   Windows