Details of compression=gzip

Sat Aug 20 10:47:10 PDT 2011

This is not really a question, but "data point" feedback into the archive, for posterity, for anyone interested in using compression=gzip.

I have been using it for a short time (a couple of days) now. And I have a mix of about 80 users running Gnome desktops doing all the usual browsing, email, etc. sorts of things. Plus a character based POS and character based accounting package. The character based stuff has by far the most jobs, but are much smaller. The character based stuff compresses very well, but the total number of uncompressed bytes is miniscule compared to the PS and PDF streams. I just ran a report from the access_log files comparing the reported local OKC sizes to the reported Dallas sizes, for yesterday. Overall, it comes to 161MB vs 119MB inbound. About a 26% improvement. Pretty handy.

But that doesn't really tell the whole tale. Some of the large PDF's compress by 20%-30%. Some not at all. The PS streams tend to be more compressible, but often not by any more than some of the PDF's. But there is the occasional (PS) job that one of my user's queues which is 177MB uncompressed, but only 2.2MB compressed. This is, admittedly an exceptional situation, but looking back into the history, it happens around once a week or so, and is enough to keep our WAN link busy for 20 minutes straight. (One can only wonder why I hear about an added 3 second pause due to an SNMP timeout almost immediately, but nobody has ever mentioned this job that consistently takes 20 minutes to even start printing!) This is particularly significant since the user's are running Gnome desktops operating over Nomachine NX. So latency is very much a concern.

The good news is that, for the most part, Linux GUI desktop software, at least in this case, sends out fairly compact print data, on the balance. The bad news is that the pathological cases truly are pathological, and you might not ever hear about them.

Also note that you can't just put "compression=gzip" in the DeviceURI and forget it. I did that. And then learned that compression is only supported when the sending queue is raw. So you might have to change your architecture a bit for it to work.

Also worth noting, though not directly related to CUPS, is that if you do change your architecture, the receiving server needs to be at least CUPS 1.2 (which, e.g. the current Centos/RHEL 4.9 is not) and it must have a ghostscript which doesn't go all pathological and use up all system memory and swap, when CUPS is sent a PDF. The current CentOS 4.9 ghostscipt *does* go all pathological when a user sends a PDF. I found that compiling a later (8.64) version of Ghostscript fixed that.

-Steve Bergman