Excessive number of processes for the same printer

Michael Sweet mike at easysw.com
Fri Aug 5 11:51:02 PDT 2005

Adam Hooper wrote:
> We have a document processing program written in Python which outputs PDFs to about 30 potential printers (mostly Windows printers connected with Samba, but also some LPD and JetDirect printers; all are laserjets). The Python program generates PDFs and ships them off to CUPS using "lpr" on the commandline -- usually each Python script invocation ships all generated PDFs to the same printer.
> We have three serious problems with this setup:
> 1. When several hundred PDFs are directed to the same printer (in our most recent case, a Windows shared printer), we end up with 5 processes per job (as described by "ps ax"):
>  - a "/usr/bin/perl -w /usr/lib/cups/filter/pdftops"
>  - a process named after the printer (e.g., "pc26_hp1200")
>  - "/usr/bin/perl /usr/lib/cups/filter/foomatic-rip"
>  - a process named after the printer's URL (e.g., "smb://domain.com/pc26/hp1200")
>  - "/usr/bin/pdftops"

Well, that's what you get when you use non-standard CUPS drivers
and filters...

Looks like you have a Perl script in place of the standard pdftops,
and then you are using foomatic (which is unnecessary for 99% of
all laser printers - stick with the standard vendor PPD).  If you
use the original pdftops filter and the vendor PPD, you will end up
with 3 processes instead of 5...

 > ...
> The closest solution I can find is the FilterLimit directive in
 > cupsd.conf... which leads to our second problem.

That is the best way to control the number of active print filters.
For a PostScript printer, use a cost of 100 for each print queue
you want to print at the same time.

You can also use FilterNice to give higher priority to other
processes on the server...

> 2. When a printer is unavailable (for instance, a Windows machine
 > is turned off), all those foomatic processes continue to run until
 > the printer becomes available again. The delay could easily be
 > several months, or even forever.

Nothing to do about it, short of updating the smbspool source (see
the SAMBA source code) so that it doesn't retry indefinitely.

> Is it a good idea to perhaps run a cronjob which inspects queued
 > jobs and emails us if a printer has been unavailable for a
 > prolonged period?

That would be another thing you could do.  At the very least it
would alleviate the congestion.

> Is there a way to configure CUPS to automatically kill jobs which
 > haven't printed in a week?


 > Or even better, can CUPS automatically re-queue those jobs somehow,
 > so we don't have so many processes running permanently on our print
 > server?

At present there is no way for a printer queue to act that way.

Future versions of CUPS may offer configurable timeouts/retries,
and of course the SAMBA folks may come up with some interesting
changes to smbspool as well...

> ...
> 3. When we do a large number of near-simultaneous prints, jobs
 > don't come out of the Windows printer in the order they were
 > queued by our Python script. I'm (wildly) guessing this is because
 > the foomatic-rip processes are finishing in a different order than
 > they started, which again makes me wish they'd run in series. Is my
 > diagnosis correct? Has anybody had a similar issue? And if so, was
 > it solved? How?

This should never happen if you have only one print queue on the
Linux system mapped to any given Windows printer.  If you have
multiple queues mapped to the same printer and are queuing to
multiple queues at the same time, then the first job to connect
to the printer gets to print first and we cannot guarantee the
order of print jobs.

Michael Sweet, Easy Software Products           mike at easysw dot com
Internet Printing and Document Software          http://www.easysw.com

More information about the cups mailing list