[cups.development] [RFE] STR #3591: Socket backend will loop "forever"

Helge Blischke h.blischke at acm.org
Fri May 28 01:54:54 PDT 2010


Ryan Yagatich wrote:

> 
> DO NOT REPLY TO THIS MESSAGE.  INSTEAD, POST ANY RESPONSES TO THE LINK
> BELOW.
> 
> [STR New]
> 
> First, we use RHEL as our base - so we're using the latest available
> release for version 5.5. However, this behavior appears to exist in
> version 1.4.3 as well (although not confirmed). I have listed this bug as
> SW version 1.3.8, but this may be an incorrect location, and by this I
> apologize if it is.
> 
> In our environment, we've identified a particular issue with regards to
> printing to HP printers, or any printer that utilizes the socket backend.
> We have found that when printing from our application in the US to a
> printer located on a network in Asia or similar, we are plagued with
> situations where a printer will frequently become non-responsive. The
> socket daemon attempts to connect to the dead printer, and will continue
> to do so until the printer comes back online.
> 
> Human behavior however, is the nature of this bug. It turns out, that when
> the 5:00 PM bell rings in Asia, they turn off their printers and print
> servers and go home. When this happens, socket.c identifies that the host
> is down, and then loops with errors like this in error_log:
> 
> W [26/May/2010:15:48:21 +0000] [Job 2329127] recoverable: Network host
> '172.16.56.27' is busy; will retry in 30 seconds...
> 
> This loop exists until the printer becomes responsive again. Although
> there are situations where we want to continue to retry to print that job
> (i.e., the printer is only down for a few minutes and comes back online on
> its own) - but sometimes we just want to give up. In particular, when we
> have an entire office closed for the weekend (or longer) - and CPU
> utilization goes up from this loop.
> 
> Therefore, I'd recommend an enhancement which adds two configuration
> settings for this:
> 
> 1) Connection backoff time:
>    - Rather than delaying for the hard-coded 30 seconds, allow for a
>      user-controlled delay period - i.e. 90 seconds - which will
>      help at some of the high CPU usage we're getting on the servers
> 2) Connection give-up time:
>    - After a certain number of retries on a recoverable failure, just
>      give up and assume the printer won't come back again and disable
>      the printer
> 
> The relevant loop in backend/socket.c appears to start at line 276 of this
> version:
>  * "$Id: socket.c 8896 2009-11-20 01:27:57Z mike $"
> 
> In our environment, we have a little over 300 printers configured in our
> US application stack, and that number is expected to double by the end of
> the year. Even with this quad Intel Xeon 3.0Ghz server with 4G of RAM, we
> get hit by high-usage CPU sourced from this module, and we're forced to
> take invasive action by killing the socket process via a cron job.
> Although this generally works OK, it would still be ideal if the daemon
> itself could identify and recover from this situation normally.
> 
> Link: http://www.cups.org/str.php?L3591
> Version:  -feature

Since CUPS 1.3.x, the socket backend offers an URI-option to specify a 
connection timeout (which is initially set to a default value of 7 * 24 * 60 
* 60). Modify the device-uri as follows:

socket://ip_or_hostname:9100/?contimeout=seconds

where seconds is the timeout value in seconds.

Then, after the timeout, the backend complains with
"ERROR: printer not responding" and exits with CUPS_BACKEND_FAILED.

Helge





More information about the cups mailing list