[cups.bugs] [MOD] STR #2577: Error policy not honoured when lpd backend fails to resolve hostname

Wed Oct 31 05:16:12 PDT 2007

Hello,

On Oct 30 12:57 Michael Sweet wrote (shortened):
> Tim Waugh wrote:
> > On Tue, 2007-10-30 at 08:01 -0700, Michael Sweet wrote:
> >> [STR Closed w/o Resolution]
> >>
> >> Failure to lookup a hostname is treated as a hard error - printing again
> >> will not fix the problem.
> > 
> > Disagree.  Hostname resolution failure may well be a transient error.
> 
> But starting on the next job won't fix it and will lead to the print
> jobs getting out-of-order.
> 
> The only time DNS lookups should fail in a transient way would be if
> the DNS server was taken down for more than a few minutes or you use
> multicast-DNS and the device is not turned on.  For the DNS server
> case, this should be a seldom-seen situation; for the mDNS printer,
> implementations are supposed to retry the lookups indefinitely...
> 
> I suppose if we get a timeout error, we can add retries in each of
> the network backends, however hard errors (not found, etc.) can't
> be solved by retrying the job or lookup.

I fully agree that the backends should be more robust against
possibly transient errors.

I would like to suggest that when there is a chance that an error
condition goes away, the backend should loop infinitely and retry
again and again (with a reasonable sleep time between each retry).

I think a continuously re-trying backend (with sleep time and
an appropriate INFO message for each re-try) is not worse than
disable the whole queue.
Actually I think it is usually much better because the user
is continuously informed what goes on (e.g. "123th re-try
to connect to host doesnotexist.nowhere") and if the user
doesn't like it, he can cancel his print job.
In contrast only the admin can re-enabe a queue.

For example we have customer complaints when the usb backend
disables a queue (e.g. because the printer was not connected
which could happen for laptops or simply not switched on), see
https://bugzilla.novell.com/show_bug.cgi?id=337794

I think having another default CUPS ErrorPolicy is not a solution
because it is only another kind of workaround when a backend
has given up too early.

I think the real solution is to let the backends be more robust
against possibly transient errors - preferably with additional
options via DeviceURI so that expert users can specify for each
queue how many retries are done and what the sleep time is
like the options of the "beh" backend error wrapper, see
http://www.linux-foundation.org/en/OpenPrinting/Database/BackendErrorHandler

Kind Regards
Johannes Meixner
-- 
SUSE LINUX Products GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany
AG Nuernberg, HRB 16746, GF: Markus Rex