[cups-devel] [UNKN] STR #4588: No automatic retry or failover when multiple SRV records available under DNS-SD

Suffield Academy noreply at cups.org
Sun Feb 22 18:09:33 PST 2015


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

DO NOT REPLY TO THIS MESSAGE.  INSTEAD, POST ANY RESPONSES TO THE LINK BELOW.

[STR New]

Summary:

When configured with a printer advertised via dns-sd that lists multiple
SRV records, CUPS does not automatically re-try using the different SRV
hosts if the first one fails or cannot be reached.

Full report:

We use a wide-area bonjour domain to advertise printers via dns-sd in an
all-Apple environment.  We recently got two printers for a high-volume area
and wanted to use a single name for both, using the built-in capabilities
of SRV records to either load-balance or fail-over connections to the
printers.  Here's an example of how it's configured:

# dig -t ptr _pdl-datastream._tcp.zeroconf.example.org
;; trimmed for brevity
_pdl-datastream._tcp.zeroconf.example.org. 300 IN PTR
Failover._pdl-datastream._tcp.zeroconf.example.org.

# dig -t any Failover._pdl-datastream._tcp.zeroconf.example.org.
;; trimmed for brevity
Failover._pdl-datastream._tcp.zeroconf.example.org. 300 IN SRV	1 0 9100
failover-one.example.org.
Failover._pdl-datastream._tcp.zeroconf.example.org. 300 IN SRV	11 0 9100
failover-two.example.org.
Failover._pdl-datastream._tcp.zeroconf.example.org. 300 IN TXT	"txtvers=1"
"...etc..."

;; ADDITIONAL SECTION:
failover-one.example.org. 3600 IN A 192.0.2.1
failover-two.example.org. 3600 IN A 192.0.2.2


The "dns-sd" utility from OS X also confirms a proper registration and
lists both records.

The intent is to use "failover-one" as the primary host, and if it's
unavailable, switch to "failover-two".


Per RFC 6763 (DNS-Based Service Discovery), section 5:

  "In the event that more than one SRV is returned, clients MUST
   correctly interpret the priority and weight fields -- i.e., lower-
   numbered priority servers should be used in preference to higher-
   numbered priority servers, and servers with equal priority should be
   selected randomly in proportion to their relative weights."


When printing to the service, CUPS correctly selects the preferred SRV host
to connect to.  Unfortunately, if that host is unavailable, it does not
retry any different hosts; instead it continues to retry and fail with the
original hostname.

Subsequent print attempts do result in trying a different host.  However,
this may be a side-effect of the dns-sd implementation of the OS (it
appears to de-register unreachable hosts) and not CUPS.

We were expecting behavior more similar to CUPS classes, where the software
fast-fails to the next supported device in the class when the first host
cannot be reached.  Ideally, the print job should find a suitable host that
answers within a matter of seconds, rather than waiting for a full
host-unreachable timeout and then manually re-submitting the job.

The issue was discovered on a Mavericks (OS X 10.9.5) machine.  Upon
examination of the 2.1 git sources, it looks like the actual lookup of the
hostname for the printer occurs in cups/http-support.c:

  if (DNSServiceResolve(&domainref,
    kDNSServiceFlagsShareConnection,
    myinterface, hostname, regtype, domain,
    http_resolve_cb,
    &uribuf) == kDNSServiceErr_NoError)

Per Apple's developer docs for DNSServiceResolve:

  "Warning: DNSServiceResolve is appropriate for getting information
   about a service that has a single SRV record and a single TXT
   record (which may be empty). To resolve services that have multiple
   SRV or TXT records, you should use DNSServiceQueryRecord You should
   also use DNSServiceQueryRecord to monitor TXT record content instead
   of DNSServiceResolve."

While individual SRV and TXT records are certainly the most common,
multiple records per instance are allowed by the spec, required per the
RFC, and would serve a useful purpose (redundant printer instances without
needing to explicitly add classes to each client).  So it would appear that
the incorrect library call may be partially to blame, and likely to affect
all shipping versions.

There would likely need to be additional logic (similar to the "queued on a
class" behavior in scheduler/job.c) so that CUPS retries different hosts
rather than using the same DNS lookup for all attempts.  I'm not
well-versed in the plumbing of CUPS (nor a C programmer), so I don't feel
qualified to try and create a patch to do so.

Please let me know if we can provide any additional information or testing.

Link: https://www.cups.org/str.php?L4588
Version: 1.7.2
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJU6ovdAAoJENujp6sI12IjnSMP+wW2n+IQVjiY4zwA2k7Q/Piu
qHOgfCyn+0mjZHWDneNt1/X5Dhdy9xGjewzIz0lqb8U2QI4SKD/taJSr5gQC28V8
+WDYNPgPnNZIbLDSWcTqGGeXhBouVvYiF6xM7tmX2C17k5xMHM6Px9djnUxTQk58
O2u9zurkhVCRg5BqrOIYjNigzw9tUDXEYCZSn3CpapXk8wgfxnqc0VkfPGmH76P0
YwV2reBzao/I6g4S5V6QM5ZcWJHUhJHzzY9/LK4maexDZ+yh4QkS0SjSZhfx98AW
pT/zdGsZkD/OPC/r+3Be85ReBMGi3PWqh7sdatGeiWj5KZ5+nB6y6LeG3rTIZMPF
5KLrz+POYEZmpzObi47KcBygXd9tka1w1JUIzkkrZfacjwglaT7sHC6z6uDgn2Ga
Bu28684dIbxHymQXaoSQ00O83YsnF6xzFulTA6/WVGyQtOM50EmB/+hGz8O2WFSw
mcx3BsGu1MTMhp6NZTLw1ICYBgDtVtWODmIrNPLQp/8VfJ97ccagcuvJYSdmJ8hA
XNI1QPqpS5uM4o7bDVqWguWLmfKqy8yJ3Qjs/uODGhdVOcVNI0DN1qgSDiQ4FDD6
Gy8Vz4t7wZpBkq8E//ssK/r0SBH+UAeK/NF5TAmrx7C8HlzqWETvWkepeCTL9KmV
7bVNuC4fB2Rc64Se+mPZ
=FWDj
-----END PGP SIGNATURE-----




More information about the cups-devel mailing list