[cups] CUPS takes 100% CPU/file descriptor leak
Joel Lord
jlord at advancedinfomanagement.com
Wed Jul 26 12:02:55 PDT 2017
Yeah, I know, yet another of these.
Running CUPS 2.1.3 on Ubuntu 16. Seeing this particular problem on a
server that is fed by other CUPS servers, and which then speaks to
printers. That is, the only inbound print jobs to this server are
coming from CUPS elsewhere. I don't know if this matters and am not in
a position to find out, sadly.
At the end of *receiving* a print job, it doesn't appear to quite
finish. The job gets turned around and prints successfully, but the
socket from the upstream server is never closed out. Digging in with
strace I'm seeing epoll_wait return a bunch of results, then a whole
cycle of recvfrom and poll for each fd epoll_wait returns. Turning on
debug logging, I'm seeing "Read: status=100" many, many times/second.
Turning on debug2 logging, I see:
d [26/Jul/2017:13:25:41 -0400] [Client 1007] cupsdReadClient: error=0,
used=0, state=HTTP_STATE_POST_RECV, data_encoding=HTTP_ENCODING_LENGTH,
data_remaining=0, request=(nil)(), file=19
for each fd.
So I found in the code for cupsdReadClient() that if httpGetReady
returns 0 and if there is nothing further in the receive buffer (as
determined with recv using MSG_PEEK) the connection is closed and it
cleans up the connection. But that recv is coming back with at least 1
byte still to receive, so it's not done. But the job has already been
passed along to the printer by this point and is on paper, so that byte
(or more) really can't be very important, can it? Or is it possible I'm
hitting a bug in the recvfrom system call such that it's reporting data
in the buffer when there actually isn't?
Problem is that since it's not going in to the "clean up the connection"
code, we're leaking file descriptors with every print job as they're
just accumulating in the CLOSE_WAIT state waiting on cupsd to actually
call close() on them. Eating a CPU isn't a problem, I've got more than
one. But over the course of a week I run out of file descriptors and
the whole thing just falls over. A daily cron job to restart cupsd gets
the job done, but that isn't really a solution to the underlying problem.
Any ideas out there?
--
Joel Lord
Sr. Systems Architect
Advanced Information Management
More information about the cups
mailing list