[cups] CUPS takes 100% CPU/file descriptor leak

Joel Lord jlord at advancedinfomanagement.com
Wed Jul 26 12:48:44 PDT 2017


Hmmm... I'd checked the changelogs, but that might have been before I 
started cranking debug logging.  Issue 4901, fixed in 2.2.1, MIGHT be it.

More investigating to do...


On 7/26/17 3:39 PM, Michael Sweet wrote:
> Joel,
>
> The first thing I'd check is whether the problem happens with newer versions of CUPS - I believe we addressed an issue like this in 2.2.0...
>
>
>> On Jul 26, 2017, at 3:02 PM, Joel Lord <jlord at advancedinfomanagement.com> wrote:
>>
>> Yeah, I know, yet another of these.
>>
>> Running CUPS 2.1.3 on Ubuntu 16.  Seeing this particular problem on a server that is fed by other CUPS servers, and which then speaks to printers.  That is, the only inbound print jobs to this server are coming from CUPS elsewhere.  I don't know if this matters and am not in a position to find out, sadly.
>>
>> At the end of *receiving* a print job, it doesn't appear to quite finish.  The job gets turned around and prints successfully, but the socket from the upstream server is never closed out.  Digging in with strace I'm seeing epoll_wait return a bunch of results, then a whole cycle of recvfrom and poll for each fd epoll_wait returns.  Turning on debug logging, I'm seeing "Read: status=100" many, many times/second.  Turning on debug2 logging, I see:
>>
>> d [26/Jul/2017:13:25:41 -0400] [Client 1007] cupsdReadClient: error=0, used=0, state=HTTP_STATE_POST_RECV, data_encoding=HTTP_ENCODING_LENGTH, data_remaining=0, request=(nil)(), file=19
>>
>> for each fd.
>>
>> So I found in the code for cupsdReadClient() that if httpGetReady returns 0 and if there is nothing further in the receive buffer (as determined with recv using MSG_PEEK) the connection is closed and it cleans up the connection.  But that recv is coming back with at least 1 byte still to receive, so it's not done.  But the job has already been passed along to the printer by this point and is on paper, so that byte (or more) really can't be very important, can it?  Or is it possible I'm hitting a bug in the recvfrom system call such that it's reporting data in the buffer when there actually isn't?
>>
>> Problem is that since it's not going in to the "clean up the connection" code, we're leaking file descriptors with every print job as they're just accumulating in the CLOSE_WAIT state waiting on cupsd to actually call close() on them.  Eating a CPU isn't a problem, I've got more than one.  But over the course of a week I run out of file descriptors and the whole thing just falls over. A daily cron job to restart cupsd gets the job done, but that isn't really a solution to the underlying problem.
>>
>> Any ideas out there?
>>
>> -- 
>> Joel Lord
>> Sr. Systems Architect
>> Advanced Information Management
>>
>> _______________________________________________
>> cups mailing list
>> cups at cups.org
>> https://lists.cups.org/mailman/listinfo/cups
> _________________________________________________________
> Michael Sweet, Senior Printing System Engineer
>
> _______________________________________________
> cups mailing list
> cups at cups.org
> https://lists.cups.org/mailman/listinfo/cups

-- 
Joel Lord
Sr. Systems Architect
Advanced Information Management



More information about the cups mailing list