[cups.bugs] [HIGH] STR #1610: Certain cups-polld actions can cause CUPS to core

kcj.cups.umich kcj.cups at umich.edu
Wed Apr 26 13:03:07 PDT 2006


[STR New]

We run a cups environment w/ ~11 servers.  Each has it's own list of
printers/classes. We use one server as a hub for polling purposes.  It
polls/relays information from/to the other servers.

We started having problems where periodically we would have cups core on
all servers.  We ended up identifying that whenever we removed a printer
or class from one of the spokes, we had a chance of crashing all other
servers as they were updated.  This could also happen if one of our
servers fell off the network, as all the other servers updated themselves
to remove that server's printers.

An in house developer was able to track this problem down and generate a
patch for us.  A message from the developer explaining the problem is
attached below. 

Now we have a identified a new problem.  Instead of crashing all servers,
only the server that has the printer deleted crashes.  The message below
talks about this and ends with a description and our patch for the initial
problem.

We thought we would post an STR for 2 reasons. First, we no longer have
access to internal developer resources and were wondering if this second
issue could be addressed. Second we wanted to provide feedback regarding
the first issue we identified.  If there is any further information you
need, please let us know.


------developer's note-----

 We are currently running cups-1.1.23
 
 Using gdb to analyze the core file
 after cupsd crashes with a segmentation fault,
 a stack traceback looks like:
 
 #0  SendBrowseList () at dirsvc.c:613
 613     dirsvc.c: No such file or directory.
         in dirsvc.c
 (gdb) bt
 #0  SendBrowseList () at dirsvc.c:613
 #1  0x0805e74f in main (argc=1, argv=0xbffff1e4) at main.c:705
 #2  0x401bd8be in __libc_start_main (main=0x805de40 <main>, argc=1,
     ubp_av=0xbffff1e4, init=0x807ba40 <__libc_csu_init>,
     fini=0x807ba70 <__libc_csu_fini>, rtld_fini=0x40015060 <_rtld_local>,
     stack_end=0x44081bdc) at ../sysdeps/generic/libc-start.c:152
 (gdb)
 
 
 The code in SendBrowseList looks like:
 
          /*
           * Loop through all of the printers and send local updates as
needed...
           */
 
           for (p = Printers; p != NULL; p = np)
           {
            /*
             * Save the next printer pointer...
             */
 
             np = p->next;
 
            /*
             * If this is a remote queue, see if it needs to be timed
out...
             */
 
 line 613:   if (p->type & CUPS_PRINTER_REMOTE)
             {
               if (p->browse_time < to)
               {
                 LogMessage(L_INFO, "Remote destination \"%s\" has timed
out; del
 eting it...",
                            p->name);
                 DeletePrinter(p, 1);
               }
             }
           }
 
 
 So we are traversing this linked list collection of printers and
 printerclasses.  With "p" pointing at the printer we are going to delete,
 and np pointing at the next printer or printerclass entry in the list.
 
 The problem occurs when DeletePrinter, in the process of deleting
 the printer calls DeletePrinterFromClasses,  to clean the printer
 from all classes that it belongs to.  DeletePrinterFromClasses deletes
 the printer from all classes,  and then deletes all empty printer
classes.
 The problem happens when DeletePrinterFromClasses deletes the entry on
 the linked list of printers and classes that "np" is pointing to in
 SendBrowserList.  When DeletePrinter returns to SendBrowserList,
 np is pointing to freed/ unallocated memory.
 
 
 Previously,  I had fixed a similar problem in DeletePrinter and
 DeletePrinterFromClasses.  Prior to deleting the printer,
 DeletePrinter calls DeletePrinterFromClasses to delete all classes
 the print is in,  and then DeletePrinterFromClasses deletes all
 empty printer classes by calling DeletePrinter.  DeletePrinter
 would then call DeletePrinterFromClasses again which would call
 DeletePrinter. This loop would go on until the stack overflowed.
 
 So I added this patch which says that if the current printer is a printer
 class then don't call DeletePrinterFromClasses:
 
 12:bringingupbaby/scheduler: diff printers.c printers.c~
 629c629
 <   if (! (p->type & (CUPS_PRINTER_CLASS | CUPS_PRINTER_IMPLICIT)) )
 ---
 >   if (!(p->type & CUPS_PRINTER_IMPLICIT))

Link: http://www.cups.org/str.php?L1610
Version: 1.1.23





More information about the cups mailing list