cups-polld - dies from incorrect use of FD?

angelb angelb at bugarin.us
Thu Jul 13 16:35:03 PDT 2006


Hello all.

The following are my observations and troubleshooting effort in an
attempt to narrow down the cause of cups-polld process crashing.

While cups-polld was running in an AIX 5.2 system, I ran a dbx against
the PID and was able to capture some debug info when cups-polld died:

ROOT @ SAPTI02 # dbx -a 655608
Waiting to attach to process 655608 ...
Successfully attached to cups-polld.
warning: Directory containing cups-polld could not be determined.
Apply 'use' command to initialize source path.

Type 'help' for help.
reading symbolic information ...warning: no source compiled with -g

stopped in raise.nsleep [/usr/lib/libc.a] at 0xd01f6d14
0xd01f6d14 (nsleep+0xbc) 80410014        lwz   r2,0x14(r1)
(dbx) cont

Broken pipe in write.write [/usr/lib/libc.a] at 0xd01f66a8
0xd01f66a8 (write+0x128) 80410014        lwz   r2,0x14(r1)
(dbx) detach

Notice, the broken pipe message above.

On another instance, I was able to capture a better output:
...
(dbx) where
raise.nsleep(??, ??) at 0xd01f6d14
sleep.sleep(??) at 0xd0201c74
poll_server() at 0x10000b00
main() at 0x100005dc
(dbx)

(dbx) cont

Terminated in write.write [/usr/lib/libc.a] at 0xd01f66a8
0xd01f66a8 (write+0x128) 80410014        lwz   r2,0x14(r1)
(dbx) where
write.write(??, ??, ??) at 0xd01f66a8
flsbuf._xwrite(??, ??, ??, ??) at 0xd01f5550 fwrite.fwrite_unlocked(??, ??, ??, ??) at 0xd020b1e0 fwrite.fwrite(??, ??, ??, ??) at 0xd020b3f8 doprnt._doprnt(??, ??, ??) at 0xd0203fa4 printf.fprintf(0xf0148590, 0x10001388, 0x2ff22794, 0x2ff220d0, 0xd0366a58, 0x250 0, 0x49424d, 0x80000000) at 0xd0203d68
poll_server() at 0x10000aac
main() at 0x100005dc

According to the above, looks like cups-polld died while attempting
to write output.

Wanting to see where it's writing to, I ran lsof to determine its
fd:

ROOT @ SAPTI02 # lsof -p 467002
COMMAND      PID USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
cups-poll 467002 cups  cwd   VDIR               10,4    12288    2 / (/dev/hd4)
cups-poll 467002 cups    0r  VCHR                2,2      0t0  275 /dev/null
cups-poll 467002 cups    1w  VCHR                2,2      0t0  275 /dev/null
cups-poll 467002 cups    2w  FIFO 0xf10000e341c8bd30        0
cups-poll 467002 cups    3u  VCHR                2,2      0t0  275 /dev/null
cups-poll 467002 cups    4u  IPv4 0xf10000f3015cf100      0t0  UDP *:50081
cups-poll 467002 cups    5u  IPv4 0xf10000f301323358      0t0  TCP sapti02:54254
->occam503:631 (CLOSE_WAIT)

Notice that stderr is being piped. I would suspect this is the cause
of cups-polld being killed as the FD is not being handled correctly
when fprintf is called.

The scheduler, cupsd, file descriptors are also weird, imho. See the
following:
ROOT @ SAPTI02 # lsof -p 782442
COMMAND    PID USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
cupsd   782442 root  cwd   VDIR               10,4    12288    2 / (/dev/hd4)
cupsd   782442 root    0u  IPv4 0xf10000f300d1a358      0t0  TCP *:631 (LISTEN)
cupsd   782442 root    1u  VREG               10,6   248940  390 /var (/dev/hd9v
ar)
cupsd   782442 root    2u  IPv4 0xf10000f3015ca400      0t0  UDP *:631
cupsd   782442 root    4r  FIFO 0xf10000e31d5d1b50        0
cupsd   782442 root    5w  FIFO 0xf10000e31d5d1b50        0

Notice that keyboard input is listening on TCP port and stderr is
being directed to UDP port. Myself and also another user had already
posted about cupsd crashing and it's possible FD handling maybe the
cause.

I ran a seperate instance of cups-polld and the following is what I
think FD should be, or equivalent:

echo "./cups-polld occam503 631 900 631 >/tmp/angel.cups.out 2>&1" | at now

ROOT @ SAPTI02 # lsof -p 692454
COMMAND      PID USER   FD   TYPE             DEVICE SIZE/OFF   NODE NAME
cups-poll 692454 root  cwd   VDIR               10,9      256 103988 /opt (/dev/
hd10opt)
cups-poll 692454 root    0r  VREG               10,6     1962    593 /var (/dev/
hd9var)
cups-poll 692454 root    1w  VREG               10,7       79     17 /tmp (/dev/
hd3)
cups-poll 692454 root    2w  VREG               10,7       79     17 /tmp (/dev/
hd3)
cups-poll 692454 root    3u  IPv4 0xf10000f301e1af00      0t0    UDP *:*
cups-poll 692454 root    4u  IPv4 0xf10000f30206ea58      0t0    TCP sapti02:*->
occam503:631

At any rate, I think stdin, stdout, and stderr should either go to
syslog or error_log since cupsd and cups-polld are non-interactive
processes. Doing it this way may just fix the problem... :)

Thanks,
Angel




More information about the cups mailing list