socket timeout problem
apples at shaw.ca
Thu Apr 7 15:01:23 PDT 2005
> Derek wrote:
> > We have 2 different platforms (SUSE Linux Enterprise and FreeBSD) with same version of CUPS(1.1.23) and there is a strange timeout problem that is causing data to be lost. The timeout is different for each platform and we're not sure where or how to change it. The SUSE timeout is ~150s and the FreeBSD timeout is ~600s.
> > We are using IBM 4247 forms printers and if they are left in an offline state, they continue to receive print jobs into an internal buffer which means that CUPS will continue to send print jobs until the buffer is full. At that point the printer sends a tcp window size of zero causing CUPS to display the message "network host busy". When the printer is set back to online, it sends a tcp window size message to the CUPS server indicating that window size is no longer zero to let the server know it can continue sending. This is where our problem begins and my knowledge of tcpip communications gets limited.
> > If the printer is left offline too long, the socket/port that sent the job is no longer listening so the window size message from the printer gets ignored, and as far as I can tell, the server just responds with a tcp reset packet. This causes the printer to dump part of the data from whatever jobs are in its buffer.
> > We've used packet traces and netstat to confirm some of these results. Netstat shows us that once the server sends a print job, the socket/port goes into a FIN_WAIT2 state where it stays until it either gets a FIN ACK from the printer, or it times out and dies, thus causing our problem. On SUSE that timeout appears to be about 150 seconds and on FreeBSD that timeout is about 600 seconds. We're not sure if the timeout is an OS or CUPS function. We've already been in contact with IBM engineers and we've been told that the problem is server related and that a print process should stay in a FIN_WAIT2 indefinitely.
> > Any help, thoughts, suggestions would be appreciated.
> You didn't tell us what backend you use to drive the printer. If you use
> the socket
> backend, try the hpnpf backend instead and tell us if it works.
> Helge Blischke
> SRZ Berlin | Firmengruppe besscom
Thank you for the suggestion, although yesterday, after more digging we found and fixed our problem. It turns out that with SUSE the default tcp_fin timeout is 60 seconds and with FreeBSD it is 600 seconds. We were able to modify this value by using the following command (SUSE):
sysctl -w net.ipv4.tcp_fin_timeout=36000
The value we used may not be the most optimal, but that will have to be determined with further testing. However, this setting is the source of the problem we were experiencing and we just want to pass along the solution in case anyone else runs into the same thing.
More information about the cups