LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* poll() blocked / packets not received ?
@ 2008-10-20  8:25 Nicolas Cannasse
  2008-10-20 10:15 ` swivel
  0 siblings, 1 reply; 10+ messages in thread
From: Nicolas Cannasse @ 2008-10-20  8:25 UTC (permalink / raw)
  To: linux-kernel

Hello,

We have an application that uses pthreads and (blocking) sockets.

When the application runs with one single thread in separate processes 
(using fork()) we don't get any problem.

However when it's multithreaded, we sometimes get stuck while poll()ing 
a socket (with events set to POLLIN). Even after the other side of the 
connection has closed its side of the connection, we are still stuck 
here. Adding a timeout only makes the poll() exit with 0, so we loop.

In case we don't loop the next operation is a recv() which will block as 
well (which is consistent).

It seems like nothing is longer received on the socket but it's 
difficult to verify with tcpdump since our server outputs something like 
15MB at peek time with 150 hits per seconds.

We have Shorewall installed and enabled, but what seems strange is that 
the problem depends on multithreading. It also occurs much more often on 
the 4 core machines than on a 2 core ones (both with Hyperthreading 
activated). We're using kernel 2.6.20-15-server (#2 SMP) provided by Ubuntu.

Any tip on we could fix that or investigate further would be 
appreciated. After one month of debugging we're really out of solution now.

Best,
Nicolas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: poll() blocked / packets not received ?
  2008-10-20  8:25 poll() blocked / packets not received ? Nicolas Cannasse
@ 2008-10-20 10:15 ` swivel
  2008-10-20 10:46   ` Nicolas Cannasse
  0 siblings, 1 reply; 10+ messages in thread
From: swivel @ 2008-10-20 10:15 UTC (permalink / raw)
  To: Nicolas Cannasse; +Cc: linux-kernel

On Mon, Oct 20, 2008 at 10:25:10AM +0200, Nicolas Cannasse wrote:
> Hello,
> 
> We have an application that uses pthreads and (blocking) sockets.
> 
> When the application runs with one single thread in separate processes 
> (using fork()) we don't get any problem.
> 
> However when it's multithreaded, we sometimes get stuck while poll()ing 
> a socket (with events set to POLLIN). Even after the other side of the 
> connection has closed its side of the connection, we are still stuck 
> here. Adding a timeout only makes the poll() exit with 0, so we loop.
> 
> In case we don't loop the next operation is a recv() which will block as 
> well (which is consistent).
> 
> It seems like nothing is longer received on the socket but it's 
> difficult to verify with tcpdump since our server outputs something like 
> 15MB at peek time with 150 hits per seconds.
> 
> We have Shorewall installed and enabled, but what seems strange is that 
> the problem depends on multithreading. It also occurs much more often on 
> the 4 core machines than on a 2 core ones (both with Hyperthreading 
> activated). We're using kernel 2.6.20-15-server (#2 SMP) provided by Ubuntu.
> 
> Any tip on we could fix that or investigate further would be 
> appreciated. After one month of debugging we're really out of solution now.
> 
> Best,
> Nicolas

Your usage pattern is a very common one, I highly doubt you are experiencing
a kernel bug here or many people (including myself) would be complaining.

Shorewall sounds like it might be suspect, are FIN's not coming in when the
remote closes?  You can look in the output of netstat to see what state the
TCP is in, still ESTABLISHED?

Have you tried just disabling the firewall to see if the problem
disappears?

Regards,
Vito Caputo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: poll() blocked / packets not received ?
  2008-10-20 10:15 ` swivel
@ 2008-10-20 10:46   ` Nicolas Cannasse
  2008-10-20 11:39     ` swivel
  0 siblings, 1 reply; 10+ messages in thread
From: Nicolas Cannasse @ 2008-10-20 10:46 UTC (permalink / raw)
  To: swivel; +Cc: linux-kernel

>> We have Shorewall installed and enabled, but what seems strange is that 
>> the problem depends on multithreading. It also occurs much more often on 
>> the 4 core machines than on a 2 core ones (both with Hyperthreading 
>> activated). We're using kernel 2.6.20-15-server (#2 SMP) provided by Ubuntu.
>>
>> Any tip on we could fix that or investigate further would be 
>> appreciated. After one month of debugging we're really out of solution now.
>>
>> Best,
>> Nicolas
> 
> Your usage pattern is a very common one, I highly doubt you are experiencing
> a kernel bug here or many people (including myself) would be complaining.
> 
> Shorewall sounds like it might be suspect, are FIN's not coming in when the
> remote closes?  You can look in the output of netstat to see what state the
> TCP is in, still ESTABLISHED?

Yes, it's still ESTABLISHED, but we can't see the corresponding 
connection on the other machine while running netstat. I'm not a TCP 
expert, so I'm not sure in which case this can occur.

I agree with your comment in general, except that we have been running 
the same application in single-thread environment for years without 
running into this very specific problem.

The only logs we get in the dmesg are the following :

either (a few everyday) :

[10742708.006350] TCP: Treason uncloaked! Peer 213.209.177.218:32924/80 
shrinks window 4049064122:4049064123. Repaired.

Or (more often) :

[10755036.856217] Shorewall:net2all:DROP:IN=eth0 OUT= 
MAC=00:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:00 SRC=60.238.83.204 
DST=XX.XX.XX.43 LEN=404 TOS=0x00 PREC=0x00 TTL=114 ID=12366 PROTO=UDP 
SPT=1057 DPT=1434 LEN=384

Both SRC/DST IPs does not correspond to the connections that are 
stalled, since they occur on the local network.

Best,
Nicolas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: poll() blocked / packets not received ?
  2008-10-20 10:46   ` Nicolas Cannasse
@ 2008-10-20 11:39     ` swivel
  2008-10-20 12:13       ` Nicolas Cannasse
  2008-10-20 12:39       ` Nicolas Cannasse
  0 siblings, 2 replies; 10+ messages in thread
From: swivel @ 2008-10-20 11:39 UTC (permalink / raw)
  To: Nicolas Cannasse; +Cc: linux-kernel

On Mon, Oct 20, 2008 at 12:46:56PM +0200, Nicolas Cannasse wrote:
> >>We have Shorewall installed and enabled, but what seems strange is that 
> >>the problem depends on multithreading. It also occurs much more often on 
> >>the 4 core machines than on a 2 core ones (both with Hyperthreading 
> >>activated). We're using kernel 2.6.20-15-server (#2 SMP) provided by 
> >>Ubuntu.
> >>
> >>Any tip on we could fix that or investigate further would be 
> >>appreciated. After one month of debugging we're really out of solution 
> >>now.
> >>
> >>Best,
> >>Nicolas
> >
> >Your usage pattern is a very common one, I highly doubt you are 
> >experiencing
> >a kernel bug here or many people (including myself) would be complaining.
> >
> >Shorewall sounds like it might be suspect, are FIN's not coming in when the
> >remote closes?  You can look in the output of netstat to see what state the
> >TCP is in, still ESTABLISHED?
> 
> Yes, it's still ESTABLISHED, but we can't see the corresponding 
> connection on the other machine while running netstat. I'm not a TCP 
> expert, so I'm not sure in which case this can occur.

If the end that's blocking still has the TCP in ESTABLISHED state, and
the other end doesnt have the TCP at all... you've already identified
why the one end is still ESTABLISHED.  ESTABLISHED state won't be left
until the FIN is received from the other end, then entering CLOSE_WAIT
state.

When the other end of the TCP is _gone_ that leads me to believe a FIN
will not be coming, hence the indefinite ESTABLISHED state.  Why it's
gone is a different question, maybe your problem is at the other end?
The end initiating a shutdown has to enter FIN_WAIT_1 then FIN_WAIT_2,
these transitions require the other side to leave ESTABLISHED (receive a
FIN then ACK) at the very least to proceed.

> 
> I agree with your comment in general, except that we have been running 
> the same application in single-thread environment for years without 
> running into this very specific problem.
> 

Perhaps when you run in multicore/threaded you are stressing the network
stacks at both ends more, including everything in-between?  The
threading vs. single process relationship is probably not causal, but
just coincidental.

What is the protocol?  Are there any timeouts to take care of these
situations?  Do you schedule an alarm or use SO_RCVTIMEO to shutdown
dead connections and free up consumed threads?

TCP being reliable can block indefinitely, you can employ TCP keepalive
to change indefinite to quite a long time.

Regards,
Vito Caputo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: poll() blocked / packets not received ?
  2008-10-20 11:39     ` swivel
@ 2008-10-20 12:13       ` Nicolas Cannasse
  2008-10-20 12:39       ` Nicolas Cannasse
  1 sibling, 0 replies; 10+ messages in thread
From: Nicolas Cannasse @ 2008-10-20 12:13 UTC (permalink / raw)
  To: swivel; +Cc: linux-kernel

swivel@shells.gnugeneration.com a écrit :
> When the other end of the TCP is _gone_ that leads me to believe a FIN
> will not be coming, hence the indefinite ESTABLISHED state.  Why it's
> gone is a different question, maybe your problem is at the other end?
> The end initiating a shutdown has to enter FIN_WAIT_1 then FIN_WAIT_2,
> these transitions require the other side to leave ESTABLISHED (receive a
> FIN then ACK) at the very least to proceed.
> 
>> I agree with your comment in general, except that we have been running 
>> the same application in single-thread environment for years without 
>> running into this very specific problem.
>>
> 
> Perhaps when you run in multicore/threaded you are stressing the network
> stacks at both ends more, including everything in-between?  The
> threading vs. single process relationship is probably not causal, but
> just coincidental.

Not sure why this should happen, since it's the same servers. What only 
change is part of the software that we are using to handle our server 
requests. It's either embedded in Apache 1.3 with fork() or a standalone 
multithread server which acts as Apache backend.

So the only difference for networking is that we have additional 
Apache<->MT-Server communications, but they should be on 127.0.0.1 so I 
think they are purely software and not hardware-related.

> What is the protocol?  Are there any timeouts to take care of these
> situations?  Do you schedule an alarm or use SO_RCVTIMEO to shutdown
> dead connections and free up consumed threads?

The protocol is MySQL. Since we had the problem with libmysqlclient, we 
reimplemented it again from scratch to make sure that it was not 
software-related.

What happens at the protocol-level is the following :

a) we connect to the server
b) we make several requests and get answers back
c) at some (random+rare) point - always after making a request - we're 
stuck while waiting for the answer.

Sadly, this can happen inside a transaction while we hold the lock on 
some shared resource. This will lock the whole website until we run out 
of File Descriptor due to accept'ed pending connections. In that case we 
get an exception and the server (the multithread one, not MySQL) 
restarts, which release the lock.

In some other cases when we don't hold a lock, the thread remains 
blocked in poll() as I described it. After a timeout (I think it's 28800 
seconds) the MySQL server closes the connection. The client - which is 
waiting in poll() - does not have any timeout activated (it's relying on 
the mysql server). But it doesn't notice that the socket has been closed 
either.

We investigated a lot about signals since poll() can also be interrupted 
by Garbage Collector and child process signals, but we correctly handle 
EINTR everywhere it's needed. So unless there's a possibility that 
interrupting poll() with a signal might somehow consume the data, this 
is not the problem here.

> TCP being reliable can block indefinitely, you can employ TCP keepalive
> to change indefinite to quite a long time.

Sure. We could also use a client timeout, but we don't want to hold the 
lock more than required, and we can't make the difference between a 
given request that would take too much time to complete and a lost 
connection.

Hope we can somehow understand what's going on.
Thanks for the answers so far,

Best,
Nicolas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: poll() blocked / packets not received ?
  2008-10-20 11:39     ` swivel
  2008-10-20 12:13       ` Nicolas Cannasse
@ 2008-10-20 12:39       ` Nicolas Cannasse
  2008-10-20 15:53         ` David Schwartz
  1 sibling, 1 reply; 10+ messages in thread
From: Nicolas Cannasse @ 2008-10-20 12:39 UTC (permalink / raw)
  To: swivel; +Cc: linux-kernel

> TCP being reliable can block indefinitely, you can employ TCP keepalive
> to change indefinite to quite a long time.

Ok, funny thing is that we just found what is occurring...

We had a process that was on a regular basis doing the following :

conntrack -F

This was done in order to prevent the table to grow too big, because we 
were reaching the maximum size as told by :

/proc/sys/net/ipv4/netfilter/ip_conntrack_max
   and
/proc/sys/net/ipv4/netfilter/ip_conntrack_count

Seems like when there are active connections, this will break netfilter 
and stop delivering packets to the socket.

At least I will have nice sleep tonight.

Best,
Nicolas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: poll() blocked / packets not received ?
  2008-10-20 12:39       ` Nicolas Cannasse
@ 2008-10-20 15:53         ` David Schwartz
  2008-10-20 17:24           ` Nicolas Cannasse
  0 siblings, 1 reply; 10+ messages in thread
From: David Schwartz @ 2008-10-20 15:53 UTC (permalink / raw)
  To: swivel, ncannasse; +Cc: linux-kernel


Nick Cannasse wrote:

> Ok, funny thing is that we just found what is occurring...
>
> We had a process that was on a regular basis doing the following :
>
> conntrack -F
>
> This was done in order to prevent the table to grow too big, because we
> were reaching the maximum size as told by :
>
> /proc/sys/net/ipv4/netfilter/ip_conntrack_max
>    and
> /proc/sys/net/ipv4/netfilter/ip_conntrack_count
>
> Seems like when there are active connections, this will break netfilter
> and stop delivering packets to the socket.
>
> At least I will have nice sleep tonight.

Note that this solved your symptom, not your problem. You actually have two
problems:

1) You rely on TCP to detect a lost connection even by a side that will
never transmit any data. TCP simply does not do this. If you are not trying
to send data, you are not assured that a lost connection will be detected.
(You either need a timeout, or you need to send or dribble some data,
depending on the protocl.)

2) You hold a lock on a shared resource while you wait for a reply over a
network. If this is a low-level "block and wait indefinitely" lock, this
will cause many threads to line up behind a slow/stuck thread. The right fix
depends on your circumstances, but you need to use a synchronization
primitive that is suitable. (You need to be able to use multiple connections
or defer operations without holding a thread.)

With both of these bugs, you are vulnerable to precisely the scenario you
observed. The TCP connection close packets were lost (in this case due to
premature expiration of the connnection tracking, but other things can do
it, such as the server rebooting), TCP could not detect the lost connection
because you never sent any data, so one thread blocked forever, and other
threads got in line behind it.

DS



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: poll() blocked / packets not received ?
  2008-10-20 15:53         ` David Schwartz
@ 2008-10-20 17:24           ` Nicolas Cannasse
  2008-10-20 23:21             ` David Schwartz
  2008-10-21  5:12             ` Willy Tarreau
  0 siblings, 2 replies; 10+ messages in thread
From: Nicolas Cannasse @ 2008-10-20 17:24 UTC (permalink / raw)
  To: davids; +Cc: swivel, linux-kernel

David Schwartz a écrit :
>> At least I will have nice sleep tonight.
> 
> Note that this solved your symptom, not your problem. You actually have two
> problems:
> 
> 1) You rely on TCP to detect a lost connection even by a side that will
> never transmit any data. TCP simply does not do this. If you are not trying
> to send data, you are not assured that a lost connection will be detected.
> (You either need a timeout, or you need to send or dribble some data,
> depending on the protocl.)
> 
> 2) You hold a lock on a shared resource while you wait for a reply over a
> network. If this is a low-level "block and wait indefinitely" lock, this
> will cause many threads to line up behind a slow/stuck thread. The right fix
> depends on your circumstances, but you need to use a synchronization
> primitive that is suitable. (You need to be able to use multiple connections
> or defer operations without holding a thread.)

I agree with both points, but I can't modify the MySQL protocol to 
implement that.

For (1) I can't add the timeout since I have no way to differentiate 
between a lost connection and a request that takes time to execute. I'll 
maybe check if the protocol allow pings while waiting for the request 
result, but I'm not sure it does.

For (2) the shared resources is on the database side, not on the server 
side. It's the transaction that have some rows locked. I have no 
solution for that.

Best,
Nicolas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: poll() blocked / packets not received ?
  2008-10-20 17:24           ` Nicolas Cannasse
@ 2008-10-20 23:21             ` David Schwartz
  2008-10-21  5:12             ` Willy Tarreau
  1 sibling, 0 replies; 10+ messages in thread
From: David Schwartz @ 2008-10-20 23:21 UTC (permalink / raw)
  To: ncannasse; +Cc: swivel, linux-kernel


> I agree with both points, but I can't modify the MySQL protocol to
> implement that.

> For (1) I can't add the timeout since I have no way to differentiate
> between a lost connection and a request that takes time to execute. I'll
> maybe check if the protocol allow pings while waiting for the request
> result, but I'm not sure it does.

Sure you can. For example, you can run a proxy on both the server and the
client, with the two proxies speaking a protocol that carries the MySQL
protocol. The protocol between the server and the client can include two
types of messages, one being regular data (which the proxies pass to the
server and client software) and one being a ping (which the proxies use
internally to decide when to drop their connections). Each proxy can 'ping'
the other as often as required and drop both connections if the ping fails
to go through. This will ensure that your program detects a connection loss
rapidly.

There are many other possible solutions.

> For (2) the shared resources is on the database side, not on the server
> side. It's the transaction that have some rows locked. I have no
> solution for that.

That doesn't fit your problem description. Presumably the server detected
the loss of the connection and so would have released any resources it was
holding that were associated with it. The problem in this case was that the
client couldn't detect the loss of the connection.

> Best,
> Nicolas

Good luck.

DS



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: poll() blocked / packets not received ?
  2008-10-20 17:24           ` Nicolas Cannasse
  2008-10-20 23:21             ` David Schwartz
@ 2008-10-21  5:12             ` Willy Tarreau
  1 sibling, 0 replies; 10+ messages in thread
From: Willy Tarreau @ 2008-10-21  5:12 UTC (permalink / raw)
  To: Nicolas Cannasse; +Cc: davids, swivel, linux-kernel

On Mon, Oct 20, 2008 at 07:24:14PM +0200, Nicolas Cannasse wrote:
> For (1) I can't add the timeout since I have no way to differentiate 
> between a lost connection and a request that takes time to execute.

Not only you can, but you *must*. Any service assuming infinite timeout
is deemed to fail. If you know that one request can take as long as one
minute for instance, then use a 2 minutes timeout. The day all requests
will be automatically cleaned up because of a failed firewall between
client and server, you'll be happy not to have to come there and restart
the service to flush them.

There's a huge difference between using a very large timeout and none at
all.

Willy


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-10-21  5:13 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-20  8:25 poll() blocked / packets not received ? Nicolas Cannasse
2008-10-20 10:15 ` swivel
2008-10-20 10:46   ` Nicolas Cannasse
2008-10-20 11:39     ` swivel
2008-10-20 12:13       ` Nicolas Cannasse
2008-10-20 12:39       ` Nicolas Cannasse
2008-10-20 15:53         ` David Schwartz
2008-10-20 17:24           ` Nicolas Cannasse
2008-10-20 23:21             ` David Schwartz
2008-10-21  5:12             ` Willy Tarreau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).