FROM : Daniel Hazelbaker
DATE : Wed Jul 26 17:32:15 2006
Okay, got more information on this.
I ran a sample of the program while hung and both its thread's are
"stuck". One is in select 100% time of the sample the other is in
sendto 100% time of the sample (10 second sample). I accidently
killed the client process (the stuck one) and the server never
noticed. When I check the netstat on the server machine it still
shows the socket ESTABLISHED with a Recv-Q _AND_ a Send-Q of 66608.
Attaching a debugger to the server shows the thread handling that
connection is also "stuck" in sendto.
Apparently, I am somehow managing to get deadlocked. Both processes
are in blocking mode waiting to write data so neither can read data
from the "full" socket. I'm not sure how this happens in the first
place. Obviously though I need to make my code non-blocking and
hopefully that should clear it all up.
Daniel
On Jul 25, 2006, at 8:28 AM, Daniel Hazelbaker wrote:
> Greetings,
>
> Got an odd problem. I have a program that runs fine on every
> other machine (so far) except a single machine. There is nothing
> special about this machine, we have others with the same hardware/
> software and they work fine. It is a command-line program that
> runs in the background normally via fork() (although I have seen
> this behavior when running it in the foreground). Anyway, the
> program seems to "stop" running its timer (runs every second to
> process a bit of work). The whole application does not die as it
> has an active socket connected to a server that sends a "ping" to
> it every minute to make sure it is still alive. The ping is being
> responded to, but that is all. No other activity is happening.
>
> That is until I attach the debugger. I did it just now by gdb
> <app> <pid> and then quit gdb and the application started running
> fine again. Has anybody seen this kind of behavior before, or know
> what signal gdb might be generating in the program to cause it to
> start running normally again? I will try to post more information
> later as I dig into this and put some logging in (I have to do
> special logging, If I just log every-time the timer hits it will be
> a massive log to the console as it runs fine for hours on end), but
> for the moment I can't imagine what gdb could be doing to cause the
> problem to go away.
>
> Daniel
> _______________________________________________
> MacOSX-dev mailing list
> <email_removed>
> http://www.omnigroup.com/mailman/listinfo/macosx-dev
>
DATE : Wed Jul 26 17:32:15 2006
Okay, got more information on this.
I ran a sample of the program while hung and both its thread's are
"stuck". One is in select 100% time of the sample the other is in
sendto 100% time of the sample (10 second sample). I accidently
killed the client process (the stuck one) and the server never
noticed. When I check the netstat on the server machine it still
shows the socket ESTABLISHED with a Recv-Q _AND_ a Send-Q of 66608.
Attaching a debugger to the server shows the thread handling that
connection is also "stuck" in sendto.
Apparently, I am somehow managing to get deadlocked. Both processes
are in blocking mode waiting to write data so neither can read data
from the "full" socket. I'm not sure how this happens in the first
place. Obviously though I need to make my code non-blocking and
hopefully that should clear it all up.
Daniel
On Jul 25, 2006, at 8:28 AM, Daniel Hazelbaker wrote:
> Greetings,
>
> Got an odd problem. I have a program that runs fine on every
> other machine (so far) except a single machine. There is nothing
> special about this machine, we have others with the same hardware/
> software and they work fine. It is a command-line program that
> runs in the background normally via fork() (although I have seen
> this behavior when running it in the foreground). Anyway, the
> program seems to "stop" running its timer (runs every second to
> process a bit of work). The whole application does not die as it
> has an active socket connected to a server that sends a "ping" to
> it every minute to make sure it is still alive. The ping is being
> responded to, but that is all. No other activity is happening.
>
> That is until I attach the debugger. I did it just now by gdb
> <app> <pid> and then quit gdb and the application started running
> fine again. Has anybody seen this kind of behavior before, or know
> what signal gdb might be generating in the program to cause it to
> start running normally again? I will try to post more information
> later as I dig into this and put some logging in (I have to do
> special logging, If I just log every-time the timer hits it will be
> a massive log to the console as it runs fine for hours on end), but
> for the moment I can't imagine what gdb could be doing to cause the
> problem to go away.
>
> Daniel
> _______________________________________________
> MacOSX-dev mailing list
> <email_removed>
> http://www.omnigroup.com/mailman/listinfo/macosx-dev
>
| Related mails | Author | Date |
|---|---|---|
| Daniel Hazelbaker | Jul 25, 17:28 | |
| Daniel Hazelbaker | Jul 26, 17:32 | |
| Ben Dale | Jul 27, 03:14 |






Cocoa mail archive

