Skip navigation.
 
mlRe: dropping incoming DO error
FROM : Chris Kane
DATE : Fri Nov 02 19:18:10 2007

On Nov 1, 2007, at 8:45 PM, Steve Gehrman wrote:
> Anyone know what causes this?  Anyone know how I can figure out 
> where it's coming from?
>
> *** -[NSMachPort handlePortMessage:]: dropping incoming DO message 
> because the connection or ports are invalid
>
> -steve



I can take a stab at this ...

Once one side of a DO connection sends a message to the other side, a 
race begins.  Or, three races.  While the message is "in-
flight" (stored in the kernel for this NSMachPort case), the process 
which is going to receive it might decide, for whatever reason, to 
shutdown the connection.  Thus, when the message is pulled from the 
kernel, the connection is already invalid and data structures torn 
down, and there's nothing to be done with the message.

The second/third races are more interesting and likely.  While the 
message is in-flight, the sending process might decide to invalidate 
the connection, or terminate.  Either of those will invalidate the 
sender's Mach ports in the message.  Plus, port death notification 
messages get generated by the kernel and sent to the process with the 
other end of the connection (since it is interested).

When the receiving process gets those port death notifications it is 
going to act on them by invalidating the connection and tearing down 
data structures.  Then the message might be pulled from the kernel. 
So this is kind of a variation on the first race but still 
interesting in its own right.

Or, the message that was sent might be pulled from the kernel first 
before those port death notifications, but still all is not well. 
When a port goes invalid, the kernel scribbles MACH_PORT_DEAD over 
the previous port identifier for that port in all messages still 
waiting in the kernel.  When the receiving process pulls the message 
from the kernel, it sees MACH_PORT_DEAD rather than the sender's port 
identifier.  Well, the sender's port is part of the information that 
the lowest layers use to figure out which connection the message is 
destined for, since of course a process can have many connections to 
other processes.  The received message lacks the information needed 
to do the mapping, so it must be dropped (and likely, connection 
invalidation is imminent in any case).  And of course one could be 
using multiple threads and more than one thread could be doing some 
of these steps.

All these races can potentially apply to all versions of Mac OS X, if 
the timing is "right".

So one can imagine how this might occur pretty easily in the 
termination case.  A sender sends, say, a oneway DO message to 
another process and immediately quits.  Maybe it is sending a 
"goodbye" message.  There is then a race between the receipt of the 
in-flight message on the other side and the death of those ports in 
the sender; the sender is inadvertently (probably!) screwing it's own 
last message by terminating "too quickly".  And the insidious thing 
is that this can work sometimes, and not others, depending on the 
timing.  Or it can work in one OS release, and not another, because, 
say, performance of the something during the shutdown improves (or 
the kernel gets faster) and the sender now goes away  a little 
quicker than it used to.  Or it might not happen on a slower machine 
but does on a faster (or multicore) machine.  Or whatever.


As for figuring out where it is coming from ...
So the thing to look for I suppose is where you have processes 
terminating, or manual connection invalidations going on.  Do a 
thought experiment where you imagine messages that are sent back and 
forth remain in-flight for an hour, and think about whether you're 
doing something that might cause this race to crop up.  (Assuming you 
saw this message with one of your own apps.)


Chris Kane
Cocoa Frameworks, Apple

Related mailsAuthorDate
mldropping incoming DO error Steve Gehrman Nov 2, 04:45
mlRe: dropping incoming DO error Chris Kane Nov 2, 19:18