On Tue, Feb 23, 2010 at 07:02:16AM -0800, Stephens, Allan wrote:
Not at all, I'm saying quite the opposite, that TIPC should rely on its own
state, but in its current implementation:
1) It doesn't (i.e. theres no check in the send path for messages that the
internal mode is TIPC_NET_MODE if the destination address is not for the local z.c.n
tuple).
2) It couldn't rely on the internal state if it did check (i.e tipc_net_start
sets tipc_mode to TIPC_NET_MODE prior to initalizing the data structures
required to support sending messages off node). So een if we did do a check for
being in NET mode prior to sending a message, it would be useless because theres
a window of time where the implementation says its ok to send, but its really
not (thats what caused the above oops).
Well, yes. Sorry for not being clear. We only need to check that we're in net
mode if we're not sending to the local node.
My initial patch checked for this:
+ if ((tipc_mode != TIPC_NET_MODE) &&
+ (dest->addr.name.domain != tipc_own_addr))
+ return -EHOSTUNREACH;
+
If we're not in NET mode but the destination is not tipc_own_addr (initialized
to <0.0.0>) we can still send.
Yes, it works fine, as long as user space applications all do the right things,
No, you got that exactly right :).
As you said, its perfectly legal, and one of the mandates of the kernel is to
prevent unprivlidged user space applications from taking down the entire system
when they do something stupid. I completely agree with you that user space is
doing bad things here, but bad things need to result in errors and broken
applications, not crashed systems.
I agree that you patch fixes the exact problem that I reported here, but theres
more to it than that. A quick grep of the tipc stack reveals the following
symbols:
tipc_bearers
media_list
tipc_local_nodes
bcbearer
bclink
tipc_net.zones
All of these symbols:
1) Are allocated dynamically in tipc_net_start, _after_ tipc_mode is set to
TIPC_NET_MODE
2) dereferenced without NULL pointer checks in either the send path or in the
netlink configuration path, both of which are reachable from user space.
So your patch fixes the last item on your list, but what about the others? In
fact, I'll bet I can very quickly change the application to trip over a null
tipc_local_nodes dereference by changing the destination address to be something
within zone 0, cluster 0.
I really think the proper solution needs to be modifying the send and control
paths to gate on the internal state, and modify the init/shutdown code to
change state properly.
I'll let you know what I come up with
Neil
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html