Do what David Stevens suggest: Add a per socket option
Subject: Multicast: Filter Multicast traffic per socket mc_list
If two processes open the same port as a multicast socket and then
join two different multicast groups then traffic for both multicast groups
is forwarded to either process. This means that application will get surprising
data that they did not ask for. Applications will have to filter these out in
order to work correctly if multiple apps run on the same system.
These are pretty strange semantics but they have been around since the
beginning of multicast support on Unix systems. Most of the other operating
systems supporting Multicast have since changed to only supplying multicast
traffic to a socket that was selected through multicast join operations.
This patch does change Linux to behave in the same way. But there may be
applications that rely on the old behavior. Therefore we provide a means
to switch back to the old behavior using a new multicast socket option
IP_MULTICAST_ALL
If set then all multicast traffic to the port is forwarded to the socket
(additional constraints are the SSM inclusion and exclusion lists!).
If not set (default) then only traffic for multicast groups that were
joined by thesocket is received.
Signed-off-by: Christoph Lameter <cl@linux.com>
---
include/linux/in.h | 1 +
include/net/inet_sock.h | 3 ++-
net/ipv4/igmp.c | 4 ++--
net/ipv4/ip_sockglue.c | 11 +++++++++++
4 files changed, 16 insertions(+), 3 deletions(-)
Index: linux-2.6/include/net/inet_sock.h
===================================================================
--- linux-2.6.orig/include/net/inet_sock.h 2009-04-16 08:59:20.000000000 -0500
+++ linux-2.6/include/net/inet_sock.h 2009-04-16 09:04:47.000000000 -0500
@@ -130,7 +130,8 @@ struct inet_sock {
freebind:1,
hdrincl:1,
mc_loop:1,
- transparent:1;
+ transparent:1,
+ mc_all:1;
int mc_index;
__be32 mc_addr;
struct ...This isn't what I suggested-- you have the default backwards. It must
default
to current behavior, or it's pointless.
The text you have with it is overstated, too. Of course applications using
your model can still receive unexpected data-- it does not reserve the
port or multicast address to just your sender or to multicast traffic
alone.
My suggestion is to do nothing. :-) But if that's too difficult, an
alternative
would be a socket option that delivers traffic for joined groups only and
defaults off. In fact, it'd probably be most useful if it also prevents
unicast
traffic for sockets using that port, too. None of these things have the
magic
effect of preventing unwanted data delivery, but it'd allow you to receive
multiple, specific groups on a single socket with just the joins to
indicate
which.
+-DLS
--
If it would default to the current behavior then it would be incompatible with the behavior of other operating systems and the surprising behavior of the Linux multicast stack would continue to exist. The unusual behavior The application will no longer receive traffic from multicast groups that it did not subscribe to. Yes unicast can still result in unexpected traffic. --
From: Christoph Lameter <cl@linux.com> Umm, no. We don't break existing applications "by default". You're being entirely selfish here, you want your application to work without having to specify the socket option to get the new behavior. Well guess what? Under Linux you will have to! --
I think your comment is reveresed here isn't it? the default you have below is that mc_all is set, which defaults you to the existing behavior, rather than the new behavior introduced by this patch. Ack to the patch though Acked-by: Neil Horman <nhorman@tuxdriver.com> Neil --
I'm sorry, I misread it (confused the definiton of a bitfield with its default value. As Dave noted, the default needs to be the current behavior, not your new behavior. Until thats changed, I rescind my Ack --
Well guess then we need the global proc setting after all. With the current misbehavior as a default applications need to be rebuilt and source code that is running on multiple OSes now would have to customized to special case for Linux. So add a global proc setting to determine the initial setting of IP_MULTICAST_ALL? --
The current behavior, as either your or Vlad's RFC quotes pointed
out as easily as the history to go with it, is exactly the expected
behavior
for decades. I think it is not misbehavior so much as your misconception,
No, actually. If you write it for the current behavior, it'll work
fine on an OS like Solaris that has departed from the original socket
behavior. If you're sloppy and don't handle unexpected traffic, it'll be
wrong on both-- you just won't know it until someone runs something with
IP_MULTICAST_ALL?
This breaks unknown existing applications that are correctly
written. I think it's clearly wrong to change the behavior of someone
else's socket to match your idea of how it should've been done 25 years
too late. An option that enables new behavior for your own socket, which
must be a new app, is fine. Adding a socket option as part of a port
is no great hurdle, and I'm guessing you aren't trying to run a Solaris
binary on Linux. So what's the problem?
+-DLS
--
Guess its the obvious: Software should run on multiple OSes without too much special casing. Linux is the only special case that I am aware of that misbehaves. Adding a socket is no easy thing given the architecture of the software (and of other software) that did not consider that Linux faithfully replicating bugs from 25 years ago that no longer exist in other OSes. Cannot imagine there to be too much software out there that relies on this strange behavior. Otherwise the software would not work on various other platforms. Can you give us a list of products that verifiably rely on the current behavior? --
All flavors of UNIX did it this way originally. I never tried
it on Windows. I heard years ago when Solaris changed their behavior
and it's been reported in this thread that current BSD does, too.
But, again, this is not in the least misbehavior. It simply doesn't
follow your model of how you thought it behaved. Linux does exactly
what Steve Deering wanted multicasting to do when he wrote the RFC
for it. It adds an address on the interface, and the binding determines
whether it's delivered to a particular socket or not. That is the
"ANY" in INADDR_ANY, just like unicasting. If you want particular
addresses only, the bind system call does that already. It makes
I don't have any say in what other OSes do, but I'd call it a bug
I don't know the extent of your survey, but Linux legacy is the
problem with changing the default behavior for sockets other than your
app. You don't need any special code at all-- write them all to assume
they may receive packets not for them, because they are broken if they
I don't do app surveys any more than you do OS surveys. But
I don't want to change the semantics of multicast sockets and you do.
Can you guarantee nothing will break from this change?
+-DLS
--
From: Christoph Lameter <cl@linux.com> Christoph just drop this, we're not creating a system-wide default selection that backs away from 15+ years of precedence. Maybe Solaris has so few users that it's OK for them to go down that path, but for us it's unacceptable to do things like this. Fix your application. And as David noted, it will be not only more robust, but also still work on those "other systems." So even your "works on all systems" argument is groundless. If you make it work under Linux it will in fact work on all systems, and be more robust in the case of other applications using the same multicast address and port. --
What seems to be happening though, is that there is an expectation that this behavior would change with advent of IGMPv3, which adds the additional filtering text. Now, we could point out that there is no normative text that requires this filtering on groups, only on sources, but the expectation I'd have to reluctantly agree here. Any application that expects original multicast behavior will be broken by a system-wide change. I think existing I wonder how BSD and Solaris got away with it? They both filter on multicast groups and source addresses. This is not meant as rhetorical or provocative, just genuinely wondering. -vlad --
From: Vlad Yasevich <vladislav.yasevich@hp.com> Smaller user base. --
I have no such expectation. :-) The additional filters are
(already)
applied per-socket, but existing apps not using source filters behave as
they did before IGMPv3. That's what I'd expect.
The RFC you quoted for SSM applies to only the SSM address space,
mentions this behavior explicitly as the norm for outside of that space,
and Linux doesn't support that RFC. If it did, it would include an
I think in practice, it doesn't come up much. That's why people
seem so surprised to learn it works this way, and not the way they
thought it did after using it, sometimes for years. But the documentation
doesn't say a join limits what you receive on a socket, or that it
has to be the same socket you're doing I/O on; people simply assume it.
+-DLS
--
On Thu, 16 Apr 2009 15:22:49 -0700 You could always use packet/socket filter to keep the packets from coming out to user space. --
Yes, after reading more of SSM spec, it definitely only applies to SSM
addresses that we don't support yet. Just to clear this one item up,
I think the expectation comes from the IGMPv3 spec:
Filtering of packets based upon a socket's multicast reception
state is a new feature of this service interface. The previous
service interface [RFC1112] described no filtering based upon
multicast join state; rather, a join on a socket simply caused the
host to join a group on the given interface, and packets destined
for that group could be delivered to all sockets whether they had
joined or not.
I could be inferred from this rather vague text that in addition to source
filtering, group filters should be done. Thus the expectation that we've
been dealing with.
That's the last I'll mention this, since most salient points have been
agreed on.
Thanks
--
From: Christoph Lameter <cl@linux.com> No Christoph, do this right. Linux by default will behave the way it has for 15+ years. And if an application wants new behavior, you have to ask for it. End of story. --
This is not right. All other OSes filter multicast traffic according to the multicast groups subscribed too (and that includes the evil one). There is no requirement of asking for "new" behavior. Why should multicast applications have to add special code to request something that comes by default on other platforms? The old behavior does not seem to be usable anyways and its certainly looks buggy if multicast packets are duplicated by the kernel and sent to applications that never have asked for it. And OS should do the sane thing by default and not only if someone asks for it. --
I need the current behaviour to not change, as it would break some people I support. DaveM is making the right decision here, and I fully support this. And I'm one of those people working on low latency and hoping messaging clients get better in their multicast usage..just that this is not one of those ways. Ideally, you could tweak OS environment configuration setting, if you don't want per socket. But it cannot be the default. thanks, Nivedita --
People or applications? There are applications that only run on Linux and fail on other OS? How does this work? Special casing depending on the OS Would you support an additional OS config variable that would set the default for socket operations? Then we could have a per socket option that would allow overriding the OS config variable? --
That would be my choice personally, because it would be easier than scripting some solution to modify potentially hundreds of sockets on a system... Does that sound acceptable? thanks, Nivedita --
From: Christoph Lameter <cl@linux.com> Christoph I just want to let you know that I'm totally ignoring everything further you say on this issue, becuase you're way out of line and totally ignoring the real issues here. What's next? Tomorrow, if you think Linux's open() system call behavior doesn't suit your needs, I want you to send a sysctl patch to Al Viro that changes the system wide behavior and we'll see how far you get with that. The fact is, you cannot just say "oops we didn't mean to do that" when something has behaved a certain way, visible to users, for more that 15 years. And the fact is, WE DID MEAN to do things this way. As David Stevens explained, the original creator of multicasting, the original BSD code, and the RFCs, INTENDED this behavior from the very beginning. You want to ignore all of this, as if none of it matters and that what you want to achieve is so much more important. --
I am not ignoring it. It seems just that other OSes have moved from this and we are one of the last holdouts. Its not only Solaris but also BSD and Windoze. Best to have a solution that is consistent across multiple OSes. --
Linux is not Solaris. I think Solaris is wrong to change the
behavior from the original BSD behavior, but it should be no surprise
that there are other differences in the API's, too. It's not difficult
to write code that works as intended on both, and the case Solaris is
trying to avoid is not really avoided since you can still receive
unicast traffic, or totally unrelated multicast traffic on the shared
port and multicast address space. If the app doesn't use the port to
distinguish it, it simply should bind the multicast address it wants,
use PKTINFO, SO_BINDTODEVICE or the like as well. In your case, multiple
sockets or filtering based on the "to" address are possibilties that
work on Solaris too, and fix more unintended traffic problems than
just a different group.
A per-socket option is a more trivial way to do this, but
turning it on for sockets that want the existing, intended and
long-standing behavior is obviously wrong.
+-DLS
--
By that you mean unrelated multicast traffic destined to the same multicast address and port? --
Yes. If neither the port nor the multicast address are
registered than anyone on your network can use them for anything.
Even if they are registered, someone may still use it; sending
requires no special privilege, and neither does joing groups or
binding to ports above 1024. Anyone on your network, or within
your multicast routing domain, may reuse both (even if they
intend it for a different machine) and your app will receive
them.
I think generally the best approach is to bind to the
particular multicast address and use SO_BINDTODEVICE if it
matters to the app. But the app still has to handle receiving
data from a different source or totally unrelated data;
it certainly can receive those, because anyone can send those.
I can see the value of a per-socket, default-off option
in the case where you want multiple groups on a single socket,
and I encourage you to submit that as a patch. It reduces the
work the receiver has to do, but doesn't eliminate it. The
way I'd do that is to use multiple sockets, one bound to each
group, but ok. As long as it doesn't change the existing
behavior out from under existing, unknown apps.
+-DLS
--
I don't think this change is needed. ipv4_is_lbcast() checks if the address is 255.255.255.255. That address is already !ipv4_is_multicast(). You might need to set inet->mc_all to 1 in inet_create() since I am not sure if we want to change the default behavior. The knowledge that some apps have a very "unique" way of doing multicast makes me a little hesitant. -vlad --
Those "unique" applications would only be able to run on Linux. Application mostly are written for multiple Unix variants. Since the other Unix variants have changed their default behavior it is reasonable to also change the default under Linux. --
