I am trying to wrap my head around DSA and I need some help. Assume the example from Lennert: +-----------+ +-----------+ | | RGMII | | | +-------+ +------ 1000baseT MDI ("WAN") | | | 6-port +------ 1000baseT MDI ("LAN1") | CPU | | ethernet +------ 1000baseT MDI ("LAN2") | |MIImgmt| switch +------ 1000baseT MDI ("LAN3") | +-------+ w/5 PHYs +------ 1000baseT MDI ("LAN4") | | | | +-----------+ +-----------+ If I understand this correctly I get at least 5 virtual I/Fs corresponding to WAN, LAN1-4, but how is the RGMII I/F modelled? I guess I will have one "real" ethX I/F which maps to RGMII but do I get one virtual I/F too? What use are these virtual I/Fs? Just to read status from the corresponding ports? Can one TX and RX network pkgs over these I/Fs too? Now I want to add STP/RSTP to the switch. How would one do that? Jocke --
The RGMII interface is just the interface that your "real" network driver exports. In the case of the Kirkwood 6281 A0 Reference Design (which I developed this code on), that would be eth0. After the DSA driver is instantiated, you don't send or receive over eth0 directly You get a virtual interface for each of the ports on the switch (that are not CPU or inter-switch ports), i.e. all ports on the right of the diagram -- wan, lan1, lan2, lan3, lan4. These interfaces are created That's one of the purposes, yes. There's a polling routine that periodically checks the status of each of the ports on the switch (via the MII management interface) and feeds back that status to the virtual interfaces. I.e. if you plug a cable into lan3, you'll see a syslog message about the link on the virtual interface lan3 having come up, First, you'll want the hardware bridging patches that I posted to netdev@ a while back, e.g.: http://patchwork.ozlabs.org/patch/16578/ They aren't in upstream-mergeable form in their current form, but they do the job. These will propagate brctl addif/delif calls into the switch chip, so that switching between those ports will be done in hardware. Now if all you want is regular STP, with that patch you'll be done -- the ->bridge_set_stp_state() hook propagates the spanning tree state of each of the DSA virtual interfaces into the switch chip automatically. If you want to use a userspace STP implementation, you'll just have to make sure that STP state (listening/learning/blocking/forwarding/etc) is correctly propagated to the switch chip similarly to how it's done in the patch. (Ideally, these patches should be reworked to receive bridge configuration and port status changes via netlink. Unfortunately, I was asked to return all my Marvell hardware when I left Marvell, so someone else will have to do this work.) --
hmm, but how do I send normal pkgs form the CPU to the switch then? I envision I would get some interface in the CPU I can set an IP address on and use as a normal I/F which would be switched by the HW switch to TX:ing pkgs on such virtual I/F would go directly to the port, bypassing normal switching? What about RX? What decides which pkg to route through the switch and I see, will have to study this a bit closer. One question though, does this disable MAC learning in the linux bridge? Do you have any idea how to do DSA on a Broadcom switch? The control plane is an attached with PCI and has a big --
Yes, these are the DSA/slave interfaces created by net/dsa/slave.c. You are free to attach IP addresses to the wan/lanX interfaces, and By default, which is until you enable bridging on some subset of the ports, all ports have their own address database, and all received packets are passed directly up to the CPU, where the DSA code will I have no idea. When I originally submitted the DSA code for merging, I contacted Broadcom people about adding support for Broadcom switch chips to it, but I never heard back from them. --
An ethernet broadcast pkg flooded onto all ports. A normal ethernet host DST address would be looked up by ah, so until I enable bridging, all ports are viewed as a separate network I/F? Once I create a linux bridge device and add the virtual I/Fs, one enables the bridge function. One drawback with that is that you kill the bridge when you reboot Doesn't the HW switch handle all MAC leaning? Why duplicate this in the SW bridge? OK. With DSA, how does one configure VLANs, policing and parameters in the HW switch that don't map or exist in the linux bridge? Jocke --
This statement assumes that all ports have been configured into a
bridge, which is not the default case. (And why would it be? Having each
port in the same VLAN/subnet is only one of the many possible ways of
configuring your switch ports -- and regular (non-DSA) Linux network
interfaces aren't bridged together by default either.) I.e. after boot,
In current upstream kernels, if you in fact bridge all switch ports
together using Linux bridging, this address lookup will be done by the
That the DSA interfaces will behave just like non-DSA Linux network
Yes. The original DSA commit message says as much:
The switch driver presents each port on the switch as a separate
Yes and no. Right now there is no hardware switch offload code in the
upstream kernel, so all bridging will still be done in software. You
will need something along the lines of the patch I pointed you to to
With the hardware bridging patch, hardware bridging will continue if
you don't break down your br0 interface before rebooting. (Of course,
your board might still have a hardware reset line that resets the
Imagine the case where you bridge lan1, lan2 (both on the switch chip)
into br0, together with wlan0 (which is not on the switch chip).
Now a packet is sent out of br0. Should it be sent to wlan0 or to the
switch chip? How will you make this decision without an address database
The idea is to use existing kernel interface for this as much as
possible. So e.g. if you do:
vconfig add lan1 123
vconfig add lan2 123
brctl addbr br123
brctl addif br123 lan1.123
brctl addif br123 lan2.123
Then the DSA code (or some userspace netlink listener helper, or some
combination of both) should ideally also detect that VLAN 123 on
interfaces lan1 and lan2 are to be bridged together, and program the
switch chip accordingly. I think all VLAN configurations that at least
the Marvell hardware supports can be expressed this way.
To configure things like ingress/egress rate limiting and ...Yes, I am getting there mentally. I just have a hard time letting go of
viewing the HW switch as an external entity :)
hmm, one will have to recreate the exact config in several steps(create br0, add each
True, in this case you need it, but for only HW switch I/Fs you don't
need it and there can be several hundreds of MAC addresses passing
trough the HW switch. It would be nice if one didn't need to pass
all those up to the SW bridge, especially if you have a small embedded
Yes, but I image that this breaks down when you want to do something a bit more
advanced. For example I don't think linux VLANs supports "shared VLAN learning"(SVL)
and to configure a HW switch to do SVL one would first have to impl.
that in Linux VLAN and then add the DSA code to get the config to the switch.
Not sure how one would express whether VLAN tags should be stripped off or not when
egressing the HW switch's physical port.
Furthermore, suppose one have a big HW switch, 48 ports, and lots of VLANs in that
HW switch one would have to create a lot of virtual I/Fs and VLANs in linux
Yes, there are aspects of a HW switch that doesn't map into DSA currently.
Perhaps one should add some framework to support this?
Jocke
--
I think you overestimate the effect that address learning will have on the host CPU. It only needs to happen for the first packet for every new MAC address, and address flooding attacks is something you'll need to address in either case. If you're really worried about this scenario, then just configure your boot loader to bridge all switch ports together, and don't load the DSA driver. The switch will then appear as a single interface, 'eth0' (or whatever your SoC calls it), over which you can talk directly without any form of tagging. You won't be able to use any advanced features, Yes. But that's really the best way to do it, in my humble opinion. If you don't go the host networking stack integration route, you end up with something like the vendor drivers. Which work fine for most scenarios.. until you want to do something like talking TCP/IP using the host TCP stack over some of the switch ports, at which point the If you transmit a packet onto 'lan', it will be sent to the switch chip with an "untagged" DSA tag. If you transmit a packet onto 'lan.123', it will be sent to the switch chip with a "tagged" DSA tag. See Where the 'resource waste' is on the order of a couple of tens or hundreds of kilobytes of RAM. If this is a problem for your host Sounds good. --
I will buy that for the moment. I can't see a better way either if you truly want to integrate a HW switch into linux. I just wish Ah, now I get it, thanks. However, how does this work for LAN to LAN pkgs? LAN1 and LAN2 could be in the same VLAN but one is implicit(port) VLAN and the That is not a very good argument, this is how bloat builds. Any idea how such an framework should look like? What transport mechanism is suitable to talk to a user space daemon? --
Most people deal with this by running a userland STP daemon that uses raw sockets to inject manually (i.e. in userspace) DSA-tagged packets onto the eth0 (or whatever) interface. This "works" (for some definitions of 'works') for UDP apps such as a DHCP server as well -- If you tell the HW switch to forward these packets, they will never appear at the CPU interface, so the DSA tagging/untagging doesn't enter Tell the switch that the vlan is native on one of the ports but not on the other. It's been a while since I looked at the chip docs but there If you have a better way of getting all the features while spending less resources, please step forward with your ideas. The current design is the best I could come up with, but I'm sure it's not optimal in its Have a look at netlink. --
"tell the HW switch"? Doesn't DSA do that already? If not, what is the point of DSA then if it doesn't use the native forwarding The current DSA impl. does not support this? There should be some I don't, I am not that familiar with the inner working of Linux I was afraid you would say that, I have no experience with netlink :) --
The point is and always was to provide a framework for proper integration of hardware switch chips into the Linux kernel. This framework doesn't become useless just because it doesn't already support every single Have you even tried the DSA code? --
Right, sorry if I sounded a bit harsh. So DSA currently does a very minimal config of the HW switch to get things going. If you want to do something more fancy one has to add a control plane to DSA which would possibly talk Not yet and I don't have any MV HW either :( --
Yes and no -- yes in the sense that if you want to use more functionality of the switch chip, you'll have to add some code that extracts that info from the Linux network interface config and turns it into commands for the switch chip, and no in the sense that I'm not sure yet what the best way to implement this would be. (Doing it all in userspace is one option.) --
