Hi,
Today something strange happened on one of my Soekris 5501 boxes,
it runs OpenBSD 4.1-stable. The box is connected with a cross-over cable
to another machine via the vr1 interface (the box has 4 vr interfaces).Problem: After having rebooted the machine at the other end of the cable
multiple times, the Soekris box suddenly stopped receiving packets on
the vr1 interface.After playing around with ping and tcpdump on both sides I found out
that the vr1 interface allowed me to send packets, but incoming packets
did no show up in tcpdump, even though the LED on the interface was
flickering.I changed the cable, connected the vr1 of the Soekris to another
machine, then to a switch port with lots of broadcast traffic etc.etc.
nothing helped, ingress traffic on vr1 did not show up in tcpdump.Solution: Finally, immediatelly after doing "ifconfig vr1 down &&
ifconfig vr1 up" everything worked again as normal.The link on vr1 is currently only used to do SSH between the two
machines, so this is really a low traffic link. On the other hand,
vr0,vr2,vr3 are heavily used (BGP sessions etc., the Soekris is at the
border of my AS). btw: the machine at the other end of the cable is a
Soekris 5501 as well. Had to reboot it to perform a BIOS upgrade.No suspicious output in dmesg.
Any ideas on how I could further track down the problem?
Thanks,
ChristianThe vr interfaces in dmesg:
vr0 at pci0 dev 6 function 0 "VIA VT6105M RhineIII" rev 0x96: irq 11,
address 00:00:24:c8:de:68
ukphy0 at vr0 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI
0x004063, model 0x0034
vr1 at pci0 dev 7 function 0 "VIA VT6105M RhineIII" rev 0x96: irq 5,
address 00:00:24:c8:de:69
ukphy1 at vr1 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI
0x004063, model 0x0034
vr2 at pci0 dev 8 function 0 "VIA VT6105M RhineIII" rev 0x96: irq 9,
address 00:00:24:c8:de:6a
ukphy2 at vr2 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI
0x004063, model 0x0034
vr3...
Not sure if related, but something similar has been fixed in
4.2-current already.http://lists.freebsd.org/pipermail/freebsd-current/2007-August/076486.html
http://www.openbsd.org/cgi-bin/cvsweb/src/sys/dev/pci/if_vr.c#rev1.70I think you should be able to safely apply the 1.69 to 1.70 diff to
your source tree if this is of a concern. The diff is for vr_attach()
only, so if your system is already up and running and you never reboot
it, then you probably shouldn't bother until your next upgrade.Cheers,
Constantine.
This was also the first thing that came into my mind, however, I don't
think it is related. VR_STICKHW is only written erroneously during
attach, and since my machine runs now for several weeks without any
problem, I doubt that the observed stall has something to do with this.Opinions by the vr maintainers? Anything I can do to debug the problem
when it occurs next time?
For what it's worth, I experienced the same problem caused by attaching and
detaching a (short) crossover cable multiple times on a vr interface in
soekris net5501 running 4.1-stable. As it was on a production firewall I
didn't troubleshoot much, tcpdump didn't show any incoming traffic on that
interface - then I went for a quick reboot that obviously fixed things. Let
me see if I can replicate it in lab.Mitja
for the record, i have a via rhine2 and i never had trouble. (if they are not
related i'm
sorry to bother you, but it might help debugging)vr0 at pci0 dev 18 function 0 "VIA RhineII-2" rev 0x51: irq 10, address
00:40:63:c9:5c:05
this is the one on the via c3 533Mhz board.alwin
[demime 1.01d removed an attachment of type application/pgp-signature]
