Re: sk98lin for 2.6.23-rc1

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Adrian Bunk <bunk@...>
Cc: Bill Davidsen <davidsen@...>, Stephen Hemminger <shemminger@...>, Kyle Rose <krose@...>, James Corey <ploversegg@...>, Rob Sims <lkml-z@...>, <linux-kernel@...>, Jeff Garzik <jeff@...>, <netdev@...>
Date: Tuesday, September 11, 2007 - 6:37 pm

On Tue, Sep 11, 2007 at 05:03:57PM +0200, Adrian Bunk wrote:

Not only that. You have to place the switch in its context with history.
Stephen, please correct me if I'm wrong, but sk98lin has been randomly
working for a very long time. Not 100% the driver's fault, because it
has had to workaround a lot of chips bugs. The fact that this driver
supports *all* chips in the family makes it harder to identify whether
problems are caused by the hardware or by the driver because it is
bloated with tons of if/else.

I've personally encountered random data corruption on the receive path
with PCI-E hardware with sk98lin, as well as random TX stops. Sometimes
it would require one terabyte of data, sometimes just a few hundreds
megs. On other hardware (skge now), UDP would simply stop being sent
and some TCP traffic was necessary to restart UDP! One guy at Marvell
once asked me for more information, but it was not easy to provide
much more, given the randomness of the problems!

Stephen has done an excellent (and thankless) job at restarting from
scratch, and the idea to separate the two chips was a good one IMHO.
The problem is that he might have thought that most of the bugs were
in the driver, while most of them are in the hardware, and this requires
a lot of workarounds, which do not always work the same for everybody
(I remember having tried to disable flow control with sk98lin because
it helped with sky2).

In parallel, sk98lin has improved on the vendor's site. v8 exhibited
all the problems I explained above, but v10 has fixed a lot of them,
making the new sk98lin more reliable. In parallel, sky2 and skge had
got wider acceptance and testing. The nastiest hardware bugs will
slowly surface, a good deal of driver bugs have been detected too
(and that's expected from any new driver).

It is possible that after 2 or 3 patches, a lot of the remaining
problems will suddenly vanish. But it's also possible that the driver
will still not work for 1% of people for 1 or 2 years because of some
obscure hardware combinations which trigger some obscure hardware bugs.


Desktop users genreally have no problem experimenting with multiple kernels
or drivers. They can report feedback too, but generally, they're not very
good at downloading alternative drivers and patching their kernel with those.

Server users cannot experiment for a long time. After 2 or 3 losses of
service, they *have* to provide a definitive solution. For some of them
when sky2 fails, it may very well be to switch over to sk98lin. Downloading
from the vendor's site and patching is not a problem for those users, but
it causes them the trouble of updating the kernel for security fixes, so
the old driver must be shipped with the kernel.

However, I remember something which might constitute a solution. In 2.4,
there's a small bug in the kbuild process on alpha. One question is always
asked during make oldconfig. Its saved value is ignored because of the way
it is computed. I don't know if we could do this with 2.6 kbuild. It would
then be nice to always set sk98lin to unset if it was set to "Y" or "M",
so that at each build, the user has to explicitly state he wants it. It's
annoying enough to give the other one a try once in a while, without causing
too much trouble to people who really have no other choice right now.

What we need with this driver is people being fed up with it, not them
being unable to use it as a last resort. Also, given that it has improved
over the last years (probably due to competition pressure from sky2/skge),
users will even less understand why there is such incentive to remove it.

Another trick for obsolete drivers would be to simply remove them from
the usual build system, but have them being available for explicit build.
Eg: make modules will not build them, but make obsolete-modules would do.


No system config should be edited to switch back to the alternative,
otherwise it remains in its working state.


Desktop users are curious and have plenty of time to kill. Server users
are frightened and lazy. So I think that annoying the user slightly is
a good solution (eg: make obsolete-modules).


After having been happy with eepro100 for years, I discovered many problems
with its VLAN support in 2.4 (MTU, ...) for which e100 was a solution. It
was a good reason to switch. But the old e100 driver took ages to load (half
of the machine boot time), which was not satisfying. So having a new driver
load faster is another good reason to switch.


Hmmm we already read this paragraph above :-)


... and as such are both smaller than sk98lin which supports both.

Cheers,
Willy

-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
sk98lin for 2.6.23-rc1, Kyle Rose, (Thu Jul 26, 11:16 am)
Re: sk98lin for 2.6.23-rc1, Bill Davidsen, (Thu Jul 26, 7:52 pm)
Re: sk98lin for 2.6.23-rc1, Kyle Rose, (Thu Jul 26, 9:13 pm)
Re: sk98lin for 2.6.23-rc1, Stephen Hemminger, (Thu Jul 26, 3:17 pm)
Re: sk98lin for 2.6.23-rc1, Adrian Bunk, (Thu Jul 26, 12:57 pm)
Re: sk98lin for 2.6.23-rc1, Rob Sims, (Sun Jul 29, 11:01 pm)
Re: sk98lin for 2.6.23-rc1, Stephen Hemminger, (Wed Sep 5, 5:22 am)
Re: sk98lin for 2.6.23-rc1, Torsten Kaiser, (Wed Sep 12, 12:46 pm)
Re: sk98lin for 2.6.23-rc1, James Corey, (Wed Sep 5, 3:42 pm)
Re: sk98lin for 2.6.23-rc1, Bill Davidsen, (Sat Sep 8, 1:44 pm)
Re: sk98lin for 2.6.23-rc1, Adrian Bunk, (Sat Sep 8, 3:11 pm)
Re: sk98lin for 2.6.23-rc1, Bill Davidsen, (Mon Sep 10, 10:32 am)
Re: sk98lin for 2.6.23-rc1, Adrian Bunk, (Mon Sep 10, 11:39 am)
Re: sk98lin for 2.6.23-rc1, Kyle Moffett, (Tue Sep 11, 12:23 am)
Re: sk98lin for 2.6.23-rc1, Chris Stromsoe, (Sun Sep 9, 8:54 am)
Re: sk98lin for 2.6.23-rc1, Stephen Hemminger, (Tue Nov 6, 6:23 pm)
Re: sk98lin for 2.6.23-rc1, Chris Stromsoe, (Tue Nov 6, 9:42 pm)
Re: sk98lin for 2.6.23-rc1, Kyle Rose, (Sat Sep 8, 10:42 pm)
Re: sk98lin for 2.6.23-rc1, Adrian Bunk, (Sun Sep 9, 7:13 am)
Re: sk98lin for 2.6.23-rc1, Stephen Hemminger, (Tue Sep 11, 4:05 am)
Re: sk98lin for 2.6.23-rc1, James Corey, (Tue Sep 11, 6:20 pm)
Re: sk98lin for 2.6.23-rc1, Adrian Bunk, (Tue Sep 11, 7:54 am)
Re: sk98lin for 2.6.23-rc1, Bill Davidsen, (Tue Sep 11, 10:29 am)
Re: sk98lin for 2.6.23-rc1, Adrian Bunk, (Tue Sep 11, 11:03 am)
Re: sk98lin for 2.6.23-rc1, Willy Tarreau, (Tue Sep 11, 6:37 pm)
Re: sk98lin for 2.6.23-rc1, Willy Tarreau, (Sun Sep 9, 12:48 am)
Re: sk98lin for 2.6.23-rc1, Kyle Rose, (Wed Sep 5, 5:04 pm)
Re: sk98lin for 2.6.23-rc1, Stephen Hemminger, (Wed Sep 5, 7:00 pm)
Re: sk98lin for 2.6.23-rc1, Bill Davidsen, (Thu Jul 26, 7:38 pm)
Re: sk98lin for 2.6.23-rc1, Jeff Garzik, (Thu Jul 26, 7:41 pm)
Re: sk98lin for 2.6.23-rc1, Chris Stromsoe, (Thu Jul 26, 6:58 pm)
Re: sk98lin for 2.6.23-rc1, Jan Engelhardt, (Thu Jul 26, 12:28 pm)
Re: sk98lin for 2.6.23-rc1, Kyle Rose, (Thu Jul 26, 12:30 pm)
Re: sk98lin for 2.6.23-rc1, Jan Engelhardt, (Thu Jul 26, 12:41 pm)
Re: sk98lin for 2.6.23-rc1, Kyle Rose, (Thu Jul 26, 9:07 pm)