It's weird. Hitachi's app note TN-PSC-337B/E dated Dec 10, 1998 shows
example lot codes for unfixed chips - "8A3" and fixed - "9A3 R". I don't
really remember the details, but I think the first digit is year (+1990)
and the last digit is quarter#. 0M1 would mean Q1 2000. I personally
have (different) cards with chips marked "9H1 R" and "0C1 R". I remember
a prototype card with something like 7** lot code (faulty, without the
"R") though I can't look up the code anymore. I'd never expect a faulty
chip dated 2000.
BTW their (now Renesas) errata is at
http://www.renesas.eu/products/assp/for_information_and_communication_equipment/com_co...
(I have the datasheet / prog manual as well). TN-PSC-337B/E seems to
indicate that the bug is present in chips made till March 31, 1999.
Your card has "SFL33" chip while my cards are "FL33". I have a card with
"SFL33" but it's dated 2005 and it's a newer chip, missing the "R" and
Hitachi logo because of Hitachi -> Renesas transition. I don't know what
"S" means. The datasheet (1998) only lists "FL33" = 25 Mb/s max transfer
rate and "AFL33" is 30 Mb/s.
Actually, the SCA-II never clears EOM. sca_tx_done() does, after it sees
the "ownership" bit set by SCA-II. Then it does netif_wake_queue().
It seems it happens this way:
- sca_xmit() fills the whole ring (leaving one descriptor empty as
designed - for EDA to work)
- the chip transmits something and signals IRQ->sca_tx_done()
- sca_tx_done can't see any descriptor processed and only wakes the
queue. Perhaps we should only wake the queue if at least one
descriptor has been processed - though sca_tx_done() should never be
called otherwise.
- sca_xmit is called again with full ring, thus BUG().
I wonder if the following helps (untested):
--- a/drivers/net/wan/hd64572.c
+++ b/drivers/net/wan/hd64572.c
@@ -293,6 +293,7 @@ static inline void sca_tx_done(port_t *port)
struct net_device *dev = port->netdev;
card_t* card = port->card;
u8 stat;
+ int wake = 0;
spin_lock(&port->lock);
@@ -316,10 +317,12 @@ static inline void sca_tx_done(port_t *port)
dev->stats.tx_bytes += readw(&desc->len);
}
writeb(0, &desc->stat); /* Free descriptor */
+ wake = 1;
port->txlast = (port->txlast + 1) % card->tx_ring_buffers;
}
- netif_wake_queue(dev);
+ if (wake)
+ netif_wake_queue(dev);
spin_unlock(&port->lock);
}
Perhaps the chip sets the bit in ISR0 register before ST_TX_OWNRSHP is
written to device RAM. With this patch sca_tx_done() should be called
again shortly, in the worst case after the next packed is transmitted.
This could send corrupted data, we don't want to overwrite buffers being
transmitted (or queued for TX).
Anyway, I think it has nothing to do with the "non-R" bug. That one
corrupts CDA register rendering any ring operation impossible and
probably corrupting system RAM (my experience is a single channel with
up to 2 Mb/s doesn't trigger it, two channels trigger it several times
a day). IOW, trying to use two channels with buggy chip is pointless.
OTOH I'm not sure your chip is buggy, perhaps SFL33 were always fixed
and thus not marked with "R"?
--
Krzysztof Halasa
--