Re: Gianfar driver failing on MPC8641D based board

Previous thread: [PATCH] obsolete config in kernel source (HSO_AUTOPM) by Christoph Egger on Friday, February 5, 2010 - 6:39 am. (1 message)

Next thread: Re: [PATCH] obsolete config in kernel source (HSO_AUTOPM) by Paulius Zaleckas on Friday, February 5, 2010 - 7:38 am. (2 messages)
From: Martyn Welch
Date: Friday, February 5, 2010 - 7:00 am

I have recently attempted to boot an 8641D based board from an NFS root.
The boot process grinds to a halt not long after the first access of the
NFS root and I receive multiple "nfs: server 192.168.0.1 not responding,
still trying" messages. Wireshark suggests that there is no further
traffic from this board at this point on. The NFS server seems to
eventually try sending duplicate packets it's already sent, which
results in "nfs: server 192.168.0.1 OK" messages, but the "not
responding" messages resume with no further traffic from the board.

I am able to boot to a ramdisk fine and the network seems to work -
though I haven't really pushed the interface from it.

I have attempted to git bisect, though I wasn't able to get much further
than discovering the problem was introduced in the 2.6.33 merge window -
at which point the gianfar network driver fails to compile (I have tried
to git bisect skip many, many times to no avail).

NFS booting fails for this board on todays linux-next, the master branch
of Kumar's PPC tree and the head of the main tree. I have also been able
to NFS boot from a random x86 based board that I have, using the head of
the main tree and the linux-next tree.

Copying the gianfar drivers from 2.6.32 into the head of the main tree
restores the correct behaviour and I'm able to NFS boot. I have heard
from others that the latest drivers work on 83xx and 85xx based boards,
but it seems to be broken on at least the 8641D.

I can see there has been a fair amount of work done on the gianfar
driver, I assume that this is a bug introduced by the multiple queue
support, but I'm way out of my depth on this.

I'm also off for the next week - so if I'm quiet, it'll be because of that.

Martyn

-- 
Martyn Welch (Principal Software Engineer)   |   Registered in England and
GE Intelligent Platforms                     |   Wales (3828642) at 100
T +44(0)127322748                            |   Barbirolli Square, Manchester,
E martyn.welch@ge.com                    ...
From: Martyn Welch
Date: Thursday, February 25, 2010 - 3:31 am

I have just compiled 2.6.33 for the Freescale MPC8641_HPCN demo board
and am having still experiencing the problems outlined in my previous
email, though I have noticed that I tend to be able to boot from cold,
but my boot fails on reboot. Hitting the reset button doesn't help, I
need to actually power the machine on and off again for it to work.

As before, I'm way out of my depth in this, any one have any ideas?
Below is a dump of the failed boot process:

U-Boot 2009.01-00181-gc1b7c70 (Jan 30 2009 - 11:17:31)

Freescale PowerPC
CPU:
    Core: E600 Core 0, Version: 0.2, (0x80040202)
    System: Unknown, Version: 2.0, (0x80900120)
    Clocks: CPU:1000 MHz, MPX: 400 MHz, DDR: 200 MHz, LBC:  25 MHz
    L2: Enabled
Board: MPC8641HPCN, System ID: 0x10, System Version: 0x10, FPGA Version:
0x22
I2C:   ready
DRAM:      DDR:  1 GB
FLASH:  8 MB
Invalid ID (ff ff ff ff)
               Scanning PCI bus 01
    PCI-EXPRESS 1 on bus 00 - 02
    PCI-EXPRESS 2 on bus 03 - 03
Video: No radeon video card found!
In:    serial
Out:   serial
Err:   serial
SCSI:  AHCI 0001.0000 32 slots 4 ports 3 Gbps 0xf impl IDE mode
flags: ncq ilck pm led clo pmp pio slum part
scanning bus for devices...
Net:   eTSEC1, eTSEC2, eTSEC3, eTSEC4
=>  tftp 4000000 hpcn/uImage-torvalds-linux-2.6
Speed: 1000, full duplex
Using eTSEC1 device
TFTP from server 192.168.0.1; our IP address is 192.168.0.30
Filename 'hpcn/uImage-torvalds-linux-2.6'.
Load address: 0x4000000
Loading: #################################################################
         #################################################################
         #######################################################
done
Bytes transferred = 2709050 (29563a hex)
=> tftp 5000000 hpcn/mpc8641_hpcn-torvalds-linux-2.6.dtb
Speed: 1000, full duplex
Using eTSEC1 device
TFTP from server 192.168.0.1; our IP address is 192.168.0.30
Filename 'hpcn/mpc8641_hpcn-torvalds-linux-2.6.dtb'.
Load address: 0x5000000
Loading: #
done
Bytes transferred = 11523 ...
From: Martyn Welch
Date: Thursday, February 25, 2010 - 9:46 am

Further testing has shown that this isn't restricted to warm reboots, it
happens from cold as well. In addition, the exact timing of the failure
seems to vary, some boots have got further before failing.

Martyn

-- 
Martyn Welch (Principal Software Engineer)   |   Registered in England and
GE Intelligent Platforms                     |   Wales (3828642) at 100
T +44(0)127322748                            |   Barbirolli Square, Manchester,
E martyn.welch@ge.com                        |   M2 3AB  VAT:GB 927559189

--

From: Anton Vorontsov
Date: Thursday, February 25, 2010 - 9:51 am

On Thu, Feb 25, 2010 at 04:46:54PM +0000, Martyn Welch wrote:

Unfortunately I don't have any 8641 boards near me, so I can't
debug this myself. Though, I tested gianfar on MPC8568E-MDS with
2.6.33 kernel, and it seems to work just fine.

I see you use SMP. Can you try to turn it off? If that will fix
the issue, then it'll be a good data point.

Meanwhile, I'll try SMP kernel on MPC8568 (UP), and let you
know the results.

Thanks,

-- 
Anton Vorontsov
email: cbouatmailru@gmail.com
irc://irc.freenode.net/bd2
--

From: Anton Vorontsov
Date: Thursday, February 25, 2010 - 10:49 am

Nope, no luck. Can't trigger the issue. :-/
Tested with NFS boot, TCP and UDP netperf tests.

-- 
Anton Vorontsov
email: cbouatmailru@gmail.com
irc://irc.freenode.net/bd2
--

From: Paul Gortmaker
Date: Thursday, February 25, 2010 - 5:53 pm

On Thu, Feb 25, 2010 at 12:49 PM, Anton Vorontsov

I was able to reproduce it on an 8641D and bisected it down to this:

-----------
commit a3bc1f11e9b867a4f49505ecac486a33af248b2e
Author: Anton Vorontsov <avorontsov@ru.mvista.com>
Date:   Tue Nov 10 14:11:10 2009 +0000

    gianfar: Revive SKB recycling

    Before calling gfar_clean_tx_ring() the driver grabs an irqsave
    spinlock, and then tries to recycle skbs. But since
    skb_recycle_check() returns 0 with IRQs disabled, we'll never
    recycle any skbs.

    It appears that gfar_clean_tx_ring() and gfar_start_xmit() are
    mostly idependent and can work in parallel, except when they
    modify num_txbdfree.

    So we can drop the lock from most sections and thus fix the skb
    recycling.
-----------

...which probably explains why you weren't seeing it on non-SMP.
I'd imagine it would show up on any of the e500mc boards too.

I'd done a rev-list on gianfar.[ch] from 32 to 33-rc1, and then
cherry-picked those onto a 32 baseline to reduce the scale of
the bisection, but I don't think that should impact the final
result I got in any meaningful way.

Paul.
--

From: Anton Vorontsov
Date: Thursday, February 25, 2010 - 8:14 pm

On Thu, Feb 25, 2010 at 07:53:30PM -0500, Paul Gortmaker wrote:

Thanks for the bisect. I have a guess why tx hangs in
SMP case. Could anyone try the patch down below?


Yeah.. Pity, I don't have SMP boards anymore. I'll try
to get one though.


diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c
index 8bd3c9f..3ff3bd0 100644
--- a/drivers/net/gianfar.c
+++ b/drivers/net/gianfar.c
@@ -2614,6 +2614,8 @@ static int gfar_poll(struct napi_struct *napi, int budget)
 			tx_queue = priv->tx_queue[rx_queue->qindex];
 
 			tx_cleaned += gfar_clean_tx_ring(tx_queue);
+			if (!tx_cleaned && !tx_queue->num_txbdfree)
+				tx_cleaned += 1; /* don't complete napi */
 			rx_cleaned_per_queue = gfar_clean_rx_ring(rx_queue,
 							budget_per_queue);
 			rx_cleaned += rx_cleaned_per_queue;
--

From: Kumar Gopalpet-B05799
Date: Thursday, February 25, 2010 - 9:58 pm

Anton, 

There is also one more issue that I have been observing with the patch
"gianfar: Revive SKB recycling".
The issue is when I do a IPV4 forwarding test scenario with
bidirectional flows (SMP environment). I am using Spirent smart bits
(smartflow) for automation testing and I frequently observe smart flow
reporting "Rx packet counte greater than Tx packet count. Duplicate
packets might have been received".

To just get over the issue I have removed this patch and I didn't see
the issue.

To a certain extent I could get over the problem by using atomic_t for
num_txbdfree (atomic_add and atomic_dec instructions for updating the
num_txbdfree) and completely removing the spin_locks in the tx routines.

Also, I feel we might want to make some more changes to the
gfar_clean_tx_ring( ) and gfar_start_xmit() routines so that they can
operate parallely. 

I am really sorry for not posting it a bit earlier as I am caught up
with some urgent issues.

--

Thanks
Sandeep
--

From: Martyn Welch
Date: Friday, February 26, 2010 - 5:06 am

-- 
Martyn Welch (Principal Software Engineer)   |   Registered in England and
GE Intelligent Platforms                     |   Wales (3828642) at 100
T +44(0)127322748                            |   Barbirolli Square, Manchester,
E martyn.welch@ge.com                        |   M2 3AB  VAT:GB 927559189

--

From: Anton Vorontsov
Date: Friday, February 26, 2010 - 7:35 am

Hm.. I found a p2020 board and I was able to reproduce the issue.
The patch down below fixed it completely for me... hm.


-- 
Anton Vorontsov
email: cbouatmailru@gmail.com
irc://irc.freenode.net/bd2
--

From: Paul Gortmaker
Date: Friday, February 26, 2010 - 7:52 am

Interesting. I just tested the patch on the sbc8641d, and it
still has the issue with your patch applied.  I'm using NFSroot
just like Martyn was and it still appears bound up on that
gianfar tx lock.  I'll see if I can get a SysRq backtrace in
case that will help you see how it manages to get there...

Paul.

----

nfs: server not responding, still trying 

[repeated ~15 times, then...]
                      
INFO: task rc.sysinit:837 blocked for more than 120 seconds.                    
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.       
rc.sysinit    D 0fef73f4     0   837    836 0x00000000                          
Call Trace:                                                                     
[dfb7d9b0] [c000a144] __switch_to+0x8c/0xf8                                     
[dfb7d9d0] [c03443dc] schedule+0x380/0x954                                      
[dfb7da50] [c0344a0c] io_schedule+0x5c/0x90                                     
[dfb7da70] [c0074b0c] sync_page+0x4c/0x74                                       
[dfb7da80] [c0344f44] __wait_on_bit_lock+0xb0/0x148                             
[dfb7dab0] [c0074a8c] __lock_page+0x94/0xa4                                     
[dfb7dae0] [c0074d5c] find_lock_page+0x8c/0xa4                                  
[dfb7db00] [c0075674] filemap_fault+0x1ec/0x4fc                                 
[dfb7db40] [c008d548] __do_fault+0x98/0x53c                                     
[dfb7dba0] [c0018478] do_page_fault+0x2d0/0x500                                 
[dfb7dc50] [c00149d4] handle_page_fault+0xc/0x80                                
--- Exception: 301 at __clear_user+0x14/0x7c                                    
    LR = load_elf_binary+0x670/0x1270                                           
[dfb7dd10] [c00f6ca0] load_elf_binary+0x620/0x1270 (unreliable)                 
[dfb7dd90] [c00b1f78] search_binary_handler+0x17c/0x394                         
[dfb7dde0] [c00f4f50] load_script+0x274/0x288            ...
From: Martyn Welch
Date: Friday, February 26, 2010 - 8:18 am

I've got a p2020ds here as well, so I'll give NFSroot on that a try with
your patch.

Martyn



-- 
Martyn Welch (Principal Software Engineer)   |   Registered in England and
GE Intelligent Platforms                     |   Wales (3828642) at 100
T +44(0)127322748                            |   Barbirolli Square, Manchester,
E martyn.welch@ge.com                        |   M2 3AB  VAT:GB 927559189

--

From: Martyn Welch
Date: Friday, February 26, 2010 - 8:34 am

Out of 10 boot attempts, 7 failed.

Martyn

-- 
Martyn Welch (Principal Software Engineer)   |   Registered in England and
GE Intelligent Platforms                     |   Wales (3828642) at 100
T +44(0)127322748                            |   Barbirolli Square, Manchester,
E martyn.welch@ge.com                        |   M2 3AB  VAT:GB 927559189

--

From: Anton Vorontsov
Date: Friday, February 26, 2010 - 9:10 am

On Fri, Feb 26, 2010 at 03:34:07PM +0000, Martyn Welch wrote:

OK, I see why. With ip=on (dhcp boot) it's much harder to trigger
it. With static ip config can I see the same.
--

From: Paul Gortmaker
Date: Friday, February 26, 2010 - 9:27 am

I'd kind of expected to see us stuck in gianfar on that lock, but
the SysRQ-T doesn't show us hung up anywhere in gianfar itself.
[This was on a base 2.6.33, with just a small sysrq fix patch]

Paul.

----------

SysRq : Changing Loglevel                                            
Loglevel set to 9                                                               
nfs: server not responding, still trying                          
SysRq : Show State                                                              
  task                PC stack   pid father                                     
init          D 0ff1c380     0     1      0 0x00000000                          
Call Trace:                                                                     
[df841a30] [c0009fc4] __switch_to+0x8c/0xf8                                     
[df841a50] [c0350160] schedule+0x354/0x92c                                      
[df841ae0] [c0331394] rpc_wait_bit_killable+0x2c/0x54                           
[df841af0] [c0350eb0] __wait_on_bit+0x9c/0x108                                  
[df841b10] [c0350fc0] out_of_line_wait_on_bit+0xa4/0xb4                         
[df841b40] [c0331cf0] __rpc_execute+0x16c/0x398                                 
[df841b90] [c0329abc] rpc_run_task+0x48/0x9c                                    
[df841ba0] [c0329c40] rpc_call_sync+0x54/0x88                                   
[df841bd0] [c015e780] nfs_proc_lookup+0x94/0xe8                                 
[df841c20] [c014eb60] nfs_lookup+0x12c/0x230                                    
[df841d50] [c00b9680] do_lookup+0x118/0x288                                     
[df841d80] [c00bb904] link_path_walk+0x194/0x1118                               
[df841df0] [c00bcb08] path_walk+0x8c/0x168                                      
[df841e20] [c00bcd6c] do_path_lookup+0x74/0x7c                                  
[df841e40] [c00be148] do_filp_open+0x5d4/0xba4                                  
[df841f10] [c00abe94] ...
From: Anton Vorontsov
Date: Friday, February 26, 2010 - 2:38 pm

Yeah, I don't think this is gianfar-related. It must be something
else triggered by the fact that gianfar no longer sends stuff.

OK, I think I found what's happening in gianfar.

Some background...

start_xmit() prepares new skb for transmitting, generally it does
three things:

1. sets up all BDs (marks them ready to send), except the first one.
2. stores skb into tx_queue->tx_skbuff so that clean_tx_ring()
   would cleanup it later.
3. sets up the first BD, i.e. marks it ready.

Here is what clean_tx_ring() does:

1. reads skbs from tx_queue->tx_skbuff
2. Checks if the *last* BD is ready. If it's still ready [to send]
   then it it isn't transmitted, so clean_tx_ring() returns.
   Otherwise it actually cleanups BDs. All is OK.

Now, if there is just one BD, code flow:

- start_xmit(): stores skb into tx_skbuff. Note that the first BD
  (which is also the last one) isn't marked as ready, yet.
- clean_tx_ring(): sees that skb is not null, *and* its lstatus
  says that it is NOT ready (like if BD was sent), so it cleans
  it up (bad!)
- start_xmit(): marks BD as ready [to send], but it's too late.

We can fix this simply by reordering lstatus/tx_skbuff writes.

It works flawlessly on my p2020, please try it.

Thanks!


diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c
index 8bd3c9f..cccb409 100644
--- a/drivers/net/gianfar.c
+++ b/drivers/net/gianfar.c
@@ -2021,7 +2021,6 @@ static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	}
 
 	/* setup the TxBD length and buffer pointer for the first BD */
-	tx_queue->tx_skbuff[tx_queue->skb_curtx] = skb;
 	txbdp_start->bufPtr = dma_map_single(&priv->ofdev->dev, skb->data,
 			skb_headlen(skb), DMA_TO_DEVICE);
 
@@ -2053,6 +2052,10 @@ static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	txbdp_start->lstatus = lstatus;
 
+	eieio(); /* force lstatus write before tx_skbuff */
+
+	tx_queue->tx_skbuff[tx_queue->skb_curtx] = skb;
+
 	/* Update the current skb ...
From: Paul Gortmaker
Date: Friday, February 26, 2010 - 3:12 pm

I've skipped right to the test part (I'll think about the description
more later) and it passed 5 out of 5 boot tests on NFSroot sbc8641d.
Looks like you've got a solution.


--

From: Kumar Gopalpet-B05799
Date: Friday, February 26, 2010 - 10:35 pm

Anton,

Understood, and thanks for the explanation. Am I correct in saying that
this is
due to the out-of-order execution capability on powerpc ?

I have one more question, why don't we use use atomic_t for num_txbdfree
and
completely  do away with spin_locks in gfar_clean_tx_ring() and
gfar_start_xmit().
In an non-SMP, scenario I would feel there is absolutely no requirement
of spin_locks
and in case of SMP atomic operation would be much more safer on powerpc
rather than spin_locks.

What is your suggestion ?


--

Thanks
--

From: Anton Vorontsov
Date: Tuesday, March 2, 2010 - 7:02 am

Hi!

On Sat, Feb 27, 2010 at 11:05:32AM +0530, Kumar Gopalpet-B05799 wrote:

Nope, that was just a logic issue in the driver. 

Though, with the patch, the eieio() is needed so that compiler (or CPU)

I think that's a good idea.

However, in start_xmit() we'll have to keep the spinlock anyway
since it also protects from gfar_error(), which can modify
regs->tstat.

Thanks!

-- 
Anton Vorontsov
email: cbouatmailru@gmail.com
irc://irc.freenode.net/bd2
--

From: Martyn Welch
Date: Monday, March 1, 2010 - 6:07 am

I can confirm 10/10 successful boots on p2020ds and mpc8641_hpcn.

Martyn


-- 
Martyn Welch (Principal Software Engineer)   |   Registered in England and
GE Intelligent Platforms                     |   Wales (3828642) at 100
T +44(0)127322748                            |   Barbirolli Square, Manchester,
E martyn.welch@ge.com                        |   M2 3AB  VAT:GB 927559189

--

From: Martyn Welch
Date: Friday, February 26, 2010 - 4:51 am

I removed the second core from the dts file rather than truly disabling
SMP in the kernel config. Doing this allowed the board to boot reliably.

Martyn

-- 
Martyn Welch (Principal Software Engineer)   |   Registered in England and
GE Intelligent Platforms                     |   Wales (3828642) at 100
T +44(0)127322748                            |   Barbirolli Square, Manchester,
E martyn.welch@ge.com                        |   M2 3AB  VAT:GB 927559189

--

From: Kumar Gala
Date: Thursday, February 25, 2010 - 11:27 am

what mechanism do you use for warm resets?

- k
--

Previous thread: [PATCH] obsolete config in kernel source (HSO_AUTOPM) by Christoph Egger on Friday, February 5, 2010 - 6:39 am. (1 message)

Next thread: Re: [PATCH] obsolete config in kernel source (HSO_AUTOPM) by Paulius Zaleckas on Friday, February 5, 2010 - 7:38 am. (2 messages)