Re: USB transfer_buffer allocations on 64bit systems

Previous thread: none

Next thread: [PATCH] macb: allow reception of large (>1518 bytes) frames by Peter Korsgaard on Wednesday, April 7, 2010 - 7:59 am. (2 messages)
From: Alan Stern
Date: Wednesday, April 7, 2010 - 7:59 am

That seems like the right answer.  You are correct that an EHCI 
controller capable only of 32-bit memory accesses would not be able to 

That is not a good fix.  usb_buffer_alloc() provides coherent memory, 
which is not what we want.  I believe the correct fix is to specify the 
GFP_DMA32 flag in the kzalloc() call.

Of course, some EHCI hardware _is_ capable of using 64-bit addresses.  
But not all, and other controller types aren't.  In principle we could
create a new allocation routine, which would take a pointer to the USB
bus as an additional argument and use it to decide whether the memory 
needs to lie below 4 GB.  I'm not sure adding this extra complexity 

Practically every USB driver, I should think.  And maybe a lot of 

Good work -- it certainly would have taken me a long time to figure 
this out.

Alan Stern

--

From: Daniel Mack
Date: Wednesday, April 7, 2010 - 8:11 am

Well, I thought this is exactly what the usb_buffer_alloc() abstraction
functions are there for. We already pass a pointer to a struct
usb_device, so the routine knows which host controller it operates on.
So we can safely dispatch all the magic inside this function, no?

If not, I would rather introduce a new function than adding GFP_ flags
to all existing drivers.

I vote for a clean solution, a fixup of existing implementations and
a clear note about how to allocate buffers for USB drivers. I believe
faulty allocations of this kind can explain quite a lot of problems on

I found many such problems in drivers/media/{dvb,video},
drivers/usb/serial, sound/usb and even in the USB core. If we agree on
how to fix that nicely, I can take some of them.

Daniel
--

From: Greg KH
Date: Wednesday, April 7, 2010 - 8:31 am

Hm, yeah, I thought that is what it was for too.  If not, why can't we


Yeah, I really don't want to have to change every driver in different
ways just depending on if someone thinks it is going to need to run on
this wierd hardware.

Alan, any objection to just using usb_buffer_alloc() for every driver?
Or is that too much overhead?

thanks,

greg k-h
--

From: Daniel Mack
Date: Wednesday, April 7, 2010 - 8:35 am

FWIW, most drivers I've seen in the past hours use a wild mix of
kmalloc(), kzalloc(), kcalloc() and usb_buffer_alloc(). That should
really be unified.

Daniel

--

From: Greg KH
Date: Wednesday, April 7, 2010 - 8:51 am

Yes, if it is necessary that we have to handle this type of crappy
hardware, then it all needs to be unified.  Or at least unified into 2
types of calls, one that needs the bounce buffer fun (what
usb_buffer_alloc() does today), and one that doesn't (usb_kzalloc()
perhaps?)

thanks,

greg k-h
--

From: Oliver Neukum
Date: Wednesday, April 7, 2010 - 11:09 pm

kmalloc() & friends != usb_buffer_alloc(). They do different things.
It makes no sense to unify them. If you really need an ordinary
buffer DMA will surely work on, this needs a third primitive.

	Regards
		Oliver
--

From: Daniel Mack
Date: Thursday, April 8, 2010 - 4:07 am

I know. I just believe that many developers used usb_buffer_alloc() even
though they don't really need coherent DMA memory. The function's name

I think it will help a lot to rename usb_buffer_alloc() in the first
place and then reconsider where coherent memory is really needed.

Daniel
--

From: Oliver Neukum
Date: Wednesday, April 7, 2010 - 9:54 am

What about XHCI? Do you really want to limit it to 32bits?

	Regards
		Oliver
--

From: Daniel Mack
Date: Wednesday, April 7, 2010 - 10:00 am

No. Once we have the abstraction functions, we can well decide what to
do in there, depending on the actual controller we're running on.

Daniel

--

From: Robert Hancock
Date: Wednesday, April 7, 2010 - 4:55 pm

AFAIK, the driver shouldn't have to worry about this at all. When the 
buffer gets DMA-mapped for the controller, the DMA mapping code should 
see that the device has a 32-bit DMA mask and either bounce or IOMMU-map 
the memory so that it appears below 4GB.

Not to say that something might not be sabotaging this somehow, but this 

--

From: Andi Kleen
Date: Monday, April 12, 2010 - 4:53 am

Yes today it's faster at least.


Then something must be broken in Pedro's system and likely other drivers
will also not work. I don't think it should be worked around in the USB layer.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.
--

From: Pedro Ribeiro
Date: Monday, April 12, 2010 - 5:11 am

I'm not putting into question whether something is broken in my
system, but if it is, it must be the ICH9 platform, because I was able
to reproduce it in another laptop.

My laptop is a Lenovo T400 and I was able to reproduce it in a Acer
Aspire 59xx (I don't remember the exact model, but it is one of the
new ones with 15.6 inch - i think they all use the same base). And the
common thing between them is the ICH9 platform.

The only which solved this problem was the first patch sent to me by
Daniel Mack. I've been using it for days straight and it works fine.

Pedro
--

From: Andi Kleen
Date: Monday, April 12, 2010 - 5:12 am

There are lots of systems around with ICH9 that work fine.

Can you send a full boot log?

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.
--

From: Daniel Mack
Date: Monday, April 12, 2010 - 5:32 am

FWIW, the fix that made it work for Pedro was to use usb_buffer_alloc()
for the transfer_buffer of the audio module.

Another detail I can't explain is that on his machine, the kernel oopses
when kmalloc() with GFP_DMA32 is used. The patch to try this also only

He just did. I put it online here:

  http://caiaq.de/download/tmp/pedro-dmesg

Daniel
--

From: Andi Kleen
Date: Monday, April 12, 2010 - 5:47 am

The system seems to set up the soft iotlb correctly.

[    0.468472] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[    0.468539] Placing 64MB software IO TLB between ffff880020000000 - ffff880024000000
[    0.468610] software IO TLB at phys 0x20000000 - 0x24000000

Also if that was wrong a lot more things would go wrong.

I would suspect the driver. Are you sure:

- it sets up it's dma_masks correctly?
- it uses pci_map_single/sg correctly for all transferred data?

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.
--

From: Daniel Mack
Date: Monday, April 12, 2010 - 5:54 am

Pedro sent me an image:

  http://caiaq.de/download/tmp/GFP_DMA32.JPG


Well, the sound driver itself doesn't care for any of those things, just
like any other USB driver doesn't. The USB core itself of the host
controller driver should do, and as far as I can see, it does that, yes.

Daniel

--

From: Andi Kleen
Date: Monday, April 12, 2010 - 8:43 am

Ah you can't use GFP_DMA32 with kmalloc, only GFP_DMA.

Actually there should be a WARN_ON for this when slab debugging 
is enabled.

Slab needs separate caches for dma, and it only has them for GFP_DMA,

Hmm, still things must go wrong somewhere. Perhaps need some instrumentation
to see if all the transfer buffers really hit the PCI mapping functions.

It might be interesting to test if the device works with enabled
IOMMU. That would trigger any failures to properly map the buffers
earlier.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.
--

Previous thread: none

Next thread: [PATCH] macb: allow reception of large (>1518 bytes) frames by Peter Korsgaard on Wednesday, April 7, 2010 - 7:59 am. (2 messages)