Re: [patch 2/4] sdio: set the functions' block size

Previous thread: none

Next thread: [PATCH -mm] Introduce U16_MAX and U32_MAX by Satyam Sharma on Tuesday, July 31, 2007 - 9:26 am. (4 messages)
From: David Vrabel
Date: Tuesday, July 31, 2007 - 8:36 am

These three patches enhance the support for the SDIO IO_RW_EXTENDED command.
The block size of functions is managed and the I/O ops (sdio_readsb() etc) are
extended to handle arbitrary lengths of data (by using multiple commands).

I've not yet had a chance to test this stuff as I don't (yet) have the time to
write a Bluetooth Type-A driver so these are posted as an example of the sort
of API I'd expect.

David
-- 
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ 
-

From: David Vrabel
Date: Tuesday, July 31, 2007 - 8:36 am

Signed-off-by: David Vrabel <david.vrabel@csr.com>

---
commit 51755c3d59be1ba778bef45888f9f5e341dc4af4
tree c7bbb562b2d801197eefb619a17c94467c1299cd
parent 1cf0b6019aa3916197eecafe058bd2f3d700d24a
author David Vrabel <david.vrabel@csr.com> Mon, 30 Jul 2007 19:20:59 +0100
committer David Vrabel <dvrabel@cantab.net> Mon, 30 Jul 2007 19:28:30 +0100

 drivers/mmc/core/sdio.c     |    4 ++--
 drivers/mmc/core/sdio_cis.c |    2 +-
 include/linux/mmc/sdio.h    |   16 +++++++++-------
 3 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/mmc/core/sdio.c b/drivers/mmc/core/sdio.c
index 6589fd6..08f579c 100644
--- a/drivers/mmc/core/sdio.c
+++ b/drivers/mmc/core/sdio.c
@@ -30,7 +30,7 @@ static int sdio_read_fbr(struct sdio_func *func)
 	unsigned char data;
 
 	ret = mmc_io_rw_direct(func->card, 0, 0,
-		func->num * 0x100 + SDIO_FBR_STD_IF, 0, &data);
+		SDIO_FBR_STD_IF(func->num), 0, &data);
 	if (ret != MMC_ERR_NONE)
 		goto out;
 
@@ -38,7 +38,7 @@ static int sdio_read_fbr(struct sdio_func *func)
 
 	if (data == 0x0f) {
 		ret = mmc_io_rw_direct(func->card, 0, 0,
-			func->num * 0x100 + SDIO_FBR_STD_IF_EXT, 0, &data);
+			SDIO_FBR_STD_IF_EXT(func->num), 0, &data);
 		if (ret != MMC_ERR_NONE)
 			goto out;
 	}
diff --git a/drivers/mmc/core/sdio_cis.c b/drivers/mmc/core/sdio_cis.c
index b0c6697..8aa4666 100644
--- a/drivers/mmc/core/sdio_cis.c
+++ b/drivers/mmc/core/sdio_cis.c
@@ -215,7 +215,7 @@ int sdio_read_cis(struct sdio_func *func, unsigned int fn)
 	for (i = 0; i < 3; i++) {
 		unsigned char x;
 		ret = mmc_io_rw_direct(func->card, 0, 0,
-				       fn * 0x100 + SDIO_FBR_CIS + i, 0, &x);
+				       SDIO_FBR_CIS(fn) + i, 0, &x);
 		if (ret != MMC_ERR_NONE)
 			return -EIO;
 		ptr |= x << (i * 8);
diff --git a/include/linux/mmc/sdio.h b/include/linux/mmc/sdio.h
index 145ce23..e2bda4b 100644
--- a/include/linux/mmc/sdio.h
+++ b/include/linux/mmc/sdio.h
@@ -132,26 +132,28 @@
  * Function Basic Registers (FBR)
  */
 
-#define ...
From: Pierre Ossman
Date: Saturday, August 4, 2007 - 6:26 am

On Tue, 31 Jul 2007 16:36:31 +0100

I am a bit sceptical about these macros. Most i/o code in the kernel is
in the form of "<base address> + <register offset>". For one thing,
that model keeps the register defines clean and means people don't have
to go reaching for specs all the time.

Would you be content with replacing "func->num * 0x100" with a macro so
that the code becomes something like:

	SDIO_FBR_BASE(func->num) + SDIO_FBR_STD_IF

-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
-

From: David Vrabel
Date: Monday, August 6, 2007 - 3:14 am

I really don't follow you objection to this.  If one is maintaining the
SDIO core then I would expect some familiarity with the spec and an
understanding the FBRs are per-function but contained in the same
CCCR/F0 register space.


I think this is less readable than SDIO_FBR_STD_IF(func->num).

David
-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
-

From: Pierre Ossman
Date: Monday, August 6, 2007 - 7:58 am

On Mon, 06 Aug 2007 11:14:03 +0100

If we can reduce that barrier, then I think we should. People can't be
expected to keep everything fresh in memory all the time. And we won't

In some sense, but there are also several identical FBR chunks on the
card. So by most definitions of a base address, the start of each chunk

It's subjective. But the longer version is more understandable for
someone who doesn't have the details of the SDIO protocol fresh in his
mind.

Rgds
-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
-

From: David Vrabel
Date: Tuesday, July 31, 2007 - 8:36 am

Prior to calling the driver's probe(), set the functions' block size to the
largest that's supported by both the card and the driver.

Signed-off-by: David Vrabel <david.vrabel@csr.com>

---
commit 6d367fd822cbb2b8089ab7ef83f706f1984ab25b
tree 8c9cc84b4c8c1c8f959abe540aa02f14aa95c51d
parent 51755c3d59be1ba778bef45888f9f5e341dc4af4
author David Vrabel <david.vrabel@csr.com> Mon, 30 Jul 2007 19:58:54 +0100
committer David Vrabel <dvrabel@cantab.net> Mon, 30 Jul 2007 19:58:54 +0100

 drivers/mmc/core/sdio_bus.c   |   13 +++++++++++++
 drivers/mmc/core/sdio_cis.c   |    8 ++++----
 drivers/mmc/core/sdio_io.c    |   23 +++++++++++++++++++++++
 include/linux/mmc/sdio_func.h |    7 +++++++
 4 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/drivers/mmc/core/sdio_bus.c b/drivers/mmc/core/sdio_bus.c
index a3ac1aa..f4d40c1 100644
--- a/drivers/mmc/core/sdio_bus.c
+++ b/drivers/mmc/core/sdio_bus.c
@@ -127,11 +127,24 @@ static int sdio_bus_probe(struct device *dev)
 	struct sdio_driver *drv = to_sdio_driver(dev->driver);
 	struct sdio_func *func = dev_to_sdio_func(dev);
 	const struct sdio_device_id *id;
+	unsigned short block_size;
+	int ret;
 
 	id = sdio_match_device(func, drv);
 	if (!id)
 		return -ENODEV;
 
+	/*
+	 * Set the function's block size to the largest supported by
+	 * both the function and the driver.
+	 */
+	block_size = min(drv->max_block_size, func->max_block_size);
+	sdio_claim_host(func);
+	ret = sdio_set_block_size(func, block_size);
+	sdio_release_host(func);
+	if (ret)
+		return ret;
+
 	return drv->probe(func, id);
 }
 
diff --git a/drivers/mmc/core/sdio_cis.c b/drivers/mmc/core/sdio_cis.c
index 8aa4666..c16c536 100644
--- a/drivers/mmc/core/sdio_cis.c
+++ b/drivers/mmc/core/sdio_cis.c
@@ -93,9 +93,9 @@ static int cistpl_funce(struct sdio_func *func, unsigned fn,
 		if (size < 0x04 || buf[0] != 0)
 			goto bad;
 		/* TPLFE_FN0_BLK_SIZE */
-		val = buf[1] | (buf[2] << 8);
+		func->max_block_size = buf[1] | (buf[2] << 8);
 ...
From: Pierre Ossman
Date: Saturday, August 4, 2007 - 6:30 am

On Tue, 31 Jul 2007 16:36:32 +0100


I guess you aren't working against the latest tree as this code is


You also need to check that the host supports that block size.

-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
-

From: David Vrabel
Date: Monday, August 6, 2007 - 3:04 am

For efficient use of the bus it is useful to pad transfers to a multiple
of the block size.  I reckoned it was potentially beneficial for drivers
to use smaller blocks so less padding was required.  However, after
looking at some figures the per block overhead (28 clocks for a write)
really does limit the transfer sizes where small block sizes is beneficial.

I'll remove this.  Drivers can still override the block size by calling
sdio_set_block_size() directly.

David
-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
-

From: David Vrabel
Date: Tuesday, July 31, 2007 - 8:36 am

Extend sdio_readsb(), sdio_writesb(), sdio_memcpy_fromio(), and
sdio_memcpy_toio() to handle any length of buffer by splitting the transfer
into several IO_RW_EXTENDED commands. Typically, a transfer would be split
into a single block mode transfer followed by a byte mode transfer for the
remainder.

For this to work the block size must be limited to the maximum size of a
byte mode transfer (512 bytes).  This limitation could be revisited if
there are any cards out there that require a block size > 512.

Signed-off-by: David Vrabel <david.vrabel@csr.com>

---
commit 1c701a6566f373ede250b51c99216227fd881535
tree 2bbc5005de65bbdc497c8603e8a68486465065dc
parent 6d367fd822cbb2b8089ab7ef83f706f1984ab25b
author David Vrabel <david.vrabel@csr.com> Tue, 31 Jul 2007 11:08:49 +0100
committer David Vrabel <dvrabel@cantab.net> Tue, 31 Jul 2007 11:44:44 +0100

 drivers/mmc/core/sdio_io.c  |   63 ++++++++++++++++++++++++++++++++++---------
 drivers/mmc/core/sdio_ops.c |   17 +++++++-----
 drivers/mmc/core/sdio_ops.h |    2 +
 3 files changed, 61 insertions(+), 21 deletions(-)

diff --git a/drivers/mmc/core/sdio_io.c b/drivers/mmc/core/sdio_io.c
index e74e605..8fb1880 100644
--- a/drivers/mmc/core/sdio_io.c
+++ b/drivers/mmc/core/sdio_io.c
@@ -156,6 +156,13 @@ int sdio_set_block_size(struct sdio_func *func, unsigned short blksz)
 {
 	int ret;
 
+	/* The standard permits block sizes up to 2048 bytes, but
+	 * sdio_readsb() etc. can only work with block size <= 512. */
+	if (blksz > 512) {
+		blksz = 512;
+		dev_warn(&func->dev, "block size limited to 512 bytes\n");
+	}
+
 	ret = mmc_io_rw_direct(func->card, 1, 0, SDIO_FBR_BLKSIZE(func->num),
 			       blksz & 0xff, NULL);
 	if (ret)
@@ -228,6 +235,39 @@ void sdio_writeb(struct sdio_func *func, unsigned char b, unsigned int addr,
 
 EXPORT_SYMBOL_GPL(sdio_writeb);
 
+/* Split an arbitrarily sized data transfer into several
+ * IO_RW_EXTENDED commands. */
+static int sdio_io_rw_ext_helper(struct sdio_func *func, int ...
From: Pierre Ossman
Date: Saturday, August 4, 2007 - 6:35 am

On Tue, 31 Jul 2007 16:36:33 +0100


You need to check how many blocks the host supports in one go. Also,

Until this function is made complete, I'd like some kind of test that
blksz <= 512 when blocks == 1.

-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
-

From: Pierre Ossman
Date: Saturday, August 4, 2007 - 6:23 am

On Tue, 31 Jul 2007 16:36:30 +0100

Thanks. These are some nice improvements. I do have one suggestion
though:

Could we design it so that sdio_io_rw_ext_helper() sets the block size
itself? That way most drivers wouldn't have to care about that detail
and the core would be free to choose optimal values.

I suspect that some transactions might require a certain block size.
But we could satisfy that by stating that any transfer small enough to
fit into one block will not be split up.

(PS. You might want to add [PATCH x/3] to your subjects so that the
order of the patches is crystal clear)

Rgds
-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
-

From: David Vrabel
Date: Monday, August 6, 2007 - 3:31 am

I would expect the block size to be set once per card, and never be
changed and thus it's not logically a per-transfer operation.  We
certainly wouldn't want to change the block size willy-nilly as it's an
expensive operation.

The patch I've presented does put the selection of the block size in the

I consider it unlikely that any card would want to do anything other
than always use the largest possible block size.

David
-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
-

From: Pierre Ossman
Date: Monday, August 6, 2007 - 8:12 am

On Mon, 06 Aug 2007 11:31:19 +0100

Indeed. It would of course be optimized so that it doesn't change the
size needlessly.

The beauty is that drivers wouldn't have to care. Things just

I have a counter example. I have here a Marvell wifi card which needs a
firmware upload. And it seems to be rather picky about parameters
during that upload.

I'm still experimenting with a clean way to do things for this card.
I'll get back to you. :)

Rgds
-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
-

From: David Vrabel
Date: Monday, August 6, 2007 - 8:32 am

Drivers may care about the block size though so you can't have it
changing behind their backs. e.g., they may need to pad data to a

sdio_set_block_size(func, 64); /* ew, this card is fussy */

download_firmware();

sdio_set_block_size(func, func->max_blksize); /* Ahhh, back to the
card's default */

If a card is fussy about the block size I don't think there's anything
cleaner than specifying directly in the driver.

David
-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
-

From: Pierre Ossman
Date: Monday, August 6, 2007 - 11:06 am

On Mon, 06 Aug 2007 16:32:40 +0100

Well, we have to assume that they aren't just padding to pass the time.

I can see a couple of reasons:

1. They are padding because it made their code easier by allowing just
one transfer.

This is what I believe is the common case, and one that will go away if
we allow the core free access to the block size. All the complexity is
in the core and the drivers don't even have to bother with setting a
block size.

2. They must write an exact number of bytes to the card before it
activates.

As far as the core is concerned, padding and "real" data are the same
thing, so this poses no restriction on how the core can fiddle with
block sizes and transfers.

3. The entire transfer must be in one transaction.

Now this is a problem as the core might prefer to do two transfers
(probably one block and one byte).

As I believe this will be a rare case, we should try to make sure we
can handle this in a way that can keep less fussy transactions simple.
So I propose changing your sdio_set_block_size() API to
sdio_force_block_size().

That way the driver can lock the block size when it has particular
needs, yet keep it under the (assumingly more optimal) control of the
core at other times.

One could have a calling convention such as specifying a block size of
zero to turn off the forced block size.

One could also use this as a less complex way of informing the drivers
of host restrictions. If the block size specified isn't possible, we
can return an error code stating such. Without that, every driver that
needs this would need to duplicate code that computes possible block
size and would make our life a pain when we want to add new host

Well, the driver has to specify the information somehow. As to how,
there are a number of options. My proposed sdio_force_block_size() will
work here as well.

Rgds
-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          ...
From: Pierre Ossman
Date: Monday, August 6, 2007 - 1:01 pm

This is essentially what I mean.

-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
From: David Vrabel
Date: Tuesday, August 7, 2007 - 5:51 am

It does not behave very well, see the specific comments at the end.

When considering optimizing SDIO it is important to always keep in mind
that a command is very expensive.  They're some 100-150 clocks long and
there's also overheads in interrupt handling/scheduling, this can add to
up to 10s of us.  Therefore, when selecting an optimum block size one
should aim to reduce the number of commands before attempting to
maximize the block size.

There are two cases to consider.

1. Card's max block_size <= 512.

In this case the optimum block size is /always/ the card's maximum.
This means we can do all transfers in two commands (and if the driver is
aware of the block size it may choose to pad transfers so only one
command is done).

2. Card's max block size > 512.

We can't handle arbitrary sized transfers with a block size > 512 (since
512 is maximum length of a byte mode transfer), so if we wish to use a
block size of 1024 (say) some transfers will require three commands (a
block mode, and two byte mode transfers).  For a block size of 2048 (the
maximum possible) some transfer may require 5 commands!

We could attempt to set the block size as follows:

transfer size % 1024 == 0 => block size = 1024
transfer size % 2048 == 0 => block size = 2048
otherwise                 => block size = 512

However, without knowing what sizes of transfers the driver is going to
use we cannot intelligently know when it's best to switch the block
size.  A block size change is expensive (two commands).

The change in block size from 512 to 2048 improves performance (for an
individual command) by 2.0% [1].  I don't believe this performance
benefit is worth the possible overhead of frequent block size changes
nor the cost of potentially more than 2 commands per transfer.

  [1] The real performance benefit will be much less due to per command
  overhead and other non IO_RW_EXTENDED traffic on the bus.

In conclusion, the optimum block size is based solely on the card and
host ...
From: David Vrabel
Date: Tuesday, August 7, 2007 - 5:53 am

-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
From: David Vrabel
Date: Tuesday, August 7, 2007 - 5:54 am

-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
From: Pierre Ossman
Date: Tuesday, August 7, 2007 - 6:38 am

On Tue, 07 Aug 2007 13:54:32 +0100

This is probably better done in the sdio_enable_func().

We also need to check that the card actually supports block io.

-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
-

From: David Vrabel
Date: Tuesday, August 7, 2007 - 10:20 am

I don't think so. A driver might enable/disable a function multiple
times but there's no need to set the block size every time.

It may be best to move setting the block size back to before the probe
so a driver can be sure the block size is something sensible.  Consider
a failed probe that called sdio_set_block_size() -- this will currently
mess up drivers probed later.

David
-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
-

From: Pierre Ossman
Date: Tuesday, August 7, 2007 - 10:54 am

On Tue, 07 Aug 2007 18:20:26 +0100

Why would it want to do that?

Anyway, as long as cur_blksz is preserved, sdio_set_block_size() can
easily filter out redundant calls. No need to compromise in the rest of
the code for that.

In the patch I sent, the block size is set the first time is needed.

Right, or remove the lock in the variant I proposed.

Rgds
-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
-

From: David Vrabel
Date: Wednesday, August 8, 2007 - 2:46 am

A function enable/disable cycle should act as a per-function reset.

David
-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
-

From: Pierre Ossman
Date: Wednesday, August 8, 2007 - 3:06 am

On Wed, 08 Aug 2007 10:46:24 +0100

It could be argued that it would reset the other per-function
information as well. :)

-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
-

From: David Vrabel
Date: Wednesday, August 8, 2007 - 3:19 am

I do not think that that's what the SDIO specification requires, nor is
it what hardware I am familiar with does.

David
-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
-

From: David Vrabel
Date: Tuesday, August 7, 2007 - 5:55 am

-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
From: Pierre Ossman
Date: Tuesday, August 7, 2007 - 6:42 am

On Tue, 07 Aug 2007 13:55:12 +0100

I'm not sure about not returning an error here. What if a driver calls

We need to check support for block transfers here as well.


|| blksz > 512

Rgds
-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
-

From: Pierre Ossman
Date: Tuesday, August 7, 2007 - 1:17 pm

Here's my proposed compromise.

If a driver uses sdio_force_block_size() extensively, it will be like
your original version with sdio_set_block_size(). If it doesn't,
however, then that is a way to indicate that the driver has no specific
requirements (meaning we are free to change things in the future).

You'll also notice that calling sdio_force_block_size() is free. The
actual change comes when it is first used, meaning that drivers don't
have to be as careful when calling it.

The reset of custom/forced block size is moved to the probe function as
you suggested.

diff --git a/drivers/mmc/core/sdio_bus.c b/drivers/mmc/core/sdio_bus.c
index ca1d4b2..6ead36a 100644
--- a/drivers/mmc/core/sdio_bus.c
+++ b/drivers/mmc/core/sdio_bus.c
@@ -133,6 +133,8 @@ static int sdio_bus_probe(struct device *dev)
 	if (!id)
 		return -ENODEV;
 
+	func->forced_blksz = 0;
+
 	return drv->probe(func, id);
 }
 
diff --git a/drivers/mmc/core/sdio_cis.c b/drivers/mmc/core/sdio_cis.c
index fffbc5a..20fdd1d 100644
--- a/drivers/mmc/core/sdio_cis.c
+++ b/drivers/mmc/core/sdio_cis.c
@@ -145,9 +145,9 @@ static int cistpl_funce_func(struct sdio_func *func,
 	printk(KERN_DEBUG "%s: TPLFE_CSA_PROPERTY = 0x%02x\n",
 	       sdio_func_id(func), val);
 	/* TPLFE_MAX_BLK_SIZE */
-	func->blksize = buf[12] | (buf[13] << 8);
+	func->max_blksz = buf[12] | (buf[13] << 8);
 	printk(KERN_DEBUG "%s: TPLFE_MAX_BLK_SIZE = %u\n",
-	       sdio_func_id(func), (unsigned)func->blksize);
+	       sdio_func_id(func), (unsigned)func->max_blksz);
 	/* TPLFE_OCR */
 	val = buf[14] | (buf[15] << 8) | (buf[16] << 16) | (buf[17] << 24);
 	printk(KERN_DEBUG "%s: TPLFE_OCR = 0x%08x\n",
diff --git a/drivers/mmc/core/sdio_io.c b/drivers/mmc/core/sdio_io.c
index 7f08ba5..46ada64 100644
--- a/drivers/mmc/core/sdio_io.c
+++ b/drivers/mmc/core/sdio_io.c
@@ -202,6 +202,115 @@ void sdio_writeb(struct sdio_func *func, unsigned char b, unsigned int addr,
 
 EXPORT_SYMBOL_GPL(sdio_writeb);
 
+/*
+ * Internal function. Sets ...
From: David Vrabel
Date: Wednesday, August 8, 2007 - 3:19 am

We need to know the block size in use /before/ the start of the transfer
as we would like drivers to be able to perform transfers with single
commands as this can result in significantly better performance[1].  The
drivers for our (CSR's) WiFi chips should do this.  This isn't some (as
you suggested in a previous post) 'rare' requirement.

Therefore, we should not change the block size here.

We can also support block sizes > 512 and have this function do the
correct thing (as you'll see when I post my final patches).

David

[1] Consider a chip with a block size of 64 that regularly does short
transfers of between 64 - 128 bytes.  Without padding, this would
require two commands per transfer instead of just one, cutting
performance by 50%!  This scenario could be a WiFi chip doing VoIP.
-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
-

From: Pierre Ossman
Date: Wednesday, August 8, 2007 - 3:37 am

On Wed, 08 Aug 2007 11:19:33 +0100

Well, there are more ways that can be achieved.

First, the driver could lock down the block size using
sdio_force_block_size(). Then it knows what it gets.

Second, we could try to make it possible for the driver to indicate
"feel free to pad this transfer". Then we could also remove the need
for drivers to mess with buffers and keep such stuff in the core. We
could even magically remove a memcpy() by setting up two sg entries,

Provided block size < 128. Otherwise it'll jump straight to the byte
transfer.

-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
-

From: David Vrabel
Date: Wednesday, August 8, 2007 - 6:22 am

Setting the block size in io_rw_ext_helper() has several drawbacks, namely:

1. Reduces the flexibility of drivers to manage what commands are performed.
2. A performance penalty on the first transfer.
3. Greater code complexity.
4. Non-intuitive location for card initialization code.

Sure we could just through hoops and add (much) extra complexity to the
core, to improve item 1 but 2, 3 and 4 are insoluble. Given that setting
block size before the first command has /zero/ benefits[1], why bother?

Your insistence on this stupid idea baffles me, particularly in the
light of your other useful comments.

I would also like to advise that until a larger number of function
drivers become available that the core is kept as simple and as flexible
as possible.  Without knowing how different cards operate it is
difficult to know what's common behaviour and what's card specific.

My latest (and hopefully final) patch set follows.

David

[1] dynamic block sizing was (potentially) a useful benefit but I have
comprehensibly shown it's not beneficial.  The idea that is somehow
necessary to permit the core to change it's block size selection
algorithm is bogus.
-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
-

From: David Vrabel
Date: Wednesday, August 8, 2007 - 6:23 am

-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
From: David Vrabel
Date: Wednesday, August 8, 2007 - 6:23 am

-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
From: David Vrabel
Date: Wednesday, August 8, 2007 - 6:24 am

-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
From: Pierre Ossman
Date: Wednesday, August 8, 2007 - 9:55 am

On Wed, 08 Aug 2007 14:22:20 +0100

3 and 4 are subjective, and for 2, the performance penalty has to come
some place. The upside, as explained, is that drivers aren't penalized
for setting the block size and then failing to use it. Something that
gives us the flexibility to write cleaner code.

As for benefits, there are other issues. We might want to deal with

Agreed, we're doing a bit more guessing than one would like. But I
don't agree in keeping the core simple and flexible. Rather the
opposite. Any guarantee that we give drivers now and need to remove in
the future needs to be tested with all drivers that rely on it (a
difficult task as it often takes some effort to find people with
hardware and the ability to test patches).

As such, the guarantees should be kept at a minimum. Code is easily
changed, API is not.

IMO, the best way to achieve this is with an API that describes _what_
the drivers want to do, not _how_. Hence my insistence on allowing
drivers to specify the difference between "I want this data
transferred" and "I want this data transferred with an exact block size

These looks good. The only part I'm really attached to is the blksz ==
0 case, which you have.

Rgds
-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
-

From: David Vrabel
Date: Tuesday, August 7, 2007 - 5:55 am

-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
From: Pierre Ossman
Date: Tuesday, August 7, 2007 - 6:43 am

On Tue, 07 Aug 2007 13:55:59 +0100

I'm not sure about this. I've seen several hosts which lack a
mechanical detect switch.

-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
-

From: David Vrabel
Date: Tuesday, August 7, 2007 - 7:46 am

If host is correctly designed it will have identical pull-ups on all the
DAT lines, and therefore this internal pull-up should be disconnected.

However, given that disabling the pull-up may cause 4 bit transfers to
fail on hardware that omits a host-side pull-up on DAT3 I agree that the
resistor should remain enabled.

Is there an example driver for a host that uses pin 1/DAT3 sensing for
card detection?  I'm curious about how removal detections work.

App Note on card detection goes into the use of this resistor in more
detail but I don't believe this document is publicly available :(.

David
-- 
David Vrabel, Software Engineer, Drivers group  Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ


.
-

From: Pierre Ossman
Date: Tuesday, August 7, 2007 - 8:08 am

On Tue, 07 Aug 2007 15:46:36 +0100

Both wbsd and sdhci have experienced hardware with this. In sdhci this
is hidden completely in hardware, but wbsd has to do some funky logic

SD is a wonderful world. :/

-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
-

From: Pierre Ossman
Date: Tuesday, August 7, 2007 - 6:33 am

On Tue, 07 Aug 2007 13:51:39 +0100

Well, I can hardly argue with that thorough analysis. :)

I still like the idea of having the control in the core as much as
possible though. It allows people to experiment and figure out new and
clever ways of solving this in a way that can benefit all drivers.

So could we do most of what you propose, where we set the block size to
maximum, yet <= 512. Drivers can tweak this further by calling
sdio_force_block_size() (as need e.g. by my marvell card), and keep the
convention of 0 meaning "back to core default"? That way we can modify


Quite right. I had no doubts that thing had room for improvement. But I
agree that we might as well select a fixed block size and stick with it
for the general case.

Rgds
-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org
-

Previous thread: none

Next thread: [PATCH -mm] Introduce U16_MAX and U32_MAX by Satyam Sharma on Tuesday, July 31, 2007 - 9:26 am. (4 messages)