The capacity printk'd in bytes is divided by 1000000,
whereas 1048576 would be more consistent with the rest
of the OS and disk-related utilities ('df' etc.).
This change replaces the (sz - (sz/625 - 974))/1950
calculation with a simple right shift by 11 bits
(/2048).
Signed-off-by: Simon Arlott <simon@fire.lp0.eu>
---
drivers/scsi/sd.c | 6 ++----
1 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index e5e7d78..e6fd6fd 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1441,10 +1441,8 @@ got_data:
sector_t mb = sz;
blk_queue_hardsect_size(queue, hard_sector);
- /* avoid 64-bit division on 32-bit platforms */
- sector_div(sz, 625);
- mb -= sz - 974;
- sector_div(mb, 1950);
+ /* Convert to megabytes (/2048) */
+ mb = sz >> 11;
sd_printk(KERN_NOTICE, sdkp,
"%llu %d-byte hardware sectors (%llu MB)\n",
--
1.5.6.5
--
Simon Arlott
--
linux-scsi is the correct list for this, cc's added No, this is wrong. By mandated standards the manufacturers are allowed to calculate MB by dividing by 10^6. This is a fiddle to allow them to make their drives look slightly bigger. However, we want the printed information to match that written on the drive, so in this printk, we use the manufacturer standard for calculation (and then do everything else in bytes so we don't have to bother with it ever again). James --
I was looking at this code recently because it looks really bizarre when you create a half-petabyte filesystem: $ sudo insmod drivers/ata/ata_ram.ko capacity=1099511627776 preallocate=0 [12095.028093] ata7.00: 1099511627776 sectors, multi 0: LBA48 NCQ (depth 31/32) [12095.028093] ata7.00: configured for UDMA/133 [12095.041915] scsi 7:0:0:0: Direct-Access ATA Linux RAM Drive 0.01 PQ: 0 ANSI: 5 [12095.041915] sd 7:0:0:0: [sdc] Very big device. Trying to use READ CAPACITY(16). [12095.041915] sd 7:0:0:0: [sdc] 1099511627776 512-byte hardware sectors (562949953 MB) [12095.041915] sd 7:0:0:0: [sdc] Write Protect is off [12095.041915] sd 7:0:0:0: [sdc] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA 1. Avoiding 64-bit divisions is _so_ last decade. We have linux/math64.h, we should use it. 2. We should report in GB or TB when appropriate. The exact definition of 'appropriate' is going to vary from person to person. Might I suggest that we should report between two and four significant digits. eg 9543 MB is ok, 10543 MB should be 10 GB. 3. I hate myself for saying this ... but maybe we should be using the horrific MiB/GiB/TiB instead of MB/GB/TB. 4. I've been far too busy to write said patch. Simon, would you mind doing the honours? -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." --
On Sat, 30 Aug 2008 11:45:16 -0600
I've been looking at doing something like this for the mmc_block
driver, so a generic helper is welcome. :)
--=20
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
rdesktop, core developer http://www.rdesktop.org
WARNING: This correspondence is being monitored by the
Swedish government. Make sure your server uses encryption
for SMTP traffic and consider using PGP for end-to-end
encryption.
Does MMC have the 10^3 requirement like SCSI/ATA disks do, or can you use 2^10 like standard computer stuff for capacities? James --
On Sat, 30 Aug 2008 16:45:49 -0500
Some do, some don't. I prefer the 2^10 and MiB variants as it avoids
ambiguity.
Rgds
--=20
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
rdesktop, core developer http://www.rdesktop.org
WARNING: This correspondence is being monitored by the
Swedish government. Make sure your server uses encryption
for SMTP traffic and consider using PGP for end-to-end
encryption.
If there's a generic function to do this, there could be a Kconfig option to select the variant (10^3 MB, 2^10 MiB, 2^10 MB ;)), as long as the byte value is available from the same message. -- Simon Arlott --
I think the most compelling reason for ensuring they're the same is that you can plug these devices into both an MMC slot directly and a card-reader and have them show up over USB. You don't want the reported capacity to change when you do that. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." --
It's unlikely to match what's on the drive, "1000204886016" isn't 1TB I've gone with five digits, it switches to GB at ~98GB, and to TB Somehow this stuff got into net-tools (ifconfig) too, so I have a Sure, patch will follow this email... it can only go as far as 8192EB and then there's not enough space to store more than 2^64 512-byte sectors. (And if you only modify drivers/scsi/sd.c, the kernel make system won't recompile sd.o!) -- Simon Arlott --
The capacity printk'd in bytes is divided by 1000000,
whereas 1048576 would be more consistent with the rest
of the OS and disk-related utilities ('df' etc.).
This change replaces the (sz - (sz/625 - 974))/1950
calculation with a simple right shift to output with
five significant digits the capacity in KB, MB, GB, TB,
PB, or EB. Anything beyond this becomes too large...
---
drivers/scsi/sd.c | 62 ++++++++++++++++++++++++++++++++++------------------
1 files changed, 40 insertions(+), 22 deletions(-)
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index e5e7d78..c47a071 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1290,6 +1290,8 @@ sd_read_capacity(struct scsi_disk *sdkp, unsigned char *buffer)
struct scsi_sense_hdr sshdr;
int sense_valid = 0;
struct scsi_device *sdp = sdkp->device;
+ unsigned long long orig_capacity, sz;
+ char *units;
repeat:
retries = 3;
@@ -1429,30 +1431,16 @@ got_data:
*/
sector_size = 512;
}
- {
- /*
- * The msdos fs needs to know the hardware sector size
- * So I have created this table. See ll_rw_blk.c
- * Jacques Gelinas (Jacques@solucorp.qc.ca)
- */
- int hard_sector = sector_size;
- sector_t sz = (sdkp->capacity/2) * (hard_sector/256);
- struct request_queue *queue = sdp->request_queue;
- sector_t mb = sz;
-
- blk_queue_hardsect_size(queue, hard_sector);
- /* avoid 64-bit division on 32-bit platforms */
- sector_div(sz, 625);
- mb -= sz - 974;
- sector_div(mb, 1950);
- sd_printk(KERN_NOTICE, sdkp,
- "%llu %d-byte hardware sectors (%llu MB)\n",
- (unsigned long long)sdkp->capacity,
- hard_sector, (unsigned long long)mb);
- }
+ /*
+ * The msdos fs needs to know the hardware sector size
+ * So I have created this table. See ll_rw_blk.c
+ * Jacques Gelinas (Jacques@solucorp.qc.ca)
+ */
+ blk_queue_hardsect_size(sdp->request_queue, sector_size);
/* Rescale capacity to 512-byte units */
+ orig_capacity = sdkp->capacity;
if (sector_size == 4096)
...Well, still needs to be dividing by 1000 not 1024 for SCSI and ATA. However, I'm afraid it needs to be a bit more sophisticated: for instance, under these calculations, a 1.75TB disk will show up as 1TB. Thus, I think we need to print the capacity to 3 significant figures to cope with this case. James --
Do you have an objection to my original suggestion of 1750GB in that case? It saves faffing around with fractions and it's unlikely to confuse the user. BTW, I do appreciate Simon's point about df showing a different number. How about we print: sd 7:0:0:0: [sdc] 1099511627776 512-byte hardware sectors (563TB / 512TiB) (or perhaps a more realistic number ...) sd 7:0:0:0: [sdc] 976562500000 512-byte hardware sectors (500TB / 455TiB) It's perhaps a more gentle way of informing our users that they may not have quite as much capacity as they thought they had. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." --
Hi Matthew,
Great idea! As a user/admin I would find this the best one of all.
1. All variants given.
2. Correct scientific units used.
I would ACK that one, if you provide a small helper for that printout
somehwere in block/{genhd,blk-core}.c
Best Regards
Ingo Oeser
--
I oppose all the "iB" forms. Has anyone tried pronouncing these? Pee-bee-bytes? Alternatively we could just remove the part in brackets from the message... -- Simon Arlott --
OK, uncle. We're wasting far more time on this email thread than it would take to code the damn thing. So, here it is as a generic helper: both forms of calculation correctly to 3sf. James --
This patch adds the ability to print sizes in either units of 10^3 (SI)
or 2^10 (Binary) units. It rounds up to three significant figures and
can be used for either memory or storage capacities.
Oh, and I'm fully aware that 64 bits is only 16EiB ... the Zetta and
Yotta units are added for future proofing against the day we have 128
bit computers ...
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
---
include/linux/string_helpers.h | 10 ++++++
lib/Makefile | 3 +-
lib/string_helpers.c | 62 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 74 insertions(+), 1 deletions(-)
create mode 100644 include/linux/string_helpers.h
create mode 100644 lib/string_helpers.c
diff --git a/include/linux/string_helpers.h b/include/linux/string_helpers.h
new file mode 100644
index 0000000..6b827fb
--- /dev/null
+++ b/include/linux/string_helpers.h
@@ -0,0 +1,10 @@
+#include <linux/types.h>
+
+enum string_size_units {
+ STRING_UNITS_10,
+ STRING_UNITS_2,
+};
+
+int string_get_size(u64 size, enum string_size_units units,
+ char *buf, int len);
+
diff --git a/lib/Makefile b/lib/Makefile
index 3b1f94b..44001af 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -19,7 +19,8 @@ lib-$(CONFIG_SMP) += cpumask.o
lib-y += kobject.o kref.o klist.o
obj-y += bcd.o div64.o sort.o parser.o halfmd4.o debug_locks.o random32.o \
- bust_spinlocks.o hexdump.o kasprintf.o bitmap.o scatterlist.o
+ bust_spinlocks.o hexdump.o kasprintf.o bitmap.o scatterlist.o \
+ string_helpers.o
ifeq ($(CONFIG_DEBUG_KOBJECT),y)
CFLAGS_kobject.o += -DDEBUG
diff --git a/lib/string_helpers.c b/lib/string_helpers.c
new file mode 100644
index 0000000..3675eaa
--- /dev/null
+++ b/lib/string_helpers.c
@@ -0,0 +1,62 @@
+/*
+ * Helpers for formatting and printing strings
+ *
+ * Copyright 31 August 2008 James Bottomley
+ */
+#include <linux/kernel.h>
+#include <linux/math64.h>
+#include <linux/module.h>
+#include ...I think this should have been "KiB". I'd prefer an option to display these without the "i"... Oh and you need to store the original capacity before it's re-scaled to 512 bytes. Outputting sdkp->capacity at that stage it'll be a count of 512-byte sectors. -- Simon Arlott --
If anyone ever finds a use for that, possibly ... but with the i is the
Yes, just trying to avoid a multiplication. I should have just used ffz
instead.
James
---
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index e5e7d78..ef1c06c 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -47,6 +47,7 @@
#include <linux/blkpg.h>
#include <linux/delay.h>
#include <linux/mutex.h>
+#include <linux/string_helpers.h>
#include <asm/uaccess.h>
#include <scsi/scsi.h>
@@ -1429,27 +1430,21 @@ got_data:
*/
sector_size = 512;
}
+ blk_queue_hardsect_size(sdp->request_queue, sector_size);
+
{
- /*
- * The msdos fs needs to know the hardware sector size
- * So I have created this table. See ll_rw_blk.c
- * Jacques Gelinas (Jacques@solucorp.qc.ca)
- */
- int hard_sector = sector_size;
- sector_t sz = (sdkp->capacity/2) * (hard_sector/256);
- struct request_queue *queue = sdp->request_queue;
- sector_t mb = sz;
+ char cap_str_2[10], cap_str_10[10];
+ u64 sz = sdkp->capacity << ffz(~sector_size);
- blk_queue_hardsect_size(queue, hard_sector);
- /* avoid 64-bit division on 32-bit platforms */
- sector_div(sz, 625);
- mb -= sz - 974;
- sector_div(mb, 1950);
+ string_get_size(sz, STRING_UNITS_2, cap_str_2,
+ sizeof(cap_str_2));
+ string_get_size(sz, STRING_UNITS_10, cap_str_10,
+ sizeof(cap_str_10));
sd_printk(KERN_NOTICE, sdkp,
- "%llu %d-byte hardware sectors (%llu MB)\n",
+ "%llu %d-byte hardware sectors: (%s/%s)\n",
(unsigned long long)sdkp->capacity,
- hard_sector, (unsigned long long)mb);
+ sector_size, cap_str_10, cap_str_2);
}
/* Rescale capacity to 512-byte units */
--
Typo for KiB? Should this be another %p extension instead? %pB and %piB perhaps? That would seem easier for the callers than futzing around with managing their own string buffers. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." --
From: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
---
diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 62a4c91..dad8edb 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -29,6 +29,7 @@
#include <linux/blkdev.h>
#include <linux/mutex.h>
#include <linux/scatterlist.h>
+#include <linux/string_helpers.h>
=20
#include <linux/mmc/card.h>
#include <linux/mmc/host.h>
@@ -496,6 +497,8 @@ static int mmc_blk_probe(struct mmc_card *card)
struct mmc_blk_data *md;
int err;
=20
+ char cap_str[10];
+
/*
* Check that the card supports the command class(es) we need.
*/
@@ -510,10 +513,11 @@ static int mmc_blk_probe(struct mmc_card *card)
if (err)
goto out;
=20
- printk(KERN_INFO "%s: %s %s %lluKiB %s\n",
+ string_get_size(get_capacity(md->disk) << 9, STRING_UNITS_2,
+ cap_str, sizeof(cap_str));
+ printk(KERN_INFO "%s: %s %s %s %s\n",
md->disk->disk_name, mmc_card_id(card), mmc_card_name(card),
- (unsigned long long)(get_capacity(md->disk) >> 1),
- md->read_only ? "(ro)" : "");
+ cap_str, md->read_only ? "(ro)" : "");
=20
mmc_set_drvdata(card, md);
add_disk(md->disk);
--=20
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
rdesktop, core developer http://www.rdesktop.org
WARNING: This correspondence is being monitored by the
Swedish government. Make sure your server uses encryption
for SMTP traffic and consider using PGP for end-to-end
encryption.
Did you want me to take this? It seems to depend on pieces of your mmc tree, so I'd have to carry it in my post-merge tree. James --
On Fri, 05 Sep 2008 15:09:00 -0500
Oh? There shouldn't be any changes around those chunks. What kind of
problems are you seeing?
Rgds
--=20
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
rdesktop, core developer http://www.rdesktop.org
WARNING: This correspondence is being monitored by the
Swedish government. Make sure your server uses encryption
for SMTP traffic and consider using PGP for end-to-end
encryption.
Oh ... your inline attached patch is actually whitespace broken (it has spaces for tabs), which is why it doesn't apply. Sorry, that's the first thing I usually check, I just skipped it in your case. Could you resend it as an attachment? Thanks, James --
On Fri, 05 Sep 2008 16:03:54 -0500
What the... It seems someone broke git-diff (or less). This one is
properly undamaged:
diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 62a4c91..dad8edb 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -29,6 +29,7 @@
#include <linux/blkdev.h>
#include <linux/mutex.h>
#include <linux/scatterlist.h>
+#include <linux/string_helpers.h>
=20
#include <linux/mmc/card.h>
#include <linux/mmc/host.h>
@@ -496,6 +497,8 @@ static int mmc_blk_probe(struct mmc_card *card)
struct mmc_blk_data *md;
int err;
=20
+ char cap_str[10];
+
/*
* Check that the card supports the command class(es) we need.
*/
@@ -510,10 +513,11 @@ static int mmc_blk_probe(struct mmc_card *card)
if (err)
goto out;
=20
- printk(KERN_INFO "%s: %s %s %lluKiB %s\n",
+ string_get_size(get_capacity(md->disk) << 9, STRING_UNITS_2,
+ cap_str, sizeof(cap_str));
+ printk(KERN_INFO "%s: %s %s %s %s\n",
md->disk->disk_name, mmc_card_id(card), mmc_card_name(card),
- (unsigned long long)(get_capacity(md->disk) >> 1),
- md->read_only ? "(ro)" : "");
+ cap_str, md->read_only ? "(ro)" : "");
=20
mmc_set_drvdata(card, md);
add_disk(md->disk);
--=20
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
rdesktop, core developer http://www.rdesktop.org
WARNING: This correspondence is being monitored by the
Swedish government. Make sure your server uses encryption
for SMTP traffic and consider using PGP for end-to-end
encryption.
Can't print a u64 - we don't know what type it has. It must be cast to --
Yes ... I thought about that. Two things I don't like about the
approach are:
1. It adds to the confetti of letters vsnprintf already accepts ...
it's becoming like ls, and we're soon going to run out of
alphabet
2. I'd probably have to plumb in precision and things, which make
the code a bit nastier.
As long as I have everything correctly const'd, the optimisations
I thought there was a patch from Matthew to move u64 to unsigned long
long on all architectures, thus obviating this annoying conversion ...
James
--
It broke, and it needs vast numbers of conversions in the arch code. --
James
---
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index e5e7d78..a9f919e 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -47,6 +47,7 @@
#include <linux/blkpg.h>
#include <linux/delay.h>
#include <linux/mutex.h>
+#include <linux/string_helpers.h>
#include <asm/uaccess.h>
#include <scsi/scsi.h>
@@ -1429,29 +1430,8 @@ got_data:
*/
sector_size = 512;
}
- {
- /*
- * The msdos fs needs to know the hardware sector size
- * So I have created this table. See ll_rw_blk.c
- * Jacques Gelinas (Jacques@solucorp.qc.ca)
- */
- int hard_sector = sector_size;
- sector_t sz = (sdkp->capacity/2) * (hard_sector/256);
- struct request_queue *queue = sdp->request_queue;
- sector_t mb = sz;
-
- blk_queue_hardsect_size(queue, hard_sector);
- /* avoid 64-bit division on 32-bit platforms */
- sector_div(sz, 625);
- mb -= sz - 974;
- sector_div(mb, 1950);
-
- sd_printk(KERN_NOTICE, sdkp,
- "%llu %d-byte hardware sectors (%llu MB)\n",
- (unsigned long long)sdkp->capacity,
- hard_sector, (unsigned long long)mb);
- }
-
+ blk_queue_hardsect_size(sdp->request_queue, sector_size);
+
/* Rescale capacity to 512-byte units */
if (sector_size == 4096)
sdkp->capacity <<= 3;
@@ -1463,6 +1443,21 @@ got_data:
sdkp->capacity >>= 1;
sdkp->device->sector_size = sector_size;
+ {
+ char cap_str_2[10], cap_str_10[10];
+ u64 sz = sdkp->capacity << 9;
+
+ string_get_size(sz, STRING_UNITS_2, cap_str_2,
+ sizeof(cap_str_2));
+ string_get_size(sz, STRING_UNITS_10, cap_str_10,
+ sizeof(cap_str_10));
+
+ sd_printk(KERN_NOTICE, sdkp,
+ "%llu %d-byte hardware sectors: (%s/%s)\n",
+ (unsigned long long)sdkp->capacity,
+ sector_size, cap_str_10, cap_str_2);
+ }
+
}
/* called with buffer of length 512 */
--
Actually it'll show up as 1629GB, as my patch shows up to 5 digits (not five significant digits, which would require outputting a non-integer value). Isn't outputting "1.75" unnecessarily complicated, and "1750" would work better? -- Simon Arlott --
Hm. I bought a 500GB drive last year: sd 1:0:0:0: [sda] 976773168 512-byte hardware sectors (500108 MB) 512 * 976773168 500107862016 512 * 976773168 / (1024 * 1024 * 1024) 465.76174163818359375000 If we report it as 465GB, we will get questions. Even pretending it's 1024 * 1000 * 1000 doesn't work: 512 * 976773168 / (1000 * 1000 * 1024) 488.38658400000000000000 I think we have to stick with dividing by multiples of 1000. It's what the drive manufacturers do (and I do understand their reasons for doing Yes. Reasonable minds can certainly disagree on this one. I respectfully submit that reporting a 97415MB capacity is less useful than reporting a 97GB capacity. If you look at drive advertisements, they sell 1TB, 1.5TB, 80GB, 750GB, 360GB, ... we should be trying to match that. I'm a little dubious about trying to match the 1.5TB; I think 1500GB is close I understand why networking tools are particularly cautious about this. The line rate (eg 1Gbps) is 1000,000,000 bps, but the amount of traffic reported might well use either binary SI or decimal SI. Reporting the wrong one makes a 7% difference at the GB/GiB level, and that's the kind I think it'll be a while before we get drives of that capacity. ATA is limited to 48 bits for the number of sectors, and while you can increase the sector size (4k is currently planned), that still doesn't bring you close. I suppose you could get ata_ram to have multiple drives and raid-0 them together, but you'd still need to allocate 2^13 of them. scsi_debug can probably go to the full 2^64 sectors. I haven't looked That sounds odd to me; what command are you using to rebuild? -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." --
I disagree. The difference between advertised and actual capacity is only going to get worse when drive capacity increases further. e.g. "4TB" will only be 3.6TB. Who is going to be asking these questions about the kernel output, but not about whatever else reports 465GB, from 'df' to a GUI showing disk capacity? This actually makes the kernel more consistent with everything This really depends on whether or not you're going for matching advertised capacity. I think the extra digit avoids losing precision too early. If you're intending to divide by 1000 you may as well determined what the advertised capacity would be and handle .5 xB (or even .25, .75, .1 It's not actually a useful value... you'd need to use the byte value, which The sector size can't currently be increased beyond 512 to get around this because it's scaled to 512 to store capacity. (Which my patch then uses to Maybe I was imagining it then... it appeared to be doing that at times while I found all the possible ways to mess up calculating 100000xB and having "0 EB" devices each time. -- Simon Arlott --
Since when did techies start paying attention to marketing statements ? We should be doing what's natural and *consistent*, which is typically dealing with power-of-2. Saying it's one thing at one level, and when the natural use (how many 512 byte sectors get added up later) changes that number in a different level, you've created even more confusion. There's no consistency. As far as user concern - they've seen this discrepancy in the PC/Windows world for years now... Why should we be taking on the task to solve or answer it now ? Throw in different overheads for filesystem metadata loss, volume manager metadata, raid level loss, etc - you'll never be able to explain it all to the user. And personally, I'd rather have natural numbers so that if I do understand the uses, I can do calculations without doing number-base conversions. -- james s --
