[PATCH] 24-bit types: typedef and functions for accessing 3-byte arrays as integers

Previous thread: Block: Trouble with kobject initialisation for blk_cmd_filter by Elias Oltmanns on Friday, September 5, 2008 - 9:48 am. (10 messages)

Next thread: 2.6.27-rc5-git7 iwl3945: driver still has receiver lockups and is worse than before by jmerkey on Friday, September 5, 2008 - 9:29 am. (1 message)
From: Chris Leech
Date: Friday, September 5, 2008 - 9:57 am

Both iSCSI and Fibre Channel make use of 24-bit big-endian values in
frame headers.  This patch defines __be24 and __le24 typedefs for a
structure wrapped around a 3-byte array, and macros to convert back and
forth to a 32-bit integer.

The undefs in iscsi_proto.h are because of the different calling
convention for the existing hton24 macro in the iSCSI code.  iSCSI will
be converted in a subsequent patch.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
---

 include/linux/byteorder.h         |   56 ++++++++++++++++++++++++++++++++++++
 include/linux/byteorder/generic.h |   57 +++++++++++++++++++++++++++++++++++++
 include/linux/types.h             |    2 +
 include/scsi/iscsi_proto.h        |    2 +
 4 files changed, 117 insertions(+), 0 deletions(-)


diff --git a/include/linux/byteorder.h b/include/linux/byteorder.h
index 29f002d..e41c17b 100644
--- a/include/linux/byteorder.h
+++ b/include/linux/byteorder.h
@@ -62,6 +62,54 @@
 # define __cpu_to_le64(x) ((__force __le64)__swab64(x))
 #endif
 
+/**
+ * __le24_to_cpu - read a 3-byte array as a 24-bit little-endian integer
+ * @x: __le24, a structure wrapper around a 3-byte array
+ *
+ * Evaluates to a __u32 integer type
+ */
+#define __le24_to_cpu(x) \
+({ \
+	__le24 _x = (x); \
+	(__u32) ((_x.b[2] << 16) | (_x.b[1] << 8) | (_x.b[0])); \
+})
+
+/**
+ * __cpu_to_le24 - store a value in a 3-byte array in little-endian format
+ * @x: __u32, there is no checking to ensure only the lower 24 bits are set
+ *
+ * Evaluates to an __le24 compound literal
+ */
+#define __cpu_to_le24(x) \
+({ \
+	__u32 _x = (x); \
+	(__le24) { .b = { _x & 0xff, (_x >> 8) & 0xff, (_x >> 16) & 0xff } }; \
+})
+
+/**
+ * __be24_to_cpu - read a 3-byte array as a 24-bit big-endian integer
+ * @x: __be24, a structure wrapper around a 3-byte array
+ *
+ * Evaluates to a __u32 integer type
+ */
+#define __be24_to_cpu(x) \
+({ \
+	__be24 _x = (x); \
+	(__u32) ((_x.b[0] << 16) | (_x.b[1] << 8) | (_x.b[2])); \
+})
+
+/**
+ ...
From: Chris Leech
Date: Friday, September 5, 2008 - 9:57 am

The calling convention of hton24 is different now, it takes a single u32
as an argument and evaluates to a __be24 lvalue.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
---

 drivers/scsi/iscsi_tcp.c   |    8 ++++----
 drivers/scsi/libiscsi.c    |    6 +++---
 include/scsi/iscsi_proto.h |   46 ++++++++++++++++++--------------------------
 3 files changed, 26 insertions(+), 34 deletions(-)


diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 2a2f009..c9d6e96 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -608,11 +608,11 @@ iscsi_solicit_data_init(struct iscsi_conn *conn, struct iscsi_task *task,
 	hdr->exp_statsn = r2t->exp_statsn;
 	hdr->offset = cpu_to_be32(r2t->data_offset);
 	if (r2t->data_length > conn->max_xmit_dlength) {
-		hton24(hdr->dlength, conn->max_xmit_dlength);
+		hdr->dlength = hton24(conn->max_xmit_dlength);
 		r2t->data_count = conn->max_xmit_dlength;
 		hdr->flags = 0;
 	} else {
-		hton24(hdr->dlength, r2t->data_length);
+		hdr->dlength = hton24(r2t->data_length);
 		r2t->data_count = r2t->data_length;
 		hdr->flags = ISCSI_FLAG_CMD_FINAL;
 	}
@@ -1311,10 +1311,10 @@ iscsi_solicit_data_cont(struct iscsi_conn *conn, struct iscsi_task *task,
 	new_offset = r2t->data_offset + r2t->sent;
 	hdr->offset = cpu_to_be32(new_offset);
 	if (left > conn->max_xmit_dlength) {
-		hton24(hdr->dlength, conn->max_xmit_dlength);
+		hdr->dlength = hton24(conn->max_xmit_dlength);
 		r2t->data_count = conn->max_xmit_dlength;
 	} else {
-		hton24(hdr->dlength, left);
+		hdr->dlength = hton24(left);
 		r2t->data_count = left;
 		hdr->flags = ISCSI_FLAG_CMD_FINAL;
 	}
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 299e075..8432318 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -105,12 +105,12 @@ void iscsi_prep_unsolicit_data_pdu(struct iscsi_task *task,
 	hdr->offset = cpu_to_be32(task->unsol_offset);
 
 	if (task->unsol_count > conn->max_xmit_dlength) ...
From: Mike Christie
Date: Friday, September 5, 2008 - 10:03 am

Looks fine by me. Thanks.

Acked-by: Mike Christie <michaelc@cs.wisc.edu>
--

From: Chris Leech
Date: Friday, September 5, 2008 - 9:57 am

This converts the Open-FCoE tree to use the common 24-bit types

Signed-off-by: Chris Leech <christopher.leech@intel.com>
---

 drivers/scsi/fcoe/fcoe_dev.c  |    6 +++---
 drivers/scsi/libfc/fc_exch.c  |   24 ++++++++++++------------
 drivers/scsi/libfc/fc_fcp.c   |    2 +-
 drivers/scsi/libfc/fc_lport.c |    8 ++++----
 drivers/scsi/libfc/fc_ns.c    |   14 +++++++-------
 drivers/scsi/libfc/fc_rport.c |    4 ++--
 include/scsi/fc/fc_els.h      |   18 +++++++++---------
 include/scsi/fc/fc_fcoe.h     |    6 ++----
 include/scsi/fc/fc_fs.h       |    8 ++++----
 include/scsi/fc/fc_gs.h       |    2 +-
 include/scsi/fc/fc_ns.h       |    6 +++---
 include/scsi/libfc/libfc.h    |   13 -------------
 12 files changed, 48 insertions(+), 63 deletions(-)


diff --git a/drivers/scsi/fcoe/fcoe_dev.c b/drivers/scsi/fcoe/fcoe_dev.c
index d5a354f..d5fc479 100644
--- a/drivers/scsi/fcoe/fcoe_dev.c
+++ b/drivers/scsi/fcoe/fcoe_dev.c
@@ -233,7 +233,7 @@ int fcoe_xmit(struct fc_lport *lp, struct fc_frame *fp)
 			 * MAC in this case would have been set by receving the
 			 * FLOGI.
 			 */
-			fc_fcoe_set_mac(fc->data_src_addr, fh->fh_s_id);
+			fc_fcoe_set_mac(fc->data_src_addr, &fh->fh_s_id);
 			fc->flogi_progress = 0;
 		}
 	}
@@ -311,7 +311,7 @@ int fcoe_xmit(struct fc_lport *lp, struct fc_frame *fp)
 	skb->ip_summed = CHECKSUM_NONE;
 	eh = (struct ethhdr *)skb_push(skb, hlen + sizeof(struct ethhdr));
 	if (fc->address_mode == FCOE_FCOUI_ADDR_MODE)
-		fc_fcoe_set_mac(eh->h_dest, fh->fh_d_id);
+		fc_fcoe_set_mac(eh->h_dest, &fh->fh_d_id);
 	else
 		/* insert GW address */
 		memcpy(eh->h_dest, fc->dest_addr, ETH_ALEN);
@@ -516,7 +516,7 @@ static void fcoe_recv_flogi(struct fcoe_softc *fc, struct fc_frame *fp, u8 *sa)
 		if (compare_ether_addr(fc->data_src_addr, (u8[6]) { 0 }))
 			dev_unicast_delete(fc->real_dev, fc->data_src_addr,
 					   ETH_ALEN);
-		fc_fcoe_set_mac(fc->data_src_addr, fh->fh_d_id);
+		fc_fcoe_set_mac(fc->data_src_addr, &fh->fh_d_id);
 ...
From: Boaz Harrosh
Date: Sunday, September 7, 2008 - 2:36 am

I like what this patch wants to accomplish, but I disagree with the
implementation.

First why is the double definition, one in include/linux/byteorder.h
and one in include/linux/byteorder/generic.h ?

Second and most important, in both these files all routines are inline's
not MACROs, and rightly so. There is no place for a macro and the MACRO
works bad for these. One - the definition of a local variable in a {} scope
in the middle of anywhere. Second - type safety.

I CC: Harvey Harrison which did lots of work in these area's.

For me this patch is totally unacceptable.

Thanks for working on this

--

From: Chris Leech
Date: Sunday, September 7, 2008 - 8:56 am

Because there are currently two byteorder/swab implementations in the
kernel.  As you said Harvey Harrison has done a lot of work in this
area, but right now only the arm and avr32 architectures make use of
the new linux/byteorder.h.  I could put the 24-bit support in it's own

I understand the benefits of inline functions, but I disagree that a
macro is a bad choice in this case.  An inline implementation of
cpu_to_be24/hton24 would require a different interface than the rest
of the byteorder functions, looking more like the existing iSCSI
hton24 macro by taking a pointer to a memory location to store the
result in.

By using a macro that expands to a compound literal, I was able to
maintain the same calling convention of X = cpu_to_be24(Y);
Attempting to do so with an inline results in a function definition
that returns a structure by value, and even when inlined generates

That was done in order to only evaluate the macro operand only once,
making it safe to call with an expression that may have side effects.
Macros that create scoped variables are all over the place, like min,
max and container_of.  I consider it necessary in creating a good
macro implementation for this functionality, and my motivation for

Actually, the assignment to a locally scoped variable gives just as
much type checking as an inline would :-)

--

From: Boaz Harrosh
Date: Sunday, September 7, 2008 - 10:21 am

I wanted to see what you're saying and tried this test code below:

<test.c>
#include "stdio.h"

typedef unsigned char __u8;
typedef unsigned int __u32;

typedef struct { __u8 b[3]; } __be24, __le24;

#define __be24_to_cpu(x) \
({ \
	__be24 _x = (x); \
	(__u32) ((_x.b[0] << 16) | (_x.b[1] << 8) | (_x.b[2])); \
})

static inline __u32 be24_to_cpu(__be24 _x)
{
	return (__u32) ((_x.b[0] << 16) | (_x.b[1] << 8) | (_x.b[2]));
}

#define __cpu_to_be24(x) \
({ \
	__u32 _x = (x); \
	(__be24) { .b = { (_x >> 16) & 0xff, (_x >> 8) & 0xff, _x & 0xff } }; \
})

static inline __be24 cpu_to_be24(__u32 _x)
{
	__be24 be = {
		.b = { (_x >> 16) & 0xff, (_x >> 8) & 0xff, _x & 0xff } 
	};
	return be;
}

int test1(__u32 r)
{
	union {
		__be24 be17;
		__u32 be_as_u;
	} be = {.be_as_u = 0};

	be.be17 = cpu_to_be24(r);
	__u32 cpu = be24_to_cpu(be.be17) + 1;

	printf("cpu=%x be=%x\n",cpu, be.be_as_u);
	return cpu;
}

int test2(__u32 r)
{
	union {
		__be24 be17;
		__u32 be_as_u;
	} be = {.be_as_u = 0};

	be.be17 = __cpu_to_be24(r);
	__u32 cpu = __be24_to_cpu(be.be17) + 1;

	printf("cpu=%x be=%x\n",cpu, be.be_as_u);
	return cpu;
}

int main()
{
	__u32 r = rand();
	test1(r);
	test2(r);

	return 0;
}
</test.c>

I compile it like this:
$ gcc -O1 test.c -o test
if I
$ gdb test
gdb> disass test1
gdb> disass test2

I get the exact same assembly.

What am I doing wrong ?

$ gcc --version
gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-27)

Boaz
--

From: Chris Leech
Date: Sunday, September 7, 2008 - 10:52 am

Interesting.  My earlier comparisons were done in the kernel and
comparing the resulting libicssi.o and iscsi_tcp.o.  Let me look into
this more.

Chris
--

From: Chris Leech
Date: Tuesday, September 9, 2008 - 3:59 pm

Both iSCSI and Fibre Channel make use of 24-bit big-endian values in
frame headers.  This patch defines __be24 and __le24 typedefs for a
structure wrapped around a 3-byte array, and functions to convert back and
forth to a 32-bit integer.

The undefs in iscsi_proto.h are because of the different calling
convention for the existing hton24 macro in the iSCSI code.  iSCSI will
be converted in a subsequent patch.

Changes from last posting:

Switched from preprocessor macros to inline functions.  The generated assembly
is the same with gcc 4.1.2 as long as the function is actually inlined.  I
applied the __always_inline attribute to all of these, after seeing that with
one of my test kernel configurations they were not being inlined without it
and the generated instructions in the iSCSI code could be considered a
regression from the existing macro.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
---

 include/linux/byteorder.h         |   44 ++++++++++++++++++++++++++++++++++++
 include/linux/byteorder/generic.h |   45 +++++++++++++++++++++++++++++++++++++
 include/linux/types.h             |    2 ++
 include/scsi/iscsi_proto.h        |    2 ++
 4 files changed, 93 insertions(+), 0 deletions(-)


diff --git a/include/linux/byteorder.h b/include/linux/byteorder.h
index 29f002d..b48b88f 100644
--- a/include/linux/byteorder.h
+++ b/include/linux/byteorder.h
@@ -62,6 +62,42 @@
 # define __cpu_to_le64(x) ((__force __le64)__swab64(x))
 #endif
 
+/**
+ * __le24_to_cpu - read a 3-byte array as a 24-bit little-endian integer
+ * @x: __le24, a structure wrapper around a 3-byte array
+ */
+static __always_inline __u32 __le24_to_cpu(const __le24 x)
+{
+	return (__u32) ((x.b[2] << 16) | (x.b[1] << 8) | (x.b[0]));
+}
+
+/**
+ * __cpu_to_le24 - store a value in a 3-byte array in little-endian format
+ * @x: __u32, there is no checking to ensure only the lower 24 bits are set
+ */
+static __always_inline __le24 __cpu_to_le24(const __u32 x)
+{
+	return (__le24) { { x & ...
From: Boaz Harrosh
Date: Wednesday, September 10, 2008 - 5:57 am

<snip>

Thanks! Personally I like this much better.

Sorry to be nit picking but again I hate that double implementation thing.
From what I could see, and considering a few of the ARCHs you mentioned,
it looks to me like <linux/swab.h> is common to the two implementations,
(old and new). Perhaps put the "__xxx" implementations in swab.h and
only the xxx => __xxx switching in the double byteorder headers, which
fits better with the surrounding style of these headers. 

Thanks
Boaz
--

From: Boaz Harrosh
Date: Sunday, September 7, 2008 - 2:57 am

<snip>

Chris please don't cross-post to a mailing list that is members only.
I press reply-all when on linux-scsi or linux-kernel, and then I get
accused of pocking where I don't belong. So just don't do that.

And my advise to the devel@open-fcoe.org mailing list. Drop the
members only. It is useless with it. I know because I made the same
mistake with my osd-dev@open-osd.org. Believe me the span-filter
at mailman is fine and the span that will get through is not that bad.

Boaz
--

From: Christoph Hellwig
Date: Wednesday, September 10, 2008 - 7:07 am

JFS already defines an __le24, see fs/jfs/endian24.h.  Please try to
cover it, too or at least make sure you don't break it.

--

From: Dave Kleikamp
Date: Wednesday, September 10, 2008 - 8:40 am

Chris,
This patch takes care of jfs.  Please add it to your patchset.

Thanks,
Shaggy

24-bit types: Convert jfs to use the common 24-bit types

This patch cleans up some of the ugliness in the jfs headers and
uses the common 24-bit types instead of its private definitions.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>

diff --git a/fs/jfs/endian24.h b/fs/jfs/endian24.h
deleted file mode 100644
index fa92f7f..0000000
--- a/fs/jfs/endian24.h
+++ /dev/null
@@ -1,49 +0,0 @@
-/*
- *   Copyright (C) International Business Machines Corp., 2001
- *
- *   This program is free software;  you can redistribute it and/or modify
- *   it under the terms of the GNU General Public License as published by
- *   the Free Software Foundation; either version 2 of the License, or
- *   (at your option) any later version.
- *
- *   This program is distributed in the hope that it will be useful,
- *   but WITHOUT ANY WARRANTY;  without even the implied warranty of
- *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See
- *   the GNU General Public License for more details.
- *
- *   You should have received a copy of the GNU General Public License
- *   along with this program;  if not, write to the Free Software
- *   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
- */
-#ifndef _H_ENDIAN24
-#define	_H_ENDIAN24
-
-/*
- *	endian24.h:
- *
- * Endian conversion for 24-byte data
- *
- */
-#define __swab24(x) \
-({ \
-	__u32 __x = (x); \
-	((__u32)( \
-		((__x & (__u32)0x000000ffUL) << 16) | \
-		 (__x & (__u32)0x0000ff00UL)	    | \
-		((__x & (__u32)0x00ff0000UL) >> 16) )); \
-})
-
-#if (defined(__KERNEL__) && defined(__LITTLE_ENDIAN)) || (defined(__BYTE_ORDER) && (__BYTE_ORDER == __LITTLE_ENDIAN))
-	#define __cpu_to_le24(x) ((__u32)(x))
-	#define __le24_to_cpu(x) ((__u32)(x))
-#else
-	#define __cpu_to_le24(x) __swab24(x)
-	#define __le24_to_cpu(x) __swab24(x)
-#endif
-
-#ifdef __KERNEL__
-	#define cpu_to_le24 ...
From: Boaz Harrosh
Date: Wednesday, September 10, 2008 - 9:11 am

Why is the difference from below definition. That is the

Is this stuff on-the-wire?
Do you need a:
and:

Note that before the :24 bit-field was kept packed but now
with the use of struct at the __le24 definition it might
choose to pad them.

Chris you might want to change the definitions at linux/types.h
to:

typedef struct { __u8 b[3]; } __be24, __le24 __packed;

With gcc it will not help with the proceeding fields, and the
containing struct will need it's own "__packed" declaration
but it will keep it packed with previous fields.

Just my $0.017
Boaz
--

From: Boaz Harrosh
Date: Wednesday, September 10, 2008 - 9:25 am

Can this be fixed please ?
--

From: Dave Kleikamp
Date: Wednesday, September 10, 2008 - 12:20 pm

Answered elsewhere, but this is host-endian.  I plan to kill this


I'm not convinced that this is needed.  Does the compiler do any padding

Maybe, but I can't get the compiler to add any padding playing around
with variants of these structures.  I've tested a simple program on both

Shaggy
-- 
David Kleikamp
IBM Linux Technology Center

--

From: Boaz Harrosh
Date: Thursday, September 11, 2008 - 1:30 am

You have an "__le32 addr2" followed, it might want too in rare cases.
But for me this is also a a declaration issue and a readability issue.

I'm telling the compiler, don't mess with my structure, this needs
to be constant whatever the compiler is. In C the compiler is even
allowed to change fields order if it wants to. Why the guess work,
__packed and be done with it.

And it's also a readability issue. Look above lxd_t vs pxd_t. I can't
know that one is in memory and one is on disk. I have to ask questions
and make wrong remarks. But if the later had a __packed on it, then
there are no more questions.

__packed for me is a statement for both the programmer and compiler
that says: "This stuff will be seen externally of the machine. It must

Boaz
--

From: Chris Leech
Date: Wednesday, September 10, 2008 - 6:51 pm

I haven't seen padding added simply because of a nested structure
boundary, but I'm not up on all the ABIs for the different
architectures.  Obviously a containing structure would want to have
the 24-bit type adjacent to an 8-bit type, or have it's own packed
attribute if needed.

It shouldn't hurt, in this case the members shouldn't be expected to
have more than byte alignment anyway, but I can't see how it would
help.

If there's a particular arch that might be a problem I'm happy to look
into it, but I don't want to start throwing packed attributes around
just in case.

Chris
--

From: Chris Leech
Date: Wednesday, September 10, 2008 - 9:25 am

Shouldn't len here be changed to a __le24?  I think this just changed
the size of lxd_t by a byte.

- Chris
--

From: Chris Leech
Date: Wednesday, September 10, 2008 - 10:45 am

Never mind, I see that it's a host order field.  And presently surprised
to see that gcc combines the 24-bit bitfield with the following u8.

Chris
--

From: Boaz Harrosh
Date: Wednesday, September 10, 2008 - 11:04 am

It does because these are all bytes. on x86. But this is not guarantied
for all ARCHs and machine-word-sizes. In any way this should be consistent
with the rest of the file.

Please see my other reply about packing of structures. If this is
on-the-wire then there are problems.

Boaz
--

From: Dave Kleikamp
Date: Wednesday, September 10, 2008 - 11:23 am

Right.  I just made a note to throw away this ridiculous structure and
replace it with something sane.  It's only used in-memory and there's no
reason not to have an unsigned long long for offset, and an int for len.
This patch didn't seem to be the right place to fix that though.

-- 
David Kleikamp
IBM Linux Technology Center

--

Previous thread: Block: Trouble with kobject initialisation for blk_cmd_filter by Elias Oltmanns on Friday, September 5, 2008 - 9:48 am. (10 messages)

Next thread: 2.6.27-rc5-git7 iwl3945: driver still has receiver lockups and is worse than before by jmerkey on Friday, September 5, 2008 - 9:29 am. (1 message)