Both iSCSI and Fibre Channel make use of 24-bit big-endian values in
frame headers. This patch defines __be24 and __le24 typedefs for a
structure wrapped around a 3-byte array, and macros to convert back and
forth to a 32-bit integer.
The undefs in iscsi_proto.h are because of the different calling
convention for the existing hton24 macro in the iSCSI code. iSCSI will
be converted in a subsequent patch.
Signed-off-by: Chris Leech <christopher.leech@intel.com>
---
include/linux/byteorder.h | 56 ++++++++++++++++++++++++++++++++++++
include/linux/byteorder/generic.h | 57 +++++++++++++++++++++++++++++++++++++
include/linux/types.h | 2 +
include/scsi/iscsi_proto.h | 2 +
4 files changed, 117 insertions(+), 0 deletions(-)
diff --git a/include/linux/byteorder.h b/include/linux/byteorder.h
index 29f002d..e41c17b 100644
--- a/include/linux/byteorder.h
+++ b/include/linux/byteorder.h
@@ -62,6 +62,54 @@
# define __cpu_to_le64(x) ((__force __le64)__swab64(x))
#endif
+/**
+ * __le24_to_cpu - read a 3-byte array as a 24-bit little-endian integer
+ * @x: __le24, a structure wrapper around a 3-byte array
+ *
+ * Evaluates to a __u32 integer type
+ */
+#define __le24_to_cpu(x) \
+({ \
+ __le24 _x = (x); \
+ (__u32) ((_x.b[2] << 16) | (_x.b[1] << 8) | (_x.b[0])); \
+})
+
+/**
+ * __cpu_to_le24 - store a value in a 3-byte array in little-endian format
+ * @x: __u32, there is no checking to ensure only the lower 24 bits are set
+ *
+ * Evaluates to an __le24 compound literal
+ */
+#define __cpu_to_le24(x) \
+({ \
+ __u32 _x = (x); \
+ (__le24) { .b = { _x & 0xff, (_x >> 8) & 0xff, (_x >> 16) & 0xff } }; \
+})
+
+/**
+ * __be24_to_cpu - read a 3-byte array as a 24-bit big-endian integer
+ * @x: __be24, a structure wrapper around a 3-byte array
+ *
+ * Evaluates to a __u32 integer type
+ */
+#define __be24_to_cpu(x) \
+({ \
+ __be24 _x = (x); \
+ (__u32) ((_x.b[0] << 16) | (_x.b[1] << 8) | (_x.b[2])); \
+})
+
+/**
+ ...The calling convention of hton24 is different now, it takes a single u32
as an argument and evaluates to a __be24 lvalue.
Signed-off-by: Chris Leech <christopher.leech@intel.com>
---
drivers/scsi/iscsi_tcp.c | 8 ++++----
drivers/scsi/libiscsi.c | 6 +++---
include/scsi/iscsi_proto.h | 46 ++++++++++++++++++--------------------------
3 files changed, 26 insertions(+), 34 deletions(-)
diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 2a2f009..c9d6e96 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -608,11 +608,11 @@ iscsi_solicit_data_init(struct iscsi_conn *conn, struct iscsi_task *task,
hdr->exp_statsn = r2t->exp_statsn;
hdr->offset = cpu_to_be32(r2t->data_offset);
if (r2t->data_length > conn->max_xmit_dlength) {
- hton24(hdr->dlength, conn->max_xmit_dlength);
+ hdr->dlength = hton24(conn->max_xmit_dlength);
r2t->data_count = conn->max_xmit_dlength;
hdr->flags = 0;
} else {
- hton24(hdr->dlength, r2t->data_length);
+ hdr->dlength = hton24(r2t->data_length);
r2t->data_count = r2t->data_length;
hdr->flags = ISCSI_FLAG_CMD_FINAL;
}
@@ -1311,10 +1311,10 @@ iscsi_solicit_data_cont(struct iscsi_conn *conn, struct iscsi_task *task,
new_offset = r2t->data_offset + r2t->sent;
hdr->offset = cpu_to_be32(new_offset);
if (left > conn->max_xmit_dlength) {
- hton24(hdr->dlength, conn->max_xmit_dlength);
+ hdr->dlength = hton24(conn->max_xmit_dlength);
r2t->data_count = conn->max_xmit_dlength;
} else {
- hton24(hdr->dlength, left);
+ hdr->dlength = hton24(left);
r2t->data_count = left;
hdr->flags = ISCSI_FLAG_CMD_FINAL;
}
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 299e075..8432318 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -105,12 +105,12 @@ void iscsi_prep_unsolicit_data_pdu(struct iscsi_task *task,
hdr->offset = cpu_to_be32(task->unsol_offset);
if (task->unsol_count > conn->max_xmit_dlength) ...Looks fine by me. Thanks. Acked-by: Mike Christie <michaelc@cs.wisc.edu> --
This converts the Open-FCoE tree to use the common 24-bit types
Signed-off-by: Chris Leech <christopher.leech@intel.com>
---
drivers/scsi/fcoe/fcoe_dev.c | 6 +++---
drivers/scsi/libfc/fc_exch.c | 24 ++++++++++++------------
drivers/scsi/libfc/fc_fcp.c | 2 +-
drivers/scsi/libfc/fc_lport.c | 8 ++++----
drivers/scsi/libfc/fc_ns.c | 14 +++++++-------
drivers/scsi/libfc/fc_rport.c | 4 ++--
include/scsi/fc/fc_els.h | 18 +++++++++---------
include/scsi/fc/fc_fcoe.h | 6 ++----
include/scsi/fc/fc_fs.h | 8 ++++----
include/scsi/fc/fc_gs.h | 2 +-
include/scsi/fc/fc_ns.h | 6 +++---
include/scsi/libfc/libfc.h | 13 -------------
12 files changed, 48 insertions(+), 63 deletions(-)
diff --git a/drivers/scsi/fcoe/fcoe_dev.c b/drivers/scsi/fcoe/fcoe_dev.c
index d5a354f..d5fc479 100644
--- a/drivers/scsi/fcoe/fcoe_dev.c
+++ b/drivers/scsi/fcoe/fcoe_dev.c
@@ -233,7 +233,7 @@ int fcoe_xmit(struct fc_lport *lp, struct fc_frame *fp)
* MAC in this case would have been set by receving the
* FLOGI.
*/
- fc_fcoe_set_mac(fc->data_src_addr, fh->fh_s_id);
+ fc_fcoe_set_mac(fc->data_src_addr, &fh->fh_s_id);
fc->flogi_progress = 0;
}
}
@@ -311,7 +311,7 @@ int fcoe_xmit(struct fc_lport *lp, struct fc_frame *fp)
skb->ip_summed = CHECKSUM_NONE;
eh = (struct ethhdr *)skb_push(skb, hlen + sizeof(struct ethhdr));
if (fc->address_mode == FCOE_FCOUI_ADDR_MODE)
- fc_fcoe_set_mac(eh->h_dest, fh->fh_d_id);
+ fc_fcoe_set_mac(eh->h_dest, &fh->fh_d_id);
else
/* insert GW address */
memcpy(eh->h_dest, fc->dest_addr, ETH_ALEN);
@@ -516,7 +516,7 @@ static void fcoe_recv_flogi(struct fcoe_softc *fc, struct fc_frame *fp, u8 *sa)
if (compare_ether_addr(fc->data_src_addr, (u8[6]) { 0 }))
dev_unicast_delete(fc->real_dev, fc->data_src_addr,
ETH_ALEN);
- fc_fcoe_set_mac(fc->data_src_addr, fh->fh_d_id);
+ fc_fcoe_set_mac(fc->data_src_addr, &fh->fh_d_id);
...I like what this patch wants to accomplish, but I disagree with the
implementation.
First why is the double definition, one in include/linux/byteorder.h
and one in include/linux/byteorder/generic.h ?
Second and most important, in both these files all routines are inline's
not MACROs, and rightly so. There is no place for a macro and the MACRO
works bad for these. One - the definition of a local variable in a {} scope
in the middle of anywhere. Second - type safety.
I CC: Harvey Harrison which did lots of work in these area's.
For me this patch is totally unacceptable.
Thanks for working on this
--
Because there are currently two byteorder/swab implementations in the kernel. As you said Harvey Harrison has done a lot of work in this area, but right now only the arm and avr32 architectures make use of the new linux/byteorder.h. I could put the 24-bit support in it's own I understand the benefits of inline functions, but I disagree that a macro is a bad choice in this case. An inline implementation of cpu_to_be24/hton24 would require a different interface than the rest of the byteorder functions, looking more like the existing iSCSI hton24 macro by taking a pointer to a memory location to store the result in. By using a macro that expands to a compound literal, I was able to maintain the same calling convention of X = cpu_to_be24(Y); Attempting to do so with an inline results in a function definition that returns a structure by value, and even when inlined generates That was done in order to only evaluate the macro operand only once, making it safe to call with an expression that may have side effects. Macros that create scoped variables are all over the place, like min, max and container_of. I consider it necessary in creating a good macro implementation for this functionality, and my motivation for Actually, the assignment to a locally scoped variable gives just as much type checking as an inline would :-) --
I wanted to see what you're saying and tried this test code below:
<test.c>
#include "stdio.h"
typedef unsigned char __u8;
typedef unsigned int __u32;
typedef struct { __u8 b[3]; } __be24, __le24;
#define __be24_to_cpu(x) \
({ \
__be24 _x = (x); \
(__u32) ((_x.b[0] << 16) | (_x.b[1] << 8) | (_x.b[2])); \
})
static inline __u32 be24_to_cpu(__be24 _x)
{
return (__u32) ((_x.b[0] << 16) | (_x.b[1] << 8) | (_x.b[2]));
}
#define __cpu_to_be24(x) \
({ \
__u32 _x = (x); \
(__be24) { .b = { (_x >> 16) & 0xff, (_x >> 8) & 0xff, _x & 0xff } }; \
})
static inline __be24 cpu_to_be24(__u32 _x)
{
__be24 be = {
.b = { (_x >> 16) & 0xff, (_x >> 8) & 0xff, _x & 0xff }
};
return be;
}
int test1(__u32 r)
{
union {
__be24 be17;
__u32 be_as_u;
} be = {.be_as_u = 0};
be.be17 = cpu_to_be24(r);
__u32 cpu = be24_to_cpu(be.be17) + 1;
printf("cpu=%x be=%x\n",cpu, be.be_as_u);
return cpu;
}
int test2(__u32 r)
{
union {
__be24 be17;
__u32 be_as_u;
} be = {.be_as_u = 0};
be.be17 = __cpu_to_be24(r);
__u32 cpu = __be24_to_cpu(be.be17) + 1;
printf("cpu=%x be=%x\n",cpu, be.be_as_u);
return cpu;
}
int main()
{
__u32 r = rand();
test1(r);
test2(r);
return 0;
}
</test.c>
I compile it like this:
$ gcc -O1 test.c -o test
if I
$ gdb test
gdb> disass test1
gdb> disass test2
I get the exact same assembly.
What am I doing wrong ?
$ gcc --version
gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-27)
Boaz
--
Interesting. My earlier comparisons were done in the kernel and comparing the resulting libicssi.o and iscsi_tcp.o. Let me look into this more. Chris --
Both iSCSI and Fibre Channel make use of 24-bit big-endian values in
frame headers. This patch defines __be24 and __le24 typedefs for a
structure wrapped around a 3-byte array, and functions to convert back and
forth to a 32-bit integer.
The undefs in iscsi_proto.h are because of the different calling
convention for the existing hton24 macro in the iSCSI code. iSCSI will
be converted in a subsequent patch.
Changes from last posting:
Switched from preprocessor macros to inline functions. The generated assembly
is the same with gcc 4.1.2 as long as the function is actually inlined. I
applied the __always_inline attribute to all of these, after seeing that with
one of my test kernel configurations they were not being inlined without it
and the generated instructions in the iSCSI code could be considered a
regression from the existing macro.
Signed-off-by: Chris Leech <christopher.leech@intel.com>
---
include/linux/byteorder.h | 44 ++++++++++++++++++++++++++++++++++++
include/linux/byteorder/generic.h | 45 +++++++++++++++++++++++++++++++++++++
include/linux/types.h | 2 ++
include/scsi/iscsi_proto.h | 2 ++
4 files changed, 93 insertions(+), 0 deletions(-)
diff --git a/include/linux/byteorder.h b/include/linux/byteorder.h
index 29f002d..b48b88f 100644
--- a/include/linux/byteorder.h
+++ b/include/linux/byteorder.h
@@ -62,6 +62,42 @@
# define __cpu_to_le64(x) ((__force __le64)__swab64(x))
#endif
+/**
+ * __le24_to_cpu - read a 3-byte array as a 24-bit little-endian integer
+ * @x: __le24, a structure wrapper around a 3-byte array
+ */
+static __always_inline __u32 __le24_to_cpu(const __le24 x)
+{
+ return (__u32) ((x.b[2] << 16) | (x.b[1] << 8) | (x.b[0]));
+}
+
+/**
+ * __cpu_to_le24 - store a value in a 3-byte array in little-endian format
+ * @x: __u32, there is no checking to ensure only the lower 24 bits are set
+ */
+static __always_inline __le24 __cpu_to_le24(const __u32 x)
+{
+ return (__le24) { { x & ...<snip> Thanks! Personally I like this much better. Sorry to be nit picking but again I hate that double implementation thing. From what I could see, and considering a few of the ARCHs you mentioned, it looks to me like <linux/swab.h> is common to the two implementations, (old and new). Perhaps put the "__xxx" implementations in swab.h and only the xxx => __xxx switching in the double byteorder headers, which fits better with the surrounding style of these headers. Thanks Boaz --
<snip> Chris please don't cross-post to a mailing list that is members only. I press reply-all when on linux-scsi or linux-kernel, and then I get accused of pocking where I don't belong. So just don't do that. And my advise to the devel@open-fcoe.org mailing list. Drop the members only. It is useless with it. I know because I made the same mistake with my osd-dev@open-osd.org. Believe me the span-filter at mailman is fine and the span that will get through is not that bad. Boaz --
JFS already defines an __le24, see fs/jfs/endian24.h. Please try to cover it, too or at least make sure you don't break it. --
Chris,
This patch takes care of jfs. Please add it to your patchset.
Thanks,
Shaggy
24-bit types: Convert jfs to use the common 24-bit types
This patch cleans up some of the ugliness in the jfs headers and
uses the common 24-bit types instead of its private definitions.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
diff --git a/fs/jfs/endian24.h b/fs/jfs/endian24.h
deleted file mode 100644
index fa92f7f..0000000
--- a/fs/jfs/endian24.h
+++ /dev/null
@@ -1,49 +0,0 @@
-/*
- * Copyright (C) International Business Machines Corp., 2001
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
- * the GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
- */
-#ifndef _H_ENDIAN24
-#define _H_ENDIAN24
-
-/*
- * endian24.h:
- *
- * Endian conversion for 24-byte data
- *
- */
-#define __swab24(x) \
-({ \
- __u32 __x = (x); \
- ((__u32)( \
- ((__x & (__u32)0x000000ffUL) << 16) | \
- (__x & (__u32)0x0000ff00UL) | \
- ((__x & (__u32)0x00ff0000UL) >> 16) )); \
-})
-
-#if (defined(__KERNEL__) && defined(__LITTLE_ENDIAN)) || (defined(__BYTE_ORDER) && (__BYTE_ORDER == __LITTLE_ENDIAN))
- #define __cpu_to_le24(x) ((__u32)(x))
- #define __le24_to_cpu(x) ((__u32)(x))
-#else
- #define __cpu_to_le24(x) __swab24(x)
- #define __le24_to_cpu(x) __swab24(x)
-#endif
-
-#ifdef __KERNEL__
- #define cpu_to_le24 ...Why is the difference from below definition. That is the
Is this stuff on-the-wire?
Do you need a:
and:
Note that before the :24 bit-field was kept packed but now
with the use of struct at the __le24 definition it might
choose to pad them.
Chris you might want to change the definitions at linux/types.h
to:
typedef struct { __u8 b[3]; } __be24, __le24 __packed;
With gcc it will not help with the proceeding fields, and the
containing struct will need it's own "__packed" declaration
but it will keep it packed with previous fields.
Just my $0.017
Boaz
--
Answered elsewhere, but this is host-endian. I plan to kill this I'm not convinced that this is needed. Does the compiler do any padding Maybe, but I can't get the compiler to add any padding playing around with variants of these structures. I've tested a simple program on both Shaggy -- David Kleikamp IBM Linux Technology Center --
You have an "__le32 addr2" followed, it might want too in rare cases. But for me this is also a a declaration issue and a readability issue. I'm telling the compiler, don't mess with my structure, this needs to be constant whatever the compiler is. In C the compiler is even allowed to change fields order if it wants to. Why the guess work, __packed and be done with it. And it's also a readability issue. Look above lxd_t vs pxd_t. I can't know that one is in memory and one is on disk. I have to ask questions and make wrong remarks. But if the later had a __packed on it, then there are no more questions. __packed for me is a statement for both the programmer and compiler that says: "This stuff will be seen externally of the machine. It must Boaz --
I haven't seen padding added simply because of a nested structure boundary, but I'm not up on all the ABIs for the different architectures. Obviously a containing structure would want to have the 24-bit type adjacent to an 8-bit type, or have it's own packed attribute if needed. It shouldn't hurt, in this case the members shouldn't be expected to have more than byte alignment anyway, but I can't see how it would help. If there's a particular arch that might be a problem I'm happy to look into it, but I don't want to start throwing packed attributes around just in case. Chris --
Shouldn't len here be changed to a __le24? I think this just changed the size of lxd_t by a byte. - Chris --
Never mind, I see that it's a host order field. And presently surprised to see that gcc combines the 24-bit bitfield with the following u8. Chris --
It does because these are all bytes. on x86. But this is not guarantied for all ARCHs and machine-word-sizes. In any way this should be consistent with the rest of the file. Please see my other reply about packing of structures. If this is on-the-wire then there are problems. Boaz --
Right. I just made a note to throw away this ridiculous structure and replace it with something sane. It's only used in-memory and there's no reason not to have an unsigned long long for offset, and an int for len. This patch didn't seem to be the right place to fix that though. -- David Kleikamp IBM Linux Technology Center --
