Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> --- drivers/media/dvb/frontends/or51132.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/media/dvb/frontends/or51132.c b/drivers/media/dvb/frontends/or51132.c index c7b5785..5ed3254 100644 --- a/drivers/media/dvb/frontends/or51132.c +++ b/drivers/media/dvb/frontends/or51132.c @@ -126,7 +126,7 @@ static int or51132_readreg(struct or51132_state *state, u8 reg) reg, err); return -EREMOTEIO; } - return le16_to_cpup((u16*)buf); + return buf[0] | (buf[1] << 8); } static int or51132_load_firmware (struct dvb_frontend* fe, const struct firmware *fw) @@ -140,9 +140,9 @@ static int or51132_load_firmware (struct dvb_frontend* fe, const struct firmware dprintk("Firmware is %Zd bytes\n",fw->size); /* Get size of firmware A and B */ - firmwareAsize = le32_to_cpu(*((u32*)fw->data)); + firmwareAsize = le32_to_cpu(*((__le32*)fw->data)); dprintk("FirmwareA is %i bytes\n",firmwareAsize); - firmwareBsize = le32_to_cpu(*((u32*)(fw->data+4))); + firmwareBsize = le32_to_cpu(*((__le32*)(fw->data+4))); dprintk("FirmwareB is %i bytes\n",firmwareBsize); /* Upload firmware */ -- 1.5.3.GIT --
firmwareBsize = le32_to_cpup((__le32 *)(fw->data + 4)); Cheers, Harvey --
... and the point of that would be? FWIW, I really don't like the ...p() forms - they are hard to distinguish from normal ones visually and any possible performance benefit is too small for most of the uses. IOW, it's mostly redundant API. --
I sent a patchset getting rid of the p variants earlier today, Dave Miller made a good point that some arches have well optimized versions of these as they have specific machine instructions they can use. Agreed that I don't much like them either, was thinking of adding a u32 get_le32(__le32 *ptr) type api and get rid of the le32_to_cpup api. Cheers, Harvey --
Thus the "mostly" part - on hot paths the microoptimization is worth the trouble. None of the places in question is on such paths... --
Perhaps better code generation? I didn't look what buf[0] | (buf[1] << 8) generates, but on x86, get_unaligned_le16() should just boil down to *(unsigned short *)buf which is certainly going to mostly generate better code (smaller, faster) than at least the most likely and straigthforward of the former. (Just checked. Gcc is not smart enough to make those two loads plus a shift be one word load. At least not my version) Linus --
FWIW, I wonder how they really compare on misaligned and whether it would make sense for gcc to try and generate a single load on targets that are known to allow that... Hell knows; I still find explicit variant more readable in this case and AFAICS it's not a critical path... --
It would almost certainly help on x86. The cost of an unaligned integer access that doesn't cross a cache-fetch boundary (8 bytes on older CPU's, 16 or 32 bytes on newer ones) is zero, last I saw. IOW, there are misaligned cases that have a higher cost, but they are pretty rare, and especially so with small data and modern CPU's. So no disadvantage for 95% of all cases, and the advantage of doing just a single instruction, rather than four (2 zero-extending loads, a shift and an add/or, with data dependencies on most of them). Linus --
