On Mon, 2008-02-04 at 10:29 -0800, Linus Torvalds wrote:The iSER spec (RFC-5046) quotes the following in the TCP case for direct data placement: " Out-of-order TCP segments in the Traditional iSCSI model have to be stored and reassembled before the iSCSI protocol layer within an end node can place the data in the iSCSI buffers. This reassembly is required because not every TCP segment is likely to contain an iSCSI header to enable its placement, and TCP itself does not have a built-in mechanism for signaling Upper Level Protocol (ULP) message boundaries to aid placement of out-of-order segments. This TCP reassembly at high network speeds is quite counter-productive for the following reasons: wasted memory bandwidth in data copying, the need for reassembly memory, wasted CPU cycles in data copying, and the general store-and-forward latency from an application perspective." While this does not have anything to do directly with the kernel vs. user discussion for target mode storage engine, the scaling and latency case is easy enough to make if we are talking about scaling TCP for 10 Gb/sec storage fabrics. number of years, I would be inclined to believe this is true for software and hardware data-path cases. The benefits of moving various control statemachines for something like say traditional iSCSI to userspace has always been debateable. The most obvious ones are things like authentication, espically if something more complex than CHAP are the obvious case for userspace. However, I have thought recovery for failures caused from communication path (iSCSI connections) or entire nexuses (iSCSI sessions) failures was very problematic to expect to have to potentially push down IOs state to userspace. Keeping statemachines for protocol and/or fabric specific statemachines (CSM-E and CSM-I from connection recovery in iSCSI and iSER are the obvious ones) are the best canidates for residing in kernel space. Most of the SCSI OS storage subsystems that I have worked with in the context of iSCSI have used 256 * 512 byte setctor requests, which the default traditional iSCSI PDU data payload (MRDSL) being 64k to hit the sweet spot with crc32c checksum calculations. I am assuming this is going to be the case for other fabrics as well. --nab --
| david | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Heiko Carstens | Re: -mm merge plans for 2.6.23 -- sys_fallocate |
git: | |
| David Miller | Re: [GIT]: Networking |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 05/37] dccp: Cleanup routines for feature negotiation |
| Lennert Buytenhek | [PATCH 16/39] mv643xx_eth: get rid of ETH_/ethernet_/eth_ prefixes |
