DragonFly: I/O Consolidation and Direct-to-DMA Plans

Submitted by njc
on December 28, 2004 - 1:10pm

Matt Dillon [story] provides an interesting and detailed explanation of future development plans with regards to DragonFly's I/O subsystem. Originally inspired by the PIPE code improvements of FreeBSD's Alan Cox, and demonstrated in DragonFly's unique XIO and MSFBUF APIs, the goal of this work is to avoid KVA mappings for I/O requests and the resulting overhead of interprocessor interrupts in SMP systems. In theory, this equates to high performance through the benefit of efficient I/O in combination with the ability of any subsystem layer to transfer data to busdma with zero memory-to-memory copies. Matt expands:

"What we are going to do is extend the msf_buf abstraction to cover these needs and provide a set of API calls that allows upper layers to supply data in any form and lower level layers to request data in any form, including with address restrictions. msf_buf's already have a page-list (XIO) and KVA mapping abstraction. We are going to add a bounce-buffer abstraction and then work on a bunch of new API calls for msf_bufs to cover the needs of various subsystems."

There appears to be a lot of interesting work going on in DragonFly, read more for the entirety of Matt's post.


From: Matthew Dillon [email blocked] To: dragonfly-kernel Subject: I/O consolidation and direct-to-DMA plans Date: 2004-12-20 20:10:54
Hiten and I have come up with a roadmap for the I/O path cleanup and direct-to-dma plans (avoiding having to map KVA buffers). We've looked at all existing structures... msf_buf's, sfbufs, struct buf, XIO's, UIO's, and BUSDMA (bounce_page and bus_dmamap). Each of these structures represents a piece of the larger 'integrated I/O' puzzle. The primary issue is that we need a structure which covers both ends of the equation... we need the capability in higher layers to specify either a KVA mapped buffer or a page list, and we need the capability in lower layers to require either a KVA mapped buffer or a page list, depending on the requirements of the layer. For example, in some instances CAM may supply a buffer pointer to simple SCSI request structures while in others CAM might want to pass data from a struct BUF or UIO that it has a page list for. In some cases a driver will have access to a DMA engine and require a page list, while in others a driver might need a mapped buffer, or might need to create bounce pages. What we are going to do is extend the msf_buf abstraction to cover these needs and provide a set of API calls that allows upper layers to supply data in any form and lower level layers to request data in any form, including with address restrictions. msf_buf's already have a page-list (XIO) and KVA mapping abstraction. We are going to add a bounce-buffer abstraction and then work on a bunch of new API calls for msf_bufs to cover the needs of various subsystems. As a starter we'll have these functions: msf_init(struct msf_buf *msf) Initialize an msf_buf for use (i.e. zero its fields). struct msf_buf * msf_create_from_buf(struct msf_buf *opt_msf, void *buf, size_t bytes) Populate an existing msf_buf or allocate a new one and install the supplied KVA buffer pointer and size. struct msf_buf * msf_create_from_xio(struct msf_buf *opt_msf, struct xio *xio) Populate an existing msf_buf or allocate a new one and install the supplied XIO (page list). int msf_require_buf(struct msf_buf *msf) Require that an msf_buf have a mapped KVA buffer. If the msf_buf already has a mapped KVA buffer this is a NOP. If the msf_buf contains a page list a KVA buffer will be allocated and mapped based on the page list. int msf_require_xio(struct msf_buf *msf) Require that an mfs_buf have a page list. If the msf_buf already has a page list this is a NOP. Otherwise a page list is constructed from the msf_buf's KVA buffer. void msf_release(struct msf_buf *msf) Release an msf_buf, freeing any resources that were created as side effects to the above API calls and zeroing out any resources that were originally supplied. The plan is to start embedding msf_buf's in various system layers (struct buf, busdma, etc...) as independant entities to begin with. As more of this work is accomplished the various layers using msf_buf's will start to become adjacent to each other and we will be able to then have one layer pass its msf_buf directly to another without having to re-create this. Along the way all the various disparate I/O related structures will be consolidated. Eventually this will allow e.g. the buffer cache to pass an msf_buf all the way down to BUSDMA without having to KVA map the buffer, thus achieving our goal. -Matt Matthew Dillon

Very cool

on
December 29, 2004 - 1:15am

I look forward to seeing these improvements integrated back into FreeBSD.

why would you think these cha

Anonymous (not verified)
on
December 29, 2004 - 8:15am

why would you think these changes would go back into FBSD?

FreeBSD people aren't stupid,

Anonymous (not verified)
on
December 29, 2004 - 9:12am

FreeBSD people aren't stupid, if this gives a nice performance boost, or advantage in code simplicity, why wouldn't they integrate it? Net-, Free-, and OpenBSD have always been integrating each others improvements, so i don't see why they wouldn't do so with DragonflyBSD's improvements.

why

Hagge (not verified)
on
December 29, 2004 - 3:18pm

Probably because they wasn't intrested in the first case was they?

Also they have already done it their own way? Matt started his Dragonfly project simply because he didn't wanted 4.x to go the way 5.x did, so...

I might have understood things the wrong way thought, but anyway, I doubt they will use it, dragonfly will be one os and freebsd another.

Actually, we've been planning

Anonymous (not verified)
on
December 30, 2004 - 12:01am

Actually, we've been planning this on FreeBSD for a long time now. We've already started going along this path, but it requires changes to a lot of device drivers, and so it requires some careful planning and then a lot of mechanical changes. The actual kernel support code for it is relatively simple, and already mostly available in the form of sfbufs.

Why wait?

Anonymous (not verified)
on
December 29, 2004 - 4:40pm

Why wait for FreeBSD? Just run DragonFly and you can have this and much much more today!

RIGHT! WHY NOT RUN LINUX ? :P

Anonymous (not verified)
on
December 29, 2004 - 8:30pm

RIGHT! WHY NOT RUN LINUX ? :P

Anyway, isn't dragonflybsd devloper only headed now? until they manage to get things more ready-to-use-in-production-environment?

Dragonfly

Anonymous (not verified)
on
December 30, 2004 - 12:52am

One of the reasons that I like Linux is I know how it works. The developers are really good at disseminating information. Dragonfly seems to be adopting this attitude and culture, and it's really nice. I may move to Dragonfly if this keeps up.

On another note, I never knew there was an Alan Cox for every OS! That rocks! I'd like to see the Windows and MacOS Alan Cox battle in a steel cage match in the Closed Source tourney, and the *BSD and Linux Alan Cox's in the Free and Open tourney. Then we could have a grand championship and REALLY know which is best!

Who needs to compete on technical merit when you can compete in PAIN!

notes ...

Anonymous (not verified)
on
December 30, 2004 - 4:38am

It may not be simple to port to FreeBSD: as DragonFly's internals start to significantly differ from FreeBSD's, then porting is much more difficult affair, especially if the mechanisms are dependant upon DFly architectural specifics.

Further to this, DFly needs a clear mission statement, such as the following, and the DFly team need to evangelise it so that everyone understands what it offers as a "key differentiator" (engineers need to do a bit of marketing!). If any of the DFly team are reading this, then I would urge you to come up with a single statement message that makes it clear what DFly offers as a core competency.

FreeBSD - best BSD unix for server/midrange needs
NetBSD - best BSD unix for portability and embedded systems
OpenBSD - best BSD unix for security and embedded networking
DragonFly - best BSD unix for SMP and multicore
Linux - best unix for desktop and compatibility

That's my guess from looking at DFly's architectural goals, and a good mission to aim for, since the next 5-10 years of microprocessor development is clearly aimed at SMP and multicore, so if DFly aims specifically at innovation in its SMP approach, then it may be the performance winner in an SMP world. It may superpass Linux as well, simply because all of the existing architectures take a fundamentally different approach to synchronisation.

I read this article at OSNews as well, and a comment about "isn't unix old" -- that's a stupid comment -- well, you could also say that "isn't a car old" as well, since the basic architecture of a car goes back to the late 1800's. The point isn't that unix is old, it's about whether its design architecture is still relevant for the problem it's solving. The answer to that seems to be yes, but changes in the problem space (i.e. the rise of multi-core) require innovations and changes in the unix architecture.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.