On Wednesday 04 June 2008 05:07, Linus Torvalds wrote:OK, I'm sitll not quite sure where this has ended up. I guess you are happy with x86 semantics as they are now. That is, all IO accesses are strongly ordered WRT one another and WRT cacheable memory (which includes keeping them within spinlocks), *unless* one asks for WC memory, in which case that memory is quite weakly ordered (and is not even ordered by a regular IO readl, at least according to AMD spec). So for WC memory, one still needs to use mb/rmb/wmb. So that still doesn't tell us what *minimum* level of ordering we should provide in the cross platform readl/writel API. Some relatively sane suggestions would be: - as strong as x86. guaranteed not to break drivers that work on x86, but slower on some archs. To me, this is most pleasing. It is much much easier to notice something is going a little slower and to work out how to use weaker ordering there, than it is to debug some once-in-a-bluemoon breakage caused by just the right architecture, driver, etc. It totally frees up the driver writer from thinking about barriers, provided they get the locking right. - ordered WRT other IO accessors, constrained within spinlocks, but not cacheable memory. This is what powerpc does now. It's a little faster for them, and probably covers the vast majority of drivers, but there are real possibilities to get it wrong (trivial example: using bit locks or mutexes or any kind of open coded locking or lockless synchronisation can break). - (less sane) same as above, but not ordered WRT spinlocks. This is what ia64 (sn2) does. From a purist POV, it is a little less arbitrary than powerpc, but in practice, it will break a lot more drivers than powerpc. I was kind of joking about taking control of this issue :) But seriously, it needs a decision to be made. I vote for #1. My rationale: I'm still finding relatively major (well, found maybe 4 or 5 in the last couple of years) bugs in the mm subsystem due to memory ordering problems. This is apparently one of the most well reviewed and tested bit of code in the kernel by people who know all about memory ordering. Not to mention that mm/ does not have to worry about IO ordering at all. Then apparently driver are the least reviewed and tested. Connect dots. Now that doesn't leave waker ordering architectures lumped with "slow old x86 semantics". Think of it as giving them the benefit of sharing x86 development and testing :) We can then formalise the relaxed __ accessors to be more complete (ie. +/- byteswapping). I'd also propose to add io_rmb/io_wmb/io_mb that order io/io access, to help architectures like sn2 where the io/cacheable barrier is pretty expensive. Any comments? --
| Hiten Pandya | Re: up? (emacs docbook xml ide) |
| Martin Michlmayr | Network slowdown due to CFS |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
git: | |
| Christos Zoulas | Re: Boot device confusion |
| Manuel Bouyer | Re: NFSv3 bug |
| Anders Magnusson | Re: setsockopt() compat issue |
| Martin Husemann | Re: Compressed vnd handling tested successfully |
