On Fri, 21 Mar 2008 09:54:05 -0700 Greg KH <greg@kroah.com> wrote:Well. The kernel has traditionally assumed that console writes are atomic. But we now have complex sleepy drivers acting as consoles. Presumably this means that large amounts of device driver code, page allocator code, etc cannot have printks in them without going recursive. Except printk itself internally handles that, due to its need to be able to handle printk-from-interrupt-when-this-cpu-is-already-running-printk. The typical fix is for these console drivers to just assume that they cannot sleep: pass GFP_ATOMIC down into the device driver code. But I bet the device driver code was designed assuming that it could sleep, oops-bad-we-lose. And it's not just sleep-in-spinlock. If any of that device driver code uses alloc_pages(GFP_KERNEL) then it can deadlock if we do a printk from within the page allocator (and hence a lot of the block and storage layer). Because in those code paths we must use GFP_NOFS or GFP_NOIO to allocate memory. So I think the right fix here is to switch those drivers to being unconditionally atomic: don't schedule, don't take mutexes, don't use __GFP_WAIT allocations. They could of course be switched to using kmalloc(GFP_ATOMIC)+memcpy()+schedule_task(). That's rather slow, but this is not a performance-sensitive area. But more seriously, this could lead to messages getting lost from a dying machine. One possibility would be to do current->called_for_console_output=1 and then test that in various places. But a) ugh and b) that's only useful for memory allocations - it doesn't help if sleeping locks need to be taken. Another possibility might be: if (current->called_for_console_output == false) { mutex_lock(lock); } else { if (!mutex_trylock(lock)) return -EAGAIN; } and then teach the console-calling code to requeue the message for later. But that's hard, because the straightforward implementation would result in the output being queued for _all_ the currently active consoles, but some of them might already have displayed this output - there's only one log_buf. An interesting problem ;) --
| Jeremy Fitzhardinge | Re: [RFC 00/15] x86_64: Optimize percpu accesses |
| jmerkey | [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released |
| Greg Kroah-Hartman | [PATCH 021/196] ISDN: Convert from class_device to device for ISDN capi |
| Ingo Molnar | Re: [PATCH 00/23] per device dirty throttling -v8 |
git: | |
| Linus Torvalds | Re: VCS comparison table |
| Peter Stahlir | Git as a filesystem |
| Johannes Schindelin | Re: git on MacOSX and files with decomposed utf-8 file names |
| Bill Lear | Meaning of "fatal: protocol error: bad line length character"? |
| Mayuresh Kathe | Re: What is our ultimate goal?? |
| Richard Stallman | Real men don't attack straw men |
| bofh | Re: web development on OpenBSD |
| Kevin | uvm_mapent_alloc: out of static map entries on 4.3 i386 |
| Mark Lord | Re: 2.6.25-rc8: FTP transfer errors |
| Evgeniy Polyakov | Re: [BUG] New Kernel Bugs |
| Jarek Poplawski | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Andi Kleen | [PATCH RFC] [1/9] Core module symbol namespaces code and intro. |
