From: Arjan van de Ven <arjan@linux.intel.com>
Subject: [PATCH] fastboot: hold the BKL over the async init call sequence
Regular init calls are called with the BKL held; make sure
the async init calls are also called with the BKL held.
While this reduces parallelism a little, it does provide
lock-for-lock compatibility. The hit to prallelism isn't too
bad, most of the init calls are done immediately or actually
block for their delays.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
---
init/main.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/init/main.c b/init/main.c
index bf79b83..dcb2c32 100644
--- a/init/main.c
+++ b/init/main.c
@@ -746,8 +746,14 @@ static void __init do_async_initcalls(struct work_struct *dummy)
{
initcall_t *call;
+ /*
+ * For compatibility with normal init calls... take the BKL
+ * not pretty, not desirable, but compatibility first
+ */
+ lock_kernel();
for (call = __async_initcall_start; call < __async_initcall_end; call++)
do_one_initcall(*call);
+ unlock_kernel();
}
static struct workqueue_struct *async_init_wq;
--
1.5.5.1
--
From: Arjan van de Ven <arjan@linux.intel.com> Subject: [PATCH] fastboot: sync the async execution before late_initcall and move level 6s (sync) first Rene Herman points out several cases where it's basically needed to have all level 6/6a/6s calls done before the level 7 (late_initcall) code runs. This patch adds a sync point in the transition from the 6's to the 7's. Second, this patch makes sure that level 6s (sync) happens before the async code starts, and puts a user in driver/pci in this category that needs to happen before device init. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> --- drivers/pci/pci.c | 2 +- include/asm-generic/vmlinux.lds.h | 3 ++- init/main.c | 8 +++++++- 3 files changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 44a46c9..d75295d 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1889,7 +1889,7 @@ static int __devinit pci_setup(char *str) } early_param("pci", pci_setup); -device_initcall(pci_init); +device_initcall_sync(pci_init); EXPORT_SYMBOL(pci_reenable_device); EXPORT_SYMBOL(pci_enable_device_io); diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 39c1afc..514dbdf 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -372,11 +372,12 @@ *(.initcall5.init) \ *(.initcall5s.init) \ *(.initcallrootfs.init) \ + *(.initcall6s.init) \ __async_initcall_start = .; \ *(.initcall6a.init) \ __async_initcall_end = .; \ *(.initcall6.init) \ - *(.initcall6s.init) \ + __device_initcall_end = .; \ *(.initcall7.init) \ *(.initcall7s.init) diff --git a/init/main.c b/init/main.c index dcb2c32..5c9e90e 100644 --- a/init/main.c +++ b/init/main.c @@ -741,6 +741,7 @@ int do_one_initcall(initcall_t fn) extern initcall_t __initcall_start[], __initcall_end[]; ...
incidentally, this fixed an USB related boot hang i found today on one of my testsystems running tip/master (which had patches 1-2-3 already), which i was about to report. Good spotting Rene! I've applied patches 4/5 to tip/fastboot. Ingo --
Did this impact the boot time improvements at all? Daniel --
The USB HCD initcalls take so little time to complete (100ms each) that ensuring they have finished makes no difference. USB devices get detected after those initcalls finish in parallel with the rest of the boot process (and there's about 1.5s before the first USB device driver initcall runs). -- Simon Arlott --
On Sun, 20 Jul 2008 14:14:59 -0700 not much in my measurement; but on my system .. level 7 is mostly empty so there's not much difference -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org --
Isn't this a bit confusing? All the other sync levels are directly after their respective levels. I can see why you want another level now, but shouldn't that mean late_initcall now wants to be 8, device_initcall 7 and your new 6s just 6 (device_core_initcall or something...)? Rene. --
yeah it is.. but nobody is using them I'll make a note to clean this up (by removing the unused ones) -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org --
After this patch, there are now 2 flush_workqueue(async_init_wq) calls in do_initcalls. Should the other one remain as well? Rene. --
On Tue, 29 Jul 2008 23:12:11 +0200 yes because if you don't have any level 7's then you won't hit this condition... you need the second one. flush_workqueue is cheap for the nothing-in-there case. -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org --
Ah, yes. For what it's worth by the way, I'm running that which is available from your fastboot repo (12 patches currently) on top of 2.6.26. Not seen any trouble. Nor improvements that I've noticed but this is a rather minimal and fast booting kernel/system anyway. Rene. --
It doesn't appear to be possible to init multiple PCI devices at once... I haven't looked into what is doing it exactly but presumably there's a lock being held over the whole device probe process. The speedup from usb seems to be primarily from initialising devices in the background... perhaps there's some way to do that without doing hcd init from a second thread? I get a really slow booting system if I enable the SAS controller... it requires 14 seconds to initialise itself, even with no drives attached (LSI 1068E). -- Simon Arlott --
You could provide useful details by booting a kernel with CONFIG_USB_DEBUG enabled. The USB stack _already_ initializes USB devices (i.e., not host controllers) in a separate thread. Alan Stern --
With fastboot: 162 ohci_hcd_mod_init+0x0/0xa6 167 pcie_portdrv_init+0x0/0x4d 182 saa7134_init+0x0/0x4a 205 ehci_hcd_init+0x0/0x8b 299 snd_usb_audio_init+0x0/0x3d 557 e1000_init_module+0x0/0x88 1227 amd74xx_ide_init+0x0/0x1b 2306 nv_init+0x0/0x1b Without fastboot: 103 ehci_hcd_init+0x0/0x8b 113 raid5_init+0x0/0x3e 127 pci_iommu_init+0x0/0x17 148 ohci_hcd_mod_init+0x0/0xa4 183 saa7134_init+0x0/0x4a 297 snd_usb_audio_init+0x0/0x3d 557 e1000_init_module+0x0/0x88 1227 amd74xx_ide_init+0x0/0x1b 2303 nv_init+0x0/0x1b 2859 usblp_init+0x0/0x1b Boot log attached. It looks like usb device driver init requires it to immediately block and wait for all devices to have completed init - so regardless of where we put the usb/ directory in the initcall order, it will always wait a while because the drivers will be immediately after the hcd init... perhaps if we moved the hcd init to before net/, sata/, ide/ etc. (all the things that take time themselves) but left usb device drivers to the end it would actually get the benefit of that separate thread. I don't have the time to rework how usb/ is linked in order to try that, but moving usb/ to before net/ and sata/ while making all the usb device drivers that get used be late initcalls could be used to test it. -- Simon Arlott
The timings in the boot log agree with your "Without fastboot:"
figures, so I assume that's the log you attached. The only timings at
issue here are ehci_hcd_init and ohci_hcd_mod_init. It's not clear
that the with-fastboot and without-fastboot values are directly
comparable; during startup there's a lot of activity, and interrupt
Which USB device driver init are you talking about? Your log includes
usblp_init, usb_stor_init, usb_usual_init, hid_init (for a USB mouse),
and snd_usb_audio_init. Each one completed before the next one
started; none of them blocked waiting for any devices (other than their
No, you're wrong. To prove it, try patching the start of hub_events()
in drivers/usb/core/hub.c like this:
* Not the most efficient, but avoids deadlocks.
*/
while (1) {
+ ssleep(5);
/* Grab the first entry at the beginning of the list */
spin_lock_irq(&hub_event_lock);
They are already running in a separate thread. Of course, the
different threads will contend for CPU resources -- just putting things
into multiple threads doesn't mean they will necessarily run
You seem to have a completely mixed-up idea of how the USB stack works.
All the device drivers you're worried about are initialized within the
khubd kernel thread.
Alan Stern
--
I wasn't suggesting comparing ehci_hcd_init/ohci_hcd_mod_init times, with fastboot I'm assuming it may manage to take a lock those need in Yes - and that's the point. The initcall process when it reaches usb/ is this: 1. ehci_hcd_init 2. ohci_hcd_mod_init 3. usblp_init There is nothing else to run between 1-2 and 3, so there is no opportunity to initialise devices in the background and usblp_init blocks for a while. If ehci_hcd_init and ohci_hcd_mod_init were moved up to before sata/ide/net but usblp_init was kept after these then the devices should be ready by the I'll try this tonight, but all I'd expect to see is the background thread There's a lot of delays going on during sata/ide/net init, waiting for The usb device driver initcalls aren't initialised by khubd and they're definitely blocking on the khubd device initialisation [for that device, as you said earlier] being completed. -- Simon Arlott --
It sounds like you have usblp compiled into the kernel instead of building it as a module. Do things change for the better if you make If it were a module then it would block in a separate thread and It's not entirely clear why usblp is blocking at all. Probably because it is waiting on the device semaphores for devices that are currently being probed -- the driver core won't allow two threads to probe the Well, yes, of course. My point is that the drivers (those in modules, anyway) won't initialize _immediately_ after the HCD init, as you Um. The initcalls for USB device drivers built as modules are made by the modprobe thread, which is started in response to a uevent created by khub. So while they aren't initialized directly by khubd, they are initialized indirectly by it. On the other hand, the initcalls for drivers compiled into the kernel Driver probing is always going to be a bottleneck, even for non-matching device/driver pairs. That's because the driver core acquires the device semaphore even before calling the bus's match routine. When a new driver is registered, the driver core just goes down the list of all devices registered on the bus, locking each one in turn and trying to match & probe it. Alan Stern --
No, a PCI device init in the main initcall process could prevent the Only if I make ALL usb device drivers modules, otherwise the first one has to wait for all devices to finish initialising. (I get the same delay on Ok - so there could be some big improvements to be had by making the hcd So could HCD init be moved higher in the initcall order so that the devices have been initialised enough by khubd that device driver initcalls take much less time? I'm not really sure how to do that, except by doing usb/core/ usb/host/ and then usb/class/ usb/storage/ etc.... -- Simon Arlott --
Arjan also needed a pre device_initcall() level for PCI core init now that the async device initcalls weren't governed just by link order anymore. He reused the device_initcall_sync() level, moving it to before device_initcall() itself (it used to be just behind). Your above notion sounds like another good reason for inserting a real new level just before device_initcall(); if you move any of the device init to late initcall(), late_initcall() loses too much of its utility I'd feel (see start of this thread with various late_initcalls wanting to assume stuff). Rene. --
Well, the late_initcall idea was just for testing, to make the device driver parts later while moving usb/ up in link order. -- Simon Arlott --
Okay, that explains things. But this means that changes which are helpful on your system might not be so helpful (or might even be Maybe. Perhaps a better approach would be to make the device driver initcalls before there are any devices for their probe routines to Try doing this instead: Put a 5-second delay at the start of hub_thread(). That will give the in-kernel device drivers time to initialize before any USB devices -- other than root hubs -- are discovered. That should speed up the init timings. For everyone else, it will slow down USB device discovery by 5 seconds. I don't know how important that is, considering how much other stuff is going on at boot time. You may find that 5 seconds is more than you need; perhaps 1 second will be enough. Alan Stern --
What about this? The Makefiles become a bit messy, but by moving things around I get the desired effect without splitting their initcalls. [ 7.941890] Write protecting the kernel read-only data: 5656k [ 5.437709] Write protecting the kernel read-only data: 5656k 2.5s faster, which is almost half the boot time. Signed-off-by: Simon Arlott <simon@fire.lp0.eu> --- drivers/Makefile | 13 ++++++++++--- drivers/usb/Makefile | 30 ++++++++++++++++++++---------- drivers/usb/core/Kconfig | 17 +++++++++++++++++ 3 files changed, 47 insertions(+), 13 deletions(-) diff --git a/drivers/Makefile b/drivers/Makefile index a280ab3..d68151f 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -33,7 +33,12 @@ obj-$(CONFIG_FB_INTEL) += video/intelfb/ obj-y += serial/ obj-$(CONFIG_PARPORT) += parport/ -obj-y += base/ block/ misc/ mfd/ net/ media/ +obj-y += base/ block/ +ifdef CONFIG_USB_FASTBOOT + obj-$(CONFIG_USB) += usb/ + obj-$(CONFIG_PCI) += usb/ +endif +obj-y += net/ media/ obj-$(CONFIG_NUBUS) += nubus/ obj-$(CONFIG_ATM) += atm/ obj-y += macintosh/ @@ -56,8 +61,10 @@ obj-$(CONFIG_MAC) += macintosh/ obj-$(CONFIG_ATA_OVER_ETH) += block/aoe/ obj-$(CONFIG_PARIDE) += block/paride/ obj-$(CONFIG_TC) += tc/ -obj-$(CONFIG_USB) += usb/ -obj-$(CONFIG_PCI) += usb/ +ifndef CONFIG_USB_FASTBOOT + obj-$(CONFIG_USB) += usb/ + obj-$(CONFIG_PCI) += usb/ +endif obj-$(CONFIG_USB_GADGET) += usb/gadget/ obj-$(CONFIG_SERIO) += input/serio/ obj-$(CONFIG_GAMEPORT) += input/gameport/ diff --git a/drivers/usb/Makefile b/drivers/usb/Makefile index a419c42..35766a1 100644 --- a/drivers/usb/Makefile +++ b/drivers/usb/Makefile @@ -8,16 +8,22 @@ obj-$(CONFIG_USB) += core/ obj-$(CONFIG_USB_MON) += mon/ -obj-$(CONFIG_PCI) += host/ -obj-$(CONFIG_USB_EHCI_HCD) += host/ -obj-$(CONFIG_USB_ISP116X_HCD) += host/ -obj-$(CONFIG_USB_OHCI_HCD) += host/ -obj-$(CONFIG_USB_UHCI_HCD) += ...
Wouldn't it be much simpler, and less objectionable, to do what I suggested earlier? That is, add a 5-second delay at the start of hub_thread() in drivers/usb/core/hub.c. No messing with Makefiles, no changes to the initcall scheduling. Alan Stern --
Aside from 5 seconds being too long, and anything less being a race between hub_thread() and driver initcalls, it doesn't improve anything because it'll still have to wait for the devices to finish initialising in userspace instead. -- Simon Arlott --
And what is your boot time if the usb drivers are modules? I'd much prefer that requirement to get a faster boot than to go around mucking with the link-time ordering... thanks, greg k-h --
Why is 5 seconds too long? Too long for what? What you're doing is already a race between hub_thread() and driver initcalls. My suggestion is no worse. "it'll still have to wait..." If by "it" you mean the initcall thread, you're wrong. If by "it" you mean the user, you still aren't necessarily correct; the user can do plenty of other things while waiting for USB devices to initialize. I suppose you could make the hub_thread delay time a module parameter for usbcore, defaulting to 0. Then it could be set by just the people who want to use it -- many (most?) people keep their drivers in modules, and it wouldn't do them any good. Alan Stern --
On Wed, 6 Aug 2008 15:49:27 -0400 (EDT) because I'm done booting to full UI in that time. it's forever. -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org --
This doesn't answer my question. Why does the fact that you're already in the UI after 5 seconds mean that 5 seconds is too long to delay the start of khubd? Besides, if your machine can boot to the UI in under 5 seconds then you have no need to worry about improving boot-up speed anyway. Alan Stern --
On Wed, 6 Aug 2008 16:09:21 -0400 (EDT) because it means that while the rest of boot is done and the system idle, the mouse and keyboard aren't going to work. -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org --
Ah, okay. Then if the delay time were set up as a module parameter, you'd be one of the people who leaves the parameter set at its default value of 0. Alan Stern --
No, by adding a 5 second delay you're intending for the device driver initcalls to complete within that 5 seconds. If they take too long then the last one blocks everything (I realise that's ridiculous, these initcalls take <1ms when there are no devices yet). The best way to do is to make the driver initcalls Assuming userspace doesn't wait for all devices to settle and appear in /dev etc. It really needs to have hcd initcalls done very early so that device init has the rest of the (kernel and userspace) boot process to complete in the background. This is negated by having device drivers initialised immediately afterwards. Re-ordering initcalls and doing more of the init process asynchronously is likely to expose bugs and cause inconsistent device order on some systems, so if the makefile mess could be reduced then it can be a Kconfig option. How many people have *all* their USB components (hcd, drivers) as modules? What do they do with their USB keyboards in the period between init and module load? If even one device driver and the hcd is compiled in, they'd need to wait for every USB device to finish init before the usbhid probe could complete. -- Simon Arlott --
Whatever that involves... /dev never truly settles; it's always All the major distributions do, as far as I know. (I haven't actually Probably nothing. What do you do on your keyboard while waiting for What if usbhid is the one device driver that is compiled in? :-) Alan Stern --
Right - so then all that's needed is to move usb/ before most other things that take a while to init, and it has some time to initialise usb devices in the I'm not an expert on what can be done with the Makefile process, perhaps there's an easier way to move things around based on a config option. If host init is moved before device init for everyone, wouldn't there be too many side effects? I've not tried to use usb-storage as root but presumably that works. If that already uses a hack of making the kernel wait X seconds before trying to mount Lack of a keyboard makes it impossible to do anything if the module fails to load... as I understand it when the HCD init runs, any BIOS emulation stops? That was the situation I was thinking of - surely that would be compiled in if the HCDs were? -- Simon Arlott --
Not exactly -- you should move most of usb/ forward so that it comes before usb/host/. Or alternatively move usb/host/ later so that it Doesn't the host init _already_ come before device init? If host init were moved _after_ device init, I don't think there would be a lot of I noticed in your logs earlier that some of your USB drivers were compiled in and others weren't. Maybe if _all_ of them were compiled in you wouldn't experience as much contention and the init delays would be shorter. Alan Stern --
Well, putting usb/ before net/ etc. requires that usb/host/ is the usb/stuff/ or Host init is before device init, as that's the Makefile link order. Any device init causes it to wait for *all* devices, so swapping them around means devices are going to appear at any time after that - there's no device initcall to make it block. Presumably it would be possible to have a late_initcall (which would be early in that list if usb was earlier) that could ensure khubd had finished [its current queue] before continuing - as if there was a device driver initcall? If someone currently has HCD init compiled in but nothing else, then the boot process would block unnecessarily... the initcall would need to be disabled in that case to maintain existing behaviour - which is why it probably needs to be a config option, which requires some mess in the Makefile or a new I don't know where you got that from - they're definitely all compiled in. Whichever is first to have its initcall run is blocked, and the logs may have been from a test of that. -- Simon Arlott --
I don't understand. I never said you should move usb/. I said you should move usb/host/ "wait" is the wrong word. Each device init (more accurately, each device driver init) probes all the USB devices. But it doesn't have to wait if nothing else is probing those devices concurrently and if no hub activity is going on. This means that you won't save much time by running multiple USB device driver initialization threads. Better to do all those inits in a single thread. The ideal situation is to have each device driver initcall run when there are _no_ USB devices to be probed -- i.e., before the host Again, I don't understand. What's wrong with simply reordering As David Brownell pointed out, I was wrong about this. The BIOS gets kicked off even before the host driver starts running, as part of PCI I must have misinterpreted one of the earlier messages in this thread. Alan Stern --
That's done much earlier ... as part of PCI quirk handling. See drivers/usb/host/pci-quirks.c ... it's not been part of the HCDs for about three years now. I don't recall the details right now, but letting the BIOS hang on to that hardware was causing a lot of wierd problems that went away when we made that change. If a module fails to load, fix the bug in that module. :) - Dave --
Hi,
I'm using usb drive to boot from. I also use USB mouse and USB keyboard.
Indeed I have to put a delay in the init script in the initrd to delay and give time for the initialization. I found out by testing that this delay should be >5sec.
I have tried compiling all the usb modules into the kernel, but this doesn't help much, now after following this discussion it might be related.
With 2.6.20 I can use the keyboard. With the same init script in 2.6.24 and 2.6.26 it is not working, but I really coudn't find time to investigate, because as Alan sais there is really not too much to do with keyboard and mouse during the bootprocess unless as I do I unlock crypted partitions.
Don't know if this information is somehow useful for you, but hope so.
I could also look at why the keyboard is not working in boot. Let me know
regards
--
This is controlled by a module parameter. Read the description of delay_use near the top of drivers/usb/storage/usb.c. Alan Stern --
If you want to do such stuff, don't be conditional. Just change it so USB is *always* earlier. We really don't need to make boot sequences become even harder to understand/predict. --
Interesting... I haven't been able to get stable improvements, with all USB kernel modules built-in, just a USB mouse on OHCI and otherwise a switched of USB HD on EHCI and usb-storage, but I do get the idea that with your parch, I "get lucky" more often. I have knfsd as a module and it loads through the exportfs trigger during bootup and outputs the last message to my dmesg: 4 boots without your patch: 6.12, 6.41, 6.38, 6.34 seconds 4 boots with : 5.39, 6.33, 5.37, 6.34 Booting with the external HD switched on adds a tiny bit to the actual kernel startup time -- completely repeatable 2.62 seconds until freeing init code with the HD off versus 2.73 with it on -- but doesn't seem to change the picture otherwise... Would those results make sense you feel? I'm not looking forward to putting an actual statistical analysis on it ;-) Arjan: your fastboot repo by the way doesn't pull cleanly into current upstream anymore. Rene. --
It does depend on what else is running... I have e1000 (500ms) and sata_nv (2200ms) so there's plenty of additional time for usb devices usb-storage by default delays 2 seconds before trying to use the device, so it won't add much time itself if there's already an driver initcall I suggest adding initcall_debug=1, it'll show how long the usb and other initcalls take to run. You should see the first usb device driver initcall take a second or two to run (without the patch). Maybe I'm just badly affected by having HCDs with 10 ports (of which only 4 are physically usable...). dmesg|grep initcall\ |sed -e 's/.*initcall \([^ ]\+\( \(\[[^]]\+\]\)\)\?\) returned [^ ]\+ after \([^ ]\+\) secs/\4 \1/'\ |sort -n -- Simon Arlott --
Thanks, but that specific expression doesn't work work me,since there's not a single one that takes ' secs', only ' msecs'. 685 isn't nice but better then 2000... Rene. --
Why? It sounds like a trivial solution for you is to actually use modules. Why go through a lot of extra work to solve something in a different way that is already solved for you? Who is imposing the "no modules allowed" rule on you, and why was it made? thanks, greg k-h --
Because it looks like doing HCD init early enough is a simple way to speed up boot time if there are any compiled-in usb device drivers, without running the HCD init itself from a separate thread. Arjan, are you able to test this? No one is imposing anything on me. I don't yet have an USB keyboard so I could compile all the drivers as modules without any problems. There's some strange animosity to compiling things that one knows will always be needed into the kernel... like the V4L DVB code that used to fail to work properly, perhaps because everyone uses modules (4abdcf933f647763592db6bef001d1fae61a5527). It also leads to people unloading/loading the module to workaround bugs that could otherwise be fixed (c278850206fd9df0bb62a72ca0b277fe20c5a452). -- Simon Arlott --
I've tested this with fastboot disabled, the hcd initcalls changed back to module_init, all my drivers changed to late_initcall to force them to be later while usb/ is before net/ in the drivers/ Makefile. -- Simon Arlott
Sorry for the massively late response. Don't know how I missed this thread earlier, but I can answer that from my standpoint. At Sony we have a "no modules if possible" policy to reduce kernel footprint. The size difference for a kernel compiled with modules vs one with external modules is about 10% IRRC. -- Tim ============================= Tim Bird Architecture Group Chair, CE Linux Forum Senior Staff Engineer, Sony Corporation of America ============================= --
