linux-kernel mailing list

FromSubjectsort iconDate
Lennart Sorensen
jsm driver fix for linuxpps support
The jsm driver doesn't currently use the uart_handle_*_change helper functions, which are the obvious place for things like linuxpps to tie into (which it now does of course), and as a result the jsm driver can not be used with linuxpps and anything else that ties into the serial_core helper functions. This patch adds calls to these helper functions whenever the value they manage changes. That actual storage of the state is not modified since the jsm driver caches the current settings (The 8250 ...
Mar 13, 3:32 pm 2007
Lennart Sorensen
Small fixes for jsm driver
The jsm driver fails when you try to use the TIOCSSERIAL ioctl. The reason is that the driver never sets uart_port.uartclk, causing the data received using TIOCGSERIAL to not match the internal state of the driver. This patch fixes this problem by settings the uartclk to the value used by the serial_core (16 times the baud base). Signed-off-by: Len Sorensen <lsorense@csclub.uwaterloo.ca> --- a/drivers/serial/jsm/jsm_tty.c 2007-03-13 15:53:39.000000000 -0400 +++ ...
Mar 13, 3:29 pm 2007
Stephen Hemminger
[ANNOUNCE] iproute2 2.6.20-070313
This is an experimental to the iproute2 command set. The version number includes the kernel version to denote what features are supported. The same source should build on older systems, but obviously the newer kernel features won't be available. As much as possible, this package tries to be source compatible across releases. It can be downloaded from: http://developer.osdl.org/dev/iproute2/download/iproute2-2.6.20-070313.tar.gz Repository: ...
Mar 13, 3:15 pm 2007
Jiri Slaby
FF layer restrictions [Was: [PATCH 1/1] Input: add sensa ...
Why did you remove all Cced people? Anyway I filtered some of them out I can do that, but in that case, I need to know how people (especially those input one) want me to do... regards, -- http://www.fi.muni.cz/~xslaby/ Jiri Slaby faculty of informatics, masaryk university, brno, cz e-mail: jirislaby gmail com, gpg pubkey fingerprint: B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E -
Mar 13, 3:16 pm 2007
Nish Aravamudan
Re: Linux 2.6.20.3
err, duh -- this is a Sun Ultra 60, debian testing install. Thanks, Nish -
Mar 13, 2:58 pm 2007
David Miller
Re: Linux 2.6.20.3
From: "Nish Aravamudan" <nish.aravamudan@gmail.com> Figure out if 2.6.20.2 does it too, then please try to git bisect it down further. I took a quick look and the two sparc64 commits between 2.6.20.1 and 2.6.20.2 are benign, a fix for E450 interrupts and a kenvctrld fix which is for a driver for hardware your ultra60 doesn't have. :) There is a decent amount of raid and nfs fixes in here, do you use either? Another commit that might be relevant is: commit ...
Mar 13, 3:20 pm 2007
Nish Aravamudan
Re: Linux 2.6.20.3
Building 2.6.20.2 right now, will let you know. Thanks, Nish -
Mar 13, 3:51 pm 2007
Chris Wright
Re: [RFC/PATCH 00/59] Make common x86 arch area for i386 ...
Thanks for taking this on. I'm sure Andi has a bunch of ideas on this what about asm-x86/ dir? the asm/ symlink would still point to relevant arch, but the file there could be simply #include <asm-x86/file.h> ? thanks, -chris -
Mar 13, 2:45 pm 2007
Linus Torvalds
Re: [RFC/PATCH 00/59] Make common x86 arch area for i386 ...
I don't mind the patches, but I'd be a lot happier if it also was a stated intention to actually make it be buildable as "x86", the same way that the separate 32-bit and 64-bit POWER architectures were merged into just one architecture that could be built either way. For the POWER merge, we had (and probably still have) legacy platforms that could only be built the old way (ie if you needed to build for certain legacy 32-bit targets, you still needed to use the "ppc" architecture, ...
Mar 13, 2:39 pm 2007
Linus Torvalds
Re: [RFC/PATCH 06/59] mv kernel/acpi/processor.c
Please use git diff -M for things like this. In fact, even if you weren't a git user, I'd ask you to *become* one just because I think that it's a *lot* more productive if people actually see renames as renames, and will see what - if anything - changed when renaming. The "-M" flag isn't the default, simply because it generates patches that cannot be applied with regular "patch", but for something like this, I think it's practically imperative. The old kind of "remove file" ...
Mar 13, 2:32 pm 2007
Pavel Machek
Re: Suspend to RAM fault in VT when resuming
It is possible that video is not initialized at that point, and that hardware goes seriously unhappy when you access non-existing vga. Does it resume ok when you completely disable video support? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
Mar 13, 2:24 pm 2007
Tim Gardner
Suspend to RAM fault in VT when resuming
Pavel, I've chased one of the 'Suspend to RAM' resume problems to a specific line in drivers/char/vt.c, see attached 2.6.21-rc3 diff with TRACE_RESUME() instrumentation. The macro scr_writew resolves to '*addr = val', which appears to be causing the problem. I've verified that the pointer is not NULL, but don't know if its really valid. Its pretty tough to tell what is happening, but on a Dell XPS it just hangs. A Dell Precision blinks the keyboard lights. Since I don't know anything about ...
Mar 13, 1:30 pm 2007
Kandan Venkataraman
[PATCH] Loop device - Tracking page writes made to a loo ...
All comments have been taken care of. Description: A file_operations structure variable called loop_fops is initialised with the default block device file operations (def_blk_fops). The mmap operation is overriden with a new function called loop_file_mmap. A vm_operations structure variable called loop_file_vm_ops is initialised with the default operations for a disk file. The page_mkwrite operation in this variable is initialised to a new function called loop_track_pgwrites. In the ...
Mar 13, 1:21 pm 2007
Christoph Hellwig
Re: [PATCH] Loop device - Tracking page writes made to a ...
NACK. block device driver should never ever play around with file operations themselves. If you want functionality like the one you have please don't overload the loop driver, but start a new (character) driver doing specificaly what you want. And even then I'm not sure we'd want functionality like this in the mainline tree, but at least we can have an open discussion if it's done properly. -
Mar 13, 1:32 pm 2007
Con Kolivas
Re: [ck] Re: [PATCH] [RSDL-0.30] sched: rsdl improve lat ...
I guess the "other minor things" will matter then :) I'll keep working. At least you know what I'm working on now. -- -ck -
Mar 13, 1:27 pm 2007
Al Boldi
Re: [PATCH] [RSDL-0.30] sched: rsdl improve latencies wi ...
Applied against v0.30 mainline. It only works on prio +16 to +19. Thanks! -- Al -
Mar 13, 12:54 pm 2007
guillaume.devaux
Re: syba tech multi-I/O pci-card
Hello, I found this thread : http://lkml.org/lkml/fancy/2004/8/28/229 It was about a driver for a serial pci card. Can anyone bring me the SYBAMIO.EXE file. Thanks -
Mar 13, 12:22 pm 2007
Zoltan Menyhart
copy_one_pte()
I had a look at copy_one_pte(). I cannot see any ioproc_update_page() call, not even for the COW pages. Is it intentional? We can live with a COW page for a considerably long time. How could the IO-PROC. know that a process-ID / user virt. addr. pair refers to the same page? The comment above ioproc_update_page() says that every time when a PTE is created / modified... Thanks, Zoltan Menyhart -
Mar 13, 12:15 pm 2007
Christoph Hellwig
Re: copy_one_pte()
There is no such thing as ioproc_update_page in any mainline tree. You must be looking at some vendor tree with braindead patches applied. -
Mar 13, 12:18 pm 2007
Avi Kivity
Re: considering kevent - the kernel development process
I don't know about CGL, but a unified async notification mechanism _is_ needed. In ~2002, I needed one, and what I ended up with was a Red Hat 2.4 kernel with IO_CMD_POLL hacked in. Five years later, mainline still can't do that. Whether it's syslets, threadlets, io_getevents with networking support, kevents, epoll with kaio support, queued signals, or something else, users don't care, but they need _something_. Please don't ignore the users just because they quote the wrong text. ...
Mar 13, 1:13 pm 2007
Johann Borck
considering kevent - the kernel development process
if the next section seems obvious, please skip it ------------------------------------------------------------------------- There is a need for a generic event handling mechanism in Linux - maybe it is enough to cite the current CGL Performance Requirements Definition Version 4.0 at http://developer.osdl.org/dev/cgl/cgl40/cgl40-performance.pdf , page 13: PRF.5.0 Efficient Low-Level Asynchronous Events Priority P1 Description: OSDL CGL specifies that carrier grade Linux ...
Mar 13, 12:03 pm 2007
Christoph Hellwig
Re: considering kevent - the kernel development process
please cut your cc lines next time, thanks. It's not obvious, but utter bullshit. Please go away if you think starting post with quotes from CGL specs is even remotely close to a good idea. -
Mar 13, 12:06 pm 2007
Andreas Schwab
Re: x86_64 system lockup from userspace using setitimer()
I can also reproduce it on ia64 with 2.6.20. 2.6.16.42 is ok. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." -
Mar 13, 12:19 pm 2007
Johannes Bauer
x86_64 system lockup from userspace using setitimer()
Dear Community, I think I've encountered a bug with the Linux kernel which results in a complete system lockup and which can be started without root priviliges. It's reproducible with 2.6.20.1 and 2.6.20.2 and only x64_64 seems affected. Here's the code which triggers the bug (originally found by me using an only partly initialized "struct itimerval" structure - hence the strange values in it_interval): -----8<-----8<-----8<-----8<-----8<-----8<-----8<-----8<-----8<----- #include ...
Mar 13, 11:55 am 2007
Chuck Ebbert Mar 13, 1:02 pm 2007
Thomas Gleixner
Re: x86_64 system lockup from userspace using setitimer()
No. The possible DoS is only when high res timers are enabled, which is not the case in 2.6.20. Looking at the values 140735669863712 = 0x7FFF 939C 0520 We convert second to nanoseconds: 140735669863712 * 1e9 = 0x1DCD 4BC3 6B82 914B 4000 The seconds value is limited to LONG_MAX, but on a 64 bit machine, the 140735669863712 is inside LONG_MAX and we have an multiplication overflow. I'm not sure, how this results in a DoS, but I will look into this tomorrow morning, when I'm more ...
Mar 13, 1:33 pm 2007
Greg KH
Linux 2.6.20.3
We (the -stable team) are announcing the release of the 2.6.20.3 kernel. It contains a number of bugfixes and all 2.6.20 users are recommended to upgrade. The diffstat and short summary of the fixes are below. I'll also be replying to this message with a copy of the patch between 2.6.20.2 and 2.6.20.3. The updated 2.6.20.y git tree can be found at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.20.y.git and can be browsed at the normal kernel.org git web browser: ...
Mar 13, 11:33 am 2007
Greg KH
Re: Linux 2.6.20.3
diff --git a/Makefile b/Makefile index d165e80..7d2f304 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ VERSION = 2 PATCHLEVEL = 6 SUBLEVEL = 20 -EXTRAVERSION = .2 +EXTRAVERSION = .3 NAME = Homicidal Dwarf Hamster # *DOCUMENTATION* diff --git a/arch/sparc/kernel/of_device.c b/arch/sparc/kernel/of_device.c index dab6169..798b140 100644 --- a/arch/sparc/kernel/of_device.c +++ b/arch/sparc/kernel/of_device.c @@ -495,7 +495,7 @@ static void __init build_device_resources(struct ...
Mar 13, 11:34 am 2007
Nish Aravamudan
Re: Linux 2.6.20.3
Compared to 2.6.20.1 (will try 2.6.20.2 as well), I now get: [ 199.361347] BUG: soft lockup detected on CPU#2! smp_percpu_timer_interrupt+0xd4/0x180 tl0_irq14+0x1c/0x20 journal_add_journal_head+0x2c/0x1e0 journal_write_metadata_buffer+0x480/0x500 journal_commit_transaction+0xc38/0x1040 kjournald+0xc0/0x1e0 kthread+0xb0/0xc0 kernel_thread+0x38/0x60 keventd_create_kthread+0x20/0xa0 shortly after the serial console prompts for login. Thanks, Nish -
Mar 13, 2:57 pm 2007
Paulo Marques
Re: /proc/kallsyms race vs module unload
The only use for the "struct module *" is to display the name of the module. This can be solved by adding a "char mod_name[MODULE_NAME_LEN];" field to "kallsym_iter" and copy the name of the module over, while still holding module_mutex. It would be slightly slower, but safer. We can even change the function's interface, so that it doesn't return a "struct module *" at all, since AFAICS kallsyms is the only user of that function. It will still produce strange artifacts, though. If ...
Mar 13, 11:49 am 2007
Alexey Dobriyan
/proc/kallsyms race vs module unload
Steps to reproduce: while true; do modprobe xfs; rmmod xfs; done vs while true; do cat /proc/kallsyms >/dev/null; done [where xfs could be any module, I haven't tried] BUG: unable to handle kernel paging request at virtual address e19f808c printing eip: c01dc361 *pde = 1ff5f067 *pte = 00000000 Oops: 0000 [#1] PREEMPT Modules linked in: CPU: 0 EIP: 0060:[<c01dc361>] Not tainted VLI EFLAGS: 00010297 (2.6.21-rc3-8b9909ded6922c33c221b105b26917780cfa497d #2) EIP is at ...
Mar 13, 11:18 am 2007
Alexey Dobriyan
Re: /proc/kallsyms race vs module unload
iter->owner = module_get_kallsym(iter->pos - kallsyms_num_syms, &iter->value, &iter->type, iter->name, sizeof(iter->name)); if (iter->owner == NULL) return 0; /* Label it "global" if it is exported, "local" if not exported. */ iter->type = is_exported(iter->name, iter->owner) ^^^^^^^^^^^ -
Mar 13, 4:07 pm 2007
Marc St-Jean
Re: [PATCH] drivers: PMC MSP71xx LED driver
Hi Florian, Thanks for pointing out the API. We were aware of the API which appeared when updating to 2.6.16 or 17. At some point if we require a userland API we will need to look at how to integrate support for it. Currently our driver has a different goal which is to support concurrent control of the LEDs (over TWI and GPIO) from kernel and non-kernel code. The non-kernel code here isn't userland code but an RTOS running on a I wasn't aware of this work, we'll need to look into it. ...
Mar 13, 10:42 am 2007
Folkert van Heusden
[2.6.20] BUG: workqueue leaked lock
... [ 1756.728209] BUG: workqueue leaked lock or atomic: nfsd4/0x00000000/3577 [ 1756.728271] last function: laundromat_main+0x0/0x69 [nfsd] [ 1756.728392] 2 locks held by nfsd4/3577: [ 1756.728435] #0: (client_mutex){--..}, at: [<c1205b88>] mutex_lock+0x8/0xa [ 1756.728679] #1: (&inode->i_mutex){--..}, at: [<c1205b88>] mutex_lock+0x8/0xa [ 1756.728923] [<c1003d57>] show_trace_log_lvl+0x1a/0x30 [ 1756.729015] [<c1003d7f>] show_trace+0x12/0x14 [ 1756.729103] [<c1003e79>] ...
Mar 13, 9:50 am 2007
Daniel Walker
Re: Stolen and degraded time and schedulers
The frequency tracking you mention is done to some extent inside the timekeeping adjustment functions, but I'm not sure it's totally accurate for non-timekeeping, and it also tracks things like interrupt latency. Tracking frequency changes where it's important to get it right shouldn't be done I think .. The sched_clock interface is basically a stripped down clocksource.. Are there other architecture which have this per-cpu clock frequency changing issue? I worked with several other ...
Mar 13, 2:27 pm 2007
Jeremy Fitzhardinge
Re: Stolen and degraded time and schedulers
I'm not sure I follow you here. Clocksources have the means to adjust the rate of time progression, mostly to warp the time for things like ntp. The stability or otherwise of the tsc is irrelevant. If you had a clocksource which was explicitly using the rate at which a CPU does work as a timebase, then using the same warping mechanism would Yes, that works. But a clocksource is strictly about measuring the progression of real time, and so doesn't generally measure how much work Well, ...
Mar 13, 2:59 pm 2007
Jeremy Fitzhardinge
Re: Stolen and degraded time and schedulers
Yes, you could imagine adding it as a clocksource variant, by allowing a clocksource to set a particular timebase: enum clocksource_timebase { CLOCK_TIMEBASE_REALTIME, CLOCK_TIMEBASE_CPU_WORK, ... }; struct clocksource { enum clocksource_timebase timebase; ... } Most of the existing clocksource infrastructure would only operate on CLOCK_TIMEBASE_REALTIME clocksources, so I'm not sure how much overlap there would be here. In the case of dealing with cpufreq, there's a certain ...
Mar 13, 1:32 pm 2007
Jeremy Fitzhardinge
Stolen and degraded time and schedulers
The current Linux scheduler makes one big assumption: that 1ms of CPU time is the same as any other 1ms of CPU time, and that therefore a process makes the same amount of progress regardless of which particular ms of time it gets. This assumption is wrong now, and will become more wrong as virtualization gets more widely used. It's wrong now, because it fails to take into account of several kinds of missing time: 1. interrupts - time spent in an ISR is accounted to the current ...
Mar 13, 9:31 am 2007
john stultz
Re: Stolen and degraded time and schedulers
My gut reaction would be to avoid using clocksources for now. While there is some thought going into how to expand clocksources for other uses (Daniel is working on this, for example), the design for clocksources has been very focused on its utility to timekeeping, so I'm hesitant to try complicate the clocksources in order to multiplex functionality until what is really needed is well understood. I suspect the best approach would be see how the sched_clock interface can be reworked/used for ...
Mar 13, 1:12 pm 2007
Atsushi Nemoto
[PATCH 2.6.21-rc3] serial: Allocate minor device numbers ...
The serial_txx9 driver have abused device numbers (major 4, minor 128) if CONFIG_SERIAL_TXX9_STDSERIAL was not set. This patch makes the driver allocate minor numbers from major 204 (Low-density serial ports). I suppose a typical user of this driver set CONFIG_SERIAL_TXX9_STDSERIAL to Y (i.e. use "ttyS0"), so this patch would not cause big compatibility issue. Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> --- diff --git a/Documentation/devices.txt b/Documentation/devices.txt index ...
Mar 13, 9:09 am 2007
Dan Aloni
thread stacks and strict vm overcommit accounting
Hello, This question is relevent to 2.6.20. I noticed that if the RSS for the stack size is say, 8MB, running a single-threaded process doesn't incur an increase of 8MB to Committed_AS (/proc/meminfo). However, on multi-threaded apps linked with pthread (on Debian Etch with 2.6.20 vanilla x86_64), every thread will incur the the specified maximum stack size RSS (assuming that you use the default attr). In other words, it appears that vm accounting works differently in that case. Is ...
Mar 13, 9:33 am 2007
Tomáš Janoušek
Re: [5/6] 2.6.21-rc3: known regressions
Hi, Ok, this was bullshit. Nohz and hrtimers turned off really solve the issue with having to press keys. Seems like the yesterday's check was for the other issue and I just pressed the keys automatically, remembering that I had to. Sorry, -- Tomáš Janoušek, a.k.a. Liskni_si, http://work.lisk.in/ -
Mar 13, 3:43 pm 2007
thomas
Re: [5/6] 2.6.21-rc3: known regressions
Can you please try to compile without nohz and without hrtimers and try it again? This is maybe the same error i encounter. See also: http://www.ussg.iu.edu/hypermail/linux/kernel/0703.1/1506.html with kind regards thomas -
Mar 13, 8:51 am 2007
Tomáš Janoušek
Re: [5/6] 2.6.21-rc3: known regressions
Hi, A colleage told me to try this yesterday and if I remember correctly, it did not help. I may try it again because I'm not sure whether it wasn't some other issue thing did not work, but I'm absolutely sure that no matter what these settings are, the machine does not resume or hangs soon after resume (in the case of s2disk). And this is since 2.6.20. 2.6.21 just added the 'wait until keypress'. Regards, -- Tomáš Janoušek, a.k.a. Liskni_si, http://work.lisk.in/ -
Mar 13, 8:56 am 2007
Herbert Poetzl
Re: [Fwd: DELIVERY FAILURE: 554 Service unavailable; Cl ...
no, you have to add the binaries too, as they are mapped read only/executeable and shared between guests .. same when I find the time, I can actually set up 100 guests with and without sharing and check the difference on mail server here: binaries and libraries, no shared files (as they are harder to figure) sum up to 28904 while the total RSS used inside the guest is at 102256 so we have roughly 1/3rd shared here between guests we should consider it now, and implement it ...
Mar 13, 7:24 am 2007
Alexey Dobriyan
[PATCH -mm] proc: remove pathetic ->deleted WARN_ON
WARN_ON(de && de->deleted); is sooo unreliable. Why? proc_lookup remove_proc_entry =========== ================= lock_kernel(); spin_lock(&proc_subdir_lock); [find proc entry] spin_unlock(&proc_subdir_lock); spin_lock(&proc_subdir_lock); [find proc entry] proc_get_inode ============== WARN_ON(de && de->deleted); ... if (!atomic_read(&de->count)) free_proc_entry(de); else de->deleted = 1; So, if you have some strange oops [1], and doesn't ...
Mar 13, 7:31 am 2007
Avi Kivity
[GIT PULL] kvm updates for 2.6.21-rc3
Linus, The following kvm stability and correctness fixes since commit 8b9909ded6922c33c221b105b26917780cfa497d: Linus Torvalds (1): Merge branch 'merge' of master.kernel.org:/.../paulus/powerpc are found in the 'linus' branch of the git repository at: git://kvm.qumranet.com/home/avi/kvm.git Avi Kivity (4): KVM: Unset kvm_arch_ops if arch module loading failed KVM: Fix guest sysenter on vmx KVM: MMU: Fix guest writes to nonpae pde KVM: MMU: Fix host ...
Mar 13, 7:18 am 2007
Glauber de Oliveira ...
Re: [PATCH] Remove duplicated code for reading control r ...
thanks. Attached now -- Glauber de Oliveira Costa Red Hat Inc. "Free as in Freedom"
Mar 13, 5:52 am 2007
Ralf Baechle
[SOUND] ice1712: build fixes
CC [M] sound/pci/ice1712/ice1712.o sound/pci/ice1712/ice1712.c:290: error: snd_ice1712_mixer_digmix_route_ac97 causes a section type conflict sound/pci/ice1712/ice1712.c:1630: error: snd_ice1712_eeprom causes a section type conflict sound/pci/ice1712/ice1712.c:1894: error: snd_ice1712_pro_internal_clock causes a section type conflict sound/pci/ice1712/ice1712.c:1965: error: snd_ice1712_pro_internal_clock_default causes a section type conflict sound/pci/ice1712/ice1712.c:2004: error: ...
Mar 13, 5:48 am 2007
Tejun Heo
Re: [3/6] 2.6.21-rc2: known regressions
Can you apply the attached patch and report what the kernel says with ACPI turned on? -- tejun
Mar 13, 5:41 am 2007
Mathieu Bérard
Re: [3/6] 2.6.21-rc2: known regressions
Hi, I got this: [ 13.523816] SCSI subsystem initialized [ 13.528914] ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 19 (level, low) -> IRQ 19 [ 14.529383] ahci 0000:00:1f.2: AHCI 0001.0000 32 slots 4 ports 1.5 Gbps 0x5 impl IDE mode [ 14.529439] ahci 0000:00:1f.2: flags: 64bit ncq pm led slum part [ 14.529565] ata1: SATA max UDMA/133 cmd 0xf8824d00 ctl 0x00000000 bmdma 0x00000000 irq 19 [ 14.529683] ata2: SATA max UDMA/133 cmd 0xf8824d80 ctl 0x00000000 bmdma 0x00000000 irq 19 [ ...
Mar 13, 1:56 pm 2007
Glauber de Oliveira ...
[PATCH] Remove duplicated code for reading control registers
Tiny cleanup: In x86_64, the same functions for reading cr3 and writing cr{3,4} are defined in tlbflush.h and system.h, whith just a name change. The only difference is the clobbering of memory, which seems a safe, and even needed change for the write_cr4. This patch removes the duplicate. write_cr3() is moved to system.h for consistency. Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> -- Glauber de Oliveira Costa Red Hat Inc. "Free as in Freedom" -
Mar 13, 5:30 am 2007
Tejun Heo
Re: IDE disk runs just in DMA/33 with 2.6.20.2 on nVidia ...
Cable detection on CK804 is basically broken at the moment. It should be fixed with Alan's acpi cable detection magic soon (maybe 2.6.21). Thanks. -- tejun -
Mar 13, 5:19 am 2007
Alan Cox
Re: IDE disk runs just in DMA/33 with 2.6.20.2 on nVidia ...
On Tue, 13 Mar 2007 13:13:12 +0100 (CET) I have the pieces that fix all this, but not yet integrated into the main tree. With 2.6.21rc3 you have 80 wire cables you may find that simply hacking the driver to set ATA_CBL_PATA80 is sufficient, but maybe not. Alan -
Mar 13, 6:40 am 2007
l.genoni
IDE disk runs just in DMA/33 with 2.6.20.2 on nVidia CK8 ...
Hi, I reported this also for 2.6.20 kernel. new libata with controller nVidia CK804 initializes the disk in DMA/33, with with 2.6.19.5 and previous the disk is correctly inizialized in DMA/100. Tha cable is OK, and with older kernels the disks runs without troubles. The sistem has two sata disks on nvidia CK804 controllers, and then a disk as primary master, and a dvd writer (DMA/33) as secondary master) here is lspci -vxxx 00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev f2) ...
Mar 13, 5:13 am 2007
Mockern
Question: kthread zombie process
Hi, I use kthread in my driver. The problem is that I can't kill process with kill_proc function. After rmmod my_driver, ps command shows again SW my_driver. How can I remove this process? Thank you -
Mar 13, 5:06 am 2007
linux-os (Dick Johnson)
Re: Question: kthread zombie process
You need to follow the procedures used by other kernel drivers that use kernel threads. The kernel thread must be properly run-down during the module removal process. An easy to read sample is in ../drivers/wireless/airo.c. Cheers, Dick Johnson Penguin : Linux version 2.6.16.24 on an i686 machine (5592.64 BogoMips). New book: http://www.AbominableFirebug.com/ _  **************************************************************** The information transmitted in this message is ...
Mar 13, 5:33 am 2007
Mel Gorman
[PATCH] Fix corruption of memmap on IA64 SPARSEMEM when ...
There are problems in the use of SPARSEMEM and pageblock flags that causes problems on ia64. The first part of the problem is that units are incorrect in SECTION_BLOCKFLAGS_BITS computation. This results in a map_section's section_mem_map being treated as part of a bitmap which isn't good. This was evident with an invalid virtual address when mem_init attempted to free bootmem pages while relinquishing control from the bootmem allocator. The second part of the problem occurs because the ...
Mar 13, 3:42 am 2007
Pierre.Peiffer
[PATCH 2.6.21-rc3-mm2 2/4] Make futex_wait() use an hrti ...
This patch modifies futex_wait() to use an hrtimer + schedule() in place of schedule_timeout(). schedule_timeout() is tick based, therefore the timeout granularity is the tick (1 ms, 4 ms or 10 ms depending on HZ). By using a high resolution timer for timeout wakeup, we can attain a much finer timeout granularity (in the microsecond range). This parallels what is already done for futex_lock_pi(). The timeout passed to the syscall is no longer converted to jiffies and is therefore passed ...
Mar 13, 2:52 am 2007
Pierre.Peiffer
[PATCH 2.6.21-rc3-mm2 4/4] sys_futex64 : allows 64bit futexes
This last patch is an adaptation of the sys_futex64 syscall provided in -rt patch (originally written by Ingo). It allows the use of 64bit futex. I have re-worked most of the code to avoid the duplication of the code. It does not provide the functionality for all architectures (only for x64 for now). Signed-off-by: Pierre Peiffer <pierre.peiffer@bull.net> --- include/asm-x86_64/futex.h | 113 ++++++++++++++++++++ include/asm-x86_64/unistd.h | 4 include/linux/futex.h | 7 ...
Mar 13, 2:52 am 2007
Pierre.Peiffer
[PATCH 2.6.21-rc3-mm2 3/4] futex_requeue_pi optimization
This patch provides the futex_requeue_pi functionality. This provides an optimization, already used for (normal) futexes, to be used for PI-futexes. This optimization is currently used by the glibc in pthread_broadcast, when using "normal" mutexes. With futex_requeue_pi, it can be used with PRIO_INHERIT mutexes too. Signed-off-by: Pierre Peiffer <pierre.peiffer@bull.net> --- include/linux/futex.h | 8 kernel/futex.c | 557 ...
Mar 13, 2:52 am 2007
Jiri Kosina
Re: Question re hiddev
Hi Robert, not only this piece of hardware, but many others - for example almost all Bluetooth keyboards and mice are capable of working both in HID and HCI modes, etc. The layer introduced in 2.6.20 gives them the possibility of I don't think so. Firstly, it will take some time until the HID layer is converted to bus, as I have another things pending. Secondly, the iowarrior driver will still be needed to handle the HW-specific reports that won't be handled by the generic HID layer, ...
Mar 13, 2:54 am 2007
Joerg Roedel
[PATCH] x86_64: fix cpu MHz reporting on constant_tsc cpus
From: Mark Langsdorf <mark.langsdorf@amd.com> From: Joerg Roedel <joerg.roedel@amd.com> This patch fixes the reporting of cpu_mhz in /proc/cpuinfo on CPUs with a constant TSC rate and a kernel with disabled cpufreq. Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> -- Joerg Roedel Operating System Research Center AMD Saxony LLC & Co. KG
Mar 13, 3:00 am 2007
Pierre.Peiffer
[PATCH 2.6.21-rc3-mm2 0/4] Futexes functionalities and i ...
Hi Andrew, This is a re-send of a series of patches concerning futexes (here after is a short description) Could you consider them for inclusion in -mm tree ? All of them have already been discussed in January and have already been included in -rt for a while. I think that we agreed to potentially include them in the -mm tree. Ulrich is specially interested by sys_futex64. There are: * futex uses prio list : allows RT-threads to be woken in priority order instead of FIFO ...
Mar 13, 2:52 am 2007
Pierre.Peiffer
[PATCH 2.6.21-rc3-mm2 1/4] futex priority based wakeup
Today, all threads waiting for a given futex are woken in FIFO order (first waiter woken first) instead of priority order. This patch makes use of plist (pirotity ordered lists) instead of simple list in futex_hash_bucket. All non-RT threads are stored with priority MAX_RT_PRIO, causing them to be woken last, in FIFO order (RT-threads are woken first, in priority order). Signed-off-by: S
Mar 13, 2:52 am 2007
ST
Re: Gigabyte GA-M57SLI-S4 (the linuxbios compatible vers ...
Thanks for your answer. Since this board has linuxbios support and i plan to put it on this board, there has been a post which tells the right boot parameters: apic=debug acpi_dbg_level=0xffffffff pci=noacpi,routeirq snd-hda-intel.enable_msi=1 I guess the "pci=noacpi,routeirq" makes the difference, but i haven't tried yet myself. ST -
Mar 13, 2:57 am 2007
Maximus
Porting V4L2 drivers to 2.6.20
Hey, am porting V4L2 drivers from 2.6.13 to 2.6.20. The driver is using a structure 'video_device' which exists in include/linux/videodev.h. However, The linux kernel in 2.6.20 doesnot have that structure?. Has the architecture changed between 2.6.13 to 2.6.20 for V4L2?. Regards, Jo -
Mar 13, 3:00 am 2007
Laurent Pinchart
Re: Porting V4L2 drivers to 2.6.20
The structure has been moved to include/media/v4l2-dev.h Best regards, Laurent Pinchart -
Mar 13, 3:08 am 2007
Serge Belyshev
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
Mike Galbraith <efault@gmx.de> writes: Whaa? make -j8 on mainline makes my desktop box completely useless. Please reconsider your statement. -
Mar 13, 4:41 am 2007
Mike Galbraith
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
I'll do no such thing, and don't appreciate the insinuation. -Mike -
Mar 13, 4:46 am 2007
Con Kolivas
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
I only find a slowdown, no choppiness, no audio stutter (it would be extremely hard to make audio stutter in this design without i/o starvation or something along those lines). The number difference in cpu percentage I've already given on the previous email. The graphics driver does feature in this test Kernel compiles seem similar till the jobs get above about 3 where rsdl gets slower but still smooth. Audio is basically unaffected either way. Don't forget all the rest of the cases ...
Mar 13, 3:06 am 2007
John Stoffel
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
>>>>> "Serge" == Serge Belyshev <belyshev@depni.sinp.msu.ru> writes: Serge> Mike Galbraith <efault@gmx.de> writes: Serge> Whaa? make -j8 on mainline makes my desktop box completely Serge> useless. I tried a make -j5 on my Dual processor PIII Xeon box. It was pretty slow. Firefox was ok scrolling with keyboard, but the scrollbar was jerky. I also has MP3s playing at the same time and it worked just fine. No drops that I heard. This is 2.6.21-rc3 patched with RSDL. To me, that ...
Mar 13, 8:36 am 2007
Ingo Molnar
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
ok. So nice levels had nothing to do with it - it's some other regression somewhere. How does the vanilla scheduler cope with the exactly same workload? I.e. could you describe the 'delta' difference in behavior - because the delta is what we are interested in mostly, the 'absolute' behavior alone is not sufficient. Something like: - on scheduler foo, under this workload, the CPU hogs steal 70% CPU time and the resulting desktop experience is 'choppy': mouse pointer is laggy ...
Mar 13, 2:39 am 2007
Mike Galbraith
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
My first test run with lame at nice 0 was truly horrid, but has _not_ repeated, so disregard that as an anomaly. For the most part, it is as you say, things just get slower with load, any load. I definitely am seeing lurchiness which is not present in mainline. No audio problems It seems to be a plain linear slowdown. The lurchiness I'm experiencing varies in intensity, and is impossible to quantify. I see neither Absolutely, all test results count. -Mike -
Mar 13, 4:23 am 2007
Jan Beulich
__HAVE_ARCH_PTEP_TEST_AND_CLEAR_{DIRTY,YOUNG} on i386
Isn't defining these on i386 of at most historical value. The only consumer is include/asm-generic/pgtable.h (ptep_clear_flush_{dirty,young}), and those are already suppressed by __HAVE_ARCH_PTEP_CLEAR_{DIRTY,YOUNG}_FLUSH being defined in include/asm-i386/pgtable.h. Or is there a particular need to detect if other uses appear (in which case the comment accompanying their definitions is pretty misleading)? Thanks, Jan -
Mar 13, 2:45 am 2007
Roland McGrath
[PATCH 2/2] Remove OPEN_MAX
The OPEN_MAX macro in limits.h should not be there. It claims to be the limit on file descriptors in a process, but its value is wrong for that. There is no constant value, but a variable resource limit (RLIMIT_NOFILE). Nothing in the kernel uses OPEN_MAX except things that are wrong to do so. I've submitted other patches to remove those uses. The proper thing to do according to POSIX is not to define OPEN_MAX at all. The sysconf (_SC_OPEN_MAX) implementation works by calling ...
Mar 13, 1:40 am 2007
Mike Galbraith
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
It's no big deal, Con and I just seem to be oil and water. He'll have to be oil, because water is already take. *evaporate* :) -
Mar 13, 1:22 am 2007
Benjamin LaHaise
Re: [PATCH 1/2] avoid OPEN_MAX in SCM_MAX_FD
This is a bad idea. From linux/fs.h: #undef NR_OPEN #define NR_OPEN (1024*1024) /* Absolute upper limit on fd num */ There isn't anything I can see guaranteeing that net/scm.h is included before fs.h. This affects networking and should really be Cc'd to netdev@vger.kernel.org, which will raise the issue that if SCM_MAX_FD is raised, the resulting simple kmalloc() must be changed. That said, I doubt SCM_MAX_FD really needs to be raised, as applications using many file ...
Mar 13, 7:17 am 2007
Roland McGrath
[PATCH 1/2] avoid OPEN_MAX in SCM_MAX_FD
The OPEN_MAX constant is an arbitrary number with no useful relation to anything. Nothing should be using it. This patch changes SCM_MAX_FD to use NR_OPEN instead of OPEN_MAX. This increases the size of the struct scm_fp_list type fourfold, to make it big enough to contain as many file descriptors as could be asked of it. This size increase may not be very worthwhile, but at any rate if an arbitrary limit unrelated to anything else is being defined it should be done explicitly here ...
Mar 13, 1:39 am 2007
Roland McGrath
Re: [PATCH 1/2] avoid OPEN_MAX in SCM_MAX_FD
Ok. My only agenda is to get rid of OPEN_MAX. I then propose the following instead. Thanks, Roland --- [PATCH] avoid OPEN_MAX in SCM_MAX_FD The OPEN_MAX constant is an arbitrary number with no useful relation to anything. Nothing should be using it. SCM_MAX_FD is just an arbitrary constant and it should be clear that its value is chosen in net/scm.h and not actually derived from anything else meaningful in the system. Signed-off-by: Roland McGrath <roland@redhat.com> --- ...
Mar 13, 1:02 pm 2007
Linus Torvalds
Re: [PATCH 1/2] avoid OPEN_MAX in SCM_MAX_FD
I'd actually prefer this as part of the "remove OPEN_MAX" patch. It's certainly nice to have small independent patches in a series, but two one-liners that really aren't all that independent either in practice or in goals doesn't make much sense to me. Much better to just be up-front about things and say: "remove OPEN_MAX, and to do so, just rewrite that other arbitrary constant to not need it any more". That said, it actually worries me that you should call "_SC_OPEN_MAX". I think ...
Mar 13, 2:28 pm 2007
Roland McGrath
[PATCH] Remove CHILD_MAX
The CHILD_MAX macro in limits.h should not be there. It claims to be the limit on processes a user can own, but its value is wrong for that. There is no constant value, but a variable resource limit (RLIMIT_NPROC). Nothing in the kernel uses CHILD_MAX. The proper thing to do according to POSIX is not to define CHILD_MAX at all. The sysconf (_SC_CHILD_MAX) implementation works by calling getrlimit. Signed-off-by: Roland McGrath <roland@redhat.com> --- include/linux/limits.h | 1 - 1 ...
Mar 13, 1:42 am 2007
Gene Heskett
Re: New thread RDSL, post-2.6.20 kernels and amanda (tar ...
And amanda/tar worked normally for 2.6.20.2 plain. Next up, 2.6.21-rc1 if it will build here. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Politics: A strife of interests masquerading as a contest of principles. The conduct of public affairs for private advantage. -- Ambrose Bierce -
Mar 13, 11:36 am 2007
Gene Heskett
New thread RDSL, post-2.6.20 kernels and amanda (tar) mi ...
Greetings; Someone suggested a fresh thread for this. I now have my scripts more or less under control, and I can report that kernel-2.6.20.1 with no other patches does not exhibit the undesirable behaviour where tar thinks its all new, even when told to do a level 2 on a directory tree that hasn't been touched in months to update anything. Next up, 2.6.20.2, plain and with the latest RDSL-0.30 patch. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ...
Mar 13, 1:28 am 2007
JanuGerman
LSM Stacking
Hi All, Within the security folder in the kernel tree, the 2.6.20 linux kernel distribution is shipped with a file root_plug.c (written by Greg Kroah-Hartman), which is a classic introduction to Linux Security Modules (LSM). The folder also contains the folder of SELinux. My question is that whether root_plug.c security module is stacked with the SELinux security module or not. If root_plug.c is stacked, where i can find the code which handles the stacking of SELinux and root_plug.c ...
Mar 13, 12:44 am 2007
Chris Wright
Re: LSM Stacking
Look at rootplug_init where it does mod_reg_security. If you have SELinux builtin, and subsequently register rootplug, it will stack under SELinux. Typically this Check the linux-security-module archives. This issue has been discussed quite extensively numerous times there. thanks, -chris -
Mar 13, 2:17 pm 2007
Randy.Dunlap
Re: [PATCH 1/2] Fix some coding-style errors in autofs
Please do a complete job on the 'for' line by eliminating the space before each semi-colon. -- ~Randy -
Mar 12, 10:02 pm 2007
sukadev
Re: [PATCH 1/2] Fix some coding-style errors in autofs
Randy.Dunlap [rdunlap@xenotime.net] wrote: | On Mon, 12 Mar 2007 sukadev@us.ibm.com wrote: | | Please do a complete job on the 'for' line by eliminating the | space before each semi-colon. | | -- | ~Randy Ok. Here is the updated patch. The Patch 2/2 in this set should still apply cleanly on top of this. --- From: Sukadev Bhattiprolu <sukadev@us.ibm.com> Subject: [PATCH 1/2] Fix some coding-style errors in autofs Fix coding style errors (extra spaces, long lines) in autofs and ...
Mar 13, 10:52 am 2007
sukadev
[PATCH 1/2] Fix some coding-style errors in autofs
From: Sukadev Bhattiprolu <sukadev@us.ibm.com> Subject: [PATCH 1/2] Fix some coding-style errors in autofs Fix coding style errors (extra spaces, long lines) in autofs and autofs4 files being modified for container/pidspace issues. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: containers@lists.osdl.org Cc: Eric W. Biederman <ebiederm@xmission.com> --- ...
Mar 12, 9:50 pm 2007
sukadev
[PATCH 1/5] statically initialize struct pid for swapper
From: Sukadev Bhattiprolu <sukadev@us.ibm.com> Subject: [PATCH 1/5] statically initialize struct pid for swapper Statically initialize a struct pid for the swapper process (pid_t == 0) and attach it to init_task. This is needed so task_pid(), task_pgrp() and task_session() interfaces work on the swapper process also. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: ...
Mar 12, 9:42 pm 2007
sukadev
[PATCH 3/5] Use struct pid parameter in copy_process()
From: Sukadev Bhattiprolu <sukadev@us.ibm.com> Subject: [PATCH 3/5] Use struct pid parameter in copy_process() Modify copy_process() to take a struct pid * parameter instead of a pid_t. This simplifies the code a bit and also avoids having to call find_pid() to convert the pid_t to a struct pid. Changelog: - Fixed Badari Pulavarty's comments and passed in &init_struct_pid from fork_idle(). - Fixed Eric Biederman's comments and simplified this patch and used a new patch to remove ...
Mar 12, 9:44 pm 2007
sukadev
[PATCH 4/5] Remove the likely(pid) check in copy_process
From: Sukadev Bhattiprolu <sukadev@us.ibm.com> Subject: [PATCH 4/5] Remove the likely(pid) check in copy_process Now that we pass in a struct pid parameter to copy_process() and even the swapper (pid_t == 0) has a valid struct pid, we no longer need this check. Changelog: Per Eric Biederman's comments, moved this out to a separate patch for easier review. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen ...
Mar 12, 9:44 pm 2007
Rafael J. Wysocki
Re: 2.6.21rc suspend to ram regression on Lenovo X60
Do you have CONFIG_TICK_ONESHOT or CONFIG_NO_HZ set? If you do, could you please unset them and retest? Thanks, Rafael -
Mar 13, 2:22 am 2007
Dave Jones
Re: 2.6.21rc suspend to ram regression on Lenovo X60
On Tue, Mar 13, 2007 at 10:22:53AM +0100, Rafael J. Wysocki wrote: > On Tuesday, 13 March 2007 05:08, Dave Jones wrote: > > I spent considerable time over the last day or so bisecting to > > find out why an X60 stopped resuming somewhen between 2.6.20 and current -git. > > (Total lockup, black screen of death). > > Do you have CONFIG_TICK_ONESHOT or CONFIG_NO_HZ set? If you do, could you > please unset them and retest? I did try with NO_HZ unset, made no difference, I don't recall ...
Mar 13, 6:22 am 2007
Matt Mackall
Re: 2.6.21rc suspend to ram regression on Lenovo X60
If you've got a tree that looks like: --a-b-c-d-e-f-g-h-> \ / i-j-k-l-m-n where h is bad but both g and n are good, you can try testing the merge of g+k, etc. Which will find half the problem. Then you can do the same on the other side. Tedious. The best way to debug resume issues directly seems to be to do a fake suspend, possibly with filtering out particular ...
Mar 13, 7:41 am 2007
Dave Jones
2.6.21rc suspend to ram regression on Lenovo X60
I spent considerable time over the last day or so bisecting to find out why an X60 stopped resuming somewhen between 2.6.20 and current -git. (Total lockup, black screen of death). The bisect log looked like this. git-bisect start # bad: [c8f71b01a50597e298dc3214a2f2be7b8d31170c] Linux 2.6.21-rc1 git-bisect bad c8f71b01a50597e298dc3214a2f2be7b8d31170c # good: [fa285a3d7924a0e3782926e51f16865c5129a2f7] Linux 2.6.20 git-bisect good fa285a3d7924a0e3782926e51f16865c5129a2f7 # bad: ...
Mar 12, 9:08 pm 2007
Eric W. Biederman
Re: 2.6.21rc suspend to ram regression on Lenovo X60
Ok. This is weird. It looks like you marked the merge bad but it's individual commits as good.... Which would indicate a problem on one of the branches it was merged with, or a problem that only shows up when both groups of changes Thanks. Of those msi patches you have identified I don't see anything really obvious. And you actually marked them as good in your bisect so I don't expect it is core problem. We do have a known e1000 regression, with msi and suspend/resume. So it is ...
Mar 13, 1:11 am 2007
sukadev
[PATCH 5/5] Use task_pgrp() task_session() in copy_process()
From: Sukadev Bhattiprolu <sukadev@us.ibm.com> Subject: [PATCH 5/5] Use task_pgrp() task_session() in copy_process(). Use task_pgrp() and task_session() in copy_process(), and avoid find_pid() call when attaching the task to its process group and session. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: containers@lists.osdl.org Acked-by: Eric W. Biederman ...
Mar 12, 9:45 pm 2007
sukadev
[PATCH 2/5] Explicitly set pgid and sid of init process
From: Sukadev Bhattiprolu <sukadev@us.ibm.com> Subject: [PATCH 2/5] Explicitly set pgid and sid of init process Explicitly set pgid and sid of init process to 1. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: <containers@lists.osdl.org> Acked-by: Eric W. Biederman ...
Mar 12, 9:43 pm 2007
Jeff Garzik
Re: Fwd: libata extension
SAT (aka ATA passthru) defines how to do soft-reset. SG_IO supports the ATA_12 and ATA_16 commands which permit soft-reset and similar tasks. libata supports this interface, but does not yet support soft-reset and similar non-comment-oriented tasks. This would be the best area to add such features, though. Jeff -
Mar 13, 4:23 am 2007
Vitaliyi
Re: Fwd: libata extension
> Why is the access to Control register needed? Reading/writing service area, uploading, downloading modules, working with flash etc. -
Mar 12, 7:36 pm 2007
albcamus
RE: Question: removal of syscall macros?
I have the same problem as yours. Do you have any idea to use ATI firegl driver in recent kernels ? Thanks in advance. Regards, albcamus -
Mar 12, 8:06 pm 2007
Nick Piggin
Re: [QUICKLIST 0/4] Arch independent quicklists V2
Depends on the tlb flush implementation. The generic one doesn't look like it is all that smart about optimising the fullmm case. It does skip some Still, it is something we could try. -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com -
Mar 13, 5:18 am 2007
Nick Piggin
Re: [QUICKLIST 0/4] Arch independent quicklists V2
Page allocator still requires interrupts to be disabled, which this doesn't. Considering there isn't much else that frees known zeroed pages, I wonder if it is worthwhile. Last time the zeroidle discussion came up was IIRC not actually real performance gain, just cooking the 1024 CPU threaded pagefault numbers ;) -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com -
Mar 13, 1:03 am 2007
Paul Mundt
Re: [QUICKLIST 1/4] Generic quicklist implementation
This doesn't work, and so CONFIG_QUICKLIST is always set. The NR_QUICK thing seems a bit backwards anyways, perhaps it would make more sense to have architectures set CONFIG_GENERIC_QUICKLIST in the same way that the other GENERIC_xxx bits are defined, and then set NR_QUICK based off of that. It's obviously going to be 2 or 1 for most people, and x86 seems to be the only one that needs 2. How about this? -- diff --git a/mm/Kconfig b/mm/Kconfig index 7942b33..2f20860 100644 --- ...
Mar 13, 2:05 am 2007
Christoph Lameter
Re: [QUICKLIST 0/4] Arch independent quicklists V2
Ok, then what did I do wrong 3 years ago with the prezeroing patchsets? -
Mar 13, 4:20 am 2007
Nick Piggin
Re: [QUICKLIST 0/4] Arch independent quicklists V2
Well I guess that would be the case if you had just unmapped a 4MB chunk that was pretty dense with pages. My malloc seems to allocate and free in blocks of 128K, so that's only going to give us 3% of the last level pte being cache hot when it gets freed. Not sure what common mmap(file) access patterns look like. The majority of programs I run have a smattering of llpt pages pretty sparsely populated, covering text, libraries, heap, stack, vdso. We don't actually have to zap_pte_range ...
Mar 13, 4:30 am 2007
Jeremy Fitzhardinge
Re: [QUICKLIST 0/4] Arch independent quicklists V2
Can pagetable pages be shared between mms? (Kernel pmds in PAE excepted.) J -
Mar 13, 1:17 pm 2007
Andrew Morton Mar 13, 5:30 am 2007
Andrew Morton
Re: [QUICKLIST 0/4] Arch independent quicklists V2
I'm trying to remember why we ever would have needed to zero out the pagetable pages if we're taking down the whole mm? Maybe it's because "oh, the arch wants to put this page into a quicklist to recycle it", which is all rather circular. It would be interesting to look at a) leave the page full of random garbage if we're releasing the whole mm and b) return it straight to the page allocator. -
Mar 13, 5:47 am 2007
Peter Chubb
Re: [QUICKLIST 0/4] Arch independent quicklists V2
>>>>> "Jeremy" == Jeremy Fitzhardinge <jeremy@goop.org> writes: Jeremy> And do the same in pte pages for actual mapped pages? Or do Jeremy> you think they would be too densely populated for it to be Jeremy> worthwhile? We've been doing some measurements on how densely clumped ptes are. On 32-bit platforms, they're pretty dense. On IA64, quite a bit sparser, depending on the workload of course. I think that's mostly because of the larger pagesize on IA64 -- with 64k pages, you don't need ...
Mar 13, 2:46 pm 2007
Nick Piggin
Re: [QUICKLIST 0/4] Arch independent quicklists V2
Well we have the 'fullmm' case, which avoids all the locked pte operations (for those architectures where hardware pt walking requires atomicity). However we still have to visit those to-be-unmapped parts of the page table, to find the pages and free them. So we still at least need to bring it into cache for the read... at which point, the store probably isn't a big burden. -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com -
Mar 13, 5:01 am 2007
Christoph Lameter
[QUICKLIST 2/4] Quicklist support for i386
i386: Convert to quicklists Implement the i386 management of pgd and pmds using quicklists. The i386 management of page table pages currently uses page sized slabs. The page state is therefore mainly determined by the slab code. However, i386 also uses its own fields in the page struct to mark special pages and to build a list of pgds using the ->private and ->index field (yuck!). This has been finely tuned to work right with SLAB but SLUB needs more control over the page struct. Currently ...
Mar 13, 12:13 am 2007
Matt Mackall
Re: [QUICKLIST 0/4] Arch independent quicklists V2
Because we'd need one link per mm that a page is mapped in? -- Mathematics is the supreme nostalgia of our time. -
Mar 13, 1:03 pm 2007
Christoph Lameter
[QUICKLIST 4/4] Quicklist support for sparc64
From: David Miller <davem@davemloft.net> [QUICKLIST]: Add sparc64 quicklist support. I ported this to sparc64 as per the patch below, tested on UP SunBlade1500 and 24 cpu Niagara T1000. Signed-off-by: David S. Miller <davem@davemloft.net> --- arch/sparc64/Kconfig | 4 ++++ arch/sparc64/mm/init.c | 24 ------------------------ arch/sparc64/mm/tsb.c | 2 +- include/asm-sparc64/pgalloc.h | 26 ++++++++++++++------------ 4 files changed, 19 insertions(+), ...
Mar 13, 12:13 am 2007
Nick Piggin
Re: [QUICKLIST 0/4] Arch independent quicklists V2
On a Pentium 4? ;) Sure, that is a minor detail, considering that you'll usually be allocating The thing is, pagetable pages are the one really good exception to the rule that we should keep cache hot and initialise-on-demand. They typically are fairly sparsely populated and sparsely accessed. Even for last level page tables, I think it is reasonable to assume they will usually be pretty cold. And you want to allocate cache cold pages as well, for the same reasons (you want to keep your ...
Mar 13, 4:06 am 2007
Andrew Morton
Re: [QUICKLIST 0/4] Arch independent quicklists V2
If you want a zeroed page for pagecache and someone has just stuffed a known-zero, cache-hot page into the pagetable quicklists, you have good reason to be upset. In fact, if you want a _non_-zeroed page and someone has just stuffed a known-zero, cache-hot page into the pagetable quicklists, you still have reason to be upset. You *want* that cache-hot page. Generally, all these little private lists of pages (such as the ones which slab had/has) are a bad deal. Cache effects preponderate ...
Mar 13, 4:52 am 2007
Christoph Lameter
[QUICKLIST 1/4] Generic quicklist implementation
Abstract quicklist from the OA64 implementation Extract the quicklist implementation for IA64, clean it up and generalize it to allow multiple quicklists and support for constructors and destructors.. Signed-off-by: Christoph Lameter <clameter@sgi.com> --- arch/ia64/Kconfig | 4 ++ arch/ia64/mm/contig.c | 2 - arch/ia64/mm/discontig.c | 2 - arch/ia64/mm/init.c | 51 --------------------------- include/asm-ia64/pgalloc.h | 82 ...
Mar 13, 12:13 am 2007
Andrew Morton
Re: [QUICKLIST 0/4] Arch independent quicklists V2
Unsurprised. Were non-temporal stores tried? -
Mar 13, 5:27 am 2007
Christoph Lameter
[QUICKLIST 0/4] Arch independent quicklists V2
V1->V2 - Add sparch64 patch - Single i386 and x86_64 patch - Update attribution - Update justification - Update approvals - Earlier discussion of V1 was at http://marc.info/?l=linux-kernel&m=117357922219342&w=2 This patchset introduces an arch independent framework to handle lists of recently used page table pages. It is necessary for x86_64 and i386 to avoid the special casing of SLUB because these two platforms use fields in the page_struct (page->index and page->private) that SLUB ...
Mar 13, 12:13 am 2007
Christoph Lameter
[QUICKLIST 3/4] Quicklist support for x86_64
Conver x86_64 to using quicklists This adds caching of pgds and puds, pmds, pte. That way we can avoid costly zeroing and initialization of special mappings in the pgd. A second quicklist is useful to separate out PGD handling. We can carry the initialized pgds over to the next process needing them. Also clean up the pgd_list handling to use regular list macros. There is no need anymore to avoid the lru field. Move the add/removal of the pgds to the pgdlist into the constructor / ...
Mar 13, 12:13 am 2007
Jeremy Fitzhardinge
Re: [QUICKLIST 0/4] Arch independent quicklists V2
And do the same in pte pages for actual mapped pages? Or do you think they would be too densely populated for it to be worthwhile? J -
Mar 13, 2:36 pm 2007
Jeremy Fitzhardinge
Re: [QUICKLIST 0/4] Arch independent quicklists V2
Why not try to find a place to stash a linklist pointer and link them all together? Saves the pulldown pagetable walk altogether. J -
Mar 13, 10:30 am 2007
Andrew Morton
Re: [QUICKLIST 0/4] Arch independent quicklists V2
I suspect there are some tlb operations which could be skipped in that case It means all that data has to be written back. Yes, I expect it'll prove to be less costly than the initial load. -
Mar 13, 6:11 am 2007
Matt Mackall
Re: [QUICKLIST 0/4] Arch independent quicklists V2
Ahh, I think the issue is that we have to walk the page tables to drop the reference count of the _actual pages_ they point to. The page tables themselves could all be put on a list or two lists (one for PMDs, one for everything else), but that wouldn't really be a win over just walking the tree, especially given the extra list maintenance. Because the fan-out is large, the bulk of the work is bringing the last layer of the tree into cache to find all the pages in the address space. And ...
Mar 13, 1:21 pm 2007
Andrew Morton
Re: [QUICKLIST 0/4] Arch independent quicklists V2
Well if they're zero then perhaps they should be released to the page allocator to satisfy the next __GFP_ZERO request. If that request is for a pagetable page, we break even (except we get to remove special-case code). If that __GFP_ZERO allocation was or some application other than for a pagetable, we win. iow, can we just nuke 'em? (Will require some work in the page allocator) (That work will open the path to using the idle thread to prezero pages) -
Mar 13, 1:53 am 2007
David Miller
Re: [QUICKLIST 0/4] Arch independent quicklists V2
From: Matt Mackall <mpm@selenic.com> That's right. And I will note that historically we used to be much worse in this area, as we used to walk the page table tree twice on address space teardown (once to hit the PTE entries, once to free the page tables). Happily it is a one-pass algorithm now. But, within active VMA ranges, we do have to walk all the bits at least one time. -
Mar 13, 2:07 pm 2007
Paul Mackerras
Re: [QUICKLIST 0/4] Arch independent quicklists V2
I don't see much point to them. For powerpc, I would rather grab an My recollection was that it wasn't a win, but it was a long time ago... Paul. -
Mar 13, 4:58 pm 2007
Matt Mackall
Re: [QUICKLIST 0/4] Arch independent quicklists V2
Well you -could- do this: - reuse a long in struct page as a used map that divides the page up into 32 or 64 segments - every time you set a PTE, set the corresponding bit in the mask - when we zap, only visit the regions set in the mask Thus, you avoid visiting most of a PMD page in the sparse case, assuming PTEs aren't evenly spread across the PMD. This might not even be too horrible as the appropriate struct page should be in cache with the appropriate bits of the mm already ...
Mar 13, 2:14 pm 2007
Christoph Lameter
Re: [QUICKLIST 0/4] Arch independent quicklists V2
Nope that wont work. 1. We need to support other states of pages other than zeroed. 2. Prezeroing does not make much sense if a large portion of the page is being used. Performance is better if the whole page is zeroed directly before use.Prezeroing only makes sense for sparse I already tried that 3 years ago and there was *no* benefit for usual users of the a page allocator. The advantage exists only if a small portion of the page is used. F.e. For one cacheline there was a 4x ...
Mar 13, 4:17 am 2007
David Miller
Re: [QUICKLIST 0/4] Arch independent quicklists V2
From: Matt Mackall <mpm@selenic.com> Yes, I've even had that idea before. You can even hide it behind pmd_none() et al., the generic VM doesn't even have to know that the page table macros are doing this optimization. -
Mar 13, 2:48 pm 2007
Andrew Morton
Re: [QUICKLIST 0/4] Arch independent quicklists V2
eh? I'd have thought that a pte page which has just gone through zap_pte_range() will very often have a _lot_ of hot cachelines, and that's a common case. Yeah, prezeroing in idle is probably pointless. But I'm not aware of anyone having tried it properly... -
Mar 13, 5:15 am 2007
Andi Kleen
Re: [PATCH] Introduce load_TLS to the "for" loop.
Are you sure? Normally it doesn't unroll without -funroll-loops which the kernel does normally not set. Especially not with -Os builds. -Andi -
Mar 13, 6:50 am 2007
Rusty Russell
[PATCH] Introduce load_TLS to the "for" loop.
GCC (4.1 at least) unrolls it anyway, but I can't believe this code was ever justifiable. (I've also submitted a patch which cleans up i386, which is even uglier). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> diff -r de5618b5e562 include/asm-x86_64/desc.h --- a/include/asm-x86_64/desc.h Tue Mar 13 11:41:55 2007 +1100 +++ b/include/asm-x86_64/desc.h Tue Mar 13 16:09:56 2007 +1100 @@ -135,16 +135,13 @@ static inline void set_ldt_desc(unsigned (info)->useable == 0 && \ ...
Mar 12, 11:39 pm 2007
Andi Kleen
Re: [PATCH] Introduce load_TLS to the "for" loop.
It's in the middle of the context switch. -Andi -
Mar 13, 1:55 pm 2007
Jeremy Fitzhardinge
Re: [PATCH] Introduce load_TLS to the "for" loop.
Does it matter either way in this case? J -
Mar 13, 10:31 am 2007
Rusty Russell
[PATCH] Remove unused set_seg_base
The set_seg_base function isn't used anywhere (2.6.21-rc3-git1) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> diff -r 0798f7cfc709 include/asm-x86_64/desc.h --- a/include/asm-x86_64/desc.h Mon Mar 12 16:56:18 2007 +1100 +++ b/include/asm-x86_64/desc.h Tue Mar 13 11:39:16 2007 +1100 @@ -107,16 +107,6 @@ static inline void set_ldt_desc(unsigned DESC_LDT, size * 8 - 1); } -static inline void set_seg_base(unsigned cpu, int entry, void *base) -{ - struct desc_struct *d = ...
Mar 12, 11:38 pm 2007
Ben Dooks
Re: Need help on mach-ep93xx
subscribe to the linux-arm-kernel list and ask the question there, you'll find more ARM people there. -- Ben (ben@fluff.org, http://www.fluff.org/) 'a smiley only costs 4 bytes' -
Mar 13, 4:24 am 2007
Maxin John
Need help on mach-ep93xx
Hi, I have one question mach-ep93xx. In EP93xx IRQ handling part in core.c, the 2.6.19.2 kernel and newer kernels are configuring the 16 interrupts of the ports A & B together. The code is not using the interrupt capability of the port F which can provide 3 interrupts. Why the port F is not configured for interrupts ? Thanks in advance, Maxin B. John -
Mar 12, 10:24 pm 2007
Bartlomiej Zolnierki ...
Re: [Bugme-new] [Bug 8187] New: 2.6.20 "PCI: Quirks" pat ...
this should be fixed in 2.6.21-rc3, commit ed8ccee0918ad063a4741c0656fda783e02df627 Bart -
Mar 13, 4:18 am 2007
Greg KH Mar 12, 10:59 pm 2007
Andrew Morton
Re: [Bugme-new] [Bug 8187] New: 2.6.20 "PCI: Quirks" pat ...
argh. Would we break more machines than we fix if we just revert that? -
Mar 12, 11:19 pm 2007
sukadev
[PATCH 2/2] Replace pid_t in autofs with struct pid reference
From: Sukadev Bhattiprolu <sukadev@us.ibm.com> Subject: [PATCH 2/2] Replace pid_t in autofs with struct pid reference. Make autofs container-friendly by caching struct pid reference rather than pid_t and using pid_nr() to retreive a task's pid_t. ChangeLog: - Fix Eric Biederman's comments - Use find_get_pid() to hold a reference to oz_pgrp and release while unmounting; separate out changes to autofs and autofs4. - Fix Cedric's comments: retain old prototype of parse_options() ...
Mar 12, 9:51 pm 2007
sukadev
[PATCH] Kill unused sesssion and group values in rocket driver
From: Sukadev Bhattiprolu <sukadev@us.ibm.com> Subject: [PATCH] Kill unused sesssion and group values in rocket driver The process_session() and process_group() values are not really used by the driver. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: containers@lists.osdl.org Cc: Eric W. Biederman <ebiederm@xmission.com> --- drivers/char/rocket.c | 3 --- ...
Mar 12, 9:49 pm 2007
young dave
Fwd: PROBLEM: 2.6.20-1 not working on ibook g4 (BUG/Oops)
---------- Forwarded message ---------- Hi, I have tested on my mac mini g4. The 2.6.21-rc2 will cause oops like the above post. And for the new 2.6.21-rc3-git7 , the kernel load ok, penguin pixmap appears, but then it stopped, there's no error messages also. Regards dave -
Mar 12, 6:54 pm 2007
Jeremy Fitzhardinge
[PATCH] i386: Simplify smp_call_function*() by using com ...
Subject: Simplify smp_call_function*() by using common implementation smp_call_function and smp_call_function_single are almost complete duplicates of the same logic. This patch combines them by implementing them in terms of the more general smp_call_function_mask(). Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Stephane Eranian <eranian@hpl.hp.com> Cc: Andrew Morton <akpm@osdl.org> Cc: Andi Kleen <ak@suse.de> Cc: "Randy.Dunlap" <rdunlap@xenotime.net> Cc: Ingo Molnar ...
Mar 12, 6:12 pm 2007
Rusty Russell
Re: CONFIG_REORDER Kconfig help strange sentence.
OK, well here is a patch for the moment. == Clarify CONFIG_REORDER explanation if (1 && X) => if (X). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> diff -r de5618b5e562 arch/x86_64/Kconfig --- a/arch/x86_64/Kconfig Tue Mar 13 11:41:55 2007 +1100 +++ b/arch/x86_64/Kconfig Tue Mar 13 17:27:05 2007 +1100 @@ -632,8 +632,8 @@ config REORDER default n help This option enables the toolchain to reorder functions for a more - optimal TLB usage. If you have ...
Mar 12, 11:37 pm 2007
Andi Kleen
Re: [PATCH] Fix vmi time header bug
I don't think the patch should make any difference, so that's not needed. -Andi -
Mar 13, 5:45 am 2007
Andrew Morton
Re: [PATCH] Fix vmi time header bug
Correctly matching the section annotation on declarations and definitions is needed by at least ARM. We should ensure that we do this on all future patches and we should also apply this patch if only for this reason. (The ARM thing is a pain, because the compiler cannot check that the definition and declaration match. However something like sparse could do so). -
Mar 13, 6:59 am 2007
Jeremy Fitzhardinge
Re: [PATCH] Fix vmi time header bug
It's also useful documentation. Knowing whether a function is __init is an important part of its interface. J -
Mar 13, 10:32 am 2007
Linus Torvalds
Re: [PATCH] Fix vmi time header bug
Well, I guess sparse could do it, but the fact is, this is just a gcc bug. It would be much better if *gcc* just checked the function attributes it cared about. Anybody want to send a bug-report to the gcc lists? Here's a trivial test-case. #define section(x) __attribute__((__section__(x))) extern int section(".text.one") test_function(int); int section(".text.two") test_function(int arg) { return arg+1; } and the bug is that gcc doesn't warn about the section ...
Mar 13, 10:53 am 2007
Zachary Amsden
Re: [PATCH] Fix vmi time header bug
User build was smoking this: make O=build -j16 This and non-repeatable results make me suspect some kind of build dependency problem, or perhaps a make bug. Still, please apply, as it doesn't hurt. Zach -
Mar 12, 11:46 pm 2007
Zachary Amsden
Re: [PATCH] Fix vmi time header bug
According to the report I have. Perhaps a bogus section qualifier does more damage than an omitted one. I'll get gcc / linker version, but this could be a combination of user error, a strange toolchain, and Yes, I was surprised by this as well, and I'm still skeptical about this being the real cause. Still, this reportedly fixed the problem, and is certainly not a bad thing. Zach -
Mar 12, 11:21 pm 2007
Andrew Morton
Re: [PATCH] Fix vmi time header bug
Really truly? I think we have a _lot_ of declarations which omit the section qualifier altogether. How come they don't all break too? (ARM (at least) in fact does require the section tagging on the declaration as well as the definition, but we've thus far only fixed that in a couple of places which were causing breakage). -
Mar 12, 11:31 pm 2007
Mathieu Desnoyers
Re: Djprobes questions
Oh, I see. You can do it because you don't support fully preemptible kernels. It makes sense it that scenario. I could not use this scheme Great, continue your good work! Regards, -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -
Mar 13, 10:24 am 2007
Masami Hiramatsu
Re: Djprobes questions
Hi Mathieu, Hmm, djprobe already might not wait for each CPU to hit the probe point. It just wait scheduler synchronization instead of that. And after that, it issues cpuid for cache serialization before executing cross-modified code. The most difficult point of the djprobe is that it has to replace "live" instructions. So we must check other processors not to run Same idea was already discussed. It might work on normal kernel, but, unfortunately, it doesn't work on preemptive ...
Mar 12, 11:07 pm 2007
Mark Lord
Re: Asus P5B-VM motherboard: cd drive malfunctions if i ...
That's nice. But the P5B-VM board does not have any such jumper for USB, nor does it have any obvious combination of BIOS-setup options to accomplish it. -
Mar 13, 9:23 am 2007
Lennart Sorensen
Re: Asus P5B-VM motherboard: cd drive malfunctions if i ...
Most Asus boards have jumpers for the USB ports to select between +5V and +5VSB (stand by power). The reason to provide standby power is so that keyboards with power buttons can remain powered so that you can turn the system on using the usb keyboard. If you want to power off the ports entirely, jumper them to the +5V line instead which only has power when the system is on. -- Len Sorensen -
Mar 13, 8:32 am 2007
Lennart Sorensen
Re: Asus P5B-VM motherboard: cd drive malfunctions if i ...
Well it could only be done by hardware. The P5B has those jumpers. I figured the P5B-VM while a budget micro board would still have those. I guess not. Without jumper settings for it there is nothing you can do about it. A quick look through the manual certainly only mentions standby power for the keyboard connector and not for the USB ports. -- Len Sorensen -
Mar 13, 11:54 am 2007
Alan Cox
Re: Asus P5B-VM motherboard: cd drive malfunctions if i ...
It ought to be rock solid, perhaps you can send me a detailed bug report. In fact it actually doesn't do very much at all as the controller is The two drive the devices identically although if there was a problem I guess the libata one would recover better. Would appreciate bug reports anyway so I can do more detailed analysis. Alan -
Mar 12, 5:04 pm 2007
Arnaud Giersch
Re: [PATCH] cleanfile: a script to clean up stealth whitespace
What about checking for a newline at end of file? Something like: # Add a newline at end of file, if needed. seek(FILE, -1, 2); if (read(FILE, $last_char, 1) == 1 && $last_char ne "\n") { seek(FILE, 0, 2); print FILE "\n"; [...] Regards, Arnaud Giersch -
Mar 13, 7:16 am 2007
Andrew Morton
Re: [PATCH] cleanfile: a script to clean up stealth whitespace
Fair enough. It'd be nice to have a clean-up-a-patch version of this. So it does all these things, except it only changes lines which start with ^+. -
Mar 12, 11:14 pm 2007
H. Peter Anvin
Re: [PATCH] cleanfile: a script to clean up stealth whitespace
Correction: for a context/unified diff it can be done by observing that there is no context left at the end of the file. It won't work if the file already have empty space at the end of it, but that's probably good enough. I'll cook something up. -hpa -
Mar 12, 10:37 pm 2007
H. Peter Anvin
Re: [PATCH] cleanfile: a script to clean up stealth whitespace
It can do everything except kill empty lines at the end of the file; a patch simply doesn't contain enough information to know if blank lines are inserted at the end of a file as opposed in the middle of the file. It can, of course, be done if the unpatched material is available, probably by applying the patch and seeing what happens. Let me know if you still want it; I'll whip it up. -hpa -
Mar 12, 10:33 pm 2007
Florian Fainelli
Re: [PATCH] drivers: PMC MSP71xx LED driver
Hi Marc, Your patch does not seem to use the Linux LED API (include/linux/leds.h),=20 which is sometimes pretty unknown, but dramatically ease your work. Maybe i= t=20 is a good idea converting it to this API if you find it relevant. Also consider the ongoing LED-GPIO API which is being written by ARM people= :=20 http://marc.theaimsgroup.com/?l=3Dlinux-kernel&m=3D110873454720555&w=3D2 My 2 cents =2D-=20 Cordialement, Florian ...
Mar 13, 1:33 am 2007
Alexey Dobriyan
Re: module.h and moduleparam.h: more header file pedantry
Regardless, of what you'll do: cross-compile test! After aforementioned removal and adding "struct kernel_param;" + akmk arm-assabet -k CHK include/linux/version.h make[2]: `include/asm-arm/mach-types.h' is up to date. Using /home/linux/linux-irq-flags-t as source for kernel GEN /home/linux/build/arm-assabet/Makefile CHK include/linux/utsrelease.h CHK include/linux/compile.h CC arch/arm/nwfpe/fpmodule.o arch/arm/nwfpe/fpmodule.c:179: error: syntax error ...
Mar 13, 3:05 pm 2007
Robert P. J. Day
Re: module.h and moduleparam.h: more header file pedantry
oh, i've already been by that and figured out what's going on. i'm going to summarize this on the KJ wiki. it's really quite the mess. rday -- ======================================================================== Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry Waterloo, Ontario, CANADA http://fsdev.net/wiki/index.php?title=Main_Page ======================================================================== -
Mar 13, 3:04 pm 2007
Herbert Poetzl
Re: Summary of resource management discussion
nope, no numbers for that, but I appreciate some testing and probably can do some testing in this regard too (although I want to get some testing done for the resource host only for now best, -
Mar 13, 4:50 pm 2007
Herbert Poetzl
Re: Summary of resource management discussion
what about identifying different resource categories and handling them according to the typical usage pattern? like the following: - cpu and scheduler related accounting/limits - memory related accounting/limits - network related accounting/limits - generic/file system related accounting/limits I don't worry too much about having the generic/file stuff attached to the nsproxy, but the cpu/sched stuff might be better off being directly reachable from the task that is what ...
Mar 13, 9:24 am 2007
Srivatsa Vaddagiri
Re: Summary of resource management discussion
I think we should experiment with both combinations (a direct pointer to cpu_limit structure from task_struct and an indirect pointer), get some numbers and then decide. Or do you have results already with Interesting. What abt /dev/cpuset view? Is that same for all containers or do you restrict that view to the containers cpuset only? -- Regards, vatsa -
Mar 13, 10:58 am 2007
Soeren Sonnenburg
Re: s2ram still broken with CONFIG_NO_HZ / HPET (macbook pro)
Well it is there: ATA: abnormal status 0x80 on port 0x000140df ata1.00: configured for UDMA/33 ata3.01: NODEV after polling detection ata3.01: revalidation failed (errno=-2) ata3: failed to recover some devices, retrying in 5 secs ATA: abnormal status 0x7F on port 0x000140df ATA: abnormal status 0x7F on port 0x000140df ata3.01: configured for UDMA/133 SCSI device sda: 234441648 512-byte hdwr sectors (120034 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: write ...
Mar 13, 2:53 am 2007
Andy Gospodarek
Re: [patch 4/4] [TULIP] Rev tulip version
It's good to keep this type of information in drivers. I've been thinking lately that it would be nice to even expand it a little bit (maybe include the commit sum) so its easier to help those who aren't running the latest upstream kernels on their boxes.... -
Mar 13, 10:07 am 2007
Frank van Maarseveen
Re: 2.6.20*: PATA DMA timeout, hangs (2)
2.6.20.2 rejects this patch and I don't see a way to apply it by hand: ide_set_dma() isn't there, nothing seems to match. -- Frank -
Mar 13, 2:19 am 2007
Bartlomiej Zolnierki ...
Re: 2.6.20*: PATA DMA timeout, hangs (2)
The patch is for 2.6.21-rc3, sorry for not making it clear. Bart -
Mar 13, 4:04 am 2007
Alan Cox
Re: sys_write() racy for multi-threaded append?
Not normally because POSIX sensibly invented pread/pwrite. Forgot There is almost no descriptor this is true for. Any file I/O can and will end up short on disk full or resource limit exceeded or quota exceeded or NFS server exploded or ... And on the device side about the only thing with the vaguest guarantees Easy enough to do and gcov plus dejagnu or similar tools will let you Competent QA and testing people test all the returns in the manual as well as all the returns they can ...
Mar 12, 7:24 pm 2007
Christoph Hellwig
Re: sys_write() racy for multi-threaded append?
Michael, please stop spreading this utter bullshit _now_. You're so full of half-knowledge that it's not funny anymore, and you try to insult people knowing a few magniutes more than you left and right. Can you please select a different mailinglist for your trolling? -
Mar 13, 12:09 pm 2007
David Miller
Re: sys_write() racy for multi-threaded append?
From: "Michael K. Edwards" <medwards.linux@gmail.com> You're not even safe over standard output, simply run the program over ssh and you suddenly have socket semantics to deal with. In the early days the fun game to play was to run programs over rsh to see in what amusing way they would explode. ssh has replaced rsh in this game, but the bugs have largely stayed the same. Even early versions of tar used to explode on TCP half-closes and whatnot. In short, if you don't handle short ...
Mar 13, 12:42 am 2007
Michael K. Edwards
Re: sys_write() racy for multi-threaded append?
OK, I laughed out loud at this. But I think you're missing my point, which is that there's a time to be hard-core about code quality and there's a time to be hard-core about _product_ quality. Face it, all products containing software more or less suck. This is because most programmers write crap code most of the time. The only way to cope with this, outside the confines of the European defense industry and other niches insulated from economic reality, is to make the production environment ...
Mar 12, 5:46 pm 2007
David M. Lloyd
Re: sys_write() racy for multi-threaded append?
You don't even need special tools: just change your code that says: foo = write(fd, mybuf, mycount); to say (for example): foo = write(fd, mybuf, mycount / randomly_either_1_or_2); Why would this need kernel support? The average developer doesn't really need to verify that the *kernel* works. They just need to test their own code paths - and in this case, they can see that foo is less than mycount (sometimes). The code paths don't care that it was not the kernel that caused ...
Mar 13, 7:00 am 2007
Michael K. Edwards
Re: sys_write() racy for multi-threaded append?
Clearly f_pos atomicity has been handled differently in the not-so-distant past: http://www.mail-archive.com/linux-fsdevel@vger.kernel.org/msg01628.html And equally clearly the current generic_file_llseek semantics are erroneous for large offsets, and we shouldn't be taking the inode mutex in any case other than SEEK_END: http://marc.theaimsgroup.com/?l=linux-fsdevel&m=100584441922835&w=2 read_write.c is a perfect example of the relative amateurishness of parts of the Linux kernel. It ...
Mar 13, 10:59 am 2007
Michael K. Edwards
Re: sys_write() racy for multi-threaded append?
Thank you Christoph for that informative response to my comments. I take it that you consider read_write.c to be code of the highest quality and maintainability. If you have something specific in mind when you write "utter bullshit" and "half-knowledge", I'd love to hear it. Now, for those who still care to respond as if improving the kernel were a goal that you and I can share, a question: When generic_file_llseek needs the inode in order to retrieve the current file size, it goes ...
Mar 13, 4:40 pm 2007
Michael K. Edwards
Re: sys_write() racy for multi-threaded append?
pread/pwrite address a miniscule fraction of lseek+read(v)/write(v) use cases -- a fraction that someone cared about strongly enough to get into X/Open CAE Spec Issue 5 Version 2 (1997), from which it propagated into UNIX98 and thence into POSIX.2 2001. The fact that no one has bothered to implement preadv/pwritev in the decade since pread/pwrite entered the Single UNIX standard reflects the rarity with which they appear in general code. Life is too short to spend it rewriting application ...
Mar 13, 12:25 am 2007
Michael K. Edwards
Re: sys_write() racy for multi-threaded append?
I'm intimately familiar with this one. Easily worked around by piping the output through cat or tee. Not that one should ever write code for a *nix box that can't cope with stdout being a socket or tty; but sometimes the quickest way to glue existing code into a pipeline is to pass /dev/stdout in place of a filename, and there's no real value in reworking legacy code to handle short writes when you can just drop in Right in one. You're writing a program for a controlled ...
Mar 13, 9:24 am 2007
Alan Cox
Re: sys_write() racy for multi-threaded append?
One way to do this is to use kprobes which will do exactly what you want Not easily with gdbstubs as you've got to talk to something to decide how to log the data and proceed. If you stick it kernel side its a lot of ugly new code and easier to port kprobes over, if you do it remotely as gdbstubs intends it adds latencies and screws all your timings. gdbstubs is also not terribly SMP aware and for low level work its sometimes easier to have on gdb per processor if you can get your ...
Mar 13, 6:15 am 2007
Felipe Alfaro Solana
Re: [ck] Re: RSDL v0.30 cpu scheduler for mainline kernels
I guess Con was kidding. A 24-CPU system can be anything but lousy hardware. -
Mar 12, 10:03 pm 2007
Con Kolivas
Re: RSDL v0.30 cpu scheduler for mainline kernels
Very nice. Thanks for the feedback and I'm sorry you have to work with such lousy hardware. -- -ck -
Mar 12, 8:05 pm 2007
Chris Friesen
Re: [ck] RSDL v0.30 cpu scheduler for mainline kernels
If the app has root privileges to set RT policy, then it could also set deeply negative nice values as well. Doesn't reallly help the regular user with no privileges. Chris -
Mar 13, 10:45 am 2007
Lee Revell
Re: [ck] RSDL v0.30 cpu scheduler for mainline kernels
Sounds like Wengophone is broken. It should be using RT threads for time critical work, as JACK and Ardour2 are doing. No scheduler can help if the apps are not correctly written... Lee -
Mar 13, 8:53 am 2007
Willy Tarreau
Re: RSDL v0.30 cpu scheduler for mainline kernels
BTW, I don't know if you say this as a joke, but those are not necessarily lousy hardware. Sun does lousy hardware when they put Sparcs in PCs (ultra5, ultra10, blade100). But their servers generally are nice with large memory busses and very scalable SMP architectures. Regards, Willy -
Mar 12, 9:32 pm 2007
michael chang
Re: [ck] Re: RSDL v0.30 cpu scheduler for mainline kernels
Considering that Con's general mass target for his -ck patchset is single-cored x86/PC-compatable system such as a Celeron or a single-core Athlon with 32-256 MB RAM, (disclaimer: this is an assumption) I think he is being rather sarcastic. The only other way to interpret this is that those machines are totally different from what Con uses for development, and as such, maybe Con'd find them "lousy" because if something DID go wrong, it would be less obvious he'd figure it out right away. But I ...
Mar 13, 6:10 am 2007
David Miller
Re: RSDL v0.30 cpu scheduler for mainline kernels
From: Willy Tarreau <w@1wt.eu> He was definitely being sarcastic, relax :-) -
Mar 12, 10:29 pm 2007
Lee Revell
Re: [ck] RSDL v0.30 cpu scheduler for mainline kernels
Well this was supposed to be solved by RLIMIT_NICE and RLIMIT_RTPRIO which went into mainline about a year ago but distros have been very slow to pick up the new PAM, glibc and bash packages. We don't have a clear picture yet of what defaults the distros will ship. Lee -
Mar 13, 1:02 pm 2007
Ash Milsted
Re: [ck] RSDL v0.30 cpu scheduler for mainline kernels
On Mon, 12 Mar 2007 10:58:11 +1100 Here's my experience with RSDL 0.30 on 2.6.21-rc3-git6 under my normal usage scenarios... Plain desktop use (web browsing, music, etc): no noticeable change Desktop use during kernel compile (no -j): The compile impacts desktop use more with RSDL, but this is easily solved by nice-ing the compile (default nice of 10 seems enough). The kind of impact I am talking about is e.g. occasional delays in scrolling in the browser etc. Desktop use whilst talking ...
Mar 13, 8:35 am 2007
Con Kolivas
Re: [ck] RSDL v0.30 cpu scheduler for mainline kernels
Well the change to -nice values was to minimise the harm they do to everything else. They will still get the lowest latencies and the most cpu but no more than previously. The difference will be to more niced tasks. I'm not sure just how much cpu you require for wengophone because at -5 it would be getting a fair chunk of cpu with the RSDL 0.30. Anyway, by sheer coincidence I just emailed out that patch I was planning so feel free to try it. -- -ck -
Mar 13, 8:46 am 2007
Russell King
Re: /sys/devices/system/cpu/cpuX/online are missing
Welcome to why cleanups are bad news. ;( Yes, ARM also needs to be fixed and I'd ask that in future people doing cleanups in core code take a little more time to review the code before submitting patches *AND* give heads-up to *EVERYONE* who might be affected by the change. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
Mar 13, 2:40 am 2007
Heiko Carstens
Re: /sys/devices/system/cpu/cpuX/online are missing
I was referring to arch/ppc not arch/powerpc. But it seems that arch/ppc doesn't support cpu hotplug anyway. So I guess it's indeed just a missing config option. Grepping a bit further shows that arm suffered by the change that inverted the logic if the 'online' attribute for cpus should appear. Since arm supports cpu hotplug but the patch left arm out, it doesn't work there anymore (cc'ing arm people: changeset 72486f1f8f0a2bc828b9d30cf4690cf2dd6807fc is most probably disabling cpu hotplug ...
Mar 13, 2:03 am 2007
Andreas Schwab
Re: /sys/devices/system/cpu/cpuX/online are missing
See arch/powerpc/kernel/sysfs.c:topology_init. I don't think there is anything to do here. You probably don't have CONFIG_HOTPLUG_CPU enabled. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." -
Mar 12, 5:39 pm 2007
Andreas Schwab
Re: /sys/devices/system/cpu/cpuX/online are missing
I think if there is no hotplug support then the file should not be created in the first place. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." -
Mar 13, 2:44 am 2007
Russell King
Re: /sys/devices/system/cpu/cpuX/online are missing
Right, here's the ARM fix which is now in the ARM tree: # Base git commit: 8b9909ded6922c33c221b105b26917780cfa497d # (Merge branch 'merge' of master.kernel.org:/pub/scm/linux/kernel/git/paulus/powerpc) # # Author: Russell King (Tue Mar 13 09:54:21 GMT 2007) # Committer: Russell King (Tue Mar 13 09:54:21 GMT 2007) # # [ARM] Fix breakage caused by 72486f1f8f0a2bc828b9d30cf4690cf2dd6807fc # # 72486f1f8f0a2bc828b9d30cf4690cf2dd6807fc inverted the sense for # enabling hotplug ...
Mar 13, 2:56 am 2007
Heiko Carstens
Re: /sys/devices/system/cpu/cpuX/online are missing
Should have cc'ed Suresh Siddha who caused the breakage ;) -
Mar 13, 2:11 am 2007
Eric W. Biederman
Re: [PATCH] Initialise SAK member for each virtual conso ...
Ugh. I missed this one, when I fixed this. Sorry I thought the loop in con_init() covered all of the consoles and we initialized all of them at boot time :( Signed-off-by: Bernhard Walle <bwalle@suse.de> Acked-by: Eric Biederman <ebiederm@xmission.com> --- drivers/char/vt.c | 1 + 1 file changed, 1 insertion(+) Index: linux-2.6.21-rc3/drivers/char/vt.c =================================================================== --- linux-2.6.21-rc3.orig/drivers/char/vt.c +++ ...
Mar 13, 10:16 am 2007
Jeremy Fitzhardinge
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
There's a distinction between giving it more cpu and giving it higher priority: the important part about having high priority is getting low This really seems like the wrong approach to me. The implication here and in other mails is that fairness is an inherently good thing which should obviously take preference over any other property. It's a nice simple stance, and its relatively easy to code up and test to see that its working, but it doesn't really give people what they want. The old ...
Mar 13, 10:59 am 2007
Ingo Molnar
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
i have to agree with Mike that this is a material regression that cannot be talked around. Con, we want RSDL to /improve/ interactivity. Having new scheduler interactivity logic that behaves /worse/ in the presence of CPU hogs, which CPU hogs are even reniced to +5, than the current interactivity code, is i think a non-starter. Could you try to fix this, please? Good interactivity in the presence of CPU hogs (be them default nice level or nice +5) is _the_ most important scheduler ...
Mar 13, 1:18 am 2007
Con Kolivas
Re: [ck] Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for ...
And again, with X in its current implementation it is NOT like two nice 0 tasks at all; it is like one nice 0 task. This is being fixed in the X design -- -ck -
Mar 12, 11:16 pm 2007
Mike Galbraith
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
Hey, you specifically asked me to not choose 5 :) (I mentioned 5 earlier in the thread anyway, so no sense in repeating myself) -Mike -
Mar 12, 11:17 pm 2007
Jeremy Fitzhardinge
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
Hm, well. The general preference has been for the kernel to do a good-enough job on getting the common cases right without tuning, and then only add knobs for the really tricky cases it can't do well. But the impression I'm getting here is that you often get sucky behaviours Well, it doesn't have to. It could give good low latency with short timeslices to things which appear to be interactive. If the interactive program doesn't make good use of its low latency, then it will suck. For ...
Mar 13, 1:10 pm 2007
Xavier Bestel
Re: [ck] Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for ...
I don't see why. nice uses +10 by default on all linux distro, and even on Solaris and HP/UX. So I suspect that if Mike just used "nice lame" instead of "nice +5 lame", he would have got what he wanted. Xav -
Mar 13, 3:24 am 2007
Bill Huey
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
Hello Ingo, After talking to Con over IRC (and if I can summarize it), he's wondering if properly nicing those tasks, as previously mention in user emails, would solve this potential user reported regression or is something additional needed. It seems like folks are happy with the results once the nice tweeking is done. This is a huge behavior change after all to scheduler (just thinking out loud). bill -
Mar 13, 3:50 am 2007
Thibaut VARENE
Re: [ck] Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for ...
If Con actually implements SCHED_IDLEPRIO in RSDL, life is good even Exactly. Driving us again toward the fact that different workloads might benefit from different schedulers (eg: RSDL is cool for server loads, previous staircase did an excellent job on desktop, etc) and thus that having a choice of schedulers might be something that would It certainly is. "Negative" feedback can be a good thing too, as it helps improving it anyway. It's nonetheless true that it's practically impossible to ...
Mar 12, 5:09 pm 2007
David Schwartz
RE: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
I don't know what else you can do when the argument is that behavior that is wrong is what you actually want. The regression is not that the scheduler doesn't do what it was asked to do or even that it isn't more faithful to what it was told to do than the scheduler it replaces. The regression is that the scheduler didn't do what Mike wanted it to do, even though he didn't ask it to do that. I would argue this is progression, not regression. The new scheduler is fairer than the old one and ...
Mar 13, 8:15 am 2007
Mike Galbraith
Re: [ck] Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for ...
Within reason, yes. Defining "reason" is difficult. As we speak, this is possible to a much greater degree than with RSDL. Before anybody pipes in, yes, I'm very much aware of the down side of the interactivity Virtual or physical cores has nothing to do with the interactivity regression I noticed. Two nice 0 tasks which combined used 50% of my box can no longer share that box with two nice 5 tasks and receive the 50% they need to perform. That's it. From there, we wandered off into ...
Mar 12, 11:08 pm 2007
Con Kolivas
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
Yet looking at the mainline scheduler code, nice 5 tasks are also supposed to get 75% cpu compared to nice 0 tasks, however I cannot seem to get 75% cpu with a fully cpu bound task in the presence of an interactive task. To me that means mainline is not living up to my expectations. What you're saying is your expectations are based on a false cpu expectation from nice 5. You can spin it both ways. It seems to me the only one that lives up to a defined expectation is to be fair. Anything ...
Mar 12, 10:53 pm 2007
Bill Huey
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
This is way beyond what SCHED_OTHER should do. It can't predict the universe. Much of the interactivity estimator borders on magic. It just happens to We can do MUCH better in the long run with something like Con's scheduler. His approach shouldn't be dismissed because it's running into a relatively few minor snags large the fault of scheduleing opaque applications. It's precise enough that it can also be loosened up a bit with additional control terms (previous email). It might be good ...
Mar 13, 1:35 pm 2007
Ingo Molnar
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
i think Mike's testcase was even simpler than that: two plain CPU hogs on nice +5 stole much more CPU time with Con's new interactivity code than they did with the current interactivity code. I'd agree with Mike that a phenomenon like that needs to be fixed. /less/ interactivity we can do easily in the current scheduler: just remove various bits here and there. The RSDL promise is that it gives us /more/ interactivity (with 'interactivity designed in', etc.), which in yeah. It's a ...
Mar 13, 1:09 am 2007
Valdis.Kletnieks
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
Fork bombs are the reason that 'ulimit -u' exists. I don't see this sched= uler as being significantly more DoS'able via that route than previous schedul= ers.
Mar 13, 10:21 am 2007
Rodney Gordon II
Re: [ck] Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for ...
Also, just to chime in, I am doing a large project converting over 250GB of FLAC audio to MP3 via lame for my archive conversion. I am using 2.6.20.2-rsdl0.30, and I have 2 processes of flac decoding/lame encoding running simultaneously from a perl script I hacked up on my P-D 830. These processes are both nice'd to 19. I have almost no degredation in latency in my usage of X (which is at nice 0), if that matters at all. Please try what Con is suggesting by adjusting your nice level, ...
Mar 12, 11:08 pm 2007
Con Kolivas
Re: [ck] Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for ...
It seem Mike has chosen to go silent so I'll guess on his part. nice on my debian etch seems to choose nice +10 without arguments contrary to a previous discussion that said 4 was the default. However 4 is a good value to use as a base of sorts. What I propose is as a proportion of nice 0: nice 4 1/2 nice 8 1/4 nice 12 1/8 nice 16 1/16 nice 20 1/32 (of course nice 20 doesn't exist) and we can do the opposite in the other direction nice -4 2 nice -8 4 nice -12 8 nice -16 16 nice ...
Mar 13, 2:31 am 2007
Mike Galbraith
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
Sure. If a user wants to do anything interactive, they can indeed nice It's not "offensive" to me, it is a behavioral regression. The situation as we speak is that you can run cpu intensive tasks while watching eye-candy. With RSDL, you can't, you feel the non-interactive load instantly. Doesn't the fact that you're asking me to lower my I'm not trying to be pig-headed. I'm of the opinion that fairness is great... until you strictly enforce it wrt interactive tasks. -Mike -
Mar 12, 10:10 pm 2007
Ingo Molnar
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
I'd say lets keep nice levels out of this completely for now - while they should work _too_, it's easy because the scheduler has the 'nice' information. The basic behavior of CPU hogs that matters most. So the question is: if all tasks are on the same nice level, how does, in Mike's test scenario, RSDL behave relative to the current interactivity code? Ingo -
Mar 13, 2:29 am 2007
Mike Galbraith
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
I just retested with the encoders at nice 0, and the x/gforce combo is terrible. Funny thing though, x/gforce isn't as badly affected with a kernel build. Any build is quite noticable, but even at -j8, the effect doen't seem to be (very brief test warning applies) as bad as with only the two encoders running. That seems quite odd. -Mike -
Mar 13, 2:33 am 2007
Bill Huey
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
SGI machines had an interactive term in their scheduler as well as a traditional nice priority. It might be useful for Con to possibly consider this as an extension for problematic (badly hacked) processes like X. Nice as a control mechanism is rather coarse, yet overly strict because of the sophistication of his scheduler. Having an additional term (control knob) would be nice for a scheduler that is built upon (correct me if I'm wrong Con): 1) has rudimentary bandwidth control for a group ...
Mar 13, 1:27 pm 2007
Matt Mackall
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
Is gforce calling sched_yield? Can you try testing with some simpler loads, like these: memload: #!/usr/bin/python a = "a" * 16 * 1024 * 1024 while 1: b = a[1:] + "b" a = b[1:] + "c" execload: #!/bin/sh exec ./execload forkload: #!/bin/sh ./forkload& pipeload: #!/usr/bin/python import os pi, po = os.pipe() if os.fork(): while 1: os.write(po, "A" * 4096) else: while 1: os.read(pi, 4096) -- Mathematics is the supreme nostalgia of our time. -
Mar 13, 7:17 am 2007
Mike Galbraith
Re: [ck] Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for ...
Shrug. I don't live then, I live now. I have expressed my concerns, and will now switch from talk back to listen mode. -Mike -
Mar 12, 11:30 pm 2007
Sanjoy Mahajan
Re: [ck] Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for ...
tcsh, and probably csh, has a builtin 'nice' with default +4. So tcsh% nice ps -l will show a process with nice +4. If you tell it not to use the builtin, tcsh% \nice ps -l then it uses /usr/bin/nice and you get +10. bash doesn't have a nice builtin, so it always uses /usr/bin/nice and you get +10 by default. -Sanjoy `Not all those who wander are lost.' (J.R.R. Tolkien) -
Mar 13, 4:19 pm 2007
David Lang
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
this can solve the specific problem (and since 'nice' is the natural way to tell the kernel this, it's not even a one-shot solution). however Linus is right, the real underlying problem is where the user is waiting on a server. if this issue could be solved then a lot of things would benifit. Con, as a quick hack (probably a bad idea as I'm not a scheduling expert), if a program blocks on another program (via a pipe or socket) could you easily give the rest of the first program's ...
Mar 12, 11:00 pm 2007
Kyle Moffett
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
Maybe extend UNIX sockets to add another passable object type vis-a- vis SCM_RIGHTS, except in this case "SCM_CPUTIME". You call SCM_CPUTIME with a time value in monotonic real-time nanoseconds (duration) and a value out of 100 indicating what percentage of your timeslices to give to the process (for the specified duration). The receiving process would be informed of the estimated total number of nanoseconds of timeslice that it will be given based on the priority of the ...
Mar 12, 9:17 pm 2007
Con Kolivas
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
If everything is run at nice 0? (this was not the test case but that's what you've asked for). We have: X + GForce contribute a load of 1 lame x 2 threads contribute a load of 2 In RSDL X + GForce will get 33% cpu lame will get 66% cpu In mainline X + GForce gets a fluctuating percentage somewhere between 35~45% as far as I can see on UP. lame gets the rest The only way to get the same behaviour on RSDL without hacking an interactivity estimator, priority boost cpu ...
Mar 13, 2:41 am 2007
Lee Revell
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
I'm not an expert in this area by any means but after reading this thread the OSX solution of simply telling the kernel "I'm the GUI, schedule me accordingly" looks increasingly attractive. Why make the kernel guess when we can just be explicit? Does anyone know of a UNIX-like system that has managed to solve this problem without hooking the GUI into the scheduler? Lee -
Mar 12, 7:23 pm 2007
David Schwartz
RE: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
I agree. Tasks that voluntarily relinquish their timeslices should get lower Yes, that is the implication. The alternative to fairness is arbitrary I don't think it makes sense for the scheduler to look for some hint that the user would prefer a task to get more CPU and try to give it more. That's Then you will always get cases where the scheduler does not do what the user wants because the scheduler does not *know* what the user wants. You always have to tell a computer what you want it ...
Mar 13, 12:58 pm 2007
Mike Galbraith
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
(One more comment before I go. You can then have the last word this time, promise :) Because the interactivity logic, which was put there to do precisely Talk about spin, you turn an example of the current scheduler working properly into a negative attribute, and attempt to discredit me with it. The floor is yours. No reply will be forthcoming. -Mike -
Mar 13, 12:53 am 2007
Con Kolivas
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
Well I guess you must have missed where I asked him if he would be happy if I changed +5 metrics to do whatever he wanted and he refused to answer me. That would easily fit within that scheme. Any percentage of nice value he chose. I suggest 50% of nice 0. Heck I can even increase it if he likes. All I asked I have been civil. Only one email crossed the line on my part and I apologise. -- -ck -
Mar 13, 2:21 am 2007
Ingo Molnar
Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2
i'm sorry, but your argument seems to be negated. We of course have no problem with interactive tasks stealing CPU time from CPU hogs. The situation Mike found is _the other direction_: that /CPU hogs/ stole from interactive tasks. That's bad and needs to be fixed. Please? Ingo -
Mar 13, 1:22 am 2007
Christoph Lameter
Re: [QUICKLIST 0/6] Arch independent quicklists V1
It used to be the case that initializating objects was better in the past. Today it is better to initialize the objects immediately before they are used. That will move them into the cpu caches and keep them there. Initializing them earlier may cause the cachelines of the object to be evicted from the cpu cache and then those have to be refetched. The benefit of this approach diminishes the larger objects get and the sparser the access to the cachelines of the object. In the case of page ...
Mar 12, 6:38 pm 2007
David Miller
Re: [QUICKLIST 0/6] Arch independent quicklists V1
From: David Miller <davem@davemloft.net> I want to quantify this with the fact that all the cache false sharing issues are irrelevant in this test because the L2 cache is shared between all of the cpu threads on Niagara. It was fast just because the quicklists were lighter weight than the SLAB stuff. -
Mar 12, 7:32 pm 2007
Paul Mackerras
Re: [QUICKLIST 0/6] Arch independent quicklists V1
Did you see any performance improvement? We used to have quicklists on ppc, but I remain to be convinced that they actually help. Also, I didn't understand why we have to do quicklists to take advantage of the fact that the pages are in a pristine state when they are freed. I thought the whole point of the slab allocator was to be able to take advantage of that... Paul. -
Mar 12, 5:37 pm 2007
David Miller
Re: [QUICKLIST 0/6] Arch independent quicklists V1
From: Paul Mackerras <paulus@samba.org> It shaved about 3 or 4 seconds consistently off of my kernel build on Niagara which usually clocks in just over 4 minutes He just wants to side-step the issue in SLUB, which arguably is an attempt to simplify SLUB at the expense of functionality. I don't agree with that, but I'm merely preemptively testing his patches and porting them to sparc64 so it does not break when/if his code is merged in. After being bitten by stuff like this in the ...
Mar 12, 7:26 pm 2007
Nish Aravamudan
Re: Stracing Amanda (was: RSDL for 2.6.21-rc3- 0.29)
buried indeed: "Special Parameters: ... $ Expands to the process ID of the shell. In a () subshell, it expands to the process ID of the current shell, not the sub‐ shell. " Thanks, Nish -
Mar 12, 8:01 pm 2007
Willy Tarreau
Re: Stracing Amanda (was: RSDL for 2.6.21-rc3- 0.29)
Yes there a risk of wrapping, but it is very small. You can add the command line arguments to the file name if you want, like this : #!/bin/sh exec strace -f -o "output.$$.${*//\//_}" /bin/real.tar $@ It will name the output file "output.<pid>.<args>", replacing slashes with underscores. This is very dirty but can help. Cheers, Willy -
Mar 12, 9:45 pm 2007
Gene Heskett
Re: Stracing Amanda (was: RSDL for 2.6.21-rc3- 0.29)
In my case, Doug, it will get invoked 64 times, amanda does a dummy run to get an estimate, calculates what to do based on that output which is 32 runs, 1 per disklist entry and I have 32, and then reruns tar with the appropriate level options against each individual disklist entry. But I'm puzzled a bit, what does the double $$ do?, or it buried someplace in the bash manpage? Its not something I've stumbled over yet. -- Cheers, Gene "There are four boxes to be used in defense of ...
Mar 12, 7:39 pm 2007
Gene Heskett
Re: Stracing Amanda (was: RSDL for 2.6.21-rc3- 0.29)
-- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Whatever doesn't succeed in two months and a half in California will never succeed. -- Rev. Henry Durant, founder of the University of California -
Mar 12, 10:48 pm 2007
Douglas McNaught
Stracing Amanda (was: RSDL for 2.6.21-rc3- 0.29)
You beat me to it. :) I've done that before; it's a great suggestion. Except that if you expect 'tar' to be invoked multiple times in a run, you should probably use 'output.$$' for the output filename so things don't get clobbered. -Doug -
Mar 12, 6:32 pm 2007
Nick Piggin
Re: RSDL-mm 0.28
-- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com -
Mar 13, 12:22 am 2007
Gene Heskett
Re: Stracing Amanda (was: RSDL for 2.6.21-rc3- 0.29)
Well, that's clear enough, but what of the double $$ case? Would this them make a PID unique to each invocation untill it finally wraps a 16 bit value, or will the kernel re-use them because they won't all be running simultainiously, but limited by the number of unique 'spindle' numbers on the system, this to prevent as best as it can, the thrashing of a drive by having tar working on 2 separate (or more) partitions at the same time. In my case 2 are possible, as /var is on a separate ...
Mar 12, 9:04 pm 2007
Ralf Baechle
Re: [SOUND] hda_intel: build fix
Well, that works. Andrew, I'm going to post an updated patch in separate email. Ralf -
Mar 13, 5:42 am 2007
David Miller
Re: [PATCH] Delete superfluous source file "net/wanroute ...
From: "Robert P. J. Day" <rpjday@mindspring.com> Applied, thanks Robert. This thing isn't even built in 2.4.x :-) Although there is some ancient reference to the build module in Documentation/networking/wan-router.txt, a heavily out of date document. -
Mar 12, 5:07 pm 2007
Patrick McHardy
Re: [stable] [patch 12/20] nfnetlink_log: fix reference ...
Sorry, I must have messed up something. I've fixed up the original patch, this one should apply on top of the stable queue with the broken patch removed.
Mar 13, 8:45 am 2007
Trent Piepho
Re: [RFC PATCH 1/3] Add ability to keep track of callers ...
Ok, I did that before, I'll change it back. Note that the reference counting isn't perfect when it comes to catching mistakes. The fundamental problem is that when a module is loaded and linked, all the modules that it used symbols from gain a "use". To be symmetric, when a module is unloaded all the modules it used symbols from should lose a "use". Except, there is no record of what modules gained a "use" at link time. Suppose module 1 uses a symbol from module 2. At link time, ...
Mar 12, 11:33 pm 2007
Jan Kara
Re: do_generic_mapping_read performance issue
I don't think you'll get more feedback now. Just write the patch, benchmark it with your workload and also some others (basically, there should be no difference) and then submit the patch together with the performance numbers and some rationale (if it's your first submission, see Documentation/SubmittingPatches). Give also CC to Andrew <akpm@linux-foundation.org> and ask him to put it into -mm kernels for testing. Honza -- Jan Kara <jack@suse.cz> SuSE CR Labs -
Mar 13, 11:55 am 2007
Ashif Harji
Re: do_generic_mapping_read performance issue
I would like to find out if the change I am suggesting will cause any other problems. Is there someone else I should contact before submitting a patch? Or should I just wait another day or two for comments? ashif. -
Mar 13, 11:43 am 2007
Florian Lohoff
Boot fails with 2.6.20-rc3 / git-current Was: 2.6.20-rc3 ...
I am getting the exact same line on bootup on my Fujitsu Siemens Lifebook E8110. After that line the boot halts with current git: Last lines on current git were: IPMI System Interface driver Clocksource tsc unstable ( delta =3D -160251929ns ) Now running: Linux laptop 2.6.21-rc2 #1 SMP PREEMPT Thu Mar 1 07:41:52 CET 2007 i686 GN= U/Linux which comes up with the same line when initializing IDE but continues to run: [ 9.912917] ICH7: IDE controller at PCI slot 0000:00:1f.1 [ ...
Mar 13, 8:04 am 2007
Jarek Poplawski
Re: Removal of multipath cached (was Re: [PATCH] [REVISE ...
Plus official way: Documentation/feature-remove-schedule.txt in the next rc-git. Jarek P. -
Mar 13, 12:05 am 2007
Andrew Morton
Re: Removal of multipath cached (was Re: [PATCH] [REVISE ...
Good stuff. I suggest you put a big printk explaining the above into 2.6.21. -
Mar 12, 11:22 pm 2007
Con Kolivas
Re: 2.6.21-rc3-mm1 RSDL results
Nothing ever looks like it stays running for very long. That would be enough to account for this sort of top picture. What HZ are you running? Do you usually run two makes at different nice levels? Thanks. -- -ck -
Mar 13, 1:26 pm 2007
Mark Lord
Re: 2.6.21-rc3-mm1 RSDL results
Retesting today with 2.6.21-rc3-git7 + 2.6.21-rc3-sched-rsdl-0.30.patch. Still not pleasant to use the GUI with a kernel build (-j1 or -j2) happening unless the build is manually "nice'd". Also, accounting looks weird in top(1). Mmm.. I wonder where all of that 100% CPU went to.. the busiest tasks are only showing up as 4.0% and 1.7% (when in fact they are using near 100%). Cheers -
Mar 13, 11:21 am 2007
Mark Lord
Re: 2.6.21-rc3-mm1 RSDL results
Sorry, I just don't buy that one. This was a 2-second sampling interval in top. top(1) is a program that has to work, so if this scheduler breaks it like this, This was HZ=1000, with NO_HZ. And, no, not normally different nice levels. Here I was just trying to keep the machine usable while building a couple of things. Keep at it. Someday this might be good enough for mainline, but right now the stock scheduler beats it for my desktop (notebook) loads. Cheers -
Mar 13, 3:06 pm 2007
Samuel Ortiz
Re: irda rmmod lockdep trace.
I considered that solution as well, and thought that it would then prevent the hasbin locks from being ever validated by lockdep. OTOH, the hashbin code is not likely to change anytime soon and is currently validated. Also, you will eventually push this code upstream, so I'd rather go for that fix ;-) Thanks for the comment. Cheers, Samuel. -
Mar 13, 9:18 am 2007
Serge E. Hallyn
Re: [RFC] [Patch 1/1] IBAC Patch
(Could you resend the IBAC patch to the disec list as well?) Is IBAC basically a 'demo' lsm? It only hooks bprm_security, so you can't execute anything with a bad hash, right? So really if you were to also hook mmap, this would completely do away with the need for digsig? IBAC is automatically able to use any integrity measurement modules you write, whether they use gpg like digsig does right now, or use the tpm? It also shows how simple it would be to hook selinux, though I guess there ...
Mar 13, 8:31 am 2007
Hugh Dickins
Re: Question about memory mapping mechanism
Hi Martin, sorry for joining so late. Most of your anxieties are unfounded: the pages your driver allocates with __get_free_pages are not put on any LRU (until they're freed), kernel pages are never swapped out, and making them visible to user- space does not put them in any more danger of being swapped out; but yes, if the refcounting goes wrong, that will cause premature freeing, "Bad page" messages, trouble. (Of course, filesystem pagecache pages are liable to be swapped out, and ...
Mar 13, 1:49 pm 2007
Eric W. Biederman
Re: Possible "struct pid" leak from tty_io.c
This patch can't be right. Not the way proc_clear_tty is called once for each process in the session, plus we aren't clearing tty->session and tty->pgrp here. If the above patch works it's a fluke. Still it is the right general area of the code. I've just started looking at this it is going to take me a bit to come up to speed on this code again and see what silly thing is missing. Eric -
Mar 13, 12:31 pm 2007
Hugh Dickins
Re: 2.6.21-rc suspend regression: sysfs deadlock
Yep, it works fine with your patch in and my silly reverted, thanks. But (I was about to say, even before seeing Cornelia's reply, honest!) I think you do need to check (audit the source? or is some runtime check possible?) for other such "suicidal" sysfs files, which seemed to (sysfs-ignorant) me to pose the real problem. Hugh -
Mar 13, 12:00 pm 2007
Dmitry Torokhov
Re: 2.6.21-rc suspend regression: sysfs deadlock
I think we could rely on subsystems maintainers to let us know if there are potential problems. For example I can tell that neither input, serio nor gameport subsystems use sysfs to destroy their devices (action on sysfs may cause some other device to be destroyed but that should be ok, only self-destruction is not allowed, right?) -- Dmitry -
Mar 13, 2:08 pm 2007
Hugh Dickins
Re: 2.6.21-rc suspend regression: sysfs deadlock
Indeed, and faint-hearted Hugh wasn't intending to do so: but stout-hearted Alan will need to, won't he, before his patch can go in? -
Mar 13, 1:55 pm 2007
Alan Stern
Re: 2.6.21-rc suspend regression: sysfs deadlock
A runtime check wouldn't detect anything until someone tried to use the file -- at which point the process would deadlock anyway. On the other hand, a quick survey of the kernel source shows that DEVICE_ATTR is used over 1500 times. Auditing all of them is not a job for the faint-of-heart! Alan Stern -
Mar 13, 1:09 pm 2007
Alan Stern
Re: 2.6.21-rc suspend regression: sysfs deadlock
Hugh, there has been a long discussion among several people concerning this issue. See for example this thread: http://marc.info/?t=117335935200001&r=1&w=2 and also: http://marc.info/?l=linux-kernel&m=117355959020831&w=2 The consensus is that we would be better off keeping Oliver's original patch without your silly change, and instead fixing the particular method call that deadlocked. Can you please try out the patch below with everything else as it was before? It should solve ...
Mar 13, 8:00 am 2007
Linus Torvalds
Re: 2.6.21-rc suspend regression: sysfs deadlock
Could we please make this easier to use by having some common sysfs helper routine for this kind of "delayed_store()" functionality. I'm not a huge fan of delayed work at all, but if we have to have it, at least make it one generic function rather than having multiple functions all doing their own workqueue logic for it. Linus -
Mar 13, 2:20 pm 2007
Alan Stern
Re: 2.6.21-rc suspend regression: sysfs deadlock
Allow me to point out that the original patch is Oliver's (although I helped), and it doesn't need to go in -- it needs not to be removed. Furthermore, I have better things to do with the next month of my time than auditing hundreds of routines I don't understand for behavior I probably won't be able to recognize. (Although at 50 a day... hmmm, maybe.) This sounds more like a job for kernel-janitors! Very good points. USB doesn't do anything like that either. And right, it's ...
Mar 13, 2:20 pm 2007
Cornelia Huck
Re: 2.6.21-rc suspend regression: sysfs deadlock
On Tue, 13 Mar 2007 11:00:21 -0400 (EDT), Another call that deadlocked with Oliver's patch is ungroup for s390 ccwgroup devices. It can be made to work again with a similar patch. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> --- drivers/s390/cio/ccwgroup.c | 35 +++++++++++++++++++++++++++++++---- 1 file changed, 31 insertions(+), 4 deletions(-) --- linux-2.6.orig/drivers/s390/cio/ccwgroup.c +++ linux-2.6/drivers/s390/cio/ccwgroup.c @@ -67,22 +67,49 @@ ...
Mar 13, 11:42 am 2007
Paul Mackerras
Re: _proxy_pda still makes linking modules fail
There is a fundamental problem with using __thread, which is that gcc assumes that the addresses of __thread variables are constant within one thread, and that therefore it can cache the result of address calculations. However, with preempt, threads in the kernel can't rely on staying on one cpu, and therefore the addresses of per-cpu variables can change. There appears to be no way to tell gcc to drop all cached __thread variable address calculations at a given point (e.g. when enabling or ...
Mar 13, 2:04 am 2007
Paul Mackerras
Re: _proxy_pda still makes linking modules fail
Yes. It can, and sometimes does. There's no way (that I know of) to tell gcc "all my __thread variables might have moved to a different There it's easier to make gcc do what we want, because we can use a barrier or a volatile. The difference is that smp_processor_id() is ultimately the value of something, not the address of something. We can tell gcc "values might have changed" but have no way to say "addresses might have changed". Paul. -
Mar 13, 4:50 pm 2007
Rusty Russell
Re: _proxy_pda still makes linking modules fail
No, I don't think I will. The PDA concept has gone too far in x86-64 to be undone. In particular, it's been put in GCC 4.1 for CONFIG_CC_STACKPROTECTOR, which assumes %gs:40 will give the stack canary. For the record: the PDA should never have existed, that's what percpu vars were supposed to be for. Something went wrong here 8( %gs is best set to the offset of the local cpu's area from the "master" per-cpu area, not set to the local cpu area's address. In the former case, booting with ...
Mar 12, 11:23 pm 2007
Jeremy Fitzhardinge
Re: _proxy_pda still makes linking modules fail
Doesn't that fall under the general class of "you have to be pinned to a particular cpu in order to meaningfully use per-cpu variables"? Or do you mean that if you have: preempt_disable(); use_my_percpu++; preempt_enable(); // switch cpus preempt_disable(); use_my_percpu++; preempt_enable(); then it will still use the old pointer to use_my_percpu? In principle gcc could CSE the value of smp_processor_id() across a cpu change in the same way. J -
Mar 13, 8:31 am 2007
Andi Kleen
Re: _proxy_pda still makes linking modules fail
Then swapgs wouldn't work anymore (there is no swapfs) -
Mar 13, 8:57 am 2007
Jiri Slaby
Re: [PATCH 1/1] Input: add sensable phantom driver
Cc: Anssi Hannula <anssi.hannula@gmail.com> Hmm, after going through input and ff layer, I figured out, that it's possible to pass only a direction to constant effect. I think, there is no possibility to pass 3D force (or vector) to the ff layer, isn't it? thanks, -- http://www.fi.muni.cz/~xslaby/ Jiri Slaby faculty of informatics, masaryk university, brno, cz e-mail: jirislaby gmail com, gpg pubkey fingerprint: B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E -
Mar 13, 9:19 am 2007
Timothy Shimmin
Re: [PATCH 2/2] xfs: stop using kmalloc in xfs_buf_get_noaddr
Hi, It looks like you might need: for (i--; i >= 0; i--) (or: for (j = 0; j < i; j++) etc.) Because if the initial alloc_page loop goes to completion then: i == pagecount and if alloc_page loop terminates early then bp->b_pages[i] == NULL So we have gone 1 too far in both cases and need to start free'ing back one. Unless I missed something. --Tim -
Mar 12, 5:08 pm 2007
Arkadiusz Miskiewicz
Re: [5/6] 2.6.21-rc3: known regressions
It's fixed in git tree. Commit ff24ba74b6d3befbfbafa142582211b5a6095d45 -- Arkadiusz Miśkiewicz PLD/Linux Team arekm / maven.pl http://ftp.pld-linux.org/ -
Mar 13, 2:46 pm 2007
Pavel Machek
Re: [1/6] 2.6.21-rc3: known regressions
Ahha, now I see where the confusion comes from. No, the reader is not a serial device, it is reader build-in x60. USB serial device (siemens sx1) has separate problem. Device is 15:00.2 Generic system peripheral [0805]: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter (rev 18) root@amd:~# ls -al /dev/mmc brw-r--r-- 1 root root 251, 0 Nov 5 16:57 /dev/mmc ... ...anything else I should try? Card is obviously detected, but I can't access it.. Uhuh. User error, lets close the ...
Mar 13, 11:11 am 2007
Pavel Machek Mar 13, 11:13 am 2007
Eric W. Biederman
Re: Linux v2.6.21-rc3
Oops I missed that. Eric -
Mar 13, 1:04 pm 2007
Mws
Re: [1/6] 2.6.21-rc3: known regressions
hi, i don't know if you ever used linux on embedded devices like set-top-boxes. you have a mostly fixed device infrastructure on those devices. even if you call it a "kind of savage", using udev there instead of fixed major device numbers is crap. best regards marcel -
Mar 13, 12:12 pm 2007
Pierre Ossman
Re: [1/6] 2.6.21-rc3: known regressions
First I heard of this. The error report is a bit thin so Pavel will need to elaborate a bit more. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainer http://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org -
Mar 13, 6:08 am 2007
Lukas Hejtmanek
Re: [5/6] 2.6.21-rc3: known regressions
seems to be fixed in 2.6.21-rc3 -- Lukáš Hejtmánek -
Mar 13, 6:29 am 2007
Pierre Ossman
Re: [1/6] 2.6.21-rc3: known regressions
Indeed I would. -- -- Pierre Ossman Linux kernel, MMC maintainer http://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org -
Mar 13, 1:31 pm 2007
Thomas Gleixner
Re: [6/6] 2.6.21-rc3: known regressions
That's not a regression. That's an informal message, when the TSC watchdog detects that the TSC is unreliable. tglx -
Mar 13, 1:46 pm 2007
Adrian Bunk
Re: [1/6] 2.6.21-rc3: known regressions
Those whose Linux installation predates the devfs hype and postdates the devfs hype and predates the udev hype and will postdate the udev hype and predates the next hype cu Adri "static /dev" an -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed -
Mar 13, 12:15 pm 2007
Adrian Bunk
[5/6] 2.6.21-rc3: known regressions
This email lists some known regressions in Linus' tree compared to 2.6.20. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : resume: slab error in verify_redzone_free(): cache `size-512': ...
Mar 13, 5:50 am 2007
Eric W. Biederman
Re: Linux v2.6.21-rc3
Here is a quick summary of the regressions I am looking at. - Currently we appear to have a pid leak in tty_io.c http://lkml.org/lkml/2007/3/8/222 - There is a missing init_WORK in vt.c that cases oops when we attempt to use SAK. http://lkml.org/lkml/2007/3/11/148 - We have a network ABI regression caused by the latest sysfs changes to net-sysfs.c In particular we now cannot rename network devices if our destination name happens to be the name of a sysfs file that the ...
Mar 13, 12:26 pm 2007
Thomas Gleixner
Re: [6/6] 2.6.21-rc3: known regressions
Linus merged the original patch, which solved the real problem. He just gave me a lesson how to do it right next time. tglx -
Mar 13, 1:05 pm 2007
Oliver Neukum
Re: [1/6] 2.6.21-rc3: known regressions
The device is a USB serial device. USB serial was known to have issues in the version this happened. As far as I know the bug has not been replicated after this bugs were fixed. Regards Oliver -
Mar 13, 6:36 am 2007
Pavel Machek
Re: [1/6] 2.6.21-rc3: known regressions
That's okay, but if one of those savages got major for you, would you be willing to use it? :-). Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
Mar 13, 1:05 pm 2007
Greg KH
Re: Linux v2.6.21-rc3
I do not think this should be reverted, as the odds that some one will rename their network device to be "irq" or something else that is in the pci device's directory is pretty slim. It also only shows up if CONFIG_SYSFS_DEPRECATED is disabled, not the common option. But I am still working on it, I sent you and Kay a patch that, while it I think these are already in Linus's tree right now, right? thanks, greg k-h -
Mar 13, 12:40 pm 2007
Adrian Bunk
[2/6] 2.6.21-rc3: known regressions
This email lists some known regressions in Linus' tree compared to 2.6.20. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : ipv6 crash References : http://lkml.org/lkml/2007/3/10/2 Submitter : ...
Mar 13, 5:50 am 2007
Adrian Bunk
[4/6] 2.6.21-rc3: known regressions
This email lists some known regressions in Linus' tree compared to 2.6.20. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : ThinkPad X60: resume no longer works (PCI related?) References : ...
Mar 13, 5:50 am 2007
Cornelia Huck
Re: [2/6] 2.6.21-rc3: known regressions
On Tue, 13 Mar 2007 13:50:03 +0100, Does this still happen with -rc3? I'd have thought Mark's patch in 0de1517e23c2e28d58a6344b97a120596ea200bb fixed that... -
Mar 13, 6:30 am 2007
Mark Lord
Re: [2/6] 2.6.21-rc3: known regressions
Pavel? Could you retest this now on a ThinkPad X60 ? ??? -
Mar 13, 6:35 am 2007
Adrian Bunk
[3/6] 2.6.21-rc3: known regressions
This email lists some known regressions in Linus' tree compared to 2.6.20. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : AMD Elan: Crash after "Allocating PCI resources" References : ...
Mar 13, 5:50 am 2007
Linus Torvalds
Re: Linux v2.6.21-rc3
Yes. I just wanted some more testing of it, and while I didn't hear much, at least Auke added his ack, and the old state was clearly broken, so they got applied yesterday. Linus -
Mar 13, 12:48 pm 2007
Adrian Bunk
[6/6] 2.6.21-rc3: known regressions
This email lists some known regressions in Linus' tree compared to 2.6.20. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : Dynticks and High resolution Timer hanging the system References : ...
Mar 13, 5:50 am 2007
Adrian Bunk
[1/6] 2.6.21-rc3: known regressions
This email lists some known regressions in Linus' tree compared to 2.6.20. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : crashes in KDE References : ...
Mar 13, 5:49 am 2007
Pavel Machek Mar 13, 11:14 am 2007
Andi Kleen
Re: [3/6] 2.6.21-rc3: known regressions
It uses RDTSC when it shouldn't. Already got a fix for that. -Andi -
Mar 13, 8:13 am 2007
Fabio Comolli
Re: [3/6] 2.6.21-rc3: known regressions
This regression is still present in 2.6.21-rc3-g8b9909de (pulled from Linus' tree less than one hour ago). Fabio -
Mar 13, 1:12 pm 2007
Takashi Iwai
Re: [1/6] 2.6.21-rc3: known regressions
At Tue, 13 Mar 2007 13:49:57 +0100, Already fixed. The patch is in ALSA HG tree, but not synced to git... Jaroslav, could you do prepare and push request ASAP, please? thanks, Takashi -
Mar 13, 6:40 am 2007
Pierre Ossman
Re: [1/6] 2.6.21-rc3: known regressions
What kind of savages do not use udev these days?! ;) I don't have the time and energy to jump through all the hoops required to get an official number right now. Most users use udev and those that don't can use the "major" parameter for mmc_block. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainer http://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org -
Mar 13, 12:07 pm 2007
Alan Cox
Re: [3/6] 2.6.21-rc3: known regressions
Some cases should be fixed now but probably not all (eg the Nvidia one) -
Mar 13, 7:03 am 2007
David Chinner
Re: [patch 3/8] per backing_dev dirty and writeback page ...
Only if the queue depth is not bound. Queue depths are bound and so the distance we can go over the threshold is limited. This is the fundamental principle on which the throttling is based..... Hence, if the queue is not full, then we will have either written dirty pages to it (i.e wbc->nr_write != write_chunk so we will throttle or continue normally if write_chunk was written) or we have no more dirty pages left. Having no dirty pages left on the bdi and it not being congested means we ...
Mar 13, 3:12 pm 2007
Miklos Szeredi
Re: [patch 3/8] per backing_dev dirty and writeback page ...
What this loop is doing is putting write requests in the request queue, and in so doing transforming page state from dirty to Because the lower filesystem writes back one request, but then gets stuck in balance_dirty_pages before returning. So the write request is never completed. The problem is that balance_dirty_pages is waiting for the condition that the global number of dirty+writeback pages goes below the threshold. But this condition can only be satisfied if This is the fuse ...
Mar 13, 1:21 am 2007
Kirill Korotaev
Re: [Devel] Re: [RFC][PATCH 2/7] RSS controller core
I guess by "paged it in" you essentially mean "mapped the page into address space for the *first* time"? i.e. how many times the same page mapped into 2 address spaces in the same container should be accounted for? We believe ONE. It is better due to: - it allows better estimate how much RAM container uses. - if one container mapped a single page 10,000 times, it doesn't mean it is worse than a container which mapped only 200 pages and that it should be killed in case of ...
Mar 13, 3:19 am 2007
Srivatsa Vaddagiri
Re: [RFC][PATCH 2/7] RSS controller core
I have a problem doing a group-reply in mutt to Herbert's mails. His email id gets dropped from the To or Cc list. Is that his email setting? Don't know. -- Regards, vatsa -
Mar 12, 7:24 pm 2007
Kirill Korotaev
Re: [RFC][PATCH 2/7] RSS controller core
great. I agree with that. Just curious why current vserver code kills arbitrary depends on whether you will include beanocunter 0 usages or not :) Kirill -
Mar 13, 8:10 am 2007
Dave Hansen
Re: [RFC][PATCH 2/7] RSS controller core
My first worry was that this approach is unfair to the poor bastard that happened to get started up first. If we have a bunch of containerized web servers, the poor guy who starts Apache first will pay the price for keeping it in memory for everybody else. That said, I think this is naturally worked around. The guy charged unfairly will get reclaim started on himself sooner. This will tend to page out those pages that he was being unfairly charged for. Hopefully, they will eventually get ...
Mar 13, 10:26 am 2007
Kirill Korotaev
Re: [Devel] Re: [RFC][PATCH 1/7] Resource counters
fully agree with it. We need to get a working version first. FYI, in OVZ we recently added such optimizations: reserves like in TCP/IP, e.g. for kmemsize, numfile these reserves are done on task-basis for fast charges/uncharges w/o involving lock operations. On task exit reserves are returned back to the beancounter. As it demonstrated atomic counters can be replaced with task-reserves on the next step. Thanks, Kirill -
Mar 13, 2:49 am 2007
Herbert Poetzl
Re: [RFC][PATCH 2/7] RSS controller core
my mail client is not involved in receiving the emails, so the email I replied to did already miss you in the cc (i.e. I doubt that mutt would hide you from the cc, if it would be present in the mailbox :) maybe one of the mailing lists is removing receipients according to some strange scheme? here are the full headers for the email I replied to: -8<------------------------------------------------------------------------
Mar 13, 9:06 am 2007
Srivatsa Vaddagiri
Re: [RFC][PATCH 1/7] Resource counters
If I am not mistaken, you shouldn't loop in normal cases, which means it boils down to a atomic_read() + atomic_cmpxch() -- Regards, vatsa -
Mar 13, 9:07 am 2007
Pavel Emelianov
Re: [RFC][PATCH 2/7] RSS controller core
Is the page stays mapped for the container or not? If yes then what's the use of limits? Container mapped pages more than the limit is but all the pages are -
Mar 13, 12:17 am 2007
Herbert Poetzl
Re: [Devel] Re: [RFC][PATCH 2/7] RSS controller core
sounds good to me, just not sure it provides what we okay, let me ask a few naive questions about this scheme: how does this work for a _file_ which is shared between two guests (e.g. an executable like bash, hardlinked between guests) when both guests are in a different zone-based container? + assumed that the file is read in the first time, will it be accounted to the first guest doing so? + assumed it is accessed in the second guest, will it cause any additional ...
Mar 13, 7:59 am 2007
Dave Hansen
Re: [RFC][PATCH 2/7] RSS controller core
I was just reading through the (comprehensive) thread about this from last week, so forgive me if I missed some of it. The idea is really tempting, precisely because I don't think anyone really wants to have to screw with the reclaim logic. I'm just brain-dumping here, hoping that somebody has already thought through some of this stuff. It's not a bitch-fest, I promise. :) How do we determine what is shared, and goes into the shared zones? Once we've allocated a page, it's too late ...
Mar 13, 10:05 am 2007
Andrew Morton
Re: [Devel] Re: [RFC][PATCH 2/7] RSS controller core
Not really - I mean "first allocated the page". ie: major fault(), read(), I'm not sure that we need to account for pages at all, nor care about rss. If we use a physical zone-based containment scheme: fake-numa, variable-sized zones, etc then it all becomes moot. You set up a container which has 1.5GB of physial memory then toss processes into it. As that process set increases in size it will toss out stray pages which shouldn't be there, then it will start reclaiming and swapping out ...
Mar 13, 4:48 am 2007
Herbert Poetzl
Re: [RFC][PATCH 2/7] RSS controller core
sounds weird, but makes sense if you look at the full picture just because the guest is over its page limit doesn't mean that you actually want the system to swap stuff out, what you really want to happen is the following: - somehow mark those pages as 'gone' for the guest - penalize the guest (and only the guest) for the 'virtual' swap/page operation - penalize the guest again for paging in the page - drop/swap/page out those pages when the host system you tell me? or is that ...
Mar 13, 8:05 am 2007
Andrew Morton
Re: [RFC][PATCH 2/7] RSS controller core
nooooooo. What you're saying there amounts to text replication. There is no proposal here to create duplicated copies of pagecache pages: the VM just doesn't support that (Nick has soe protopatches which do this as a possible NUMA optimisation). So these mmapped pages will contiue to be shared across all guests. The problem boils down to "which guest(s) get charged for each shared page". A simple and obvious and easy-to-implement answer is "the guest which paged it in". I think we ...
Mar 12, 11:04 pm 2007
Kirill Korotaev
Re: [RFC][PATCH 2/7] RSS controller core
I'm talking not about the finess of the code, but rather about the lack of isolation, i.e. one VE can affect others. Kirill -
Mar 13, 8:54 am 2007
Herbert Poetzl
Re: [RFC][PATCH 2/7] RSS controller core
because it obviously lacks the finess of OpenVZ code :) seriously, handling the OOM kills inside a container has never been a real world issue, as once you are really out of memory (and OOM starts killing) you usually have lost the game anyways (i.e. a guest restart or similar is required to get your services up and running again) and OOM killer decisions are not perfect in mainline either, but, you've probably seen the FIXME and TODO entries in the code showing that this so that is an ...
Mar 13, 8:11 am 2007
Eric W. Biederman
Re: [RFC][PATCH 1/7] Resource counters
I think as far as having this discussion if you can remove that race people will be more willing to talk about what vserver does. That said anything that uses locks or atomic operations (finer grained locks) because of the cache line ping pong is going to have scaling issues on large boxes. So in that sense anything short of per cpu variables sucks at scale. That said I would much rather get a simple correct version without the complexity of per cpu counters, before we optimize the ...
Mar 13, 2:09 am 2007
Herbert Poetzl
Re: [RFC][PATCH 1/7] Resource counters
well, shouldn't be a big deal to brush that patch up right, but atomic ops have much less impact on most actually I thought about per cpu counters quite a lot, and we (Llinux-VServer) use them for accounting, but please tell me how you use per cpu structures for implementing limits TIA, -
Mar 13, 8:21 am 2007
Nick Piggin
Re: [RFC][PATCH 4/7] RSS accounting hooks over the code
Let's be practical here, what you're asking is basically impossible. Unless by deterministic you mean that it never enters the a non trivial syscall, in which case, you just want to know about maximum It is basically handwaving anyway. The only approach I've seen with a sane (not perfect, but good) way of accounting memory use is this one. If you care to define "proper", then we could discuss that. -- SUSE Labs, Novell Inc. Send instant messages to your online friends ...
Mar 13, 3:25 am 2007
Pavel Emelianov
Re: [RFC][PATCH 1/7] Resource counters
Right. But atomic_add_unless() is slower as it is Did you ever look at how get_empty_filp() works? I agree, that this is not a "strict" limit, but it limits the usage wit some "precision". /* off-the-topic */ Herbert, you've lost Balbir again: In this sub-thread some letters up Eric wrote a letter with Balbir in Cc:. The next reply from you doesn't include him. -
Mar 13, 8:41 am 2007
Eric W. Biederman
Re: [RFC][PATCH 4/7] RSS accounting hooks over the code
I will tell you what I want. I want a shared page cache that has nothing to do with RSS limits. I want an RSS limit that once I know I can run a deterministic application with a fixed set of inputs in I want to know it will always run. First touch page ownership does not guarantee give me anything useful for knowing if I can run my application or not. Because of page sharing my application might run inside the rss limit only because I got lucky and happened to share a lot of pages with ...
Mar 13, 2:58 am 2007
Balbir Singh
Re: [RFC][PATCH 2/7] RSS controller core
Thats good to know, but my mailer shows Andrew Morton <akpm@linux-foundation.org> to Pavel Emelianov <xemul@sw.ru> cc Paul Menage <menage@google.com>, Srivatsa Vaddagiri <vatsa@in.ibm.com>, Balbir Singh <balbir@in.ibm.com> (see I am <<HERE>>), devel@openvz.org, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, containers@lists.osdl.org, Kirill Korotaev <dev@sw.ru> date Mar 7, 2007 3:30 AM subject Re: [RFC][PATCH 2/7] RSS controller ...
Mar 12, 6:57 pm 2007
Alan Cox
Re: [RFC][PATCH 2/7] RSS controller core
"Is it useful for me as a bad guy to make it happen ?" Alan -
Mar 13, 12:09 pm 2007
Dave Hansen
Re: [RFC][PATCH 2/7] RSS controller core
A very fine question. ;) To exploit this, you'd need to: 1. need to access common data with another user 2. be patient enough to wait 3. determine when one of those users had actually pulled a page in from disk, which sys_mincore() can do, right? I guess that might be a decent reason to not charge the guy who brings the page in for the page's entire lifetime. So, unless we can change page ownership after it has been allocated, anyone accessing shared data can get around resource ...
Mar 13, 1:28 pm 2007
Kirill Korotaev
Re: [Devel] Re: [RFC][PATCH 2/7] RSS controller core
I would be happy to propose OVZ approach then, where a page is tracked with page_beancounter data structure, which ties together a page with beancounters which use it like this: page -> page_beancounter -> list of beanocunters which has the page mapped This gives a number of advantages: - the page is accounted to all the VEs which actually use it. - allows almost accurate tracking of page fractions used by VEs depending on how many VEs mapped the page. - allows to track dirty pages, i.e. ...
Mar 13, 8:43 am 2007
Pavel Emelianov
Re: [RFC][PATCH 3/7] Data structures changes for RSS acc ...
That's exactly what we agreed on during our discussions: When page is get touched it is charged to this container. When page is get touched again by new container it is NOT charged to new container, but keeps holding the old one till it (the page) is completely freed. Nobody worried the fact that a single page can hold container for good. OpenVZ beancounters work the other way (and we proposed this solution when we first sent the patches). We keep track of We can migrate page to another ...
Mar 13, 12:10 am 2007
Eric W. Biederman
Re: [Devel] Re: [RFC][PATCH 2/7] RSS controller core
No the fact that a page mapped into 2 separate mm_structs in two separate accounting domains is counted only once. This is very likely to happen with things like glibc if you have a read-only shared copy of your distro. There appears to be no technical reason for such a restriction. A page should not be owned. Going further unless the limits are draconian I don't expect users to hit the rss limits often or frequently. So in 99% of all cases page reclaim should continue to be global. ...
Mar 13, 2:26 am 2007
Pavel Emelianov
Re: [RFC][PATCH 2/7] RSS controller core
Yeah! And slow down the container which caused global limit hit (w/o hitting it's own limit!) by swapping In OpenVZ we account resources in host system as well. -
Mar 13, 8:32 am 2007
Pavel Emelianov
Re: [RFC][PATCH 1/7] Resource counters
BTW atomic_add_unless() is essentially a loop!!! Just like spin_lock() is, so why is one better that another? spin_lock() can go to schedule() on preemptive kernels -
Mar 13, 2:27 am 2007
Eric W. Biederman
Re: [RFC][PATCH 4/7] RSS accounting hooks over the code
Not per process I want this on a group of processes, and yes that is all I want just. I just want accounting of the maximum RSS of No. I don't want the meaning of my rss limit to be affected by what other processes are doing. We have constraints of how many resources the box actually has. But I don't want accounting so sloppy that processes outside my group of processes can artificially I will agree that this patchset is probably in the right general ballpark. But the fact that pages ...
Mar 13, 9:01 am 2007
Eric W. Biederman
Re: [RFC][PATCH 4/7] RSS accounting hooks over the code
Which is a questionable assumption. Worse case we are talking a list several thousand entries long, and generally if you are used by the same container you will hit one of your processes long before you traverse the whole list. So at least the average case performance should be good. It is only in the case when you a page is shared between multiple containers when this matters. Eric -
Mar 13, 2:43 am 2007
Herbert Poetzl
Re: [RFC][PATCH 1/7] Resource counters
fine, nobody actually uses atomic_add_unless(), or am I missing something? using two locks will be slower than using a single lock, adding a loop which counts from 0 to 100 will I can happily add him to every email I reply to, but he definitely isn't removed by my mailer (as I already stated, it might be the mailing list which does this), fact is, the email arrives here without him in the cc, so a reply does not contain it either ... best, Herbert -
Mar 13, 9:32 am 2007
Jeremy Fitzhardinge
Re: [PATCH 8/8] Convert PDA into the percpu section
Why testing with qemu is not enough. diff -r 8dcd1dc9b298 arch/i386/kernel/cpu/common.c --- a/arch/i386/kernel/cpu/common.c Tue Mar 13 00:33:37 2007 -0700 +++ b/arch/i386/kernel/cpu/common.c Tue Mar 13 08:33:42 2007 -0700 @@ -627,7 +627,7 @@ __cpuinit void init_gdt(int cpu, struct pack_descriptor((u32 *)&gdt[GDT_ENTRY_PERCPU].a, (u32 *)&gdt[GDT_ENTRY_PERCPU].b, __per_cpu_offset[cpu], 0xFFFFF, - 0x80 | DESCTYPE_S | 0x2, 0); /* present read-write data segment */ + 0x80 | ...
Mar 13, 10:15 am 2007
Jeremy Fitzhardinge
Re: [PATCH 0/8] x86 boot, pda and gdt cleanups
Hi Rusty, This is my rough hacking patch I needed to get things into a Xen-shape state. There are a few of things here: * init_gdt should always use write_gdt_entry when touching the gdt; if it doesn't and it ends up touching an already-installed gdt under Xen, it will get a write fault. This happens because init_gdt ends up getting called twice in SMP (see below). * init_gdt should always be called before bringing up the cpu, rather than by the cpu itself ...
Mar 13, 1:48 pm 2007
David Miller
Re: [git patches] net driver fixes
From: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Applied, thanks Geert. -
Mar 12, 5:02 pm 2007
Eric Piel
[PATCH 0/2] wistron_btns: More keymaps
Hello, As a sequel to my patch "Wistron button support for TravelMate 610" of last week, here is a bigger addition of keymaps for the wistron_btns. Patch 1 adds all the database of acerhk which fits this driver (about 25 more laptops). Patch 2 adds a generic map that should fit most users but has the disadvantage of not being automatic. Dmitry, I've tried to make them against your tree. Still, if they don't apply cleanly, just tell me and I'll try harder! See you, Eric -
Mar 13, 4:01 pm 2007
Eric Piel
[PATCH 1/2] wriston_btns: Add acerhk laptop database
This patch adds all the "tm_new" laptops information that is in acerhk to wistron_btns. That's about 25 more laptops. Obviously, I couldn't try them all. I've just tried the Aspire 3020. For this reason, I've also added a printk which ask the users of those laptops to confirm me it works (or not). Surprisingly, the dmi information could be found on google for a majority of the laptops, so it might not work so badly. The information about which laptop has which led is also imported, ...
Mar 13, 4:05 pm 2007
Eric Piel
[PATCH 2/2] wistron_btns: Generic keymap
This patch adds a generic map. That is, a keymap that should output the correct keycodes for most laptops. This is simply based on the observation of all those keymaps already gathered, as most of the wistron codes are always mapped to the same keycode. Hopefully, this way users which have a non-supported laptop will have a quick and dirty way to use the multimedia keys. Eric
Mar 13, 4:07 pm 2007
Con Kolivas
RSDL development plans
The rsdl patches queued up so far are stable and boot fine and are reasonably performant on many architectures so I'm quite happy for them to get a run in -mm. The changes planned will (as you may have seen on this email thread) decrease average latencies across all nice levels, and make differential nice levels run better together. This will allow -nice to be used without significant latency harm to not niced tasks (as there is presently in rsdl and mainline). The change required on top ...
Mar 13, 4:08 pm 2007
Con Kolivas
Re: [PATCH] [RSDL-0.30] sched: rsdl improve latencies wi ...
A few other minor things would need to be updated before this patch is in a good enough shape to join the rsdl patches. This one will be good for testing though. -- -ck -
Mar 13, 9:08 am 2007
Con Kolivas
[PATCH] [RSDL-0.30] sched: rsdl improve latencies with d ...
Can you try the attached patch please Al and Mike? It "dithers" the priority bitmap which tends to fluctuate the latency a lot more but in a cyclical fashion. This tends to make the max latency bound to a smaller value and should make it possible to run -nice tasks without killing the latency of the non niced tasks. Eg you could possibly run X nice -10 at a guess like we used to in 2.4 days. It's not essential of course, but is a workaround for Mike's testcase. Thanks. --- Modify the ...
Mar 13, 8:31 am 2007
Con Kolivas
Re: [PATCH] [RSDL-0.30] sched: rsdl improve latencies wi ...
Bah with a bit more sleep under my belt it became clear that I forgot to update the expired array in any proper way so this change almost breaks stuff at the moment in the shape it's in. Please disregard this change for now apart from interest in how I'm tackling the nice issue. -- -ck -
Mar 13, 1:58 pm 2007
Con Kolivas
[PATCH] [RSDL-0.30] sched: rsdl improve latencies with d ...
Oops, one tiny fix. This is a respin of the patch, sorry. --- Modify the priority bitmaps of different nice levels to be dithered minimising the latency likely when different nice levels are used. This allows low cpu using relatively niced tasks to still get low latency in the presence of less niced tasks. Fix the accounting on -nice levels to not be scaled by HZ. Signed-off-by: Con Kolivas <kernel@kolivas.org> --- kernel/sched.c | 73 ...
Mar 13, 9:03 am 2007
Michal Piotrowski
Re: 2.6.21-rc2-git3 oops snd_intel8x0_interrupt+0x104/0x ...
Hi Takashi, Yes, it does. Regards, Michal -- Michal K. K. Piotrowski LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) LTG - Linux Testers Group (EN) (http://www.stardust.webpages.pl/linux_testers_group_en/) -
Mar 13, 11:24 am 2007
Lukas Hejtmanek
Re: Suspend to disk bug 2.6.21-rc2-git1
2.6.21-rc3-git with platform (previous method was reboot) method works OK. -- Lukáš Hejtmánek -
Mar 13, 5:45 am 2007
Alan Stern
Re: Keyboard stops working after *lock [Was: 2.6.21-rc2-mm1]
Yes. In fact, let's be safe and unplug _both_ the mouse and the keyboard. Okay, no rush. Alan Stern -
Mar 13, 9:30 am 2007
Jiri Slaby
Re: Keyboard stops working after *lock [Was: 2.6.21-rc2-mm1]
There weren't any changes neither in HW config nor in modules, just reverted, compiled, installed, rebooted. Mouse is HID user too, so I So, do you mean rmmod uhci_hcd, unplug the keyboard, modprobe uhci_hcd, start usbmon, plug the keyboard, press numlock, stop usbmon, post it? I'm away from the box till Sat, anyway. regards, -- http://www.fi.muni.cz/~xslaby/ Jiri Slaby faculty of informatics, masaryk university, brno, cz e-mail: jirislaby gmail com, gpg pubkey ...
Mar 13, 9:13 am 2007
Alan Stern
Re: Keyboard stops working after *lock [Was: 2.6.21-rc2-mm1]
I don't see anything in the UHCI snapshots to explain the difference in behavior. One thing that stands out is the other, low-speed device (a mouse?) -- in the bad kernel dump its driver was running and in the good kernel dump its driver wasn't. But that shouldn't have affected the result. In fact, nothing in your data was significant. It could be that the problem occurs earlier, at the time when the keyboard is first plugged in. Can you get another pair of usbmon logs, starting from ...
Mar 13, 9:01 am 2007
Jeremy Fitzhardinge
Re: [PATCH 2/9] Sched clock paravirt op fix.patch
Well, its a matter of degree. If your busy vcpu is sharing a real cpu with 9 other busy vcpus, then you'll only see about 10% of the real cpu cycles. If we had legitimate loads in which 90% of the cpu were taken up by interrupt processing, then I think we definitely would try to account for that time properly. SMM interrupts, in which the CPU is stolen from the kernel entirely, is a better analogy. But again, they're expected to be rare and insignificant. In a virtual machine, on the ...
Mar 13, 8:26 am 2007
Rik van Riel
Re: [PATCH 2/9] Sched clock paravirt op fix.patch
Interrupts tend to be reasonably short though. Steal time can be several hypervisor/host time slices long. As an aside, normal interrupts *are* accounted for separately in /proc/stat, so why not steal time too? -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. -
Mar 13, 9:37 am 2007
Andi Kleen
Re: [PATCH 2/9] Sched clock paravirt op fix.patch
But that is just what a interrupt is. -Andi -
Mar 13, 9:16 am 2007
Andi Kleen
Re: [PATCH 2/9] Sched clock paravirt op fix.patch
It depends -- under heavy network load you can spend a long time just processing interrupts. -Andi -
Mar 13, 1:56 pm 2007
Andi Kleen
Re: [PATCH 2/9] Sched clock paravirt op fix.patch
I think it's better to remove this completely and not allow paravirt to hook into sched_clock. After all a hypervisor stealing time is no different from interrupts stealing time and we don't try to handle that either. I will remove the custom hook. -Andi -
Mar 13, 7:01 am 2007
Chris Wright
Re: [PATCH 2/9] Sched clock paravirt op fix.patch
Exactly. Normal interrupts we can handle. Having CPU completely disappear for unkown time periods we can't, and will need to. thanks, -chris -
Mar 13, 9:07 am 2007
Jeremy Fitzhardinge
Re: [PATCH 2/9] Sched clock paravirt op fix.patch
Well, in that case you probably don't want to charge them to the process which happens to be running at the time. J -
Mar 13, 2:05 pm 2007
David Chinner
Re: [RFC] Heads up on sys_fallocate()
ISTR having had this discussion before ;) About guided preallocation for defrag: http://marc.info/?t=116247859500001&r=1&w=2 e.g.: The sorts of policies we need for effective use of preallocation: http://marc.info/?l=linux-fsdevel&m=116184475308164&w=2 http://marc.info/?l=linux-fsdevel&m=116278169519095&w=2 Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group -
Mar 13, 4:46 pm 2007
Kirill Korotaev
Re: [PATCH 1/2] rcfs core patch
you know better than I that stable branch doesn't differ much, especially in securiy (because it lacks these controls at all). BTW, killing arbitrary task in case of RSS limit hit It's a pity, but it took me only 5 minutes of looking into the code, Forget about Vserver and OpenVZ. It is not a war. We are looking for something working, new and robust. I'm just trying you to show that non-intrusive and pretty small accounting/limiting code like in Vserver simply doesn't work. The problem of ...
Mar 13, 1:28 am 2007
Srivatsa Vaddagiri
Re: [ckrm-tech] [PATCH 1/2] rcfs core patch
Are you referring to these issues in the general Paul Menage's container code or in the RSS-control code posted by Pavel? -- Regards, vatsa -
Mar 13, 7:11 am 2007
Herbert Poetzl
Re: [PATCH 1/2] rcfs core patch
first, fix your mail client to get the quoting right, it is quite unreadable the way it is (not the first see how readable and easily understandable the code is? it takes me several hours to read OpenVZ code, and that's simply doesn't work? because you didn't try to make it work? as we are discussing RSS limits, there are actually three different (existing) approaches we have talked about: - 'the 'perfect RAM counter' each page is accounted exactly once, when used in a ...
Mar 13, 6:55 am 2007
Herbert Poetzl
Re: [ckrm-tech] [PATCH 1/2] rcfs core patch
the container code has quite a number of locks, including the container(_manage)_lock/unlock, but those should not hurt that much, as restructuring is not on the hot path (as far as I can tell) but I was more referring to the charge/uncharge(_locked) inlines which will be execised quite often ... and btw, why not use a 'generic' accounting macro/inline to do the accounting for all the different resources like files, rss, sockets (instead of duplicating the code over and over ...
Mar 13, 8:52 am 2007
Paul Menage
Re: [ckrm-tech] [PATCH 0/2] resource control file system ...
That's assuming that you're using network namespace virtualization, with each group of tasks in a separate namespace. What if you don't want the virtualization overhead, just the accounting? Paul -
Mar 12, 7:25 pm 2007
Herbert Poetzl
Re: [ckrm-tech] [PATCH 0/2] resource control file system ...
there should be no 'virtualization' overhead, and what do you want to account for, if not by a group of tasks? maybe I'm missing the grouping condition here, but I assume you assign tasks to the accounting containers note: network isolation is not supposed to add overhead compared to the host system (at least not measureable overhead) best, -
Mar 13, 8:57 am 2007
Srivatsa Vaddagiri
Re: [ckrm-tech] [PATCH 0/2] resource control file system ...
Considering the example Sam quoted, doesn't it make sense to split resource classes (some of them atleast) independent of each other? That would also argue for providing multiple hierarchy feature in Paul's patches. Given that and the mail Serge sent on why nsproxy optimization is usefull given numbers, can you reconsider your earlier proposals as below: - pid_ns and resource parameters should be in a single struct (point 1c, 7b in [1]) - pointers to resource controlling objects ...
Mar 12, 7:22 pm 2007
Milton Miller
Re: [patch 00/12] Syslets, Threadlets, generic AIO support, v5
I too went and downloaded patches-v5 for review. First off, one problem I noticed in sys_async_wait: + ah->events_left = min_wait_events - (kernel_ring_idx - user_ring_idx); This completely misses the wraparound case of kernel_ring_idx < user_ring_idx. I wonder if this is causing some of the benchmark problems? In addition, the entries in the table are not function pointers, they are the actual code targets. So we need a arch helper to invoke the system call. Here is another ...
Mar 13, 12:05 am 2007
Oliver Neukum
Re: [PATCH] usb-serial regression fix
No, any open port has taken a reference in serial_open(): serial = usb_serial_get_by_index(tty->index); destroy_serial() can be called only when the refcount goes to zero. How can there be open ports? Regards Oliver -
Mar 13, 6:50 am 2007
Mark Lord
Re: [PATCH] usb-serial regression fix
(1) open up a ckermit session on /dev/usb_serial_port_0. (2) suspend the machine (to RAM). (3) the suspend logic "removes" all USB devices. Cheers -
Mar 13, 6:39 am 2007
Oliver Neukum
Re: [PATCH] usb-serial regression fix
While we are at it, is there a reason we call return_serial() late in the sequence? Making sure the device is not reopened should be the first measure. Are we protected by hanging up the tty? Secondly, what is this code supposed to do: for (i = 0; i < serial->num_ports; ++i) serial->port[i]->open_count = 0; If we get to destroy_serial(), how can ports still be open? Regards Oliver -
Mar 13, 3:14 am 2007
Greg KH
Re: [PATCH] usb-serial regression fix
But shouldn't you null it out somewhere? It will be an "empty" pointer at some point in time... Mark, does this solve your oops (after you revert your previous patch)? thanks, greg k-h -
Mar 12, 5:18 pm 2007
Mark Lord
Re: [PATCH] usb-serial regression fix
Okay, so.. Jim, could you spell it out for us now? I'm confused. Based on the current 2.6.21-rc3-git*, tell us *exactly* what (if any) needs to be reverted, and *exactly* which (if any) of my suggested patches to apply ? Thanks -
Mar 13, 6:55 am 2007
Jim Radford
Re: [PATCH] usb-serial regression fix
It gets free'd through device_unregister for (i = 0; i < num_ports; ++i) { ... port->dev.release = &port_release; ... retval = device_register(&port->dev); which means that until all the drivers get converted to use ->port_probe() and ->port_remove() (which gets called by device_unregister) and stop using the ->port[] array in ->shutdown() So, this patch should be reverted for now. -Jim -
Mar 13, 2:14 am 2007
Mathieu Bérard
Re: [3/6] 2.6.21-rc2: known regressions
Hi, libata.noacpi=1 worked. The drive is up and running with NCQ on. Here is the PATA/SATA related part of my DSDT table with the _GTF methods: Device (PATA) { Name (_ADR, 0x001F0001) OperationRegion (PACS, PCI_Config, 0x40, 0xC0) Field (PACS, DWordAcc, NoLock, Preserve) { PRIT, 16, Offset (0x04), PSIT, 4, Offset (0x08), SYNC, 4, Offset (0x0A), SDT0, 2, , 2, SDT1, 2, ...
Mar 13, 5:31 am 2007
Jim Radford
Re: [PATCH] usb-serial regression fix
The simple change to not NULL the ->port[] array after unregister was not quite enough, as ->shutdown() would, as hinted at by Mark, then access kfree'd memory. The best thing to do is to just revert d9a7ecacac5f8274d2afce09aadcf37bdb42b93a and work out a better fix for the next release. My patches to fix ftdi_sio no longer require this patch, so it is pointless at this late state if it breaks *anything*, and it does. This patch reverts d9a7ecacac5f8274d2afce09aadcf37bdb42b93a since ...
Mar 13, 8:30 am 2007
Jim Radford
Re: [PATCH] usb-serial regression fix
Not as far as I can see. The serial structure that ->port[i] is in gets kfree()ed soon after, in the same function, and nothing in between, other than ->shutdown(), uses ->port[]. I assume it was someone being overly cautious. -Jim -
Mar 12, 5:41 pm 2007
Mark Lord
Re: [PATCH] usb-serial regression fix
So where does the memory get freed -- the structure pointed at by the serial->port[i] thingie ? It's not a leak, is it? ??? -
Mar 12, 6:55 pm 2007
Mark Lord
Re: [PATCH] usb-serial regression fix
Patch applied, tested, works for me. Signed-Off: Jim Radford <radford@blackbean.org> Acked-by: Mark Lord <mlord@pobox.com> --- b/drivers/usb/serial/usb-serial.c +++ a/drivers/usb/serial/usb-serial.c @@ -137,6 +135,11 @@ dbg("%s - %s", __FUNCTION__, serial->type->description); + serial->type->shutdown(serial); + + /* return the minor range that this device had */ + return_serial(serial); + for (i = 0; i < serial->num_ports; ++i) serial->port[i]->open_count = 0; @@ ...
Mar 13, 9:35 am 2007
Srivatsa Vaddagiri
Re: [PATCH] kthread_should_stop_check_freeze (was: Re: [ ...
Hmm ..that needs to be documented as well then! I can easily count more than dozen places where kernel_thread() is being used. I agree though that kthread is a much cleaner interface to create/destroy threads. -- Regards, vatsa -
Mar 13, 2:58 am 2007
Srivatsa Vaddagiri
Re: [PATCH] kthread_should_stop_check_freeze (was: Re: [ ...
This neednt guarantee that the thread will see the stop request before it exits the kthread_should_stop_freeze() function. There will always be races .. So the only safe way for a thread to know whether it is time to exit is: while (!kthread_should_stop_freeze()) { if (!cpu_online(home_cpu)) goto wait_to_die; ... } wait_to_die: while (!kthread_should_stop()) { /* sleep */ } -- Regards, vatsa -
Mar 12, 10:42 pm 2007
Srivatsa Vaddagiri
Re: [PATCH] kthread_should_stop_check_freeze (was: Re: [ ...
Document as well in the kernel_thread() API, as I notice people still I noticed that in the Powerpc code (atleast for rtas kernel thread) here: http://lkml.org/lkml/2007/1/9/61 That was not a serious problem perhaps because process freezer was mostly used in software suspend and only those platforms supporting software suspend had to worry abt it. But now we intend to use process freezer for CPU hotplug as well, so all platforms wanting to support CPU hotplug better support process ...
Mar 12, 8:14 pm 2007
Christoph Hellwig
Re: [PATCH] kthread_should_stop_check_freeze (was: Re: [ ...
Well, it takes a lot of time to convert all the existing users. But I try to make sure to flame^H^H^H^H^Hcorrect everyone who tries to sneak a new user of kernel_thread in. -
Mar 13, 3:20 am 2007
Christoph Hellwig Mar 13, 2:16 am 2007
Paul E. McKenney
Re: [PATCH] kthread_should_stop_check_freeze (was: Re: [ ...
Another problem is people doing "kthread_should_stop()" and forgetting about freezing. Then we continue ending up with situations where we are intermittently unable to freeze. In the spirit of "Rusty Scale" interface design, how do we make it difficult for people to misuse this interface? Thanx, Paul -
Mar 12, 5:58 pm 2007
Gautham R Shenoy
Re: [PATCH] kthread_should_stop_check_freeze (was: Re: [ ...
I would prefer to have try_to_freeze() followed by the kthread_stop_info.k check. Something like if (try_to_freeze()) /*some barrier ensuring all writes are completed */ if (kthread_stop_info.k == current) return 1; return 0; This would be helpful in situations (atleast for cpu-hotplug) where we want to stop a frozen thread immediately after thawing it. Something like CPU_DEAD: thaw_process(p); kthread_stop(p); p = NULL; Is there a problem with this line of thinking ...
Mar 12, 10:27 pm 2007
Benjamin Herrenschmidt
Re: Make sure we populate the initroot filesystem late enough
Have you tried, instead, to apply 38f3323037de22bb0089d08be27be01196e7148b ? (That is revert 39d61db0edb34d60b83c5e0d62d0e906578cc707). I suspect this is the proper fix... -
Mar 13, 12:03 am 2007
Kumar Gala
Re: Make sure we populate the initroot filesystem late enough
Have you tried 2.6.20.2, there was a significant bug in get_order() that was deemed to be causing these issues. - k -
Mar 12, 8:03 pm 2007
Andrea Arcangeli
Re: SMP performance degradation with sysbench
When you said idle I thought idle and not waiting for I/O. Waiting for I/O would be hardly a kernel issue ;). If they're not waiting for I/O and they're not scheduling in userland with nanosleep/pause, the cpu shouldn't go idle. Even if they're calling sched_yield in a loop the cpu should account for zero idle time as far as I can tell. -
Mar 13, 3:31 am 2007
Andrea Arcangeli
Re: SMP performance degradation with sysbench
So it again makes little sense to me that this is idle time, unless some userland mutex has a usleep in the slow path which would be very wrong, in the worst case they should yield() (yield can still waste lots of cpu if two tasks in the slow paths calls it while the holder is not scheduled, but at least it wouldn't be idle time). Idle time is suspicious for a kernel issue in the scheduler or some userland inefficiency (the latter sounds more likely). -
Mar 13, 3:57 am 2007
Andrea Arcangeli
Re: SMP performance degradation with sysbench
The initial assumption was that there was zero idle time with threads = cpus and the idle time showed up only when the number of threads increased to the double the number of cpus. If the idle time wouldn't It'd be interesting to see the sysrq+t after the idle time My wild guess is that they're allocating memory after taking futexes. If they do, something like this will happen: taskA taskB taskC user lock mmap_sem lock mmap sem -> schedule user lock -> ...
Mar 13, 4:42 am 2007
Nick Piggin
Re: SMP performance degradation with sysbench
Well that doesn't help in this case. I tested and the mmap_sem contention The idea is a good one, and I was half way through implementing similar myself at one point (some java apps hit this badly). It is just horribly sad that futexes are supposed to implement a _scalable_ thread synchronisation mechanism, whilst fundamentally relying on an mm-wide lock to operate. I don't like your interface, but then again, the futex interface isn't exactly pretty anyway. You should resubmit the ...
Mar 13, 4:56 am 2007
Nick Piggin
Re: SMP performance degradation with sysbench
Well it wasn't iowait time. From Anton's analysis, I would probably say it was time waiting for either the glibc malloc mutex or MySQL heap mutex. -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com -
Mar 13, 3:37 am 2007
Eric Dumazet
Re: SMP performance degradation with sysbench
glibc malloc uses arenas, and trylock() only. It should not block because if an arena is already locked, thread automatically chose another arena, and might create a new one if necessary. But yes, mmap_sem contention is a big problem, because it's also taken by futex code (unfortunately) -
Mar 13, 5:02 am 2007
Eric Dumazet
Re: SMP performance degradation with sysbench
I cooked a patch some time ago to speedup threaded apps and got no feedback. http://lkml.org/lkml/2006/8/9/26 Maybe we have to wait for 32 core cpu before thinking of cache line bouncings... -
Mar 13, 4:40 am 2007
Jakub Jelinek
Re: SMP performance degradation with sysbench
Well, only when allocating it uses trylock, free uses normal lock. glibc malloc will by default use the same arena for all threads, only when it sees contention during allocation it gives different threads different arenas. So, e.g. if mysql did all allocations while holding some global heap lock (thus glibc wouldn't see any contention on allocation), but freeing would be done outside of application's critical section, you would see contention on main arena's lock in the free path. Calling ...
Mar 13, 5:27 am 2007
Andrea Arcangeli
Re: SMP performance degradation with sysbench
btw, regardless of what glibc is doing, still the cpu shouldn't go idle IMHO. Even if we're overscheduling and trashing over the mmap_sem with threads (no idea if other OS schedules the task away when they find the other cpu in the mmap critical section), or if we've overscheduling with futex locking, the cpu usage should remain 100% system time in the worst case. The only explanation for going idle legitimately could be on HT cpus where HT may hurt more than help but on real multicore it ...
Mar 13, 2:45 am 2007
Nick Piggin
Re: SMP performance degradation with sysbench
Hi Anton, Very cool. Yeah I had come to the conclusion that it wasn't a kernel issue, and basically was afraid to look into userspace ;) That bogus setscheduler thing must surely have never worked, though. I wonder if FreeBSD avoids the scalability issue because it is using SCHED_RR there, or because it has a decent threaded malloc implementation. -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com -
Mar 12, 10:11 pm 2007
Nick Piggin
Re: SMP performance degradation with sysbench
Well ignoring the HT issue, I was seeing lots of idle time simply because userspace could not keep up enough load to the scheduler. There simply were fewer runnable tasks than CPU cores. But it wasn't a case of all CPUs going idle, just most of them ;) -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com -
Mar 13, 3:06 am 2007
Eric Dumazet
Re: SMP performance degradation with sysbench
Hi Anton, thanks for the report. glibc has certainly many scalability problems. One of the known problem is its (ab)use of mmap() to allocate one (yes : one !) page every time you fopen() a file. And then a munmap() at fclose() time. mmap()/munmap() should be avoided as hell in multithreaded programs. -
Mar 12, 11:00 pm 2007
Nick Piggin
Re: SMP performance degradation with sysbench
Well I think more threads ~= more probability that this guy is going to be preempted while holding the mutex? This might be why FreeBSD works much better, because it looks like MySQL actually will set RT scheduling for those processes that take critical I would agree that it points to MySQL scalability issues, however the fact that such large gains come from tcmalloc is still interesting. -- SUSE Labs, Novell Inc. Send instant messages to your online friends ...
Mar 13, 5:08 am 2007
Nick Piggin
Re: SMP performance degradation with sysbench
They'll be sleeping in futex_wait in the kernel, I think. One thread will hold the critical mutex, some will be off doing their own thing, but importantly there will be many sleeping for the mutex to become That is what I first suspected, because the dropoff appeared to happen exactly after we saturated the CPU count: it seems like a scheduler artifact. However, I tested with a bigger system and actually the idle time comes before we saturate all CPUs. Also, increasing the ...
Mar 13, 4:12 am 2007
Willy Tarreau
Re: [PATCH] tcp_cubic: use 32 bit math
Hi Stephen, Well, I have cleaned it a little bit, there were more comments and ifdefs than code ! I've appended it to the end of this mail. I have changed it a bit, because I noticed that integer divide precision was so coarse that there were other possibilities to play with the bits. I have experimented with combinations of several methods : - replace integer divides with multiplies/shifts where possible. - compensation for divide imprecisions by adding/removing small values ...
Mar 13, 1:50 pm 2007
Nick Piggin
Re: [patch 4/6] mm: merge populate and nopage into fault ...
Yes, I believe that is the case, however I wonder if that is going to be a problem for you to distinguish between write faults for clean writable Yeah they are also frustratingly similar to filemap_nopage and shmem_nopage, Is there a big clash? I don't think I did a great deal to fremap.c (mainly just removing stuff)... -
Mar 12, 6:19 pm 2007
Lennart Sorensen
Re: [PATCH 1/1] LinuxPPS: Pulse per Second support for Linux
I have tried out 3.0.0-rc2 which seems to work pretty well so far (when combined with the patches to the jsm driver I just posted). It took soe work to get ntp's refclock_nmea to work though, since the patch that is linked to from the linuxpps page seems out of date. Here is the patch that seems to be working for me, although I am still testing it. Given you know the linuxpps code better perhaps you can see if it looks sane to you. --- ntpd/refclock_nmea.c.ori 2007-03-13 18:38:01.000000000 ...
Mar 13, 3:48 pm 2007
Rodolfo Giometti
[PATCH 1/1] LinuxPPS: Pulse per Second support for Linux
Hello, here my new patch for PPS support in Linux. I tried to follow your suggestions as much possible! Please let me know if this new version could be more acceptable. Thanks in advance, Rodolfo -- GNU/Linux Solutions e-mail: giometti@enneenne.com Linux Device Driver giometti@gnudd.com Embedded Systems giometti@linux.it UNIX programming phone: +39 349 2432127
Mar 13, 2:38 pm 2007
David Howells
Re: Move to unshared VMAs in NOMMU mode?
I'd like to have the drivers and filesystems need to know as little as possible about whether they're working in MMU-mode or NOMMU-mode - for the most part such knowledge should be unnecessary. Additionally, I'd rather not put special case code in the generic parts of the NOMMU code. David -
Mar 13, 3:14 am 2007
Alan Cox
Re: [RFC] hwbkpt: Hardware breakpoints (was Kwatch)
On Tue, 13 Mar 2007 01:00:50 -0700 (PDT) No. We merge the CPUID information to get a shared set of capability bits. Generic PC systems with a mix of PII and PIII are possible. The voyager architecture can have even more peculiar combinations of processor modules installed. -
Mar 13, 6:07 am 2007
Alan Stern
Re: [RFC] hwbkpt: Hardware breakpoints (was Kwatch)
Yes, the code could be reworked by moving some of the data from the CPU hw-breakpoint info into the thread's info. I'll see how much simpler it It isn't quite that easy. Even though the number of user breakpoints may not have changed, their identities may have. So the unlikely case has to encompass two possibilities: the number of installable user breakpoints This shouldn't be necessary. So long as DR_GLOBAL_ENABLE always belongs to the kernel's part of DR7 and DR_LOCAL_ENABLE always ...
Mar 13, 11:56 am 2007
Roland McGrath
Re: [RFC] hwbkpt: Hardware breakpoints (was Kwatch)
I don't know, but it seems unlikely. AFAIK all CPUs are presumed to have Indeed, it is for this sort of thing. Still, it feels like a bit too much is going on in switch_to_thread_hw_breakpoint for the common case. It seems to me it ought to be something as simple as this: if (unlikely((thbi->want_dr7 &~ chbi->kdr7) != thbi->active_tdr7) { /* Need to make some installed or uninstalled callbacks. */ if (thbi->active_tdr7 & chbi->kdr7) uninstalled callbacks; else installed ...
Mar 13, 1:00 am 2007
Geert Uytterhoeven
const and __initdata (was: Re: [Linux-fbdev-devel] [PATC ...
By modifying scripts/pnmtologo.c to add `const' to the declaration of the linux_logo structs, I could keep the const. However, this introduced a new conflict w.r.t. to `const' and `__initdata': - gcc 3.4.6 20060404 (Red Hat 3.4.6-3) for ia32 needs xxx_data[] and xxx_clut[] to be const too, else it complains about a section type conflict - ppu-gcc 4.1.0 20060304 (Red Hat 4.1.0-3) for ppc64 needs xxx_data[] and xxx_clut[] to be non-const, else it complains about a section type ...
Mar 13, 7:00 am 2007
previous daytodaynext day
March 12, 2007March 13, 2007March 14, 2007