login
Header Space

 
 

Re: [patch 125/311] Cpuset hardwall flag: switch cpusets to use the bulk cgroup_add_files() API

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Andrew Morton <akpm@...>, <menage@...>
Cc: <torvalds@...>, <lizf@...>, Hidetoshi Seto <seto.hidetoshi@...>, Hiroyuki KAMEZAWA <kamezawa.hiroyu@...>, Dimitri Sivanich <sivanich@...>, <linux-kernel@...>
Date: Tuesday, May 6, 2008 - 8:20 pm

Dimitri Sivanich, a colleague of mine, just reported to me an easily
reproduced BUG in Linus's current git tree, anytime one reads or writes
the new per-cpuset file "sched_relax_domain_level".  The guilty task
gets a SEGV and the kernel prints (if the command was called 'cat'
and its pid was 16766 ;):

    kernel BUG at kernel/cpuset.c:1448!
    cat[16766]: bugcheck! 0 [3]

The BUG comes from cpuset code that wasn't expecting that read or write
request at that point in the code.

The basic problem is that Seto-san's "sched_relax_domain_level" and
Paul M's conversion to the new style *_u64 cpuset file handlers were
occurring at the same time, with the result that the handlers for
the per-cpuset file "sched_relax_domain_level" were only partially
converted to the new style *_u64 cpuset file handlers.

The following provides more details, and presents a couple of questions
for Andrew or Paul Menage, at the end.

===

On April 29, Paul Menage observed that the cpuset patch for
'sched_relax_domain' got mangled -- it ended up using the
old style common file read/write routines, but having the
cases to handle it added to Paul M's new style *_u64 handlers.

Paul M proposed the following untested patch:

Andrew replied:

I definitely agree with the above observations of Paul M.  I suspect
that the patch might be missing the lines needed to -remove- the
FILE_SCHED_RELAX_DOMAIN_LEVEL cases from the old style
cpuset_common_file_read and cpuset_common_file_write switches.

The kernel now at the top of Linus's git tree hits a BUG()
immediately, anytime you try to read or write these new
per-cpuset files "sched_relax_domain_level".

I tried looking in 2.6.25-rc1-mm1-mmotm (as of an hour ago),
and it -looks- like the fix is in the linux-next.patch there.

However:

 1) I can't get 2.6.25-rc1-mm1-mmotm to apply even close to
    either of 2.6.25 or 2.6.25-rc1.  Blows up on the first
    patch.

	==> akpm - what does todays 2.6.25-rc1-mm1-mmotm
	    apply to?

 2) I didn't see any replies from Paul M in response to
    Andrews above request to "send us any needed fixup later
    in the week".

	==> Paul M or akpm - Is this fixup in the pipeline?

    I guess it did from my reading of the linux-next.patch
    in 2.6.25-rc1-mm1-mmotm, but I'm not confident I'm
    reading that patch right.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: [patch 125/311] Cpuset hardwall flag: switch cpusets to ..., Paul Jackson, (Tue May 6, 8:20 pm)
speck-geostationary