Given the patches being busily submitted by trivial patch submitters to make use kmem_cache_zalloc(), et. al, I believe we should remove the unlikely() tests around the (gfp_flags & __GFP_ZERO) tests, such as: - if (unlikely((flags & __GFP_ZERO) && objp)) + if ((flags & __GFP_ZERO) && objp) memset(objp, 0, obj_size(cachep)); Agreed? If so, I'll send a patch... - Ted --
I support it. Recently Steven tried to gather the information. http://thread.gmane.org/gmane.linux.kernel/1072767 -- Kind regards, Minchan Kim --
Hi, I guess the rationale here is that if you're going to take the hit of memset() you can take the hit of unlikely() as well. We're optimizing for hot call-sites that allocate a small amount of memory and initialize everything themselves. That said, I don't think the unlikely() annotation matters much either way and am for removing it unless people object to that. That would be interesting, sure. --
Note, you could do it yourself too. Just enable:
Kernel Hacking -> Tracers -> Branch Profiling
(Trace likely/unlikely profiler)
CONFIG_PROFILE_ANNOTATED_BRANCHES
Then search /debug/tracing/trace_stats/branch_annotated.
(hmm, the help in Kconfig is wrong, I need to fix that)
Anyway, here's my box. I just started it an hour ago, and have not been
doing too much on it yet. But here's what I got (using SLUB)
correct incorrect % Function File Line
------- --------- - -------- ---- ----
6890998 2784830 28 slab_alloc slub.c 1719
That's incorrect 28% of the time.
-- Steve
--
Thanks! AFAICT, that number is high enough to justify removing the unlikely() annotations, no? --
Personally, I think anything that is incorrect more that 5% of the time should not have any annotation. My rule is to use the annotation when a branch goes one way 95% or more. With the exception of times when we want a particular path to be the faster path, because we know its in a more critical position (as there are cases in the scheduler and the tracing infrastructure itself). But here, I think removing it is the right decision. -- Steve --
I suspect for many slab caches, all of the slab allocations for a given slab cache type will have the GFP_ZERO flag passed. So maybe it would be more efficient to zap the entire page when it is pressed into service for a particular slab cache, so we can avoid needing to use memset on a per-object basis? - Ted --
Hi Ted,
We'd then need to do memset() in kmem_cache_free() because callers are
not required to clean up after them. In general, we don't want to do
that because object cachelines are less likely to be touched after
free than they are after allocation (you usually use the memory
immediately after you allocate).
Pekka
--
Sounds good to me. We might consider dropping this flag and making the decision statically (ie alloc vs zalloc), at least for slab objects. -- Mathematics is the supreme nostalgia of our time. --
