Linux: Slab Defragmentation

Submitted by Jeremy
on September 4, 2007 - 10:59am

"Slab defragmentation is mainly an issue if Linux is used as a fileserver and large amounts of dentries, inodes and buffer heads accumulate," Christoph Lameter explained when posting the fifth version of his patchset. He continued, "in some load situations the slabs become very sparsely populated so that a lot of memory is wasted by slabs that only contain one or a few objects. In extreme cases the performance of a machine will become sluggish since we are continually running reclaim. Slab defragmentation adds the capability to recover wasted memory." Christoph noted that the patch is difficult to validate and measure because, "activities are only performed when special load situations are encountered." He then pointed to updatedb as something that typically triggers slab defragmentation on his systems:

"Updatedb scans all files on the system which causes a high inode and dentry use. After updatedb is complete we need to go back to the regular use patterns (typical on my machine: kernel compiles). Those need the memory now for different purposes. The inodes and dentries used for updatedb will gradually be aged by the dentry/inode reclaim algorithm which will free up the dentries and inode entries randomly through the slabs that were allocated. As a result the slabs will become sparsely populated. If they become empty then they can be freed but a lot of them will remain sparsely populated. That is where slab defrag comes in: It removes the slabs with just a few entries reclaiming more memory for other uses."


From:	Christoph Lameter [email blocked]
Subject: [RFC 00/26] Slab defragmentation V5
Date:	Fri, 31 Aug 2007 18:41:07 -0700

Slab defragmentation is mainly an issue if Linux is used as a fileserver
and large amounts of dentries, inodes and buffer heads accumulate. In some
load situations the slabs become very sparsely populated so that a lot of
memory is wasted by slabs that only contain one or a few objects. In
extreme cases the performance of a machine will become sluggish since
we are continually running reclaim. Slab defragmentation adds the
capability to recover wasted memory.

For lumpy reclaim slab defragmentation can be used to enhance the
ability to recover larger contiguous areas of memory. Lumpy reclaim currently
cannot do anything if a slab page is encountered. With slab defragmentation
that slab page can be removed and a large contiguous page freed. It may
be possible to have slab pages also part of ZONE_MOVABLE (Mel's defrag
scheme in 2.6.23) or the MOVABLE areas (antifrag patches in mm).

The trouble with this patchset is that it is difficult to validate.
Activities are only performed when special load situations are encountered.
Are there any tests that could give meaningful information about
the effectiveness of these measures? I have run various tests here
creating and deleting files and building kernels under low memory situations
to trigger these reclaim mechanisms but how does one measure their
effectiveness?

The patchset is also available via git

git pull git://git.kernel.org/pub/scm/linux/kernel/git/christoph/slab.git defrag


We currently support the following types of reclaim:

1. dentry cache
2. inode cache (with a generic interface to allow easy setup of more
   filesystems than the currently supported ext2/3/4 reiserfs, XFS
   and proc)
3. buffer_head

One typical mechanism that triggers slab defragmentation on my systems
is the daily run of

	updatedb

Updatedb scans all files on the system which causes a high inode and dentry
use. After updatedb is complete we need to go back to the regular use
patterns (typical on my machine: kernel compiles). Those need the memory now
for different purposes. The inodes and dentries used for updatedb will
gradually be aged by the dentry/inode reclaim algorithm which will free
up the dentries and inode entries randomly through the slabs that were
allocated. As a result the slabs will become sparsely populated. If they
become empty then they can be freed but a lot of them will remain sparsely
populated. That is where slab defrag comes in: It removes the slabs with
just a few entries reclaiming more memory for other uses.

V4->V5:
- Support lumpy reclaim for slabs
- Support reclaim via slab_shrink()
- Add constructors to insure a consistent object state at all times.

V3->V4:
- Optimize scan for slabs that need defragmentation
- Add /sys/slab/*/defrag_ratio to allow setting defrag limits
  per slab.
- Add support for buffer heads.
- Describe how the cleanup after the daily updatedb can be
  improved by slab defragmentation.

V2->V3
- Support directory reclaim
- Add infrastructure to trigger defragmentation after slab shrinking if we
  have slabs with a high degree of fragmentation.

V1->V2
- Clean up control flow using a state variable. Simplify API. Back to 2
  functions that now take arrays of objects.
- Inode defrag support for a set of filesystems
- Fix up dentry defrag support to work on negative dentries by adding
  a new dentry flag that indicates that a dentry is not in the process
  of being freed or allocated.


Related Links:

Ah yes.

on
September 4, 2007 - 1:35pm

Behold the power of updatedb, everyone's favorite workload to love to hate.

--
Program Intellivision and play Space Patrol!

rlocate is much better

Dâniel Fraga (not verified)
on
September 4, 2007 - 5:34pm

Instead of using updatedb, use rlocate (http://rlocate.sourceforge.net/) which is much better and much more intelligent. A software that scan all the files every day is pretty stupid.

uhh... rlocate still

Anonymous (not verified)
on
September 4, 2007 - 11:42pm

uhh... rlocate still recommends running updatedb once per day, so I don't see your point at all. All it does is maintain a diff of the database thats always up to date. IMO, that's overkill for the intended purpose. In general, if a file has been changed in the last 24 hours, I know where it is.

Anyway, what's wrong with scanning all files once per day? On development machines where this sort of functionality is really useful no one uses the computer in the middle of the night.

Wrong.

Anonymous (not verified)
on
September 5, 2007 - 6:52am

My deveploment boxes are regularly used in the middle of the night. And on the day. On all times occasionally. The updatedb is set to run on 6:25 and I still sometimes manage to do something on the boxes at that time...

But...

Anonymous (not verified)
on
September 5, 2007 - 2:46pm

"uhh... rlocate still recommends running updatedb once per day, so I don't see your point at all"

Ah, but rlocate's updatedb takes a few seconds, except the first, and every 10th time.

Ok...

on
September 5, 2007 - 5:39pm

When it does end up doing a full scan ("every 10th time"), it's still a crappy workload.

Sure, the "scan everything" approach isn't typical of a wide range of workloads, but it handily demonstrates poor behavior on the part of the Linux kernel, and it continues to be everyone's favorite whipping boy for one reason or another.

Just imagine if we had to run system-wide virus scans regularly like people on some other OSes are forced to. Ick. Then you're not just scanning all the metadata, you're also touching the data itself. Eew?

--
Program Intellivision and play Space Patrol!

I have to run in

Dâniel Fraga (not verified)
on
September 11, 2007 - 9:14pm

I have to run in cron:


/usr/local/bin/rlocate -u


everyday, but it just takes no more than 1 minute. I ran it now and it took just 1 second. So what's the point?

Just an example

Anonymous (not verified)
on
September 5, 2007 - 3:22am

Come on, it was just an example. Put your_extensive_hd_program instead updatecd. You missed the point.

Too bad its Linux_2.6-only.

Anonymous (not verified)
on
September 5, 2007 - 7:14am

Too bad its Linux_2.6-only. Updatedb is portable and userland. So i'm not sure its "better" and "more intelligent".

That's beside the point.

Flewellyn (not verified)
on
September 5, 2007 - 7:53pm

The point is that updatedb is an example of a workload that scans a large amount of data exactly once. It's a massive cache-trasher, in other words. It's probably the only program in common use by everyday desktop users that behaves this way, as far as we know, for now. But that may not remain so. And besides, that says nothing about the server side of things, where such cache-trashing workloads may be more common.

So if slab defrag can help such cases, I say, it's a good thing!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.