Hi,
I'm seeing a serious memory leak whenever I use my /home partition.
It's an encrypted partition using dm-crypt. Simply running=20
'stress -d 5' is enough to exhaust the memory in a few minutes.When I stop 'stress' the memory isn't returned.
This doesn't seem to happen when I run stress on a normal (no dm and=20
no encryption) partition.Here's the output of 'free -m' before and after 'stress -d 1 -t 5'
Before:
total used free shared buffers cached
Mem: 751 584 167 0 14 235
-/+ buffers/cache: 334 416
Swap: 980 0 980After:
total used free shared buffers cached
Mem: 751 619 131 0 12 169
-/+ buffers/cache: 437 313
Swap: 980 0 980That's 100Mb of RAM gone in 5 seconds. =20
I'm using a standard x86 centrino laptop.
The log files don't reveal anything suspicious.git bisect tells me it started in one of these commits:
d24517d793f21edab1a411da95f2c45cb88a84aa,=20
5bb23a688b2de23d7765a1dd439d89c038378978 and=20
9cc54d40b8ca01fcefc9151044b6996565061d90.=20The bug is still present in the last version of Linus' tree
(65a6ec0d72a07f16719e9b7a96e1c4bae044b591)Let me know if I can provide more information.
Regards,
Kristof
Same setup here - I can confirm the leak. Happened here while compiling latest
kernel while using the previous build. Over 1GB "vanished" :D ... Now i'm
back to the default kernel.I'll apply the patch now and report.
Greets
Jan-Simon-
Please try with this patch from Neil.
diff .prev/drivers/md/dm-crypt.c ./drivers/md/dm-crypt.c
--- .prev/drivers/md/dm-crypt.c 2007-10-15 17:18:20.000000000 +1000
+++ ./drivers/md/dm-crypt.c 2007-10-15 17:21:43.000000000 +1000
@@ -444,32 +444,12 @@ static struct bio *crypt_alloc_buffer(st
}static void crypt_free_buffer_pages(struct crypt_config *cc,
- struct bio *clone, unsigned int bytes)
+ struct bio *clone)
{
- unsigned int i, start, end;
+ unsigned int i;
struct bio_vec *bv;- /*
- * This is ugly, but Jens Axboe thinks that using bi_idx in the
- * endio function is too dangerous at the moment, so I calculate the
- * correct position using bi_vcnt and bi_size.
- * The bv_offset and bv_len fields might already be modified but we
- * know that we always allocated whole pages.
- * A fix to the bi_idx issue in the kernel is in the works, so
- * we will hopefully be able to revert to the cleaner solution soon.
- */
- i = clone->bi_vcnt - 1;
- bv = bio_iovec_idx(clone, i);
- end = (i << PAGE_SHIFT) + (bv->bv_offset + bv->bv_len) - clone->bi_size;
- start = end - bytes;
-
- start >>= PAGE_SHIFT;
- if (!clone->bi_size)
- end = clone->bi_vcnt;
- else
- end >>= PAGE_SHIFT;
-
- for (i = start; i < end; i++) {
+ for (i = 0; i < clone->bi_vcnt; i++) {
bv = bio_iovec_idx(clone, i);
BUG_ON(!bv->bv_page);
mempool_free(bv->bv_page, cc->page_pool);
@@ -539,7 +519,7 @@ static void crypt_endio(struct bio *clon
* free the processed pages
*/
if (!read_io) {
- crypt_free_buffer_pages(cc, clone, clone->bi_size);
+ crypt_free_buffer_pages(cc, clone);
goto out;
}@@ -628,7 +608,7 @@ static void process_write(struct dm_cryp
ctx.idx_out = 0;if (unlikely(crypt_convert(cc, &ctx) < 0)) {
- crypt_free_buffer_pages(cc, clone, clone->bi_size);
+ crypt_free_buffer_pages(cc, clone);
bio_put(clone);
crypt_dec_pending(io, -EIO);
return;
...
Tried the patch. Now its usable - as far as i can say. Compiled a fresh kernel
to "stress" the system (make -j3).
total used free shared buffers cached
Mem: 2053260 691856 1361404 0 5884 324868
-/+ buffers/cache: 361104 1692156
total used free shared buffers cached
Mem: 2053260 2006588 46672 0 32 1492360
-/+ buffers/cache: 514196 1539064
Swap: 2104472 96 2104376Greets
Jan-Simon
-
e.
Same here. As far as I can tell it's fixed.Thanks,
Kristof=20
Great, thanks for testing. Current git has the fix as of last night
(CET).--
Jens Axboe-
| Amit K. Arora | [RFC] Heads up on sys_fallocate() |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Linus Torvalds | Linux 2.6.25-rc4 |
| Greg KH | Linux 2.6.25.10 |
git: | |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | [GIT]: Networking |
| Ilpo Järvinen | Re: Strange Application bug, race in MSG_PEEK complaints (was: Bug#513695: fetchma... |
