Re: [PATCH 06/10] Block discard support

Previous thread: Re: 2.6.35 - INFO: kernel/signal.c:660 invoked rcu_dereference_check() without protection! by Paul E. McKenney on Monday, August 9, 2010 - 10:13 am. (2 messages)

Next thread: Re: 2.6.35 - INFO: kernel/exit.c:1387 invoked rcu_dereference_check() without protection! by Paul E. McKenney on Monday, August 9, 2010 - 10:27 am. (3 messages)
From: Nitin Gupta
Date: Monday, August 9, 2010 - 10:26 am

The zram module creates RAM based block devices named /dev/zram<id>
(<id> = 0, 1, ...). Pages written to these disks are compressed and stored
in memory itself.

One of the major changes done is the replacement of ioctls with sysfs
interface. One of the advantages of this approach is we no longer depend on the
userspace tool (rzscontrol) which was used to set various parameters and check
statistics. Maintaining updated version of rzscontrol as changes were done to
ioctls, statistics exported etc. was a major pain.

Another significant change is the introduction of percpu stats and compression
buffers. Earlier, we had per-device buffers protected by a mutex. This was a
major bottleneck on multi-core systems. With these changes, benchmarks with
fio[1] showed a speedup of about 20% for write performance on dual-core
system (see patch 4/10 description for details).


For easier testing, a single patch against 2.6.35-git8 has been uploaded at:
http://compcache.googlecode.com/hg/sub-projects/mainline/zram_2.6.36-rc0.patch

[1] fio I/O benchmarking tool: http://freshmeat.net/projects/fio/

Nitin Gupta (10):
  Replace ioctls with sysfs interface
  Remove need for explicit device initialization
  Use percpu stats
  Use percpu buffers
  Reduce per table entry overhead by 4 bytes
  Block discard support
  Increase compressed page size threshold
  Some cleanups
  Update zram documentation
  Document sysfs entries

 Documentation/ABI/testing/sysfs-block-zram |   99 ++++
 drivers/staging/zram/Kconfig               |   12 -
 drivers/staging/zram/Makefile              |    2 +-
 drivers/staging/zram/xvmalloc.c            |   22 +-
 drivers/staging/zram/zram.txt              |   58 ++-
 drivers/staging/zram/zram_drv.c            |  670 ++++++++++++++--------------
 drivers/staging/zram/zram_drv.h            |  138 ++----
 drivers/staging/zram/zram_ioctl.h          |   41 --
 drivers/staging/zram/zram_sysfs.c          |  253 +++++++++++
 9 files changed, 760 insertions(+), 535 ...
From: Nitin Gupta
Date: Monday, August 9, 2010 - 10:26 am

Creates per-device sysfs nodes in /sys/block/zram<id>/
Currently following stats are exported:
 - disksize
 - num_reads
 - num_writes
 - invalid_io
 - zero_pages
 - orig_data_size
 - compr_data_size
 - mem_used_total

By default, disksize is set to 0. So, to start using
a zram device, fist write a disksize value and then
initialize device by writing any positive value to
initstate. For example:

        # initialize /dev/zram0 with 50MB disksize
        echo 50*1024*1024 | bc > /sys/block/zram0/disksize
        echo 1 > /sys/block/zram0/initstate

When done using a disk, issue reset to free its memory
by writing any positive value to reset node:

        echo 1 > /sys/block/zram0/reset

This change also obviates the need for 'rzscontrol' utility.

Signed-off-by: Nitin Gupta <ngupta@vflare.org>
---
 drivers/staging/zram/Kconfig      |   12 --
 drivers/staging/zram/Makefile     |    2 +-
 drivers/staging/zram/zram_drv.c   |  186 ++++++++--------------------
 drivers/staging/zram/zram_drv.h   |   55 ++-------
 drivers/staging/zram/zram_ioctl.h |   41 ------
 drivers/staging/zram/zram_sysfs.c |  242 +++++++++++++++++++++++++++++++++++++
 6 files changed, 306 insertions(+), 232 deletions(-)
 delete mode 100644 drivers/staging/zram/zram_ioctl.h
 create mode 100644 drivers/staging/zram/zram_sysfs.c

diff --git a/drivers/staging/zram/Kconfig b/drivers/staging/zram/Kconfig
index 4654ae2..da079f8 100644
--- a/drivers/staging/zram/Kconfig
+++ b/drivers/staging/zram/Kconfig
@@ -15,15 +15,3 @@ config ZRAM
 
 	  See zram.txt for more information.
 	  Project home: http://compcache.googlecode.com/
-
-config ZRAM_STATS
-	bool "Enable statistics for compressed RAM disks"
-	depends on ZRAM
-	default y
-	help
-	  Enable statistics collection for compressed RAM devices. Statistics
-	  are exported through ioctl interface, so you have to use zramconfig
-	  program to get them. This adds only a minimal overhead.
-
-	  If unsure, say Y.
-
diff --git ...
From: Pekka Enberg
Date: Monday, August 9, 2010 - 11:34 am

Looks good to me (but I'm not a sysfs guy).


These could probably use atomic_inc(), atomic64_inc(), and friends, no?
--

From: Nitin Gupta
Date: Monday, August 9, 2010 - 8:06 pm

Yes, I think we could use them. Anyways, they are replaced by percpu stats in
patch 3, so probably this can be left as-is.

Thanks,
Nitin
--

From: Dave Hansen
Date: Tuesday, August 31, 2010 - 4:06 pm

Maybe I'm just a weirdo, but I don't really use modules much.  That
effectively means that I'm stuck at boot with one zram device.

Making it a read-only module param also means that someone can't add a
second at runtime while the first is still in use.

It doesn't seem to be used very pervasively, but there is a
module_param_cb() function so you can register callbacks when the param
gets updated.  Might come in handy for this.

-- Dave

--

From: Nitin Gupta
Date: Monday, August 9, 2010 - 10:26 am

Currently, the user has to explicitly write a positive value to
initstate sysfs node before the device can be used. This event
triggers allocation of per-device metadata like memory pool,
table array and so on.

We do not pre-initialize all zram devices since the 'table' array,
mapping disk blocks to compressed chunks, takes considerable amount
of memory (8 bytes per page). So, pre-initializing all devices will
be quite wasteful if only few or none of the devices are actually
used.

This explicit device initialization from user is an odd requirement and
can be easily avoided. We now initialize the device when first write is
done to the device.

Signed-off-by: Nitin Gupta <ngupta@vflare.org>
---
 drivers/staging/zram/zram_drv.c   |   35 +++++++++++++++++++++++------------
 drivers/staging/zram/zram_drv.h   |    2 ++
 drivers/staging/zram/zram_sysfs.c |   26 ++++----------------------
 3 files changed, 29 insertions(+), 34 deletions(-)

diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index 3f698a5..c5f84ee 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -207,9 +207,15 @@ static int zram_read(struct zram *zram, struct bio *bio)
 	u32 index;
 	struct bio_vec *bvec;
 
-	zram_stat64_inc(zram, &zram->stats.num_reads);
+	if (unlikely(!zram->init_done)) {
+		set_bit(BIO_UPTODATE, &bio->bi_flags);
+		bio_endio(bio, 0);
+		return 0;
+	}
 
+	zram_stat64_inc(zram, &zram->stats.num_reads);
 	index = bio->bi_sector >> SECTORS_PER_PAGE_SHIFT;
+
 	bio_for_each_segment(bvec, bio, i) {
 		int ret;
 		size_t clen;
@@ -275,16 +281,20 @@ out:
 
 static int zram_write(struct zram *zram, struct bio *bio)
 {
-	int i;
+	int i, ret;
 	u32 index;
 	struct bio_vec *bvec;
 
-	zram_stat64_inc(zram, &zram->stats.num_writes);
+	if (unlikely(!zram->init_done)) {
+		ret = zram_init_device(zram);
+		if (ret)
+			goto out;
+	}
 
+	zram_stat64_inc(zram, &zram->stats.num_writes);
 	index = bio->bi_sector >> ...
From: Pekka Enberg
Date: Monday, August 9, 2010 - 11:36 am

AFAICT, most hardware block device drivers do things like this in the
probe function. Why can't we do that for zram as well and drop the
->init_done and ->init_lock parts?
--

From: Nitin Gupta
Date: Monday, August 9, 2010 - 8:38 pm

I think probe is only for PCI devices? Maybe should hook into open function
in struct block_device_operations. That way, we can also drop init_done and
init_lock parts.

Thanks,
Nitin
--

From: Nitin Gupta
Date: Monday, August 9, 2010 - 10:26 am

Also remove references to removed stats (ex: good_comress).

Signed-off-by: Nitin Gupta <ngupta@vflare.org>
---
 drivers/staging/zram/zram_drv.c   |   77 ++++++++++++++++--------------------
 drivers/staging/zram/zram_drv.h   |   33 +++++++++-------
 drivers/staging/zram/zram_sysfs.c |   46 +++++++++++++++-------
 3 files changed, 84 insertions(+), 72 deletions(-)

diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index c5f84ee..f111b86 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -38,33 +38,27 @@ struct zram *devices;
 /* Module params (documentation at end) */
 unsigned int num_devices;
 
-static void zram_stat_inc(u32 *v)
+static void zram_add_stat(struct zram *zram,
+			enum zram_stats_index idx, s64 val)
 {
-	*v = *v + 1;
+	struct zram_stats_cpu *stats;
+
+	preempt_disable();
+	stats = __this_cpu_ptr(zram->stats);
+	u64_stats_update_begin(&stats->syncp);
+	stats->count[idx] += val;
+	u64_stats_update_end(&stats->syncp);
+	preempt_enable();
 }
 
-static void zram_stat_dec(u32 *v)
+static void zram_inc_stat(struct zram *zram, enum zram_stats_index idx)
 {
-	*v = *v - 1;
+	zram_add_stat(zram, idx, 1);
 }
 
-static void zram_stat64_add(struct zram *zram, u64 *v, u64 inc)
+static void zram_dec_stat(struct zram *zram, enum zram_stats_index idx)
 {
-	spin_lock(&zram->stat64_lock);
-	*v = *v + inc;
-	spin_unlock(&zram->stat64_lock);
-}
-
-static void zram_stat64_sub(struct zram *zram, u64 *v, u64 dec)
-{
-	spin_lock(&zram->stat64_lock);
-	*v = *v - dec;
-	spin_unlock(&zram->stat64_lock);
-}
-
-static void zram_stat64_inc(struct zram *zram, u64 *v)
-{
-	zram_stat64_add(zram, v, 1);
+	zram_add_stat(zram, idx, -1);
 }
 
 static int zram_test_flag(struct zram *zram, u32 index,
@@ -131,7 +125,7 @@ static void zram_set_disksize(struct zram *zram, size_t totalram_bytes)
 
 static void zram_free_page(struct zram *zram, size_t index)
 {
-	u32 clen;
+	int clen;
 	void *obj;
 
 	struct ...
From: Pekka Enberg
Date: Monday, August 9, 2010 - 11:44 am

Acked-by: Pekka Enberg <penberg@kernel.org>
--

From: Andrew Morton
Date: Monday, August 9, 2010 - 9:34 pm

That reimplements include/linux/percpu_counter.h, poorly.

Please see the June discussion "[PATCH v2 1/2] tmpfs: Quick token
library to allow scalable retrieval of tokens from token jar" for some
discussion.

--

From: Nitin Gupta
Date: Wednesday, August 11, 2010 - 9:39 am

I read the discussion you pointed out but still fail to see how percpu_counters,
with all their overhead, are better than simple pcpu variable used in current
version. What is the advantage?

Thanks,
Nitin

--

From: Andrew Morton
Date: Wednesday, August 11, 2010 - 10:18 am

What overhead?  Send numbers.  Then extrapolate those numbers to a

Firstly, they'd have saved all the time you spent duplicating them.

Secondly, getting additional users of the standard facility results in
more testing and perhaps enhancement of that facility, thus benefiting
other users too.

Thirdly, using the standard facility permits your code to leverage
enhancements which others add. 

Fourthly, they would result in a smaller kernel.

You didn't really need me to teach you the benefits of code reuse did
you?


Please do not merge this code unless there is a good reason to do so
and it has been shown that the standard facility cannot be suitably
fixed or enhanced to address the deficiency.


--

From: Christoph Lameter
Date: Monday, August 30, 2010 - 9:20 am

Maybe do

#define zram_add_stat(zram, index, val)
		this_cpu_add(zram->stats->count[index], val)

instead? It creates an add in a single "atomic" per cpu instruction and
deals with the fallback scenarios for processors that cannot handle 64
bit adds.

--

From: Nitin Gupta
Date: Tuesday, August 31, 2010 - 1:31 pm

Yes, this_cpu_add() seems sufficient. I can't recall why I used u64_stats_*
but if it's not required for atomic access to 64-bit then why was it added to
the mainline in the first place?

Thanks,
Nitin
--

From: Eric Dumazet
Date: Tuesday, August 31, 2010 - 2:28 pm

Because we wanted to have fast 64bit counters, even on 32bit arches, and
this has litle to do with 'atomic' on one entity, but a group of
counters. (check drivers/net/loopback.c, lines 91-94). No lock prefix
used in fast path.

We also wanted readers to read correct values, not a value being changed
by a writer, with inconsistent 32bit halves. SNMP applications want
monotonically increasing counters.

this_cpu_add()/this_cpu_read() doesnt fit.

Even for single counter, this_cpu_read(64bit) is not using an RMW
(cmpxchg8) instruction, so you can get very strange results when low
order 32bit wraps.



--

From: Christoph Lameter
Date: Tuesday, August 31, 2010 - 2:35 pm

How about fixing it so that everyone benefits?

--

From: Eric Dumazet
Date: Tuesday, August 31, 2010 - 2:41 pm

IMHO, this_cpu_read() is fine as is : a _read_ operation.

Dont pretend it can be used in every context, its not true.



--

From: Christoph Lameter
Date: Wednesday, September 1, 2010 - 1:05 pm

The problem only exists on 32 bit platforms using 64 bit counters. If you
would provide this functionality for the fallback case of 64 bit counters
(here x86) in 32 bit arch code then you could use the this_cpu_*
operations in all context without your special code being replicated in
ohter places.

The additional advantage would be that for the 64bit case you would have
much faster and more compact code.


--

From: Eric Dumazet
Date: Wednesday, September 1, 2010 - 1:38 pm

My implementation is portable and use existing infrastructure, at the
time it was coded. BTW, its fast on 64bit too. As fast as previous
implementation. No extra code added. Please double check.

If you believe you can do better, please do so.

Of course, we added 64bit network stats to all 32bit arches only because
cost was acceptable. (I say all 32bit arches, because you seem to think
only x86 was the target)

Using this_cpu_{add|res}() fallback using atomic ops or spinlocks would
be slower than actual implemenation (smp_wmb() (nops on x86) and
increments).



--

From: Christoph Lameter
Date: Wednesday, September 1, 2010 - 5:34 pm

Well if you use the this_cpu_add() on 64 bit you have a single instruction


Ok then add your 64 bit on 32 bit implementation as a default for all 32
bit configurations. The point is to not have special implementations for
particular counters. It would be great if the logic you developed for the
network counters could be used in general for all who have issues with 64

That would be bad. But then I do not see anyone around that wants
to implement such fallback behavior.

The current generic fallback for 64 ops (if the arch does not provide it
which 32 bit arches generally do not) is to switch off preempt and then
perform the 64 bit op. There is also a irqsafe version as well. Have a
look at inclue/linux/percpu.h.



--

From: Anton Blanchard
Date: Tuesday, August 31, 2010 - 8:41 pm

Sorry, was having other zram issues on ppc64 that prevented me from testing
this. Stupid bug, removed the &.

Anton

zram: Free percpu data on module exit.

Signed-off-by: Anton Blanchard <anton@samba.org>
---

Index: powerpc.git/drivers/staging/zram/zram_drv.c
===================================================================
--- powerpc.git.orig/drivers/staging/zram/zram_drv.c	2010-08-31 15:15:59.344290847 +1000
+++ powerpc.git/drivers/staging/zram/zram_drv.c	2010-09-01 12:35:02.964893575 +1000
@@ -483,8 +483,7 @@ void zram_reset_device(struct zram *zram
 	xv_destroy_pool(zram->mem_pool);
 	zram->mem_pool = NULL;
 
-	/* Reset stats */
-	memset(&zram->stats, 0, sizeof(zram->stats));
+	free_percpu(zram->stats);
 
 	zram->disksize = zram_default_disksize();
 	mutex_unlock(&zram->init_lock);
--

From: Anton Blanchard
Date: Tuesday, August 31, 2010 - 8:51 pm

I'm getting an oops when running mkfs on zram:

NIP [d0000000030e0340] .zram_inc_stat+0x58/0x84 [zram]
[c00000006d58f720] [d0000000030e091c] .zram_make_request+0xa8/0x6a0 [zram]
[c00000006d58f840] [c00000000035795c] .generic_make_request+0x390/0x434
[c00000006d58f950] [c000000000357b14] .submit_bio+0x114/0x140
[c00000006d58fa20] [c000000000361778] .blkdev_issue_discard+0x1ac/0x250
[c00000006d58fb10] [c000000000361f68] .blkdev_ioctl+0x358/0x7fc
[c00000006d58fbd0] [c0000000001c1c1c] .block_ioctl+0x6c/0x90
[c00000006d58fc70] [c0000000001984c4] .do_vfs_ioctl+0x660/0x6d4
[c00000006d58fd70] [c0000000001985a0] .SyS_ioctl+0x68/0xb0

Since disksize no longer starts as 0 it looks like we can call
zram_make_request before the device has been initialised. The patch below
fixes the immediate problem but this would go away if we move the
initialisation function elsewhere (as suggested in another thread).

Signed-off-by: Anton Blanchard <anton@samba.org>
---

Index: powerpc.git/drivers/staging/zram/zram_drv.c
===================================================================
--- powerpc.git.orig/drivers/staging/zram/zram_drv.c	2010-09-01 12:35:14.286515175 +1000
+++ powerpc.git/drivers/staging/zram/zram_drv.c	2010-09-01 12:35:24.167930504 +1000
@@ -441,6 +441,12 @@ static int zram_make_request(struct requ
 	int ret = 0;
 	struct zram *zram = queue->queuedata;
 
+	if (unlikely(!zram->init_done)) {
+		set_bit(BIO_UPTODATE, &bio->bi_flags);
+		bio_endio(bio, 0);
+		return 0;
+	}
+
 	if (unlikely(!valid_io_request(zram, bio))) {
 		zram_inc_stat(zram, ZRAM_STAT_INVALID_IO);
 		bio_io_error(bio);
--

From: Andrew Morton
Date: Friday, September 17, 2010 - 1:59 pm

On Wed, 1 Sep 2010 13:51:35 +1000

So... what happened with this and your other bugfix in
zram_reset_device()?
--

From: Nitin Gupta
Date: Monday, August 9, 2010 - 10:26 am

Each zram device maintains an array (table) that maps
index within the device to the location of corresponding
compressed chunk. Currently we store 'struct page' pointer,
offset with page and various flags separately which takes
12 bytes per table entry. Now all these are encoded in a
single 'phys_add_t' value which results in savings of 4 bytes
per entry (except on PAE systems).

Unfortunately, cleanups related to some variable renames
were mixed in this patch. So, please bear some additional
noise.

Signed-off-by: Nitin Gupta <ngupta@vflare.org>
---
 drivers/staging/zram/zram_drv.c |  256 ++++++++++++++++++++------------------
 drivers/staging/zram/zram_drv.h |   24 +---
 2 files changed, 140 insertions(+), 140 deletions(-)

diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index c16e09a..efe9c93 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -42,6 +42,13 @@ struct zram *devices;
 /* Module params (documentation at end) */
 unsigned int num_devices;
 
+/*
+ * We do not allocate any memory for zero-filled pages.
+ * Rather, we simply mark them in corresponding table
+ * entry by setting this bit.
+ */
+#define ZRAM_ZERO_PAGE_MARK_BIT		(1 << 0)
+
 static void zram_add_stat(struct zram *zram,
 			enum zram_stats_index idx, s64 val)
 {
@@ -65,37 +72,62 @@ static void zram_dec_stat(struct zram *zram, enum zram_stats_index idx)
 	zram_add_stat(zram, idx, -1);
 }
 
-static int zram_test_flag(struct zram *zram, u32 index,
-			enum zram_pageflags flag)
+static int page_zero_filled(void *ptr)
 {
-	return zram->table[index].flags & BIT(flag);
+	unsigned int pos;
+	unsigned long *page;
+
+	page = (unsigned long *)ptr;
+
+	for (pos = 0; pos != PAGE_SIZE / sizeof(*page); pos++) {
+		if (page[pos])
+			return 0;
+	}
+
+	return 1;
 }
 
-static void zram_set_flag(struct zram *zram, u32 index,
-			enum zram_pageflags flag)
+static int zram_is_zero_page(struct zram *zram, u32 index)
 ...
From: Pekka Enberg
Date: Monday, August 9, 2010 - 11:59 am

The noise makes this patch pretty difficult to review properly. Care
to spilt the patch into two pieces?
--

From: Nitin Gupta
Date: Monday, August 9, 2010 - 9:55 pm

Ok, I will split them as separate patches.

Thanks for all the reviews and Acks.
Nitin

--

From: Nitin Gupta
Date: Monday, August 9, 2010 - 10:26 am

The 'discard' bio discard request provides information to
zram disks regarding blocks which are no longer in use by
filesystem. This allows freeing memory allocated for such
blocks.

When zram devices are used as swap disks, we already have
a callback (block_device_operations->swap_slot_free_notify).
So, the discard support is useful only when used as generic
(non-swap) disk.

Signed-off-by: Nitin Gupta <ngupta@vflare.org>
---
 drivers/staging/zram/zram_drv.c   |   25 +++++++++++++++++++++++++
 drivers/staging/zram/zram_sysfs.c |   11 +++++++++++
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index efe9c93..0f9785f 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -420,6 +420,20 @@ out:
 	return 0;
 }
 
+static void zram_discard(struct zram *zram, struct bio *bio)
+{
+	size_t bytes = bio->bi_size;
+	sector_t sector = bio->bi_sector;
+
+	while (bytes >= PAGE_SIZE) {
+		zram_free_page(zram, sector >> SECTORS_PER_PAGE_SHIFT);
+		sector += PAGE_SIZE >> SECTOR_SHIFT;
+		bytes -= PAGE_SIZE;
+	}
+
+	bio_endio(bio, 0);
+}
+
 /*
  * Check if request is within bounds and page aligned.
  */
@@ -451,6 +465,12 @@ static int zram_make_request(struct request_queue *queue, struct bio *bio)
 		return 0;
 	}
 
+	if (unlikely(bio_rw_flagged(bio, BIO_RW_DISCARD))) {
+		zram_inc_stat(zram, ZRAM_STAT_DISCARD);
+		zram_discard(zram, bio);
+		return 0;
+	}
+
 	switch (bio_data_dir(bio)) {
 	case READ:
 		ret = zram_read(zram, bio);
@@ -606,6 +626,11 @@ static int create_device(struct zram *zram, int device_id)
 	blk_queue_io_min(zram->disk->queue, PAGE_SIZE);
 	blk_queue_io_opt(zram->disk->queue, PAGE_SIZE);
 
+	zram->disk->queue->limits.discard_granularity = PAGE_SIZE;
+	zram->disk->queue->limits.max_discard_sectors = UINT_MAX;
+	zram->disk->queue->limits.discard_zeroes_data = 1;
+	queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, zram->queue);
+
 ...
From: Pekka Enberg
Date: Monday, August 9, 2010 - 12:03 pm

From: Jens Axboe
Date: Monday, August 9, 2010 - 7:23 pm

So freeing the page here will guarantee zeroed return on read?
And since you set PAGE_SIZE as the discard granularity, the above loop
could be coded more readable with the knowledge that ->bi_size is always
a multiple of the page size.

-- 
Jens Axboe


Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended recipient, and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are not the intended recipient, please immediately notify the sender and destroy the original e-mail message and any attachments (and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying, disclosure or distribution of this information is strictly prohibited.
--

From: Nitin Gupta
Date: Monday, August 9, 2010 - 9:54 pm

For reads on freed/unwritten sectors, it simply returns success and
does not touch the bio page. Is it better to zero the page in such

Ok, I will cleanup it up.

Thanks for comments.
Nitin
--

From: Jens Axboe
Date: Tuesday, August 10, 2010 - 8:54 am

Well, you told the kernel that you return zeroes on discarded ranges:

        zram->disk->queue->limits.discard_zeroes_data = 1;

So yes, if you intend to keep that, then you need to zero the
incoming pages that have been explicitly trimmed by a discard.

-- 
Jens Axboe


Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended recipient, and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are not the intended recipient, please immediately notify the sender and destroy the original e-mail message and any attachments (and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying, disclosure or distribution of this information is strictly prohibited.
--

From: Nitin Gupta
Date: Monday, August 9, 2010 - 10:26 am

Compression takes much more time than decompression. So, its quite
wasteful in terms of both CPU cycles and memory usage to have a very
low compressed page size threshold and thereby storing such not-so-well
compressible pages as-is (uncompressed). So, increasing it from
PAGE_SIZE/2 to PAGE_SIZE/8*7. A low threshold was useful when we had
"backing swap" support where we could forward such pages to the backing
device (applicable only when zram was used as swap disk).

It is not yet configurable through sysfs but may be exported in future,
along with threshold for average compression ratio.

Signed-off-by: Nitin Gupta <ngupta@vflare.org>
---
 drivers/staging/zram/zram_drv.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/zram/zram_drv.h b/drivers/staging/zram/zram_drv.h
index 65e512d..bcc51ea 100644
--- a/drivers/staging/zram/zram_drv.h
+++ b/drivers/staging/zram/zram_drv.h
@@ -47,7 +47,7 @@ static const unsigned default_disksize_perc_ram = 25;
  * Pages that compress to size greater than this are stored
  * uncompressed in memory.
  */
-static const unsigned max_zpage_size = PAGE_SIZE / 4 * 3;
+static const unsigned max_zpage_size = PAGE_SIZE / 8 * 7;
 
 /*
  * NOTE: max_zpage_size must be less than or equal to:
-- 
1.7.2.1

--

From: Pekka Enberg
Date: Monday, August 9, 2010 - 11:32 am

Hi Nitin,


The description makes sense but lacks any real data. What kind of
workloads did you test this with? Where does it help most? How much?

                        Pekka
--

From: Nitin Gupta
Date: Monday, August 9, 2010 - 10:26 am

- xvmalloc: Remove unnecessary stat_{inc,dec} and increment
   pages_stored directly
 - xvmalloc: Initialize pointers with NULL instead of 0
 - zram: Remove verbose message when use sets insane disksize
 - zram: Mark some messages as pr_debug
 - zram: Refine some comments

Signed-off-by: Nitin Gupta <ngupta@vflare.org>
---
 drivers/staging/zram/xvmalloc.c |   22 ++++-------------
 drivers/staging/zram/zram_drv.c |   49 +++++++++++++-------------------------
 drivers/staging/zram/zram_drv.h |   26 ++++----------------
 3 files changed, 28 insertions(+), 69 deletions(-)

diff --git a/drivers/staging/zram/xvmalloc.c b/drivers/staging/zram/xvmalloc.c
index 3fdbb8a..6268f65 100644
--- a/drivers/staging/zram/xvmalloc.c
+++ b/drivers/staging/zram/xvmalloc.c
@@ -20,16 +20,6 @@
 #include "xvmalloc.h"
 #include "xvmalloc_int.h"
 
-static void stat_inc(u64 *value)
-{
-	*value = *value + 1;
-}
-
-static void stat_dec(u64 *value)
-{
-	*value = *value - 1;
-}
-
 static int test_flag(struct block_header *block, enum blockflags flag)
 {
 	return block->prev & BIT(flag);
@@ -187,7 +177,7 @@ static void insert_block(struct xv_pool *pool, struct page *page, u32 offset,
 	slindex = get_index_for_insert(block->size);
 	flindex = slindex / BITS_PER_LONG;
 
-	block->link.prev_page = 0;
+	block->link.prev_page = NULL;
 	block->link.prev_offset = 0;
 	block->link.next_page = pool->freelist[slindex].page;
 	block->link.next_offset = pool->freelist[slindex].offset;
@@ -217,7 +207,7 @@ static void remove_block_head(struct xv_pool *pool,
 
 	pool->freelist[slindex].page = block->link.next_page;
 	pool->freelist[slindex].offset = block->link.next_offset;
-	block->link.prev_page = 0;
+	block->link.prev_page = NULL;
 	block->link.prev_offset = 0;
 
 	if (!pool->freelist[slindex].page) {
@@ -232,7 +222,7 @@ static void remove_block_head(struct xv_pool *pool,
 		 */
 		tmpblock = get_ptr_atomic(pool->freelist[slindex].page,
 				pool->freelist[slindex].offset, ...
From: Pekka Enberg
Date: Monday, August 9, 2010 - 12:02 pm

Acked-by: Pekka Enberg <penberg@kernel.org>
--

From: Nitin Gupta
Date: Monday, August 9, 2010 - 10:26 am

Signed-off-by: Nitin Gupta <ngupta@vflare.org>
---
 Documentation/ABI/testing/sysfs-block-zram |   99 ++++++++++++++++++++++++++++
 1 files changed, 99 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-block-zram

diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram
new file mode 100644
index 0000000..c8b3b48
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-block-zram
@@ -0,0 +1,99 @@
+What:		/sys/block/zram<id>/disksize
+Date:		August 2010
+Contact:	Nitin Gupta <ngupta@vflare.org>
+Description:
+		The disksize file is read-write and specifies the disk size
+		which represents the limit on the *uncompressed* worth of data
+		that can be stored in this disk.
+
+What:		/sys/block/zram<id>/initstate
+Date:		August 2010
+Contact:	Nitin Gupta <ngupta@vflare.org>
+Description:
+		The disksize file is read-only and shows the initialization
+		state of the device.
+
+What:		/sys/block/zram<id>/reset
+Date:		August 2010
+Contact:	Nitin Gupta <ngupta@vflare.org>
+Description:
+		The disksize file is write-only and allows resetting the
+		device. The reset operation frees all the memory assocaited
+		with this device.
+
+What:		/sys/block/zram<id>/num_reads
+Date:		August 2010
+Contact:	Nitin Gupta <ngupta@vflare.org>
+Description:
+		The num_reads file is read-only and specifies the number of
+		reads (failed or successful) done on this device.
+
+What:		/sys/block/zram<id>/num_writes
+Date:		August 2010
+Contact:	Nitin Gupta <ngupta@vflare.org>
+Description:
+		The num_writes file is read-only and specifies the number of
+		writes (failed or successful) done on this device.
+
+What:		/sys/block/zram<id>/invalid_io
+Date:		August 2010
+Contact:	Nitin Gupta <ngupta@vflare.org>
+Description:
+		The invalid_io file is read-only and specifies the number of
+		non-page-size-aligned I/O requests issued to this device.
+
+What:		/sys/block/zram<id>/notify_free
+Date:		August ...
From: Pekka Enberg
Date: Monday, August 9, 2010 - 12:02 pm

Acked-by: Pekka Enberg <penberg@kernel.org>
--

From: Nitin Gupta
Date: Monday, August 9, 2010 - 10:26 am

Update zram documentation to reflect transition form
ioctl to sysfs interface.

Signed-off-by: Nitin Gupta <ngupta@vflare.org>
---
 drivers/staging/zram/zram.txt |   58 +++++++++++++++++++++++++---------------
 1 files changed, 36 insertions(+), 22 deletions(-)

diff --git a/drivers/staging/zram/zram.txt b/drivers/staging/zram/zram.txt
index 520edc1..5f75d29 100644
--- a/drivers/staging/zram/zram.txt
+++ b/drivers/staging/zram/zram.txt
@@ -5,33 +5,35 @@ Project home: http://compcache.googlecode.com/
 
 * Introduction
 
-The zram module creates RAM based block devices: /dev/ramX (X = 0, 1, ...).
-Pages written to these disks are compressed and stored in memory itself.
-These disks allow very fast I/O and compression provides good amounts of
-memory savings.
+The zram module creates RAM based block devices named /dev/zram<id>
+(<id> = 0, 1, ...). Pages written to these disks are compressed and stored
+in memory itself. These disks allow very fast I/O and compression provides
+good amounts of memory savings. Some of the usecases include /tmp storage,
+use as swap disks, various caches under /var and maybe many more :)
 
-See project home for use cases, performance numbers and a lot more.
-
-Individual zram devices are configured and initialized using zramconfig
-userspace utility as shown in examples below. See zramconfig man page for
-more details.
+Statistics for individual zram devices are exported through sysfs nodes at
+/sys/block/zram<id>/
 
 * Usage
 
 Following shows a typical sequence of steps for using zram.
 
-1) Load Modules:
+1) Load Module:
 	modprobe zram num_devices=4
-	This creates 4 (uninitialized) devices: /dev/zram{0,1,2,3}
+	This creates 4 devices: /dev/zram{0,1,2,3}
 	(num_devices parameter is optional. Default: 1)
 
-2) Initialize:
-	Use zramconfig utility to configure and initialize individual
-	zram devices. For example:
-	zramconfig /dev/zram0 --init # uses default value of disksize_kb
-	zramconfig /dev/zram1 --disksize_kb=102400 # 100MB ...
From: Nitin Gupta
Date: Monday, August 9, 2010 - 10:26 am

This also removes the (mutex) lock which was used to
protect these buffers.

Tests with fio on dual core showed improvement of about
20% for write speed.

fio job description:

[global]
bs=4k
ioengine=libaio
iodepth=8
size=1g
direct=1
runtime=60
directory=/mnt/zdisk
verify=meta
verify=0x1234567890abcdef
numjobs=2

[seq-write]
rw=write

Speedup was negligible with iodepth of 4 or less. Maybe
gains will be more visible with higher number of cores.

Results with above job file:
Before:
  WRITE: io=2,048MB, aggrb=70,725KB/s
  WRITE: io=2,048MB, aggrb=71,938KB/s
  WRITE: io=2,048MB, aggrb=70,461KB/s

After:
  WRITE: io=2,048MB, aggrb=86,691KB/s
  WRITE: io=2,048MB, aggrb=87,025KB/s
  WRITE: io=2,048MB, aggrb=88,700KB/s

Speedup: ~20%

Read performance remained almost the same since the read
path was lockless earlier also (LZO decompressor does not
require any additional buffer).

Signed-off-by: Nitin Gupta <ngupta@vflare.org>
---
 drivers/staging/zram/zram_drv.c |  136 +++++++++++++++++++++++++--------------
 drivers/staging/zram/zram_drv.h |    4 -
 2 files changed, 88 insertions(+), 52 deletions(-)

diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index f111b86..c16e09a 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -17,13 +17,14 @@
 
 #include <linux/module.h>
 #include <linux/kernel.h>
-#include <linux/bio.h>
 #include <linux/bitops.h>
 #include <linux/blkdev.h>
 #include <linux/buffer_head.h>
+#include <linux/cpu.h>
 #include <linux/device.h>
 #include <linux/genhd.h>
 #include <linux/highmem.h>
+#include <linux/notifier.h>
 #include <linux/slab.h>
 #include <linux/lzo.h>
 #include <linux/string.h>
@@ -31,6 +32,9 @@
 
 #include "zram_drv.h"
 
+static DEFINE_PER_CPU(unsigned char *, compress_buffer);
+static DEFINE_PER_CPU(unsigned char *, compress_workmem);
+
 /* Globals */
 static int zram_major;
 struct zram *devices;
@@ -290,10 +294,10 @@ static int ...
From: Pekka Enberg
Date: Monday, August 9, 2010 - 11:57 am

The per-CPU buffer thing with this preempt_disable() trickery looks
overkill to me. Most block device drivers seem to use mempool_alloc()
for this sort of thing. Is there some reason you can't use that here?
--

From: Nitin Gupta
Date: Monday, August 9, 2010 - 9:47 pm

Other block drivers are allocating relatively small structs using
mempool_alloc(). However, in case of zram, these buffers are quite
large (compress_workmem is 64K!). So, allocating them on every write
would probably be much slower than using a pre-allocated per-cpu buffer.

Thanks,
Nitin
--

From: Pekka Enberg
Date: Monday, August 9, 2010 - 10:05 pm

Hi Nitin,

The mempool API is precisely for that - using pre-allocated buffers 
instead of allocating every time. The preempt_disable() games make the 
code complex and have the downside of higher scheduling latencies so why 
not give mempools a try?

             Pekka
--

From: Nitin Gupta
Date: Monday, August 9, 2010 - 10:32 pm

Hi,


mempool_alloc() first calls alloc_fn with ~(__GFP_WAIT | __GFP_IO)
and *then* falls down to pre-allocated buffers. So, it will always
be slower than directly using pre-allocated buffers as is done
currently.

One trick we can use is to have alloc_fn such that it always returns
failure with ~__GFP_WAIT and do actual allocation otherwise. But still
it seems like unnecessary cost.

Thanks,
Nitin
 


--

From: Pekka Enberg
Date: Tuesday, August 10, 2010 - 12:36 am

Hi Nitin,

We can always extend the mempool API with mempool_prealloc() function if 
that turns out to be a problem. The per-CPU buffer with 
preempt_disable() trickery isn't really the proper thing to do here. It 
doesn't make much sense to disable preemption for compression that's 
purely CPU bound.

                     Pekka
--

From: Greg KH
Date: Tuesday, August 31, 2010 - 3:37 pm

I applied the first 2 and last 2 of these patches and they will show up
in the linux-next tree tomorrow.  I stopped there due to Andrew's
complaints about the per-cpu variable stuff.  Please resolve this and
redo the patch set and I will be glad to apply them.

thanks,

greg k-h
--

From: Anton Blanchard
Date: Tuesday, August 31, 2010 - 8:32 pm

Hi Nitin,

I gave zram a try on a ppc64 box with a 64kB PAGE_SIZE. It looks like the
xvmalloc allocator fails when we add in a large enough block (in this case
65532 bytes).

flindex ends up as 127 which is larger than BITS_PER_LONG. We continually call
grow_block inside find_block and fail:

zram: Error allocating memory for compressed page: 0, size=467

Anton
--

From: David Rientjes
Date: Thursday, September 9, 2010 - 12:07 pm

I'm curious why there are no killable processes on the system; it seems 
like the triggering task here, cat, would at least be killable itself.  
Could you post the tasklist dump that preceeds this (or, if you've 
disabled it try echo 1 > /proc/sys/vm/oom_dump_tasks first)?

It's possible that if you have enough swap that none of the eligible tasks 
actually have non-zero badness scores either because they are being run as 
root or because the amount of RAM or swap is sufficiently high such that 
(task's rss + swap) / (total rss + swap) is never non-zero.  And, since 
root tasks have a 3% bonus, it's possible these are all root tasks and no 
single task uses more than 3% of rss and swap.

While this may not be the issue in your case, and can be confirmed with 
the tasklist dump if you can get it, we need to protect against these 
situations where eligible tasks may not be killed.

Andrew, I'd like to propose this patch for 2.6.36-rc-series since the 
worst case is that the machine will panic if there are an exceptionally 
large number of tasks, each with little memory usage at the time of oom.



oom: always return a badness score of non-zero for eligible tasks

A task's badness score is roughly a proportion of its rss and swap
compared to the system's capacity.  The scale ranges from 0 to 1000 with
the highest score chosen for kill.  Thus, this scale operates on a
resolution of 0.1% of RAM + swap.  Admin tasks are also given a 3% bonus,
so the badness score of an admin task using 3% of memory, for example,
would still be 0.

It's possible that an exceptionally large number of tasks will combine to 
exhaust all resources but never have a single task that uses more than
0.1% of RAM and swap (or 3.0% for admin tasks).

This patch ensures that the badness score of any eligible task is never 0
so the machine doesn't unnecessarily panic because it cannot find a task
to kill.

Signed-off-by: David Rientjes <rientjes@google.com>
---
 mm/oom_kill.c |    9 +++++++--
 1 files ...
From: Dave Hansen
Date: Thursday, September 9, 2010 - 12:48 pm

That was one odd part here.  I didn't disable the tasklist dump, and

It's a 64GB machine with ~30GB of swap and very little RSS.  Your
hypothesis seems correct.  Just grepping through /proc/[0-9]*/oom_score
shows nothing other than 0's.

Trying this again, I just hung the system instead of OOM'ing straight
away like last time.

Your patch makes a lot of sense to me in any case where there aren't
large-RSS tasks around using memory.  That definitely applies here
because of the amount in the compcache store and might also apply with
ramfs and hugetlbfs.

-- Dave

--

From: David Rientjes
Date: Thursday, September 9, 2010 - 2:00 pm

Hmm, could you very that /proc/sys/vm/oom_dump_tasks is set?  Perhaps it's 
getting cleared by something else before you use zram.  The sysctl should 

Presumably you're not using a large amount of swap, either, or that would 

with the patch, you should still be calling the oom killer and instead of 
panicking it will go on a serial killing spree because everything that it 
wasn't judging as a candidate before (oom_score of 0) now is if it's truly 
killable (oom_score of 1).  The patch is definitely needed for correctness 
since an oom_score of 0 implies the task is unkillable.

We're apparently hanging in the exit path for the oom killed task or 
something is constantly respawning threads that repeatedly get killed.  It 
appears as though nothing is actually a worthwhile target for the oom 

Agreed, we'll need to address hugepages specifically because they don't 
get accounted for in rss but do free memory when the task is killed.
--

From: Dave Hansen
Date: Thursday, September 9, 2010 - 2:10 pm

Nope.  There's very little happening on the system except for me toying

I'll give the patch a shot and see if I get any better behavior.  But, I
really do think the root cause here is compcache exhausting the system
when you feed incompressible pages into it.  We can kill all the tasks
we want, but I think it'll continue to gobble memory up as fast as we
free it.

We either need to put some upper bounds on the amount of memory that
compcache uses for its backing store, or reintroduce the code that lets

They do sometimes.  But, if they're preallocated, or stuck in a linked
file on the filesystem, killing the task doesn't do any good.

-- Dave

--

From: David Rientjes
Date: Thursday, September 9, 2010 - 2:40 pm

Ok, I assume you aren't getting the typical "cat invoked oom-killer..." 
message, the memory state dump, etc., either, so there's something strange 
with your log level such that nothing under KERN_WARNING is getting 
through or you can't access the actual kernel log due to the panic.  I can 

That certainly seems to be the case and is the true topic of this thread, 
so I don't want to hijack it any further since it's outside the scope of 
the oom killer :)

But I'm still curious as to why the machine is hanging and not eventually 
panicking when we run out of killable tasks.  It seems as though something 
is hanging in the exit path, meaning memory reserves aren't even safe from 
compcache, or there's something wrong in the oom killer retry logic, or 
you're simply forking more tasks, perhaps as a response to threads getting 
killed by the kernel, than we can kill.

We'd certainly prefer to panic the machine if no work is getting done than 
simply killing everything that gets forked.  The problem before was that 
we panicked too early before we killed anything and now we don't know when 

Indeed you're right, I meant s/hugepages/transparent hugepages/, sorry.  
It appears as though they get included in the rss of the allocating task, 
though, via MM_ANONPAGES, so this is already represented in the task's 
badness score.

Thanks for trying the patch out, Dave, I hope we can add your Tested-by 
line and it can get pushed to the rc-series.
--

From: Nitin Gupta
Date: Sunday, October 3, 2010 - 11:41 am

Hi Dave,

Sorry for late reply. Since last month I couldn't get any chance to
work on this project.



Ability to write out zram (compressed) memory to a backing disk seems
really useful. However considering lkml reviews, I had to drop this
feature. Anyways, I guess I will try to push this feature again.

Also, please do not use linux-next/mainline version of compcache. Instead
just use version in the project repository here:
hg clone https://compcache.googlecode.com/hg/ compcache 

This is updated much more frequently and has many more bug fixes over
the mainline. It will also be easier to fix bugs/add features much more
quickly in this repo rather than sending them to lkml which can take
long time.

Thanks,
Nitin
--

From: Dave Hansen
Date: Sunday, October 3, 2010 - 12:27 pm

I'd argue that zram is pretty useless without some ability to write to a
backing store, unless you *really* know what is going to be stored in it
and you trust the user.  Otherwise, it's just too easy to OOM the
system.

I've been investigating backing the xvmalloc space with a tmpfs file.
Instead of keeping page/offset pairs, you just keep a linear address
inside the tmpfile file.  There's an extra step needed to look up and
lock the page cache page into place each time you go into the xvmalloc
store, but it does seem to basically work.  The patches are really rough
and not quite functional, but I'm happy to share if you want to see them

That looks like just a clone of the code needed to build the module.  

Kernel developers are pretty used to _some_ kernel tree being the
authoritative source.  Also, having it in a kernel tree makes it
possible to get testing in places like linux-next, and it makes it
easier for people to make patches or kernel trees on top of your work. 

There's not really a point to the code being in -staging if it isn't
somewhat up-to-date or people can't generate patches to it.  It sounds
to me like we need to take it out of -staging.

-- Dave

--

From: Nitin Gupta
Date: Sunday, October 3, 2010 - 12:40 pm

I will try sending patches to sync mainline and hg code (along with
some changes in pipeline), or maybe just take it out of -staging and
send fresh patch series.

Thanks,
Nitin
--

From: Ed Tomlinson
Date: Monday, October 4, 2010 - 4:08 am

Or move it to a git tree.  Then generating patches becomes tivial for all of
us and keeping staging upto date becomes easier for you.

--

From: Greg KH
Date: Tuesday, October 5, 2010 - 4:43 pm

What?  No, the reason we put this into the kernel was so that _everyone_
could work on it, including the original developers.  Going off and
doing development somewhere else just isn't ok.  Should I just delete
this driver from the staging tree as you don't seem to want to work with

Yes, developing in your own sandbox can always be faster, but there is
no feedback loop.

sad,

greg k-h
--

From: Nitin Gupta
Date: Tuesday, October 5, 2010 - 7:29 pm

Getting it out of -staging wasn't my intent. Community is the reason

I was finding it real hard to find time to properly discuss each patch
over LKML, so I thought of shifting focus to local project repository
and then later go through proper reviews.

Thanks,
Nitin
--

From: Greg KH
Date: Tuesday, October 5, 2010 - 7:36 pm

So, should I delete the version in staging, or are you going to send
patches to sync it up with your development version?

thanks,

greg k-h
--

From: Nitin Gupta
Date: Tuesday, October 5, 2010 - 9:30 pm

Deleting it from staging would not help much. Much more helpful would
be to sync at least the mainline and linux-next version of the driver
so it's easier to develop against these kernel trees.  Initially, I
thought -staging means that any reviewed change can quickly make it
to *both* linux-next and more importantly -staging in mainline. Working/
Testing against mainline is much smoother than against linux-next.

Thanks,
Nitin


--

From: Pekka Enberg
Date: Wednesday, October 6, 2010 - 12:38 am

Hi,


We can't push the patches immediately to mainline because we need to
respect the merge window. You shouldn't need to rely on linux-next for
testing, though, but work directly against Greg's staging tree. Greg,
where's the official tree at, btw? The tree at

  http://kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/

seems empty.

                        Pekka
--

From: Greg KH
Date: Wednesday, October 6, 2010 - 7:03 am

Oops, I need to update the MAINTAINERS file, the proper place for the
staging tree is now in git, at

git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-next-2.6.git

which feeds directly into the linux-next tree.

hope this helps,

greg k-h
--

From: Pekka Enberg
Date: Wednesday, October 6, 2010 - 7:16 am

Excellent! Nitin, can you develop and test zram against this tree?

             Pekka
--

From: Nitin Gupta
Date: Wednesday, October 6, 2010 - 7:53 am

This seems like the ideal tree to develop against.

Thanks!
Nitin
--

From: Greg KH
Date: Wednesday, October 6, 2010 - 7:02 am

It makes it quickly to linux-next, right?

Nothing goes "quickly" to mainline, other than the normal development

Sure it's "easier", but that's not how kernel development is done,
sorry.  See Documentation/development-process/ for details.

thanks,

greg k-h
--

Previous thread: Re: 2.6.35 - INFO: kernel/signal.c:660 invoked rcu_dereference_check() without protection! by Paul E. McKenney on Monday, August 9, 2010 - 10:13 am. (2 messages)

Next thread: Re: 2.6.35 - INFO: kernel/exit.c:1387 invoked rcu_dereference_check() without protection! by Paul E. McKenney on Monday, August 9, 2010 - 10:27 am. (3 messages)