Re: [linux-pm] [PATCH 0/8] Suspend block api (version 8)

Previous thread: [ANN] Linux Security Summit 2010 - Announcement and CFP by James Morris on Friday, May 21, 2010 - 3:44 pm. (2 messages)

Next thread: [PULL REQUEST] md updates for 2.6.35 by Neil Brown on Friday, May 21, 2010 - 3:47 pm. (1 message)
From: =?UTF-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?=
Date: Friday, May 21, 2010 - 3:46 pm

This patch series adds a suspend-block api that provides the same
functionality as the android wakelock api. This version adds a
delay before suspending again if no suspend blockers were used
during the last suspend attempt.

--
Arve Hjønnevåg <arve@android.com>


--

From: =?UTF-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?=
Date: Friday, May 21, 2010 - 3:46 pm

Power management features present in the current mainline kernel are
insufficient to get maximum possible energy savings on some platforms,
such as Android.  The problem is that to save maximum amount of energy
all system hardware components need to be in the lowest-power states
available for as long as reasonably possible, but at the same time the
system must always respond to certain events, regardless of the
current state of the hardware.

The first goal can be achieved either by using device runtime PM and
cpuidle to put all hardware into low-power states, transparently from
the user space point of view, or by suspending the whole system.
However, system suspend, in its current form, does not guarantee that
the events of interest will always be responded to, since wakeup
events (events that wake the CPU from idle and the system from
suspend) that occur right after initiating suspend will not be
processed until another possibly unrelated event wakes the system up
again.

On hardware where idle can enter the same power state as suspend, idle
combined with runtime PM can be used, but periodic wakeups increase
the average power consumption. Suspending the system also reduces the
harm caused by apps that never go idle. There also are systems where
some devices cannot be put into low-power states without suspending
the entire system (or the low-power states available to them without
suspending the entire system are substantially shallower than the
low-power states they are put into when the entire system is
suspended), so the system has to be suspended as a whole to achieve
the maximum energy savings.

To allow Android and similar platforms to save more energy than they
currently can save using the mainline kernel, introduce a mechanism by
which the system is automatically suspended (i.e. put into a
system-wide sleep state) whenever it's not doing work that's
immediately useful to the user, called opportunistic suspend.

For this purpose introduce the suspend blockers ...
From: =?UTF-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?=
Date: Friday, May 21, 2010 - 3:46 pm

If a suspend_blocker is active, suspend will fail anyway. Since
try_to_freeze_tasks can take up to 20 seconds to complete or fail, aborting
as soon as someone blocks suspend (e.g. from an interrupt handler) improves
the worst case wakeup latency.

On an older kernel where task freezing could fail for processes attached
to a debugger, this fixed a problem where the device sometimes hung for
20 seconds before the screen turned on.

Signed-off-by: Arve Hjønnevåg <arve@android.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
---
 kernel/power/process.c |   11 +++++++++--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/kernel/power/process.c b/kernel/power/process.c
index 71ae290..27d26d3 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -38,6 +38,7 @@ static int try_to_freeze_tasks(bool sig_only)
 	struct timeval start, end;
 	u64 elapsed_csecs64;
 	unsigned int elapsed_csecs;
+	bool wakeup = false;
 
 	do_gettimeofday(&start);
 
@@ -63,6 +64,10 @@ static int try_to_freeze_tasks(bool sig_only)
 				todo++;
 		} while_each_thread(g, p);
 		read_unlock(&tasklist_lock);
+		if (todo && suspend_is_blocked()) {
+			wakeup = true;
+			break;
+		}
 		if (!todo || time_after(jiffies, end_time))
 			break;
 
@@ -85,13 +90,15 @@ static int try_to_freeze_tasks(bool sig_only)
 		 * but it cleans up leftover PF_FREEZE requests.
 		 */
 		printk("\n");
-		printk(KERN_ERR "Freezing of tasks failed after %d.%02d seconds "
+		printk(KERN_ERR "Freezing of tasks %s after %d.%02d seconds "
 				"(%d tasks refusing to freeze):\n",
+				wakeup ? "aborted" : "failed",
 				elapsed_csecs / 100, elapsed_csecs % 100, todo);
 		read_lock(&tasklist_lock);
 		do_each_thread(g, p) {
 			task_lock(p);
-			if (freezing(p) && !freezer_should_skip(p))
+			if (freezing(p) && !freezer_should_skip(p)
+					&& elapsed_csecs > 100)
 				sched_show_task(p);
 			cancel_freezing(p);
 			task_unlock(p);
-- 
1.6.5.1

--

From: =?UTF-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?=
Date: Friday, May 21, 2010 - 3:46 pm

Allow work to be queued that will block suspend while it is pending
or executing. To get the same functionality in the calling code often
requires a separate suspend_blocker for pending and executing work, or
additional state and locking. This implementation does add additional
state and locking, but this can be removed later if we add support for
suspend blocking work to the core workqueue code.

Signed-off-by: Arve Hjønnevåg <arve@android.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
---
 include/linux/suspend_blocker.h      |   67 +++++++++++++++++++++
 kernel/power/opportunistic_suspend.c |  109 ++++++++++++++++++++++++++++++++++
 2 files changed, 176 insertions(+), 0 deletions(-)

diff --git a/include/linux/suspend_blocker.h b/include/linux/suspend_blocker.h
index 256af15..bb90b45 100755
--- a/include/linux/suspend_blocker.h
+++ b/include/linux/suspend_blocker.h
@@ -18,6 +18,7 @@
 
 #include <linux/list.h>
 #include <linux/ktime.h>
+#include <linux/workqueue.h>
 
 /**
  * struct suspend_blocker_stats - statistics for a suspend blocker
@@ -62,6 +63,38 @@ struct suspend_blocker {
 #endif
 };
 
+/**
+ * struct suspend_blocking_work - the basic suspend_blocking_work structure
+ * @work:		Standard work struct.
+ * @suspend_blocker:	Suspend blocker.
+ * @func:		Callback.
+ * @lock:		Spinlock protecting pending and running state.
+ * @active:		Number of cpu workqueues where work is pending or
+ *			callback is running.
+ *
+ * When suspend blocking work is pending or its callback is running it prevents
+ * the system from entering opportunistic suspend.
+ *
+ * The suspend_blocking_work structure must be initialized by
+ * suspend_blocking_work_init().
+ */
+
+struct suspend_blocking_work {
+	struct work_struct work;
+#ifdef CONFIG_OPPORTUNISTIC_SUSPEND
+	struct suspend_blocker suspend_blocker;
+	work_func_t func;
+	spinlock_t lock;
+	int active;
+#endif
+};
+
+static inline struct suspend_blocking_work ...
From: =?UTF-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?=
Date: Friday, May 21, 2010 - 3:46 pm

When connecting usb or the charger the device would often go back to sleep
before the charge led and screen turned on.

Signed-off-by: Arve Hjønnevåg <arve@android.com>
---
 drivers/power/power_supply_core.c |    9 ++++++---
 include/linux/power_supply.h      |    3 ++-
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/power/power_supply_core.c b/drivers/power/power_supply_core.c
index cce75b4..577a131 100644
--- a/drivers/power/power_supply_core.c
+++ b/drivers/power/power_supply_core.c
@@ -39,7 +39,7 @@ static int __power_supply_changed_work(struct device *dev, void *data)
 static void power_supply_changed_work(struct work_struct *work)
 {
 	struct power_supply *psy = container_of(work, struct power_supply,
-						changed_work);
+						changed_work.work);
 
 	dev_dbg(psy->dev, "%s\n", __func__);
 
@@ -55,7 +55,7 @@ void power_supply_changed(struct power_supply *psy)
 {
 	dev_dbg(psy->dev, "%s\n", __func__);
 
-	schedule_work(&psy->changed_work);
+	schedule_suspend_blocking_work(&psy->changed_work);
 }
 EXPORT_SYMBOL_GPL(power_supply_changed);
 
@@ -155,7 +155,8 @@ int power_supply_register(struct device *parent, struct power_supply *psy)
 		goto dev_create_failed;
 	}
 
-	INIT_WORK(&psy->changed_work, power_supply_changed_work);
+	suspend_blocking_work_init(&psy->changed_work,
+				   power_supply_changed_work, "power-supply");
 
 	rc = power_supply_create_attrs(psy);
 	if (rc)
@@ -172,6 +173,7 @@ int power_supply_register(struct device *parent, struct power_supply *psy)
 create_triggers_failed:
 	power_supply_remove_attrs(psy);
 create_attrs_failed:
+	suspend_blocking_work_destroy(&psy->changed_work);
 	device_unregister(psy->dev);
 dev_create_failed:
 success:
@@ -184,6 +186,7 @@ void power_supply_unregister(struct power_supply *psy)
 	flush_scheduled_work();
 	power_supply_remove_triggers(psy);
 	power_supply_remove_attrs(psy);
+	suspend_blocking_work_destroy(&psy->changed_work);
 	device_unregister(psy->dev);
 }
 ...
From: =?UTF-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?=
Date: Friday, May 21, 2010 - 3:46 pm

Add an ioctl, EVIOCSSUSPENDBLOCK, to enable a suspend_blocker that will block
suspend while the event queue is not empty. This allows userspace code to
process input events while the device appears to be asleep.

Signed-off-by: Arve Hjønnevåg <arve@android.com>
---
 drivers/input/evdev.c |   22 ++++++++++++++++++++++
 include/linux/input.h |    3 +++
 2 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
index 2ee6c7a..bff2247 100644
--- a/drivers/input/evdev.c
+++ b/drivers/input/evdev.c
@@ -20,6 +20,7 @@
 #include <linux/input.h>
 #include <linux/major.h>
 #include <linux/device.h>
+#include <linux/suspend.h>
 #include "input-compat.h"
 
 struct evdev {
@@ -43,6 +44,8 @@ struct evdev_client {
 	struct fasync_struct *fasync;
 	struct evdev *evdev;
 	struct list_head node;
+	struct suspend_blocker suspend_blocker;
+	bool use_suspend_blocker;
 };
 
 static struct evdev *evdev_table[EVDEV_MINORS];
@@ -55,6 +58,8 @@ static void evdev_pass_event(struct evdev_client *client,
 	 * Interrupts are disabled, just acquire the lock
 	 */
 	spin_lock(&client->buffer_lock);
+	if (client->use_suspend_blocker)
+		suspend_block(&client->suspend_blocker);
 	client->buffer[client->head++] = *event;
 	client->head &= EVDEV_BUFFER_SIZE - 1;
 	spin_unlock(&client->buffer_lock);
@@ -234,6 +239,8 @@ static int evdev_release(struct inode *inode, struct file *file)
 	mutex_unlock(&evdev->mutex);
 
 	evdev_detach_client(evdev, client);
+	if (client->use_suspend_blocker)
+		suspend_blocker_unregister(&client->suspend_blocker);
 	kfree(client);
 
 	evdev_close_device(evdev);
@@ -335,6 +342,8 @@ static int evdev_fetch_next_event(struct evdev_client *client,
 	if (have_event) {
 		*event = client->buffer[client->tail++];
 		client->tail &= EVDEV_BUFFER_SIZE - 1;
+		if (client->use_suspend_blocker && client->head == client->tail)
+			suspend_unblock(&client->suspend_blocker);
 	}
 
 	spin_unlock_irq(&client->buffer_lock);
@@ ...
From: =?UTF-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?=
Date: Friday, May 21, 2010 - 3:46 pm

Report suspend block stats in /sys/kernel/debug/suspend_blockers.

Signed-off-by: Arve Hjønnevåg <arve@android.com>
---
 include/linux/suspend_blocker.h      |   27 +++++-
 kernel/power/Kconfig                 |    8 ++
 kernel/power/opportunistic_suspend.c |  195 +++++++++++++++++++++++++++++++++-
 kernel/power/power.h                 |    5 +
 kernel/power/suspend.c               |    4 +-
 5 files changed, 235 insertions(+), 4 deletions(-)

diff --git a/include/linux/suspend_blocker.h b/include/linux/suspend_blocker.h
index 8788302..256af15 100755
--- a/include/linux/suspend_blocker.h
+++ b/include/linux/suspend_blocker.h
@@ -17,11 +17,35 @@
 #define _LINUX_SUSPEND_BLOCKER_H
 
 #include <linux/list.h>
+#include <linux/ktime.h>
+
+/**
+ * struct suspend_blocker_stats - statistics for a suspend blocker
+ *
+ * @count: Number of times this blocker has been deacivated.
+ * @wakeup_count: Number of times this blocker was the first to block suspend
+ *	after resume.
+ * @total_time: Total time this suspend blocker has prevented suspend.
+ * @prevent_suspend_time: Time this suspend blocker has prevented suspend while
+ *	user-space requested suspend.
+ * @max_time: Max time this suspend blocker has been continuously active.
+ * @last_time: Monotonic clock when the active state last changed.
+ */
+struct suspend_blocker_stats {
+#ifdef CONFIG_SUSPEND_BLOCKER_STATS
+	unsigned int count;
+	unsigned int wakeup_count;
+	ktime_t total_time;
+	ktime_t prevent_suspend_time;
+	ktime_t max_time;
+	ktime_t last_time;
+#endif
+};
 
 /**
  * struct suspend_blocker - the basic suspend_blocker structure
  * @link: List entry for active or inactive list.
- * @flags: Tracks initialized and active state.
+ * @flags: Tracks initialized and active state and statistics.
  * @name: Suspend blocker name used for debugging.
  *
  * When a suspend_blocker is active it prevents the system from entering
@@ -34,6 +58,7 @@ struct suspend_blocker {
 	struct list_head link;
 	int flags;
 ...
From: =?UTF-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?=
Date: Friday, May 21, 2010 - 3:46 pm

Report active and inactive suspend blockers in
/sys/kernel/debug/suspend_blockers.

Signed-off-by: Arve Hjønnevåg <arve@android.com>
---
 kernel/power/opportunistic_suspend.c |   43 +++++++++++++++++++++++++++++++++-
 1 files changed, 42 insertions(+), 1 deletions(-)

diff --git a/kernel/power/opportunistic_suspend.c b/kernel/power/opportunistic_suspend.c
index cc90b60..6b5eb1d 100644
--- a/kernel/power/opportunistic_suspend.c
+++ b/kernel/power/opportunistic_suspend.c
@@ -17,6 +17,7 @@
 #include <linux/module.h>
 #include <linux/rtc.h>
 #include <linux/suspend.h>
+#include <linux/debugfs.h>
 
 #include "power.h"
 
@@ -151,7 +152,8 @@ EXPORT_SYMBOL(suspend_blocker_register);
 /**
  * suspend_blocker_init - Initialize a suspend blocker's name and register it.
  * @blocker: Suspend blocker to initialize.
- * @name:    The name of the suspend blocker to show in debug messages.
+ * @name:    The name of the suspend blocker to show in debug messages and
+ *	     /sys/kernel/debug/suspend_blockers.
  *
  * The suspend blocker struct and name must not be freed before calling
  * suspend_blocker_unregister().
@@ -296,3 +298,42 @@ void __init opportunistic_suspend_init(void)
 	suspend_block(&main_suspend_blocker);
 	suspend_blocker_register(&unknown_wakeup);
 }
+
+static struct dentry *suspend_blocker_stats_dentry;
+
+static int suspend_blocker_stats_show(struct seq_file *m, void *unused)
+{
+	unsigned long irqflags;
+	struct suspend_blocker *blocker;
+
+	seq_puts(m, "name\tactive\n");
+	spin_lock_irqsave(&list_lock, irqflags);
+	list_for_each_entry(blocker, &inactive_blockers, link)
+		seq_printf(m, "\"%s\"\t0\n", blocker->name);
+	list_for_each_entry(blocker, &active_blockers, link)
+		seq_printf(m, "\"%s\"\t1\n", blocker->name);
+	spin_unlock_irqrestore(&list_lock, irqflags);
+	return 0;
+}
+
+static int suspend_blocker_stats_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, suspend_blocker_stats_show, NULL);
+}
+
+static const struct ...
From: =?UTF-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?=
Date: Friday, May 21, 2010 - 3:46 pm

Add a misc device, "suspend_blocker", that allows user-space processes
to block automatic suspend.

Opening this device creates a suspend blocker that can be used by the
opener to prevent automatic suspend from occurring.  There are ioctls
provided for blocking and unblocking suspend and for giving the
suspend blocker a meaningful name.  Closing the device special file
causes the suspend blocker to be destroyed.

For example, when select or poll indicates that input event are
available, this interface can be used by user space to block suspend
before it reads those events. This allows the input driver to release
its suspend blocker as soon as the event queue is empty. If user space
could not use a suspend blocker here the input driver would need to
delay the release of its suspend blocker until it knows (or assumes)
that user space has finished processing the events.

By careful use of suspend blockers in drivers and user space system
code, one can arrange for the system to stay awake for extremely short
periods of time in reaction to events, rapidly returning to a fully
suspended state.

Signed-off-by: Arve Hjønnevåg <arve@android.com>
---
 Documentation/ioctl/ioctl-number.txt          |    3 +-
 Documentation/power/opportunistic-suspend.txt |   27 +++++
 include/linux/suspend_ioctls.h                |    4 +
 kernel/power/Kconfig                          |    7 ++
 kernel/power/Makefile                         |    1 +
 kernel/power/user_suspend_blocker.c           |  143 +++++++++++++++++++++++++
 6 files changed, 184 insertions(+), 1 deletions(-)
 create mode 100644 kernel/power/user_suspend_blocker.c

diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt
index dd5806f..e2458f7 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -254,7 +254,8 @@ Code  Seq#(hex)	Include File		Comments
 'q'	80-FF	linux/telephony.h	Internet PhoneJACK, Internet LineJACK
 ...
From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 1:43 am

Urgh, please let the open() be BLOCK, the close() be UNBLOCK, and keep
the SET_NAME thing if you really care.

--

From: Arve Hjønnevåg
Date: Wednesday, May 26, 2010 - 3:47 am

That would be very inefficient.

-- 
Arve Hjønnevåg
--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 3:50 am

How so? Anyway, since you admitted this thing isn't needed at all, I say
we remove it altogether.
--

From: Arve Hjønnevåg
Date: Wednesday, May 26, 2010 - 4:13 pm

I also said it was useful. I don't think we should drop it just
because we can work around its absence.

-- 
Arve Hjønnevåg
--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 3:51 am

On Wed, 26 May 2010 03:47:27 -0700

Also I think it is intended to enforce named suspend blockers. (For
debugging/accounting purposes).

Cheers,
Flo
--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 4:06 am

I don't think the code as proposed mandates you SET_NAME, and I didn't
propose killing that off, you can still SET_NAME after you open() and
acquire the thing.

Anyway, the whole point is moot since its simply not needed at all.
--

From: Rafael J. Wysocki
Date: Wednesday, May 26, 2010 - 2:57 pm

SET_NAME wouldn't serve any purpose in that case.

This whole thing is related to the statistics part, which Arve says is
essential to him.  He wants to collect statistics for each suspend blocker
activated and deactivated so that he can tell who's causing problems by
blocking suspend too often.  The name also is a part of this.

In fact, without the statistics part the whole thing might be implemented
as a single reference counter such that suspend would happen when it went down
to zero.

Thanks,
Rafael
--

From: Alan Cox
Date: Wednesday, May 26, 2010 - 3:14 pm

If he wants to collect stats about misbehaving code then presumably he
needs reliable stats. Arbitary user set names are not reliable and
judging by app vendor behaviour in the proprietary space with other such
examples they will actively name their blockers in a manner specifically
intended to hide the cause so as to 'reduce support costs'.

Its a waste of memory (especially if I create a million of them with a
long name for fun).

It's all an economic system, proprietary app vendors are in it to make
money, some will therefore game the system and the rest will be forced to
follow to keep their playing field fair.

Alan
--

From: Brian Swetland
Date: Wednesday, May 26, 2010 - 3:18 pm

You are limited to one per open fd of the device, and a max name size
which could be further shrunk to something pretty small (32?) if
desired.  The device node interface came about after discussions last
year and concerns that userspace could create an unbounded number of

Untrusted (non-system) code can't directly access the device node from
userspace in the Android world -- so directly created suspend blockers
from userspace are only created by a couple system processes (3-4
typically).  Applications are sandboxed by UID and there is (much
more) per-application accounting in the userspace application manager
process (other resource consumption such as sensors, CPU, etc is
tracked here as well).

For suspend blockers created by drivers and by trusted userspace
processes, having a meaningful name significantly helps statistics
gathering.

Brian
--

From: Alan Cox
Date: Wednesday, May 26, 2010 - 4:00 pm

Surely you only need one block per task (or better yet one expression of
latency per task). If not can you explain why a setup which has a per
task expression of latency needs (and a 'hard limit' set which you can't
then bump back down except as root) isn't enough ?

I'd really like this lot to also fix the hard real time and power
management problem we have today and to try an fix the "suspend is
special and different" mentality in the kernel, which is getting less and

Great - but the world is not Android and even if they can't access it

By drivers I agree but in the driver case the cost is minimal because
there should not be many and it is bounded clearly. Again I really think
'suspend blocking' is the wrong expression.

A driver needs to express

'Don't go below this latency'

and

'Don't go below this state'

This is more generic and helps our power management do the right thing on
all boxes. For example a serial port can meaningfully say 'I want X
latency worst case' based upon the fact the fifo is 64 bytes and the user
space just asked for 115,200 baud.

The don't go below for states out of which the device must wake but
cannot. Eg if your device is being told by user space to set wake on lan
and can only wake on lan from a higher state than 'off' it needs to say
so.

Alan
--

From: Arve Hjønnevåg
Date: Wednesday, May 26, 2010 - 4:00 pm

No, many suspend blockers protect data, not tasks. We block suspend
when there are unprocessed input events in the kernel queue.
User-space then blocks suspend before reading those events, and it
blocks suspend using a different suspend blocker when its queue of



-- 
Arve Hjønnevåg
--

From: Alan Cox
Date: Wednesday, May 26, 2010 - 4:52 pm

That sounds to me like higher level user space implementation plus a
kernel driver imposing a power level limit.

The fact your suspend blockers as a user construct are tied to data isn't
neccessarily what the kernel needs to provide.

That aside we could equally provide latency constraints using a file
handle based semantic like your suspend blockers. That would make it
easier to do suspend blockers in those terms and provide an interface
that can be used for more elegant expression of desire.

I suggested setpidle() to keep resources tied and bounded to tasks and
to make SMP work nicely. I have no problem with the latency constraints
being floating file handles, although some thought would be needed as to
how that is then mapped to SMP.

The problem with an arbitary mapping is that if I have an 8 processor
system I might want to use the latency info to stuff tasks onto different
cores so that most of them are in a deeper idle than the others. I can't
do that unless I know who the latency constraint applies to. But
hopefully someone in scheduler land has bright ideas - eg could we keep a
per task variable and adjust it when people open/close blockers with the
assumption being anyone who has open a given constraint is bound by it
and nobody else is.

Might need it to be a constraintfs so you can open other people's
constraints but all the framework for that is there in the kernel too.

Alan
--

From: Rafael J. Wysocki
Date: Wednesday, May 26, 2010 - 3:45 pm

Good point.

Rafael
--

From: Rafael J. Wysocki
Date: Sunday, May 23, 2010 - 5:46 pm

Patches [1-6/8] applied to suspend-2.6/linux-next

Thanks,
Rafael
--

From: Felipe Balbi
Date: Sunday, May 23, 2010 - 9:32 pm

funny thing is that even without sorting out the concerns plenty of 
developers had on the other thread, this series is still taken. What's 
the point in dicussing/reviewing the patches then ?

-- 
balbi

DefectiveByDesign.org
--

From: Rafael J. Wysocki
Date: Monday, May 24, 2010 - 11:49 am

I don't think the concerns you're referring to can be solved out.  Some people
just don't like the whole idea and I don't think there's any way we can improve
the patches to make them happy.  The only "solution" they would be satisfied
with would simply be rejecting the feature altogether, although there are no
practically viable alternatives known to me.

OTOH I do think there are quite a few reasons to take the patchset, so I'm
going to push it to Linus as I told in one of my replies to Kevin.  If Linus
decides not to pull it, so be it.

Thanks,
Rafael
--

From: Kevin Hilman
Date: Monday, May 24, 2010 - 3:51 pm

I'm not sure who the "some people" you're referring to are, but I'll
assume I'm included in that group.

I don't think this is a fair characterization of the objections, nor
do I think "rejecting the feature altogether" is the only satisfactory
answer.  Speaking for myself, I find the idea of being able to suspend
while idle a valid objective, and certainly see the usefulness of it
for embedded systems.  I'm also an owner and user of an Android phone,
so I am certainly not out just to make life difficult for Android.

The primary objection is not the end goal, but rather the
implementation.  In particular, the problematic redefintion of what it
means to be idle, or "not doing work that's immediately useful to the
user" to use the phrase from the changelog (where "useful" is still
not defined.)

This (re)definition completely bypasses all current idle
infrastructure based on timers, scheduler, etc. and makes "usefulness"
defined in terms of who holds suspend blockers.  This of course will
lead to a scattering of suspend blockers into any drivers/subsystems
considered "useful", which by looking through current Android kernels
is many of them.

Kevin
--

From: Rafael J. Wysocki
Date: Monday, May 24, 2010 - 4:38 pm

So, in fact, you don't like the _idea_, because the _idea_ is to use suspend
blockers instead of trying to define what "idle" means.

I don't think it's generally possible to define "idle" to match every possible
criteria one can imagine, so you're request to do that simply cannot be

That's because the point is not to suspend when the system is "idle", because
that would mean "suspend transparently from the applications' point of view",
which is what the Android people _don't_ _want_ _to_ _do_, because in that
case their battery life would go to the toilet.  The idea is to suspend even

That depends on the maintainers of these subsystems, who still have the power
to reject requested changes.

As I said before, I don't think there's a way to resolve this so that everyone
is happy and in my opinion there are reasons to merge the feature.

Also I don't think we can make any progress discussing it.  We've already
discussed it for a month or so without any real progress and I don't see how
that's going to change now.

Thanks,
Rafael
--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 1:47 am

So as a scheduler maintainer I'm going to merge a patch that does a
suspend_blocker when the runqueue's aren't empty... how about that?
--

From: Arve Hjønnevåg
Date: Wednesday, May 26, 2010 - 2:41 am

I don't know if you are serious, since the all the runqueues are never
empty while suspending, this would disable opportunistic suspend
altogether.

-- 
Arve Hjønnevåg
--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 2:45 am

So why again was this such a great scheme? Go fix your userspace to not
not run when not needed.
--

From: Brian Swetland
Date: Wednesday, May 26, 2010 - 2:49 am

Thanks for your constructive feedback.

Brian
--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 3:02 am

On Wed, 26 May 2010 11:45:06 +0200

Hi Peter!

This was already mentioned in one of these threads. 

The summary is: The device this kernel is running on dosn't want to
(or can) rely on userspace to save power. This is because it is an open
system, without an app-store or the like. Everyone can run what he
wants.

So anything relying on (all) userspace solves a different problem.

Cheers,
--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 3:08 am

So what stops an application from grabbing a suspend blocker?
--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 3:19 am

On Wed, 26 May 2010 12:08:04 +0200

Well, I don't own any android devices, but  If I read this all
correctly, an app can request the permission to grab an suspend blocker
at installation time. ("This application is requesting permission to
keep the device from sleeping, thus possibly reducing your battery
time. Are you shure you want to continue? [Yes,No]")

every app grabbing a suspend blocker is showing
up in a "these programs stop suspend" kind of battery-app and are thus
well accounted for. _And the user knows who to blame_.

Maybe this is implemented via fs-permissions? Anyway, I'm shure,
that the access control uses a well established method. :)  

Cheers,
Flo
--

From: Vitaly Wool
Date: Wednesday, May 26, 2010 - 4:18 am

I don't see this as a valid point. Everyone can run a different kernel
where nothing will just work. Are you aiming protection against that
as well?

~Vitaly
--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 4:37 am

On Wed, 26 May 2010 13:18:51 +0200

This is not "protection". This is functioning properly in a real world
scenario. Why would the user change the kernel, if the device would be
buggy after that? (Except maybe he is a geek)

Cheers,
Flo
--

From: Vitaly Wool
Date: Wednesday, May 26, 2010 - 5:01 am

Hmm... Why would the user continue to use the program if it slows down
his device and sucks the battery as a vampire (Except maybe he's a
moron)? ;)

~Vitaly
--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 5:24 am

On Wed, 26 May 2010 14:01:49 +0200

Because he is using a robust kernel that provides suspend blockers and
is preventing the vampire from sucking power? 

Most users don't even grasp the simple concept of different "programs".
They just have a device and click here and there and are happy. 

Really, what are you getting at? Do you deny that there are programs,
that prevent a device from sleeping? (Just think of the bouncing
cows app)

And if you have two kernels, one with which your device is dead after 1
hour and one with which your device is dead after 10 hours. Which would
you prefer? I mean really... this is ridiculous. 

Cheers,
Flo

--

From: Felipe Balbi
Date: Wednesday, May 26, 2010 - 5:29 am

hi,


What I find ridiculous is the assumption that kernel should provide good 
power management even for badly written applications. They should work, 
of course, but there's no assumption that the kernel should cope with 
those applications and provide good battery usage on those cases.

You can install and run anything on the device, and they will work as 
they should (they will be scheduled and will be processed) but you can't 
expect the kernel to prevent that application from waking up the CPU 
every 10 ms simply because someone didn't think straight while writting 
the app.

-- 
balbi

DefectiveByDesign.org
--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 5:33 am

On Wed, 26 May 2010 15:29:32 +0300

But then someone at the user side has to know what he is doing. 

I fear, if you target mass market without central distribution
channels, you can not assume that much.

Cheers,
Flo
--

From: Felipe Balbi
Date: Wednesday, May 26, 2010 - 5:35 am

Hi,


and that's enough to push hacks into the kernel ? I don't think so. Do 

-- 
balbi

DefectiveByDesign.org
--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 5:54 am

On Wed, 26 May 2010 15:35:32 +0300

It really comes down to a policy decision by the distribution maker.
And I don't think kernel upstream should be the one to force one way or
the other. So merging this patch set will allow android to continue
their work _on mainline_ while everybody else can continue as before.

All points about the impact on the kernel have already been raised. So
you should be happy there. 

Nonetheless, I really think the kernel needs to allow for the android
way of power saving. It misses out on a big feature and a big user-base
if not.

Also I expect there to be synergies between android development and
mainline kernel development _only_ if android development can use
mainline kernel.

And as for the quality of the "hack": I think you find this ugly, just
because you don't like the concept of degrading user space guaranties on
timers and stuff. 

But look at it this way: Suspend blockers are a way for the kernel
to make user space programs accountable for using the resource "power".
If a user space program needs the "traditional" guaranties for
functioning properly, it needs to take a suspend blocker. But _THEN_ it
better be well behaved. This is a kind of contract between userspace
and kernelspace.

On the other hand, if I don't need these traditional guaranties on
timers and stuff, I don't have to know device specific things about
power consumption. I can just use whatever facilities the programming
language provides without needing to worry about low level details.

This is a _big_ plus for attracting 3rd party programs. (And of course
the thing you don't like). 

Cheers,
Flo




--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 6:06 am

That's exactly what we always do. If we were not to do so, the kernel

I really think we should not do so. Let them help in fixing the real
issue instead of creating a new class of userspace that is more

How is userspace without suspend blockers not accountable? We can easily
account runtime and in fact have several ways to do so.
--

From: Alan Cox
Date: Wednesday, May 26, 2010 - 6:19 am

You would do better to concentrate on technical issues that the
assignment of malicious intent to other parties.

Alan
--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 6:39 am

On Wed, 26 May 2010 14:19:42 +0100

This was nothing the kind of! He explicitly said this:

On Wed, 26 May 2010 15:29:32 +0300

And I responded that if the kernel would do this, then that would
be a "_big_ plus for attracting 3d party programs". 

I had no intent in attacking anyone or putting word's in someones mouth.
Sorry if this was unclearly written.

Cheers,
Flo
--

From: Felipe Contreras
Date: Thursday, May 27, 2010 - 1:58 am

Let's get rid of hypothetical uses in the future: suspend blockers is
_only_ used by Android user-space. Nobody else has expressed any

That's like saying "there can only be synergies between linux real
time and mainline _only_ if RT development can use mainline".

I can give you my experience at Nokia... can you use mainline on any
of the Maemo devices? No. You have to patch the kernel heavily, to be
able to kind-of run the official user-space, or you have to use a
different user-space.

Does that prevent synergies? No. As Brian Swetland and Daniel Walker
already expressed before; you can run mainline kernel with debian on
Android phones.

It would be nice to run Android user-space, or parts of it on mainline
kernels, but if it's not possible, that's a deficiency on Android's
design; Maemo/Moblin/Meego are good players in the linux ecosystem so
you can re-use parts of the system on typical desktops (in fact many
are coming from there), and there are community distributions re-using
those parts and running just fine on mainline kernels.

Sure, it would be easier for Android developers if all their crap was
in the mainline, but even then there are no guarantees of anything.
Just like any other linux phone, you'll probably need to add patches
for 3D drivers, DSP, or other hardware acceleration, missing
board-specfic patches, and bunch of hacks.

So if you have to add all those patches anyway, what's the problem of
having to add the suspend block patches?

Why do some Android developers think they can be the exception and
have patches merged in the core of linux _only_ for their specific
user-space, and their specific drivers?

If you separate suspend blockers from Android, and judge them on their
technical merit, I don't see a single person saying this is a good
idea, we'll switch all our user-space to use them.

-- 
Felipe Contreras
--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 5:41 am

Provide the developers and users with tools. 

Notify the users that their phone is using power at an unadvised rate
due to proglet $foo.

Also, if you can integrate into the development environment and provide
developers instant feedback on suckage of their app they can react and
fix before letting users run into the issue.


--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 6:03 am

On Wed, 26 May 2010 14:41:29 +0200

Yeah. And I personally agree with you there. But this is a policy
decision that should not prevent android from doing it differently.
The kernel can not win if it does not try to integrate any use of it.
After all, we are a free comunity and if someone wants to use it their
way, why not allow for it? (As long as it does not directly impact other
uses)

The best solution wins, but not by decision of some kernel
development gatekeepers, but because it is superior. There are no clear
markings of the better solution. Time will tell.

Cheers,
Flo


--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 6:07 am

If we'd integrate every patch that came to lkml, you'd run away
screaming.

We most certainly do not want to integrate _any_ use.
--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 6:30 am

On Wed, 26 May 2010 15:07:27 +0200

We most certainly do want to integrate any use that is not harmful to
others.

I don't buy the argument that this is harmful. 

Cheers,
Flo
--

From: Vitaly Wool
Date: Wednesday, May 26, 2010 - 5:55 am

You almost always need to "hack" the mainline software for a
production system. So do it here as well. Make sure the hack is well
isolated and local. You can even submit it to the mainline, better as
a configuration option, _unless_ it is a *framework* that provokes
writing code in an ugly and unsafe way.

~Vitaly
--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 6:19 am

On Wed, 26 May 2010 14:55:31 +0200

I don't think that the in-kernel suspend block is a bad idea. 

You could probably use the suspend-blockers unconditionally in the
suspend framework to indicate if a suspend is possible or not.
Regardless of opportunistic suspend or not. This way, you don't have to
try-and-fail on a suspend request and thus making suspending
potentially more robust or allowing for a "suspend as soon as
possible" semantic (which is probably a good idea, if you have to grab
your laptop in a hurry to get away).

Cheers,
Flo
--

From: Alan Cox
Date: Wednesday, May 26, 2010 - 6:16 am

The problem you have is that this is policy. If I have the device wired
to a big screen and I want cows bouncing on it I'll be most upset if
instead it suspends. What you are essentially arguing for is for the
kernel to disobey the userspace. It's as ridiculous (albeit usually less
damaging) as a file system saying "Ooh thats a rude file name, the app
can't have meant it, I'll put your document soemwhere else"

The whole API feels wrong to me. It's breaking rule #1 of technology "You
cannot solve a social problem with technology". In this case you have a
social/economic problem which is crap code. You solve it with an
economics solution - creative incentives not to produce crap code like
boxes that keep popping up saying "App XYZ is using all your battery" and
red-amber-green powermeter scores in app stores.

That said if you want technical mitigation I think it makes more sense
if you look at it from a different starting point. The starting point
being this: We have idling logic in the kernel and improving this helps
everyone. What is needed to improve the existing logic ?

- You don't know which processes should be ignored for the purpose of
  suspend (except for kernel threads) and there is no way to set this

- You don't know whether a move from a deep idle to a 'suspend' (which is
  just a really deep idle in truth anyway) might break wakeups
  requirements because a device has wake dependencies due to hardware
  design (eg a port that has no electronics to kick the box out of
  suspend into running). This is a problem we have already. [1]

That maps onto two existing ideas

Sandboxing/Resource Limits: handling apps that can't be trusted. So the
phone runs the appstore code via something like

		setpidle(getpid(), something);
		exec()

where 'something' is a value with meaning to both user space and to the
existing idling logic in the kernel that basically says to what extent it
is permitted to block idling/suspend. That also seems to tie into some of
the realtime ...
From: Thomas Gleixner
Date: Wednesday, May 26, 2010 - 6:46 am

Alan,


I completely agree. 

We have already proven that the social pressure on crappy applications
works. When NOHZ was merged into the kernel we got no effect at all
because a big percentage of user space applications just used timers
at will and without any thoughts, also it unveiled busy polling and
other horrible coding constructs. So what happened ? Arjan created
powertop which lets the user analyse the worst offenders in his
system. As a result the offending apps got fixed rapidly simply
because no maintainer wanted to be on top of the powertop sh*tlist.

In the mobile app space it's basically the same problem. Users can
influence the app writers simply by voting and setting up public lists
of apps which are crappy or excellent. All it needs is a nice powertop
tool for the phone which allows the users to identify the crap on
their phones. That provides much more incentive - especially for
commercial players - to fix their crappy code.

Adding that sys_try_to_fix_crappy_userspace_code() API to the kernel
is just counter productive as it signals to the app provider: Go
ahead, keep on coding crap!

That's not a solution, that's just capitulation. 

It's absurd that some folks believe that giving up the most efficient
tool to apply pressure to crappy app providers is a good idea.

Thanks,

	tglx
--

From: Felipe Balbi
Date: Wednesday, May 26, 2010 - 8:33 am

Hi,


I couldn't agree more with both of you. I also have stated that a
powertop application with a fancy UI would do the job. Also building
some sort of power estimations on the SDK would allow the developer the
have fast feedback about potential power consumption caused by his app
on the device.

On top of that, the app stores can use the same power estimation
"technology" to rate apps automatically and even reject apps that are
waaaay too badly written.

I also feel that kernel shouldn't have to deal, fix, hide bad behavior
from apps.

-- 
balbi
--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 8:11 am

On Wed, 26 May 2010 14:16:12 +0100

I'm not saying that your argument is not valid. But why don't you look
at suspend blockers as a contract between userspace and kernelspace? An
Opt-In to the current guarantees the kernel provides in the non-suspend
case.

<<If you want to use the rare resource "power" you have to take a
suspend blocker. By this you assert that you are a well written
application. If you are not well written, you will get the worst of our
red-amber-green powermeter scores we have.>>

On the other hand, applications can say, they don't need that much
power and userspace guaranties and not take a suspend blocker.

This is an option which they currently don't have.

I don't think opportunistic suspend is a policy decision by the kernel.
it is something new. Something which currently only the android
userspace implements / supports. If you don't want to suspend while
looking at the bouncing-cow, you have to take a suspend blocker and
make yourself a user-visible power-eater, or don't do 

echo "opportunistic" > /sys/power/policy 

in the first place.

This "optionally being badly written, who cares?" is a new feature the
kernel can provide to applications. 

That said, your proposed alternative implementation scheme looks like

How does this address the loss of wakeup events while using suspend? 
(For example the 2 issues formulated by Alan Stern in [1])

cheers,
Flo

[1]http://lkml.org/lkml/2010/5/21/458

p.s.: 
dmk@schatten /usr/src/linux $ grep -r "setpidle" .
dmk@schatten /usr/src/linux $ 
--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 8:12 am

By not suspending obviously.

--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 8:15 am

That's backwards.

--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 8:40 am

On Wed, 26 May 2010 17:15:47 +0200

I think that's the point of it. 

--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 8:45 am

Apparently, and you're not accepting that we're telling you we think its
a singularly bad idea. Alan seems to have the skill to clearly explain
why, I suggest you re-read his emails again.

--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 8:47 am

On Wed, 26 May 2010 17:45:00 +0200

I'm sorry if I offend you. I indeed read Alan's emails. It's just they
have more content than yours. So it takes longer. 

Cheers,
Flo
--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 8:49 am

On Wed, 26 May 2010 17:47:35 +0200

p.s.: also they encourage me to think more before answering. 
--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 8:16 am

How about we don't merge that junk and don't give you the opportunity to
do silly things like that? :-)

--

From: Alan Cox
Date: Wednesday, May 26, 2010 - 8:45 am

It is a contract - but not the right one. You are removing autonomy from
the kernel when only the kernel can measure the full picture and when the


Disagree. It's an arbitary and misleading divide that happens to reflect
a specific vendors current phones. Worse yet it may not reflect their own
future products. It assumes for example that their is some power level
that is 'suspend' that is singular, application understood and can be
effectively user space managed. Big assumptions and not ones that seem to
be sensible.

It also breaks another rule - when the hardware changes your application
blocker policies will be wrong. Do you want multiple hand optimised
copies of each app ? Take a look at what happened to CPU designs where
the assumption was you'd recompile the app for each CPU version to get
any useful performance.

If you are instead expressing it as "must be able to respond in time X"
and "must be able to wake up from an event on an active device" then your
interface is generic and hardware independant.

If "bouncing cows" says 'need to wake up every 0.3 seconds" you want the
kernel to decide how best to do that. It will vary by hardware. On todays
desktop PC thats probably a low power state limit. On some current
embedded hardware it might be a special deep sleep mode. On one or two
devices it might be 'suspend'. It might also be that the properties have
been set to 2 seconds already so it gets told it can't have 0.3.

The app cannot be allowed to know platform specific stuff or your
portability comes apart and you end up with a disaster area where each
app only comes on a subset of devices. Express the requirement right and
you get a simple clean interface that continues to work.


Thats a very big hammer and it doesn't express what I actually want,

But you can do this properly by having a per process idle requirement,
and that can encompass things like hard real time as well (or even
gaming). The suspend blockers break all the global policy, don't solve
real ...
From: Thomas Gleixner
Date: Wednesday, May 26, 2010 - 10:22 am

Florian,


Wrong. A well coded power aware application is very well able to
express that in various ways already today. Admittedly it's far from
perfect, but that can be fixed by adding interfaces which allow the
power aware coder to express the requirements of his application
actively, not by avoiding it.

suspend blockers are completely backwards as they basically disable
the kernels ability to do resource management.

Also they enforce a black and white scheme (suspend or run) on the
kernel which is stupid, as there are more options to efficiently save
power than those two. While current android devices might not provide
them, later hardware will have it and any atom based device has them
already.

So what the kernel needs to know to make better decisions are things
like:

  - how much slack can timers have (exisiting interface)
  - how much delay of wakeups is tolerated (missing interface)

and probably some others. That information would allow the kernel to
make better decisions versus power states, grouping timers, race to
idle and other things which influence the power consumption based on

It's a misfeature which the kernel should not provide at all. It sends
out the completely wrong message: Hey, we can deal with your crappy
code, keep on coding that way.

While this seems to sound cool to certain people in the mobile space,
it's completely backwards and will backfire in no time. 

The power efficiency of a mobile device is depending on a sane overall
software stack and not on the ability to mitigate crappy software in
some obscure way which is prone to malfunction and disappoint users.

Thanks,

	tglx

--

From: Alan Cox
Date: Wednesday, May 26, 2010 - 11:02 am

Even if you believe the kernel should be containing junk the model that
works and is used for everything else is resource management. Not giving
various tasks the ability to override rules, otherwise you end up needing
suspend blocker blockers next week.

A model based on the idea that a task can set its desired wakeup
behaviour *subject to hard limits* (ie soft/hard process wakeup) works
both for the sane system where its elegantly managing hard RT, and for
the crud where you sandbox it to stop it making a nasty mess.

Do we even need a syscall or will adding RLIMIT_WAKEUP or similar do the
trick ?

Alan
--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 12:56 pm

On Wed, 26 May 2010 19:02:04 +0100

Your approach definitely sounds better than the current solution. 
What about mapping suspend blocker functionality later on, when this
interface exists, on to this new approach and deprecating it?

Cheers,
Flo
--

From: Vitaly Wool
Date: Wednesday, May 26, 2010 - 1:03 pm

What about coming back after some while with the appropriate solution
when it's ready instead of stubbornly pushing crap?

~Vitaly
--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 10:11 pm

On Wed, 26 May 2010 22:03:37 +0200

Because quite frankly, for a good part of linux users, suspend blockers
is already in the kernel. It's just an historical mistake that they are
not in the linux kernel's hosted on kernel.org. 
So why don't we do what we always do? Improve existing interfaces step
by step? 

Top Down approaches fail from time to time. Also it is not clear, that
that proposed interface works for the use cases. This has to be proven
by providing an implementation. 

Cheers,
Flo
--

From: Thomas Gleixner
Date: Thursday, May 27, 2010 - 6:35 am

No, it's not a historical mistake. It's a technical decision _NOT_ to
merge crap. If we would accept every crappy patch which gets shipped
in large quantities as a defacto part of the kernel we would have a

Exactly, that's what we are going to do. We improve and extend
existing interfaces step by step, but not by creating a horrible and
unmaintainable mess in the frist place which we can never get rid of

Nobody prevents you to sit down and start with a prove of concept
implementation.

Thanks,

	tglx
--

From: Florian Mickler
Date: Friday, May 28, 2010 - 12:25 am

On Thu, 27 May 2010 15:35:18 +0200 (CEST)

Ok to your two paragraphs. I can understand this. 

Nonetheless, i'm convinced that there has to be some solution in
mainline to allow for what android does. But perhaps it needs more

Hmm... *scratch*... *lookaround* .. who?

--

From: Thomas Gleixner
Date: Thursday, May 27, 2010 - 6:26 am

We definitely will need them when we want to optimize the kernel
resource management on a QoS based scheme, which is the only sensible

Right, the base system can set sensible defaults for "verified" apps,
which will work most of the time except for those which have special
requirements and need a skilled coder anyway. And for the sandbox crud
the sensible default can be "very long time" and allow the kernel to

That might be a good starting point.

Thanks,

	tglx
--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 12:54 pm

On Wed, 26 May 2010 19:22:16 +0200 (CEST)

Yeah, but a user can't say: "I don't
know programming, but I had this idea. Here try it out." 
to his friend. Because his friends phone then will crap out.

This is a negative. The phone just doesn't work well. 

A knowledgeable programmer is able to do extra work to enable specific
guarantees. A dummy-throw-away-programmer doesn't want to do


This is true. Nonetheless, in my opinion, implementing the "backwards
approach" in any form (suspend blockers or some other sort of "sane"
interface) is necessary in the long run.  I also believe that Alan's
approach is the more flexible one. But I'm not in a position to judge
on this.

If it really is better and superior, I think userland will switch
over to it, as soon as it is possible to do it. The impact to the
drivers code is needed anyway. What looses the kernel by implementing
suspend blockers, and later a more finegrained approach and mapping the
userspace part of suspend blockers on that finegrained approach later

I liked this idea of Arjan, to give some applications infinite timer
slack. But I don't know if it can made work in a "dummy proof" way.
(I.e. it is too complicated to get it right in general, except for the

Doesn't solve the segregation problem and is probably overkill for most
applications. I see this as an orthogonal thing (i.e. not

I can't really say anything against that. Or anything in favor of it.  
Except that this is probably a really hard decision for Linus to

Maybe. It is something which seems to not come from the traditional
"linux distribution" model of software deployment, where you have a
party that codes and another party that packages that code. 

It instead is more targeted at the decentralised 3rd-party-app

True. But I wouldnt say, that it is the linux kernel who should force
--

From: Alan Cox
Date: Wednesday, May 26, 2010 - 3:09 pm

I think it's both. It's the point of it, and that itself is a defect. Its

The Linux approach is to do the job right. That means getting the
interface right and so it works across all the interested parties (or as

The key question that matters for suspend management is 'what wakeup

This is the Linux way of doing things. It's like the GPL and being
shouted at by Linus. They are things you accept when you choose to take
part. Google chose to use Linux, if they want a feature upstream then the
way you get it there is to figure out how to solve the real problem and
make *everyone* (within reason) happy.

We now have suggestions how to do the job properly so the right thing is
probably to go and explore those suggestions not merge crap.

Merging crap won't help anyway. The rest of the kernel community can
still simply stonewall those interfaces, and a volunteer community has
ways of dealing with abuse of process, notably by simply not getting
around to, or implementing things directly contrary to the crap bits.

So it's not even in the interest of people to play political games. Even
if you get away with in the short term the people who rely on the junk
will end up out on a limb and holding the baby when the crap hits the fan
(see reiserfs)

Alan
--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 10:14 pm

On Wed, 26 May 2010 23:09:43 +0100

I'm not interested in "abusing processes". I just think, this is in
limbo for too long already.
Just decide something. One way or the other. The world will continue.

Cheers,
Flo
--

From: Vitaly Wool
Date: Thursday, May 27, 2010 - 12:43 am

Oh man, you rule the world eh? :)

~Vitaly
--

From: Thomas Gleixner
Date: Thursday, May 27, 2010 - 6:37 am

Trying to solve that inside the kernel is the patently wrong
approach. The only way to give Joe Clicker the ability to "code" his
idea is to give him a restricted sandbox environment which takes care
of the extra work and is created by knowledgable programmers with the
Joe Clickers in mind.

Every Joe Clicker can "code" websites and does this happily in his
sandbox which consists of a web server and a web application
framework. There is no single line of code in the kernel to make this
work.

As I said before we need new interfaces and new technologies to let
the kernel do better power management, but definitely not a big hammer
approach which is actively preventing the kernel to do smarter
decisions.

The correct approach is QoS based and everything else is just a

I know that this is the point of the approach, but that does not make
it less wrong. Me, Alan and others explained already in great length
why it is the wrong approach, but you refuse to listen.

You remind me of my 14 year old son, who argues in circles to convince
me that I must allow him something which is wrong. And if he would
think about it w/o his primary goal of getting permission in mind he

The kernel loses the ability to remove suspend blockers once the
interface is exposed to user space. That's the whole point. We would
have to carry it on for a long time and trying to work around it when
implementing a technical correct approach.

And we have never seen crap move to a better interface. It will stay

A mobile device can implement sensible defaults for the careless

It solves the segregration problem quite nicely, as again it can be

It's not orthogonal, it's essential to do QoS based power management,
which is the only sensible technical solution to do, as it allows the
kernel to optimize in various areas while at the same time
guaranteeing the response time to applications which require them.

Blockers are not orthogonal at all, as they actively prevent clever

The kernel does not make ...
From: Kevin Hilman
Date: Wednesday, May 26, 2010 - 8:19 am

Completely agree with this.

I used the static/dynamic names out of habit, but since on most
embedded devices, there is really no difference in hardware power
state, I agree that the difference is only a matter of wakeup latency.

Kevin

--

From: Arve Hjønnevåg
Date: Wednesday, May 26, 2010 - 3:30 pm

We never suspend when the screen is on. If the screen is off, I would

No I'm not. User-space asked the kernel to suspend when possible.
Suspend is an existing kernel feature. All opportunistic suspend adds
is the ability to use it safely when there are wakeup event you cannot

Our actual stating point is this: Some systems can only enter their
lowest power state from suspend. So we added an api that allowed us to
use suspend without loosing wakeup events. Since suspending also
improves our battery life on platforms that enter the same power state
from idle and suspend and we already have a way to safely use suspend,

Sandboxing is problematic on Android since there are a lot of cross
process dependencies. When a call comes in I don't know where the name
and picture to display comes from. With suspend blockers we block
suspend when we get notified that we have an incoming call and we can


What about platforms that currently cannot enter low power states from



-- 
Arve Hjønnevåg
--

From: Alan Cox
Date: Wednesday, May 26, 2010 - 4:39 pm

On Wed, 26 May 2010 15:30:58 -0700

This is policy and platform specific. The OLPC can suspend with the
screen on. Should I write my app to know about this once for Android and
once for OLPC (and no doubt once for Apple). In the OLPC case cows could
indeed suspend to RAM between frames if the wakeup time was suitable.

My app simply should not have to know this sort of crap, that's the whole

Don't get me wrong - opportunistic suspend isn't the problem. Suspend
blockers are - or more precisely - the manner in which they express
intent from userspace. Opportunistic suspend is wonderful stuff for all
sorts of things and if it persuades people like netbook manufacturers to
think harder, and Linux driver authors to optimise their suspend/resume

Opportunistic suspend isn't special. Suspend is just a very deep idle. In
fact some of the low power states on processors look little different to
suspend - the OS executes a whole pile of CPU state saving and cache
flushing. It might be a hardware state machine, it might be buried in
firmware or it might be quite explicit (eg mrst). So we already have
differing degrees of doing additional work in different states.

User triggered suspend is a bit special in that the user is usually right
in that case to override the power management policy.

Note I'm not suggesting we run off and restructure all our power
management code to take this view right now. I'm suggesting we need a
clean 'opportunistic suspend is not special' view by user space. How the
kernel handles this is addressible later without app breakage, but only

But you can express user suspend blocking in this manner. Your call
incoming code would say 'I want good latency'. As someone needs good
latency the box won't suspend. If your approach is to start with an
initial 'anyone can set a low latency we don't care' then anyone can
block suspends.

Equally your call handling example is about latency not about suspend.
You want the phone to stay on, to fetch a picture and display ...
From: Arve Hjønnevåg
Date: Wednesday, May 26, 2010 - 5:49 pm

Are you still talking about Linux suspend? If you can enter S3 from
idle and still wake up for the next timer or interrupt, then do that.

Most apps does not have to know about this with opportunistic suspend
either. If the user is interacting with the device, we don't suspend.
If your apps needs to run when the user is not interacting with the

Suspend as it is currently implemented in Linux is special. Regular
timers stop, and only specially marked wakeup events can bring the

On a phone this is not the case. The user manually can toggle the
screen on and off, and we may or may not enter suspend when the screen
is off. If we forced suspend when the user turned the screen off, we

I don't think a latency api is the right way to express this since the
only latency values we would use is minimal-latency and any-latency.
What we are expressing is that this works need to happen now, even if

We have two main modes of operation. The user is interacting with the
device and tasks and timers should run as requested. And, the user is
not interacting with the device and the device should enter (and stay
in) low power states regardless of running tasks and timers. Since
some events (e.g. incoming phone call, alarm clock) will may cause the
user to start interacting with the device, they need special
treatment. A per thread latency api does not work for us. A global
latency api could work, but since would only use minimal latency or
any latency this seem like overkill. Also, with a global latency api,
how do I know it the requested latency is meant to improve the
experience while the user is interacting with the app, or if it meant

I don't think you understood what I asked. Currently most x86 systems
can enter much lower power states from suspend than they can from
idle. Are you suggesting that we remove suspend support from Linux and
try to enter the same power states on x86 from idle that we now enter



-- 
Arve Hjønnevåg
--

From: Thomas Gleixner
Date: Thursday, May 27, 2010 - 7:29 am

It does not matter whether it's S3 or any other power saving
state / mechanism in a particular device.

What matters is that the kernel needs to know about the QoS
requirements of the applications which are active to make optimal
decisions for power management. 

If we can go into any given power state from idle then the decision
which power state to select needs to be made on the requirements of an
application to react on the next event (timer, interrupt, ...).

So yes, we could go into S3 (with some effort) and arm the wakeup
source which will bring us back when the requirements of the apps are

You can express that in QoS requirements as well. If you say "max
wakeup latency 100ms" then the kernel will select a power state which
will meet this requirement. So it can decide whether to go into full
suspend on a given system or select some other better suiting power
saving mechanism. But this allows us to express this in a completely
hardware independent way. If the hardware does not provide a suitable

That's a matter of the current implementation. We can change that and
improve it to do better resource management. And this requirement is
not restricted to the mobile phone space, it's true for laptops,

You're mind is stuck in a black and white decision scheme, which you
implemented with the suspend blockers big hammer approach.

There is a wide variety between minimal and any latency. It depends on
the task at hand. An interactive application will want a latency which
is in the range of acceptable user experience, but that's not
necessarily the minimal latency which the system can guarantee and
provide. A background task can say "I'm fine with 100ms" which allows
the kernel to aggregate wakeups in a clever way.

Even if andorid decides that min and any are the only two which
matter, then this approach will work for android, but lets us use the
same mechanism and technology in other areas where people are

Again. You do not need a global latency API. A per thread and ...
From: Matthew Garrett
Date: Thursday, May 27, 2010 - 7:06 am

I don't entirely see how this works. In order to deal with poorly 
written applications, it's necessary to (optionally, based on some 
policy) ignore them when it comes to the scheduler. The problem is how 
to implement the optional nature of this in a race-free manner. This is 
obviously a pathological case, but imagine an application that does 
something along the following lines:

int input = open ("/dev/input", O_RDONLY|O_NONBLOCK);
char foo;

while (1) {
	suspend_block();
	if (read(input, &foo, 1) > 0) {
		(do something)
		suspend_unblock();
	} else {
		suspend_unblock();
		(draw bouncing cows and clouds and tractor beams briefly)
	}
}

Now, if the user is playing this game, you want it to be scheduled. If 
the user has put down their phone and the screen lock has kicked in, you 
don't want it to be scheduled. So we could imagine some sort of cgroup 
that contains untrusted tasks - when the session is active we set a flag 
one way which indicates to the scheduler that tasks in TASK_RUNNING 
should be scheduled, and when the session is idle we set the flag the 
other way and all processes in that cgroup get shifted to 
TASK_INTERRUPTIBLE or something.

Except that doesn't work. If the session goes idle in the middle of the 
app drawing a frame, we'll stop the process and the task will never call 
read(). So the user hits a key, we wake up, nothing shifts from 
TASK_INTERRUPTIBLE into TASK_RUNNING, the key never gets read, we go 
back to sleep. The event never gets delivered.

Now let's try this in the Android world. The user hits a key and the 
system wakes up. The input layer takes a suspend block. The application 
now draws all the cows it wants to, takes its own suspend block and 
reads the input device. This empties the queue and the kernel-level 
suspend block is released. The application then processes the event 
before releasing the suspend block. The event has been delivered and 
handled.

You can't express that with resource limits or QoS constraints. ...
From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 7:28 am

What's wrong with simply making the phone beep loudly and displaying:
bouncing cows is preventing your phone from sleeping!


--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 7:35 am

Well, primarily that it's possible to design an implementation where it 
*doesn't* prevent your phone froms sleeping, but also because a given 
application may justifiably be preventing your phone from sleeping for a 
short while. What threshold do you use to determine the difference?

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 7:41 am

Whatever you want, why would the kernel care?

You can create a whole resource management layer in userspace, with
different privilidge/trust levels. Trusted apps may wake more than
untrusted apps. Who cares.

The thing is, you can easily detect what keeps your cpu from idling.
What you do about it a pure userspace solution.

You can use the QoS stuff to give hints, like don't wake me more than 5
times a minute, if with those hints an app still doesn't meet whatever
criteria are suitable for the current mode, yell at it. Or adjust its
QoS parameters for it.

Heck, for all I care, simply SIGKILL the thing and report it once the
user starts looking at his screen again.
--

From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 7:43 am

Provide incentive for Joe Clicker to improve his app, instead of cope
with the shit he created.
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 8:10 am

That isn't helpful. But if you feel like that I suggest you run with your
memory management protection disabled, it's really on there to deal with
crap code and its giving the wrong incentives. Come to think of it
you might want to remove your seatbelts and any safety catches or airbags
- it only encourages carelessness.

The reality is you need a sane, generic, race free way to express your
requirements (eg for hard RT) and ditto for constraining the expression
(for 'crapplications')

Arguing that you don't need to do this isn't useful. Android has
demonstrated a need to do this. RT has a need to do some of this.
Virtualisation wants elements of this etc.

The question is how you do it right.
--

From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 8:07 am

Sure, I fully agree with the task and per device QoS stuff. I'm just
saying that its good to inform the user that some app is severely
mis-behaving.
--

From: Florian Mickler
Date: Thursday, May 27, 2010 - 9:28 am

On Thu, 27 May 2010 16:10:54 +0100

And the thing is, even a well written app can be a 'crapplication'
depending on the context and mood of the user.

cheers,
Flo
--

From: Rafael J. Wysocki
Date: Thursday, May 27, 2010 - 2:17 pm

I violently agree, thanks Alan!

Rafael
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 8:05 am

I would hope not, because I'd rather prefer my app that used the screen
to get the chance to save important data on what it was doing
irrespective of the screen blank:  "I have an elegant proof for this
problem but my battery has gone flat"


I don't see why not. You just have to think about the problem from the
right end. Start from "normality is well behaved applications" and
progress to "but I can constrain bogus ones". 

So what are the resource constraints/QoS constraints for your example:

[Simplistically]

1. App says 'I want to wakeup from events for me within 1 second'
	(Because I like drawing cows at about that rate)
2. App open driver for buttons
3. App opens driver for screen

   Driver for buttons goes 'humm, well I can trigger wakeup from all
   power states so I need no restrictions'. Screen will vary by device a
   lot.


(I'll come back to the screen a bit more in a moment)

So lets consider the same binary

App runs on OLPC like h/w

The pm code goes 'well I can suspend/resume in a second thats cool'
The screen code goes 'Hey I've got OLPC like video so thats ok'
The button driver can wake the system from suspend and queue an event


App runs on Android like h/w

The pm code goes 'well I can suspend/resume in a second thats cool'
The screen code goes 'Gee the screen goes blank if I go below level X' so
	I'll set a limit
The button driver can wake the system from suspend and queue an event


App runs on Android like h/w but not trusted

The pm code goes 'well tough, you can't do that, I'll refuse you'
	(Maybe user space wrapped by Android with a 'Cows wants to eat
	your phone alive [Refuse] [This Time Only] [Always] UI
	User hits refuse and Android duly assigns the code no guarantee
	and a hard limit of no guarantee.

The screen code goes 'tough'
The button driver can wake the system etc

Cows will get suspended for longer than one second whether it likes it or
not

App runs on a desktop PC

The pm code goes 'well I can't do ...
From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 8:05 am

Very well put again.

I bet the next example is a proglet that does: while(1); without device
interaction :-).
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 9:07 am

Perhaps set after callbacks are made. But given that the approach 

It's still racy. Going back to my example without any of the suspend 
blocking code, but using a network socket rather than an input device:

int input = socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, 0);
char foo;
struct sockaddr addr;
connect (input, &addr, sizeof(addr))
while (1) {
       if (read(input, &foo, 1) > 0) {
               (do something)
       } else {
               (draw bouncing cows and clouds and tractor beams briefly)
       }
}

A network packet arrives while we're drawing. Before we finish drawing, 
the policy timeout expires and the screen turns off. The app's drawing 
is blocked and so never gets to the point of reading the socket. The 
wakeup event has been lost.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 9:41 am

On Thu, 27 May 2010 17:07:14 +0100


Which is correct for a badly behaved application. You said you wanted to
constrain it. You've done so. Now I am not sure why such a "timeout"
would expire in the example as the task is clearly busy when drawing, or
is talking to someone else who is in turn busy. Someone somewhere is
actually drawing be it a driver or app code.

For a well behaved application you are drawing so you are running
drawing stuff so why would you suspend. The app has said it has a
latency constraint that suspend cannot meet, or has a device open that
cannot meet the constraints in suspend.

You also have the socket open so you can meaningfully extract resource
constraint information from that fact.

See it's not the read() that matters, it's the connect and the close. 

If your policy for a well behaved application is 'thou shalt not
suspend in a way that breaks its networking' then for a well behaving app
once I connect the socket we cannot suspend that app until such point as
the app closes the socket. At any other point we will break the
connection. Whether that is desirable is a policy question and you get to
pick how much you choose to trust an app and how you interpret the
information in your cpufreq and suspend drivers.

If you have wake-on-lan then the network stack might be smarter and
choose to express itself as

	'the constraint is C6 unless the input queue is empty in which
	 case suspend is ok as I have WoL and my network routing is such
	 that I can prove that interface will be used'

In truth I doubt much hardware can make such an inference but some phones
probably can. On the other hand for /dev/input/foo you can make the
inference very nicely thank you.

Again wake on lan information does not belong in the application !



--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 9:52 am

Sorry, using cgroups and scheduler tricks as a race-free replacement for 

The timeout would be at the userspace platform level. If I haven't 
touched the app for 30 seconds (and if the app hasn't taken any form of 
suspend block), the screen should turn off. In the current Android 
implementation that will then (in the absence of any kernel-level 
suspend blockers) result in the system transitioning into a fully 

Not at all. The fact that the application hasn't taken any sort of 
suspend block means that the application has indicated that it's happy 
with no longer being scheduled when the screen is shut off, *providing 

Again, that's not the desired outcome. The desired outcome is that when 
the screen shuts off, the application no longer gets scheduled until a 

This is still racy. Going back to this:

int input = socket(AF_INET, SOCK_STREAM|SOCK_NONBLOCK, 0);
char foo;
struct sockaddr addr;
connect (input, &addr, sizeof(addr))
while (1) {
       if (read(input, &foo, 1) > 0) {
               (do something)
       } else {
		* SUSPEND OCCURS HERE *
               (draw bouncing cows and clouds and tractor beams briefly)
       }
}

A wakeup event now arrives. We use kernel level suspend blockers to 
prevent the system from going back to sleep until userspace has read the 
packet. The application finishes drawing its cows, reads the packet 
(thus releasing the kernel-level suspend block) and them immediately 
reaches the end of its timeslice. At this point the application has not 
had an opportunity to indicate in any way whether or not the packet has 
altered its constraints in any way. What stops us from immediately 
suspending again?

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 11:02 am

> network packet arrives. The difference between these scenarios is large.

Your application seems to change desired outcome every email. Sorry but
you need to explicitly explain and define a policy in full that we can

Fine but then the packet will arrive and we will wake back up process the
packet and wake the task. See suspend is just like a deep sleep. If we
went to sleep there and the packet arrival doesn't rewake the box the

If my app level constraint before the packet is 'don't suspend' then my
constraint on receipt is 'don't suspend' so I won't suspend. If my
constraint is then lowered and I suspend I will suspend *after* the
lowering.

If my constraint is tightened then the decision to tighten is run under
the previous constraint. Which is fine, because if I have a case where I
must tighten my constraint within the tight constraint time I've screwed
up my app and need to fix it.

In reality almost all your userspace is going to look like 'trusted app'
or 'untrusted app' in droidthink and won't be transitioning in user space
(but may well be adding/losing kernel constraints)

This is good because it's another thing app authors don't have to care
about. It's good because apps can be run trusted/untrusted without
recompiling.

Alan
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 11:12 am

Yes, there's no problem so far. The question is how you guarantee that 

If the app level constraint before the packet was "don't suspend", we 
wouldn't have suspended. Here's a description of the desired behaviour 
of the application as you requested:

The application is a network monitoring app that renders server state 
via animated bouncing cows. The desired behaviour is that the 
application will cease to be scheduled if the session becomes idle 
(where idle is defined as the system having received no user input for 
30 seconds) but that push notifications from the server still be 
received in order to allow the application to present the user with 
critical alerts.

Under Android:

User puts down phone. 30 seconds later the screen turns off and releases 
the last user-level suspend block. The phone enters suspend and the 
application is suspended. A network packet is received, causing the 
network driver to take a suspend block. The application finishes the 
frame it was drawing, takes its own suspend block and reads the network 
packet. In doing so the network driver releases its suspend block, but 
since userspace is holding one the phone stays awake. The application 
then handles the event as necessary, releases its suspend block and the 
phone goes to sleep again.

I don't see how this behaviour can be emulated in your model.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 11:48 am

Because the application has said that it wants QoS guarantees. It wants
to know that if the events it can receive occur it will wake up within
the timescale.

So it receives the event. It's now running, not idle so we won't suspend
it.

At some point in time it will become idle again _or_ the CPU limit will
get it.


If the app is not trusted then we might suspend it before it handles the
packet. That is fine, it's not trusted and we will do that when we decide
its in the system interest to suspend it anyway.

You still don't lose the event because on resume the task will do its

This is a bit confusing - does the screen come back on for such events,
what constraints is the server operating under ? How does your code look
- it's hard to imagine the examples you've given as being workable given
they would block on network packet wait when a critical event occurs.

User puts down phone. 30 seconds later the X server decides to turn the
screen off and closes the device. This probalby releases the constraint
held via the display driver not to suspend. Any further draw requests will
block.

System looks at the other tasks and sees they are idle and can sink to a
low power state. Cows is either blocked on a packet receive or could even
be blocked on writing to the display (or both if its a realistic example
and using poll)

Everyone is idle, we can sleep

The kernel looks at the constraints it has
	- must not sink to a state below which network receive of packets
	  fails
	- must not sink below a state where whatever is needed for the
	  critical alert code etc to do its stuff
	- must not sink to a state which takes more than [constraint]
	  seconds to get back out of

It picks 'opportunistic suspend'
It goes to sleep

A packet arrives
It wakes the hardware
We are busy, we do not wish to suspend
It processes the packet
It wakes the user app
It starts processing the packet
[We are busy, we do not wish to suspend]

Presumably your display server listens to waking ...
From: Matthew Garrett
Date: Thursday, May 27, 2010 - 11:56 am

It's code that's broadly identical to what I posted. The screen will 

Even if it's using poll, it could block purely on the display if X turns 

If it's blocked on the write then it only starts processing the packet 
again if the screen wakes up. You need to power up every piece of 
hardware that an application's blocked on, just in case they need to 
complete that read or write in order to get back to the event loop where 
they have the opportunity to read the network packet.

So, yes, I think this can work in that case. But it doesn't work in 
others - you won't idle applications that aren't accessing hardware 
drivers.

As an aside, I think this is a good idea in any case since a fringe 
benefit is the removal of the requirement to use the process freezer in 

The problem is determining how to constrain it to go idle, where "idle" 
is defined as "Doesn't wake up until a wakeup event is received". It's 
acceptable for something to use as much CPU as it wants when the user is 
actively interacting with the device, but in most cases processes 
shouldn't be permitted to use any CPU when the session is idle. The 
question is how to implement something that allows a CPU-guzzling 
application to be idled without impairing its ability to process wakeup 
events.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 12:25 pm

From your literal description:

setpriority. signal, process groups.

	kill(-desktopgroup, SIGSTOP);
	kill(-desktopgroup, SIGCONT);
	kill(pid_i_am_crit_eventing, SIGCONT);

or SIGTSTP might be friendlier as a well behaved smart app can catch it,
fire it into the event loop and elegantly save and sleep.

Some window managers played with doing setpriority for focussed windows.
OLPC the same thing for OOM targets via /proc/oom_adj

The scheduler can happily do this, the power management will also
recognize STOPPED processes as no impediment to suspend.




--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 12:29 pm

But wakeup events won't be delivered to STOPped processes, and there's 
also the race of an application being in the middle of handling a wakeup 
event when you send it the signal.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 12:53 pm

On Thu, 27 May 2010 20:29:26 +0100

Try the following

	cat <pipe
	kill -STOP catpid

	echo "wombats are cool" > pipe
	kill -CONT catpid

it will echo "wombats are cool"


sigmask()

--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 1:11 pm

Not lost, but not delivered. So you need your policy agent to send 
SIGCONT when you receive any wakeup event, which either means proxying 
all your network traffic through your policy agent or having some 
mechanism for alerting the policy agent whenever you leave the deep idle 

Doesn't help - I may be hit by the signal between the poll() unblocking 
and me having the opportunity to call sigmask().

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 1:53 pm

ppoll(). This is all existing solved stuff.

Alan
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 2:08 pm

I thought it was pretty obvious that wakeup events had to actually be 

Isn't that the inverse of what we want? The application should default 
to being SIGSTOPpable except in the case of it being in the process of 
having a specific event delivered.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Zygo Blaxell
Date: Thursday, May 27, 2010 - 12:32 pm

That's what fcntl(fd, F_SETFL, O_NONBLOCK) is for.

--

From: Thomas Gleixner
Date: Thursday, May 27, 2010 - 8:32 am

Thanks for providing this example:

  1) It proves that suspend blockers are solely designed to encourage
     people to code crap.

  2) It exposes the main conceptual problem of the approach:

     The input layer in the kernel magically takes a suspend blocker
     and releases it in an equally magic way just to allow the crappy
     application to reach the point where it takes it's own suspend
     blocker and can react on the user input.
     
     And you need to do that, because the user applications suspend
     blocker magic is racy as hell. To work around that you sprinkle
     your suspend blocker magic all over the kernel instead of telling
     people how to solve the problem correctly.

     And what are you achieving with this versus power saving ?

     	 Exaclty _NOTHING_ ! 

     Simply because you move the cow drawing CPU time from the point
     where the device wants to go into suspend to the point where the
     user hits a key again. You even delay the reaction of your app to
     the user input by the time it needs to finish drawing cows.
 
     So you need to come up with a way better example why you need

Wrong. If your application is interactive then you set the QoS
requirement once to interactive and be done.

So the correct point to make a power state decision is when the app
waits for a key press. At this point the kernel can take several
pathes:

      1) Keep the system alive because the input device is in active
       	 state and a key press is expected

      2) Go into supsend because the input device is deactivated after
      	 the screen lock kicked in.

This behaves exactly the same way in terms of power consumption as
your blocker example just without all the mess you are trying to
create.

And it allows the kernel to use intermediate power saving methods
(between idle and suspend) which might be available on some hardware.

Thanks,

	tglx
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 8:52 am

No. Suspend blockers are designed to ensure that suspend isn't racy with 
respect to wakeup events. The bit that mitigates badly written 
applications is the bit where the scheduler doesn't run any more.

If you're happy with a single badly written application being able to 
cripple your power management story, you don't need opportunistic 
suspend. But you still have complications when it comes to deciding to 

What /is/ the correct way to solve this problem when entering explicit 
suspend states? How do you guarantee that an event has been delivered to 
userspace before transitioning into suspend? Now, this is a less 
interesting problem if you're not using opportunistic suspend, but it's 


That's no good. If the input device has been deactivated, how does the 

And means that wakeup events don't get delivered. That's a shortcoming.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 9:16 am

How does having applications taking blockers fix that - it makes it



If the input device is letting itself get de-activated in a way that can
lose events the input device driver is buggy. It's nobody elses business
how it does the its job, and certainly *not* the applications.

That's a kernel internal issue.

You know the resource constraint exists because the driver knows it is
open
Your QoS guarantees tell you what you desired latency of response at the
point you can become ready is.

That's all your need to do it right.

In kernel yes your device driver probably does need to say things like
'Don't go below C6 for a moment' just as a high speed serial port might
want to say 'Nothing over 10mS please'

I can't speak for Thomas, but I'm certainly not arguing that you don't
need something that looks more like the blocker side of the logic *in
kernel*, because there is stuff that you want to express which isn't tied
to the task.

So you need

	Userspace -> QoS guarantee expression, implied resource
			expression via device use. *NO* knowledge of
			device or platform in the application

	Kernel space 

		Drivers -> Explicit guarantee expression not bound to
			tasks. Driver encapsulates the variety in the
			device hardware and expresses it in a uniform
			manner to the idling/suspend logic

		CPU Freq -> Encapsulates the variety in the CPU and core
			power functionality of devices, makes policy
			based upon the uniform express from the drivers
			and tasks

All the autonomy is now in the right places, and we have requisite variety
to actually manage the situation.

Alan
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 9:19 am

Sure, if you're not using opportunistic suspend then I don't think 
there's any real need for the userspace side of this. The question is 
how to implement something with the useful properties of opportunistic 
suspend without without implementing something pretty much equivalent to 
the userspace suspend blockers. I've sent another mail expressing why I 
don't think your proposed QoS style behaviour provides that.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Thomas Gleixner
Date: Thursday, May 27, 2010 - 10:04 am

Opportunistic suspend is just a deep idle state, nothing else. If the
overall QoS requirements allow to enter that deep idle state then the
kernel goes there. Same decision as for all other idle states. You
don't need any user space blocker for this decision, just sensible QoS
information.

Stop thinking about suspend as a special mechanism. It's not - except
for s2disk, which is an entirely different beast.

Thanks,

	tglx
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 10:07 am

No. The useful property of opportunistic suspend is that nothing gets 

On PCs, suspend has more in common with s2disk than it does C states.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 10:13 am

I think Alan and Thomas but certainly I am saying is that you can get to
the same state without suspend.

Either you suspend (forcefully don't schedule stuff), or you end up
blocking all tasks on QoS/resource limits and end up with an idle system
that goes into a deep idle state (aka suspend).

So why isn't blocking every task on a QoS/resource good enough for you?
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 10:16 am

Because you may then block them in such a way that they never handle an 
event that should wake them.
 
-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 10:20 am

*blink*, do explain?

Suppose X (or whatever windowing system) will block all clients that try
to draw when you switch off your screen.

How would we not wake them when we do turn the screen back on and start
servicing the pending requests again?

Pretty much the same for everything else, input events, WoL etc..


--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 10:25 am

How (and why) does the WoL (which may be *any* packet, not just a magic 
one) turn the screen back on?

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 10:28 am

Why would you care about the screen for a network event?
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 10:32 am

Because the application that needs to handle the network packet is 
currently blocked trying to draw something to the screen.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 10:35 am

Then that's an application bug right there, isn't it?

If should have listened to the window server telling its clients it was
going to go away. Drawing after you get that is your own damn fault ;-)
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 10:41 am

How long do you wait for applications to respond that they've stopped 
drawing? What if the application is heavily in swap at the time?

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 10:46 am

Since we're talking about a purely idle driven power saving, we wait
until the cpu is idle.

Note that it doesn't need to broadcast this, it could opt to reply with
that message on the first drawing attempt after it goes away and block
on the second.
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 10:52 am

If that's what you're aiming for then you don't need to block 
applications on hardware access because they should all already have 

That's more interesting, but you're changing semantics quite heavily at 
this point.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 10:56 am

Correct, a well behaved app would have. I thought we all agreed that

So?
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 10:59 am

Ok. So the existing badly-behaved application ignores your request and 
then gets blocked. And now it no longer responds to wakeup events. So 
you penalise well-behaved applications without providing any benefits to 
badly-behaved ones.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 11:06 am

Uhm, how again is blocking a badly behaved app causing harm to the well
behaved one?

The well behaved one didn't get blocked and still happily waiting (on
its own accord, in sys_poll() or something) for something to happen, if
it would get an event it'd be placed on the runqueue and do its thing.
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 11:17 am

It's blocked on the screen being turned off. It's supposed to be reading 
a network packet. How does it ever get to reading the network packet?

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 11:22 am

Its blocked because its a buggy app, who cares about misbehaviour in a
buggy app?

If it were a proper app it wouldn't have gotten blocked and would've
been able to receive the network packet.

I thought we'd already been over this?


--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 11:31 am

So why bother blocking? Just kill the app and tell the user. If you want 
to support suboptimal apps then blocking isn't sufficient. If you don't 
want to then blocking isn't necessary.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 12:06 pm

On Thu, 27 May 2010 19:17:58 +0100

Thats a stupid argument. If you write broken code then it doesn't work.
You know if I do

	ls < unopenedfifo

it blocks too.

There is a difference between dealing with apps that overconsume
resources and arbitarily broken code (which your suspend blocker case
doesn't fix either but makes worse).

Can we stick to sane stuff ?
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 2:03 pm

On Thu, 27 May 2010 18:59:20 +0100

I don't see how you put the first two sentences together and get the
final one.

When you beat up badly behaved apps that doesn't penalise well behaved
ones.

Forcing "well behaved apps" to make hundreds of extra calls to a complex
blocker interface that also requires tons of kernel code and requires the
application know platform policy and be recompiled if it changes - now
that is punishing well behaved apps.

A well behaved app should just work using standard existing APIs because
that is how all the standard current well behaved apps are written [1].

Alan
--
[1] I'm dying to see the suspend blocker patch for evolution ;)
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 2:06 pm

If you're going to block an app on drawing then you either need to 
reenable drawing on wakeup or you need to have an interface for alerting 
the app to the fact that drawing is about to block and it should get 
back to its event loop. The first is suboptimal, the second penalises 
well behaved apps.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Thomas Gleixner
Date: Thursday, May 27, 2010 - 11:12 am

Very realistic scenario on a mobile phone. 

--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 11:18 am

If the aim is to produce a solution that isn't special-cased to specific 
devices, thinking about how long an application may take to respond is 
entirely relevant.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 2:37 pm

On Thu, 27 May 2010 18:25:10 +0100

Well on my laptop today it works like this

A WoL packet arrives
The CPU resumes
Depp process, chipset and laptop BIOS magic happens
The kernel gets called
The kernel lets interested people know a resume occurred
The X server sees this
X reconfigures the display
X redraws the display (either by sending everyone expose events or by
keeping the bits, not sure how it works this week as it has changed over
time)

My desktop re-appears


Alan
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 2:36 pm

No it doesn't. The kernel continues executing anything that was on the 
runqueue before the scheduler stopped. If you're using idle-based 
suspend then there's nothing on the runqueue - the application that 
should be scheduled because of the event is blocked on writing to the 
screen.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 2:56 pm

On Thu, 27 May 2010 22:36:35 +0100

Would you like to come and watch my laptop resume ? With printk's if you
want. You appear at this point to be arguing that bumble bees can't

IFF its your bogus example
IFF you don't have any task waiting for resume notifications (ie its not
X)

So take the PC desktop case and for simplicity sake lets assume the X
application in question has either filled the socket (unlikely) or is mid
query request so blocked on the socket.

The important line then is

'The kernel lets interested people know a resume occurred'

Interesting people includes X
X therefore ends up on the runqueue
X gets the display back in order
X completes processing the outstanding X request and replies
The application continues

If I was blocked on say serial output then the resume is going to wake
the serial driver, which will transmit the queue, which will wake the app.

Alan
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 3:08 pm

The kernel performs no explicit notification to userspace. With legacy 
graphics setups you'll get a VT switch, but X is entirely unaware that 
that's due to suspend and that's going away in any case. On a typical 
setup it's not even the kernel that does the VT switch to and from X - 
that's handled by a script that happens to be on the runqueue. So yeah, 
things kind of work as you suggest right now - but only by accident, not 
design. What you're describing requires a new interface to inform 
interested bits of userspace whenever you transition from your deep idle 
state into a less deep one.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 3:32 pm

> The kernel performs no explicit notification to userspace. With legacy 

For a PC ACPI using type device /proc/acpi/events which wakes acpid which
wakes gnome-power-manager which wakes half the universe

Do we need a better more generic version of the events files - maybe but
thats a rather different kettle of fish to suspend blockers.

Alan
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 3:35 pm

It's a requirement for any reasonable alternative approach.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 10:30 am

Nothing gets scheduled in a deep idle state either - its idle. We leave
the idle state to schedule anything.

I believe the constraint is

- Do not auto-enter a state for which you cannot maintain the devices in
  use "properly".

On a current PC that generally means 'not suspend', on a lot of embedded
boards (including Android phones) it includes an opportunistic 'suspend'

Todays PCs are a special case. More to the point I don't think anyone is
expected opportunistic suspend to be useful on _todays_ x86 systems.

Even on todays PCs your assumption is questionable for virtual machines
where a VM suspend is a lot faster and rather useful.

Alan
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 10:26 am

Certainly, if you can force the system to be idle then you don't need 
opportunistic suspend. But you haven't shown how to do that without it 
being racey.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Felipe Balbi
Date: Thursday, May 27, 2010 - 10:18 am

Hi,


agree completely with you. Adding virtual differences between power 
states is a bad idea and causes unnecessary complication to the system. 
If we have a generic way of describing desired latencies (irq, wakeup, 
throughput, whatever), then the kernel should decide what's the best 
power state for the current situation.

-- 
balbi

DefectiveByDesign.org
--

From: Thomas Gleixner
Date: Thursday, May 27, 2010 - 10:00 am

I'm not opposed, but yes it needs to be expressed in quantifiable
terms, i.e. wakeup latency. That's just contributing to the global QoS
state of affairs even if it is not tied to a particular task. 

And that allows the driver to be intelligent about it. The serial port
at 9600 has definitely different requirements than at 115200.

But that's quite a different concept than the big hammer approach of
the blockers.

Thanks,

	tglx

--

From: Zygo Blaxell
Date: Thursday, May 27, 2010 - 11:35 am

I have a pile of use cases where I want to turn off "implied resource
expression via device use."  There are two orthogonal variables to
consider:

1.  I'm drawing cows on the screen (or asking another process to do so
on my behalf).

2.  I care whether anyone can actually see the cows, and I'm willing
(or not) to burn power to make them visible.

Quite often, I'm drawing cows but I don't care about cow visibility,
so I would tell PM to turn the display off when the PM framework is
looking for ways to conserve power; however, if the animated cow is part
of an alarm clock application, then I want the display on, powering it
up if was previously turned off.

A real-world example of this is a backup process on a file server.
I'd like to tell the kernel that the backup process's CPU usage and
disk I/O is *not* implied resource expression, and if there's no other
processes using the CPU or disks, the kernel can just power down the
drives or idle the CPU on a whim.  The backup process can hang until
some other process comes along to wake the drives and CPU up again,
and then the backups will run during the idle time while the drive
is waiting for new requests from other processes.  Obviously if the
backup process is trying to write dirty pages to a powered-down drive
there will be problems (memory starvation and lost data come to mind),
so I'd make sure I don't do that.

I'd also like to change my mind about these sorts of things on the fly,
without requiring hooks in the backup process itself.  I'm thinking
of a syscall with PID, FD, mode bits (read/write?  iowait/runnable?),
and policy (whether usage implies expression).

I can express mostly the same things if "policy" was "maximum latency,"
but not all.  Consider how you'd have to specify latencies to get hard
disks that spin down when idle, spin up immediately if read requests are
issued, but wait several minutes to spin up if write requests are issued.
I can't specify that with a single latency value since it would ...
From: Thomas Gleixner
Date: Thursday, May 27, 2010 - 9:45 am

Wrong. Setting the QoS requirements of the badly written app to any
latency will allow the kernel to suspend even if the crappy app is
active.

And again. I'm opposing the general chant that fixing crappy
applications in the kernel is a good thing. It's the worst decision we

Holy crap. If an event happens _before_ we go into an idle state - and
I see suspend as an deeper idle state - then we do not go there at all.

The whole notion of treating suspend to RAM any different than a plain
idle C-State is wrong. It's not different at all. You just use a
different mechanism which has longer takedown and wakeup latencies and
requires to shut down stuff and setup extra wakeup sources.

And there is the whole problem. Switching from normal event delivery
to those special wakeup sources. That needs to be engineered in any
case carefuly and it does not matter whether you add suspend blockers

So what's the f*cking point ? You draw exactly the same amount of

That's utter nonsense. If we have a problem with missed wakeups then
it needs to be fixed and not papered over with suspend blocker magic.

I'm starting to get really grumpy about the chant that suspend
blockers are the only way to fix missed wakeups. That might be the
only way you can think of with your pink android glasses on, but again
this is not rocket science even if it does not fit into the current
way the kernel handles the whole suspend mechanism.

So if we really sit back and look at suspend as another idle state,
then we have first off the same requirements for entering it as we
have for any other idle state:

     No running tasks (and we can solve the don't care task problem
     nicely with QoS)

Aside of that we need to bring devices into a quiescent state and
setup the wakeup sources. That switch over needs to be done with and
without suspend blockers in a careful way for each SoC
implementation. 

If the interrupt happens _BEFORE_ we switch over to the quiescent
state, then we need to backout. If it ...
From: Matthew Garrett
Date: Thursday, May 27, 2010 - 9:59 am

You still need the in-kernel suspend blockers if you want to guarantee 
that you can't lose wakeup events. But yes, if you're not concerned 
handling badly behaved applications then I believe that you can lose 

My question was about explicit suspend states, not implicitly handling 
an identical state based on scheduler constraints. Suspend-as-a-C-state 
isn't usable on x86 - you have to explicitly trigger it based on some 
policy. And if you want to be able to do that without risking the loss 


There are various platforms where we cannot treat suspend as an idle 
state. Any solution that requires that doesn't actually solve the 
problem. Yes, this is *trivial* if you can depend on the scheduler. But 
we can't, and so it's difficult.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Thomas Gleixner
Date: Thursday, May 27, 2010 - 10:15 am

No, we do not. We need correctly implemented drivers and a safe

And why not ? Just because suspend is not implemented as an ACPI
C-state ? 

Nonsense, if we want to push the system into suspend from the idle
state we can do that. It's just not implemented and we've never tried
to do it as it requires a non trivial amount of work, but I have done
it on an ARM two years ago as a prove of concept and it works like a

Crap. Stop beating on those lost wakeup events. If we lose them then
the drivers are broken and do not handle the switch over correctly. Or
the suspend mechanism is broken as it does not evaluate the system
state correctly. Blockers are just papering over that w/o tackling the

Stop handwaving. Which platforms prevent us to go into suspend from
idle ? Please point me to the relevant documentation which says so.

Just because we have not tried to implemented it does not mean that we
cannot implement it.

Thanks,

	tglx


--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 10:23 am

What is a "Correctly implemented driver" in this case? One that receives 
a wakeup event and then prevents suspend being entered until userspace 
has acknowledged that event? Because that's what an in-kernel suspend 

ACPI provides no guarantees about what level of hardware functionality 
remains during S3. You don't have any useful ability to determine which 
events will generate wakeups. And from a purely practical point of view, 
since the latency is in the range of seconds, you'll never have a low 

Ger;kljaserf;kljf;kljer;klj. Suspend blockers are the mechanism for the 
driver to indicate whether the wakeup event has been handled. That's 
what they're there for. The in-kernel ones don't paper over anything.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 10:26 am

If all of userspace is blocked on devices, WTH is keeping us from
hitting it?
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 10:49 am

Kernel side maybe - but even then its a subset of expressing
latency/lowest level requirements. That bit isn't really too contentious.

So PCs with current ACPI don't get opportunistic suspend capability. It

Semantically the in kernel blockers and the in kernel expression of
device driven constraints are the same thing except that instead of 
yes/no you replace the boolean with information.


So we go from

	block_suspend() / unblock_suspend()

to
	add_pm_constraint(latency, level) 
	remove_pm_constraint(latency, level);


And if Android choses to interpret that in its policy code as

	if (latency > MAGIC)
		suspend_is_cool();
	else
		suspend_isnt_cool();

that's now isolated in droidspace policy

Alan


--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 10:50 am

Actually, the reverse - there's no terribly good way to make PCs work 
with scheduler-based suspend, but there's no reason why they wouldn't 

In some cases, not all. It may be a latency constraint (in which case 
pm_qos is an appropriate mechanism), but instead it may be something 
like "A key was pressed but never read" or "A network packet was 
received but not delivered". These don't fit into the pm_qos model, but 
it's state that you have to track.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 11:17 am

I never mentioned pm_qos, just latency *and* knowing what suspend states
are acceptable. You need both.

Alan
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 11:20 am

Not at all. The entire point of opportunistic suspend is that I don't 
care is currently in TASK_RUNNABLE or has a timer that's due to expire 
in 100msec - based on policy (through not having any held suspend 
blockers), I'll go to sleep. That's easily possible on PCs.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 12:09 pm

Yes I appreciate what suspend blockers are trying to do. Now how does
that connect with my first sentence ?
--

From: Rafael J. Wysocki
Date: Thursday, May 27, 2010 - 2:55 pm

I guess what Matthew wanted to say was that you couldn't use ACPI S3 as
a very deep CPU idle state, because of the way wakeup sources are set up
for it, while you could use it for aggressive power management with suspend
blockers as proposed by Arve.

Rafael
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 3:20 pm

On Thu, 27 May 2010 23:55:13 +0200

Which is a nonsense. Because the entire Gnome desktop and KDE, and
OpenOffice and Firefox and friends would need fitting out with
suspend blockers.

x86 hardware is moving to fix these problems (at least on handheld
devices initially). Look up the C6 power idle, and S0i1 and S0i3
standby states. I reckon the laptop folks can probably get the hardware
fixed well before anyone can convert the entire PC desktop to include
blockers.

Alan
--

From: Rafael J. Wysocki
Date: Thursday, May 27, 2010 - 4:50 pm

To clarify, I'm not suggesting to spread suspend blockers all over the
universe.

Rafael
--

From: Thomas Gleixner
Date: Thursday, May 27, 2010 - 11:18 am

How does that solve the problems you mentioned above ? Wakeup
guarantees, latencies ...

It's not a prove of the technical correctness of the approach if it
can provide a useless functionality.
 
Thanks,

	tglx
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 11:23 am

Latency doesn't matter because we don't care when the next timer is due 
to expire. Wakeup guarantees can be provided via the suspend blocker 
implementation.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 12:59 pm

On Thu, 27 May 2010 19:23:03 +0100

In your specific current implementation. It matters a hell of a lot in
most cases.

Alan
--

From: Thomas Gleixner
Date: Thursday, May 27, 2010 - 10:59 am

Right, it does not as of today. So we cannot use that on x86
hardware. Fine. That does not prevent us to implement it for
architectures which can do it. And if x86 comes to the point where it
can handle it as well we're going to use it. Where is the problem ? If
x86 cannot guarantee the wakeup sources it's not going to be used for
such devices. The kernel just does not provide the service for it, so

So the driver says: events have been handled. Neat.

Now if that crappy app does not use the user space blockers or is not
allowed to use them, then what are you guaranteeing ? All you
guarantee is that the application has emptied the input queue. Nothing
else. And that's the same as setting the QoS guarantee to NONE.

An application which uses the blocker is just holding off the system
from going into deep idle. Nothing which cannot be done today.

So the only thing you are imposing to app writers is to use an
interface which solves nothing and does not save you any power at
all. 

If the application drops the blocker after processing the event and
before it goes back to the read event then you just postpone the CPU
usage and therefor power consumption to a later point in time instead
of going back to the blocking read right away. 

Again what is it saving ?  NOTHING!  And for nothing you want to mess
up the kernel big time ?

Runnable tasks and QoS guarantees are the indicators whether you can
go to opportunistic suspend or not. Everything else is just window
dressing.

Thanks,

	tglx
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 11:26 am

We were talking about PCs. Suspend-as-c-state is already implemented for 


As I keep saying, this is all much less interesting if you don't care 
about handling suboptimal applications. If you do care about them then 
the Android approach works. Nobody has demonstrated a scheduler-based 
one that does.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Thomas Gleixner
Date: Thursday, May 27, 2010 - 11:53 am

Ah, now we talk about PCs. And all of a sudden the problem of the
unability of determining wakeup sources is not longer relevant ? So
how do you guarantee that we don't miss one if we cant figure out

Demonstrated ? Care to explain me how it makes a difference:

while (1) {
  block();
  read();
  process_event();
  unblock();
		---> suspend
		<--- resume
  do_crap();	1000000 cycles
}

vs.

while (1) {
  read();
		---> suspend
		<--- resume
  process_event();
  do_crap();	1000000 cycles
}

You spend the damned 10000000 cycles in any case just at a different
point in time. So if you are so convinced and have fully understood
all the implications, please enlighten me why do_crap() costs less
power with the blockers approach.

An you are also stubbornly refusing to answer my analysis about the
effect on apps which do not use the blocker or are not allowed to.

1) The kernel blocker does not guarantee that the lousy app has
   processed the event. It just guarantees that the lousy app has
   emptied the input queue. So what's the point of the kernel blocker
   in that case ?

2) What's the difference on setting that app to QoS(NONE) and let the
   kernel happily ignore it.

Come up with real explanations and numbers and not just the "it has
been demonstrated" chant which is not holding water if you look at the

That does not make the android approach any better. They should have
talked to us upfront and not after the fact. Just because they decided
to do that in their google basement w/o talking to people who care is
not proving that it's a good solution and even less a reason to merge
it as is.

The kernel history is full of examples where crappy solutions got
rejected and kept out of the kernel for a long time even if there was
a need for them in the application field and they got shipped in
quantities with out of tree patches (NOHZ, high resolution timers,
...). At some point people stopped arguing for crappy solutions and
sat down and got it right. The ...
From: Matthew Garrett
Date: Thursday, May 27, 2010 - 12:06 pm

A wakeup event is defined as one that wakes the system - if a system 
can't be woken by a specific event then it's impossible to lose it, 


Yes, I think you're right here. You need the userspace component as well 

What sets that flag, and how does it interact with an application that 


Yes, and I'd agree with this if anyone seemed to have any idea how to do 
it right. But despite all the heat and noise in this thread, all we've 
got is an expression that it should be handled by the scheduler (a 
viewpoint I agree with) without any explanation of how to switch 
policies in such a way that applications idle without losing wakeup 
events.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Thomas Gleixner
Date: Thursday, May 27, 2010 - 1:23 pm

Which I consider in the same range as the application which does:


  QoS(NONE) would be default policy for untrusted apps in the same way
  as you'd refuse the usage of supsend blockers for untrusted apps.
  
  while (1); is definitely not an application which should be granted
  trusted rights, right ?

  block_suspend(); while(1);
  
  is the same as:

  QoS(minimal latency); while(1);

  So if you really go to trust an "while (1);" application you better
  make sure that this app has the appropriate QoS(NONE) or QoS(10s)

  Numbers, yes. But I really give a sh*t about numbers w/o a detailed
  explanation of the scenario which created those numbers. And if the
  numbers boil down to: we handle the untrusted app which does "while

Why in the world should they lose wakeup events ? If an app goes idle,
it goes idle because it is waiting for an event. There is no other
sane reason except for those apps which are untrusted and force
idled. And for those you agreed already the suspend blockers don't
solve anything as they are either not implemented or the app is not
trusted to use them.

So we are back to round one of the whole discussion:

   Is it worth to create an utter mess in the kernel just to handle a
   subset of crappy coded applications ?

And the answer stays the same: No, not at all.

Thanks,

	tglx
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 1:38 pm

The problem is that, right now, if a wakeup event is received between 
the point where userspace decides to start a suspend and userspace 

Not at all. Depending on what it reads, it may follow some other path 
where it sleeps. But, as I keep saying, if you don't want to support 

No, but if that while (1) is draw_cows() then the user may want this to 
run while their session is active and stop running while their session 
is idle. So you only want it to be QoS(NONE) in the idle session case. 

The tested case was a stock Android install with opportunistic suspend 
enabled and one that just used runtime idle. The lowest power state 

You need suspend blockers to avoid losing wakeups in the explicit 
suspend case even if you don't want to implement opportunistic suspend.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Alan Cox
Date: Thursday, May 27, 2010 - 1:03 pm

"You" were talking about PCs. Some of us are interested in the making
Linux do the right thing not adding platform specific hacks all over the


I don't believe the Android one does either. It maybe handles a subset in

I would point you at the web, cgi scripts and the huge Linux server farms
fielding billions of hits per second on crap cgi scripts.

That doesn't mean the Android one is the right approach. Nobody has
explained to me how you don't get synchronization effects in Android or
indeed answered several of the questions pointing out holes in the
Android model. The fact we are at rev 8 says something too - that the
Android 'proof' isn't old or tested either !

Alan
--

From: Pavel Machek
Date: Monday, June 21, 2010 - 8:57 am

Hi!


We'll need ACPI extensions, then (or very conservative

I did 'sleepy linux' prototype on PC, and yes I was able to get to
'once in 5 seconds' wakeup rate... good enough...
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Florian Mickler
Date: Thursday, May 27, 2010 - 10:21 am

On Thu, 27 May 2010 18:45:25 +0200 (CEST)

Ok, I just don't know the answer: How is it just another idle state if
the userspace gets frozen? Doesn't that bork the whole transition and
you need a userspace<->kernel synchronisation point to not loose events?

Cheers,
Flo
--

From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 10:25 am

There is no userspace to freeze when the runqueues are empty.

And as explained, you won't loose events if all the devices do a proper
state transition. To quote:


--

From: Florian Mickler
Date: Thursday, May 27, 2010 - 10:42 am

On Thu, 27 May 2010 19:25:27 +0200

If in the imaginery situation where userspace can aquire certain
wakeup-constraints and loose certain wakeup-constraints, then it could

I believe the problem beeing userspace frozen at an unopportune time.
So the wakeup event is processed (kernel-side) but userspace didn't
have time to reacquire the correct wakeup-constraint to process the
event.

I.e. the wakeup will be effectivly ignored.


Cheers,
Flo
--

From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 10:52 am

No. Wakeup constaints simply delay wakeups, not loose them.

If there's something runnable, we run it.

What could happen is that you try to program a timer every 5ms, but then
QoS won't let you and errors the timer_create() or something. But then

How so, event happens on hardware level, IRQ gets raised, CPU wakes up,
handler gets run, handler generates a task wakeup, runqueue isn't empty,
we run stuff.

I'm not quite sure how to loose something there.
--

From: Matthew Garrett
Date: Thursday, May 27, 2010 - 10:54 am

If you're using idle-based suspend without any forced idling or blocking 
of applications then you don't lose wakeups. People keep conflating 
separate issues.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Peter Zijlstra
Date: Thursday, May 27, 2010 - 11:02 am

I still don't see how blocking applications will cause missed wakeups in
anything but a buggy application at worst, and even those will
eventually get the event when they unblock.

What seems to be the confusion?
--

From: Vitaly Wool
Date: Thursday, May 27, 2010 - 12:42 am

That's /wrong/. What if you have an active download ongoing when the
screen is off? This ugly simplistic approach is one of the worst
things in Android.

~Vitaly
--

From: Arve Hjønnevåg
Date: Thursday, May 27, 2010 - 1:05 am

On android we have code that blocks suspend while downloading. On
non-android systems I have used if the download has not finished by
the time the auto-sleep timeout kicks in, the system will suspend and
the download halts.

-- 
Arve Hjønnevåg
--

From: Ben Gamari
Date: Thursday, May 27, 2010 - 7:09 pm

Suspend blockers are only a flawed and indirect way to keep the vampire

He's getting at the fact that there are much better ways to deal with
this problem. The issue here is that we seem to be expected to swallow
whatever Google throws at us, regardless of the quality of the
solution. It seems like the best argument we have for merging is "we
couldn't think of anything better and we need it yesterday." This might be
a good enough reason for shipping, but it certainly doesn't satisfy the

It is absolutely not. If you want to keep power usage down, then
implement real resource management in the scheduler. Suspend blockers
are nothing but a clunky and ineffective means of resource allocation.
As has been pointed out in this thread, there are much better ways of
dealing with this problem.

- Ben
--

From: Florian Mickler
Date: Friday, May 28, 2010 - 12:03 am

On Thu, 27 May 2010 22:09:37 -0400

I don't disagree on the quality. But I don't think it is because of the
patches, but because of how the kernel is architectured in that area
(suspend not being an idle state).

Look, probably suspend needs to be integrated into the idle states and
used from there. I could imagine a cost-specification for idle states:

c3
	cost-to-transition-to-this-state: X 
	powersavings-per-time: Y
	expected time we stay in this state: relative short, there is a
	timer sheduled
	suspend-blockers: ignored

suspend 
	cost-to-transition-to-this-state: depends, how much drivers to
	suspend, how much processes to freeze, ...
	powersavings-per-time: Y
	expected time we stay in this state: long, independent of
	sheduled timers
	suspend-blockers: need not be activated

Now, a governor could compute if it is ok, to enter suspend or only
wait for idle-c3. And maybe it would never transition from idle-c3 to
suspend but only from c1. because the cost to enter suspend would mean
it just has to go to c1 anyway.


I think this has to be independently to the scheduler, because as soon
as the user interacts with the phone, everything needs to be scheduled.
even the stuff that doesn't directly interact with the user.
as soon as _nothing_ interacts with the user, the phone does schedule
_nothing_ anymore.

Cheers,
Flo
--

From: Arve Hjønnevåg
Date: Wednesday, May 26, 2010 - 3:06 am

I was not talking about our user-space code. Suspend has to be called
by a running thread, so at least one runqueue is not empty.

-- 
Arve Hjønnevåg
--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 3:09 am

But why would you need to suspend if the machine is fully idle?

Is it because you're using broken hardware that has lower power
consumption in suspend state as in idle?

Couldn't you make the runtime-pm smarter and utilize the suspend states?
--

From: Arve Hjønnevåg
Date: Wednesday, May 26, 2010 - 3:25 am

I don't think runtime-pm is relevant here. We don't use suspend to
power down devices that are not in use, we use suspend to enter system
power states that we cannot enter from idle, and on systems where the
same power state can be used from idle and suspend, we use suspend so
we can stay in the low power state for minutes to hours instead of
milliseconds to seconds.

-- 
Arve Hjønnevåg
--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 3:32 am

So don't you think working on making it possible for systems to be idle
_that_ long would improve things for everybody? as opposed to this
auto-suspend which only improves matters for those that (can) use it?


--

From: Brian Swetland
Date: Wednesday, May 26, 2010 - 3:40 am

As we've stated a number of times in the several weeks of discussion
(this time around) of this patchset, we are all in favor of improving
runtime pm, finding and resolving issues that prevent idle, and
heading toward ever lower power states in idle -- after all, this
benefits our battery life in the cases when the system is not
suspended as well as moving us closer to a future where the power
savings between actively entering suspend and not doing so approach
zero.  Aggressively entering the lowest possible power state at all
times is our goal here.

At the moment, the power savings from opportunistic suspend do
directly lead to improved battery life, and there are some advantages
to this model in the face of a non-optimal userspace (as we encounter
in a world where there are not restrictions on what applications users
may install and run).

Brian
--

From: Arve Hjønnevåg
Date: Wednesday, May 26, 2010 - 3:40 am

I'm not preventing anyone from working on improving this. Currently
both the kernel and our user-space code polls way too much. I don't
think it is reasonable to demand that no one should run any user-space
code with periodic timers when we have not even fixed the kernel to
not do this.

-- 
Arve Hjønnevåg
--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 3:49 am

All I'm saying is that merging a stop-gap measure will decrease the
urgency and thus the time spend fixing the actual issues while adding
the burden of maintaining this stop-gap measure.
--

From: Arve Hjønnevåg
Date: Wednesday, May 26, 2010 - 3:53 am

Fixing the actually issue means fixing all user-space code, and
replacing most x86 hardware. I don't think keeping this feature out of
the kernel will significantly accelerate this.

-- 
Arve Hjønnevåg
--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 4:12 am

I don't think x86 is relevant anyway, it doesn't suspend/resume anywhere
near fast enough for this to be usable.

My laptop still takes several seconds to suspend (Lenovo T500), and
resume (aside from some userspace bustage) takes the same amount of
time. That is quick enough for manual suspend, but I'd hate it to try
and auto-suspend.

Getting longer idle periods however is something that everybody benefits
from. On x86 we're nowhere close to hitting the max idle time of the
platform, you get _tons_ of wakeups on current 'desktop' software.

But x86 being a PITA shouldn't stop people from working on this, there's
plenty other architectures out there, I remember fixing a NO_HZ bug with
davem on sparc64 because his niagra had cores idling for very long times
indeed. 

So yes, I do think merging this will delay the effort in fixing
userspace, simply because all the mobile/embedded folks don't care about
it anymore.
--

From: Alan Cox
Date: Wednesday, May 26, 2010 - 5:35 am

This is an area where machines are improving and where the ability to
do stuff like autosuspend, the technology like the OLPC screen and so on
create an incentive for the BIOS and platform people to improve their

The mobile space probably doesn't care too much about many of the large
bloated desktop apps anyway and traditional embedded generally has a very
small fixed application set where the optimise both halves of the system
together.

Alan
--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 5:53 am

But do you think its a sensible thing to do? Explicitly not running
runnable tasks just sounds wrong. Also, at the extreme end, super fast
suspend is basically an efficient idle mode.

Why would the code holding suspend blockers be any more or less
important than any other piece of runnable code.

In fact, having runnable but non suspend blocking tasks around will
delay the completion of the suspend blocker, so will we start removing
those?

This whole thing introduces an artificial segregation of code. My 'cool'
code is more important than your 'uncool' code. Without a clear

Sure, but at least we share the kernel. It was said that the kernel
generates too many wakeups (and iwlagn certainly is the top most waker
on my laptop). Improvements to the kernel will benefit all, regardless
of whatever userspace we run.


--

From: Zygo Blaxell
Date: Wednesday, May 26, 2010 - 1:18 pm

With my userspace developer hat on, I'd *kill* for a way to tell
the kernel that there are more important things for the system to
be doing than executing my runnable task.  In some cases, the set of
"more important things" the system might include running other tasks,
but it also might include conserving power.  I'd like to have my program
tell the kernel things like "wake me up in 0.1 seconds, plus or minus
a year if you have something better to do."

With my sysadmin hat on (which is nearly identical to my phone owner hat,
BTW), I'd like whatever syscall implements those features to take a PID
argument, so I can impose my importance decisions on other processes.
I'd also like to set the relative importance of keeping the CPU idle on
the same scale, so that I could raise or lower the importance of keeping

It's impossible in the general case for an application to know whether
it's important or not, so it's also impossible for the kernel to derive
this information from the application's behavior--and impossible, in the
general case, to decide whether the application is more important than the
battery or some other scarce resource the kernel might also be managing
(e.g. if the machine is running hot, heat dissipation might be scarce,
and we'd want to be idle then too).  This is similar to niceness and
SCHED_RR/FIFO:  there's no way for the kernel to automatically assign
those values either, they have to be specified by a user or administrator.
Of course, programs are free--within limits--to specify these values
about themselves.

Consider a traditional Unix program like "sort".  Seriously, how is "sort"
supposed to know that it's the most important application on the system
(because I need my contacts list alphabetized *now*), or the least
(because the screensaver needs to know which is the oldest graphics
hack in the list)?

"sort" gets invoked from a shell, cooperating with other processes to do
its work.  It knows very little about the context in which it is ...
From: Arve Hjønnevåg
Date: Wednesday, May 26, 2010 - 3:52 pm

Why? If your suspend is currently set to sleep after 30 minutes of
inactivity, you can still have the same setting with opportunistic
suspend. With opportunistic suspend you can have an alarm set to run a
task at a specific time without risking that this task does not run at
that time just because your inactivity timer expired at the same time

To me this is not a good argument for not merging the code. If people
stop caring about the problem if this feature is merged that means it
solved a problem for them. You want to prevent merging a feature
_because_ it solves a problem.

-- 
Arve Hjønnevåg
--

From: Vitaly Wool
Date: Wednesday, May 26, 2010 - 4:23 am

But if this feature gets merged, I bet you'll find another 100 reasons
to not fix the actual issue. I wouldn't say so if you haven't provided
the irrelevant points already, like "replacing x86 hardware". You're
trying to merge the approach which makes the bad way of handing things
the easiest way. This shouldn't get in as it is IMO.

~Vitaly
--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 1:45 am

So you're going to merge this junk?


--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 2:40 am

On Wed, 26 May 2010 10:45:33 +0200

Yes. By now, everyone reading the posts should know all points.
Raffael obviously was part of this discussion and came to the decision
to merge it. 

My take of the discussion:
_IF_ you want to suspend aggressively, I don't see another
way.

The thing is, this is a paradigm change. Suspend is not anymore
controlled by userspace. In order to let userspace control/work with
this scheme, it needs to know when a suspend will be successfull or
poll:

1. kernel sees suspend may be possible on his side of things

2. kernel sends a message to userspace that i could be possibly
possible to suspend, but it may well be that by the time
userspace suspends it is not possible anymore

3. userspace decides to suspend. 

<- system suspends...  or not ..-> 

4. userspace retries ... retries ... retries ... 

And then you have the whole can of worms and races.



Or you have the suspend-blocker scheme:

1. kernel sees suspend is possible.
2. kernel suspends.
3. bingo.

Cheers,
Flo
--

From: Peter Zijlstra
Date: Wednesday, May 26, 2010 - 2:54 am

I don't see any races, nor retry loops.

There is always the race of an event arriving whilst in the process of
suspending, that is not solved by either the kernel nor user part of
suspend-blockers. The only thing is not to loose the event.

You simply have to deal with that, the suspend gets canceled, you do
deal with the event, and suspend again. How does making that 'retry' as
you call it happen from a kernel thread or from a userspace thread any
difference?


--

From: Florian Mickler
Date: Wednesday, May 26, 2010 - 4:35 am

On Wed, 26 May 2010 11:54:37 +0200

What about the worms?  :)

You have a point there. But what follows?

You either need to let userspace know that the kernel is now able to
suspend or you let the kernel know that userspace is now able to
suspend.
Else you can not make a well informed suspend-decision and have to
guess and retry.

Why not look at blocking and unblocking as these events you want
to have? Without wiggle room and retrying.

And not having to route through userspace simplifies the auto-suspend
scheme further.

Cheers,
Flo
--

Previous thread: [ANN] Linux Security Summit 2010 - Announcement and CFP by James Morris on Friday, May 21, 2010 - 3:44 pm. (2 messages)

Next thread: