Accessing the VPD area can take a long time. There are comments in the
SysKonnect vendor driver that it can take up to 25ms. The existing vpd
access code fails consistently on my hardware.
Change the access routines to:
* use a mutex rather than spinning with IRQ's disabled and lock held
* have a longer timeout
* call schedule while spinning to provide some responsivness
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
--- a/drivers/pci/access.c 2008-09-02 10:42:12.000000000 -0700
+++ b/drivers/pci/access.c 2008-09-03 08:47:49.000000000 -0700
@@ -133,7 +133,7 @@ PCI_USER_WRITE_CONFIG(dword, u32)
struct pci_vpd_pci22 {
struct pci_vpd base;
- spinlock_t lock; /* controls access to hardware and the flags */
+ struct mutex lock;
u8 cap;
bool busy;
bool flag; /* value of F bit to wait for */
@@ -144,29 +144,30 @@ static int pci_vpd_pci22_wait(struct pci
{
struct pci_vpd_pci22 *vpd =
container_of(dev->vpd, struct pci_vpd_pci22, base);
- u16 flag, status;
- int wait;
+ u16 flag = vpd->flag ? PCI_VPD_ADDR_F : 0;
+ unsigned long timeout = jiffies + (vpd->flag ? HZ/50 : HZ/10);
+ u16 status;
int ret;
if (!vpd->busy)
return 0;
- flag = vpd->flag ? PCI_VPD_ADDR_F : 0;
- wait = vpd->flag ? 10 : 1000; /* read: 100 us; write: 10 ms */
- for (;;) {
- ret = pci_user_read_config_word(dev,
- vpd->cap + PCI_VPD_ADDR,
- &status);
- if (ret < 0)
- return ret;
+ while ( (ret = pci_user_read_config_word(dev,
+ vpd->cap + PCI_VPD_ADDR,
+ &status)) == 0) {
if ((status & PCI_VPD_ADDR_F) == flag) {
vpd->busy = false;
- return 0;
+ break;
}
- if (wait-- == 0)
+
+ if (time_after(jiffies, timeout))
return -ETIMEDOUT;
- udelay(10);
+ if (signal_pending(current))
+ return -EINTR;
+ schedule();
}
+
+ return ret;
}
static int pci_vpd_pci22_read(struct pci_dev *dev, int pos, int size,
@@ -183,7 +184,7 @@ static int pci_vpd_pci22_read(struct pci
if (size == 0)
return ...Wow, that's slow. If you were to try to read all 32k, it'd take more This should be: + if (mutex_lock_interruptible(&vpd->lock)) And the same here, of course. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." --
[...] This is fine for the sysfs case, but not if this is called during device probe - we don't want signals to modprobe to break device initialisation, do we? Ben. -- Ben Hutchings, Senior Software Engineer, Solarflare Communications Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. --
Probably only fatal signals -- in which case the if (signal_pending) check should be a fatal_signal_pending() and mutex_lock_interruptible() should be mutex_lock_killable(). OTOH, who's signalling modprobe to do anything other than die? -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." --
On Thu, 4 Sep 2008 15:19:46 +0100 It's bad but not that bad more details are: MIN MAX ------------------------------------------------------------------- write 1.8 ms 3.6 ms internal write cyles 0.7 ms 7.0 ms ------------------------------------------------------------------- over all program time 2.5 ms 10.6 ms read 1.3 ms 2.6 ms ------------------------------------------------------------------- over all 3.8 ms 13.2 ms Usable VPD is limited to 2K so worst case read is 27 seconds. Note: there doesn't appear to be an standard for VPD size register in Why not, it makes sense to allow killing a stuck modprobe. --
In some circumstances, you want to wait for an event to happen. let's
assume that it's a hardware event, so you can't just add a notifier of
some kind, you have to poll. Here's an example:
If there's no other task ready to run, schedule() could return in much
less than 10 microseconds (actually, it could return in much less than
10 microseconds even if another task does run, but let's ignore that case).
If schedule() returned whether or not it had scheduled another task, we
could do something like:
if (!schedule())
udelay(10);
Please consider this patch:
It can be useful to know whether a call to schedule() ran another task
or not. For example, if you're giving up the CPU while waiting for an
event to happen, you might want to delay if no other process wants the
CPU to ensure you're not just bashing away at a device that is unlikely
to have finished yet.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5270d44..39c6ef9 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -329,7 +329,7 @@ extern signed long schedule_timeout(signed long timeout);
extern signed long schedule_timeout_interruptible(signed long timeout);
extern signed long schedule_timeout_killable(signed long timeout);
extern signed long schedule_timeout_uninterruptible(signed long timeout);
-asmlinkage void schedule(void);
+asmlinkage int schedule(void);
struct nsproxy;
struct user_namespace;
diff --git a/kernel/sched.c b/kernel/sched.c
index 04160d2..ba3ab9a 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -4337,8 +4337,9 @@ pick_next_task(struct rq *rq, struct task_struct *prev)
/*
* schedule() is the main scheduler function.
+ * It returns 1 if we scheduled a new task and 0 if no other task was chosen.
*/
-asmlinkage void __sched schedule(void)
+asmlinkage int __sched schedule(void)
{
struct task_struct *prev, *next;
unsigned long *switch_count;
@@ -4411,6 +4412,7 @@ ...hm, i'm not really sure - this really just seems to be a higher prio variant of yield() combined with some weird code. Do we really want to promote such arguably broken behavior? If there's any chance of any polling to take a material amount of CPU time it should be event driven to begin with. Ingo --
Oh, I'm not concerned about CPU utilisation, I'm concerned about PCI bus utilisation. Perhaps I'd like a yield_timeout() function instead where I say that I'd like to not run for at least 10 microseconds? Can we do that, or are we still jiffie-based there? -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." --
On Thu, 4 Sep 2008 10:21:11 -0600 use schedule_hrtimerout() for this (hopefully will be in 2.6.28); see this weeks LWN for an article describing it --
OK, so something like:
struct timespec ts = { 0, 10 * 1000 };
set_task_state(TASK_INTERRUPTIBLE);
schedule_hrtimeout(&ts, HRTIMER_MODE_REL);
if (fatal_signal_pending())
return -EINTR;
should do the trick.
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
--
On Thu, 4 Sep 2008 11:48:45 -0600 Never mind, I changed it to just yield() in revision. --
Gah, not another yield in the network code we have to figure out wtf its meant to do. --
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
| Alexander Sulfrian | [RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set |
| Junio C Hamano | Re: Rss produced by git is not valid xml? |
| < |
