Sometimes, application responses become bad under heavy memory load.
Applications take a bit time to reclaim memory.
The statistics, how long memory reclaim takes, will be useful to
measure memory usage.This patch adds accounting memory reclaim to per-task-delay-accounting
for accounting the time of try_to_free_pages().<i.e>
- When System is under low memory load,
memory reclaim may not occur.$ free
total used free shared buffers cached
Mem: 8197800 1577300 6620500 0 4808 1516724
-/+ buffers/cache: 55768 8142032
Swap: 16386292 0 16386292$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 0 5069748 10612 3014060 0 0 0 0 3 26 0 0 100 0
0 0 0 5069748 10612 3014060 0 0 0 0 4 22 0 0 100 0
0 0 0 5069748 10612 3014060 0 0 0 0 3 18 0 0 100 0Measure the time of tar command.
$ ls -s test.dat
1501472 test.dat$ time tar cvf test.tar test.dat
real 0m13.388s
user 0m0.116s
sys 0m5.304s$ ./delayget -d -p <pid>
CPU count real total virtual total delay total
428 5528345500 5477116080 62749891
IO count delay total
338 8078977189
SWAP count delay total
0 0
RECLAIM count delay total
0 0- When system is under heavy memory load
memory reclaim may occur.$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 7159032 49724 1812 3012 0 0 0 0 3 24 0 0 100 0
0 0 7159032 49724 1812 3012 0 0 0 0 4 24 0 0 10...
On Tue, 3 Jun 2008 19:38:25 -0700
Could you move this to vmscan.c do_try_to_free_pages().
Then, memory reclaim from memory resource controller can be catched as well.Thanks,
-Kame--
Why don't you cc'ed delayacct developer nor linux-mm?
--
On Wed, 04 Jun 2008 12:15:48 +0900
Oops. I forgot to add delayacct developer to CC.
I'll do it.But are you sure about linux-mm?
--
Shouldn't we consider memcgroup reclaim?
--
thanks for pointing this.
Unfortunately, we're not so familiar to memcgroup.
Will look into this.
Could you tell us pointers to memcgroup?Thanks,
Hiroshi Shimamoto
--
Documentation/controllers/memory.txt is good guide :)
<abstract>
try_to_free_pages(): global reclaim entry point
try_to_free_mem_cgroup_pages(): memcgroup reclaim entry point
do_try_to_free_pages(): common layerUnfortunately, I don't know your requirement and
memcgroup reclaim shold be mesured.
I hope you explain your purpose and benefit more.Thanks.
--
thanks, and playing with this is good too.
I see the point that, on memcgroup enabled system, there are
memcgroup memory reclaim points and try_to_free_mem_cgroup_pages()
is called when the page charge reach the limit.
So if we want to account delay for memory reclaim, we should
account at both of try_to_free_pages() and try_to_free_mem_cgroup_pages().Ah, sure.
We have an issue that a process which accesses a DB sometime slows down.
We want to know what causes it, without the reproducer.
We have only the report of the problem, memory usage, vmstat...
And, I don't know the memory reclaim really causes the problem.If we can see what kind of delay is the matter in the test bed, it will
help us to take a next action.Thanks,
Hiroshi Shimamoto--
Signed-off-by: Keika Kobayashi <kobayashi.kk@ncos.nec.co.jp>
---
fs/proc/array.c | 3 ++-
include/linux/delayacct.h | 10 ++++++++++
kernel/delayacct.c | 11 +++++++++++
3 files changed, 23 insertions(+), 1 deletions(-)diff --git a/fs/proc/array.c b/fs/proc/array.c
index 9e3b8c3..a3e6e86 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -498,7 +498,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,seq_printf(m, "%d (%s) %c %d %d %d %d %d %u %lu \
%lu %lu %lu %lu %lu %ld %ld %ld %ld %d 0 %llu %lu %ld %lu %lu %lu %lu %lu \
-%lu %lu %lu %lu %lu %lu %lu %lu %d %d %u %u %llu %lu %ld\n",
+%lu %lu %lu %lu %lu %lu %lu %lu %d %d %u %u %llu %llu %lu %ld\n",
pid_nr_ns(pid, ns),
tcomm,
state,
@@ -544,6 +544,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
task->rt_priority,
task->policy,
(unsigned long long)delayacct_blkio_ticks(task),
+ (unsigned long long)delayacct_freepages_ticks(task),
cputime_to_clock_t(gtime),
cputime_to_clock_t(cgtime));
if (mm)
diff --git a/include/linux/delayacct.h b/include/linux/delayacct.h
index f352f06..226b754 100644
--- a/include/linux/delayacct.h
+++ b/include/linux/delayacct.h
@@ -41,6 +41,7 @@ extern int __delayacct_add_tsk(struct taskstats *, struct task_struct *);
extern __u64 __delayacct_blkio_ticks(struct task_struct *);
extern void __delayacct_freepages_start(void);
extern void __delayacct_freepages_end(void);
+extern __u64 __delayacct_freepages_ticks(struct task_struct *);static inline int delayacct_is_task_waiting_on_io(struct task_struct *p)
{
@@ -121,6 +122,13 @@ static inline void delayacct_freepages_end(void)
__delayacct_freepages_end();
}+static inline __u64 delayacct_freepages_ticks(struct task_struct *tsk)
+{
+ if (tsk->delays)
+ return __delayacct_freepages_ticks(tsk);
+ return 0;
+}
+
#else
static inline void delayacct_set_flag(int flag)
{}
@@ -147,6 +155,8 @@ static inline void ...
userland program oftern below access.
thus, this patch break userland compatibility.----------------------------------------
#!/usr/bin/perl$stat = `cat /proc/$pid/stat`;
split
@stat_list = split(/ / , $stat);
print stat_list[$index];
^
|
use array index number--
On Wed, 04 Jun 2008 12:07:29 +0900
Any application which breaks if we add more things to /proc/pid/stat
just needs to be fixed.But we can't add new fields in the middle of the line! We can only add
new items to the end.But I think it would be good if we just weren't to make this change.
Do we really want to keep reporting the same information in two
separate ways?If so, a new /proc/pid/delays would be a better way, but it might be a
bit late for doing that now, given that we already put
delayacct_blkio_ticks into /proc/pid/stat.--
On Wed, 4 Jun 2008 15:41:24 -0700
I agree. There is no strong reason with two method.
I'll drop this one from patch series next time.--
It's my fault. I suggested him to group delayacct ticks.
I know this breakage, but was not sure what it affects.
I thought that if it'll be a problem we get a claim such you did.Thanks,
Hiroshi Shimamoto
--
On Wed, 04 Jun 2008 12:07:29 +0900
Certainly, it makes sense.
I thought this measurement value should be put
beside other delay account.
--
Yeah, I'm looking forward to your v2 :)
Thanks!
--
Update document and
make getdelays.c show delay accounting of memory reclaim.For making a distinction between "swapping in pages" and "memory reclaim"
in getdelays.c, MEM is changed to SWAP.Signed-off-by: Keika Kobayashi <kobayashi.kk@ncos.nec.co.jp>
---
Documentation/accounting/delay-accounting.txt | 11 ++++++++---
Documentation/accounting/getdelays.c | 8 ++++++--
2 files changed, 14 insertions(+), 5 deletions(-)diff --git a/Documentation/accounting/delay-accounting.txt b/Documentation/accounting/delay-accounting.txt
index 1443cd7..8a12f07 100644
--- a/Documentation/accounting/delay-accounting.txt
+++ b/Documentation/accounting/delay-accounting.txt
@@ -11,6 +11,7 @@ the delays experienced by a task while
a) waiting for a CPU (while being runnable)
b) completion of synchronous block I/O initiated by the task
c) swapping in pages
+d) memory reclaimand makes these statistics available to userspace through
the taskstats interface.
@@ -41,7 +42,7 @@ this structure. See
include/linux/taskstats.h
for a description of the fields pertaining to delay accounting.
It will generally be in the form of counters returning the cumulative
-delay seen for cpu, sync block I/O, swapin etc.
+delay seen for cpu, sync block I/O, swapin, memory reclaim etc.Taking the difference of two successive readings of a given
counter (say cpu_delay_total) for a task will give the delay
@@ -94,7 +95,9 @@ CPU count real total virtual total delay total
7876 92005750 100000000 24001500
IO count delay total
0 0
-MEM count delay total
+SWAP count delay total
+ 0 0
+RECLAIM count delay total
0 0Get delays seen in executing a given simple command
@@ -108,5 +111,7 @@ CPU count real total virtual total delay total
6 4000250 4000000 0
IO count delay total
0 0
-MEM count delay total
+SWAP count delay total
+ 0 0
+RECLAIM count delay total
0 0
diff --git a/Documentation/accounting/getdelays.c b/Documentation/accounting/getdelays.c
index 4012...
Signed-off-by: Keika Kobayashi <kobayashi.kk@ncos.nec.co.jp>
---
include/linux/taskstats.h | 4 ++++
kernel/delayacct.c | 3 +++
2 files changed, 7 insertions(+), 0 deletions(-)diff --git a/include/linux/taskstats.h b/include/linux/taskstats.h
index 5d69c07..87aae21 100644
--- a/include/linux/taskstats.h
+++ b/include/linux/taskstats.h
@@ -81,6 +81,10 @@ struct taskstats {
__u64 swapin_count;
__u64 swapin_delay_total;+ /* Delay waiting for memory reclaim */
+ __u64 freepages_count;
+ __u64 freepages_delay_total;
+
/* cpu "wall-clock" running time
* On some architectures, value will adjust for cpu time stolen
* from the kernel in involuntary waits due to virtualization.
diff --git a/kernel/delayacct.c b/kernel/delayacct.c
index 84b6782..b3179da 100644
--- a/kernel/delayacct.c
+++ b/kernel/delayacct.c
@@ -145,8 +145,11 @@ int __delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
d->blkio_delay_total = (tmp < d->blkio_delay_total) ? 0 : tmp;
tmp = d->swapin_delay_total + tsk->delays->swapin_delay;
d->swapin_delay_total = (tmp < d->swapin_delay_total) ? 0 : tmp;
+ tmp = d->freepages_delay_total + tsk->delays->freepages_delay;
+ d->freepages_delay_total = (tmp < d->freepages_delay_total) ? 0 : tmp;
d->blkio_count += tsk->delays->blkio_count;
d->swapin_count += tsk->delays->swapin_count;
+ d->freepages_count += tsk->delays->freepages_count;
spin_unlock_irqrestore(&tsk->delays->lock, flags);done:
--
1.5.0.6
--
Please write patch desctiption :)
--
Add members for memory reclaim delay to taskstats,
and accumulate them in __delayacct_add_tsk() .Signed-off-by: Keika Kobayashi <kobayashi.kk@ncos.nec.co.jp>
---
On Wed, 04 Jun 2008 11:59:04 +0900Thank you for your comment.
Here is update version.include/linux/taskstats.h | 4 ++++
kernel/delayacct.c | 3 +++
2 files changed, 7 insertions(+), 0 deletions(-)diff --git a/include/linux/taskstats.h b/include/linux/taskstats.h
index 5d69c07..87aae21 100644
--- a/include/linux/taskstats.h
+++ b/include/linux/taskstats.h
@@ -81,6 +81,10 @@ struct taskstats {
__u64 swapin_count;
__u64 swapin_delay_total;+ /* Delay waiting for memory reclaim */
+ __u64 freepages_count;
+ __u64 freepages_delay_total;
+
/* cpu "wall-clock" running time
* On some architectures, value will adjust for cpu time stolen
* from the kernel in involuntary waits due to virtualization.
diff --git a/kernel/delayacct.c b/kernel/delayacct.c
index 84b6782..b3179da 100644
--- a/kernel/delayacct.c
+++ b/kernel/delayacct.c
@@ -145,8 +145,11 @@ int __delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
d->blkio_delay_total = (tmp < d->blkio_delay_total) ? 0 : tmp;
tmp = d->swapin_delay_total + tsk->delays->swapin_delay;
d->swapin_delay_total = (tmp < d->swapin_delay_total) ? 0 : tmp;
+ tmp = d->freepages_delay_total + tsk->delays->freepages_delay;
+ d->freepages_delay_total = (tmp < d->freepages_delay_total) ? 0 : tmp;
d->blkio_count += tsk->delays->blkio_count;
d->swapin_count += tsk->delays->swapin_count;
+ d->freepages_count += tsk->delays->freepages_count;
spin_unlock_irqrestore(&tsk->delays->lock, flags);done:
--
1.5.0.6
--
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Greg Kroah-Hartman | [PATCH 004/196] Chinese: add translation of SubmittingPatches |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
| Eric Sandeen | Re: [RFC] Heads up on sys_fallocate() |
git: | |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | [GIT]: Networking |
| Antonio Almeida | HTB accuracy for high speed |
