Cc: Zhang, Yanmin <yanmin_zhang@...>, Andi Kleen <andi@...>, Matthew Wilcox <matthew@...>, LKML <linux-kernel@...>, Alexander Viro <viro@...>, Andrew Morton <akpm@...>, Thomas Gleixner <tglx@...>, H. Peter Anvin <hpa@...>, Alan Cox <alan@...>
Ok. tty write handling. Nasty. But not as nasty as the open/close code,
perhaps, and maybe we'll get it fixed some day.
In fact, I thought we had fixed most of this already, but hey, I was
clearly wrong. I assume Alan looks at it occasionally and groans. Alan?
Ok, signals being the top one, but that tty code is pretty high again.
No, it's not rmap contention. Your profile hits are just on the actual
calculations, and it's all data-dependent arithmetic and loads. Some cache
misses on the page tables, clearly, but it looks like a lot of it is even
just the plain arithmetic (the imul followed by a data-dependent 'lea'
instruction).
Some of it is that "page_to_pfn(page)", which involves a nasty division
(divide by sizeof(struct page)). It gets turned into that shift and
multiply, but it's still quite expensive with big constants etc.
Linus
--