Now that various servers support processing larger than 64K file
writes, it would be helpful to have pagevecs that are larger than 14
pages (PAGEVEC_SIZE).so an array of more pages could be passed to
kernel_sendmsg (thus allowing writes larger than 56K, depending on
wsize - the server limit is currently 8MB)pagevec_lookup_tag (and the inline function pagevec_space which I
don't use) is the main place that cifs needs a pagevec instead of
simply an array of pages and PAGEVEC_SIZE does not seem to matter
there (it works if I malloc a structure with more pages and request
more than 14 pages out of pagevec_lookup_tag). Is there another
better alternative other than defining a larger pagevec (pagevec2 or
cifs_pagevec)? Should I simply kmalloc something larger than a
pagevec and cast a (struct pagevec *) to it?static int cifs_writepages(struct address_space *mapping,
struct writeback_control *wbc)
{
@@ -1209,7 +1216,7 @@ static int cifs_writepages(struct address_space *mapping,
__u64 offset = 0;
struct cifsFileInfo *open_file;
struct page *page;
- struct pagevec pvec;
+ struct pagevec2 pvec;
int rc = 0;
int scanned = 0;
int xid;
@@ -1229,7 +1236,7 @@ static int cifs_writepages(struct address_space *mapping,
if (!experimEnabled)
return generic_writepages(mapping, wbc);- iov = kmalloc(32 * sizeof(struct kvec), GFP_KERNEL);
+ iov = kmalloc(maximum_pages_to_write_based_on_wsize *
sizeof(struct kvec), GFP_KERNEL);
if (iov == NULL)
return generic_writepages(mapping, wbc);@@ -1261,7 +1268,7 @@ retry:
while (!done && (index <= end) &&
(nr_pages = pagevec_lookup_tag(&pvec, mapping, &index,
PAGECACHE_TAG_DIRTY,
- min(end - index, (pgoff_t)PAGEVEC_SIZE - 1) + 1))) {
+ min(end - index,
...
We could make pagevecs variable-sized easily enough - add a `struct
page **' to it. But it's more overhead and this is the first time I've
seen demand for it?--
Hi.
kernel_sendmsg() requires iovecs, so each page will have to be mapped,
which can end up having too many slots in kmap table. I can reproduce
deadlock we talked about in POHMELFS thread with 100 highmem pages per
transaction on 32bit x86, so increasing pagevec noticebly can deadlock
CIFS too.It could be great if CIFS moved to kernel_sendpage() instead, Steve?
--
Evgeniy Polyakov
--
Sounds interesting - looks like the only user at the moment is NFS's
SunRPC right (so no other examples to look at)?How much better is this than send_msg?
For the write path (coming from writepage and writepages) - we have a
small header followed by a list of sequential pages (the servers
either support 4 pages (old windows), 15 pages (some windows and
NetApp etc. filers), or 31 pages (older Samba and other Windows) or
2048 pages (current Samba supports up to 8MB writes, and this may be
very useful now that Samba can call splice/receivefile and not have to
do the extra copy)--
Thanks,Steve
--
It depends. In usual life with not that big requests and 64bit arch (or
low mem pages on 32 bit x86) difference is marginal, but with high-pageskernel_sendpage() has only single page argument, but that should not be
a problem, since stack will combine multiple pages into single skb if
needed (if it can, especially for stuff like TSO/GSO), so just loop for
whatever you fetched pages via find_get_pages_tag() and send them
one-by-one, network stack will coalesce them by itself.But note that return of kernel_sendpage() does not guarantee that page
has been sent, one can free it (drop refcnt, page will not be freed,
since network stack increases counter), but can not write there until
reply from the server is returned, so just lock page and mark it is
being under writeback and then unlock and end_page_writeback() after
CIFSSMBWrite2() just like it is doen right now.Btw, should'n end_page_writeback() be called with locked page in CIFS?
Iirc in POHMELFS under huge load I was able to lose writeback clearing
probably because of that, but its subtle and can be not the issue is
CIFS.diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 8636cec..ba23c65 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -1391,8 +1391,8 @@ retry:
if (rc)
SetPageError(page);
kunmap(page);
- unlock_page(page);
end_page_writeback(page);
+ unlock_page(page);
page_cache_release(page);
}
if ((wbc->nr_to_write -= n_iov) <= 0)--
Evgeniy Polyakov
--
| Sunil Naidu | Re: Linux 2.6.20-rc6 |
| Alan Cox | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Chris Snook | Re: init's children list is long and slows reaping children. |
| Greg Kroah-Hartman | [PATCH 001/196] Chinese: Add the known_regression URI to the HOWTO |
git: | |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | [GIT]: Networking |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Eric W. Biederman | Re: [PATCH 10/11] avoid kobject name conflict with different namespaces |
