Christoph Lameter a écrit :Thanks Christoph for doing this I believe we can restore pre 2.6.25 performance level with litle changes. [Problem is that on 2.6.25, UDP mem accounting forced us to add a callback to sock_def_write_space() at skb TX completion time. This function then wake up all thread(s) blocked in revfrom() syscall. Once awaken, thread(s) block again because no frame was received] Davide Libenzi added a 'key' opaque argument to wakeups so that eventpoll can avoid unnecessary wakeups. This infrastructure could be used on other paths. (Most important being this one : receivers, because writers are rarely blocked because of sndbuffer filled) commit 37e5540b3c9d838eb20f2ca8ea2eb8072271e403 Author: Davide Libenzi <davidel@xmailserver.org> Date: Tue Mar 31 15:24:21 2009 -0700 epoll keyed wakeups: make sockets use keyed wakeups Add support for event-aware wakeups to the sockets code. Events are delivered to the wakeup target, so that epoll can avoid spurious wakeups for non-interesting events. commit : 2dfa4eeab0fc7e8633974f2770945311b31eedf6 epoll keyed wakeups: teach epoll about hints coming with the wakeup key Use the events hint now sent by some devices, to avoid unnecessary wakeups for events that are of no interest for the caller. This code handles both devices that are sending keyed events, and the ones that are not (and event the ones that sometimes send events, and sometimes don't). We can add support for these key on regular socket code, so that a process waiting on receive wont be scheduled because a TX completion occured. Standard way is using autoremove_wake_function() : int autoremove_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *key) { int ret = default_wake_function(wait, mode, sync, key); if (ret) list_del_init(&wait->task_list); return ret; } /* this function ignores "key" argument */ int default_wake_function(wait_queue_t *curr, unsigned mode, int sync, void *key) { return try_to_wake_up(curr->private, mode, sync); } While new 'keyed' events can do better : static int ep_poll_callback(wait_queue_t *wait, unsigned mode, int sync, void *key) { int pwake = 0; unsigned long flags; struct epitem *epi = ep_item_from_wait(wait); struct eventpoll *ep = epi->ep; spin_lock_irqsave(&ep->lock, flags); ... /* * Check the events coming with the callback. At this stage, not * every device reports the events in the "key" parameter of the * callback. We need to be able to handle both cases here, hence the * test for "key" != NULL before the event match test. */ if (key && !((unsigned long) key & epi->event.events)) goto out_unlock; } I'll try to cook a patch in following days, unless someone beats me :) Thanks -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
| Linus Torvalds | Linux 2.6.21 |
| Greg Kroah-Hartman | [PATCH 002/196] Chinese: rephrase English introduction in HOWTO |
| Josef 'Jeff' Sipek | [PATCH 02/24] lookup_one_len_nd - lookup_one_len with nameidata argument |
| david | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
git: | |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Jarek Poplawski | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | Re: [GIT]: Networking |
| David Miller | [PATCH]: Preliminary release of Sun Neptune driver |
