On Mon, 2007-08-10 at 10:22 -0400, Jeff Garzik wrote:If you can get the scheduling/dequeuing to run on one CPU (as we do today) it should work; alternatively you can totaly bypass the qdisc subystem and go direct to the hardware for devices that are capable and that would work but would require huge changes. My fear is there's a mini-scheduler pieces running on multi cpus which is what i understood as being described. sounds like strict prio scheduling to me which says "if low prio starves so be it" Does putting things in the same core help? But overall i agree with your views. I think i see the receive with a lot of clarity, i am still foggy on the txmit path mostly because of the qos/scheduling issues. Infact even with status quo theres a case that can be made to not bind to interupts. In my recent experience with batching, due to the nature of my test app, if i let the interupts float across multiple cpus i benefit. My app runs/binds a thread per CPU and so benefits from having more juice to send more packets per unit of time - something i wouldnt get if i was always running on one cpu. But when i do this i found that just because i have bound a thread to cpu3 doesnt mean that thread will always run on cpu3. If netif_wakeup happens on cpu1, scheduler will put the thread on cpu1 if it is to be run. It made sense to do that, it just took me a while to digest. There would be cache benefits if you can free the packet on the same cpu it was allocated; so the idea of skb affinity is useful in the minimal in that sense if you can pull it. Assuming hardware is capable, even if you just tagged it on xmit to say which cpu it was sent out on, and made sure thats where it is freed, that would be a good start. Note: The majority of the packet processing overhead is _still_ the memory subsystem latency; in my tests with batched pktgen improving the xmit subsystem meant the overhead on allocing and freeing the packets went to something > 80%. So something along the lines of parallelizing based on a split of alloc free of sksb IMO on more cpus than where xmit/receive run would see more performance improvements. cheers, jamal -
| Ondrej Zary | pata_it821x completely broken |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Greg KH | Re: [PATCH 5/7] FUSE: implement ioctl support |
| Andi Kleen | Re: 2.6.27-rc1: critical thermal shutdown on thinkpad x60 |
git: | |
| Jakub Narebski | Re: VCS comparison table |
| Jakub Narebski | Re: git-push through git protocol |
| Michael Smith | Re: [rfc] git submodules howto |
| Olaf Hering | how to find outstanding patches in non-linux-2.6 repositories? |
| GVG GVG | ssh_exchange_identification: Connection closed by remote host |
| Richard Stallman | Real men don't attack straw men |
| Stuart Henderson | Re: pfctl |
| Tomas Bodzar | command history in ksh missed when I set $EDITOR |
| Jim Winstead Jr. | Re: Root Disk/Book Disk Compatibility |
| Ian Kluft | RESULT: comp.os.linux reorganization, all groups pass (part 3/3) |
| Robert Osterlund | Re: Sharing a swap partition: Linux and Windows? |
| Ian Kluft | 2nd CFV and VOTE ACK: comp.os.linux reorganization |
