linux-netdev mailing list

FromSubjectsort iconDate
Volker Armin Hemmann
build error with 2.6.27.6+reiser4+ehci-hub patch. ERROR: ...
Hi, with my old config from 2.6.27.5 (which has the same patches) I get this error: make all modules_install install <snip> OBJCOPY arch/x86/boot/setup.bin OBJCOPY arch/x86/boot/vmlinux.bin HOSTCC arch/x86/boot/tools/build BUILD arch/x86/boot/bzImage Root device is (9, 1) Setup is 11436 bytes (padded to 11776 bytes). System is 2180 kB CRC 54816509 Kernel: arch/x86/boot/bzImage is ready (#1) Building modules, stage 2. MODPOST 136 modules ERROR: "mii_ethtool_gset" ...
Nov 13, 4:45 pm 2008
Volker Armin Hemmann
Re: build error with 2.6.27.6+reiser4+ehci-hub patch. ER ...
and here is the config: # # Automatically generated make config: don't edit # Linux kernel version: 2.6.27.6 # Fri Nov 14 00:29:54 2008 # CONFIG_64BIT=y # CONFIG_X86_32 is not set CONFIG_X86_64=y CONFIG_X86=y CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig" # CONFIG_GENERIC_LOCKBREAK is not ...
Nov 13, 4:49 pm 2008
Alexey Dobriyan
Re: [PATCH v3] net: #ifdef inet_bind_bucket::ib_net
That's not mass usage. Mass usage is, say, s/dev_net/read_pnet/. You too. --
Nov 13, 4:21 pm 2008
Alexey Dobriyan
Re: [PATCH v3] net: #ifdef inet_bind_bucket::ib_net
It also make no sense to expose write_pnet() for one(!) user and simultaneously hide read_pnet() under ib_net() as committed patches do. Something is wrong with read_pnet() as nobody suggested to mass use it or send a patch doing it. #ifdef CONFIG_NET_NS ib->ib_net = net; #endif It's _obvious_ from this code that it's a C assignment or nop. It's also obvious depending on what config option. write_pnet(&ib->ib_net, net); What is & operator ...
Nov 13, 3:53 pm 2008
David Miller
Re: [PATCH v3] net: #ifdef inet_bind_bucket::ib_net
From: "Alexey Dobriyan" <adobriyan@gmail.com> '&' is necessary for the assignment. --
Nov 13, 4:03 pm 2008
Eric Dumazet
Re: [PATCH v3] net: #ifdef inet_bind_bucket::ib_net
You obviously didnt read my patches. How do you want to implement a C function that can write to ib->ib_net ? Yes, I prefer a function over a macro, as most kernel developpers, for obvious reasons. Only sane way is : static inline void write_pnet(struct net **pnet, struct net *net) { *pnet = net; } and call write_pnet(&ip->ib_net, net); Strange, this is what I did. Take the time to read the patch, please. Thank you --
Nov 13, 4:09 pm 2008
Michael Kerrisk
Re: [PATCH] reintroduce accept4
Andrew, There was a couplke of crufty printf()'s in the preceding version that I didn't notice until just after I hit send. Paste this version into the changelog instead. Cheers, Michael /* test_accept4.c Copyright (C) 2008, Linux Foundation, written by Michael Kerrisk <mtk.manpages@gmail.com> Licensed under the GNU GPLv2 or later. */ #define _GNU_SOURCE #include <unistd.h> #include <sys/syscall.h> #include <sys/socket.h> #include <netinet/in.h> #include ...
Nov 13, 3:14 pm 2008
Michael Kerrisk
Re: [PATCH] reintroduce accept4
Here's my test program. Works on x86-32. Should work on x86-64, but I don't have a system to hand to test with. It tests accept4() with each of the four possible combinations of SOCK_CLOEXEC and SOCK_NONBLOCK set/clear in 'flags', and verifies that the appropriate flags are set on the file descriptor/open file description returned by accept4(). I tested Ulrich's patch in this thread by applying against 2.6.28-rc2, and it passes according to my test program. /* test_accept4.c ...
Nov 13, 3:11 pm 2008
Andrew Morton
Re: [PATCH] reintroduce accept4
On Fri, 14 Nov 2008 09:28:39 +1100 Here's the latest version, for review-n-test enjoyment: From: Ulrich Drepper <drepper@redhat.com> Introduce a new accept4() system call. The addition of this system call matches analogous changes in 2.6.27 (dup3(), evenfd2(), signalfd4(), inotify_init1(), epoll_create1(), pipe2()) which added new system calls that differed from analogous traditional system calls in adding a flags argument that can be used to access additional functionality. The ...
Nov 13, 3:57 pm 2008
Michael Kerrisk
Re: [PATCH] reintroduce accept4
... Each of these new system calls (accept4(), dup3(), evenfd2(), signalfd4(), inotify_init1(), epoll_create1(), pipe2()) has a name that's based on the number of arguments it has. This follows a convention that was used in a few traditional Unix system calls, e.g., wait3(), wait4(), dup2(). However, it's probably a mistake since: a) The glibc interfaces can have different numbers of arguments from the system call b) In the future, we might use the new bits in the flags argument ...
Nov 13, 3:02 pm 2008
Paul Mackerras
Re: [PATCH] reintroduce accept4
Ulrich's patch only updated x86. If you're going to send it to Linus, please give us other architecture maintainers a chance to get patches to you to wire it up on our architectures, and then send Linus the combined patch. Paul. --
Nov 13, 3:25 pm 2008
Paul Mackerras
Re: [PATCH] reintroduce accept4
Actually, any architecture that uses sys_socketcall should be OK already, by the looks, and that includes powerpc. Paul. --
Nov 13, 3:28 pm 2008
Michael Kerrisk
Re: [PATCH] reintroduce accept4
Andrew, I see you accepted this patch into -mm. I've finally got to looking at and testing this, so: Tested-by: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> In my tests, everything looks fine. I'll forward my test program in a follow-up mail. I think Ulrich wanted to try to see this patch in for 2.6.28; it's past the merge window of course, so it's up to you, but I have no problem with that. The API is the one that Ulrich initially ...
Nov 13, 2:51 pm 2008
Andrew Morton
Re: [PATCH] reintroduce accept4
On Thu, 13 Nov 2008 16:51:56 -0500 That's easy - I'll send it to Linus and let him decide ;) Realistically, this isn't likely to get much third-party testing in -rc anyway. Our best defence at this time is careful review and developer runtime testing, which you've done, thanks. If it's buggy, we can live with that - fix it later, backport the fixes. It's security holes (including DoS ones) which we need to be I replaced the existing changelog with the above (plus some ...
Nov 13, 3:05 pm 2008
Krzysztof Halasa
[Pull request] WAN aka generic HDLC (next)
Hi Jeff, Can you pull my WAN tree into "next", please? The following changes since Linux 2.6.28-rc4 are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/chris/linux-2.6.git for-jeff me (15): WAN: split hd6457x.c into hd64570.c and hd64572.c WAN: remove SCA II support from SCA drivers WAN: remove SCA support from SCA-II drivers WAN: convert HD64572-based drivers to NAPI. WAN: TX-done handler now uses the ownership bit in ...
Nov 13, 2:48 pm 2008
Anton Vorontsov
[PATCH] net/ucc_geth: Fix oops in uec_get_ethtool_stats()
p_{tx,rx}_fw_statistics_pram are special: they're available only when a device is open. If the device is closed, we should just fill the data with zeroes. Fixes the following oops: root@b1:~# ifconfig eth1 down root@b1:~# ethtool -S eth1 Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xc01e1dcc Oops: Kernel access of bad area, sig: 11 [#1] [...] NIP [c01e1dcc] uec_get_ethtool_stats+0x98/0x124 LR [c0287cc8] ...
Nov 13, 11:26 am 2008
Bjorn Helgaas
[patch] igb: use dev_printk instead of printk
Use dev_printk() instead of printk() to give a little more context and use consistent format. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c index 1f397cd..2799518 100644 --- a/drivers/net/igb/igb_main.c +++ b/drivers/net/igb/igb_main.c @@ -1019,10 +1019,9 @@ static int __devinit igb_probe(struct pci_dev *pdev, state &= ~PCIE_LINK_STATE_L0S; pci_write_config_word(us_dev, pos + PCI_EXP_LNKCTL, ...
Nov 13, 9:20 am 2008
Jarek Poplawski
[PATCH] pkt_sched: Remove qdisc->ops->requeue() etc.
After implementing qdisc->ops->peek() and changing sch_netem into classless qdisc there are no more qdisc->ops->requeue() users. This patch removes this method with its wrappers (qdisc_requeue()), and also unused qdisc->requeue structure. There are a few minor fixes of warnings (htb_enqueue()) and comments btw. The idea to kill ->requeue() and a similar patch were first developed by David S. Miller. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> --- include/net/sch_generic.h | 17 ...
Nov 13, 4:47 am 2008
David Stevens
Re: [PATCH] [TPROXY] implemented IP_RECVORIGDSTADDR sock ...
I know it's not part of your patch, but what about turning that into an array of function pointers and a loop, as code cleanup? +-DLS --
Nov 13, 1:21 pm 2008
Balazs Scheidler
Re: [PATCH] [TPROXY] implemented IP_RECVORIGDSTADDR sock ...
IP_PKTINFO does not have a port number field which may be changed with tproxy redirections. -- Bazsi --
Nov 13, 4:37 am 2008
Balazs Scheidler
[PATCH] [TPROXY] implemented IP_RECVORIGDSTADDR socket option
In case UDP traffic is redirected to a local UDP socket, the originally addressed destination address/port cannot be recovered with the in-kernel tproxy. This patch adds an IP_RECVORIGDSTADDR sockopt that enables a IP_ORIGDSTADDR ancillary message in recvmsg(). This ancillary message contains the original destination address/port of the packet being received. Please apply. Signed-off-by: Balazs Scheidler <bazsi@balabit.hu> --- include/linux/in.h | 4 ++++ ...
Nov 13, 3:37 am 2008
Rémi
Re: [PATCH] [TPROXY] implemented IP_RECVORIGDSTADDR sock ...
Does this not duplicate the IP_PKTINFO functionality? -- Rémi Denis-Courmont Maemo Software, Nokia Devices R&D --
Nov 13, 4:09 am 2008
Petr Tesarik
[PATCH] remove an unnecessary field in struct tcp_skb_cb
The urg_ptr field is not used anywhere and is merely confusing. Signed-off-by: Petr Tesarik <ptesarik@suse.cz> -- include/net/tcp.h | 1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 438014d..8f26b28 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -590,7 +590,6 @@ struct tcp_skb_cb { #define TCPCB_EVER_RETRANS 0x80 /* Ever retransmitted frame */ #define ...
Nov 13, 3:21 am 2008
Andi Kleen
Re: [PATCH] remove an unnecessary field in struct tcp_skb_cb
I think you're confusing it with tcphdr's field of the same name. Petr's patch is correct as far as I can see. -Andi -- ak@linux.intel.com --
Nov 13, 12:57 pm 2008
Saikiran Madugula
Re: [PATCH] remove an unnecessary field in struct tcp_skb_cb
grep -r "urg_ptr" on my linux tree says otherwise. --
Nov 13, 11:10 am 2008
David Miller
Re: 9p: restrict RDMA usage
I've applied this to net-2.6 thanks Randy. I didn't get it earlier because I thought perhaps the Infiniband folks might want to have picked this one up. --
Nov 13, 12:35 am 2008
David Miller
Re: [PATCH 3/4 net-next] netdevice: safe convert to netd ...
From: Wang Chen <wangchen@cn.fujitsu.com> Applied. --
Nov 13, 1:01 am 2008
Wang Chen
[PATCH 3/4 net-next] netdevice: safe convert to netdev_p ...
We have some reasons to kill netdev->priv: 1. netdev->priv is equal to netdev_priv(). 2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously netdev_priv() is more flexible than netdev->priv. But we cann't kill netdev->priv, because so many drivers reference to it directly. This patch is a safe convert for netdev->priv to netdev_priv(netdev). Since all of the netdev->priv is only for read. But it is too big to be sent in one mail. I split it to 4 parts and make every ...
Nov 12, 7:43 pm 2008
Wang Chen
[PATCH 4/4 net-next] netdevice: safe convert to netdev_p ...
We have some reasons to kill netdev->priv: 1. netdev->priv is equal to netdev_priv(). 2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously netdev_priv() is more flexible than netdev->priv. But we cann't kill netdev->priv, because so many drivers reference to it directly. This patch is a safe convert for netdev->priv to netdev_priv(netdev). Since all of the netdev->priv is only for read. But it is too big to be sent in one mail. I split it to 4 parts and make every ...
Nov 12, 7:43 pm 2008
David Miller
Re: [PATCH 4/4 net-next] netdevice: safe convert to netd ...
From: Wang Chen <wangchen@cn.fujitsu.com> Applied. --
Nov 13, 1:01 am 2008
Wang Chen
[PATCH 2/4 net-next] netdevice: safe convert to netdev_p ...
We have some reasons to kill netdev->priv: 1. netdev->priv is equal to netdev_priv(). 2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously netdev_priv() is more flexible than netdev->priv. But we cann't kill netdev->priv, because so many drivers reference to it directly. This patch is a safe convert for netdev->priv to netdev_priv(netdev). Since all of the netdev->priv is only for read. But it is too big to be sent in one mail. I split it to 4 parts and make every ...
Nov 12, 7:43 pm 2008
David Miller
Re: [PATCH 2/4 net-next] netdevice: safe convert to netd ...
From: Wang Chen <wangchen@cn.fujitsu.com> Applied. --
Nov 13, 1:01 am 2008
Wang Chen
[PATCH 1/4 net-next] netdevice: safe convert to netdev_p ...
We have some reasons to kill netdev->priv: 1. netdev->priv is equal to netdev_priv(). 2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously netdev_priv() is more flexible than netdev->priv. But we cann't kill netdev->priv, because so many drivers reference to it directly. This patch is a safe convert for netdev->priv to netdev_priv(netdev). Since all of the netdev->priv is only for read. But it is too big to be sent in one mail. I split it to 4 parts and make every ...
Nov 12, 7:43 pm 2008
David Miller
Re: [PATCH 1/4 net-next] netdevice: safe convert to netd ...
From: Wang Chen <wangchen@cn.fujitsu.com> Applied. --
Nov 13, 1:01 am 2008
Stephen Hemminger
Re: /net
On Thu, 13 Nov 2008 02:25:33 +0100 Sure patches accepted for review anytime :-) --
Nov 12, 7:24 pm 2008
David Miller
[GIT]: Networking
Mostly driver fixes, as usual: 1) NIU driver doesn't work properly on 32-bit 2) BNX2 driver passes wrong object down into interrupt handler from ->poll_controller() and also doesn't check all RX queues. Fix from Neil Horman. 3) put_cmsg_compat() uses SO_TIMESTAMP* instead of SCM_TIMESTAMP* They are the same values, but it makes the code super confusing to read. 4) EEPROM handling and multiqueue fixes to CXGB3 from Divy Le Ray. 5) HTCP congestion control module does not ...
Nov 12, 5:31 pm 2008
David Miller
Re: [PATCH net-next 2/5] bnx2: Restrict WoL support.
From: "Michael Chan" <mchan@broadcom.com> Applied. --
Nov 12, 5:01 pm 2008
David Miller
Re: [PATCH net-next 3/5] bnx2: Set rx buffer water marks ...
From: "Michael Chan" <mchan@broadcom.com> Applied. --
Nov 12, 5:02 pm 2008
David Miller
Re: [PATCH] bnx2: bnx2_alloc_rx_mem() should use kmalloc ...
From: Eric Dumazet <dada1@cosmosbay.com> Eric, I've seen (and used) this idiom enough times that I think it deserves a generic construct somewhere instead of replicating this sequence over and over again. Don't you agree? :-) --
Nov 12, 11:49 pm 2008
David Miller
Re: [PATCH net-next 5/5] bnx2: Update version to 1.8.2.
From: "Michael Chan" <mchan@broadcom.com> Also applied, thanks Michael. --
Nov 12, 5:03 pm 2008
David Miller
Re: [PATCH net-next 1/5] bnx2: Add PCI ID for 5716S.
From: "Michael Chan" <mchan@broadcom.com> Applied. --
Nov 12, 5:01 pm 2008
Eric Dumazet
[PATCH] bnx2: bnx2_alloc_rx_mem() should use kmalloc() ...
Hi all # grep bnx2 /proc/vmallocinfo 0xf8218000-0xf821a000 8192 bnx2_alloc_rx_mem+0x33/0x310 pages=1 vmalloc 0xf821b000-0xf821d000 8192 bnx2_alloc_rx_mem+0x33/0x310 pages=1 vmalloc 0xf8220000-0xf8234000 81920 bnx2_init_board+0x104/0xae0 phys=f6000000 ioremap 0xf8240000-0xf8254000 81920 bnx2_init_board+0x104/0xae0 phys=fa000000 ioremap Any chance bnx2_alloc_rx_mem doesnt use vmalloc() to allocate less than a page of memory ? Thank you [PATCH] bnx2: bnx2_alloc_rx_mem() should ...
Nov 12, 10:49 pm 2008
Jan Engelhardt
Re: [tproxy] udp + tproxy
Mh, perhaps getsockname() could do something - well, at least if used with accept() and as such, mostly for TCP only. --
Nov 13, 12:17 am 2008
Balazs Scheidler
Re: [tproxy] udp + tproxy
Well, here's my other mail on the subject on the tproxy list, it basically details that we do have an implementation of accept() for UDP which we use in production. But I'm not sure that could be integrated to mainline. -<- quote ->- Well, for supporting UDP with tproxy4 we use a different approach: udp_accept(), you can find it in the BalaBit kernel patch tree at http://www.balabit.com/downloads/files/kernel-patches/ The way udp_accept() works is as follows: * the userspace proxy ...
Nov 13, 12:25 am 2008
Henrik Nordstrom
Re: [tproxy] udp + tproxy
I second this, but also think that it will see some initial resistance. I guess the main complaint (assuming code is in good shale) will be that UDP does not have any sender verification, which means it's very easy to flood the kernel with UDP "connection requests". But on the other hand there is also no SYN_RECV or FIN_WAIT/TIME_WAIT states which may hold up things beyond CPU processing speed so this is not by far as big problem to deal with as in the TCP case.. Regards Henrik
Nov 13, 1:44 am 2008
David Miller
Re: [PATCH] Remove unused parameter of xfrm_gen_index()
From: Herbert Xu <herbert@gondor.apana.org.au> Applied, thanks everyone. --
Nov 13, 12:28 am 2008
Herbert Xu
Re: [PATCH] Remove unused parameter of xfrm_gen_index()
Looks good to me. Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --
Nov 12, 7:01 pm 2008
David Miller
Re: [PATCH] net: ifdef struct sock::sk_async_wait_queue
From: Alexey Dobriyan <adobriyan@gmail.com> Applied, thanks Alexey. --
Nov 13, 12:25 am 2008
David Miller
Re: [PATCH] net: shy netns_ok check
From: Alexey Dobriyan <adobriyan@gmail.com> Applied, thanks Alexey. --
Nov 13, 12:24 am 2008
David Miller
Re: [PATCH] ipv6: routing header fixes
From: Brian Haley <brian.haley@hp.com> Looks good, applied, thanks Brian. --
Nov 12, 11:59 pm 2008
Jarek Poplawski
Re: Linux 2.6.27.5 / SFQ/HTB scheduling problems
On 11-11-2008 22:47, Sami Farin wrote: Could you check for TSO (and maybe turn this off) with ethtool? Thanks, Jarek P. --
Nov 13, 12:15 am 2008
Sami Farin
Re: Linux 2.6.27.5 / SFQ/HTB scheduling problems
What later options? MTU is 1472. Oh, I had old ethtool.. These with v 6: # ethtool -k eth0 Offload parameters for eth0: rx-checksumming: off tx-checksumming: off scatter-gather: off tcp segmentation offload: off udp fragmentation offload: off generic segmentation offload: on Wow. I turned gso off and now it works just like before. No packets over size of mtu anymore, either. State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB ...
Nov 13, 7:29 am 2008
Sami Farin
Re: Linux 2.6.27.5 / SFQ/HTB scheduling problems
It was off: # ethtool -k eth0 Offload parameters for eth0: rx-checksumming: off tx-checksumming: off scatter-gather: off tcp segmentation offload: off Sometimes I am also sending over 15000 byte packets according to tcpdump. What I'd like to know is: 1) How can I get tcpdump to show the actual packets I am sending, without needing to buy another computer for sniffing the packets? 2) What does it really mean when 15000 byte packet is sent? Can other packets be queued normally "in ...
Nov 13, 4:48 am 2008
Jarek Poplawski
Re: Linux 2.6.27.5 / SFQ/HTB scheduling problems
Since a driver and a hardware are later in the queue I doubt you can do this with tcpdump easily without some loop or something... Maybe Packets could be coalesced e.g. by TCP if TSO/GSO is on, but an actual packet sent to the wire should be not more than mtu (15000 if "jumbo" frames etc. are supported). Since you have warnings about this it looks like some of your configs could be different between kernel versions. Your stats from the first message show some differences: class htb ...
Nov 13, 6:59 am 2008
Eric Dumazet
Re: [RFC PATCH 00/13] hardware time stamping + igb examp ...
If NIC is going to receive 100.000 frames per second as Andi mentioned earlier my guess is you dont want to make sophisticated computation in NIC rx handler, but storing raw data delivered by NIC. Then, later, for the happy few^Wmany applications that need to get hwstamp, perform the computation if needed ? I hope tcp stack wont need hwstamp before 2013 or so ;) --
Nov 12, 11:29 pm 2008
Ohly, Patrick
RE: [RFC PATCH 00/13] hardware time stamping + igb examp ...
For reasons that have been mentioned already here (some hardware can time stamp every packet, new use cases) I think it would be important to have the hwtstamp information right in the skb. I can change the patch series so that it uses one additional ktime_t hwtstamp field; give I'm not a friend of a config option because it was suggested that hardware tstamps should off on *standard* kernels. That's of little use for users of unmodified distributions who want to run PTP. If the feature is ...
Nov 13, 8:53 am 2008
Ohly, Patrick
RE: [RFC PATCH 00/13] hardware time stamping + igb examp ...
The sys-timestamp is normally not generated. The offset scheme would add a call to gettimeofdayns() even if there is no other use for the value. This might be acceptable; the bigger problem IMHO is that without tracking system time in the hardware, hardware and system time will quickly (~ a few days with the hardware I was looking at, if I remember correctly) diverge more than can be stored in the 32 bit offset. I'd prefer to spend 64 bits and be done without the need for further encoding ...
Nov 13, 9:05 am 2008
Oliver Hartkopp
Re: [RFC PATCH 00/13] hardware time stamping + igb examp ...
Patrick, one question about a new crazy idea: If we would tend to add new space in the skb, won't 4 bytes enough then? A 32 bit value gives a nsec resolution of 4.294967296 seconds or +/- 2.147483648 seconds. If we make a 'full qualified' 64 bit sys-timestamp available anyway, the new 32 bit value could be used as an offest (or it could be given to the userspace directly) to calculate the hw timestamp within the sys-timestamp context, right? Regards, Oliver --
Nov 12, 11:15 pm 2008
David Miller
Re: [PATCH] bnx2: fix poll_controller method so that pro ...
From: Neil Horman <nhorman@tuxdriver.com> This looks good, applied, thanks Neil! --
Nov 12, 5:23 pm 2008
Neil Horman Nov 12, 6:09 pm 2008
David Miller
Re: NIU driver: Sun x8 Express Quad Gigabit Ethernet Adapter
From: Jesper Dangaard Brouer <jdb@comx.dk> That unfortunately (can be) the cost of SMP :-/ With multi-flow tests, Robert Olsson is getting 4.2 mpps rates with NIU and pktgen. That's what this card is designed for, good multi-flow workload performance, rather than striving for maximum Yes, people on lkml are trying to figure out what is causing that regression on x86. --
Nov 13, 3:15 pm 2008
David Miller
Re: NIU driver: Sun x8 Express Quad Gigabit Ethernet Adapter
From: Jesper Dangaard Brouer <jdb@comx.dk> Thanks a lot for making this test Jesper, even though the bug Same signature, counters advancing yet no mark bits are set. Now if we can fix that MSIX BUG() and start analyzing your pps performance with oprofile, we'll be in good shape :) --
Nov 13, 3:19 pm 2008
Jesper Dangaard Brouer
Re: NIU driver: Sun x8 Express Quad Gigabit Ethernet Adapter
Both applying test#1 and test#2. After applying test#2, I cannot get it to do a TX transmit timed out. And every thing seem to work... which after the known bug fix was kind of the expected behaviour... Although I'm not happy about the new perf numbers, as I now on a SMP system only can route approx 290 kpps, remember I could route 319 kpps using a single CPU nosmp kernel. (even more anyoing is that oprofile is broken) -- Med venlig hilsen / Best regards Jesper Brouer ComX ...
Nov 13, 3:29 am 2008
Jesper Dangaard Brouer
Re: NIU driver: Sun x8 Express Quad Gigabit Ethernet Adapter
------------[ cut here ]------------ WARNING: at net/sched/sch_generic.c:226 dev_watchdog+0x21e/0x230() NETDEV WATCHDOG: eth2 (niu): transmit timed out Modules linked in: niu ipmi_si hpwdt serio_raw bnx2 zlib_inflate rng_core ipmi_msghandler hpilo ehci_hcd uhci_hcd sr_mod cdrom Pid: 0, comm: swapper Not tainted 2.6.28-rc4-davem #17 Call Trace: [<c0125823>] warn_slowpath+0x63/0x80 [<c011f03e>] ? __enqueue_entity+0x8e/0xb0 [<c010888c>] ? native_sched_clock+0x1c/0x80 [<c01453c4>] ? ...
Nov 13, 2:10 am 2008
Jesper Dangaard Brouer
Re: NIU driver: Sun x8 Express Quad Gigabit Ethernet Adapter
Another bug... while unloading the niu module. During my testing I'm unloading/loading the niu module, I usually take down the interfaces _before_ unloading the module, but I forgot one time, and got the following BUG in the kern log. niu: niu_put_parent: port[3] niu 0000:0b:00.3: PCI INT D disabled niu: niu_put_parent: port[2] niu 0000:0b:00.2: PCI INT C disabled niu: niu_put_parent: port[1] niu 0000:0b:00.1: PCI INT B disabled ------------[ cut here ]------------ kernel BUG at ...
Nov 13, 1:50 am 2008
David Miller
Re: NIU driver: Sun x8 Express Quad Gigabit Ethernet Adapter
From: Jesper Dangaard Brouer <jdb@comx.dk> Weird. When the module is unloaded, unregister_netdev() will do a dev_close() which will invoke dev->stop() which is niu_close(). And niu_close() will call free_irq() on every MSI interrupt registered in niu_open(). So I can't see how this can happen but obviously it is happening. I suspect that something might be changing np->num_ldg, but anyways the following debugging patch should provide some clues. Please reproduce this and send the logs ...
Nov 13, 3:08 pm 2008
Eric Dumazet
Re: [PATCH 1/3] rcu: Introduce hlist_nulls variant of hlist
For example, if a process holds a lock, it doesnt need rcu version. OK, maybe I should add a Documentation/RCU/rculist_nulls.txt file with appropriate examples and documentation. (Say the lookup/insert algorithms, with standard hlist and memory barriers, and with hlist_nulls without those two memory barriers. (These two memory barriers can be found in commits : c37ccc0d4e2a4ee52f1a40cff1be0049f2104bba : udp: add a missing smp_wmb() in udp_lib_get_port() Corey Minyard spotted a ...
Nov 13, 6:44 am 2008
Eric Dumazet
[PATCH 4/3] rcu: documents rculist_nulls
[PATCH 4/3] rcu: documents rculist_nulls Adds Documentation/RCU/rculist_nulls.txt file to describe how 'nulls' end-of-list can help in some RCU algos. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> --- Documentation/RCU/rculist_nulls.txt | 167 ++++++++++++++++++++++++++ 1 files changed, 167 insertions(+)
Nov 13, 9:02 am 2008
Christoph Lameter Nov 13, 7:27 am 2008
Eric Dumazet
[PATCH 0/3] net: RCU lookups for UDP, DCCP and TCP protocol
Hi all Here is a serie of three patches (based on net-next-2.6), to continue work with RCU on UDP/TCP/DCCP stacks Many thanks for all usefull reviews and comments, especially from Paul and Corey. 1) Introduce hlist_nulls variant of hlist hlist uses NULL value to finish a chain. hlist_nulls variant use the low order bit set to 1 to signal an end marker. This allows to store many different end markers, so that some RCU lockless algos (used in TCP/UDP stack for example) can save ...
Nov 13, 6:13 am 2008
Eric Dumazet
Re: [PATCH 3/3] net: Convert TCP & DCCP hash tables to u ...
The atomic_inc_not_zero() is not related to SLAB_DESTROY_BY_RCU but classic RCU lookup. A writer can delete the item right before we try to use it. Next step is necessary in case the deleted item was re-allocated and inserted in a hash chain (this one or another one, it doesnt matter). In this case, previous atomic_inc_not_zero test will succeed. So we must check again the item we selected (and refcounted) is the one we were searching. So yes, this bit should be documented, since ...
Nov 13, 6:51 am 2008
Christoph Lameter
Re: [PATCH 3/3] net: Convert TCP & DCCP hash tables to u ...
It is used for the anonymous vmas. That is the purpose that Hugh introduced it for since he saw regression if he would use straight rcu freeing. See mm/rmap.c. SLAB_DESTROY_BY_RCU a pretty strange way of using RCU and slab so it should always be documented in detail. --
Nov 13, 7:08 am 2008
Peter Zijlstra
Re: [PATCH 3/3] net: Convert TCP & DCCP hash tables to u ...
This is the validation step that verifies the race opened by using SLAB_DESTROY_BY_RCU, right? Does it make sense to add a little comment to these validation steps to --
Nov 13, 6:34 am 2008
Eric Dumazet
[PATCH 2/3] udp: Use hlist_nulls in UDP RCU code
This is a straightforward patch, using hlist_nulls infrastructure. RCUification already done on UDP two weeks ago. Using hlist_nulls permits us to avoid some memory barriers, both at lookup time and delete time. Patch is large because it adds new macros to include/net/sock.h. These macros will be used by TCP & DCCP in next patch. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> --- include/linux/rculist.h | 17 ----------- include/net/sock.h | 57 ...
Nov 13, 6:15 am 2008
Andi Kleen
Re: [PATCH 0/3] net: RCU lookups for UDP, DCCP and TCP p ...
Do you have any numbers that demonstrate the read memory barriers being a performance problem? At least on x86 they should be very cheap because they're normally nops. -Andi -- ak@linux.intel.com --
Nov 13, 10:20 am 2008
Peter Zijlstra
Re: [PATCH 3/3] net: Convert TCP & DCCP hash tables to u ...
We have one user, anon_vma, and one thing that is very nearly identical the lockless pagecache. See page_cache_get_speculative() and find_get_page(). The pagecache gets away with this due to the simple fact that the page frames are never freed. Hmm, I once wrote a comment to go with SLAB_DESTROY_BY_RCU, which seems to have gotten lost... /me goes dig. Found it: http://lkml.org/lkml/2008/4/2/143 I guess I'd better re-submit that.. --
Nov 13, 7:22 am 2008
Eric Dumazet
[PATCH 3/3] net: Convert TCP & DCCP hash tables to use R ...
RCU was added to UDP lookups, using a fast infrastructure : - sockets kmem_cache use SLAB_DESTROY_BY_RCU and dont pay the price of call_rcu() at freeing time. - hlist_nulls permits to use few memory barriers. This patch uses same infrastructure for TCP/DCCP established and timewait sockets. Thanks to SLAB_DESTROY_BY_RCU, no slowdown for applications using short lived TCP connections. A followup patch, converting rwlocks to spinlocks will even speedup this ...
Nov 13, 6:15 am 2008
Peter Zijlstra
Re: [PATCH 1/3] rcu: Introduce hlist_nulls variant of hlist
So by not using some memory barriers (would be nice to have it illustrated which ones), we can race and end up on the wrong chain, in case that happens we detect this by using this per-chain terminator and try again. It would be really good to have it explained in the rculist_nulls.h comments what memory barriers are missing, what races they open, and how the this special terminator trick closes that race. I'm sure most of us understand it now, but will we still in a few months? - how ...
Nov 13, 6:29 am 2008
Eric Dumazet
[PATCH 1/3] rcu: Introduce hlist_nulls variant of hlist
hlist uses NULL value to finish a chain. hlist_nulls variant use the low order bit set to 1 to signal an end-of-list marker. This allows to store many different end markers, so that some RCU lockless algos (used in TCP/UDP stack for example) can save some memory barriers in fast paths. Two new files are added : include/linux/list_nulls.h - mimics hlist part of include/linux/list.h, derived to hlist_nulls variant include/linux/rculist_nulls.h - mimics hlist part of ...
Nov 13, 6:14 am 2008
previous daytodaynext day
November 12, 2008November 13, 2008November 14, 2008