linux-netdev mailing list

FromSubjectsort iconDate
Michael S. Tsirkin
possible recursive locking in udp4_lib_rcv
Hi! I noticed the following warnings when running on 2.6.27-rc2 with lockdep checker enabled: [ 2912.004106] Initializing XFRM netlink socket [ 2922.009629] [ 2922.009632] ============================================= [ 2922.009640] [ INFO: possible recursive locking detected ] [ 2922.009643] 2.6.27-rc2-mst-suspend #32 [ 2922.009645] --------------------------------------------- [ 2922.009648] Xorg/5958 is trying to acquire lock: [ 2922.009650] (slock-AF_INET/1){-+..}, at: [<c02f105a>] __...
Aug 7, 7:46 pm 2008
Adam Langley
Code query: tcp_request_sock deallocation
I'm hoping someone can explain part of the lifetime of a request_sock for me. In tcp_minisocks.c:tcp_check_req (near the bottom) we have the code path for upgrading a request_sock to a full sock in the case that we get a valid ack: child = inet_csk(sk)->icsk_af_ops->syn_recv_sock(sk, skb, req, NULL); if (child == NULL) goto listen_overflow; // MD5 stuff omitted inet_csk_reqsk_queue_unlink(sk, req, prev); inet_csk_reqsk_queue_removed(sk, req); inet_csk_reqsk_queue_add(sk,...
Aug 7, 7:07 pm 2008
Jussi Kivilinna
[PATCH] sch_prio: Use return value from inner qdisc requeue
Use return value from inner qdisc requeue when value returned isn't NET_XMIT_SUCCESS, instead of always returning NET_XMIT_DROP. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> --- net/sched/sch_prio.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c index eac1976..7cdc3e6 100644 --- a/net/sched/sch_prio.c +++ b/net/sched/sch_prio.c @@ -117,7 +117,7 @@ prio_requeue(struct sk_buff *skb, struct Qdisc* sch) ...
Aug 7, 6:10 pm 2008
Stephen Hemminger
qdisc and down links (regression)
Before the mulitqueue changes in 2.6.27-rc it was possible to setup queueing disciplines before the link came up (carrier active). This no longer works. If link is down, the qdisc is the noop_qdisc and any configuration changes don't seem to be shown. This probably will break ISP's (and other users whose links aren't always on) that use traffic shaping. For example, try wondershaper or similar script before boot; all the tc commands will fail. --
Aug 7, 5:30 pm 2008
David Miller
Re: qdisc and down links (regression)
From: Stephen Hemminger <stephen.hemminger@vyatta.com> I'll see why this happens, it wasn't an intentional change. Before the link comes up, we don't activate the qdisc. We just remember it in ->qdisc_sleeping. I bet if you bring the link up the configuration will become visible, or that is the area where the unintentional error is occuring. --
Aug 7, 6:47 pm 2008
Randy Dunlap
[PATCH -next] bnx2x: fix logical op
From: Randy Dunlap <randy.dunlap@oracle.com> cc: eilong@broadcom.com Fix dubious logical operation that was found by sparse: linux-next-20080807/drivers/net/bnx2x_main.c:7205:27: warning: dubious: !x & y Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> --- drivers/net/bnx2x_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-next-20080807.orig/drivers/net/bnx2x_main.c +++ linux-next-20080807/drivers/net/bnx2x_main.c @@ -7202,7 +7202,7 @@ static voi...
Aug 7, 4:53 pm 2008
Alexey Dobriyan
Re: OOPS, ip -f inet6 route get fec0::1, linux-2.6.26, ip6_r...
0xffffffff80424dd3 is in rt6_fill_node (net/ipv6/route.c:2191). 2186 } else 2187 #endif 2188 NLA_PUT_U32(skb, RTA_IIF, iif); 2189 } else if (dst) { 2190 struct in6_addr saddr_buf; 2191 ====> if (ipv6_dev_get_saddr(ip6_dst_idev(&rt->u.dst)->dev, ^^^^^^^^^^^^^^^^^^^^^^^^ NULL 2192 dst, 0, &saddr_buf) == 0) 2193 ...
Aug 7, 4:37 pm 2008
Tom Herbert
SO_REUSEPORT?
Hello, We are looking at ways to scale TCP listeners. I think we like is the ability to listen on a port from multiple threads (sockets bound to same port, INADDR_ANY, and no interface binding) , which is what SO_REUSEPORT would seem to allow. Has this ever been implemented for Linux or is there a good reason not to have it? Thanks, Tom --
Aug 7, 12:57 pm 2008
Rémi Denis-Courmont
Re: SO_REUSEPORT?
On Linux, SO_REUSEADDR provide most of what SO_REUSEPORT provides on BSD. In any case, there is absolutely no point in creating multiple TCP listeners. Multiple threads can accept() on the same listener - at the same time. -- Rémi Denis-Courmont http://www.remlab.net/ --
Aug 7, 1:09 pm 2008
Tom Herbert
Re: SO_REUSEPORT?
We've been doing that, but then on wakeup it would seem that we're at the mercy of scheduling-- basically which ever threads wakes up first will get to process accept queue first. This seems to bias towards threads running on the same CPU as the wakeup is called, and so this method doesn't give us an even distribution of new connections across the threads that we'd like. Tom --
Aug 7, 1:58 pm 2008
Rick Jones
Re: SO_REUSEPORT?
How would the presence of multiple TCP LISTEN endpoints change that? You'd then be at the mercy of whatever "scheduling" there was inside the stack. If you want to balance the threads, perhaps a dispatch thread, or a virtual one - each thread knows how many connections it is servicing, let them know how many the other threads are servicing, and if a thread has N more connections than the other threads have it not go into accept() that time around. Might need some tweaking to handle patho...
Aug 7, 2:17 pm 2008
Stephen Hemminger
Re: SO_REUSEPORT?
On Thu, 07 Aug 2008 11:17:55 -0700 I suspect thread balancing would actually hurt performance! You would be better off to have a couple of "hot" threads that are doing all the work and stay in cache. If you push the work around to all the threads, you have worst case cache behaviour. --
Aug 7, 3:03 pm 2008
Tom Herbert
Re: SO_REUSEPORT?
On Thu, Aug 7, 2008 at 12:03 PM, Stephen Hemminger I'm not sure that's applicable for us since the server application and networking will max out all the CPUs on host anyway; one way or another we need to dispatch the work of incoming connections to threads on different CPUs. If we do this in user space and do all accepts in one thread, the CPU of that thread becomes the bottleneck (we're accepting about 40,000 connections per second). If we have multiple accept threads running on different CPUs...
Aug 7, 3:43 pm 2008
Rick Jones
Re: SO_REUSEPORT?
Well, if you _really_ want the load spread, you may need to use a multiqueue (at least inbound if not also later outbound) interface, "know" how the NIC will hash and then have N distinct port numbers each assigned to a LISTEN endpoint. The old song and dance about making an N CPU system look as much like N single-CPU systems and all that... Unless there are NICs you can "tell" where to send the interrupts, which IMO is preferable - I have a preference for the application/scheduler telling...
Aug 7, 4:14 pm 2008
Tom Herbert
Re: SO_REUSEPORT?
Yep that's what I really want, except for the fact that I can only use a single port for the server-- all flows could be nicely distributed by the NIC multiqueue, but I still have the problem of how to ensure that the accepting thread for a connection is run on the same CPU as NICs are already doing steering based on tuple hash (RSS), and I think some will allow specifying the CPU for interrupt based on RX flow. Maybe this would address the issues of Inbound Packet Scheduling? Thanks for the p...
Aug 7, 7:05 pm 2008
Rick Jones
Re: SO_REUSEPORT?
All IPS in HP-UX 10.20 was was hash the IP/port numbers and queue based on that - this at the handoff between driver and netisr. The problem was if you had a thread of execution servicing more than one connection, you would start whipsawing across the processors based on the remote addressing. There are IIRC indeed some NICs where you can give them a finite number of tuples and say where each tuple should go. I'm sure those vendors if watching can speak-up :) That sort of functionality...
Aug 7, 7:28 pm 2008
Atsushi Nemoto
[PATCH] ne: Use CONFIG_MACH_TX49XX
After some cleanups in arch/mips area, now MACH_TX49XX is selected for both TOSHIBA_RBTX4927 and TOSHIBA_RBTX4938. Fold these two conditions to one. Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> --- drivers/net/Kconfig | 2 +- drivers/net/ne.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index 8a03875..b6c1717 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -1171,7 +1171,7 @@ config ETH1...
Aug 7, 11:55 am 2008
Ben Dooks
AX88796: Fix locking in ethtool support
Fix a pair of nasty locking problems in the ax88796 driver spotted by a sparse check: warning: context imbalance in 'ax_get_settings' - wrong count at exit warning: context imbalance in 'ax_set_settings' - wrong count at exit Signed-off-by: Ben Dooks <ben-linux@fluff.org> Index: linux-2.6.27-rc2-quilt1/drivers/net/ax88796.c =================================================================== --- linux-2.6.27-rc2-quilt1.orig/drivers/net/ax88796.c 2008-08-07 17:15:01.000000000 +0100 +++ lin...
Aug 7, 12:21 pm 2008
Marc Haber
Need help with MCS7830 driver and 802.1q VLAN Tagging
Hi, I am having difficulties with the MCS7830 driver in recent Linux versions (last version I tried was 2.6.26.1, didn't try 2.6.26.2 yet). It looks like the MCS7830 has an issue which prevents VLAN tagged ethernet frames from being transmitted and/or received when the frame size is near the ethernet MTU. This behavior was known around the millennium for a lot of ethernet drivers (including tulip) drivers. The host OS does not know that the frame is destined to go out via a VLAN tagged interface...
Aug 7, 11:36 am 2008
Ben Hutchings
Re: Need help with MCS7830 driver and 802.1q VLAN Tagging
[...] This could be a general problem with USB Ethernet drivers, as usbnet.c doesn't seem to take account of VLAN header overhead. Does the following help? Ben. diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c index 8463efb..d24d22e 100644 --- a/drivers/net/usb/usbnet.c +++ b/drivers/net/usb/usbnet.c @@ -42,6 +42,7 @@ #include <linux/mii.h> #include <linux/usb.h> #include <linux/usb/usbnet.h> +#include <linux/if_vlan.h> #define DRIVER_VERS...
Aug 7, 1:06 pm 2008
Arjan van de Ven Aug 7, 11:41 am 2008
John W. Linville
pull request: wireless-2.6#ath9k 2008-08-07
Dave, I know we all want to see ath9k merged ASAP. Since I'll be unavailable this weekend, I have created a branch that includes Luis's revamped list.h patches, the original ath9k patches, Adrian's ath9k gcc bug work-around patch, and another cleanup series from Sujith on top. Hopefully no further problems will reveal themselves with the list.h stuff. Please pull at your leisure, or if a problem arises I'll work to resolve it on Monday. Thanks! John --- Individual patches are availa...
Aug 7, 10:38 am 2008
Julius Volz
[PATCHv3 0/2] IPVS: Add Generic Netlink configuration interf...
This is the third iteration of the IPVS Netlink interface, this time with only a small fix for a typo found by Thomas Graf. If there are no further major issues, can this be applied? The two patches add a Generic Netlink interface to IPVS while keeping the old get/setsockopt interface for userspace backwards compatibility. The motivation for this is to have a more extensible interface for future changes, such as the planned IPv6 support. An ipvsadm that already uses the new interface is available...
Aug 7, 10:43 am 2008
Julius Volz
[PATCHv3 2/2] IPVS: Add genetlink interface implementation
Add the implementation of the new Generic Netlink interface to IPVS and keep the old set/getsockopt interface for userspace backwards compatibility. Signed-off-by: Julius Volz <juliusv@google.com> 1 files changed, 880 insertions(+), 0 deletions(-) diff --git a/net/ipv4/ipvs/ip_vs_ctl.c b/net/ipv4/ipvs/ip_vs_ctl.c index 9a5ace0..b4c5cc3 100644 --- a/net/ipv4/ipvs/ip_vs_ctl.c +++ b/net/ipv4/ipvs/ip_vs_ctl.c @@ -37,6 +37,7 @@ #include <net/ip.h> #include <net/route.h> #i...
Aug 7, 10:43 am 2008
Julius Volz
[PATCHv3 1/2] IPVS: Add genetlink interface definitions to i...
Add IPVS Generic Netlink interface definitions to include/linux/ip_vs.h. Signed-off-by: Julius Volz <juliusv@google.com> 1 files changed, 160 insertions(+), 0 deletions(-) diff --git a/include/linux/ip_vs.h b/include/linux/ip_vs.h index ec6eb49..0f434a2 100644 --- a/include/linux/ip_vs.h +++ b/include/linux/ip_vs.h @@ -242,4 +242,164 @@ struct ip_vs_daemon_user { int syncid; }; +/* + * + * IPVS Generic Netlink interface definitions + * + */ + +/* Generic Netlink family i...
Aug 7, 10:43 am 2008
Denys Fedoryshchenko
[PATCH] reset optind and fix memory leak in m_ipt/iptables t...
According iptables sources for reusing iptables target optind must be reset to zero, this is important in batch processing. Also if we are using iptables target, we are allocating memory for two temporary variables and inside structure of iptables_target, which must be freed when job done. Also tflags and used in structure must be reset, to zero for next reuse of structure. Signed-off-by: Denys Fedoryshchenko <denys@visp.net.lb> Tested-by: Denys Fedoryshchenko <denys@visp.net.lb>...
Aug 7, 10:33 am 2008
jamal
Re: [PATCH] reset optind and fix memory leak in m_ipt/iptabl...
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> cheers, jamal --
Aug 7, 11:13 am 2008
John W. Linville
pull request: wireless-2.6 2008-08-07
Dave, A few more patches for 2.6.27. Also included is an iwlwifi cleanup that the Intel guys really wanted -- obviously I'm too soft...well, it seems harmless enough. I'll be out of town through the weekend celebrating my 10th wedding anniversary -- woohoo! I'll be away from email until Monday or so -- FYI... Still, let me know if there are problems! Thanks, John --- Individual patches are available here: [ message continues ]
" title="http://www.kernel.org/pub/linux/kernel/people/linville/wireless-2.6/mast...">http://www.kernel.org/pub/linux/kernel/people/linville/wireless-2.6/mast...
Aug 7, 10:25 am 2008
jamal
[PATCH] wext: Send name on events
iproute2 doesnt like it, so i was getting irritated by this on my laptop. cheers, jamal
Aug 7, 9:25 am 2008
roel kluin
[PATCH?] atl1e: WAKE_MCAST 2x. 1st WAKE_UCAST?
untested, is it right? --- WAKE_MCAST bit tested twice, test WAKE_UCAST first. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> diff --git a/drivers/net/atl1e/atl1e_ethtool.c b/drivers/net/atl1e/atl1e_ethtool.c index cdc3b85..619c658 100644 --- a/drivers/net/atl1e/atl1e_ethtool.c +++ b/drivers/net/atl1e/atl1e_ethtool.c @@ -355,7 +355,7 @@ static int atl1e_set_wol(struct net_device *netdev, struct ethtool_wolinfo *wol) struct atl1e_adapter *adapter = netdev_priv(netdev); if (wol->...
Aug 7, 12:24 pm 2008
jamal
[PATCH]net-sched: Fix actions referencing
Noticed while exercising Denys' iproute changes. cheers, jamal
Aug 7, 5:42 am 2008
Jeff Garzik
[git patches] net driver updates
Rough summary: * fixes * additions of the variety where only a few lines are added, to support new hardware (forcedeth, netxen) * a WAN update that got missed, and IMO should go in for 2.6.27. This is addressing the complaint about too much stuff still using syncppp, so things get switched more to generic HDLC. MUCH more typesafe, and links directly with struct net_device, rather than through opaque void pointers and syncppp-specific structs. * bonding update with core bits prev...
Aug 7, 4:50 am 2008
Martin Michlmayr
Re: [git patches] net driver updates
Jeff, it seems you missed this patch: From: Mikael Pettersson <mikpe@it.uu.se> The arm ixp4xx_eth driver doesn't compile in 2.6.27-rc1: CC [M] drivers/net/arm/ixp4xx_eth.o drivers/net/arm/ixp4xx_eth.c: In function 'eth_poll': drivers/net/arm/ixp4xx_eth.c:554: warning: passing argument 1 of 'dma_mapping_error' makes pointer from integer without a cast drivers/net/arm/ixp4xx_eth.c:554: error: too few arguments to function 'dma_mapping_error' drivers/net/arm/ixp4xx_eth.c: In function 'eth...
Aug 7, 6:09 am 2008
David Miller
Re: [git patches] net driver updates
From: Jeff Garzik <jeff@garzik.org> I've pulled, but the WAN bits were borderline. If you couldn't pull that WAN stuff in on time, that's just the way it goes sometimes. --
Aug 7, 5:19 am 2008
Aleksey Senin
Is this necessary export ndisc_send_ns
Hello, all! I'm working on IPv6 support for Ininiband stack and some module need to perform network discovery. Is there a way to do it without exporting symbol ndisc_send_ns? --
Aug 7, 4:01 am 2008
Jeff Garzik Aug 7, 1:56 am 2008
Gui Jianfeng
[PATCH] Fix kernel panic when calling tcp_v(4/6)_md5_do_lookup
If the following packet flow happen, kernel will panic. MathineA MathineB SYN ----------------------> SYN+ACK <---------------------- ACK(bad seq) ----------------------> When a bad seq ACK is received, tcp_v4_md5_do_lookup(skb->sk, ip_hdr(skb)->daddr)) is finally called by tcp_v4_reqsk_send_ack(), but the first parameter(skb->sk) is NULL at that moment, so kernel panic happens. This patch fixes this bug. Below is the OOPS output: [ 302.812793] IP: ...
Aug 7, 1:12 am 2008
David Miller
Re: [PATCH] Fix kernel panic when calling tcp_v(4/6)_md5_do_...
From: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Applied, thanks. --
Aug 7, 1:44 am 2008
David Miller
Re: [PATCH] Fix kernel panic when calling tcp_v(4/6)_md5_do_...
From: David Miller <davem@davemloft.net> Actually I'm reverting, this was not properly build tested: net/dccp/ipv4.c:555: warning: initialization from incompatible pointer type net/dccp/ipv6.c:358: warning: initialization from incompatible pointer type --
Aug 7, 1:47 am 2008
Gui Jianfeng
[PATCH] Fix kernel panic when calling tcp_v(4/6)_md5_do_lookup
If the following packet flow happen, kernel will panic. MathineA MathineB SYN ----------------------> SYN+ACK <---------------------- ACK(bad seq) ----------------------> When a bad seq ACK is received, tcp_v4_md5_do_lookup(skb->sk, ip_hdr(skb)->daddr)) is finally called by tcp_v4_reqsk_send_ack(), but the first parameter(skb->sk) is NULL at that moment, so kernel panic happens. This patch fixes this bug. OOPS output is as following: [ 302.812793] IP:...
Aug 7, 2:44 am 2008
David Miller
Re: [PATCH] Fix kernel panic when calling tcp_v(4/6)_md5_do_...
From: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Applied, let's see if this one build properly :-) --
Aug 7, 2:51 am 2008
Tobias Koeck
meant hlist_* - Re: hash_list and rcu locking question
I mean the rcu function for the hlist_{head,list} .... Greetings t. --
Aug 6, 9:49 pm 2008
Evgeniy Polyakov
Re: meant hlist_* - Re: hash_list and rcu locking question
include/linux/rculist.h : hlist_for_each_entry_rcu and friends. Does it contain needded calls? -- Evgeniy Polyakov --
Aug 6, 11:53 pm 2008
Stephen Hemminger
sfq dump broken in 2.6.27-rc1
With only one sfq (on eth0), I am getting multiple results from 'tc qdisc ls' # tc qdisc ls qdisc sfq 8001: dev eth0 root limit 127p quantum 1514b qdisc sfq 8001: dev eth0 root limit 127p quantum 1514b qdisc sfq 8001: dev eth0 root limit 127p quantum 1514b qdisc sfq 8001: dev eth0 root limit 127p quantum 1514b qdisc sfq 8001: dev eth0 root limit 127p quantum 1514b Still bisecting, since there is no obvious reason for the sudden borkage. --
Aug 6, 9:02 pm 2008
David Miller
Re: sfq dump broken in 2.6.27-rc1
From: Stephen Hemminger <stephen.hemminger@vyatta.com> Don't bother bisecting, it will take longer than the obvious set of debug printk's you could add to net/sched/sch_api.c:tc_dump_qdisc(). Please use a "mindful" approach to debugging this instead of a "mindless" one like bisect :-) --
Aug 6, 9:13 pm 2008
Stephen Hemminger
Re: sfq dump broken in 2.6.27-rc1
On Wed, 06 Aug 2008 18:13:27 -0700 (PDT) What ever happened to "you broke it, you fix it?" --
Aug 6, 11:20 pm 2008
David Miller
Re: sfq dump broken in 2.6.27-rc1
From: Stephen Hemminger <stephen.hemminger@vyatta.com> What ever happened to debugging things by reading the code and thinking? By all means, when I'm dealing with a user who doesn't know the kernel or networking in particular very well, going the bisect route is often the way. But for someone as skilled and knowledgable as you? Give me a break :-) --
Aug 6, 11:22 pm 2008
Stephen Hemminger
[PATCH] net: trap attempts to modify noop qdisc
Since noop qdisc is a singleton, it shouldn't end up with any other qdisc's on it's list, and it shouldn't be deleted. Dave, this should help you find the bug. Any change to root qdisc causes this to trigger. I.e doing: tc qdisc add dev eth0 root pfifo causes pfifo to end up on the noop_qdisc->list, which later causes problems. diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index 4840aff..57b778f 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -792,8 +792,10 @@ qdi...
Aug 7, 2:08 am 2008
David Miller
Re: [PATCH] net: trap attempts to modify noop qdisc
From: Stephen Hemminger <stephen.hemminger@vyatta.com> Thanks. --
Aug 7, 2:11 am 2008
Stephen Hemminger
Re: [PATCH] net: trap attempts to modify noop qdisc
On Wed, 06 Aug 2008 23:11:59 -0700 (PDT) I think the root of your problem (bad pun) is that the new code is assuming that changes to the root are done with parent handle of 0, but the API is for the parent handle to be TC_H_ROOT (0xFFFFFFFFU). --
Aug 7, 2:15 am 2008
previous daytodaynext day
August 6, 2008August 7, 2008August 8, 2008