Re: [Bugme-new] [Bug 11144] New: dhcp doesn't work with iwl4965

Previous thread: Huge packet loss on Intel 80003ES2LAN by Vlad Seliverstov on Tuesday, July 22, 2008 - 3:34 am. (2 messages)

Next thread: [bug] hard hang with CONFIG_USB_USBNET=y by Ingo Molnar on Tuesday, July 22, 2008 - 6:07 am. (1 message)
From: Andrew Morton
Date: Tuesday, July 22, 2008 - 3:48 am

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).



A whole pile of networking patches went into mainline about 12 hours
ago, one of which might have fixed this.  Can you please test
2.6.26-git10 once it has appeared and let us know the result?


Thanks.
--

From: Patrick McHardy
Date: Tuesday, July 22, 2008 - 3:52 am

This implies you're running shaping on your WLAN device. Please
post the rules you're using.


--

From: François Valenduc
Date: Wednesday, July 23, 2008 - 12:58 am

The problem still occurs with 2.6.26-git-10. But if I use a static 
address instead of DHCP, the connection work. DHCP also work with my 
ethernet card (a Marvell yukon which use the sky2 driver). But DHCP 
doesn't work with my wireless connection using the iwl4965 driver. As I 
said, my router offer several time a correct IP adress but it is not 
accepted by my computer. I don't use traffic shaping (see my iptables 
rules). Furthermore, the problem still occurs if I disable iptables.
Does anybody have an idea to solve the problem ?

Thanks for your help,
François

P.S: why don't you want to use the bugzilla interface ?
--

From: François Valenduc
Date: Wednesday, July 23, 2008 - 12:59 am

From: Andrew Morton
Date: Wednesday, July 23, 2008 - 1:05 am

Depends on the subsystem.  ACPI is 100% bugzilla-based, net people 
prefer email, others are in-between.

Plus for recently-occurring bugs it's best to knock them over via email
without getting into bugzilla bureaucracy.  bugzilla is more
appropriate to longer-term bugs.

And when the bug affects multiple subsystems, multiple
developers and we don't even know which subsystem introduced it, it is
much better to perform the diagnosis and repair via email.


I make a judgement call on each report.  Been doing it for a while,
don't get too many complaints ;)
--

From: François Valenduc
Date: Wednesday, July 23, 2008 - 2:17 am

OK, thanks for the explanation. But do you have more ideas about the 
problem ?
--

From: Patrick McHardy
Date: Wednesday, July 23, 2008 - 3:12 am

I tested myself using dhcpcd 3.2.3 on an ethernet device and it
works fine, so this appears to be driver related.
--

From: François Valenduc
Date: Wednesday, July 23, 2008 - 4:29 am

I also noticed that DHCP works without problem with my ethernet card 
(using the sky2 driver), so there seems to be a problem or conflict with 
the iwl4965 driver. However, if I use dhclient instead of dhcpcd, I can 
get an IP adress with DHCP and the DNS servers are correctly written in 
/etc/resolv.conf. Unfortunately, name resolution doesn't work, even if I 
can ping the DNS server.

 
--

From: Patrick McHardy
Date: Wednesday, July 23, 2008 - 4:31 am

Interesting. Could you post straces of both commands please?


--

From: François Valenduc
Date: Wednesday, July 23, 2008 - 4:58 am

Patrick McHardy a 
From: François Valenduc
Date: Wednesday, July 23, 2008 - 5:36 am

Patrick McHardy a 
From: Patrick McHardy
Date: Wednesday, July 23, 2008 - 5:44 am

Thanks, I couldn't spot anything in these traces. I'll try to
reproduce this with iwl3945.
--

From: Patrick McHardy
Date: Wednesday, July 23, 2008 - 5:51 am

Well, no luck, the current -git kernel doesn't boot on my notebook
and I don't have time for a bisection currently. Could someone
else try to verify/debug this?
--

From: François Valenduc
Date: Wednesday, July 23, 2008 - 7:57 am

I did the bisection again, this time on the whole tree and the first bad 
commit is again the one I mentioned previously:

175f9c1bba9b825d22b142d183c9e175488b260c is first bad commit
commit 175f9c1bba9b825d22b142d183c9e175488b260c
Author: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Date:   Sun Jul 20 00:08:47 2008 -0700

    net_sched: Add size table for qdiscs

    Add size table functions for qdiscs and calculate packet size in
    qdisc_enqueue().

    Based on patch by Patrick McHardy
     http://marc.info/?l=linux-netdev&m=115201979221729&w=2

This time, I didn't encounter kernels which didn't compile. So, I didn't 
use git-reset or git-bisect skip.

François

--

From: Patrick McHardy
Date: Wednesday, July 23, 2008 - 8:18 am

From: Patrick McHardy
Date: Wednesday, July 23, 2008 - 8:19 am

Patrick McHardy wrote:
> Fran
From: François Valenduc
Date: Wednesday, July 23, 2008 - 8:42 am

I tested your last patch. Unfortunately, I get the following compile error:

In file included from net/mac80211/main.c:11:
include/net/mac80211.h: In function ‘IEEE80211_SKB_CB’:
include/net/mac80211.h:347: erreur: size of array ‘type name’ is negative


François
--

From: Patrick McHardy
Date: Wednesday, July 23, 2008 - 8:52 am

I was afraid that might happen. This means skb->cb is not large
enough to hold both the qdisc and the ieee80211 structs.

Just for testing, changing (include/net/mac80211.h):

#define IEEE80211_TX_INFO_DRIVER_DATA_SIZE \
         (sizeof(((struct sk_buff *)0)->cb) - 8)

to

#define IEEE80211_TX_INFO_DRIVER_DATA_SIZE \
         (sizeof(((struct sk_buff *)0)->cb) - 12)

might help to get it to compile. If that doesn't work, try -16.
--

From: François Valenduc
Date: Wednesday, July 23, 2008 - 8:58 am

That didn't work, neither with -12 or -16.

--

From: Patrick McHardy
Date: Wednesday, July 23, 2008 - 8:59 am

I'll give it a try myself, please wait a few minutes.
--

From: Patrick McHardy
Date: Wednesday, July 23, 2008 - 9:25 am

We can't fit them into the cb together, I don't see a way to
shrink ieee80211_tx_info.

Maybe one of the wireless folks can suggest something? Is it
really necessary to pass the full struct ieee80211_tx_info
through the qdisc layer, or could the struct be split? It
needs to find a way to co-exist peacefully with qdiscs'
skb->cb usage.

--

From: David Miller
Date: Wednesday, July 23, 2008 - 2:21 pm

From: Patrick McHardy <kaber@trash.net>

This is another area that got mangled up in the ->select_queue()
conversion of the WME bits, but in another aspect this problem
existed beforehand as well.

Specifically, when RX packets get requeued out to transmit in
the code in net/mac80211/rx.c that resends packets back out the
wireless device by setting a bit in the SKB CB then calling
dev_queue_xmit().

That's completely illegal :-)

There's a ton of stuff in that structure, I can't see how to
make it smaller either.  Maybe some bits only matter through
the layers of the TX mac80211 stack and thus can be passed
as parameters during such processing?

--

From: Patrick McHardy
Date: Thursday, July 24, 2008 - 1:58 am

It seems its doing even more illegal things that were also
present previously. The ieee80211_master_start_xmit function
expects to get a valid IEEE80211_SKB_CB, which means it
expects it to survive through the entire qdisc layer. I'm
not sure how packets get to the master device from the
subifs though, so I might be wrong.


--

From: Tomas Winkler
Date: Thursday, July 24, 2008 - 3:17 am

Isn't this time to make 802.11 native?
Tomas.
--

From: Patrick McHardy
Date: Thursday, July 24, 2008 - 3:19 am

You mean by making these things real skb members? I hope you're
not talking about the entire 48 bytes? :)

--

From: Tomas Winkler
Date: Thursday, July 24, 2008 - 4:35 am

I mean elevating 802.11 header to 802.3 level. Not doing 802.3 ->
802.11 translation where not needed.
Tomas
--

From: Johannes Berg
Date: Thursday, July 24, 2008 - 4:45 am

We've discussed that over and over and over and over again, and it's not
going to fly since 802.11 throughout assumes that you have a two-address
data service to the upper layers.

Besides, making it 802.11 native doesn't help with the control
information at all.

johannes
From: François Valenduc
Date: Wednesday, July 30, 2008 - 11:04 am

It seems the patch you made an available at 
http://lkml.org/lkml/2008/7/29/153 solves my problem. DHCP works again 
if I apply it on kernel 2.6.27-rc1.

François
--

Previous thread: Huge packet loss on Intel 80003ES2LAN by Vlad Seliverstov on Tuesday, July 22, 2008 - 3:34 am. (2 messages)

Next thread: [bug] hard hang with CONFIG_USB_USBNET=y by Ingo Molnar on Tuesday, July 22, 2008 - 6:07 am. (1 message)