thousands of classes, e1000 TX unit hang

Previous thread: [PATCH net-2.6] bridge: fix compile warning in net/bridge/br_netfilter.c by Rami Rosen on Tuesday, August 5, 2008 - 2:45 am. (2 messages)

Next thread: Re: SYSFS support together with NET_NS status? by Daniel Lezcano on Tuesday, August 5, 2008 - 4:01 am. (1 message)
To: <netdev@...>
Date: Tuesday, August 5, 2008 - 3:47 am

I did script, that looks something like this (to simulate SFQ by flow
classifier):

$2 (is ppp interface)
echo "qdisc del dev $2 root ">>${TEMP}
echo "qdisc add dev $2 root handle 1: htb ">>${TEMP}
echo "filter add dev $2 protocol ip pref 16 parent 1: u32 \
match ip dst 0.0.0.0/0 police rate 8kbit burst 2048kb \
peakrate 1024Kbit mtu 10000 \
conform-exceed continue/ok">>${TEMP}

echo "filter add dev $2 protocol ip pref 32 parent 1: handle 1 \
flow hash keys nfct divisor 128 baseclass 1:2">>${TEMP}

echo "class add dev $2 parent 1: classid 1:1 htb \
rate ${rate}bit ceil ${rate}Kbit quantum 1514">>${TEMP}

#Cycle to add 128 classes
maxslot=130
for slot in `seq 2 $maxslot`; do
echo "class add dev $2 parent 1:1 classid 1:$slot htb \
rate 8Kbit ceil 256Kbit quantum 1514">>${TEMP}
echo "qdisc add dev $2 handle $slot: parent 1:$slot bfifo limit 3000">>${TEMP}
done

After adding around 400-450 interfaces (ppp) server start to "crack". Sure
there is packetloss to eth0 (but there is no filters or shapers on it). Even
deleting all classes becomes a challenge. After deleting all root handles on
ppp interfaces - it becomes ok.

Traffic over host is 15-20Mbit/s at that moment, it is 1 CPU Xeon 3.0 Ghz on
server motherboard SE7520 with 1GB ram available (at moment of testing more
than 512Mb was free).

Kernel is 2.6.26.1-vanilla
Anything else i need to add to info?

Error message appearing in dmesg:
[149650.006939] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
[149650.006943] Tx Queue <0>
[149650.006944] TDH <a3>
[149650.006945] TDT <a3>
[149650.006947] next_to_use <a3>
[149650.006948] next_to_clean <f8>
[149650.006949] buffer_info[next_to_clean]
[149650.006951] time_stamp <8e69a7c>
[149650.006952] next_to_watch <f8>
[149650.006953] jiffies <8e6a111>
[149650.006...

To: Denys Fedoryshchenko <denys@...>, <netdev@...>
Date: Tuesday, August 5, 2008 - 9:13 pm

<snip>

Just to clarify for e1000 message, this is a "false hang" as indicated
by .status = 1 and TDH==TDT and both are still moving, which means
adapter is still transmitting.

In this case it appears that the system took longer than two seconds to
allow the e1000 driver to clean up packets that it transmitted, in fact

So I don't think this is any e1000 problem as it appears the rest of
your thread confirms, as it appears the system gets too busy trying to
traverse your tc filters and can't work on e1000 driver packet clean up.

Jesse
--

To: <netdev@...>
Date: Tuesday, August 5, 2008 - 4:06 am

A little bit more info:

On oprofile i run on another machine (which doesn't suffer much, but i can
notice also drops on eth0 after adding around 100 interfaces). On first
machine clocksources is TSC, on machine where i read stats acpi_pm.

CPU: P4 / Xeon with 2 hyper-threads, speed 3200.53 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not
stopped) with a unit mask of 0x01 (mandatory) count 100000
GLOBAL_POWER_E...|
samples| %|
------------------
973464 75.7644 vmlinux
97703 7.6042 libc-2.6.1.so
36166 2.8148 cls_fw
18290 1.4235 nf_conntrack
17946 1.3967 busybox
GLOBAL_POWER_E...|

PU: P4 / Xeon with 2 hyper-threads, speed 3200.53 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not
stopped) with a unit mask of 0x01 (mandatory) count 100000
samples % symbol name
245545 23.1963 acpi_pm_read
143863 13.5905 __copy_to_user_ll
121269 11.4561 ioread16
58609 5.5367 gen_kill_estimator
40153 3.7932 ioread32
33923 3.2047 ioread8
16491 1.5579 arch_task_cache_init
16067 1.5178 sysenter_past_esp
11604 1.0962 find_get_page
10631 1.0043 est_timer
9038 0.8538 get_page_from_freelist
8681 0.8201 sk_run_filter
8077 0.7630 irq_entries_start
7711 0.7284 schedule
6451 0.6094 copy_to_user

--

To: <netdev@...>
Date: Tuesday, August 5, 2008 - 6:05 am

I found, that packetloss happening when i am deleting/adding classes.
I attach result of oprofile as file.

To: Denys Fedoryshchenko <denys@...>
Cc: <netdev@...>
Date: Tuesday, August 5, 2008 - 7:04 am

...

Deleting of estimators (gen_kill_estimator) isn't optimized for
a large number of them, and it's a known issue. Adding of classes
shouldn't be such a problem, but maybe you could try to do this
before adding filters directing to those classes.

Since you can control rate with htb, I'm not sure you really need
policing: at least you could try if removing this changes anything.
And I'm not sure: do these tx hangs happen only when classes are
added/deleted or otherwise too?

Jarek P.
--

To: Jarek Poplawski <jarkao2@...>
Cc: <netdev@...>
Date: Tuesday, August 5, 2008 - 7:13 am

Policer is creating burst for me.
For example first 2Mbyte(+rate*time if need more precision) will pass on high
speed (1Mbit), then if flow is still using maximum bandwidth will be
throttled to rate of HTB. When i tried to play with cburst/burst values in
HTB i was not able to archieve same results. I can do same with TBF and his
peakrate/burst, but not with HTB.

It happens when root qdisc deleted(which holds around 130 child classes).
Probably gen_kill_estimator taking all resources while i am deleting root
class.
I did some test, on machine with 150 ppp interfaces (Pentium 4 3.2 Ghz),
just by deleting root qdisc and i got huge packetloss. When i am just adding
classes - there is no significant packetloss.
Probably it is not right thing, when i am deleting qdisc on ppp - causing
packetloss on whole system? Is it possible to workaround, till
gen_kill_estimator will be rewritten?

But sure i can try to avoid "mass deleting" classes, but i think many people
will hit this bug, especially newbies, who implement "many class" setup.

--

To: Denys Fedoryshchenko <denys@...>
Cc: <netdev@...>
Date: Tuesday, August 5, 2008 - 8:23 am

Very interesting. Anyway tbf doesn't use gen estimators, so you could

Actually, gen_kill_estimator was rewritten already, but for some
reason it wasn't merged. Maybe there isn't so much users with such a
number of classes or they don't delete them, anyway this subject isn't
reported often to the list (I remember once). Some workaround could be
probably deleting individual classes (and filters) to give away a lock
and soft interrupts for a while), before deleting the root, but I
didn't test this. BTW, you are using quite long queues (3000), so there
would be interesting to make them less and check if doesn't add to the
problem (with retransmits).

Jarek P.
--

To: Jarek Poplawski <jarkao2@...>
Cc: <netdev@...>
Date: Tuesday, August 5, 2008 - 10:07 am

Btw even if i optimize my scripts, still when ppp interface going down and
disappearing - all classes will be deleted, and system locked up for short
time (with all related TX hang, packetloss and etc).
--

To: Denys Fedoryshchenko <denys@...>
Cc: <netdev@...>
Date: Tuesday, August 5, 2008 - 12:48 pm

Are you sure you can't let pppd to run this script when the link goes
down?

Jarek P.
--

To: Jarek Poplawski <jarkao2@...>
Cc: <netdev@...>
Date: Tuesday, August 5, 2008 - 1:18 pm

Probably i can, over /etc/ppp/ip-down, i will try it.

--

To: Jarek Poplawski <jarkao2@...>
Cc: <netdev@...>
Date: Tuesday, August 5, 2008 - 9:02 am

[Empty message]
To: Denys Fedoryshchenko <denys@...>
Cc: <netdev@...>
Date: Tuesday, August 5, 2008 - 5:14 pm

> > BTW, you are using quite long queues (3000), so there

BTW, sorry for my gibberish about long queues. I missed this "b"fifo.

Jarek P.
--

To: Denys Fedoryshchenko <denys@...>
Cc: <netdev@...>
Date: Tuesday, August 5, 2008 - 12:41 pm

If such config works for you then why bother? I've some doubts, but

Alas you are the second:
http://www.mail-archive.com/netdev@vger.kernel.org/msg60101.html

If you think this gen_kill_estimator() fix will solve your problems
you can try to upgrade this patch and resend as yours or ours, it's
not copyrighted. Of course, no guarantee it'll be accepted this time.

Jarek P.
--

To: Jarek Poplawski <jarkao2@...>
Cc: <netdev@...>
Date: Tuesday, August 5, 2008 - 12:48 pm

TFB just make linear fifo, if user put downloading large file, and he have
limit 256Kbit/s - his bandwidth and other apps will suffer.

If i put flow, SFQ or like i show rules - it will balance each flow to small
fifo, and provide fair bandwidth to each flow. Flow with huge load just will
have packets dropped more :-)
Means even if customer uses 256Kbit/s flat - his browsing still amazing
fast... (it is checked).
--

Previous thread: [PATCH net-2.6] bridge: fix compile warning in net/bridge/br_netfilter.c by Rami Rosen on Tuesday, August 5, 2008 - 2:45 am. (2 messages)

Next thread: Re: SYSFS support together with NET_NS status? by Daniel Lezcano on Tuesday, August 5, 2008 - 4:01 am. (1 message)