OK thanks David, I reproduced the problem on latest net-next-2.6 tree too. I wonder why nobody hit this before. [352468.556484] ------------[ cut here ]------------ [352468.556511] WARNING: at net/netfilter/nf_conntrack_extend.c:82 __nf_ct_ext_add+0x1c2/0x1e0 [nf_conntrack]() [352468.556559] Hardware name: ProLiant BL460c G1 [352468.556582] Modules linked in: nf_defrag_ipv4 nf_conntrack_netlink nf_conntrack sch_hfsc sch_sfq ipmi_devintf ipmi_si ipmi_msghandler hpilo bonding [last unloaded: nf_conntrack_ipv4] [352468.556675] Pid: 18852, comm: conntrack Tainted: G W 2.6.33-rc5-02754-g0ea034c-dirty #545 [352468.556721] Call Trace: [352468.556742] [<c054d45f>] ? printk+0x1d/0x26 [352468.556767] [<c023bbc2>] warn_slowpath_common+0x72/0xa0 [352468.556795] [<fee75e42>] ? __nf_ct_ext_add+0x1c2/0x1e0 [nf_conntrack] [352468.556825] [<fee75e42>] ? __nf_ct_ext_add+0x1c2/0x1e0 [nf_conntrack] [352468.556854] [<c023bc0a>] warn_slowpath_null+0x1a/0x20 [352468.556882] [<fee75e42>] __nf_ct_ext_add+0x1c2/0x1e0 [nf_conntrack] [352468.556911] [<fee70dcc>] ? nf_conntrack_alloc+0x10c/0x1a0 [nf_conntrack] [352468.556940] [<feecaf59>] ctnetlink_create_conntrack+0x339/0x360 [nf_conntrack_netlink] [352468.556987] [<feeca26b>] ? ctnetlink_parse_tuple+0x14b/0x1c0 [nf_conntrack_netlink] [352468.557039] [<fee6fd60>] ? __nf_conntrack_find+0x70/0x100 [nf_conntrack] [352468.557068] [<feecb090>] ctnetlink_new_conntrack+0x110/0x680 [nf_conntrack_netlink] [352468.557113] [<c04d93b5>] nfnetlink_rcv_msg+0x125/0x180 [352468.557140] [<c054ec57>] ? __mutex_lock_slowpath+0x197/0x230 [352468.557167] [<c04d9290>] ? nfnetlink_rcv_msg+0x0/0x180 [352468.557194] [<c04d5896>] netlink_rcv_skb+0x96/0xc0 [352468.557219] [<c04d927c>] nfnetlink_rcv+0x1c/0x30 [352468.557245] [<c04d5545>] netlink_unicast+0x255/0x2a0 [352468.557274] [<c04d5d3f>] netlink_sendmsg+0x1af/0x2b0 [352468.557300] [<c04a86ec>] sock_sendmsg+0xac/0xe0 [352468.559358] [<c029d042>] ? find_get_page+0x22/0xd0 [352468.559385] ...
Hmm, my config had not NETFILTER_DEBUG enabled, that's why I didn't hit I think that we should explicitly report if the user unsets IPS_CONFIRMED. Please, don't change this. Apart from that, the patch seems fine to me. Thanks! --
Problem is we now (I mean after my patch) enter ctnetlink_change_status() with ct->status being null (or at least, IPS_CONFIRMED not set) --
Hi Eric, The conntrack patch works successfully. Thanks for your help! David -----Message d'origine----- De : Eric Dumazet [mailto:eric.dumazet@gmail.com] Envoyé : mardi 16 février 2010 14:45 À : Pablo Neira Ayuso Cc : Ramblewski David; netfilter-devel@vger.kernel.org; netdev Objet : Re: kernel stack trace using conntrack Problem is we now (I mean after my patch) enter ctnetlink_change_status() with ct->status being null (or at least, IPS_CONFIRMED not set) Ce message et les pièces jointes sont confidentiels et réservés à l'usage exclusif de ses destinataires. Il peut également être protégé par le secret professionnel. Si vous recevez ce message par erreur, merci d'en avertir immédiatement l'expéditeur et de le détruire. L'intégrité du message ne pouvant être assurée sur Internet, la responsabilité du groupe Atos Origin ne pourra être recherchée quant au contenu de ce message. Bien que les meilleurs efforts soient faits pour maintenir cette transmission exempte de tout virus, l'expéditeur ne donne aucune garantie à cet égard et sa responsabilité ne saurait être recherchée pour tout dommage résultant d'un virus transmis. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Origin group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted.
Pablo, please let me know whether you want me to apply this. --
ctnetlink_change_helper() also calls nf_ct_ext_add() for conntracks that are confirmed (in case of a helper update for an existing conntrack). That would also trigger the assertion. If we want to support helper assignation via ctnetlink for existing conntracks, we will need to add locking to the conntrack extension infrastructure to avoid races. I don't see a clear solution for this yet. --
I see, this is indeed a problem. Since the helper is known at the first event, we could restrict this to only allow manual assignment for newly created conntracks. Most helpers probably can't properly cope with connections not seen from the beginning anyways. --
Indeed, changing the helper in the middle of the road doesn't make too much sense to me either. I can send you a patch for this along today, I'll find some spare time to do it. --
I have slightly tested the following patch here. I think it should fix the problem. We can revisit ctnetlink_change_helper() later, I think there's some code there that can be refactorized. Let me know if you're OK with it.
I sucessfuly tested your patch Pablo, thanks. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> --
