* Peter Zijlstra (peterz@infradead.org) wrote:Exactly. I used the implementation of rcu_assign_pointer as a hint that we did not need barriers when setting the pointer to NULL, and thus we should not need the read barrier when reading the NULL pointer, because it references no data. #define rcu_assign_pointer(p, v) \ ({ \ if (!__builtin_constant_p(v) || \ ((v) != NULL)) \ smp_wmb(); \ (p) = (v); \ }) #define rcu_dereference(p) ({ \ typeof(p) _________p1 = ACCESS_ONCE(p); \ smp_read_barrier_depends(); \ (_________p1); \ }) But I think you are right, since we are already in unlikely code, using rcu_dereference as you do is better than my use of read barrier depends. It should not change anything in the assembly result except on alpha, where the read_barrier_depends() is not a nop. I wonder if there would be a way to add this kind of NULL pointer case check without overhead in rcu_dereference() on alpha. I guess not, since the pointer is almost never known at compile-time. And I guess Paul must already have thought about it. The only case where we could add this test is when we know that we have a if (ptr != NULL) test following the rcu_dereference(); we could then assume the compiler will merge the two branches since they depend on the same condition. On x86_64 : 820: bf 01 00 00 00 mov $0x1,%edi 825: e8 00 00 00 00 callq 82a <thread_return+0x136> 82a: 48 8b 1d 00 00 00 00 mov 0x0(%rip),%rbx # 831 <thread_return+0x13d> 831: 48 85 db test %rbx,%rbx 834: 75 21 jne 857 <thread_return+0x163> 836: eb 27 jmp 85f <thread_return+0x16b> 838: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1) 83f: 00 840: 48 8b 95 68 ff ff ff mov -0x98(%rbp),%rdx 847: 48 8b b5 60 ff ff ff mov -0xa0(%rbp),%rsi 84e: 4c 89 e7 mov %r12,%rdi 851: 48 83 c3 08 add $0x8,%rbx 855: ff d0 callq *%rax 857: 48 8b 03 mov (%rbx),%rax 85a: 48 85 c0 test %rax,%rax 85d: 75 e1 jne 840 <thread_return+0x14c> 85f: bf 01 00 00 00 mov $0x1,%edi 864: for 68 bytes. My original implementation was 77 bytes, so yes, we have a win. Mathieu -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 --
| Jeff Garzik | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Christoph Hellwig | Re: [malware-list] [RFC 0/5] [TALPA] Intro to a linux interface for on access scan... |
| Heiko Carstens | Re: -mm merge plans for 2.6.23 -- sys_fallocate |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
git: | |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Arjan van de Ven | Re: [GIT]: Networking |
| Jens Axboe | Re: [BUG] New Kernel Bugs |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Emmanuel Dreyfus | fixing send(2) semantics (kern/29750) |
| Christos Zoulas | Re: Melting down your network [Subject changed] |
| Juan RP | Changing the I/O scheduler on-the-fly |
| Emmanuel Dreyfus | Re: fixing send(2) semantics (kern/29750) |
