Re: [PATCH] Packet socket: mmapped IO: PACKET_TX_RING

Previous thread: [RFC PATCH 00/13] hardware time stamping + igb example implementation by Patrick Ohly on Tuesday, November 11, 2008 - 7:44 am. (40 messages)

Next thread: Linux 2.6.27.5 / SFQ/HTB scheduling problems by Sami Farin on Tuesday, November 11, 2008 - 8:17 am. (7 messages)
From: Johann Baudy
Date: Tuesday, November 11, 2008 - 7:59 am

Ok, so we can skip below block according to this assumption:
                for (i = 0; i < req->tp_block_nr; i++) {
                        void *ptr = pg_vec[i];
                        int k;

                        for (k = 0; k < rb->frames_per_block; k++) {
                                __packet_set_status(po, ptr, TP_STATUS_KERNEL);
                                ptr += req->tp_frame_size;
                        }

 Yes, we can also add a check for po->mapped in

 I'm not sure to understand:

 Let me, sum up :
 * when setting or resizing the ring: we have to return an error if:
 - someone has mapped the socket (1.a)
 - sending is ongoing - tx_pending_skb != 0 (1.b)
 - a send() procedure is ongoing (1.c)

 * when closing the socket, we have to return an error if:
 - sending is ongoing - tx_pending_skb != 0 (2.a)
 - a send() procedure is ongoing (2.b)

 * during do-while loop of send(),:
 - protect ring against simultaneous send() (3.a)
 - ensure that ring is not resized (3.b) (linked with 1.c)
 Our main issue is when: tx_pending_skb = 0, Thread A starts to send a
 packet while Thread B unmaps and resizes the ring.
  Thread B can potentially resizes the ring until increment of
 tx_pending_skb in Thread A.
 - loop can continue even if ring is not mapped (3.c), this has no
 sense but it is not critical

 * when executing the destructor callback:
 - we must ensure that ring structure has not changed since send() (4.a)

 (1.a) can be done with po->mmaped check in packet_setsockopt()
 (1.b) and (2.a) can be done as you said :  tx_pending_skb != 0 check
 when sizing buffer in packet_set_ring()
 (3.a) by a mutex M outside of the do-while loop
 (1.c), (2.b) and (3.b) can be performed thanks to the same mutex M and
 mutex_trylock() in packet_set_ring(). An error is returned if we can't
 take the mutex . The mutex is released at the end of packet_set_ring()
 call.
 (4.a) is validated with tx_pending_skb increment, (3.b),(1.b) and (2.a)

 Do I miss ...
From: Lovich, Vitali
Date: Tuesday, November 11, 2008 - 12:05 pm

I hope I covered everything above.  I'm not sure if send, setsockopt, etc are in an interrupt context or not.  My guess is they are and thus mutexes are not allowed - spinlocks only.  However, there's also no need for mutexes or spinlocks because I believe the above atomic algorithm should work.  I should be able to post a patch by the end of the day.

Thanks,
Vitali
--

Previous thread: [RFC PATCH 00/13] hardware time stamping + igb example implementation by Patrick Ohly on Tuesday, November 11, 2008 - 7:44 am. (40 messages)

Next thread: Linux 2.6.27.5 / SFQ/HTB scheduling problems by Sami Farin on Tuesday, November 11, 2008 - 8:17 am. (7 messages)