Re: [PATCH 1/2] skb: expose and constify hash primitives

Previous thread: none

Next thread: [PATCH] sungem: missing net_device_ops by Stephen Hemminger on Thursday, March 19, 2009 - 11:53 pm. (2 messages)
From: Stephen Hemminger
Date: Thursday, March 19, 2009 - 11:34 pm

Some minor changes to queue hashing:
 1. Use const on accessor functions
 2. Export skb_tx_hash for use in drivers (see ixgbe)

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
 include/linux/skbuff.h |    9 ++++++---
 net/core/dev.c         |    3 ++-
 2 files changed, 8 insertions(+), 4 deletions(-)

--- a/include/linux/skbuff.h	2009-03-19 22:57:01.041090287 -0700
+++ b/include/linux/skbuff.h	2009-03-19 22:59:00.951841088 -0700
@@ -1969,7 +1969,7 @@ static inline void skb_set_queue_mapping
 	skb->queue_mapping = queue_mapping;
 }
 
-static inline u16 skb_get_queue_mapping(struct sk_buff *skb)
+static inline u16 skb_get_queue_mapping(const struct sk_buff *skb)
 {
 	return skb->queue_mapping;
 }
@@ -1984,16 +1984,19 @@ static inline void skb_record_rx_queue(s
 	skb->queue_mapping = rx_queue + 1;
 }
 
-static inline u16 skb_get_rx_queue(struct sk_buff *skb)
+static inline u16 skb_get_rx_queue(const struct sk_buff *skb)
 {
 	return skb->queue_mapping - 1;
 }
 
-static inline bool skb_rx_queue_recorded(struct sk_buff *skb)
+static inline bool skb_rx_queue_recorded(const struct sk_buff *skb)
 {
 	return (skb->queue_mapping != 0);
 }
 
+extern u16 skb_tx_hash(const struct net_device *dev,
+		       const struct sk_buff *skb);
+
 #ifdef CONFIG_XFRM
 static inline struct sec_path *skb_sec_path(struct sk_buff *skb)
 {
--- a/net/core/dev.c	2009-03-19 22:57:01.031089784 -0700
+++ b/net/core/dev.c	2009-03-19 22:58:16.794089571 -0700
@@ -1725,7 +1725,7 @@ out_kfree_skb:
 
 static u32 skb_tx_hashrnd;
 
-static u16 skb_tx_hash(struct net_device *dev, struct sk_buff *skb)
+u16 skb_tx_hash(const struct net_device *dev, const struct sk_buff *skb)
 {
 	u32 hash;
 
@@ -1740,6 +1740,7 @@ static u16 skb_tx_hash(struct net_device
 
 	return (u16) (((u64) hash * dev->real_num_tx_queues) >> 32);
 }
+EXPORT_SYMBOL(skb_tx_hash);
 
 static struct netdev_queue *dev_pick_tx(struct net_device *dev,
 					struct sk_buff *skb)
--

From: Stephen Hemminger
Date: Thursday, March 19, 2009 - 11:36 pm

Convert ixgbe to use net_device_ops properly.
Rather than changing the select_queue function pointer
just check the flag.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
 drivers/net/ixgbe/ixgbe_dcb_nl.c |    8 --------
 drivers/net/ixgbe/ixgbe_main.c   |   11 +++++++++++
 2 files changed, 11 insertions(+), 8 deletions(-)

--- a/drivers/net/ixgbe/ixgbe_dcb_nl.c	2009-03-19 22:57:01.053842791 -0700
+++ b/drivers/net/ixgbe/ixgbe_dcb_nl.c	2009-03-19 23:07:36.372656027 -0700
@@ -102,12 +102,6 @@ static u8 ixgbe_dcbnl_get_state(struct n
 	return !!(adapter->flags & IXGBE_FLAG_DCB_ENABLED);
 }
 
-static u16 ixgbe_dcb_select_queue(struct net_device *dev, struct sk_buff *skb)
-{
-	/* All traffic should default to class 0 */
-	return 0;
-}
-
 static u8 ixgbe_dcbnl_set_state(struct net_device *netdev, u8 state)
 {
 	u8 err = 0;
@@ -135,7 +129,6 @@ static u8 ixgbe_dcbnl_set_state(struct n
 		kfree(adapter->rx_ring);
 		adapter->tx_ring = NULL;
 		adapter->rx_ring = NULL;
-		netdev->select_queue = &ixgbe_dcb_select_queue;
 
 		adapter->flags &= ~IXGBE_FLAG_RSS_ENABLED;
 		adapter->flags |= IXGBE_FLAG_DCB_ENABLED;
@@ -154,7 +147,6 @@ static u8 ixgbe_dcbnl_set_state(struct n
 			kfree(adapter->rx_ring);
 			adapter->tx_ring = NULL;
 			adapter->rx_ring = NULL;
-			netdev->select_queue = NULL;
 
 			adapter->flags &= ~IXGBE_FLAG_DCB_ENABLED;
 			adapter->flags |= IXGBE_FLAG_RSS_ENABLED;
--- a/drivers/net/ixgbe/ixgbe_main.c	2009-03-19 23:05:56.954718897 -0700
+++ b/drivers/net/ixgbe/ixgbe_main.c	2009-03-19 23:07:33.611651593 -0700
@@ -4272,6 +4272,16 @@ static int ixgbe_maybe_stop_tx(struct ne
 	return __ixgbe_maybe_stop_tx(netdev, tx_ring, size);
 }
 
+static u16 ixgbe_select_queue(struct net_device *dev, struct sk_buff *skb)
+{
+	struct ixgbe_adapter *adapter = netdev_priv(dev);
+
+	if (adapter->flags & IXGBE_FLAG_DCB_ENABLED)
+		return 0;  /* All traffic should default to class 0 */
+
+	return skb_tx_hash(dev, skb);
+}
+
 static int ...
From: Waskiewicz Jr, Peter P
Date: Friday, March 20, 2009 - 12:23 am

Thanks Stephen.  I was looking at reassigning a DCB netdev_ops struct when 
DCB is enabled, and then having a default netdev_ops struct when it's not 
enabled.  I agree the check is cleaner the way you have it below, but it's 
another conditional check in the Tx hotpath, which we have too many of in 
the first place.

On a related side note, why is the netdev_ops member of net_device 
declared const?

Cheers,
-PJ Waskiewicz
--

From: Stephen Hemminger
Date: Friday, March 20, 2009 - 9:24 am

On Fri, 20 Mar 2009 00:23:39 -0700 (Pacific Daylight Time)

Changing number of tx queues is actually the fastest, since then

The purpose of having an ops structure is two fold. First, the ops
can be in read-only section (if driver wants) to avoid cache issues.
More importantly only one instance is necessary when there are multiple
--

From: Waskiewicz Jr, Peter P
Date: Friday, March 20, 2009 - 10:14 am

Ah, gotcha.  Clears up my questions.

Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
--

From: Jeff Kirsher
Date: Friday, March 20, 2009 - 5:24 pm

On Fri, Mar 20, 2009 at 10:14 AM, Waskiewicz Jr, Peter P

Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Dave - feel free to apply this patch when you take the first patch in
this series, I am not adding this to me tree.

-- 
Cheers,
Jeff
--

From: David Miller
Date: Saturday, March 21, 2009 - 1:40 pm

From: "Waskiewicz Jr, Peter P" <peter.p.waskiewicz.jr@intel.com>

Applied.
--

From: Stephen Hemminger
Date: Friday, March 20, 2009 - 9:12 am

Convert ixgbe to use net_device_ops properly.
Rather than changing the select_queue function pointer
just change number of available transmit queues.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
 drivers/net/ixgbe/ixgbe_dcb_nl.c |   48 +++++++++++++++++----------------------
 1 file changed, 21 insertions(+), 27 deletions(-)

--- a/drivers/net/ixgbe/ixgbe_dcb_nl.c	2009-03-20 09:01:19.643651162 -0700
+++ b/drivers/net/ixgbe/ixgbe_dcb_nl.c	2009-03-20 09:11:09.645652169 -0700
@@ -102,12 +102,6 @@ static u8 ixgbe_dcbnl_get_state(struct n
 	return !!(adapter->flags & IXGBE_FLAG_DCB_ENABLED);
 }
 
-static u16 ixgbe_dcb_select_queue(struct net_device *dev, struct sk_buff *skb)
-{
-	/* All traffic should default to class 0 */
-	return 0;
-}
-
 static u8 ixgbe_dcbnl_set_state(struct net_device *netdev, u8 state)
 {
 	u8 err = 0;
@@ -135,7 +129,7 @@ static u8 ixgbe_dcbnl_set_state(struct n
 		kfree(adapter->rx_ring);
 		adapter->tx_ring = NULL;
 		adapter->rx_ring = NULL;
-		netdev->select_queue = &ixgbe_dcb_select_queue;
+		netdev->real_num_tx_queues = 1;
 
 		adapter->flags &= ~IXGBE_FLAG_RSS_ENABLED;
 		adapter->flags |= IXGBE_FLAG_DCB_ENABLED;
@@ -147,6 +141,7 @@ static u8 ixgbe_dcbnl_set_state(struct n
 		if (adapter->flags & IXGBE_FLAG_DCB_ENABLED) {
 			if (netif_running(netdev))
 				netdev->netdev_ops->ndo_stop(netdev);
+
 			ixgbe_reset_interrupt_capability(adapter);
 			ixgbe_napi_del_all(adapter);
 			INIT_LIST_HEAD(&netdev->napi_list);
@@ -154,7 +149,7 @@ static u8 ixgbe_dcbnl_set_state(struct n
 			kfree(adapter->rx_ring);
 			adapter->tx_ring = NULL;
 			adapter->rx_ring = NULL;
-			netdev->select_queue = NULL;
+			netdev->real_num_tx_queues = MAX_TX_QUEUES;
 
 			adapter->flags &= ~IXGBE_FLAG_DCB_ENABLED;
 			adapter->flags |= IXGBE_FLAG_RSS_ENABLED;
--

From: Waskiewicz Jr, Peter P
Date: Friday, March 20, 2009 - 8:48 pm

NAK.  The point of dcb_select_queue() isn't because DCB mode only uses 1 
Tx queue.  DCB has 8 priorities, and allocates 8 Tx queues, one for each 
priority.  The DCB spec says that any traffic not being filtered by some 
kind of mechanism needs to go through priority 0, or queue 0.  So 
select_queue is meant to tag all traffic to queue 0, then have the 
attached qdisc and tc filters get the majority of the traffic into the 
different priority queues.

If we did not push the unfiltered traffic into queue 0, then skb_tx_hash() 
would put traffic randomly into queues with higher priority, which is not 
what we want.

I'd prefer your original patch to fix this up, where you check if DCB is 
enabled, and return 0.

-PJ Waskiewicz
--

From: Stephen Hemminger
Date: Friday, March 20, 2009 - 9:45 pm

On Fri, 20 Mar 2009 20:48:46 -0700 (Pacific Daylight Time)

The default select queue function in the kernel is:

static struct netdev_queue *dev_pick_tx(struct net_device *dev,
					struct sk_buff *skb)
{
	const struct net_device_ops *ops = dev->netdev_ops;
	u16 queue_index = 0;

	if (ops->ndo_select_queue)
		queue_index = ops->ndo_select_queue(dev, skb);
	else if (dev->real_num_tx_queues > 1)
		queue_index = skb_tx_hash(dev, skb);

	skb_set_queue_mapping(skb, queue_index);
	return netdev_get_tx_queue(dev, queue_index);
}

So if driver (re)sets real_num_tx_queues to 1 then queue_index will always
0 and all traffic will go to one queue. This is the same as having your
own select_queue function.

--

From: Waskiewicz Jr, Peter P
Date: Friday, March 20, 2009 - 11:21 pm

I see your point, but it is a hack in my opinion.  The device will have 8 
real Tx queues, not 1.  I'd much rather go with the original proposal, 
since if the code in dev_pick_tx() changed, it could silently break ixgbe.

-PJ Waskiewicz
--

From: David Miller
Date: Saturday, March 21, 2009 - 12:33 am

From: "Waskiewicz Jr, Peter P" <peter.p.waskiewicz.jr@intel.com>

It can't, if you only advertise one transmit queue the kernel
can never ever choose anything other than queue zero.  It's
impossible.

Stephen's right, you guys don't need your select queue override.

And if you recall I suspected this from the very beginning.

You guys never ever think out of the box, ever...  if it's
not straightforward, you guys won't got for it.  That makes
it very frustrating to get anything done.


--

From: Waskiewicz Jr, Peter P
Date: Saturday, March 21, 2009 - 12:43 am

This patch will break DCB in ixgbe.  We need all 8 queues, because the 
user will be assigning tc filters to the sch_multiq qdisc to get traffic 
into priority queues.  If we take Stephen's patch and tell the stack we 
have 1 real_num_tx_queues, then we get 1 band in sch_multiq, which makes 
it impossible to assign traffic to priorities 1 through 8:

static int multiq_tune(struct Qdisc *sch, struct nlattr *opt)
{
        struct multiq_sched_data *q = qdisc_priv(sch);
        struct tc_multiq_qopt *qopt;
        int i;

        if (!netif_is_multiqueue(qdisc_dev(sch)))
                return -EOPNOTSUPP;
        if (nla_len(opt) < sizeof(*qopt))
                return -EINVAL;

        qopt = nla_data(opt);

        qopt->bands = qdisc_dev(sch)->real_num_tx_queues;

This is not what we want, rather, we want all 8 Tx queues that we expose.  
The only reason we override the select_queue is to catch the unfiltered 
traffic and send it to queue 0.

Cheers,
-PJ Waskiewicz
--

From: Stephen Hemminger
Date: Saturday, March 21, 2009 - 12:39 pm

On Sat, 21 Mar 2009 00:43:40 -0700 (Pacific Daylight Time)

How does it make sense to say you have 8 bands, but only use one.
--

From: Waskiewicz Jr, Peter P
Date: Saturday, March 21, 2009 - 6:48 pm

That's not how DCB works.  We use sch_multiq to identify the traffic we 
want to put into the 8 bands.  So in other words, the user will add tc 
filters to move the traffic around.  We override select queue to filter 
the rest of the traffic into a single queue, so we don't randomly put 
traffic into the other hardware priority queues.

Either way, thanks for cleaning this up Stephen.  It was something I 
needed to do, and haven't done yet.  So I very much appreciate it.

Cheers,
-PJ Waskiewicz
--

From: David Miller
Date: Saturday, March 21, 2009 - 7:00 pm

From: "Waskiewicz Jr, Peter P" <peter.p.waskiewicz.jr@intel.com>

It's escaping me why the multiq rules can't handle this?

In the end, it's a decision of where the logic lives.  Currently
the default handling logic is in the ->select_queue() override,
and I'm still not at all convinced it has to be there.
--

From: David Miller
Date: Saturday, March 21, 2009 - 1:39 pm

From: Stephen Hemminger <shemminger@vyatta.com>

Applied.
--

Previous thread: none

Next thread: [PATCH] sungem: missing net_device_ops by Stephen Hemminger on Thursday, March 19, 2009 - 11:53 pm. (2 messages)