linux-netdev mailing list

FromSubjectsort iconDate
David Miller
Re: [PATCH] net/ipv4, linux-2.6.30.4
From: Daniel Slot <slot.daniel@gmail.com> We already have code in the stack which tries to detect packet reordering with a high level of sophistication. --
Aug 12, 2:55 pm 2009
Gregory Haskins
AlacrityVM numbers updated for 31-rc4
I re-ran the numbers on 10GE against the actual alacrityvm v0.1 release available in git on kernel.org. I tried to include the newly announced "vhost" driver (Michael Tsirkin) for virtio acceleration, but ran into issues getting the patches to apply. For now, this includes native, virtio-u (virtio-userspace), and venet all running on 31-rc4. If I can resolve the issue with Michaels patches, I will add "virtio-k" (virtio-kernel) to the mix as well. For now, here are the results for ...
Aug 12, 1:43 pm 2009
Javier Guerra
Re: AlacrityVM numbers updated for 31-rc4
On Wed, Aug 12, 2009 at 3:43 PM, Gregory pseudo-3D charts are just wrong (http://www.chuckchakrapani.com/articles/PDF/94070347.pdf) -- Javier --
Aug 12, 2:33 pm 2009
Michael S. Zick
Re: AlacrityVM numbers updated for 31-rc4
Nice quote. Even 15 years after it was published, those first two graphs are prime "bad examples". The second two could stand some improvement also - like put the legend and numbers *on* each solid bar. @J.G. - I think you are fighting an uphill battle here against human nature. People tend to go with their idea of "pretty". Mike --
Aug 12, 3:05 pm 2009
Anthony Liguori
Re: AlacrityVM numbers updated for 31-rc4
Just FYI, the numbers quoted are wrong for virtio-u. Greg's machine didn't have high res timers enabled in the kernel. He'll post newer numbers later but they're much better than these (venet is still ahead though). Regards, Anthony Liguori --
Aug 12, 4:05 pm 2009
Gregory Haskins
Re: AlacrityVM numbers updated for 31-rc4
Anthony is correct. The new numbers after fixing the HRT clock issue are: virtio-u: 2670Mb/s, 266us rtt (3764 tps udp-rr) I will update the charts later tonight. Sorry for the confusion. -Greg
Aug 12, 4:21 pm 2009
Dan Smith
[PATCH] c/r: Add AF_UNIX support (v10)
This patch adds basic checkpoint/restart support for AF_UNIX sockets. It has been tested with a single and multiple processes, and with data inflight at the time of checkpoint. It supports socketpair()s, path-based, and abstract sockets. Changes in v10: - Moved header structure definitions back to checkpoint_hdr.h - Moved AF_UNIX checkpoint/restart code to net/unix/checkpoint.c - Make sock_unix_*() functions only compile if CONFIG_UNIX=y - Add TODO for CONFIG_UNIX=m case Changes ...
Aug 12, 1:02 pm 2009
Daniel Slot
[PATCH] net/ipv4, linux-2.6.30.4
RFC 4653 specifies Non-Congestion Robustness (NCR) for TCP. In the absence of explicit congestion notification from the network, TCP uses loss as an indication of congestion. One of the ways TCP detects loss is using the arrival of three duplicate acknowledgments. However, this heuristic is not always correct, notably in the case when network paths reorder segments (for whatever reason), resulting in degraded performance. TCP-NCR is designed to mitigate this degraded performance by increasing ...
Aug 12, 11:59 am 2009
Daniel Slot
[PATCH] net/ipv4, linux-2.6.30.4
RFC 4653 specifies Non-Congestion Robustness (NCR) for TCP. In the absence of explicit congestion notification from the network, TCP uses loss as an indication of congestion. One of the ways TCP detects loss is using the arrival of three duplicate acknowledgments. However, this heuristic is not always correct, notably in the case when network paths reorder segments (for whatever reason), resulting in degraded performance. TCP-NCR is designed to mitigate this degraded performance by increasing ...
Aug 12, 11:50 am 2009
Stephen Hemminger
Re: [PATCH] net/ipv4, linux-2.6.30.4
On Wed, 12 Aug 2009 20:50:59 +0200 Patch has funny indentation and awkward naming for socket elements. Your style needs to match existing code. What is the usage model for this? I expect that some user who wants to enable this would be stuck somewhere with a lossy network and would want to enable it. Or is it something only researchers will want to play with? It would be easier to use a sysctl value for this because otherwise each application has to be changed to select the socket ...
Aug 12, 12:02 pm 2009
Daniel Slot
Re: [PATCH] net/ipv4, linux-2.6.30.4
I already tried to adapt the style to existing code. Would be nice if you could give me hint about what is akward. I'm quite new to the kernelhacking area and sorry for all obvious errors. The usage is imho interesting in research domains. As it is an implementation of an ietf RFC, there might be some researchers who can use it. It is part of my master thesis and I use it for measurements and comparisons. I tried to avoid a sysctl value for this. I don't wanted this algorithm to be used by ...
Aug 12, 12:27 pm 2009
Eilon Greenstein
[net-next 33/36] bnx2x: Beautify bnx2x_dump.h
Signed-off-by: Yitchak Gertner <gertner@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_dump.h | 890 +++++++++++++++++++++++----------------------- 1 files changed, 449 insertions(+), 441 deletions(-) diff --git a/drivers/net/bnx2x_dump.h b/drivers/net/bnx2x_dump.h index 78c6b03..3bb9a91 100644 --- a/drivers/net/bnx2x_dump.h +++ b/drivers/net/bnx2x_dump.h @@ -13,31 +13,35 @@ * The signature is time stamp, diag version and grc_dump version ...
Aug 12, 11:24 am 2009
Eilon Greenstein
[net-next 34/36] bnx2x: Removing unused definitions
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_reg.h | 811 ----------------------------------------------- 1 files changed, 0 insertions(+), 811 deletions(-) diff --git a/drivers/net/bnx2x_reg.h b/drivers/net/bnx2x_reg.h index 1e6f5aa..95ebf3f 100644 --- a/drivers/net/bnx2x_reg.h +++ b/drivers/net/bnx2x_reg.h @@ -190,12 +190,6 @@ _(0..15) stands for the connection type (one of 16). */ #define ...
Aug 12, 11:24 am 2009
Eilon Greenstein
[net-next 32/36] bnx2x: Re-factor the initialization code
Moving the code to a more logical place and beautifying it. No real change in behavior. Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x.h | 28 +++- drivers/net/bnx2x_init.h | 320 ++++++-------------------------- drivers/net/bnx2x_init_ops.h | 419 ++++++++++++++++++++++++------------------ drivers/net/bnx2x_main.c | 79 ++++++-- drivers/net/bnx2x_reg.h | 113 +++++++++++ 5 ...
Aug 12, 11:24 am 2009
Eilon Greenstein
[net-next 35/36] bnx2x: Whitespaces and comments
Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x.h | 38 ++++++++------- drivers/net/bnx2x_link.c | 115 +++++++++++++++++++++++----------------------- drivers/net/bnx2x_main.c | 80 ++++++++++++++++---------------- drivers/net/bnx2x_reg.h | 3 +- 4 files changed, 121 insertions(+), 115 deletions(-) diff --git a/drivers/net/bnx2x.h b/drivers/net/bnx2x.h index 97bc5e0..bbf8422 100644 --- a/drivers/net/bnx2x.h +++ b/drivers/net/bnx2x.h @@ -314,9 ...
Aug 12, 11:24 am 2009
Eilon Greenstein
[net-next 29/36] bnx2x: Using macro for phy address
Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_link.c | 108 ++++++++++++--------------------------------- drivers/net/bnx2x_link.h | 12 +++-- drivers/net/bnx2x_main.c | 8 +--- 3 files changed, 39 insertions(+), 89 deletions(-) diff --git a/drivers/net/bnx2x_link.c b/drivers/net/bnx2x_link.c index 74f4d10..c2b0010 100644 --- a/drivers/net/bnx2x_link.c +++ b/drivers/net/bnx2x_link.c @@ -1547,10 +1547,7 @@ static u8 bnx2x_ext_phy_resove_fc(struct ...
Aug 12, 11:24 am 2009
Eilon Greenstein
[net-next 23/36] bnx2x: Remove the init_dmae field from bp
Moved the dmae_command from the heap to the stack. This will save 56 bytes per bnx2x structure. As a side benefit, we can also reduce the time the dmae_mutex is held. This is because do we not need to hold this mutex when setting up the dmae command. The memory where is dmae command is stored is not a shared resource and doesn not need to be protected. Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x.h | ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 36/36] bnx2x: update version to 1.52.1
Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_main.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c index 54e3ef9..037d862 100644 --- a/drivers/net/bnx2x_main.c +++ b/drivers/net/bnx2x_main.c @@ -56,8 +56,8 @@ #include "bnx2x_init_ops.h" #include "bnx2x_dump.h" -#define DRV_MODULE_VERSION "1.48.114-1" -#define DRV_MODULE_RELDATE "2009/07/29" +#define ...
Aug 12, 11:24 am 2009
Eilon Greenstein
[net-next 28/36] bnx2x: Re-arrange the link structures f ...
Change ieee_fc to u16 instead of u32 and re-arrange the link parameters structures Signed-off-by: Yitchak Gertner <gertner@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_link.c | 6 +++--- drivers/net/bnx2x_link.h | 31 +++++++++++++++++++------------ 2 files changed, 22 insertions(+), 15 deletions(-) diff --git a/drivers/net/bnx2x_link.c b/drivers/net/bnx2x_link.c index c163c42..74f4d10 100644 --- a/drivers/net/bnx2x_link.c +++ ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 31/36] bnx2x: Using PCI_DEVICE macro
Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_main.c | 9 +++------ 1 files changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c index a409767..23b9c7d 100644 --- a/drivers/net/bnx2x_main.c +++ b/drivers/net/bnx2x_main.c @@ -138,12 +138,9 @@ static struct { static const struct pci_device_id bnx2x_pci_tbl[] = { - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_NX2_57710, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, ...
Aug 12, 11:24 am 2009
Eilon Greenstein
[net-next 27/36] bnx2x: Missing smp_wmb for statistics s ...
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_main.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c index 656ee97..bcf8362 100644 --- a/drivers/net/bnx2x_main.c +++ b/drivers/net/bnx2x_main.c @@ -4411,6 +4411,9 @@ static void bnx2x_stats_handle(struct bnx2x *bp, enum bnx2x_stats_event event) ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 30/36] bnx2x: Adding explicit casting
Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_main.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c index bdf7228..a409767 100644 --- a/drivers/net/bnx2x_main.c +++ b/drivers/net/bnx2x_main.c @@ -2926,7 +2926,7 @@ static inline void bnx2x_attn_int_deasserted0(struct bnx2x *bp, u32 attn) REG_WR(bp, reg_offset, val); BNX2X_ERR("FATAL HW block attention set0 0x%x\n", - ...
Aug 12, 11:24 am 2009
Eilon Greenstein
[net-next 26/36] bnx2x: Remove SGMII configuration when ...
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_link.c | 13 ++++++++++--- 1 files changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/net/bnx2x_link.c b/drivers/net/bnx2x_link.c index b81a057..c163c42 100644 --- a/drivers/net/bnx2x_link.c +++ b/drivers/net/bnx2x_link.c @@ -1276,14 +1276,14 @@ static void bnx2x_program_serdes(struct link_params *params, struct bnx2x *bp = params->bp; u16 ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 25/36] bnx2x: Keep only one HW path active
Disable bmac access while working with emac and keep the single lane SerDes in reset while working with 4 lanes XGXS Signed-off-by: Yaniv Rosner <yanivr@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_link.c | 7 ++++--- 1 files changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/net/bnx2x_link.c b/drivers/net/bnx2x_link.c index dc3b69e..b81a057 100644 --- a/drivers/net/bnx2x_link.c +++ b/drivers/net/bnx2x_link.c @@ -397,7 +397,8 @@ ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 24/36] bnx2x: Check unzip return code
Without this check, when running out of memory, we will see PSOD's in bnx2x_init_fill() when doing a memset(). This is because at that time, bp->gunzip_buf is not pointing to a valid allocated space. Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_main.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c index cfcc4ee..656ee97 100644 --- ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 22/36] bnx2x: Updating regdump_len at drvinfo
Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_main.c | 69 ++++++++++++++++++++++------------------------ 1 files changed, 33 insertions(+), 36 deletions(-) diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c index 347036f..dd9a77f 100644 --- a/drivers/net/bnx2x_main.c +++ b/drivers/net/bnx2x_main.c @@ -8946,50 +8946,15 @@ static int bnx2x_set_settings(struct net_device *dev, struct ethtool_cmd *cmd) return 0; } -#define ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 13/36] bnx2x: Removing old PHY FW upgrade code
This code should not have resided in the driver. Now that we have a new interface, this logic can reside in the application that whishes to upgrade the PHY FW Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_link.c | 427 ---------------------------------------------- drivers/net/bnx2x_link.h | 2 - 2 files changed, 0 insertions(+), 429 deletions(-) diff --git a/drivers/net/bnx2x_link.c b/drivers/net/bnx2x_link.c index 98e3e8f..dc3b69e 100644 --- ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 19/36] bnx2x: Stop loading if error condition ...
Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x.h | 1 + drivers/net/bnx2x_main.c | 8 ++++++++ 2 files changed, 9 insertions(+), 0 deletions(-) diff --git a/drivers/net/bnx2x.h b/drivers/net/bnx2x.h index 004f4a8..633acca 100644 --- a/drivers/net/bnx2x.h +++ b/drivers/net/bnx2x.h @@ -89,6 +89,7 @@ } while (0) #else #define bnx2x_panic() do { \ + bp->panic = 1; \ BNX2X_ERR("driver assert\n"); ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 18/36] bnx2x: Calling pci_set_drvdata earlier
In case of error, bnx2x_init_dev calls pci_set_drvdata(pdev, NULL) Signed-off-by: Yitchak Gertner <gertner@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_main.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c index e9e4349..9eea52d 100644 --- a/drivers/net/bnx2x_main.c +++ b/drivers/net/bnx2x_main.c @@ -11884,14 +11884,14 @@ static int __devinit bnx2x_init_one(struct ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 21/36] bnx2x: Move printing of version from pr ...
Move printing of version from probe to the init function Rather then checking if this is the first module probe call to print the version of the driver only once, the statement is moved to the init function of the module where init is only called once Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_main.c | 6 ++---- 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/net/bnx2x_main.c ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 20/36] bnx2x: Combine get_pcie_width and get_p ...
The functions bnx2x_get_pcie_width() and bnx2x_get_pcie_speed() were combined into bnx2x_get_pcie_width_speed() so that there is only 1 PCI read to PCICFG_OFFSET + PCICFG_LINK_CONTROL rather then 2 reads. Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_main.c | 34 ++++++++++++++++------------------ 1 files changed, 16 insertions(+), 18 deletions(-) diff --git a/drivers/net/bnx2x_main.c ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 17/36] bnx2x: Configurable pause scheme
When a given ring is running out of space, the FW can send pause towards the network. When working with multi-queues, when one queue is getting out of space it can block all other queues. The preferred scheme is to send pause frames only when running out of the shared internal chip buffers and if a given queue cannot place a packet on the host, it will drop it. Since some users might want to work in drop-less mode, allowing changing the behavior as a module parameter. Signed-off-by: Eilon ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 16/36] bnx2x: Adding Likely directive
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_main.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c index 594168a..c4427ef 100644 --- a/drivers/net/bnx2x_main.c +++ b/drivers/net/bnx2x_main.c @@ -1612,7 +1612,8 @@ static int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget) skb = new_skb; - } else if ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 15/36] bnx2x: Prefetch the page containing the ...
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_main.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c index e6fde5e..594168a 100644 --- a/drivers/net/bnx2x_main.c +++ b/drivers/net/bnx2x_main.c @@ -1497,6 +1497,13 @@ static int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget) bd_prod = RX_BD(bd_prod); bd_cons = ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 14/36] bnx2x: Reporting host statistics to man ...
This is required for NCSI statistics Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x.h | 1 + drivers/net/bnx2x_main.c | 228 +++++++++++++++++++++++++++++++++++----------- 2 files changed, 176 insertions(+), 53 deletions(-) diff --git a/drivers/net/bnx2x.h b/drivers/net/bnx2x.h index 903c89d..1d0b727 100644 --- a/drivers/net/bnx2x.h +++ b/drivers/net/bnx2x.h @@ -777,6 +777,7 @@ struct bnx2x_slowpath { struct nig_stats nig_stats; struct ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 12/36] bnx2x: Supporting PHY FW upgrade
There are 3 operations that the driver needs to support to allow applications to access the PHY FW (on top of the MDC/MDIO access). Since those are essentially nvram access commands, adding them to the ethtool -E interface. Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_link.c | 21 ++++++------- drivers/net/bnx2x_link.h | 4 ++ drivers/net/bnx2x_main.c | 72 ++++++++++++++++++++++++++++++++++----------- 3 files changed, 67 insertions(+), 30 ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 11/36] bnx2x: MDC/MDIO CL45 IOCTLs
As suggested by Ben Hutchings <bhutchings@solarflare.com>, using the MDC/MDIO IOCTL Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/Kconfig | 1 + drivers/net/bnx2x.h | 3 + drivers/net/bnx2x_main.c | 121 ++++++++++++++++++++++++++++++++-------------- 3 files changed, 88 insertions(+), 37 deletions(-) diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index 9948fa2..29935a9 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -2722,6 ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 10/36] bnx2x: Adding XAUI CL73 autoneg support
Adding CL73 support to the built in PHY in the 5771x device. Also supporting fallbacks to CL73 if the link partner does not respond. Signed-off-by: Yaniv Rosner <yanivr@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_link.c | 197 +++++++++++++++++++++++++++++++++++++++------- drivers/net/bnx2x_reg.h | 13 +++ 2 files changed, 180 insertions(+), 30 deletions(-) diff --git a/drivers/net/bnx2x_link.c b/drivers/net/bnx2x_link.c index ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 04/36] bnx2x: Supporting Device Control Channel
In multi-function mode, the FW can receive special management control commands to set the Min/Max BW and the the function link state Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x.h | 6 + drivers/net/bnx2x_hsi.h | 34 +++++- drivers/net/bnx2x_main.c | 325 +++++++++++++++++++++++++++++++++------------- 3 files changed, 268 insertions(+), 97 deletions(-) diff --git a/drivers/net/bnx2x.h b/drivers/net/bnx2x.h index 16ccba8..5864ae2 100644 --- ...
Aug 12, 11:22 am 2009
Eilon Greenstein
[net-next 06/36] bnx2x: BCM8481 LED4 instead of LASI
The BCM8481 does not generate LASI interrupt for 10M, 100M and 1G link, so we are using LED4 output as the interrupt input to the 57711. This requires some adaptation in the link interrupt routines Signed-off-by: Yaniv Rosner <yanivr@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_link.c | 498 ++++++++++++++++++++++++++++++++++++++++------ drivers/net/bnx2x_reg.h | 32 +++ 2 files changed, 468 insertions(+), 62 deletions(-) diff --git ...
Aug 12, 11:22 am 2009
Eilon Greenstein
[net-next 09/36] bnx2x: BCM8727 FW load
The BCM8727 is a dual port PHY. The FW must be loaded in a given order on all designs - including those which swapped the ports (calling port number zero the second port) Signed-off-by: Yaniv Rosner <yanivr@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_link.c | 16 +++++++++------- 1 files changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/net/bnx2x_link.c b/drivers/net/bnx2x_link.c index aee9fff..db4f3c0 100644 --- ...
Aug 12, 11:23 am 2009
Eilon Greenstein
[net-next 07/36] bnx2x: Reading the FW version of the BC ...
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_link.c | 118 ++++++++++++++++++++++++++++++++++++++++++++-- 1 files changed, 113 insertions(+), 5 deletions(-) diff --git a/drivers/net/bnx2x_link.c b/drivers/net/bnx2x_link.c index 9e1f19a..c925249 100644 --- a/drivers/net/bnx2x_link.c +++ b/drivers/net/bnx2x_link.c @@ -2046,6 +2046,111 @@ static void bnx2x_save_bcm_spirom_ver(struct bnx2x *bp, u8 port, ...
Aug 12, 11:22 am 2009
Eilon Greenstein
[net-next 08/36] bnx2x: get_ext_phy_fw_version returns N ...
To avoid confusion, if the PHY does not have a FW (and so, no FW version) make sure that the string is NULL. Signed-off-by: Yaniv Rosner <yanivr@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_link.c | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/bnx2x_link.c b/drivers/net/bnx2x_link.c index c925249..aee9fff 100644 --- a/drivers/net/bnx2x_link.c +++ b/drivers/net/bnx2x_link.c @@ -5393,7 +5393,7 @@ u8 ...
Aug 12, 11:22 am 2009
Eilon Greenstein
[net-next 05/36] bnx2x: Advertize flow control normally ...
Signed-off-by: Eilon Greenstein <eilong@broadcom.com> --- drivers/net/bnx2x_main.c | 4 +--- 1 files changed, 1 insertions(+), 3 deletions(-) diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c index 442ba61..47b687b 100644 --- a/drivers/net/bnx2x_main.c +++ b/drivers/net/bnx2x_main.c @@ -2148,9 +2148,7 @@ static u8 bnx2x_initial_phy_init(struct bnx2x *bp, int load_mode) /* Initialize link parameters structure variables */ /* It is recommended to turn off RX FC for ...
Aug 12, 11:22 am 2009
Eilon Greenstein
[net-next 02/36] bnx2x: Using the new FW
The new FW improves the packets per second rate. It required a lot of change in the FW which implies many changes in the driver to support it. It is now also possible for the driver to use a separate MSI-X vector for Rx and Tx - this also add some to the complicity of this change. All things said - after this patch, practically all performance matrixes show improvement. Though Vladislav Zolotarov is not signed on this patch, he did most of the job and deserves credit for that. Signed-off-by: ...
Aug 12, 11:20 am 2009
Eilon Greenstein
[net-next 00/36] bnx2x patch series
Hi Dave, Here is a patch series for the bnx2x. This patch series also replace the FW, so it contains two big blobs - the new fw and the removal of the old one. Those patches do not contain anything but the ihex - the actually change to the driver is in patch number 2 which is small enough to fit the mailing list. For those who wish to see all the patches, including the ihex, I also updated http://linux.broadcom.com/eilong/ to contain this patch series. Please consider applying to ...
Aug 12, 11:19 am 2009
David Miller
Re: [net-next 00/36] bnx2x patch series
Come on... 36 patches? :-/ Please trickle changes in, don't send patch bombs. I think I've told you this not once, but several times. But I keep seeing these sizable patch sets. I frankly don't care that it might not mesh well with how you code up and validate changes internally, because it absolutely does NOT work well for how bugs really get found and fixed upstream. If you trickle changes in, the guilty change is obvious to spot and it gets found before you do more development that ...
Aug 12, 2:50 pm 2009
Jens Rosenboom
[PATCH] ipv6: Log the explicit address that triggered DA ...
If an interface has multiple addresses, the current message for DAD failure isn't really helpful, so this patch adds the address itself to the printk. Signed-off-by: Jens Rosenboom <jens@mcbone.net> --- diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 43b3c9f..01a4b25 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -1403,8 +1403,8 @@ void addrconf_dad_failure(struct inet6_ifaddr *ifp) struct inet6_dev *idev = ifp->idev; if ...
Aug 12, 7:58 am 2009
Jens Rosenboom
[RFC] ipv6: Change %pI6 format to output compacted addresses?
Currently the output looks like 2001:0db8:0000:0000:0000:0000:0000:0001 which might be compacted to 2001:db8::1. The code to do this could be adapted from inet_ntop in glibc, which would add about 80 lines to lib/vsprintf.c. How do you guys value the tradeoff between more readable logging and increased kernel size? This was already mentioned in http://kerneltrap.org/mailarchive/linux-netdev/2008/11/25/4231684 but noone seems to have taken up on it. --
Aug 12, 8:39 am 2009
Dan Smith
[PATCH] c/r: Add AF_UNIX support (v9)
This patch adds basic checkpoint/restart support for AF_UNIX sockets. It has been tested with a single and multiple processes, and with data inflight at the time of checkpoint. It supports socketpair()s, path-based, and abstract sockets. Changes in v9: - Fix double-free of skb's in the list and target holding queue in the error path of sock_copy_buffers() - Adjust use of ckpt_read_string() to match new signature Changes in v8: - Fix stale dev_alloc_skb() from before the ...
Aug 12, 8:12 am 2009
=?UTF-8?B?7ZmN7IugIH ...
net/unix : possible race bug at unix_create1()
Hi. I am reporting a possible race bug at unix_create1() in net/unix/af_unix.c of Linux 2.6.30.4. Concurrent executions of unix_create1() function in two different threads may result race condition when unix_nr_socks +1 == 2 * get_max_files(). It is possible that no thread can pass the if-condition checking if two atomic_inc() operations are executed before. It seems that it would be better to combine two atomic operations into one atomic_inc_and_return(). Please examine the code and ...
Aug 12, 8:00 am 2009
Mrs Joan Thomas
LUCKY WINNER
950.000.00 GBP has been Awarded to you in our LG Electronics,send our office your Names:............ Address:.......... Country:.......... --
Aug 12, 7:11 am 2009
=?UTF-8?B?7ZmN7IugIH ...
a question on packet_sock struct
Hi. I have a question on packet_sock struct defined in net/packet/af_packet.c of Linux 2.6.30.4. Is it necessary to hold a packet_sock's bind_lock before accessing its ifindex field? According to the definition of packet_sock struct, it seems that an access to ifindex field should be synchronized by the bind_lock. However, according to the its usage in the code, ifindex accesses are not consistently protected by the bind_lock. Thank you Sincerely Shin Hong --
Aug 11, 11:43 pm 2009
Rusty Russell
Re: Page allocation failures in guest
Subject: virtio: net refill on out-of-memory If we run out of memory, use keventd to fill the buffer. There's a report of this happening: "Page allocation failures in guest", Message-ID: <20090713115158.0a4892b0@mjolnir.ossman.eu> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -71,6 +71,9 @@ struct virtnet_info struct sk_buff_head recv; struct ...
Aug 11, 10:31 pm 2009
Avi Kivity
Re: Page allocation failures in guest
schedule_delayed_work()? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. --
Aug 11, 10:41 pm 2009
Rusty Russell
Re: Page allocation failures in guest
Hmm, might as well, although this is v. unlikely to happen. Thanks, Rusty. diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -72,7 +72,7 @@ struct virtnet_info struct sk_buff_head send; /* Work struct for refilling if we run low on memory. */ - struct work_struct refill; + struct delayed_work refill; /* Chain pages by the private ptr. */ struct page *pages; @@ -402,19 +402,16 @@ static void ...
Aug 11, 11:56 pm 2009
Gregory Haskins
Re: [PATCHv2 2/2] vhost_net: a kernel-level virtio server
Only a quick review for now. Will look closer later. This seems odd. If you have the flush to act as a sync-barrier, why do you also need rcu_dereference(sock)? At first blush, it seems
Aug 11, 5:06 pm 2009
Michael S. Tsirkin
Re: [PATCHv2 2/2] vhost_net: a kernel-level virtio server
It inserts memory barriers on architectures that require them (currently only the Alpha), and, more importantly, documents I don't think so. sync-barrier has nothing to do with it as it comes --
Aug 12, 2:02 am 2009
Michael S. Tsirkin Aug 12, 3:52 am 2009
Gregory Haskins
Re: [PATCHv2 2/2] vhost_net: a kernel-level virtio server
Is it? I don't see where RCU actually enters the equation if you are not bracketing the dereference with rcu_read_lock/unlock. I'm sure you have this correct, its just that it goes against my understanding of how to use RCU properly so I am trying to understand what you did. I'm I still am not seeing it. Even rcupdate.h says: /** * rcu_dereference - fetch an RCU-protected pointer in an * RCU read-side critical section. This pointer may later * be safely dereferenced. note ...
Aug 12, 6:01 am 2009
Michael S. Tsirkin
Re: [PATCHv2 2/2] vhost_net: a kernel-level virtio server
Here's a thesis on what rcu_dereference does (besides documentation): reader does this A: sock = n->sock B: use *sock Say writer does this: C: newsock = allocate socket D: initialize(newsock) E: n->sock = newsock F: flush On Alpha, reads could be reordered. So, on smp, command A could get data from point F, and command B - from point D (uninitialized, from cache). IOW, you get fresh pointer but stale data. Heh, if readers are lockless and writer does ...
Aug 12, 6:25 am 2009
Gregory Haskins
Re: [PATCHv2 2/2] vhost_net: a kernel-level virtio server
Yes, that is understood. Perhaps you should just use a normal barrier, however. (Or at least a comment that says "I am just using this for its More correctly: it "smells like" RCU, but its not. ;) It's rcu-like, but you are not really using the rcu facilities. I think anyone that knows RCU and reads your code will likely be scratching their heads as well. Its probably not a big deal, as I understand your code now. Just a suggestion to help clarify it. Regards, -Greg
Aug 12, 6:41 am 2009
Michael S. Tsirkin
Re: [PATCHv2 2/2] vhost_net: a kernel-level virtio server
OK, I'll add some comments about that. Thanks for the review! -- MST --
Aug 12, 6:47 am 2009
Paul E. McKenney
Re: [PATCHv2 2/2] vhost_net: a kernel-level virtio server
If you are using call_rcu(), synchronize_rcu(), or one of the similar primitives, then you absolutely need rcu_read_lock() and rcu_read_unlock(), or one of the similar pairs of primitives. If you -don't- use rcu_read_lock(), then you are pretty much restricted to adding data, but never removing it. Make sense? ;-) Thanx, Paul --
Aug 12, 7:11 am 2009
Michael S. Tsirkin
Re: [PATCHv2 2/2] vhost_net: a kernel-level virtio server
Since I only access data from a workqueue, I replaced synchronize_rcu with workqueue flush. That's why I don't need rcu_read_lock. -- MST --
Aug 12, 7:15 am 2009
Paul E. McKenney
Re: [PATCHv2 2/2] vhost_net: a kernel-level virtio server
Well, you -do- need -something- that takes on the role of rcu_read_lock(), and in your case you in fact actually do. Your equivalent of rcu_read_lock() is the beginning of execution of a workqueue item, and the equivalent of rcu_read_unlock() is the end of execution of that same workqueue item. Implicit, but no less real. If a couple more uses like this show up, I might need to add this to Documentation/RCU. ;-) Thanx, Paul --
Aug 12, 8:26 am 2009
Michael S. Tsirkin Aug 12, 8:51 am 2009
Paul E. McKenney
Re: [PATCHv2 2/2] vhost_net: a kernel-level virtio server
And I idly wonder if this approach could replace SRCU. Probably not for protecting the CPU-hotplug notifier chains, but worth some thought. Thanx, Paul --
Aug 12, 9:06 am 2009
Michael S. Tsirkin
Re: [PATCHv2 0/2] vhost: a kernel-level virtio server
1. use a dedicated network interface with SRIOV, program mac to match that of guest (for testing, you can set promisc mode, but that is bad for performance) 2. disable tso,gso,lro with ethtool 3. add vhost=ethX -- MST --
Aug 12, 12:16 am 2009
Gregory Haskins
Re: [PATCHv2 0/2] vhost: a kernel-level virtio server
Are you saying SRIOV is a requirement, and I can either program the SRIOV adapter with a mac or use promis? Or are you saying I can use Out of curiosity, wouldnt you only need to disable LRO on the adapter, since the other two (IIUC) are transmit path and are therefore You mean via "ip link" I assume? Regards, -Greg
Aug 12, 4:56 am 2009
Michael S. Tsirkin
Re: [PATCHv2 0/2] vhost: a kernel-level virtio server
SRIOV is not a requirement. And you can also use a dedicated No, that's a new flag for virtio in qemu: --
Aug 12, 5:05 am 2009
Gregory Haskins
Re: [PATCHv2 0/2] vhost: a kernel-level virtio server
Makes sense. Got it. I was going to add guest-to-guest to the test matrix, but I assume that is not supported with vhost unless you have something like a VEPA enabled bridge? Ah, ok. Even better. Thanks! -Greg
Aug 12, 5:41 am 2009
Arnd Bergmann
Re: [PATCHv2 0/2] vhost: a kernel-level virtio server
If I understand it correctly, you can at least connect a veth pair to a bridge, right? Something like veth0 - veth1 - vhost - guest 1 eth0 - br0-| veth2 - veth3 - vhost - guest 2 It's a bit more complicated than it need to be, but should work fine. Arnd <>< --
Aug 12, 5:52 am 2009
Michael S. Tsirkin
Re: [PATCHv2 0/2] vhost: a kernel-level virtio server
Presumably you mean on the same host? There were also some patches to enable local guest to guest for macvlan, that would be a nice software-only solution. For back to back, I just tried over veth, seems --
Aug 12, 6:04 am 2009
Michael S. Tsirkin
Re: [PATCHv2 0/2] vhost: a kernel-level virtio server
Heh, you don't need a bridge in this picture: guest 1 - vhost - veth0 - veth1 - vhost guest 2 --
Aug 12, 6:06 am 2009
Michael S. Tsirkin
Re: [PATCHv2 0/2] vhost: a kernel-level virtio server
Oh, hopefully macvlan will soon allow that. -- MST --
Aug 12, 6:42 am 2009
Arnd Bergmann
Re: [PATCHv2 0/2] vhost: a kernel-level virtio server
Sure, but the setup I described is the one that I would expect to see in practice because it gives you external connectivity. Measuring two guests communicating over a veth pair is interesting for finding the bottlenecks, but of little practical relevance. Arnd <>< --
Aug 12, 6:40 am 2009
Gregory Haskins
Re: [PATCHv2 0/2] vhost: a kernel-level virtio server
Yeah, this would be the config I would be interested in. Regards, -Greg
Aug 12, 6:51 am 2009
Michael S. Tsirkin
Re: [PATCHv2 0/2] vhost: a kernel-level virtio server
Hmm, this wouldn't be the config to use for the benchmark though: there are just too many variables. If you want both guest to guest and guest to host, create 2 nics in the guest. Here's one way to do this: -net nic,model=virtio,vlan=0 -net user,vlan=0 -net nic,vlan=1,model=virtio,vhost=veth0 -redir tcp:8022::22 -net nic,model=virtio,vlan=0 -net user,vlan=0 -net nic,vlan=1,model=virtio,vhost=veth1 -redir tcp:8023::22 In guests, for simplicity, configure eth1 and eth0 to use ...
Aug 12, 7:02 am 2009
Gregory Haskins
Re: [PATCHv2 0/2] vhost: a kernel-level virtio server
I can try to do a few variations, but what I am interested is in performance in a real-world L2 configuration. This would generally mean all hosts (virtual or physical) in the same L2 domain. If I get a chance, though, I will try to also wire them up in isolation as another data point. Regards, -Greg
Aug 12, 9:13 am 2009
Michael S. Tsirkin
Re: [PATCHv2 0/2] vhost: a kernel-level virtio server
Or patch macvlan to support guest to guest: http://markmail.org/message/sjy74g57qsvdo2wh That patch needs to be updated to support guest to guest multiast, but it seems functional enough for your purposes. -- MST --
Aug 12, 9:37 am 2009
David Miller
Re: pull request: wireless-2.6 2009-08-11
From: "John W. Linville" <linville@tuxdriver.com> Ok, I'll take care of that. --
Aug 12, 2:52 pm 2009
John W. Linville
Re: pull request: wireless-2.6 2009-08-11
Dave, When you pull this, could you also revert 57921c31 ("libertas: Read buffer overflow"). It has been shown to create a new problem. There is work towards a solution to that one, but it isn't a simple clean-up. If you would prefer, I can include the revert with another pull request or even regenerate this one. Just let me know if that is what you prefer. Thanks, John -- John W. Linville Someday the world will need a hero, and you linville@tuxdriver.com might be all we ...
Aug 12, 11:24 am 2009
Bob Dunlop
Re: [PATCH] libertas: name the network device wlan%d
Well I've been applying the equivalent of this patch privately since we started using the libertas driver. We build systems with one or two wired Ethernets and then an optional wireless module. A fixed name wlan0 is a lot easier than explaining to a user that the interface might be eth1 or eth2 depending on which model they have, or that eth1 might be wired or wireless. It also simplifies scripts and configuration file handling. I'm sure there are many other solutions for big systems but ...
Aug 12, 12:56 am 2009
Daniel Mack
Re: [PATCH] libertas: name the network device wlan%d
Yes, our story here is very similar :) --
Aug 12, 1:55 am 2009
Dan Williams
Re: [PATCH] libertas: name the network device wlan%d
I don't care either way, it's completely historical. Most of the fullmac drivers used 'eth' back when. Might want to get buy-in from the OLPC crew since they probably have the most deployed units using libertas (cc-ed Daniel Drake). Daniel, is it a problem for you guys if the libertas wifi interface name went from 'eth' -> 'wlan' ? Mesh name would be unchanged. Dan --
Aug 12, 9:31 am 2009
Stephen Hemminger
Re: [PATCH] Fix Warnings from net/netlink/genetlink.c
On Tue, 11 Aug 2009 16:57:41 -0700 Agreed, and the line numbers are off. -- --
Aug 11, 8:24 pm 2009
Stephen Rothwell
Re: [PATCH] Fix Warnings from net/netlink/genetlink.c
Hi all, In the -next tree, it looks like this: int genl_register_mc_group(struct genl_family *family, struct genl_multicast_group *grp) { int id; unsigned long *new_groups; int err; BUG_ON(grp->name[0] == '\0'); genl_lock(); /* special-case our own group */ if (grp == &notify_grp) id = GENL_ID_CTRL; else id = find_first_zero_bit(mc_groups, mc_groups_longs * BITS_PER_LONG); if (id >= mc_groups_longs * BITS_PER_LONG) { size_t nlen = ...
Aug 11, 8:50 pm 2009
Marcel Holtmann
Re: [PATCH] Fix Warnings from net/netlink/genetlink.c
it would have been nice if the patch actually indicates that it is for -next since otherwise just shutting up a compiler warning is a bad idea I prefer we add a err = 0 in the if (family->netnsok) { block instead of just globally setting it to a value. Regards Marcel --
Aug 11, 9:03 pm 2009
Rusty Russell
Re: [PATCH] fix memory leak in virtio_net
Nope, kfree_skb() frees the frags. It needs to, otherwise we leak on every received packet! Cheers, Rusty. --
Aug 12, 5:41 am 2009
Serge E. Hallyn
Re: module loading permissions and request_module permis ...
Right, so taking a more extreme example, the request_module() in search_binary_handler... requiring CAP_SYS_MODULE there would mean you'd have to be privileged to be the first to execute say a binfmt_misc. The actual modules are to be protected by protecting /lib/modules and /sbin/modprobe themselves. So long as those are properly protected, the ability to cause a call to __request_module() at most takes up more memory. So what you say seems to make sense. -serge --
Aug 12, 4:48 pm 2009
Arnd Bergmann
Re: [PATCH 2/2] vhost_net: a kernel-level virtio server
We discussed this before, and I still think this could be directly derived from struct virtqueue, in the same way that vring_virtqueue is derived from struct virtqueue. That would make it possible for simple device drivers to use the same driver in both host and guest, similar to how Ira Snyder used virtqueues to make virtio_net run between two hosts running the same code [1]. Ideally, I guess you should be able to even make virtio_net work in the host if you do that, but that could bring ...
Aug 12, 10:03 am 2009
Ira W. Snyder
Re: [PATCH 2/2] vhost_net: a kernel-level virtio server
I have no comments about the vhost code itself, I haven't reviewed it. It might be interesting to try using a virtio-net in the host kernel to communicate with the virtio-net running in the guest kernel. The lack of a management interface is the biggest problem you will face (setting MAC addresses, negotiating features, etc. doesn't work intuitively). Getting the network interfaces talking is relatively easy. Ira --
Aug 12, 10:19 am 2009
Michael S. Tsirkin
Re: [PATCH 2/2] vhost_net: a kernel-level virtio server
I prefer keeping it simple. Much of abstraction in virtio is due to the fact that it needs to work on top of different hardware emulations: lguest,kvm, possibly others in the future. vhost is always working on I don't think so. For example, there's a callback field that gets invoked in guest when buffers are consumed. It could be overloaded to mean "buffers are available" in host but you never handle both As I pointed out earlier, most code in virtio net is asymmetrical: guest provides ...
Aug 12, 10:21 am 2009
Michael S. Tsirkin
Re: [PATCH 2/2] vhost_net: a kernel-level virtio server
That was one of the reasons I decided to move most of code out to userspace. My kernel driver only handles datapath, Tried this, but - guest memory isn't pinned, so copy_to_user to access it, errors need to be handled in a sane way - used/available roles are reversed - kick/interrupt roles are reversed So most of the code then looks like if (host) { } else { } return The only common part is walking the descriptor list, but that's like 10 lines of code. At which point ...
Aug 12, 10:31 am 2009
Ira W. Snyder
Re: [PATCH 2/2] vhost_net: a kernel-level virtio server
Ok, that makes sense. Let me see if I understand the concept of the driver. Here's a picture of what makes sense to me: guest system --------------------------------- | userspace applications | --------------------------------- | kernel network stack | --------------------------------- | virtio-net | --------------------------------- | transport (virtio-ring, etc.) | --------------------------------- | ...
Aug 12, 10:48 am 2009
Arnd Bergmann
Re: [PATCH 2/2] vhost_net: a kernel-level virtio server
Well, that was my point: virtio can already work on a number of abstractions, The trick is to swap the virtqueues instead. virtio-net is actually mostly symmetric in just the same way that the physical wires on a twisted pair ethernet are symmetric (I like how that analogy fits). virtio_net kicks the transmit virtqueue when it has data and it kicks the receive queue when it has empty buffers to fill, and it has callbacks when the two are done. You can do the same in both the guest and the ...
Aug 12, 10:59 am 2009
Anthony Liguori
Re: [PATCH 2/2] vhost_net: a kernel-level virtio server
Actually, vhost may not always be limited to real hardware. We may on day use vhost as the basis of a driver domain. There's quite a lot of interest in this for networking. At any rate, I'd like to see performance results before we consider trying to reuse virtio code. Regards, Anthony Liguori --
Aug 12, 12:22 pm 2009
Anthony Liguori Aug 12, 12:27 pm 2009
Paul E. McKenney
Re: [PATCH 2/2] vhost_net: a kernel-level virtio server
Much better -- a couple of documentation nits below. How about something like "Therefore the beginning of workqueue execution acts as rcu_read_lock() and the end of workqueue execution acts as rcu_read_lock()"? It would also be good to add comments to the workqueue functions themselves saying that they act as read-side critical sections for your kind of RCU. --
Aug 12, 12:58 pm 2009
Paul Moore
Re: [RFC PATCH v2 2/2] selinux: Support for the new TUN ...
Thanks, I added both acks. -- paul moore linux @ hp --
Aug 12, 7:59 am 2009
Serge E. Hallyn Aug 12, 12:28 pm 2009
Paul Moore Aug 12, 12:43 pm 2009
Serge E. Hallyn
Re: [RFC PATCH v2 2/2] selinux: Support for the new TUN ...
IIUC it is possible for multiple processes to attach to the same tun device. Will it get confusing/incorrect to have each attach --
Aug 12, 3:14 pm 2009
Paul Moore
Re: [RFC PATCH v2 2/2] selinux: Support for the new TUN ...
I may be reading the code wrong, but in drivers/net/tun.c:tun_attach() the code checks to see if the TUN device is already in use and if it is then the attach fails with -EBUSY (check where the tun_device->tfile is examined). I believe this should ensure that only one process at a time has access to the TUN device so we shouldn't have to worry about a TUN socket getting relabeled while it is currently in use. As far as persistent TUN devices getting relabeled when a new process ...
Aug 12, 3:55 pm 2009
Serge E. Hallyn
Re: [RFC PATCH v2 2/2] selinux: Support for the new TUN ...
Ah yes, you're right - I saw the check for (ifr->ifr_flags & IFF_TUN_EXCL) in Ok, thanks. To my untrained eye the class addition looks right too, so with the trivial change: Acked-by: Serge Hallyn <serue@us.ibm.com> thanks, -serge --
Aug 12, 4:07 pm 2009
Oren Laadan
Re: [PATCH 5/5] c/r: Add AF_UNIX support (v8)
Dan, I just noticed that this message wasn't posted last night. So hitting "send" now ... sorry about that. ----- Before pulling this one, I took a quick look at this patch, and I saw that it still uses skb_morph despite the changelog and my memory... Can you please verify that this is the latest ? Also, while trying to pull it, I'd like to ask for three cosmetic changes, if it isn't too much - 1) Move 'struct ckpt_hdr_socket' et-al to checkpoint_hdr.h 2) Move everything that is ...
Aug 12, 8:29 am 2009
Dan Smith
Re: [PATCH 5/5] c/r: Add AF_UNIX support (v8)
OL> Before pulling this one, I took a quick look at this patch, and I OL> saw that it still uses skb_morph despite the changelog and my OL> memory... That's correct. We've been through several ways of allocating the skb's, so it's definitely confusing. We're back to skb_morph() because I'm pre-allocating them for lock safety when traversing the queues. The only thing that is allocated is the actual skb structure itself; the buffers are still shared like with skb_clone(). OL> 1) Move ...
Aug 12, 8:36 am 2009
Oren Laadan Aug 12, 12:19 pm 2009
Dmitry Eremin-Solenikov
Re: [PATCH 1/2] mac802154: add a software MAC 802.15.4 i ...
Currently we do all the work from special worker threads, so it's possible for this callback to sleep. The error isn't yet propagated to We were using master netdevices for several purposes: 1) ip link add link mwpanX type wpan, so that we have out-of-box support for radio additions. That's really nice thing to have. 2) for SOCK_RAW implementation that can be used to send raw packets over-the-air/receive raw packets. I think we can use af_packet for this, but I'm still not sure ...
Aug 12, 6:06 am 2009
Johannes Berg Aug 12, 6:13 am 2009
Dmitry Eremin-Solenikov
Re: [PATCH 1/2] mac802154: add a software MAC 802.15.4 i ...
Hmmm. Really weird. Then, if we want to pass data from socket layer to MAC layer, we should place data in skb->data and not in skb->cb (like Nice idea. Thanks a lot! -- With best wishes Dmitry --
Aug 12, 1:46 pm 2009
Brandeburg, Jesse
RE: Receive side performance issue with multi-10-GigE and NUMA
bill, I recently helped Jesse Barnes push a patch that addresses this kind of issue on CoreI7, the root cause was the numa_node variable was initialized based on slot on AMD systems, but needed to be set to -1 by default on systems with a uniform IOH to slot architecture. here is the commit ID: http://git.kernel.org/?p=linux/kernel/git/sfr/linux-next.git;a=commit;h=3c38 d674be519109696746192943a6d524019f7f I'm not sure it is in linus' tree yet, this link is to net-next Maybe see if it ...
Aug 11, 5:02 pm 2009
Bill Fink
Re: Receive side performance issue with multi-10-GigE and NUMA
I did have NUMA enabled, and memory was configured as independent rather than interleaved. Based on all the discussions, it seemed a good possibility that the BIOS was broken. Today a colleague checked the SuperMicro site, and discovered and installed a newer version of the BIOS. Things seem better now, but not totally correct. There are now NUMA nodes 0 and 1 instead of 0 and 2, and the CPUs for node 0 are 0 through 3 while the CPUs for node 1 are 4 through 7 (previously the even CPUs ...
Aug 11, 9:30 pm 2009
Bill Fink
Re: Receive side performance issue with multi-10-GigE and NUMA
It's worth a shot. Hopefully I can get a chance to build a new kernel tomorrow to check out some of the suggestions, like this one, the setting of ACPI_DEBUG, and the new ftrace module for checking NUMA affinity of skbs. -Thanks -Bill --
Aug 11, 9:38 pm 2009
Andi Kleen
Re: Receive side performance issue with multi-10-GigE and NUMA
That might be ok, depending on how the APICs are configured. Of course you should have the same number of CPUs on the different Most likely you need the appended patch from linux-next. It should be probably in .31, but I can't see it in linus' tree only in -next. Jesse? No. -Andi commit eaf2f454cc9a76dbe1890af6269e60fe9978a3a5 Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Fri Jul 10 14:04:30 2009 -0700 x86/PCI: initialize PCI bus node numbers early ...
Aug 12, 12:21 am 2009
Jesse Barnes
Re: Receive side performance issue with multi-10-GigE and NUMA
On Wed, 12 Aug 2009 00:38:24 -0400 It's a fairly significant change so I wasn't planning on sending it to Linus for 2.6.31. If you think it *should* go into 2.6.31 (and stable for that matter), please let me know soon. Thanks, -- Jesse Barnes, Intel Open Source Technology Center --
Aug 12, 9:00 am 2009
David Miller
Re: Receive side performance issue with multi-10-GigE and NUMA
From: Bill Fink <billfink@mindspring.com> This, unfortunately, won't be comprehensive. You'd also need to kludge the NUMA node used for allocation of the skb->data buffer via the netdev_alloc_skb() calls in myri10ge_rx_done() and friends. This could possibly account for why, with your kludge, you still were only getting 56.4703 Gbps --
Aug 12, 4:29 pm 2009
Phil Sutter
Re: [PATCH] korina: Read buffer overflow
Hi, Obviously, I took the chance to mess things up again. These three patches were accidentially written on top of the linux-mips tree, right before Ralf pulled from Linus. So they do not apply cleanly to the netdev tree, and even worse the last one is completely useless since it's changes have already been implemented. I will follow up to this email with an updated series of the two remaining, valid patches. Sorry for the inconvenience. Greetings, Phil --
Aug 12, 3:15 pm 2009
Phil Sutter
[PATCH 1/2] korina: fix printk formatting, add final info line
The macro DRV_NAME contains "korina", the field dev->name points to the actual interface name. So messages were formerly prefixed with 'korinaeth2:' (on my system). Signed-off-by: Phil Sutter <n0-1@freewrt.org> --- drivers/net/korina.c | 32 +++++++++++++++++--------------- 1 files changed, 17 insertions(+), 15 deletions(-) diff --git a/drivers/net/korina.c b/drivers/net/korina.c index b4cf602..6df9d25 100644 --- a/drivers/net/korina.c +++ b/drivers/net/korina.c @@ -338,7 +338,7 @@ ...
Aug 12, 3:22 pm 2009
Phil Sutter
[PATCH 2/2] korina: add error-handling to korina_alloc_ring
This also avoids a potential buffer overflow in case the very first receive descriptor fails to allocate, as an index of -1 would be used afterwards. Kudos to Roel Kluin for pointing this out and providing an initial patch. Signed-off-by: Phil Sutter <n0-1@freewrt.org> --- drivers/net/korina.c | 12 +++++++++--- 1 files changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/net/korina.c b/drivers/net/korina.c index 6df9d25..51ca54c 100644 --- a/drivers/net/korina.c +++ ...
Aug 12, 3:52 pm 2009
Dave Jones
Re: 8139cp dma-debug warning.
On Thu, Aug 06, 2009 at 05:57:02PM -0400, Dave Jones wrote: > I'm chasing yet another dma-debug warning where we're unmapping a different > size to what we mapped. > > > WARNING: at lib/dma-debug.c:803 check_unmap+0x1f5/0x509() (Not tainted) > > Hardware name: > > 8139cp 0000:00:03.0: DMA-API: device driver frees DMA memory with different > > size [device address=0x000000001e9f8852] [map size=1536 bytes] [unmap size=1538 > > bytes] > > Modules linked in: ipv6 dm_multipath ...
Aug 12, 10:13 am 2009
David Miller
Re: [PATCH] net: Fix spinlock use in alloc_netdev_mq()
From: Jiri Pirko <jpirko@redhat.com> Well, because of those potential late dev->type settings we can't do things this way. And I believe those in fact do happen. So I'm tossing this patch, I wouldn't have applied it to net-2.6 anyways, as it's net-next-2.6 material :-) --
Aug 12, 4:44 pm 2009
Vlad Yasevich
Re: WARNING: at net/ipv4/af_inet.c:155 inet_sock_destruc ...
BTW, I've seen the same issue in 2.6.28 and 2.6.29 while doing a bunch of NFS-over-UDP testing. I've seen the issue reported in 2.6.27 as well, but it went by ignored. It's not easy to reproduce as it seems like it requires quite a bit traffic over over multiple interfaces. I've been looking at this for a while and haven't caught the bugger. Here is the stack trace from 2.6.28: May 13 16:17:38 dl380g6-2 kernel: [ 4473.086015] ------------[ cut here ]------- ----- May 13 16:17:38 ...
Aug 12, 1:00 pm 2009
David Miller
Re: [PATCH] pppoe: fix race at init time
From: Cyrill Gorcunov <gorcunov@gmail.com> Still no feedback on this one, but it looks totally correct to me. So I've applied it to net-next-2.6 so that it doesn't get lost and if it turns out we need it to actually fix a user reported bug we can toss it into net-2.6 too. --
Aug 12, 4:40 pm 2009
=?iso-8859-1?q?R=E9m ...
Re: [PATCH] Phonet: sockets list through proc_fs
I simply did not think of the sole pn_sock_seq_fops as "so many things"... But I can change it if you think it would be better. -- Rémi Denis-Courmont Nokia Devices R&D, Maemo Software, Helsinki --
Aug 12, 4:02 am 2009
David Miller
Re: [PATCH] Phonet: sockets list through proc_fs
From: "Rémi Denis-Courmont" <remi.denis-courmont@nokia.com> Maybe I'm exaggerating, but in any event the less you export from a file the cleaner it tends to be. --
Aug 12, 11:06 am 2009
Arnd Bergmann
Re: [PATCH][RFC] net/bridge: add basic VEPA support
Right, that question is still open, and dont't see it as very important Not yet, but I guess it comes as a natural extension when I fix multicast/broadcast delivery from the reflective relay for VEPA. The logic that I would use there is: broadcast from a dowstream port: if (bridge_mode(source_port)) { forward_to_upstream(frame); for_each_downstream(port) { /* deliver to all bridge ports except self, do not deliver to any VEPA ...
Aug 12, 6:19 am 2009
Fischer, Anna
RE: [PATCH][RFC] net/bridge: add basic VEPA support
Yes, for the basic VEPA this is not important. For MultiChannel VEPA, it would be nice if a macvlan device could operate as VEPA and as a typical VEB (VEB = traditional bridge but no learning). Basically, what we would need to be able to support is running a VEB and a VEPA simultaneously on the same uplink port (e.g. the physical device). A new component (called the S-Component) would then multiplex frames to the VEB or the VEPA based on a tagging scheme. I could see this potentially ...
Aug 12, 7:32 am 2009
Arnd Bergmann
Re: [PATCH][RFC] net/bridge: add basic VEPA support
Right, this would be a logical extension in that scenario. I would imagine that in many scenarios running a VEB also means that you want to use the advanced ebtables/iptables filtering of the bridge subsystem, but if all guests trust each other, using macvlan to bridge between them You can of course do that by adding one port of the S-component to a port of a bridge, and using another port of the S-component to create macvlan devices, or you could have multiple ports of the S-component each ...
Aug 12, 9:27 am 2009
previous daytodaynext day
August 11, 2009August 12, 2009August 13, 2009