Changes from the previous version:
- Rename proc_{get,put}_ulong to proc_{get,put}_long();
- Fix potential dead loop problems in cma code.
------------->
This patch introduces /proc/sys/net/ipv4/ip_local_reserved_ports which
allows users to reserve ports for third-party applications.
The reserved ports will not be used by automatic port assignments
(e.g. when calling connect() or bind() with port number 0). Explicit
port allocation behavior is unchanged.
There are still some miss behaviors with regard to proc parsing in odd
invalid cases (for "40000\0-40001" all is acknowledged but only 40000
is accepted) but they are not easy to fix without changing the current
"acknowledge how much we accepted" behavior.
Because of that and because the same issues are present in the
existing proc_dointvec code as well I don't think its worth holding
the actual feature (port reservation) after such petty error recovery
issues.
--
From: Octavian Purdila <opurdila@ixiacom.com>
As we are about to add another integer handling proc function a little
bit of cleanup is in order: add a few helper functions to improve code
readability and decrease code duplication.
In the process a bug is also fixed: if the user specifies a number
with more then 20 digits it will be interpreted as two integers
(e.g. 10000...13 will be interpreted as 100.... and 13).
Behavior for EFAULT handling was changed as well. Previous to this
patch, when an EFAULT error occurred in the middle of a write
operation, although some of the elements were set, that was not
acknowledged to the user (by shorting the write and returning the
number of bytes accepted). EFAULT is now treated just like any other
errors by acknowledging the amount of bytes accepted.
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
---
Index: linux-2.6/kernel/sysctl.c
===================================================================
--- linux-2.6.orig/kernel/sysctl.c
+++ linux-2.6/kernel/sysctl.c
@@ -2040,8 +2040,148 @@ int proc_dostring(struct ctl_table *tabl
buffer, lenp, ppos);
}
+static int proc_skip_wspace(char __user **buf, size_t *size)
+{
+ char c;
+
+ while (*size) {
+ if (get_user(c, *buf))
+ return -EFAULT;
+ if (!isspace(c))
+ break;
+ (*size)--;
+ (*buf)++;
+ }
+
+ return 0;
+}
+
+static bool isanyof(char c, const char *v, unsigned len)
+{
+ int i;
+
+ if (!len)
+ return false;
+
+ for (i = 0; i < len; i++)
+ if (c == v[i])
+ break;
+ if (i == len)
+ return false;
+
+ return true;
+}
+
+#define TMPBUFLEN 22
+/**
+ * proc_get_long - reads an ASCII formated integer from a user buffer
+ *
+ * @buf - user buffer
+ * @size - size of the user buffer
+ * @val - this is where the number will be stored
+ * @neg - set to %TRUE if number is negative
+ * @perm_tr - a vector which contains the allowed ...ULONG_MAX is not 22 digits always. The fix is to not rely on simple_strtoul() I guess it's time to finally remove it. :-( Also, it's better to copy_from user stuff once. A what? --
It seems that all proc code assumes that the input buffer will Hmm, right, it should be memchr(v, c, len). Thanks! --
From: Octavian Purdila <opurdila@ixiacom.com> This patch introduces /proc/sys/net/ipv4/ip_local_reserved_ports which allows users to reserve ports for third-party applications. The reserved ports will not be used by automatic port assignments (e.g. when calling connect() or bind() with port number 0). Explicit port allocation behavior is unchanged. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> --- Index: linux-2.6/Documentation/networking/ip-sysctl.txt =================================================================== --- linux-2.6.orig/Documentation/networking/ip-sysctl.txt +++ linux-2.6/Documentation/networking/ip-sysctl.txt @@ -588,6 +588,37 @@ ip_local_port_range - 2 INTEGERS (i.e. by default) range 1024-4999 is enough to issue up to 2000 connections per second to systems supporting timestamps. +ip_local_reserved_ports - list of comma separated ranges + Specify the ports which are reserved for known third-party + applications. These ports will not be used by automatic port + assignments (e.g. when calling connect() or bind() with port + number 0). Explicit port allocation behavior is unchanged. + + The format used for both input and output is a comma separated + list of ranges (e.g. "1,2-4,10-10" for ports 1, 2, 3, 4 and + 10). Writing to the file will clear all previously reserved + ports and update the current list with the one given in the + input. + + Note that ip_local_port_range and ip_local_reserved_ports + settings are independent and both are considered by the kernel + when determining which ports are available for automatic port + assignments. + + You can reserve ports which are not in the current + ip_local_port_range, e.g.: + + $ cat /proc/sys/net/ipv4/ip_local_port_range + 32000 61000 + $ cat ...
I think above part is wrong. Below program
--------------------
#include <linux/module.h>
#include <linux/sched.h>
#include <linux/idr.h>
static DEFINE_IDR(idr);
static int idr_demo_init(void)
{
int next_port = 65530;
int port = 0;
int ret = -EINTR;
while (!signal_pending(current)) {
msleep(1000);
ret = idr_get_new_above(&idr, NULL, next_port, &port);
printk(KERN_INFO "idr_get_new_above() = %d\n", ret);
if (!ret) {
/* Emulate inet_is_reserved_local_port(port) = true */
printk(KERN_INFO "Port %u is reserved.\n", port);
ret = -EAGAIN;
}
if (ret == -EAGAIN) {
if (idr_pre_get(&idr, GFP_KERNEL)) {
printk(KERN_INFO "idr_pre_get() succeeded.\n");
continue;
}
printk(KERN_INFO "idr_pre_get() failed.\n");
break;
} else {
printk(KERN_INFO "next_port=%u port=%u\n",
next_port, port);
break;
}
}
if (!ret)
idr_remove(&idr, port);
idr_destroy(&idr);
return -EINVAL;
}
module_init(idr_demo_init);
MODULE_LICENSE("GPL");
--------------------
generated below output.
idr_get_new_above() = -11
idr_pre_get() succeeded.
idr_get_new_above() = 0
Port 65530 is reserved.
idr_pre_get() succeeded.
idr_get_new_above() = 0
Port 65531 is reserved.
idr_pre_get() succeeded.
idr_get_new_above() = 0
Port 65532 is reserved.
idr_pre_get() succeeded.
idr_get_new_above() = 0
Port 65533 is reserved.
idr_pre_get() succeeded.
idr_get_new_above() = 0
Port 65534 is reserved.
idr_pre_get() succeeded.
idr_get_new_above() = 0
Port 65535 is reserved.
idr_pre_get() succeeded.
idr_get_new_above() = 0
Port 65536 is reserved.
idr_pre_get() succeeded.
idr_get_new_above() = 0
Port 65537 is reserved.
idr_pre_get() succeeded.
idr_get_new_above() = 0
(...snipped...)
This result suggests that above loop will continue until idr_pre_get() fails
due to out of memory if all ports were reserved.
Also, if idr_get_new_above() returned 0, bind_list (which is a kmalloc()ed
pointer) is already installed into a free slot (see ...Thanks for testing! How about: + if (!ret && inet_is_reserved_local_port(port)) + ret = -EBUSY; ? So that it will break the loop and return error. --
Or use the similar trick:
int tries = 10;
...
if(!ret && inet_is_reserved_local_port(port)) {
if (tries--)
ret = -EAGAIN;
else
ret = -EBUSY;
}
Any comments?
--
Hello.
Adding Sean Hefty and Roland Dreier as drivers/infiniband/core/cma.c maintainer.
I don't like above change. Above change makes local port assignment from
"likely-succeed" (succeeds if one port is available from thousands of ports) to
"unlikely-succeed" (fail if randomly chosen port is already in use).
We should repeat for all ranges specified in /proc/sys/net/ipv4/ip_local_port_range .
cma_alloc_any_port() and cma_alloc_port() are almost identical.
Thus, I think we can call cma_alloc_port() from cma_alloc_any_port().
Sean and Roland, is below patch correct?
inet_is_reserved_local_port() is the new function proposed in this patchset.
---
drivers/infiniband/core/cma.c | 68 ++++++++++++++----------------------------
1 file changed, 23 insertions(+), 45 deletions(-)
--- linux-2.6.34-rc4.orig/drivers/infiniband/core/cma.c
+++ linux-2.6.34-rc4/drivers/infiniband/core/cma.c
@@ -79,7 +79,6 @@ static DEFINE_IDR(sdp_ps);
static DEFINE_IDR(tcp_ps);
static DEFINE_IDR(udp_ps);
static DEFINE_IDR(ipoib_ps);
-static int next_port;
struct cma_device {
struct list_head list;
@@ -1970,47 +1969,31 @@ err1:
static int cma_alloc_any_port(struct idr *ps, struct rdma_id_private *id_priv)
{
- struct rdma_bind_list *bind_list;
- int port, ret, low, high;
-
- bind_list = kzalloc(sizeof *bind_list, GFP_KERNEL);
- if (!bind_list)
- return -ENOMEM;
-
-retry:
- /* FIXME: add proper port randomization per like inet_csk_get_port */
- do {
- ret = idr_get_new_above(ps, bind_list, next_port, &port);
- } while ((ret == -EAGAIN) && idr_pre_get(ps, GFP_KERNEL));
-
- if (ret)
- goto err1;
+ static unsigned int last_used_port;
+ int low, high, remaining;
+ unsigned int rover;
inet_get_local_port_range(&low, &high);
- if (port > high) {
- if (next_port != low) {
- idr_remove(ps, port);
- next_port = low;
- goto retry;
+ remaining = (high - low) + 1;
+ rover = net_random() % remaining + low;
+ do {
+ rover++;
+ if ((rover < low) || (rover > ...It looks correct to me. I didn't test the patch series, but if I comment out the call to inet_is_reserved_local_port() in the provided below, the changes worked fine for me. Acked-by: Sean Hefty <sean.hefty@intel.com> --
Thank you for testing.
I think it is better to split this patch into
Part 1: Make cma_alloc_any_port() to use cma_alloc_port().
Part 2: Insert "!inet_is_reserved_local_port(rover) &&" line.
for future "git bisect".
Roland, will you review below patch for part 1?
--------------------
[PATCH] Infiniband: Randomize local port allocation.
Randomize local port allocation in a way sctp_get_port_local() does.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
drivers/infiniband/core/cma.c | 69 ++++++++++++++----------------------------
1 file changed, 24 insertions(+), 45 deletions(-)
--- linux-2.6.34-rc4.orig/drivers/infiniband/core/cma.c
+++ linux-2.6.34-rc4/drivers/infiniband/core/cma.c
@@ -79,7 +79,6 @@ static DEFINE_IDR(sdp_ps);
static DEFINE_IDR(tcp_ps);
static DEFINE_IDR(udp_ps);
static DEFINE_IDR(ipoib_ps);
-static int next_port;
struct cma_device {
struct list_head list;
@@ -1970,47 +1969,32 @@ err1:
static int cma_alloc_any_port(struct idr *ps, struct rdma_id_private *id_priv)
{
- struct rdma_bind_list *bind_list;
- int port, ret, low, high;
-
- bind_list = kzalloc(sizeof *bind_list, GFP_KERNEL);
- if (!bind_list)
- return -ENOMEM;
-
-retry:
- /* FIXME: add proper port randomization per like inet_csk_get_port */
- do {
- ret = idr_get_new_above(ps, bind_list, next_port, &port);
- } while ((ret == -EAGAIN) && idr_pre_get(ps, GFP_KERNEL));
-
- if (ret)
- goto err1;
+ static unsigned int last_used_port;
+ int low, high, remaining;
+ unsigned int rover;
inet_get_local_port_range(&low, &high);
- if (port > high) {
- if (next_port != low) {
- idr_remove(ps, port);
- next_port = low;
- goto retry;
+ remaining = (high - low) + 1;
+ rover = net_random() % remaining + low;
+ do {
+ rover++;
+ if ((rover < low) || (rover > high))
+ rover = low;
+ if (last_used_port != rover &&
+ !idr_find(ps, (unsigned short) rover)) {
+ int ret = cma_alloc_port(ps, id_priv, rover);
+ /*
+ * ...Right, thanks a lot for your work! So, I will rebase my patch 3/3 on top of this patch. I hope someone could take this one asap. --
Thanks for fixing this long outstanding issue. :) The latest patch looks correct and passed some simple tests that I ran against it. One comment below, Assuming that we're likely to pick a valid port on the first try, it would be --
Indeed. I moved these lines to "if (--remaining) { ... }" block.
--------------------
[PATCH] Infiniband: Randomize local port allocation.
Randomize local port allocation in a way sctp_get_port_local() does.
Update rover at the end of loop since we're likely to pick a valid port
on the first try.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
drivers/infiniband/core/cma.c | 70 +++++++++++++++---------------------------
1 file changed, 25 insertions(+), 45 deletions(-)
--- linux-2.6.34-rc4.orig/drivers/infiniband/core/cma.c
+++ linux-2.6.34-rc4/drivers/infiniband/core/cma.c
@@ -79,7 +79,6 @@ static DEFINE_IDR(sdp_ps);
static DEFINE_IDR(tcp_ps);
static DEFINE_IDR(udp_ps);
static DEFINE_IDR(ipoib_ps);
-static int next_port;
struct cma_device {
struct list_head list;
@@ -1970,47 +1969,33 @@ err1:
static int cma_alloc_any_port(struct idr *ps, struct rdma_id_private *id_priv)
{
- struct rdma_bind_list *bind_list;
- int port, ret, low, high;
-
- bind_list = kzalloc(sizeof *bind_list, GFP_KERNEL);
- if (!bind_list)
- return -ENOMEM;
-
-retry:
- /* FIXME: add proper port randomization per like inet_csk_get_port */
- do {
- ret = idr_get_new_above(ps, bind_list, next_port, &port);
- } while ((ret == -EAGAIN) && idr_pre_get(ps, GFP_KERNEL));
-
- if (ret)
- goto err1;
+ static unsigned int last_used_port;
+ int low, high, remaining;
+ unsigned int rover;
inet_get_local_port_range(&low, &high);
- if (port > high) {
- if (next_port != low) {
- idr_remove(ps, port);
- next_port = low;
- goto retry;
- }
- ret = -EADDRNOTAVAIL;
- goto err2;
+ remaining = (high - low) + 1;
+ rover = net_random() % remaining + low;
+retry:
+ if (last_used_port != rover &&
+ !idr_find(ps, (unsigned short) rover)) {
+ int ret = cma_alloc_port(ps, id_priv, rover);
+ /*
+ * Remember previously used port number in order to avoid
+ * re-using same port immediately after it is closed.
+ */
+ if ...I like this version, thanks! I'm not sure which tree to merge it through. --
Cong, merge window for 2.6.34 was already closed. You need to make your patchset towards 2.6.35 (using net-next-2.6 tree) rather than 2.6.34 (using linux-2.6 tree). Therefore, this patch being queued for 2.6.35 (through net-next-2.6 tree) should be okay for you. --
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> I don't take RDMA patches into net-next-2.6, the less I touch this stack avoiding stuff the better and Roland has been taking this stuff into his own tree for some time now. --
I left for a few days. Ok, so I will wait for this to be merged. Thanks, David and Tetsuo! --
Thanks, applied this part of the patch -- I preferred this one since the goto into the middle of a loop seemed worse than a goto out of the loop... -- Roland Dreier <rolandd@cisco.com> || For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html --
> Thanks, applied this part of the patch -- I preferred this one since the err, not "part of the patch" -- I meant "this version of the patch". -- Roland Dreier <rolandd@cisco.com> || For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html --
| Jesse Barnes | Re: [stable] [BUG][PATCH] cpqphp: fix kernel NULL pointer dereference |
| Greg KH | [003/136] p54usb: add Zcomax XG-705A usbid |
| Magnus Damm | [PATCH 03/07] ARM: Use shared GIC entry macros on Realview |
| Oliver Neukum | Re: [Bug #13682] The webcam stopped working when upgrading from 2.6.29 to 2.6.30 |
| Martin Schwidefsky |
