LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Benjamin Thery <benjamin.thery@bull.net>
To: netdev <netdev@vger.kernel.org>, Dave Miller <davem@davemloft.net>
Cc: Eric Biederman <ebiederm@xmission.com>,
	Greg Kroah-Hartman <gregkh@suse.de>,
	Al Viro <viro@ftp.linux.org.uk>, Serge Hallyn <serue@us.ibm.com>,
	Daniel Lezcano <dlezcano@fr.ibm.com>,
	linux-kernel@vger.kernel.org, Tejun Heo <htejun@gmail.com>,
	Denis Lunev <den@openvz.org>,
	Linux Containers <containers@lists.linux-foundation.org>,
	Benjamin Thery <benjamin.thery@bull.net>
Subject: [PATCH 4/4] netns: sysfs: add netns suffix to net devices sysfs entries
Date: Wed, 22 Oct 2008 17:22:24 +0200	[thread overview]
Message-ID: <20081022152145.128767713@theryb.frec.bull.fr> (raw)
In-Reply-To: 20081022152144.351965414@theryb.frec.bull.fr

Reminder: what we want is being able to create network interfaces with
the same name in different network namespaces (eg. the loopback). The
remaining issues are in sysfs.

This patch dissociates network devices actual names (stored in struct 
net_device and seen by ifconfig/ip tools) and network device names 
stored in sysfs. 

When adding a network device in child net namespace (!init_net), when
registering it in sysfs, a suffix unique to the current netns is 
appended to the actual device name. Currently this suffix is the netns
ida ID in hexa form separated by a '@' char.

In sysfs, we see all the network devices of all netns.

# ll /sys/devices/virtual/net/
...
drwxr-xr-x 4 root root 0 2008-10-13 14:08 lo
drwxr-xr-x 4 root root 0 2008-10-13 16:31 lo@1
...
drwxr-xr-x 4 root root 0 2008-10-13 16:31 lo@e5
...

Then, in the child network namespace we can filter the contents of 
/sys/class/net with:

* mount -t tmpfs /sys/class/net 
* and  manually link the right devices from /sys/devices/virtual/net
  (ln -s ../../devices/virtual/net/lo@1 lo)

Thus, /sys/class/net appears to be fine for the applications running
in this namespace.

FUSE can also be used to alter the view of /sys/class/net in the 
namespace.

Issues:
-------

* The suffix

  We only have four characters left (BUS_ID_SIZE - IFNAMSIZ) to build the
  suffix: add a separator and encode the netns ID.
  By encoding the ID in hexa, it introduces a limit of 4095 (0xFFF)
  sub-network namespaces running at the same time.

* This approach reduces isolation between network namespaces

  Everyone can see all the devices in each namespaces by exploring 
  /sys/devices/../.. or /sys/class/net (if it's not re-mounted as tmpfs).

* Does not work very well with CONFIG_SYSFS_DEPRECATED=y 

  The filtering of /sys/class/net with CONFIG_SYSFS_DEPRECATED=y is more
  difficult to do because in this case /sys/class/net contains 
  the actual directories (not symlinks).


Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Tested-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---
 net/Kconfig          |    2 +-
 net/core/dev.c       |    4 +++-
 net/core/net-sysfs.c |   34 +++++++++++++++++++++++++++++++++-
 net/core/net-sysfs.h |    1 +
 4 files changed, 38 insertions(+), 3 deletions(-)

Index: net-next-2.6/net/Kconfig
===================================================================
--- net-next-2.6.orig/net/Kconfig
+++ net-next-2.6/net/Kconfig
@@ -27,7 +27,7 @@ menu "Networking options"
 config NET_NS
 	bool "Network namespace support"
 	default n
-	depends on EXPERIMENTAL && !SYSFS && NAMESPACES
+	depends on EXPERIMENTAL && NAMESPACES
 	help
 	  Allow user space to create what appear to be multiple instances
 	  of the network stack.
Index: net-next-2.6/net/core/dev.c
===================================================================
--- net-next-2.6.orig/net/core/dev.c
+++ net-next-2.6/net/core/dev.c
@@ -894,6 +894,7 @@ int dev_alloc_name(struct net_device *de
 int dev_change_name(struct net_device *dev, const char *newname)
 {
 	char oldname[IFNAMSIZ];
+	char devname[BUS_ID_SIZE];
 	int err = 0;
 	int ret;
 	struct net *net;
@@ -924,7 +925,8 @@ int dev_change_name(struct net_device *d
 		strlcpy(dev->name, newname, IFNAMSIZ);
 
 rollback:
-	err = device_rename(&dev->dev, dev->name);
+	netdev_fill_bus_id_name(devname, dev);
+	err = device_rename(&dev->dev, devname);
 	if (err) {
 		memcpy(dev->name, oldname, IFNAMSIZ);
 		return err;
Index: net-next-2.6/net/core/net-sysfs.c
===================================================================
--- net-next-2.6.orig/net/core/net-sysfs.c
+++ net-next-2.6/net/core/net-sysfs.c
@@ -468,6 +468,38 @@ static struct class net_class = {
 #endif
 };
 
+/* Fill device bus_id name from net device name
+ * When registering a device for a child network namespace,
+ * a suffix is added to the name stored in "struct device"
+ * bus_id.
+ *
+ * devname size must be at least BUS_ID_SIZE
+ */
+void netdev_fill_bus_id_name(char *devname, struct net_device *netdev)
+{
+#ifndef CONFIG_NET_NS
+	strlcpy(devname, netdev->name, BUS_ID_SIZE);
+#else
+	struct net *net = dev_net(netdev);
+
+	if (net_eq(net, &init_net))
+		strlcpy(devname, netdev->name, BUS_ID_SIZE);
+	else {
+		/*
+		 * To allow registration of net devices with the same name in
+		 * different namespaces, append the netns identifier to the
+		 * device name in sysfs using the 4 bytes left in bus_id
+		 * (BUS_ID_SIZE - IFNAMSIZ).
+		 *
+		 * devname is in the form: device_name@XXX
+		 * the netns identifier is an integer < 4095, thus encodable
+		 * in hexa in 3 characters ("FFF").
+		 */
+		snprintf(devname, BUS_ID_SIZE, "%s@%x", netdev->name, net->id);
+	}
+#endif
+}
+
 /* Delete sysfs entries but hold kobject reference until after all
  * netdev references are gone.
  */
@@ -490,7 +522,7 @@ int netdev_register_kobject(struct net_d
 	dev->groups = groups;
 
 	BUILD_BUG_ON(BUS_ID_SIZE < IFNAMSIZ);
-	strlcpy(dev->bus_id, netdev->name, BUS_ID_SIZE);
+	netdev_fill_bus_id_name(dev->bus_id, netdev);
 
 #ifdef CONFIG_SYSFS
 	*groups++ = &netstat_group;
Index: net-next-2.6/net/core/net-sysfs.h
===================================================================
--- net-next-2.6.orig/net/core/net-sysfs.h
+++ net-next-2.6/net/core/net-sysfs.h
@@ -5,4 +5,5 @@ int netdev_kobject_init(void);
 int netdev_register_kobject(struct net_device *);
 void netdev_unregister_kobject(struct net_device *);
 void netdev_initialize_kobject(struct net_device *);
+void netdev_fill_bus_id_name(char *, struct net_device *);
 #endif

-- 

  parent reply	other threads:[~2008-10-22 15:24 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-22 15:21 [PATCH 0/4][RFC] netns: sysfs: add a netns suffix to net device sysfs entries Benjamin Thery
2008-10-22 15:21 ` [PATCH 1/4] netns: add in ida ID to identify the network namespace Benjamin Thery
2008-10-22 15:22 ` [PATCH 2/4] netns: Export nets id to /proc/net/netns Benjamin Thery
2008-10-22 15:22 ` [PATCH 3/4] net: cleanup some vars names to be more consistant with the network code Benjamin Thery
2008-10-22 15:22 ` Benjamin Thery [this message]
2008-10-22 19:59 ` [PATCH 0/4][RFC] netns: sysfs: add a netns suffix to net device sysfs entries Eric W. Biederman
2008-10-22 20:30   ` Serge E. Hallyn
2008-10-22 21:01     ` Eric W. Biederman
2008-10-22 21:55       ` Stephen Hemminger
2008-10-22 22:54         ` Eric W. Biederman
2008-10-23  4:14           ` Kyle Moffett
2008-10-23 11:56   ` Benjamin Thery
2008-10-23 15:46     ` Eric W. Biederman
2008-10-22 20:16 ` Greg KH
2008-10-22 21:08   ` Eric W. Biederman
2008-10-22 21:24     ` Greg KH
2008-10-22 20:32 ` [PATCH] netns: Coexist with the sysfs limitations Eric W. Biederman
2008-10-22 20:40   ` Daniel Lezcano
2008-10-22 21:21   ` Serge E. Hallyn
2008-10-23  8:04     ` Benjamin Thery
2008-10-23 15:40       ` Eric W. Biederman
2008-10-23 15:56       ` [PATCH] netns: Coexist with the sysfs limitations v2 Eric W. Biederman
2008-10-27 19:41         ` David Miller
2008-10-27 20:19           ` Eric W. Biederman
2008-10-28  0:50             ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081022152145.128767713@theryb.frec.bull.fr \
    --to=benjamin.thery@bull.net \
    --cc=containers@lists.linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=den@openvz.org \
    --cc=dlezcano@fr.ibm.com \
    --cc=ebiederm@xmission.com \
    --cc=gregkh@suse.de \
    --cc=htejun@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=serue@us.ibm.com \
    --cc=viro@ftp.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).