LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/5 v2] tracing: Add new file system tracefs
@ 2015-01-23 15:55 Steven Rostedt
  2015-01-23 15:55 ` [PATCH 1/5 v2] tracefs: Add new tracefs file system Steven Rostedt
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Steven Rostedt @ 2015-01-23 15:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton


There has been complaints that tracing is tied too much to debugfs,
as there are systems that would like to perform tracing, but do
not mount debugfs for security reasons. That is because any subsystem
may use debugfs for debugging, and these interfaces are not always
tested for security.

Creating a new tracefs that the tracing directory will now be attached
to allows system admins the ability to access the tracing directory
without the need to mount debugfs.

Another advantage is that debugfs does not support the system calls
for mkdir and rmdir. Tracing uses these system calls to create new
instances for sub buffers. This was done by a hack that hijacked the
dentry ops from the "instances" debugfs dentry, and replacing it with
one that could work.

Instead of using this hack, tracefs can provide a proper interface to
allow the tracing system to have a mkdir and rmdir feature.

To maintain backward compatibility with older tools that expect that
the tracing directory is mounted with debugfs, the tracing directory
is still created under debugfs and tracefs is automatically mounted
there.

Finally, a new directory is created when tracefs is enabled called
/sys/kernel/tracing. This will be the new location that system admins
may mount tracefs if they are not using debugfs.

Changes from v1:

 o Fixed all the posting problems (included files that were missing
   and removed changes that were not suppose to be there).

 o Changed the mkdir/rmdir logic. Instead of trying to keep the inode
   mutexes locked, which caused locking issues with other locks that
   were taken, the locks are still released. But this time, they are
   released by the tracefs system which has a bit more control.
   As the mkdir/rmdir methods for the tracing facility only need the
   new name of the instance, the tracefs mkdir/rmdir copies the name
   from the dentry, releases the locks, and passes in the copy to
   the tracing methods.

Steven Rostedt (Red Hat) (5):
      tracefs: Add new tracefs file system
      tracing: Convert the tracing facility over to use tracefs
      tracing: Automatically mount tracefs on debugfs/tracing
      tracefs: Add directory /sys/kernel/tracing
      tracing: Have mkdir and rmdir be part of tracefs

----
 fs/Makefile                          |   1 +
 fs/tracefs/Makefile                  |   4 +
 fs/tracefs/inode.c                   | 658 +++++++++++++++++++++++++++++++++++
 include/linux/tracefs.h              |  48 +++
 include/uapi/linux/magic.h           |   2 +
 kernel/trace/ftrace.c                |  22 +-
 kernel/trace/trace.c                 | 176 +++++-----
 kernel/trace/trace.h                 |   2 +-
 kernel/trace/trace_events.c          |  32 +-
 kernel/trace/trace_functions_graph.c |   7 +-
 kernel/trace/trace_kprobe.c          |  10 +-
 kernel/trace/trace_probe.h           |   2 +-
 kernel/trace/trace_stat.c            |  10 +-
 13 files changed, 844 insertions(+), 130 deletions(-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/5 v2] tracefs: Add new tracefs file system
  2015-01-23 15:55 [PATCH 0/5 v2] tracing: Add new file system tracefs Steven Rostedt
@ 2015-01-23 15:55 ` Steven Rostedt
  2015-01-23 15:55 ` [PATCH 2/5 v2] tracing: Convert the tracing facility over to use tracefs Steven Rostedt
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Steven Rostedt @ 2015-01-23 15:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton

[-- Attachment #1: 0001-tracefs-Add-new-tracefs-file-system.patch --]
[-- Type: text/plain, Size: 18320 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

Add a separate file system to handle the tracing directory. Currently it
is part of debugfs, but that is starting to show its limits.

One thing is that in order to access the tracing infrastructure, you need
to mount debugfs. As that includes debugging from all sorts of sub systems
in the kernel, it is not considered advisable to mount such an all
encompassing debugging system.

Having the tracing system in its own file systems gives access to the
tracing sub system without needing to include all other systems.

Another problem with tracing using the debugfs system is that the
instances use mkdir to create sub buffers. debugfs does not support mkdir
from userspace so to implement it, special hacks were used. By controlling
the file system that the tracing infrastructure uses, this can be properly
done without hacks.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 fs/Makefile                |   1 +
 fs/tracefs/Makefile        |   4 +
 fs/tracefs/inode.c         | 560 +++++++++++++++++++++++++++++++++++++++++++++
 include/linux/tracefs.h    |  41 ++++
 include/uapi/linux/magic.h |   2 +
 5 files changed, 608 insertions(+)
 create mode 100644 fs/tracefs/Makefile
 create mode 100644 fs/tracefs/inode.c
 create mode 100644 include/linux/tracefs.h

diff --git a/fs/Makefile b/fs/Makefile
index bedff48e8fdc..d244b8d973ac 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -118,6 +118,7 @@ obj-$(CONFIG_HOSTFS)		+= hostfs/
 obj-$(CONFIG_HPPFS)		+= hppfs/
 obj-$(CONFIG_CACHEFILES)	+= cachefiles/
 obj-$(CONFIG_DEBUG_FS)		+= debugfs/
+obj-$(CONFIG_TRACING)		+= tracefs/
 obj-$(CONFIG_OCFS2_FS)		+= ocfs2/
 obj-$(CONFIG_BTRFS_FS)		+= btrfs/
 obj-$(CONFIG_GFS2_FS)           += gfs2/
diff --git a/fs/tracefs/Makefile b/fs/tracefs/Makefile
new file mode 100644
index 000000000000..82fa35b656c4
--- /dev/null
+++ b/fs/tracefs/Makefile
@@ -0,0 +1,4 @@
+tracefs-objs	:= inode.o
+
+obj-$(CONFIG_TRACING)	+= tracefs.o
+
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
new file mode 100644
index 000000000000..c31997a303c7
--- /dev/null
+++ b/fs/tracefs/inode.c
@@ -0,0 +1,560 @@
+/*
+ *  inode.c - part of tracefs, a pseudo file system for activating tracing
+ *
+ * Based on debugfs by: Greg Kroah-Hartman <greg@kroah.com>
+ *
+ *  Copyright (C) 2014 Red Hat Inc, author: Steven Rostedt <srostedt@redhat.com>
+ *
+ *	This program is free software; you can redistribute it and/or
+ *	modify it under the terms of the GNU General Public License version
+ *	2 as published by the Free Software Foundation.
+ *
+ * tracefs is the file system that is used by the tracing infrastructure.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/mount.h>
+#include <linux/namei.h>
+#include <linux/tracefs.h>
+#include <linux/fsnotify.h>
+#include <linux/seq_file.h>
+#include <linux/parser.h>
+#include <linux/magic.h>
+#include <linux/slab.h>
+
+#define TRACEFS_DEFAULT_MODE	0700
+
+static struct vfsmount *tracefs_mount;
+static int tracefs_mount_count;
+static bool tracefs_registered;
+
+static ssize_t default_read_file(struct file *file, char __user *buf,
+				 size_t count, loff_t *ppos)
+{
+	return 0;
+}
+
+static ssize_t default_write_file(struct file *file, const char __user *buf,
+				   size_t count, loff_t *ppos)
+{
+	return count;
+}
+
+static const struct file_operations tracefs_file_operations = {
+	.read =		default_read_file,
+	.write =	default_write_file,
+	.open =		simple_open,
+	.llseek =	noop_llseek,
+};
+
+static struct inode *tracefs_get_inode(struct super_block *sb, umode_t mode, dev_t dev,
+				      void *data, const struct file_operations *fops)
+
+{
+	struct inode *inode = new_inode(sb);
+
+	if (inode) {
+		inode->i_ino = get_next_ino();
+		inode->i_mode = mode;
+		inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
+		switch (mode & S_IFMT) {
+		default:
+			init_special_inode(inode, mode, dev);
+			break;
+		case S_IFREG:
+			inode->i_fop = fops ? fops : &tracefs_file_operations;
+			inode->i_private = data;
+			break;
+		case S_IFDIR:
+			inode->i_op = &simple_dir_inode_operations;
+			inode->i_fop = &simple_dir_operations;
+
+			/* directory inodes start off with i_nlink == 2
+			 * (for "." entry) */
+			inc_nlink(inode);
+			break;
+		}
+	}
+	return inode;
+}
+
+static int tracefs_mknod(struct inode *dir, struct dentry *dentry,
+			 umode_t mode, dev_t dev, void *data,
+			 const struct file_operations *fops)
+{
+	struct inode *inode;
+	int error = -EPERM;
+
+	if (dentry->d_inode)
+		return -EEXIST;
+
+	inode = tracefs_get_inode(dir->i_sb, mode, dev, data, fops);
+	if (inode) {
+		d_instantiate(dentry, inode);
+		dget(dentry);
+		error = 0;
+	}
+	return error;
+}
+
+static int tracefs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
+{
+	int res;
+
+	mode = (mode & (S_IRWXUGO | S_ISVTX)) | S_IFDIR;
+	res = tracefs_mknod(dir, dentry, mode, 0, NULL, NULL);
+	if (!res) {
+		inc_nlink(dir);
+		fsnotify_mkdir(dir, dentry);
+	}
+	return res;
+}
+
+static int tracefs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
+			  void *data, const struct file_operations *fops)
+{
+	int res;
+
+	mode = (mode & S_IALLUGO) | S_IFREG;
+	res = tracefs_mknod(dir, dentry, mode, 0, data, fops);
+	if (!res)
+		fsnotify_create(dir, dentry);
+	return res;
+}
+
+struct tracefs_mount_opts {
+	kuid_t uid;
+	kgid_t gid;
+	umode_t mode;
+};
+
+enum {
+	Opt_uid,
+	Opt_gid,
+	Opt_mode,
+	Opt_err
+};
+
+static const match_table_t tokens = {
+	{Opt_uid, "uid=%u"},
+	{Opt_gid, "gid=%u"},
+	{Opt_mode, "mode=%o"},
+	{Opt_err, NULL}
+};
+
+struct tracefs_fs_info {
+	struct tracefs_mount_opts mount_opts;
+};
+
+static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
+{
+	substring_t args[MAX_OPT_ARGS];
+	int option;
+	int token;
+	kuid_t uid;
+	kgid_t gid;
+	char *p;
+
+	opts->mode = TRACEFS_DEFAULT_MODE;
+
+	while ((p = strsep(&data, ",")) != NULL) {
+		if (!*p)
+			continue;
+
+		token = match_token(p, tokens, args);
+		switch (token) {
+		case Opt_uid:
+			if (match_int(&args[0], &option))
+				return -EINVAL;
+			uid = make_kuid(current_user_ns(), option);
+			if (!uid_valid(uid))
+				return -EINVAL;
+			opts->uid = uid;
+			break;
+		case Opt_gid:
+			if (match_int(&args[0], &option))
+				return -EINVAL;
+			gid = make_kgid(current_user_ns(), option);
+			if (!gid_valid(gid))
+				return -EINVAL;
+			opts->gid = gid;
+			break;
+		case Opt_mode:
+			if (match_octal(&args[0], &option))
+				return -EINVAL;
+			opts->mode = option & S_IALLUGO;
+			break;
+		/*
+		 * We might like to report bad mount options here;
+		 * but traditionally tracefs has ignored all mount options
+		 */
+		}
+	}
+
+	return 0;
+}
+
+static int tracefs_apply_options(struct super_block *sb)
+{
+	struct tracefs_fs_info *fsi = sb->s_fs_info;
+	struct inode *inode = sb->s_root->d_inode;
+	struct tracefs_mount_opts *opts = &fsi->mount_opts;
+
+	inode->i_mode &= ~S_IALLUGO;
+	inode->i_mode |= opts->mode;
+
+	inode->i_uid = opts->uid;
+	inode->i_gid = opts->gid;
+
+	return 0;
+}
+
+static int tracefs_remount(struct super_block *sb, int *flags, char *data)
+{
+	int err;
+	struct tracefs_fs_info *fsi = sb->s_fs_info;
+
+	sync_filesystem(sb);
+	err = tracefs_parse_options(data, &fsi->mount_opts);
+	if (err)
+		goto fail;
+
+	tracefs_apply_options(sb);
+
+fail:
+	return err;
+}
+
+static int tracefs_show_options(struct seq_file *m, struct dentry *root)
+{
+	struct tracefs_fs_info *fsi = root->d_sb->s_fs_info;
+	struct tracefs_mount_opts *opts = &fsi->mount_opts;
+
+	if (!uid_eq(opts->uid, GLOBAL_ROOT_UID))
+		seq_printf(m, ",uid=%u",
+			   from_kuid_munged(&init_user_ns, opts->uid));
+	if (!gid_eq(opts->gid, GLOBAL_ROOT_GID))
+		seq_printf(m, ",gid=%u",
+			   from_kgid_munged(&init_user_ns, opts->gid));
+	if (opts->mode != TRACEFS_DEFAULT_MODE)
+		seq_printf(m, ",mode=%o", opts->mode);
+
+	return 0;
+}
+
+static const struct super_operations tracefs_super_operations = {
+	.statfs		= simple_statfs,
+	.remount_fs	= tracefs_remount,
+	.show_options	= tracefs_show_options,
+};
+
+static int trace_fill_super(struct super_block *sb, void *data, int silent)
+{
+	static struct tree_descr trace_files[] = {{""}};
+	struct tracefs_fs_info *fsi;
+	int err;
+
+	save_mount_options(sb, data);
+
+	fsi = kzalloc(sizeof(struct tracefs_fs_info), GFP_KERNEL);
+	sb->s_fs_info = fsi;
+	if (!fsi) {
+		err = -ENOMEM;
+		goto fail;
+	}
+
+	err = tracefs_parse_options(data, &fsi->mount_opts);
+	if (err)
+		goto fail;
+
+	err  =  simple_fill_super(sb, TRACEFS_MAGIC, trace_files);
+	if (err)
+		goto fail;
+
+	sb->s_op = &tracefs_super_operations;
+
+	tracefs_apply_options(sb);
+
+	return 0;
+
+fail:
+	kfree(fsi);
+	sb->s_fs_info = NULL;
+	return err;
+}
+
+static struct dentry *trace_mount(struct file_system_type *fs_type,
+			int flags, const char *dev_name,
+			void *data)
+{
+	return mount_single(fs_type, flags, data, trace_fill_super);
+}
+
+static struct file_system_type trace_fs_type = {
+	.owner =	THIS_MODULE,
+	.name =		"tracefs",
+	.mount =	trace_mount,
+	.kill_sb =	kill_litter_super,
+};
+MODULE_ALIAS_FS("tracefs");
+
+static struct dentry *__create_file(const char *name, umode_t mode,
+				    struct dentry *parent, void *data,
+				    const struct file_operations *fops)
+{
+	struct dentry *dentry = NULL;
+	int error;
+
+	pr_debug("tracefs: creating file '%s'\n",name);
+
+	error = simple_pin_fs(&trace_fs_type, &tracefs_mount,
+			      &tracefs_mount_count);
+	if (error)
+		goto exit;
+
+	/* If the parent is not specified, we create it in the root.
+	 * We need the root dentry to do this, which is in the super
+	 * block. A pointer to that is in the struct vfsmount that we
+	 * have around.
+	 */
+	if (!parent)
+		parent = tracefs_mount->mnt_root;
+
+	mutex_lock(&parent->d_inode->i_mutex);
+	dentry = lookup_one_len(name, parent, strlen(name));
+	if (!IS_ERR(dentry)) {
+		switch (mode & S_IFMT) {
+		case S_IFDIR:
+			error = tracefs_mkdir(parent->d_inode, dentry, mode);
+
+			break;
+		default:
+			error = tracefs_create(parent->d_inode, dentry, mode,
+					       data, fops);
+			break;
+		}
+		dput(dentry);
+	} else
+		error = PTR_ERR(dentry);
+	mutex_unlock(&parent->d_inode->i_mutex);
+
+	if (error) {
+		dentry = NULL;
+		simple_release_fs(&tracefs_mount, &tracefs_mount_count);
+	}
+exit:
+	return dentry;
+}
+
+/**
+ * tracefs_create_file - create a file in the tracefs filesystem
+ * @name: a pointer to a string containing the name of the file to create.
+ * @mode: the permission that the file should have.
+ * @parent: a pointer to the parent dentry for this file.  This should be a
+ *          directory dentry if set.  If this parameter is NULL, then the
+ *          file will be created in the root of the tracefs filesystem.
+ * @data: a pointer to something that the caller will want to get to later
+ *        on.  The inode.i_private pointer will point to this value on
+ *        the open() call.
+ * @fops: a pointer to a struct file_operations that should be used for
+ *        this file.
+ *
+ * This is the basic "create a file" function for tracefs.  It allows for a
+ * wide range of flexibility in creating a file, or a directory (if you want
+ * to create a directory, the tracefs_create_dir() function is
+ * recommended to be used instead.)
+ *
+ * This function will return a pointer to a dentry if it succeeds.  This
+ * pointer must be passed to the tracefs_remove() function when the file is
+ * to be removed (no automatic cleanup happens if your module is unloaded,
+ * you are responsible here.)  If an error occurs, %NULL will be returned.
+ *
+ * If tracefs is not enabled in the kernel, the value -%ENODEV will be
+ * returned.
+ */
+struct dentry *tracefs_create_file(const char *name, umode_t mode,
+				   struct dentry *parent, void *data,
+				   const struct file_operations *fops)
+{
+	switch (mode & S_IFMT) {
+	case S_IFREG:
+	case 0:
+		break;
+	default:
+		BUG();
+	}
+
+	return __create_file(name, mode, parent, data, fops);
+}
+
+/**
+ * tracefs_create_dir - create a directory in the tracefs filesystem
+ * @name: a pointer to a string containing the name of the directory to
+ *        create.
+ * @parent: a pointer to the parent dentry for this file.  This should be a
+ *          directory dentry if set.  If this parameter is NULL, then the
+ *          directory will be created in the root of the tracefs filesystem.
+ *
+ * This function creates a directory in tracefs with the given name.
+ *
+ * This function will return a pointer to a dentry if it succeeds.  This
+ * pointer must be passed to the tracefs_remove() function when the file is
+ * to be removed. If an error occurs, %NULL will be returned.
+ *
+ * If tracing is not enabled in the kernel, the value -%ENODEV will be
+ * returned.
+ */
+struct dentry *tracefs_create_dir(const char *name, struct dentry *parent)
+{
+	return __create_file(name, S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO,
+				   parent, NULL, NULL);
+}
+
+static inline int tracefs_positive(struct dentry *dentry)
+{
+	return dentry->d_inode && !d_unhashed(dentry);
+}
+
+static int __tracefs_remove(struct dentry *dentry, struct dentry *parent)
+{
+	int ret = 0;
+
+	if (tracefs_positive(dentry)) {
+		if (dentry->d_inode) {
+			dget(dentry);
+			switch (dentry->d_inode->i_mode & S_IFMT) {
+			case S_IFDIR:
+				ret = simple_rmdir(parent->d_inode, dentry);
+				break;
+			default:
+				simple_unlink(parent->d_inode, dentry);
+				break;
+			}
+			if (!ret)
+				d_delete(dentry);
+			dput(dentry);
+		}
+	}
+	return ret;
+}
+
+/**
+ * tracefs_remove - removes a file or directory from the tracefs filesystem
+ * @dentry: a pointer to a the dentry of the file or directory to be
+ *          removed.
+ *
+ * This function removes a file or directory in tracefs that was previously
+ * created with a call to another tracefs function (like
+ * tracefs_create_file() or variants thereof.)
+ */
+void tracefs_remove(struct dentry *dentry)
+{
+	struct dentry *parent;
+	int ret;
+
+	if (IS_ERR_OR_NULL(dentry))
+		return;
+
+	parent = dentry->d_parent;
+	if (!parent || !parent->d_inode)
+		return;
+
+	mutex_lock(&parent->d_inode->i_mutex);
+	ret = __tracefs_remove(dentry, parent);
+	mutex_unlock(&parent->d_inode->i_mutex);
+	if (!ret)
+		simple_release_fs(&tracefs_mount, &tracefs_mount_count);
+}
+
+/**
+ * tracefs_remove_recursive - recursively removes a directory
+ * @dentry: a pointer to a the dentry of the directory to be removed.
+ *
+ * This function recursively removes a directory tree in tracefs that
+ * was previously created with a call to another tracefs function
+ * (like tracefs_create_file() or variants thereof.)
+ */
+void tracefs_remove_recursive(struct dentry *dentry)
+{
+	struct dentry *child, *parent;
+
+	if (IS_ERR_OR_NULL(dentry))
+		return;
+
+	parent = dentry->d_parent;
+	if (!parent || !parent->d_inode)
+		return;
+
+	parent = dentry;
+ down:
+	mutex_lock(&parent->d_inode->i_mutex);
+ loop:
+	/*
+	 * The parent->d_subdirs is protected by the d_lock. Outside that
+	 * lock, the child can be unlinked and set to be freed which can
+	 * use the d_u.d_child as the rcu head and corrupt this list.
+	 */
+	spin_lock(&parent->d_lock);
+	list_for_each_entry(child, &parent->d_subdirs, d_child) {
+		if (!tracefs_positive(child))
+			continue;
+
+		/* perhaps simple_empty(child) makes more sense */
+		if (!list_empty(&child->d_subdirs)) {
+			spin_unlock(&parent->d_lock);
+			mutex_unlock(&parent->d_inode->i_mutex);
+			parent = child;
+			goto down;
+		}
+
+		spin_unlock(&parent->d_lock);
+
+		if (!__tracefs_remove(child, parent))
+			simple_release_fs(&tracefs_mount, &tracefs_mount_count);
+
+		/*
+		 * The parent->d_lock protects agaist child from unlinking
+		 * from d_subdirs. When releasing the parent->d_lock we can
+		 * no longer trust that the next pointer is valid.
+		 * Restart the loop. We'll skip this one with the
+		 * tracefs_positive() check.
+		 */
+		goto loop;
+	}
+	spin_unlock(&parent->d_lock);
+
+	mutex_unlock(&parent->d_inode->i_mutex);
+	child = parent;
+	parent = parent->d_parent;
+	mutex_lock(&parent->d_inode->i_mutex);
+
+	if (child != dentry)
+		/* go up */
+		goto loop;
+
+	if (!__tracefs_remove(child, parent))
+		simple_release_fs(&tracefs_mount, &tracefs_mount_count);
+	mutex_unlock(&parent->d_inode->i_mutex);
+}
+
+/**
+ * tracefs_initialized - Tells whether tracefs has been registered
+ */
+bool tracefs_initialized(void)
+{
+	return tracefs_registered;
+}
+
+static int __init tracefs_init(void)
+{
+	int retval;
+
+	retval = register_filesystem(&trace_fs_type);
+	if (!retval)
+		tracefs_registered = true;
+
+	return retval;
+}
+core_initcall(tracefs_init);
diff --git a/include/linux/tracefs.h b/include/linux/tracefs.h
new file mode 100644
index 000000000000..23e04ce21749
--- /dev/null
+++ b/include/linux/tracefs.h
@@ -0,0 +1,41 @@
+/*
+ *  tracefs.h - a pseudo file system for activating tracing
+ *
+ * Based on debugfs by: 2004 Greg Kroah-Hartman <greg@kroah.com>
+ *
+ *  Copyright (C) 2014 Red Hat Inc, author: Steven Rostedt <srostedt@redhat.com>
+ *
+ *	This program is free software; you can redistribute it and/or
+ *	modify it under the terms of the GNU General Public License version
+ *	2 as published by the Free Software Foundation.
+ *
+ * tracefs is the file system that is used by the tracing infrastructure.
+ *
+ */
+
+#ifndef _TRACEFS_H_
+#define _TRACEFS_H_
+
+#include <linux/fs.h>
+#include <linux/seq_file.h>
+
+#include <linux/types.h>
+
+struct file_operations;
+
+#ifdef CONFIG_TRACING
+
+struct dentry *tracefs_create_file(const char *name, umode_t mode,
+				   struct dentry *parent, void *data,
+				   const struct file_operations *fops);
+
+struct dentry *tracefs_create_dir(const char *name, struct dentry *parent);
+
+void tracefs_remove(struct dentry *dentry);
+void tracefs_remove_recursive(struct dentry *dentry);
+
+bool tracefs_initialized(void);
+
+#endif /* CONFIG_TRACING */
+
+#endif
diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
index 7d664ea85ebd..7b1425a6b370 100644
--- a/include/uapi/linux/magic.h
+++ b/include/uapi/linux/magic.h
@@ -58,6 +58,8 @@
 
 #define STACK_END_MAGIC		0x57AC6E9D
 
+#define TRACEFS_MAGIC          0x74726163
+
 #define V9FS_MAGIC		0x01021997
 
 #define BDEVFS_MAGIC            0x62646576
-- 
2.1.4



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/5 v2] tracing: Convert the tracing facility over to use tracefs
  2015-01-23 15:55 [PATCH 0/5 v2] tracing: Add new file system tracefs Steven Rostedt
  2015-01-23 15:55 ` [PATCH 1/5 v2] tracefs: Add new tracefs file system Steven Rostedt
@ 2015-01-23 15:55 ` Steven Rostedt
  2015-01-23 15:55 ` [PATCH 3/5 v2] tracing: Automatically mount tracefs on debugfs/tracing Steven Rostedt
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Steven Rostedt @ 2015-01-23 15:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton

[-- Attachment #1: 0002-tracing-Convert-the-tracing-facility-over-to-use-tra.patch --]
[-- Type: text/plain, Size: 20220 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

debugfs was fine for the tracing facility as a quick way to get
an interface. Now that tracing has matured, it should separate itself
from debugfs such that it can be mounted separately without needing
to mount all of debugfs with it. That is, users resist using tracing
because it requires mounting debugfs. Having tracing have its own file
system lets users get the features of tracing without needing to bring
in the rest of the kernel's debug infrastructure.

Another reason for tracefs is that debubfs does not support mkdir.
Currently, to create instances, one does a mkdir in the tracing/instance
directory. This is implemented via a hack that forces debugfs to do
something it is not intended on doing. By converting over to tracefs, this
hack can be removed and mkdir can be properly implemented. This patch does
not address this yet, but it lays the ground work for that to be done.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c                | 22 ++++++-------
 kernel/trace/trace.c                 | 61 ++++++++++++++++++++++--------------
 kernel/trace/trace.h                 |  2 +-
 kernel/trace/trace_events.c          | 32 +++++++++----------
 kernel/trace/trace_functions_graph.c |  7 ++---
 kernel/trace/trace_kprobe.c          | 10 +++---
 kernel/trace/trace_probe.h           |  2 +-
 kernel/trace/trace_stat.c            | 10 +++---
 8 files changed, 79 insertions(+), 67 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 80c9d34540dd..e3596de88fc1 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -18,7 +18,7 @@
 #include <linux/kallsyms.h>
 #include <linux/seq_file.h>
 #include <linux/suspend.h>
-#include <linux/debugfs.h>
+#include <linux/tracefs.h>
 #include <linux/hardirq.h>
 #include <linux/kthread.h>
 #include <linux/uaccess.h>
@@ -1008,7 +1008,7 @@ static struct tracer_stat function_stats __initdata = {
 	.stat_show	= function_stat_show
 };
 
-static __init void ftrace_profile_debugfs(struct dentry *d_tracer)
+static __init void ftrace_profile_tracefs(struct dentry *d_tracer)
 {
 	struct ftrace_profile_stat *stat;
 	struct dentry *entry;
@@ -1044,15 +1044,15 @@ static __init void ftrace_profile_debugfs(struct dentry *d_tracer)
 		}
 	}
 
-	entry = debugfs_create_file("function_profile_enabled", 0644,
+	entry = tracefs_create_file("function_profile_enabled", 0644,
 				    d_tracer, NULL, &ftrace_profile_fops);
 	if (!entry)
-		pr_warning("Could not create debugfs "
+		pr_warning("Could not create tracefs "
 			   "'function_profile_enabled' entry\n");
 }
 
 #else /* CONFIG_FUNCTION_PROFILER */
-static __init void ftrace_profile_debugfs(struct dentry *d_tracer)
+static __init void ftrace_profile_tracefs(struct dentry *d_tracer)
 {
 }
 #endif /* CONFIG_FUNCTION_PROFILER */
@@ -4653,7 +4653,7 @@ void ftrace_destroy_filter_files(struct ftrace_ops *ops)
 	mutex_unlock(&ftrace_lock);
 }
 
-static __init int ftrace_init_dyn_debugfs(struct dentry *d_tracer)
+static __init int ftrace_init_dyn_tracefs(struct dentry *d_tracer)
 {
 
 	trace_create_file("available_filter_functions", 0444,
@@ -4961,7 +4961,7 @@ static int __init ftrace_nodyn_init(void)
 }
 core_initcall(ftrace_nodyn_init);
 
-static inline int ftrace_init_dyn_debugfs(struct dentry *d_tracer) { return 0; }
+static inline int ftrace_init_dyn_tracefs(struct dentry *d_tracer) { return 0; }
 static inline void ftrace_startup_enable(int command) { }
 static inline void ftrace_startup_all(int command) { }
 /* Keep as macros so we do not need to define the commands */
@@ -5414,7 +5414,7 @@ static const struct file_operations ftrace_pid_fops = {
 	.release	= ftrace_pid_release,
 };
 
-static __init int ftrace_init_debugfs(void)
+static __init int ftrace_init_tracefs(void)
 {
 	struct dentry *d_tracer;
 
@@ -5422,16 +5422,16 @@ static __init int ftrace_init_debugfs(void)
 	if (IS_ERR(d_tracer))
 		return 0;
 
-	ftrace_init_dyn_debugfs(d_tracer);
+	ftrace_init_dyn_tracefs(d_tracer);
 
 	trace_create_file("set_ftrace_pid", 0644, d_tracer,
 			    NULL, &ftrace_pid_fops);
 
-	ftrace_profile_debugfs(d_tracer);
+	ftrace_profile_tracefs(d_tracer);
 
 	return 0;
 }
-fs_initcall(ftrace_init_debugfs);
+fs_initcall(ftrace_init_tracefs);
 
 /**
  * ftrace_kill - kill ftrace
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index acd27555dc5b..fb577a2a60ea 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -20,6 +20,7 @@
 #include <linux/notifier.h>
 #include <linux/irqflags.h>
 #include <linux/debugfs.h>
+#include <linux/tracefs.h>
 #include <linux/pagemap.h>
 #include <linux/hardirq.h>
 #include <linux/linkage.h>
@@ -5814,19 +5815,31 @@ static __init int register_snapshot_cmd(void)
 static inline __init int register_snapshot_cmd(void) { return 0; }
 #endif /* defined(CONFIG_TRACER_SNAPSHOT) && defined(CONFIG_DYNAMIC_FTRACE) */
 
+#define TRACE_TOP_DIR_ENTRY		((struct dentry *)1)
+
 struct dentry *tracing_init_dentry_tr(struct trace_array *tr)
 {
+	/* Top entry does not have a descriptor */
+	if (tr->dir == TRACE_TOP_DIR_ENTRY)
+		return NULL;
+
+	/* All sub buffers do */
 	if (tr->dir)
 		return tr->dir;
 
 	if (!debugfs_initialized())
 		return ERR_PTR(-ENODEV);
 
-	if (tr->flags & TRACE_ARRAY_FL_GLOBAL)
+	if (tr->flags & TRACE_ARRAY_FL_GLOBAL) {
 		tr->dir = debugfs_create_dir("tracing", NULL);
+		tr->dir = TRACE_TOP_DIR_ENTRY;
+		return NULL;
+	}
 
-	if (!tr->dir)
+	if (!tr->dir) {
 		pr_warn_once("Could not create debugfs directory 'tracing'\n");
+		return ERR_PTR(-ENOMEM);
+	}
 
 	return tr->dir;
 }
@@ -5847,10 +5860,10 @@ static struct dentry *tracing_dentry_percpu(struct trace_array *tr, int cpu)
 	if (IS_ERR(d_tracer))
 		return NULL;
 
-	tr->percpu_dir = debugfs_create_dir("per_cpu", d_tracer);
+	tr->percpu_dir = tracefs_create_dir("per_cpu", d_tracer);
 
 	WARN_ONCE(!tr->percpu_dir,
-		  "Could not create debugfs directory 'per_cpu/%d'\n", cpu);
+		  "Could not create tracefs directory 'per_cpu/%d'\n", cpu);
 
 	return tr->percpu_dir;
 }
@@ -5867,7 +5880,7 @@ trace_create_cpu_file(const char *name, umode_t mode, struct dentry *parent,
 }
 
 static void
-tracing_init_debugfs_percpu(struct trace_array *tr, long cpu)
+tracing_init_tracefs_percpu(struct trace_array *tr, long cpu)
 {
 	struct dentry *d_percpu = tracing_dentry_percpu(tr, cpu);
 	struct dentry *d_cpu;
@@ -5877,9 +5890,9 @@ tracing_init_debugfs_percpu(struct trace_array *tr, long cpu)
 		return;
 
 	snprintf(cpu_dir, 30, "cpu%ld", cpu);
-	d_cpu = debugfs_create_dir(cpu_dir, d_percpu);
+	d_cpu = tracefs_create_dir(cpu_dir, d_percpu);
 	if (!d_cpu) {
-		pr_warning("Could not create debugfs '%s' entry\n", cpu_dir);
+		pr_warning("Could not create tracefs '%s' entry\n", cpu_dir);
 		return;
 	}
 
@@ -6031,9 +6044,9 @@ struct dentry *trace_create_file(const char *name,
 {
 	struct dentry *ret;
 
-	ret = debugfs_create_file(name, mode, parent, data, fops);
+	ret = tracefs_create_file(name, mode, parent, data, fops);
 	if (!ret)
-		pr_warning("Could not create debugfs '%s' entry\n", name);
+		pr_warning("Could not create tracefs '%s' entry\n", name);
 
 	return ret;
 }
@@ -6050,9 +6063,9 @@ static struct dentry *trace_options_init_dentry(struct trace_array *tr)
 	if (IS_ERR(d_tracer))
 		return NULL;
 
-	tr->options = debugfs_create_dir("options", d_tracer);
+	tr->options = tracefs_create_dir("options", d_tracer);
 	if (!tr->options) {
-		pr_warning("Could not create debugfs directory 'options'\n");
+		pr_warning("Could not create tracefs directory 'options'\n");
 		return NULL;
 	}
 
@@ -6121,7 +6134,7 @@ destroy_trace_option_files(struct trace_option_dentry *topts)
 		return;
 
 	for (cnt = 0; topts[cnt].opt; cnt++)
-		debugfs_remove(topts[cnt].entry);
+		tracefs_remove(topts[cnt].entry);
 
 	kfree(topts);
 }
@@ -6210,7 +6223,7 @@ static const struct file_operations rb_simple_fops = {
 struct dentry *trace_instance_dir;
 
 static void
-init_tracer_debugfs(struct trace_array *tr, struct dentry *d_tracer);
+init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer);
 
 static int
 allocate_trace_buffer(struct trace_array *tr, struct trace_buffer *buf, int size)
@@ -6326,17 +6339,17 @@ static int new_instance_create(const char *name)
 	if (allocate_trace_buffers(tr, trace_buf_size) < 0)
 		goto out_free_tr;
 
-	tr->dir = debugfs_create_dir(name, trace_instance_dir);
+	tr->dir = tracefs_create_dir(name, trace_instance_dir);
 	if (!tr->dir)
 		goto out_free_tr;
 
 	ret = event_trace_add_tracer(tr->dir, tr);
 	if (ret) {
-		debugfs_remove_recursive(tr->dir);
+		tracefs_remove_recursive(tr->dir);
 		goto out_free_tr;
 	}
 
-	init_tracer_debugfs(tr, tr->dir);
+	init_tracer_tracefs(tr, tr->dir);
 
 	list_add(&tr->list, &ftrace_trace_arrays);
 
@@ -6409,7 +6422,7 @@ static int instance_mkdir (struct inode *inode, struct dentry *dentry, umode_t m
 		return -ENOENT;
 
 	/*
-	 * The inode mutex is locked, but debugfs_create_dir() will also
+	 * The inode mutex is locked, but tracefs_create_dir() will also
 	 * take the mutex. As the instances directory can not be destroyed
 	 * or changed in any other way, it is safe to unlock it, and
 	 * let the dentry try. If two users try to make the same dir at
@@ -6439,7 +6452,7 @@ static int instance_rmdir(struct inode *inode, struct dentry *dentry)
 	mutex_unlock(&dentry->d_inode->i_mutex);
 
 	/*
-	 * The inode mutex is locked, but debugfs_create_dir() will also
+	 * The inode mutex is locked, but tracefs_create_dir() will also
 	 * take the mutex. As the instances directory can not be destroyed
 	 * or changed in any other way, it is safe to unlock it, and
 	 * let the dentry try. If two users try to make the same dir at
@@ -6464,7 +6477,7 @@ static const struct inode_operations instance_dir_inode_operations = {
 
 static __init void create_trace_instances(struct dentry *d_tracer)
 {
-	trace_instance_dir = debugfs_create_dir("instances", d_tracer);
+	trace_instance_dir = tracefs_create_dir("instances", d_tracer);
 	if (WARN_ON(!trace_instance_dir))
 		return;
 
@@ -6473,7 +6486,7 @@ static __init void create_trace_instances(struct dentry *d_tracer)
 }
 
 static void
-init_tracer_debugfs(struct trace_array *tr, struct dentry *d_tracer)
+init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer)
 {
 	int cpu;
 
@@ -6527,11 +6540,11 @@ init_tracer_debugfs(struct trace_array *tr, struct dentry *d_tracer)
 #endif
 
 	for_each_tracing_cpu(cpu)
-		tracing_init_debugfs_percpu(tr, cpu);
+		tracing_init_tracefs_percpu(tr, cpu);
 
 }
 
-static __init int tracer_init_debugfs(void)
+static __init int tracer_init_tracefs(void)
 {
 	struct dentry *d_tracer;
 
@@ -6541,7 +6554,7 @@ static __init int tracer_init_debugfs(void)
 	if (IS_ERR(d_tracer))
 		return 0;
 
-	init_tracer_debugfs(&global_trace, d_tracer);
+	init_tracer_tracefs(&global_trace, d_tracer);
 
 	trace_create_file("tracing_thresh", 0644, d_tracer,
 			&global_trace, &tracing_thresh_fops);
@@ -6901,5 +6914,5 @@ __init static int clear_boot_tracer(void)
 	return 0;
 }
 
-fs_initcall(tracer_init_debugfs);
+fs_initcall(tracer_init_tracefs);
 late_initcall(clear_boot_tracer);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 0eddfeb05fee..ba1170cb4880 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -334,7 +334,7 @@ struct tracer_flags {
 
 
 /**
- * struct tracer - a specific tracer and its callbacks to interact with debugfs
+ * struct tracer - a specific tracer and its callbacks to interact with tracefs
  * @name: the name chosen to select it on the available_tracers file
  * @init: called when one switches to this tracer (echo name > current_tracer)
  * @reset: called when one switches to another tracer
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 4ff8c1394017..e3b7782f904f 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -13,7 +13,7 @@
 #include <linux/workqueue.h>
 #include <linux/spinlock.h>
 #include <linux/kthread.h>
-#include <linux/debugfs.h>
+#include <linux/tracefs.h>
 #include <linux/uaccess.h>
 #include <linux/module.h>
 #include <linux/ctype.h>
@@ -480,7 +480,7 @@ static void remove_subsystem(struct ftrace_subsystem_dir *dir)
 		return;
 
 	if (!--dir->nr_events) {
-		debugfs_remove_recursive(dir->entry);
+		tracefs_remove_recursive(dir->entry);
 		list_del(&dir->list);
 		__put_system_dir(dir);
 	}
@@ -499,7 +499,7 @@ static void remove_event_file_dir(struct ftrace_event_file *file)
 		}
 		spin_unlock(&dir->d_lock);
 
-		debugfs_remove_recursive(dir);
+		tracefs_remove_recursive(dir);
 	}
 
 	list_del(&file->list);
@@ -1526,7 +1526,7 @@ event_subsystem_dir(struct trace_array *tr, const char *name,
 	} else
 		__get_system(system);
 
-	dir->entry = debugfs_create_dir(name, parent);
+	dir->entry = tracefs_create_dir(name, parent);
 	if (!dir->entry) {
 		pr_warn("Failed to create system directory %s\n", name);
 		__put_system(system);
@@ -1539,12 +1539,12 @@ event_subsystem_dir(struct trace_array *tr, const char *name,
 	dir->subsystem = system;
 	file->system = dir;
 
-	entry = debugfs_create_file("filter", 0644, dir->entry, dir,
+	entry = tracefs_create_file("filter", 0644, dir->entry, dir,
 				    &ftrace_subsystem_filter_fops);
 	if (!entry) {
 		kfree(system->filter);
 		system->filter = NULL;
-		pr_warn("Could not create debugfs '%s/filter' entry\n", name);
+		pr_warn("Could not create tracefs '%s/filter' entry\n", name);
 	}
 
 	trace_create_file("enable", 0644, dir->entry, dir,
@@ -1585,9 +1585,9 @@ event_create_dir(struct dentry *parent, struct ftrace_event_file *file)
 		d_events = parent;
 
 	name = ftrace_event_name(call);
-	file->dir = debugfs_create_dir(name, d_events);
+	file->dir = tracefs_create_dir(name, d_events);
 	if (!file->dir) {
-		pr_warn("Could not create debugfs '%s' directory\n", name);
+		pr_warn("Could not create tracefs '%s' directory\n", name);
 		return -1;
 	}
 
@@ -2228,7 +2228,7 @@ static inline int register_event_cmds(void) { return 0; }
 /*
  * The top level array has already had its ftrace_event_file
  * descriptors created in order to allow for early events to
- * be recorded. This function is called after the debugfs has been
+ * be recorded. This function is called after the tracefs has been
  * initialized, and we now have to create the files associated
  * to the events.
  */
@@ -2311,16 +2311,16 @@ create_event_toplevel_files(struct dentry *parent, struct trace_array *tr)
 	struct dentry *d_events;
 	struct dentry *entry;
 
-	entry = debugfs_create_file("set_event", 0644, parent,
+	entry = tracefs_create_file("set_event", 0644, parent,
 				    tr, &ftrace_set_event_fops);
 	if (!entry) {
-		pr_warn("Could not create debugfs 'set_event' entry\n");
+		pr_warn("Could not create tracefs 'set_event' entry\n");
 		return -ENOMEM;
 	}
 
-	d_events = debugfs_create_dir("events", parent);
+	d_events = tracefs_create_dir("events", parent);
 	if (!d_events) {
-		pr_warn("Could not create debugfs 'events' directory\n");
+		pr_warn("Could not create tracefs 'events' directory\n");
 		return -ENOMEM;
 	}
 
@@ -2412,7 +2412,7 @@ int event_trace_del_tracer(struct trace_array *tr)
 
 	down_write(&trace_event_sem);
 	__trace_remove_event_dirs(tr);
-	debugfs_remove_recursive(tr->event_dir);
+	tracefs_remove_recursive(tr->event_dir);
 	up_write(&trace_event_sem);
 
 	tr->event_dir = NULL;
@@ -2493,10 +2493,10 @@ static __init int event_trace_init(void)
 	if (IS_ERR(d_tracer))
 		return 0;
 
-	entry = debugfs_create_file("available_events", 0444, d_tracer,
+	entry = tracefs_create_file("available_events", 0444, d_tracer,
 				    tr, &ftrace_avail_fops);
 	if (!entry)
-		pr_warn("Could not create debugfs 'available_events' entry\n");
+		pr_warn("Could not create tracefs 'available_events' entry\n");
 
 	if (trace_define_common_fields())
 		pr_warn("tracing: Failed to allocate common fields");
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index 2d25ad1526bb..9cfea4c6d314 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -6,7 +6,6 @@
  * is Copyright (c) Steven Rostedt <srostedt@redhat.com>
  *
  */
-#include <linux/debugfs.h>
 #include <linux/uaccess.h>
 #include <linux/ftrace.h>
 #include <linux/slab.h>
@@ -151,7 +150,7 @@ ftrace_push_return_trace(unsigned long ret, unsigned long func, int *depth,
 	 * The curr_ret_stack is initialized to -1 and get increased
 	 * in this function.  So it can be less than -1 only if it was
 	 * filtered out via ftrace_graph_notrace_addr() which can be
-	 * set from set_graph_notrace file in debugfs by user.
+	 * set from set_graph_notrace file in tracefs by user.
 	 */
 	if (current->curr_ret_stack < -1)
 		return -EBUSY;
@@ -1432,7 +1431,7 @@ static const struct file_operations graph_depth_fops = {
 	.llseek		= generic_file_llseek,
 };
 
-static __init int init_graph_debugfs(void)
+static __init int init_graph_tracefs(void)
 {
 	struct dentry *d_tracer;
 
@@ -1445,7 +1444,7 @@ static __init int init_graph_debugfs(void)
 
 	return 0;
 }
-fs_initcall(init_graph_debugfs);
+fs_initcall(init_graph_tracefs);
 
 static __init int init_graph_trace(void)
 {
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index b4a00def88f5..c1c6655847c8 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1310,7 +1310,7 @@ static int unregister_kprobe_event(struct trace_kprobe *tk)
 	return ret;
 }
 
-/* Make a debugfs interface for controlling probe points */
+/* Make a tracefs interface for controlling probe points */
 static __init int init_kprobe_trace(void)
 {
 	struct dentry *d_tracer;
@@ -1323,20 +1323,20 @@ static __init int init_kprobe_trace(void)
 	if (IS_ERR(d_tracer))
 		return 0;
 
-	entry = debugfs_create_file("kprobe_events", 0644, d_tracer,
+	entry = tracefs_create_file("kprobe_events", 0644, d_tracer,
 				    NULL, &kprobe_events_ops);
 
 	/* Event list interface */
 	if (!entry)
-		pr_warning("Could not create debugfs "
+		pr_warning("Could not create tracefs "
 			   "'kprobe_events' entry\n");
 
 	/* Profile interface */
-	entry = debugfs_create_file("kprobe_profile", 0444, d_tracer,
+	entry = tracefs_create_file("kprobe_profile", 0444, d_tracer,
 				    NULL, &kprobe_profile_ops);
 
 	if (!entry)
-		pr_warning("Could not create debugfs "
+		pr_warning("Could not create tracefs "
 			   "'kprobe_profile' entry\n");
 	return 0;
 }
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index 4f815fbce16d..19aff635841a 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -25,7 +25,7 @@
 #include <linux/seq_file.h>
 #include <linux/slab.h>
 #include <linux/smp.h>
-#include <linux/debugfs.h>
+#include <linux/tracefs.h>
 #include <linux/types.h>
 #include <linux/string.h>
 #include <linux/ctype.h>
diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
index 75e19e86c954..6cf935316769 100644
--- a/kernel/trace/trace_stat.c
+++ b/kernel/trace/trace_stat.c
@@ -12,7 +12,7 @@
 #include <linux/list.h>
 #include <linux/slab.h>
 #include <linux/rbtree.h>
-#include <linux/debugfs.h>
+#include <linux/tracefs.h>
 #include "trace_stat.h"
 #include "trace.h"
 
@@ -65,7 +65,7 @@ static void reset_stat_session(struct stat_session *session)
 
 static void destroy_session(struct stat_session *session)
 {
-	debugfs_remove(session->file);
+	tracefs_remove(session->file);
 	__reset_stat_session(session);
 	mutex_destroy(&session->stat_mutex);
 	kfree(session);
@@ -279,9 +279,9 @@ static int tracing_stat_init(void)
 	if (IS_ERR(d_tracing))
 		return 0;
 
-	stat_dir = debugfs_create_dir("trace_stat", d_tracing);
+	stat_dir = tracefs_create_dir("trace_stat", d_tracing);
 	if (!stat_dir)
-		pr_warning("Could not create debugfs "
+		pr_warning("Could not create tracefs "
 			   "'trace_stat' entry\n");
 	return 0;
 }
@@ -291,7 +291,7 @@ static int init_stat_file(struct stat_session *session)
 	if (!stat_dir && tracing_stat_init())
 		return -ENODEV;
 
-	session->file = debugfs_create_file(session->ts->name, 0644,
+	session->file = tracefs_create_file(session->ts->name, 0644,
 					    stat_dir,
 					    session, &tracing_stat_fops);
 	if (!session->file)
-- 
2.1.4



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/5 v2] tracing: Automatically mount tracefs on debugfs/tracing
  2015-01-23 15:55 [PATCH 0/5 v2] tracing: Add new file system tracefs Steven Rostedt
  2015-01-23 15:55 ` [PATCH 1/5 v2] tracefs: Add new tracefs file system Steven Rostedt
  2015-01-23 15:55 ` [PATCH 2/5 v2] tracing: Convert the tracing facility over to use tracefs Steven Rostedt
@ 2015-01-23 15:55 ` Steven Rostedt
  2015-01-24  3:00   ` Greg Kroah-Hartman
  2015-01-23 15:55 ` [PATCH 4/5 v2] tracefs: Add directory /sys/kernel/tracing Steven Rostedt
  2015-01-23 15:55 ` [PATCH 5/5 v2] tracing: Have mkdir and rmdir be part of tracefs Steven Rostedt
  4 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2015-01-23 15:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton, Al Viro

[-- Attachment #1: 0003-tracing-Automatically-mount-tracefs-on-debugfs-traci.patch --]
[-- Type: text/plain, Size: 3737 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

As tools currently rely on the tracing directory in debugfs, we can not
just created a tracefs infrastructure and expect sysadmins to mount
the new tracefs to have their old tools work.

Instead, the debugfs tracing directory is still created and the tracefs
file system is mounted there when the debugfs filesystem is mounted.

No longer does the tracing infrastructure update the debugfs file system,
but instead interacts with the tracefs file system. But now, it still
appears to the user like nothing changed, except you also have the feature
of mounting just the tracing system without needing all of debugfs!

Note, because debugfs_create_dir() happens to end up setting the
dentry->d_op, we can not use d_set_d_op() but must manually assign the
new op, that has automount set, to the dentry returned. This can be
racy, but since this happens during the initcall sequence on boot up,
there should be nothing that races with it.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 50 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index fb577a2a60ea..4fb557917d39 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -32,6 +32,7 @@
 #include <linux/splice.h>
 #include <linux/kdebug.h>
 #include <linux/string.h>
+#include <linux/mount.h>
 #include <linux/rwsem.h>
 #include <linux/slab.h>
 #include <linux/ctype.h>
@@ -5815,10 +5816,35 @@ static __init int register_snapshot_cmd(void)
 static inline __init int register_snapshot_cmd(void) { return 0; }
 #endif /* defined(CONFIG_TRACER_SNAPSHOT) && defined(CONFIG_DYNAMIC_FTRACE) */
 
+static struct vfsmount *trace_automount(struct path *path)
+{
+	struct vfsmount *mnt;
+	struct file_system_type *type;
+
+	/*
+	 * To maintain backward compatibility for tools that mount
+	 * debugfs to get to the tracing facility, tracefs is automatically
+	 * mounted to the debugfs/tracing directory.
+	 */
+	type = get_fs_type("tracefs");
+	if (!type)
+		return NULL;
+	mnt = vfs_kern_mount(type, 0, "tracefs", NULL);
+	put_filesystem(type);
+	if (IS_ERR(mnt))
+		return NULL;
+	mntget(mnt);
+
+	return mnt;
+}
+
 #define TRACE_TOP_DIR_ENTRY		((struct dentry *)1)
 
 struct dentry *tracing_init_dentry_tr(struct trace_array *tr)
 {
+	static struct dentry_operations trace_ops;
+	struct dentry *traced;
+
 	/* Top entry does not have a descriptor */
 	if (tr->dir == TRACE_TOP_DIR_ENTRY)
 		return NULL;
@@ -5831,7 +5857,30 @@ struct dentry *tracing_init_dentry_tr(struct trace_array *tr)
 		return ERR_PTR(-ENODEV);
 
 	if (tr->flags & TRACE_ARRAY_FL_GLOBAL) {
-		tr->dir = debugfs_create_dir("tracing", NULL);
+		traced = debugfs_create_dir("tracing", NULL);
+		if (!traced)
+			return ERR_PTR(-ENOMEM);
+		/* copy the dentry ops and add an automount to it */
+		if (traced->d_op) {
+			/*
+			 * FIXME:
+			 * Currently debugfs sets the d_op by a side-effect
+			 * of calling simple_lookup(). Normally, we should
+			 * never change d_op of a dentry, but as this is
+			 * happening at boot up and shouldn't be racing with
+			 * any other users, this should be OK. But it is still
+			 * a hack, and needs to be properly done.
+			 */
+			trace_ops = *traced->d_op;
+			trace_ops.d_automount = trace_automount;
+			traced->d_flags |= DCACHE_NEED_AUTOMOUNT;
+			traced->d_op = &trace_ops;
+		} else {
+			/* Ideally, this is what should happen */
+			trace_ops = simple_dentry_operations;
+			trace_ops.d_automount = trace_automount;
+			d_set_d_op(traced, &trace_ops);
+		}
 		tr->dir = TRACE_TOP_DIR_ENTRY;
 		return NULL;
 	}
-- 
2.1.4



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 4/5 v2] tracefs: Add directory /sys/kernel/tracing
  2015-01-23 15:55 [PATCH 0/5 v2] tracing: Add new file system tracefs Steven Rostedt
                   ` (2 preceding siblings ...)
  2015-01-23 15:55 ` [PATCH 3/5 v2] tracing: Automatically mount tracefs on debugfs/tracing Steven Rostedt
@ 2015-01-23 15:55 ` Steven Rostedt
  2015-01-24  3:00   ` Greg Kroah-Hartman
  2015-01-23 15:55 ` [PATCH 5/5 v2] tracing: Have mkdir and rmdir be part of tracefs Steven Rostedt
  4 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2015-01-23 15:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton

[-- Attachment #1: 0004-tracefs-Add-directory-sys-kernel-tracing.patch --]
[-- Type: text/plain, Size: 1126 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

When tracefs is configured, have the directory /sys/kernel/tracing appear
just like /sys/kernel/debug appears when debugfs is configured.

This will give a consistent place for system admins to mount tracefs.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 fs/tracefs/inode.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index c31997a303c7..cdbaa42b44a1 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -16,6 +16,7 @@
 #include <linux/module.h>
 #include <linux/fs.h>
 #include <linux/mount.h>
+#include <linux/kobject.h>
 #include <linux/namei.h>
 #include <linux/tracefs.h>
 #include <linux/fsnotify.h>
@@ -547,10 +548,16 @@ bool tracefs_initialized(void)
 	return tracefs_registered;
 }
 
+static struct kobject *trace_kobj;
+
 static int __init tracefs_init(void)
 {
 	int retval;
 
+	trace_kobj = kobject_create_and_add("tracing", kernel_kobj);
+	if (!trace_kobj)
+		return -EINVAL;
+
 	retval = register_filesystem(&trace_fs_type);
 	if (!retval)
 		tracefs_registered = true;
-- 
2.1.4



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 5/5 v2] tracing: Have mkdir and rmdir be part of tracefs
  2015-01-23 15:55 [PATCH 0/5 v2] tracing: Add new file system tracefs Steven Rostedt
                   ` (3 preceding siblings ...)
  2015-01-23 15:55 ` [PATCH 4/5 v2] tracefs: Add directory /sys/kernel/tracing Steven Rostedt
@ 2015-01-23 15:55 ` Steven Rostedt
  4 siblings, 0 replies; 14+ messages in thread
From: Steven Rostedt @ 2015-01-23 15:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: Al Viro, Greg Kroah-Hartman, Ingo Molnar, Andrew Morton, Al Viro

[-- Attachment #1: 0005-tracing-Have-mkdir-and-rmdir-be-part-of-tracefs.patch --]
[-- Type: text/plain, Size: 8387 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

The tracing "instances" directory can create sub tracing buffers
with mkdir, and remove them with rmdir. As a mkdir will also create
all the files and directories that control the sub buffer the inode
mutexes need to be released before this is done, to avoid deadlocks.
It is better to let the tracing system unlock the inode mutexes before
calling the functions that create the files within the new directory
(or deletes the files from the one being destroyed).

Now that tracing has been converted over to tracefs, the tracefs file
system can be modified to accommodate this feature. It still releases
the locks, but the filesystem itself can take care of the ugly
business and let the user just do what it needs.

The tracing system now attaches a descriptor to the directory dentry
that can have userspace create or remove sub directories. If this
descriptor does not exist for a dentry, then that dentry can not be
used to create other directories. This descriptor holds a mkdir and
rmdir method that only takes a character string as an argument.

The tracefs file system will first make a copy of the dentry name
before releasing the locks. Then it will pass the copied name to the
methods. It is up to the tracing system that supplied the methods to
handle races with duplicate names and such as all the inode mutexes
would be released when the functions are called.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 fs/tracefs/inode.c      | 93 ++++++++++++++++++++++++++++++++++++++++++++++++-
 include/linux/tracefs.h |  7 ++++
 kernel/trace/trace.c    | 68 +++---------------------------------
 3 files changed, 103 insertions(+), 65 deletions(-)

diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index cdbaa42b44a1..a005b951fd85 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -50,6 +50,87 @@ static const struct file_operations tracefs_file_operations = {
 	.llseek =	noop_llseek,
 };
 
+static char *get_dname(struct dentry *dentry)
+{
+	const char *dname;
+	char *name;
+	int len = dentry->d_name.len;
+
+	dname = dentry->d_name.name;
+	name = kmalloc(len + 1, GFP_KERNEL);
+	if (!name)
+		return NULL;
+	memcpy(name, dname, len);
+	name[len] = 0;
+	return name;
+}
+
+static int tracefs_syscall_mkdir(struct inode *inode, struct dentry *dentry, umode_t mode)
+{
+	const struct tracefs_dir_ops *ops = inode ? inode->i_private : NULL;
+	char *name;
+	int ret;
+
+	if (!ops)
+		return -EPERM;
+
+	name = get_dname(dentry);
+	if (!name)
+		return -ENOMEM;
+
+	/*
+	 * The mkdir call can call the generic functions that create
+	 * the files within the tracefs system. It is up to the individual
+	 * mkdir routine to handle races.
+	 */
+	mutex_unlock(&inode->i_mutex);
+	ret = ops->mkdir(name);
+	mutex_lock(&inode->i_mutex);
+
+	kfree(name);
+
+	return ret;
+}
+
+static int tracefs_syscall_rmdir(struct inode *inode, struct dentry *dentry)
+{
+	const struct tracefs_dir_ops *ops = inode->i_private;
+	char *name;
+	int ret;
+
+	if (!ops)
+		return -EPERM;
+
+	name = get_dname(dentry);
+	if (!name)
+		return -ENOMEM;
+
+	/*
+	 * The rmdir call can call the generic functions that create
+	 * the files within the tracefs system. It is up to the individual
+	 * rmdir routine to handle races.
+	 * This time we need to unlock not only the parent (inode) but
+	 * also the directory that is being deleted.
+	 */
+	mutex_unlock(&inode->i_mutex);
+	mutex_unlock(&dentry->d_inode->i_mutex);
+
+	ret = ops->rmdir(name);
+
+	mutex_lock_nested(&inode->i_mutex, I_MUTEX_PARENT);
+	mutex_lock(&dentry->d_inode->i_mutex);
+
+	kfree(name);
+
+	return ret;
+}
+
+const struct inode_operations tracefs_dir_inode_operations = {
+	.lookup		= simple_lookup,
+	.mkdir		= tracefs_syscall_mkdir,
+	.rmdir		= tracefs_syscall_rmdir,
+};
+
 static struct inode *tracefs_get_inode(struct super_block *sb, umode_t mode, dev_t dev,
 				      void *data, const struct file_operations *fops)
 
@@ -69,7 +150,7 @@ static struct inode *tracefs_get_inode(struct super_block *sb, umode_t mode, dev
 			inode->i_private = data;
 			break;
 		case S_IFDIR:
-			inode->i_op = &simple_dir_inode_operations;
+			inode->i_op = &tracefs_dir_inode_operations;
 			inode->i_fop = &simple_dir_operations;
 
 			/* directory inodes start off with i_nlink == 2
@@ -125,6 +206,16 @@ static int tracefs_create(struct inode *dir, struct dentry *dentry, umode_t mode
 	return res;
 }
 
+void tracefs_add_dir_ops(struct dentry *dentry, const struct tracefs_dir_ops *ops)
+{
+	struct inode *inode = dentry->d_inode;
+
+	if (!inode)
+		return;
+
+	inode->i_private = (void *)ops;
+}
+
 struct tracefs_mount_opts {
 	kuid_t uid;
 	kgid_t gid;
diff --git a/include/linux/tracefs.h b/include/linux/tracefs.h
index 23e04ce21749..d142b1f9d453 100644
--- a/include/linux/tracefs.h
+++ b/include/linux/tracefs.h
@@ -34,6 +34,13 @@ struct dentry *tracefs_create_dir(const char *name, struct dentry *parent);
 void tracefs_remove(struct dentry *dentry);
 void tracefs_remove_recursive(struct dentry *dentry);
 
+struct tracefs_dir_ops {
+	int (*mkdir)(const char *name);
+	int (*rmdir)(const char *name);
+};
+
+void tracefs_add_dir_ops(struct dentry *dentry, const struct tracefs_dir_ops *ops);
+
 bool tracefs_initialized(void);
 
 #endif /* CONFIG_TRACING */
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 4fb557917d39..ce9a331ebc9c 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -6349,7 +6349,7 @@ static void free_trace_buffers(struct trace_array *tr)
 #endif
 }
 
-static int new_instance_create(const char *name)
+static int instance_mkdir(const char *name)
 {
 	struct trace_array *tr;
 	int ret;
@@ -6419,7 +6419,7 @@ static int new_instance_create(const char *name)
 
 }
 
-static int instance_delete(const char *name)
+static int instance_rmdir(const char *name)
 {
 	struct trace_array *tr;
 	int found = 0;
@@ -6460,66 +6460,7 @@ static int instance_delete(const char *name)
 	return ret;
 }
 
-static int instance_mkdir (struct inode *inode, struct dentry *dentry, umode_t mode)
-{
-	struct dentry *parent;
-	int ret;
-
-	/* Paranoid: Make sure the parent is the "instances" directory */
-	parent = hlist_entry(inode->i_dentry.first, struct dentry, d_u.d_alias);
-	if (WARN_ON_ONCE(parent != trace_instance_dir))
-		return -ENOENT;
-
-	/*
-	 * The inode mutex is locked, but tracefs_create_dir() will also
-	 * take the mutex. As the instances directory can not be destroyed
-	 * or changed in any other way, it is safe to unlock it, and
-	 * let the dentry try. If two users try to make the same dir at
-	 * the same time, then the new_instance_create() will determine the
-	 * winner.
-	 */
-	mutex_unlock(&inode->i_mutex);
-
-	ret = new_instance_create(dentry->d_iname);
-
-	mutex_lock(&inode->i_mutex);
-
-	return ret;
-}
-
-static int instance_rmdir(struct inode *inode, struct dentry *dentry)
-{
-	struct dentry *parent;
-	int ret;
-
-	/* Paranoid: Make sure the parent is the "instances" directory */
-	parent = hlist_entry(inode->i_dentry.first, struct dentry, d_u.d_alias);
-	if (WARN_ON_ONCE(parent != trace_instance_dir))
-		return -ENOENT;
-
-	/* The caller did a dget() on dentry */
-	mutex_unlock(&dentry->d_inode->i_mutex);
-
-	/*
-	 * The inode mutex is locked, but tracefs_create_dir() will also
-	 * take the mutex. As the instances directory can not be destroyed
-	 * or changed in any other way, it is safe to unlock it, and
-	 * let the dentry try. If two users try to make the same dir at
-	 * the same time, then the instance_delete() will determine the
-	 * winner.
-	 */
-	mutex_unlock(&inode->i_mutex);
-
-	ret = instance_delete(dentry->d_iname);
-
-	mutex_lock_nested(&inode->i_mutex, I_MUTEX_PARENT);
-	mutex_lock(&dentry->d_inode->i_mutex);
-
-	return ret;
-}
-
-static const struct inode_operations instance_dir_inode_operations = {
-	.lookup		= simple_lookup,
+static const struct tracefs_dir_ops instance_dir_ops = {
 	.mkdir		= instance_mkdir,
 	.rmdir		= instance_rmdir,
 };
@@ -6530,8 +6471,7 @@ static __init void create_trace_instances(struct dentry *d_tracer)
 	if (WARN_ON(!trace_instance_dir))
 		return;
 
-	/* Hijack the dir inode operations, to allow mkdir */
-	trace_instance_dir->d_inode->i_op = &instance_dir_inode_operations;
+	tracefs_add_dir_ops(trace_instance_dir, &instance_dir_ops);
 }
 
 static void
-- 
2.1.4



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/5 v2] tracing: Automatically mount tracefs on debugfs/tracing
  2015-01-23 15:55 ` [PATCH 3/5 v2] tracing: Automatically mount tracefs on debugfs/tracing Steven Rostedt
@ 2015-01-24  3:00   ` Greg Kroah-Hartman
  2015-01-24 11:33     ` Steven Rostedt
  0 siblings, 1 reply; 14+ messages in thread
From: Greg Kroah-Hartman @ 2015-01-24  3:00 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Al Viro, Ingo Molnar, Andrew Morton

On Fri, Jan 23, 2015 at 10:55:28AM -0500, Steven Rostedt wrote:
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> 
> As tools currently rely on the tracing directory in debugfs, we can not
> just created a tracefs infrastructure and expect sysadmins to mount
> the new tracefs to have their old tools work.
> 
> Instead, the debugfs tracing directory is still created and the tracefs
> file system is mounted there when the debugfs filesystem is mounted.
> 
> No longer does the tracing infrastructure update the debugfs file system,
> but instead interacts with the tracefs file system. But now, it still
> appears to the user like nothing changed, except you also have the feature
> of mounting just the tracing system without needing all of debugfs!
> 
> Note, because debugfs_create_dir() happens to end up setting the
> dentry->d_op, we can not use d_set_d_op() but must manually assign the
> new op, that has automount set, to the dentry returned. This can be
> racy, but since this happens during the initcall sequence on boot up,
> there should be nothing that races with it.
> 
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  kernel/trace/trace.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 50 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index fb577a2a60ea..4fb557917d39 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -32,6 +32,7 @@
>  #include <linux/splice.h>
>  #include <linux/kdebug.h>
>  #include <linux/string.h>
> +#include <linux/mount.h>
>  #include <linux/rwsem.h>
>  #include <linux/slab.h>
>  #include <linux/ctype.h>
> @@ -5815,10 +5816,35 @@ static __init int register_snapshot_cmd(void)
>  static inline __init int register_snapshot_cmd(void) { return 0; }
>  #endif /* defined(CONFIG_TRACER_SNAPSHOT) && defined(CONFIG_DYNAMIC_FTRACE) */
>  
> +static struct vfsmount *trace_automount(struct path *path)
> +{
> +	struct vfsmount *mnt;
> +	struct file_system_type *type;
> +
> +	/*
> +	 * To maintain backward compatibility for tools that mount
> +	 * debugfs to get to the tracing facility, tracefs is automatically
> +	 * mounted to the debugfs/tracing directory.
> +	 */
> +	type = get_fs_type("tracefs");
> +	if (!type)
> +		return NULL;
> +	mnt = vfs_kern_mount(type, 0, "tracefs", NULL);
> +	put_filesystem(type);
> +	if (IS_ERR(mnt))
> +		return NULL;
> +	mntget(mnt);
> +
> +	return mnt;
> +}
> +
>  #define TRACE_TOP_DIR_ENTRY		((struct dentry *)1)
>  
>  struct dentry *tracing_init_dentry_tr(struct trace_array *tr)
>  {
> +	static struct dentry_operations trace_ops;
> +	struct dentry *traced;
> +
>  	/* Top entry does not have a descriptor */
>  	if (tr->dir == TRACE_TOP_DIR_ENTRY)
>  		return NULL;
> @@ -5831,7 +5857,30 @@ struct dentry *tracing_init_dentry_tr(struct trace_array *tr)
>  		return ERR_PTR(-ENODEV);
>  
>  	if (tr->flags & TRACE_ARRAY_FL_GLOBAL) {
> -		tr->dir = debugfs_create_dir("tracing", NULL);
> +		traced = debugfs_create_dir("tracing", NULL);
> +		if (!traced)
> +			return ERR_PTR(-ENOMEM);
> +		/* copy the dentry ops and add an automount to it */
> +		if (traced->d_op) {
> +			/*
> +			 * FIXME:
> +			 * Currently debugfs sets the d_op by a side-effect
> +			 * of calling simple_lookup(). Normally, we should
> +			 * never change d_op of a dentry, but as this is
> +			 * happening at boot up and shouldn't be racing with
> +			 * any other users, this should be OK. But it is still
> +			 * a hack, and needs to be properly done.
> +			 */
> +			trace_ops = *traced->d_op;
> +			trace_ops.d_automount = trace_automount;
> +			traced->d_flags |= DCACHE_NEED_AUTOMOUNT;
> +			traced->d_op = &trace_ops;
> +		} else {
> +			/* Ideally, this is what should happen */
> +			trace_ops = simple_dentry_operations;
> +			trace_ops.d_automount = trace_automount;
> +			d_set_d_op(traced, &trace_ops);

How will this else block run if debugfs is setting d_op in the
debugfs_create_dir() call?

What really do you want to do here, just automount a filesystem on
debugfs?  If so, can't we just add a new debugfs call to do that?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/5 v2] tracefs: Add directory /sys/kernel/tracing
  2015-01-23 15:55 ` [PATCH 4/5 v2] tracefs: Add directory /sys/kernel/tracing Steven Rostedt
@ 2015-01-24  3:00   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 14+ messages in thread
From: Greg Kroah-Hartman @ 2015-01-24  3:00 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Al Viro, Ingo Molnar, Andrew Morton

On Fri, Jan 23, 2015 at 10:55:29AM -0500, Steven Rostedt wrote:
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> 
> When tracefs is configured, have the directory /sys/kernel/tracing appear
> just like /sys/kernel/debug appears when debugfs is configured.
> 
> This will give a consistent place for system admins to mount tracefs.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  fs/tracefs/inode.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
> index c31997a303c7..cdbaa42b44a1 100644
> --- a/fs/tracefs/inode.c
> +++ b/fs/tracefs/inode.c
> @@ -16,6 +16,7 @@
>  #include <linux/module.h>
>  #include <linux/fs.h>
>  #include <linux/mount.h>
> +#include <linux/kobject.h>
>  #include <linux/namei.h>
>  #include <linux/tracefs.h>
>  #include <linux/fsnotify.h>
> @@ -547,10 +548,16 @@ bool tracefs_initialized(void)
>  	return tracefs_registered;
>  }
>  
> +static struct kobject *trace_kobj;
> +
>  static int __init tracefs_init(void)
>  {
>  	int retval;
>  
> +	trace_kobj = kobject_create_and_add("tracing", kernel_kobj);
> +	if (!trace_kobj)
> +		return -EINVAL;
> +
>  	retval = register_filesystem(&trace_fs_type);
>  	if (!retval)
>  		tracefs_registered = true;
> -- 
> 2.1.4
> 

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/5 v2] tracing: Automatically mount tracefs on debugfs/tracing
  2015-01-24  3:00   ` Greg Kroah-Hartman
@ 2015-01-24 11:33     ` Steven Rostedt
  2015-01-25 13:22       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2015-01-24 11:33 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Al Viro, Ingo Molnar, Andrew Morton

On Sat, 24 Jan 2015 11:00:41 +0800
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:

> > +		if (traced->d_op) {
> > +			/*
> > +			 * FIXME:
> > +			 * Currently debugfs sets the d_op by a
> > side-effect
> > +			 * of calling simple_lookup(). Normally,
> > we should
> > +			 * never change d_op of a dentry, but as
> > this is
> > +			 * happening at boot up and shouldn't be
> > racing with
> > +			 * any other users, this should be OK. But
> > it is still
> > +			 * a hack, and needs to be properly done.
> > +			 */
> > +			trace_ops = *traced->d_op;
> > +			trace_ops.d_automount = trace_automount;
> > +			traced->d_flags |= DCACHE_NEED_AUTOMOUNT;
> > +			traced->d_op = &trace_ops;
> > +		} else {
> > +			/* Ideally, this is what should happen */
> > +			trace_ops = simple_dentry_operations;
> > +			trace_ops.d_automount = trace_automount;
> > +			d_set_d_op(traced, &trace_ops);
> 
> How will this else block run if debugfs is setting d_op in the
> debugfs_create_dir() call?

It wont; I put the else block there to show what we would like to do.
And would hopefully work if debugfs ever changed.


> 
> What really do you want to do here, just automount a filesystem on
> debugfs?  If so, can't we just add a new debugfs call to do that?

We could add a call to debugfs to do that. Would you prefer that? From
talking with Al, it sounds to me that changing d_ops on the fly is very
racy. Adding a call in debugfs sounds like it would be open for other
users to do the same and do so while the system is running. Would that
be wise?

Doing the automount here is guaranteed not to happen after system boot
up, or when there might be users of debugfs while the ops changes.

-- Steve


-- Steve

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/5 v2] tracing: Automatically mount tracefs on debugfs/tracing
  2015-01-24 11:33     ` Steven Rostedt
@ 2015-01-25 13:22       ` Greg Kroah-Hartman
  2015-01-25 19:38         ` Steven Rostedt
  0 siblings, 1 reply; 14+ messages in thread
From: Greg Kroah-Hartman @ 2015-01-25 13:22 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Al Viro, Ingo Molnar, Andrew Morton

On Sat, Jan 24, 2015 at 06:33:30AM -0500, Steven Rostedt wrote:
> On Sat, 24 Jan 2015 11:00:41 +0800
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> 
> > > +		if (traced->d_op) {
> > > +			/*
> > > +			 * FIXME:
> > > +			 * Currently debugfs sets the d_op by a
> > > side-effect
> > > +			 * of calling simple_lookup(). Normally,
> > > we should
> > > +			 * never change d_op of a dentry, but as
> > > this is
> > > +			 * happening at boot up and shouldn't be
> > > racing with
> > > +			 * any other users, this should be OK. But
> > > it is still
> > > +			 * a hack, and needs to be properly done.
> > > +			 */
> > > +			trace_ops = *traced->d_op;
> > > +			trace_ops.d_automount = trace_automount;
> > > +			traced->d_flags |= DCACHE_NEED_AUTOMOUNT;
> > > +			traced->d_op = &trace_ops;
> > > +		} else {
> > > +			/* Ideally, this is what should happen */
> > > +			trace_ops = simple_dentry_operations;
> > > +			trace_ops.d_automount = trace_automount;
> > > +			d_set_d_op(traced, &trace_ops);
> > 
> > How will this else block run if debugfs is setting d_op in the
> > debugfs_create_dir() call?
> 
> It wont; I put the else block there to show what we would like to do.
> And would hopefully work if debugfs ever changed.
> 
> 
> > 
> > What really do you want to do here, just automount a filesystem on
> > debugfs?  If so, can't we just add a new debugfs call to do that?
> 
> We could add a call to debugfs to do that. Would you prefer that? From
> talking with Al, it sounds to me that changing d_ops on the fly is very
> racy. Adding a call in debugfs sounds like it would be open for other
> users to do the same and do so while the system is running. Would that
> be wise?

If we could do it in a non-racy way, that would be good, otherwise I
don't see us being able to even take this patch :(

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/5 v2] tracing: Automatically mount tracefs on debugfs/tracing
  2015-01-25 13:22       ` Greg Kroah-Hartman
@ 2015-01-25 19:38         ` Steven Rostedt
  2015-01-25 19:59           ` Al Viro
  0 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2015-01-25 19:38 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: linux-kernel, Al Viro, Ingo Molnar, Andrew Morton

On Sun, 25 Jan 2015 21:22:07 +0800
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:


> If we could do it in a non-racy way, that would be good, otherwise I
> don't see us being able to even take this patch :(

Is it still racy even if it's only done at boot up? This path only gets
hit the first time it is called. "if (tr->flags &TRACE_ARRAY_FL_GLOBAL)"
is the top level tracing directory ("tracing") and is only called
during boot up (fs_initcall) and never hit again. I could even make
this called directly by that code so we could label it "__init" to make
sure that it is to be never hit. Or is this racy even when done by
fs_initcall?

Waiting for Al to comment on this, because, I can't add this feature
until debugfs/tracing still containing the tracing information,
otherwise it will break all the tools that interact with the tracing
infrastructure, and we all know how happy Linus feels about such
changes.

-- Steve

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/5 v2] tracing: Automatically mount tracefs on debugfs/tracing
  2015-01-25 19:38         ` Steven Rostedt
@ 2015-01-25 19:59           ` Al Viro
  2015-01-25 20:27             ` Al Viro
  0 siblings, 1 reply; 14+ messages in thread
From: Al Viro @ 2015-01-25 19:59 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Greg Kroah-Hartman, linux-kernel, Ingo Molnar, Andrew Morton

On Sun, Jan 25, 2015 at 02:38:30PM -0500, Steven Rostedt wrote:
> On Sun, 25 Jan 2015 21:22:07 +0800
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> 
> 
> > If we could do it in a non-racy way, that would be good, otherwise I
> > don't see us being able to even take this patch :(
> 
> Is it still racy even if it's only done at boot up? This path only gets
> hit the first time it is called. "if (tr->flags &TRACE_ARRAY_FL_GLOBAL)"
> is the top level tracing directory ("tracing") and is only called
> during boot up (fs_initcall) and never hit again. I could even make
> this called directly by that code so we could label it "__init" to make
> sure that it is to be never hit. Or is this racy even when done by
> fs_initcall?
> 
> Waiting for Al to comment on this, because, I can't add this feature
> until debugfs/tracing still containing the tracing information,
> otherwise it will break all the tools that interact with the tracing
> infrastructure, and we all know how happy Linus feels about such
> changes.

Actually, I'm almost done massaging that sucker into adding
debugfs_create_automount().  The only remaining question is what arguments
do we put it; for now I'm giving it dentry_operations + data (to go into
inode->i_private), but it might be better to give it a pointer just to
d_automount() callback + data for it...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/5 v2] tracing: Automatically mount tracefs on debugfs/tracing
  2015-01-25 19:59           ` Al Viro
@ 2015-01-25 20:27             ` Al Viro
  2015-01-25 20:31               ` Al Viro
  0 siblings, 1 reply; 14+ messages in thread
From: Al Viro @ 2015-01-25 20:27 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Greg Kroah-Hartman, linux-kernel, Ingo Molnar, Andrew Morton

On Sun, Jan 25, 2015 at 07:59:32PM +0000, Al Viro wrote:
> On Sun, Jan 25, 2015 at 02:38:30PM -0500, Steven Rostedt wrote:
> > On Sun, 25 Jan 2015 21:22:07 +0800
> > Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> > 
> > 
> > > If we could do it in a non-racy way, that would be good, otherwise I
> > > don't see us being able to even take this patch :(
> > 
> > Is it still racy even if it's only done at boot up? This path only gets
> > hit the first time it is called. "if (tr->flags &TRACE_ARRAY_FL_GLOBAL)"
> > is the top level tracing directory ("tracing") and is only called
> > during boot up (fs_initcall) and never hit again. I could even make
> > this called directly by that code so we could label it "__init" to make
> > sure that it is to be never hit. Or is this racy even when done by
> > fs_initcall?
> > 
> > Waiting for Al to comment on this, because, I can't add this feature
> > until debugfs/tracing still containing the tracing information,
> > otherwise it will break all the tools that interact with the tracing
> > infrastructure, and we all know how happy Linus feels about such
> > changes.
> 
> Actually, I'm almost done massaging that sucker into adding
> debugfs_create_automount().  The only remaining question is what arguments
> do we put it; for now I'm giving it dentry_operations + data (to go into
> inode->i_private), but it might be better to give it a pointer just to
> d_automount() callback + data for it...

Turns out that it is better that way (and less prone to abuse).  See
vfs.git#debugfs_automount; some massage on top of 3.19-rc5, the payoff is
in the last commit.

For your code it's a matter of replacing struct path *path with void *unused
in trace_automount() and just calling
	debugfs_create_automount("tracing", NULL, trace_automount, NULL);
to create the sucker.  That's it - no games with ->d_op, etc.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/5 v2] tracing: Automatically mount tracefs on debugfs/tracing
  2015-01-25 20:27             ` Al Viro
@ 2015-01-25 20:31               ` Al Viro
  0 siblings, 0 replies; 14+ messages in thread
From: Al Viro @ 2015-01-25 20:31 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Greg Kroah-Hartman, linux-kernel, Ingo Molnar, Andrew Morton

On Sun, Jan 25, 2015 at 08:27:29PM +0000, Al Viro wrote:

> Turns out that it is better that way (and less prone to abuse).  See
> vfs.git#debugfs_automount; some massage on top of 3.19-rc5, the payoff is
> in the last commit.
> 
> For your code it's a matter of replacing struct path *path with void *unused
> in trace_automount() and just calling
> 	debugfs_create_automount("tracing", NULL, trace_automount, NULL);
> to create the sucker.  That's it - no games with ->d_op, etc.

... and the total is plus 8 lines.  Would be negative, if not for the
kerneldoc comment in there...

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-01-25 20:31 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-23 15:55 [PATCH 0/5 v2] tracing: Add new file system tracefs Steven Rostedt
2015-01-23 15:55 ` [PATCH 1/5 v2] tracefs: Add new tracefs file system Steven Rostedt
2015-01-23 15:55 ` [PATCH 2/5 v2] tracing: Convert the tracing facility over to use tracefs Steven Rostedt
2015-01-23 15:55 ` [PATCH 3/5 v2] tracing: Automatically mount tracefs on debugfs/tracing Steven Rostedt
2015-01-24  3:00   ` Greg Kroah-Hartman
2015-01-24 11:33     ` Steven Rostedt
2015-01-25 13:22       ` Greg Kroah-Hartman
2015-01-25 19:38         ` Steven Rostedt
2015-01-25 19:59           ` Al Viro
2015-01-25 20:27             ` Al Viro
2015-01-25 20:31               ` Al Viro
2015-01-23 15:55 ` [PATCH 4/5 v2] tracefs: Add directory /sys/kernel/tracing Steven Rostedt
2015-01-24  3:00   ` Greg Kroah-Hartman
2015-01-23 15:55 ` [PATCH 5/5 v2] tracing: Have mkdir and rmdir be part of tracefs Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).