LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RFC 0/7] LTTng Kernel Instrumentation (Architecture Independent)
@ 2007-11-13 19:33 Mathieu Desnoyers
  2007-11-13 19:33 ` [RFC 1/7] Include marker.h in kernel.h -- temporary, for code readability Mathieu Desnoyers
                   ` (6 more replies)
  0 siblings, 7 replies; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 19:33 UTC (permalink / raw)
  To: akpm, linux-kernel

Hi,

I submit this instrumentation of the main kernel events using markers to the
Linux community as an RFC. This is the instrumentation LTTng (Linux Trace
Toolkit Next Generation, at http://ltt.polymtl.ca) uses.

In addition to this, I also have architecture dependent instrumentation for:
- traps
- system call entry/exit
- kernel thread creation
- IPC call
which I plan to submit a bit later, once these architecture independent core
instrumentation patches will have been discussed.

The goal here is be be able to tell "what the kernel is doing" so we can :
- get execution traces from bug reporters easily.
- help userspace developers to understand the interactions of their programs
  with other processes and with the kernel.

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [RFC 1/7] Include marker.h in kernel.h -- temporary, for code readability
  2007-11-13 19:33 [RFC 0/7] LTTng Kernel Instrumentation (Architecture Independent) Mathieu Desnoyers
@ 2007-11-13 19:33 ` Mathieu Desnoyers
  2007-11-13 19:33 ` [RFC 2/7] LTTng instrumentation fs Mathieu Desnoyers
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 19:33 UTC (permalink / raw)
  To: akpm, linux-kernel; +Cc: Mathieu Desnoyers

[-- Attachment #1: lttng-instrument-kernelh.patch --]
[-- Type: text/plain, Size: 1003 bytes --]

This patch is a hack to make my life easier : it lessens the conflicts due to
header includes that changes between the kernel versions.

The proper way to do this is to include <linux/marker.h> in every file using the
markers.

NOT FOR UPSTREAM.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
---
 include/linux/kernel.h |    1 +
 1 file changed, 1 insertion(+)

Index: linux-2.6-lttng/include/linux/kernel.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/kernel.h	2007-06-15 16:13:48.000000000 -0400
+++ linux-2.6-lttng/include/linux/kernel.h	2007-06-15 16:14:28.000000000 -0400
@@ -14,6 +14,7 @@
 #include <linux/compiler.h>
 #include <linux/bitops.h>
 #include <linux/log2.h>
+#include <linux/marker.h>
 #include <asm/byteorder.h>
 #include <asm/bug.h>
 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [RFC 2/7] LTTng instrumentation fs
  2007-11-13 19:33 [RFC 0/7] LTTng Kernel Instrumentation (Architecture Independent) Mathieu Desnoyers
  2007-11-13 19:33 ` [RFC 1/7] Include marker.h in kernel.h -- temporary, for code readability Mathieu Desnoyers
@ 2007-11-13 19:33 ` Mathieu Desnoyers
  2007-11-13 19:33 ` [RFC 3/7] LTTng instrumentation ipc Mathieu Desnoyers
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 19:33 UTC (permalink / raw)
  To: akpm, linux-kernel; +Cc: Mathieu Desnoyers, Alexander Viro

[-- Attachment #1: lttng-instrumentation-fs.patch --]
[-- Type: text/plain, Size: 7039 bytes --]

Core filesystem events markers.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Alexander Viro <viro@zeniv.linux.org.uk>
---
 fs/buffer.c     |    2 ++
 fs/compat.c     |    1 +
 fs/exec.c       |    1 +
 fs/ioctl.c      |    2 ++
 fs/open.c       |    2 ++
 fs/read_write.c |   21 +++++++++++++++++++--
 fs/select.c     |    4 ++++
 7 files changed, 31 insertions(+), 2 deletions(-)

Index: linux-2.6-lttng/fs/buffer.c
===================================================================
--- linux-2.6-lttng.orig/fs/buffer.c	2007-11-13 09:49:27.000000000 -0500
+++ linux-2.6-lttng/fs/buffer.c	2007-11-13 09:49:29.000000000 -0500
@@ -89,7 +89,9 @@ void fastcall unlock_buffer(struct buffe
  */
 void __wait_on_buffer(struct buffer_head * bh)
 {
+	trace_mark(fs_buffer_wait_start, "bh %p", bh);
 	wait_on_bit(&bh->b_state, BH_Lock, sync_buffer, TASK_UNINTERRUPTIBLE);
+	trace_mark(fs_buffer_wait_end, "bh %p", bh);
 }
 
 static void
Index: linux-2.6-lttng/fs/compat.c
===================================================================
--- linux-2.6-lttng.orig/fs/compat.c	2007-11-13 09:49:27.000000000 -0500
+++ linux-2.6-lttng/fs/compat.c	2007-11-13 09:49:29.000000000 -0500
@@ -1408,6 +1408,7 @@ int compat_do_execve(char * filename,
 
 	retval = search_binary_handler(bprm, regs);
 	if (retval >= 0) {
+		trace_mark(fs_exec, "filename %s", filename);
 		/* execve success */
 		security_bprm_free(bprm);
 		acct_update_integrals(current);
Index: linux-2.6-lttng/fs/ioctl.c
===================================================================
--- linux-2.6-lttng.orig/fs/ioctl.c	2007-11-13 09:49:27.000000000 -0500
+++ linux-2.6-lttng/fs/ioctl.c	2007-11-13 09:49:29.000000000 -0500
@@ -164,6 +164,8 @@ asmlinkage long sys_ioctl(unsigned int f
 	if (!filp)
 		goto out;
 
+	trace_mark(fs_ioctl, "fd %u cmd %u arg %lu", fd, cmd, arg);
+
 	error = security_file_ioctl(filp, cmd, arg);
 	if (error)
 		goto out_fput;
Index: linux-2.6-lttng/fs/open.c
===================================================================
--- linux-2.6-lttng.orig/fs/open.c	2007-11-13 09:49:27.000000000 -0500
+++ linux-2.6-lttng/fs/open.c	2007-11-13 09:49:29.000000000 -0500
@@ -1043,6 +1043,7 @@ long do_sys_open(int dfd, const char __u
 				fsnotify_open(f->f_path.dentry);
 				fd_install(fd, f);
 			}
+			trace_mark(fs_open, "fd %d filename %s", fd, tmp);
 		}
 		putname(tmp);
 	}
@@ -1133,6 +1134,7 @@ asmlinkage long sys_close(unsigned int f
 	filp = fdt->fd[fd];
 	if (!filp)
 		goto out_unlock;
+	trace_mark(fs_close, "fd %u", fd);
 	rcu_assign_pointer(fdt->fd[fd], NULL);
 	FD_CLR(fd, fdt->close_on_exec);
 	__put_unused_fd(files, fd);
Index: linux-2.6-lttng/fs/read_write.c
===================================================================
--- linux-2.6-lttng.orig/fs/read_write.c	2007-11-13 09:49:27.000000000 -0500
+++ linux-2.6-lttng/fs/read_write.c	2007-11-13 09:49:29.000000000 -0500
@@ -146,6 +146,9 @@ asmlinkage off_t sys_lseek(unsigned int 
 		if (res != (loff_t)retval)
 			retval = -EOVERFLOW;	/* LFS: should only happen on 32 bit platforms */
 	}
+
+	trace_mark(fs_lseek, "fd %u offset %ld origin %u", fd, offset, origin);
+
 	fput_light(file, fput_needed);
 bad:
 	return retval;
@@ -173,6 +176,9 @@ asmlinkage long sys_llseek(unsigned int 
 	offset = vfs_llseek(file, ((loff_t) offset_high << 32) | offset_low,
 			origin);
 
+	trace_mark(fs_llseek, "fd %u offset %lld origin %u", fd, offset,
+			origin);
+
 	retval = (int)offset;
 	if (offset >= 0) {
 		retval = -EFAULT;
@@ -363,6 +369,7 @@ asmlinkage ssize_t sys_read(unsigned int
 	file = fget_light(fd, &fput_needed);
 	if (file) {
 		loff_t pos = file_pos_read(file);
+		trace_mark(fs_read, "fd %u count %zu", fd, count);
 		ret = vfs_read(file, buf, count, &pos);
 		file_pos_write(file, pos);
 		fput_light(file, fput_needed);
@@ -381,6 +388,7 @@ asmlinkage ssize_t sys_write(unsigned in
 	file = fget_light(fd, &fput_needed);
 	if (file) {
 		loff_t pos = file_pos_read(file);
+		trace_mark(fs_write, "fd %u count %zu", fd, count);
 		ret = vfs_write(file, buf, count, &pos);
 		file_pos_write(file, pos);
 		fput_light(file, fput_needed);
@@ -402,8 +410,12 @@ asmlinkage ssize_t sys_pread64(unsigned 
 	file = fget_light(fd, &fput_needed);
 	if (file) {
 		ret = -ESPIPE;
-		if (file->f_mode & FMODE_PREAD)
+		if (file->f_mode & FMODE_PREAD) {
+			trace_mark(fs_pread64, "fd %u count %zu pos %lld",
+				fd, count, pos);
 			ret = vfs_read(file, buf, count, &pos);
+		}
+
 		fput_light(file, fput_needed);
 	}
 
@@ -423,8 +435,11 @@ asmlinkage ssize_t sys_pwrite64(unsigned
 	file = fget_light(fd, &fput_needed);
 	if (file) {
 		ret = -ESPIPE;
-		if (file->f_mode & FMODE_PWRITE)  
+		if (file->f_mode & FMODE_PWRITE) {
+			trace_mark(fs_pwrite64, "fd %u count %zu pos %lld",
+				fd, count, pos);
 			ret = vfs_write(file, buf, count, &pos);
+		}
 		fput_light(file, fput_needed);
 	}
 
@@ -670,6 +685,7 @@ sys_readv(unsigned long fd, const struct
 	file = fget_light(fd, &fput_needed);
 	if (file) {
 		loff_t pos = file_pos_read(file);
+		trace_mark(fs_readv, "fd %lu vlen %lu", fd, vlen);
 		ret = vfs_readv(file, vec, vlen, &pos);
 		file_pos_write(file, pos);
 		fput_light(file, fput_needed);
@@ -691,6 +707,7 @@ sys_writev(unsigned long fd, const struc
 	file = fget_light(fd, &fput_needed);
 	if (file) {
 		loff_t pos = file_pos_read(file);
+		trace_mark(fs_writev, "fd %lu vlen %lu", fd, vlen);
 		ret = vfs_writev(file, vec, vlen, &pos);
 		file_pos_write(file, pos);
 		fput_light(file, fput_needed);
Index: linux-2.6-lttng/fs/select.c
===================================================================
--- linux-2.6-lttng.orig/fs/select.c	2007-11-13 09:49:27.000000000 -0500
+++ linux-2.6-lttng/fs/select.c	2007-11-13 09:49:29.000000000 -0500
@@ -231,6 +231,9 @@ int do_select(int n, fd_set_bits *fds, s
 				file = fget_light(i, &fput_needed);
 				if (file) {
 					f_op = file->f_op;
+					trace_mark(fs_select,
+							"fd %d timeout #8d%llu",
+							i, *timeout);
 					mask = DEFAULT_POLLMASK;
 					if (f_op && f_op->poll)
 						mask = (*f_op->poll)(file, retval ? NULL : wait);
@@ -559,6 +562,7 @@ static inline unsigned int do_pollfd(str
 		file = fget_light(fd, &fput_needed);
 		mask = POLLNVAL;
 		if (file != NULL) {
+			trace_mark(fs_pollfd, "fd %d", fd);
 			mask = DEFAULT_POLLMASK;
 			if (file->f_op && file->f_op->poll)
 				mask = file->f_op->poll(file, pwait);
Index: linux-2.6-lttng/fs/exec.c
===================================================================
--- linux-2.6-lttng.orig/fs/exec.c	2007-11-13 09:49:27.000000000 -0500
+++ linux-2.6-lttng/fs/exec.c	2007-11-13 09:49:29.000000000 -0500
@@ -1351,6 +1351,7 @@ int do_execve(char * filename,
 
 	retval = search_binary_handler(bprm,regs);
 	if (retval >= 0) {
+		trace_mark(fs_exec, "filename %s", filename);
 		/* execve success */
 		free_arg_pages(bprm);
 		security_bprm_free(bprm);

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [RFC 3/7] LTTng instrumentation ipc
  2007-11-13 19:33 [RFC 0/7] LTTng Kernel Instrumentation (Architecture Independent) Mathieu Desnoyers
  2007-11-13 19:33 ` [RFC 1/7] Include marker.h in kernel.h -- temporary, for code readability Mathieu Desnoyers
  2007-11-13 19:33 ` [RFC 2/7] LTTng instrumentation fs Mathieu Desnoyers
@ 2007-11-13 19:33 ` Mathieu Desnoyers
  2007-11-13 19:33 ` [RFC 4/7] LTTng instrumentation kernel Mathieu Desnoyers
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 19:33 UTC (permalink / raw)
  To: akpm, linux-kernel; +Cc: Mathieu Desnoyers

[-- Attachment #1: lttng-instrumentation-ipc.patch --]
[-- Type: text/plain, Size: 2836 bytes --]

Interprocess communication, core events.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
---
 ipc/msg.c |    5 ++++-
 ipc/sem.c |    5 ++++-
 ipc/shm.c |    5 ++++-
 3 files changed, 12 insertions(+), 3 deletions(-)

Index: linux-2.6-lttng/ipc/msg.c
===================================================================
--- linux-2.6-lttng.orig/ipc/msg.c	2007-11-13 09:49:27.000000000 -0500
+++ linux-2.6-lttng/ipc/msg.c	2007-11-13 09:49:31.000000000 -0500
@@ -315,6 +315,7 @@ asmlinkage long sys_msgget(key_t key, in
 	struct ipc_namespace *ns;
 	struct ipc_ops msg_ops;
 	struct ipc_params msg_params;
+	long ret;
 
 	ns = current->nsproxy->ipc_ns;
 
@@ -325,7 +326,9 @@ asmlinkage long sys_msgget(key_t key, in
 	msg_params.key = key;
 	msg_params.flg = msgflg;
 
-	return ipcget(ns, &msg_ids(ns), &msg_ops, &msg_params);
+	ret = ipcget(ns, &msg_ids(ns), &msg_ops, &msg_params);
+	trace_mark(ipc_msg_create, "id %ld flags %d", ret, msgflg);
+	return ret;
 }
 
 static inline unsigned long
Index: linux-2.6-lttng/ipc/sem.c
===================================================================
--- linux-2.6-lttng.orig/ipc/sem.c	2007-11-13 09:49:27.000000000 -0500
+++ linux-2.6-lttng/ipc/sem.c	2007-11-13 09:49:31.000000000 -0500
@@ -334,6 +334,7 @@ asmlinkage long sys_semget(key_t key, in
 	struct ipc_namespace *ns;
 	struct ipc_ops sem_ops;
 	struct ipc_params sem_params;
+	long err;
 
 	ns = current->nsproxy->ipc_ns;
 
@@ -348,7 +349,9 @@ asmlinkage long sys_semget(key_t key, in
 	sem_params.flg = semflg;
 	sem_params.u.nsems = nsems;
 
-	return ipcget(ns, &sem_ids(ns), &sem_ops, &sem_params);
+	err = ipcget(ns, &sem_ids(ns), &sem_ops, &sem_params);
+	trace_mark(ipc_sem_create, "id %ld flags %d", err, semflg);
+	return err;
 }
 
 /* Manage the doubly linked list sma->sem_pending as a FIFO:
Index: linux-2.6-lttng/ipc/shm.c
===================================================================
--- linux-2.6-lttng.orig/ipc/shm.c	2007-11-13 09:49:27.000000000 -0500
+++ linux-2.6-lttng/ipc/shm.c	2007-11-13 09:49:31.000000000 -0500
@@ -497,6 +497,7 @@ asmlinkage long sys_shmget (key_t key, s
 	struct ipc_namespace *ns;
 	struct ipc_ops shm_ops;
 	struct ipc_params shm_params;
+	long err;
 
 	ns = current->nsproxy->ipc_ns;
 
@@ -508,7 +509,9 @@ asmlinkage long sys_shmget (key_t key, s
 	shm_params.flg = shmflg;
 	shm_params.u.size = size;
 
-	return ipcget(ns, &shm_ids(ns), &shm_ops, &shm_params);
+	err = ipcget(ns, &shm_ids(ns), &shm_ops, &shm_params);
+	trace_mark(ipc_shm_create, "id %ld flags %d", err, shmflg);
+	return err;
 }
 
 static inline unsigned long copy_shmid_to_user(void __user *buf, struct shmid64_ds *in, int version)

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [RFC 4/7] LTTng instrumentation kernel
  2007-11-13 19:33 [RFC 0/7] LTTng Kernel Instrumentation (Architecture Independent) Mathieu Desnoyers
                   ` (2 preceding siblings ...)
  2007-11-13 19:33 ` [RFC 3/7] LTTng instrumentation ipc Mathieu Desnoyers
@ 2007-11-13 19:33 ` Mathieu Desnoyers
  2007-11-15 23:30   ` Mike Mason
  2007-11-13 19:33 ` [RFC 5/7] LTTng instrumentation mm Mathieu Desnoyers
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 19:33 UTC (permalink / raw)
  To: akpm, linux-kernel; +Cc: Mathieu Desnoyers

[-- Attachment #1: lttng-instrumentation-kernel.patch --]
[-- Type: text/plain, Size: 15894 bytes --]

Core kernel events.

*not* present in this patch because they are architecture specific :
- syscall entry/exit
- traps
- kernel thread creation

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
---
 include/linux/module.h |    1 +
 kernel/exit.c          |    5 +++++
 kernel/fork.c          |    4 ++++
 kernel/irq/handle.c    |    6 ++++++
 kernel/itimer.c        |   11 +++++++++++
 kernel/kthread.c       |    4 ++++
 kernel/lockdep.c       |   19 +++++++++++++++++++
 kernel/module.c        |   25 +++++++++++++++++++++++++
 kernel/printk.c        |   26 ++++++++++++++++++++++++++
 kernel/sched.c         |   11 +++++++++++
 kernel/signal.c        |    2 ++
 kernel/softirq.c       |   22 ++++++++++++++++++++++
 kernel/timer.c         |   12 +++++++++++-
 13 files changed, 147 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/kernel/irq/handle.c
===================================================================
--- linux-2.6-lttng.orig/kernel/irq/handle.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/irq/handle.c	2007-11-13 09:49:33.000000000 -0500
@@ -130,6 +130,10 @@ irqreturn_t handle_IRQ_event(unsigned in
 {
 	irqreturn_t ret, retval = IRQ_NONE;
 	unsigned int status = 0;
+	struct pt_regs *regs = get_irq_regs();
+
+	trace_mark(kernel_irq_entry, "irq_id %u kernel_mode %u", irq,
+		(regs)?(!user_mode(regs)):(1));
 
 	handle_dynamic_tick(action);
 
@@ -148,6 +152,8 @@ irqreturn_t handle_IRQ_event(unsigned in
 		add_interrupt_randomness(irq);
 	local_irq_disable();
 
+	trace_mark(kernel_irq_exit, MARK_NOARGS);
+
 	return retval;
 }
 
Index: linux-2.6-lttng/kernel/itimer.c
===================================================================
--- linux-2.6-lttng.orig/kernel/itimer.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/itimer.c	2007-11-13 09:49:33.000000000 -0500
@@ -132,6 +132,8 @@ enum hrtimer_restart it_real_fn(struct h
 	struct signal_struct *sig =
 		container_of(timer, struct signal_struct, real_timer);
 
+	trace_mark(kernel_timer_itimer_expired, "pid %d", sig->tsk->pid);
+
 	send_group_sig_info(SIGALRM, SEND_SIG_PRIV, sig->tsk);
 
 	return HRTIMER_NORESTART;
@@ -157,6 +159,15 @@ int do_setitimer(int which, struct itime
 	    !timeval_valid(&value->it_interval))
 		return -EINVAL;
 
+	trace_mark(kernel_timer_itimer_set,
+			"which %d interval_sec %ld interval_usec %ld "
+			"value_sec %ld value_usec %ld",
+			which,
+			value->it_interval.tv_sec,
+			value->it_interval.tv_usec,
+			value->it_value.tv_sec,
+			value->it_value.tv_usec);
+
 	switch (which) {
 	case ITIMER_REAL:
 again:
Index: linux-2.6-lttng/kernel/kthread.c
===================================================================
--- linux-2.6-lttng.orig/kernel/kthread.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/kthread.c	2007-11-13 09:49:33.000000000 -0500
@@ -195,6 +195,8 @@ int kthread_stop(struct task_struct *k)
 	/* It could exit after stop_info.k set, but before wake_up_process. */
 	get_task_struct(k);
 
+	trace_mark(kernel_kthread_stop, "pid %d", k->pid);
+
 	/* Must init completion *before* thread sees kthread_stop_info.k */
 	init_completion(&kthread_stop_info.done);
 	smp_wmb();
@@ -210,6 +212,8 @@ int kthread_stop(struct task_struct *k)
 	ret = kthread_stop_info.err;
 	mutex_unlock(&kthread_stop_lock);
 
+	trace_mark(kernel_kthread_stop_ret, "ret %d", ret);
+
 	return ret;
 }
 EXPORT_SYMBOL(kthread_stop);
Index: linux-2.6-lttng/kernel/lockdep.c
===================================================================
--- linux-2.6-lttng.orig/kernel/lockdep.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/lockdep.c	2007-11-13 09:49:33.000000000 -0500
@@ -2014,6 +2014,9 @@ void trace_hardirqs_on(void)
 	struct task_struct *curr = current;
 	unsigned long ip;
 
+	_trace_mark(locking_hardirqs_on, "ip #p%lu",
+		(unsigned long) __builtin_return_address(0));
+
 	if (unlikely(!debug_locks || current->lockdep_recursion))
 		return;
 
@@ -2061,6 +2064,9 @@ void trace_hardirqs_off(void)
 {
 	struct task_struct *curr = current;
 
+	_trace_mark(locking_hardirqs_off, "ip #p%lu",
+		(unsigned long) __builtin_return_address(0));
+
 	if (unlikely(!debug_locks || current->lockdep_recursion))
 		return;
 
@@ -2088,6 +2094,9 @@ void trace_softirqs_on(unsigned long ip)
 {
 	struct task_struct *curr = current;
 
+	_trace_mark(locking_softirqs_on, "ip #p%lu",
+		(unsigned long) __builtin_return_address(0));
+
 	if (unlikely(!debug_locks))
 		return;
 
@@ -2122,6 +2131,9 @@ void trace_softirqs_off(unsigned long ip
 {
 	struct task_struct *curr = current;
 
+	_trace_mark(locking_softirqs_off, "ip #p%lu",
+		(unsigned long) __builtin_return_address(0));
+
 	if (unlikely(!debug_locks))
 		return;
 
@@ -2358,6 +2370,10 @@ static int __lock_acquire(struct lockdep
 	int chain_head = 0;
 	u64 chain_key;
 
+	_trace_mark(locking_lock_acquire,
+		"ip #p%lu subclass %u lock %p trylock %d",
+		ip, subclass, lock, trylock);
+
 	if (!prove_locking)
 		check = 1;
 
@@ -2631,6 +2647,9 @@ __lock_release(struct lockdep_map *lock,
 {
 	struct task_struct *curr = current;
 
+	_trace_mark(locking_lock_release, "ip #p%lu lock %p nested %d",
+		ip, lock, nested);
+
 	if (!check_unlock(curr, lock, ip))
 		return;
 
Index: linux-2.6-lttng/kernel/printk.c
===================================================================
--- linux-2.6-lttng.orig/kernel/printk.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/printk.c	2007-11-13 09:49:33.000000000 -0500
@@ -619,6 +619,7 @@ asmlinkage int printk(const char *fmt, .
 	int r;
 
 	va_start(args, fmt);
+	trace_mark(kernel_printk, "ip %p", __builtin_return_address(0));
 	r = vprintk(fmt, args);
 	va_end(args);
 
@@ -653,6 +654,31 @@ asmlinkage int vprintk(const char *fmt, 
 	/* Emit the output into the temporary buffer */
 	printed_len = vscnprintf(printk_buf, sizeof(printk_buf), fmt, args);
 
+	if (printed_len > 0) {
+		unsigned int loglevel;
+		int mark_len;
+		char *mark_buf;
+		char saved_char;
+
+		if (printk_buf[0] == '<' && printk_buf[1] >= '0' &&
+		   printk_buf[1] <= '7' && printk_buf[2] == '>') {
+			loglevel = printk_buf[1] - '0';
+			mark_buf = &printk_buf[3];
+			mark_len = printed_len - 3;
+		} else {
+			loglevel = default_message_loglevel;
+			mark_buf = printk_buf;
+			mark_len = printed_len;
+		}
+		if (mark_buf[mark_len - 1] == '\n')
+			mark_len--;
+		saved_char = mark_buf[mark_len];
+		mark_buf[mark_len] = '\0';
+		_trace_mark(kernel_vprintk, "loglevel %c string %s ip %p",
+			loglevel, mark_buf, __builtin_return_address(0));
+		mark_buf[mark_len] = saved_char;
+	}
+
 	/*
 	 * Copy the output into log_buf.  If the caller didn't provide
 	 * appropriate log level tags, we insert them here
Index: linux-2.6-lttng/kernel/sched.c
===================================================================
--- linux-2.6-lttng.orig/kernel/sched.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/sched.c	2007-11-13 09:49:33.000000000 -0500
@@ -1161,6 +1161,8 @@ void wait_task_inactive(struct task_stru
 		 * just go back and repeat.
 		 */
 		rq = task_rq_lock(p, &flags);
+		trace_mark(kernel_sched_wait_task, "pid %d state %ld",
+			p->pid, p->state);
 		running = task_running(rq, p);
 		on_rq = p->se.on_rq;
 		task_rq_unlock(rq, &flags);
@@ -1495,6 +1497,8 @@ static int try_to_wake_up(struct task_st
 #endif
 
 	rq = task_rq_lock(p, &flags);
+	trace_mark(kernel_sched_try_wakeup, "pid %d state %ld",
+		p->pid, p->state);
 	old_state = p->state;
 	if (!(old_state & state))
 		goto out;
@@ -1733,6 +1737,8 @@ void fastcall wake_up_new_task(struct ta
 	struct rq *rq;
 
 	rq = task_rq_lock(p, &flags);
+	trace_mark(kernel_sched_wakeup_new_task, "pid %d state %ld",
+		p->pid, p->state);
 	BUG_ON(p->state != TASK_RUNNING);
 	update_rq_clock(rq);
 
@@ -1911,6 +1917,9 @@ context_switch(struct rq *rq, struct tas
 	struct mm_struct *mm, *oldmm;
 
 	prepare_task_switch(rq, prev, next);
+	trace_mark(kernel_sched_schedule,
+		"prev_pid %d next_pid %d prev_state %ld",
+		prev->pid, next->pid, prev->state);
 	mm = next->mm;
 	oldmm = prev->active_mm;
 	/*
@@ -2139,6 +2148,8 @@ static void sched_migrate_task(struct ta
 	    || unlikely(cpu_is_offline(dest_cpu)))
 		goto out;
 
+	trace_mark(kernel_sched_migrate_task, "pid %d state %ld dest_cpu %d",
+		p->pid, p->state, dest_cpu);
 	/* force the process onto the specified CPU */
 	if (migrate_task(p, dest_cpu, &req)) {
 		/* Need to wait for migration thread (might exit: take ref). */
Index: linux-2.6-lttng/kernel/signal.c
===================================================================
--- linux-2.6-lttng.orig/kernel/signal.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/signal.c	2007-11-13 09:49:33.000000000 -0500
@@ -663,6 +663,8 @@ static int send_signal(int sig, struct s
 	struct sigqueue * q = NULL;
 	int ret = 0;
 
+	trace_mark(kernel_send_signal, "pid %d signal %d", t->pid, sig);
+
 	/*
 	 * Deliver the signal to listening signalfds. This must be called
 	 * with the sighand lock held.
Index: linux-2.6-lttng/kernel/softirq.c
===================================================================
--- linux-2.6-lttng.orig/kernel/softirq.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/softirq.c	2007-11-13 09:49:33.000000000 -0500
@@ -229,7 +229,15 @@ restart:
 
 	do {
 		if (pending & 1) {
+			trace_mark(kernel_softirq_entry, "softirq_id %lu",
+				((unsigned long)h
+					- (unsigned long)softirq_vec)
+					/ sizeof(*h));
 			h->action(h);
+			trace_mark(kernel_softirq_exit, "softirq_id %lu",
+				((unsigned long)h
+					- (unsigned long)softirq_vec)
+					/ sizeof(*h));
 			rcu_bh_qsctr_inc(cpu);
 		}
 		h++;
@@ -315,6 +323,8 @@ void irq_exit(void)
  */
 inline fastcall void raise_softirq_irqoff(unsigned int nr)
 {
+	trace_mark(kernel_softirq_raise, "softirq %u", nr);
+
 	__raise_softirq_irqoff(nr);
 
 	/*
@@ -400,7 +410,13 @@ static void tasklet_action(struct softir
 			if (!atomic_read(&t->count)) {
 				if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
 					BUG();
+				trace_mark(kernel_tasklet_low_entry,
+						"func %p data %lu",
+						t->func, t->data);
 				t->func(t->data);
+				trace_mark(kernel_tasklet_low_exit,
+						"func %p data %lu",
+						t->func, t->data);
 				tasklet_unlock(t);
 				continue;
 			}
@@ -433,7 +449,13 @@ static void tasklet_hi_action(struct sof
 			if (!atomic_read(&t->count)) {
 				if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
 					BUG();
+				trace_mark(kernel_tasklet_high_entry,
+						"func %p data %lu",
+						t->func, t->data);
 				t->func(t->data);
+				trace_mark(kernel_tasklet_high_exit,
+						"func %p data %lu",
+						t->func, t->data);
 				tasklet_unlock(t);
 				continue;
 			}
Index: linux-2.6-lttng/kernel/timer.c
===================================================================
--- linux-2.6-lttng.orig/kernel/timer.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/timer.c	2007-11-13 09:49:33.000000000 -0500
@@ -43,6 +43,7 @@
 #include <asm/div64.h>
 #include <asm/timex.h>
 #include <asm/io.h>
+#include <asm/irq_regs.h>
 
 u64 jiffies_64 __cacheline_aligned_in_smp = INITIAL_JIFFIES;
 
@@ -290,6 +291,8 @@ static void internal_add_timer(tvec_base
 		i = (expires >> (TVR_BITS + 3 * TVN_BITS)) & TVN_MASK;
 		vec = base->tv5.vec + i;
 	}
+	trace_mark(kernel_timer_set, "expires %lu function %p data %lu",
+		expires, timer->function, timer->data);
 	/*
 	 * Timers are FIFO:
 	 */
@@ -931,6 +934,11 @@ void do_timer(unsigned long ticks)
 {
 	jiffies_64 += ticks;
 	update_times(ticks);
+	trace_mark(kernel_timer_update_time,
+		"jiffies #8u%llu xtime_sec %ld xtime_nsec %ld "
+		"walltomonotonic_sec %ld walltomonotonic_nsec %ld",
+		jiffies_64, xtime.tv_sec, xtime.tv_nsec,
+		wall_to_monotonic.tv_sec, wall_to_monotonic.tv_nsec);
 }
 
 #ifdef __ARCH_WANT_SYS_ALARM
@@ -1012,7 +1020,9 @@ asmlinkage long sys_getegid(void)
 
 static void process_timeout(unsigned long __data)
 {
-	wake_up_process((struct task_struct *)__data);
+	struct task_struct *task = (struct task_struct *)__data;
+	trace_mark(kernel_timer_timeout, "pid %d", task->pid);
+	wake_up_process(task);
 }
 
 /**
Index: linux-2.6-lttng/kernel/exit.c
===================================================================
--- linux-2.6-lttng.orig/kernel/exit.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/exit.c	2007-11-13 09:49:33.000000000 -0500
@@ -177,6 +177,7 @@ repeat:
 
 	write_unlock_irq(&tasklist_lock);
 	release_thread(p);
+	trace_mark(kernel_process_free, "pid %d", p->pid);
 	call_rcu(&p->rcu, delayed_put_task_struct);
 
 	p = leader;
@@ -994,6 +995,8 @@ fastcall NORET_TYPE void do_exit(long co
 
 	if (group_dead)
 		acct_process();
+	trace_mark(kernel_process_exit, "pid %d", tsk->pid);
+
 	exit_sem(tsk);
 	__exit_files(tsk);
 	__exit_fs(tsk);
@@ -1539,6 +1542,8 @@ static long do_wait(pid_t pid, int optio
 	int flag, retval;
 	int allowed, denied;
 
+	trace_mark(kernel_process_wait, "pid %d", pid);
+
 	add_wait_queue(&current->signal->wait_chldexit,&wait);
 repeat:
 	/*
Index: linux-2.6-lttng/kernel/fork.c
===================================================================
--- linux-2.6-lttng.orig/kernel/fork.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/fork.c	2007-11-13 09:49:33.000000000 -0500
@@ -1435,6 +1435,10 @@ long do_fork(unsigned long clone_flags,
 	if (!IS_ERR(p)) {
 		struct completion vfork;
 
+		trace_mark(kernel_process_fork,
+			"parent_pid %d child_pid %d child_tgid %d",
+			current->pid, p->pid, p->tgid);
+
 		/*
 		 * this is enough to call pid_nr_ns here, but this if
 		 * improves optimisation of regular fork()
Index: linux-2.6-lttng/include/linux/module.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/module.h	2007-11-13 09:48:41.000000000 -0500
+++ linux-2.6-lttng/include/linux/module.h	2007-11-13 09:49:33.000000000 -0500
@@ -466,6 +466,7 @@ int register_module_notifier(struct noti
 int unregister_module_notifier(struct notifier_block * nb);
 
 extern void print_modules(void);
+extern void list_modules(void *call_data);
 
 extern void module_update_markers(void);
 
Index: linux-2.6-lttng/kernel/module.c
===================================================================
--- linux-2.6-lttng.orig/kernel/module.c	2007-11-13 09:49:16.000000000 -0500
+++ linux-2.6-lttng/kernel/module.c	2007-11-13 09:49:33.000000000 -0500
@@ -1294,6 +1294,8 @@ static int __unlink_module(void *_mod)
 /* Free a module, remove from lists, etc (must hold module_mutex). */
 static void free_module(struct module *mod)
 {
+	trace_mark(kernel_module_free, "name %s", mod->name);
+
 	/* Delete from various lists */
 	stop_machine_run(__unlink_module, mod, NR_CPUS);
 	remove_notes_attrs(mod);
@@ -2063,6 +2065,8 @@ static struct module *load_module(void _
 	/* Get rid of temporary copy */
 	vfree(hdr);
 
+	trace_mark(kernel_module_load, "name %s", mod->name);
+
 	/* Done! */
 	return mod;
 
@@ -2426,6 +2430,27 @@ const struct seq_operations modules_op =
 	.show	= m_show
 };
 
+void list_modules(void *call_data)
+{
+	/* Enumerate loaded modules */
+	struct list_head	*i;
+	struct module		*mod;
+	unsigned long refcount = 0;
+
+	mutex_lock(&module_mutex);
+	list_for_each(i, &modules) {
+		mod = list_entry(i, struct module, list);
+#ifdef CONFIG_MODULE_UNLOAD
+		refcount = local_read(&mod->ref[0].count);
+#endif
+		__trace_mark(0, list_module, call_data,
+				"name %s state %d refcount %lu",
+				mod->name, mod->state, refcount);
+	}
+	mutex_unlock(&module_mutex);
+}
+EXPORT_SYMBOL_GPL(list_modules);
+
 /* Given an address, look for it in the module exception tables. */
 const struct exception_table_entry *search_module_extables(unsigned long addr)
 {

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [RFC 5/7] LTTng instrumentation mm
  2007-11-13 19:33 [RFC 0/7] LTTng Kernel Instrumentation (Architecture Independent) Mathieu Desnoyers
                   ` (3 preceding siblings ...)
  2007-11-13 19:33 ` [RFC 4/7] LTTng instrumentation kernel Mathieu Desnoyers
@ 2007-11-13 19:33 ` Mathieu Desnoyers
  2007-11-15 21:06   ` Dave Hansen
  2007-11-13 19:33 ` [RFC 6/7] LTTng instrumentation net Mathieu Desnoyers
  2007-11-13 19:33 ` [RFC 7/7] Add Markers Into Semaphore Primitives Mathieu Desnoyers
  6 siblings, 1 reply; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 19:33 UTC (permalink / raw)
  To: akpm, linux-kernel; +Cc: Mathieu Desnoyers, linux-mm

[-- Attachment #1: lttng-instrumentation-mm.patch --]
[-- Type: text/plain, Size: 4322 bytes --]

Memory management core events.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: linux-mm@kvack.org
---
 mm/filemap.c    |    4 ++++
 mm/memory.c     |   34 +++++++++++++++++++++++++---------
 mm/page_alloc.c |    5 +++++
 mm/page_io.c    |    1 +
 4 files changed, 35 insertions(+), 9 deletions(-)

Index: linux-2.6-lttng/mm/filemap.c
===================================================================
--- linux-2.6-lttng.orig/mm/filemap.c	2007-11-13 09:25:26.000000000 -0500
+++ linux-2.6-lttng/mm/filemap.c	2007-11-13 09:49:35.000000000 -0500
@@ -514,9 +514,13 @@ void fastcall wait_on_page_bit(struct pa
 {
 	DEFINE_WAIT_BIT(wait, &page->flags, bit_nr);
 
+	trace_mark(mm_filemap_wait_start, "address %p", page_address(page));
+
 	if (test_bit(bit_nr, &page->flags))
 		__wait_on_bit(page_waitqueue(page), &wait, sync_page,
 							TASK_UNINTERRUPTIBLE);
+
+	trace_mark(mm_filemap_wait_end, "address %p", page_address(page));
 }
 EXPORT_SYMBOL(wait_on_page_bit);
 
Index: linux-2.6-lttng/mm/memory.c
===================================================================
--- linux-2.6-lttng.orig/mm/memory.c	2007-11-13 09:45:41.000000000 -0500
+++ linux-2.6-lttng/mm/memory.c	2007-11-13 09:49:35.000000000 -0500
@@ -2072,6 +2072,7 @@ static int do_swap_page(struct mm_struct
 	delayacct_set_flag(DELAYACCT_PF_SWAPIN);
 	page = lookup_swap_cache(entry);
 	if (!page) {
+		trace_mark(mm_swap_in, "address #p%lu", address);
 		grab_swap_token(); /* Contend for token _before_ read-in */
  		swapin_readahead(entry, address, vma);
  		page = read_swap_cache_async(entry, vma, address);
@@ -2526,30 +2527,45 @@ unlock:
 int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 		unsigned long address, int write_access)
 {
+	int res;
 	pgd_t *pgd;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
 
+	trace_mark(mm_handle_fault_entry, "address %lu ip #p%ld",
+		address, KSTK_EIP(current));
+
 	__set_current_state(TASK_RUNNING);
 
 	count_vm_event(PGFAULT);
 
-	if (unlikely(is_vm_hugetlb_page(vma)))
-		return hugetlb_fault(mm, vma, address, write_access);
+	if (unlikely(is_vm_hugetlb_page(vma))) {
+		res = hugetlb_fault(mm, vma, address, write_access);
+		goto end;
+	}
 
 	pgd = pgd_offset(mm, address);
 	pud = pud_alloc(mm, pgd, address);
-	if (!pud)
-		return VM_FAULT_OOM;
+	if (!pud) {
+		res = VM_FAULT_OOM;
+		goto end;
+	}
 	pmd = pmd_alloc(mm, pud, address);
-	if (!pmd)
-		return VM_FAULT_OOM;
+	if (!pmd) {
+		res = VM_FAULT_OOM;
+		goto end;
+	}
 	pte = pte_alloc_map(mm, pmd, address);
-	if (!pte)
-		return VM_FAULT_OOM;
+	if (!pte) {
+		res = VM_FAULT_OOM;
+		goto end;
+	}
 
-	return handle_pte_fault(mm, vma, address, pte, pmd, write_access);
+	res = handle_pte_fault(mm, vma, address, pte, pmd, write_access);
+end:
+	trace_mark(mm_handle_fault_exit, MARK_NOARGS);
+	return res;
 }
 
 #ifndef __PAGETABLE_PUD_FOLDED
Index: linux-2.6-lttng/mm/page_alloc.c
===================================================================
--- linux-2.6-lttng.orig/mm/page_alloc.c	2007-11-13 09:25:26.000000000 -0500
+++ linux-2.6-lttng/mm/page_alloc.c	2007-11-13 09:49:35.000000000 -0500
@@ -519,6 +519,9 @@ static void __free_pages_ok(struct page 
 	int i;
 	int reserved = 0;
 
+	trace_mark(mm_page_free, "order %u address %p",
+		order, page_address(page));
+
 	for (i = 0 ; i < (1 << order) ; ++i)
 		reserved += free_pages_check(page + i);
 	if (reserved)
@@ -1639,6 +1642,8 @@ fastcall unsigned long __get_free_pages(
 	page = alloc_pages(gfp_mask, order);
 	if (!page)
 		return 0;
+	trace_mark(mm_page_alloc, "order %u address %p",
+		order, page_address(page));
 	return (unsigned long) page_address(page);
 }
 
Index: linux-2.6-lttng/mm/page_io.c
===================================================================
--- linux-2.6-lttng.orig/mm/page_io.c	2007-11-13 09:25:26.000000000 -0500
+++ linux-2.6-lttng/mm/page_io.c	2007-11-13 09:49:35.000000000 -0500
@@ -114,6 +114,7 @@ int swap_writepage(struct page *page, st
 		rw |= (1 << BIO_RW_SYNC);
 	count_vm_event(PSWPOUT);
 	set_page_writeback(page);
+	trace_mark(mm_swap_out, "address %p", page_address(page));
 	unlock_page(page);
 	submit_bio(rw, bio);
 out:

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [RFC 6/7] LTTng instrumentation net
  2007-11-13 19:33 [RFC 0/7] LTTng Kernel Instrumentation (Architecture Independent) Mathieu Desnoyers
                   ` (4 preceding siblings ...)
  2007-11-13 19:33 ` [RFC 5/7] LTTng instrumentation mm Mathieu Desnoyers
@ 2007-11-13 19:33 ` Mathieu Desnoyers
  2007-11-13 19:33 ` [RFC 7/7] Add Markers Into Semaphore Primitives Mathieu Desnoyers
  6 siblings, 0 replies; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 19:33 UTC (permalink / raw)
  To: akpm, linux-kernel; +Cc: Mathieu Desnoyers, netdev

[-- Attachment #1: lttng-instrumentation-net.patch --]
[-- Type: text/plain, Size: 3739 bytes --]

Network core events.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: netdev@vger.kernel.org
---
 net/core/dev.c     |    5 +++++
 net/ipv4/devinet.c |    5 +++++
 net/socket.c       |   18 ++++++++++++++++++
 3 files changed, 28 insertions(+)

Index: linux-2.6-lttng/net/core/dev.c
===================================================================
--- linux-2.6-lttng.orig/net/core/dev.c	2007-11-13 09:25:26.000000000 -0500
+++ linux-2.6-lttng/net/core/dev.c	2007-11-13 09:49:37.000000000 -0500
@@ -1637,6 +1637,8 @@ int dev_queue_xmit(struct sk_buff *skb)
 	}
 
 gso:
+	trace_mark(net_dev_xmit, "skb %p protocol #2u%hu", skb, skb->protocol);
+
 	spin_lock_prefetch(&dev->queue_lock);
 
 	/* Disable soft irqs for various locks below. Also
@@ -2037,6 +2039,9 @@ int netif_receive_skb(struct sk_buff *sk
 
 	__get_cpu_var(netdev_rx_stat).total++;
 
+	trace_mark(net_dev_receive, "skb %p protocol #2u%hu",
+		skb, skb->protocol);
+
 	skb_reset_network_header(skb);
 	skb_reset_transport_header(skb);
 	skb->mac_len = skb->network_header - skb->mac_header;
Index: linux-2.6-lttng/net/ipv4/devinet.c
===================================================================
--- linux-2.6-lttng.orig/net/ipv4/devinet.c	2007-11-13 09:25:26.000000000 -0500
+++ linux-2.6-lttng/net/ipv4/devinet.c	2007-11-13 09:49:37.000000000 -0500
@@ -262,6 +262,8 @@ static void __inet_del_ifa(struct in_dev
 		struct in_ifaddr **ifap1 = &ifa1->ifa_next;
 
 		while ((ifa = *ifap1) != NULL) {
+			trace_mark(net_del_ifa_ipv4, "label %s",
+				ifa->ifa_label);
 			if (!(ifa->ifa_flags & IFA_F_SECONDARY) &&
 			    ifa1->ifa_scope <= ifa->ifa_scope)
 				last_prim = ifa;
@@ -368,6 +370,9 @@ static int __inet_insert_ifa(struct in_i
 			}
 			ifa->ifa_flags |= IFA_F_SECONDARY;
 		}
+		trace_mark(net_insert_ifa_ipv4, "label %s address #4u%lu",
+			ifa->ifa_label,
+			(unsigned long)ifa->ifa_address);
 	}
 
 	if (!(ifa->ifa_flags & IFA_F_SECONDARY)) {
Index: linux-2.6-lttng/net/socket.c
===================================================================
--- linux-2.6-lttng.orig/net/socket.c	2007-11-13 09:25:26.000000000 -0500
+++ linux-2.6-lttng/net/socket.c	2007-11-13 09:49:37.000000000 -0500
@@ -563,6 +563,11 @@ int sock_sendmsg(struct socket *sock, st
 	struct sock_iocb siocb;
 	int ret;
 
+	trace_mark(net_socket_sendmsg,
+		"sock %p family %d type %d protocol %d size %zu",
+		sock, sock->sk->sk_family, sock->sk->sk_type,
+		sock->sk->sk_protocol, size);
+
 	init_sync_kiocb(&iocb, NULL);
 	iocb.private = &siocb;
 	ret = __sock_sendmsg(&iocb, sock, msg, size);
@@ -646,7 +651,13 @@ int sock_recvmsg(struct socket *sock, st
 	struct sock_iocb siocb;
 	int ret;
 
+	trace_mark(net_socket_recvmsg,
+		"sock %p family %d type %d protocol %d size %zu",
+		sock, sock->sk->sk_family, sock->sk->sk_type,
+		sock->sk->sk_protocol, size);
+
 	init_sync_kiocb(&iocb, NULL);
+
 	iocb.private = &siocb;
 	ret = __sock_recvmsg(&iocb, sock, msg, size, flags);
 	if (-EIOCBQUEUED == ret)
@@ -1212,6 +1223,11 @@ asmlinkage long sys_socket(int family, i
 	if (retval < 0)
 		goto out_release;
 
+	trace_mark(net_socket_create,
+		"sock %p family %d type %d protocol %d fd %d",
+		sock, sock->sk->sk_family, sock->sk->sk_type,
+		sock->sk->sk_protocol, retval);
+
 out:
 	/* It may be already another descriptor 8) Not kernel problem. */
 	return retval;
@@ -2021,6 +2037,8 @@ asmlinkage long sys_socketcall(int call,
 	a0 = a[0];
 	a1 = a[1];
 
+	trace_mark(net_socket_call, "call %d a0 %lu", call, a0);
+
 	switch (call) {
 	case SYS_SOCKET:
 		err = sys_socket(a0, a1, a[2]);

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [RFC 7/7] Add Markers Into Semaphore Primitives
  2007-11-13 19:33 [RFC 0/7] LTTng Kernel Instrumentation (Architecture Independent) Mathieu Desnoyers
                   ` (5 preceding siblings ...)
  2007-11-13 19:33 ` [RFC 6/7] LTTng instrumentation net Mathieu Desnoyers
@ 2007-11-13 19:33 ` Mathieu Desnoyers
  6 siblings, 0 replies; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-13 19:33 UTC (permalink / raw)
  To: akpm, linux-kernel; +Cc: Mike Mason, David Wilder, Mathieu Desnoyers

[-- Attachment #1: add-markers-into-semaphore-primitives.patch --]
[-- Type: text/plain, Size: 3111 bytes --]

This patch adds several markers around semaphore primitives.
Along with a tracing application this patch can be useful for measuring
kernel semaphore usage and contention.

Signed-off-by: Mike Mason <mmlnx@us.ibm.com>
Signed-off-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
---
 lib/semaphore-sleepers.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/lib/semaphore-sleepers.c b/lib/semaphore-sleepers.c
index 1281805..5343a96 100644
--- a/lib/semaphore-sleepers.c
+++ b/lib/semaphore-sleepers.c
@@ -15,6 +15,7 @@
 #include <linux/sched.h>
 #include <linux/err.h>
 #include <linux/init.h>
+#include <linux/marker.h>
 #include <asm/semaphore.h>
 
 /*
@@ -50,6 +51,7 @@
 
 fastcall void __up(struct semaphore *sem)
 {
+	trace_mark(sem_up, "%p", sem);
 	wake_up(&sem->wait);
 }
 
@@ -59,6 +61,7 @@ fastcall void __sched __down(struct semaphore * sem)
 	DECLARE_WAITQUEUE(wait, tsk);
 	unsigned long flags;
 
+	trace_mark(sem_down, "%p", sem);
 	tsk->state = TASK_UNINTERRUPTIBLE;
 	spin_lock_irqsave(&sem->wait.lock, flags);
 	add_wait_queue_exclusive_locked(&sem->wait, &wait);
@@ -73,12 +76,14 @@ fastcall void __sched __down(struct semaphore * sem)
 		 * the wait_queue_head.
 		 */
 		if (!atomic_add_negative(sleepers - 1, &sem->count)) {
+			trace_mark(sem_down_resume, "%p", sem);
 			sem->sleepers = 0;
 			break;
 		}
 		sem->sleepers = 1;	/* us - see -1 above */
 		spin_unlock_irqrestore(&sem->wait.lock, flags);
 
+		trace_mark(sem_down_sched, "%p", sem);
 		schedule();
 
 		spin_lock_irqsave(&sem->wait.lock, flags);
@@ -97,6 +102,7 @@ fastcall int __sched __down_interruptible(struct semaphore * sem)
 	DECLARE_WAITQUEUE(wait, tsk);
 	unsigned long flags;
 
+	trace_mark(sem_down_intr, "%p", sem);
 	tsk->state = TASK_INTERRUPTIBLE;
 	spin_lock_irqsave(&sem->wait.lock, flags);
 	add_wait_queue_exclusive_locked(&sem->wait, &wait);
@@ -113,6 +119,7 @@ fastcall int __sched __down_interruptible(struct semaphore * sem)
 		 * and exit.
 		 */
 		if (signal_pending(current)) {
+			trace_mark(sem_down_intr_fail, "%p", sem);
 			retval = -EINTR;
 			sem->sleepers = 0;
 			atomic_add(sleepers, &sem->count);
@@ -126,12 +133,14 @@ fastcall int __sched __down_interruptible(struct semaphore * sem)
 		 * still hoping to get the semaphore.
 		 */
 		if (!atomic_add_negative(sleepers - 1, &sem->count)) {
+			trace_mark(sem_down_intr_resume, "%p", sem);
 			sem->sleepers = 0;
 			break;
 		}
 		sem->sleepers = 1;	/* us - see -1 above */
 		spin_unlock_irqrestore(&sem->wait.lock, flags);
 
+		trace_mark(sem_down_intr_sched, "%p", sem);
 		schedule();
 
 		spin_lock_irqsave(&sem->wait.lock, flags);


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 5/7] LTTng instrumentation mm
  2007-11-13 19:33 ` [RFC 5/7] LTTng instrumentation mm Mathieu Desnoyers
@ 2007-11-15 21:06   ` Dave Hansen
  2007-11-15 21:51     ` Mathieu Desnoyers
  0 siblings, 1 reply; 46+ messages in thread
From: Dave Hansen @ 2007-11-15 21:06 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: akpm, linux-kernel, linux-mm

> On Tue, 2007-11-13 at 14:33 -0500, Mathieu Desnoyers wrote:
>  linux-2.6-lttng/mm/page_io.c        2007-11-13 09:49:35.000000000 -0500
> @@ -114,6 +114,7 @@ int swap_writepage(struct page *page, st
>                 rw |= (1 << BIO_RW_SYNC);
>         count_vm_event(PSWPOUT);
>         set_page_writeback(page);
> +       trace_mark(mm_swap_out, "address %p", page_address(page));
>         unlock_page(page);
>         submit_bio(rw, bio);
>  out:

I'm not sure all this page_address() stuff makes any sense on highmem
systems.  How about page_to_pfn()?

I also have to wonder if you should be hooking into count_vm_event() and
using those.  Could you give a high-level overview of exactly why you
need these hooks, and perhaps what you expect from future people adding
things to the VM?

-- Dave


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 5/7] LTTng instrumentation mm
  2007-11-15 21:06   ` Dave Hansen
@ 2007-11-15 21:51     ` Mathieu Desnoyers
  2007-11-15 22:16       ` Dave Hansen
  0 siblings, 1 reply; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-15 21:51 UTC (permalink / raw)
  To: Dave Hansen; +Cc: akpm, linux-kernel, linux-mm, mbligh

* Dave Hansen (haveblue@us.ibm.com) wrote:
> > On Tue, 2007-11-13 at 14:33 -0500, Mathieu Desnoyers wrote:
> >  linux-2.6-lttng/mm/page_io.c        2007-11-13 09:49:35.000000000 -0500
> > @@ -114,6 +114,7 @@ int swap_writepage(struct page *page, st
> >                 rw |= (1 << BIO_RW_SYNC);
> >         count_vm_event(PSWPOUT);
> >         set_page_writeback(page);
> > +       trace_mark(mm_swap_out, "address %p", page_address(page));
> >         unlock_page(page);
> >         submit_bio(rw, bio);
> >  out:
> 
> I'm not sure all this page_address() stuff makes any sense on highmem
> systems.  How about page_to_pfn()?
> 

Hrm, maybe both ?

Knowing which page frame number has been swapped out is not always as
relevant as knowing the page's virtual address (when it has one). Saving
both the PFN and the page's virtual address could give us useful
information when the page is not mapped.

We face two possible approaches : either we save both the address and
the pfn at each event and later have the information at once in the
trace, or we instrument the kernel virtual addresses map/unmap
operations and let the trace analyzer figure out the mappings.

It is sometimes a big benefit traffic-wise to let the userspace tool do
recreate the kernel structures from the traced information, but it
involved specialized treatment in the userspace tools. If we chose this
solution, we could simply save the PFN in the event, as you propose.


> I also have to wonder if you should be hooking into count_vm_event() and
> using those.  Could you give a high-level overview of exactly why you
> need these hooks, and perhaps what you expect from future people adding
> things to the VM?
> 

Yep, I guess we could put useful markers beside the count_vm_events
inline function calls.

High level overview :

We currently have a "LTTng statedump", which iterates on the mappings of
all tasks at trace start time to dump them in the trace. We also
instrument memory allocation/free. We therefore have much of the
information needed to recreate the memory mappings in the kernel at any
point during the trace by "replaying" the trace.

Having the events that helps us to recreate it
- precisely
- efficiently
- with a level of generality that should not break "too much" between
  kernel versions

would be useful to us.

Then we could start creating plugins in our userspace trace analysis
tool to analyze fun stuff such as sources of memory fragmentation.

Then coupling that with, eventually, performance counter, we could start
doing really fun things with cache misses...

It can also be useful to you guys to find our problems by adding ad-hoc
instrumentation to the VM code when pinpointing the cause of a problem.
Martin Bligh made interesting things applying a tracer to the vm,
described in "Linux Kernel Debugging on Google-sized clusters" in
OLS2007 proceedings.

(https://ols2006.108.redhat.com/2007/Reprints/OLS2007-Proceedings-V1.pdf)

Mathieu

> -- Dave
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 5/7] LTTng instrumentation mm
  2007-11-15 21:51     ` Mathieu Desnoyers
@ 2007-11-15 22:16       ` Dave Hansen
  2007-11-16 14:30         ` Mathieu Desnoyers
  2007-11-16 14:47         ` [RFC 5/7] LTTng instrumentation mm Mathieu Desnoyers
  0 siblings, 2 replies; 46+ messages in thread
From: Dave Hansen @ 2007-11-15 22:16 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: akpm, linux-kernel, linux-mm, mbligh

On Thu, 2007-11-15 at 16:51 -0500, Mathieu Desnoyers wrote:
> * Dave Hansen (haveblue@us.ibm.com) wrote:
> > > On Tue, 2007-11-13 at 14:33 -0500, Mathieu Desnoyers wrote:
> > >  linux-2.6-lttng/mm/page_io.c        2007-11-13 09:49:35.000000000 -0500
> > > @@ -114,6 +114,7 @@ int swap_writepage(struct page *page, st
> > >                 rw |= (1 << BIO_RW_SYNC);
> > >         count_vm_event(PSWPOUT);
> > >         set_page_writeback(page);
> > > +       trace_mark(mm_swap_out, "address %p", page_address(page));
> > >         unlock_page(page);
> > >         submit_bio(rw, bio);
> > >  out:
> > 
> > I'm not sure all this page_address() stuff makes any sense on highmem
> > systems.  How about page_to_pfn()?
>
> Knowing which page frame number has been swapped out is not always as
> relevant as knowing the page's virtual address (when it has one). Saving
> both the PFN and the page's virtual address could give us useful
> information when the page is not mapped.

For most (all?) architectures, the PFN and the virtual address in the
kernel's linear are interchangeable with pretty trivial arithmetic.  All
pages have a pfn, but not all have a virtual address.  Thus, I suggested
using the pfn.  What kind of virtual addresses are you talking about?

-- Dave


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 4/7] LTTng instrumentation kernel
  2007-11-13 19:33 ` [RFC 4/7] LTTng instrumentation kernel Mathieu Desnoyers
@ 2007-11-15 23:30   ` Mike Mason
  2007-11-15 23:54     ` Mike Mason
  2007-11-16  2:22     ` Mathieu Desnoyers
  0 siblings, 2 replies; 46+ messages in thread
From: Mike Mason @ 2007-11-15 23:30 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: akpm, linux-kernel

This patch uses _trace_mark in lockdep.c and printk.c.  I assume they should be trace_mark (no '_' prefix).

Mike Mason


Mathieu Desnoyers wrote:

Core kernel events.

*not* present in this patch because they are architecture specific :
- syscall entry/exit
- traps
- kernel thread creation

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
---
 include/linux/module.h |    1 +
 kernel/exit.c          |    5 +++++
 kernel/fork.c          |    4 ++++
 kernel/irq/handle.c    |    6 ++++++
 kernel/itimer.c        |   11 +++++++++++
 kernel/kthread.c       |    4 ++++
 kernel/lockdep.c       |   19 +++++++++++++++++++
 kernel/module.c        |   25 +++++++++++++++++++++++++
 kernel/printk.c        |   26 ++++++++++++++++++++++++++
 kernel/sched.c         |   11 +++++++++++
 kernel/signal.c        |    2 ++
 kernel/softirq.c       |   22 ++++++++++++++++++++++
 kernel/timer.c         |   12 +++++++++++-
 13 files changed, 147 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/kernel/irq/handle.c
===================================================================
--- linux-2.6-lttng.orig/kernel/irq/handle.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/irq/handle.c	2007-11-13 09:49:33.000000000 -0500
@@ -130,6 +130,10 @@ irqreturn_t handle_IRQ_event(unsigned in
 {
 	irqreturn_t ret, retval = IRQ_NONE;
 	unsigned int status = 0;
+	struct pt_regs *regs = get_irq_regs();
+
+	trace_mark(kernel_irq_entry, "irq_id %u kernel_mode %u", irq,
+		(regs)?(!user_mode(regs)):(1));

 	handle_dynamic_tick(action);

@@ -148,6 +152,8 @@ irqreturn_t handle_IRQ_event(unsigned in
 		add_interrupt_randomness(irq);
 	local_irq_disable();

+	trace_mark(kernel_irq_exit, MARK_NOARGS);
+
 	return retval;
 }

Index: linux-2.6-lttng/kernel/itimer.c
===================================================================
--- linux-2.6-lttng.orig/kernel/itimer.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/itimer.c	2007-11-13 09:49:33.000000000 -0500
@@ -132,6 +132,8 @@ enum hrtimer_restart it_real_fn(struct h
 	struct signal_struct *sig =
 		container_of(timer, struct signal_struct, real_timer);

+	trace_mark(kernel_timer_itimer_expired, "pid %d", sig->tsk->pid);
+
 	send_group_sig_info(SIGALRM, SEND_SIG_PRIV, sig->tsk);

 	return HRTIMER_NORESTART;
@@ -157,6 +159,15 @@ int do_setitimer(int which, struct itime
 	    !timeval_valid(&value->it_interval))
 		return -EINVAL;

+	trace_mark(kernel_timer_itimer_set,
+			"which %d interval_sec %ld interval_usec %ld "
+			"value_sec %ld value_usec %ld",
+			which,
+			value->it_interval.tv_sec,
+			value->it_interval.tv_usec,
+			value->it_value.tv_sec,
+			value->it_value.tv_usec);
+
 	switch (which) {
 	case ITIMER_REAL:
 again:
Index: linux-2.6-lttng/kernel/kthread.c
===================================================================
--- linux-2.6-lttng.orig/kernel/kthread.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/kthread.c	2007-11-13 09:49:33.000000000 -0500
@@ -195,6 +195,8 @@ int kthread_stop(struct task_struct *k)
 	/* It could exit after stop_info.k set, but before wake_up_process. */
 	get_task_struct(k);

+	trace_mark(kernel_kthread_stop, "pid %d", k->pid);
+
 	/* Must init completion *before* thread sees kthread_stop_info.k */
 	init_completion(&kthread_stop_info.done);
 	smp_wmb();
@@ -210,6 +212,8 @@ int kthread_stop(struct task_struct *k)
 	ret = kthread_stop_info.err;
 	mutex_unlock(&kthread_stop_lock);

+	trace_mark(kernel_kthread_stop_ret, "ret %d", ret);
+
 	return ret;
 }
 EXPORT_SYMBOL(kthread_stop);
Index: linux-2.6-lttng/kernel/lockdep.c
===================================================================
--- linux-2.6-lttng.orig/kernel/lockdep.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/lockdep.c	2007-11-13 09:49:33.000000000 -0500
@@ -2014,6 +2014,9 @@ void trace_hardirqs_on(void)
 	struct task_struct *curr = current;
 	unsigned long ip;

+	_trace_mark(locking_hardirqs_on, "ip #p%lu",
+		(unsigned long) __builtin_return_address(0));
+
 	if (unlikely(!debug_locks || current->lockdep_recursion))
 		return;

@@ -2061,6 +2064,9 @@ void trace_hardirqs_off(void)
 {
 	struct task_struct *curr = current;

+	_trace_mark(locking_hardirqs_off, "ip #p%lu",
+		(unsigned long) __builtin_return_address(0));
+
 	if (unlikely(!debug_locks || current->lockdep_recursion))
 		return;

@@ -2088,6 +2094,9 @@ void trace_softirqs_on(unsigned long ip)
 {
 	struct task_struct *curr = current;

+	_trace_mark(locking_softirqs_on, "ip #p%lu",
+		(unsigned long) __builtin_return_address(0));
+
 	if (unlikely(!debug_locks))
 		return;

@@ -2122,6 +2131,9 @@ void trace_softirqs_off(unsigned long ip
 {
 	struct task_struct *curr = current;

+	_trace_mark(locking_softirqs_off, "ip #p%lu",
+		(unsigned long) __builtin_return_address(0));
+
 	if (unlikely(!debug_locks))
 		return;

@@ -2358,6 +2370,10 @@ static int __lock_acquire(struct lockdep
 	int chain_head = 0;
 	u64 chain_key;

+	_trace_mark(locking_lock_acquire,
+		"ip #p%lu subclass %u lock %p trylock %d",
+		ip, subclass, lock, trylock);
+
 	if (!prove_locking)
 		check = 1;

@@ -2631,6 +2647,9 @@ __lock_release(struct lockdep_map *lock,
 {
 	struct task_struct *curr = current;

+	_trace_mark(locking_lock_release, "ip #p%lu lock %p nested %d",
+		ip, lock, nested);
+
 	if (!check_unlock(curr, lock, ip))
 		return;

Index: linux-2.6-lttng/kernel/printk.c
===================================================================
--- linux-2.6-lttng.orig/kernel/printk.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/printk.c	2007-11-13 09:49:33.000000000 -0500
@@ -619,6 +619,7 @@ asmlinkage int printk(const char *fmt, .
 	int r;

 	va_start(args, fmt);
+	trace_mark(kernel_printk, "ip %p", __builtin_return_address(0));
 	r = vprintk(fmt, args);
 	va_end(args);

@@ -653,6 +654,31 @@ asmlinkage int vprintk(const char *fmt, 
 	/* Emit the output into the temporary buffer */
 	printed_len = vscnprintf(printk_buf, sizeof(printk_buf), fmt, args);

+	if (printed_len > 0) {
+		unsigned int loglevel;
+		int mark_len;
+		char *mark_buf;
+		char saved_char;
+
+		if (printk_buf[0] == '<' && printk_buf[1] >= '0' &&
+		   printk_buf[1] <= '7' && printk_buf[2] == '>') {
+			loglevel = printk_buf[1] - '0';
+			mark_buf = &printk_buf[3];
+			mark_len = printed_len - 3;
+		} else {
+			loglevel = default_message_loglevel;
+			mark_buf = printk_buf;
+			mark_len = printed_len;
+		}
+		if (mark_buf[mark_len - 1] == '\n')
+			mark_len--;
+		saved_char = mark_buf[mark_len];
+		mark_buf[mark_len] = '\0';
+		_trace_mark(kernel_vprintk, "loglevel %c string %s ip %p",
+			loglevel, mark_buf, __builtin_return_address(0));
+		mark_buf[mark_len] = saved_char;
+	}
+
 	/*
 	 * Copy the output into log_buf.  If the caller didn't provide
 	 * appropriate log level tags, we insert them here
Index: linux-2.6-lttng/kernel/sched.c
===================================================================
--- linux-2.6-lttng.orig/kernel/sched.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/sched.c	2007-11-13 09:49:33.000000000 -0500
@@ -1161,6 +1161,8 @@ void wait_task_inactive(struct task_stru
 		 * just go back and repeat.
 		 */
 		rq = task_rq_lock(p, &flags);
+		trace_mark(kernel_sched_wait_task, "pid %d state %ld",
+			p->pid, p->state);
 		running = task_running(rq, p);
 		on_rq = p->se.on_rq;
 		task_rq_unlock(rq, &flags);
@@ -1495,6 +1497,8 @@ static int try_to_wake_up(struct task_st
 #endif

 	rq = task_rq_lock(p, &flags);
+	trace_mark(kernel_sched_try_wakeup, "pid %d state %ld",
+		p->pid, p->state);
 	old_state = p->state;
 	if (!(old_state & state))
 		goto out;
@@ -1733,6 +1737,8 @@ void fastcall wake_up_new_task(struct ta
 	struct rq *rq;

 	rq = task_rq_lock(p, &flags);
+	trace_mark(kernel_sched_wakeup_new_task, "pid %d state %ld",
+		p->pid, p->state);
 	BUG_ON(p->state != TASK_RUNNING);
 	update_rq_clock(rq);

@@ -1911,6 +1917,9 @@ context_switch(struct rq *rq, struct tas
 	struct mm_struct *mm, *oldmm;

 	prepare_task_switch(rq, prev, next);
+	trace_mark(kernel_sched_schedule,
+		"prev_pid %d next_pid %d prev_state %ld",
+		prev->pid, next->pid, prev->state);
 	mm = next->mm;
 	oldmm = prev->active_mm;
 	/*
@@ -2139,6 +2148,8 @@ static void sched_migrate_task(struct ta
 	    || unlikely(cpu_is_offline(dest_cpu)))
 		goto out;

+	trace_mark(kernel_sched_migrate_task, "pid %d state %ld dest_cpu %d",
+		p->pid, p->state, dest_cpu);
 	/* force the process onto the specified CPU */
 	if (migrate_task(p, dest_cpu, &req)) {
 		/* Need to wait for migration thread (might exit: take ref). */
Index: linux-2.6-lttng/kernel/signal.c
===================================================================
--- linux-2.6-lttng.orig/kernel/signal.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/signal.c	2007-11-13 09:49:33.000000000 -0500
@@ -663,6 +663,8 @@ static int send_signal(int sig, struct s
 	struct sigqueue * q = NULL;
 	int ret = 0;

+	trace_mark(kernel_send_signal, "pid %d signal %d", t->pid, sig);
+
 	/*
 	 * Deliver the signal to listening signalfds. This must be called
 	 * with the sighand lock held.
Index: linux-2.6-lttng/kernel/softirq.c
===================================================================
--- linux-2.6-lttng.orig/kernel/softirq.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/softirq.c	2007-11-13 09:49:33.000000000 -0500
@@ -229,7 +229,15 @@ restart:

 	do {
 		if (pending & 1) {
+			trace_mark(kernel_softirq_entry, "softirq_id %lu",
+				((unsigned long)h
+					- (unsigned long)softirq_vec)
+					/ sizeof(*h));
 			h->action(h);
+			trace_mark(kernel_softirq_exit, "softirq_id %lu",
+				((unsigned long)h
+					- (unsigned long)softirq_vec)
+					/ sizeof(*h));
 			rcu_bh_qsctr_inc(cpu);
 		}
 		h++;
@@ -315,6 +323,8 @@ void irq_exit(void)
  */
 inline fastcall void raise_softirq_irqoff(unsigned int nr)
 {
+	trace_mark(kernel_softirq_raise, "softirq %u", nr);
+
 	__raise_softirq_irqoff(nr);

 	/*
@@ -400,7 +410,13 @@ static void tasklet_action(struct softir
 			if (!atomic_read(&t->count)) {
 				if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
 					BUG();
+				trace_mark(kernel_tasklet_low_entry,
+						"func %p data %lu",
+						t->func, t->data);
 				t->func(t->data);
+				trace_mark(kernel_tasklet_low_exit,
+						"func %p data %lu",
+						t->func, t->data);
 				tasklet_unlock(t);
 				continue;
 			}
@@ -433,7 +449,13 @@ static void tasklet_hi_action(struct sof
 			if (!atomic_read(&t->count)) {
 				if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
 					BUG();
+				trace_mark(kernel_tasklet_high_entry,
+						"func %p data %lu",
+						t->func, t->data);
 				t->func(t->data);
+				trace_mark(kernel_tasklet_high_exit,
+						"func %p data %lu",
+						t->func, t->data);
 				tasklet_unlock(t);
 				continue;
 			}
Index: linux-2.6-lttng/kernel/timer.c
===================================================================
--- linux-2.6-lttng.orig/kernel/timer.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/timer.c	2007-11-13 09:49:33.000000000 -0500
@@ -43,6 +43,7 @@
 #include <asm/div64.h>
 #include <asm/timex.h>
 #include <asm/io.h>
+#include <asm/irq_regs.h>

 u64 jiffies_64 __cacheline_aligned_in_smp = INITIAL_JIFFIES;

@@ -290,6 +291,8 @@ static void internal_add_timer(tvec_base
 		i = (expires >> (TVR_BITS + 3 * TVN_BITS)) & TVN_MASK;
 		vec = base->tv5.vec + i;
 	}
+	trace_mark(kernel_timer_set, "expires %lu function %p data %lu",
+		expires, timer->function, timer->data);
 	/*
 	 * Timers are FIFO:
 	 */
@@ -931,6 +934,11 @@ void do_timer(unsigned long ticks)
 {
 	jiffies_64 += ticks;
 	update_times(ticks);
+	trace_mark(kernel_timer_update_time,
+		"jiffies #8u%llu xtime_sec %ld xtime_nsec %ld "
+		"walltomonotonic_sec %ld walltomonotonic_nsec %ld",
+		jiffies_64, xtime.tv_sec, xtime.tv_nsec,
+		wall_to_monotonic.tv_sec, wall_to_monotonic.tv_nsec);
 }

 #ifdef __ARCH_WANT_SYS_ALARM
@@ -1012,7 +1020,9 @@ asmlinkage long sys_getegid(void)

 static void process_timeout(unsigned long __data)
 {
-	wake_up_process((struct task_struct *)__data);
+	struct task_struct *task = (struct task_struct *)__data;
+	trace_mark(kernel_timer_timeout, "pid %d", task->pid);
+	wake_up_process(task);
 }

 /**
Index: linux-2.6-lttng/kernel/exit.c
===================================================================
--- linux-2.6-lttng.orig/kernel/exit.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/exit.c	2007-11-13 09:49:33.000000000 -0500
@@ -177,6 +177,7 @@ repeat:

 	write_unlock_irq(&tasklist_lock);
 	release_thread(p);
+	trace_mark(kernel_process_free, "pid %d", p->pid);
 	call_rcu(&p->rcu, delayed_put_task_struct);

 	p = leader;
@@ -994,6 +995,8 @@ fastcall NORET_TYPE void do_exit(long co

 	if (group_dead)
 		acct_process();
+	trace_mark(kernel_process_exit, "pid %d", tsk->pid);
+
 	exit_sem(tsk);
 	__exit_files(tsk);
 	__exit_fs(tsk);
@@ -1539,6 +1542,8 @@ static long do_wait(pid_t pid, int optio
 	int flag, retval;
 	int allowed, denied;

+	trace_mark(kernel_process_wait, "pid %d", pid);
+
 	add_wait_queue(&current->signal->wait_chldexit,&wait);
 repeat:
 	/*
Index: linux-2.6-lttng/kernel/fork.c
===================================================================
--- linux-2.6-lttng.orig/kernel/fork.c	2007-11-13 09:25:27.000000000 -0500
+++ linux-2.6-lttng/kernel/fork.c	2007-11-13 09:49:33.000000000 -0500
@@ -1435,6 +1435,10 @@ long do_fork(unsigned long clone_flags,
 	if (!IS_ERR(p)) {
 		struct completion vfork;

+		trace_mark(kernel_process_fork,
+			"parent_pid %d child_pid %d child_tgid %d",
+			current->pid, p->pid, p->tgid);
+
 		/*
 		 * this is enough to call pid_nr_ns here, but this if
 		 * improves optimisation of regular fork()
Index: linux-2.6-lttng/include/linux/module.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/module.h	2007-11-13 09:48:41.000000000 -0500
+++ linux-2.6-lttng/include/linux/module.h	2007-11-13 09:49:33.000000000 -0500
@@ -466,6 +466,7 @@ int register_module_notifier(struct noti
 int unregister_module_notifier(struct notifier_block * nb);

 extern void print_modules(void);
+extern void list_modules(void *call_data);

 extern void module_update_markers(void);

Index: linux-2.6-lttng/kernel/module.c
===================================================================
--- linux-2.6-lttng.orig/kernel/module.c	2007-11-13 09:49:16.000000000 -0500
+++ linux-2.6-lttng/kernel/module.c	2007-11-13 09:49:33.000000000 -0500
@@ -1294,6 +1294,8 @@ static int __unlink_module(void *_mod)
 /* Free a module, remove from lists, etc (must hold module_mutex). */
 static void free_module(struct module *mod)
 {
+	trace_mark(kernel_module_free, "name %s", mod->name);
+
 	/* Delete from various lists */
 	stop_machine_run(__unlink_module, mod, NR_CPUS);
 	remove_notes_attrs(mod);
@@ -2063,6 +2065,8 @@ static struct module *load_module(void _
 	/* Get rid of temporary copy */
 	vfree(hdr);

+	trace_mark(kernel_module_load, "name %s", mod->name);
+
 	/* Done! */
 	return mod;

@@ -2426,6 +2430,27 @@ const struct seq_operations modules_op =
 	.show	= m_show
 };

+void list_modules(void *call_data)
+{
+	/* Enumerate loaded modules */
+	struct list_head	*i;
+	struct module		*mod;
+	unsigned long refcount = 0;
+
+	mutex_lock(&module_mutex);
+	list_for_each(i, &modules) {
+		mod = list_entry(i, struct module, list);
+#ifdef CONFIG_MODULE_UNLOAD
+		refcount = local_read(&mod->ref[0].count);
+#endif
+		__trace_mark(0, list_module, call_data,
+				"name %s state %d refcount %lu",
+				mod->name, mod->state, refcount);
+	}
+	mutex_unlock(&module_mutex);
+}
+EXPORT_SYMBOL_GPL(list_modules);
+
 /* Given an address, look for it in the module exception tables. */
 const struct exception_table_entry *search_module_extables(unsigned long addr)
 {

-- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 4/7] LTTng instrumentation kernel
  2007-11-15 23:30   ` Mike Mason
@ 2007-11-15 23:54     ` Mike Mason
  2007-11-16  2:42       ` Mathieu Desnoyers
  2007-11-16  2:22     ` Mathieu Desnoyers
  1 sibling, 1 reply; 46+ messages in thread
From: Mike Mason @ 2007-11-15 23:54 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: akpm, linux-kernel

snip
> 
> +void list_modules(void *call_data)
> +{
> +    /* Enumerate loaded modules */
> +    struct list_head    *i;
> +    struct module        *mod;
> +    unsigned long refcount = 0;
> +
> +    mutex_lock(&module_mutex);
> +    list_for_each(i, &modules) {
> +        mod = list_entry(i, struct module, list);
> +#ifdef CONFIG_MODULE_UNLOAD
> +        refcount = local_read(&mod->ref[0].count);
> +#endif
> +        __trace_mark(0, list_module, call_data,
> +                "name %s state %d refcount %lu",
> +                mod->name, mod->state, refcount);
> +    }
> +    mutex_unlock(&module_mutex);
> +}
> +EXPORT_SYMBOL_GPL(list_modules);
> +
> /* Given an address, look for it in the module exception tables. */
> const struct exception_table_entry *search_module_extables(unsigned long 
> addr)
> {

What is the purpose of list_modules() in this patch?  Seems outside the scope of the patches' intent.  I assume LTTng uses it for some purpose, but it's not required to use the markers added by the patch.

Also, if list_modules() remains, the 0 should be removed from "__trace_mark(0, ..." 

Mike Mason

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 4/7] LTTng instrumentation kernel
  2007-11-15 23:30   ` Mike Mason
  2007-11-15 23:54     ` Mike Mason
@ 2007-11-16  2:22     ` Mathieu Desnoyers
  1 sibling, 0 replies; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-16  2:22 UTC (permalink / raw)
  To: Mike Mason; +Cc: akpm, linux-kernel

* Mike Mason (mmlnx@us.ibm.com) wrote:
> This patch uses _trace_mark in lockdep.c and printk.c.  I assume they 
> should be trace_mark (no '_' prefix).
>

Since it folows the markers with immediate values patch, it requires to
use the underscored version, because the lockdep code can be called from
the return from trap (thus breakpoint) because of the interrupt enable
instrumentation and therefore cause a recursive trap.

The underscored version means "don't use the optimized version".

Mathieu


> Mike Mason
>
>
> Mathieu Desnoyers wrote:
>
> Core kernel events.
>
> *not* present in this patch because they are architecture specific :
> - syscall entry/exit
> - traps
> - kernel thread creation
>
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> ---
> include/linux/module.h |    1 +
> kernel/exit.c          |    5 +++++
> kernel/fork.c          |    4 ++++
> kernel/irq/handle.c    |    6 ++++++
> kernel/itimer.c        |   11 +++++++++++
> kernel/kthread.c       |    4 ++++
> kernel/lockdep.c       |   19 +++++++++++++++++++
> kernel/module.c        |   25 +++++++++++++++++++++++++
> kernel/printk.c        |   26 ++++++++++++++++++++++++++
> kernel/sched.c         |   11 +++++++++++
> kernel/signal.c        |    2 ++
> kernel/softirq.c       |   22 ++++++++++++++++++++++
> kernel/timer.c         |   12 +++++++++++-
> 13 files changed, 147 insertions(+), 1 deletion(-)
>
> Index: linux-2.6-lttng/kernel/irq/handle.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/irq/handle.c	2007-11-13 09:25:27.000000000 
> -0500
> +++ linux-2.6-lttng/kernel/irq/handle.c	2007-11-13 09:49:33.000000000 -0500
> @@ -130,6 +130,10 @@ irqreturn_t handle_IRQ_event(unsigned in
> {
> 	irqreturn_t ret, retval = IRQ_NONE;
> 	unsigned int status = 0;
> +	struct pt_regs *regs = get_irq_regs();
> +
> +	trace_mark(kernel_irq_entry, "irq_id %u kernel_mode %u", irq,
> +		(regs)?(!user_mode(regs)):(1));
>
> 	handle_dynamic_tick(action);
>
> @@ -148,6 +152,8 @@ irqreturn_t handle_IRQ_event(unsigned in
> 		add_interrupt_randomness(irq);
> 	local_irq_disable();
>
> +	trace_mark(kernel_irq_exit, MARK_NOARGS);
> +
> 	return retval;
> }
>
> Index: linux-2.6-lttng/kernel/itimer.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/itimer.c	2007-11-13 09:25:27.000000000 
> -0500
> +++ linux-2.6-lttng/kernel/itimer.c	2007-11-13 09:49:33.000000000 -0500
> @@ -132,6 +132,8 @@ enum hrtimer_restart it_real_fn(struct h
> 	struct signal_struct *sig =
> 		container_of(timer, struct signal_struct, real_timer);
>
> +	trace_mark(kernel_timer_itimer_expired, "pid %d", sig->tsk->pid);
> +
> 	send_group_sig_info(SIGALRM, SEND_SIG_PRIV, sig->tsk);
>
> 	return HRTIMER_NORESTART;
> @@ -157,6 +159,15 @@ int do_setitimer(int which, struct itime
> 	    !timeval_valid(&value->it_interval))
> 		return -EINVAL;
>
> +	trace_mark(kernel_timer_itimer_set,
> +			"which %d interval_sec %ld interval_usec %ld "
> +			"value_sec %ld value_usec %ld",
> +			which,
> +			value->it_interval.tv_sec,
> +			value->it_interval.tv_usec,
> +			value->it_value.tv_sec,
> +			value->it_value.tv_usec);
> +
> 	switch (which) {
> 	case ITIMER_REAL:
> again:
> Index: linux-2.6-lttng/kernel/kthread.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/kthread.c	2007-11-13 09:25:27.000000000 
> -0500
> +++ linux-2.6-lttng/kernel/kthread.c	2007-11-13 09:49:33.000000000 -0500
> @@ -195,6 +195,8 @@ int kthread_stop(struct task_struct *k)
> 	/* It could exit after stop_info.k set, but before wake_up_process. */
> 	get_task_struct(k);
>
> +	trace_mark(kernel_kthread_stop, "pid %d", k->pid);
> +
> 	/* Must init completion *before* thread sees kthread_stop_info.k */
> 	init_completion(&kthread_stop_info.done);
> 	smp_wmb();
> @@ -210,6 +212,8 @@ int kthread_stop(struct task_struct *k)
> 	ret = kthread_stop_info.err;
> 	mutex_unlock(&kthread_stop_lock);
>
> +	trace_mark(kernel_kthread_stop_ret, "ret %d", ret);
> +
> 	return ret;
> }
> EXPORT_SYMBOL(kthread_stop);
> Index: linux-2.6-lttng/kernel/lockdep.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/lockdep.c	2007-11-13 09:25:27.000000000 
> -0500
> +++ linux-2.6-lttng/kernel/lockdep.c	2007-11-13 09:49:33.000000000 -0500
> @@ -2014,6 +2014,9 @@ void trace_hardirqs_on(void)
> 	struct task_struct *curr = current;
> 	unsigned long ip;
>
> +	_trace_mark(locking_hardirqs_on, "ip #p%lu",
> +		(unsigned long) __builtin_return_address(0));
> +
> 	if (unlikely(!debug_locks || current->lockdep_recursion))
> 		return;
>
> @@ -2061,6 +2064,9 @@ void trace_hardirqs_off(void)
> {
> 	struct task_struct *curr = current;
>
> +	_trace_mark(locking_hardirqs_off, "ip #p%lu",
> +		(unsigned long) __builtin_return_address(0));
> +
> 	if (unlikely(!debug_locks || current->lockdep_recursion))
> 		return;
>
> @@ -2088,6 +2094,9 @@ void trace_softirqs_on(unsigned long ip)
> {
> 	struct task_struct *curr = current;
>
> +	_trace_mark(locking_softirqs_on, "ip #p%lu",
> +		(unsigned long) __builtin_return_address(0));
> +
> 	if (unlikely(!debug_locks))
> 		return;
>
> @@ -2122,6 +2131,9 @@ void trace_softirqs_off(unsigned long ip
> {
> 	struct task_struct *curr = current;
>
> +	_trace_mark(locking_softirqs_off, "ip #p%lu",
> +		(unsigned long) __builtin_return_address(0));
> +
> 	if (unlikely(!debug_locks))
> 		return;
>
> @@ -2358,6 +2370,10 @@ static int __lock_acquire(struct lockdep
> 	int chain_head = 0;
> 	u64 chain_key;
>
> +	_trace_mark(locking_lock_acquire,
> +		"ip #p%lu subclass %u lock %p trylock %d",
> +		ip, subclass, lock, trylock);
> +
> 	if (!prove_locking)
> 		check = 1;
>
> @@ -2631,6 +2647,9 @@ __lock_release(struct lockdep_map *lock,
> {
> 	struct task_struct *curr = current;
>
> +	_trace_mark(locking_lock_release, "ip #p%lu lock %p nested %d",
> +		ip, lock, nested);
> +
> 	if (!check_unlock(curr, lock, ip))
> 		return;
>
> Index: linux-2.6-lttng/kernel/printk.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/printk.c	2007-11-13 09:25:27.000000000 
> -0500
> +++ linux-2.6-lttng/kernel/printk.c	2007-11-13 09:49:33.000000000 -0500
> @@ -619,6 +619,7 @@ asmlinkage int printk(const char *fmt, .
> 	int r;
>
> 	va_start(args, fmt);
> +	trace_mark(kernel_printk, "ip %p", __builtin_return_address(0));
> 	r = vprintk(fmt, args);
> 	va_end(args);
>
> @@ -653,6 +654,31 @@ asmlinkage int vprintk(const char *fmt, 	/* Emit the 
> output into the temporary buffer */
> 	printed_len = vscnprintf(printk_buf, sizeof(printk_buf), fmt, args);
>
> +	if (printed_len > 0) {
> +		unsigned int loglevel;
> +		int mark_len;
> +		char *mark_buf;
> +		char saved_char;
> +
> +		if (printk_buf[0] == '<' && printk_buf[1] >= '0' &&
> +		   printk_buf[1] <= '7' && printk_buf[2] == '>') {
> +			loglevel = printk_buf[1] - '0';
> +			mark_buf = &printk_buf[3];
> +			mark_len = printed_len - 3;
> +		} else {
> +			loglevel = default_message_loglevel;
> +			mark_buf = printk_buf;
> +			mark_len = printed_len;
> +		}
> +		if (mark_buf[mark_len - 1] == '\n')
> +			mark_len--;
> +		saved_char = mark_buf[mark_len];
> +		mark_buf[mark_len] = '\0';
> +		_trace_mark(kernel_vprintk, "loglevel %c string %s ip %p",
> +			loglevel, mark_buf, __builtin_return_address(0));
> +		mark_buf[mark_len] = saved_char;
> +	}
> +
> 	/*
> 	 * Copy the output into log_buf.  If the caller didn't provide
> 	 * appropriate log level tags, we insert them here
> Index: linux-2.6-lttng/kernel/sched.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/sched.c	2007-11-13 09:25:27.000000000 -0500
> +++ linux-2.6-lttng/kernel/sched.c	2007-11-13 09:49:33.000000000 -0500
> @@ -1161,6 +1161,8 @@ void wait_task_inactive(struct task_stru
> 		 * just go back and repeat.
> 		 */
> 		rq = task_rq_lock(p, &flags);
> +		trace_mark(kernel_sched_wait_task, "pid %d state %ld",
> +			p->pid, p->state);
> 		running = task_running(rq, p);
> 		on_rq = p->se.on_rq;
> 		task_rq_unlock(rq, &flags);
> @@ -1495,6 +1497,8 @@ static int try_to_wake_up(struct task_st
> #endif
>
> 	rq = task_rq_lock(p, &flags);
> +	trace_mark(kernel_sched_try_wakeup, "pid %d state %ld",
> +		p->pid, p->state);
> 	old_state = p->state;
> 	if (!(old_state & state))
> 		goto out;
> @@ -1733,6 +1737,8 @@ void fastcall wake_up_new_task(struct ta
> 	struct rq *rq;
>
> 	rq = task_rq_lock(p, &flags);
> +	trace_mark(kernel_sched_wakeup_new_task, "pid %d state %ld",
> +		p->pid, p->state);
> 	BUG_ON(p->state != TASK_RUNNING);
> 	update_rq_clock(rq);
>
> @@ -1911,6 +1917,9 @@ context_switch(struct rq *rq, struct tas
> 	struct mm_struct *mm, *oldmm;
>
> 	prepare_task_switch(rq, prev, next);
> +	trace_mark(kernel_sched_schedule,
> +		"prev_pid %d next_pid %d prev_state %ld",
> +		prev->pid, next->pid, prev->state);
> 	mm = next->mm;
> 	oldmm = prev->active_mm;
> 	/*
> @@ -2139,6 +2148,8 @@ static void sched_migrate_task(struct ta
> 	    || unlikely(cpu_is_offline(dest_cpu)))
> 		goto out;
>
> +	trace_mark(kernel_sched_migrate_task, "pid %d state %ld dest_cpu %d",
> +		p->pid, p->state, dest_cpu);
> 	/* force the process onto the specified CPU */
> 	if (migrate_task(p, dest_cpu, &req)) {
> 		/* Need to wait for migration thread (might exit: take ref). */
> Index: linux-2.6-lttng/kernel/signal.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/signal.c	2007-11-13 09:25:27.000000000 
> -0500
> +++ linux-2.6-lttng/kernel/signal.c	2007-11-13 09:49:33.000000000 -0500
> @@ -663,6 +663,8 @@ static int send_signal(int sig, struct s
> 	struct sigqueue * q = NULL;
> 	int ret = 0;
>
> +	trace_mark(kernel_send_signal, "pid %d signal %d", t->pid, sig);
> +
> 	/*
> 	 * Deliver the signal to listening signalfds. This must be called
> 	 * with the sighand lock held.
> Index: linux-2.6-lttng/kernel/softirq.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/softirq.c	2007-11-13 09:25:27.000000000 
> -0500
> +++ linux-2.6-lttng/kernel/softirq.c	2007-11-13 09:49:33.000000000 -0500
> @@ -229,7 +229,15 @@ restart:
>
> 	do {
> 		if (pending & 1) {
> +			trace_mark(kernel_softirq_entry, "softirq_id %lu",
> +				((unsigned long)h
> +					- (unsigned long)softirq_vec)
> +					/ sizeof(*h));
> 			h->action(h);
> +			trace_mark(kernel_softirq_exit, "softirq_id %lu",
> +				((unsigned long)h
> +					- (unsigned long)softirq_vec)
> +					/ sizeof(*h));
> 			rcu_bh_qsctr_inc(cpu);
> 		}
> 		h++;
> @@ -315,6 +323,8 @@ void irq_exit(void)
>  */
> inline fastcall void raise_softirq_irqoff(unsigned int nr)
> {
> +	trace_mark(kernel_softirq_raise, "softirq %u", nr);
> +
> 	__raise_softirq_irqoff(nr);
>
> 	/*
> @@ -400,7 +410,13 @@ static void tasklet_action(struct softir
> 			if (!atomic_read(&t->count)) {
> 				if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
> 					BUG();
> +				trace_mark(kernel_tasklet_low_entry,
> +						"func %p data %lu",
> +						t->func, t->data);
> 				t->func(t->data);
> +				trace_mark(kernel_tasklet_low_exit,
> +						"func %p data %lu",
> +						t->func, t->data);
> 				tasklet_unlock(t);
> 				continue;
> 			}
> @@ -433,7 +449,13 @@ static void tasklet_hi_action(struct sof
> 			if (!atomic_read(&t->count)) {
> 				if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
> 					BUG();
> +				trace_mark(kernel_tasklet_high_entry,
> +						"func %p data %lu",
> +						t->func, t->data);
> 				t->func(t->data);
> +				trace_mark(kernel_tasklet_high_exit,
> +						"func %p data %lu",
> +						t->func, t->data);
> 				tasklet_unlock(t);
> 				continue;
> 			}
> Index: linux-2.6-lttng/kernel/timer.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/timer.c	2007-11-13 09:25:27.000000000 -0500
> +++ linux-2.6-lttng/kernel/timer.c	2007-11-13 09:49:33.000000000 -0500
> @@ -43,6 +43,7 @@
> #include <asm/div64.h>
> #include <asm/timex.h>
> #include <asm/io.h>
> +#include <asm/irq_regs.h>
>
> u64 jiffies_64 __cacheline_aligned_in_smp = INITIAL_JIFFIES;
>
> @@ -290,6 +291,8 @@ static void internal_add_timer(tvec_base
> 		i = (expires >> (TVR_BITS + 3 * TVN_BITS)) & TVN_MASK;
> 		vec = base->tv5.vec + i;
> 	}
> +	trace_mark(kernel_timer_set, "expires %lu function %p data %lu",
> +		expires, timer->function, timer->data);
> 	/*
> 	 * Timers are FIFO:
> 	 */
> @@ -931,6 +934,11 @@ void do_timer(unsigned long ticks)
> {
> 	jiffies_64 += ticks;
> 	update_times(ticks);
> +	trace_mark(kernel_timer_update_time,
> +		"jiffies #8u%llu xtime_sec %ld xtime_nsec %ld "
> +		"walltomonotonic_sec %ld walltomonotonic_nsec %ld",
> +		jiffies_64, xtime.tv_sec, xtime.tv_nsec,
> +		wall_to_monotonic.tv_sec, wall_to_monotonic.tv_nsec);
> }
>
> #ifdef __ARCH_WANT_SYS_ALARM
> @@ -1012,7 +1020,9 @@ asmlinkage long sys_getegid(void)
>
> static void process_timeout(unsigned long __data)
> {
> -	wake_up_process((struct task_struct *)__data);
> +	struct task_struct *task = (struct task_struct *)__data;
> +	trace_mark(kernel_timer_timeout, "pid %d", task->pid);
> +	wake_up_process(task);
> }
>
> /**
> Index: linux-2.6-lttng/kernel/exit.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/exit.c	2007-11-13 09:25:27.000000000 -0500
> +++ linux-2.6-lttng/kernel/exit.c	2007-11-13 09:49:33.000000000 -0500
> @@ -177,6 +177,7 @@ repeat:
>
> 	write_unlock_irq(&tasklist_lock);
> 	release_thread(p);
> +	trace_mark(kernel_process_free, "pid %d", p->pid);
> 	call_rcu(&p->rcu, delayed_put_task_struct);
>
> 	p = leader;
> @@ -994,6 +995,8 @@ fastcall NORET_TYPE void do_exit(long co
>
> 	if (group_dead)
> 		acct_process();
> +	trace_mark(kernel_process_exit, "pid %d", tsk->pid);
> +
> 	exit_sem(tsk);
> 	__exit_files(tsk);
> 	__exit_fs(tsk);
> @@ -1539,6 +1542,8 @@ static long do_wait(pid_t pid, int optio
> 	int flag, retval;
> 	int allowed, denied;
>
> +	trace_mark(kernel_process_wait, "pid %d", pid);
> +
> 	add_wait_queue(&current->signal->wait_chldexit,&wait);
> repeat:
> 	/*
> Index: linux-2.6-lttng/kernel/fork.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/fork.c	2007-11-13 09:25:27.000000000 -0500
> +++ linux-2.6-lttng/kernel/fork.c	2007-11-13 09:49:33.000000000 -0500
> @@ -1435,6 +1435,10 @@ long do_fork(unsigned long clone_flags,
> 	if (!IS_ERR(p)) {
> 		struct completion vfork;
>
> +		trace_mark(kernel_process_fork,
> +			"parent_pid %d child_pid %d child_tgid %d",
> +			current->pid, p->pid, p->tgid);
> +
> 		/*
> 		 * this is enough to call pid_nr_ns here, but this if
> 		 * improves optimisation of regular fork()
> Index: linux-2.6-lttng/include/linux/module.h
> ===================================================================
> --- linux-2.6-lttng.orig/include/linux/module.h	2007-11-13 
> 09:48:41.000000000 -0500
> +++ linux-2.6-lttng/include/linux/module.h	2007-11-13 09:49:33.000000000 
> -0500
> @@ -466,6 +466,7 @@ int register_module_notifier(struct noti
> int unregister_module_notifier(struct notifier_block * nb);
>
> extern void print_modules(void);
> +extern void list_modules(void *call_data);
>
> extern void module_update_markers(void);
>
> Index: linux-2.6-lttng/kernel/module.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/module.c	2007-11-13 09:49:16.000000000 
> -0500
> +++ linux-2.6-lttng/kernel/module.c	2007-11-13 09:49:33.000000000 -0500
> @@ -1294,6 +1294,8 @@ static int __unlink_module(void *_mod)
> /* Free a module, remove from lists, etc (must hold module_mutex). */
> static void free_module(struct module *mod)
> {
> +	trace_mark(kernel_module_free, "name %s", mod->name);
> +
> 	/* Delete from various lists */
> 	stop_machine_run(__unlink_module, mod, NR_CPUS);
> 	remove_notes_attrs(mod);
> @@ -2063,6 +2065,8 @@ static struct module *load_module(void _
> 	/* Get rid of temporary copy */
> 	vfree(hdr);
>
> +	trace_mark(kernel_module_load, "name %s", mod->name);
> +
> 	/* Done! */
> 	return mod;
>
> @@ -2426,6 +2430,27 @@ const struct seq_operations modules_op =
> 	.show	= m_show
> };
>
> +void list_modules(void *call_data)
> +{
> +	/* Enumerate loaded modules */
> +	struct list_head	*i;
> +	struct module		*mod;
> +	unsigned long refcount = 0;
> +
> +	mutex_lock(&module_mutex);
> +	list_for_each(i, &modules) {
> +		mod = list_entry(i, struct module, list);
> +#ifdef CONFIG_MODULE_UNLOAD
> +		refcount = local_read(&mod->ref[0].count);
> +#endif
> +		__trace_mark(0, list_module, call_data,
> +				"name %s state %d refcount %lu",
> +				mod->name, mod->state, refcount);
> +	}
> +	mutex_unlock(&module_mutex);
> +}
> +EXPORT_SYMBOL_GPL(list_modules);
> +
> /* Given an address, look for it in the module exception tables. */
> const struct exception_table_entry *search_module_extables(unsigned long 
> addr)
> {
>
> -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole 
> Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F 
> BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line 
> "unsubscribe linux-kernel" in the body of a message to 
> majordomo@vger.kernel.org More majordomo info at 
> http://vger.kernel.org/majordomo-info.html Please read the FAQ at 
> http://www.tux.org/lkml/ 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 4/7] LTTng instrumentation kernel
  2007-11-15 23:54     ` Mike Mason
@ 2007-11-16  2:42       ` Mathieu Desnoyers
  0 siblings, 0 replies; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-16  2:42 UTC (permalink / raw)
  To: Mike Mason; +Cc: akpm, linux-kernel

* Mike Mason (mmlnx@us.ibm.com) wrote:
> snip
>> +void list_modules(void *call_data)
>> +{
>> +    /* Enumerate loaded modules */
>> +    struct list_head    *i;
>> +    struct module        *mod;
>> +    unsigned long refcount = 0;
>> +
>> +    mutex_lock(&module_mutex);
>> +    list_for_each(i, &modules) {
>> +        mod = list_entry(i, struct module, list);
>> +#ifdef CONFIG_MODULE_UNLOAD
>> +        refcount = local_read(&mod->ref[0].count);
>> +#endif
>> +        __trace_mark(0, list_module, call_data,
>> +                "name %s state %d refcount %lu",
>> +                mod->name, mod->state, refcount);
>> +    }
>> +    mutex_unlock(&module_mutex);
>> +}
>> +EXPORT_SYMBOL_GPL(list_modules);
>> +
>> /* Given an address, look for it in the module exception tables. */
>> const struct exception_table_entry *search_module_extables(unsigned long 
>> addr)
>> {
>
> What is the purpose of list_modules() in this patch?  Seems outside the 
> scope of the patches' intent.  I assume LTTng uses it for some purpose, but 
> it's not required to use the markers added by the patch.
>

Right, I should move it down in my patchset.

> Also, if list_modules() remains, the 0 should be removed from 
> "__trace_mark(0, ..." 
> Mike Mason

With the immediate values based markers, the 0 means an optimized
markers (non-generic). I use __trace_mark directly to be able to pass
the call_data argument.

Thanks for the review,

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 5/7] LTTng instrumentation mm
  2007-11-15 22:16       ` Dave Hansen
@ 2007-11-16 14:30         ` Mathieu Desnoyers
  2007-11-19 18:04           ` Dave Hansen
  2007-11-16 14:47         ` [RFC 5/7] LTTng instrumentation mm Mathieu Desnoyers
  1 sibling, 1 reply; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-16 14:30 UTC (permalink / raw)
  To: Dave Hansen; +Cc: akpm, linux-kernel, linux-mm, mbligh

* Dave Hansen (haveblue@us.ibm.com) wrote:
> On Thu, 2007-11-15 at 16:51 -0500, Mathieu Desnoyers wrote:
> > * Dave Hansen (haveblue@us.ibm.com) wrote:
> > > > On Tue, 2007-11-13 at 14:33 -0500, Mathieu Desnoyers wrote:
> > > >  linux-2.6-lttng/mm/page_io.c        2007-11-13 09:49:35.000000000 -0500
> > > > @@ -114,6 +114,7 @@ int swap_writepage(struct page *page, st
> > > >                 rw |= (1 << BIO_RW_SYNC);
> > > >         count_vm_event(PSWPOUT);
> > > >         set_page_writeback(page);
> > > > +       trace_mark(mm_swap_out, "address %p", page_address(page));
> > > >         unlock_page(page);
> > > >         submit_bio(rw, bio);
> > > >  out:
> > > 
> > > I'm not sure all this page_address() stuff makes any sense on highmem
> > > systems.  How about page_to_pfn()?
> >
> > Knowing which page frame number has been swapped out is not always as
> > relevant as knowing the page's virtual address (when it has one). Saving
> > both the PFN and the page's virtual address could give us useful
> > information when the page is not mapped.
> 
> For most (all?) architectures, the PFN and the virtual address in the
> kernel's linear are interchangeable with pretty trivial arithmetic.  All
> pages have a pfn, but not all have a virtual address.  Thus, I suggested
> using the pfn.  What kind of virtual addresses are you talking about?
> 

Hum, the mappings I was referring to are the virual memory mappings of
all processes, which is not at all what interests us here.

Let's use the PFN then.

I see that the standard macro to get the kernel address from a pfn is :

asm-x86/page_32.h:#define pfn_to_kaddr(pfn)      __va((pfn) << PAGE_SHIFT)

The question might seem trivial, but I wonder how this deals with large
pages ?

Mathieu


> -- Dave
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 5/7] LTTng instrumentation mm
  2007-11-15 22:16       ` Dave Hansen
  2007-11-16 14:30         ` Mathieu Desnoyers
@ 2007-11-16 14:47         ` Mathieu Desnoyers
  2007-11-19 18:07           ` Dave Hansen
  1 sibling, 1 reply; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-16 14:47 UTC (permalink / raw)
  To: Dave Hansen; +Cc: akpm, linux-kernel, linux-mm, mbligh

* Dave Hansen (haveblue@us.ibm.com) wrote:
> On Thu, 2007-11-15 at 16:51 -0500, Mathieu Desnoyers wrote:
> > * Dave Hansen (haveblue@us.ibm.com) wrote:
> > > > On Tue, 2007-11-13 at 14:33 -0500, Mathieu Desnoyers wrote:
> > > >  linux-2.6-lttng/mm/page_io.c        2007-11-13 09:49:35.000000000 -0500
> > > > @@ -114,6 +114,7 @@ int swap_writepage(struct page *page, st
> > > >                 rw |= (1 << BIO_RW_SYNC);
> > > >         count_vm_event(PSWPOUT);
> > > >         set_page_writeback(page);
> > > > +       trace_mark(mm_swap_out, "address %p", page_address(page));
> > > >         unlock_page(page);
> > > >         submit_bio(rw, bio);
> > > >  out:
> > > 
> > > I'm not sure all this page_address() stuff makes any sense on highmem
> > > systems.  How about page_to_pfn()?
> >
> > Knowing which page frame number has been swapped out is not always as
> > relevant as knowing the page's virtual address (when it has one). Saving
> > both the PFN and the page's virtual address could give us useful
> > information when the page is not mapped.
> 
> For most (all?) architectures, the PFN and the virtual address in the
> kernel's linear are interchangeable with pretty trivial arithmetic.  All
> pages have a pfn, but not all have a virtual address.  Thus, I suggested
> using the pfn.  What kind of virtual addresses are you talking about?
> 

Hrm, in asm-generic/memory_model.h, we have various versions of
__page_to_pfn. Normally they all cast the result to (unsigned long),
except for :


#elif defined(CONFIG_SPARSEMEM_VMEMMAP)

/* memmap is virtually contigious.  */
#define __pfn_to_page(pfn)      (vmemmap + (pfn))
#define __page_to_pfn(page)     ((page) - vmemmap)

So I guess the result is a pointer ? Should this be expected ?

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 5/7] LTTng instrumentation mm
  2007-11-16 14:30         ` Mathieu Desnoyers
@ 2007-11-19 18:04           ` Dave Hansen
  2007-11-28 14:09             ` [RFC PATCH] LTTng instrumentation mm (using page_to_pfn) Mathieu Desnoyers
  0 siblings, 1 reply; 46+ messages in thread
From: Dave Hansen @ 2007-11-19 18:04 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: akpm, linux-kernel, linux-mm, mbligh

On Fri, 2007-11-16 at 09:30 -0500, Mathieu Desnoyers wrote:
> I see that the standard macro to get the kernel address from a pfn is :
> 
> asm-x86/page_32.h:#define pfn_to_kaddr(pfn)      __va((pfn) << PAGE_SHIFT)
> 
> The question might seem trivial, but I wonder how this deals with large
> pages ?

Well, first of all, large pages are a virtual addressing concept.  We're
only talking about physical addresses here.  You still address the
memory the same way no matter if it is composed of large or small pages.
The physical address (and pfn) never change no matter what we do with
the page or how we allocate ir.

-- Dave


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 5/7] LTTng instrumentation mm
  2007-11-16 14:47         ` [RFC 5/7] LTTng instrumentation mm Mathieu Desnoyers
@ 2007-11-19 18:07           ` Dave Hansen
  2007-11-19 18:52             ` Mathieu Desnoyers
  0 siblings, 1 reply; 46+ messages in thread
From: Dave Hansen @ 2007-11-19 18:07 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: akpm, linux-kernel, linux-mm, mbligh

On Fri, 2007-11-16 at 09:47 -0500, Mathieu Desnoyers wrote:
> * Dave Hansen (haveblue@us.ibm.com) wrote:
> > For most (all?) architectures, the PFN and the virtual address in the
> > kernel's linear are interchangeable with pretty trivial arithmetic.  All
> > pages have a pfn, but not all have a virtual address.  Thus, I suggested
> > using the pfn.  What kind of virtual addresses are you talking about?
> > 
> 
> Hrm, in asm-generic/memory_model.h, we have various versions of
> __page_to_pfn. Normally they all cast the result to (unsigned long),
> except for :
> 
> 
> #elif defined(CONFIG_SPARSEMEM_VMEMMAP)
> 
> /* memmap is virtually contigious.  */
> #define __pfn_to_page(pfn)      (vmemmap + (pfn))
> #define __page_to_pfn(page)     ((page) - vmemmap)
> 
> So I guess the result is a pointer ? Should this be expected ?

Nope.  'pointer - pointer' is an integer.  Just solve this equation for
integer:

	'pointer + integer = pointer'

-- Dave


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 5/7] LTTng instrumentation mm
  2007-11-19 18:07           ` Dave Hansen
@ 2007-11-19 18:52             ` Mathieu Desnoyers
  2007-11-19 19:00               ` Mathieu Desnoyers
  2007-11-19 19:43               ` Dave Hansen
  0 siblings, 2 replies; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-19 18:52 UTC (permalink / raw)
  To: Dave Hansen; +Cc: akpm, linux-kernel, linux-mm, mbligh

* Dave Hansen (haveblue@us.ibm.com) wrote:
> On Fri, 2007-11-16 at 09:47 -0500, Mathieu Desnoyers wrote:
> > * Dave Hansen (haveblue@us.ibm.com) wrote:
> > > For most (all?) architectures, the PFN and the virtual address in the
> > > kernel's linear are interchangeable with pretty trivial arithmetic.  All
> > > pages have a pfn, but not all have a virtual address.  Thus, I suggested
> > > using the pfn.  What kind of virtual addresses are you talking about?
> > > 
> > 
> > Hrm, in asm-generic/memory_model.h, we have various versions of
> > __page_to_pfn. Normally they all cast the result to (unsigned long),
> > except for :
> > 
> > 
> > #elif defined(CONFIG_SPARSEMEM_VMEMMAP)
> > 
> > /* memmap is virtually contigious.  */
> > #define __pfn_to_page(pfn)      (vmemmap + (pfn))
> > #define __page_to_pfn(page)     ((page) - vmemmap)
> > 
> > So I guess the result is a pointer ? Should this be expected ?
> 
> Nope.  'pointer - pointer' is an integer.  Just solve this equation for
> integer:
> 
> 	'pointer + integer = pointer'
> 

Well, using page_to_pfn turns out to be ugly in markers (and in
printks) then. Depending on the architecture, it will result in either
an unsigned long (x86_64) or an unsigned int (i386), which corresponds
to %lu or %u and will print a warning if we don't cast it explicitly.

Mathieu


> -- Dave
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 5/7] LTTng instrumentation mm
  2007-11-19 18:52             ` Mathieu Desnoyers
@ 2007-11-19 19:00               ` Mathieu Desnoyers
  2007-11-19 19:43                 ` Dave Hansen
  2007-11-19 19:43               ` Dave Hansen
  1 sibling, 1 reply; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-19 19:00 UTC (permalink / raw)
  To: Dave Hansen; +Cc: akpm, linux-kernel, linux-mm, mbligh

* Mathieu Desnoyers (mathieu.desnoyers@polymtl.ca) wrote:
> * Dave Hansen (haveblue@us.ibm.com) wrote:
> > On Fri, 2007-11-16 at 09:47 -0500, Mathieu Desnoyers wrote:
> > > * Dave Hansen (haveblue@us.ibm.com) wrote:
> > > > For most (all?) architectures, the PFN and the virtual address in the
> > > > kernel's linear are interchangeable with pretty trivial arithmetic.  All
> > > > pages have a pfn, but not all have a virtual address.  Thus, I suggested
> > > > using the pfn.  What kind of virtual addresses are you talking about?
> > > > 
> > > 
> > > Hrm, in asm-generic/memory_model.h, we have various versions of
> > > __page_to_pfn. Normally they all cast the result to (unsigned long),
> > > except for :
> > > 
> > > 
> > > #elif defined(CONFIG_SPARSEMEM_VMEMMAP)
> > > 
> > > /* memmap is virtually contigious.  */
> > > #define __pfn_to_page(pfn)      (vmemmap + (pfn))
> > > #define __page_to_pfn(page)     ((page) - vmemmap)
> > > 
> > > So I guess the result is a pointer ? Should this be expected ?
> > 
> > Nope.  'pointer - pointer' is an integer.  Just solve this equation for
> > integer:
> > 
> > 	'pointer + integer = pointer'
> > 
> 
> Well, using page_to_pfn turns out to be ugly in markers (and in
> printks) then. Depending on the architecture, it will result in either
> an unsigned long (x86_64) or an unsigned int (i386), which corresponds

Well, it's signed long and signed int, but the point is still valid.

> to %lu or %u and will print a warning if we don't cast it explicitly.
> 
> Mathieu
> 
> 
> > -- Dave
> > 
> 
> -- 
> Mathieu Desnoyers
> Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 5/7] LTTng instrumentation mm
  2007-11-19 18:52             ` Mathieu Desnoyers
  2007-11-19 19:00               ` Mathieu Desnoyers
@ 2007-11-19 19:43               ` Dave Hansen
  2007-11-19 19:52                 ` [PATCH] Cast __page_to_pfn to unsigned long in CONFIG_SPARSEMEM Mathieu Desnoyers
  1 sibling, 1 reply; 46+ messages in thread
From: Dave Hansen @ 2007-11-19 19:43 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: akpm, linux-kernel, linux-mm, mbligh

On Mon, 2007-11-19 at 13:52 -0500, Mathieu Desnoyers wrote:
> > > So I guess the result is a pointer ? Should this be expected ?
> > 
> > Nope.  'pointer - pointer' is an integer.  Just solve this equation for
> > integer:
> > 
> >       'pointer + integer = pointer'
> > 
> 
> Well, using page_to_pfn turns out to be ugly in markers (and in
> printks) then. Depending on the architecture, it will result in either
> an unsigned long (x86_64) or an unsigned int (i386), which corresponds
> to %lu or %u and will print a warning if we don't cast it explicitly. 

Casting the i386 one to be an unconditional 'unsigned long' shouldn't be
an issue.  We don't generally expect pfns to fit into ints anyway. 

-- Dave


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC 5/7] LTTng instrumentation mm
  2007-11-19 19:00               ` Mathieu Desnoyers
@ 2007-11-19 19:43                 ` Dave Hansen
  0 siblings, 0 replies; 46+ messages in thread
From: Dave Hansen @ 2007-11-19 19:43 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: akpm, linux-kernel, linux-mm, mbligh

On Mon, 2007-11-19 at 14:00 -0500, Mathieu Desnoyers wrote:
> > Well, using page_to_pfn turns out to be ugly in markers (and in
> > printks) then. Depending on the architecture, it will result in either
> > an unsigned long (x86_64) or an unsigned int (i386), which corresponds
> 
> Well, it's signed long and signed int, but the point is still valid. 

the result of page_to_pfn() may end up being signed in practice, but it
never needs to be.  Just cast it to an unsigned long and make it
consistent everywhere.  

-- Dave


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH] Cast __page_to_pfn to unsigned long in CONFIG_SPARSEMEM
  2007-11-19 19:43               ` Dave Hansen
@ 2007-11-19 19:52                 ` Mathieu Desnoyers
  2007-11-19 20:09                   ` Dave Hansen
  0 siblings, 1 reply; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-19 19:52 UTC (permalink / raw)
  To: Dave Hansen; +Cc: akpm, linux-kernel, linux-mm, mbligh

* Dave Hansen (haveblue@us.ibm.com) wrote:
> On Mon, 2007-11-19 at 13:52 -0500, Mathieu Desnoyers wrote:
> > > > So I guess the result is a pointer ? Should this be expected ?
> > > 
> > > Nope.  'pointer - pointer' is an integer.  Just solve this equation for
> > > integer:
> > > 
> > >       'pointer + integer = pointer'
> > > 
> > 
> > Well, using page_to_pfn turns out to be ugly in markers (and in
> > printks) then. Depending on the architecture, it will result in either
> > an unsigned long (x86_64) or an unsigned int (i386), which corresponds
> > to %lu or %u and will print a warning if we don't cast it explicitly. 
> 
> Casting the i386 one to be an unconditional 'unsigned long' shouldn't be
> an issue.  We don't generally expect pfns to fit into ints anyway. 

So would this make sense ?

Cast __page_to_pfn to unsigned long in CONFIG_SPARSEMEM

Make sure the type returned by __page_to_pfn is always unsigned long. If we
don't cast it explicitly, it can be int on i386, but long on x86_64. This is
especially inelegant for printks.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Dave Hansen <haveblue@us.ibm.com>
CC: linux-mm@kvack.org
CC: linux-kernel@vger.kernel.org
---
 include/asm-generic/memory_model.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6-lttng/include/asm-generic/memory_model.h
===================================================================
--- linux-2.6-lttng.orig/include/asm-generic/memory_model.h	2007-11-19 14:47:30.000000000 -0500
+++ linux-2.6-lttng/include/asm-generic/memory_model.h	2007-11-19 14:48:30.000000000 -0500
@@ -50,7 +50,7 @@
 
 /* memmap is virtually contigious.  */
 #define __pfn_to_page(pfn)	(vmemmap + (pfn))
-#define __page_to_pfn(page)	((page) - vmemmap)
+#define __page_to_pfn(page)	((unsigned long)((page) - vmemmap))
 
 #elif defined(CONFIG_SPARSEMEM)
 /*

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Cast __page_to_pfn to unsigned long in CONFIG_SPARSEMEM
  2007-11-19 19:52                 ` [PATCH] Cast __page_to_pfn to unsigned long in CONFIG_SPARSEMEM Mathieu Desnoyers
@ 2007-11-19 20:09                   ` Dave Hansen
  2007-11-19 20:20                     ` [PATCH] Cast page_to_pfn " Mathieu Desnoyers
  0 siblings, 1 reply; 46+ messages in thread
From: Dave Hansen @ 2007-11-19 20:09 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: akpm, linux-kernel, linux-mm, mbligh

On Mon, 2007-11-19 at 14:52 -0500, Mathieu Desnoyers wrote:
> * Dave Hansen (haveblue@us.ibm.com) wrote:
> > On Mon, 2007-11-19 at 13:52 -0500, Mathieu Desnoyers wrote:
> > > > > So I guess the result is a pointer ? Should this be expected ?
> > > > 
> > > > Nope.  'pointer - pointer' is an integer.  Just solve this equation for
> > > > integer:
> > > > 
> > > >       'pointer + integer = pointer'
> > > > 
> > > 
> > > Well, using page_to_pfn turns out to be ugly in markers (and in
> > > printks) then. Depending on the architecture, it will result in either
> > > an unsigned long (x86_64) or an unsigned int (i386), which corresponds
> > > to %lu or %u and will print a warning if we don't cast it explicitly. 
> > 
> > Casting the i386 one to be an unconditional 'unsigned long' shouldn't be
> > an issue.  We don't generally expect pfns to fit into ints anyway. 
> 
> So would this make sense ?
> 
> Cast __page_to_pfn to unsigned long in CONFIG_SPARSEMEM
> 
> Make sure the type returned by __page_to_pfn is always unsigned long. If we
> don't cast it explicitly, it can be int on i386, but long on x86_64. This is
> especially inelegant for printks.

The only thing I might suggest doing differently is actually using the
page_to_pfn() definition itself:

memory_model.h:#define page_to_pfn __page_to_pfn

The full inline function version should do this already, and we
shouldn't have any real direct __page_to_pfn() users anyway.    

-- Dave


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Cast page_to_pfn to unsigned long in CONFIG_SPARSEMEM
  2007-11-19 20:09                   ` Dave Hansen
@ 2007-11-19 20:20                     ` Mathieu Desnoyers
  2007-11-19 21:08                       ` Andrew Morton
  0 siblings, 1 reply; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-19 20:20 UTC (permalink / raw)
  To: Dave Hansen; +Cc: akpm, linux-kernel, linux-mm, mbligh

* Dave Hansen (haveblue@us.ibm.com) wrote:
> The only thing I might suggest doing differently is actually using the
> page_to_pfn() definition itself:
> 
> memory_model.h:#define page_to_pfn __page_to_pfn
> 
> The full inline function version should do this already, and we
> shouldn't have any real direct __page_to_pfn() users anyway.    
> 

Like this then..

Cast page_to_pfn to unsigned long in CONFIG_SPARSEMEM

Make sure the type returned by page_to_pfn is always unsigned long. If we
don't cast it explicitly, it can be int on i386, but long on x86_64. This is
especially inelegant for printks.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Dave Hansen <haveblue@us.ibm.com>
CC: linux-mm@kvack.org
CC: linux-kernel@vger.kernel.org
---
 include/asm-generic/memory_model.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6-lttng/include/asm-generic/memory_model.h
===================================================================
--- linux-2.6-lttng.orig/include/asm-generic/memory_model.h	2007-11-19 15:06:40.000000000 -0500
+++ linux-2.6-lttng/include/asm-generic/memory_model.h	2007-11-19 15:18:57.000000000 -0500
@@ -76,7 +76,7 @@ struct page;
 extern struct page *pfn_to_page(unsigned long pfn);
 extern unsigned long page_to_pfn(struct page *page);
 #else
-#define page_to_pfn __page_to_pfn
+#define page_to_pfn ((unsigned long)__page_to_pfn)
 #define pfn_to_page __pfn_to_page
 #endif /* CONFIG_OUT_OF_LINE_PFN_TO_PAGE */
 



-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Cast page_to_pfn to unsigned long in CONFIG_SPARSEMEM
  2007-11-19 20:20                     ` [PATCH] Cast page_to_pfn " Mathieu Desnoyers
@ 2007-11-19 21:08                       ` Andrew Morton
  2007-11-19 21:19                         ` Dave Hansen
  2007-11-20 17:34                         ` Mathieu Desnoyers
  0 siblings, 2 replies; 46+ messages in thread
From: Andrew Morton @ 2007-11-19 21:08 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: haveblue, linux-kernel, linux-mm, mbligh

On Mon, 19 Nov 2007 15:20:23 -0500
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:

> * Dave Hansen (haveblue@us.ibm.com) wrote:
> > The only thing I might suggest doing differently is actually using the
> > page_to_pfn() definition itself:
> > 
> > memory_model.h:#define page_to_pfn __page_to_pfn
> > 
> > The full inline function version should do this already, and we
> > shouldn't have any real direct __page_to_pfn() users anyway.    
> > 
> 
> Like this then..
> 
> Cast page_to_pfn to unsigned long in CONFIG_SPARSEMEM
> 
> Make sure the type returned by page_to_pfn is always unsigned long. If we
> don't cast it explicitly, it can be int on i386, but long on x86_64.

formally ptrdiff_t, I believe.

> This is
> especially inelegant for printks.
> 
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> CC: Dave Hansen <haveblue@us.ibm.com>
> CC: linux-mm@kvack.org
> CC: linux-kernel@vger.kernel.org
> ---
>  include/asm-generic/memory_model.h |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-2.6-lttng/include/asm-generic/memory_model.h
> ===================================================================
> --- linux-2.6-lttng.orig/include/asm-generic/memory_model.h	2007-11-19 15:06:40.000000000 -0500
> +++ linux-2.6-lttng/include/asm-generic/memory_model.h	2007-11-19 15:18:57.000000000 -0500
> @@ -76,7 +76,7 @@ struct page;
>  extern struct page *pfn_to_page(unsigned long pfn);
>  extern unsigned long page_to_pfn(struct page *page);
>  #else
> -#define page_to_pfn __page_to_pfn
> +#define page_to_pfn ((unsigned long)__page_to_pfn)
>  #define pfn_to_page __pfn_to_page
>  #endif /* CONFIG_OUT_OF_LINE_PFN_TO_PAGE */

I'd have thought that __pfn_to_page() was the place to fix this: the
lower-level point.  Because someone might later start using __pfn_to_page()
for something.

Heaven knows why though - why does __pfn_to_page() even exist?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Cast page_to_pfn to unsigned long in CONFIG_SPARSEMEM
  2007-11-19 21:08                       ` Andrew Morton
@ 2007-11-19 21:19                         ` Dave Hansen
  2007-11-19 21:26                           ` Dave Hansen
  2007-11-21 20:12                           ` Christoph Lameter
  2007-11-20 17:34                         ` Mathieu Desnoyers
  1 sibling, 2 replies; 46+ messages in thread
From: Dave Hansen @ 2007-11-19 21:19 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Mathieu Desnoyers, linux-kernel, linux-mm, mbligh

On Mon, 2007-11-19 at 13:08 -0800, Andrew Morton wrote:
> 
> >  #else
> > -#define page_to_pfn __page_to_pfn
> > +#define page_to_pfn ((unsigned long)__page_to_pfn)
> >  #define pfn_to_page __pfn_to_page
> >  #endif /* CONFIG_OUT_OF_LINE_PFN_TO_PAGE */
> 
> I'd have thought that __pfn_to_page() was the place to fix this: the
> lower-level point.  Because someone might later start using
> __pfn_to_page()
> for something.
> 
> Heaven knows why though - why does __pfn_to_page() even exist?

I think it's this stuff:
        
        #ifdef CONFIG_OUT_OF_LINE_PFN_TO_PAGE
        struct page *pfn_to_page(unsigned long pfn)
        {
                return __pfn_to_page(pfn);
        }
        unsigned long page_to_pfn(struct page *page)
        {
                return __page_to_pfn(page);
        }
        EXPORT_SYMBOL(pfn_to_page);
        EXPORT_SYMBOL(page_to_pfn);
        #endif /* CONFIG_OUT_OF_LINE_PFN_TO_PAGE */
        
Which comes from:
        
        config OUT_OF_LINE_PFN_TO_PAGE
                def_bool X86_64
                depends on DISCONTIGMEM
        
and only on x86_64.  Perhaps it can go away with the
discontig->sparsemem-vmemmap conversion.

-- Dave



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Cast page_to_pfn to unsigned long in CONFIG_SPARSEMEM
  2007-11-19 21:19                         ` Dave Hansen
@ 2007-11-19 21:26                           ` Dave Hansen
  2007-11-21 20:12                           ` Christoph Lameter
  1 sibling, 0 replies; 46+ messages in thread
From: Dave Hansen @ 2007-11-19 21:26 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Mathieu Desnoyers, linux-kernel, linux-mm, mbligh

On Mon, 2007-11-19 at 13:19 -0800, Dave Hansen wrote:
> On Mon, 2007-11-19 at 13:08 -0800, Andrew Morton wrote:
> > Heaven knows why though - why does __pfn_to_page() even exist?
> Perhaps it can go away with the
> discontig->sparsemem-vmemmap conversion.

In fact, Christoph Lameter's

                           Subject: 
x86_64: Make sparsemem/vmemmap the
default memory model V2
                              Date: 
        Thu, 15 Nov 2007 19:55:11
-0800 (PST)

does remove it.  

-- Dave


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Cast page_to_pfn to unsigned long in CONFIG_SPARSEMEM
  2007-11-19 21:08                       ` Andrew Morton
  2007-11-19 21:19                         ` Dave Hansen
@ 2007-11-20 17:34                         ` Mathieu Desnoyers
  1 sibling, 0 replies; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-20 17:34 UTC (permalink / raw)
  To: Andrew Morton; +Cc: haveblue, linux-kernel, linux-mm, mbligh

* Andrew Morton (akpm@linux-foundation.org) wrote:
> On Mon, 19 Nov 2007 15:20:23 -0500
> Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> 
> > * Dave Hansen (haveblue@us.ibm.com) wrote:
> > > The only thing I might suggest doing differently is actually using the
> > > page_to_pfn() definition itself:
> > > 
> > > memory_model.h:#define page_to_pfn __page_to_pfn
> > > 
> > > The full inline function version should do this already, and we
> > > shouldn't have any real direct __page_to_pfn() users anyway.    
> > > 
> > 
> > Like this then..
> > 
> > Cast page_to_pfn to unsigned long in CONFIG_SPARSEMEM
> > 
> > Make sure the type returned by page_to_pfn is always unsigned long. If we
> > don't cast it explicitly, it can be int on i386, but long on x86_64.
> 
> formally ptrdiff_t, I believe.
> 
> > This is
> > especially inelegant for printks.
> > 
> > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> > CC: Dave Hansen <haveblue@us.ibm.com>
> > CC: linux-mm@kvack.org
> > CC: linux-kernel@vger.kernel.org
> > ---
> >  include/asm-generic/memory_model.h |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > Index: linux-2.6-lttng/include/asm-generic/memory_model.h
> > ===================================================================
> > --- linux-2.6-lttng.orig/include/asm-generic/memory_model.h	2007-11-19 15:06:40.000000000 -0500
> > +++ linux-2.6-lttng/include/asm-generic/memory_model.h	2007-11-19 15:18:57.000000000 -0500
> > @@ -76,7 +76,7 @@ struct page;
> >  extern struct page *pfn_to_page(unsigned long pfn);
> >  extern unsigned long page_to_pfn(struct page *page);
> >  #else
> > -#define page_to_pfn __page_to_pfn
> > +#define page_to_pfn ((unsigned long)__page_to_pfn)
> >  #define pfn_to_page __pfn_to_page
> >  #endif /* CONFIG_OUT_OF_LINE_PFN_TO_PAGE */
> 
> I'd have thought that __pfn_to_page() was the place to fix this: the
> lower-level point.  Because someone might later start using __pfn_to_page()
> for something.
> 
> Heaven knows why though - why does __pfn_to_page() even exist?

Since it all does away with Christoph's patchset anyway, please drop
this patch. (I think there is also an issue with this patch version,
which is that the define should take the arguments...).

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] Cast page_to_pfn to unsigned long in CONFIG_SPARSEMEM
  2007-11-19 21:19                         ` Dave Hansen
  2007-11-19 21:26                           ` Dave Hansen
@ 2007-11-21 20:12                           ` Christoph Lameter
  1 sibling, 0 replies; 46+ messages in thread
From: Christoph Lameter @ 2007-11-21 20:12 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Andrew Morton, Mathieu Desnoyers, linux-kernel, linux-mm, mbligh

On Mon, 19 Nov 2007, Dave Hansen wrote:

> Which comes from:
>         
>         config OUT_OF_LINE_PFN_TO_PAGE
>                 def_bool X86_64
>                 depends on DISCONTIGMEM
>         
> and only on x86_64.  Perhaps it can go away with the
> discontig->sparsemem-vmemmap conversion.

The discontig/flatmem removal patch for x86_64 in mm already removes this.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [RFC PATCH] LTTng instrumentation mm (using page_to_pfn)
  2007-11-19 18:04           ` Dave Hansen
@ 2007-11-28 14:09             ` Mathieu Desnoyers
  2007-11-28 16:54               ` Dave Hansen
  0 siblings, 1 reply; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-28 14:09 UTC (permalink / raw)
  To: Dave Hansen; +Cc: akpm, linux-kernel, linux-mm, mbligh

LTTng instrumentation mm

Memory management core events.

Changelog:
- Use page_to_pfn for swap out instrumentation, wait_on_page_bit, do_swap_page,
  page alloc/free.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: linux-mm@kvack.org
CC: Dave Hansen <haveblue@us.ibm.com>
---
 mm/filemap.c    |    4 ++++
 mm/memory.c     |   34 +++++++++++++++++++++++++---------
 mm/page_alloc.c |    5 +++++
 mm/page_io.c    |    1 +
 4 files changed, 35 insertions(+), 9 deletions(-)

Index: linux-2.6-lttng/mm/filemap.c
===================================================================
--- linux-2.6-lttng.orig/mm/filemap.c	2007-11-28 08:38:46.000000000 -0500
+++ linux-2.6-lttng/mm/filemap.c	2007-11-28 08:59:05.000000000 -0500
@@ -514,9 +514,13 @@ void fastcall wait_on_page_bit(struct pa
 {
 	DEFINE_WAIT_BIT(wait, &page->flags, bit_nr);
 
+	trace_mark(mm_filemap_wait_start, "pfn %lu", page_to_pfn(page));
+
 	if (test_bit(bit_nr, &page->flags))
 		__wait_on_bit(page_waitqueue(page), &wait, sync_page,
 							TASK_UNINTERRUPTIBLE);
+
+	trace_mark(mm_filemap_wait_end, "pfn %lu", page_to_pfn(page));
 }
 EXPORT_SYMBOL(wait_on_page_bit);
 
Index: linux-2.6-lttng/mm/memory.c
===================================================================
--- linux-2.6-lttng.orig/mm/memory.c	2007-11-28 08:42:09.000000000 -0500
+++ linux-2.6-lttng/mm/memory.c	2007-11-28 09:02:57.000000000 -0500
@@ -2072,6 +2072,7 @@ static int do_swap_page(struct mm_struct
 	delayacct_set_flag(DELAYACCT_PF_SWAPIN);
 	page = lookup_swap_cache(entry);
 	if (!page) {
+		trace_mark(mm_swap_in, "pfn %lu", page_to_pfn(page));
 		grab_swap_token(); /* Contend for token _before_ read-in */
  		swapin_readahead(entry, address, vma);
  		page = read_swap_cache_async(entry, vma, address);
@@ -2526,30 +2527,45 @@ unlock:
 int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 		unsigned long address, int write_access)
 {
+	int res;
 	pgd_t *pgd;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
 
+	trace_mark(mm_handle_fault_entry, "address %lu ip #p%ld",
+		address, KSTK_EIP(current));
+
 	__set_current_state(TASK_RUNNING);
 
 	count_vm_event(PGFAULT);
 
-	if (unlikely(is_vm_hugetlb_page(vma)))
-		return hugetlb_fault(mm, vma, address, write_access);
+	if (unlikely(is_vm_hugetlb_page(vma))) {
+		res = hugetlb_fault(mm, vma, address, write_access);
+		goto end;
+	}
 
 	pgd = pgd_offset(mm, address);
 	pud = pud_alloc(mm, pgd, address);
-	if (!pud)
-		return VM_FAULT_OOM;
+	if (!pud) {
+		res = VM_FAULT_OOM;
+		goto end;
+	}
 	pmd = pmd_alloc(mm, pud, address);
-	if (!pmd)
-		return VM_FAULT_OOM;
+	if (!pmd) {
+		res = VM_FAULT_OOM;
+		goto end;
+	}
 	pte = pte_alloc_map(mm, pmd, address);
-	if (!pte)
-		return VM_FAULT_OOM;
+	if (!pte) {
+		res = VM_FAULT_OOM;
+		goto end;
+	}
 
-	return handle_pte_fault(mm, vma, address, pte, pmd, write_access);
+	res = handle_pte_fault(mm, vma, address, pte, pmd, write_access);
+end:
+	trace_mark(mm_handle_fault_exit, MARK_NOARGS);
+	return res;
 }
 
 #ifndef __PAGETABLE_PUD_FOLDED
Index: linux-2.6-lttng/mm/page_alloc.c
===================================================================
--- linux-2.6-lttng.orig/mm/page_alloc.c	2007-11-28 08:38:46.000000000 -0500
+++ linux-2.6-lttng/mm/page_alloc.c	2007-11-28 09:05:36.000000000 -0500
@@ -519,6 +519,9 @@ static void __free_pages_ok(struct page 
 	int i;
 	int reserved = 0;
 
+	trace_mark(mm_page_free, "order %u pfn %lu",
+		order, page_to_pfn(page));
+
 	for (i = 0 ; i < (1 << order) ; ++i)
 		reserved += free_pages_check(page + i);
 	if (reserved)
@@ -1639,6 +1642,8 @@ fastcall unsigned long __get_free_pages(
 	page = alloc_pages(gfp_mask, order);
 	if (!page)
 		return 0;
+	trace_mark(mm_page_alloc, "order %u pfn %lu",
+		order, page_to_pfn(page));
 	return (unsigned long) page_address(page);
 }
 
Index: linux-2.6-lttng/mm/page_io.c
===================================================================
--- linux-2.6-lttng.orig/mm/page_io.c	2007-11-28 08:38:47.000000000 -0500
+++ linux-2.6-lttng/mm/page_io.c	2007-11-28 08:52:14.000000000 -0500
@@ -114,6 +114,7 @@ int swap_writepage(struct page *page, st
 		rw |= (1 << BIO_RW_SYNC);
 	count_vm_event(PSWPOUT);
 	set_page_writeback(page);
+	trace_mark(mm_swap_out, "pfn %lu", page_to_pfn(page));
 	unlock_page(page);
 	submit_bio(rw, bio);
 out:
-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH] LTTng instrumentation mm (using page_to_pfn)
  2007-11-28 14:09             ` [RFC PATCH] LTTng instrumentation mm (using page_to_pfn) Mathieu Desnoyers
@ 2007-11-28 16:54               ` Dave Hansen
  2007-11-29  2:34                 ` Mathieu Desnoyers
  0 siblings, 1 reply; 46+ messages in thread
From: Dave Hansen @ 2007-11-28 16:54 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: akpm, linux-kernel, linux-mm, mbligh

On Wed, 2007-11-28 at 09:09 -0500, Mathieu Desnoyers wrote:
> ===================================================================
> --- linux-2.6-lttng.orig/mm/filemap.c	2007-11-28 08:38:46.000000000 -0500
> +++ linux-2.6-lttng/mm/filemap.c	2007-11-28 08:59:05.000000000 -0500
> @@ -514,9 +514,13 @@ void fastcall wait_on_page_bit(struct pa
>  {
>  	DEFINE_WAIT_BIT(wait, &page->flags, bit_nr);
> 
> +	trace_mark(mm_filemap_wait_start, "pfn %lu", page_to_pfn(page));
> +
>  	if (test_bit(bit_nr, &page->flags))
>  		__wait_on_bit(page_waitqueue(page), &wait, sync_page,
>  							TASK_UNINTERRUPTIBLE);
> +
> +	trace_mark(mm_filemap_wait_end, "pfn %lu", page_to_pfn(page));
>  }
>  EXPORT_SYMBOL(wait_on_page_bit);

I've got some small nits with this.  I guess I just wish that if we're
going to sprinkle hooks all over that we'd have those hooks be as useful
as possible for people who have to look at them on a daily basis.

Do you also want to put in the page bit which is being waited on?

> 
> Index: linux-2.6-lttng/mm/memory.c
> ===================================================================
> --- linux-2.6-lttng.orig/mm/memory.c	2007-11-28 08:42:09.000000000 -0500
> +++ linux-2.6-lttng/mm/memory.c	2007-11-28 09:02:57.000000000 -0500
> @@ -2072,6 +2072,7 @@ static int do_swap_page(struct mm_struct
>  	delayacct_set_flag(DELAYACCT_PF_SWAPIN);
>  	page = lookup_swap_cache(entry);
>  	if (!page) {
> +		trace_mark(mm_swap_in, "pfn %lu", page_to_pfn(page));
>  		grab_swap_token(); /* Contend for token _before_ read-in */
>   		swapin_readahead(entry, address, vma);
>   		page = read_swap_cache_async(entry, vma, address);

How about putting the swap file number and the offset as well?

> @@ -2526,30 +2527,45 @@ unlock:
>  int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  		unsigned long address, int write_access)
>  {
> +	int res;
>  	pgd_t *pgd;
>  	pud_t *pud;
>  	pmd_t *pmd;
>  	pte_t *pte;
> 
> +	trace_mark(mm_handle_fault_entry, "address %lu ip #p%ld",
> +		address, KSTK_EIP(current));

For knowing this one, the write access can be pretty important.  It can
help you find copy-on-write situations as well as some common bugs that
creep in here. 

>  	__set_current_state(TASK_RUNNING);
> 
>  	count_vm_event(PGFAULT);
> 
> -	if (unlikely(is_vm_hugetlb_page(vma)))
> -		return hugetlb_fault(mm, vma, address, write_access);
> +	if (unlikely(is_vm_hugetlb_page(vma))) {
> +		res = hugetlb_fault(mm, vma, address, write_access);
> +		goto end;
> +	}

I think you should also add tracing to the hugetlb code while you're at
it.  Those poor fellows seem to be always getting left out these
days. :)

>  	pgd = pgd_offset(mm, address);
>  	pud = pud_alloc(mm, pgd, address);
> -	if (!pud)
> -		return VM_FAULT_OOM;
> +	if (!pud) {
> +		res = VM_FAULT_OOM;
> +		goto end;
> +	}
>  	pmd = pmd_alloc(mm, pud, address);
> -	if (!pmd)
> -		return VM_FAULT_OOM;
> +	if (!pmd) {
> +		res = VM_FAULT_OOM;
> +		goto end;
> +	}
>  	pte = pte_alloc_map(mm, pmd, address);
> -	if (!pte)
> -		return VM_FAULT_OOM;
> +	if (!pte) {
> +		res = VM_FAULT_OOM;
> +		goto end;
> +	}
> 
> -	return handle_pte_fault(mm, vma, address, pte, pmd, write_access);
> +	res = handle_pte_fault(mm, vma, address, pte, pmd, write_access);
> +end:
> +	trace_mark(mm_handle_fault_exit, MARK_NOARGS);
> +	return res;
>  }
> 
>  #ifndef __PAGETABLE_PUD_FOLDED
> Index: linux-2.6-lttng/mm/page_alloc.c
> ===================================================================
> --- linux-2.6-lttng.orig/mm/page_alloc.c	2007-11-28 08:38:46.000000000 -0500
> +++ linux-2.6-lttng/mm/page_alloc.c	2007-11-28 09:05:36.000000000 -0500
> @@ -519,6 +519,9 @@ static void __free_pages_ok(struct page 
>  	int i;
>  	int reserved = 0;
> 
> +	trace_mark(mm_page_free, "order %u pfn %lu",
> +		order, page_to_pfn(page));
> +
>  	for (i = 0 ; i < (1 << order) ; ++i)
>  		reserved += free_pages_check(page + i);
>  	if (reserved)
> @@ -1639,6 +1642,8 @@ fastcall unsigned long __get_free_pages(
>  	page = alloc_pages(gfp_mask, order);
>  	if (!page)
>  		return 0;
> +	trace_mark(mm_page_alloc, "order %u pfn %lu",
> +		order, page_to_pfn(page));
>  	return (unsigned long) page_address(page);
>  }
> 
> Index: linux-2.6-lttng/mm/page_io.c
> ===================================================================
> --- linux-2.6-lttng.orig/mm/page_io.c	2007-11-28 08:38:47.000000000 -0500
> +++ linux-2.6-lttng/mm/page_io.c	2007-11-28 08:52:14.000000000 -0500
> @@ -114,6 +114,7 @@ int swap_writepage(struct page *page, st
>  		rw |= (1 << BIO_RW_SYNC);
>  	count_vm_event(PSWPOUT);
>  	set_page_writeback(page);
> +	trace_mark(mm_swap_out, "pfn %lu", page_to_pfn(page));
>  	unlock_page(page);
>  	submit_bio(rw, bio);

I'd also like to see the swap file number and the location in swap for
this one.  

-- Dave


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH] LTTng instrumentation mm (using page_to_pfn)
  2007-11-28 16:54               ` Dave Hansen
@ 2007-11-29  2:34                 ` Mathieu Desnoyers
  2007-11-29  6:25                   ` Dave Hansen
  0 siblings, 1 reply; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-29  2:34 UTC (permalink / raw)
  To: Dave Hansen; +Cc: akpm, linux-kernel, linux-mm, mbligh

I am adding the rest.. two questions left :

* Dave Hansen (haveblue@us.ibm.com) wrote:
 
> > 
> > Index: linux-2.6-lttng/mm/memory.c
> > ===================================================================
> > --- linux-2.6-lttng.orig/mm/memory.c	2007-11-28 08:42:09.000000000 -0500
> > +++ linux-2.6-lttng/mm/memory.c	2007-11-28 09:02:57.000000000 -0500
> > @@ -2072,6 +2072,7 @@ static int do_swap_page(struct mm_struct
> >  	delayacct_set_flag(DELAYACCT_PF_SWAPIN);
> >  	page = lookup_swap_cache(entry);
> >  	if (!page) {
> > +		trace_mark(mm_swap_in, "pfn %lu", page_to_pfn(page));
> >  		grab_swap_token(); /* Contend for token _before_ read-in */
> >   		swapin_readahead(entry, address, vma);
> >   		page = read_swap_cache_async(entry, vma, address);
> 
> How about putting the swap file number and the offset as well?
> 
[...]
> > Index: linux-2.6-lttng/mm/page_io.c
> > ===================================================================
> > --- linux-2.6-lttng.orig/mm/page_io.c	2007-11-28 08:38:47.000000000 -0500
> > +++ linux-2.6-lttng/mm/page_io.c	2007-11-28 08:52:14.000000000 -0500
> > @@ -114,6 +114,7 @@ int swap_writepage(struct page *page, st
> >  		rw |= (1 << BIO_RW_SYNC);
> >  	count_vm_event(PSWPOUT);
> >  	set_page_writeback(page);
> > +	trace_mark(mm_swap_out, "pfn %lu", page_to_pfn(page));
> >  	unlock_page(page);
> >  	submit_bio(rw, bio);
> 
> I'd also like to see the swap file number and the location in swap for
> this one.  
> 

Before I start digging deeper in checking whether it is already
instrumented by the fs instrumentation (and would therefore be
redundant), is there a particular data structure from mm/ that you
suggest taking the swap file number and location in swap from ?

Mathieu

> -- Dave
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH] LTTng instrumentation mm (using page_to_pfn)
  2007-11-29  2:34                 ` Mathieu Desnoyers
@ 2007-11-29  6:25                   ` Dave Hansen
  2007-11-30 16:11                     ` [RFC PATCH] LTTng instrumentation mm (updated) Mathieu Desnoyers
  0 siblings, 1 reply; 46+ messages in thread
From: Dave Hansen @ 2007-11-29  6:25 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: akpm, linux-kernel, linux-mm, mbligh

On Wed, 2007-11-28 at 21:34 -0500, Mathieu Desnoyers wrote:
> Before I start digging deeper in checking whether it is already
> instrumented by the fs instrumentation (and would therefore be
> redundant), is there a particular data structure from mm/ that you
> suggest taking the swap file number and location in swap from ?

page_private() at this point stores a swp_entry_t.  There are swp_type()
and swp_offset() helpers to decode the two bits you need after you've
turned page_private() into a swp_entry_t.  See how get_swap_bio()
creates a temporary swp_entry_t from the page_private() passed into it,
then uses swp_type/offset() on it?

I don't know if there is some history behind it, but it doesn't make a
whole ton of sense to me to be passing page_private(page) into
get_swap_bio() (which happens from its only two call sites).  It just
kinda obfuscates where 'index' came from.

It think we probably could just be doing

	swp_entry_t entry = { .val = page_private(page), };

in get_swap_bio() and not passing page_private().  We have the page in
there already, so we don't need to pass a derived value like
page_private().  At the least, it'll save some clutter in the function
declaration.  

Or, make a helper:

static swp_entry_t page_swp_entry(struct page *page)
{
	swp_entry_t entry;
	VM_BUG_ON(!PageSwapCache(page));
	entry.val = page_private(page);
	return entry;
}

I see at least 4 call sites that could use this.  The try_to_unmap_one()
caller would trip over the debug check, so you'd have to move the call
inside of the if(PageSwapCache(page)) statement.

-- Dave


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [RFC PATCH] LTTng instrumentation mm (updated)
  2007-11-29  6:25                   ` Dave Hansen
@ 2007-11-30 16:11                     ` Mathieu Desnoyers
  2007-11-30 17:46                       ` Dave Hansen
  0 siblings, 1 reply; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-30 16:11 UTC (permalink / raw)
  To: Dave Hansen; +Cc: akpm, linux-kernel, linux-mm, mbligh

LTTng instrumentation mm

Memory management core events.

Changelog:
- Use page_to_pfn for swap out instrumentation, wait_on_page_bit, do_swap_page,
  page alloc/free.
- add missing free_hot_cold_page instrumentation.
- add hugetlb page_alloc page_free instrumentation.
- Add write_access to mm fault.
- Add page bit_nr waited for by wait_on_page_bit.
- Move page alloc instrumentation to __aloc_pages so we cover the alloc zeroed
  page path.
- Add swap file used for swap in and swap out events.
- Dump the swap files, instrument swapon and swapoff.

(note : I did not change the other sites where page_swp_entry could be
used)
(note 2 : my FS instrumentation does not dump the kernel vfs mounts,
which would be useful to interpret the "dump swap files"
instrumentation. I should add this eventually.)

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: linux-mm@kvack.org
CC: Dave Hansen <haveblue@us.ibm.com>
---
 include/linux/swapops.h |    8 ++++++++
 mm/filemap.c            |    6 ++++++
 mm/hugetlb.c            |    2 ++
 mm/memory.c             |   38 +++++++++++++++++++++++++++++---------
 mm/page_alloc.c         |    6 ++++++
 mm/page_io.c            |    5 +++++
 mm/swapfile.c           |   22 ++++++++++++++++++++++
 7 files changed, 78 insertions(+), 9 deletions(-)

Index: linux-2.6-lttng/mm/filemap.c
===================================================================
--- linux-2.6-lttng.orig/mm/filemap.c	2007-11-29 20:22:52.000000000 -0500
+++ linux-2.6-lttng/mm/filemap.c	2007-11-29 20:23:01.000000000 -0500
@@ -514,9 +514,15 @@ void fastcall wait_on_page_bit(struct pa
 {
 	DEFINE_WAIT_BIT(wait, &page->flags, bit_nr);
 
+	trace_mark(mm_filemap_wait_start, "pfn %lu bit_nr %d",
+		page_to_pfn(page), bit_nr);
+
 	if (test_bit(bit_nr, &page->flags))
 		__wait_on_bit(page_waitqueue(page), &wait, sync_page,
 							TASK_UNINTERRUPTIBLE);
+
+	trace_mark(mm_filemap_wait_end, "pfn %lu bit_nr %d",
+		page_to_pfn(page), bit_nr);
 }
 EXPORT_SYMBOL(wait_on_page_bit);
 
Index: linux-2.6-lttng/mm/memory.c
===================================================================
--- linux-2.6-lttng.orig/mm/memory.c	2007-11-29 20:22:52.000000000 -0500
+++ linux-2.6-lttng/mm/memory.c	2007-11-29 20:42:36.000000000 -0500
@@ -2090,6 +2090,10 @@ static int do_swap_page(struct mm_struct
 		/* Had to read the page from swap area: Major fault */
 		ret = VM_FAULT_MAJOR;
 		count_vm_event(PGMAJFAULT);
+		trace_mark(mm_swap_in, "pfn %lu filp %p offset %lu",
+			page_to_pfn(page),
+			get_swap_info_struct(swp_type(entry))->swap_file,
+			swp_offset(entry));
 	}
 
 	mark_page_accessed(page);
@@ -2526,30 +2530,46 @@ unlock:
 int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 		unsigned long address, int write_access)
 {
+	int res;
 	pgd_t *pgd;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
 
+	trace_mark(mm_handle_fault_entry,
+		"address %lu ip #p%ld write_access %d",
+		address, KSTK_EIP(current), write_access);
+
 	__set_current_state(TASK_RUNNING);
 
 	count_vm_event(PGFAULT);
 
-	if (unlikely(is_vm_hugetlb_page(vma)))
-		return hugetlb_fault(mm, vma, address, write_access);
+	if (unlikely(is_vm_hugetlb_page(vma))) {
+		res = hugetlb_fault(mm, vma, address, write_access);
+		goto end;
+	}
 
 	pgd = pgd_offset(mm, address);
 	pud = pud_alloc(mm, pgd, address);
-	if (!pud)
-		return VM_FAULT_OOM;
+	if (!pud) {
+		res = VM_FAULT_OOM;
+		goto end;
+	}
 	pmd = pmd_alloc(mm, pud, address);
-	if (!pmd)
-		return VM_FAULT_OOM;
+	if (!pmd) {
+		res = VM_FAULT_OOM;
+		goto end;
+	}
 	pte = pte_alloc_map(mm, pmd, address);
-	if (!pte)
-		return VM_FAULT_OOM;
+	if (!pte) {
+		res = VM_FAULT_OOM;
+		goto end;
+	}
 
-	return handle_pte_fault(mm, vma, address, pte, pmd, write_access);
+	res = handle_pte_fault(mm, vma, address, pte, pmd, write_access);
+end:
+	trace_mark(mm_handle_fault_exit, MARK_NOARGS);
+	return res;
 }
 
 #ifndef __PAGETABLE_PUD_FOLDED
Index: linux-2.6-lttng/mm/page_alloc.c
===================================================================
--- linux-2.6-lttng.orig/mm/page_alloc.c	2007-11-29 20:22:52.000000000 -0500
+++ linux-2.6-lttng/mm/page_alloc.c	2007-11-29 20:23:01.000000000 -0500
@@ -519,6 +519,9 @@ static void __free_pages_ok(struct page 
 	int i;
 	int reserved = 0;
 
+	trace_mark(mm_page_free, "order %u pfn %lu",
+		order, page_to_pfn(page));
+
 	for (i = 0 ; i < (1 << order) ; ++i)
 		reserved += free_pages_check(page + i);
 	if (reserved)
@@ -981,6 +984,8 @@ static void fastcall free_hot_cold_page(
 	struct per_cpu_pages *pcp;
 	unsigned long flags;
 
+	trace_mark(mm_page_free, "order %u pfn %lu", 0, page_to_pfn(page));
+
 	if (PageAnon(page))
 		page->mapping = NULL;
 	if (free_pages_check(page))
@@ -1625,6 +1630,7 @@ nopage:
 		show_mem();
 	}
 got_pg:
+	trace_mark(mm_page_alloc, "order %u pfn %lu", order, page_to_pfn(page));
 	return page;
 }
 
Index: linux-2.6-lttng/mm/page_io.c
===================================================================
--- linux-2.6-lttng.orig/mm/page_io.c	2007-11-29 20:22:52.000000000 -0500
+++ linux-2.6-lttng/mm/page_io.c	2007-11-29 20:43:02.000000000 -0500
@@ -114,6 +114,11 @@ int swap_writepage(struct page *page, st
 		rw |= (1 << BIO_RW_SYNC);
 	count_vm_event(PSWPOUT);
 	set_page_writeback(page);
+	trace_mark(mm_swap_out, "pfn %lu filp %p offset %lu",
+			page_to_pfn(page),
+			get_swap_info_struct(swp_type(
+				page_swp_entry(page)))->swap_file,
+			swp_offset(page_swp_entry(page)));
 	unlock_page(page);
 	submit_bio(rw, bio);
 out:
Index: linux-2.6-lttng/mm/hugetlb.c
===================================================================
--- linux-2.6-lttng.orig/mm/hugetlb.c	2007-11-29 20:22:52.000000000 -0500
+++ linux-2.6-lttng/mm/hugetlb.c	2007-11-29 20:23:01.000000000 -0500
@@ -118,6 +118,7 @@ static void free_huge_page(struct page *
 	int nid = page_to_nid(page);
 	struct address_space *mapping;
 
+	trace_mark(mm_huge_page_free, "pfn %lu", page_to_pfn(page));
 	mapping = (struct address_space *) page_private(page);
 	BUG_ON(page_count(page));
 	INIT_LIST_HEAD(&page->lru);
@@ -401,6 +402,7 @@ static struct page *alloc_huge_page(stru
 	if (!IS_ERR(page)) {
 		set_page_refcounted(page);
 		set_page_private(page, (unsigned long) mapping);
+		trace_mark(mm_huge_page_alloc, "pfn %lu", page_to_pfn(page));
 	}
 	return page;
 }
Index: linux-2.6-lttng/include/linux/swapops.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/swapops.h	2007-11-29 20:22:52.000000000 -0500
+++ linux-2.6-lttng/include/linux/swapops.h	2007-11-29 20:23:01.000000000 -0500
@@ -68,6 +68,14 @@ static inline pte_t swp_entry_to_pte(swp
 	return __swp_entry_to_pte(arch_entry);
 }
 
+static inline swp_entry_t page_swp_entry(struct page *page)
+{
+	swp_entry_t entry;
+	VM_BUG_ON(!PageSwapCache(page));
+	entry.val = page_private(page);
+	return entry;
+}
+
 #ifdef CONFIG_MIGRATION
 static inline swp_entry_t make_migration_entry(struct page *page, int write)
 {
Index: linux-2.6-lttng/mm/swapfile.c
===================================================================
--- linux-2.6-lttng.orig/mm/swapfile.c	2007-11-30 09:18:38.000000000 -0500
+++ linux-2.6-lttng/mm/swapfile.c	2007-11-30 10:21:50.000000000 -0500
@@ -1279,6 +1279,7 @@ asmlinkage long sys_swapoff(const char _
 	swap_map = p->swap_map;
 	p->swap_map = NULL;
 	p->flags = 0;
+	trace_mark(mm_swap_file_close, "filp %p", swap_file);
 	spin_unlock(&swap_lock);
 	mutex_unlock(&swapon_mutex);
 	vfree(swap_map);
@@ -1660,6 +1661,8 @@ asmlinkage long sys_swapon(const char __
 	} else {
 		swap_info[prev].next = p - swap_info;
 	}
+	trace_mark(mm_swap_file_open, "filp %p filename %s",
+		swap_file, name);
 	spin_unlock(&swap_lock);
 	mutex_unlock(&swapon_mutex);
 	error = 0;
@@ -1796,3 +1799,22 @@ int valid_swaphandles(swp_entry_t entry,
 	spin_unlock(&swap_lock);
 	return ret;
 }
+
+void ltt_dump_swap_files(void *call_data)
+{
+	int type;
+	struct swap_info_struct * p = NULL;
+
+	mutex_lock(&swapon_mutex);
+	for (type = swap_list.head; type >= 0; type = swap_info[type].next) {
+		p = swap_info + type;
+		if ((p->flags & SWP_ACTIVE) != SWP_ACTIVE)
+			continue;
+		__trace_mark(0, statedump_swap_files, call_data,
+			"filp %p vfsmount %p dname %s",
+			p->swap_file, p->swap_file->f_vfsmnt,
+			p->swap_file->f_dentry->d_name.name);
+	}
+	mutex_unlock(&swapon_mutex);
+}
+EXPORT_SYMBOL_GPL(ltt_dump_swap_files);
-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH] LTTng instrumentation mm (updated)
  2007-11-30 17:46                       ` Dave Hansen
@ 2007-11-30 17:05                         ` Mathieu Desnoyers
  2007-11-30 18:42                           ` Dave Hansen
  0 siblings, 1 reply; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-30 17:05 UTC (permalink / raw)
  To: Dave Hansen; +Cc: akpm, linux-kernel, linux-mm, mbligh

* Dave Hansen (haveblue@us.ibm.com) wrote:
> On Fri, 2007-11-30 at 11:11 -0500, Mathieu Desnoyers wrote:
> > +static inline swp_entry_t page_swp_entry(struct page *page)
> > +{
> > +       swp_entry_t entry;
> > +       VM_BUG_ON(!PageSwapCache(page));
> > +       entry.val = page_private(page);
> > +       return entry;
> > +}
> 
> This probably needs to be introduced (and used) in a separate patch.
> Please fix up those other places in the code that can take advantage of
> it.
> 
Sure,

> >  #ifdef CONFIG_MIGRATION
> >  static inline swp_entry_t make_migration_entry(struct page *page, int
> > write)
> >  {
> > Index: linux-2.6-lttng/mm/swapfile.c
> > ===================================================================
> > --- linux-2.6-lttng.orig/mm/swapfile.c  2007-11-30 09:18:38.000000000
> > -0500
> > +++ linux-2.6-lttng/mm/swapfile.c       2007-11-30 10:21:50.000000000
> > -0500
> > @@ -1279,6 +1279,7 @@ asmlinkage long sys_swapoff(const char _
> >         swap_map = p->swap_map;
> >         p->swap_map = NULL;
> >         p->flags = 0;
> > +       trace_mark(mm_swap_file_close, "filp %p", swap_file);
> >         spin_unlock(&swap_lock);
> >         mutex_unlock(&swapon_mutex);
> >         vfree(swap_map);
> > @@ -1660,6 +1661,8 @@ asmlinkage long sys_swapon(const char __
> >         } else {
> >                 swap_info[prev].next = p - swap_info;
> >         }
> > +       trace_mark(mm_swap_file_open, "filp %p filename %s",
> > +               swap_file, name); 
> 
> You print out the filp a number of times here, but how does that help in
> a trace?  If I was trying to figure out which swapfile, I'd probably
> just want to know the swp_entry_t->type, then I could look at this:
> 
> dave@foo:~/garbage$ cat /proc/swaps 
> Filename                                Type            Size    Used    Priority
> /dev/sda2                               partition       1992052 649336  -1
> 
> to see the ordering.
> 

Given a trace including :
- Swapfiles initially used
- multiple swapon/swapoff
- swap in/out events

We would like to be able to tell which swap file the information has
been written to/read from at any given time during the trace.

Therefore, I dump the swap file information at the beginning of the
trace (see the ltt_dump_swap_files function) and also follow each
swapon/swapoff.

The minimal information that has to be saved at each swap read/write
seems to be the struct file * that is used by the operation. We can then
map back to the file used by knowing the mapping between struct file *
and associated file names (dump/swapon/swapoff instrumentation).

The swp_entry_t->type does not seem to map to any specific information
in /proc/swaps ? (or I may have missed a detail) Even if it does, it is
limited to a specific point in time and does not follow swapon/swapoff
events.

You are talking about ordering in /proc/swaps : I wonder what happens if
we add/remove swap files from the array : I guess the swp_entry_t
ordering may become mixed up with the order of the /proc/swaps output,
since it is based on the swap_info array which will fill empty spots
upon swapon (again, unless I missed a clever detail).

Mathieu

> -- Dave
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH] LTTng instrumentation mm (updated)
  2007-11-30 16:11                     ` [RFC PATCH] LTTng instrumentation mm (updated) Mathieu Desnoyers
@ 2007-11-30 17:46                       ` Dave Hansen
  2007-11-30 17:05                         ` Mathieu Desnoyers
  0 siblings, 1 reply; 46+ messages in thread
From: Dave Hansen @ 2007-11-30 17:46 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: akpm, linux-kernel, linux-mm, mbligh

On Fri, 2007-11-30 at 11:11 -0500, Mathieu Desnoyers wrote:
> +static inline swp_entry_t page_swp_entry(struct page *page)
> +{
> +       swp_entry_t entry;
> +       VM_BUG_ON(!PageSwapCache(page));
> +       entry.val = page_private(page);
> +       return entry;
> +}

This probably needs to be introduced (and used) in a separate patch.
Please fix up those other places in the code that can take advantage of
it.

>  #ifdef CONFIG_MIGRATION
>  static inline swp_entry_t make_migration_entry(struct page *page, int
> write)
>  {
> Index: linux-2.6-lttng/mm/swapfile.c
> ===================================================================
> --- linux-2.6-lttng.orig/mm/swapfile.c  2007-11-30 09:18:38.000000000
> -0500
> +++ linux-2.6-lttng/mm/swapfile.c       2007-11-30 10:21:50.000000000
> -0500
> @@ -1279,6 +1279,7 @@ asmlinkage long sys_swapoff(const char _
>         swap_map = p->swap_map;
>         p->swap_map = NULL;
>         p->flags = 0;
> +       trace_mark(mm_swap_file_close, "filp %p", swap_file);
>         spin_unlock(&swap_lock);
>         mutex_unlock(&swapon_mutex);
>         vfree(swap_map);
> @@ -1660,6 +1661,8 @@ asmlinkage long sys_swapon(const char __
>         } else {
>                 swap_info[prev].next = p - swap_info;
>         }
> +       trace_mark(mm_swap_file_open, "filp %p filename %s",
> +               swap_file, name); 

You print out the filp a number of times here, but how does that help in
a trace?  If I was trying to figure out which swapfile, I'd probably
just want to know the swp_entry_t->type, then I could look at this:

dave@foo:~/garbage$ cat /proc/swaps 
Filename                                Type            Size    Used    Priority
/dev/sda2                               partition       1992052 649336  -1

to see the ordering.

-- Dave


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH] LTTng instrumentation mm (updated)
  2007-11-30 17:05                         ` Mathieu Desnoyers
@ 2007-11-30 18:42                           ` Dave Hansen
  2007-11-30 19:10                             ` Mathieu Desnoyers
  0 siblings, 1 reply; 46+ messages in thread
From: Dave Hansen @ 2007-11-30 18:42 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: akpm, linux-kernel, linux-mm, mbligh

On Fri, 2007-11-30 at 12:05 -0500, Mathieu Desnoyers wrote:
> 
> 
> Given a trace including :
> - Swapfiles initially used
> - multiple swapon/swapoff
> - swap in/out events
> 
> We would like to be able to tell which swap file the information has
> been written to/read from at any given time during the trace.

Oh, tracing is expected to be on at all times?  I figured someone would
encounter a problem, then turn it on to dig down a little deeper, then
turn it off.

As for why I care what is in /proc/swaps.  Take a look at this:

struct swap_info_struct *
get_swap_info_struct(unsigned type)
{
        return &swap_info[type];
}

Then, look at the proc functions: 

static void *swap_next(struct seq_file *swap, void *v, loff_t *pos)
{
        struct swap_info_struct *ptr;
        struct swap_info_struct *endptr = swap_info + nr_swapfiles;

        if (v == SEQ_START_TOKEN)
                ptr = swap_info;
...

I guess if that swap_info[] has any holes, we can't relate indexes in
there right back to /proc/swaps, but maybe we should add some
information so that we _can_.

-- Dave


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH] LTTng instrumentation mm (updated)
  2007-11-30 18:42                           ` Dave Hansen
@ 2007-11-30 19:10                             ` Mathieu Desnoyers
  2007-12-04 19:15                               ` Frank Ch. Eigler
  0 siblings, 1 reply; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-11-30 19:10 UTC (permalink / raw)
  To: Dave Hansen; +Cc: akpm, linux-kernel, linux-mm, mbligh

* Dave Hansen (haveblue@us.ibm.com) wrote:
> On Fri, 2007-11-30 at 12:05 -0500, Mathieu Desnoyers wrote:
> > 
> > 
> > Given a trace including :
> > - Swapfiles initially used
> > - multiple swapon/swapoff
> > - swap in/out events
> > 
> > We would like to be able to tell which swap file the information has
> > been written to/read from at any given time during the trace.
> 
> Oh, tracing is expected to be on at all times?  I figured someone would
> encounter a problem, then turn it on to dig down a little deeper, then
> turn it off.
> 

Yep, it can be expected to be on at all times, especially on production
systems using "flight recorder" tracing to record information in a
circular buffer, then dumping the buffers when some triggers (error
conditions) happens.

> As for why I care what is in /proc/swaps.  Take a look at this:
> 
> struct swap_info_struct *
> get_swap_info_struct(unsigned type)
> {
>         return &swap_info[type];
> }
> 
> Then, look at the proc functions: 
> 
> static void *swap_next(struct seq_file *swap, void *v, loff_t *pos)
> {
>         struct swap_info_struct *ptr;
>         struct swap_info_struct *endptr = swap_info + nr_swapfiles;
> 
>         if (v == SEQ_START_TOKEN)
>                 ptr = swap_info;
> ...
> 
> I guess if that swap_info[] has any holes, we can't relate indexes in
> there right back to /proc/swaps, but maybe we should add some
> information so that we _can_.
> 

The if (!(ptr->flags & SWP_USED) test in swap_next seems to skip the
unused swap_info entries.

Why should we care about get_swap_info_struct always returning a "used"
swap info struct ?

> -- Dave
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH] LTTng instrumentation mm (updated)
  2007-11-30 19:10                             ` Mathieu Desnoyers
@ 2007-12-04 19:15                               ` Frank Ch. Eigler
  2007-12-04 19:25                                 ` Mathieu Desnoyers
  0 siblings, 1 reply; 46+ messages in thread
From: Frank Ch. Eigler @ 2007-12-04 19:15 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Dave Hansen, akpm, linux-kernel, linux-mm, mbligh

Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> writes:

> [...]
>> > We would like to be able to tell which swap file the information has
>> > been written to/read from at any given time during the trace.
>> 
>> Oh, tracing is expected to be on at all times?  I figured someone would
>> encounter a problem, then turn it on to dig down a little deeper, then
>> turn it off.
>
> Yep, it can be expected to be on at all times, especially on production
> systems using "flight recorder" tracing to record information in a
> circular buffer [...]

Considering how early in the boot sequence swap partitions are
activated, it seems optimistic to assume that the monitoring equipment
will always start up in time to catch the initial swapons.  It would
be more useful if a marker parameter was included in the swap events
to let a tool/user map to /proc/swaps or a file name.

- FChE

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH] LTTng instrumentation mm (updated)
  2007-12-04 19:15                               ` Frank Ch. Eigler
@ 2007-12-04 19:25                                 ` Mathieu Desnoyers
  2007-12-04 19:40                                   ` Dave Hansen
  0 siblings, 1 reply; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-12-04 19:25 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Dave Hansen, akpm, linux-kernel, linux-mm, mbligh

* Frank Ch. Eigler (fche@redhat.com) wrote:
> Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> writes:
> 
> > [...]
> >> > We would like to be able to tell which swap file the information has
> >> > been written to/read from at any given time during the trace.
> >> 
> >> Oh, tracing is expected to be on at all times?  I figured someone would
> >> encounter a problem, then turn it on to dig down a little deeper, then
> >> turn it off.
> >
> > Yep, it can be expected to be on at all times, especially on production
> > systems using "flight recorder" tracing to record information in a
> > circular buffer [...]
> 
> Considering how early in the boot sequence swap partitions are
> activated, it seems optimistic to assume that the monitoring equipment
> will always start up in time to catch the initial swapons.  It would
> be more useful if a marker parameter was included in the swap events
> to let a tool/user map to /proc/swaps or a file name.
> 
> - FChE

Not early at all ? We have userspace processes running.. this is _late_
in the boot sequence! ;)

Anyhow, that I have now is a combination including your proposal :

- I dump the swapon/swapoff events.
- I also dump the equivalent of /proc/swaps (with kernel internal
  information) at trace start to know what swap files are currently
  used.

Does it sound fair ?

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH] LTTng instrumentation mm (updated)
  2007-12-04 19:25                                 ` Mathieu Desnoyers
@ 2007-12-04 19:40                                   ` Dave Hansen
  2007-12-04 20:05                                     ` Mathieu Desnoyers
  0 siblings, 1 reply; 46+ messages in thread
From: Dave Hansen @ 2007-12-04 19:40 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Frank Ch. Eigler, akpm, linux-kernel, linux-mm, mbligh

On Tue, 2007-12-04 at 14:25 -0500, Mathieu Desnoyers wrote:
> 
> - I also dump the equivalent of /proc/swaps (with kernel internal
>   information) at trace start to know what swap files are currently
>   used.

What about just enhancing /proc/swaps so that this information can be
useful to people other than those doing traces?

Now that we have /proc/$pid/pagemap, we expose some of the same
information about which userspace virtual addresses are stored where and
in which swapfile.  

-- Dave


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH] LTTng instrumentation mm (updated)
  2007-12-04 19:40                                   ` Dave Hansen
@ 2007-12-04 20:05                                     ` Mathieu Desnoyers
  2007-12-04 20:24                                       ` Dave Hansen
  2007-12-04 20:28                                       ` Dave Hansen
  0 siblings, 2 replies; 46+ messages in thread
From: Mathieu Desnoyers @ 2007-12-04 20:05 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Frank Ch. Eigler, akpm, linux-kernel, linux-mm, mbligh

* Dave Hansen (haveblue@us.ibm.com) wrote:
> On Tue, 2007-12-04 at 14:25 -0500, Mathieu Desnoyers wrote:
> > 
> > - I also dump the equivalent of /proc/swaps (with kernel internal
> >   information) at trace start to know what swap files are currently
> >   used.
> 
> What about just enhancing /proc/swaps so that this information can be
> useful to people other than those doing traces?
> 

It includes an in-kernel struct file pointer, exporting it to userspace
would be somewhat ugly.

> Now that we have /proc/$pid/pagemap, we expose some of the same
> information about which userspace virtual addresses are stored where and
> in which swapfile.  
> 

The problems with /proc :

- It exports all the data in formatted text. What I need for my traces
  is pure binary, compact representation.
- It's not very neat to export in-kernel pointer information like a
  kernel tracer would need.
- The locking is very often wrong. I started correcting /proc/modules a
  while ago, but I fear there are quite a few cases where a procfile
  reader could release the locks between two consecutive reads of the
  same list and therefore cause missing information or corruption. While
  being manageable for a proc text file, this is _highly_ unwanted in a
  trace. See my previous "seq file sorted" and "module.c sort module
  list" patches about this. My tracer deals with addition/removal of
  elements to a list between dumps done by "chunks" by tracing the
  modifications done to the list at the same time. However, /proc seq
  files will just get corrupted or forget about an element not touched
  by the modification, which my tracer cannot cope with.

Mathieu


> -- Dave
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH] LTTng instrumentation mm (updated)
  2007-12-04 20:05                                     ` Mathieu Desnoyers
@ 2007-12-04 20:24                                       ` Dave Hansen
  2007-12-04 20:28                                       ` Dave Hansen
  1 sibling, 0 replies; 46+ messages in thread
From: Dave Hansen @ 2007-12-04 20:24 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Frank Ch. Eigler, akpm, linux-kernel, linux-mm, mbligh

On Tue, 2007-12-04 at 15:05 -0500, Mathieu Desnoyers wrote:
> * Dave Hansen (haveblue@us.ibm.com) wrote:
> > On Tue, 2007-12-04 at 14:25 -0500, Mathieu Desnoyers wrote:
> > > 
> > > - I also dump the equivalent of /proc/swaps (with kernel internal
> > >   information) at trace start to know what swap files are currently
> > >   used.
> > 
> > What about just enhancing /proc/swaps so that this information can be
> > useful to people other than those doing traces?
> 
> It includes an in-kernel struct file pointer, exporting it to userspace
> would be somewhat ugly.

What about just exporting the 'type' field that we use to index into
swap_info[]?

As far as /proc goes, it may not be _ideal_ for your traces, but it sure
beats not getting the information out at all. ;)  I guess I'm just not
that familiar with the tracing requirements and I can't really assess
whether what you're asking for is reasonable, or horrible
over-engineering.  Dunno.

-- Dave


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [RFC PATCH] LTTng instrumentation mm (updated)
  2007-12-04 20:05                                     ` Mathieu Desnoyers
  2007-12-04 20:24                                       ` Dave Hansen
@ 2007-12-04 20:28                                       ` Dave Hansen
  1 sibling, 0 replies; 46+ messages in thread
From: Dave Hansen @ 2007-12-04 20:28 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: Frank Ch. Eigler, akpm, linux-kernel, linux-mm, mbligh

Or, think out of the box...

Maybe you can introduce some interfaces that expose information both in
sysfs (in normal human-readable formats) and in a way that lets you get
the same data out in some binary format.  

Seems to me you'll have a lot easier time justifying all of these lines
of code spread all over the kernel if there are a few more users off the
bat.  

-- Dave


^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2007-12-04 20:28 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-11-13 19:33 [RFC 0/7] LTTng Kernel Instrumentation (Architecture Independent) Mathieu Desnoyers
2007-11-13 19:33 ` [RFC 1/7] Include marker.h in kernel.h -- temporary, for code readability Mathieu Desnoyers
2007-11-13 19:33 ` [RFC 2/7] LTTng instrumentation fs Mathieu Desnoyers
2007-11-13 19:33 ` [RFC 3/7] LTTng instrumentation ipc Mathieu Desnoyers
2007-11-13 19:33 ` [RFC 4/7] LTTng instrumentation kernel Mathieu Desnoyers
2007-11-15 23:30   ` Mike Mason
2007-11-15 23:54     ` Mike Mason
2007-11-16  2:42       ` Mathieu Desnoyers
2007-11-16  2:22     ` Mathieu Desnoyers
2007-11-13 19:33 ` [RFC 5/7] LTTng instrumentation mm Mathieu Desnoyers
2007-11-15 21:06   ` Dave Hansen
2007-11-15 21:51     ` Mathieu Desnoyers
2007-11-15 22:16       ` Dave Hansen
2007-11-16 14:30         ` Mathieu Desnoyers
2007-11-19 18:04           ` Dave Hansen
2007-11-28 14:09             ` [RFC PATCH] LTTng instrumentation mm (using page_to_pfn) Mathieu Desnoyers
2007-11-28 16:54               ` Dave Hansen
2007-11-29  2:34                 ` Mathieu Desnoyers
2007-11-29  6:25                   ` Dave Hansen
2007-11-30 16:11                     ` [RFC PATCH] LTTng instrumentation mm (updated) Mathieu Desnoyers
2007-11-30 17:46                       ` Dave Hansen
2007-11-30 17:05                         ` Mathieu Desnoyers
2007-11-30 18:42                           ` Dave Hansen
2007-11-30 19:10                             ` Mathieu Desnoyers
2007-12-04 19:15                               ` Frank Ch. Eigler
2007-12-04 19:25                                 ` Mathieu Desnoyers
2007-12-04 19:40                                   ` Dave Hansen
2007-12-04 20:05                                     ` Mathieu Desnoyers
2007-12-04 20:24                                       ` Dave Hansen
2007-12-04 20:28                                       ` Dave Hansen
2007-11-16 14:47         ` [RFC 5/7] LTTng instrumentation mm Mathieu Desnoyers
2007-11-19 18:07           ` Dave Hansen
2007-11-19 18:52             ` Mathieu Desnoyers
2007-11-19 19:00               ` Mathieu Desnoyers
2007-11-19 19:43                 ` Dave Hansen
2007-11-19 19:43               ` Dave Hansen
2007-11-19 19:52                 ` [PATCH] Cast __page_to_pfn to unsigned long in CONFIG_SPARSEMEM Mathieu Desnoyers
2007-11-19 20:09                   ` Dave Hansen
2007-11-19 20:20                     ` [PATCH] Cast page_to_pfn " Mathieu Desnoyers
2007-11-19 21:08                       ` Andrew Morton
2007-11-19 21:19                         ` Dave Hansen
2007-11-19 21:26                           ` Dave Hansen
2007-11-21 20:12                           ` Christoph Lameter
2007-11-20 17:34                         ` Mathieu Desnoyers
2007-11-13 19:33 ` [RFC 6/7] LTTng instrumentation net Mathieu Desnoyers
2007-11-13 19:33 ` [RFC 7/7] Add Markers Into Semaphore Primitives Mathieu Desnoyers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).