LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [patch v2 0/3] OProfile support for System z's hardware sampling
@ 2011-01-21 10:06 Heinz Graalfs
  2011-01-21 10:06 ` [patch v2 1/3] This patch adds support for hardware based sampling on System z processors (models z10 and up) Heinz Graalfs
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Heinz Graalfs @ 2011-01-21 10:06 UTC (permalink / raw)
  To: robert.richter
  Cc: mingo, oprofile-list, linux-kernel, linux-s390, borntraeger,
	schwidefsky, heiko.carstens

Hello Robert,

I'm resending yesterday's mail because I missed to specify the correct sender information.

This is a re-posting of the patch series originally posted last month:

http://marc.info/?l=linux-s390&m=129285043619973&w=2

Heinz

Changes in

v2:
   - kernel module hwsampler removed, everything is now in oprofile kernel module
   - functions from hwsampler-main.c and smpctl.c merged into arch/s390/oprofile/hwsampler.c
     - functions made static
   - arch/s390/include/asm/hwsampler.h moved to arch/s390/oprofile/hwsampler.h
     - structs have now hws_ prefix
   - config variables changed, HAVE_HWSAMPLER used only
   - original patch 4 (handle_munmap.patch) removed

Description:

So far, OProfile takes samples by using a software interrupt.
The purpose of this series of patches is to add support for System z hardware sampling to OProfile.

Hardware (HW) sampling is a feature provided by System z processors (z10 and follow ons).
When sampling, the processor takes samples containing the instruction address, PID, and other information.
The samples are taken at a programmable rate and stored into a buffer provided by the operating system.
The sampling process is implemented in hardware and millicode and thus does not affect the operating system
being oberved, apart from requiring buffer memory that the Linux kernel must provide.

Hardware sampling is available in LPAR mode on 64 BIT processors only.

The overall approach is to replace the software-based sample generation by hardware sampling.
All required functionality to control the HW sampling mechanism is added to the oprofile kernel module.
The functions provide support for
 - controlling the sampling hardware,
 - setting up appropriate buffer structures (HW buffers),
 - retrieving sample entries from these buffers.
Multiple CPUs can be handled.

The samples contain the instruction address, a bit distinguishing between kernel and user space,
and for user space samples also the PID.
Instead of taking samples from its own per-CPU buffers, OProfile would rather take samples from the
HW buffers.

When hardware sampling can be enabled on the current System z processor it will be the new default.
Switching back to timer based sampling can be established by using

   echo 0 > /dev/oprofile/hwsampling/hwsampler

The user space drivers of OProfile also need an extension to control hw sampling by appropriate options.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [patch v2 1/3] This patch adds support for hardware based sampling on System z processors (models z10 and up)
  2011-01-21 10:06 [patch v2 0/3] OProfile support for System z's hardware sampling Heinz Graalfs
@ 2011-01-21 10:06 ` Heinz Graalfs
  2011-02-14 18:57   ` Robert Richter
  2011-03-25 11:00   ` Robert Richter
  2011-01-21 10:06 ` [patch v2 2/3] This patch enhances OProfile to support System zs hardware sampling feature Heinz Graalfs
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 17+ messages in thread
From: Heinz Graalfs @ 2011-01-21 10:06 UTC (permalink / raw)
  To: robert.richter
  Cc: mingo, oprofile-list, linux-kernel, linux-s390, borntraeger,
	schwidefsky, heiko.carstens, Mahesh Salgaonkar,
	Maran Pakkirisamy

[-- Attachment #1: hwsampler-base.patch --]
[-- Type: text/plain, Size: 34690 bytes --]

From: Heinz Graalfs <graalfs@linux.vnet.ibm.com>

System z's hardware sampling is described in detail in:

   SA23-2260-01 "The Load-Program-Parameter and CPU-Measurement Facilities"

The patch introduces 
 - support for System z's hardware sampler in OProfile's kernel module
 - it adds functions that control all hardware sampling related operations as
  - checking if hardware sampling feature is available
    - ie: on System z models z10 and up, in LPAR mode only, and authorised during LPAR activation
  - allocating memory for the hardware sampling feature
  - starting/stopping hardware sampling

All functions required to start and stop hardware sampling have to be
invoked by the oprofile kernel module as provided by the other patches of this patch set.

In case hardware based sampling cannot be setup standard timer based sampling is used by OProfile.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Maran Pakkirisamy <maranp@linux.vnet.ibm.com>
Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
---
 arch/Kconfig                   |    3 
 arch/s390/Kconfig              |    1 
 arch/s390/oprofile/hwsampler.c | 1256 +++++++++++++++++++++++++++++++++++++++++
 arch/s390/oprofile/hwsampler.h |  113 +++
 4 files changed, 1373 insertions(+)

Index: linux-2.6/arch/s390/oprofile/hwsampler.c
===================================================================
--- /dev/null
+++ linux-2.6/arch/s390/oprofile/hwsampler.c
@@ -0,0 +1,1256 @@
+/**
+ * arch/s390/oprofile/hwsampler.c
+ *
+ * Copyright IBM Corp. 2010
+ * Author: Heinz Graalfs <graalfs@de.ibm.com>
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/smp.h>
+#include <linux/errno.h>
+#include <linux/workqueue.h>
+#include <linux/interrupt.h>
+#include <linux/notifier.h>
+#include <linux/cpu.h>
+#include <linux/semaphore.h>
+#include <linux/oom.h>
+#include <linux/oprofile.h>
+
+#include <asm/lowcore.h>
+#include <asm/s390_ext.h>
+
+#include "hwsampler.h"
+
+#define MAX_NUM_SDB 511
+#define MIN_NUM_SDB 1
+
+#define ALERT_REQ_MASK   0x4000000000000000ul
+#define BUFFER_FULL_MASK 0x8000000000000000ul
+
+#define EI_IEA      (1 << 31)	/* invalid entry address              */
+#define EI_ISE      (1 << 30)	/* incorrect SDBT entry               */
+#define EI_PRA      (1 << 29)	/* program request alert              */
+#define EI_SACA     (1 << 23)	/* sampler authorization change alert */
+#define EI_LSDA     (1 << 22)	/* loss of sample data alert          */
+
+DECLARE_PER_CPU(struct hws_cpu_buffer, sampler_cpu_buffer);
+
+struct hws_execute_parms {
+	void *buffer;
+	signed int rc;
+};
+
+DEFINE_PER_CPU(struct hws_cpu_buffer, sampler_cpu_buffer);
+EXPORT_PER_CPU_SYMBOL(sampler_cpu_buffer);
+
+static DEFINE_MUTEX(hws_sem);
+static DEFINE_MUTEX(hws_sem_oom);
+
+static unsigned char hws_flush_all;
+static unsigned int hws_oom;
+static struct workqueue_struct *hws_wq;
+
+static unsigned int hws_state;
+enum {
+	HWS_INIT = 1,
+	HWS_DEALLOCATED,
+	HWS_STOPPED,
+	HWS_STARTED,
+	HWS_STOPPING };
+
+/* set to 1 if called by kernel during memory allocation */
+static unsigned char oom_killer_was_active;
+/* size of SDBT and SDB as of allocate API */
+static unsigned long num_sdbt = 100;
+static unsigned long num_sdb = 511;
+/* sampling interval (machine cycles) */
+static unsigned long interval;
+
+static unsigned long min_sampler_rate;
+static unsigned long max_sampler_rate;
+
+static int ssctl(void *buffer)
+{
+	int cc;
+
+	/* set in order to detect a program check */
+	cc = 1;
+
+	asm volatile(
+		"0: .insn s,0xB2870000,0(%1)\n"
+		"1: ipm %0\n"
+		"   srl %0,28\n"
+		"2:\n"
+		EX_TABLE(0b, 2b) EX_TABLE(1b, 2b)
+		: "+d" (cc), "+a" (buffer)
+		: "m" (*((struct hws_ssctl_request_block *)buffer))
+		: "cc", "memory");
+
+	return cc ? -EINVAL : 0 ;
+}
+
+static int qsi(void *buffer)
+{
+	int cc;
+	cc = 1;
+
+	asm volatile(
+		"0: .insn s,0xB2860000,0(%1)\n"
+		"1: lhi %0,0\n"
+		"2:\n"
+		EX_TABLE(0b, 2b) EX_TABLE(1b, 2b)
+		: "=d" (cc), "+a" (buffer)
+		: "m" (*((struct hws_qsi_info_block *)buffer))
+		: "cc", "memory");
+
+	return cc ? -EINVAL : 0;
+}
+
+static void execute_qsi(void *parms)
+{
+	struct hws_execute_parms *ep = parms;
+
+	ep->rc = qsi(ep->buffer);
+}
+
+static void execute_ssctl(void *parms)
+{
+	struct hws_execute_parms *ep = parms;
+
+	ep->rc = ssctl(ep->buffer);
+}
+
+static int smp_ctl_ssctl_stop(int cpu)
+{
+	int rc;
+	struct hws_execute_parms ep;
+	struct hws_cpu_buffer *cb;
+
+	cb = &per_cpu(sampler_cpu_buffer, cpu);
+
+	cb->ssctl.es = 0;
+	cb->ssctl.cs = 0;
+
+	ep.buffer = &cb->ssctl;
+	smp_call_function_single(cpu, execute_ssctl, &ep, 1);
+	rc = ep.rc;
+	if (rc) {
+		printk(KERN_ERR "hwsampler: CPU %d CPUMF SSCTL failed.\n", cpu);
+		dump_stack();
+	}
+
+	ep.buffer = &cb->qsi;
+	smp_call_function_single(cpu, execute_qsi, &ep, 1);
+
+	if (cb->qsi.es || cb->qsi.cs) {
+		printk(KERN_EMERG "CPUMF sampling did not stop properly.\n");
+		dump_stack();
+	}
+
+	return rc;
+}
+
+static int smp_ctl_ssctl_deactivate(int cpu)
+{
+	int rc;
+	struct hws_execute_parms ep;
+	struct hws_cpu_buffer *cb;
+
+	cb = &per_cpu(sampler_cpu_buffer, cpu);
+
+	cb->ssctl.es = 1;
+	cb->ssctl.cs = 0;
+
+	ep.buffer = &cb->ssctl;
+	smp_call_function_single(cpu, execute_ssctl, &ep, 1);
+	rc = ep.rc;
+	if (rc)
+		printk(KERN_ERR "hwsampler: CPU %d CPUMF SSCTL failed.\n", cpu);
+
+	ep.buffer = &cb->qsi;
+	smp_call_function_single(cpu, execute_qsi, &ep, 1);
+
+	if (cb->qsi.cs)
+		printk(KERN_EMERG "CPUMF sampling was not set inactive.\n");
+
+	return rc;
+}
+
+static int smp_ctl_ssctl_enable_activate(int cpu, unsigned long interval)
+{
+	int rc;
+	struct hws_execute_parms ep;
+	struct hws_cpu_buffer *cb;
+
+	cb = &per_cpu(sampler_cpu_buffer, cpu);
+
+	cb->ssctl.h = 1;
+	cb->ssctl.tear = cb->first_sdbt;
+	cb->ssctl.dear = *(unsigned long *) cb->first_sdbt;
+	cb->ssctl.interval = interval;
+	cb->ssctl.es = 1;
+	cb->ssctl.cs = 1;
+
+	ep.buffer = &cb->ssctl;
+	smp_call_function_single(cpu, execute_ssctl, &ep, 1);
+	rc = ep.rc;
+	if (rc)
+		printk(KERN_ERR "hwsampler: CPU %d CPUMF SSCTL failed.\n", cpu);
+
+	ep.buffer = &cb->qsi;
+	smp_call_function_single(cpu, execute_qsi, &ep, 1);
+	if (ep.rc)
+		printk(KERN_ERR "hwsampler: CPU %d CPUMF QSI failed.\n", cpu);
+
+	return rc;
+}
+
+static int smp_ctl_qsi(int cpu)
+{
+	struct hws_execute_parms ep;
+	struct hws_cpu_buffer *cb;
+
+	cb = &per_cpu(sampler_cpu_buffer, cpu);
+
+	ep.buffer = &cb->qsi;
+	smp_call_function_single(cpu, execute_qsi, &ep, 1);
+
+	return ep.rc;
+}
+
+static inline unsigned long *trailer_entry_ptr(unsigned long v)
+{
+	void *ret;
+
+	ret = (void *)v;
+	ret += PAGE_SIZE;
+	ret -= sizeof(struct hws_trailer_entry);
+
+	return (unsigned long *) ret;
+}
+
+/* prototypes for external interrupt handler and worker */
+static void hws_ext_handler(unsigned int ext_int_code,
+				unsigned int param32, unsigned long param64);
+
+static void worker(struct work_struct *work);
+
+static void add_samples_to_oprofile(unsigned cpu, unsigned long *,
+				unsigned long *dear);
+
+static void init_all_cpu_buffers(void)
+{
+	int cpu;
+	struct hws_cpu_buffer *cb;
+
+	for_each_online_cpu(cpu) {
+		cb = &per_cpu(sampler_cpu_buffer, cpu);
+		memset(cb, 0, sizeof(struct hws_cpu_buffer));
+	}
+}
+
+static int is_link_entry(unsigned long *s)
+{
+	return *s & 0x1ul ? 1 : 0;
+}
+
+static unsigned long *get_next_sdbt(unsigned long *s)
+{
+	return (unsigned long *) (*s & ~0x1ul);
+}
+
+static int prepare_cpu_buffers(void)
+{
+	int cpu;
+	int rc;
+	struct hws_cpu_buffer *cb;
+
+	rc = 0;
+	for_each_online_cpu(cpu) {
+		cb = &per_cpu(sampler_cpu_buffer, cpu);
+		atomic_set(&cb->ext_params, 0);
+		cb->worker_entry = 0;
+		cb->sample_overflow = 0;
+		cb->req_alert = 0;
+		cb->incorrect_sdbt_entry = 0;
+		cb->invalid_entry_address = 0;
+		cb->loss_of_sample_data = 0;
+		cb->sample_auth_change_alert = 0;
+		cb->finish = 0;
+		cb->oom = 0;
+		cb->stop_mode = 0;
+	}
+
+	return rc;
+}
+
+/*
+ * allocate_sdbt() - allocate sampler memory
+ * @cpu: the cpu for which sampler memory is allocated
+ *
+ * A 4K page is allocated for each requested SDBT.
+ * A maximum of 511 4K pages are allocated for the SDBs in each of the SDBTs.
+ * Set ALERT_REQ mask in each SDBs trailer.
+ * Returns zero if successful, <0 otherwise.
+ */
+static int allocate_sdbt(int cpu)
+{
+	int j, k, rc;
+	unsigned long *sdbt;
+	unsigned long  sdb;
+	unsigned long *tail;
+	unsigned long *trailer;
+	struct hws_cpu_buffer *cb;
+
+	cb = &per_cpu(sampler_cpu_buffer, cpu);
+
+	if (cb->first_sdbt)
+		return -EINVAL;
+
+	sdbt = NULL;
+	tail = sdbt;
+
+	for (j = 0; j < num_sdbt; j++) {
+		sdbt = (unsigned long *)get_zeroed_page(GFP_KERNEL);
+
+		mutex_lock(&hws_sem_oom);
+		/* OOM killer might have been activated */
+		barrier();
+		if (oom_killer_was_active || !sdbt) {
+			if (sdbt)
+				free_page((unsigned long)sdbt);
+
+			goto allocate_sdbt_error;
+		}
+		if (cb->first_sdbt == 0)
+			cb->first_sdbt = (unsigned long)sdbt;
+
+		/* link current page to tail of chain */
+		if (tail)
+			*tail = (unsigned long)(void *)sdbt + 1;
+
+		mutex_unlock(&hws_sem_oom);
+
+		for (k = 0; k < num_sdb; k++) {
+			/* get and set SDB page */
+			sdb = get_zeroed_page(GFP_KERNEL);
+
+			mutex_lock(&hws_sem_oom);
+			/* OOM killer might have been activated */
+			barrier();
+			if (oom_killer_was_active || !sdb) {
+				if (sdb)
+					free_page(sdb);
+
+				goto allocate_sdbt_error;
+			}
+			*sdbt = sdb;
+			trailer = trailer_entry_ptr(*sdbt);
+			*trailer = ALERT_REQ_MASK;
+			sdbt++;
+			mutex_unlock(&hws_sem_oom);
+		}
+		tail = sdbt;
+	}
+	mutex_lock(&hws_sem_oom);
+	if (oom_killer_was_active)
+		goto allocate_sdbt_error;
+
+	rc = 0;
+	if (tail)
+		*tail = (unsigned long)
+			((void *)cb->first_sdbt) + 1;
+
+allocate_sdbt_exit:
+	mutex_unlock(&hws_sem_oom);
+	return rc;
+
+allocate_sdbt_error:
+	rc = -ENOMEM;
+	goto allocate_sdbt_exit;
+}
+
+/*
+ * deallocate_sdbt() - deallocate all sampler memory
+ *
+ * For each online CPU all SDBT trees are deallocated.
+ * Returns the number of freed pages.
+ */
+static int deallocate_sdbt(void)
+{
+	int cpu;
+	int counter;
+
+	counter = 0;
+
+	for_each_online_cpu(cpu) {
+		unsigned long start;
+		unsigned long sdbt;
+		unsigned long *curr;
+		struct hws_cpu_buffer *cb;
+
+		cb = &per_cpu(sampler_cpu_buffer, cpu);
+
+		if (!cb->first_sdbt)
+			continue;
+
+		sdbt = cb->first_sdbt;
+		curr = (unsigned long *) sdbt;
+		start = sdbt;
+
+		/* we'll free the SDBT after all SDBs are processed... */
+		while (1) {
+			if (!*curr || !sdbt)
+				break;
+
+			/* watch for link entry reset if found */
+			if (is_link_entry(curr)) {
+				curr = get_next_sdbt(curr);
+				if (sdbt)
+					free_page(sdbt);
+
+				/* we are done if we reach the start */
+				if ((unsigned long) curr == start)
+					break;
+				else
+					sdbt = (unsigned long) curr;
+			} else {
+				/* process SDB pointer */
+				if (*curr) {
+					free_page(*curr);
+					curr++;
+				}
+			}
+			counter++;
+		}
+		cb->first_sdbt = 0;
+	}
+	return counter;
+}
+
+static int start_sampling(int cpu)
+{
+	int rc;
+	struct hws_cpu_buffer *cb;
+
+	cb = &per_cpu(sampler_cpu_buffer, cpu);
+	rc = smp_ctl_ssctl_enable_activate(cpu, interval);
+	if (rc) {
+		printk(KERN_INFO "hwsampler: CPU %d ssctl failed.\n", cpu);
+		goto start_exit;
+	}
+
+	rc = -EINVAL;
+	if (!cb->qsi.es) {
+		printk(KERN_INFO "hwsampler: CPU %d ssctl not enabled.\n", cpu);
+		goto start_exit;
+	}
+
+	if (!cb->qsi.cs) {
+		printk(KERN_INFO "hwsampler: CPU %d ssctl not active.\n", cpu);
+		goto start_exit;
+	}
+
+	printk(KERN_INFO
+		"hwsampler: CPU %d, CPUMF Sampling started, interval %lu.\n",
+		cpu, interval);
+
+	rc = 0;
+
+start_exit:
+	return rc;
+}
+
+static int stop_sampling(int cpu)
+{
+	unsigned long v;
+	int rc;
+	struct hws_cpu_buffer *cb;
+
+	rc = smp_ctl_qsi(cpu);
+	WARN_ON(rc);
+
+	cb = &per_cpu(sampler_cpu_buffer, cpu);
+	if (!rc && !cb->qsi.es)
+		printk(KERN_INFO "hwsampler: CPU %d, already stopped.\n", cpu);
+
+	rc = smp_ctl_ssctl_stop(cpu);
+	if (rc) {
+		printk(KERN_INFO "hwsampler: CPU %d, ssctl stop error %d.\n",
+				cpu, rc);
+		goto stop_exit;
+	}
+
+	printk(KERN_INFO "hwsampler: CPU %d, CPUMF Sampling stopped.\n", cpu);
+
+stop_exit:
+	v = cb->req_alert;
+	if (v)
+		printk(KERN_ERR "hwsampler: CPU %d CPUMF Request alert,"
+				" count=%lu.\n", cpu, v);
+
+	v = cb->loss_of_sample_data;
+	if (v)
+		printk(KERN_ERR "hwsampler: CPU %d CPUMF Loss of sample data,"
+				" count=%lu.\n", cpu, v);
+
+	v = cb->invalid_entry_address;
+	if (v)
+		printk(KERN_ERR "hwsampler: CPU %d CPUMF Invalid entry address,"
+				" count=%lu.\n", cpu, v);
+
+	v = cb->incorrect_sdbt_entry;
+	if (v)
+		printk(KERN_ERR
+				"hwsampler: CPU %d CPUMF Incorrect SDBT address,"
+				" count=%lu.\n", cpu, v);
+
+	v = cb->sample_auth_change_alert;
+	if (v)
+		printk(KERN_ERR
+				"hwsampler: CPU %d CPUMF Sample authorization change,"
+				" count=%lu.\n", cpu, v);
+
+	return rc;
+}
+
+static int check_hardware_prerequisites(void)
+{
+	unsigned long long facility_bits[2];
+
+	memcpy(facility_bits, S390_lowcore.stfle_fac_list, 32);
+	if (!(facility_bits[1] & (1ULL << 59)))
+		return -EOPNOTSUPP;
+
+	return 0;
+}
+/*
+ * hws_oom_callback() - the OOM callback function
+ *
+ * In case the callback is invoked during memory allocation for the
+ *  hw sampler, all obtained memory is deallocated and a flag is set
+ *  so main sampler memory allocation can exit with a failure code.
+ * In case the callback is invoked during sampling the hw sampler
+ *  is deactivated for all CPUs.
+ */
+static int hws_oom_callback(struct notifier_block *nfb,
+	unsigned long dummy, void *parm)
+{
+	unsigned long *freed;
+	int cpu;
+	struct hws_cpu_buffer *cb;
+
+	freed = parm;
+
+	mutex_lock(&hws_sem_oom);
+
+	if (hws_state == HWS_DEALLOCATED) {
+		/* during memory allocation */
+		if (oom_killer_was_active == 0) {
+			oom_killer_was_active = 1;
+			*freed += deallocate_sdbt();
+		}
+	} else {
+		int i;
+		cpu = get_cpu();
+		cb = &per_cpu(sampler_cpu_buffer, cpu);
+
+		if (!cb->oom) {
+			for_each_online_cpu(i) {
+				smp_ctl_ssctl_deactivate(i);
+				cb->oom = 1;
+			}
+			cb->finish = 1;
+
+			printk(KERN_INFO
+				"hwsampler: CPU %d, OOM notify during CPUMF Sampling.\n",
+				cpu);
+		}
+	}
+
+	mutex_unlock(&hws_sem_oom);
+
+	return NOTIFY_OK;
+}
+
+static struct notifier_block hws_oom_notifier = {
+	.notifier_call = hws_oom_callback
+};
+
+static int __cpuinit hws_cpu_callback(struct notifier_block *nfb,
+	unsigned long action, void *hcpu)
+{
+	/* We do not have sampler space available for all possible CPUs.
+	   All CPUs should be online when hw sampling is activated. */
+	return NOTIFY_BAD;
+}
+
+static struct notifier_block hws_cpu_notifier = {
+	.notifier_call = hws_cpu_callback
+};
+
+/**
+ * hwsampler_deactivate() - set hardware sampling temporarily inactive
+ * @cpu:  specifies the CPU to be set inactive.
+ *
+ * Returns 0 on success, !0 on failure.
+ */
+int hwsampler_deactivate(unsigned int cpu)
+{
+	/*
+	 * Deactivate hw sampling temporarily and flush the buffer
+	 * by pushing all the pending samples to oprofile buffer.
+	 *
+	 * This function can be called under one of the following conditions:
+	 *     Memory unmap, task is exiting.
+	 */
+	int rc;
+	struct hws_cpu_buffer *cb;
+
+	rc = 0;
+	mutex_lock(&hws_sem);
+
+	cb = &per_cpu(sampler_cpu_buffer, cpu);
+	if (hws_state == HWS_STARTED) {
+		rc = smp_ctl_qsi(cpu);
+		WARN_ON(rc);
+		if (cb->qsi.cs) {
+			rc = smp_ctl_ssctl_deactivate(cpu);
+			if (rc) {
+				printk(KERN_INFO
+				"hwsampler: CPU %d, CPUMF Deactivation failed.\n", cpu);
+				cb->finish = 1;
+				hws_state = HWS_STOPPING;
+			} else  {
+				hws_flush_all = 1;
+				/* Add work to queue to read pending samples.*/
+				queue_work_on(cpu, hws_wq, &cb->worker);
+			}
+		}
+	}
+	mutex_unlock(&hws_sem);
+
+	if (hws_wq)
+		flush_workqueue(hws_wq);
+
+	return rc;
+}
+
+/**
+ * hwsampler_activate() - activate/resume hardware sampling which was deactivated
+ * @cpu:  specifies the CPU to be set active.
+ *
+ * Returns 0 on success, !0 on failure.
+ */
+int hwsampler_activate(unsigned int cpu)
+{
+	/*
+	 * Re-activate hw sampling. This should be called in pair with
+	 * hwsampler_deactivate().
+	 */
+	int rc;
+	struct hws_cpu_buffer *cb;
+
+	rc = 0;
+	mutex_lock(&hws_sem);
+
+	cb = &per_cpu(sampler_cpu_buffer, cpu);
+	if (hws_state == HWS_STARTED) {
+		rc = smp_ctl_qsi(cpu);
+		WARN_ON(rc);
+		if (!cb->qsi.cs) {
+			hws_flush_all = 0;
+			rc = smp_ctl_ssctl_enable_activate(cpu, interval);
+			if (rc) {
+				printk(KERN_ERR
+				"CPU %d, CPUMF activate sampling failed.\n",
+					 cpu);
+			}
+		}
+	}
+
+	mutex_unlock(&hws_sem);
+
+	return rc;
+}
+
+static void hws_ext_handler(unsigned int ext_int_code,
+			    unsigned int param32, unsigned long param64)
+{
+	int cpu;
+	struct hws_cpu_buffer *cb;
+
+	cpu = smp_processor_id();
+	cb = &per_cpu(sampler_cpu_buffer, cpu);
+
+	atomic_xchg(
+			&cb->ext_params,
+			atomic_read(&cb->ext_params)
+				| S390_lowcore.ext_params);
+
+	if (hws_wq)
+		queue_work(hws_wq, &cb->worker);
+}
+
+static int check_qsi_on_setup(void)
+{
+	int rc;
+	unsigned int cpu;
+	struct hws_cpu_buffer *cb;
+
+	for_each_online_cpu(cpu) {
+		cb = &per_cpu(sampler_cpu_buffer, cpu);
+		rc = smp_ctl_qsi(cpu);
+		WARN_ON(rc);
+		if (rc)
+			return -EOPNOTSUPP;
+
+		if (!cb->qsi.as) {
+			printk(KERN_INFO "hwsampler: CPUMF sampling is not authorized.\n");
+			return -EINVAL;
+		}
+
+		if (cb->qsi.es) {
+			printk(KERN_WARNING "hwsampler: CPUMF is still enabled.\n");
+			rc = smp_ctl_ssctl_stop(cpu);
+			if (rc)
+				return -EINVAL;
+
+			printk(KERN_INFO
+				"CPU %d, CPUMF Sampling stopped now.\n", cpu);
+		}
+	}
+	return 0;
+}
+
+static int check_qsi_on_start(void)
+{
+	unsigned int cpu;
+	int rc;
+	struct hws_cpu_buffer *cb;
+
+	for_each_online_cpu(cpu) {
+		cb = &per_cpu(sampler_cpu_buffer, cpu);
+		rc = smp_ctl_qsi(cpu);
+		WARN_ON(rc);
+
+		if (!cb->qsi.as)
+			return -EINVAL;
+
+		if (cb->qsi.es)
+			return -EINVAL;
+
+		if (cb->qsi.cs)
+			return -EINVAL;
+	}
+	return 0;
+}
+
+static void worker_on_start(unsigned int cpu)
+{
+	struct hws_cpu_buffer *cb;
+
+	cb = &per_cpu(sampler_cpu_buffer, cpu);
+	cb->worker_entry = cb->first_sdbt;
+}
+
+static int worker_check_error(unsigned int cpu, int ext_params)
+{
+	int rc;
+	unsigned long *sdbt;
+	struct hws_cpu_buffer *cb;
+
+	rc = 0;
+	cb = &per_cpu(sampler_cpu_buffer, cpu);
+	sdbt = (unsigned long *) cb->worker_entry;
+
+	if (!sdbt || !*sdbt)
+		return -EINVAL;
+
+	if (ext_params & EI_IEA)
+		cb->req_alert++;
+
+	if (ext_params & EI_LSDA)
+		cb->loss_of_sample_data++;
+
+	if (ext_params & EI_IEA) {
+		cb->invalid_entry_address++;
+		rc = -EINVAL;
+	}
+
+	if (ext_params & EI_ISE) {
+		cb->incorrect_sdbt_entry++;
+		rc = -EINVAL;
+	}
+
+	if (ext_params & EI_SACA) {
+		cb->sample_auth_change_alert++;
+		rc = -EINVAL;
+	}
+
+	return rc;
+}
+
+static void worker_on_finish(unsigned int cpu)
+{
+	int rc, i;
+	struct hws_cpu_buffer *cb;
+
+	cb = &per_cpu(sampler_cpu_buffer, cpu);
+
+	if (cb->finish) {
+		rc = smp_ctl_qsi(cpu);
+		WARN_ON(rc);
+		if (cb->qsi.es) {
+			printk(KERN_INFO
+				"hwsampler: CPU %d, CPUMF Stop/Deactivate sampling.\n",
+				cpu);
+			rc = smp_ctl_ssctl_stop(cpu);
+			if (rc)
+				printk(KERN_INFO
+					"hwsampler: CPU %d, CPUMF Deactivation failed.\n",
+					cpu);
+
+			for_each_online_cpu(i) {
+				if (i == cpu)
+					continue;
+				if (!cb->finish) {
+					cb->finish = 1;
+					queue_work_on(i, hws_wq,
+						&cb->worker);
+				}
+			}
+		}
+	}
+}
+
+static void worker_on_interrupt(unsigned int cpu)
+{
+	unsigned long *sdbt;
+	unsigned char done;
+	struct hws_cpu_buffer *cb;
+
+	cb = &per_cpu(sampler_cpu_buffer, cpu);
+
+	sdbt = (unsigned long *) cb->worker_entry;
+
+	done = 0;
+	/* do not proceed if stop was entered,
+	 * forget the buffers not yet processed */
+	while (!done && !cb->stop_mode) {
+		unsigned long *trailer;
+		struct hws_trailer_entry *te;
+		unsigned long *dear = 0;
+
+		trailer = trailer_entry_ptr(*sdbt);
+		/* leave loop if no more work to do */
+		if (!(*trailer & BUFFER_FULL_MASK)) {
+			done = 1;
+			if (!hws_flush_all)
+				continue;
+		}
+
+		te = (struct hws_trailer_entry *)trailer;
+		cb->sample_overflow += te->overflow;
+
+		add_samples_to_oprofile(cpu, sdbt, dear);
+
+		/* reset trailer */
+		xchg((unsigned char *) te, 0x40);
+
+		/* advance to next sdb slot in current sdbt */
+		sdbt++;
+		/* in case link bit is set use address w/o link bit */
+		if (is_link_entry(sdbt))
+			sdbt = get_next_sdbt(sdbt);
+
+		cb->worker_entry = (unsigned long)sdbt;
+	}
+}
+
+static void add_samples_to_oprofile(unsigned int cpu, unsigned long *sdbt,
+		unsigned long *dear)
+{
+	struct hws_data_entry *sample_data_ptr;
+	unsigned long *trailer;
+
+	trailer = trailer_entry_ptr(*sdbt);
+	if (dear) {
+		if (dear > trailer)
+			return;
+		trailer = dear;
+	}
+
+	sample_data_ptr = (struct hws_data_entry *)(*sdbt);
+
+	while ((unsigned long *)sample_data_ptr < trailer) {
+		struct pt_regs *regs = NULL;
+		struct task_struct *tsk = NULL;
+
+		/*
+		 * Check sampling mode, 1 indicates basic (=customer) sampling
+		 * mode.
+		 */
+		if (sample_data_ptr->def != 1) {
+			/* sample slot is not yet written */
+			break;
+		} else {
+			/* make sure we don't use it twice,
+			 * the next time the sampler will set it again */
+			sample_data_ptr->def = 0;
+		}
+
+		/* Get pt_regs. */
+		if (sample_data_ptr->P == 1) {
+			/* userspace sample */
+			unsigned int pid = sample_data_ptr->prim_asn;
+			rcu_read_lock();
+			tsk = pid_task(find_vpid(pid), PIDTYPE_PID);
+			if (tsk)
+				regs = task_pt_regs(tsk);
+			rcu_read_unlock();
+		} else {
+			/* kernelspace sample */
+			regs = task_pt_regs(current);
+		}
+
+		mutex_lock(&hws_sem);
+		oprofile_add_ext_hw_sample(sample_data_ptr->ia, regs, 0,
+				!sample_data_ptr->P, tsk);
+		mutex_unlock(&hws_sem);
+
+		sample_data_ptr++;
+	}
+}
+
+static void worker(struct work_struct *work)
+{
+	unsigned int cpu;
+	int ext_params;
+	struct hws_cpu_buffer *cb;
+
+	cb = container_of(work, struct hws_cpu_buffer, worker);
+	cpu = smp_processor_id();
+	ext_params = atomic_xchg(&cb->ext_params, 0);
+
+	if (!cb->worker_entry)
+		worker_on_start(cpu);
+
+	if (worker_check_error(cpu, ext_params))
+		return;
+
+	if (!cb->finish)
+		worker_on_interrupt(cpu);
+
+	if (cb->finish)
+		worker_on_finish(cpu);
+}
+
+/**
+ * hwsampler_allocate() - allocate memory for the hardware sampler
+ * @sdbt:  number of SDBTs per online CPU (must be > 0)
+ * @sdb:   number of SDBs per SDBT (minimum 1, maximum 511)
+ *
+ * Returns 0 on success, !0 on failure.
+ */
+int hwsampler_allocate(unsigned long sdbt, unsigned long sdb)
+{
+	int cpu, rc;
+	mutex_lock(&hws_sem);
+
+	rc = -EINVAL;
+	if (hws_state != HWS_DEALLOCATED)
+		goto allocate_exit;
+
+	if (sdbt < 1)
+		goto allocate_exit;
+
+	if (sdb > MAX_NUM_SDB || sdb < MIN_NUM_SDB)
+		goto allocate_exit;
+
+	num_sdbt = sdbt;
+	num_sdb = sdb;
+
+	oom_killer_was_active = 0;
+	register_oom_notifier(&hws_oom_notifier);
+
+	for_each_online_cpu(cpu) {
+		if (allocate_sdbt(cpu)) {
+			unregister_oom_notifier(&hws_oom_notifier);
+			goto allocate_error;
+		}
+	}
+	unregister_oom_notifier(&hws_oom_notifier);
+	if (oom_killer_was_active)
+		goto allocate_error;
+
+	hws_state = HWS_STOPPED;
+	rc = 0;
+
+allocate_exit:
+	mutex_unlock(&hws_sem);
+	return rc;
+
+allocate_error:
+	rc = -ENOMEM;
+	printk(KERN_ERR "hwsampler: CPUMF Memory allocation failed.\n");
+	goto allocate_exit;
+}
+
+/**
+ * hwsampler_deallocate() - deallocate hardware sampler memory
+ *
+ * Returns 0 on success, !0 on failure.
+ */
+int hwsampler_deallocate()
+{
+	int rc;
+
+	mutex_lock(&hws_sem);
+
+	rc = -EINVAL;
+	if (hws_state != HWS_STOPPED)
+		goto deallocate_exit;
+
+	smp_ctl_clear_bit(0, 5); /* set bit 58 CR0 off */
+	deallocate_sdbt();
+
+	hws_state = HWS_DEALLOCATED;
+	rc = 0;
+
+deallocate_exit:
+	mutex_unlock(&hws_sem);
+
+	return rc;
+}
+
+long hwsampler_query_min_interval(void)
+{
+	if (min_sampler_rate)
+		return min_sampler_rate;
+	else
+		return -EINVAL;
+}
+
+long hwsampler_query_max_interval(void)
+{
+	if (max_sampler_rate)
+		return max_sampler_rate;
+	else
+		return -EINVAL;
+}
+
+unsigned long hwsampler_get_sample_overflow_count(unsigned int cpu)
+{
+	struct hws_cpu_buffer *cb;
+
+	cb = &per_cpu(sampler_cpu_buffer, cpu);
+
+	return cb->sample_overflow;
+}
+
+int hwsampler_setup()
+{
+	int rc;
+	int cpu;
+	struct hws_cpu_buffer *cb;
+
+	mutex_lock(&hws_sem);
+
+	rc = -EINVAL;
+	if (hws_state)
+		goto setup_exit;
+
+	hws_state = HWS_INIT;
+
+	init_all_cpu_buffers();
+
+	rc = check_hardware_prerequisites();
+	if (rc)
+		goto setup_exit;
+
+	rc = check_qsi_on_setup();
+	if (rc)
+		goto setup_exit;
+
+	rc = -EINVAL;
+	hws_wq = create_workqueue("hwsampler");
+	if (!hws_wq)
+		goto setup_exit;
+
+	register_cpu_notifier(&hws_cpu_notifier);
+
+	for_each_online_cpu(cpu) {
+		cb = &per_cpu(sampler_cpu_buffer, cpu);
+		INIT_WORK(&cb->worker, worker);
+		rc = smp_ctl_qsi(cpu);
+		WARN_ON(rc);
+		if (min_sampler_rate != cb->qsi.min_sampl_rate) {
+			if (min_sampler_rate) {
+				printk(KERN_WARNING
+					"hwsampler: different min sampler rate values.\n");
+				if (min_sampler_rate < cb->qsi.min_sampl_rate)
+					min_sampler_rate =
+						cb->qsi.min_sampl_rate;
+			} else
+				min_sampler_rate = cb->qsi.min_sampl_rate;
+		}
+		if (max_sampler_rate != cb->qsi.max_sampl_rate) {
+			if (max_sampler_rate) {
+				printk(KERN_WARNING
+					"hwsampler: different max sampler rate values.\n");
+				if (max_sampler_rate > cb->qsi.max_sampl_rate)
+					max_sampler_rate =
+						cb->qsi.max_sampl_rate;
+			} else
+				max_sampler_rate = cb->qsi.max_sampl_rate;
+		}
+	}
+	register_external_interrupt(0x1407, hws_ext_handler);
+
+	hws_state = HWS_DEALLOCATED;
+	rc = 0;
+
+setup_exit:
+	mutex_unlock(&hws_sem);
+	return rc;
+}
+
+int hwsampler_shutdown()
+{
+	int rc;
+
+	mutex_lock(&hws_sem);
+
+	rc = -EINVAL;
+	if (hws_state == HWS_DEALLOCATED || hws_state == HWS_STOPPED) {
+		mutex_unlock(&hws_sem);
+
+		if (hws_wq)
+			flush_workqueue(hws_wq);
+
+		mutex_lock(&hws_sem);
+
+		if (hws_state == HWS_STOPPED) {
+			smp_ctl_clear_bit(0, 5); /* set bit 58 CR0 off */
+			deallocate_sdbt();
+		}
+		if (hws_wq) {
+			destroy_workqueue(hws_wq);
+			hws_wq = NULL;
+		}
+
+		unregister_external_interrupt(0x1407, hws_ext_handler);
+		hws_state = HWS_INIT;
+		rc = 0;
+	}
+	mutex_unlock(&hws_sem);
+
+	unregister_cpu_notifier(&hws_cpu_notifier);
+
+	return rc;
+}
+
+/**
+ * hwsampler_start_all() - start hardware sampling on all online CPUs
+ * @rate:  specifies the used interval when samples are taken
+ *
+ * Returns 0 on success, !0 on failure.
+ */
+int hwsampler_start_all(unsigned long rate)
+{
+	int rc, cpu;
+
+	mutex_lock(&hws_sem);
+
+	hws_oom = 0;
+
+	rc = -EINVAL;
+	if (hws_state != HWS_STOPPED)
+		goto start_all_exit;
+
+	interval = rate;
+
+	/* fail if rate is not valid */
+	if (interval < min_sampler_rate || interval > max_sampler_rate)
+		goto start_all_exit;
+
+	rc = check_qsi_on_start();
+	if (rc)
+		goto start_all_exit;
+
+	rc = prepare_cpu_buffers();
+	if (rc)
+		goto start_all_exit;
+
+	for_each_online_cpu(cpu) {
+		rc = start_sampling(cpu);
+		if (rc)
+			break;
+	}
+	if (rc) {
+		for_each_online_cpu(cpu) {
+			stop_sampling(cpu);
+		}
+		goto start_all_exit;
+	}
+	hws_state = HWS_STARTED;
+	rc = 0;
+
+start_all_exit:
+	mutex_unlock(&hws_sem);
+
+	if (rc)
+		return rc;
+
+	register_oom_notifier(&hws_oom_notifier);
+	hws_oom = 1;
+	hws_flush_all = 0;
+	/* now let them in, 1407 CPUMF external interrupts */
+	smp_ctl_set_bit(0, 5); /* set CR0 bit 58 */
+
+	return 0;
+}
+
+/**
+ * hwsampler_stop_all() - stop hardware sampling on all online CPUs
+ *
+ * Returns 0 on success, !0 on failure.
+ */
+int hwsampler_stop_all()
+{
+	int tmp_rc, rc, cpu;
+	struct hws_cpu_buffer *cb;
+
+	mutex_lock(&hws_sem);
+
+	rc = 0;
+	if (hws_state == HWS_INIT) {
+		mutex_unlock(&hws_sem);
+		return rc;
+	}
+	hws_state = HWS_STOPPING;
+	mutex_unlock(&hws_sem);
+
+	for_each_online_cpu(cpu) {
+		cb = &per_cpu(sampler_cpu_buffer, cpu);
+		cb->stop_mode = 1;
+		tmp_rc = stop_sampling(cpu);
+		if (tmp_rc)
+			rc = tmp_rc;
+	}
+
+	if (hws_wq)
+		flush_workqueue(hws_wq);
+
+	mutex_lock(&hws_sem);
+	if (hws_oom) {
+		unregister_oom_notifier(&hws_oom_notifier);
+		hws_oom = 0;
+	}
+	hws_state = HWS_STOPPED;
+	mutex_unlock(&hws_sem);
+
+	return rc;
+}
Index: linux-2.6/arch/s390/Kconfig
===================================================================
--- linux-2.6.orig/arch/s390/Kconfig
+++ linux-2.6/arch/s390/Kconfig
@@ -127,6 +127,7 @@ config S390
 	select ARCH_INLINE_WRITE_UNLOCK_BH
 	select ARCH_INLINE_WRITE_UNLOCK_IRQ
 	select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE
+	select HAVE_HWSAMPLER
 
 config SCHED_OMIT_FRAME_POINTER
 	bool
Index: linux-2.6/arch/s390/oprofile/hwsampler.h
===================================================================
--- /dev/null
+++ linux-2.6/arch/s390/oprofile/hwsampler.h
@@ -0,0 +1,113 @@
+/*
+ * CPUMF HW sampler functions and internal structures
+ *
+ *    Copyright IBM Corp. 2010
+ *    Author(s): Heinz Graalfs <graalfs@de.ibm.com>
+ */
+
+#ifndef HWSAMPLER_H_
+#define HWSAMPLER_H_
+
+#include <linux/workqueue.h>
+
+struct hws_qsi_info_block          /* QUERY SAMPLING information block  */
+{ /* Bit(s) */
+	unsigned int b0_13:14;      /* 0-13: zeros                       */
+	unsigned int as:1;          /* 14: sampling authorisation control*/
+	unsigned int b15_21:7;      /* 15-21: zeros                      */
+	unsigned int es:1;          /* 22: sampling enable control       */
+	unsigned int b23_29:7;      /* 23-29: zeros                      */
+	unsigned int cs:1;          /* 30: sampling activation control   */
+	unsigned int:1;             /* 31: reserved                      */
+	unsigned int bsdes:16;      /* 4-5: size of sampling entry       */
+	unsigned int:16;            /* 6-7: reserved                     */
+	unsigned long min_sampl_rate; /* 8-15: minimum sampling interval */
+	unsigned long max_sampl_rate; /* 16-23: maximum sampling interval*/
+	unsigned long tear;         /* 24-31: TEAR contents              */
+	unsigned long dear;         /* 32-39: DEAR contents              */
+	unsigned int rsvrd0;        /* 40-43: reserved                   */
+	unsigned int cpu_speed;     /* 44-47: CPU speed                  */
+	unsigned long long rsvrd1;  /* 48-55: reserved                   */
+	unsigned long long rsvrd2;  /* 56-63: reserved                   */
+};
+
+struct hws_ssctl_request_block     /* SET SAMPLING CONTROLS req block   */
+{ /* bytes 0 - 7  Bit(s) */
+	unsigned int s:1;           /* 0: maximum buffer indicator       */
+	unsigned int h:1;           /* 1: part. level reserved for VM use*/
+	unsigned long b2_53:52;     /* 2-53: zeros                       */
+	unsigned int es:1;          /* 54: sampling enable control       */
+	unsigned int b55_61:7;      /* 55-61: - zeros                    */
+	unsigned int cs:1;          /* 62: sampling activation control   */
+	unsigned int b63:1;         /* 63: zero                          */
+	unsigned long interval;     /* 8-15: sampling interval           */
+	unsigned long tear;         /* 16-23: TEAR contents              */
+	unsigned long dear;         /* 24-31: DEAR contents              */
+	/* 32-63:                                                        */
+	unsigned long rsvrd1;       /* reserved                          */
+	unsigned long rsvrd2;       /* reserved                          */
+	unsigned long rsvrd3;       /* reserved                          */
+	unsigned long rsvrd4;       /* reserved                          */
+};
+
+struct hws_cpu_buffer {
+	unsigned long first_sdbt;       /* @ of 1st SDB-Table for this CP*/
+	unsigned long worker_entry;
+	unsigned long sample_overflow;  /* taken from SDB ...            */
+	struct hws_qsi_info_block qsi;
+	struct hws_ssctl_request_block ssctl;
+	struct work_struct worker;
+	atomic_t ext_params;
+	unsigned long req_alert;
+	unsigned long loss_of_sample_data;
+	unsigned long invalid_entry_address;
+	unsigned long incorrect_sdbt_entry;
+	unsigned long sample_auth_change_alert;
+	unsigned int finish:1;
+	unsigned int oom:1;
+	unsigned int stop_mode:1;
+};
+
+struct hws_data_entry {
+	unsigned int def:16;        /* 0-15  Data Entry Format           */
+	unsigned int R:4;           /* 16-19 reserved                    */
+	unsigned int U:4;           /* 20-23 Number of unique instruct.  */
+	unsigned int z:2;           /* zeros                             */
+	unsigned int T:1;           /* 26 PSW DAT mode                   */
+	unsigned int W:1;           /* 27 PSW wait state                 */
+	unsigned int P:1;           /* 28 PSW Problem state              */
+	unsigned int AS:2;          /* 29-30 PSW address-space control   */
+	unsigned int I:1;           /* 31 entry valid or invalid         */
+	unsigned int:16;
+	unsigned int prim_asn:16;   /* primary ASN                       */
+	unsigned long long ia;      /* Instruction Address               */
+	unsigned long long lpp;     /* Logical-Partition Program Param.  */
+	unsigned long long vpp;     /* Virtual-Machine Program Param.    */
+};
+
+struct hws_trailer_entry {
+	unsigned int f:1;           /* 0 - Block Full Indicator          */
+	unsigned int a:1;           /* 1 - Alert request control         */
+	unsigned long:62;           /* 2 - 63: Reserved                  */
+	unsigned long overflow;     /* 64 - sample Overflow count        */
+	unsigned long timestamp;    /* 16 - time-stamp                   */
+	unsigned long timestamp1;   /*                                   */
+	unsigned long reserved1;    /* 32 -Reserved                      */
+	unsigned long reserved2;    /*                                   */
+	unsigned long progusage1;   /* 48 - reserved for programming use */
+	unsigned long progusage2;   /*                                   */
+};
+
+int hwsampler_setup(void);
+int hwsampler_shutdown(void);
+int hwsampler_allocate(unsigned long sdbt, unsigned long sdb);
+int hwsampler_deallocate(void);
+long hwsampler_query_min_interval(void);
+long hwsampler_query_max_interval(void);
+int hwsampler_start_all(unsigned long interval);
+int hwsampler_stop_all(void);
+int hwsampler_deactivate(unsigned int cpu);
+int hwsampler_activate(unsigned int cpu);
+unsigned long hwsampler_get_sample_overflow_count(unsigned int cpu);
+
+#endif /*HWSAMPLER_H_*/
Index: linux-2.6/arch/Kconfig
===================================================================
--- linux-2.6.orig/arch/Kconfig
+++ linux-2.6/arch/Kconfig
@@ -30,6 +30,9 @@ config OPROFILE_EVENT_MULTIPLEX
 config HAVE_OPROFILE
 	bool
 
+config HAVE_HWSAMPLER
+	bool
+
 config KPROBES
 	bool "Kprobes"
 	depends on MODULES


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [patch v2 2/3] This patch enhances OProfile to support System zs hardware sampling feature
  2011-01-21 10:06 [patch v2 0/3] OProfile support for System z's hardware sampling Heinz Graalfs
  2011-01-21 10:06 ` [patch v2 1/3] This patch adds support for hardware based sampling on System z processors (models z10 and up) Heinz Graalfs
@ 2011-01-21 10:06 ` Heinz Graalfs
  2011-02-14 19:01   ` Robert Richter
  2011-01-21 10:06 ` [patch v2 3/3] This patch introduces a new oprofile sample add function (oprofile_add_ext_hw_sample) Heinz Graalfs
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Heinz Graalfs @ 2011-01-21 10:06 UTC (permalink / raw)
  To: robert.richter
  Cc: mingo, oprofile-list, linux-kernel, linux-s390, borntraeger,
	schwidefsky, heiko.carstens, Mahesh Salgaonkar,
	Maran Pakkirisamy

[-- Attachment #1: oprofile_vfs.patch --]
[-- Type: text/plain, Size: 10038 bytes --]

From: Heinz Graalfs <graalfs@linux.vnet.ibm.com>

OProfile is enhanced to export all files for controlling System z's hardware sampling,
and to invoke hwsampler exported functions to initialize and use System z's hardware sampling.

The patch invokes hwsampler_setup() during oprofile init and exports following
hwsampler files under oprofilefs if hwsampler's setup succeeded:

A new directory for hardware sampling based files

 /dev/oprofile/hwsampling/

The userland daemon must explicitly write to the following files
to disable (or enable) hardware based sampling

 /dev/oprofile/hwsampling/hwsampler

to modify the actual sampling rate

 /dev/oprofile/hwsampling/hw_interval

to modify the amount of sampling memory (measured in 4K pages)

 /dev/oprofile/hwsampling/hw_sdbt_blocks

The following files are read only and show
the possible minimum sampling rate

 /dev/oprofile/hwsampling/hw_min_interval

the possible maximum sampling rate

 /dev/oprofile/hwsampling/hw_max_interval

The patch splits the oprofile_timer_[init/exit] function so that it can be also called
through user context (oprofilefs) to avoid kernel oops.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Maran Pakkirisamy <maranp@linux.vnet.ibm.com>
Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
---
 arch/s390/oprofile/Makefile          |    3 
 arch/s390/oprofile/hwsampler_files.c |  146 +++++++++++++++++++++++++++++++++++
 arch/s390/oprofile/init.c            |    7 +
 drivers/oprofile/oprof.c             |   32 +++++++
 drivers/oprofile/oprof.h             |    2 
 drivers/oprofile/timer_int.c         |   16 +++
 include/linux/oprofile.h             |   21 +++++
 7 files changed, 222 insertions(+), 5 deletions(-)

Index: linux-2.6/arch/s390/oprofile/hwsampler_files.c
===================================================================
--- /dev/null
+++ linux-2.6/arch/s390/oprofile/hwsampler_files.c
@@ -0,0 +1,146 @@
+/**
+ * arch/s390/oprofile/hwsampler_files.c
+ *
+ * Copyright IBM Corp. 2010
+ * Author: Mahesh Salgaonkar (mahesh@linux.vnet.ibm.com)
+ */
+#include <linux/oprofile.h>
+#include <linux/errno.h>
+#include <linux/fs.h>
+
+#include "hwsampler.h"
+
+#define DEFAULT_INTERVAL	4096
+
+#define DEFAULT_SDBT_BLOCKS	1
+#define DEFAULT_SDB_BLOCKS	511
+
+static unsigned long oprofile_hw_interval = DEFAULT_INTERVAL;
+static unsigned long oprofile_min_interval;
+static unsigned long oprofile_max_interval;
+
+static unsigned long oprofile_sdbt_blocks = DEFAULT_SDBT_BLOCKS;
+static unsigned long oprofile_sdb_blocks = DEFAULT_SDB_BLOCKS;
+
+static unsigned long oprofile_hwsampler;
+
+static int oprofile_hwsampler_start(void)
+{
+	int retval;
+
+	retval = hwsampler_allocate(oprofile_sdbt_blocks, oprofile_sdb_blocks);
+	if (retval)
+		return retval;
+
+	retval = hwsampler_start_all(oprofile_hw_interval);
+	if (retval)
+		hwsampler_deallocate();
+
+	return retval;
+}
+
+static void oprofile_hwsampler_stop(void)
+{
+	hwsampler_stop_all();
+	hwsampler_deallocate();
+	return;
+}
+
+int oprofile_arch_set_hwsampler(struct oprofile_operations *ops)
+{
+	printk(KERN_INFO "oprofile: using hardware sampling\n");
+	ops->start = oprofile_hwsampler_start;
+	ops->stop = oprofile_hwsampler_stop;
+	ops->cpu_type = "timer";
+
+	return 0;
+}
+
+static ssize_t hwsampler_read(struct file *file, char __user *buf,
+		size_t count, loff_t *offset)
+{
+	return oprofilefs_ulong_to_user(oprofile_hwsampler, buf, count, offset);
+}
+
+static ssize_t hwsampler_write(struct file *file, char const __user *buf,
+		size_t count, loff_t *offset)
+{
+	unsigned long val;
+	int retval;
+
+	if (*offset)
+		return -EINVAL;
+
+	retval = oprofilefs_ulong_from_user(&val, buf, count);
+	if (retval)
+		return retval;
+
+	if (oprofile_hwsampler == val)
+		return -EINVAL;
+
+	retval = oprofile_set_hwsampler(val);
+
+	if (retval)
+		return retval;
+
+	oprofile_hwsampler = val;
+	return count;
+}
+
+static const struct file_operations hwsampler_fops = {
+	.read		= hwsampler_read,
+	.write		= hwsampler_write,
+};
+
+static int oprofile_create_hwsampling_files(struct super_block *sb,
+						struct dentry *root)
+{
+	struct dentry *hw_dir;
+
+	/* reinitialize default values */
+	oprofile_hwsampler = 1;
+
+	hw_dir = oprofilefs_mkdir(sb, root, "hwsampling");
+	if (!hw_dir)
+		return -EINVAL;
+
+	oprofilefs_create_file(sb, hw_dir, "hwsampler", &hwsampler_fops);
+	oprofilefs_create_ulong(sb, hw_dir, "hw_interval",
+				&oprofile_hw_interval);
+	oprofilefs_create_ro_ulong(sb, hw_dir, "hw_min_interval",
+				&oprofile_min_interval);
+	oprofilefs_create_ro_ulong(sb, hw_dir, "hw_max_interval",
+				&oprofile_max_interval);
+	oprofilefs_create_ulong(sb, hw_dir, "hw_sdbt_blocks",
+				&oprofile_sdbt_blocks);
+
+	return 0;
+}
+
+int oprofile_hwsampler_init(struct oprofile_operations* ops)
+{
+	if (hwsampler_setup())
+		return -ENODEV;
+
+	/*
+	 * create hwsampler files only if hwsampler_setup() succeeds.
+	 */
+	ops->create_files = oprofile_create_hwsampling_files;
+	oprofile_min_interval = hwsampler_query_min_interval();
+	if (oprofile_min_interval < 0) {
+		oprofile_min_interval = 0;
+		return -ENODEV;
+	}
+	oprofile_max_interval = hwsampler_query_max_interval();
+	if (oprofile_max_interval < 0) {
+		oprofile_max_interval = 0;
+		return -ENODEV;
+	}
+	oprofile_arch_set_hwsampler(ops);
+	return 0;
+}
+
+void oprofile_hwsampler_exit(void)
+{
+	hwsampler_shutdown();
+}
Index: linux-2.6/drivers/oprofile/oprof.c
===================================================================
--- linux-2.6.orig/drivers/oprofile/oprof.c
+++ linux-2.6/drivers/oprofile/oprof.c
@@ -239,6 +239,38 @@ int oprofile_set_ulong(unsigned long *ad
 	return err;
 }
 
+#ifdef CONFIG_HAVE_HWSAMPLER
+int oprofile_set_hwsampler(unsigned long val)
+{
+	int err = 0;
+
+	mutex_lock(&start_mutex);
+
+	if (oprofile_started) {
+		err = -EBUSY;
+		goto out;
+	}
+
+	switch (val) {
+	case 1:
+		/* Switch to hardware sampling. */
+		__oprofile_timer_exit();
+		err = oprofile_arch_set_hwsampler(&oprofile_ops);
+		break;
+	case 0:
+		printk(KERN_INFO "oprofile: using timer interrupt.\n");
+		err = __oprofile_timer_init(&oprofile_ops);
+		break;
+	default:
+		err = -EINVAL;
+	}
+
+out:
+	mutex_unlock(&start_mutex);
+	return err;
+}
+#endif /* CONFIG_HAVE_HWSAMPLER */
+
 static int __init oprofile_init(void)
 {
 	int err;
Index: linux-2.6/drivers/oprofile/oprof.h
===================================================================
--- linux-2.6.orig/drivers/oprofile/oprof.h
+++ linux-2.6/drivers/oprofile/oprof.h
@@ -35,7 +35,9 @@ struct dentry;
 
 void oprofile_create_files(struct super_block *sb, struct dentry *root);
 int oprofile_timer_init(struct oprofile_operations *ops);
+int __oprofile_timer_init(struct oprofile_operations *ops);
 void oprofile_timer_exit(void);
+void __oprofile_timer_exit(void);
 
 int oprofile_set_ulong(unsigned long *addr, unsigned long val);
 int oprofile_set_timeout(unsigned long time);
Index: linux-2.6/drivers/oprofile/timer_int.c
===================================================================
--- linux-2.6.orig/drivers/oprofile/timer_int.c
+++ linux-2.6/drivers/oprofile/timer_int.c
@@ -97,14 +97,13 @@ static struct notifier_block __refdata o
 	.notifier_call = oprofile_cpu_notify,
 };
 
-int __init oprofile_timer_init(struct oprofile_operations *ops)
+int  __oprofile_timer_init(struct oprofile_operations *ops)
 {
 	int rc;
 
 	rc = register_hotcpu_notifier(&oprofile_cpu_notifier);
 	if (rc)
 		return rc;
-	ops->create_files = NULL;
 	ops->setup = NULL;
 	ops->shutdown = NULL;
 	ops->start = oprofile_hrtimer_start;
@@ -113,7 +112,18 @@ int __init oprofile_timer_init(struct op
 	return 0;
 }
 
-void __exit oprofile_timer_exit(void)
+int __init oprofile_timer_init(struct oprofile_operations *ops)
+{
+	return __oprofile_timer_init(ops);
+}
+
+void __oprofile_timer_exit(void)
 {
 	unregister_hotcpu_notifier(&oprofile_cpu_notifier);
 }
+
+void __exit oprofile_timer_exit(void)
+{
+	__oprofile_timer_exit();
+}
+
Index: linux-2.6/include/linux/oprofile.h
===================================================================
--- linux-2.6.orig/include/linux/oprofile.h
+++ linux-2.6/include/linux/oprofile.h
@@ -89,6 +89,27 @@ int oprofile_arch_init(struct oprofile_o
  */
 void oprofile_arch_exit(void);
 
+#ifdef CONFIG_HAVE_HWSAMPLER
+/**
+ * setup hardware sampler for oprofiling.
+ */
+
+int oprofile_set_hwsampler(unsigned long);
+
+/**
+ * hardware sampler module initialization for the s390 arch
+ */
+
+int oprofile_arch_set_hwsampler(struct oprofile_operations *ops);
+
+/**
+ * Add an s390 hardware sample.
+ */
+void oprofile_add_ext_hw_sample(unsigned long pc, struct pt_regs * const regs,
+	unsigned long event, int is_kernel,
+	struct task_struct *task);
+#endif /* CONFIG_HAVE_HWSAMPLER */
+
 /**
  * Add a sample. This may be called from any context.
  */
Index: linux-2.6/arch/s390/oprofile/init.c
===================================================================
--- linux-2.6.orig/arch/s390/oprofile/init.c
+++ linux-2.6/arch/s390/oprofile/init.c
@@ -11,16 +11,21 @@
 #include <linux/oprofile.h>
 #include <linux/init.h>
 #include <linux/errno.h>
+#include <linux/fs.h>
 
+extern int oprofile_hwsampler_init(struct oprofile_operations* ops);
+extern void oprofile_hwsampler_exit(void);
 
 extern void s390_backtrace(struct pt_regs * const regs, unsigned int depth);
 
 int __init oprofile_arch_init(struct oprofile_operations* ops)
 {
 	ops->backtrace = s390_backtrace;
-	return -ENODEV;
+
+	return oprofile_hwsampler_init(ops);
 }
 
 void oprofile_arch_exit(void)
 {
+	oprofile_hwsampler_exit();
 }
Index: linux-2.6/arch/s390/oprofile/Makefile
===================================================================
--- linux-2.6.orig/arch/s390/oprofile/Makefile
+++ linux-2.6/arch/s390/oprofile/Makefile
@@ -6,4 +6,5 @@ DRIVER_OBJS = $(addprefix ../../../drive
 		oprofilefs.o oprofile_stats.o  \
 		timer_int.o )
 
-oprofile-y				:= $(DRIVER_OBJS) init.o backtrace.o
+oprofile-y				:= $(DRIVER_OBJS) init.o backtrace.o \
+					hwsampler_files.o hwsampler.o


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [patch v2 3/3] This patch introduces a new oprofile sample add function (oprofile_add_ext_hw_sample)
  2011-01-21 10:06 [patch v2 0/3] OProfile support for System z's hardware sampling Heinz Graalfs
  2011-01-21 10:06 ` [patch v2 1/3] This patch adds support for hardware based sampling on System z processors (models z10 and up) Heinz Graalfs
  2011-01-21 10:06 ` [patch v2 2/3] This patch enhances OProfile to support System zs hardware sampling feature Heinz Graalfs
@ 2011-01-21 10:06 ` Heinz Graalfs
  2011-02-14 18:55   ` Robert Richter
  2011-02-07  8:23 ` [patch v2 0/3] OProfile support for System z's hardware sampling Heinz Graalfs
  2011-02-14 19:42 ` Robert Richter
  4 siblings, 1 reply; 17+ messages in thread
From: Heinz Graalfs @ 2011-01-21 10:06 UTC (permalink / raw)
  To: robert.richter
  Cc: mingo, oprofile-list, linux-kernel, linux-s390, borntraeger,
	schwidefsky, heiko.carstens, Mahesh Salgaonkar,
	Maran Pakkirisamy

[-- Attachment #1: oprofile_worker.patch --]
[-- Type: text/plain, Size: 3662 bytes --]

From: Heinz Graalfs <graalfs@linux.vnet.ibm.com>

This patch introduces a new oprofile sample add function (oprofile_add_ext_hw_sample)
that can also take task_struct as an argument, which is used by the hwsampler kernel module 
when copying hardware samples to OProfile buffers.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Maran Pakkirisamy <maranp@linux.vnet.ibm.com>
Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
---
 drivers/oprofile/cpu_buffer.c |   27 ++++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

Index: linux-2.6/drivers/oprofile/cpu_buffer.c
===================================================================
--- linux-2.6.orig/drivers/oprofile/cpu_buffer.c
+++ linux-2.6/drivers/oprofile/cpu_buffer.c
@@ -22,6 +22,7 @@
 #include <linux/sched.h>
 #include <linux/oprofile.h>
 #include <linux/errno.h>
+#include <linux/module.h>
 
 #include "event_buffer.h"
 #include "cpu_buffer.h"
@@ -258,8 +259,10 @@ op_add_sample(struct oprofile_cpu_buffer
  */
 static int
 log_sample(struct oprofile_cpu_buffer *cpu_buf, unsigned long pc,
-	   unsigned long backtrace, int is_kernel, unsigned long event)
+	unsigned long backtrace, int is_kernel, unsigned long event,
+	struct task_struct *task)
 {
+	struct task_struct *tsk = task ? task : current;
 	cpu_buf->sample_received++;
 
 	if (pc == ESCAPE_CODE) {
@@ -267,7 +270,7 @@ log_sample(struct oprofile_cpu_buffer *c
 		return 0;
 	}
 
-	if (op_add_code(cpu_buf, backtrace, is_kernel, current))
+	if (op_add_code(cpu_buf, backtrace, is_kernel, tsk))
 		goto fail;
 
 	if (op_add_sample(cpu_buf, pc, event))
@@ -292,7 +295,8 @@ static inline void oprofile_end_trace(st
 
 static inline void
 __oprofile_add_ext_sample(unsigned long pc, struct pt_regs * const regs,
-			  unsigned long event, int is_kernel)
+			  unsigned long event, int is_kernel,
+			  struct task_struct *task)
 {
 	struct oprofile_cpu_buffer *cpu_buf = &__get_cpu_var(op_cpu_buffer);
 	unsigned long backtrace = oprofile_backtrace_depth;
@@ -301,7 +305,7 @@ __oprofile_add_ext_sample(unsigned long
 	 * if log_sample() fail we can't backtrace since we lost the
 	 * source of this event
 	 */
-	if (!log_sample(cpu_buf, pc, backtrace, is_kernel, event))
+	if (!log_sample(cpu_buf, pc, backtrace, is_kernel, event, task))
 		/* failed */
 		return;
 
@@ -316,9 +320,18 @@ __oprofile_add_ext_sample(unsigned long
 void oprofile_add_ext_sample(unsigned long pc, struct pt_regs * const regs,
 			     unsigned long event, int is_kernel)
 {
-	__oprofile_add_ext_sample(pc, regs, event, is_kernel);
+	__oprofile_add_ext_sample(pc, regs, event, is_kernel, NULL);
 }
 
+#ifdef CONFIG_HAVE_HWSAMPLER
+void oprofile_add_ext_hw_sample(unsigned long pc, struct pt_regs * const regs,
+				unsigned long event, int is_kernel,
+				struct task_struct *task)
+{
+	__oprofile_add_ext_sample(pc, regs, event, is_kernel, task);
+}
+#endif /* CONFIG_HAVE_HWSAMPLER */
+
 void oprofile_add_sample(struct pt_regs * const regs, unsigned long event)
 {
 	int is_kernel;
@@ -332,7 +345,7 @@ void oprofile_add_sample(struct pt_regs
 		pc = ESCAPE_CODE; /* as this causes an early return. */
 	}
 
-	__oprofile_add_ext_sample(pc, regs, event, is_kernel);
+	__oprofile_add_ext_sample(pc, regs, event, is_kernel, NULL);
 }
 
 /*
@@ -403,7 +416,7 @@ int oprofile_write_commit(struct op_entr
 void oprofile_add_pc(unsigned long pc, int is_kernel, unsigned long event)
 {
 	struct oprofile_cpu_buffer *cpu_buf = &__get_cpu_var(op_cpu_buffer);
-	log_sample(cpu_buf, pc, 0, is_kernel, event);
+	log_sample(cpu_buf, pc, 0, is_kernel, event, NULL);
 }
 
 void oprofile_add_trace(unsigned long pc)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [patch v2 0/3] OProfile support for System z's hardware sampling
  2011-01-21 10:06 [patch v2 0/3] OProfile support for System z's hardware sampling Heinz Graalfs
                   ` (2 preceding siblings ...)
  2011-01-21 10:06 ` [patch v2 3/3] This patch introduces a new oprofile sample add function (oprofile_add_ext_hw_sample) Heinz Graalfs
@ 2011-02-07  8:23 ` Heinz Graalfs
  2011-02-14 19:42 ` Robert Richter
  4 siblings, 0 replies; 17+ messages in thread
From: Heinz Graalfs @ 2011-02-07  8:23 UTC (permalink / raw)
  To: robert.richter
  Cc: linux-s390, heiko.carstens, linux-kernel, borntraeger,
	oprofile-list, schwidefsky

Hello Robert,

when do you have a chance do look at my patches?

Looking forward to a positive reply.

Heinz

On Fri, 2011-01-21 at 11:06 +0100, Heinz Graalfs wrote:
> Hello Robert,
> 
> I'm resending yesterday's mail because I missed to specify the correct sender information.
> 
> This is a re-posting of the patch series originally posted last month:
> 
> http://marc.info/?l=linux-s390&m=129285043619973&w=2
> 
> Heinz
> 
> Changes in
> 
> v2:
>    - kernel module hwsampler removed, everything is now in oprofile kernel module
>    - functions from hwsampler-main.c and smpctl.c merged into arch/s390/oprofile/hwsampler.c
>      - functions made static
>    - arch/s390/include/asm/hwsampler.h moved to arch/s390/oprofile/hwsampler.h
>      - structs have now hws_ prefix
>    - config variables changed, HAVE_HWSAMPLER used only
>    - original patch 4 (handle_munmap.patch) removed
> 
> Description:
> 
> So far, OProfile takes samples by using a software interrupt.
> The purpose of this series of patches is to add support for System z hardware sampling to OProfile.
> 
> Hardware (HW) sampling is a feature provided by System z processors (z10 and follow ons).
> When sampling, the processor takes samples containing the instruction address, PID, and other information.
> The samples are taken at a programmable rate and stored into a buffer provided by the operating system.
> The sampling process is implemented in hardware and millicode and thus does not affect the operating system
> being oberved, apart from requiring buffer memory that the Linux kernel must provide.
> 
> Hardware sampling is available in LPAR mode on 64 BIT processors only.
> 
> The overall approach is to replace the software-based sample generation by hardware sampling.
> All required functionality to control the HW sampling mechanism is added to the oprofile kernel module.
> The functions provide support for
>  - controlling the sampling hardware,
>  - setting up appropriate buffer structures (HW buffers),
>  - retrieving sample entries from these buffers.
> Multiple CPUs can be handled.
> 
> The samples contain the instruction address, a bit distinguishing between kernel and user space,
> and for user space samples also the PID.
> Instead of taking samples from its own per-CPU buffers, OProfile would rather take samples from the
> HW buffers.
> 
> When hardware sampling can be enabled on the current System z processor it will be the new default.
> Switching back to timer based sampling can be established by using
> 
>    echo 0 > /dev/oprofile/hwsampling/hwsampler
> 
> The user space drivers of OProfile also need an extension to control hw sampling by appropriate options.
> 
> 
> ------------------------------------------------------------------------------
> Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
> Finally, a world-class log management solution at an even better price-free!
> Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
> February 28th, so secure your free ArcSight Logger TODAY! 
> http://p.sf.net/sfu/arcsight-sfd2d
> _______________________________________________
> oprofile-list mailing list
> oprofile-list@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oprofile-list



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [patch v2 3/3] This patch introduces a new oprofile sample add function (oprofile_add_ext_hw_sample)
  2011-01-21 10:06 ` [patch v2 3/3] This patch introduces a new oprofile sample add function (oprofile_add_ext_hw_sample) Heinz Graalfs
@ 2011-02-14 18:55   ` Robert Richter
  0 siblings, 0 replies; 17+ messages in thread
From: Robert Richter @ 2011-02-14 18:55 UTC (permalink / raw)
  To: Heinz Graalfs
  Cc: mingo, oprofile-list, linux-kernel, linux-s390, borntraeger,
	schwidefsky, heiko.carstens, Mahesh Salgaonkar,
	Maran Pakkirisamy

On 21.01.11 05:06:54, Heinz Graalfs wrote:
> From: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> 
> This patch introduces a new oprofile sample add function (oprofile_add_ext_hw_sample)
> that can also take task_struct as an argument, which is used by the hwsampler kernel module 
> when copying hardware samples to OProfile buffers.
> 
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> Signed-off-by: Maran Pakkirisamy <maranp@linux.vnet.ibm.com>
> Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> ---
>  drivers/oprofile/cpu_buffer.c |   27 ++++++++++++++++++++-------
>  1 file changed, 20 insertions(+), 7 deletions(-)

Applied with following changes:
    * removed #include <linux/module.h>
    * whitespace changes
    * removed conditional compilation (CONFIG_HAVE_HWSAMPLER)
    * modified order of functions
    * fix missing function definition in header file

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [patch v2 1/3] This patch adds support for hardware based sampling on System z processors (models z10 and up)
  2011-01-21 10:06 ` [patch v2 1/3] This patch adds support for hardware based sampling on System z processors (models z10 and up) Heinz Graalfs
@ 2011-02-14 18:57   ` Robert Richter
  2011-03-25 11:00   ` Robert Richter
  1 sibling, 0 replies; 17+ messages in thread
From: Robert Richter @ 2011-02-14 18:57 UTC (permalink / raw)
  To: Heinz Graalfs
  Cc: mingo, oprofile-list, linux-kernel, linux-s390, borntraeger,
	schwidefsky, heiko.carstens, Mahesh Salgaonkar,
	Maran Pakkirisamy

On 21.01.11 05:06:52, Heinz Graalfs wrote:
> From: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> 
> System z's hardware sampling is described in detail in:
> 
>    SA23-2260-01 "The Load-Program-Parameter and CPU-Measurement Facilities"
> 
> The patch introduces
>  - support for System z's hardware sampler in OProfile's kernel module
>  - it adds functions that control all hardware sampling related operations as
>   - checking if hardware sampling feature is available
>     - ie: on System z models z10 and up, in LPAR mode only, and authorised during LPAR activation
>   - allocating memory for the hardware sampling feature
>   - starting/stopping hardware sampling
> 
> All functions required to start and stop hardware sampling have to be
> invoked by the oprofile kernel module as provided by the other patches of this patch set.
> 
> In case hardware based sampling cannot be setup standard timer based sampling is used by OProfile.
> 
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> Signed-off-by: Maran Pakkirisamy <maranp@linux.vnet.ibm.com>
> Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> ---
>  arch/Kconfig                   |    3
>  arch/s390/Kconfig              |    1
>  arch/s390/oprofile/hwsampler.c | 1256 +++++++++++++++++++++++++++++++++++++++++
>  arch/s390/oprofile/hwsampler.h |  113 +++
>  4 files changed, 1373 insertions(+)

Applied with following changes:
    * enable compilation in Makefile

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [patch v2 2/3] This patch enhances OProfile to support System zs hardware sampling feature
  2011-01-21 10:06 ` [patch v2 2/3] This patch enhances OProfile to support System zs hardware sampling feature Heinz Graalfs
@ 2011-02-14 19:01   ` Robert Richter
  2011-02-14 19:03     ` Robert Richter
                       ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Robert Richter @ 2011-02-14 19:01 UTC (permalink / raw)
  To: Heinz Graalfs
  Cc: mingo, oprofile-list, linux-kernel, linux-s390, borntraeger,
	schwidefsky, heiko.carstens, Mahesh Salgaonkar,
	Maran Pakkirisamy

On 21.01.11 05:06:53, Heinz Graalfs wrote:
> From: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> 
> OProfile is enhanced to export all files for controlling System z's hardware sampling,
> and to invoke hwsampler exported functions to initialize and use System z's hardware sampling.
> 
> The patch invokes hwsampler_setup() during oprofile init and exports following
> hwsampler files under oprofilefs if hwsampler's setup succeeded:
> 
> A new directory for hardware sampling based files
> 
>  /dev/oprofile/hwsampling/
> 
> The userland daemon must explicitly write to the following files
> to disable (or enable) hardware based sampling
> 
>  /dev/oprofile/hwsampling/hwsampler
> 
> to modify the actual sampling rate
> 
>  /dev/oprofile/hwsampling/hw_interval
> 
> to modify the amount of sampling memory (measured in 4K pages)
> 
>  /dev/oprofile/hwsampling/hw_sdbt_blocks
> 
> The following files are read only and show
> the possible minimum sampling rate
> 
>  /dev/oprofile/hwsampling/hw_min_interval
> 
> the possible maximum sampling rate
> 
>  /dev/oprofile/hwsampling/hw_max_interval
> 
> The patch splits the oprofile_timer_[init/exit] function so that it can be also called
> through user context (oprofilefs) to avoid kernel oops.
> 
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> Signed-off-by: Maran Pakkirisamy <maranp@linux.vnet.ibm.com>
> Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> ---
>  arch/s390/oprofile/Makefile          |    3 
>  arch/s390/oprofile/hwsampler_files.c |  146 +++++++++++++++++++++++++++++++++++
>  arch/s390/oprofile/init.c            |    7 +
>  drivers/oprofile/oprof.c             |   32 +++++++
>  drivers/oprofile/oprof.h             |    2 
>  drivers/oprofile/timer_int.c         |   16 +++
>  include/linux/oprofile.h             |   21 +++++
>  7 files changed, 222 insertions(+), 5 deletions(-)

Applied with following changes:
    * whitespace changes in Makefile and timer_int.c

I reworked some changes in a follow-on patch which I apply on top of
your patch set.

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [patch v2 2/3] This patch enhances OProfile to support System zs hardware sampling feature
  2011-02-14 19:01   ` Robert Richter
@ 2011-02-14 19:03     ` Robert Richter
  2011-02-14 19:05     ` [PATCH] oprofile, s390: Rework hwsampler implementation Robert Richter
  2011-02-14 19:07     ` [PATCH] oprofile, s390: Fix section mismatch of function hws_cpu_callback() Robert Richter
  2 siblings, 0 replies; 17+ messages in thread
From: Robert Richter @ 2011-02-14 19:03 UTC (permalink / raw)
  To: Heinz Graalfs
  Cc: mingo, oprofile-list, linux-kernel, linux-s390, borntraeger,
	schwidefsky, heiko.carstens, Mahesh Salgaonkar,
	Maran Pakkirisamy

On 14.02.11 20:01:57, Robert Richter wrote:
> I reworked some changes in a follow-on patch which I apply on top of
> your patch set.

See below for the patch.

-Robert



>From c43ad95d99aec85ba497bec3a8a8131a93098281 Mon Sep 17 00:00:00 2001
From: Robert Richter <robert.richter@amd.com>
Date: Fri, 11 Feb 2011 17:31:44 +0100
Subject: [PATCH] oprofile, s390: Rework hwsampler implementation

This patch is a rework of the hwsampler oprofile implementation that
has been applied recently. Now there are less non-architectural
changes. The only changes are:

* introduction of oprofile_add_ext_hw_sample(), and
* removal of section attributes of oprofile_timer_init/_exit().

To setup hwsampler for oprofile we need to modify start()/stop()
callbacks and additional hwsampler control files in oprofilefs. We do
not reinitialize the timer or hwsampler mode by restarting calling
init/exit() anymore, instead hwsampler_running is used to switch the
mode directly in oprofile_hwsampler_start/_stop(). For locking reasons
there is also hwsampler_file that reflects the value in oprofilefs.

The overall diffstat of the oprofile s390 hwsampler implemenation
shows the low impact to non-architectural code:

 arch/Kconfig                         |    3 +
 arch/s390/Kconfig                    |    1 +
 arch/s390/oprofile/Makefile          |    2 +-
 arch/s390/oprofile/hwsampler.c       | 1256 ++++++++++++++++++++++++++++++++++
 arch/s390/oprofile/hwsampler.h       |  113 +++
 arch/s390/oprofile/hwsampler_files.c |  162 +++++
 arch/s390/oprofile/init.c            |    6 +-
 drivers/oprofile/cpu_buffer.c        |   24 +-
 drivers/oprofile/timer_int.c         |    4 +-
 include/linux/oprofile.h             |    7 +
 10 files changed, 1567 insertions(+), 11 deletions(-)

Signed-off-by: Robert Richter <robert.richter@amd.com>
---
 arch/s390/oprofile/hwsampler_files.c |   60 +++++++++++++++++++++------------
 arch/s390/oprofile/init.c            |    1 -
 drivers/oprofile/oprof.c             |   32 ------------------
 drivers/oprofile/oprof.h             |    2 -
 drivers/oprofile/timer_int.c         |   15 ++-------
 include/linux/oprofile.h             |   21 ------------
 6 files changed, 41 insertions(+), 90 deletions(-)

diff --git a/arch/s390/oprofile/hwsampler_files.c b/arch/s390/oprofile/hwsampler_files.c
index 493f7cc..2e1da24 100644
--- a/arch/s390/oprofile/hwsampler_files.c
+++ b/arch/s390/oprofile/hwsampler_files.c
@@ -8,6 +8,7 @@
 #include <linux/errno.h>
 #include <linux/fs.h>
 
+#include "../../../drivers/oprofile/oprof.h"
 #include "hwsampler.h"
 
 #define DEFAULT_INTERVAL	4096
@@ -22,12 +23,20 @@ static unsigned long oprofile_max_interval;
 static unsigned long oprofile_sdbt_blocks = DEFAULT_SDBT_BLOCKS;
 static unsigned long oprofile_sdb_blocks = DEFAULT_SDB_BLOCKS;
 
-static unsigned long oprofile_hwsampler;
+static int hwsampler_file;
+static int hwsampler_running;	/* start_mutex must be held to change */
+
+static struct oprofile_operations timer_ops;
 
 static int oprofile_hwsampler_start(void)
 {
 	int retval;
 
+	hwsampler_running = hwsampler_file;
+
+	if (!hwsampler_running)
+		return timer_ops.start();
+
 	retval = hwsampler_allocate(oprofile_sdbt_blocks, oprofile_sdb_blocks);
 	if (retval)
 		return retval;
@@ -41,25 +50,20 @@ static int oprofile_hwsampler_start(void)
 
 static void oprofile_hwsampler_stop(void)
 {
+	if (!hwsampler_running) {
+		timer_ops.stop();
+		return;
+	}
+
 	hwsampler_stop_all();
 	hwsampler_deallocate();
 	return;
 }
 
-int oprofile_arch_set_hwsampler(struct oprofile_operations *ops)
-{
-	printk(KERN_INFO "oprofile: using hardware sampling\n");
-	ops->start = oprofile_hwsampler_start;
-	ops->stop = oprofile_hwsampler_stop;
-	ops->cpu_type = "timer";
-
-	return 0;
-}
-
 static ssize_t hwsampler_read(struct file *file, char __user *buf,
 		size_t count, loff_t *offset)
 {
-	return oprofilefs_ulong_to_user(oprofile_hwsampler, buf, count, offset);
+	return oprofilefs_ulong_to_user(hwsampler_file, buf, count, offset);
 }
 
 static ssize_t hwsampler_write(struct file *file, char const __user *buf,
@@ -75,15 +79,16 @@ static ssize_t hwsampler_write(struct file *file, char const __user *buf,
 	if (retval)
 		return retval;
 
-	if (oprofile_hwsampler == val)
-		return -EINVAL;
-
-	retval = oprofile_set_hwsampler(val);
+	if (oprofile_started)
+		/*
+		 * save to do without locking as we set
+		 * hwsampler_running in start() when start_mutex is
+		 * held
+		 */
+		return -EBUSY;
 
-	if (retval)
-		return retval;
+	hwsampler_file = val;
 
-	oprofile_hwsampler = val;
 	return count;
 }
 
@@ -98,7 +103,7 @@ static int oprofile_create_hwsampling_files(struct super_block *sb,
 	struct dentry *hw_dir;
 
 	/* reinitialize default values */
-	oprofile_hwsampler = 1;
+	hwsampler_file = 1;
 
 	hw_dir = oprofilefs_mkdir(sb, root, "hwsampling");
 	if (!hw_dir)
@@ -125,7 +130,6 @@ int oprofile_hwsampler_init(struct oprofile_operations* ops)
 	/*
 	 * create hwsampler files only if hwsampler_setup() succeeds.
 	 */
-	ops->create_files = oprofile_create_hwsampling_files;
 	oprofile_min_interval = hwsampler_query_min_interval();
 	if (oprofile_min_interval < 0) {
 		oprofile_min_interval = 0;
@@ -136,11 +140,23 @@ int oprofile_hwsampler_init(struct oprofile_operations* ops)
 		oprofile_max_interval = 0;
 		return -ENODEV;
 	}
-	oprofile_arch_set_hwsampler(ops);
+
+	if (oprofile_timer_init(ops))
+		return -ENODEV;
+
+	printk(KERN_INFO "oprofile: using hardware sampling\n");
+
+	memcpy(&timer_ops, ops, sizeof(timer_ops));
+
+	ops->start = oprofile_hwsampler_start;
+	ops->stop = oprofile_hwsampler_stop;
+	ops->create_files = oprofile_create_hwsampling_files;
+
 	return 0;
 }
 
 void oprofile_hwsampler_exit(void)
 {
+	oprofile_timer_exit();
 	hwsampler_shutdown();
 }
diff --git a/arch/s390/oprofile/init.c b/arch/s390/oprofile/init.c
index f6b3f72..059b44b 100644
--- a/arch/s390/oprofile/init.c
+++ b/arch/s390/oprofile/init.c
@@ -11,7 +11,6 @@
 #include <linux/oprofile.h>
 #include <linux/init.h>
 #include <linux/errno.h>
-#include <linux/fs.h>
 
 extern int oprofile_hwsampler_init(struct oprofile_operations* ops);
 extern void oprofile_hwsampler_exit(void);
diff --git a/drivers/oprofile/oprof.c b/drivers/oprofile/oprof.c
index 43b01da..f9bda64 100644
--- a/drivers/oprofile/oprof.c
+++ b/drivers/oprofile/oprof.c
@@ -239,38 +239,6 @@ int oprofile_set_ulong(unsigned long *addr, unsigned long val)
 	return err;
 }
 
-#ifdef CONFIG_HAVE_HWSAMPLER
-int oprofile_set_hwsampler(unsigned long val)
-{
-	int err = 0;
-
-	mutex_lock(&start_mutex);
-
-	if (oprofile_started) {
-		err = -EBUSY;
-		goto out;
-	}
-
-	switch (val) {
-	case 1:
-		/* Switch to hardware sampling. */
-		__oprofile_timer_exit();
-		err = oprofile_arch_set_hwsampler(&oprofile_ops);
-		break;
-	case 0:
-		printk(KERN_INFO "oprofile: using timer interrupt.\n");
-		err = __oprofile_timer_init(&oprofile_ops);
-		break;
-	default:
-		err = -EINVAL;
-	}
-
-out:
-	mutex_unlock(&start_mutex);
-	return err;
-}
-#endif /* CONFIG_HAVE_HWSAMPLER */
-
 static int __init oprofile_init(void)
 {
 	int err;
diff --git a/drivers/oprofile/oprof.h b/drivers/oprofile/oprof.h
index 5a6ceb1..177b73d 100644
--- a/drivers/oprofile/oprof.h
+++ b/drivers/oprofile/oprof.h
@@ -35,9 +35,7 @@ struct dentry;
 
 void oprofile_create_files(struct super_block *sb, struct dentry *root);
 int oprofile_timer_init(struct oprofile_operations *ops);
-int __oprofile_timer_init(struct oprofile_operations *ops);
 void oprofile_timer_exit(void);
-void __oprofile_timer_exit(void);
 
 int oprofile_set_ulong(unsigned long *addr, unsigned long val);
 int oprofile_set_timeout(unsigned long time);
diff --git a/drivers/oprofile/timer_int.c b/drivers/oprofile/timer_int.c
index 0099a45..3ef4462 100644
--- a/drivers/oprofile/timer_int.c
+++ b/drivers/oprofile/timer_int.c
@@ -97,13 +97,14 @@ static struct notifier_block __refdata oprofile_cpu_notifier = {
 	.notifier_call = oprofile_cpu_notify,
 };
 
-int  __oprofile_timer_init(struct oprofile_operations *ops)
+int oprofile_timer_init(struct oprofile_operations *ops)
 {
 	int rc;
 
 	rc = register_hotcpu_notifier(&oprofile_cpu_notifier);
 	if (rc)
 		return rc;
+	ops->create_files = NULL;
 	ops->setup = NULL;
 	ops->shutdown = NULL;
 	ops->start = oprofile_hrtimer_start;
@@ -112,17 +113,7 @@ int  __oprofile_timer_init(struct oprofile_operations *ops)
 	return 0;
 }
 
-int __init oprofile_timer_init(struct oprofile_operations *ops)
-{
-	return __oprofile_timer_init(ops);
-}
-
-void __oprofile_timer_exit(void)
+void oprofile_timer_exit(void)
 {
 	unregister_hotcpu_notifier(&oprofile_cpu_notifier);
 }
-
-void __exit oprofile_timer_exit(void)
-{
-	__oprofile_timer_exit();
-}
diff --git a/include/linux/oprofile.h b/include/linux/oprofile.h
index b517d86..7f5cfd3 100644
--- a/include/linux/oprofile.h
+++ b/include/linux/oprofile.h
@@ -91,27 +91,6 @@ int oprofile_arch_init(struct oprofile_operations * ops);
  */
 void oprofile_arch_exit(void);
 
-#ifdef CONFIG_HAVE_HWSAMPLER
-/**
- * setup hardware sampler for oprofiling.
- */
-
-int oprofile_set_hwsampler(unsigned long);
-
-/**
- * hardware sampler module initialization for the s390 arch
- */
-
-int oprofile_arch_set_hwsampler(struct oprofile_operations *ops);
-
-/**
- * Add an s390 hardware sample.
- */
-void oprofile_add_ext_hw_sample(unsigned long pc, struct pt_regs * const regs,
-	unsigned long event, int is_kernel,
-	struct task_struct *task);
-#endif /* CONFIG_HAVE_HWSAMPLER */
-
 /**
  * Add a sample. This may be called from any context.
  */
-- 
1.7.3.4



-- 
Advanced Micro Devices, Inc.
Operating System Research Center


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] oprofile, s390: Rework hwsampler implementation
  2011-02-14 19:01   ` Robert Richter
  2011-02-14 19:03     ` Robert Richter
@ 2011-02-14 19:05     ` Robert Richter
  2011-02-14 19:07     ` [PATCH] oprofile, s390: Fix section mismatch of function hws_cpu_callback() Robert Richter
  2 siblings, 0 replies; 17+ messages in thread
From: Robert Richter @ 2011-02-14 19:05 UTC (permalink / raw)
  To: Heinz Graalfs
  Cc: mingo, oprofile-list, linux-kernel, linux-s390, borntraeger,
	schwidefsky, heiko.carstens, Mahesh Salgaonkar,
	Maran Pakkirisamy

(resent with subject changed)

On 14.02.11 20:01:57, Robert Richter wrote:
> I reworked some changes in a follow-on patch which I apply on top of
> your patch set.

See below for the patch.

-Robert



>From c43ad95d99aec85ba497bec3a8a8131a93098281 Mon Sep 17 00:00:00 2001
From: Robert Richter <robert.richter@amd.com>
Date: Fri, 11 Feb 2011 17:31:44 +0100
Subject: [PATCH] oprofile, s390: Rework hwsampler implementation

This patch is a rework of the hwsampler oprofile implementation that
has been applied recently. Now there are less non-architectural
changes. The only changes are:

* introduction of oprofile_add_ext_hw_sample(), and
* removal of section attributes of oprofile_timer_init/_exit().

To setup hwsampler for oprofile we need to modify start()/stop()
callbacks and additional hwsampler control files in oprofilefs. We do
not reinitialize the timer or hwsampler mode by restarting calling
init/exit() anymore, instead hwsampler_running is used to switch the
mode directly in oprofile_hwsampler_start/_stop(). For locking reasons
there is also hwsampler_file that reflects the value in oprofilefs.

The overall diffstat of the oprofile s390 hwsampler implemenation
shows the low impact to non-architectural code:

 arch/Kconfig                         |    3 +
 arch/s390/Kconfig                    |    1 +
 arch/s390/oprofile/Makefile          |    2 +-
 arch/s390/oprofile/hwsampler.c       | 1256 ++++++++++++++++++++++++++++++++++
 arch/s390/oprofile/hwsampler.h       |  113 +++
 arch/s390/oprofile/hwsampler_files.c |  162 +++++
 arch/s390/oprofile/init.c            |    6 +-
 drivers/oprofile/cpu_buffer.c        |   24 +-
 drivers/oprofile/timer_int.c         |    4 +-
 include/linux/oprofile.h             |    7 +
 10 files changed, 1567 insertions(+), 11 deletions(-)

Signed-off-by: Robert Richter <robert.richter@amd.com>
---
 arch/s390/oprofile/hwsampler_files.c |   60 +++++++++++++++++++++------------
 arch/s390/oprofile/init.c            |    1 -
 drivers/oprofile/oprof.c             |   32 ------------------
 drivers/oprofile/oprof.h             |    2 -
 drivers/oprofile/timer_int.c         |   15 ++-------
 include/linux/oprofile.h             |   21 ------------
 6 files changed, 41 insertions(+), 90 deletions(-)

diff --git a/arch/s390/oprofile/hwsampler_files.c b/arch/s390/oprofile/hwsampler_files.c
index 493f7cc..2e1da24 100644
--- a/arch/s390/oprofile/hwsampler_files.c
+++ b/arch/s390/oprofile/hwsampler_files.c
@@ -8,6 +8,7 @@
 #include <linux/errno.h>
 #include <linux/fs.h>
 
+#include "../../../drivers/oprofile/oprof.h"
 #include "hwsampler.h"
 
 #define DEFAULT_INTERVAL	4096
@@ -22,12 +23,20 @@ static unsigned long oprofile_max_interval;
 static unsigned long oprofile_sdbt_blocks = DEFAULT_SDBT_BLOCKS;
 static unsigned long oprofile_sdb_blocks = DEFAULT_SDB_BLOCKS;
 
-static unsigned long oprofile_hwsampler;
+static int hwsampler_file;
+static int hwsampler_running;	/* start_mutex must be held to change */
+
+static struct oprofile_operations timer_ops;
 
 static int oprofile_hwsampler_start(void)
 {
 	int retval;
 
+	hwsampler_running = hwsampler_file;
+
+	if (!hwsampler_running)
+		return timer_ops.start();
+
 	retval = hwsampler_allocate(oprofile_sdbt_blocks, oprofile_sdb_blocks);
 	if (retval)
 		return retval;
@@ -41,25 +50,20 @@ static int oprofile_hwsampler_start(void)
 
 static void oprofile_hwsampler_stop(void)
 {
+	if (!hwsampler_running) {
+		timer_ops.stop();
+		return;
+	}
+
 	hwsampler_stop_all();
 	hwsampler_deallocate();
 	return;
 }
 
-int oprofile_arch_set_hwsampler(struct oprofile_operations *ops)
-{
-	printk(KERN_INFO "oprofile: using hardware sampling\n");
-	ops->start = oprofile_hwsampler_start;
-	ops->stop = oprofile_hwsampler_stop;
-	ops->cpu_type = "timer";
-
-	return 0;
-}
-
 static ssize_t hwsampler_read(struct file *file, char __user *buf,
 		size_t count, loff_t *offset)
 {
-	return oprofilefs_ulong_to_user(oprofile_hwsampler, buf, count, offset);
+	return oprofilefs_ulong_to_user(hwsampler_file, buf, count, offset);
 }
 
 static ssize_t hwsampler_write(struct file *file, char const __user *buf,
@@ -75,15 +79,16 @@ static ssize_t hwsampler_write(struct file *file, char const __user *buf,
 	if (retval)
 		return retval;
 
-	if (oprofile_hwsampler == val)
-		return -EINVAL;
-
-	retval = oprofile_set_hwsampler(val);
+	if (oprofile_started)
+		/*
+		 * save to do without locking as we set
+		 * hwsampler_running in start() when start_mutex is
+		 * held
+		 */
+		return -EBUSY;
 
-	if (retval)
-		return retval;
+	hwsampler_file = val;
 
-	oprofile_hwsampler = val;
 	return count;
 }
 
@@ -98,7 +103,7 @@ static int oprofile_create_hwsampling_files(struct super_block *sb,
 	struct dentry *hw_dir;
 
 	/* reinitialize default values */
-	oprofile_hwsampler = 1;
+	hwsampler_file = 1;
 
 	hw_dir = oprofilefs_mkdir(sb, root, "hwsampling");
 	if (!hw_dir)
@@ -125,7 +130,6 @@ int oprofile_hwsampler_init(struct oprofile_operations* ops)
 	/*
 	 * create hwsampler files only if hwsampler_setup() succeeds.
 	 */
-	ops->create_files = oprofile_create_hwsampling_files;
 	oprofile_min_interval = hwsampler_query_min_interval();
 	if (oprofile_min_interval < 0) {
 		oprofile_min_interval = 0;
@@ -136,11 +140,23 @@ int oprofile_hwsampler_init(struct oprofile_operations* ops)
 		oprofile_max_interval = 0;
 		return -ENODEV;
 	}
-	oprofile_arch_set_hwsampler(ops);
+
+	if (oprofile_timer_init(ops))
+		return -ENODEV;
+
+	printk(KERN_INFO "oprofile: using hardware sampling\n");
+
+	memcpy(&timer_ops, ops, sizeof(timer_ops));
+
+	ops->start = oprofile_hwsampler_start;
+	ops->stop = oprofile_hwsampler_stop;
+	ops->create_files = oprofile_create_hwsampling_files;
+
 	return 0;
 }
 
 void oprofile_hwsampler_exit(void)
 {
+	oprofile_timer_exit();
 	hwsampler_shutdown();
 }
diff --git a/arch/s390/oprofile/init.c b/arch/s390/oprofile/init.c
index f6b3f72..059b44b 100644
--- a/arch/s390/oprofile/init.c
+++ b/arch/s390/oprofile/init.c
@@ -11,7 +11,6 @@
 #include <linux/oprofile.h>
 #include <linux/init.h>
 #include <linux/errno.h>
-#include <linux/fs.h>
 
 extern int oprofile_hwsampler_init(struct oprofile_operations* ops);
 extern void oprofile_hwsampler_exit(void);
diff --git a/drivers/oprofile/oprof.c b/drivers/oprofile/oprof.c
index 43b01da..f9bda64 100644
--- a/drivers/oprofile/oprof.c
+++ b/drivers/oprofile/oprof.c
@@ -239,38 +239,6 @@ int oprofile_set_ulong(unsigned long *addr, unsigned long val)
 	return err;
 }
 
-#ifdef CONFIG_HAVE_HWSAMPLER
-int oprofile_set_hwsampler(unsigned long val)
-{
-	int err = 0;
-
-	mutex_lock(&start_mutex);
-
-	if (oprofile_started) {
-		err = -EBUSY;
-		goto out;
-	}
-
-	switch (val) {
-	case 1:
-		/* Switch to hardware sampling. */
-		__oprofile_timer_exit();
-		err = oprofile_arch_set_hwsampler(&oprofile_ops);
-		break;
-	case 0:
-		printk(KERN_INFO "oprofile: using timer interrupt.\n");
-		err = __oprofile_timer_init(&oprofile_ops);
-		break;
-	default:
-		err = -EINVAL;
-	}
-
-out:
-	mutex_unlock(&start_mutex);
-	return err;
-}
-#endif /* CONFIG_HAVE_HWSAMPLER */
-
 static int __init oprofile_init(void)
 {
 	int err;
diff --git a/drivers/oprofile/oprof.h b/drivers/oprofile/oprof.h
index 5a6ceb1..177b73d 100644
--- a/drivers/oprofile/oprof.h
+++ b/drivers/oprofile/oprof.h
@@ -35,9 +35,7 @@ struct dentry;
 
 void oprofile_create_files(struct super_block *sb, struct dentry *root);
 int oprofile_timer_init(struct oprofile_operations *ops);
-int __oprofile_timer_init(struct oprofile_operations *ops);
 void oprofile_timer_exit(void);
-void __oprofile_timer_exit(void);
 
 int oprofile_set_ulong(unsigned long *addr, unsigned long val);
 int oprofile_set_timeout(unsigned long time);
diff --git a/drivers/oprofile/timer_int.c b/drivers/oprofile/timer_int.c
index 0099a45..3ef4462 100644
--- a/drivers/oprofile/timer_int.c
+++ b/drivers/oprofile/timer_int.c
@@ -97,13 +97,14 @@ static struct notifier_block __refdata oprofile_cpu_notifier = {
 	.notifier_call = oprofile_cpu_notify,
 };
 
-int  __oprofile_timer_init(struct oprofile_operations *ops)
+int oprofile_timer_init(struct oprofile_operations *ops)
 {
 	int rc;
 
 	rc = register_hotcpu_notifier(&oprofile_cpu_notifier);
 	if (rc)
 		return rc;
+	ops->create_files = NULL;
 	ops->setup = NULL;
 	ops->shutdown = NULL;
 	ops->start = oprofile_hrtimer_start;
@@ -112,17 +113,7 @@ int  __oprofile_timer_init(struct oprofile_operations *ops)
 	return 0;
 }
 
-int __init oprofile_timer_init(struct oprofile_operations *ops)
-{
-	return __oprofile_timer_init(ops);
-}
-
-void __oprofile_timer_exit(void)
+void oprofile_timer_exit(void)
 {
 	unregister_hotcpu_notifier(&oprofile_cpu_notifier);
 }
-
-void __exit oprofile_timer_exit(void)
-{
-	__oprofile_timer_exit();
-}
diff --git a/include/linux/oprofile.h b/include/linux/oprofile.h
index b517d86..7f5cfd3 100644
--- a/include/linux/oprofile.h
+++ b/include/linux/oprofile.h
@@ -91,27 +91,6 @@ int oprofile_arch_init(struct oprofile_operations * ops);
  */
 void oprofile_arch_exit(void);
 
-#ifdef CONFIG_HAVE_HWSAMPLER
-/**
- * setup hardware sampler for oprofiling.
- */
-
-int oprofile_set_hwsampler(unsigned long);
-
-/**
- * hardware sampler module initialization for the s390 arch
- */
-
-int oprofile_arch_set_hwsampler(struct oprofile_operations *ops);
-
-/**
- * Add an s390 hardware sample.
- */
-void oprofile_add_ext_hw_sample(unsigned long pc, struct pt_regs * const regs,
-	unsigned long event, int is_kernel,
-	struct task_struct *task);
-#endif /* CONFIG_HAVE_HWSAMPLER */
-
 /**
  * Add a sample. This may be called from any context.
  */
-- 
1.7.3.4



-- 
Advanced Micro Devices, Inc.
Operating System Research Center


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] oprofile, s390: Fix section mismatch of function hws_cpu_callback()
  2011-02-14 19:01   ` Robert Richter
  2011-02-14 19:03     ` Robert Richter
  2011-02-14 19:05     ` [PATCH] oprofile, s390: Rework hwsampler implementation Robert Richter
@ 2011-02-14 19:07     ` Robert Richter
  2 siblings, 0 replies; 17+ messages in thread
From: Robert Richter @ 2011-02-14 19:07 UTC (permalink / raw)
  To: Heinz Graalfs
  Cc: mingo, oprofile-list, linux-kernel, linux-s390, borntraeger,
	schwidefsky, heiko.carstens, Mahesh Salgaonkar,
	Maran Pakkirisamy

On 14.02.11 20:01:57, Robert Richter wrote:
> I reworked some changes in a follow-on patch which I apply on top of
> your patch set.

And another fix below.

-Robert



>From 11a2be68e575c02ec74a918e05c590627ff16e9c Mon Sep 17 00:00:00 2001
From: Robert Richter <robert.richter@amd.com>
Date: Mon, 14 Feb 2011 19:08:33 +0100
Subject: [PATCH] oprofile, s390: Fix section mismatch of function hws_cpu_callback()

Fixes the following section mismatch:

 Section mismatch in reference from the variable hws_cpu_notifier to the function .cpuinit.text:hws_cpu_callback()

Signed-off-by: Robert Richter <robert.richter@amd.com>
---
 arch/s390/oprofile/hwsampler.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/s390/oprofile/hwsampler.c b/arch/s390/oprofile/hwsampler.c
index ab3f770e..3d48f4d 100644
--- a/arch/s390/oprofile/hwsampler.c
+++ b/arch/s390/oprofile/hwsampler.c
@@ -578,7 +578,7 @@ static struct notifier_block hws_oom_notifier = {
 	.notifier_call = hws_oom_callback
 };
 
-static int __cpuinit hws_cpu_callback(struct notifier_block *nfb,
+static int hws_cpu_callback(struct notifier_block *nfb,
 	unsigned long action, void *hcpu)
 {
 	/* We do not have sampler space available for all possible CPUs.
-- 
1.7.3.4



-- 
Advanced Micro Devices, Inc.
Operating System Research Center


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [patch v2 0/3] OProfile support for System z's hardware sampling
  2011-01-21 10:06 [patch v2 0/3] OProfile support for System z's hardware sampling Heinz Graalfs
                   ` (3 preceding siblings ...)
  2011-02-07  8:23 ` [patch v2 0/3] OProfile support for System z's hardware sampling Heinz Graalfs
@ 2011-02-14 19:42 ` Robert Richter
  2011-02-15  7:17   ` Heiko Carstens
  2011-02-15 16:59   ` Heinz Graalfs
  4 siblings, 2 replies; 17+ messages in thread
From: Robert Richter @ 2011-02-14 19:42 UTC (permalink / raw)
  To: Heinz Graalfs, Martin Schwidefsky, Heiko Carstens
  Cc: mingo, oprofile-list, linux-kernel, linux-s390, borntraeger

Heinz,

On 21.01.11 05:06:51, Heinz Graalfs wrote:
> I'm resending yesterday's mail because I missed to specify the correct sender information.
> 
> This is a re-posting of the patch series originally posted last month:
> 
> http://marc.info/?l=linux-s390&m=129285043619973&w=2
> 
> Heinz
> 
> Changes in
> 
> v2:
>    - kernel module hwsampler removed, everything is now in oprofile kernel module
>    - functions from hwsampler-main.c and smpctl.c merged into arch/s390/oprofile/hwsampler.c
>      - functions made static
>    - arch/s390/include/asm/hwsampler.h moved to arch/s390/oprofile/hwsampler.h
>      - structs have now hws_ prefix
>    - config variables changed, HAVE_HWSAMPLER used only
>    - original patch 4 (handle_munmap.patch) removed

thanks for you patches, I have applied them with modifications to:

 git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile.git s390

For atomic compilation of single patches I changed patch order and the
Makefile where necessary. The branch also includes my follow-on
patches.

I will be travelling the next weeks and wanted to have this series in
linux-next now for later inclusion to .39. We would run out of time
otherwise (my bad). Please review and test the changes I made and send
me possibly updates.

Though I applied the patches I want to have the following changes:

* Merge init.c and hwsampler_file.c, two files are bloated here and
  hwsampler_file.c is a bad and too long naming.

* Rework functions in cpu_buffer.c (log_sample,
  __oprofile_add_ext_sample, oprofile_add_ext_hw_sample, etc.). All
  the (static) functions can be merged to a single functino by
  implementing a struct that holds all current function arguments.
  Something like:

static void __oprofile_add_ext_sample(struct *foobar fb);
void oprofile_add_sample(struct pt_regs * const regs, unsigned long event)
{
	struct foobar fb = { .regs = regs, .event = event };
	__oprofile_add_ext_sample(&fb);
}

  The naming and description of oprofile_add_ext_hw_sample() is also
  not the best. As interface in include/linux/oprofile.h we could then
  merge oprofile_add_ext_sample() and oprofile_add_ext_hw_sample() to
  a single function.

It would be nice if you could implement this.

For all patches in the oprofile/s390 branch I also need the
s390 maintainer's ack.

Martin or Heiko, please ack.

Thanks,

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [patch v2 0/3] OProfile support for System z's hardware sampling
  2011-02-14 19:42 ` Robert Richter
@ 2011-02-15  7:17   ` Heiko Carstens
  2011-02-15 16:59   ` Heinz Graalfs
  1 sibling, 0 replies; 17+ messages in thread
From: Heiko Carstens @ 2011-02-15  7:17 UTC (permalink / raw)
  To: Robert Richter
  Cc: Heinz Graalfs, Martin Schwidefsky, mingo, oprofile-list,
	linux-kernel, linux-s390, borntraeger

On Mon, Feb 14, 2011 at 08:42:06PM +0100, Robert Richter wrote:
> For all patches in the oprofile/s390 branch I also need the
> s390 maintainer's ack.
> 
> Martin or Heiko, please ack.

Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [patch v2 0/3] OProfile support for System z's hardware sampling
  2011-02-14 19:42 ` Robert Richter
  2011-02-15  7:17   ` Heiko Carstens
@ 2011-02-15 16:59   ` Heinz Graalfs
  2011-02-15 17:29     ` Robert Richter
  1 sibling, 1 reply; 17+ messages in thread
From: Heinz Graalfs @ 2011-02-15 16:59 UTC (permalink / raw)
  To: Robert Richter
  Cc: Martin Schwidefsky, Heiko Carstens, mingo, oprofile-list,
	linux-kernel, linux-s390, borntraeger

Hello Robert,

thanks for the positive reply.

Please see my comments below. I suppose I still need some more detailed
directions from you.
Unfortunately I will not be in the office Thursday and Friday.

Heinz

On Mon, 2011-02-14 at 20:42 +0100, Robert Richter wrote:
> Heinz,
> 
> On 21.01.11 05:06:51, Heinz Graalfs wrote:
> > I'm resending yesterday's mail because I missed to specify the correct sender information.
> > 
> > This is a re-posting of the patch series originally posted last month:
> > 
> > http://marc.info/?l=linux-s390&m=129285043619973&w=2
> > 
> > Heinz
> > 
> > Changes in
> > 
> > v2:
> >    - kernel module hwsampler removed, everything is now in oprofile kernel module
> >    - functions from hwsampler-main.c and smpctl.c merged into arch/s390/oprofile/hwsampler.c
> >      - functions made static
> >    - arch/s390/include/asm/hwsampler.h moved to arch/s390/oprofile/hwsampler.h
> >      - structs have now hws_ prefix
> >    - config variables changed, HAVE_HWSAMPLER used only
> >    - original patch 4 (handle_munmap.patch) removed
> 
> thanks for you patches, I have applied them with modifications to:
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile.git s390
> 
> For atomic compilation of single patches I changed patch order and the
> Makefile where necessary. The branch also includes my follow-on
> patches.
> 
> I will be travelling the next weeks and wanted to have this series in
> linux-next now for later inclusion to .39. We would run out of time
> otherwise (my bad). Please review and test the changes I made and send
> me possibly updates.
> 
> Though I applied the patches I want to have the following changes:
> 
> * Merge init.c and hwsampler_file.c, two files are bloated here and
>   hwsampler_file.c is a bad and too long naming.
> 

OK, I've merged all hwsampler_files.c contents into init.c

> * Rework functions in cpu_buffer.c (log_sample,
>   __oprofile_add_ext_sample, oprofile_add_ext_hw_sample, etc.). All
>   the (static) functions can be merged to a single functino by
>   implementing a struct that holds all current function arguments.
>   Something like:
> 
> static void __oprofile_add_ext_sample(struct *foobar fb);
> void oprofile_add_sample(struct pt_regs * const regs, unsigned long event)
> {
> 	struct foobar fb = { .regs = regs, .event = event };
> 	__oprofile_add_ext_sample(&fb);
> }

OK, I've done this

> 
>   The naming and description of oprofile_add_ext_hw_sample() is also
>   not the best. As interface in include/linux/oprofile.h we could then
>   merge oprofile_add_ext_sample() and oprofile_add_ext_hw_sample() to
>   a single function.
> 
> It would be nice if you could implement this.

sure, I will do this, however I'm not sure what you exactly mean.
Could you specify the interface in oprofile.h what you basically have in
mind?

> 
> For all patches in the oprofile/s390 branch I also need the
> s390 maintainer's ack.
> 
> Martin or Heiko, please ack.
> 
> Thanks,
> 
> -Robert
> 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [patch v2 0/3] OProfile support for System z's hardware sampling
  2011-02-15 16:59   ` Heinz Graalfs
@ 2011-02-15 17:29     ` Robert Richter
  0 siblings, 0 replies; 17+ messages in thread
From: Robert Richter @ 2011-02-15 17:29 UTC (permalink / raw)
  To: Heinz Graalfs
  Cc: Martin Schwidefsky, Heiko Carstens, mingo, oprofile-list,
	linux-kernel, linux-s390, borntraeger

On 15.02.11 11:59:29, Heinz Graalfs wrote:

> > * Merge init.c and hwsampler_file.c, two files are bloated here and
> >   hwsampler_file.c is a bad and too long naming.
> > 
> 
> OK, I've merged all hwsampler_files.c contents into init.c
> 
> > * Rework functions in cpu_buffer.c (log_sample,
> >   __oprofile_add_ext_sample, oprofile_add_ext_hw_sample, etc.). All
> >   the (static) functions can be merged to a single functino by
> >   implementing a struct that holds all current function arguments.
> >   Something like:
> > 
> > static void __oprofile_add_ext_sample(struct *foobar fb);
> > void oprofile_add_sample(struct pt_regs * const regs, unsigned long event)
> > {
> > 	struct foobar fb = { .regs = regs, .event = event };
> > 	__oprofile_add_ext_sample(&fb);
> > }
> 
> OK, I've done this
> 
> > 
> >   The naming and description of oprofile_add_ext_hw_sample() is also
> >   not the best. As interface in include/linux/oprofile.h we could then
> >   merge oprofile_add_ext_sample() and oprofile_add_ext_hw_sample() to
> >   a single function.
> > 
> > It would be nice if you could implement this.
> 
> sure, I will do this, however I'm not sure what you exactly mean.
> Could you specify the interface in oprofile.h what you basically have in
> mind?

I mean to replace oprofile_add_ext_sample() and
oprofile_add_ext_hw_sample() by a new one. The interface would be in
the form of:

struct foobar {
       ...
}

static void __oprofile_add_ext_sample(struct *foobar fb);

Hope this makes sense. The advantage would be that we don't need to
extend the functions argument list anymore, we simply extend the
struct.

Please send me delta patches to oprofile/s390.

-Robert

-- 
Advanced Micro Devices, Inc.
Operating System Research Center


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [patch v2 1/3] This patch adds support for hardware based sampling on System z processors (models z10 and up)
  2011-01-21 10:06 ` [patch v2 1/3] This patch adds support for hardware based sampling on System z processors (models z10 and up) Heinz Graalfs
  2011-02-14 18:57   ` Robert Richter
@ 2011-03-25 11:00   ` Robert Richter
  2011-03-29 12:38     ` Heinz Graalfs
  1 sibling, 1 reply; 17+ messages in thread
From: Robert Richter @ 2011-03-25 11:00 UTC (permalink / raw)
  To: Heinz Graalfs
  Cc: mingo, oprofile-list, linux-kernel, linux-s390, borntraeger,
	schwidefsky, heiko.carstens, Mahesh Salgaonkar,
	Maran Pakkirisamy

On 21.01.11 05:06:52, Heinz Graalfs wrote:
> From: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> 
> System z's hardware sampling is described in detail in:
> 
>    SA23-2260-01 "The Load-Program-Parameter and CPU-Measurement Facilities"
> 
> The patch introduces
>  - support for System z's hardware sampler in OProfile's kernel module
>  - it adds functions that control all hardware sampling related operations as
>   - checking if hardware sampling feature is available
>     - ie: on System z models z10 and up, in LPAR mode only, and authorised during LPAR activation
>   - allocating memory for the hardware sampling feature
>   - starting/stopping hardware sampling
> 
> All functions required to start and stop hardware sampling have to be
> invoked by the oprofile kernel module as provided by the other patches of this patch set.
> 
> In case hardware based sampling cannot be setup standard timer based sampling is used by OProfile.
> 
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> Signed-off-by: Maran Pakkirisamy <maranp@linux.vnet.ibm.com>
> Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> ---
>  arch/Kconfig                   |    3
>  arch/s390/Kconfig              |    1
>  arch/s390/oprofile/hwsampler.c | 1256 +++++++++++++++++++++++++++++++++++++++++
>  arch/s390/oprofile/hwsampler.h |  113 +++
>  4 files changed, 1373 insertions(+)

> +int hwsampler_setup()
> +{
> +       int rc;
> +       int cpu;
> +       struct hws_cpu_buffer *cb;
> +
> +       mutex_lock(&hws_sem);
> +
> +       rc = -EINVAL;
> +       if (hws_state)
> +               goto setup_exit;
> +
> +       hws_state = HWS_INIT;

hws_state is never set to zero again, so we will pass this point only
one time, even after hwsampler_shutdown(). Is this intended? Only
loading and unloading this as module would work.

Maybe we clear hws_state in hwsampler_shutdown()?

-Robert

> +
> +       init_all_cpu_buffers();
> +
> +       rc = check_hardware_prerequisites();
> +       if (rc)
> +               goto setup_exit;
> +
> +       rc = check_qsi_on_setup();
> +       if (rc)
> +               goto setup_exit;
> +
> +       rc = -EINVAL;
> +       hws_wq = create_workqueue("hwsampler");
> +       if (!hws_wq)
> +               goto setup_exit;
> +
> +       register_cpu_notifier(&hws_cpu_notifier);
> +
> +       for_each_online_cpu(cpu) {
> +               cb = &per_cpu(sampler_cpu_buffer, cpu);
> +               INIT_WORK(&cb->worker, worker);
> +               rc = smp_ctl_qsi(cpu);
> +               WARN_ON(rc);
> +               if (min_sampler_rate != cb->qsi.min_sampl_rate) {
> +                       if (min_sampler_rate) {
> +                               printk(KERN_WARNING
> +                                       "hwsampler: different min sampler rate values.\n");
> +                               if (min_sampler_rate < cb->qsi.min_sampl_rate)
> +                                       min_sampler_rate =
> +                                               cb->qsi.min_sampl_rate;
> +                       } else
> +                               min_sampler_rate = cb->qsi.min_sampl_rate;
> +               }
> +               if (max_sampler_rate != cb->qsi.max_sampl_rate) {
> +                       if (max_sampler_rate) {
> +                               printk(KERN_WARNING
> +                                       "hwsampler: different max sampler rate values.\n");
> +                               if (max_sampler_rate > cb->qsi.max_sampl_rate)
> +                                       max_sampler_rate =
> +                                               cb->qsi.max_sampl_rate;
> +                       } else
> +                               max_sampler_rate = cb->qsi.max_sampl_rate;
> +               }
> +       }
> +       register_external_interrupt(0x1407, hws_ext_handler);
> +
> +       hws_state = HWS_DEALLOCATED;
> +       rc = 0;
> +
> +setup_exit:
> +       mutex_unlock(&hws_sem);
> +       return rc;
> +}
> +
> +int hwsampler_shutdown()
> +{
> +       int rc;
> +
> +       mutex_lock(&hws_sem);
> +
> +       rc = -EINVAL;
> +       if (hws_state == HWS_DEALLOCATED || hws_state == HWS_STOPPED) {
> +               mutex_unlock(&hws_sem);
> +
> +               if (hws_wq)
> +                       flush_workqueue(hws_wq);
> +
> +               mutex_lock(&hws_sem);
> +
> +               if (hws_state == HWS_STOPPED) {
> +                       smp_ctl_clear_bit(0, 5); /* set bit 58 CR0 off */
> +                       deallocate_sdbt();
> +               }
> +               if (hws_wq) {
> +                       destroy_workqueue(hws_wq);
> +                       hws_wq = NULL;
> +               }
> +
> +               unregister_external_interrupt(0x1407, hws_ext_handler);
> +               hws_state = HWS_INIT;
> +               rc = 0;
> +       }
> +       mutex_unlock(&hws_sem);
> +
> +       unregister_cpu_notifier(&hws_cpu_notifier);
> +
> +       return rc;
> +}

-- 
Advanced Micro Devices, Inc.
Operating System Research Center


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [patch v2 1/3] This patch adds support for hardware based sampling on System z processors (models z10 and up)
  2011-03-25 11:00   ` Robert Richter
@ 2011-03-29 12:38     ` Heinz Graalfs
  0 siblings, 0 replies; 17+ messages in thread
From: Heinz Graalfs @ 2011-03-29 12:38 UTC (permalink / raw)
  To: Robert Richter
  Cc: mingo, oprofile-list, linux-kernel, linux-s390, borntraeger,
	schwidefsky, heiko.carstens, Mahesh Salgaonkar,
	Maran Pakkirisamy

On Fri, 2011-03-25 at 12:00 +0100, Robert Richter wrote: 
> On 21.01.11 05:06:52, Heinz Graalfs wrote:
> > From: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> > 
> > System z's hardware sampling is described in detail in:
> > 
> >    SA23-2260-01 "The Load-Program-Parameter and CPU-Measurement Facilities"
> > 
> > The patch introduces
> >  - support for System z's hardware sampler in OProfile's kernel module
> >  - it adds functions that control all hardware sampling related operations as
> >   - checking if hardware sampling feature is available
> >     - ie: on System z models z10 and up, in LPAR mode only, and authorised during LPAR activation
> >   - allocating memory for the hardware sampling feature
> >   - starting/stopping hardware sampling
> > 
> > All functions required to start and stop hardware sampling have to be
> > invoked by the oprofile kernel module as provided by the other patches of this patch set.
> > 
> > In case hardware based sampling cannot be setup standard timer based sampling is used by OProfile.
> > 
> > Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> > Signed-off-by: Maran Pakkirisamy <maranp@linux.vnet.ibm.com>
> > Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
> > ---
> >  arch/Kconfig                   |    3
> >  arch/s390/Kconfig              |    1
> >  arch/s390/oprofile/hwsampler.c | 1256 +++++++++++++++++++++++++++++++++++++++++
> >  arch/s390/oprofile/hwsampler.h |  113 +++
> >  4 files changed, 1373 insertions(+)
> 
> > +int hwsampler_setup()
> > +{
> > +       int rc;
> > +       int cpu;
> > +       struct hws_cpu_buffer *cb;
> > +
> > +       mutex_lock(&hws_sem);
> > +
> > +       rc = -EINVAL;
> > +       if (hws_state)
> > +               goto setup_exit;
> > +
> > +       hws_state = HWS_INIT;
> 
> hws_state is never set to zero again, so we will pass this point only
> one time, even after hwsampler_shutdown(). Is this intended? Only
> loading and unloading this as module would work.
> 
> Maybe we clear hws_state in hwsampler_shutdown()?
> 
> -Robert
> 
yes, the intent is:
hwsampler_setup() is called on module load - oprofile_arch_init()
and 
hwsampler_shutdown() is called on unload - oprofile_arch_exit()

hws_state is reset in hwsampler_shutdown()

hwsampler_shutdown processing only succeeds if we are in states
HWS_DEALLOCATED or HWS_STOPPED.
Usually this is ensured by the current approach the functions are
called, and in this case hws_state is also reset to HWS_INIT.

In case hwsampler_shutdown is called in other 'invalid states' it should
return -EINVAL.
However, in this case, the unregister_cpu_notifier() would be executed,
which is not good and should be avoided.

Although, current processing, never calls hwsampler_shutdown() in
invalid states, I suppose, it might better if we do the unregister call
as intended only if hwsampler_shutdown succeeds.

I will submit a separate patch.

Heinz

> > +
> > +       init_all_cpu_buffers();
> > +
> > +       rc = check_hardware_prerequisites();
> > +       if (rc)
> > +               goto setup_exit;
> > +
> > +       rc = check_qsi_on_setup();
> > +       if (rc)
> > +               goto setup_exit;
> > +
> > +       rc = -EINVAL;
> > +       hws_wq = create_workqueue("hwsampler");
> > +       if (!hws_wq)
> > +               goto setup_exit;
> > +
> > +       register_cpu_notifier(&hws_cpu_notifier);
> > +
> > +       for_each_online_cpu(cpu) {
> > +               cb = &per_cpu(sampler_cpu_buffer, cpu);
> > +               INIT_WORK(&cb->worker, worker);
> > +               rc = smp_ctl_qsi(cpu);
> > +               WARN_ON(rc);
> > +               if (min_sampler_rate != cb->qsi.min_sampl_rate) {
> > +                       if (min_sampler_rate) {
> > +                               printk(KERN_WARNING
> > +                                       "hwsampler: different min sampler rate values.\n");
> > +                               if (min_sampler_rate < cb->qsi.min_sampl_rate)
> > +                                       min_sampler_rate =
> > +                                               cb->qsi.min_sampl_rate;
> > +                       } else
> > +                               min_sampler_rate = cb->qsi.min_sampl_rate;
> > +               }
> > +               if (max_sampler_rate != cb->qsi.max_sampl_rate) {
> > +                       if (max_sampler_rate) {
> > +                               printk(KERN_WARNING
> > +                                       "hwsampler: different max sampler rate values.\n");
> > +                               if (max_sampler_rate > cb->qsi.max_sampl_rate)
> > +                                       max_sampler_rate =
> > +                                               cb->qsi.max_sampl_rate;
> > +                       } else
> > +                               max_sampler_rate = cb->qsi.max_sampl_rate;
> > +               }
> > +       }
> > +       register_external_interrupt(0x1407, hws_ext_handler);
> > +
> > +       hws_state = HWS_DEALLOCATED;
> > +       rc = 0;
> > +
> > +setup_exit:
> > +       mutex_unlock(&hws_sem);
> > +       return rc;
> > +}
> > +
> > +int hwsampler_shutdown()
> > +{
> > +       int rc;
> > +
> > +       mutex_lock(&hws_sem);
> > +
> > +       rc = -EINVAL;
> > +       if (hws_state == HWS_DEALLOCATED || hws_state == HWS_STOPPED) {
> > +               mutex_unlock(&hws_sem);
> > +
> > +               if (hws_wq)
> > +                       flush_workqueue(hws_wq);
> > +
> > +               mutex_lock(&hws_sem);
> > +
> > +               if (hws_state == HWS_STOPPED) {
> > +                       smp_ctl_clear_bit(0, 5); /* set bit 58 CR0 off */
> > +                       deallocate_sdbt();
> > +               }
> > +               if (hws_wq) {
> > +                       destroy_workqueue(hws_wq);
> > +                       hws_wq = NULL;
> > +               }
> > +
> > +               unregister_external_interrupt(0x1407, hws_ext_handler);
> > +               hws_state = HWS_INIT;
> > +               rc = 0;
> > +       }
> > +       mutex_unlock(&hws_sem);
> > +
> > +       unregister_cpu_notifier(&hws_cpu_notifier);
> > +
> > +       return rc;
> > +}
> 




^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2011-03-29 12:38 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-21 10:06 [patch v2 0/3] OProfile support for System z's hardware sampling Heinz Graalfs
2011-01-21 10:06 ` [patch v2 1/3] This patch adds support for hardware based sampling on System z processors (models z10 and up) Heinz Graalfs
2011-02-14 18:57   ` Robert Richter
2011-03-25 11:00   ` Robert Richter
2011-03-29 12:38     ` Heinz Graalfs
2011-01-21 10:06 ` [patch v2 2/3] This patch enhances OProfile to support System zs hardware sampling feature Heinz Graalfs
2011-02-14 19:01   ` Robert Richter
2011-02-14 19:03     ` Robert Richter
2011-02-14 19:05     ` [PATCH] oprofile, s390: Rework hwsampler implementation Robert Richter
2011-02-14 19:07     ` [PATCH] oprofile, s390: Fix section mismatch of function hws_cpu_callback() Robert Richter
2011-01-21 10:06 ` [patch v2 3/3] This patch introduces a new oprofile sample add function (oprofile_add_ext_hw_sample) Heinz Graalfs
2011-02-14 18:55   ` Robert Richter
2011-02-07  8:23 ` [patch v2 0/3] OProfile support for System z's hardware sampling Heinz Graalfs
2011-02-14 19:42 ` Robert Richter
2011-02-15  7:17   ` Heiko Carstens
2011-02-15 16:59   ` Heinz Graalfs
2011-02-15 17:29     ` Robert Richter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).