LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 0/4] perf tools: introduce --map-adjustment.
@ 2015-04-01 10:33 Wang Nan
  2015-04-01 10:33 ` [PATCH 1/4] perf tools: unwind: ensure unwind hooks return negative errorno Wang Nan
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Wang Nan @ 2015-04-01 10:33 UTC (permalink / raw)
  To: acme, jolsa, namhyung; +Cc: mingo, lizefan, pi3orama, linux-kernel

This series of patches introduce a --map-adjustment argument for
dealing with private dynamic linkers.

Some programs write their private dynamic loader instead of glibc ld for
different reasons. They mmap() executable memory area, assemble code
from different '.so' and '.o' files then do the relocation and code
fixing by itself. Since the memory area is not file-backended, perf is
unable to handle symbol information in those files.

Actually, it is not hard for us to create a /tmp/perf-%d.map file from
those ELF objects then utilize the JIT interface. However, without this
series of patches, dwarf unwind information is lost. We are unable to
unwind stack recorded by --call-graph=dwarf if they are compiled without
frame pointer. In addition, we are unable to use annotation to analysis
instruction level histogram.

This series of patches solve this problem by introducing
'--map-adjustment' argument and let users directly hint perf-report
about the private mapping which known to be copied from ELF files.

Patch 1/4: fix a bug in unwind hooks.
Patch 2/4: extracts common code from machine__process_mmap2_event and
           machine__process_mmap_event, create machine_map_new().
Patch 3/4: the main part of this series. The usage of the newly
           introduced argument is described in the commit message of
	   that patch. It also update document for the argument.
Patch 4/4: Allows libunwind to try to read from user provided dso even
           the required address is not actually mapped.

Wang Nan (4):
  perf tools: unwind: ensure unwind hooks return negative errorno.
  perf tools: introduce machine_map_new to merge mmap/mmap2 processing
    code.
  perf tools: report: introduce --map-adjustment argument.
  perf tools: unwinding: try to read from map_adj for a unmapped
    address.

 tools/perf/Documentation/perf-report.txt |  11 +
 tools/perf/builtin-report.c              |   2 +
 tools/perf/util/machine.c                | 355 ++++++++++++++++++++++++++++++-
 tools/perf/util/machine.h                |   3 +
 tools/perf/util/unwind-libunwind.c       |  28 ++-
 5 files changed, 383 insertions(+), 16 deletions(-)

-- 
1.8.3.4


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/4] perf tools: unwind: ensure unwind hooks return negative errorno.
  2015-04-01 10:33 [PATCH 0/4] perf tools: introduce --map-adjustment Wang Nan
@ 2015-04-01 10:33 ` Wang Nan
  2015-04-01 12:12   ` Jiri Olsa
  2015-04-01 10:33 ` [PATCH 2/4] perf tools: introduce machine_map_new to merge mmap/mmap2 processing code Wang Nan
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Wang Nan @ 2015-04-01 10:33 UTC (permalink / raw)
  To: acme, jolsa, namhyung; +Cc: mingo, lizefan, pi3orama, linux-kernel

According to man pages of libunwind, unwind hooks should return
'negative value of one of the unw_error_t error-codes', they are
different from generic error code. In addition, access_dso_mem()
returns '!(size == sizeof(*data))', compiler never ensure it is
negative when failure, which causes libunwind get undesire value
when accessing //anon memory.

This patch fixes this problem by force returning negative value when
error, instead of returning 'ret' itself when it is non-zero.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 tools/perf/util/unwind-libunwind.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index 7b09a44..78a32c7 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -422,7 +422,7 @@ static int access_dso_mem(struct unwind_info *ui, unw_word_t addr,
 	size = dso__data_read_addr(al.map->dso, al.map, ui->machine,
 				   addr, (u8 *) data, sizeof(*data));
 
-	return !(size == sizeof(*data));
+	return (size == sizeof(*data)) ? 0 : -UNW_EINVAL;
 }
 
 static int access_mem(unw_addr_space_t __maybe_unused as,
@@ -443,13 +443,13 @@ static int access_mem(unw_addr_space_t __maybe_unused as,
 
 	ret = perf_reg_value(&start, &ui->sample->user_regs, PERF_REG_SP);
 	if (ret)
-		return ret;
+		return -UNW_EBADREG;
 
 	end = start + stack->size;
 
 	/* Check overflow. */
 	if (addr + sizeof(unw_word_t) < addr)
-		return -EINVAL;
+		return -UNW_EINVAL;
 
 	if (addr < start || addr + sizeof(unw_word_t) >= end) {
 		ret = access_dso_mem(ui, addr, valp);
@@ -491,12 +491,12 @@ static int access_reg(unw_addr_space_t __maybe_unused as,
 
 	id = libunwind__arch_reg_id(regnum);
 	if (id < 0)
-		return -EINVAL;
+		return -UNW_EBADREG;
 
 	ret = perf_reg_value(&val, &ui->sample->user_regs, id);
 	if (ret) {
 		pr_err("unwind: can't read reg %d\n", regnum);
-		return ret;
+		return -UNW_EBADREG;
 	}
 
 	*valp = (unw_word_t) val;
-- 
1.8.3.4


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/4] perf tools: introduce machine_map_new to merge mmap/mmap2 processing code.
  2015-04-01 10:33 [PATCH 0/4] perf tools: introduce --map-adjustment Wang Nan
  2015-04-01 10:33 ` [PATCH 1/4] perf tools: unwind: ensure unwind hooks return negative errorno Wang Nan
@ 2015-04-01 10:33 ` Wang Nan
  2015-04-01 12:18   ` Jiri Olsa
  2015-04-01 10:33 ` [PATCH 3/4] perf tools: report: introduce --map-adjustment argument Wang Nan
  2015-04-01 10:33 ` [PATCH 4/4] perf tools: unwinding: try to read from map_adj for a unmapped address Wang Nan
  3 siblings, 1 reply; 13+ messages in thread
From: Wang Nan @ 2015-04-01 10:33 UTC (permalink / raw)
  To: acme, jolsa, namhyung; +Cc: mingo, lizefan, pi3orama, linux-kernel

Create a machine_map_new() and merge mapping code in
machine__process_mmap2_event() and machine__process_mmap_event()
together. This patch is a preparation for following map adjustment
patches.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 tools/perf/util/machine.c | 30 +++++++++++++++++++++---------
 1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index e335330..051883a 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1155,13 +1155,29 @@ out_problem:
 	return -1;
 }
 
+static int machine_map_new(struct machine *machine, u64 start, u64 len,
+		     u64 pgoff, u32 pid, u32 d_maj, u32 d_min, u64 ino,
+		     u64 ino_gen, u32 prot, u32 flags, char *filename,
+		     enum map_type type, struct thread *thread)
+{
+	struct map *map;
+
+	map = map__new(machine, start, len, pgoff, pid, d_maj, d_min,
+			ino, ino_gen, prot, flags, filename, type, thread);
+
+	if (map == NULL)
+		return -1;
+
+	thread__insert_map(thread, map);
+	return 0;
+}
+
 int machine__process_mmap2_event(struct machine *machine,
 				 union perf_event *event,
 				 struct perf_sample *sample __maybe_unused)
 {
 	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
 	struct thread *thread;
-	struct map *map;
 	enum map_type type;
 	int ret = 0;
 
@@ -1186,7 +1202,7 @@ int machine__process_mmap2_event(struct machine *machine,
 	else
 		type = MAP__FUNCTION;
 
-	map = map__new(machine, event->mmap2.start,
+	ret = machine_map_new(machine, event->mmap2.start,
 			event->mmap2.len, event->mmap2.pgoff,
 			event->mmap2.pid, event->mmap2.maj,
 			event->mmap2.min, event->mmap2.ino,
@@ -1195,10 +1211,8 @@ int machine__process_mmap2_event(struct machine *machine,
 			event->mmap2.flags,
 			event->mmap2.filename, type, thread);
 
-	if (map == NULL)
+	if (ret)
 		goto out_problem;
-
-	thread__insert_map(thread, map);
 	return 0;
 
 out_problem:
@@ -1211,7 +1225,6 @@ int machine__process_mmap_event(struct machine *machine, union perf_event *event
 {
 	u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
 	struct thread *thread;
-	struct map *map;
 	enum map_type type;
 	int ret = 0;
 
@@ -1236,16 +1249,15 @@ int machine__process_mmap_event(struct machine *machine, union perf_event *event
 	else
 		type = MAP__FUNCTION;
 
-	map = map__new(machine, event->mmap.start,
+	ret = machine_map_new(machine, event->mmap.start,
 			event->mmap.len, event->mmap.pgoff,
 			event->mmap.pid, 0, 0, 0, 0, 0, 0,
 			event->mmap.filename,
 			type, thread);
 
-	if (map == NULL)
+	if (ret)
 		goto out_problem;
 
-	thread__insert_map(thread, map);
 	return 0;
 
 out_problem:
-- 
1.8.3.4


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/4] perf tools: report: introduce --map-adjustment argument.
  2015-04-01 10:33 [PATCH 0/4] perf tools: introduce --map-adjustment Wang Nan
  2015-04-01 10:33 ` [PATCH 1/4] perf tools: unwind: ensure unwind hooks return negative errorno Wang Nan
  2015-04-01 10:33 ` [PATCH 2/4] perf tools: introduce machine_map_new to merge mmap/mmap2 processing code Wang Nan
@ 2015-04-01 10:33 ` Wang Nan
  2015-04-01 13:21   ` Jiri Olsa
  2015-04-01 14:23   ` pi3orama
  2015-04-01 10:33 ` [PATCH 4/4] perf tools: unwinding: try to read from map_adj for a unmapped address Wang Nan
  3 siblings, 2 replies; 13+ messages in thread
From: Wang Nan @ 2015-04-01 10:33 UTC (permalink / raw)
  To: acme, jolsa, namhyung; +Cc: mingo, lizefan, pi3orama, linux-kernel

This patch introduces a --map-adjustment argument for perf report. The
goal of this option is to deal with private dynamic loader used in some
special program.

Some programs write their private dynamic loader instead of glibc ld for
different reasons. They mmap() executable memory area, assemble code
from different '.so' and '.o' files then do the relocation and code
fixing by itself. The memory area is not file-backended so perf is
unable to handle symbol information in those files.

This patch allows user to give perf report hints directly using
'--map-adjustment' argument. Perf report will regard such mapping as
file-backended mapping and treat them as dso instead of private mapping
area.

The main part of this patch resides in util/machine.c. struct map_adj is
introduced to represent each adjustment. They are sorted and linked
together to map_adj_list linked list. When a real MMAP event raises,
perf checks such adjustments before calling map__new() and
thread__insert_map(), then setup filename and pgoff according to user
hints. It also splits MMAP events when necessary.

Usage of --map-adjustment is appended into Documentation/perf-report.txt.

Here is an example:

 $ perf report --map-adjustment=./libtest.so@0x7fa52fcb1000,0x4000,0x21000,92051 \
		--no-children

Where 0x7fa52fcb1000 is private map area got through:

   mmap(NULL, 4096 * 4, PROT_EXEC|PROT_WRITE|PROT_READ, MAP_ANONYMOUS|MAP_PRIVATE,
		   -1, 0);

And its contents are copied from libtest.so.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 tools/perf/Documentation/perf-report.txt |  11 ++
 tools/perf/builtin-report.c              |   2 +
 tools/perf/util/machine.c                | 276 ++++++++++++++++++++++++++++++-
 tools/perf/util/machine.h                |   2 +
 4 files changed, 288 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 4879cf6..e19349c 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -323,6 +323,17 @@ OPTIONS
 --header-only::
 	Show only perf.data header (forces --stdio).
 
+--map-adjustment=objfile@start,length[,pgoff[,pid]]::
+	Give memory layout hints for specific or all process. This makes
+	perf regard provided range of memory as mapped from provided
+	file instead of its original attributes found in perf.data.
+	start and length should be hexadecimal values represent the
+	address range. pgoff should be hexadecimal values represent
+	mapping offset (in pages) of that file. Default pgoff value is
+	0 (map from start of the file). If pid is ommited, such
+	adjustment will be applied to all process in this trace. This
+	should be used when perf.data contains only 1 process.
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-annotate[1]
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index b5b2ad4..9fdfb05 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -717,6 +717,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 		     "Don't show entries under that percent", parse_percent_limit),
 	OPT_CALLBACK(0, "percentage", NULL, "relative|absolute",
 		     "how to display percentage of filtered entries", parse_filter_percentage),
+	OPT_CALLBACK(0, "map-adjustment", NULL, "objfile@start,length[,pgoff[,pid]]",
+		     "Provide map adjustment hinting", parse_map_adjustment),
 	OPT_END()
 	};
 	struct perf_data_file file = {
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 051883a..dc9e91e 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1155,21 +1155,291 @@ out_problem:
 	return -1;
 }
 
+/*
+ * Users are allowed to provide map adjustment setting for the case
+ * that an address range is actually privatly mapped but known to be
+ * ELF object file backended. Like this:
+ *
+ * |<- copied from libx.so ->|  |<- copied from liby.so ->|
+ * |<-------------------- MMAP area --------------------->|
+ *
+ * When dealing with such mmap events, try to obey user adjustment.
+ * Such adjustment settings are not allowed overlapping.
+ * Adjustments won't be considered as valid code until real MMAP events
+ * take place. Therefore, users are allowed to provide adjustments which
+ * cover never mapped areas, like:
+ *
+ * |<- libx.so ->|  |<- liby.so ->|
+ *      |<-- MMAP area -->|
+ *
+ * This feature is useful when dealing with private dynamic linkers,
+ * which assemble code piece from different ELF objects.
+ *
+ * map_adj_list is an ordered linked list. Order of two adjustments is
+ * first defined by their pid, and then by their start address.
+ * Therefore, adjustments for specific pids are groupped together
+ * naturally.
+ */
+static LIST_HEAD(map_adj_list);
+struct map_adj {
+	u32 pid;
+	u64 start;
+	u64 len;
+	u64 pgoff;
+	struct list_head list;
+	char filename[PATH_MAX];
+};
+
+enum map_adj_cross {
+	MAP_ADJ_LEFT_PID,
+	MAP_ADJ_LEFT,
+	MAP_ADJ_CROSS,
+	MAP_ADJ_RIGHT,
+	MAP_ADJ_RIGHT_PID,
+};
+
+/*
+ * Check whether two map_adj cross over each other. This function is
+ * used for comparing adjustments. For overlapping adjustments, it
+ * reports different between two start address and the length of
+ * overlapping area. Signess of pgoff_diff can be used to determine
+ * which one is the left one.
+ *
+ * If anyone in r and l has pid set as -1, don't consider pid.
+ */
+static enum map_adj_cross
+check_map_adj_cross(struct map_adj* l, struct map_adj* r,
+		int *pgoff_diff, u64 *cross_len)
+{
+	bool swapped = false;
+
+	if ((l->pid != (u32)(-1)) && (r->pid != (u32)(-1))
+			&& (l->pid != r->pid))
+		return (l->pid < r->pid) ? MAP_ADJ_LEFT_PID : MAP_ADJ_RIGHT_PID;
+
+	if (l->start > r->start) {
+		struct map_adj *t = l;
+		swapped = true;
+		l = r;
+		r = t;
+	}
+
+	if (l->start + l->len > r->start) {
+		if (pgoff_diff)
+			*pgoff_diff = ((r->start - l->start) / page_size) *
+				(swapped ? -1 : 1);
+		if (cross_len) {
+			u64 cross_start = r->start;
+			u64 l_end = l->start + l->len;
+			u64 r_end = r->start + r->len;
+
+			*cross_len = (l_end < r_end ? l_end : r_end) -
+					cross_start;
+		}
+		return MAP_ADJ_CROSS;
+	}
+
+	return swapped ? MAP_ADJ_RIGHT : MAP_ADJ_LEFT;
+}
+
+static int machine_add_map_adj(u32 pid, u64 start, u64 len,
+		     u64 pgoff, const char *filename)
+{
+	struct map_adj *pos;
+	struct map_adj *new;
+	struct map_adj tmp = {
+		.pid = pid,
+		.start = start,
+		.len = len,
+	};
+
+	if (!filename)
+		return -EINVAL;
+
+	if ((start % page_size) || (len % page_size)) {
+		pr_err("Map adjustment is not page aligned for %d%s.\n", pid,
+				pid == (u32)(-1) ? " (all pids)" : "");
+		return -EINVAL;
+	}
+
+	if ((pid != (u32)(-1)) && (!list_empty(&map_adj_list))) {
+		/*
+		 * Don't allow mixing (u32)(-1) (for all pids) and
+		 * normal pid.
+		 *
+		 * During sorting, (u32)(-1) should be considered as
+		 * the largest pid.
+		 */
+		struct map_adj *largest = list_entry(map_adj_list.prev,
+				struct map_adj, list);
+
+		if (largest->pid == (u32)(-1)) {
+			pr_err("Providing both system-wide and pid specific map adjustments is forbidden.\n");
+			return -EINVAL;
+		}
+	}
+
+	/*
+	 * Find the first one which is larger than tmp and insert new
+	 * adj prior to it.
+	 */
+	list_for_each_entry(pos, &map_adj_list, list) {
+		enum map_adj_cross cross;
+
+		cross = check_map_adj_cross(&tmp, pos, NULL, NULL);
+		if (cross < MAP_ADJ_CROSS)
+			break;
+		if (cross == MAP_ADJ_CROSS) {
+			pr_err("Overlapping map adjustments provided for pid %d%s\n", pid,
+					pid == (u32)(-1) ? " (all pids)" : "");
+			return -EINVAL;
+		}
+	}
+
+	new = malloc(sizeof(*new));
+	if (!new)
+		return -EINVAL;
+
+	new->pid = pid;
+	new->start = start;
+	new->len = len;
+	new->pgoff = pgoff;
+	strncpy(new->filename, filename, PATH_MAX);
+	list_add(&new->list, pos->list.prev);
+	return 0;
+}
+
 static int machine_map_new(struct machine *machine, u64 start, u64 len,
 		     u64 pgoff, u32 pid, u32 d_maj, u32 d_min, u64 ino,
 		     u64 ino_gen, u32 prot, u32 flags, char *filename,
 		     enum map_type type, struct thread *thread)
 {
+	struct map_adj *pos;
 	struct map *map;
 
-	map = map__new(machine, start, len, pgoff, pid, d_maj, d_min,
-			ino, ino_gen, prot, flags, filename, type, thread);
+	list_for_each_entry(pos, &map_adj_list, list) {
+		u64 adj_start, adj_len, adj_pgoff, cross_len;
+		enum map_adj_cross cross;
+		struct map_adj tmp;
+		int pgoff_diff;
+
+again:
+		if (len == 0)
+			break;
+
+		tmp.pid = pid;
+		tmp.start = start;
+		tmp.len = len;
+
+		cross = check_map_adj_cross(&tmp,
+				pos, &pgoff_diff, &cross_len);
+
+		if (cross < MAP_ADJ_CROSS)
+			break;
+		if (cross > MAP_ADJ_CROSS)
+			continue;
+
+		if (pgoff_diff <= 0) {
+			/*
+			 *       |<----- tmp ----->|
+			 * |<----- pos ----->|
+			 */
+
+			adj_start = tmp.start;
+			adj_len = cross_len;
+			adj_pgoff = pos->pgoff + (-pgoff_diff);
+			map = map__new(machine, adj_start, adj_len, adj_pgoff,
+					pid, 0, 0, 0, 0, prot, flags,
+					pos->filename, type, thread);
+		} else {
+			/*
+			 * |<----- tmp ----->|
+			 * |<-- X -->|<----- pos ----->|
+			 * In this case, only deal with tmp part X. goto again
+			 * instead of next pos.
+			 */
+			adj_start = tmp.start;
+			adj_len = tmp.len - cross_len;
+			adj_pgoff = tmp.pgoff;
+			map = map__new(machine, adj_start, adj_len, adj_pgoff,
+					pid, d_maj, d_min, ino, ino_gen, prot,
+					flags, filename, type, thread);
+
+		}
+
+		if (map == NULL)
+			goto error;
+
+		thread__insert_map(thread, map);
+
+		pgoff += adj_len / page_size;
+		start = tmp.start + adj_len;
+		len -= adj_len;
+		if (pgoff_diff > 0)
+			goto again;
+	}
+
+	map = map__new(machine, start, len, pgoff,
+			pid, d_maj, d_min, ino, ino_gen, prot,
+			flags, filename, type, thread);
 
 	if (map == NULL)
-		return -1;
+		goto error;
 
 	thread__insert_map(thread, map);
+
 	return 0;
+error:
+	return -1;
+}
+
+int parse_map_adjustment(const struct option *opt __maybe_unused,
+		const char *arg, int unset __maybe_unused)
+{
+	const char *ptr;
+	char *sep;
+	int err;
+	u64 start, len, pgoff = 0;
+	u32 pid = (u32)(-1);
+	char filename[PATH_MAX];
+
+	sep = strchr(arg, '@');
+	if (sep == NULL)
+		goto err;
+
+	strncpy(filename, arg, sep - arg);
+
+	ptr = sep + 1; /* Skip '@' */
+
+	/* start */
+	start = strtoll(ptr, &sep, 16);
+	if (*sep != ',')
+		goto err;
+	ptr = sep + 1;
+
+	/* len */
+	len = strtoll(ptr, &sep, 16);
+	if (*sep == ',') {
+		/* pgoff */
+		ptr = sep + 1;
+		pgoff = strtoll(ptr, &sep, 16);
+
+		if (*sep == ',') {
+			/* pid */
+			ptr = sep + 1;
+			pid = strtol(ptr, &sep, 10);
+		}
+	}
+
+	if (*sep != '\0')
+		goto err;
+
+	err = machine_add_map_adj(pid, start, len, pgoff, filename);
+	return err;
+
+err:
+	fprintf(stderr, "invalid map adjustment setting: %s\n", arg);
+	return -1;
 }
 
 int machine__process_mmap2_event(struct machine *machine,
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index e2faf3b..73b49e4 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -223,4 +223,6 @@ pid_t machine__get_current_tid(struct machine *machine, int cpu);
 int machine__set_current_tid(struct machine *machine, int cpu, pid_t pid,
 			     pid_t tid);
 
+int parse_map_adjustment(const struct option *opt, const char *arg, int unset);
+
 #endif /* __PERF_MACHINE_H */
-- 
1.8.3.4


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 4/4] perf tools: unwinding: try to read from map_adj for a unmapped address.
  2015-04-01 10:33 [PATCH 0/4] perf tools: introduce --map-adjustment Wang Nan
                   ` (2 preceding siblings ...)
  2015-04-01 10:33 ` [PATCH 3/4] perf tools: report: introduce --map-adjustment argument Wang Nan
@ 2015-04-01 10:33 ` Wang Nan
  3 siblings, 0 replies; 13+ messages in thread
From: Wang Nan @ 2015-04-01 10:33 UTC (permalink / raw)
  To: acme, jolsa, namhyung; +Cc: mingo, lizefan, pi3orama, linux-kernel

Previous patch allows users to use --map-adjustment to hint perf about
address ranges known to be shared object files backended. However, if
only parts of such object files are read, libunwind will fail to get
required information from unmapped area of those files.

This patch makes access_dso_mem() to try it best when when reading from
DSO. When it tried and fail the first time, it iterates over
map_adj_list to search a mapped DSO whcih contains such address and read
from it.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
---
 tools/perf/util/machine.c          | 55 ++++++++++++++++++++++++++++++++++++++
 tools/perf/util/machine.h          |  1 +
 tools/perf/util/unwind-libunwind.c | 18 +++++++++++--
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index dc9e91e..5176932 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1186,6 +1186,9 @@ struct map_adj {
 	u64 start;
 	u64 len;
 	u64 pgoff;
+	/* filled when first mapping */
+	bool mapped;
+	u64 map_start;
 	struct list_head list;
 	char filename[PATH_MAX];
 };
@@ -1304,6 +1307,7 @@ static int machine_add_map_adj(u32 pid, u64 start, u64 len,
 	new->start = start;
 	new->len = len;
 	new->pgoff = pgoff;
+	new->mapped = false;
 	strncpy(new->filename, filename, PATH_MAX);
 	list_add(&new->list, pos->list.prev);
 	return 0;
@@ -1351,6 +1355,11 @@ again:
 			map = map__new(machine, adj_start, adj_len, adj_pgoff,
 					pid, 0, 0, 0, 0, prot, flags,
 					pos->filename, type, thread);
+
+			if (!pos->mapped) {
+				pos->mapped = true;
+				pos->map_start = adj_start;
+			}
 		} else {
 			/*
 			 * |<----- tmp ----->|
@@ -1393,6 +1402,52 @@ error:
 	return -1;
 }
 
+/*
+ * Search map_adj according to pid and address and return an already
+ * mapped address. This function is used for dwarf unwinding. When
+ * libunwind tries to read from a DSO and fail because the area is not
+ * mapped, this function can help it to find cooresponding map_adj, and
+ * give it another chance to read from DSO.
+ *
+ * However, we are unable to ensure this searching always correct. See
+ * this:
+ *
+ *    |<--- liba.so --->|<-.. liba.so (unmapped) ..->|
+ *                      |<--- libb.so --->|<-.. libb.so (unmapped) ..->|
+ *    |<---     mmaped map_adj        --->|   ^
+ *
+ * Actually mapped area is assembled by part of liba.so and part of
+ * libb.so. When libunwind tries to read from ^ marked place, this
+ * searching returns liba.so, but in fact no one (expect libunwind
+ * itself) knows which map_adj is correct.
+ */
+int machine__search_map_adj(pid_t pid, u64 addr, u64 *map_start)
+{
+	struct map_adj *pos;
+
+	if (!map_start)
+		return -EINVAL;
+
+	list_for_each_entry(pos, &map_adj_list, list) {
+		if (pos->pid != (u32)(-1)) {
+			if (pos->pid < (u32)pid)
+				continue;
+			if (pos->pid > (u32)pid)
+				break;
+		}
+
+		if (!pos->mapped)
+			continue;
+
+		if ((pos->start <= addr) && (pos->start + pos->len > addr)) {
+			*map_start = pos->map_start;
+			return 0;
+		}
+	}
+
+	return -EEXIST;
+}
+
 int parse_map_adjustment(const struct option *opt __maybe_unused,
 		const char *arg, int unset __maybe_unused)
 {
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 73b49e4..b0c66ce 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -224,5 +224,6 @@ int machine__set_current_tid(struct machine *machine, int cpu, pid_t pid,
 			     pid_t tid);
 
 int parse_map_adjustment(const struct option *opt, const char *arg, int unset);
+int machine__search_map_adj(pid_t pid, u64 addr, u64 *map_start);
 
 #endif /* __PERF_MACHINE_H */
diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index 78a32c7..a78c280 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -412,8 +412,22 @@ static int access_dso_mem(struct unwind_info *ui, unw_word_t addr,
 	thread__find_addr_map(ui->thread, PERF_RECORD_MISC_USER,
 			      MAP__FUNCTION, addr, &al);
 	if (!al.map) {
-		pr_debug("unwind: no map for %lx\n", (unsigned long)addr);
-		return -1;
+		u64 map_start;
+
+		pr_debug("unwind: try find map from map_adj: %lx\n", (unsigned long)addr);
+		if (machine__search_map_adj(ui->thread->pid_, (u64)addr, &map_start)) {
+			pr_debug("unwind: no map for %lx even consider map adjustment\n",
+					(unsigned long)addr);
+			return -1;
+		}
+
+		thread__find_addr_map(ui->thread, PERF_RECORD_MISC_USER,
+				MAP__FUNCTION, map_start, &al);
+		if (!al.map) {
+			pr_debug("unwind: unable to find map for %lx (through %lx)\n",
+					(unsigned long)addr, (unsigned long)map_start);
+			return -1;
+		}
 	}
 
 	if (!al.map->dso)
-- 
1.8.3.4


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] perf tools: unwind: ensure unwind hooks return negative errorno.
  2015-04-01 10:33 ` [PATCH 1/4] perf tools: unwind: ensure unwind hooks return negative errorno Wang Nan
@ 2015-04-01 12:12   ` Jiri Olsa
  2015-04-01 12:41     ` Wang Nan
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Olsa @ 2015-04-01 12:12 UTC (permalink / raw)
  To: Wang Nan; +Cc: acme, jolsa, namhyung, mingo, lizefan, pi3orama, linux-kernel

On Wed, Apr 01, 2015 at 10:33:12AM +0000, Wang Nan wrote:
> According to man pages of libunwind, unwind hooks should return
> 'negative value of one of the unw_error_t error-codes', they are
> different from generic error code. In addition, access_dso_mem()
> returns '!(size == sizeof(*data))', compiler never ensure it is
> negative when failure, which causes libunwind get undesire value
> when accessing //anon memory.
> 
> This patch fixes this problem by force returning negative value when
> error, instead of returning 'ret' itself when it is non-zero.

hum, how about find_proc_info callback.. should it follow the same rules?

thanks,
jirka

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] perf tools: introduce machine_map_new to merge mmap/mmap2 processing code.
  2015-04-01 10:33 ` [PATCH 2/4] perf tools: introduce machine_map_new to merge mmap/mmap2 processing code Wang Nan
@ 2015-04-01 12:18   ` Jiri Olsa
  2015-04-01 14:58     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 13+ messages in thread
From: Jiri Olsa @ 2015-04-01 12:18 UTC (permalink / raw)
  To: Wang Nan; +Cc: acme, jolsa, namhyung, mingo, lizefan, pi3orama, linux-kernel

On Wed, Apr 01, 2015 at 10:33:13AM +0000, Wang Nan wrote:
> Create a machine_map_new() and merge mapping code in
> machine__process_mmap2_event() and machine__process_mmap_event()
> together. This patch is a preparation for following map adjustment
> patches.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> ---
>  tools/perf/util/machine.c | 30 +++++++++++++++++++++---------
>  1 file changed, 21 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index e335330..051883a 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -1155,13 +1155,29 @@ out_problem:
>  	return -1;
>  }
>  
> +static int machine_map_new(struct machine *machine, u64 start, u64 len,
> +		     u64 pgoff, u32 pid, u32 d_maj, u32 d_min, u64 ino,
> +		     u64 ino_gen, u32 prot, u32 flags, char *filename,
> +		     enum map_type type, struct thread *thread)

the name style for this should be more like 'machine__new_map'
I think Arnaldo will chime in.. ;-)

jirka

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] perf tools: unwind: ensure unwind hooks return negative errorno.
  2015-04-01 12:12   ` Jiri Olsa
@ 2015-04-01 12:41     ` Wang Nan
  2015-04-07  8:30       ` Wang Nan
  0 siblings, 1 reply; 13+ messages in thread
From: Wang Nan @ 2015-04-01 12:41 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: acme, jolsa, namhyung, mingo, lizefan, pi3orama, linux-kernel

On 2015/4/1 20:12, Jiri Olsa wrote:
> On Wed, Apr 01, 2015 at 10:33:12AM +0000, Wang Nan wrote:
>> According to man pages of libunwind, unwind hooks should return
>> 'negative value of one of the unw_error_t error-codes', they are
>> different from generic error code. In addition, access_dso_mem()
>> returns '!(size == sizeof(*data))', compiler never ensure it is
>> negative when failure, which causes libunwind get undesire value
>> when accessing //anon memory.
>>
>> This patch fixes this problem by force returning negative value when
>> error, instead of returning 'ret' itself when it is non-zero.
> 
> hum, how about find_proc_info callback.. should it follow the same rules?
> 

Yes, but it only returns -EINVAL and dwarf_search_unwind_table(....). The latter
one is part of libunwind so we can trust it returns negative when fail.

> thanks,
> jirka
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] perf tools: report: introduce --map-adjustment argument.
  2015-04-01 10:33 ` [PATCH 3/4] perf tools: report: introduce --map-adjustment argument Wang Nan
@ 2015-04-01 13:21   ` Jiri Olsa
  2015-04-02  1:15     ` Wang Nan
  2015-04-01 14:23   ` pi3orama
  1 sibling, 1 reply; 13+ messages in thread
From: Jiri Olsa @ 2015-04-01 13:21 UTC (permalink / raw)
  To: Wang Nan; +Cc: acme, jolsa, namhyung, mingo, lizefan, pi3orama, linux-kernel

On Wed, Apr 01, 2015 at 10:33:14AM +0000, Wang Nan wrote:
> This patch introduces a --map-adjustment argument for perf report. The
> goal of this option is to deal with private dynamic loader used in some
> special program.
> 

SNIP

> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 051883a..dc9e91e 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -1155,21 +1155,291 @@ out_problem:
>  	return -1;
>  }
>  
> +/*
> + * Users are allowed to provide map adjustment setting for the case
> + * that an address range is actually privatly mapped but known to be
> + * ELF object file backended. Like this:
> + *
> + * |<- copied from libx.so ->|  |<- copied from liby.so ->|
> + * |<-------------------- MMAP area --------------------->|
> + *
> + * When dealing with such mmap events, try to obey user adjustment.
> + * Such adjustment settings are not allowed overlapping.
> + * Adjustments won't be considered as valid code until real MMAP events
> + * take place. Therefore, users are allowed to provide adjustments which
> + * cover never mapped areas, like:
> + *
> + * |<- libx.so ->|  |<- liby.so ->|
> + *      |<-- MMAP area -->|
> + *
> + * This feature is useful when dealing with private dynamic linkers,
> + * which assemble code piece from different ELF objects.
> + *
> + * map_adj_list is an ordered linked list. Order of two adjustments is
> + * first defined by their pid, and then by their start address.
> + * Therefore, adjustments for specific pids are groupped together
> + * naturally.
> + */
> +static LIST_HEAD(map_adj_list);

we dont like global stuff ;-)

I think this belongs to the machine object, which is created
within the perf_session__new, so after options parsing.. hum

perhaps you could stash stash 'struct map_adj' objects and
add some interface to init perf_session::machines::host
once it's created?

> +struct map_adj {

IMHO 'struct map_adjust' suits better.. using 'adjust' instead
of 'adj' is not such a waste of space and it's more readable
(for all 'adj' instances in the patch)

> +	u32 pid;
> +	u64 start;
> +	u64 len;
> +	u64 pgoff;
> +	struct list_head list;
> +	char filename[PATH_MAX];
> +};
> +
> +enum map_adj_cross {

'enum map_adjust' ?

> +	MAP_ADJ_LEFT_PID,
> +	MAP_ADJ_LEFT,
> +	MAP_ADJ_CROSS,
> +	MAP_ADJ_RIGHT,
> +	MAP_ADJ_RIGHT_PID,
> +};
> +
> +/*
> + * Check whether two map_adj cross over each other. This function is
> + * used for comparing adjustments. For overlapping adjustments, it
> + * reports different between two start address and the length of
> + * overlapping area. Signess of pgoff_diff can be used to determine
> + * which one is the left one.
> + *
> + * If anyone in r and l has pid set as -1, don't consider pid.
> + */

SNIP

>  static int machine_map_new(struct machine *machine, u64 start, u64 len,
>  		     u64 pgoff, u32 pid, u32 d_maj, u32 d_min, u64 ino,
>  		     u64 ino_gen, u32 prot, u32 flags, char *filename,
>  		     enum map_type type, struct thread *thread)
>  {
> +	struct map_adj *pos;
>  	struct map *map;
>  
> -	map = map__new(machine, start, len, pgoff, pid, d_maj, d_min,
> -			ino, ino_gen, prot, flags, filename, type, thread);

could you please loop below into separate function?

> +	list_for_each_entry(pos, &map_adj_list, list) {
> +		u64 adj_start, adj_len, adj_pgoff, cross_len;
> +		enum map_adj_cross cross;
> +		struct map_adj tmp;
> +		int pgoff_diff;

just curious.. how many --map-adjust entries do you normaly use?
maybe if it's bigger number then a) using rb_tree might be faster
and b) using some sort of config file could be better way for
input might be easier

> +
> +again:
> +		if (len == 0)
> +			break;
> +
> +		tmp.pid = pid;
> +		tmp.start = start;
> +		tmp.len = len;
> +
> +		cross = check_map_adj_cross(&tmp,
> +				pos, &pgoff_diff, &cross_len);
> +
> +		if (cross < MAP_ADJ_CROSS)
> +			break;
> +		if (cross > MAP_ADJ_CROSS)
> +			continue;
> +
> +		if (pgoff_diff <= 0) {
> +			/*
> +			 *       |<----- tmp ----->|
> +			 * |<----- pos ----->|
> +			 */
> +
> +			adj_start = tmp.start;

SNIP

> +int parse_map_adjustment(const struct option *opt, const char *arg, int unset);
> +
>  #endif /* __PERF_MACHINE_H */
> -- 
> 1.8.3.4
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] perf tools: report: introduce --map-adjustment argument.
  2015-04-01 10:33 ` [PATCH 3/4] perf tools: report: introduce --map-adjustment argument Wang Nan
  2015-04-01 13:21   ` Jiri Olsa
@ 2015-04-01 14:23   ` pi3orama
  1 sibling, 0 replies; 13+ messages in thread
From: pi3orama @ 2015-04-01 14:23 UTC (permalink / raw)
  To: Wang Nan
  Cc: <acme@kernel.org>, <jolsa@kernel.org>,
	<namhyung@kernel.org>, <mingo@redhat.com>,
	<lizefan@huawei.com>, <linux-kernel@vger.kernel.org>



发自我的 iPhone

> 在 2015年4月1日,下午6:33,Wang Nan <wangnan0@huawei.com> 写道:
> 
> This patch introduces a --map-adjustment argument for perf report. The
> goal of this option is to deal with private dynamic loader used in some
> special program.
> 
> Some programs write their private dynamic loader instead of glibc ld for
> different reasons. They mmap() executable memory area, assemble code
> from different '.so' and '.o' files then do the relocation and code
> fixing by itself. The memory area is not file-backended so perf is
> unable to handle symbol information in those files.
> 
> This patch allows user to give perf report hints directly using
> '--map-adjustment' argument. Perf report will regard such mapping as
> file-backended mapping and treat them as dso instead of private mapping
> area.
> 
> The main part of this patch resides in util/machine.c. struct map_adj is
> introduced to represent each adjustment. They are sorted and linked
> together to map_adj_list linked list. When a real MMAP event raises,
> perf checks such adjustments before calling map__new() and
> thread__insert_map(), then setup filename and pgoff according to user
> hints. It also splits MMAP events when necessary.
> 
> Usage of --map-adjustment is appended into Documentation/perf-report.txt.
> 
> Here is an example:
> 
> $ perf report --map-adjustment=./libtest.so@0x7fa52fcb1000,0x4000,0x21000,92051 \
>        --no-children
> 
> Where 0x7fa52fcb1000 is private map area got through:
> 
>   mmap(NULL, 4096 * 4, PROT_EXEC|PROT_WRITE|PROT_READ, MAP_ANONYMOUS|MAP_PRIVATE,
>           -1, 0);
> 
> And its contents are copied from libtest.so.
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> ---
> tools/perf/Documentation/perf-report.txt |  11 ++
> tools/perf/builtin-report.c              |   2 +
> tools/perf/util/machine.c                | 276 ++++++++++++++++++++++++++++++-
> tools/perf/util/machine.h                |   2 +
> 4 files changed, 288 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index 4879cf6..e19349c 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -323,6 +323,17 @@ OPTIONS
> --header-only::
>    Show only perf.data header (forces --stdio).
> 
> +--map-adjustment=objfile@start,length[,pgoff[,pid]]::
> +    Give memory layout hints for specific or all process. This makes
> +    perf regard provided range of memory as mapped from provided
> +    file instead of its original attributes found in perf.data.
> +    start and length should be hexadecimal values represent the
> +    address range. pgoff should be hexadecimal values represent
> +    mapping offset (in pages) of that file. Default pgoff value is
> +    0 (map from start of the file). If pid is ommited, such
> +    adjustment will be applied to all process in this trace. This
> +    should be used when perf.data contains only 1 process.
> +
> SEE ALSO
> --------
> linkperf:perf-stat[1], linkperf:perf-annotate[1]
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index b5b2ad4..9fdfb05 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -717,6 +717,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
>             "Don't show entries under that percent", parse_percent_limit),
>    OPT_CALLBACK(0, "percentage", NULL, "relative|absolute",
>             "how to display percentage of filtered entries", parse_filter_percentage),
> +    OPT_CALLBACK(0, "map-adjustment", NULL, "objfile@start,length[,pgoff[,pid]]",
> +             "Provide map adjustment hinting", parse_map_adjustment),
>    OPT_END()
>    };
>    struct perf_data_file file = {
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 051883a..dc9e91e 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -1155,21 +1155,291 @@ out_problem:
>    return -1;
> }
> 
> +/*
> + * Users are allowed to provide map adjustment setting for the case
> + * that an address range is actually privatly mapped but known to be
> + * ELF object file backended. Like this:
> + *
> + * |<- copied from libx.so ->|  |<- copied from liby.so ->|
> + * |<-------------------- MMAP area --------------------->|
> + *
> + * When dealing with such mmap events, try to obey user adjustment.
> + * Such adjustment settings are not allowed overlapping.
> + * Adjustments won't be considered as valid code until real MMAP events
> + * take place. Therefore, users are allowed to provide adjustments which
> + * cover never mapped areas, like:
> + *
> + * |<- libx.so ->|  |<- liby.so ->|
> + *      |<-- MMAP area -->|
> + *
> + * This feature is useful when dealing with private dynamic linkers,
> + * which assemble code piece from different ELF objects.
> + *
> + * map_adj_list is an ordered linked list. Order of two adjustments is
> + * first defined by their pid, and then by their start address.
> + * Therefore, adjustments for specific pids are groupped together
> + * naturally.
> + */
> +static LIST_HEAD(map_adj_list);
> +struct map_adj {
> +    u32 pid;
> +    u64 start;
> +    u64 len;
> +    u64 pgoff;
> +    struct list_head list;
> +    char filename[PATH_MAX];
> +};
> +
> +enum map_adj_cross {
> +    MAP_ADJ_LEFT_PID,
> +    MAP_ADJ_LEFT,
> +    MAP_ADJ_CROSS,
> +    MAP_ADJ_RIGHT,
> +    MAP_ADJ_RIGHT_PID,
> +};
> +
> +/*
> + * Check whether two map_adj cross over each other. This function is
> + * used for comparing adjustments. For overlapping adjustments, it
> + * reports different between two start address and the length of
> + * overlapping area. Signess of pgoff_diff can be used to determine
> + * which one is the left one.
> + *
> + * If anyone in r and l has pid set as -1, don't consider pid.
> + */
> +static enum map_adj_cross
> +check_map_adj_cross(struct map_adj* l, struct map_adj* r,
> +        int *pgoff_diff, u64 *cross_len)
> +{
> +    bool swapped = false;
> +
> +    if ((l->pid != (u32)(-1)) && (r->pid != (u32)(-1))
> +            && (l->pid != r->pid))
> +        return (l->pid < r->pid) ? MAP_ADJ_LEFT_PID : MAP_ADJ_RIGHT_PID;
> +
> +    if (l->start > r->start) {
> +        struct map_adj *t = l;
> +        swapped = true;
> +        l = r;
> +        r = t;
> +    }
> +
> +    if (l->start + l->len > r->start) {
> +        if (pgoff_diff)
> +            *pgoff_diff = ((r->start - l->start) / page_size) *
> +                (swapped ? -1 : 1);
> +        if (cross_len) {
> +            u64 cross_start = r->start;
> +            u64 l_end = l->start + l->len;
> +            u64 r_end = r->start + r->len;
> +
> +            *cross_len = (l_end < r_end ? l_end : r_end) -
> +                    cross_start;
> +        }
> +        return MAP_ADJ_CROSS;
> +    }
> +
> +    return swapped ? MAP_ADJ_RIGHT : MAP_ADJ_LEFT;
> +}
> +
> +static int machine_add_map_adj(u32 pid, u64 start, u64 len,
> +             u64 pgoff, const char *filename)
> +{
> +    struct map_adj *pos;
> +    struct map_adj *new;
> +    struct map_adj tmp = {
> +        .pid = pid,
> +        .start = start,
> +        .len = len,
> +    };
> +
> +    if (!filename)
> +        return -EINVAL;
> +
> +    if ((start % page_size) || (len % page_size)) {
> +        pr_err("Map adjustment is not page aligned for %d%s.\n", pid,
> +                pid == (u32)(-1) ? " (all pids)" : "");
> +        return -EINVAL;
> +    }
> +
> +    if ((pid != (u32)(-1)) && (!list_empty(&map_adj_list))) {
> +        /*
> +         * Don't allow mixing (u32)(-1) (for all pids) and
> +         * normal pid.
> +         *
> +         * During sorting, (u32)(-1) should be considered as
> +         * the largest pid.
> +         */
> +        struct map_adj *largest = list_entry(map_adj_list.prev,
> +                struct map_adj, list);
> +
> +        if (largest->pid == (u32)(-1)) {
> +            pr_err("Providing both system-wide and pid specific map adjustments is forbidden.\n");
> +            return -EINVAL;
> +        }
> +    }
> +
> +    /*
> +     * Find the first one which is larger than tmp and insert new
> +     * adj prior to it.
> +     */
> +    list_for_each_entry(pos, &map_adj_list, list) {
> +        enum map_adj_cross cross;
> +
> +        cross = check_map_adj_cross(&tmp, pos, NULL, NULL);
> +        if (cross < MAP_ADJ_CROSS)
> +            break;
> +        if (cross == MAP_ADJ_CROSS) {
> +            pr_err("Overlapping map adjustments provided for pid %d%s\n", pid,
> +                    pid == (u32)(-1) ? " (all pids)" : "");
> +            return -EINVAL;
> +        }
> +    }
> +
> +    new = malloc(sizeof(*new));
> +    if (!new)
> +        return -EINVAL;
> +
> +    new->pid = pid;
> +    new->start = start;
> +    new->len = len;
> +    new->pgoff = pgoff;
> +    strncpy(new->filename, filename, PATH_MAX);
> +    list_add(&new->list, pos->list.prev);
> +    return 0;
> +}
> +
> static int machine_map_new(struct machine *machine, u64 start, u64 len,
>             u64 pgoff, u32 pid, u32 d_maj, u32 d_min, u64 ino,
>             u64 ino_gen, u32 prot, u32 flags, char *filename,
>             enum map_type type, struct thread *thread)
> {
> +    struct map_adj *pos;
>    struct map *map;
> 
> -    map = map__new(machine, start, len, pgoff, pid, d_maj, d_min,
> -            ino, ino_gen, prot, flags, filename, type, thread);
> +    list_for_each_entry(pos, &map_adj_list, list) {
> +        u64 adj_start, adj_len, adj_pgoff, cross_len;
> +        enum map_adj_cross cross;
> +        struct map_adj tmp;
> +        int pgoff_diff;
> +
> +again:
> +        if (len == 0)
> +            break;
> +
> +        tmp.pid = pid;
> +        tmp.start = start;
> +        tmp.len = len;
> +
> +        cross = check_map_adj_cross(&tmp,
> +                pos, &pgoff_diff, &cross_len);
> +
> +        if (cross < MAP_ADJ_CROSS)
> +            break;
> +        if (cross > MAP_ADJ_CROSS)
> +            continue;
> +
> +        if (pgoff_diff <= 0) {
> +            /*
> +             *       |<----- tmp ----->|
> +             * |<----- pos ----->|
> +             */
> +
> +            adj_start = tmp.start;
> +            adj_len = cross_len;
> +            adj_pgoff = pos->pgoff + (-pgoff_diff);
> +            map = map__new(machine, adj_start, adj_len, adj_pgoff,
> +                    pid, 0, 0, 0, 0, prot, flags,
> +                    pos->filename, type, thread);
> +        } else {
> +            /*
> +             * |<----- tmp ----->|
> +             * |<-- X -->|<----- pos ----->|
> +             * In this case, only deal with tmp part X. goto again
> +             * instead of next pos.
> +             */
> +            adj_start = tmp.start;
> +            adj_len = tmp.len - cross_len;
> +            adj_pgoff = tmp.pgoff;
> +            map = map__new(machine, adj_start, adj_len, adj_pgoff,
> +                    pid, d_maj, d_min, ino, ino_gen, prot,
> +                    flags, filename, type, thread);
> +
> +        }
> +
> +        if (map == NULL)
> +            goto error;
> +
> +        thread__insert_map(thread, map);
> +
> +        pgoff += adj_len / page_size;
> +        start = tmp.start + adj_len;
> +        len -= adj_len;
> +        if (pgoff_diff > 0)
> +            goto again;
> +    }
> +
> +    map = map__new(machine, start, len, pgoff,
> +            pid, d_maj, d_min, ino, ino_gen, prot,
> +            flags, filename, type, thread);

We'd better check the value of len, and only do this mapping if len is not 0.

>    if (map == NULL)
> -        return -1;
> +        goto error;
> 
>    thread__insert_map(thread, map);
> +
>    return 0;
> +error:
> +    return -1;
> +}
> +
> +int parse_map_adjustment(const struct option *opt __maybe_unused,
> +        const char *arg, int unset __maybe_unused)
> +{
> +    const char *ptr;
> +    char *sep;
> +    int err;
> +    u64 start, len, pgoff = 0;
> +    u32 pid = (u32)(-1);
> +    char filename[PATH_MAX];
> +
> +    sep = strchr(arg, '@');
> +    if (sep == NULL)
> +        goto err;
> +
> +    strncpy(filename, arg, sep - arg);
> +
> +    ptr = sep + 1; /* Skip '@' */
> +
> +    /* start */
> +    start = strtoll(ptr, &sep, 16);
> +    if (*sep != ',')
> +        goto err;
> +    ptr = sep + 1;
> +
> +    /* len */
> +    len = strtoll(ptr, &sep, 16);
> +    if (*sep == ',') {
> +        /* pgoff */
> +        ptr = sep + 1;
> +        pgoff = strtoll(ptr, &sep, 16);
> +
> +        if (*sep == ',') {
> +            /* pid */
> +            ptr = sep + 1;
> +            pid = strtol(ptr, &sep, 10);
> +        }
> +    }
> +
> +    if (*sep != '\0')
> +        goto err;
> +
> +    err = machine_add_map_adj(pid, start, len, pgoff, filename);
> +    return err;
> +
> +err:
> +    fprintf(stderr, "invalid map adjustment setting: %s\n", arg);
> +    return -1;
> }
> 
> int machine__process_mmap2_event(struct machine *machine,
> diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
> index e2faf3b..73b49e4 100644
> --- a/tools/perf/util/machine.h
> +++ b/tools/perf/util/machine.h
> @@ -223,4 +223,6 @@ pid_t machine__get_current_tid(struct machine *machine, int cpu);
> int machine__set_current_tid(struct machine *machine, int cpu, pid_t pid,
>                 pid_t tid);
> 
> +int parse_map_adjustment(const struct option *opt, const char *arg, int unset);
> +
> #endif /* __PERF_MACHINE_H */
> -- 
> 1.8.3.4
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] perf tools: introduce machine_map_new to merge mmap/mmap2 processing code.
  2015-04-01 12:18   ` Jiri Olsa
@ 2015-04-01 14:58     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-04-01 14:58 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Wang Nan, jolsa, namhyung, mingo, lizefan, pi3orama, linux-kernel

Em Wed, Apr 01, 2015 at 02:18:53PM +0200, Jiri Olsa escreveu:
> On Wed, Apr 01, 2015 at 10:33:13AM +0000, Wang Nan wrote:
> > Create a machine_map_new() and merge mapping code in
> > machine__process_mmap2_event() and machine__process_mmap_event()
> > together. This patch is a preparation for following map adjustment
> > patches.

> > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c

> > +static int machine_map_new(struct machine *machine, u64 start, u64 len,
> > +		     u64 pgoff, u32 pid, u32 d_maj, u32 d_min, u64 ino,
> > +		     u64 ino_gen, u32 prot, u32 flags, char *filename,
> > +		     enum map_type type, struct thread *thread)
 
> the name style for this should be more like 'machine__new_map'
> I think Arnaldo will chime in.. ;-)

Right, I wonder if this is a function that is about 'maps' or 'machines'
mostly, that would say what prefix should be used, if map__ or
machine__, still need to review this patchset.

Just give a really quick lock, eyebrows raised, even more complexity for
the symbol layer, wonder if this can't be simplified...

- Arnaldo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] perf tools: report: introduce --map-adjustment argument.
  2015-04-01 13:21   ` Jiri Olsa
@ 2015-04-02  1:15     ` Wang Nan
  0 siblings, 0 replies; 13+ messages in thread
From: Wang Nan @ 2015-04-02  1:15 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: acme, jolsa, namhyung, mingo, lizefan, pi3orama, linux-kernel

On 2015/4/1 21:21, Jiri Olsa wrote:
> On Wed, Apr 01, 2015 at 10:33:14AM +0000, Wang Nan wrote:
>> This patch introduces a --map-adjustment argument for perf report. The
>> goal of this option is to deal with private dynamic loader used in some
>> special program.
>>
> 
> SNIP
> 
>> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
>> index 051883a..dc9e91e 100644
>> --- a/tools/perf/util/machine.c
>> +++ b/tools/perf/util/machine.c
>> @@ -1155,21 +1155,291 @@ out_problem:
>>  	return -1;
>>  }
>>  
>> +/*
>> + * Users are allowed to provide map adjustment setting for the case
>> + * that an address range is actually privatly mapped but known to be
>> + * ELF object file backended. Like this:
>> + *
>> + * |<- copied from libx.so ->|  |<- copied from liby.so ->|
>> + * |<-------------------- MMAP area --------------------->|
>> + *
>> + * When dealing with such mmap events, try to obey user adjustment.
>> + * Such adjustment settings are not allowed overlapping.
>> + * Adjustments won't be considered as valid code until real MMAP events
>> + * take place. Therefore, users are allowed to provide adjustments which
>> + * cover never mapped areas, like:
>> + *
>> + * |<- libx.so ->|  |<- liby.so ->|
>> + *      |<-- MMAP area -->|
>> + *
>> + * This feature is useful when dealing with private dynamic linkers,
>> + * which assemble code piece from different ELF objects.
>> + *
>> + * map_adj_list is an ordered linked list. Order of two adjustments is
>> + * first defined by their pid, and then by their start address.
>> + * Therefore, adjustments for specific pids are groupped together
>> + * naturally.
>> + */
>> +static LIST_HEAD(map_adj_list);
> 
> we dont like global stuff ;-)
> 
> I think this belongs to the machine object, which is created
> within the perf_session__new, so after options parsing.. hum
> 

Do you think such struct map_adj objects should better reside in thread
objects?

SNIP

> 
> just curious.. how many --map-adjust entries do you normaly use?
> maybe if it's bigger number then a) using rb_tree might be faster
> and b) using some sort of config file could be better way for
> input might be easier
> 

The address and pid are dynamically allocated so I don't think static config
file is a good way for input. I'll consider rb_tree in my next post.

Thank you.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] perf tools: unwind: ensure unwind hooks return negative errorno.
  2015-04-01 12:41     ` Wang Nan
@ 2015-04-07  8:30       ` Wang Nan
  0 siblings, 0 replies; 13+ messages in thread
From: Wang Nan @ 2015-04-07  8:30 UTC (permalink / raw)
  To: Jiri Olsa, acme, jolsa, mingo; +Cc: namhyung, lizefan, pi3orama, linux-kernel

Hi folks,

I'm rethinking --map-adjustment now, and I believe what we need should be something
like 'perf inject', which allows us to inject fake mmap events into perf.data to
make 'perf report' believe some //anon memory are file based mapping. Patch 2/4 - 4/4
seem not useful now. However, patch 1/4 is still useful because it is a bugfix. Could
you please drop the other 3 patches and merge this one?

Thank you.

On 2015/4/1 20:41, Wang Nan wrote:
> On 2015/4/1 20:12, Jiri Olsa wrote:
>> On Wed, Apr 01, 2015 at 10:33:12AM +0000, Wang Nan wrote:
>>> According to man pages of libunwind, unwind hooks should return
>>> 'negative value of one of the unw_error_t error-codes', they are
>>> different from generic error code. In addition, access_dso_mem()
>>> returns '!(size == sizeof(*data))', compiler never ensure it is
>>> negative when failure, which causes libunwind get undesire value
>>> when accessing //anon memory.
>>>
>>> This patch fixes this problem by force returning negative value when
>>> error, instead of returning 'ret' itself when it is non-zero.
>>
>> hum, how about find_proc_info callback.. should it follow the same rules?
>>
> 
> Yes, but it only returns -EINVAL and dwarf_search_unwind_table(....). The latter
> one is part of libunwind so we can trust it returns negative when fail.
> 
>> thanks,
>> jirka
>>
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-04-07  8:31 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-01 10:33 [PATCH 0/4] perf tools: introduce --map-adjustment Wang Nan
2015-04-01 10:33 ` [PATCH 1/4] perf tools: unwind: ensure unwind hooks return negative errorno Wang Nan
2015-04-01 12:12   ` Jiri Olsa
2015-04-01 12:41     ` Wang Nan
2015-04-07  8:30       ` Wang Nan
2015-04-01 10:33 ` [PATCH 2/4] perf tools: introduce machine_map_new to merge mmap/mmap2 processing code Wang Nan
2015-04-01 12:18   ` Jiri Olsa
2015-04-01 14:58     ` Arnaldo Carvalho de Melo
2015-04-01 10:33 ` [PATCH 3/4] perf tools: report: introduce --map-adjustment argument Wang Nan
2015-04-01 13:21   ` Jiri Olsa
2015-04-02  1:15     ` Wang Nan
2015-04-01 14:23   ` pi3orama
2015-04-01 10:33 ` [PATCH 4/4] perf tools: unwinding: try to read from map_adj for a unmapped address Wang Nan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).