LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v1 00/14] perf: Stream comparison
@ 2020-03-10  7:02 Jin Yao
  2020-03-10  7:02 ` [PATCH v1 01/14] perf util: Create source line mapping table Jin Yao
                   ` (13 more replies)
  0 siblings, 14 replies; 23+ messages in thread
From: Jin Yao @ 2020-03-10  7:02 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Sometimes, a small change in a hot function reducing the cycles of
this function, but the overall workload doesn't get faster. It is
interesting where the cycles are moved to.

What it would like is to diff before/after streams. A stream we think
is a callchain which is aggregated by the branch records from samples.

By browsing the hot streams, we can understand the hot code flow.
By comparing the cycles variation of same streams between old perf
data and new perf data, we can understand if the cycles are moved to
the unchanged code.

The before stream is the stream before source code changed
(e.g. streams in perf.data.old). The after stream is the stream
after source code changed (e.g. streams in perf.data).

Diffing before/after streams compares all streams (or compares top
N hot streams) between two perf data files.

If all entries of one stream in perf.data.old are fully matched with
all entries of another stream in perf.data, we think these two streams
are matched otherwise the streams are not matched.

For example,

   cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
--------------------------              --------------------------
             main div.c:39                           main div.c:39
             main div.c:44                           main div.c:44

It looks that two streams are matched and we can see for the same
streams the cycles are equal and the callchain hit percents are
slightly changed. That's expected in the normal range.

But that's not always true if source code is changed in perf.data
(e.g. source line div.c:39 is changed). If the source line is changed,
they are different streams, we can't compare them. We will think the
stream in perf.data is a new stream.

The challenge is how to identify the changed source lines. The basic
idea is to use linux command "diff" to compare the source file A and
source file A* line by line (assume A is used in perf.data.old
and A* is updated in perf.data). According to "diff" output, we can
create a source line mapping table.

For example,

  Execute 'diff ./before/div.c ./after/div.c'

  25c25
  <       i = rand() % 2;
  ---
  >       i = rand() % 4;
  39c39
  <       for (i = 0; i < 2000000000; i++) {
  ---
  >       for (i = 0; i < 20000000001; i++) {

  div.c (after -> before) lines mapping:
  0 -> 0
  1 -> 1
  2 -> 2
  3 -> 3
  4 -> 4
  5 -> 5
  6 -> 6
  7 -> 7
  8 -> 8
  9 -> 9
  ...
  24 -> 24
  25 -> -1
  26 -> 26
  27 -> 27
  28 -> 28
  29 -> 29
  30 -> 30
  31 -> 31
  32 -> 32
  33 -> 33
  34 -> 34
  35 -> 35
  36 -> 36
  37 -> 37
  38 -> 38
  39 -> -1
  40 -> 40
  ...

From the table, we can easily know div.c:39 is source line changed.
(mapped to -1). So these two streams are not matched.

Besides the hot streams comparison, this patch series also support
the top N hottest blocks comparison.

It's also useful to figure out the top N hottest blocks from old perf
data file and figure out the top N hottest blocks from new perf data file,
and then compare them for the cycles diff. It can let us easily know
how many cycles are moved from one block to another block.

Now let's see examples.

perf record -b ...      Generate perf.data.old with branch data
perf record -b ...      Generate perf.data with branch data
perf diff --stream --percent-limit 2

[ Matched hot chains between old perf data and new perf data) ]

hot chain pair 1:
            cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
        ---------------------------              --------------------------
                      main div.c:39                           main div.c:39
                      main div.c:44                           main div.c:44

hot chain pair 2:
           cycles: 35, hits: 21.43%                cycles: 33, hits: 19.37%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380
          __random_r random_r.c:357               __random_r random_r.c:357
              __random random.c:293                   __random random.c:293
              __random random.c:293                   __random random.c:293
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:288                   __random random.c:288
                     rand rand.c:27                          rand rand.c:27
                     rand rand.c:26                          rand rand.c:26
                           rand@plt                                rand@plt
                           rand@plt                                rand@plt
              compute_flag div.c:25                   compute_flag div.c:25
              compute_flag div.c:22                   compute_flag div.c:22
                      main div.c:40                           main div.c:40
                      main div.c:40                           main div.c:40
                      main div.c:39                           main div.c:39

hot chain pair 3:
            cycles: 18, hits: 6.10%                 cycles: 19, hits: 6.51%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380
          __random_r random_r.c:357               __random_r random_r.c:357
              __random random.c:293                   __random random.c:293
              __random random.c:293                   __random random.c:293
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:288                   __random random.c:288
                     rand rand.c:27                          rand rand.c:27
                     rand rand.c:26                          rand rand.c:26
                           rand@plt                                rand@plt
                           rand@plt                                rand@plt
              compute_flag div.c:25                   compute_flag div.c:25
              compute_flag div.c:22                   compute_flag div.c:22
                      main div.c:40                           main div.c:40

hot chain pair 4:
             cycles: 9, hits: 5.95%                  cycles: 8, hits: 5.03%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380

[ Hot chains in old perf data but source line changed (*) in new perf data ]

[ Hot chains in old perf data only ]

hot chain 1:
             cycles: 2, hits: 4.08%
         --------------------------
                      main div.c:42
              compute_flag div.c:28

[ Hot chains in new perf data only ]

hot chain 1:
                                                    cycles: 36, hits: 3.36%
                                                 --------------------------
                                                  __random_r random_r.c:357
                                                      __random random.c:293
                                                      __random random.c:293
                                                      __random random.c:291
                                                      __random random.c:291
                                                      __random random.c:291
                                                      __random random.c:288
                                                             rand rand.c:27
                                                             rand rand.c:26
                                                                   rand@plt
                                                                   rand@plt
                                                      compute_flag div.c:25
                                                      compute_flag div.c:22
                                                              main div.c:40
                                                              main div.c:40

Ignore the rightmost columns such as '[Program Block Range]' and 'Shared Object' for saving space

# Output based on old perf data:
#
# Sampled Cycles%  Avg Cycles  New Stream Diff(cycles%,cycles)  New Stream Sampled Cycles%  New Stream Avg Cycles
# ...............  ..........  ...............................  ..........................  .....................
#
           25.20%          18                     -0.36%,   -1                           -                      -
           15.24%           7                     -0.45%,    0                           -                      -
            5.07%           2                      0.09%,    0                           -                      -
            4.84%           2                      0.26%,    0                           -                      -
            4.72%           2                      0.30%,    0                           -                      -
            3.91%           1                      0.29%,    0                           -                      -
            3.05%           1                      0.11%,    0                           -                      -
            2.90%           1                      0.08%,    0                           -                      -
            2.71%           1                     -0.11%,    0                           -                      -
            2.44%           1                      0.09%,    0                           -                      -
            2.35%           1                     -0.09%,    0                           -                      -
            2.27%           1                      0.15%,    0                           -                      -
            2.27%           1                      0.06%,    0                           -                      -
            2.17%           1                      0.09%,    0                           -                      -

If we enable the source line comparison, the output might be different.

perf diff --stream --before ./before --after ./after

[ Matched hot chains between old perf data and new perf data) ]

hot chain pair 1:
            cycles: 18, hits: 6.10%                 cycles: 19, hits: 6.51%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380
          __random_r random_r.c:357               __random_r random_r.c:357
              __random random.c:293                   __random random.c:293
              __random random.c:293                   __random random.c:293
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:288                   __random random.c:288
                     rand rand.c:27                          rand rand.c:27
                     rand rand.c:26                          rand rand.c:26
                           rand@plt                                rand@plt
                           rand@plt                                rand@plt
              compute_flag div.c:25                   compute_flag div.c:25
              compute_flag div.c:22                   compute_flag div.c:22
                      main div.c:40                           main div.c:40

hot chain pair 2:
             cycles: 9, hits: 5.95%                  cycles: 8, hits: 5.03%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380

[ Hot chains in old perf data but source line changed (*) in new perf data ]

hot chain pair 1:
            cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
        ---------------------------              --------------------------
                      main div.c:39                           main div.c:39*
                      main div.c:44                           main div.c:44

hot chain pair 2:
           cycles: 35, hits: 21.43%                cycles: 33, hits: 19.37%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380
          __random_r random_r.c:357               __random_r random_r.c:357
              __random random.c:293                   __random random.c:293
              __random random.c:293                   __random random.c:293
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:288                   __random random.c:288
                     rand rand.c:27                          rand rand.c:27
                     rand rand.c:26                          rand rand.c:26
                           rand@plt                                rand@plt
                           rand@plt                                rand@plt
              compute_flag div.c:25                   compute_flag div.c:25
              compute_flag div.c:22                   compute_flag div.c:22
                      main div.c:40                           main div.c:40
                      main div.c:40                           main div.c:40
                      main div.c:39                           main div.c:39*

[ Hot chains in old perf data only ]

hot chain 1:
             cycles: 2, hits: 4.08%
         --------------------------
                      main div.c:42
              compute_flag div.c:28

[ Hot chains in new perf data only ]

hot chain 1:
                                                    cycles: 36, hits: 3.36%
                                                 --------------------------
                                                  __random_r random_r.c:357
                                                      __random random.c:293
                                                      __random random.c:293
                                                      __random random.c:291
                                                      __random random.c:291
                                                      __random random.c:291
                                                      __random random.c:288
                                                             rand rand.c:27
                                                             rand rand.c:26
                                                                   rand@plt
                                                                   rand@plt
                                                      compute_flag div.c:25
                                                      compute_flag div.c:22
                                                              main div.c:40
                                                              main div.c:40

# Output based on old perf data:
#
# Sampled Cycles%  Avg Cycles  New Stream Diff(cycles%,cycles)  New Stream Sampled Cycles%  New Stream Avg Cycles
# ...............  ..........  ...............................  ..........................  .....................
#
           25.20%          18    [block changed in new stream]                      24.84%                     17
           15.24%           7                     -0.45%,    0                           -                      -
            5.07%           2                      0.09%,    0                           -                      -
            4.84%           2                      0.26%,    0                           -                      -
            4.72%           2                      0.30%,    0                           -                      -
            3.91%           1                      0.29%,    0                           -                      -
            3.05%           1                      0.11%,    0                           -                      -
            2.90%           1                      0.08%,    0                           -                      -
            2.71%           1                     -0.11%,    0                           -                      -
            2.44%           1                      0.09%,    0                           -                      -
            2.35%           1                     -0.09%,    0                           -                      -
            2.27%           1                      0.15%,    0                           -                      -
            2.27%           1                      0.06%,    0                           -                      -
            2.17%           1                      0.09%,    0                           -                      -

Sometime some changes are not reflected in the source code,
e.g. changing the compiler option. So for this, we can't get
the changes by diffing the source code lines.

This patch series also introduces a new perf-diff option "--changed-func".
It passes the names of changed functions then perf-diff can know what
functions are changed.

For example,
perf diff --stream --changed-func main --changed-func rand

NOTE:
-----
1. For the patches:

  perf util: Create source line mapping table
  perf util: Create streams for managing top N hottest callchains
  perf util: Return per-event callchain streams
  perf util: Compare two streams
  perf util: Calculate the sum of all streams hits
  perf util: Report hot streams
  perf diff: Support hot streams comparison

  These patches support the hot stream comparison.

2. For the patches:
  perf util: Add new block info functions for top N hot blocks comparison
  perf util: Add new block info fmts for showing hot blocks comparison
  perf util: Enable block source line comparison
  perf diff: support hot blocks comparison

  These patches support the hot blocks comparison.

3. For the patches
  perf util: Filter out streams by name of changed functions
  perf util: Filter out blocks by name of changed functions
  perf diff: Filter out streams by changed functions

  These patches support a user specified function name list which let
  perf-diff know these functions are changed.

Jin Yao (14):
  perf util: Create source line mapping table
  perf util: Create streams for managing top N hottest callchains
  perf util: Return per-event callchain streams
  perf util: Compare two streams
  perf util: Calculate the sum of all streams hits
  perf util: Report hot streams
  perf diff: Support hot streams comparison
  perf util: Add new block info functions for top N hot blocks
    comparison
  perf util: Add new block info fmts for showing hot blocks comparison
  perf util: Enable block source line comparison
  perf diff: support hot blocks comparison
  perf util: Filter out streams by name of changed functions
  perf util: Filter out blocks by name of changed functions
  perf diff: Filter out streams by changed functions

 tools/perf/Documentation/perf-diff.txt |  19 +
 tools/perf/builtin-diff.c              | 324 ++++++++++++---
 tools/perf/util/Build                  |   1 +
 tools/perf/util/block-info.c           | 450 +++++++++++++++++++-
 tools/perf/util/block-info.h           |  38 +-
 tools/perf/util/callchain.c            | 521 +++++++++++++++++++++++
 tools/perf/util/callchain.h            |  34 ++
 tools/perf/util/srclist.c              | 552 +++++++++++++++++++++++++
 tools/perf/util/srclist.h              |  74 ++++
 9 files changed, 1953 insertions(+), 60 deletions(-)
 create mode 100644 tools/perf/util/srclist.c
 create mode 100644 tools/perf/util/srclist.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v1 01/14] perf util: Create source line mapping table
  2020-03-10  7:02 [PATCH v1 00/14] perf: Stream comparison Jin Yao
@ 2020-03-10  7:02 ` Jin Yao
  2020-03-10 15:08   ` Arnaldo Carvalho de Melo
  2020-03-10  7:02 ` [PATCH v1 02/14] perf util: Create streams for managing top N hottest callchains Jin Yao
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Jin Yao @ 2020-03-10  7:02 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Sometimes, a small change in a hot function reducing the cycles of
this function, but the overall workload doesn't get faster. It is
interesting where the cycles are moved to.

What it would like is to diff before/after streams. A stream is a
callchain which is aggregated by the branch records from samples.

By browsing the hot streams, we can understand the hot code flow.
By comparing the cycles variation of same streams between old perf
data and new perf data, we can understand if the cycles are moved to
the unchanged code.

The before stream is the stream before source line changed
(e.g. streams in perf.data.old). The after stream is the stream
after source line changed (e.g. streams in perf.data).

Diffing before/after streams compares all streams (or compares top
N hot streams) between two perf data files.

If all entries of one stream in perf.data.old are fully matched with
all entries of another stream in perf.data, we think these two streams
are matched, otherwise the streams are not matched.

For example,

   cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
--------------------------              --------------------------
             main div.c:39                           main div.c:39
             main div.c:44                           main div.c:44

It looks that two streams are matched and we can see for the same
streams the cycles are equal and the stream hit percents are
slightly changed. That's expected in the normal range.

But that's not always true if source lines are changed in perf.data
(e.g. div.c:39 is changed). If the source line is changed, they
are different streams, we can't compare them. We just think the
stream in perf.data is a new one.

The challenge is how to identify the changed source lines. The basic
idea is to use linux command 'diff' to compare the source file A and
source file A* line by line (assume A is used in perf.data.old
and A* is updated in perf.data). Create a source line mapping table.

For example,

  Execute 'diff ./before/div.c ./after/div.c'

  25c25
  <       i = rand() % 2;
  ---
  >       i = rand() % 4;
  39c39
  <       for (i = 0; i < 2000000000; i++) {
  ---
  >       for (i = 0; i < 20000000001; i++) {

  div.c (after -> before) lines mapping:
  0 -> 0
  1 -> 1
  2 -> 2
  3 -> 3
  4 -> 4
  5 -> 5
  6 -> 6
  7 -> 7
  8 -> 8
  9 -> 9
  ...
  24 -> 24
  25 -> -1
  26 -> 26
  27 -> 27
  28 -> 28
  29 -> 29
  30 -> 30
  31 -> 31
  32 -> 32
  33 -> 33
  34 -> 34
  35 -> 35
  36 -> 36
  37 -> 37
  38 -> 38
  39 -> -1
  40 -> 40
  ...

From the table, we can easily know div.c:39 is source line changed.
(it's mapped to -1). So these two streams are not matched.

This patch can also handle the cases of new source lines insertion
and old source lines deletion. This source line mapping table is a
base for next processing for stream comparison.

This patch creates a new rblist 'srclist' to manage the source line
mapping tables. Each node contains the source line mapping table of
a before/after source file pair. The node is created on demand as
we process the samples. So it's likely we don't need to create so many
nodes.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/Build     |   1 +
 tools/perf/util/srclist.c | 552 ++++++++++++++++++++++++++++++++++++++
 tools/perf/util/srclist.h |  65 +++++
 3 files changed, 618 insertions(+)
 create mode 100644 tools/perf/util/srclist.c
 create mode 100644 tools/perf/util/srclist.h

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index c0cf8dff694e..b9bf620b60f0 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -99,6 +99,7 @@ perf-y += call-path.o
 perf-y += rwsem.o
 perf-y += thread-stack.o
 perf-y += spark.o
+perf-y += srclist.o
 perf-$(CONFIG_AUXTRACE) += auxtrace.o
 perf-$(CONFIG_AUXTRACE) += intel-pt-decoder/
 perf-$(CONFIG_AUXTRACE) += intel-pt.o
diff --git a/tools/perf/util/srclist.c b/tools/perf/util/srclist.c
new file mode 100644
index 000000000000..cb1d42a3a8ed
--- /dev/null
+++ b/tools/perf/util/srclist.c
@@ -0,0 +1,552 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Manage difference of source lines
+ * Copyright (c) 2020, Intel Corporation.
+ * Author: Jin Yao
+ */
+#include <stdlib.h>
+#include <stdio.h>
+#include <sys/types.h>
+#include <inttypes.h>
+#include <string.h>
+#include <strings.h>
+#include <unistd.h>
+#include <err.h>
+#include <errno.h>
+#include <limits.h>
+#include <stdbool.h>
+#include <linux/zalloc.h>
+#include "strlist.h"
+#include "srclist.h"
+#include "debug.h"
+
+enum {
+	DIFF_LINE_ADD,
+	DIFF_LINE_DEL,
+	DIFF_LINE_CHANGE
+};
+
+static void get_full_path(const char *dir, const char *rel_path,
+			  char *full_path, int size)
+{
+	if (!dir) {
+		snprintf(full_path, size, "%s", rel_path);
+		return;
+	}
+
+	if (dir[strlen(dir) - 1] == '/')
+		snprintf(full_path, size, "%s%s", dir, rel_path);
+	else
+		snprintf(full_path, size, "%s/%s", dir, rel_path);
+}
+
+static int get_line_num(char *path)
+{
+	FILE *fh;
+	int num = 0, ch, last = 0;
+
+	fh = fopen(path, "r");
+	if (!fh) {
+		pr_debug("Failed to open file %s\n", path);
+		return -1;
+	}
+
+	while (!feof(fh)) {
+		ch = fgetc(fh);
+		if (ch == '\n')
+			num++;
+		last = ch;
+	}
+
+	fclose(fh);
+
+	if (last != '\n')
+		num++;
+
+	return num;
+}
+
+static int get_digit(char *str, char **end_str, int *val)
+{
+	int v;
+	char *end;
+
+	errno = 0;
+	v = strtol(str, &end, 10);
+	if ((errno == ERANGE && (v == INT_MAX || v == INT_MIN))
+		|| (errno != 0 && v == 0)) {
+		return -1;
+	}
+
+	if (end == str)
+		return -1;
+
+	*val = v;
+	*end_str = end;
+	return 0;
+}
+
+static int get_range(char *str, int *start_val, int *end_val)
+{
+	int val, ret;
+	char *end, *next;
+
+	*start_val = -1;
+	*end_val = -1;
+
+	ret = get_digit(str, &end, &val);
+	if (ret)
+		return ret;
+
+	*start_val = val;
+
+	if (*end != '\0') {
+		next = end + 1;
+		ret = get_digit(next, &end, &val);
+		if (ret)
+			return ret;
+
+		*end_val = val;
+	}
+
+	return 0;
+}
+
+static int opr_str_idx(char *str, int len, char ch)
+{
+	int i;
+
+	for (i = 0; i < len; i++) {
+		if (str[i] == ch)
+			break;
+	}
+
+	if (i == len || i == 0 || i == len - 1)
+		return -1;
+
+	if (str[i - 1] >= '0' && str[i - 1] <= '9' &&
+	    str[i + 1] >= '0' && str[i + 1] <= '9') {
+		return i;
+	}
+
+	return -1;
+}
+
+static bool get_diff_operation(char *str, int *opr, int *b_start, int *b_end,
+			       int *a_start, int *a_end)
+{
+	int idx, len;
+
+	if (str[0] == '<' || str[0] == '>' || str[0] == '-')
+		return false;
+
+	len = strlen(str);
+
+	/*
+	 * For example,
+	 * 5,6d4
+	 * 8a7
+	 * 20,21c19
+	 */
+	idx = opr_str_idx(str, len, 'd');
+	if (idx != -1) {
+		*opr = DIFF_LINE_DEL;
+		goto exit;
+	}
+
+	idx = opr_str_idx(str, len, 'a');
+	if (idx != -1) {
+		*opr = DIFF_LINE_ADD;
+		goto exit;
+	}
+
+	idx = opr_str_idx(str, len, 'c');
+	if (idx != -1) {
+		*opr = DIFF_LINE_CHANGE;
+		goto exit;
+	}
+
+exit:
+	if (idx != -1) {
+		str[idx] = 0;
+		get_range(str, b_start, b_end);
+		get_range(&str[idx + 1], a_start, a_end);
+		return true;
+	}
+
+	return false;
+}
+
+static int del_lines(struct line_pair *lines, int b_start, int b_end,
+		     int a_start, int a_end __maybe_unused,
+		     int *nr_lines, int *b_adjust)
+{
+	int i = *nr_lines;
+	bool set = false;
+
+	/*
+	 * For example, 5,6d4
+	 * It means line5~line6 of file1 are not same as line4 of file2
+	 * since line5~line6 are deleted.
+	 */
+
+	if (a_start == i) {
+		lines[i].a_nr = i;
+		lines[i].b_nr = i + *b_adjust;
+		set = true;
+	}
+
+	if (b_end != -1)
+		*b_adjust += b_end - b_start + 1;
+	else
+		*b_adjust += 1;
+
+	if (!set) {
+		lines[i].a_nr = i;
+		lines[i].b_nr = i + *b_adjust;
+	}
+
+	*nr_lines = i + 1;
+	return 0;
+}
+
+static int add_lines(struct line_pair *lines,
+		     int b_start __maybe_unused, int b_end __maybe_unused,
+		     int a_start, int a_end, int *nr_lines, int *b_adjust)
+{
+	int i = *nr_lines;
+
+	/*
+	 * For example, 8a7
+	 * It means line8 of file1 is not same as line7 of file2
+	 * since a new line (line7) is inserted to file2.
+	 */
+	if (a_end != -1) {
+		for (int j = 0; j < a_end - a_start + 1; j++) {
+			lines[i].a_nr = i;
+			lines[i].b_nr = -1;
+			i++;
+		}
+		*b_adjust -= a_end - a_start + 1;
+	} else {
+		lines[i].a_nr = i;
+		lines[i].b_nr = -1;
+		i++;
+		*b_adjust -= 1;
+	}
+
+	*nr_lines = i;
+	return 0;
+}
+
+static int change_lines(struct line_pair *lines, int b_start, int b_end,
+			int a_start, int a_end, int *nr_lines, int *b_adjust)
+{
+	int i = *nr_lines;
+
+	/*
+	 * For example, 20,21c19
+	 * It means line20~line21 of file1 are not same as line19 of file2
+	 * since they are changed.
+	 */
+
+	if (a_end != -1) {
+		for (int j = 0; j < a_end - a_start + 1; j++) {
+			lines[i].a_nr = i;
+			lines[i].b_nr = -1;
+			i++;
+		}
+	} else {
+		lines[i].a_nr = i;
+		lines[i].b_nr = -1;
+		i++;
+	}
+
+	if (b_end != -1)
+		*b_adjust += b_end - b_start;
+
+	if (a_end != -1)
+		*b_adjust -= a_end - a_start;
+
+	*nr_lines = i;
+	return 0;
+}
+
+static int update_lines(struct line_pair *lines, int opr, int b_start,
+			int b_end, int a_start, int a_end, int *nr_lines,
+			int *b_adjust)
+{
+	int i = *nr_lines;
+
+	while (i < a_start) {
+		lines[i].a_nr = i;
+		lines[i].b_nr = i + *b_adjust;
+		i++;
+	}
+
+	*nr_lines = i;
+
+	switch (opr) {
+	case DIFF_LINE_DEL:
+		del_lines(lines, b_start, b_end, a_start, a_end,
+			  nr_lines, b_adjust);
+		break;
+
+	case DIFF_LINE_ADD:
+		add_lines(lines, b_start, b_end, a_start, a_end,
+			  nr_lines, b_adjust);
+		break;
+
+	case DIFF_LINE_CHANGE:
+		change_lines(lines, b_start, b_end, a_start, a_end,
+			     nr_lines, b_adjust);
+		break;
+
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+static void update_remaining(struct line_pair *lines, int a_num, int *nr_lines,
+			     int b_adjust)
+{
+	int i;
+
+	for (i = *nr_lines; i <= a_num; i++) {
+		lines[i].a_nr = i;
+		lines[i].b_nr = i + b_adjust;
+	}
+
+	*nr_lines = i;
+}
+
+static int build_mapping_table(FILE *diff_fd, struct line_pair *lines,
+			       int *nr_lines, int a_num)
+{
+	char *str, *p;
+	size_t len;
+	int opr, b_start, b_end, a_start, a_end, b_adjust = 0;
+
+	*nr_lines = 1;  /* First line is reserved */
+
+	while (!feof(diff_fd)) {
+		str = NULL;
+		if (getline(&str, &len, diff_fd) < 0 || !len)
+			break;
+
+		p = strchr(str, '\n');
+		if (p)
+			*p = '\0';
+
+		pr_debug("%s\n", str);
+
+		if (!get_diff_operation(str, &opr, &b_start, &b_end,
+					&a_start, &a_end)) {
+			continue;
+		}
+
+		update_lines(lines, opr, b_start, b_end, a_start, a_end,
+			     nr_lines, &b_adjust);
+	}
+
+	update_remaining(lines, a_num, nr_lines, b_adjust);
+
+	free(str);
+	return 0;
+}
+
+static int create_line_mapping(struct src_info *info, char *b_path,
+			       char *a_path)
+{
+	char cmd[PATH_MAX];
+	FILE *diff_fd = NULL;
+	int ret = -1;
+
+	info->nr_lines_before = get_line_num(b_path);
+	if (info->nr_lines_before == -1)
+		goto out;
+
+	info->nr_lines_after = get_line_num(a_path);
+	if (info->nr_lines_after == -1)
+		goto out;
+
+	/* Reserve first line */
+	info->nr_max = info->nr_lines_before + info->nr_lines_after + 1;
+	info->lines = calloc(info->nr_max, sizeof(struct line_pair));
+	if (!info->lines) {
+		pr_err("Failed to alloc memory for lines\n");
+		goto out;
+	}
+
+	snprintf(cmd, sizeof(cmd), "diff %s %s",
+		 b_path, a_path);
+
+	pr_debug("Execute '%s'\n", cmd);
+
+	diff_fd = popen(cmd, "r");
+	if (!diff_fd) {
+		pr_err("Failed to execute '%s'\n", cmd);
+		goto out;
+	}
+
+	ret = build_mapping_table(diff_fd, info->lines, &info->nr_lines,
+				  info->nr_lines_after);
+
+out:
+	if (diff_fd)
+		fclose(diff_fd);
+
+	return ret;
+}
+
+static void free_src_info(struct src_info *info)
+{
+	if (info->rel_path)
+		zfree(&info->rel_path);
+
+	if (info->lines)
+		zfree(&info->lines);
+}
+
+static int init_src_info(char *b_path, char *a_path, const char *rel_path,
+			 struct src_info *info)
+{
+	int ret;
+
+	info->rel_path = strdup(rel_path);
+	if (!info->rel_path)
+		return -1;
+
+	ret = create_line_mapping(info, b_path, a_path);
+	if (ret) {
+		free_src_info(info);
+		return ret;
+	}
+
+	return 0;
+}
+
+static void free_src_node(struct src_node *node)
+{
+	free_src_info(&node->info);
+	free(node);
+}
+
+static struct rb_node *srclist__node_new(struct rblist *rblist,
+					 const void *entry)
+{
+	struct srclist *slist = container_of(rblist, struct srclist, rblist);
+	const char *rel_path = entry;
+	char b_path[PATH_MAX], a_path[PATH_MAX];
+	struct src_node *node;
+	int ret;
+
+	get_full_path(slist->before_dir, rel_path, b_path, PATH_MAX);
+	get_full_path(slist->after_dir, rel_path, a_path, PATH_MAX);
+
+	node = calloc(1, sizeof(*node));
+	if (!node)
+		return NULL;
+
+	ret = init_src_info(b_path, a_path, rel_path, &node->info);
+	if (ret)
+		goto err;
+
+	return &node->rb_node;
+
+err:
+	free_src_node(node);
+	return NULL;
+}
+
+static void srclist__node_delete(struct rblist *rblist __maybe_unused,
+				 struct rb_node *rb_node)
+{
+	struct src_node *node = container_of(rb_node, struct src_node, rb_node);
+
+	free_src_info(&node->info);
+	free(node);
+}
+
+static int srclist__node_cmp(struct rb_node *rb_node, const void *entry)
+{
+	struct src_node *node = container_of(rb_node, struct src_node, rb_node);
+	const char *str = entry;
+
+	return strcmp(node->info.rel_path, str);
+}
+
+struct srclist *srclist__new(const char *before_dir, const char *after_dir)
+{
+	struct srclist *slist = calloc(1, sizeof(*slist));
+
+	if (!slist)
+		return NULL;
+
+	rblist__init(&slist->rblist);
+	slist->rblist.node_cmp = srclist__node_cmp;
+	slist->rblist.node_new = srclist__node_new;
+	slist->rblist.node_delete = srclist__node_delete;
+
+	slist->before_dir = before_dir;
+	slist->after_dir = after_dir;
+	return slist;
+}
+
+void srclist__delete(struct srclist *slist)
+{
+	if (slist)
+		rblist__delete(&slist->rblist);
+}
+
+struct src_node *srclist__find(struct srclist *slist, char *rel_path,
+			       bool create)
+{
+	struct src_node *node = NULL;
+	struct rb_node *rb_node;
+
+	if (create)
+		rb_node = rblist__findnew(&slist->rblist, (void *)rel_path);
+	else
+		rb_node = rblist__find(&slist->rblist, (void *)rel_path);
+
+	if (rb_node)
+		node = container_of(rb_node, struct src_node, rb_node);
+
+	return node;
+}
+
+struct line_pair *srclist__line_pair(struct src_node *node, int a_nr)
+{
+	struct src_info *info = &node->info;
+
+	if (a_nr < info->nr_lines && a_nr > 0)
+		return &info->lines[a_nr];
+
+	pr_debug("Exceeds line range: nr_lines = %d, a_nr = %d\n",
+		 info->nr_lines, a_nr);
+
+	return NULL;
+}
+
+void srclist__dump(struct srclist *slist)
+{
+	struct src_node *node;
+	struct src_info *info;
+	int i;
+
+	srclist__for_each_entry(node, slist) {
+		info = &node->info;
+
+		pr_debug("%s (after -> before) lines mapping:\n",
+			 info->rel_path);
+
+		for (i = 0; i < info->nr_lines; i++) {
+			pr_debug("%d -> %d\n",
+				 info->lines[i].a_nr,
+				 info->lines[i].b_nr);
+		}
+	}
+}
diff --git a/tools/perf/util/srclist.h b/tools/perf/util/srclist.h
new file mode 100644
index 000000000000..f25b0de91a13
--- /dev/null
+++ b/tools/perf/util/srclist.h
@@ -0,0 +1,65 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PERF_SRCLIST_H
+#define __PERF_SRCLIST_H
+
+#include <linux/types.h>
+#include <linux/rbtree.h>
+#include "rblist.h"
+
+struct line_pair {
+	int a_nr;	/* line nr in after.c */
+	int b_nr;	/* line nr in before.c */
+};
+
+struct src_node;
+
+struct src_info {
+	char *rel_path; /* relative path */
+	struct line_pair *lines;
+	int nr_max;
+	int nr_lines;
+	int nr_lines_before;
+	int nr_lines_after;
+};
+
+struct src_node {
+	struct rb_node rb_node;
+	struct src_info info;
+};
+
+struct srclist {
+	struct rblist rblist;
+	const char *before_dir;
+	const char *after_dir;
+};
+
+struct srclist *srclist__new(const char *before_dir, const char *after_dir);
+void srclist__delete(struct srclist *slist);
+
+struct src_node *srclist__find(struct srclist *slist, char *rel_path,
+			       bool create);
+
+struct line_pair *srclist__line_pair(struct src_node *node, int a_nr);
+void srclist__dump(struct srclist *slist);
+
+static inline struct src_node *srclist__first(struct srclist *slist)
+{
+	struct rb_node *rn = rb_first_cached(&slist->rblist.entries);
+
+	return rn ? rb_entry(rn, struct src_node, rb_node) : NULL;
+}
+
+static inline struct src_node *srclist__next(struct src_node *sn)
+{
+	struct rb_node *rn;
+
+	if (!sn)
+		return NULL;
+	rn = rb_next(&sn->rb_node);
+	return rn ? rb_entry(rn, struct src_node, rb_node) : NULL;
+}
+
+#define srclist__for_each_entry(pos, slist)	\
+	for (pos = srclist__first(slist); pos; pos = srclist__next(pos))
+
+#endif /* __PERF_SRCLIST_H */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v1 02/14] perf util: Create streams for managing top N hottest callchains
  2020-03-10  7:02 [PATCH v1 00/14] perf: Stream comparison Jin Yao
  2020-03-10  7:02 ` [PATCH v1 01/14] perf util: Create source line mapping table Jin Yao
@ 2020-03-10  7:02 ` Jin Yao
  2020-03-10 15:11   ` Arnaldo Carvalho de Melo
  2020-03-10  7:02 ` [PATCH v1 03/14] perf util: Return per-event callchain streams Jin Yao
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Jin Yao @ 2020-03-10  7:02 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

We think the stream is a callchain which is aggregated by the LBR
records from samples. By browsing the stream, we can understand
the code flow.

The struct callchain_node represents one callchain and we use the
callchain_node->hit to measure the hot level of this callchain.
Higher is hotter.

Since in perf data file, there may be many callchains so we just
need to focus on the top N hottest callchains. N is a user defined
parameter or just a predefined default value.

This patch saves the top N hottest callchains in 'struct stream_node'
type array, which is defined in a per event 'struct callchain_streams'.

So now we can get the per-event top N hottest callchains.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/callchain.c | 125 ++++++++++++++++++++++++++++++++++++
 tools/perf/util/callchain.h |  16 +++++
 2 files changed, 141 insertions(+)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 818aa4efd386..d9c68a8e7619 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -31,6 +31,7 @@
 #include "callchain.h"
 #include "branch.h"
 #include "symbol.h"
+#include "evlist.h"
 #include "../perf.h"
 
 #define CALLCHAIN_PARAM_DEFAULT			\
@@ -1599,3 +1600,127 @@ void callchain_cursor_reset(struct callchain_cursor *cursor)
 	for (node = cursor->first; node != NULL; node = node->next)
 		map__zput(node->ms.map);
 }
+
+static void free_evsel_streams(struct callchain_streams *callchain_streams,
+			       int nr_evsel)
+{
+	for (int i = 0; i < nr_evsel; i++) {
+		if (callchain_streams[i].streams)
+			free(callchain_streams[i].streams);
+	}
+
+	free(callchain_streams);
+}
+
+static struct callchain_streams *create_evsel_streams(int nr_evsel,
+						      int nr_streams_max)
+{
+	struct callchain_streams *callchain_streams;
+
+	callchain_streams = calloc(nr_evsel, sizeof(struct callchain_streams));
+	if (!callchain_streams)
+		return NULL;
+
+	for (int i = 0; i < nr_evsel; i++) {
+		struct callchain_streams *s = &callchain_streams[i];
+
+		s->streams = calloc(nr_streams_max, sizeof(struct stream_node));
+		if (!s->streams)
+			goto err;
+
+		s->nr_streams_max = nr_streams_max;
+		s->evsel_idx = -1;
+	}
+
+	return callchain_streams;
+
+err:
+	free_evsel_streams(callchain_streams, nr_evsel);
+	return NULL;
+}
+
+/*
+ * The cnodes with high hit number are hot callchains.
+ */
+static void set_hot_cnode(struct callchain_streams *s,
+			  struct callchain_node *cnode)
+{
+	int i, idx = 0;
+	u64 hit;
+
+	if (s->nr_streams < s->nr_streams_max) {
+		i = s->nr_streams;
+		s->streams[i].cnode = cnode;
+		s->nr_streams++;
+		return;
+	}
+
+	/*
+	 * Since only a few number of hot streams, so only use simple
+	 * way to find the cnode with smallest hit number and replace.
+	 */
+	hit = (s->streams[0].cnode)->hit;
+	for (i = 1; i < s->nr_streams; i++) {
+		if ((s->streams[i].cnode)->hit < hit) {
+			hit = (s->streams[i].cnode)->hit;
+			idx = i;
+		}
+	}
+
+	if (cnode->hit > hit)
+		s->streams[idx].cnode = cnode;
+}
+
+static void update_hot_streams(struct hist_entry *he,
+			       struct callchain_streams *s)
+{
+	struct rb_root *root = &he->sorted_chain;
+	struct rb_node *rb_node = rb_first(root);
+	struct callchain_node *node;
+
+	while (rb_node) {
+		node = rb_entry(rb_node, struct callchain_node, rb_node);
+		set_hot_cnode(s, node);
+		rb_node = rb_next(rb_node);
+	}
+}
+
+static void get_hot_streams(struct hists *hists,
+			    struct callchain_streams *s)
+{
+	struct rb_node *next;
+
+	next = rb_first_cached(&hists->entries);
+	while (next) {
+		struct hist_entry *he;
+
+		he = rb_entry(next, struct hist_entry, rb_node);
+		update_hot_streams(he, s);
+		next = rb_next(&he->rb_node);
+	}
+}
+
+struct callchain_streams *callchain_evsel_streams_create(struct evlist *evlist,
+							 int nr_streams_max,
+							 int *nr_evsel_streams)
+{
+	int nr_evsel = evlist->core.nr_entries, i = 0;
+	struct callchain_streams *callchain_streams;
+	struct evsel *pos;
+
+	callchain_streams = create_evsel_streams(nr_evsel, nr_streams_max);
+	if (!callchain_streams)
+		return NULL;
+
+	evlist__for_each_entry(evlist, pos) {
+		struct hists *hists = evsel__hists(pos);
+
+		hists__output_resort(hists, NULL);
+		get_hot_streams(hists, &callchain_streams[i]);
+		callchain_streams[i].evsel_idx = pos->idx;
+		i++;
+	}
+
+	*nr_evsel_streams = nr_evsel;
+	return callchain_streams;
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 706bb7bbe1e1..5852990cdf60 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -13,6 +13,7 @@ struct ip_callchain;
 struct map;
 struct perf_sample;
 struct thread;
+struct evlist;
 
 #define HELP_PAD "\t\t\t\t"
 
@@ -159,6 +160,17 @@ struct callchain_cursor {
 	struct callchain_cursor_node	*curr;
 };
 
+struct stream_node {
+	struct callchain_node	*cnode;
+};
+
+struct callchain_streams {
+	struct stream_node	*streams;
+	int			nr_streams_max;
+	int			nr_streams;
+	int			evsel_idx;
+};
+
 extern __thread struct callchain_cursor callchain_cursor;
 
 static inline void callchain_init(struct callchain_root *root)
@@ -289,4 +301,8 @@ int callchain_branch_counts(struct callchain_root *root,
 			    u64 *branch_count, u64 *predicted_count,
 			    u64 *abort_count, u64 *cycles_count);
 
+struct callchain_streams *callchain_evsel_streams_create(struct evlist *evlist,
+							 int nr_streams_max,
+							 int *nr_evsel_streams);
+
 #endif	/* __PERF_CALLCHAIN_H */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v1 03/14] perf util: Return per-event callchain streams
  2020-03-10  7:02 [PATCH v1 00/14] perf: Stream comparison Jin Yao
  2020-03-10  7:02 ` [PATCH v1 01/14] perf util: Create source line mapping table Jin Yao
  2020-03-10  7:02 ` [PATCH v1 02/14] perf util: Create streams for managing top N hottest callchains Jin Yao
@ 2020-03-10  7:02 ` Jin Yao
  2020-03-10  7:02 ` [PATCH v1 04/14] perf util: Compare two streams Jin Yao
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jin Yao @ 2020-03-10  7:02 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

In previous patch, we have created a 'struct callchain_streams'
array and each array entry contains per-event callchain streams.

This patch returns the pointer of per-event callchain streams
according to the evsel_idx.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/callchain.c | 12 ++++++++++++
 tools/perf/util/callchain.h |  4 ++++
 2 files changed, 16 insertions(+)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index d9c68a8e7619..31ecc1abe7f9 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1724,3 +1724,15 @@ struct callchain_streams *callchain_evsel_streams_create(struct evlist *evlist,
 	*nr_evsel_streams = nr_evsel;
 	return callchain_streams;
 }
+
+struct callchain_streams *callchain_evsel_streams_get(struct callchain_streams *cs,
+						      int nr_streams_max,
+						      int evsel_idx)
+{
+	for (int i = 0; i < nr_streams_max; i++) {
+		if (cs[i].evsel_idx == evsel_idx)
+			return &cs[i];
+	}
+
+	return NULL;
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 5852990cdf60..4e881361e154 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -305,4 +305,8 @@ struct callchain_streams *callchain_evsel_streams_create(struct evlist *evlist,
 							 int nr_streams_max,
 							 int *nr_evsel_streams);
 
+struct callchain_streams *callchain_evsel_streams_get(struct callchain_streams *cs,
+						      int nr_streams_max,
+						      int evsel_idx);
+
 #endif	/* __PERF_CALLCHAIN_H */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v1 04/14] perf util: Compare two streams
  2020-03-10  7:02 [PATCH v1 00/14] perf: Stream comparison Jin Yao
                   ` (2 preceding siblings ...)
  2020-03-10  7:02 ` [PATCH v1 03/14] perf util: Return per-event callchain streams Jin Yao
@ 2020-03-10  7:02 ` Jin Yao
  2020-03-10  7:02 ` [PATCH v1 05/14] perf util: Calculate the sum of all streams hits Jin Yao
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jin Yao @ 2020-03-10  7:02 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

If all callchain entries of one stream are fully matched with
all entries of another stream, we think these two streams are
matched.

For example,

   cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
   -----------------------                 -----------------------
             main div.c:39                           main div.c:39
             main div.c:44                           main div.c:44

Above two streams are matched if div.c is not changed.

But the streams are not matched if div.c is source line changed
(e.g. div.c:39 is changed). If the source line is changed, they
are different streams.

So the challenge is how to identify the changed source lines.
For this purpose, we use the source line mapping table (see patch
"perf util: Create source line mapping table"). The source line
mapping information is saved in src_list.

So the matching logic is, if we can find the source lines, the
callchain entry comparison is based on source lines. Otherwise,
fallback to dso address comparison.

Once two streams are fully matched, they will be linked as
a pair. From the pair, we can know which streams are matched.
The pair information will be used in next patches.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/callchain.c | 172 ++++++++++++++++++++++++++++++++++++
 tools/perf/util/callchain.h |   8 ++
 2 files changed, 180 insertions(+)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 31ecc1abe7f9..a9dd91268b00 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1736,3 +1736,175 @@ struct callchain_streams *callchain_evsel_streams_get(struct callchain_streams *
 
 	return NULL;
 }
+
+static bool chain_srclist_match(struct srclist *src_list, const char *srcline_a,
+				const char *srcline_b, bool *src_found,
+				bool *src_changed)
+{
+	char file_a[PATH_MAX], file_b[PATH_MAX];
+	int line_a, line_b;
+	struct src_node *node;
+	struct line_pair *lp;
+
+	if (sscanf(srcline_a, "%[^:]:%d", file_a, &line_a) != 2)
+		return false;
+
+	if (sscanf(srcline_b, "%[^:]:%d", file_b, &line_b) != 2)
+		return false;
+
+	if (strcmp(file_a, file_b))
+		return false;
+
+	node = srclist__find(src_list, file_a, true);
+	if (!node)
+		return false;
+
+	*src_found = true;
+
+	lp = srclist__line_pair(node, line_a);
+	if (!lp)
+		return false;
+
+	if (line_a == line_b && lp->b_nr == -1) {
+		*src_changed = true;
+		return true;
+	}
+
+	if (lp->b_nr == line_b)
+		return true;
+
+	return false;
+}
+
+static bool chain_match(struct callchain_list *base_chain,
+			struct callchain_list *pair_chain,
+			struct srclist *src_list,
+			bool *src_changed)
+{
+	enum match_result match;
+	bool src_found = false;
+
+	*src_changed = false;
+
+	if (src_list && chain_srclist_match(src_list, pair_chain->srcline,
+					    base_chain->srcline, &src_found,
+					    src_changed)) {
+		return true;
+	}
+
+	if (!src_found)  {
+		match = match_chain_strings(base_chain->srcline,
+					    pair_chain->srcline);
+		if (match != MATCH_ERROR)
+			return match == MATCH_EQ;
+
+		match = match_chain_dso_addresses(base_chain->ms.map,
+						  base_chain->ip,
+						  pair_chain->ms.map,
+						  pair_chain->ip);
+
+		return match == MATCH_EQ;
+	}
+
+	return false;
+}
+
+static bool callchain_node_matched(struct callchain_node *base_cnode,
+				   struct callchain_node *pair_cnode,
+				   struct srclist *src_list,
+				   int *nr_changed)
+{
+	struct callchain_list *base_chain, *pair_chain;
+	bool match = false;
+
+	pair_chain = list_first_entry(&pair_cnode->val,
+				      struct callchain_list,
+				      list);
+
+	list_for_each_entry(base_chain, &base_cnode->val, list) {
+		bool src_changed;
+
+		if (&pair_chain->list == &pair_cnode->val)
+			return false;
+
+		if (!base_chain->srcline || !pair_chain->srcline) {
+			pair_chain = list_next_entry(pair_chain, list);
+			continue;
+		}
+
+		match = chain_match(base_chain, pair_chain, src_list,
+				    &src_changed);
+
+		if (src_changed) {
+			pair_chain->src_changed = true;
+			*nr_changed += 1;
+		}
+
+		if (!match)
+			return false;
+
+		pair_chain = list_next_entry(pair_chain, list);
+	}
+
+	/*
+	 * Say chain1 is ABC, chain2 is ABCD, we consider they are
+	 * not fully matched.
+	 */
+	if (pair_chain && (&pair_chain->list != &pair_cnode->val))
+		return false;
+
+	return match;
+}
+
+static struct stream_node *stream_node_match(struct stream_node *base_node,
+					     struct callchain_streams *cs_pair,
+					     struct srclist *src_list,
+					     bool *src_changed)
+{
+	*src_changed = false;
+
+	for (int i = 0; i < cs_pair->nr_streams; i++) {
+		struct stream_node *pair_node = &cs_pair->streams[i];
+		int nr_changed = 0;
+
+		if (callchain_node_matched(base_node->cnode, pair_node->cnode,
+					   src_list, &nr_changed)) {
+			if (nr_changed)
+				*src_changed = true;
+
+			return pair_node;
+		}
+	}
+
+	return NULL;
+}
+
+static void stream_nodes_link(struct stream_node *base_node,
+			      struct stream_node *pair_node,
+			      bool src_changed)
+{
+	base_node->pair_cnode = pair_node->cnode;
+	base_node->src_changed = src_changed;
+	pair_node->pair_cnode = base_node->cnode;
+}
+
+void callchain_match_streams(struct callchain_streams *cs_base,
+			     struct callchain_streams *cs_pair,
+			     struct srclist *src_list)
+{
+	for (int i = 0; i < cs_base->nr_streams; i++) {
+		struct stream_node *base_node = &cs_base->streams[i];
+		struct stream_node *pair_node;
+		bool src_changed;
+
+		pair_node = stream_node_match(base_node, cs_pair, src_list,
+					      &src_changed);
+		if (pair_node)
+			stream_nodes_link(base_node, pair_node, src_changed);
+		else
+			base_node->src_changed = src_changed;
+	}
+
+	if (src_list)
+		srclist__dump(src_list);
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 4e881361e154..c996ab4fb108 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -6,6 +6,7 @@
 #include <linux/rbtree.h>
 #include "map_symbol.h"
 #include "branch.h"
+#include "srclist.h"
 
 struct addr_location;
 struct evsel;
@@ -131,6 +132,7 @@ struct callchain_list {
 	u64			iter_cycles;
 	struct branch_type_stat brtype_stat;
 	const char		*srcline;
+	bool			src_changed;
 	struct list_head	list;
 };
 
@@ -162,6 +164,8 @@ struct callchain_cursor {
 
 struct stream_node {
 	struct callchain_node	*cnode;
+	struct callchain_node	*pair_cnode;
+	bool			src_changed;
 };
 
 struct callchain_streams {
@@ -309,4 +313,8 @@ struct callchain_streams *callchain_evsel_streams_get(struct callchain_streams *
 						      int nr_streams_max,
 						      int evsel_idx);
 
+void callchain_match_streams(struct callchain_streams *cs_base,
+			     struct callchain_streams *cs_pair,
+			     struct srclist *src_list);
+
 #endif	/* __PERF_CALLCHAIN_H */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v1 05/14] perf util: Calculate the sum of all streams hits
  2020-03-10  7:02 [PATCH v1 00/14] perf: Stream comparison Jin Yao
                   ` (3 preceding siblings ...)
  2020-03-10  7:02 ` [PATCH v1 04/14] perf util: Compare two streams Jin Yao
@ 2020-03-10  7:02 ` Jin Yao
  2020-03-10 15:14   ` Arnaldo Carvalho de Melo
  2020-03-10  7:02 ` [PATCH v1 06/14] perf util: Report hot streams Jin Yao
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Jin Yao @ 2020-03-10  7:02 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

We have used callchain_node->hit to measure the hot level of one
stream. This patch calculates the sum of hits of all streams.

Then in next patch, we can use following formula to report hot
percent for one stream.

hot percent = callchain_node->hit / sum of all hits

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/callchain.c | 35 +++++++++++++++++++++++++++++++++++
 tools/perf/util/callchain.h |  1 +
 2 files changed, 36 insertions(+)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index a9dd91268b00..040995405664 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1685,6 +1685,39 @@ static void update_hot_streams(struct hist_entry *he,
 	}
 }
 
+static u64 count_callchain_hits(struct hist_entry *he)
+{
+	struct rb_root *root = &he->sorted_chain;
+	struct rb_node *rb_node = rb_first(root);
+	struct callchain_node *node;
+	u64 chain_hits = 0;
+
+	while (rb_node) {
+		node = rb_entry(rb_node, struct callchain_node, rb_node);
+		chain_hits += node->hit;
+		rb_node = rb_next(rb_node);
+	}
+
+	return chain_hits;
+}
+
+static u64 total_callchain_hits(struct hists *hists)
+{
+	struct rb_node *next;
+	u64 chain_hits = 0;
+
+	next = rb_first_cached(&hists->entries);
+	while (next) {
+		struct hist_entry *he;
+
+		he = rb_entry(next, struct hist_entry, rb_node);
+		chain_hits += count_callchain_hits(he);
+		next = rb_next(&he->rb_node);
+	}
+
+	return chain_hits;
+}
+
 static void get_hot_streams(struct hists *hists,
 			    struct callchain_streams *s)
 {
@@ -1698,6 +1731,8 @@ static void get_hot_streams(struct hists *hists,
 		update_hot_streams(he, s);
 		next = rb_next(&he->rb_node);
 	}
+
+	s->chain_hits = total_callchain_hits(hists);
 }
 
 struct callchain_streams *callchain_evsel_streams_create(struct evlist *evlist,
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index c996ab4fb108..3c0e0b45656b 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -173,6 +173,7 @@ struct callchain_streams {
 	int			nr_streams_max;
 	int			nr_streams;
 	int			evsel_idx;
+	u64			chain_hits;
 };
 
 extern __thread struct callchain_cursor callchain_cursor;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v1 06/14] perf util: Report hot streams
  2020-03-10  7:02 [PATCH v1 00/14] perf: Stream comparison Jin Yao
                   ` (4 preceding siblings ...)
  2020-03-10  7:02 ` [PATCH v1 05/14] perf util: Calculate the sum of all streams hits Jin Yao
@ 2020-03-10  7:02 ` Jin Yao
  2020-03-10  7:02 ` [PATCH v1 07/14] perf diff: Support hot streams comparison Jin Yao
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jin Yao @ 2020-03-10  7:02 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

We show the streams separately. They are divided into different sections.

1. "Matched hot chains between old perf data and new perf data"

2. "Hot chains in old perf data but source line changed in new perf data"

3. "Hot chains in old perf data only"

4. "Hot chains in new perf data only".

For each stream, we report the cycles and hot percent (hits%).

For example,

     cycles: 2, hits: 4.08%
 --------------------------
              main div.c:42
      compute_flag div.c:28

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/callchain.c | 155 ++++++++++++++++++++++++++++++++++++
 tools/perf/util/callchain.h |   3 +
 2 files changed, 158 insertions(+)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 040995405664..e13eee1042eb 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1781,6 +1781,9 @@ static bool chain_srclist_match(struct srclist *src_list, const char *srcline_a,
 	struct src_node *node;
 	struct line_pair *lp;
 
+	if (!srcline_a || !srcline_b)
+		return false;
+
 	if (sscanf(srcline_a, "%[^:]:%d", file_a, &line_a) != 2)
 		return false;
 
@@ -1821,6 +1824,10 @@ static bool chain_match(struct callchain_list *base_chain,
 
 	*src_changed = false;
 
+	/*
+	 * Check sourceline first. If not matched,
+	 * fallback to symbol match and address match.
+	 */
 	if (src_list && chain_srclist_match(src_list, pair_chain->srcline,
 					    base_chain->srcline, &src_found,
 					    src_changed)) {
@@ -1943,3 +1950,151 @@ void callchain_match_streams(struct callchain_streams *cs_base,
 	if (src_list)
 		srclist__dump(src_list);
 }
+
+static s64 callchain_avg_cycles(struct callchain_node *cnode)
+{
+	struct callchain_list *chain;
+	s64 cycles = 0;
+
+	list_for_each_entry(chain, &cnode->val, list) {
+		if (chain->srcline && chain->branch_count)
+			cycles += chain->cycles_count / chain->branch_count;
+	}
+
+	return cycles;
+}
+
+static void print_callchain_pair(struct stream_node *base_node, int idx,
+				 struct callchain_streams *cs_base,
+				 struct callchain_streams *cs_pair)
+{
+	struct callchain_node *base_cnode = base_node->cnode;
+	struct callchain_node *pair_cnode = base_node->pair_cnode;
+	struct callchain_list *base_chain, *pair_chain;
+	char buf1[512], buf2[512], cbuf1[256], cbuf2[256];
+	char *s1, *s2;
+	double pct;
+
+	printf("\nhot chain pair %d:\n", idx);
+
+	pct = (double)base_cnode->hit / (double)cs_base->chain_hits;
+	scnprintf(buf1, sizeof(buf1), "cycles: %ld, hits: %.2f%%",
+		  callchain_avg_cycles(base_cnode), pct * 100.0);
+
+	pct = (double)pair_cnode->hit / (double)cs_pair->chain_hits;
+	scnprintf(buf2, sizeof(buf2), "cycles: %ld, hits: %.2f%%",
+		  callchain_avg_cycles(pair_cnode), pct * 100.0);
+
+	printf("%35s\t%35s\n", buf1, buf2);
+
+	printf("%35s\t%35s\n",
+	       "---------------------------",
+	       "--------------------------");
+
+	pair_chain = list_first_entry(&pair_cnode->val,
+				      struct callchain_list,
+				      list);
+
+	list_for_each_entry(base_chain, &base_cnode->val, list) {
+		if (&pair_chain->list == &pair_cnode->val)
+			return;
+
+		s1 = callchain_list__sym_name(base_chain, cbuf1, sizeof(cbuf1),
+					      false);
+		s2 = callchain_list__sym_name(pair_chain, cbuf2, sizeof(cbuf2),
+					      false);
+
+		if (!pair_chain->src_changed)
+			scnprintf(buf1, sizeof(buf1), "%35s\t%35s", s1, s2);
+		else
+			scnprintf(buf1, sizeof(buf1), "%35s\t%35s*", s1, s2);
+
+		printf("%s\n", buf1);
+		pair_chain = list_next_entry(pair_chain, list);
+	}
+}
+
+static void print_stream_callchain(struct stream_node *node, int idx,
+				   struct callchain_streams *cs, bool pair)
+{
+	struct callchain_node *cnode = node->cnode;
+	struct callchain_list *chain;
+	char buf[512], cbuf[256], *s;
+	double pct;
+
+	printf("\nhot chain %d:\n", idx);
+
+	pct = (double)cnode->hit / (double)cs->chain_hits;
+	scnprintf(buf, sizeof(buf), "cycles: %ld, hits: %.2f%%",
+		  callchain_avg_cycles(cnode), pct * 100.0);
+
+	if (pair) {
+		printf("%35s\t%35s\n", "", buf);
+		printf("%35s\t%35s\n",
+		       "", "--------------------------");
+	} else {
+		printf("%35s\n", buf);
+		printf("%35s\n", "--------------------------");
+	}
+
+	list_for_each_entry(chain, &cnode->val, list) {
+		s = callchain_list__sym_name(chain, cbuf, sizeof(cbuf), false);
+
+		if (pair)
+			scnprintf(buf, sizeof(buf), "%35s\t%35s", "", s);
+		else
+			scnprintf(buf, sizeof(buf), "%35s", s);
+
+		printf("%s\n", buf);
+	}
+}
+
+void callchain_stream_report(struct callchain_streams *cs_base,
+			     struct callchain_streams *cs_pair)
+{
+	struct stream_node *base_node;
+	int i, idx = 0;
+
+	printf("[ Matched hot chains between old perf data and new perf data ]\n");
+	for (i = 0; i < cs_base->nr_streams; i++) {
+		base_node = &cs_base->streams[i];
+		if (base_node->pair_cnode) {
+			if (!base_node->src_changed) {
+				print_callchain_pair(base_node, ++idx,
+						     cs_base, cs_pair);
+			}
+		}
+	}
+
+	idx = 0;
+	printf("\n[ Hot chains in old perf data but source line changed (*) in new perf data ]\n");
+	for (i = 0; i < cs_base->nr_streams; i++) {
+		base_node = &cs_base->streams[i];
+		if (base_node->pair_cnode) {
+			if (base_node->src_changed) {
+				print_callchain_pair(base_node, ++idx,
+						     cs_base, cs_pair);
+			}
+		}
+	}
+
+	idx = 0;
+	printf("\n[ Hot chains in old perf data only ]\n");
+	for (i = 0; i < cs_base->nr_streams; i++) {
+		base_node = &cs_base->streams[i];
+		if (!base_node->pair_cnode) {
+			print_stream_callchain(base_node, ++idx,
+					       cs_base, false);
+		}
+	}
+
+	idx = 0;
+	printf("\n[ Hot chains in new perf data only ]\n");
+	for (i = 0; i < cs_pair->nr_streams; i++) {
+		base_node = &cs_pair->streams[i];
+		if (!base_node->pair_cnode) {
+			print_stream_callchain(base_node, ++idx,
+					       cs_pair, true);
+		}
+	}
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 3c0e0b45656b..90826a280476 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -318,4 +318,7 @@ void callchain_match_streams(struct callchain_streams *cs_base,
 			     struct callchain_streams *cs_pair,
 			     struct srclist *src_list);
 
+void callchain_stream_report(struct callchain_streams *cs_base,
+			     struct callchain_streams *cs_pair);
+
 #endif	/* __PERF_CALLCHAIN_H */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v1 07/14] perf diff: Support hot streams comparison
  2020-03-10  7:02 [PATCH v1 00/14] perf: Stream comparison Jin Yao
                   ` (5 preceding siblings ...)
  2020-03-10  7:02 ` [PATCH v1 06/14] perf util: Report hot streams Jin Yao
@ 2020-03-10  7:02 ` Jin Yao
  2020-03-10  7:02 ` [PATCH v1 08/14] perf util: Add new block info functions for top N hot blocks comparison Jin Yao
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jin Yao @ 2020-03-10  7:02 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

This patch enables perf-diff with "--stream", "--before" and "--after"
options.

"--stream": Enable hot streams comparison
"--before": Source code directory corresponding to perf.data.old
"--after" : Source code directory corresponding to perf.data

Let's see two examples

perf record -b ...      Generate perf.data.old with branch data
perf record -b ...      Generate perf.data with branch data
perf diff --stream

[ Matched hot chains between old perf data and new perf data ]

hot chain pair 1:
            cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
        ---------------------------              --------------------------
                      main div.c:39                           main div.c:39
                      main div.c:44                           main div.c:44

hot chain pair 2:
           cycles: 35, hits: 21.43%                cycles: 33, hits: 19.37%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380
          __random_r random_r.c:357               __random_r random_r.c:357
              __random random.c:293                   __random random.c:293
              __random random.c:293                   __random random.c:293
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:288                   __random random.c:288
                     rand rand.c:27                          rand rand.c:27
                     rand rand.c:26                          rand rand.c:26
                           rand@plt                                rand@plt
                           rand@plt                                rand@plt
              compute_flag div.c:25                   compute_flag div.c:25
              compute_flag div.c:22                   compute_flag div.c:22
                      main div.c:40                           main div.c:40
                      main div.c:40                           main div.c:40
                      main div.c:39                           main div.c:39

hot chain pair 3:
            cycles: 18, hits: 6.10%                 cycles: 19, hits: 6.51%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380
          __random_r random_r.c:357               __random_r random_r.c:357
              __random random.c:293                   __random random.c:293
              __random random.c:293                   __random random.c:293
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:288                   __random random.c:288
                     rand rand.c:27                          rand rand.c:27
                     rand rand.c:26                          rand rand.c:26
                           rand@plt                                rand@plt
                           rand@plt                                rand@plt
              compute_flag div.c:25                   compute_flag div.c:25
              compute_flag div.c:22                   compute_flag div.c:22
                      main div.c:40                           main div.c:40

hot chain pair 4:
             cycles: 9, hits: 5.95%                  cycles: 8, hits: 5.03%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380

[ Hot chains in old perf data but source line changed (*) in new perf data ]

[ Hot chains in old perf data only ]

hot chain 1:
             cycles: 2, hits: 4.08%
         --------------------------
                      main div.c:42
              compute_flag div.c:28

[ Hot chains in new perf data only ]

hot chain 1:
                                                    cycles: 36, hits: 3.36%
                                                 --------------------------
                                                  __random_r random_r.c:357
                                                      __random random.c:293
                                                      __random random.c:293
                                                      __random random.c:291
                                                      __random random.c:291
                                                      __random random.c:291
                                                      __random random.c:288
                                                             rand rand.c:27
                                                             rand rand.c:26
                                                                   rand@plt
                                                                   rand@plt
                                                      compute_flag div.c:25
                                                      compute_flag div.c:22
                                                              main div.c:40
                                                              main div.c:40

If we enable the source line comparison, the output might be different.

perf diff --stream --before ./before --after ./after

[ Matched hot chains between old perf data and new perf data) ]

hot chain pair 1:
            cycles: 18, hits: 6.10%                 cycles: 19, hits: 6.51%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380
          __random_r random_r.c:357               __random_r random_r.c:357
              __random random.c:293                   __random random.c:293
              __random random.c:293                   __random random.c:293
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:288                   __random random.c:288
                     rand rand.c:27                          rand rand.c:27
                     rand rand.c:26                          rand rand.c:26
                           rand@plt                                rand@plt
                           rand@plt                                rand@plt
              compute_flag div.c:25                   compute_flag div.c:25
              compute_flag div.c:22                   compute_flag div.c:22
                      main div.c:40                           main div.c:40

hot chain pair 2:
             cycles: 9, hits: 5.95%                  cycles: 8, hits: 5.03%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380

[ Hot chains in old perf data but source line changed (*) in new perf data ]

hot chain pair 1:
            cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
        ---------------------------              --------------------------
                      main div.c:39                           main div.c:39*
                      main div.c:44                           main div.c:44

hot chain pair 2:
           cycles: 35, hits: 21.43%                cycles: 33, hits: 19.37%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380
          __random_r random_r.c:357               __random_r random_r.c:357
              __random random.c:293                   __random random.c:293
              __random random.c:293                   __random random.c:293
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:288                   __random random.c:288
                     rand rand.c:27                          rand rand.c:27
                     rand rand.c:26                          rand rand.c:26
                           rand@plt                                rand@plt
                           rand@plt                                rand@plt
              compute_flag div.c:25                   compute_flag div.c:25
              compute_flag div.c:22                   compute_flag div.c:22
                      main div.c:40                           main div.c:40
                      main div.c:40                           main div.c:40
                      main div.c:39                           main div.c:39*

[ Hot chains in old perf data only ]

hot chain 1:
             cycles: 2, hits: 4.08%
         --------------------------
                      main div.c:42
              compute_flag div.c:28

[ Hot chains in new perf data only ]

hot chain 1:
                                                    cycles: 36, hits: 3.36%
                                                 --------------------------
                                                  __random_r random_r.c:357
                                                      __random random.c:293
                                                      __random random.c:293
                                                      __random random.c:291
                                                      __random random.c:291
                                                      __random random.c:291
                                                      __random random.c:288
                                                             rand rand.c:27
                                                             rand rand.c:26
                                                                   rand@plt
                                                                   rand@plt
                                                      compute_flag div.c:25
                                                      compute_flag div.c:22
                                                              main div.c:40
                                                              main div.c:40
                                                              main div.c:39

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/Documentation/perf-diff.txt |  14 ++
 tools/perf/builtin-diff.c              | 170 +++++++++++++++++++++++--
 2 files changed, 171 insertions(+), 13 deletions(-)

diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt
index f50ca0fef0a4..296fea98ac07 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -182,6 +182,20 @@ OPTIONS
 --tid=::
 	Only diff samples for given thread ID (comma separated list).
 
+--stream::
+	Enable hot streams comparison. Stream is a callchain which is
+	aggregated by the branch records from samples. The option can be
+	used with --before and --after if source line comparison is
+	required.
+
+--before::
+	Source code directory corresponding to perf.data.old. Should be
+	used with --stream and --after.
+
+--after::
+	Source code directory corresponding to perf.data. Should be
+	used with --stream and --before.
+
 COMPARISON
 ----------
 The comparison is governed by the baseline file. The baseline perf.data
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 5e697cd2224a..9c801b9bc5bb 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -25,6 +25,8 @@
 #include "util/map.h"
 #include "util/spark.h"
 #include "util/block-info.h"
+#include "util/srclist.h"
+#include "util/callchain.h"
 #include <linux/err.h>
 #include <linux/zalloc.h>
 #include <subcmd/pager.h>
@@ -38,10 +40,15 @@
 struct perf_diff {
 	struct perf_tool		 tool;
 	const char			*time_str;
+	const char			*before_dir;
+	const char			*after_dir;
 	struct perf_time_interval	*ptime_range;
 	int				 range_size;
 	int				 range_num;
 	bool				 has_br_stack;
+	bool				 src_cmp;
+	bool				 stream;
+	struct srclist			*src_list;
 };
 
 /* Diff command specific HPP columns. */
@@ -72,6 +79,8 @@ struct data__file {
 	struct perf_data	 data;
 	int			 idx;
 	struct hists		*hists;
+	struct callchain_streams *evsel_streams;
+	int			 nr_evsel_streams;
 	struct diff_hpp_fmt	 fmt[PERF_HPP_DIFF__MAX_INDEX];
 };
 
@@ -106,6 +115,7 @@ enum {
 	COMPUTE_DELTA_ABS,
 	COMPUTE_CYCLES,
 	COMPUTE_MAX,
+	COMPUTE_STREAM,	/* After COMPUTE_MAX to avoid use current compute arrays */
 };
 
 const char *compute_names[COMPUTE_MAX] = {
@@ -393,6 +403,11 @@ static int diff__process_sample_event(struct perf_tool *tool,
 	struct perf_diff *pdiff = container_of(tool, struct perf_diff, tool);
 	struct addr_location al;
 	struct hists *hists = evsel__hists(evsel);
+	struct hist_entry_iter iter = {
+		.evsel	= evsel,
+		.sample	= sample,
+		.ops	= &hist_iter_normal,
+	};
 	int ret = -1;
 
 	if (perf_time__ranges_skip_sample(pdiff->ptime_range, pdiff->range_num,
@@ -411,14 +426,8 @@ static int diff__process_sample_event(struct perf_tool *tool,
 		goto out_put;
 	}
 
-	if (compute != COMPUTE_CYCLES) {
-		if (!hists__add_entry(hists, &al, NULL, NULL, NULL, sample,
-				      true)) {
-			pr_warning("problem incrementing symbol period, "
-				   "skipping event\n");
-			goto out_put;
-		}
-	} else {
+	switch (compute) {
+	case COMPUTE_CYCLES:
 		if (!hists__add_entry_ops(hists, &block_hist_ops, &al, NULL,
 					  NULL, NULL, sample, true)) {
 			pr_warning("problem incrementing symbol period, "
@@ -428,6 +437,23 @@ static int diff__process_sample_event(struct perf_tool *tool,
 
 		hist__account_cycles(sample->branch_stack, &al, sample, false,
 				     NULL);
+		break;
+
+	case COMPUTE_STREAM:
+		if (hist_entry_iter__add(&iter, &al, PERF_MAX_STACK_DEPTH,
+					 NULL)) {
+			pr_debug("problem adding hist entry, skipping event\n");
+			goto out_put;
+		}
+		break;
+
+	default:
+		if (!hists__add_entry(hists, &al, NULL, NULL, NULL, sample,
+				      true)) {
+			pr_warning("problem incrementing symbol period, "
+				   "skipping event\n");
+			goto out_put;
+		}
 	}
 
 	/*
@@ -995,6 +1021,53 @@ static void data_process(void)
 	}
 }
 
+static int process_base_stream(struct data__file *data_base,
+			       struct data__file *data_pair,
+			       const char *title __maybe_unused,
+			       const char *base_dir __maybe_unused,
+			       const char *pair_dir __maybe_unused)
+{
+	struct evlist *evlist_base = data_base->session->evlist;
+	struct evlist *evlist_pair = data_pair->session->evlist;
+	struct evsel *evsel_base, *evsel_pair;
+	struct callchain_streams *es_base, *es_pair;
+
+	evlist__for_each_entry(evlist_base, evsel_base) {
+		evsel_pair = evsel_match(evsel_base, evlist_pair);
+		if (!evsel_pair)
+			continue;
+
+		es_base = callchain_evsel_streams_get(data_base->evsel_streams,
+						      data_base->nr_evsel_streams,
+						      evsel_base->idx);
+		if (!es_base)
+			return -1;
+
+		es_pair = callchain_evsel_streams_get(data_pair->evsel_streams,
+						      data_pair->nr_evsel_streams,
+						      evsel_pair->idx);
+		if (!es_pair)
+			return -1;
+
+		callchain_match_streams(es_base, es_pair, pdiff.src_list);
+		callchain_stream_report(es_base, es_pair);
+	}
+
+	return 0;
+}
+
+static void stream_process(void)
+{
+	/*
+	 * Stream comparison only supports two data files.
+	 * perf.data.old and perf.data. data__files[0] is perf.data.old,
+	 * data__files[1] is perf.data.
+	 */
+	process_base_stream(&data__files[0], &data__files[1],
+			    "# Output based on old perf data:\n#\n",
+			    pdiff.before_dir, pdiff.after_dir);
+}
+
 static void data__free(struct data__file *d)
 {
 	int col;
@@ -1108,6 +1181,19 @@ static int check_file_brstack(void)
 	return 0;
 }
 
+static struct callchain_streams *create_evsel_streams(struct evlist *evlist,
+						      int nr_streams_max,
+						      int *nr_evsel_streams)
+{
+	struct callchain_streams *evsel_streams;
+
+	evsel_streams = callchain_evsel_streams_create(evlist,
+						       nr_streams_max,
+						       nr_evsel_streams);
+
+	return evsel_streams;
+}
+
 static int __cmd_diff(void)
 {
 	struct data__file *d;
@@ -1121,6 +1207,14 @@ static int __cmd_diff(void)
 	abstime_tmp = abstime_ostr;
 	ret = -EINVAL;
 
+	if (pdiff.src_cmp) {
+		pdiff.src_list = srclist__new(pdiff.before_dir,
+					      pdiff.after_dir);
+
+		if (!pdiff.src_list)
+			goto out_free;
+	}
+
 	data__for_each_file(i, d) {
 		d->session = perf_session__new(&d->data, false, &pdiff.tool);
 		if (IS_ERR(d->session)) {
@@ -1152,9 +1246,21 @@ static int __cmd_diff(void)
 
 		if (pdiff.ptime_range)
 			zfree(&pdiff.ptime_range);
+
+		if (compute == COMPUTE_STREAM) {
+			d->evsel_streams = create_evsel_streams(
+						d->session->evlist,
+						5,
+						&d->nr_evsel_streams);
+			if (!d->evsel_streams)
+				goto out_delete;
+		}
 	}
 
-	data_process();
+	if (compute == COMPUTE_STREAM)
+		stream_process();
+	else
+		data_process();
 
  out_delete:
 	data__for_each_file(i, d) {
@@ -1162,6 +1268,7 @@ static int __cmd_diff(void)
 		data__free(d);
 	}
 
+ out_free:
 	free(data__files);
 
 	if (pdiff.ptime_range)
@@ -1170,6 +1277,9 @@ static int __cmd_diff(void)
 	if (abstime_ostr)
 		free(abstime_ostr);
 
+	if (pdiff.src_list)
+		srclist__delete(pdiff.src_list);
+
 	return ret;
 }
 
@@ -1227,6 +1337,15 @@ static const struct option options[] = {
 		   "only consider symbols in these pids"),
 	OPT_STRING(0, "tid", &symbol_conf.tid_list_str, "tid[,tid...]",
 		   "only consider symbols in these tids"),
+	OPT_BOOLEAN(0, "stream", &pdiff.stream,
+		    "Enable hot streams comparison. "
+		    "Can be used with --before and --after"),
+	OPT_STRING(0, "before", &pdiff.before_dir, "dir",
+		   "Source code directory corresponding to perf.data.old. "
+		   "WARNING: use with --after and --stream"),
+	OPT_STRING(0, "after", &pdiff.after_dir, "dir",
+		   "Source code directory corresponding to perf.data. "
+		   "WARNING: use with --before and --stream"),
 	OPT_END()
 };
 
@@ -1886,6 +2005,19 @@ int cmd_diff(int argc, const char **argv)
 	if (cycles_hist && (compute != COMPUTE_CYCLES))
 		usage_with_options(diff_usage, options);
 
+	if (pdiff.stream) {
+		compute = COMPUTE_STREAM;
+		if ((pdiff.before_dir && !pdiff.after_dir) ||
+		    (!pdiff.before_dir && pdiff.after_dir)) {
+			usage_with_options(diff_usage, options);
+		}
+
+		if (pdiff.before_dir && pdiff.after_dir)
+			pdiff.src_cmp = true;
+	} else if (pdiff.before_dir || pdiff.after_dir) {
+		usage_with_options(diff_usage, options);
+	}
+
 	symbol__annotation_init();
 
 	if (symbol__init(NULL) < 0)
@@ -1897,13 +2029,25 @@ int cmd_diff(int argc, const char **argv)
 	if (check_file_brstack() < 0)
 		return -1;
 
-	if (compute == COMPUTE_CYCLES && !pdiff.has_br_stack)
+	if ((compute == COMPUTE_CYCLES || compute == COMPUTE_STREAM)
+	    && !pdiff.has_br_stack) {
 		return -1;
+	}
 
-	if (ui_init() < 0)
-		return -1;
+	if (compute == COMPUTE_STREAM) {
+		symbol_conf.show_branchflag_count = true;
+		callchain_param.mode = CHAIN_FLAT;
+		callchain_param.key = CCKEY_SRCLINE;
+		callchain_param.branch_callstack = 1;
+		symbol_conf.use_callchain = true;
+		callchain_register_param(&callchain_param);
+		sort_order = "srcline,symbol,dso";
+	} else {
+		if (ui_init() < 0)
+			return -1;
 
-	sort__mode = SORT_MODE__DIFF;
+		sort__mode = SORT_MODE__DIFF;
+	}
 
 	if (setup_sorting(NULL) < 0)
 		usage_with_options(diff_usage, options);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v1 08/14] perf util: Add new block info functions for top N hot blocks comparison
  2020-03-10  7:02 [PATCH v1 00/14] perf: Stream comparison Jin Yao
                   ` (6 preceding siblings ...)
  2020-03-10  7:02 ` [PATCH v1 07/14] perf diff: Support hot streams comparison Jin Yao
@ 2020-03-10  7:02 ` Jin Yao
  2020-03-10 15:17   ` Arnaldo Carvalho de Melo
  2020-03-10  7:02 ` [PATCH v1 09/14] perf util: Add new block info fmts for showing " Jin Yao
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 23+ messages in thread
From: Jin Yao @ 2020-03-10  7:02 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

It's also useful to figure out the top N hottest blocks from old perf
data file and figure out the top N hottest blocks from new perf data file,
and then compare them for the cycles diff. It can let us easily know
how many cycles are moved from one block to another block.

This patch adds new helper functions and data structures for the block
comparison.

And it also updates the existing perf-diff to be compatible with
the new interface.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-diff.c    |  45 +-----
 tools/perf/util/block-info.c | 305 ++++++++++++++++++++++++++++++++++-
 tools/perf/util/block-info.h |  32 +++-
 tools/perf/util/srclist.h    |   9 ++
 4 files changed, 345 insertions(+), 46 deletions(-)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 9c801b9bc5bb..dcbc9bba4e61 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -598,27 +598,6 @@ static void init_block_hist(struct block_hist *bh)
 	bh->valid = true;
 }
 
-static struct hist_entry *get_block_pair(struct hist_entry *he,
-					 struct hists *hists_pair)
-{
-	struct rb_root_cached *root = hists_pair->entries_in;
-	struct rb_node *next = rb_first_cached(root);
-	int64_t cmp;
-
-	while (next != NULL) {
-		struct hist_entry *he_pair = rb_entry(next, struct hist_entry,
-						      rb_node_in);
-
-		next = rb_next(&he_pair->rb_node_in);
-
-		cmp = __block_info__cmp(he_pair, he);
-		if (!cmp)
-			return he_pair;
-	}
-
-	return NULL;
-}
-
 static void init_spark_values(unsigned long *svals, int num)
 {
 	for (int i = 0; i < num; i++)
@@ -665,26 +644,6 @@ static void compute_cycles_diff(struct hist_entry *he,
 	}
 }
 
-static void block_hists_match(struct hists *hists_base,
-			      struct hists *hists_pair)
-{
-	struct rb_root_cached *root = hists_base->entries_in;
-	struct rb_node *next = rb_first_cached(root);
-
-	while (next != NULL) {
-		struct hist_entry *he = rb_entry(next, struct hist_entry,
-						 rb_node_in);
-		struct hist_entry *pair = get_block_pair(he, hists_pair);
-
-		next = rb_next(&he->rb_node_in);
-
-		if (pair) {
-			hist_entry__add_pair(pair, he);
-			compute_cycles_diff(he, pair);
-		}
-	}
-}
-
 static void hists__precompute(struct hists *hists)
 {
 	struct rb_root_cached *root;
@@ -737,7 +696,9 @@ static void hists__precompute(struct hists *hists)
 
 				if (bh->valid && pair_bh->valid) {
 					block_hists_match(&bh->block_hists,
-							  &pair_bh->block_hists);
+							  &pair_bh->block_hists,
+							  NULL,
+							  compute_cycles_diff);
 					hists__output_resort(&pair_bh->block_hists,
 							     NULL);
 				}
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index 423ec69bda6c..247c87b8df56 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -12,6 +12,7 @@
 #include "evlist.h"
 #include "hist.h"
 #include "ui/browsers/hists.h"
+#include "debug.h"
 
 static struct block_header_column {
 	const char *name;
@@ -50,10 +51,24 @@ struct block_info *block_info__get(struct block_info *bi)
 	return bi;
 }
 
+static void free_block_line(struct block_line **bl)
+{
+	if ((*bl)->start_file)
+		free((*bl)->start_file);
+
+	if ((*bl)->end_file)
+		free((*bl)->end_file);
+
+	zfree(bl);
+}
+
 void block_info__put(struct block_info *bi)
 {
-	if (bi && refcount_dec_and_test(&bi->refcnt))
+	if (bi && refcount_dec_and_test(&bi->refcnt)) {
+		if (bi->line)
+			free_block_line(&bi->line);
 		free(bi);
+	}
 }
 
 struct block_info *block_info__new(void)
@@ -65,7 +80,8 @@ struct block_info *block_info__new(void)
 	return bi;
 }
 
-int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right)
+int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right,
+			  struct srclist *src_list __maybe_unused)
 {
 	struct block_info *bi_l = left->block_info;
 	struct block_info *bi_r = right->block_info;
@@ -93,7 +109,7 @@ int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right)
 int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
 			struct hist_entry *left, struct hist_entry *right)
 {
-	return __block_info__cmp(left, right);
+	return __block_info__cmp(left, right, NULL);
 }
 
 static void init_block_info(struct block_info *bi, struct symbol *sym,
@@ -446,8 +462,10 @@ struct block_report *block_info__create_report(struct evlist *evlist,
 	evlist__for_each_entry(evlist, pos) {
 		struct hists *hists = evsel__hists(pos);
 
+		hists__output_resort(hists, NULL);
 		process_block_report(hists, &block_reports[i], total_cycles,
 				     block_hpps, nr_hpps);
+		block_reports[i].evsel_idx = pos->idx;
 		i++;
 	}
 
@@ -496,3 +514,284 @@ float block_info__total_cycles_percent(struct hist_entry *he)
 
 	return 0.0;
 }
+
+struct block_report *block_info__get_report(struct block_report *reps,
+					    int nr_reps, int evsel_idx)
+{
+	for (int i = 0; i < nr_reps; i++) {
+		if (reps[i].evsel_idx == evsel_idx)
+			return &reps[i];
+	}
+
+	return NULL;
+}
+
+static char *move_to_relpath(char *path, const char *dir)
+{
+	const char *d = dir, *end = dir + strlen(dir);
+	char *s;
+
+	/* skip '.' and '/' */
+	while ((d != end) && ((*d == '.') || (*d == '/')))
+		d++;
+
+	if (d == end)
+		return NULL;
+
+	s = strstr(path, d);
+	if (!s)
+		return NULL;
+
+	s += strlen(d);
+
+	while (*s == '/' && *s != 0)
+		s++;
+
+	if (*s == 0)
+		return NULL;
+
+	return s;
+}
+
+static int block_addr2line(struct hist_entry *he, const char *dir)
+{
+	struct block_info *bi = he->block_info;
+	struct block_line *bl;
+
+	if (!he->ms.map || !he->ms.map->dso)
+		return -1;
+
+	bl = zalloc(sizeof(*bl));
+	if (!bl)
+		return -1;
+
+	symbol_conf.disable_add2line_warn = true;
+	bl->start_file = get_srcline_split(he->ms.map->dso,
+					   map__rip_2objdump(he->ms.map,
+							     bi->sym->start + bi->start),
+					   &bl->start_nr);
+	if (!bl->start_file)
+		goto err;
+
+	if (dir)
+		bl->start_rel = move_to_relpath(bl->start_file, dir);
+
+	bl->end_file = get_srcline_split(he->ms.map->dso,
+					 map__rip_2objdump(he->ms.map,
+							   bi->sym->start + bi->end),
+					 &bl->end_nr);
+	if (!bl->end_file)
+		goto err;
+
+	if (dir)
+		bl->end_rel = move_to_relpath(bl->end_file, dir);
+
+	bi->line = bl;
+	return 0;
+
+err:
+	free_block_line(&bl);
+	return -1;
+}
+
+int block_hists_addr2line(struct hists *hists, const char *dir)
+{
+	struct rb_root_cached *root = hists->entries_in;
+	struct rb_node *next = rb_first_cached(root);
+
+	while (next != NULL) {
+		struct hist_entry *he = rb_entry(next, struct hist_entry,
+						 rb_node_in);
+
+		if (!he)
+			break;
+
+		block_addr2line(he, dir);
+		next = rb_next(&he->rb_node_in);
+	}
+
+	return 0;
+}
+
+bool block_same_srcfiles(struct block_line *bl_a, struct block_line *bl_b)
+{
+	if (!bl_a->start_file || !bl_a->end_file ||
+	    !bl_b->start_file || !bl_b->end_file) {
+		return false;
+	}
+
+	if (!strcmp(bl_a->start_file, bl_b->start_file) &&
+	    !strcmp(bl_a->end_file, bl_b->end_file) &&
+	    !strcmp(bl_a->start_file, bl_a->end_file)) {
+		return true;
+	}
+
+	if (!bl_a->start_rel || !bl_a->end_rel ||
+	    !bl_b->start_rel || !bl_b->end_rel) {
+		return false;
+	}
+
+	if (!strcmp(bl_a->start_rel, bl_b->start_rel) &&
+	    !strcmp(bl_a->end_rel, bl_b->end_rel) &&
+	    !strcmp(bl_a->start_rel, bl_a->end_rel)) {
+		return true;
+	}
+
+	return false;
+}
+
+void block_line_dump(struct block_info *bi, const char *str)
+{
+	if (bi && bi->line) {
+		pr_debug("%s: %s:%d -> %s:%d\n",
+			 str,
+			 bi->line->start_file,
+			 bi->line->start_nr,
+			 bi->line->end_file,
+			 bi->line->end_nr);
+	}
+}
+
+static bool block_line_matched(struct src_node *node,
+			       int start_nr_a, int end_nr_a,
+			       int start_nr_b, int end_nr_b,
+			       bool *changed)
+{
+	int i, j, start_a, end_a, start_b, changed_nr = 0;
+	struct line_pair *lp;
+
+	*changed = false;
+
+	if (abs(end_nr_a - start_nr_a) !=
+	    abs(end_nr_b - start_nr_b)) {
+		return false;
+	}
+
+	i = start_a = (start_nr_a < end_nr_a) ? start_nr_a : end_nr_a;
+	end_a = (end_nr_a > start_nr_a) ? end_nr_a : start_nr_a;
+
+	j = start_b = (start_nr_b < end_nr_b) ? start_nr_b : end_nr_b;
+
+	while (i <= end_a) {
+		lp = srclist__line_pair(node, i);
+		if (!lp)
+			return false;
+
+		if (lp->b_nr != j)
+			changed_nr++;
+
+		i++; j++;
+	}
+
+	if ((i == end_a + 1) && (changed_nr == 0))
+		return true;
+
+	/*
+	 * At least one line is unchanged in this block,
+	 * we think this block is changed (not a new block).
+	 */
+	if (changed_nr < end_a - start_a + 1)
+		*changed = true;
+
+	return false;
+}
+
+bool block_srclist_matched(struct srclist *slist, char *rel_path,
+			   int start_nr_a, int end_nr_a,
+			   int start_nr_b, int end_nr_b,
+			   bool *changed)
+{
+	struct src_node *node;
+	bool ret;
+
+	*changed = false;
+
+	node = srclist__find(slist, rel_path, true);
+	if (!node)
+		return false;
+
+	ret = block_line_matched(node, start_nr_a, end_nr_a,
+				 start_nr_b, end_nr_b, changed);
+
+	if (ret) {
+		pr_debug("block MATCHED (a vs. b)\t\t%s: (%d-%d) vs. (%d-%d)\n",
+			 node->info.rel_path,
+			 start_nr_a, end_nr_a,
+			 start_nr_b, end_nr_b);
+	} else if (*changed) {
+		pr_debug("block CHANGED (a vs. b)\t\t%s: (%d-%d) vs. (%d-%d)\n",
+			 node->info.rel_path,
+			 start_nr_a, end_nr_a,
+			 start_nr_b, end_nr_b);
+	} else {
+		pr_debug("block UNMATCHED (a vs. b)\t%s: (%d-%d) vs. (%d-%d)\n",
+			 node->info.rel_path,
+			 start_nr_a, end_nr_a,
+			 start_nr_b, end_nr_b);
+	}
+
+	return ret;
+}
+
+static struct hist_entry *get_block_pair(struct hist_entry *he,
+					 struct hists *hists_pair,
+					 struct srclist *src_list)
+{
+	struct rb_root_cached *root = hists_pair->entries_in;
+	struct rb_node *next = rb_first_cached(root);
+	int64_t cmp;
+
+	while (next != NULL) {
+		struct hist_entry *he_pair = rb_entry(next, struct hist_entry,
+						      rb_node_in);
+
+		next = rb_next(&he_pair->rb_node_in);
+
+		cmp = __block_info__cmp(he_pair, he, src_list);
+		if (!cmp)
+			return he_pair;
+	}
+
+	return NULL;
+}
+
+void block_hists_match(struct hists *hists_base,
+		       struct hists *hists_pair,
+		       struct srclist *src_list,
+		       void (*func)(struct hist_entry *,
+				    struct hist_entry *))
+{
+	struct rb_root_cached *root = hists_base->entries_in;
+	struct rb_node *next = rb_first_cached(root);
+
+	while (next != NULL) {
+		struct hist_entry *he = rb_entry(next, struct hist_entry,
+						 rb_node_in);
+		struct hist_entry *pair = get_block_pair(he, hists_pair,
+							 src_list);
+
+		next = rb_next(&he->rb_node_in);
+
+		if (pair) {
+			hist_entry__add_pair(pair, he);
+
+			if (func)
+				(*func)(he, pair);
+		}
+	}
+}
+
+int block_info__match_report(struct block_report *rep_base,
+			     struct block_report *rep_pair,
+			     struct srclist *src_list,
+			     void (*func)(struct hist_entry *,
+					  struct hist_entry *))
+{
+	struct block_hist *bh_base = &rep_base->hist;
+	struct block_hist *bh_pair = &rep_pair->hist;
+
+	block_hists_match(&bh_base->block_hists, &bh_pair->block_hists,
+			  src_list, func);
+
+	return 0;
+}
diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
index 42e9dcc4cf0a..458bd998089d 100644
--- a/tools/perf/util/block-info.h
+++ b/tools/perf/util/block-info.h
@@ -20,6 +20,8 @@ struct block_info {
 	int			num;
 	int			num_aggr;
 	refcount_t		refcnt;
+	struct block_line	*line;
+	bool			srcline_matched;
 };
 
 struct block_fmt {
@@ -46,6 +48,7 @@ struct block_report {
 	u64			cycles;
 	struct block_fmt	fmts[PERF_HPP_REPORT__BLOCK_MAX_INDEX];
 	int			nr_fmts;
+	int			evsel_idx;
 };
 
 struct block_hist;
@@ -62,7 +65,8 @@ static inline void __block_info__zput(struct block_info **bi)
 
 #define block_info__zput(bi) __block_info__zput(&bi)
 
-int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right);
+int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right,
+			  struct srclist *src_list __maybe_unused);
 
 int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
 			struct hist_entry *left, struct hist_entry *right);
@@ -83,4 +87,30 @@ int report__browse_block_hists(struct block_hist *bh, float min_percent,
 
 float block_info__total_cycles_percent(struct hist_entry *he);
 
+struct block_report *block_info__get_report(struct block_report *reps,
+					    int nr_reps, int evsel_idx);
+
+int block_hists_addr2line(struct hists *hists, const char *dir);
+
+void block_line_dump(struct block_info *bi, const char *str);
+
+bool block_same_srcfiles(struct block_line *bl_a, struct block_line *bl_b);
+
+bool block_srclist_matched(struct srclist *slist, char *rel_path,
+			   int start_nr_a, int end_nr_a,
+			   int start_nr_b, int end_nr_b,
+			   bool *changed);
+
+void block_hists_match(struct hists *hists_base,
+		       struct hists *hists_pair,
+		       struct srclist *src_list,
+		       void (*func)(struct hist_entry *,
+				    struct hist_entry *));
+
+int block_info__match_report(struct block_report *rep_base,
+			     struct block_report *rep_pair,
+			     struct srclist *src_list,
+			     void (*func)(struct hist_entry *,
+					  struct hist_entry *));
+
 #endif /* __PERF_BLOCK_H */
diff --git a/tools/perf/util/srclist.h b/tools/perf/util/srclist.h
index f25b0de91a13..46866d5c51d7 100644
--- a/tools/perf/util/srclist.h
+++ b/tools/perf/util/srclist.h
@@ -33,6 +33,15 @@ struct srclist {
 	const char *after_dir;
 };
 
+struct block_line {
+	char *start_file;
+	char *end_file;
+	char *start_rel;
+	char *end_rel;
+	unsigned int start_nr;
+	unsigned int end_nr;
+};
+
 struct srclist *srclist__new(const char *before_dir, const char *after_dir);
 void srclist__delete(struct srclist *slist);
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v1 09/14] perf util: Add new block info fmts for showing hot blocks comparison
  2020-03-10  7:02 [PATCH v1 00/14] perf: Stream comparison Jin Yao
                   ` (7 preceding siblings ...)
  2020-03-10  7:02 ` [PATCH v1 08/14] perf util: Add new block info functions for top N hot blocks comparison Jin Yao
@ 2020-03-10  7:02 ` Jin Yao
  2020-03-10  7:02 ` [PATCH v1 10/14] perf util: Enable block source line comparison Jin Yao
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jin Yao @ 2020-03-10  7:02 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

We support three new columns in output

New Stream Diff(cycles%,cycles):
  The diff of 'Sampled Cycles%' and 'Avg Cycles' between new stream
  and old stream for the same block.

  If block is not matched, it reports "[block not in new stream]" or
  "[block changed in new stream]".

New Stream Sampled Cycles%:
  The 'Sampled Cycles%' of the block in new stream. It's only reported
  when the block is changed in new stream.

New Stream Avg Cycles:
  The 'Average Cycles' of the block in new stream. It's only reported
  when the block is changed in new stream.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/block-info.c | 117 ++++++++++++++++++++++++++++++++++-
 tools/perf/util/block-info.h |   4 ++
 2 files changed, 120 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index 247c87b8df56..2ad9e2dd7f77 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -41,7 +41,19 @@ static struct block_header_column {
 	[PERF_HPP_REPORT__BLOCK_DSO] = {
 		.name = "Shared Object",
 		.width = 20,
-	}
+	},
+	[PERF_HPP__BLOCK_NEW_STREAM_DIFF] = {
+		.name = "New Stream Diff(cycles%,cycles)",
+		.width = 31,
+	},
+	[PERF_HPP__BLOCK_NEW_STREAM_TOTAL_CYCLES_PCT] = {
+		.name = "New Stream Sampled Cycles%",
+		.width = 26,
+	},
+	[PERF_HPP__BLOCK_NEW_STREAM_AVG_CYCLES] = {
+		.name = "New Stream Avg Cycles",
+		.width = 21,
+	},
 };
 
 struct block_info *block_info__get(struct block_info *bi)
@@ -342,6 +354,100 @@ static int block_dso_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
 			 "[unknown]");
 }
 
+static int block_new_stream_diff_entry(struct perf_hpp_fmt *fmt,
+				       struct perf_hpp *hpp,
+				       struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct hist_entry *pair = hist_entry__next_pair(he);
+	struct block_info *bi_h, *bi_p;
+	double ratio_h = 0.0, ratio_p = 0.0;
+	s64 cycles_diff;
+	char buf[32];
+
+	if (!pair) {
+		return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+				 "[block not in new stream]");
+	}
+
+	bi_h = he->block_info;
+	bi_p = pair->block_info;
+
+	if (bi_p->block_changed) {
+		return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+				 "[block changed in new stream]");
+	}
+
+	if (bi_h->total_cycles)
+		ratio_h = (double)bi_h->cycles / (double)bi_h->total_cycles;
+
+	if (bi_p->total_cycles)
+		ratio_p = (double)bi_p->cycles / (double)bi_p->total_cycles;
+
+	cycles_diff = (s64)(bi_p->cycles_aggr / bi_p->num_aggr) -
+		(s64)(bi_h->cycles_aggr / bi_h->num_aggr);
+
+	snprintf(buf, sizeof(buf), "%4.2f%%, %4ld",
+		 100.0 * (ratio_p - ratio_h), cycles_diff);
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
+}
+
+static int block_new_stream_total_cycles_pct_entry(struct perf_hpp_fmt *fmt,
+						   struct perf_hpp *hpp,
+						   struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct hist_entry *pair = hist_entry__next_pair(he);
+	struct block_info *bi_p;
+	double ratio = 0.0;
+
+	if (!pair) {
+		return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+				 "-");
+	}
+
+	bi_p = pair->block_info;
+
+	if (!bi_p->block_changed) {
+		return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+				 "-");
+	}
+
+	if (bi_p->total_cycles)
+		ratio = (double)bi_p->cycles / (double)bi_p->total_cycles;
+
+	return color_pct(hpp, block_fmt->width, 100.0 * ratio);
+}
+
+static int block_new_stream_avg_cycles_entry(struct perf_hpp_fmt *fmt,
+					     struct perf_hpp *hpp,
+					     struct hist_entry *he)
+{
+	struct block_fmt *block_fmt = container_of(fmt, struct block_fmt, fmt);
+	struct hist_entry *pair = hist_entry__next_pair(he);
+	struct block_info *bi_p;
+	char cycles_buf[16];
+
+	if (!pair) {
+		return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+				 "-");
+	}
+
+	bi_p = pair->block_info;
+
+	if (!bi_p->block_changed) {
+		return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+				 "-");
+	}
+
+	cycles_string(bi_p->cycles_aggr / bi_p->num_aggr, cycles_buf,
+		      sizeof(cycles_buf));
+
+	return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width,
+			 cycles_buf);
+}
+
 static void init_block_header(struct block_fmt *block_fmt)
 {
 	struct perf_hpp_fmt *fmt = &block_fmt->fmt;
@@ -385,6 +491,15 @@ static void hpp_register(struct block_fmt *block_fmt, int idx,
 	case PERF_HPP_REPORT__BLOCK_DSO:
 		fmt->entry = block_dso_entry;
 		break;
+	case PERF_HPP__BLOCK_NEW_STREAM_DIFF:
+		fmt->entry = block_new_stream_diff_entry;
+		break;
+	case PERF_HPP__BLOCK_NEW_STREAM_TOTAL_CYCLES_PCT:
+		fmt->entry = block_new_stream_total_cycles_pct_entry;
+		break;
+	case PERF_HPP__BLOCK_NEW_STREAM_AVG_CYCLES:
+		fmt->entry = block_new_stream_avg_cycles_entry;
+		break;
 	default:
 		return;
 	}
diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
index 458bd998089d..4e22bce81731 100644
--- a/tools/perf/util/block-info.h
+++ b/tools/perf/util/block-info.h
@@ -22,6 +22,7 @@ struct block_info {
 	refcount_t		refcnt;
 	struct block_line	*line;
 	bool			srcline_matched;
+	bool			block_changed;
 };
 
 struct block_fmt {
@@ -40,6 +41,9 @@ enum {
 	PERF_HPP_REPORT__BLOCK_AVG_CYCLES,
 	PERF_HPP_REPORT__BLOCK_RANGE,
 	PERF_HPP_REPORT__BLOCK_DSO,
+	PERF_HPP__BLOCK_NEW_STREAM_DIFF,
+	PERF_HPP__BLOCK_NEW_STREAM_TOTAL_CYCLES_PCT,
+	PERF_HPP__BLOCK_NEW_STREAM_AVG_CYCLES,
 	PERF_HPP_REPORT__BLOCK_MAX_INDEX
 };
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v1 10/14] perf util: Enable block source line comparison
  2020-03-10  7:02 [PATCH v1 00/14] perf: Stream comparison Jin Yao
                   ` (8 preceding siblings ...)
  2020-03-10  7:02 ` [PATCH v1 09/14] perf util: Add new block info fmts for showing " Jin Yao
@ 2020-03-10  7:02 ` Jin Yao
  2020-03-10  7:02 ` [PATCH v1 11/14] perf diff: support hot blocks comparison Jin Yao
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jin Yao @ 2020-03-10  7:02 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Previously we only supported address comparison, but it was not good
for the case that address might be changed if source code was updated.

This patch enables for the source line comparison.

1. If all of the source lines in a block are not changed (or only
   moved some offsets as a whole), we think the block is not changed.

2. If we can aware any source line in this block is changed,
   we think this block is changed.

3. If 1 and 2 are both not matched, fallback to address comparison.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/block-info.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index 2ad9e2dd7f77..3aa564151d84 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -93,11 +93,12 @@ struct block_info *block_info__new(void)
 }
 
 int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right,
-			  struct srclist *src_list __maybe_unused)
+			  struct srclist *src_list)
 {
 	struct block_info *bi_l = left->block_info;
 	struct block_info *bi_r = right->block_info;
 	int cmp;
+	bool changed;
 
 	if (!bi_l->sym || !bi_r->sym) {
 		if (!bi_l->sym && !bi_r->sym)
@@ -112,6 +113,27 @@ int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right,
 	if (cmp)
 		return cmp;
 
+	if (src_list && bi_l->line && bi_r->line) {
+		if (block_same_srcfiles(bi_l->line, bi_r->line) &&
+		    bi_l->line->start_rel) {
+
+			if (block_srclist_matched(src_list,
+						  bi_l->line->start_rel,
+						  bi_l->line->start_nr,
+						  bi_l->line->end_nr,
+						  bi_r->line->start_nr,
+						  bi_r->line->end_nr,
+						  &changed)) {
+				bi_l->srcline_matched = true;
+				return 0;
+			} else if (changed) {
+				bi_l->block_changed = true;
+				return 0;
+			} else
+				return -1;
+		}
+	}
+
 	if (bi_l->start != bi_r->start)
 		return (int64_t)(bi_r->start - bi_l->start);
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v1 11/14] perf diff: support hot blocks comparison
  2020-03-10  7:02 [PATCH v1 00/14] perf: Stream comparison Jin Yao
                   ` (9 preceding siblings ...)
  2020-03-10  7:02 ` [PATCH v1 10/14] perf util: Enable block source line comparison Jin Yao
@ 2020-03-10  7:02 ` Jin Yao
  2020-03-10  7:02 ` [PATCH v1 12/14] perf util: Filter out streams by name of changed functions Jin Yao
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 23+ messages in thread
From: Jin Yao @ 2020-03-10  7:02 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

This patch reports the results of hot blocks comparison.

For example, following outputs are added to the end of
hot streams reports.

  # Output based on old stream (perf.data.old):
  #
  # Sampled Cycles%  Avg Cycles  New Stream Diff(cycles%,cycles)  New Stream Sampled Cycles%  New Stream Avg Cycles
  # ...............  ..........  ...............................  ..........................  .....................
  #
             24.76%          17                     -4.54%,   -7                           -                      -
             14.72%           7                      0.13%,   -1                           -                      -
              5.28%           2        [block not in new stream]                           -                      -
              5.09%           2                      0.52%,    0                           -                      -
              5.05%           2                     -0.15%,   -1                           -                      -
              4.18%           1                      0.12%,    0                           -                      -
              3.16%           1                      0.28%,    0                           -                      -

If we enable the source line comparison, the output might be different.

  # Output based on old stream (perf.data.old):
  #
  # Sampled Cycles%  Avg Cycles  New Stream Diff(cycles%,cycles)  New Stream Sampled Cycles%  New Stream Avg Cycles
  # ...............  ..........  ...............................  ..........................  .....................
  #
             24.76%          17    [block changed in new stream]                      20.22%                     10
             14.72%           7                      0.13%,   -1                           -                      -
              5.28%           2        [block not in new stream]                           -                      -
              5.09%           2                      0.52%,    0                           -                      -
              5.05%           2                     -0.15%,   -1                           -                      -
              4.18%           1    [block changed in new stream]                       4.31%                      1
              3.16%           1                      0.28%,    0                           -                      -

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-diff.c | 85 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 84 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index dcbc9bba4e61..566e811054b1 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -49,6 +49,8 @@ struct perf_diff {
 	bool				 src_cmp;
 	bool				 stream;
 	struct srclist			*src_list;
+	u64				 total_cycles;
+	float				 min_percent;
 };
 
 /* Diff command specific HPP columns. */
@@ -76,8 +78,10 @@ struct diff_hpp_fmt {
 
 struct data__file {
 	struct perf_session	*session;
+	struct block_report	*block_reports;
 	struct perf_data	 data;
 	int			 idx;
+	int			 nr_block_reports;
 	struct hists		*hists;
 	struct callchain_streams *evsel_streams;
 	int			 nr_evsel_streams;
@@ -445,8 +449,10 @@ static int diff__process_sample_event(struct perf_tool *tool,
 			pr_debug("problem adding hist entry, skipping event\n");
 			goto out_put;
 		}
-		break;
 
+		hist__account_cycles(sample->branch_stack, &al, sample, false,
+				     &pdiff->total_cycles);
+		break;
 	default:
 		if (!hists__add_entry(hists, &al, NULL, NULL, NULL, sample,
 				      true)) {
@@ -992,6 +998,7 @@ static int process_base_stream(struct data__file *data_base,
 	struct evlist *evlist_pair = data_pair->session->evlist;
 	struct evsel *evsel_base, *evsel_pair;
 	struct callchain_streams *es_base, *es_pair;
+	struct block_report *rep_base, *rep_pair;
 
 	evlist__for_each_entry(evlist_base, evsel_base) {
 		evsel_pair = evsel_match(evsel_base, evlist_pair);
@@ -1014,6 +1021,37 @@ static int process_base_stream(struct data__file *data_base,
 		callchain_stream_report(es_base, es_pair);
 	}
 
+	evlist__for_each_entry(evlist_base, evsel_base) {
+		rep_base = block_info__get_report(data_base->block_reports,
+						  data_base->nr_block_reports,
+						  evsel_base->idx);
+
+		if (!rep_base)
+			return -1;
+
+		block_hists_addr2line(&rep_base->hist.block_hists, base_dir);
+
+		evlist_pair = data_pair->session->evlist;
+		evsel_pair = evsel_match(evsel_base, evlist_pair);
+		if (!evsel_pair)
+			continue;
+
+		rep_pair = block_info__get_report(data_pair->block_reports,
+						  data_pair->nr_block_reports,
+						  evsel_pair->idx);
+
+		block_hists_addr2line(&rep_pair->hist.block_hists, pair_dir);
+
+		block_info__match_report(rep_base, rep_pair,
+					 pdiff.src_list, NULL);
+
+		fprintf(stdout, "%s", title);
+
+		use_browser = 0;
+		report__browse_block_hists(&rep_base->hist, pdiff.min_percent,
+					   evsel_base, NULL, NULL);
+	}
+
 	return 0;
 }
 
@@ -1142,6 +1180,26 @@ static int check_file_brstack(void)
 	return 0;
 }
 
+static struct block_report *create_block_reports(struct evlist *evlist,
+						 u64 total_cycles,
+						 int *nr_block_reports)
+{
+	struct block_report *reps;
+	int block_hpps[7] = {
+		PERF_HPP_REPORT__BLOCK_TOTAL_CYCLES_PCT,
+		PERF_HPP_REPORT__BLOCK_AVG_CYCLES,
+		PERF_HPP__BLOCK_NEW_STREAM_DIFF,
+		PERF_HPP__BLOCK_NEW_STREAM_TOTAL_CYCLES_PCT,
+		PERF_HPP__BLOCK_NEW_STREAM_AVG_CYCLES,
+		PERF_HPP_REPORT__BLOCK_RANGE,
+		PERF_HPP_REPORT__BLOCK_DSO,
+	};
+
+	reps = block_info__create_report(evlist, total_cycles, block_hpps, 7,
+					 nr_block_reports);
+	return reps;
+}
+
 static struct callchain_streams *create_evsel_streams(struct evlist *evlist,
 						      int nr_streams_max,
 						      int *nr_evsel_streams)
@@ -1197,6 +1255,8 @@ static int __cmd_diff(void)
 				goto out_delete;
 		}
 
+		pdiff.total_cycles = 0;
+
 		ret = perf_session__process_events(d->session);
 		if (ret) {
 			pr_err("Failed to process %s\n", d->data.path);
@@ -1215,6 +1275,12 @@ static int __cmd_diff(void)
 						&d->nr_evsel_streams);
 			if (!d->evsel_streams)
 				goto out_delete;
+
+			d->block_reports = create_block_reports(d->session->evlist,
+								pdiff.total_cycles,
+								&d->nr_block_reports);
+			if (!d->block_reports)
+				goto out_delete;
 		}
 	}
 
@@ -1225,6 +1291,11 @@ static int __cmd_diff(void)
 
  out_delete:
 	data__for_each_file(i, d) {
+		if (d->block_reports) {
+			block_info__free_report(d->block_reports,
+						d->nr_block_reports);
+		}
+
 		perf_session__delete(d->session);
 		data__free(d);
 	}
@@ -1244,6 +1315,15 @@ static int __cmd_diff(void)
 	return ret;
 }
 
+static int parse_percent_limit(const struct option *opt, const char *arg,
+			       int unset __maybe_unused)
+{
+	struct perf_diff *d = opt->value;
+
+	d->min_percent = strtof(arg, NULL);
+	return 0;
+}
+
 static const char * const diff_usage[] = {
 	"perf diff [<options>] [old_file] [new_file]",
 	NULL,
@@ -1307,6 +1387,9 @@ static const struct option options[] = {
 	OPT_STRING(0, "after", &pdiff.after_dir, "dir",
 		   "Source code directory corresponding to perf.data. "
 		   "WARNING: use with --before and --stream"),
+	OPT_CALLBACK(0, "percent-limit", &pdiff, "percent",
+		     "Don't show entries under that percent",
+		     parse_percent_limit),
 	OPT_END()
 };
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v1 12/14] perf util: Filter out streams by name of changed functions
  2020-03-10  7:02 [PATCH v1 00/14] perf: Stream comparison Jin Yao
                   ` (10 preceding siblings ...)
  2020-03-10  7:02 ` [PATCH v1 11/14] perf diff: support hot blocks comparison Jin Yao
@ 2020-03-10  7:02 ` Jin Yao
  2020-03-10  7:02 ` [PATCH v1 13/14] perf util: Filter out blocks " Jin Yao
  2020-03-10  7:02 ` [PATCH v1 14/14] perf diff: Filter out streams by " Jin Yao
  13 siblings, 0 replies; 23+ messages in thread
From: Jin Yao @ 2020-03-10  7:02 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Sometime some changes are not reflected in the source code,
e.g. changing the compiler option. So for this, we can't get
the changes by diffing the source code lines.

This idea is to let user provide a list of changed functions,
then perf-diff can know these functions are not matched between
old perf data and new perf data.

In this patch, the names of changed function names are stored
in strlist, and we will check if the symbol name is in strlist.
If yes, that means this function is changed.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-diff.c   |  2 +-
 tools/perf/util/callchain.c | 30 ++++++++++++++++++++++++++----
 tools/perf/util/callchain.h |  4 +++-
 3 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 566e811054b1..2f891e8a5122 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -1017,7 +1017,7 @@ static int process_base_stream(struct data__file *data_base,
 		if (!es_pair)
 			return -1;
 
-		callchain_match_streams(es_base, es_pair, pdiff.src_list);
+		callchain_match_streams(es_base, es_pair, pdiff.src_list, NULL);
 		callchain_stream_report(es_base, es_pair);
 	}
 
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index e13eee1042eb..19019329ea32 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1814,16 +1814,35 @@ static bool chain_srclist_match(struct srclist *src_list, const char *srcline_a,
 	return false;
 }
 
+static bool sym_name_changed(struct callchain_list *base_chain,
+			     struct callchain_list *pair_chain,
+			     struct strlist *func_list)
+{
+	if (!func_list || !base_chain->ms.sym || !pair_chain->ms.sym)
+		return false;
+
+	if (strlist__has_entry(func_list, base_chain->ms.sym->name))
+		return true;
+
+	return false;
+}
+
 static bool chain_match(struct callchain_list *base_chain,
 			struct callchain_list *pair_chain,
 			struct srclist *src_list,
+			struct strlist *func_list,
 			bool *src_changed)
 {
 	enum match_result match;
 	bool src_found = false;
+	bool func_changed;
 
 	*src_changed = false;
 
+	func_changed = sym_name_changed(base_chain, pair_chain, func_list);
+	if (func_changed)
+		return false;
+
 	/*
 	 * Check sourceline first. If not matched,
 	 * fallback to symbol match and address match.
@@ -1854,6 +1873,7 @@ static bool chain_match(struct callchain_list *base_chain,
 static bool callchain_node_matched(struct callchain_node *base_cnode,
 				   struct callchain_node *pair_cnode,
 				   struct srclist *src_list,
+				   struct strlist *func_list,
 				   int *nr_changed)
 {
 	struct callchain_list *base_chain, *pair_chain;
@@ -1875,7 +1895,7 @@ static bool callchain_node_matched(struct callchain_node *base_cnode,
 		}
 
 		match = chain_match(base_chain, pair_chain, src_list,
-				    &src_changed);
+				    func_list, &src_changed);
 
 		if (src_changed) {
 			pair_chain->src_changed = true;
@@ -1901,6 +1921,7 @@ static bool callchain_node_matched(struct callchain_node *base_cnode,
 static struct stream_node *stream_node_match(struct stream_node *base_node,
 					     struct callchain_streams *cs_pair,
 					     struct srclist *src_list,
+					     struct strlist *func_list,
 					     bool *src_changed)
 {
 	*src_changed = false;
@@ -1910,7 +1931,7 @@ static struct stream_node *stream_node_match(struct stream_node *base_node,
 		int nr_changed = 0;
 
 		if (callchain_node_matched(base_node->cnode, pair_node->cnode,
-					   src_list, &nr_changed)) {
+					   src_list, func_list, &nr_changed)) {
 			if (nr_changed)
 				*src_changed = true;
 
@@ -1932,7 +1953,8 @@ static void stream_nodes_link(struct stream_node *base_node,
 
 void callchain_match_streams(struct callchain_streams *cs_base,
 			     struct callchain_streams *cs_pair,
-			     struct srclist *src_list)
+			     struct srclist *src_list,
+			     struct strlist *func_list)
 {
 	for (int i = 0; i < cs_base->nr_streams; i++) {
 		struct stream_node *base_node = &cs_base->streams[i];
@@ -1940,7 +1962,7 @@ void callchain_match_streams(struct callchain_streams *cs_base,
 		bool src_changed;
 
 		pair_node = stream_node_match(base_node, cs_pair, src_list,
-					      &src_changed);
+					      func_list, &src_changed);
 		if (pair_node)
 			stream_nodes_link(base_node, pair_node, src_changed);
 		else
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 90826a280476..a6153af9f776 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -7,6 +7,7 @@
 #include "map_symbol.h"
 #include "branch.h"
 #include "srclist.h"
+#include "strlist.h"
 
 struct addr_location;
 struct evsel;
@@ -316,7 +317,8 @@ struct callchain_streams *callchain_evsel_streams_get(struct callchain_streams *
 
 void callchain_match_streams(struct callchain_streams *cs_base,
 			     struct callchain_streams *cs_pair,
-			     struct srclist *src_list);
+			     struct srclist *src_list,
+			     struct strlist *func_list);
 
 void callchain_stream_report(struct callchain_streams *cs_base,
 			     struct callchain_streams *cs_pair);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v1 13/14] perf util: Filter out blocks by name of changed functions
  2020-03-10  7:02 [PATCH v1 00/14] perf: Stream comparison Jin Yao
                   ` (11 preceding siblings ...)
  2020-03-10  7:02 ` [PATCH v1 12/14] perf util: Filter out streams by name of changed functions Jin Yao
@ 2020-03-10  7:02 ` Jin Yao
  2020-03-10  7:02 ` [PATCH v1 14/14] perf diff: Filter out streams by " Jin Yao
  13 siblings, 0 replies; 23+ messages in thread
From: Jin Yao @ 2020-03-10  7:02 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Similar as previous patch "perf util: Filter out streams
by name of changed functions", user provides a list of changed
functions, then perf-diff can know these functions are not
matched between old perf data and new perf data.

In this patch, the names of changed function names have been
stored in strlist, and we will check if the symbol name is in
strlist. If yes, that means this function is changed and the
related blocks are not matched.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/builtin-diff.c    |  4 ++--
 tools/perf/util/block-info.c | 18 ++++++++++++------
 tools/perf/util/block-info.h |  4 +++-
 3 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 2f891e8a5122..98e9ab8c69ce 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -703,7 +703,7 @@ static void hists__precompute(struct hists *hists)
 				if (bh->valid && pair_bh->valid) {
 					block_hists_match(&bh->block_hists,
 							  &pair_bh->block_hists,
-							  NULL,
+							  NULL, NULL,
 							  compute_cycles_diff);
 					hists__output_resort(&pair_bh->block_hists,
 							     NULL);
@@ -1043,7 +1043,7 @@ static int process_base_stream(struct data__file *data_base,
 		block_hists_addr2line(&rep_pair->hist.block_hists, pair_dir);
 
 		block_info__match_report(rep_base, rep_pair,
-					 pdiff.src_list, NULL);
+					 pdiff.src_list, NULL, NULL);
 
 		fprintf(stdout, "%s", title);
 
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index 3aa564151d84..0e9aeb407455 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -93,7 +93,7 @@ struct block_info *block_info__new(void)
 }
 
 int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right,
-			  struct srclist *src_list)
+			  struct srclist *src_list, struct strlist *func_list)
 {
 	struct block_info *bi_l = left->block_info;
 	struct block_info *bi_r = right->block_info;
@@ -113,6 +113,9 @@ int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right,
 	if (cmp)
 		return cmp;
 
+	if (func_list && strlist__has_entry(func_list, bi_l->sym->name))
+		return -1;
+
 	if (src_list && bi_l->line && bi_r->line) {
 		if (block_same_srcfiles(bi_l->line, bi_r->line) &&
 		    bi_l->line->start_rel) {
@@ -143,7 +146,7 @@ int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right,
 int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
 			struct hist_entry *left, struct hist_entry *right)
 {
-	return __block_info__cmp(left, right, NULL);
+	return __block_info__cmp(left, right, NULL, NULL);
 }
 
 static void init_block_info(struct block_info *bi, struct symbol *sym,
@@ -872,7 +875,8 @@ bool block_srclist_matched(struct srclist *slist, char *rel_path,
 
 static struct hist_entry *get_block_pair(struct hist_entry *he,
 					 struct hists *hists_pair,
-					 struct srclist *src_list)
+					 struct srclist *src_list,
+					 struct strlist *func_list)
 {
 	struct rb_root_cached *root = hists_pair->entries_in;
 	struct rb_node *next = rb_first_cached(root);
@@ -884,7 +888,7 @@ static struct hist_entry *get_block_pair(struct hist_entry *he,
 
 		next = rb_next(&he_pair->rb_node_in);
 
-		cmp = __block_info__cmp(he_pair, he, src_list);
+		cmp = __block_info__cmp(he_pair, he, src_list, func_list);
 		if (!cmp)
 			return he_pair;
 	}
@@ -895,6 +899,7 @@ static struct hist_entry *get_block_pair(struct hist_entry *he,
 void block_hists_match(struct hists *hists_base,
 		       struct hists *hists_pair,
 		       struct srclist *src_list,
+		       struct strlist *func_list,
 		       void (*func)(struct hist_entry *,
 				    struct hist_entry *))
 {
@@ -905,7 +910,7 @@ void block_hists_match(struct hists *hists_base,
 		struct hist_entry *he = rb_entry(next, struct hist_entry,
 						 rb_node_in);
 		struct hist_entry *pair = get_block_pair(he, hists_pair,
-							 src_list);
+							 src_list, func_list);
 
 		next = rb_next(&he->rb_node_in);
 
@@ -921,6 +926,7 @@ void block_hists_match(struct hists *hists_base,
 int block_info__match_report(struct block_report *rep_base,
 			     struct block_report *rep_pair,
 			     struct srclist *src_list,
+			     struct strlist *func_list,
 			     void (*func)(struct hist_entry *,
 					  struct hist_entry *))
 {
@@ -928,7 +934,7 @@ int block_info__match_report(struct block_report *rep_base,
 	struct block_hist *bh_pair = &rep_pair->hist;
 
 	block_hists_match(&bh_base->block_hists, &bh_pair->block_hists,
-			  src_list, func);
+			  src_list, func_list, func);
 
 	return 0;
 }
diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
index 4e22bce81731..2b440cef4ef7 100644
--- a/tools/perf/util/block-info.h
+++ b/tools/perf/util/block-info.h
@@ -70,7 +70,7 @@ static inline void __block_info__zput(struct block_info **bi)
 #define block_info__zput(bi) __block_info__zput(&bi)
 
 int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right,
-			  struct srclist *src_list __maybe_unused);
+			  struct srclist *src_list, struct strlist *func_list);
 
 int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
 			struct hist_entry *left, struct hist_entry *right);
@@ -108,12 +108,14 @@ bool block_srclist_matched(struct srclist *slist, char *rel_path,
 void block_hists_match(struct hists *hists_base,
 		       struct hists *hists_pair,
 		       struct srclist *src_list,
+		       struct strlist *func_list,
 		       void (*func)(struct hist_entry *,
 				    struct hist_entry *));
 
 int block_info__match_report(struct block_report *rep_base,
 			     struct block_report *rep_pair,
 			     struct srclist *src_list,
+			     struct strlist *func_list,
 			     void (*func)(struct hist_entry *,
 					  struct hist_entry *));
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v1 14/14] perf diff: Filter out streams by changed functions
  2020-03-10  7:02 [PATCH v1 00/14] perf: Stream comparison Jin Yao
                   ` (12 preceding siblings ...)
  2020-03-10  7:02 ` [PATCH v1 13/14] perf util: Filter out blocks " Jin Yao
@ 2020-03-10  7:02 ` Jin Yao
  13 siblings, 0 replies; 23+ messages in thread
From: Jin Yao @ 2020-03-10  7:02 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Sometime some changes are not reflected in the source code,
e.g. changing the compiler option. So for this, we can't get
the changes by diffing the source code lines.

This patch introduces a new perf-diff option "--changed-func".
It passes the names of changed functions then perf-diff can
know what functions are changed.

For example,
perf diff --stream --changed-func main --changed-func rand

It passes the function list {"main", "rand"} to perf-diff.
Now perf-diff knows the functions "main" and "rand" in new perf
data file are changed.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/Documentation/perf-diff.txt |  5 +++++
 tools/perf/builtin-diff.c              | 30 ++++++++++++++++++++++++--
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt
index 296fea98ac07..784598c12e26 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -196,6 +196,11 @@ OPTIONS
 	Source code directory corresponding to perf.data. Should be
 	used with --stream and --before.
 
+--changed-func=::
+	The given function is changed in new perf data file. This option
+	needs to be used with --stream option. Multiple functions can be given
+	by using this option more than once.
+
 COMPARISON
 ----------
 The comparison is governed by the baseline file. The baseline perf.data
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 98e9ab8c69ce..5e5f29105fe1 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -27,6 +27,7 @@
 #include "util/block-info.h"
 #include "util/srclist.h"
 #include "util/callchain.h"
+#include "util/strlist.h"
 #include <linux/err.h>
 #include <linux/zalloc.h>
 #include <subcmd/pager.h>
@@ -49,6 +50,7 @@ struct perf_diff {
 	bool				 src_cmp;
 	bool				 stream;
 	struct srclist			*src_list;
+	struct strlist			*func_list;
 	u64				 total_cycles;
 	float				 min_percent;
 };
@@ -1017,7 +1019,8 @@ static int process_base_stream(struct data__file *data_base,
 		if (!es_pair)
 			return -1;
 
-		callchain_match_streams(es_base, es_pair, pdiff.src_list, NULL);
+		callchain_match_streams(es_base, es_pair, pdiff.src_list,
+					pdiff.func_list);
 		callchain_stream_report(es_base, es_pair);
 	}
 
@@ -1043,7 +1046,7 @@ static int process_base_stream(struct data__file *data_base,
 		block_hists_addr2line(&rep_pair->hist.block_hists, pair_dir);
 
 		block_info__match_report(rep_base, rep_pair,
-					 pdiff.src_list, NULL, NULL);
+					 pdiff.src_list, pdiff.func_list, NULL);
 
 		fprintf(stdout, "%s", title);
 
@@ -1213,6 +1216,24 @@ static struct callchain_streams *create_evsel_streams(struct evlist *evlist,
 	return evsel_streams;
 }
 
+static int parse_func(const struct option *opt __maybe_unused,
+		      const char *str, int unset __maybe_unused)
+{
+	int ret;
+
+	if (!pdiff.func_list) {
+		pdiff.func_list = strlist__new(NULL, NULL);
+		if (!pdiff.func_list)
+			return -ENOMEM;
+	}
+
+	ret = strlist__add(pdiff.func_list, str);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
 static int __cmd_diff(void)
 {
 	struct data__file *d;
@@ -1312,6 +1333,9 @@ static int __cmd_diff(void)
 	if (pdiff.src_list)
 		srclist__delete(pdiff.src_list);
 
+	if (pdiff.func_list)
+		strlist__delete(pdiff.func_list);
+
 	return ret;
 }
 
@@ -1390,6 +1414,8 @@ static const struct option options[] = {
 	OPT_CALLBACK(0, "percent-limit", &pdiff, "percent",
 		     "Don't show entries under that percent",
 		     parse_percent_limit),
+	OPT_CALLBACK(0, "changed-func", NULL, "func",
+		     "Given function is changed", parse_func),
 	OPT_END()
 };
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 01/14] perf util: Create source line mapping table
  2020-03-10  7:02 ` [PATCH v1 01/14] perf util: Create source line mapping table Jin Yao
@ 2020-03-10 15:08   ` Arnaldo Carvalho de Melo
  2020-03-11  5:33     ` Jin, Yao
  0 siblings, 1 reply; 23+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-03-10 15:08 UTC (permalink / raw)
  To: Jin Yao
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Em Tue, Mar 10, 2020 at 03:02:32PM +0800, Jin Yao escreveu:
> Sometimes, a small change in a hot function reducing the cycles of
> this function, but the overall workload doesn't get faster. It is
> interesting where the cycles are moved to.
> 
> What it would like is to diff before/after streams. A stream is a
> callchain which is aggregated by the branch records from samples.
> 
> By browsing the hot streams, we can understand the hot code flow.
> By comparing the cycles variation of same streams between old perf
> data and new perf data, we can understand if the cycles are moved to
> the unchanged code.
> 
> The before stream is the stream before source line changed
> (e.g. streams in perf.data.old). The after stream is the stream
> after source line changed (e.g. streams in perf.data).
> 
> Diffing before/after streams compares all streams (or compares top
> N hot streams) between two perf data files.
> 
> If all entries of one stream in perf.data.old are fully matched with
> all entries of another stream in perf.data, we think these two streams
> are matched, otherwise the streams are not matched.
> 
> For example,
> 
>    cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
> --------------------------              --------------------------
>              main div.c:39                           main div.c:39
>              main div.c:44                           main div.c:44
> 
> It looks that two streams are matched and we can see for the same
> streams the cycles are equal and the stream hit percents are
> slightly changed. That's expected in the normal range.
> 
> But that's not always true if source lines are changed in perf.data
> (e.g. div.c:39 is changed). If the source line is changed, they
> are different streams, we can't compare them. We just think the
> stream in perf.data is a new one.
> 
> The challenge is how to identify the changed source lines. The basic
> idea is to use linux command 'diff' to compare the source file A and
> source file A* line by line (assume A is used in perf.data.old
> and A* is updated in perf.data). Create a source line mapping table.
> 
> For example,
> 
>   Execute 'diff ./before/div.c ./after/div.c'
> 
>   25c25
>   <       i = rand() % 2;
>   ---
>   >       i = rand() % 4;
>   39c39
>   <       for (i = 0; i < 2000000000; i++) {
>   ---
>   >       for (i = 0; i < 20000000001; i++) {
> 
>   div.c (after -> before) lines mapping:
>   0 -> 0
>   1 -> 1
>   2 -> 2
>   3 -> 3
>   4 -> 4
>   5 -> 5
>   6 -> 6
>   7 -> 7
>   8 -> 8
>   9 -> 9
>   ...
>   24 -> 24
>   25 -> -1
>   26 -> 26
>   27 -> 27
>   28 -> 28
>   29 -> 29
>   30 -> 30
>   31 -> 31
>   32 -> 32
>   33 -> 33
>   34 -> 34
>   35 -> 35
>   36 -> 36
>   37 -> 37
>   38 -> 38
>   39 -> -1
>   40 -> 40
>   ...
> 
> From the table, we can easily know div.c:39 is source line changed.
> (it's mapped to -1). So these two streams are not matched.
> 
> This patch can also handle the cases of new source lines insertion
> and old source lines deletion. This source line mapping table is a
> base for next processing for stream comparison.
> 
> This patch creates a new rblist 'srclist' to manage the source line
> mapping tables. Each node contains the source line mapping table of
> a before/after source file pair. The node is created on demand as
> we process the samples. So it's likely we don't need to create so many
> nodes.
> 
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> ---
>  tools/perf/util/Build     |   1 +
>  tools/perf/util/srclist.c | 552 ++++++++++++++++++++++++++++++++++++++
>  tools/perf/util/srclist.h |  65 +++++
>  3 files changed, 618 insertions(+)
>  create mode 100644 tools/perf/util/srclist.c
>  create mode 100644 tools/perf/util/srclist.h
> 
> diff --git a/tools/perf/util/Build b/tools/perf/util/Build
> index c0cf8dff694e..b9bf620b60f0 100644
> --- a/tools/perf/util/Build
> +++ b/tools/perf/util/Build
> @@ -99,6 +99,7 @@ perf-y += call-path.o
>  perf-y += rwsem.o
>  perf-y += thread-stack.o
>  perf-y += spark.o
> +perf-y += srclist.o
>  perf-$(CONFIG_AUXTRACE) += auxtrace.o
>  perf-$(CONFIG_AUXTRACE) += intel-pt-decoder/
>  perf-$(CONFIG_AUXTRACE) += intel-pt.o
> diff --git a/tools/perf/util/srclist.c b/tools/perf/util/srclist.c
> new file mode 100644
> index 000000000000..cb1d42a3a8ed
> --- /dev/null
> +++ b/tools/perf/util/srclist.c
> @@ -0,0 +1,552 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Manage difference of source lines
> + * Copyright (c) 2020, Intel Corporation.
> + * Author: Jin Yao
> + */
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <sys/types.h>
> +#include <inttypes.h>
> +#include <string.h>
> +#include <strings.h>
> +#include <unistd.h>
> +#include <err.h>
> +#include <errno.h>
> +#include <limits.h>
> +#include <stdbool.h>
> +#include <linux/zalloc.h>
> +#include "strlist.h"
> +#include "srclist.h"
> +#include "debug.h"
> +
> +enum {
> +	DIFF_LINE_ADD,
> +	DIFF_LINE_DEL,
> +	DIFF_LINE_CHANGE
> +};
> +
> +static void get_full_path(const char *dir, const char *rel_path,
> +			  char *full_path, int size)
> +{
> +	if (!dir) {
> +		snprintf(full_path, size, "%s", rel_path);
> +		return;
> +	}
> +
> +	if (dir[strlen(dir) - 1] == '/')
> +		snprintf(full_path, size, "%s%s", dir, rel_path);
> +	else
> +		snprintf(full_path, size, "%s/%s", dir, rel_path);

Just use 

	snprintf(full_path, size, "%s/%s", dir, rel_path);

an extra / won't matter, the code will be smaller. At first I thought
you could use path__join(), please check if that is the case.

> +}
> +
> +static int get_line_num(char *path)

So this is to get how many lines a given file has? I.e. 'wc -l'? Maybe:

static int path__number_of_lines(const char *path)

?

> +{
> +	FILE *fh;
> +	int num = 0, ch, last = 0;
> +
> +	fh = fopen(path, "r");
> +	if (!fh) {
> +		pr_debug("Failed to open file %s\n", path);
> +		return -1;
> +	}
> +
> +	while (!feof(fh)) {
> +		ch = fgetc(fh);
> +		if (ch == '\n')
> +			num++;
> +		last = ch;

Maybe use fgets() instead as it already stops at '\n'?

> +	}
> +
> +	fclose(fh);
> +
> +	if (last != '\n')
> +		num++;
> +
> +	return num;
> +}
> +
> +static int get_digit(char *str, char **end_str, int *val)
> +{
> +	int v;
> +	char *end;
> +
> +	errno = 0;
> +	v = strtol(str, &end, 10);
> +	if ((errno == ERANGE && (v == INT_MAX || v == INT_MIN))
> +		|| (errno != 0 && v == 0)) {
> +		return -1;
> +	}
> +
> +	if (end == str)
> +		return -1;
> +
> +	*val = v;
> +	*end_str = end;
> +	return 0;
> +}
> +
> +static int get_range(char *str, int *start_val, int *end_val)
> +{
> +	int val, ret;
> +	char *end, *next;
> +
> +	*start_val = -1;
> +	*end_val = -1;
> +
> +	ret = get_digit(str, &end, &val);
> +	if (ret)
> +		return ret;
> +
> +	*start_val = val;
> +
> +	if (*end != '\0') {
> +		next = end + 1;
> +		ret = get_digit(next, &end, &val);
> +		if (ret)
> +			return ret;
> +
> +		*end_val = val;
> +	}
> +
> +	return 0;
> +}
> +
> +static int opr_str_idx(char *str, int len, char ch)
> +{
> +	int i;
> +
> +	for (i = 0; i < len; i++) {
> +		if (str[i] == ch)
> +			break;
> +	}
> +
> +	if (i == len || i == 0 || i == len - 1)
> +		return -1;
> +
> +	if (str[i - 1] >= '0' && str[i - 1] <= '9' &&
> +	    str[i + 1] >= '0' && str[i + 1] <= '9') {
> +		return i;
> +	}
> +
> +	return -1;
> +}
> +
> +static bool get_diff_operation(char *str, int *opr, int *b_start, int *b_end,
> +			       int *a_start, int *a_end)
> +{
> +	int idx, len;
> +
> +	if (str[0] == '<' || str[0] == '>' || str[0] == '-')
> +		return false;
> +
> +	len = strlen(str);
> +
> +	/*
> +	 * For example,
> +	 * 5,6d4
> +	 * 8a7
> +	 * 20,21c19
> +	 */
> +	idx = opr_str_idx(str, len, 'd');
> +	if (idx != -1) {
> +		*opr = DIFF_LINE_DEL;
> +		goto exit;
> +	}
> +
> +	idx = opr_str_idx(str, len, 'a');
> +	if (idx != -1) {
> +		*opr = DIFF_LINE_ADD;
> +		goto exit;
> +	}
> +
> +	idx = opr_str_idx(str, len, 'c');
> +	if (idx != -1) {
> +		*opr = DIFF_LINE_CHANGE;
> +		goto exit;
> +	}
> +
> +exit:
> +	if (idx != -1) {
> +		str[idx] = 0;
> +		get_range(str, b_start, b_end);
> +		get_range(&str[idx + 1], a_start, a_end);

get_range() may return -1, don't you have to check this here?

> +		return true;
> +	}
> +
> +	return false;
> +}
> +
> +static int del_lines(struct line_pair *lines, int b_start, int b_end,
> +		     int a_start, int a_end __maybe_unused,
> +		     int *nr_lines, int *b_adjust)
> +{
> +	int i = *nr_lines;
> +	bool set = false;
> +
> +	/*
> +	 * For example, 5,6d4
> +	 * It means line5~line6 of file1 are not same as line4 of file2
> +	 * since line5~line6 are deleted.
> +	 */
> +
> +	if (a_start == i) {
> +		lines[i].a_nr = i;
> +		lines[i].b_nr = i + *b_adjust;
> +		set = true;
> +	}
> +
> +	if (b_end != -1)
> +		*b_adjust += b_end - b_start + 1;
> +	else
> +		*b_adjust += 1;
> +
> +	if (!set) {
> +		lines[i].a_nr = i;
> +		lines[i].b_nr = i + *b_adjust;
> +	}
> +
> +	*nr_lines = i + 1;
> +	return 0;
> +}
> +
> +static int add_lines(struct line_pair *lines,
> +		     int b_start __maybe_unused, int b_end __maybe_unused,
> +		     int a_start, int a_end, int *nr_lines, int *b_adjust)
> +{
> +	int i = *nr_lines;
> +
> +	/*
> +	 * For example, 8a7
> +	 * It means line8 of file1 is not same as line7 of file2
> +	 * since a new line (line7) is inserted to file2.
> +	 */
> +	if (a_end != -1) {
> +		for (int j = 0; j < a_end - a_start + 1; j++) {
> +			lines[i].a_nr = i;
> +			lines[i].b_nr = -1;
> +			i++;
> +		}
> +		*b_adjust -= a_end - a_start + 1;
> +	} else {
> +		lines[i].a_nr = i;
> +		lines[i].b_nr = -1;
> +		i++;
> +		*b_adjust -= 1;
> +	}
> +
> +	*nr_lines = i;
> +	return 0;
> +}
> +
> +static int change_lines(struct line_pair *lines, int b_start, int b_end,
> +			int a_start, int a_end, int *nr_lines, int *b_adjust)
> +{
> +	int i = *nr_lines;
> +
> +	/*
> +	 * For example, 20,21c19
> +	 * It means line20~line21 of file1 are not same as line19 of file2
> +	 * since they are changed.
> +	 */
> +
> +	if (a_end != -1) {
> +		for (int j = 0; j < a_end - a_start + 1; j++) {
> +			lines[i].a_nr = i;
> +			lines[i].b_nr = -1;
> +			i++;
> +		}
> +	} else {
> +		lines[i].a_nr = i;
> +		lines[i].b_nr = -1;
> +		i++;
> +	}
> +
> +	if (b_end != -1)
> +		*b_adjust += b_end - b_start;
> +
> +	if (a_end != -1)
> +		*b_adjust -= a_end - a_start;
> +
> +	*nr_lines = i;
> +	return 0;
> +}
> +
> +static int update_lines(struct line_pair *lines, int opr, int b_start,
> +			int b_end, int a_start, int a_end, int *nr_lines,
> +			int *b_adjust)


Functions operating on thisa 'struct line_pair' should have the
line_pair__ prefix, so that we can use ctags, etc, as we have other
places that process lines, etc, which may end up clashing.

> +{
> +	int i = *nr_lines;
> +
> +	while (i < a_start) {
> +		lines[i].a_nr = i;
> +		lines[i].b_nr = i + *b_adjust;
> +		i++;
> +	}
> +
> +	*nr_lines = i;
> +
> +	switch (opr) {
> +	case DIFF_LINE_DEL:
> +		del_lines(lines, b_start, b_end, a_start, a_end,
> +			  nr_lines, b_adjust);
> +		break;
> +
> +	case DIFF_LINE_ADD:
> +		add_lines(lines, b_start, b_end, a_start, a_end,
> +			  nr_lines, b_adjust);
> +		break;
> +
> +	case DIFF_LINE_CHANGE:
> +		change_lines(lines, b_start, b_end, a_start, a_end,
> +			     nr_lines, b_adjust);
> +		break;
> +
> +	default:
> +		break;
> +	}
> +
> +	return 0;
> +}
> +
> +static void update_remaining(struct line_pair *lines, int a_num, int *nr_lines,
> +			     int b_adjust)
> +{
> +	int i;
> +
> +	for (i = *nr_lines; i <= a_num; i++) {
> +		lines[i].a_nr = i;
> +		lines[i].b_nr = i + b_adjust;
> +	}
> +
> +	*nr_lines = i;
> +}
> +
> +static int build_mapping_table(FILE *diff_fd, struct line_pair *lines,
> +			       int *nr_lines, int a_num)
> +{
> +	char *str, *p;
> +	size_t len;
> +	int opr, b_start, b_end, a_start, a_end, b_adjust = 0;
> +
> +	*nr_lines = 1;  /* First line is reserved */
> +
> +	while (!feof(diff_fd)) {
> +		str = NULL;

Forgot to set len to zero? (See getline manpage)

> +		if (getline(&str, &len, diff_fd) < 0 || !len)
> +			break;
> +
> +		p = strchr(str, '\n');
> +		if (p)
> +			*p = '\0';
> +
> +		pr_debug("%s\n", str);
> +
> +		if (!get_diff_operation(str, &opr, &b_start, &b_end,
> +					&a_start, &a_end)) {
> +			continue;
> +		}
> +
> +		update_lines(lines, opr, b_start, b_end, a_start, a_end,
> +			     nr_lines, &b_adjust);

So, since str was allocated, where is it being kept? above you're always
setting it to NULL before calling getline.

> +	}
> +
> +	update_remaining(lines, a_num, nr_lines, b_adjust);
> +
> +	free(str);
> +	return 0;
> +}
> +
> +static int create_line_mapping(struct src_info *info, char *b_path,
> +			       char *a_path)

src_info__create_line_mappings() ?

> +{
> +	char cmd[PATH_MAX];
> +	FILE *diff_fd = NULL;
> +	int ret = -1;
> +
> +	info->nr_lines_before = get_line_num(b_path);
> +	if (info->nr_lines_before == -1)
> +		goto out;
> +
> +	info->nr_lines_after = get_line_num(a_path);
> +	if (info->nr_lines_after == -1)
> +		goto out;
> +
> +	/* Reserve first line */
> +	info->nr_max = info->nr_lines_before + info->nr_lines_after + 1;
> +	info->lines = calloc(info->nr_max, sizeof(struct line_pair));
> +	if (!info->lines) {
> +		pr_err("Failed to alloc memory for lines\n");
> +		goto out;
> +	}
> +
> +	snprintf(cmd, sizeof(cmd), "diff %s %s",
> +		 b_path, a_path);
> +
> +	pr_debug("Execute '%s'\n", cmd);
> +
> +	diff_fd = popen(cmd, "r");
> +	if (!diff_fd) {
> +		pr_err("Failed to execute '%s'\n", cmd);
> +		goto out;
> +	}
> +
> +	ret = build_mapping_table(diff_fd, info->lines, &info->nr_lines,
> +				  info->nr_lines_after);
> +
> +out:
> +	if (diff_fd)
> +		fclose(diff_fd);
> +
> +	return ret;
> +}
> +
> +static void free_src_info(struct src_info *info)
> +{
> +	if (info->rel_path)
> +		zfree(&info->rel_path);
> +
> +	if (info->lines)
> +		zfree(&info->lines);

No need to check if it is NULL, call zfree() straight away.

> +}
> +
> +static int init_src_info(char *b_path, char *a_path, const char *rel_path,
> +			 struct src_info *info)
> +{
> +	int ret;
> +
> +	info->rel_path = strdup(rel_path);
> +	if (!info->rel_path)
> +		return -1;
> +
> +	ret = create_line_mapping(info, b_path, a_path);
> +	if (ret) {
> +		free_src_info(info);
> +		return ret;
> +	}
> +
> +	return 0;

	if (ret)
		free_src_info(info);

	return ret;

Should be equivalent and shorter, no?


> +}
> +
> +static void free_src_node(struct src_node *node)
> +{

Free routines should accept a NULL arg, i.e. first thing:

        if (node == NULL)
		return;

> +	free_src_info(&node->info);
> +	free(node);
> +}
> +
> +static struct rb_node *srclist__node_new(struct rblist *rblist,
> +					 const void *entry)
> +{
> +	struct srclist *slist = container_of(rblist, struct srclist, rblist);
> +	const char *rel_path = entry;
> +	char b_path[PATH_MAX], a_path[PATH_MAX];
> +	struct src_node *node;
> +	int ret;
> +
> +	get_full_path(slist->before_dir, rel_path, b_path, PATH_MAX);
> +	get_full_path(slist->after_dir, rel_path, a_path, PATH_MAX);
> +
> +	node = calloc(1, sizeof(*node));

	zalloc()?

> +	if (!node)
> +		return NULL;
> +
> +	ret = init_src_info(b_path, a_path, rel_path, &node->info);
> +	if (ret)
> +		goto err;
> +
> +	return &node->rb_node;
> +
> +err:
> +	free_src_node(node);
> +	return NULL;
> +}
> +
> +static void srclist__node_delete(struct rblist *rblist __maybe_unused,
> +				 struct rb_node *rb_node)
> +{
> +	struct src_node *node = container_of(rb_node, struct src_node, rb_node);
> +
> +	free_src_info(&node->info);
> +	free(node);
> +}
> +
> +static int srclist__node_cmp(struct rb_node *rb_node, const void *entry)
> +{
> +	struct src_node *node = container_of(rb_node, struct src_node, rb_node);
> +	const char *str = entry;
> +
> +	return strcmp(node->info.rel_path, str);
> +}
> +
> +struct srclist *srclist__new(const char *before_dir, const char *after_dir)
> +{
> +	struct srclist *slist = calloc(1, sizeof(*slist));

zalloc()

> +
> +	if (!slist)
> +		return NULL;
> +
> +	rblist__init(&slist->rblist);
> +	slist->rblist.node_cmp = srclist__node_cmp;
> +	slist->rblist.node_new = srclist__node_new;
> +	slist->rblist.node_delete = srclist__node_delete;
> +
> +	slist->before_dir = before_dir;
> +	slist->after_dir = after_dir;
> +	return slist;
> +}
> +
> +void srclist__delete(struct srclist *slist)
> +{
> +	if (slist)
> +		rblist__delete(&slist->rblist);


Correct, delete operations should accept NULL, just like free()

> +}
> +
> +struct src_node *srclist__find(struct srclist *slist, char *rel_path,
> +			       bool create)
> +{
> +	struct src_node *node = NULL;
> +	struct rb_node *rb_node;
> +
> +	if (create)
> +		rb_node = rblist__findnew(&slist->rblist, (void *)rel_path);
> +	else
> +		rb_node = rblist__find(&slist->rblist, (void *)rel_path);
> +
> +	if (rb_node)
> +		node = container_of(rb_node, struct src_node, rb_node);
> +
> +	return node;
> +}
> +
> +struct line_pair *srclist__line_pair(struct src_node *node, int a_nr)
> +{
> +	struct src_info *info = &node->info;
> +
> +	if (a_nr < info->nr_lines && a_nr > 0)
> +		return &info->lines[a_nr];
> +
> +	pr_debug("Exceeds line range: nr_lines = %d, a_nr = %d\n",
> +		 info->nr_lines, a_nr);
> +
> +	return NULL;
> +}
> +
> +void srclist__dump(struct srclist *slist)
> +{
> +	struct src_node *node;
> +	struct src_info *info;
> +	int i;
> +
> +	srclist__for_each_entry(node, slist) {
> +		info = &node->info;
> +
> +		pr_debug("%s (after -> before) lines mapping:\n",
> +			 info->rel_path);
> +
> +		for (i = 0; i < info->nr_lines; i++) {
> +			pr_debug("%d -> %d\n",
> +				 info->lines[i].a_nr,
> +				 info->lines[i].b_nr);
> +		}
> +	}
> +}
> diff --git a/tools/perf/util/srclist.h b/tools/perf/util/srclist.h
> new file mode 100644
> index 000000000000..f25b0de91a13
> --- /dev/null
> +++ b/tools/perf/util/srclist.h
> @@ -0,0 +1,65 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __PERF_SRCLIST_H
> +#define __PERF_SRCLIST_H
> +
> +#include <linux/types.h>
> +#include <linux/rbtree.h>
> +#include "rblist.h"
> +
> +struct line_pair {
> +	int a_nr;	/* line nr in after.c */
> +	int b_nr;	/* line nr in before.c */
> +};
> +
> +struct src_node;
> +
> +struct src_info {
> +	char *rel_path; /* relative path */
> +	struct line_pair *lines;
> +	int nr_max;
> +	int nr_lines;
> +	int nr_lines_before;
> +	int nr_lines_after;
> +};
> +
> +struct src_node {
> +	struct rb_node rb_node;
> +	struct src_info info;
> +};
> +
> +struct srclist {
> +	struct rblist rblist;
> +	const char *before_dir;
> +	const char *after_dir;
> +};
> +
> +struct srclist *srclist__new(const char *before_dir, const char *after_dir);
> +void srclist__delete(struct srclist *slist);
> +
> +struct src_node *srclist__find(struct srclist *slist, char *rel_path,
> +			       bool create);
> +
> +struct line_pair *srclist__line_pair(struct src_node *node, int a_nr);
> +void srclist__dump(struct srclist *slist);
> +
> +static inline struct src_node *srclist__first(struct srclist *slist)
> +{
> +	struct rb_node *rn = rb_first_cached(&slist->rblist.entries);
> +
> +	return rn ? rb_entry(rn, struct src_node, rb_node) : NULL;
> +}
> +
> +static inline struct src_node *srclist__next(struct src_node *sn)
> +{
> +	struct rb_node *rn;
> +
> +	if (!sn)
> +		return NULL;
> +	rn = rb_next(&sn->rb_node);
> +	return rn ? rb_entry(rn, struct src_node, rb_node) : NULL;
> +}
> +
> +#define srclist__for_each_entry(pos, slist)	\
> +	for (pos = srclist__first(slist); pos; pos = srclist__next(pos))
> +
> +#endif /* __PERF_SRCLIST_H */
> -- 
> 2.17.1
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 02/14] perf util: Create streams for managing top N hottest callchains
  2020-03-10  7:02 ` [PATCH v1 02/14] perf util: Create streams for managing top N hottest callchains Jin Yao
@ 2020-03-10 15:11   ` Arnaldo Carvalho de Melo
  2020-03-11  5:38     ` Jin, Yao
  0 siblings, 1 reply; 23+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-03-10 15:11 UTC (permalink / raw)
  To: Jin Yao
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Em Tue, Mar 10, 2020 at 03:02:33PM +0800, Jin Yao escreveu:
> We think the stream is a callchain which is aggregated by the LBR
> records from samples. By browsing the stream, we can understand
> the code flow.
> 
> The struct callchain_node represents one callchain and we use the
> callchain_node->hit to measure the hot level of this callchain.
> Higher is hotter.
> 
> Since in perf data file, there may be many callchains so we just
> need to focus on the top N hottest callchains. N is a user defined
> parameter or just a predefined default value.
> 
> This patch saves the top N hottest callchains in 'struct stream_node'
> type array, which is defined in a per event 'struct callchain_streams'.
> 
> So now we can get the per-event top N hottest callchains.
> 
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> ---
>  tools/perf/util/callchain.c | 125 ++++++++++++++++++++++++++++++++++++
>  tools/perf/util/callchain.h |  16 +++++
>  2 files changed, 141 insertions(+)
> 
> diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
> index 818aa4efd386..d9c68a8e7619 100644
> --- a/tools/perf/util/callchain.c
> +++ b/tools/perf/util/callchain.c
> @@ -31,6 +31,7 @@
>  #include "callchain.h"
>  #include "branch.h"
>  #include "symbol.h"
> +#include "evlist.h"
>  #include "../perf.h"
>  
>  #define CALLCHAIN_PARAM_DEFAULT			\
> @@ -1599,3 +1600,127 @@ void callchain_cursor_reset(struct callchain_cursor *cursor)
>  	for (node = cursor->first; node != NULL; node = node->next)
>  		map__zput(node->ms.map);
>  }
> +
> +static void free_evsel_streams(struct callchain_streams *callchain_streams,
> +			       int nr_evsel)
> +{
> +	for (int i = 0; i < nr_evsel; i++) {
> +		if (callchain_streams[i].streams)
> +			free(callchain_streams[i].streams);

free(NULL) is valid, so remove that extra check and use zfree() to reset
that entry to NULL, i.e.:

	for ()
		zfree(&callchain_streams[i].streams);

> +	}
> +
> +	free(callchain_streams);
> +}
> +
> +static struct callchain_streams *create_evsel_streams(int nr_evsel,
> +						      int nr_streams_max)
> +{
> +	struct callchain_streams *callchain_streams;
> +
> +	callchain_streams = calloc(nr_evsel, sizeof(struct callchain_streams));

calloc is the right thing here, as this is an array

> +	if (!callchain_streams)
> +		return NULL;
> +
> +	for (int i = 0; i < nr_evsel; i++) {
> +		struct callchain_streams *s = &callchain_streams[i];
> +
> +		s->streams = calloc(nr_streams_max, sizeof(struct stream_node));
> +		if (!s->streams)
> +			goto err;
> +
> +		s->nr_streams_max = nr_streams_max;
> +		s->evsel_idx = -1;
> +	}
> +
> +	return callchain_streams;
> +
> +err:
> +	free_evsel_streams(callchain_streams, nr_evsel);
> +	return NULL;
> +}
> +
> +/*
> + * The cnodes with high hit number are hot callchains.
> + */
> +static void set_hot_cnode(struct callchain_streams *s,
> +			  struct callchain_node *cnode)
> +{
> +	int i, idx = 0;
> +	u64 hit;
> +
> +	if (s->nr_streams < s->nr_streams_max) {
> +		i = s->nr_streams;
> +		s->streams[i].cnode = cnode;
> +		s->nr_streams++;
> +		return;
> +	}
> +
> +	/*
> +	 * Since only a few number of hot streams, so only use simple
> +	 * way to find the cnode with smallest hit number and replace.
> +	 */
> +	hit = (s->streams[0].cnode)->hit;
> +	for (i = 1; i < s->nr_streams; i++) {
> +		if ((s->streams[i].cnode)->hit < hit) {
> +			hit = (s->streams[i].cnode)->hit;
> +			idx = i;
> +		}
> +	}
> +
> +	if (cnode->hit > hit)
> +		s->streams[idx].cnode = cnode;
> +}
> +
> +static void update_hot_streams(struct hist_entry *he,
> +			       struct callchain_streams *s)
> +{
> +	struct rb_root *root = &he->sorted_chain;
> +	struct rb_node *rb_node = rb_first(root);
> +	struct callchain_node *node;
> +
> +	while (rb_node) {
> +		node = rb_entry(rb_node, struct callchain_node, rb_node);
> +		set_hot_cnode(s, node);
> +		rb_node = rb_next(rb_node);
> +	}
> +}
> +
> +static void get_hot_streams(struct hists *hists,
> +			    struct callchain_streams *s)
> +{
> +	struct rb_node *next;
> +
> +	next = rb_first_cached(&hists->entries);
> +	while (next) {
> +		struct hist_entry *he;
> +
> +		he = rb_entry(next, struct hist_entry, rb_node);
> +		update_hot_streams(he, s);
> +		next = rb_next(&he->rb_node);
> +	}
> +}
> +
> +struct callchain_streams *callchain_evsel_streams_create(struct evlist *evlist,
> +							 int nr_streams_max,
> +							 int *nr_evsel_streams)
> +{
> +	int nr_evsel = evlist->core.nr_entries, i = 0;
> +	struct callchain_streams *callchain_streams;
> +	struct evsel *pos;
> +
> +	callchain_streams = create_evsel_streams(nr_evsel, nr_streams_max);
> +	if (!callchain_streams)
> +		return NULL;
> +
> +	evlist__for_each_entry(evlist, pos) {
> +		struct hists *hists = evsel__hists(pos);
> +
> +		hists__output_resort(hists, NULL);
> +		get_hot_streams(hists, &callchain_streams[i]);
> +		callchain_streams[i].evsel_idx = pos->idx;
> +		i++;
> +	}
> +
> +	*nr_evsel_streams = nr_evsel;
> +	return callchain_streams;
> +}
> diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
> index 706bb7bbe1e1..5852990cdf60 100644
> --- a/tools/perf/util/callchain.h
> +++ b/tools/perf/util/callchain.h
> @@ -13,6 +13,7 @@ struct ip_callchain;
>  struct map;
>  struct perf_sample;
>  struct thread;
> +struct evlist;
>  
>  #define HELP_PAD "\t\t\t\t"
>  
> @@ -159,6 +160,17 @@ struct callchain_cursor {
>  	struct callchain_cursor_node	*curr;
>  };
>  
> +struct stream_node {
> +	struct callchain_node	*cnode;
> +};
> +
> +struct callchain_streams {
> +	struct stream_node	*streams;
> +	int			nr_streams_max;
> +	int			nr_streams;
> +	int			evsel_idx;
> +};
> +
>  extern __thread struct callchain_cursor callchain_cursor;
>  
>  static inline void callchain_init(struct callchain_root *root)
> @@ -289,4 +301,8 @@ int callchain_branch_counts(struct callchain_root *root,
>  			    u64 *branch_count, u64 *predicted_count,
>  			    u64 *abort_count, u64 *cycles_count);
>  
> +struct callchain_streams *callchain_evsel_streams_create(struct evlist *evlist,
> +							 int nr_streams_max,
> +							 int *nr_evsel_streams);
> +
>  #endif	/* __PERF_CALLCHAIN_H */
> -- 
> 2.17.1
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 05/14] perf util: Calculate the sum of all streams hits
  2020-03-10  7:02 ` [PATCH v1 05/14] perf util: Calculate the sum of all streams hits Jin Yao
@ 2020-03-10 15:14   ` Arnaldo Carvalho de Melo
  2020-03-11  5:44     ` Jin, Yao
  0 siblings, 1 reply; 23+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-03-10 15:14 UTC (permalink / raw)
  To: Jin Yao
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Em Tue, Mar 10, 2020 at 03:02:36PM +0800, Jin Yao escreveu:
> We have used callchain_node->hit to measure the hot level of one
> stream. This patch calculates the sum of hits of all streams.
> 
> Then in next patch, we can use following formula to report hot
> percent for one stream.
> 
> hot percent = callchain_node->hit / sum of all hits
> 
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> ---
>  tools/perf/util/callchain.c | 35 +++++++++++++++++++++++++++++++++++
>  tools/perf/util/callchain.h |  1 +
>  2 files changed, 36 insertions(+)
> 
> diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
> index a9dd91268b00..040995405664 100644
> --- a/tools/perf/util/callchain.c
> +++ b/tools/perf/util/callchain.c
> @@ -1685,6 +1685,39 @@ static void update_hot_streams(struct hist_entry *he,
>  	}
>  }
>  
> +static u64 count_callchain_hits(struct hist_entry *he)
> +{
> +	struct rb_root *root = &he->sorted_chain;
> +	struct rb_node *rb_node = rb_first(root);
> +	struct callchain_node *node;
> +	u64 chain_hits = 0;
> +
> +	while (rb_node) {
> +		node = rb_entry(rb_node, struct callchain_node, rb_node);
> +		chain_hits += node->hit;
> +		rb_node = rb_next(rb_node);
> +	}
> +
> +	return chain_hits;
> +}
> +
> +static u64 total_callchain_hits(struct hists *hists)
> +{
> +	struct rb_node *next;
> +	u64 chain_hits = 0;
> +
> +	next = rb_first_cached(&hists->entries);

Try to combine the variable decl line with its initial assignment,
saving one line, i.e.:

+static u64 total_callchain_hits(struct hists *hists)
+{
+	struct rb_node *next = rb_first_cached(&hists->entries);
+	u64 chain_hits = 0;
+
> +	while (next) {
> +		struct hist_entry *he;
> +
> +		he = rb_entry(next, struct hist_entry, rb_node);

Ditto:

+		struct hist_entry *he = rb_entry(next, struct hist_entry, rb_node);

> +		chain_hits += count_callchain_hits(he);
> +		next = rb_next(&he->rb_node);
> +	}
> +
> +	return chain_hits;
> +}
> +
>  static void get_hot_streams(struct hists *hists,
>  			    struct callchain_streams *s)
>  {
> @@ -1698,6 +1731,8 @@ static void get_hot_streams(struct hists *hists,
>  		update_hot_streams(he, s);
>  		next = rb_next(&he->rb_node);
>  	}
> +
> +	s->chain_hits = total_callchain_hits(hists);
>  }
>  
>  struct callchain_streams *callchain_evsel_streams_create(struct evlist *evlist,
> diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
> index c996ab4fb108..3c0e0b45656b 100644
> --- a/tools/perf/util/callchain.h
> +++ b/tools/perf/util/callchain.h
> @@ -173,6 +173,7 @@ struct callchain_streams {
>  	int			nr_streams_max;
>  	int			nr_streams;
>  	int			evsel_idx;
> +	u64			chain_hits;
>  };
>  
>  extern __thread struct callchain_cursor callchain_cursor;
> -- 
> 2.17.1
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 08/14] perf util: Add new block info functions for top N hot blocks comparison
  2020-03-10  7:02 ` [PATCH v1 08/14] perf util: Add new block info functions for top N hot blocks comparison Jin Yao
@ 2020-03-10 15:17   ` Arnaldo Carvalho de Melo
  2020-03-11  5:47     ` Jin, Yao
  0 siblings, 1 reply; 23+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-03-10 15:17 UTC (permalink / raw)
  To: Jin Yao
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Em Tue, Mar 10, 2020 at 03:02:39PM +0800, Jin Yao escreveu:
> It's also useful to figure out the top N hottest blocks from old perf
> data file and figure out the top N hottest blocks from new perf data file,
> and then compare them for the cycles diff. It can let us easily know
> how many cycles are moved from one block to another block.
> 
> This patch adds new helper functions and data structures for the block
> comparison.
> 
> And it also updates the existing perf-diff to be compatible with
> the new interface.
> 
> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
> ---
>  tools/perf/builtin-diff.c    |  45 +-----
>  tools/perf/util/block-info.c | 305 ++++++++++++++++++++++++++++++++++-
>  tools/perf/util/block-info.h |  32 +++-
>  tools/perf/util/srclist.h    |   9 ++
>  4 files changed, 345 insertions(+), 46 deletions(-)
> 
> diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
> index 9c801b9bc5bb..dcbc9bba4e61 100644
> --- a/tools/perf/builtin-diff.c
> +++ b/tools/perf/builtin-diff.c
> @@ -598,27 +598,6 @@ static void init_block_hist(struct block_hist *bh)
>  	bh->valid = true;
>  }
>  
> -static struct hist_entry *get_block_pair(struct hist_entry *he,
> -					 struct hists *hists_pair)
> -{
> -	struct rb_root_cached *root = hists_pair->entries_in;
> -	struct rb_node *next = rb_first_cached(root);
> -	int64_t cmp;
> -
> -	while (next != NULL) {
> -		struct hist_entry *he_pair = rb_entry(next, struct hist_entry,
> -						      rb_node_in);
> -
> -		next = rb_next(&he_pair->rb_node_in);
> -
> -		cmp = __block_info__cmp(he_pair, he);
> -		if (!cmp)
> -			return he_pair;
> -	}
> -
> -	return NULL;
> -}
> -
>  static void init_spark_values(unsigned long *svals, int num)
>  {
>  	for (int i = 0; i < num; i++)
> @@ -665,26 +644,6 @@ static void compute_cycles_diff(struct hist_entry *he,
>  	}
>  }
>  
> -static void block_hists_match(struct hists *hists_base,
> -			      struct hists *hists_pair)
> -{
> -	struct rb_root_cached *root = hists_base->entries_in;
> -	struct rb_node *next = rb_first_cached(root);
> -
> -	while (next != NULL) {
> -		struct hist_entry *he = rb_entry(next, struct hist_entry,
> -						 rb_node_in);
> -		struct hist_entry *pair = get_block_pair(he, hists_pair);
> -
> -		next = rb_next(&he->rb_node_in);
> -
> -		if (pair) {
> -			hist_entry__add_pair(pair, he);
> -			compute_cycles_diff(he, pair);
> -		}
> -	}
> -}
> -
>  static void hists__precompute(struct hists *hists)
>  {
>  	struct rb_root_cached *root;
> @@ -737,7 +696,9 @@ static void hists__precompute(struct hists *hists)
>  
>  				if (bh->valid && pair_bh->valid) {
>  					block_hists_match(&bh->block_hists,
> -							  &pair_bh->block_hists);
> +							  &pair_bh->block_hists,
> +							  NULL,
> +							  compute_cycles_diff);
>  					hists__output_resort(&pair_bh->block_hists,
>  							     NULL);
>  				}
> diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
> index 423ec69bda6c..247c87b8df56 100644
> --- a/tools/perf/util/block-info.c
> +++ b/tools/perf/util/block-info.c
> @@ -12,6 +12,7 @@
>  #include "evlist.h"
>  #include "hist.h"
>  #include "ui/browsers/hists.h"
> +#include "debug.h"
>  
>  static struct block_header_column {
>  	const char *name;
> @@ -50,10 +51,24 @@ struct block_info *block_info__get(struct block_info *bi)
>  	return bi;
>  }
>  
> +static void free_block_line(struct block_line **bl)

static void block_line__zdelete(struct block_line **bl)

> +{
> +	if ((*bl)->start_file)
> +		free((*bl)->start_file);
> +
> +	if ((*bl)->end_file)
> +		free((*bl)->end_file);
> +
> +	zfree(bl);
> +}
> +
>  void block_info__put(struct block_info *bi)

Correct naming, cool.

>  {
> -	if (bi && refcount_dec_and_test(&bi->refcnt))
> +	if (bi && refcount_dec_and_test(&bi->refcnt)) {
> +		if (bi->line)
> +			free_block_line(&bi->line);
>  		free(bi);
> +	}
>  }
>  
>  struct block_info *block_info__new(void)
> @@ -65,7 +80,8 @@ struct block_info *block_info__new(void)
>  	return bi;
>  }
>  
> -int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right)
> +int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right,
> +			  struct srclist *src_list __maybe_unused)
>  {
>  	struct block_info *bi_l = left->block_info;
>  	struct block_info *bi_r = right->block_info;
> @@ -93,7 +109,7 @@ int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right)
>  int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
>  			struct hist_entry *left, struct hist_entry *right)
>  {
> -	return __block_info__cmp(left, right);
> +	return __block_info__cmp(left, right, NULL);
>  }
>  
>  static void init_block_info(struct block_info *bi, struct symbol *sym,
> @@ -446,8 +462,10 @@ struct block_report *block_info__create_report(struct evlist *evlist,
>  	evlist__for_each_entry(evlist, pos) {
>  		struct hists *hists = evsel__hists(pos);
>  
> +		hists__output_resort(hists, NULL);
>  		process_block_report(hists, &block_reports[i], total_cycles,
>  				     block_hpps, nr_hpps);
> +		block_reports[i].evsel_idx = pos->idx;
>  		i++;
>  	}
>  
> @@ -496,3 +514,284 @@ float block_info__total_cycles_percent(struct hist_entry *he)
>  
>  	return 0.0;
>  }
> +
> +struct block_report *block_info__get_report(struct block_report *reps,
> +					    int nr_reps, int evsel_idx)
> +{
> +	for (int i = 0; i < nr_reps; i++) {
> +		if (reps[i].evsel_idx == evsel_idx)
> +			return &reps[i];
> +	}
> +
> +	return NULL;
> +}
> +
> +static char *move_to_relpath(char *path, const char *dir)
> +{
> +	const char *d = dir, *end = dir + strlen(dir);
> +	char *s;
> +
> +	/* skip '.' and '/' */
> +	while ((d != end) && ((*d == '.') || (*d == '/')))
> +		d++;
> +
> +	if (d == end)
> +		return NULL;
> +
> +	s = strstr(path, d);
> +	if (!s)
> +		return NULL;
> +
> +	s += strlen(d);
> +
> +	while (*s == '/' && *s != 0)
> +		s++;
> +
> +	if (*s == 0)
> +		return NULL;
> +
> +	return s;
> +}
> +
> +static int block_addr2line(struct hist_entry *he, const char *dir)
> +{
> +	struct block_info *bi = he->block_info;
> +	struct block_line *bl;
> +
> +	if (!he->ms.map || !he->ms.map->dso)
> +		return -1;
> +
> +	bl = zalloc(sizeof(*bl));
> +	if (!bl)
> +		return -1;
> +
> +	symbol_conf.disable_add2line_warn = true;
> +	bl->start_file = get_srcline_split(he->ms.map->dso,
> +					   map__rip_2objdump(he->ms.map,
> +							     bi->sym->start + bi->start),
> +					   &bl->start_nr);
> +	if (!bl->start_file)
> +		goto err;
> +
> +	if (dir)
> +		bl->start_rel = move_to_relpath(bl->start_file, dir);
> +
> +	bl->end_file = get_srcline_split(he->ms.map->dso,
> +					 map__rip_2objdump(he->ms.map,
> +							   bi->sym->start + bi->end),
> +					 &bl->end_nr);
> +	if (!bl->end_file)
> +		goto err;
> +
> +	if (dir)
> +		bl->end_rel = move_to_relpath(bl->end_file, dir);
> +
> +	bi->line = bl;
> +	return 0;
> +
> +err:
> +	free_block_line(&bl);
> +	return -1;
> +}
> +
> +int block_hists_addr2line(struct hists *hists, const char *dir)
> +{
> +	struct rb_root_cached *root = hists->entries_in;
> +	struct rb_node *next = rb_first_cached(root);
> +
> +	while (next != NULL) {
> +		struct hist_entry *he = rb_entry(next, struct hist_entry,
> +						 rb_node_in);
> +
> +		if (!he)
> +			break;
> +
> +		block_addr2line(he, dir);
> +		next = rb_next(&he->rb_node_in);
> +	}
> +
> +	return 0;
> +}
> +
> +bool block_same_srcfiles(struct block_line *bl_a, struct block_line *bl_b)
> +{
> +	if (!bl_a->start_file || !bl_a->end_file ||
> +	    !bl_b->start_file || !bl_b->end_file) {
> +		return false;
> +	}
> +
> +	if (!strcmp(bl_a->start_file, bl_b->start_file) &&
> +	    !strcmp(bl_a->end_file, bl_b->end_file) &&
> +	    !strcmp(bl_a->start_file, bl_a->end_file)) {
> +		return true;
> +	}
> +
> +	if (!bl_a->start_rel || !bl_a->end_rel ||
> +	    !bl_b->start_rel || !bl_b->end_rel) {
> +		return false;
> +	}
> +
> +	if (!strcmp(bl_a->start_rel, bl_b->start_rel) &&
> +	    !strcmp(bl_a->end_rel, bl_b->end_rel) &&
> +	    !strcmp(bl_a->start_rel, bl_a->end_rel)) {
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
> +void block_line_dump(struct block_info *bi, const char *str)
> +{
> +	if (bi && bi->line) {
> +		pr_debug("%s: %s:%d -> %s:%d\n",
> +			 str,
> +			 bi->line->start_file,
> +			 bi->line->start_nr,
> +			 bi->line->end_file,
> +			 bi->line->end_nr);
> +	}
> +}
> +
> +static bool block_line_matched(struct src_node *node,
> +			       int start_nr_a, int end_nr_a,
> +			       int start_nr_b, int end_nr_b,
> +			       bool *changed)
> +{
> +	int i, j, start_a, end_a, start_b, changed_nr = 0;
> +	struct line_pair *lp;
> +
> +	*changed = false;
> +
> +	if (abs(end_nr_a - start_nr_a) !=
> +	    abs(end_nr_b - start_nr_b)) {
> +		return false;
> +	}
> +
> +	i = start_a = (start_nr_a < end_nr_a) ? start_nr_a : end_nr_a;
> +	end_a = (end_nr_a > start_nr_a) ? end_nr_a : start_nr_a;
> +
> +	j = start_b = (start_nr_b < end_nr_b) ? start_nr_b : end_nr_b;
> +
> +	while (i <= end_a) {
> +		lp = srclist__line_pair(node, i);
> +		if (!lp)
> +			return false;
> +
> +		if (lp->b_nr != j)
> +			changed_nr++;
> +
> +		i++; j++;
> +	}
> +
> +	if ((i == end_a + 1) && (changed_nr == 0))
> +		return true;
> +
> +	/*
> +	 * At least one line is unchanged in this block,
> +	 * we think this block is changed (not a new block).
> +	 */
> +	if (changed_nr < end_a - start_a + 1)
> +		*changed = true;
> +
> +	return false;
> +}
> +
> +bool block_srclist_matched(struct srclist *slist, char *rel_path,
> +			   int start_nr_a, int end_nr_a,
> +			   int start_nr_b, int end_nr_b,
> +			   bool *changed)
> +{
> +	struct src_node *node;
> +	bool ret;
> +
> +	*changed = false;
> +
> +	node = srclist__find(slist, rel_path, true);
> +	if (!node)
> +		return false;
> +
> +	ret = block_line_matched(node, start_nr_a, end_nr_a,
> +				 start_nr_b, end_nr_b, changed);
> +
> +	if (ret) {
> +		pr_debug("block MATCHED (a vs. b)\t\t%s: (%d-%d) vs. (%d-%d)\n",
> +			 node->info.rel_path,
> +			 start_nr_a, end_nr_a,
> +			 start_nr_b, end_nr_b);
> +	} else if (*changed) {
> +		pr_debug("block CHANGED (a vs. b)\t\t%s: (%d-%d) vs. (%d-%d)\n",
> +			 node->info.rel_path,
> +			 start_nr_a, end_nr_a,
> +			 start_nr_b, end_nr_b);
> +	} else {
> +		pr_debug("block UNMATCHED (a vs. b)\t%s: (%d-%d) vs. (%d-%d)\n",
> +			 node->info.rel_path,
> +			 start_nr_a, end_nr_a,
> +			 start_nr_b, end_nr_b);
> +	}
> +
> +	return ret;
> +}
> +
> +static struct hist_entry *get_block_pair(struct hist_entry *he,
> +					 struct hists *hists_pair,
> +					 struct srclist *src_list)
> +{
> +	struct rb_root_cached *root = hists_pair->entries_in;
> +	struct rb_node *next = rb_first_cached(root);
> +	int64_t cmp;
> +
> +	while (next != NULL) {
> +		struct hist_entry *he_pair = rb_entry(next, struct hist_entry,
> +						      rb_node_in);
> +
> +		next = rb_next(&he_pair->rb_node_in);
> +
> +		cmp = __block_info__cmp(he_pair, he, src_list);
> +		if (!cmp)
> +			return he_pair;
> +	}
> +
> +	return NULL;
> +}
> +
> +void block_hists_match(struct hists *hists_base,
> +		       struct hists *hists_pair,
> +		       struct srclist *src_list,
> +		       void (*func)(struct hist_entry *,
> +				    struct hist_entry *))
> +{
> +	struct rb_root_cached *root = hists_base->entries_in;
> +	struct rb_node *next = rb_first_cached(root);
> +
> +	while (next != NULL) {
> +		struct hist_entry *he = rb_entry(next, struct hist_entry,
> +						 rb_node_in);
> +		struct hist_entry *pair = get_block_pair(he, hists_pair,
> +							 src_list);
> +
> +		next = rb_next(&he->rb_node_in);
> +
> +		if (pair) {
> +			hist_entry__add_pair(pair, he);
> +
> +			if (func)
> +				(*func)(he, pair);
> +		}
> +	}
> +}
> +
> +int block_info__match_report(struct block_report *rep_base,
> +			     struct block_report *rep_pair,
> +			     struct srclist *src_list,
> +			     void (*func)(struct hist_entry *,
> +					  struct hist_entry *))
> +{
> +	struct block_hist *bh_base = &rep_base->hist;
> +	struct block_hist *bh_pair = &rep_pair->hist;
> +
> +	block_hists_match(&bh_base->block_hists, &bh_pair->block_hists,
> +			  src_list, func);
> +
> +	return 0;
> +}
> diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
> index 42e9dcc4cf0a..458bd998089d 100644
> --- a/tools/perf/util/block-info.h
> +++ b/tools/perf/util/block-info.h
> @@ -20,6 +20,8 @@ struct block_info {
>  	int			num;
>  	int			num_aggr;
>  	refcount_t		refcnt;
> +	struct block_line	*line;
> +	bool			srcline_matched;
>  };
>  
>  struct block_fmt {
> @@ -46,6 +48,7 @@ struct block_report {
>  	u64			cycles;
>  	struct block_fmt	fmts[PERF_HPP_REPORT__BLOCK_MAX_INDEX];
>  	int			nr_fmts;
> +	int			evsel_idx;
>  };
>  
>  struct block_hist;
> @@ -62,7 +65,8 @@ static inline void __block_info__zput(struct block_info **bi)
>  
>  #define block_info__zput(bi) __block_info__zput(&bi)
>  
> -int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right);
> +int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right,
> +			  struct srclist *src_list __maybe_unused);
>  
>  int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
>  			struct hist_entry *left, struct hist_entry *right);
> @@ -83,4 +87,30 @@ int report__browse_block_hists(struct block_hist *bh, float min_percent,
>  
>  float block_info__total_cycles_percent(struct hist_entry *he);
>  
> +struct block_report *block_info__get_report(struct block_report *reps,
> +					    int nr_reps, int evsel_idx);
> +
> +int block_hists_addr2line(struct hists *hists, const char *dir);
> +
> +void block_line_dump(struct block_info *bi, const char *str);
> +
> +bool block_same_srcfiles(struct block_line *bl_a, struct block_line *bl_b);
> +
> +bool block_srclist_matched(struct srclist *slist, char *rel_path,
> +			   int start_nr_a, int end_nr_a,
> +			   int start_nr_b, int end_nr_b,
> +			   bool *changed);
> +
> +void block_hists_match(struct hists *hists_base,
> +		       struct hists *hists_pair,
> +		       struct srclist *src_list,
> +		       void (*func)(struct hist_entry *,
> +				    struct hist_entry *));
> +
> +int block_info__match_report(struct block_report *rep_base,
> +			     struct block_report *rep_pair,
> +			     struct srclist *src_list,
> +			     void (*func)(struct hist_entry *,
> +					  struct hist_entry *));
> +
>  #endif /* __PERF_BLOCK_H */
> diff --git a/tools/perf/util/srclist.h b/tools/perf/util/srclist.h
> index f25b0de91a13..46866d5c51d7 100644
> --- a/tools/perf/util/srclist.h
> +++ b/tools/perf/util/srclist.h
> @@ -33,6 +33,15 @@ struct srclist {
>  	const char *after_dir;
>  };
>  
> +struct block_line {
> +	char *start_file;
> +	char *end_file;
> +	char *start_rel;
> +	char *end_rel;
> +	unsigned int start_nr;
> +	unsigned int end_nr;
> +};
> +
>  struct srclist *srclist__new(const char *before_dir, const char *after_dir);
>  void srclist__delete(struct srclist *slist);
>  
> -- 
> 2.17.1
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 01/14] perf util: Create source line mapping table
  2020-03-10 15:08   ` Arnaldo Carvalho de Melo
@ 2020-03-11  5:33     ` Jin, Yao
  0 siblings, 0 replies; 23+ messages in thread
From: Jin, Yao @ 2020-03-11  5:33 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Hi Arnaldo,

On 3/10/2020 11:08 PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Mar 10, 2020 at 03:02:32PM +0800, Jin Yao escreveu:
>> Sometimes, a small change in a hot function reducing the cycles of
>> this function, but the overall workload doesn't get faster. It is
>> interesting where the cycles are moved to.
>>
>> What it would like is to diff before/after streams. A stream is a
>> callchain which is aggregated by the branch records from samples.
>>
>> By browsing the hot streams, we can understand the hot code flow.
>> By comparing the cycles variation of same streams between old perf
>> data and new perf data, we can understand if the cycles are moved to
>> the unchanged code.
>>
>> The before stream is the stream before source line changed
>> (e.g. streams in perf.data.old). The after stream is the stream
>> after source line changed (e.g. streams in perf.data).
>>
>> Diffing before/after streams compares all streams (or compares top
>> N hot streams) between two perf data files.
>>
>> If all entries of one stream in perf.data.old are fully matched with
>> all entries of another stream in perf.data, we think these two streams
>> are matched, otherwise the streams are not matched.
>>
>> For example,
>>
>>     cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
>> --------------------------              --------------------------
>>               main div.c:39                           main div.c:39
>>               main div.c:44                           main div.c:44
>>
>> It looks that two streams are matched and we can see for the same
>> streams the cycles are equal and the stream hit percents are
>> slightly changed. That's expected in the normal range.
>>
>> But that's not always true if source lines are changed in perf.data
>> (e.g. div.c:39 is changed). If the source line is changed, they
>> are different streams, we can't compare them. We just think the
>> stream in perf.data is a new one.
>>
>> The challenge is how to identify the changed source lines. The basic
>> idea is to use linux command 'diff' to compare the source file A and
>> source file A* line by line (assume A is used in perf.data.old
>> and A* is updated in perf.data). Create a source line mapping table.
>>
>> For example,
>>
>>    Execute 'diff ./before/div.c ./after/div.c'
>>
>>    25c25
>>    <       i = rand() % 2;
>>    ---
>>    >       i = rand() % 4;
>>    39c39
>>    <       for (i = 0; i < 2000000000; i++) {
>>    ---
>>    >       for (i = 0; i < 20000000001; i++) {
>>
>>    div.c (after -> before) lines mapping:
>>    0 -> 0
>>    1 -> 1
>>    2 -> 2
>>    3 -> 3
>>    4 -> 4
>>    5 -> 5
>>    6 -> 6
>>    7 -> 7
>>    8 -> 8
>>    9 -> 9
>>    ...
>>    24 -> 24
>>    25 -> -1
>>    26 -> 26
>>    27 -> 27
>>    28 -> 28
>>    29 -> 29
>>    30 -> 30
>>    31 -> 31
>>    32 -> 32
>>    33 -> 33
>>    34 -> 34
>>    35 -> 35
>>    36 -> 36
>>    37 -> 37
>>    38 -> 38
>>    39 -> -1
>>    40 -> 40
>>    ...
>>
>>  From the table, we can easily know div.c:39 is source line changed.
>> (it's mapped to -1). So these two streams are not matched.
>>
>> This patch can also handle the cases of new source lines insertion
>> and old source lines deletion. This source line mapping table is a
>> base for next processing for stream comparison.
>>
>> This patch creates a new rblist 'srclist' to manage the source line
>> mapping tables. Each node contains the source line mapping table of
>> a before/after source file pair. The node is created on demand as
>> we process the samples. So it's likely we don't need to create so many
>> nodes.
>>
>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>> ---
>>   tools/perf/util/Build     |   1 +
>>   tools/perf/util/srclist.c | 552 ++++++++++++++++++++++++++++++++++++++
>>   tools/perf/util/srclist.h |  65 +++++
>>   3 files changed, 618 insertions(+)
>>   create mode 100644 tools/perf/util/srclist.c
>>   create mode 100644 tools/perf/util/srclist.h
>>
>> diff --git a/tools/perf/util/Build b/tools/perf/util/Build
>> index c0cf8dff694e..b9bf620b60f0 100644
>> --- a/tools/perf/util/Build
>> +++ b/tools/perf/util/Build
>> @@ -99,6 +99,7 @@ perf-y += call-path.o
>>   perf-y += rwsem.o
>>   perf-y += thread-stack.o
>>   perf-y += spark.o
>> +perf-y += srclist.o
>>   perf-$(CONFIG_AUXTRACE) += auxtrace.o
>>   perf-$(CONFIG_AUXTRACE) += intel-pt-decoder/
>>   perf-$(CONFIG_AUXTRACE) += intel-pt.o
>> diff --git a/tools/perf/util/srclist.c b/tools/perf/util/srclist.c
>> new file mode 100644
>> index 000000000000..cb1d42a3a8ed
>> --- /dev/null
>> +++ b/tools/perf/util/srclist.c
>> @@ -0,0 +1,552 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Manage difference of source lines
>> + * Copyright (c) 2020, Intel Corporation.
>> + * Author: Jin Yao
>> + */
>> +#include <stdlib.h>
>> +#include <stdio.h>
>> +#include <sys/types.h>
>> +#include <inttypes.h>
>> +#include <string.h>
>> +#include <strings.h>
>> +#include <unistd.h>
>> +#include <err.h>
>> +#include <errno.h>
>> +#include <limits.h>
>> +#include <stdbool.h>
>> +#include <linux/zalloc.h>
>> +#include "strlist.h"
>> +#include "srclist.h"
>> +#include "debug.h"
>> +
>> +enum {
>> +	DIFF_LINE_ADD,
>> +	DIFF_LINE_DEL,
>> +	DIFF_LINE_CHANGE
>> +};
>> +
>> +static void get_full_path(const char *dir, const char *rel_path,
>> +			  char *full_path, int size)
>> +{
>> +	if (!dir) {
>> +		snprintf(full_path, size, "%s", rel_path);
>> +		return;
>> +	}
>> +
>> +	if (dir[strlen(dir) - 1] == '/')
>> +		snprintf(full_path, size, "%s%s", dir, rel_path);
>> +	else
>> +		snprintf(full_path, size, "%s/%s", dir, rel_path);
> 
> Just use
> 
> 	snprintf(full_path, size, "%s/%s", dir, rel_path);
> 
> an extra / won't matter, the code will be smaller. At first I thought
> you could use path__join(), please check if that is the case.
> 

Yes, path__join() looks better. Let me try this function.

>> +}
>> +
>> +static int get_line_num(char *path)
> 
> So this is to get how many lines a given file has? I.e. 'wc -l'? Maybe:
> 
> static int path__number_of_lines(const char *path)
> 
> ?
> 

Yes, something like "wc -l".

I will use the new function name "path__number_of_lines".

>> +{
>> +	FILE *fh;
>> +	int num = 0, ch, last = 0;
>> +
>> +	fh = fopen(path, "r");
>> +	if (!fh) {
>> +		pr_debug("Failed to open file %s\n", path);
>> +		return -1;
>> +	}
>> +
>> +	while (!feof(fh)) {
>> +		ch = fgetc(fh);
>> +		if (ch == '\n')
>> +			num++;
>> +		last = ch;
> 
> Maybe use fgets() instead as it already stops at '\n'?
> 

Yes, fgets() can be used here to replace fgetc().

>> +	}
>> +
>> +	fclose(fh);
>> +
>> +	if (last != '\n')
>> +		num++;
>> +
>> +	return num;
>> +}
>> +
>> +static int get_digit(char *str, char **end_str, int *val)
>> +{
>> +	int v;
>> +	char *end;
>> +
>> +	errno = 0;
>> +	v = strtol(str, &end, 10);
>> +	if ((errno == ERANGE && (v == INT_MAX || v == INT_MIN))
>> +		|| (errno != 0 && v == 0)) {
>> +		return -1;
>> +	}
>> +
>> +	if (end == str)
>> +		return -1;
>> +
>> +	*val = v;
>> +	*end_str = end;
>> +	return 0;
>> +}
>> +
>> +static int get_range(char *str, int *start_val, int *end_val)
>> +{
>> +	int val, ret;
>> +	char *end, *next;
>> +
>> +	*start_val = -1;
>> +	*end_val = -1;
>> +
>> +	ret = get_digit(str, &end, &val);
>> +	if (ret)
>> +		return ret;
>> +
>> +	*start_val = val;
>> +
>> +	if (*end != '\0') {
>> +		next = end + 1;
>> +		ret = get_digit(next, &end, &val);
>> +		if (ret)
>> +			return ret;
>> +
>> +		*end_val = val;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int opr_str_idx(char *str, int len, char ch)
>> +{
>> +	int i;
>> +
>> +	for (i = 0; i < len; i++) {
>> +		if (str[i] == ch)
>> +			break;
>> +	}
>> +
>> +	if (i == len || i == 0 || i == len - 1)
>> +		return -1;
>> +
>> +	if (str[i - 1] >= '0' && str[i - 1] <= '9' &&
>> +	    str[i + 1] >= '0' && str[i + 1] <= '9') {
>> +		return i;
>> +	}
>> +
>> +	return -1;
>> +}
>> +
>> +static bool get_diff_operation(char *str, int *opr, int *b_start, int *b_end,
>> +			       int *a_start, int *a_end)
>> +{
>> +	int idx, len;
>> +
>> +	if (str[0] == '<' || str[0] == '>' || str[0] == '-')
>> +		return false;
>> +
>> +	len = strlen(str);
>> +
>> +	/*
>> +	 * For example,
>> +	 * 5,6d4
>> +	 * 8a7
>> +	 * 20,21c19
>> +	 */
>> +	idx = opr_str_idx(str, len, 'd');
>> +	if (idx != -1) {
>> +		*opr = DIFF_LINE_DEL;
>> +		goto exit;
>> +	}
>> +
>> +	idx = opr_str_idx(str, len, 'a');
>> +	if (idx != -1) {
>> +		*opr = DIFF_LINE_ADD;
>> +		goto exit;
>> +	}
>> +
>> +	idx = opr_str_idx(str, len, 'c');
>> +	if (idx != -1) {
>> +		*opr = DIFF_LINE_CHANGE;
>> +		goto exit;
>> +	}
>> +
>> +exit:
>> +	if (idx != -1) {
>> +		str[idx] = 0;
>> +		get_range(str, b_start, b_end);
>> +		get_range(&str[idx + 1], a_start, a_end);
> 
> get_range() may return -1, don't you have to check this here?
> 

Thanks for reminding, I should add checking here.

>> +		return true;
>> +	}
>> +
>> +	return false;
>> +}
>> +
>> +static int del_lines(struct line_pair *lines, int b_start, int b_end,
>> +		     int a_start, int a_end __maybe_unused,
>> +		     int *nr_lines, int *b_adjust)
>> +{
>> +	int i = *nr_lines;
>> +	bool set = false;
>> +
>> +	/*
>> +	 * For example, 5,6d4
>> +	 * It means line5~line6 of file1 are not same as line4 of file2
>> +	 * since line5~line6 are deleted.
>> +	 */
>> +
>> +	if (a_start == i) {
>> +		lines[i].a_nr = i;
>> +		lines[i].b_nr = i + *b_adjust;
>> +		set = true;
>> +	}
>> +
>> +	if (b_end != -1)
>> +		*b_adjust += b_end - b_start + 1;
>> +	else
>> +		*b_adjust += 1;
>> +
>> +	if (!set) {
>> +		lines[i].a_nr = i;
>> +		lines[i].b_nr = i + *b_adjust;
>> +	}
>> +
>> +	*nr_lines = i + 1;
>> +	return 0;
>> +}
>> +
>> +static int add_lines(struct line_pair *lines,
>> +		     int b_start __maybe_unused, int b_end __maybe_unused,
>> +		     int a_start, int a_end, int *nr_lines, int *b_adjust)
>> +{
>> +	int i = *nr_lines;
>> +
>> +	/*
>> +	 * For example, 8a7
>> +	 * It means line8 of file1 is not same as line7 of file2
>> +	 * since a new line (line7) is inserted to file2.
>> +	 */
>> +	if (a_end != -1) {
>> +		for (int j = 0; j < a_end - a_start + 1; j++) {
>> +			lines[i].a_nr = i;
>> +			lines[i].b_nr = -1;
>> +			i++;
>> +		}
>> +		*b_adjust -= a_end - a_start + 1;
>> +	} else {
>> +		lines[i].a_nr = i;
>> +		lines[i].b_nr = -1;
>> +		i++;
>> +		*b_adjust -= 1;
>> +	}
>> +
>> +	*nr_lines = i;
>> +	return 0;
>> +}
>> +
>> +static int change_lines(struct line_pair *lines, int b_start, int b_end,
>> +			int a_start, int a_end, int *nr_lines, int *b_adjust)
>> +{
>> +	int i = *nr_lines;
>> +
>> +	/*
>> +	 * For example, 20,21c19
>> +	 * It means line20~line21 of file1 are not same as line19 of file2
>> +	 * since they are changed.
>> +	 */
>> +
>> +	if (a_end != -1) {
>> +		for (int j = 0; j < a_end - a_start + 1; j++) {
>> +			lines[i].a_nr = i;
>> +			lines[i].b_nr = -1;
>> +			i++;
>> +		}
>> +	} else {
>> +		lines[i].a_nr = i;
>> +		lines[i].b_nr = -1;
>> +		i++;
>> +	}
>> +
>> +	if (b_end != -1)
>> +		*b_adjust += b_end - b_start;
>> +
>> +	if (a_end != -1)
>> +		*b_adjust -= a_end - a_start;
>> +
>> +	*nr_lines = i;
>> +	return 0;
>> +}
>> +
>> +static int update_lines(struct line_pair *lines, int opr, int b_start,
>> +			int b_end, int a_start, int a_end, int *nr_lines,
>> +			int *b_adjust)
> 
> 
> Functions operating on thisa 'struct line_pair' should have the
> line_pair__ prefix, so that we can use ctags, etc, as we have other
> places that process lines, etc, which may end up clashing.
> 

OK, I will use the function names like line_pair__change_lines(), 
line_pair__update_lines() and etc.

>> +{
>> +	int i = *nr_lines;
>> +
>> +	while (i < a_start) {
>> +		lines[i].a_nr = i;
>> +		lines[i].b_nr = i + *b_adjust;
>> +		i++;
>> +	}
>> +
>> +	*nr_lines = i;
>> +
>> +	switch (opr) {
>> +	case DIFF_LINE_DEL:
>> +		del_lines(lines, b_start, b_end, a_start, a_end,
>> +			  nr_lines, b_adjust);
>> +		break;
>> +
>> +	case DIFF_LINE_ADD:
>> +		add_lines(lines, b_start, b_end, a_start, a_end,
>> +			  nr_lines, b_adjust);
>> +		break;
>> +
>> +	case DIFF_LINE_CHANGE:
>> +		change_lines(lines, b_start, b_end, a_start, a_end,
>> +			     nr_lines, b_adjust);
>> +		break;
>> +
>> +	default:
>> +		break;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static void update_remaining(struct line_pair *lines, int a_num, int *nr_lines,
>> +			     int b_adjust)
>> +{
>> +	int i;
>> +
>> +	for (i = *nr_lines; i <= a_num; i++) {
>> +		lines[i].a_nr = i;
>> +		lines[i].b_nr = i + b_adjust;
>> +	}
>> +
>> +	*nr_lines = i;
>> +}
>> +
>> +static int build_mapping_table(FILE *diff_fd, struct line_pair *lines,
>> +			       int *nr_lines, int a_num)
>> +{
>> +	char *str, *p;
>> +	size_t len;
>> +	int opr, b_start, b_end, a_start, a_end, b_adjust = 0;
>> +
>> +	*nr_lines = 1;  /* First line is reserved */
>> +
>> +	while (!feof(diff_fd)) {
>> +		str = NULL;
> 
> Forgot to set len to zero? (See getline manpage)
> 

I've set str to NULL yet. Yes, I should set len to 0 too. The manpage says

"If *lineptr is set to NULL and *n is set 0 before the call, then 
getline() will allocate a buffer for storing the line."

It mentions the "and" here.

>> +		if (getline(&str, &len, diff_fd) < 0 || !len)
>> +			break;
>> +
>> +		p = strchr(str, '\n');
>> +		if (p)
>> +			*p = '\0';
>> +
>> +		pr_debug("%s\n", str);
>> +
>> +		if (!get_diff_operation(str, &opr, &b_start, &b_end,
>> +					&a_start, &a_end)) {
>> +			continue;
>> +		}
>> +
>> +		update_lines(lines, opr, b_start, b_end, a_start, a_end,
>> +			     nr_lines, &b_adjust);
> 
> So, since str was allocated, where is it being kept? above you're always
> setting it to NULL before calling getline.
> 

Let me update the code as following.

size_t len = 0;
char *str = NULL;

while (getline(&str, &len, diff_fd) > 0) {
	...
}

free(str);

>> +	}
>> +
>> +	update_remaining(lines, a_num, nr_lines, b_adjust);
>> +
>> +	free(str);
>> +	return 0;
>> +}
>> +
>> +static int create_line_mapping(struct src_info *info, char *b_path,
>> +			       char *a_path)
> 
> src_info__create_line_mappings() ?
>

OK, I will change it to src_info__create_line_mappings().

>> +{
>> +	char cmd[PATH_MAX];
>> +	FILE *diff_fd = NULL;
>> +	int ret = -1;
>> +
>> +	info->nr_lines_before = get_line_num(b_path);
>> +	if (info->nr_lines_before == -1)
>> +		goto out;
>> +
>> +	info->nr_lines_after = get_line_num(a_path);
>> +	if (info->nr_lines_after == -1)
>> +		goto out;
>> +
>> +	/* Reserve first line */
>> +	info->nr_max = info->nr_lines_before + info->nr_lines_after + 1;
>> +	info->lines = calloc(info->nr_max, sizeof(struct line_pair));
>> +	if (!info->lines) {
>> +		pr_err("Failed to alloc memory for lines\n");
>> +		goto out;
>> +	}
>> +
>> +	snprintf(cmd, sizeof(cmd), "diff %s %s",
>> +		 b_path, a_path);
>> +
>> +	pr_debug("Execute '%s'\n", cmd);
>> +
>> +	diff_fd = popen(cmd, "r");
>> +	if (!diff_fd) {
>> +		pr_err("Failed to execute '%s'\n", cmd);
>> +		goto out;
>> +	}
>> +
>> +	ret = build_mapping_table(diff_fd, info->lines, &info->nr_lines,
>> +				  info->nr_lines_after);
>> +
>> +out:
>> +	if (diff_fd)
>> +		fclose(diff_fd);
>> +
>> +	return ret;
>> +}
>> +
>> +static void free_src_info(struct src_info *info)
>> +{
>> +	if (info->rel_path)
>> +		zfree(&info->rel_path);
>> +
>> +	if (info->lines)
>> +		zfree(&info->lines);
> 
> No need to check if it is NULL, call zfree() straight away.
> 

OK

>> +}
>> +
>> +static int init_src_info(char *b_path, char *a_path, const char *rel_path,
>> +			 struct src_info *info)
>> +{
>> +	int ret;
>> +
>> +	info->rel_path = strdup(rel_path);
>> +	if (!info->rel_path)
>> +		return -1;
>> +
>> +	ret = create_line_mapping(info, b_path, a_path);
>> +	if (ret) {
>> +		free_src_info(info);
>> +		return ret;
>> +	}
>> +
>> +	return 0;
> 
> 	if (ret)
> 		free_src_info(info);
> 
> 	return ret;
> 
> Should be equivalent and shorter, no?
> 
> 

Yes, much better. :)

>> +}
>> +
>> +static void free_src_node(struct src_node *node)
>> +{
> 
> Free routines should accept a NULL arg, i.e. first thing:
> 
>          if (node == NULL)
> 		return;
> 

Yes, will add NULL check here.

>> +	free_src_info(&node->info);
>> +	free(node);
>> +}
>> +
>> +static struct rb_node *srclist__node_new(struct rblist *rblist,
>> +					 const void *entry)
>> +{
>> +	struct srclist *slist = container_of(rblist, struct srclist, rblist);
>> +	const char *rel_path = entry;
>> +	char b_path[PATH_MAX], a_path[PATH_MAX];
>> +	struct src_node *node;
>> +	int ret;
>> +
>> +	get_full_path(slist->before_dir, rel_path, b_path, PATH_MAX);
>> +	get_full_path(slist->after_dir, rel_path, a_path, PATH_MAX);
>> +
>> +	node = calloc(1, sizeof(*node));
> 
> 	zalloc()?
> 

Will change it to zalloc().

>> +	if (!node)
>> +		return NULL;
>> +
>> +	ret = init_src_info(b_path, a_path, rel_path, &node->info);
>> +	if (ret)
>> +		goto err;
>> +
>> +	return &node->rb_node;
>> +
>> +err:
>> +	free_src_node(node);
>> +	return NULL;
>> +}
>> +
>> +static void srclist__node_delete(struct rblist *rblist __maybe_unused,
>> +				 struct rb_node *rb_node)
>> +{
>> +	struct src_node *node = container_of(rb_node, struct src_node, rb_node);
>> +
>> +	free_src_info(&node->info);
>> +	free(node);
>> +}
>> +
>> +static int srclist__node_cmp(struct rb_node *rb_node, const void *entry)
>> +{
>> +	struct src_node *node = container_of(rb_node, struct src_node, rb_node);
>> +	const char *str = entry;
>> +
>> +	return strcmp(node->info.rel_path, str);
>> +}
>> +
>> +struct srclist *srclist__new(const char *before_dir, const char *after_dir)
>> +{
>> +	struct srclist *slist = calloc(1, sizeof(*slist));
> 
> zalloc()
> 

zalloc :)

>> +
>> +	if (!slist)
>> +		return NULL;
>> +
>> +	rblist__init(&slist->rblist);
>> +	slist->rblist.node_cmp = srclist__node_cmp;
>> +	slist->rblist.node_new = srclist__node_new;
>> +	slist->rblist.node_delete = srclist__node_delete;
>> +
>> +	slist->before_dir = before_dir;
>> +	slist->after_dir = after_dir;
>> +	return slist;
>> +}
>> +
>> +void srclist__delete(struct srclist *slist)
>> +{
>> +	if (slist)
>> +		rblist__delete(&slist->rblist);
> 
> 
> Correct, delete operations should accept NULL, just like free()
> 

Yes.

Sorry for so many isues in this patch. I will update according to your 
suggestions.

Thanks
Jin Yao

>> +}
>> +
>> +struct src_node *srclist__find(struct srclist *slist, char *rel_path,
>> +			       bool create)
>> +{
>> +	struct src_node *node = NULL;
>> +	struct rb_node *rb_node;
>> +
>> +	if (create)
>> +		rb_node = rblist__findnew(&slist->rblist, (void *)rel_path);
>> +	else
>> +		rb_node = rblist__find(&slist->rblist, (void *)rel_path);
>> +
>> +	if (rb_node)
>> +		node = container_of(rb_node, struct src_node, rb_node);
>> +
>> +	return node;
>> +}
>> +
>> +struct line_pair *srclist__line_pair(struct src_node *node, int a_nr)
>> +{
>> +	struct src_info *info = &node->info;
>> +
>> +	if (a_nr < info->nr_lines && a_nr > 0)
>> +		return &info->lines[a_nr];
>> +
>> +	pr_debug("Exceeds line range: nr_lines = %d, a_nr = %d\n",
>> +		 info->nr_lines, a_nr);
>> +
>> +	return NULL;
>> +}
>> +
>> +void srclist__dump(struct srclist *slist)
>> +{
>> +	struct src_node *node;
>> +	struct src_info *info;
>> +	int i;
>> +
>> +	srclist__for_each_entry(node, slist) {
>> +		info = &node->info;
>> +
>> +		pr_debug("%s (after -> before) lines mapping:\n",
>> +			 info->rel_path);
>> +
>> +		for (i = 0; i < info->nr_lines; i++) {
>> +			pr_debug("%d -> %d\n",
>> +				 info->lines[i].a_nr,
>> +				 info->lines[i].b_nr);
>> +		}
>> +	}
>> +}
>> diff --git a/tools/perf/util/srclist.h b/tools/perf/util/srclist.h
>> new file mode 100644
>> index 000000000000..f25b0de91a13
>> --- /dev/null
>> +++ b/tools/perf/util/srclist.h
>> @@ -0,0 +1,65 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef __PERF_SRCLIST_H
>> +#define __PERF_SRCLIST_H
>> +
>> +#include <linux/types.h>
>> +#include <linux/rbtree.h>
>> +#include "rblist.h"
>> +
>> +struct line_pair {
>> +	int a_nr;	/* line nr in after.c */
>> +	int b_nr;	/* line nr in before.c */
>> +};
>> +
>> +struct src_node;
>> +
>> +struct src_info {
>> +	char *rel_path; /* relative path */
>> +	struct line_pair *lines;
>> +	int nr_max;
>> +	int nr_lines;
>> +	int nr_lines_before;
>> +	int nr_lines_after;
>> +};
>> +
>> +struct src_node {
>> +	struct rb_node rb_node;
>> +	struct src_info info;
>> +};
>> +
>> +struct srclist {
>> +	struct rblist rblist;
>> +	const char *before_dir;
>> +	const char *after_dir;
>> +};
>> +
>> +struct srclist *srclist__new(const char *before_dir, const char *after_dir);
>> +void srclist__delete(struct srclist *slist);
>> +
>> +struct src_node *srclist__find(struct srclist *slist, char *rel_path,
>> +			       bool create);
>> +
>> +struct line_pair *srclist__line_pair(struct src_node *node, int a_nr);
>> +void srclist__dump(struct srclist *slist);
>> +
>> +static inline struct src_node *srclist__first(struct srclist *slist)
>> +{
>> +	struct rb_node *rn = rb_first_cached(&slist->rblist.entries);
>> +
>> +	return rn ? rb_entry(rn, struct src_node, rb_node) : NULL;
>> +}
>> +
>> +static inline struct src_node *srclist__next(struct src_node *sn)
>> +{
>> +	struct rb_node *rn;
>> +
>> +	if (!sn)
>> +		return NULL;
>> +	rn = rb_next(&sn->rb_node);
>> +	return rn ? rb_entry(rn, struct src_node, rb_node) : NULL;
>> +}
>> +
>> +#define srclist__for_each_entry(pos, slist)	\
>> +	for (pos = srclist__first(slist); pos; pos = srclist__next(pos))
>> +
>> +#endif /* __PERF_SRCLIST_H */
>> -- 
>> 2.17.1
>>
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 02/14] perf util: Create streams for managing top N hottest callchains
  2020-03-10 15:11   ` Arnaldo Carvalho de Melo
@ 2020-03-11  5:38     ` Jin, Yao
  0 siblings, 0 replies; 23+ messages in thread
From: Jin, Yao @ 2020-03-11  5:38 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 3/10/2020 11:11 PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Mar 10, 2020 at 03:02:33PM +0800, Jin Yao escreveu:
>> We think the stream is a callchain which is aggregated by the LBR
>> records from samples. By browsing the stream, we can understand
>> the code flow.
>>
>> The struct callchain_node represents one callchain and we use the
>> callchain_node->hit to measure the hot level of this callchain.
>> Higher is hotter.
>>
>> Since in perf data file, there may be many callchains so we just
>> need to focus on the top N hottest callchains. N is a user defined
>> parameter or just a predefined default value.
>>
>> This patch saves the top N hottest callchains in 'struct stream_node'
>> type array, which is defined in a per event 'struct callchain_streams'.
>>
>> So now we can get the per-event top N hottest callchains.
>>
>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>> ---
>>   tools/perf/util/callchain.c | 125 ++++++++++++++++++++++++++++++++++++
>>   tools/perf/util/callchain.h |  16 +++++
>>   2 files changed, 141 insertions(+)
>>
>> diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
>> index 818aa4efd386..d9c68a8e7619 100644
>> --- a/tools/perf/util/callchain.c
>> +++ b/tools/perf/util/callchain.c
>> @@ -31,6 +31,7 @@
>>   #include "callchain.h"
>>   #include "branch.h"
>>   #include "symbol.h"
>> +#include "evlist.h"
>>   #include "../perf.h"
>>   
>>   #define CALLCHAIN_PARAM_DEFAULT			\
>> @@ -1599,3 +1600,127 @@ void callchain_cursor_reset(struct callchain_cursor *cursor)
>>   	for (node = cursor->first; node != NULL; node = node->next)
>>   		map__zput(node->ms.map);
>>   }
>> +
>> +static void free_evsel_streams(struct callchain_streams *callchain_streams,
>> +			       int nr_evsel)
>> +{
>> +	for (int i = 0; i < nr_evsel; i++) {
>> +		if (callchain_streams[i].streams)
>> +			free(callchain_streams[i].streams);
> 
> free(NULL) is valid, so remove that extra check and use zfree() to reset
> that entry to NULL, i.e.:
> 
> 	for ()
> 		zfree(&callchain_streams[i].streams);
> 

Yes, will use zfree() here.

>> +	}
>> +
>> +	free(callchain_streams);
>> +}
>> +
>> +static struct callchain_streams *create_evsel_streams(int nr_evsel,
>> +						      int nr_streams_max)
>> +{
>> +	struct callchain_streams *callchain_streams;
>> +
>> +	callchain_streams = calloc(nr_evsel, sizeof(struct callchain_streams));
> 
> calloc is the right thing here, as this is an array
> 

Yes. If just allocation for one element, I should use zalloc().

Thanks
Jin Yao

>> +	if (!callchain_streams)
>> +		return NULL;
>> +
>> +	for (int i = 0; i < nr_evsel; i++) {
>> +		struct callchain_streams *s = &callchain_streams[i];
>> +
>> +		s->streams = calloc(nr_streams_max, sizeof(struct stream_node));
>> +		if (!s->streams)
>> +			goto err;
>> +
>> +		s->nr_streams_max = nr_streams_max;
>> +		s->evsel_idx = -1;
>> +	}
>> +
>> +	return callchain_streams;
>> +
>> +err:
>> +	free_evsel_streams(callchain_streams, nr_evsel);
>> +	return NULL;
>> +}
>> +
>> +/*
>> + * The cnodes with high hit number are hot callchains.
>> + */
>> +static void set_hot_cnode(struct callchain_streams *s,
>> +			  struct callchain_node *cnode)
>> +{
>> +	int i, idx = 0;
>> +	u64 hit;
>> +
>> +	if (s->nr_streams < s->nr_streams_max) {
>> +		i = s->nr_streams;
>> +		s->streams[i].cnode = cnode;
>> +		s->nr_streams++;
>> +		return;
>> +	}
>> +
>> +	/*
>> +	 * Since only a few number of hot streams, so only use simple
>> +	 * way to find the cnode with smallest hit number and replace.
>> +	 */
>> +	hit = (s->streams[0].cnode)->hit;
>> +	for (i = 1; i < s->nr_streams; i++) {
>> +		if ((s->streams[i].cnode)->hit < hit) {
>> +			hit = (s->streams[i].cnode)->hit;
>> +			idx = i;
>> +		}
>> +	}
>> +
>> +	if (cnode->hit > hit)
>> +		s->streams[idx].cnode = cnode;
>> +}
>> +
>> +static void update_hot_streams(struct hist_entry *he,
>> +			       struct callchain_streams *s)
>> +{
>> +	struct rb_root *root = &he->sorted_chain;
>> +	struct rb_node *rb_node = rb_first(root);
>> +	struct callchain_node *node;
>> +
>> +	while (rb_node) {
>> +		node = rb_entry(rb_node, struct callchain_node, rb_node);
>> +		set_hot_cnode(s, node);
>> +		rb_node = rb_next(rb_node);
>> +	}
>> +}
>> +
>> +static void get_hot_streams(struct hists *hists,
>> +			    struct callchain_streams *s)
>> +{
>> +	struct rb_node *next;
>> +
>> +	next = rb_first_cached(&hists->entries);
>> +	while (next) {
>> +		struct hist_entry *he;
>> +
>> +		he = rb_entry(next, struct hist_entry, rb_node);
>> +		update_hot_streams(he, s);
>> +		next = rb_next(&he->rb_node);
>> +	}
>> +}
>> +
>> +struct callchain_streams *callchain_evsel_streams_create(struct evlist *evlist,
>> +							 int nr_streams_max,
>> +							 int *nr_evsel_streams)
>> +{
>> +	int nr_evsel = evlist->core.nr_entries, i = 0;
>> +	struct callchain_streams *callchain_streams;
>> +	struct evsel *pos;
>> +
>> +	callchain_streams = create_evsel_streams(nr_evsel, nr_streams_max);
>> +	if (!callchain_streams)
>> +		return NULL;
>> +
>> +	evlist__for_each_entry(evlist, pos) {
>> +		struct hists *hists = evsel__hists(pos);
>> +
>> +		hists__output_resort(hists, NULL);
>> +		get_hot_streams(hists, &callchain_streams[i]);
>> +		callchain_streams[i].evsel_idx = pos->idx;
>> +		i++;
>> +	}
>> +
>> +	*nr_evsel_streams = nr_evsel;
>> +	return callchain_streams;
>> +}
>> diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
>> index 706bb7bbe1e1..5852990cdf60 100644
>> --- a/tools/perf/util/callchain.h
>> +++ b/tools/perf/util/callchain.h
>> @@ -13,6 +13,7 @@ struct ip_callchain;
>>   struct map;
>>   struct perf_sample;
>>   struct thread;
>> +struct evlist;
>>   
>>   #define HELP_PAD "\t\t\t\t"
>>   
>> @@ -159,6 +160,17 @@ struct callchain_cursor {
>>   	struct callchain_cursor_node	*curr;
>>   };
>>   
>> +struct stream_node {
>> +	struct callchain_node	*cnode;
>> +};
>> +
>> +struct callchain_streams {
>> +	struct stream_node	*streams;
>> +	int			nr_streams_max;
>> +	int			nr_streams;
>> +	int			evsel_idx;
>> +};
>> +
>>   extern __thread struct callchain_cursor callchain_cursor;
>>   
>>   static inline void callchain_init(struct callchain_root *root)
>> @@ -289,4 +301,8 @@ int callchain_branch_counts(struct callchain_root *root,
>>   			    u64 *branch_count, u64 *predicted_count,
>>   			    u64 *abort_count, u64 *cycles_count);
>>   
>> +struct callchain_streams *callchain_evsel_streams_create(struct evlist *evlist,
>> +							 int nr_streams_max,
>> +							 int *nr_evsel_streams);
>> +
>>   #endif	/* __PERF_CALLCHAIN_H */
>> -- 
>> 2.17.1
>>
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 05/14] perf util: Calculate the sum of all streams hits
  2020-03-10 15:14   ` Arnaldo Carvalho de Melo
@ 2020-03-11  5:44     ` Jin, Yao
  0 siblings, 0 replies; 23+ messages in thread
From: Jin, Yao @ 2020-03-11  5:44 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 3/10/2020 11:14 PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Mar 10, 2020 at 03:02:36PM +0800, Jin Yao escreveu:
>> We have used callchain_node->hit to measure the hot level of one
>> stream. This patch calculates the sum of hits of all streams.
>>
>> Then in next patch, we can use following formula to report hot
>> percent for one stream.
>>
>> hot percent = callchain_node->hit / sum of all hits
>>
>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>> ---
>>   tools/perf/util/callchain.c | 35 +++++++++++++++++++++++++++++++++++
>>   tools/perf/util/callchain.h |  1 +
>>   2 files changed, 36 insertions(+)
>>
>> diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
>> index a9dd91268b00..040995405664 100644
>> --- a/tools/perf/util/callchain.c
>> +++ b/tools/perf/util/callchain.c
>> @@ -1685,6 +1685,39 @@ static void update_hot_streams(struct hist_entry *he,
>>   	}
>>   }
>>   
>> +static u64 count_callchain_hits(struct hist_entry *he)
>> +{
>> +	struct rb_root *root = &he->sorted_chain;
>> +	struct rb_node *rb_node = rb_first(root);
>> +	struct callchain_node *node;
>> +	u64 chain_hits = 0;
>> +
>> +	while (rb_node) {
>> +		node = rb_entry(rb_node, struct callchain_node, rb_node);
>> +		chain_hits += node->hit;
>> +		rb_node = rb_next(rb_node);
>> +	}
>> +
>> +	return chain_hits;
>> +}
>> +
>> +static u64 total_callchain_hits(struct hists *hists)
>> +{
>> +	struct rb_node *next;
>> +	u64 chain_hits = 0;
>> +
>> +	next = rb_first_cached(&hists->entries);
> 
> Try to combine the variable decl line with its initial assignment,
> saving one line, i.e.:
> 
> +static u64 total_callchain_hits(struct hists *hists)
> +{
> +	struct rb_node *next = rb_first_cached(&hists->entries);
> +	u64 chain_hits = 0;
> +

Yes, combining to one line is easy for reading. I will update the code.

>> +	while (next) {
>> +		struct hist_entry *he;
>> +
>> +		he = rb_entry(next, struct hist_entry, rb_node);
> 
> Ditto:
> 
> +		struct hist_entry *he = rb_entry(next, struct hist_entry, rb_node);
> 

Got it.

Thanks
Jin Yao

>> +		chain_hits += count_callchain_hits(he);
>> +		next = rb_next(&he->rb_node);
>> +	}
>> +
>> +	return chain_hits;
>> +}
>> +
>>   static void get_hot_streams(struct hists *hists,
>>   			    struct callchain_streams *s)
>>   {
>> @@ -1698,6 +1731,8 @@ static void get_hot_streams(struct hists *hists,
>>   		update_hot_streams(he, s);
>>   		next = rb_next(&he->rb_node);
>>   	}
>> +
>> +	s->chain_hits = total_callchain_hits(hists);
>>   }
>>   
>>   struct callchain_streams *callchain_evsel_streams_create(struct evlist *evlist,
>> diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
>> index c996ab4fb108..3c0e0b45656b 100644
>> --- a/tools/perf/util/callchain.h
>> +++ b/tools/perf/util/callchain.h
>> @@ -173,6 +173,7 @@ struct callchain_streams {
>>   	int			nr_streams_max;
>>   	int			nr_streams;
>>   	int			evsel_idx;
>> +	u64			chain_hits;
>>   };
>>   
>>   extern __thread struct callchain_cursor callchain_cursor;
>> -- 
>> 2.17.1
>>
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v1 08/14] perf util: Add new block info functions for top N hot blocks comparison
  2020-03-10 15:17   ` Arnaldo Carvalho de Melo
@ 2020-03-11  5:47     ` Jin, Yao
  0 siblings, 0 replies; 23+ messages in thread
From: Jin, Yao @ 2020-03-11  5:47 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 3/10/2020 11:17 PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Mar 10, 2020 at 03:02:39PM +0800, Jin Yao escreveu:
>> It's also useful to figure out the top N hottest blocks from old perf
>> data file and figure out the top N hottest blocks from new perf data file,
>> and then compare them for the cycles diff. It can let us easily know
>> how many cycles are moved from one block to another block.
>>
>> This patch adds new helper functions and data structures for the block
>> comparison.
>>
>> And it also updates the existing perf-diff to be compatible with
>> the new interface.
>>
>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>> ---
>>   tools/perf/builtin-diff.c    |  45 +-----
>>   tools/perf/util/block-info.c | 305 ++++++++++++++++++++++++++++++++++-
>>   tools/perf/util/block-info.h |  32 +++-
>>   tools/perf/util/srclist.h    |   9 ++
>>   4 files changed, 345 insertions(+), 46 deletions(-)
>>
>> diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
>> index 9c801b9bc5bb..dcbc9bba4e61 100644
>> --- a/tools/perf/builtin-diff.c
>> +++ b/tools/perf/builtin-diff.c
>> @@ -598,27 +598,6 @@ static void init_block_hist(struct block_hist *bh)
>>   	bh->valid = true;
>>   }
>>   
>> -static struct hist_entry *get_block_pair(struct hist_entry *he,
>> -					 struct hists *hists_pair)
>> -{
>> -	struct rb_root_cached *root = hists_pair->entries_in;
>> -	struct rb_node *next = rb_first_cached(root);
>> -	int64_t cmp;
>> -
>> -	while (next != NULL) {
>> -		struct hist_entry *he_pair = rb_entry(next, struct hist_entry,
>> -						      rb_node_in);
>> -
>> -		next = rb_next(&he_pair->rb_node_in);
>> -
>> -		cmp = __block_info__cmp(he_pair, he);
>> -		if (!cmp)
>> -			return he_pair;
>> -	}
>> -
>> -	return NULL;
>> -}
>> -
>>   static void init_spark_values(unsigned long *svals, int num)
>>   {
>>   	for (int i = 0; i < num; i++)
>> @@ -665,26 +644,6 @@ static void compute_cycles_diff(struct hist_entry *he,
>>   	}
>>   }
>>   
>> -static void block_hists_match(struct hists *hists_base,
>> -			      struct hists *hists_pair)
>> -{
>> -	struct rb_root_cached *root = hists_base->entries_in;
>> -	struct rb_node *next = rb_first_cached(root);
>> -
>> -	while (next != NULL) {
>> -		struct hist_entry *he = rb_entry(next, struct hist_entry,
>> -						 rb_node_in);
>> -		struct hist_entry *pair = get_block_pair(he, hists_pair);
>> -
>> -		next = rb_next(&he->rb_node_in);
>> -
>> -		if (pair) {
>> -			hist_entry__add_pair(pair, he);
>> -			compute_cycles_diff(he, pair);
>> -		}
>> -	}
>> -}
>> -
>>   static void hists__precompute(struct hists *hists)
>>   {
>>   	struct rb_root_cached *root;
>> @@ -737,7 +696,9 @@ static void hists__precompute(struct hists *hists)
>>   
>>   				if (bh->valid && pair_bh->valid) {
>>   					block_hists_match(&bh->block_hists,
>> -							  &pair_bh->block_hists);
>> +							  &pair_bh->block_hists,
>> +							  NULL,
>> +							  compute_cycles_diff);
>>   					hists__output_resort(&pair_bh->block_hists,
>>   							     NULL);
>>   				}
>> diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
>> index 423ec69bda6c..247c87b8df56 100644
>> --- a/tools/perf/util/block-info.c
>> +++ b/tools/perf/util/block-info.c
>> @@ -12,6 +12,7 @@
>>   #include "evlist.h"
>>   #include "hist.h"
>>   #include "ui/browsers/hists.h"
>> +#include "debug.h"
>>   
>>   static struct block_header_column {
>>   	const char *name;
>> @@ -50,10 +51,24 @@ struct block_info *block_info__get(struct block_info *bi)
>>   	return bi;
>>   }
>>   
>> +static void free_block_line(struct block_line **bl)
> 
> static void block_line__zdelete(struct block_line **bl)
> 

Sorry about that. I should use the struct name as the perfix. I will 
keep in mind.

>> +{
>> +	if ((*bl)->start_file)
>> +		free((*bl)->start_file);
>> +
>> +	if ((*bl)->end_file)
>> +		free((*bl)->end_file);
>> +
>> +	zfree(bl);
>> +}
>> +
>>   void block_info__put(struct block_info *bi)
> 
> Correct naming, cool.
> 

:)

Thanks
Jin Yao

>>   {
>> -	if (bi && refcount_dec_and_test(&bi->refcnt))
>> +	if (bi && refcount_dec_and_test(&bi->refcnt)) {
>> +		if (bi->line)
>> +			free_block_line(&bi->line);
>>   		free(bi);
>> +	}
>>   }
>>   
>>   struct block_info *block_info__new(void)
>> @@ -65,7 +80,8 @@ struct block_info *block_info__new(void)
>>   	return bi;
>>   }
>>   
>> -int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right)
>> +int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right,
>> +			  struct srclist *src_list __maybe_unused)
>>   {
>>   	struct block_info *bi_l = left->block_info;
>>   	struct block_info *bi_r = right->block_info;
>> @@ -93,7 +109,7 @@ int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right)
>>   int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
>>   			struct hist_entry *left, struct hist_entry *right)
>>   {
>> -	return __block_info__cmp(left, right);
>> +	return __block_info__cmp(left, right, NULL);
>>   }
>>   
>>   static void init_block_info(struct block_info *bi, struct symbol *sym,
>> @@ -446,8 +462,10 @@ struct block_report *block_info__create_report(struct evlist *evlist,
>>   	evlist__for_each_entry(evlist, pos) {
>>   		struct hists *hists = evsel__hists(pos);
>>   
>> +		hists__output_resort(hists, NULL);
>>   		process_block_report(hists, &block_reports[i], total_cycles,
>>   				     block_hpps, nr_hpps);
>> +		block_reports[i].evsel_idx = pos->idx;
>>   		i++;
>>   	}
>>   
>> @@ -496,3 +514,284 @@ float block_info__total_cycles_percent(struct hist_entry *he)
>>   
>>   	return 0.0;
>>   }
>> +
>> +struct block_report *block_info__get_report(struct block_report *reps,
>> +					    int nr_reps, int evsel_idx)
>> +{
>> +	for (int i = 0; i < nr_reps; i++) {
>> +		if (reps[i].evsel_idx == evsel_idx)
>> +			return &reps[i];
>> +	}
>> +
>> +	return NULL;
>> +}
>> +
>> +static char *move_to_relpath(char *path, const char *dir)
>> +{
>> +	const char *d = dir, *end = dir + strlen(dir);
>> +	char *s;
>> +
>> +	/* skip '.' and '/' */
>> +	while ((d != end) && ((*d == '.') || (*d == '/')))
>> +		d++;
>> +
>> +	if (d == end)
>> +		return NULL;
>> +
>> +	s = strstr(path, d);
>> +	if (!s)
>> +		return NULL;
>> +
>> +	s += strlen(d);
>> +
>> +	while (*s == '/' && *s != 0)
>> +		s++;
>> +
>> +	if (*s == 0)
>> +		return NULL;
>> +
>> +	return s;
>> +}
>> +
>> +static int block_addr2line(struct hist_entry *he, const char *dir)
>> +{
>> +	struct block_info *bi = he->block_info;
>> +	struct block_line *bl;
>> +
>> +	if (!he->ms.map || !he->ms.map->dso)
>> +		return -1;
>> +
>> +	bl = zalloc(sizeof(*bl));
>> +	if (!bl)
>> +		return -1;
>> +
>> +	symbol_conf.disable_add2line_warn = true;
>> +	bl->start_file = get_srcline_split(he->ms.map->dso,
>> +					   map__rip_2objdump(he->ms.map,
>> +							     bi->sym->start + bi->start),
>> +					   &bl->start_nr);
>> +	if (!bl->start_file)
>> +		goto err;
>> +
>> +	if (dir)
>> +		bl->start_rel = move_to_relpath(bl->start_file, dir);
>> +
>> +	bl->end_file = get_srcline_split(he->ms.map->dso,
>> +					 map__rip_2objdump(he->ms.map,
>> +							   bi->sym->start + bi->end),
>> +					 &bl->end_nr);
>> +	if (!bl->end_file)
>> +		goto err;
>> +
>> +	if (dir)
>> +		bl->end_rel = move_to_relpath(bl->end_file, dir);
>> +
>> +	bi->line = bl;
>> +	return 0;
>> +
>> +err:
>> +	free_block_line(&bl);
>> +	return -1;
>> +}
>> +
>> +int block_hists_addr2line(struct hists *hists, const char *dir)
>> +{
>> +	struct rb_root_cached *root = hists->entries_in;
>> +	struct rb_node *next = rb_first_cached(root);
>> +
>> +	while (next != NULL) {
>> +		struct hist_entry *he = rb_entry(next, struct hist_entry,
>> +						 rb_node_in);
>> +
>> +		if (!he)
>> +			break;
>> +
>> +		block_addr2line(he, dir);
>> +		next = rb_next(&he->rb_node_in);
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +bool block_same_srcfiles(struct block_line *bl_a, struct block_line *bl_b)
>> +{
>> +	if (!bl_a->start_file || !bl_a->end_file ||
>> +	    !bl_b->start_file || !bl_b->end_file) {
>> +		return false;
>> +	}
>> +
>> +	if (!strcmp(bl_a->start_file, bl_b->start_file) &&
>> +	    !strcmp(bl_a->end_file, bl_b->end_file) &&
>> +	    !strcmp(bl_a->start_file, bl_a->end_file)) {
>> +		return true;
>> +	}
>> +
>> +	if (!bl_a->start_rel || !bl_a->end_rel ||
>> +	    !bl_b->start_rel || !bl_b->end_rel) {
>> +		return false;
>> +	}
>> +
>> +	if (!strcmp(bl_a->start_rel, bl_b->start_rel) &&
>> +	    !strcmp(bl_a->end_rel, bl_b->end_rel) &&
>> +	    !strcmp(bl_a->start_rel, bl_a->end_rel)) {
>> +		return true;
>> +	}
>> +
>> +	return false;
>> +}
>> +
>> +void block_line_dump(struct block_info *bi, const char *str)
>> +{
>> +	if (bi && bi->line) {
>> +		pr_debug("%s: %s:%d -> %s:%d\n",
>> +			 str,
>> +			 bi->line->start_file,
>> +			 bi->line->start_nr,
>> +			 bi->line->end_file,
>> +			 bi->line->end_nr);
>> +	}
>> +}
>> +
>> +static bool block_line_matched(struct src_node *node,
>> +			       int start_nr_a, int end_nr_a,
>> +			       int start_nr_b, int end_nr_b,
>> +			       bool *changed)
>> +{
>> +	int i, j, start_a, end_a, start_b, changed_nr = 0;
>> +	struct line_pair *lp;
>> +
>> +	*changed = false;
>> +
>> +	if (abs(end_nr_a - start_nr_a) !=
>> +	    abs(end_nr_b - start_nr_b)) {
>> +		return false;
>> +	}
>> +
>> +	i = start_a = (start_nr_a < end_nr_a) ? start_nr_a : end_nr_a;
>> +	end_a = (end_nr_a > start_nr_a) ? end_nr_a : start_nr_a;
>> +
>> +	j = start_b = (start_nr_b < end_nr_b) ? start_nr_b : end_nr_b;
>> +
>> +	while (i <= end_a) {
>> +		lp = srclist__line_pair(node, i);
>> +		if (!lp)
>> +			return false;
>> +
>> +		if (lp->b_nr != j)
>> +			changed_nr++;
>> +
>> +		i++; j++;
>> +	}
>> +
>> +	if ((i == end_a + 1) && (changed_nr == 0))
>> +		return true;
>> +
>> +	/*
>> +	 * At least one line is unchanged in this block,
>> +	 * we think this block is changed (not a new block).
>> +	 */
>> +	if (changed_nr < end_a - start_a + 1)
>> +		*changed = true;
>> +
>> +	return false;
>> +}
>> +
>> +bool block_srclist_matched(struct srclist *slist, char *rel_path,
>> +			   int start_nr_a, int end_nr_a,
>> +			   int start_nr_b, int end_nr_b,
>> +			   bool *changed)
>> +{
>> +	struct src_node *node;
>> +	bool ret;
>> +
>> +	*changed = false;
>> +
>> +	node = srclist__find(slist, rel_path, true);
>> +	if (!node)
>> +		return false;
>> +
>> +	ret = block_line_matched(node, start_nr_a, end_nr_a,
>> +				 start_nr_b, end_nr_b, changed);
>> +
>> +	if (ret) {
>> +		pr_debug("block MATCHED (a vs. b)\t\t%s: (%d-%d) vs. (%d-%d)\n",
>> +			 node->info.rel_path,
>> +			 start_nr_a, end_nr_a,
>> +			 start_nr_b, end_nr_b);
>> +	} else if (*changed) {
>> +		pr_debug("block CHANGED (a vs. b)\t\t%s: (%d-%d) vs. (%d-%d)\n",
>> +			 node->info.rel_path,
>> +			 start_nr_a, end_nr_a,
>> +			 start_nr_b, end_nr_b);
>> +	} else {
>> +		pr_debug("block UNMATCHED (a vs. b)\t%s: (%d-%d) vs. (%d-%d)\n",
>> +			 node->info.rel_path,
>> +			 start_nr_a, end_nr_a,
>> +			 start_nr_b, end_nr_b);
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static struct hist_entry *get_block_pair(struct hist_entry *he,
>> +					 struct hists *hists_pair,
>> +					 struct srclist *src_list)
>> +{
>> +	struct rb_root_cached *root = hists_pair->entries_in;
>> +	struct rb_node *next = rb_first_cached(root);
>> +	int64_t cmp;
>> +
>> +	while (next != NULL) {
>> +		struct hist_entry *he_pair = rb_entry(next, struct hist_entry,
>> +						      rb_node_in);
>> +
>> +		next = rb_next(&he_pair->rb_node_in);
>> +
>> +		cmp = __block_info__cmp(he_pair, he, src_list);
>> +		if (!cmp)
>> +			return he_pair;
>> +	}
>> +
>> +	return NULL;
>> +}
>> +
>> +void block_hists_match(struct hists *hists_base,
>> +		       struct hists *hists_pair,
>> +		       struct srclist *src_list,
>> +		       void (*func)(struct hist_entry *,
>> +				    struct hist_entry *))
>> +{
>> +	struct rb_root_cached *root = hists_base->entries_in;
>> +	struct rb_node *next = rb_first_cached(root);
>> +
>> +	while (next != NULL) {
>> +		struct hist_entry *he = rb_entry(next, struct hist_entry,
>> +						 rb_node_in);
>> +		struct hist_entry *pair = get_block_pair(he, hists_pair,
>> +							 src_list);
>> +
>> +		next = rb_next(&he->rb_node_in);
>> +
>> +		if (pair) {
>> +			hist_entry__add_pair(pair, he);
>> +
>> +			if (func)
>> +				(*func)(he, pair);
>> +		}
>> +	}
>> +}
>> +
>> +int block_info__match_report(struct block_report *rep_base,
>> +			     struct block_report *rep_pair,
>> +			     struct srclist *src_list,
>> +			     void (*func)(struct hist_entry *,
>> +					  struct hist_entry *))
>> +{
>> +	struct block_hist *bh_base = &rep_base->hist;
>> +	struct block_hist *bh_pair = &rep_pair->hist;
>> +
>> +	block_hists_match(&bh_base->block_hists, &bh_pair->block_hists,
>> +			  src_list, func);
>> +
>> +	return 0;
>> +}
>> diff --git a/tools/perf/util/block-info.h b/tools/perf/util/block-info.h
>> index 42e9dcc4cf0a..458bd998089d 100644
>> --- a/tools/perf/util/block-info.h
>> +++ b/tools/perf/util/block-info.h
>> @@ -20,6 +20,8 @@ struct block_info {
>>   	int			num;
>>   	int			num_aggr;
>>   	refcount_t		refcnt;
>> +	struct block_line	*line;
>> +	bool			srcline_matched;
>>   };
>>   
>>   struct block_fmt {
>> @@ -46,6 +48,7 @@ struct block_report {
>>   	u64			cycles;
>>   	struct block_fmt	fmts[PERF_HPP_REPORT__BLOCK_MAX_INDEX];
>>   	int			nr_fmts;
>> +	int			evsel_idx;
>>   };
>>   
>>   struct block_hist;
>> @@ -62,7 +65,8 @@ static inline void __block_info__zput(struct block_info **bi)
>>   
>>   #define block_info__zput(bi) __block_info__zput(&bi)
>>   
>> -int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right);
>> +int64_t __block_info__cmp(struct hist_entry *left, struct hist_entry *right,
>> +			  struct srclist *src_list __maybe_unused);
>>   
>>   int64_t block_info__cmp(struct perf_hpp_fmt *fmt __maybe_unused,
>>   			struct hist_entry *left, struct hist_entry *right);
>> @@ -83,4 +87,30 @@ int report__browse_block_hists(struct block_hist *bh, float min_percent,
>>   
>>   float block_info__total_cycles_percent(struct hist_entry *he);
>>   
>> +struct block_report *block_info__get_report(struct block_report *reps,
>> +					    int nr_reps, int evsel_idx);
>> +
>> +int block_hists_addr2line(struct hists *hists, const char *dir);
>> +
>> +void block_line_dump(struct block_info *bi, const char *str);
>> +
>> +bool block_same_srcfiles(struct block_line *bl_a, struct block_line *bl_b);
>> +
>> +bool block_srclist_matched(struct srclist *slist, char *rel_path,
>> +			   int start_nr_a, int end_nr_a,
>> +			   int start_nr_b, int end_nr_b,
>> +			   bool *changed);
>> +
>> +void block_hists_match(struct hists *hists_base,
>> +		       struct hists *hists_pair,
>> +		       struct srclist *src_list,
>> +		       void (*func)(struct hist_entry *,
>> +				    struct hist_entry *));
>> +
>> +int block_info__match_report(struct block_report *rep_base,
>> +			     struct block_report *rep_pair,
>> +			     struct srclist *src_list,
>> +			     void (*func)(struct hist_entry *,
>> +					  struct hist_entry *));
>> +
>>   #endif /* __PERF_BLOCK_H */
>> diff --git a/tools/perf/util/srclist.h b/tools/perf/util/srclist.h
>> index f25b0de91a13..46866d5c51d7 100644
>> --- a/tools/perf/util/srclist.h
>> +++ b/tools/perf/util/srclist.h
>> @@ -33,6 +33,15 @@ struct srclist {
>>   	const char *after_dir;
>>   };
>>   
>> +struct block_line {
>> +	char *start_file;
>> +	char *end_file;
>> +	char *start_rel;
>> +	char *end_rel;
>> +	unsigned int start_nr;
>> +	unsigned int end_nr;
>> +};
>> +
>>   struct srclist *srclist__new(const char *before_dir, const char *after_dir);
>>   void srclist__delete(struct srclist *slist);
>>   
>> -- 
>> 2.17.1
>>
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2020-03-11  5:47 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-10  7:02 [PATCH v1 00/14] perf: Stream comparison Jin Yao
2020-03-10  7:02 ` [PATCH v1 01/14] perf util: Create source line mapping table Jin Yao
2020-03-10 15:08   ` Arnaldo Carvalho de Melo
2020-03-11  5:33     ` Jin, Yao
2020-03-10  7:02 ` [PATCH v1 02/14] perf util: Create streams for managing top N hottest callchains Jin Yao
2020-03-10 15:11   ` Arnaldo Carvalho de Melo
2020-03-11  5:38     ` Jin, Yao
2020-03-10  7:02 ` [PATCH v1 03/14] perf util: Return per-event callchain streams Jin Yao
2020-03-10  7:02 ` [PATCH v1 04/14] perf util: Compare two streams Jin Yao
2020-03-10  7:02 ` [PATCH v1 05/14] perf util: Calculate the sum of all streams hits Jin Yao
2020-03-10 15:14   ` Arnaldo Carvalho de Melo
2020-03-11  5:44     ` Jin, Yao
2020-03-10  7:02 ` [PATCH v1 06/14] perf util: Report hot streams Jin Yao
2020-03-10  7:02 ` [PATCH v1 07/14] perf diff: Support hot streams comparison Jin Yao
2020-03-10  7:02 ` [PATCH v1 08/14] perf util: Add new block info functions for top N hot blocks comparison Jin Yao
2020-03-10 15:17   ` Arnaldo Carvalho de Melo
2020-03-11  5:47     ` Jin, Yao
2020-03-10  7:02 ` [PATCH v1 09/14] perf util: Add new block info fmts for showing " Jin Yao
2020-03-10  7:02 ` [PATCH v1 10/14] perf util: Enable block source line comparison Jin Yao
2020-03-10  7:02 ` [PATCH v1 11/14] perf diff: support hot blocks comparison Jin Yao
2020-03-10  7:02 ` [PATCH v1 12/14] perf util: Filter out streams by name of changed functions Jin Yao
2020-03-10  7:02 ` [PATCH v1 13/14] perf util: Filter out blocks " Jin Yao
2020-03-10  7:02 ` [PATCH v1 14/14] perf diff: Filter out streams by " Jin Yao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).