LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH V6 1/1] perf tool: perf diff support for different binaries
@ 2014-12-02 15:39 kan.liang
  2015-01-05 16:04 ` Liang, Kan
  2015-01-06 14:53 ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 7+ messages in thread
From: kan.liang @ 2014-12-02 15:39 UTC (permalink / raw)
  To: acme, jolsa, namhyung; +Cc: linux-kernel, ak, Kan Liang

From: Kan Liang <kan.liang@intel.com>

Currently, the perf diff only works with same binaries. That's because
it compares the symbol start address. It doesn't work if the perf.data
comes from different binaries. This patch matches the function names.

Here is an examples.
The only difference between example_v1.c and example_v2.c is the
location of f2 and f3. There is no change in behavior, but the previous
perf diff display the wrong differential profile.

example_v1.c

noinline void f3(void)
{
        volatile int i;
        for (i = 0; i < 10000;) {

                if(i%2)
                        i++;
                else
                        i++;
        }
}

noinline void f2(void)
{
        volatile int a = 100, b, c;
        for (b = 0; b < 10000; b++)
                c = a * b;

}

noinline void f1(void)
{
                f2();
                f3();
}

int main()
{
        int i;
        for (i = 0; i < 100000; i++)
                f1();
}

example_v2.c

noinline void f2(void)
{
        volatile int a = 100, b, c;
        for (b = 0; b < 10000; b++)
                c = a * b;
}

noinline void f3(void)
{
        volatile int i;
        for (i = 0; i < 10000;) {
                if(i%2)
                        i++;
                else
                        i++;
        }
}

noinline void f1(void)
{
                f2();
                f3();
}

int main()
{
        int i;
        for (i = 0; i < 100000; i++)
                f1();
}

[lk@localhost perf_diff]$ gcc example_v1.c -o example
[lk@localhost perf_diff]$ perf record -o example_v1.data ./example
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 0.813 MB example_v1.data (~35522
samples) ]

[lk@localhost perf_diff]$ gcc example_v2.c -o example
[lk@localhost perf_diff]$ perf record -o example_v2.data ./example
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 0.824 MB example_v2.data (~36015
samples) ]

Old perf diff result.
[lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
 Event 'cycles'

 Baseline    Delta  Shared Object     Symbol
 ........  .......  ................  ...............................

                     [kernel.vmlinux]  [k] __perf_event_task_sched_out
     0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
                     [kernel.vmlinux]  [k] idle_cpu
                     [kernel.vmlinux]  [k] intel_pstate_timer_func
                     [kernel.vmlinux]  [k] native_read_msr_safe
     0.00%           [kernel.vmlinux]  [k] native_read_tsc
     0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
                     [kernel.vmlinux]  [k] ntp_tick_length
     0.00%           [kernel.vmlinux]  [k] rb_erase
     0.00%           [kernel.vmlinux]  [k] tick_sched_timer
     0.00%           [kernel.vmlinux]  [k] unmap_single_vma
     0.00%           [kernel.vmlinux]  [k] update_wall_time
     0.00%           example           [.] f1
    46.24%           example           [.] f2
    53.71%   -7.55%  example           [.] f3
            +53.81%  example           [.] f3
     0.02%           example           [.] main

New perf diff result.
[lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
                     [kernel.vmlinux]  [k] __perf_event_task_sched_out
     0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
                     [kernel.vmlinux]  [k] idle_cpu
                     [kernel.vmlinux]  [k] intel_pstate_timer_func
                     [kernel.vmlinux]  [k] native_read_msr_safe
     0.00%           [kernel.vmlinux]  [k] native_read_tsc
     0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
                     [kernel.vmlinux]  [k] ntp_tick_length
     0.00%           [kernel.vmlinux]  [k] rb_erase
     0.00%           [kernel.vmlinux]  [k] tick_sched_timer
     0.00%           [kernel.vmlinux]  [k] unmap_single_vma
     0.00%           [kernel.vmlinux]  [k] update_wall_time
     0.00%           example           [.] f1
    46.24%   -0.08%  example           [.] f2
    53.71%   +0.11%  example           [.] f3
     0.02%           example           [.] main

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>

---

The patch is seperated from "[PATCH V4 0/3] perf tool: perf diff sort changes" patch set. 
The first patch of the patch set has been merged. 
The third patch of the patch set for symoff will be sent out by another thread. 

Changes since V4: 
 - Seperate from old patch set 
 - Add an example in changelog 
 - Update man page 

Changes since V5:
 - Correct patch style

 tools/perf/Documentation/perf-diff.txt | 5 +++++
 tools/perf/util/sort.c                 | 9 +++++++++
 2 files changed, 14 insertions(+)

diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt
index e463caa..5182661 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -20,6 +20,11 @@ If no parameters are passed it will assume perf.data.old and perf.data.
 The differential profile is displayed only for events matching both
 specified perf.data files.
 
+If no parameters are passed the samples will be sorted by dso and symbol.
+As the perf.data files could come from different binaries, the symbols addresses
+could vary. So perf diff is based on the comparison of the files and
+symbols name.
+
 OPTIONS
 -------
 -D::
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 9139dda..e0e4173 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1432,6 +1432,15 @@ int sort_dimension__add(const char *tok)
 			sort__has_parent = 1;
 		} else if (sd->entry == &sort_sym) {
 			sort__has_sym = 1;
+			/*
+			 * perf diff displays the performance difference amongst
+			 * two or more perf.data files. Those files could come
+			 * from different binaries. So we should not compare
+			 * their ips, but the name of symbol.
+			 */
+			if (sort__mode == SORT_MODE__DIFF)
+				sd->entry->se_collapse = sort__sym_sort;
+
 		} else if (sd->entry == &sort_dso) {
 			sort__has_dso = 1;
 		}
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH V6 1/1] perf tool: perf diff support for different binaries
  2014-12-02 15:39 [PATCH V6 1/1] perf tool: perf diff support for different binaries kan.liang
@ 2015-01-05 16:04 ` Liang, Kan
  2015-01-06 14:53 ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 7+ messages in thread
From: Liang, Kan @ 2015-01-05 16:04 UTC (permalink / raw)
  To: acme, jolsa, namhyung; +Cc: linux-kernel, ak

Hi Arnaldo,

The patch is one month old. Kim and Jirka have reviewed it. 

There is also another perf diff related patch which has similar situation.
https://lkml.org/lkml/2014/12/1/380
It was also reviewed by Jirka a month ago.

Both of them still apply to current perf/core. 
Should I re-post them for more review?

Thanks,
Kan

> 
> From: Kan Liang <kan.liang@intel.com>
> 
> Currently, the perf diff only works with same binaries. That's because it
> compares the symbol start address. It doesn't work if the perf.data comes
> from different binaries. This patch matches the function names.
> 
> Here is an examples.
> The only difference between example_v1.c and example_v2.c is the
> location of f2 and f3. There is no change in behavior, but the previous perf
> diff display the wrong differential profile.
> 
> example_v1.c
> 
> noinline void f3(void)
> {
>         volatile int i;
>         for (i = 0; i < 10000;) {
> 
>                 if(i%2)
>                         i++;
>                 else
>                         i++;
>         }
> }
> 
> noinline void f2(void)
> {
>         volatile int a = 100, b, c;
>         for (b = 0; b < 10000; b++)
>                 c = a * b;
> 
> }
> 
> noinline void f1(void)
> {
>                 f2();
>                 f3();
> }
> 
> int main()
> {
>         int i;
>         for (i = 0; i < 100000; i++)
>                 f1();
> }
> 
> example_v2.c
> 
> noinline void f2(void)
> {
>         volatile int a = 100, b, c;
>         for (b = 0; b < 10000; b++)
>                 c = a * b;
> }
> 
> noinline void f3(void)
> {
>         volatile int i;
>         for (i = 0; i < 10000;) {
>                 if(i%2)
>                         i++;
>                 else
>                         i++;
>         }
> }
> 
> noinline void f1(void)
> {
>                 f2();
>                 f3();
> }
> 
> int main()
> {
>         int i;
>         for (i = 0; i < 100000; i++)
>                 f1();
> }
> 
> [lk@localhost perf_diff]$ gcc example_v1.c -o example [lk@localhost
> perf_diff]$ perf record -o example_v1.data ./example [ perf record:
> Woken up 4 times to write data ] [ perf record: Captured and wrote 0.813
> MB example_v1.data (~35522
> samples) ]
> 
> [lk@localhost perf_diff]$ gcc example_v2.c -o example [lk@localhost
> perf_diff]$ perf record -o example_v2.data ./example [ perf record:
> Woken up 4 times to write data ] [ perf record: Captured and wrote 0.824
> MB example_v2.data (~36015
> samples) ]
> 
> Old perf diff result.
> [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
> Event 'cycles'
> 
>  Baseline    Delta  Shared Object     Symbol
>  ........  .......  ................  ...............................
> 
>                      [kernel.vmlinux]  [k] __perf_event_task_sched_out
>      0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
>                      [kernel.vmlinux]  [k] idle_cpu
>                      [kernel.vmlinux]  [k] intel_pstate_timer_func
>                      [kernel.vmlinux]  [k] native_read_msr_safe
>      0.00%           [kernel.vmlinux]  [k] native_read_tsc
>      0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
>                      [kernel.vmlinux]  [k] ntp_tick_length
>      0.00%           [kernel.vmlinux]  [k] rb_erase
>      0.00%           [kernel.vmlinux]  [k] tick_sched_timer
>      0.00%           [kernel.vmlinux]  [k] unmap_single_vma
>      0.00%           [kernel.vmlinux]  [k] update_wall_time
>      0.00%           example           [.] f1
>     46.24%           example           [.] f2
>     53.71%   -7.55%  example           [.] f3
>             +53.81%  example           [.] f3
>      0.02%           example           [.] main
> 
> New perf diff result.
> [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
>                      [kernel.vmlinux]  [k] __perf_event_task_sched_out
>      0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
>                      [kernel.vmlinux]  [k] idle_cpu
>                      [kernel.vmlinux]  [k] intel_pstate_timer_func
>                      [kernel.vmlinux]  [k] native_read_msr_safe
>      0.00%           [kernel.vmlinux]  [k] native_read_tsc
>      0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
>                      [kernel.vmlinux]  [k] ntp_tick_length
>      0.00%           [kernel.vmlinux]  [k] rb_erase
>      0.00%           [kernel.vmlinux]  [k] tick_sched_timer
>      0.00%           [kernel.vmlinux]  [k] unmap_single_vma
>      0.00%           [kernel.vmlinux]  [k] update_wall_time
>      0.00%           example           [.] f1
>     46.24%   -0.08%  example           [.] f2
>     53.71%   +0.11%  example           [.] f3
>      0.02%           example           [.] main
> 
> Signed-off-by: Kan Liang <kan.liang@intel.com>
> Acked-by: Namhyung Kim <namhyung@kernel.org>
> Acked-by: Jiri Olsa <jolsa@kernel.org>
> 
> ---
> 
> The patch is seperated from "[PATCH V4 0/3] perf tool: perf diff sort
> changes" patch set.
> The first patch of the patch set has been merged.
> The third patch of the patch set for symoff will be sent out by another
> thread.
> 
> Changes since V4:
>  - Seperate from old patch set
>  - Add an example in changelog
>  - Update man page
> 
> Changes since V5:
>  - Correct patch style
> 
>  tools/perf/Documentation/perf-diff.txt | 5 +++++
>  tools/perf/util/sort.c                 | 9 +++++++++
>  2 files changed, 14 insertions(+)
> 
> diff --git a/tools/perf/Documentation/perf-diff.txt
> b/tools/perf/Documentation/perf-diff.txt
> index e463caa..5182661 100644
> --- a/tools/perf/Documentation/perf-diff.txt
> +++ b/tools/perf/Documentation/perf-diff.txt
> @@ -20,6 +20,11 @@ If no parameters are passed it will assume
> perf.data.old and perf.data.
>  The differential profile is displayed only for events matching both
> specified perf.data files.
> 
> +If no parameters are passed the samples will be sorted by dso and symbol.
> +As the perf.data files could come from different binaries, the symbols
> +addresses could vary. So perf diff is based on the comparison of the
> +files and symbols name.
> +
>  OPTIONS
>  -------
>  -D::
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index
> 9139dda..e0e4173 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -1432,6 +1432,15 @@ int sort_dimension__add(const char *tok)
>  			sort__has_parent = 1;
>  		} else if (sd->entry == &sort_sym) {
>  			sort__has_sym = 1;
> +			/*
> +			 * perf diff displays the performance difference
> amongst
> +			 * two or more perf.data files. Those files could
> come
> +			 * from different binaries. So we should not
> compare
> +			 * their ips, but the name of symbol.
> +			 */
> +			if (sort__mode == SORT_MODE__DIFF)
> +				sd->entry->se_collapse = sort__sym_sort;
> +
>  		} else if (sd->entry == &sort_dso) {
>  			sort__has_dso = 1;
>  		}
> --
> 1.8.3.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V6 1/1] perf tool: perf diff support for different binaries
  2014-12-02 15:39 [PATCH V6 1/1] perf tool: perf diff support for different binaries kan.liang
  2015-01-05 16:04 ` Liang, Kan
@ 2015-01-06 14:53 ` Arnaldo Carvalho de Melo
  2015-01-06 16:16   ` Arnaldo Carvalho de Melo
                     ` (2 more replies)
  1 sibling, 3 replies; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-01-06 14:53 UTC (permalink / raw)
  To: kan.liang; +Cc: jolsa, namhyung, linux-kernel, ak

Em Tue, Dec 02, 2014 at 10:39:18AM -0500, kan.liang@intel.com escreveu:
> From: Kan Liang <kan.liang@intel.com>
> 
> Currently, the perf diff only works with same binaries. That's because
> it compares the symbol start address. It doesn't work if the perf.data
> comes from different binaries. This patch matches the function names.

Sorry Kan, I saw these patches, its just that it the above statement
looked completely strange to me, i.e. of course it should look at the
dso name, then at the symbol name, comparing via addresses makes no
sense to me, so I kept leaving this patch to look after processing other
patches and doing other things, then the year end holidays, etc.

So now I'm looking at old code, from 2009:

  commit 604c5c92972dcb4b98be34775452d09a5d4ec248
  Author: Arnaldo Carvalho de Melo <acme@redhat.com>
  Date:   Wed Dec 16 14:09:53 2009 -0200

    perf diff: Change the default sort order to "dso,symbol"
    
    This is a more intuitive / more meaningful default:
    
    $ perf diff | head -8
         9.02%     +1.00%  libc-2.10.1.so               [.] _IO_vfprintf_internal
         2.91%     -1.00%  [kernel]                     [k] __kmalloc
         2.85%     -1.00%  [kernel]                     [k] ext4_htree_store_dirent
         1.99%     -1.00%  [kernel]                     [k] _atomic_dec_and_lock
         2.44%             [kernel]
    $
    
    Suggested-by: Ingo Molnar <mingo@elte.hu>

and I see:

static void perf_session__match_hists(struct perf_session *old_session,
                                      struct perf_session *new_session)
{
        struct rb_node *nd;

        perf_session__resort_by_name(old_session);

        for (nd = rb_first(&new_session->hists); nd; nd = rb_next(nd)) {
                struct hist_entry *pos = rb_entry(nd, struct hist_entry, rb_node);
                pos->pair = perf_session__find_hist_entry_by_name(old_session, pos);
        }
}

Ok, it will process all samples, store everything in hist_entries, etc, then traverse all
entries in the most recent perf.data file and look for a pair, _by name_:

static struct hist_entry *
perf_session__find_hist_entry_by_name(struct perf_session *self,
                                      struct hist_entry *he)
{
        struct rb_node *n = self->hists.rb_node;

        while (n) {
                struct hist_entry *iter = rb_entry(n, struct hist_entry, rb_node);
                int cmp = strcmp(he->map->dso->name, iter->map->dso->name);

                if (cmp > 0)
                        n = n->rb_left;
                else if (cmp < 0)
                        n = n->rb_right;
                else {
                        cmp = strcmp(he->sym->name, iter->sym->name);
                        if (cmp > 0)
                                n = n->rb_left;
                        else if (cmp < 0)
                                n = n->rb_right;
                        else
                                return iter;
                }
        }

        return NULL;
}

See? Resort the old session by "dso,symbol_name", then go on pairing them.

So at some point this got broken :-\

I.e. this is not changing an existing correct behaviour, but fixing a bug, i.e.
the changeset comment confused me :-\

Now I'm trying to figure out _when_ this got broken, what was the reasoning for
that code to have changed in a way that made it not match by dso_name, symbol_name.

Also please, when providing code to be built to reproduce your results, make it
self contained,  i.e. in the case below, I had to replace "noinline" with
"__attribute__((noinline))".

- Arnaldo
 
> Here is an examples.
> The only difference between example_v1.c and example_v2.c is the
> location of f2 and f3. There is no change in behavior, but the previous
> perf diff display the wrong differential profile.
> 
> example_v1.c
> 
> noinline void f3(void)
> {
>         volatile int i;
>         for (i = 0; i < 10000;) {
> 
>                 if(i%2)
>                         i++;
>                 else
>                         i++;
>         }
> }
> 
> noinline void f2(void)
> {
>         volatile int a = 100, b, c;
>         for (b = 0; b < 10000; b++)
>                 c = a * b;
> 
> }
> 
> noinline void f1(void)
> {
>                 f2();
>                 f3();
> }
> 
> int main()
> {
>         int i;
>         for (i = 0; i < 100000; i++)
>                 f1();
> }
> 
> example_v2.c
> 
> noinline void f2(void)
> {
>         volatile int a = 100, b, c;
>         for (b = 0; b < 10000; b++)
>                 c = a * b;
> }
> 
> noinline void f3(void)
> {
>         volatile int i;
>         for (i = 0; i < 10000;) {
>                 if(i%2)
>                         i++;
>                 else
>                         i++;
>         }
> }
> 
> noinline void f1(void)
> {
>                 f2();
>                 f3();
> }
> 
> int main()
> {
>         int i;
>         for (i = 0; i < 100000; i++)
>                 f1();
> }
> 
> [lk@localhost perf_diff]$ gcc example_v1.c -o example
> [lk@localhost perf_diff]$ perf record -o example_v1.data ./example
> [ perf record: Woken up 4 times to write data ]
> [ perf record: Captured and wrote 0.813 MB example_v1.data (~35522
> samples) ]
> 
> [lk@localhost perf_diff]$ gcc example_v2.c -o example
> [lk@localhost perf_diff]$ perf record -o example_v2.data ./example
> [ perf record: Woken up 4 times to write data ]
> [ perf record: Captured and wrote 0.824 MB example_v2.data (~36015
> samples) ]
> 
> Old perf diff result.
> [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
>  Event 'cycles'
> 
>  Baseline    Delta  Shared Object     Symbol
>  ........  .......  ................  ...............................
> 
>                      [kernel.vmlinux]  [k] __perf_event_task_sched_out
>      0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
>                      [kernel.vmlinux]  [k] idle_cpu
>                      [kernel.vmlinux]  [k] intel_pstate_timer_func
>                      [kernel.vmlinux]  [k] native_read_msr_safe
>      0.00%           [kernel.vmlinux]  [k] native_read_tsc
>      0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
>                      [kernel.vmlinux]  [k] ntp_tick_length
>      0.00%           [kernel.vmlinux]  [k] rb_erase
>      0.00%           [kernel.vmlinux]  [k] tick_sched_timer
>      0.00%           [kernel.vmlinux]  [k] unmap_single_vma
>      0.00%           [kernel.vmlinux]  [k] update_wall_time
>      0.00%           example           [.] f1
>     46.24%           example           [.] f2
>     53.71%   -7.55%  example           [.] f3
>             +53.81%  example           [.] f3
>      0.02%           example           [.] main
> 
> New perf diff result.
> [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
>                      [kernel.vmlinux]  [k] __perf_event_task_sched_out
>      0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
>                      [kernel.vmlinux]  [k] idle_cpu
>                      [kernel.vmlinux]  [k] intel_pstate_timer_func
>                      [kernel.vmlinux]  [k] native_read_msr_safe
>      0.00%           [kernel.vmlinux]  [k] native_read_tsc
>      0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
>                      [kernel.vmlinux]  [k] ntp_tick_length
>      0.00%           [kernel.vmlinux]  [k] rb_erase
>      0.00%           [kernel.vmlinux]  [k] tick_sched_timer
>      0.00%           [kernel.vmlinux]  [k] unmap_single_vma
>      0.00%           [kernel.vmlinux]  [k] update_wall_time
>      0.00%           example           [.] f1
>     46.24%   -0.08%  example           [.] f2
>     53.71%   +0.11%  example           [.] f3
>      0.02%           example           [.] main
> 
> Signed-off-by: Kan Liang <kan.liang@intel.com>
> Acked-by: Namhyung Kim <namhyung@kernel.org>
> Acked-by: Jiri Olsa <jolsa@kernel.org>
> 
> ---
> 
> The patch is seperated from "[PATCH V4 0/3] perf tool: perf diff sort changes" patch set. 
> The first patch of the patch set has been merged. 
> The third patch of the patch set for symoff will be sent out by another thread. 
> 
> Changes since V4: 
>  - Seperate from old patch set 
>  - Add an example in changelog 
>  - Update man page 
> 
> Changes since V5:
>  - Correct patch style
> 
>  tools/perf/Documentation/perf-diff.txt | 5 +++++
>  tools/perf/util/sort.c                 | 9 +++++++++
>  2 files changed, 14 insertions(+)
> 
> diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt
> index e463caa..5182661 100644
> --- a/tools/perf/Documentation/perf-diff.txt
> +++ b/tools/perf/Documentation/perf-diff.txt
> @@ -20,6 +20,11 @@ If no parameters are passed it will assume perf.data.old and perf.data.
>  The differential profile is displayed only for events matching both
>  specified perf.data files.
>  
> +If no parameters are passed the samples will be sorted by dso and symbol.
> +As the perf.data files could come from different binaries, the symbols addresses
> +could vary. So perf diff is based on the comparison of the files and
> +symbols name.
> +
>  OPTIONS
>  -------
>  -D::
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index 9139dda..e0e4173 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -1432,6 +1432,15 @@ int sort_dimension__add(const char *tok)
>  			sort__has_parent = 1;
>  		} else if (sd->entry == &sort_sym) {
>  			sort__has_sym = 1;
> +			/*
> +			 * perf diff displays the performance difference amongst
> +			 * two or more perf.data files. Those files could come
> +			 * from different binaries. So we should not compare
> +			 * their ips, but the name of symbol.
> +			 */
> +			if (sort__mode == SORT_MODE__DIFF)
> +				sd->entry->se_collapse = sort__sym_sort;
> +
>  		} else if (sd->entry == &sort_dso) {
>  			sort__has_dso = 1;
>  		}
> -- 
> 1.8.3.2

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH V6 1/1] perf tool: perf diff support for different binaries
  2015-01-06 14:53 ` Arnaldo Carvalho de Melo
@ 2015-01-06 16:16   ` Arnaldo Carvalho de Melo
  2015-01-12 18:06     ` Liang, Kan
  2015-01-06 16:39   ` Liang, Kan
  2015-01-26 14:15   ` Liang, Kan
  2 siblings, 1 reply; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-01-06 16:16 UTC (permalink / raw)
  To: kan.liang; +Cc: jolsa, namhyung, linux-kernel, ak

Em Tue, Jan 06, 2015 at 11:53:56AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Dec 02, 2014 at 10:39:18AM -0500, kan.liang@intel.com escreveu:
> > Currently, the perf diff only works with same binaries. That's because
> > it compares the symbol start address. It doesn't work if the perf.data
> > comes from different binaries. This patch matches the function names.
 
> Sorry Kan, I saw these patches, its just that it the above statement
> looked completely strange to me, i.e. of course it should look at the
> dso name, then at the symbol name, comparing via addresses makes no
> sense to me, so I kept leaving this patch to look after processing other
> patches and doing other things, then the year end holidays, etc.
 
> So now I'm looking at old code, from 2009:
 
>   commit 604c5c92972dcb4b98be34775452d09a5d4ec248
>   Author: Arnaldo Carvalho de Melo <acme@redhat.com>
>   Date:   Wed Dec 16 14:09:53 2009 -0200
 
>     perf diff: Change the default sort order to "dso,symbol"

<SNIP>

> Now I'm trying to figure out _when_ this got broken, what was the reasoning for
> that code to have changed in a way that made it not match by dso_name, symbol_name.

<SNIP>

> > New perf diff result.
> > [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
> >                      [kernel.vmlinux]  [k] __perf_event_task_sched_out
> >      0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
> >                      [kernel.vmlinux]  [k] idle_cpu
> >                      [kernel.vmlinux]  [k] intel_pstate_timer_func
> >                      [kernel.vmlinux]  [k] native_read_msr_safe
> >      0.00%           [kernel.vmlinux]  [k] native_read_tsc
> >      0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
> >                      [kernel.vmlinux]  [k] ntp_tick_length
> >      0.00%           [kernel.vmlinux]  [k] rb_erase
> >      0.00%           [kernel.vmlinux]  [k] tick_sched_timer
> >      0.00%           [kernel.vmlinux]  [k] unmap_single_vma
> >      0.00%           [kernel.vmlinux]  [k] update_wall_time
> >      0.00%           example           [.] f1
> >     46.24%   -0.08%  example           [.] f2
> >     53.71%   +0.11%  example           [.] f3
> >      0.02%           example           [.] main

So, with:

[root@ssdandy l]# perf --version
perf version 3.1.rc9.37.g3f2728.dirty

commit 3f2728bdb6a4cad0d18b09d03e2008121c902751
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Wed Oct 5 16:10:06 2011 -0300

    perf report: Add option to show total period

I get:

[root@ssdandy l]# gcc example_v1.c -o example
[root@ssdandy l]# perf record -F 10000 ./example
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 1.739 MB perf.data (~75981 samples) ]
[root@ssdandy l]# gcc example_v2.c -o example
[root@ssdandy l]# perf record -F 10000 ./example
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 1.733 MB perf.data (~75735 samples) ]
[root@ssdandy l]# perf diff
# Baseline  Delta          Shared Object                       Symbol
# ........ ..........  .................  ...........................
#
     0.00%    +53.60%  example            [.] f3
    54.23%     -8.02%  example            [.] f2
     0.02%             [kernel.kallsyms]  [k] native_read_msr_safe
     0.04%     -0.01%  [kernel.kallsyms]  [k] native_write_msr_safe
     0.01%             [kernel.kallsyms]  [k] thermal_interrupt
     0.01%             [kernel.kallsyms]  [k] apic_timer_interrupt
     0.00%             [kernel.kallsyms]  [k] update_wall_time
     0.00%             example            [.] main
     0.00%             example            [.] f1
     0.00%             [kernel.kallsyms]  [k] notifier_call_chain
     0.00%             [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
     0.01%             [kernel.kallsyms]  [k] _raw_spin_lock
     0.00%             [kernel.kallsyms]  [k] native_read_tsc
     0.00%             [kernel.kallsyms]  [k] task_tick_fair
     0.00%             [kernel.kallsyms]  [k] account_user_time
     0.00%             [kernel.kallsyms]  [k] unmap_page_range
     0.00%             [kernel.kallsyms]  [k] kfree
     0.00%             [kernel.kallsyms]  [k] sidtab_context_to_sid
     0.00%             ld-2.17.so         [.] _dl_relocate_object
     0.00%             [kernel.kallsyms]  [k] update_cfs_rq_blocked_load
     0.00%             [kernel.kallsyms]  [k] perf_event_task_tick
     0.00%             [kernel.kallsyms]  [k] rcu_irq_enter
     0.00%             [kernel.kallsyms]  [k] run_timer_softirq
     0.00%             [kernel.kallsyms]  [k] task_cputime
     0.00%             [kernel.kallsyms]  [k] exit_idle
     0.00%             [kernel.kallsyms]  [k] __update_cpu_load
     0.00%             [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
     0.00%             [kernel.kallsyms]  [k] account_process_tick
     0.00%             [kernel.kallsyms]  [k] raise_softirq
     0.00%             [kernel.kallsyms]  [k] update_process_times
     0.01%             [kernel.kallsyms]  [k] native_sched_clock
     0.00%             [kernel.kallsyms]  [k] therm_throt_process
     0.00%             [kernel.kallsyms]  [k] intel_thermal_interrupt
     0.01%             [kernel.kallsyms]  [k] update_vsyscall
     0.00%             [kernel.kallsyms]  [k] jiffies_to_timeval
     0.00%             [kernel.kallsyms]  [k] intel_pstate_timer_func
     0.00%             [kernel.kallsyms]  [k] x86_pmu_disable
     0.00%             [kernel.kallsyms]  [k] __do_softirq
     0.00%             [kernel.kallsyms]  [k] hrtimer_interrupt
     0.00%             [kernel.kallsyms]  [k] tick_sched_do_timer
     0.00%             [kernel.kallsyms]  [k] do_softirq
     0.00%             [kernel.kallsyms]  [k] sidtab_search_core
     0.00%             [kernel.kallsyms]  [k] selinux_inode_permission
     0.00%             [kernel.kallsyms]  [k] kmem_cache_alloc
     0.00%             [kernel.kallsyms]  [k] finish_wait
[root@ssdandy l]# 

After this there were the changes to allow using the hists browser to be used
with 'top' and 'diff' was left broken for a while, I'll see if I can continue
bisecting this after lunch.

- Arnaldo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH V6 1/1] perf tool: perf diff support for different binaries
  2015-01-06 14:53 ` Arnaldo Carvalho de Melo
  2015-01-06 16:16   ` Arnaldo Carvalho de Melo
@ 2015-01-06 16:39   ` Liang, Kan
  2015-01-26 14:15   ` Liang, Kan
  2 siblings, 0 replies; 7+ messages in thread
From: Liang, Kan @ 2015-01-06 16:39 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: jolsa, namhyung, linux-kernel, ak



> Em Tue, Dec 02, 2014 at 10:39:18AM -0500, kan.liang@intel.com escreveu:
> > From: Kan Liang <kan.liang@intel.com>
> >
> > Currently, the perf diff only works with same binaries. That's because
> > it compares the symbol start address. It doesn't work if the perf.data
> > comes from different binaries. This patch matches the function names.
> 
> Sorry Kan, I saw these patches, its just that it the above statement looked
> completely strange to me, i.e. of course it should look at the dso name,
> then at the symbol name, comparing via addresses makes no sense to me,
> so I kept leaving this patch to look after processing other patches and doing
> other things, then the year end holidays, etc.
> 
> So now I'm looking at old code, from 2009:
> 
>   commit 604c5c92972dcb4b98be34775452d09a5d4ec248
>   Author: Arnaldo Carvalho de Melo <acme@redhat.com>
>   Date:   Wed Dec 16 14:09:53 2009 -0200
> 
>     perf diff: Change the default sort order to "dso,symbol"
> 
>     This is a more intuitive / more meaningful default:
> 
>     $ perf diff | head -8
>          9.02%     +1.00%  libc-2.10.1.so               [.] _IO_vfprintf_internal
>          2.91%     -1.00%  [kernel]                     [k] __kmalloc
>          2.85%     -1.00%  [kernel]                     [k] ext4_htree_store_dirent
>          1.99%     -1.00%  [kernel]                     [k] _atomic_dec_and_lock
>          2.44%             [kernel]
>     $
> 
>     Suggested-by: Ingo Molnar <mingo@elte.hu>
> 
> and I see:
> 
> static void perf_session__match_hists(struct perf_session *old_session,
>                                       struct perf_session *new_session) {
>         struct rb_node *nd;
> 
>         perf_session__resort_by_name(old_session);
> 
>         for (nd = rb_first(&new_session->hists); nd; nd = rb_next(nd)) {
>                 struct hist_entry *pos = rb_entry(nd, struct hist_entry, rb_node);
>                 pos->pair = perf_session__find_hist_entry_by_name(old_session,
> pos);
>         }
> }
> 
> Ok, it will process all samples, store everything in hist_entries, etc, then
> traverse all entries in the most recent perf.data file and look for a pair, _by
> name_:
> 
> static struct hist_entry *
> perf_session__find_hist_entry_by_name(struct perf_session *self,
>                                       struct hist_entry *he) {
>         struct rb_node *n = self->hists.rb_node;
> 
>         while (n) {
>                 struct hist_entry *iter = rb_entry(n, struct hist_entry, rb_node);
>                 int cmp = strcmp(he->map->dso->name, iter->map->dso->name);
> 
>                 if (cmp > 0)
>                         n = n->rb_left;
>                 else if (cmp < 0)
>                         n = n->rb_right;
>                 else {
>                         cmp = strcmp(he->sym->name, iter->sym->name);
>                         if (cmp > 0)
>                                 n = n->rb_left;
>                         else if (cmp < 0)
>                                 n = n->rb_right;
>                         else
>                                 return iter;
>                 }
>         }
> 
>         return NULL;
> }
> 
> See? Resort the old session by "dso,symbol_name", then go on pairing
> them.
> 
> So at some point this got broken :-\
> 
> I.e. this is not changing an existing correct behaviour, but fixing a bug, i.e.
> the changeset comment confused me :-\
> 
> Now I'm trying to figure out _when_ this got broken, what was the
> reasoning for that code to have changed in a way that made it not match
> by dso_name, symbol_name.
> 


It looks it got broken just after two weeks later in 2009. 
perf_session__find_hist_entry_by_name was replaced by hist_entry__cmp.
After that it doesn't compares the name any more.


commit 9c443dfdd31eddea6cbe6ee0ca469fbcc4e1dc3b
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Mon Dec 28 22:48:36 2009 -0200

    perf diff: Fix support for all --sort combinations

@@ -122,29 +112,28 @@ static void perf_session__resort_by_name(struct perf_session *self)
        self->hists = tmp;
 }

+static void perf_session__set_hist_entries_positions(struct perf_session *self)
+{
+       perf_session__output_resort(self, self->events_stats.total);
+       perf_session__resort_hist_entries(self);
+}
+
 static struct hist_entry *
-perf_session__find_hist_entry_by_name(struct perf_session *self,
-                                     struct hist_entry *he)
+perf_session__find_hist_entry(struct perf_session *self,
+                             struct hist_entry *he)
 {
        struct rb_node *n = self->hists.rb_node;

        while (n) {
                struct hist_entry *iter = rb_entry(n, struct hist_entry, rb_node);
-               int cmp = strcmp(he->map->dso->name, iter->map->dso->name);
+               int64_t cmp = hist_entry__cmp(he, iter);

-               if (cmp > 0)
+               if (cmp < 0)
                        n = n->rb_left;
-               else if (cmp < 0)
+               else if (cmp > 0)
                        n = n->rb_right;
-               else {
-                       cmp = strcmp(he->sym->name, iter->sym->name);
-                       if (cmp > 0)
-                               n = n->rb_left;
-                       else if (cmp < 0)
-                               n = n->rb_right;
-                       else
-                               return iter;
-               }
+               else
+                       return iter;
        }

        return NULL;
@@ -155,11 +144,9 @@ static void perf_session__match_hists(struct perf_session *old_session,
 {
        struct rb_node *nd;

-       perf_session__resort_by_name(old_session);
-
        for (nd = rb_first(&new_session->hists); nd; nd = rb_next(nd)) {
                struct hist_entry *pos = rb_entry(nd, struct hist_entry, rb_node);
-               pos->pair = perf_session__find_hist_entry_by_name(old_session, pos);
+               pos->pair = perf_session__find_hist_entry(old_session, pos);
        }
 }





^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH V6 1/1] perf tool: perf diff support for different binaries
  2015-01-06 16:16   ` Arnaldo Carvalho de Melo
@ 2015-01-12 18:06     ` Liang, Kan
  0 siblings, 0 replies; 7+ messages in thread
From: Liang, Kan @ 2015-01-12 18:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: jolsa, namhyung, linux-kernel, ak

> Em Tue, Jan 06, 2015 at 11:53:56AM -0300, Arnaldo Carvalho de Melo
> escreveu:
> > Em Tue, Dec 02, 2014 at 10:39:18AM -0500, kan.liang@intel.com escreveu:
> > > Currently, the perf diff only works with same binaries. That's
> > > because it compares the symbol start address. It doesn't work if the
> > > perf.data comes from different binaries. This patch matches the
> function names.
> 
> > Sorry Kan, I saw these patches, its just that it the above statement
> > looked completely strange to me, i.e. of course it should look at the
> > dso name, then at the symbol name, comparing via addresses makes no
> > sense to me, so I kept leaving this patch to look after processing
> > other patches and doing other things, then the year end holidays, etc.
> 
> > So now I'm looking at old code, from 2009:
> 
> >   commit 604c5c92972dcb4b98be34775452d09a5d4ec248
> >   Author: Arnaldo Carvalho de Melo <acme@redhat.com>
> >   Date:   Wed Dec 16 14:09:53 2009 -0200
> 
> >     perf diff: Change the default sort order to "dso,symbol"
> 
> <SNIP>
> 
> > Now I'm trying to figure out _when_ this got broken, what was the
> > reasoning for that code to have changed in a way that made it not match
> by dso_name, symbol_name.
> 
> <SNIP>
> 
> > > New perf diff result.
> > > [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
> > >                      [kernel.vmlinux]  [k] __perf_event_task_sched_out
> > >      0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
> > >                      [kernel.vmlinux]  [k] idle_cpu
> > >                      [kernel.vmlinux]  [k] intel_pstate_timer_func
> > >                      [kernel.vmlinux]  [k] native_read_msr_safe
> > >      0.00%           [kernel.vmlinux]  [k] native_read_tsc
> > >      0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
> > >                      [kernel.vmlinux]  [k] ntp_tick_length
> > >      0.00%           [kernel.vmlinux]  [k] rb_erase
> > >      0.00%           [kernel.vmlinux]  [k] tick_sched_timer
> > >      0.00%           [kernel.vmlinux]  [k] unmap_single_vma
> > >      0.00%           [kernel.vmlinux]  [k] update_wall_time
> > >      0.00%           example           [.] f1
> > >     46.24%   -0.08%  example           [.] f2
> > >     53.71%   +0.11%  example           [.] f3
> > >      0.02%           example           [.] main
> 
> So, with:
> 
> [root@ssdandy l]# perf --version
> perf version 3.1.rc9.37.g3f2728.dirty
> 
> commit 3f2728bdb6a4cad0d18b09d03e2008121c902751
> Author: Arnaldo Carvalho de Melo <acme@redhat.com>
> Date:   Wed Oct 5 16:10:06 2011 -0300
> 
>     perf report: Add option to show total period
> 
> I get:
> 
> [root@ssdandy l]# gcc example_v1.c -o example [root@ssdandy l]# perf
> record -F 10000 ./example [ perf record: Woken up 7 times to write data ]
> [ perf record: Captured and wrote 1.739 MB perf.data (~75981 samples) ]
> [root@ssdandy l]# gcc example_v2.c -o example [root@ssdandy l]# perf
> record -F 10000 ./example [ perf record: Woken up 7 times to write data ]
> [ perf record: Captured and wrote 1.733 MB perf.data (~75735 samples) ]
> [root@ssdandy l]# perf diff
> # Baseline  Delta          Shared Object                       Symbol
> # ........ ..........  .................  ...........................
> #
>      0.00%    +53.60%  example            [.] f3
>     54.23%     -8.02%  example            [.] f2
>      0.02%             [kernel.kallsyms]  [k] native_read_msr_safe
>      0.04%     -0.01%  [kernel.kallsyms]  [k] native_write_msr_safe
>      0.01%             [kernel.kallsyms]  [k] thermal_interrupt
>      0.01%             [kernel.kallsyms]  [k] apic_timer_interrupt
>      0.00%             [kernel.kallsyms]  [k] update_wall_time
>      0.00%             example            [.] main
>      0.00%             example            [.] f1
>      0.00%             [kernel.kallsyms]  [k] notifier_call_chain
>      0.00%             [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
>      0.01%             [kernel.kallsyms]  [k] _raw_spin_lock
>      0.00%             [kernel.kallsyms]  [k] native_read_tsc
>      0.00%             [kernel.kallsyms]  [k] task_tick_fair
>      0.00%             [kernel.kallsyms]  [k] account_user_time
>      0.00%             [kernel.kallsyms]  [k] unmap_page_range
>      0.00%             [kernel.kallsyms]  [k] kfree
>      0.00%             [kernel.kallsyms]  [k] sidtab_context_to_sid
>      0.00%             ld-2.17.so         [.] _dl_relocate_object
>      0.00%             [kernel.kallsyms]  [k] update_cfs_rq_blocked_load
>      0.00%             [kernel.kallsyms]  [k] perf_event_task_tick
>      0.00%             [kernel.kallsyms]  [k] rcu_irq_enter
>      0.00%             [kernel.kallsyms]  [k] run_timer_softirq
>      0.00%             [kernel.kallsyms]  [k] task_cputime
>      0.00%             [kernel.kallsyms]  [k] exit_idle
>      0.00%             [kernel.kallsyms]  [k] __update_cpu_load
>      0.00%             [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
>      0.00%             [kernel.kallsyms]  [k] account_process_tick
>      0.00%             [kernel.kallsyms]  [k] raise_softirq
>      0.00%             [kernel.kallsyms]  [k] update_process_times
>      0.01%             [kernel.kallsyms]  [k] native_sched_clock
>      0.00%             [kernel.kallsyms]  [k] therm_throt_process
>      0.00%             [kernel.kallsyms]  [k] intel_thermal_interrupt
>      0.01%             [kernel.kallsyms]  [k] update_vsyscall
>      0.00%             [kernel.kallsyms]  [k] jiffies_to_timeval
>      0.00%             [kernel.kallsyms]  [k] intel_pstate_timer_func
>      0.00%             [kernel.kallsyms]  [k] x86_pmu_disable
>      0.00%             [kernel.kallsyms]  [k] __do_softirq
>      0.00%             [kernel.kallsyms]  [k] hrtimer_interrupt
>      0.00%             [kernel.kallsyms]  [k] tick_sched_do_timer
>      0.00%             [kernel.kallsyms]  [k] do_softirq
>      0.00%             [kernel.kallsyms]  [k] sidtab_search_core
>      0.00%             [kernel.kallsyms]  [k] selinux_inode_permission
>      0.00%             [kernel.kallsyms]  [k] kmem_cache_alloc
>      0.00%             [kernel.kallsyms]  [k] finish_wait
> [root@ssdandy l]#
> 
> After this there were the changes to allow using the hists browser to be
> used with 'top' and 'diff' was left broken for a while, I'll see if I can continue
> bisecting this after lunch.
> 

Hi Arnaldo,

I'm not quite what should I modified for this patch.

I think it may be broken after the commit as below (not test)
9c443dfdd31e ("perf diff: Fix support for all --sort combinations").

Should I change the patch description to bug fix, remove the examples,
 and remove the changes in Documentation?

It's only one-line change, so I will keep the code part.

What do you think?

Thanks,
Kan



^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH V6 1/1] perf tool: perf diff support for different binaries
  2015-01-06 14:53 ` Arnaldo Carvalho de Melo
  2015-01-06 16:16   ` Arnaldo Carvalho de Melo
  2015-01-06 16:39   ` Liang, Kan
@ 2015-01-26 14:15   ` Liang, Kan
  2 siblings, 0 replies; 7+ messages in thread
From: Liang, Kan @ 2015-01-26 14:15 UTC (permalink / raw)
  To: 'Arnaldo Carvalho de Melo'; +Cc: jolsa, namhyung, linux-kernel, ak



> 
> Em Tue, Dec 02, 2014 at 10:39:18AM -0500, kan.liang@intel.com escreveu:
> > From: Kan Liang <kan.liang@intel.com>
> >
> > Currently, the perf diff only works with same binaries. That's because
> > it compares the symbol start address. It doesn't work if the perf.data
> > comes from different binaries. This patch matches the function names.
> 
> Sorry Kan, I saw these patches, its just that it the above statement looked
> completely strange to me, i.e. of course it should look at the dso name,
> then at the symbol name, comparing via addresses makes no sense to me,
> so I kept leaving this patch to look after processing other patches and doing
> other things, then the year end holidays, etc.
> 
> So now I'm looking at old code, from 2009:
> 
>   commit 604c5c92972dcb4b98be34775452d09a5d4ec248
>   Author: Arnaldo Carvalho de Melo <acme@redhat.com>
>   Date:   Wed Dec 16 14:09:53 2009 -0200
> 
>     perf diff: Change the default sort order to "dso,symbol"
> 
>     This is a more intuitive / more meaningful default:
> 
>     $ perf diff | head -8
>          9.02%     +1.00%  libc-2.10.1.so               [.] _IO_vfprintf_internal
>          2.91%     -1.00%  [kernel]                     [k] __kmalloc
>          2.85%     -1.00%  [kernel]                     [k] ext4_htree_store_dirent
>          1.99%     -1.00%  [kernel]                     [k] _atomic_dec_and_lock
>          2.44%             [kernel]
>     $
> 
>     Suggested-by: Ingo Molnar <mingo@elte.hu>
> 
> and I see:
> 
> static void perf_session__match_hists(struct perf_session *old_session,
>                                       struct perf_session *new_session) {
>         struct rb_node *nd;
> 
>         perf_session__resort_by_name(old_session);
> 
>         for (nd = rb_first(&new_session->hists); nd; nd = rb_next(nd)) {
>                 struct hist_entry *pos = rb_entry(nd, struct hist_entry, rb_node);
>                 pos->pair = perf_session__find_hist_entry_by_name(old_session,
> pos);
>         }
> }
> 
> Ok, it will process all samples, store everything in hist_entries, etc, then
> traverse all entries in the most recent perf.data file and look for a pair, _by
> name_:
> 
> static struct hist_entry *
> perf_session__find_hist_entry_by_name(struct perf_session *self,
>                                       struct hist_entry *he) {
>         struct rb_node *n = self->hists.rb_node;
> 
>         while (n) {
>                 struct hist_entry *iter = rb_entry(n, struct hist_entry, rb_node);
>                 int cmp = strcmp(he->map->dso->name, iter->map->dso->name);
> 
>                 if (cmp > 0)
>                         n = n->rb_left;
>                 else if (cmp < 0)
>                         n = n->rb_right;
>                 else {
>                         cmp = strcmp(he->sym->name, iter->sym->name);
>                         if (cmp > 0)
>                                 n = n->rb_left;
>                         else if (cmp < 0)
>                                 n = n->rb_right;
>                         else
>                                 return iter;
>                 }
>         }
> 
>         return NULL;
> }
> 
> See? Resort the old session by "dso,symbol_name", then go on pairing
> them.
> 
> So at some point this got broken :-\
> 


Hi Arnaldo,

I tried the 2009 version perf diff. The code still only work for the same binaries.

Although it compared by name, it used a global list of dsos. 
That means the binaries which has same name can only be loaded once. It's OK
for same binaries. But for different binaries, it's a problem. For example, we have
an old binary and an updated binary. They very likely have same name and most
of the function. So only dsos from old binary will be loaded. When processing the
data from updated binary, perf still use the symbol information from old binary. 
That's wrong for different binaries.

Then about two weeks later, the patch as below to use IP to replace symbol name.

author	Arnaldo Carvalho de Melo <acme@redhat.com>	2009-12-29 00:48:36 (GMT)
committer	Ingo Molnar <mingo@elte.hu>	2009-12-30 11:00:00 (GMT)
commit	9c443dfdd31eddea6cbe6ee0ca469fbcc4e1dc3b (patch)
tree	6e217aaf89b941fbbc7b905046d5350392d642db /tools/perf
parent	cdbae31408cf39372402076cf2e189ec693daa71 (diff)
perf diff: Fix support for all --sort combinations


The global dsos is discarded from a patch in 2010. Since the previous patch has already
use the IP to compare. So it still doesn't work for different binaries

author	Zhang, Yanmin <yanmin_zhang@linux.intel.com>	2010-04-19 05:32:50 (GMT)
committer	Avi Kivity <avi@redhat.com>	2010-04-19 09:37:24 (GMT)
commit	a1645ce12adb6c9cc9e19d7695466204e3f017fe (patch)
tree	5d31aaaf534997e6e9cebc07f38eca35f76986cf /tools/perf
parent	ff9d07a0e7ce756a183e7c2e483aec452ee6b574 (diff)
perf: 'perf kvm' tool for monitoring guest performance from host


So I think perf diff never works for different binaries.  My patch should not be a bug fix,
but a feature enhancement as I mentioned in the description. 

What do you think?

Thanks,
Kan


> I.e. this is not changing an existing correct behaviour, but fixing a bug, i.e.
> the changeset comment confused me :-\
> 
> Now I'm trying to figure out _when_ this got broken, what was the
> reasoning for that code to have changed in a way that made it not match
> by dso_name, symbol_name.
> 
> Also please, when providing code to be built to reproduce your results,
> make it self contained,  i.e. in the case below, I had to replace "noinline"
> with "__attribute__((noinline))".
> 
> - Arnaldo
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-01-26 14:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-02 15:39 [PATCH V6 1/1] perf tool: perf diff support for different binaries kan.liang
2015-01-05 16:04 ` Liang, Kan
2015-01-06 14:53 ` Arnaldo Carvalho de Melo
2015-01-06 16:16   ` Arnaldo Carvalho de Melo
2015-01-12 18:06     ` Liang, Kan
2015-01-06 16:39   ` Liang, Kan
2015-01-26 14:15   ` Liang, Kan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).