LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Vince Weaver <vincent.weaver@maine.edu>
To: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@kernel.org, linux-kernel@vger.kernel.org,
	vincent.weaver@maine.edu, eranian@gmail.com, jolsa@redhat.com,
	mark.rutland@arm.com, torvalds@linux-foundation.org,
	tglx@linutronix.de
Subject: Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia
Date: Tue, 27 Jan 2015 16:28:47 -0500 (EST)	[thread overview]
Message-ID: <alpine.DEB.2.11.1501271621550.5038@vincent-weaver-1.umelst.maine.edu> (raw)
In-Reply-To: <20150126162639.GA21418@twins.programming.kicks-ass.net>

On Mon, 26 Jan 2015, Peter Zijlstra wrote:

> I think the below should cure this; if we install a group leader it will
> iterate the (still intact) group list and find its siblings and try and
> install those too -- even though those still have the old event->ctx --
> in the new ctx.

I've been fuzzing 24 hours (despite the blizzard) with this patch
plus the original series on top of 3.19-rc6 on the Haswell machine and it 
is looking good.

Still getting the "perfevents: irq loop stuck!" warnings seen below, but 
that's probably a problem for a different day.  

(as an aside, when this warning is triggered, almost always the fixed 
counter 0 has a count of 0000fffffffffffe, not sure if that's relevant)

[27788.169271] ------------[ cut here ]------------
[27788.174226] WARNING: CPU: 2 PID: 0 at arch/x86/kernel/cpu/perf_event_intel.c:1372 intel_pmu_handle_irq+0x282/0x3a0()
[27788.185501] perfevents: irq loop stuck!
[27788.189640] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm crct10dif_pclmul snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_hda_controller aesni_intel snd_hda_codec aes_x86_64 snd_hwdep lrw gf128mul snd_pcm ppdev glue_helper xhci_pci mei_me iTCO_wdt iTCO_vendor_support i915 snd_timer drm_kms_helper snd drm ablk_helper lpc_ich mfd_core evdev pcspkr parport_pc psmouse cryptd soundcore i2c_i801 serio_raw parport xhci_hcd mei wmi tpm_tis tpm video battery button processor i2c_algo_bit sg sr_mod sd_mod cdrom ahci libahci e1000e ehci_pci libata ptp ehci_hcd crc32c_intel usbcore scsi_mod usb_common pps_core thermal fan thermal_sys
[27788.261403] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.19.0-rc6+ #126
[27788.268409] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[27788.276362]  ffffffff81a266b0 ffff88011ea85b50 ffffffff816b6761 0000000000000000
[27788.284445]  ffff88011ea85ba0 ffff88011ea85b90 ffffffff8106dcda ffffffff8115c439
[27788.292492]  0000000000000064 ffff88011ea8bd40 ffff8800366d6000 0000000000000040
[27788.300600] Call Trace:
[27788.303225]  <NMI>  [<ffffffff816b6761>] dump_stack+0x45/0x57
[27788.309459]  [<ffffffff8106dcda>] warn_slowpath_common+0x8a/0xc0
[27788.315855]  [<ffffffff8115c439>] ? __perf_event_overflow+0xd9/0x270
[27788.322724]  [<ffffffff8106dd56>] warn_slowpath_fmt+0x46/0x50
[27788.328943]  [<ffffffff81032732>] intel_pmu_handle_irq+0x282/0x3a0
[27788.335605]  [<ffffffff810b405c>] ? check_chain_key+0x12c/0x1e0
[27788.341986]  [<ffffffff8102a04b>] perf_event_nmi_handler+0x2b/0x50
[27788.348620]  [<ffffffff81018610>] nmi_handle+0xa0/0x150
[27788.354261]  [<ffffffff81018575>] ? nmi_handle+0x5/0x150
[27788.359898]  [<ffffffff810188ba>] default_do_nmi+0x4a/0x140
[27788.365938]  [<ffffffff81018a48>] do_nmi+0x98/0xe0
[27788.371107]  [<ffffffff816c0bf1>] end_repeat_nmi+0x1e/0x2e
[27788.377003]  [<ffffffff81056e9a>] ? native_write_msr_safe+0xa/0x10
[27788.383629]  [<ffffffff81056e9a>] ? native_write_msr_safe+0xa/0x10
[27788.390296]  [<ffffffff81056e9a>] ? native_write_msr_safe+0xa/0x10
[27788.396953]  <<EOE>>  <IRQ>  [<ffffffff81032346>] intel_pmu_enable_event+0x236/0x250
[27788.405309]  [<ffffffff8102b281>] x86_pmu_start+0x81/0x120
[27788.411284]  [<ffffffff8102ba55>] x86_pmu_enable+0x295/0x310
[27788.417339]  [<ffffffff8115858a>] perf_pmu_enable+0x2a/0x30
[27788.423344]  [<ffffffff81029e38>] x86_pmu_commit_txn+0x78/0xa0
[27788.429615]  [<ffffffff810b594f>] ? __lock_acquire.isra.31+0x3af/0xfe0
[27788.436600]  [<ffffffff8115aec4>] ? perf_event_update_userpage+0xd4/0x160
[27788.443927]  [<ffffffff810b594f>] ? __lock_acquire.isra.31+0x3af/0xfe0
[27788.450880]  [<ffffffff810b47c8>] ? __lock_is_held+0x58/0x80
[27788.456951]  [<ffffffff81159410>] ? event_sched_in.isra.74+0x180/0x280
[27788.463962]  [<ffffffff811596c8>] group_sched_in+0x1b8/0x1e0
[27788.470071]  [<ffffffff8101d90a>] ? native_sched_clock+0x2a/0x90
[27788.476575]  [<ffffffff81159ddc>] __perf_event_enable+0x25c/0x2a0
[27788.483139]  [<ffffffff810ea649>] ? tick_nohz_irq_exit+0x29/0x30
[27788.489614]  [<ffffffff81154570>] remote_function+0x50/0x60
[27788.495572]  [<ffffffff810ef822>] flush_smp_call_function_queue+0x62/0x140
[27788.502919]  [<ffffffff810efe43>] generic_smp_call_function_single_interrupt+0x13/0x60
[27788.511384]  [<ffffffff81046e07>] smp_call_function_single_interrupt+0x27/0x40
[27788.519150]  [<ffffffff816bfc7d>] call_function_single_interrupt+0x6d/0x80
[27788.527654]  <EOI>  [<ffffffff81554185>] ? cpuidle_enter_state+0x65/0x160
[27788.536161]  [<ffffffff81554171>] ? cpuidle_enter_state+0x51/0x160
[27788.543978]  [<ffffffff81554367>] cpuidle_enter+0x17/0x20
[27788.550939]  [<ffffffff810aec31>] cpu_startup_entry+0x311/0x3c0
[27788.558382]  [<ffffffff810476e0>] start_secondary+0x140/0x150
[27788.565763] ---[ end trace 55752a03ec8ab977 ]---
[27788.574572] CPU#2: ctrl:       0000000000000000
[27788.580619] CPU#2: status:     0000000000000000
[27788.586609] CPU#2: overflow:   0000000000000000
[27788.592580] CPU#2: fixed:      00000000000000ba
[27788.598551] CPU#2: pebs:       0000000000000000
[27788.604522] CPU#2: active:     0000000300000000
[27788.610460] CPU#2:   gen-PMC0 ctrl:  00000000a8938001
[27788.617003] CPU#2:   gen-PMC0 count: 0000000000014b9c
[27788.623545] CPU#2:   gen-PMC0 left:  0000ffffffffffff
[27788.630032] CPU#2:   gen-PMC1 ctrl:  000000001833ca08
[27788.636563] CPU#2:   gen-PMC1 count: 0000000000000001
[27788.643123] CPU#2:   gen-PMC1 left:  0000ffffffffffff
[27788.649666] CPU#2:   gen-PMC2 ctrl:  0000000300130000
[27788.656189] CPU#2:   gen-PMC2 count: 0000000000000001
[27788.662622] CPU#2:   gen-PMC2 left:  0000ffffffffffff
[27788.669091] CPU#2:   gen-PMC3 ctrl:  00000000801300e2
[27788.675519] CPU#2:   gen-PMC3 count: 0000000000000001
[27788.681882] CPU#2:   gen-PMC3 left:  0000ffffffffffff
[27788.688207] CPU#2: fixed-PMC0 count: 0000fffffffffffe
[27788.694448] CPU#2: fixed-PMC1 count: 0000fff958ec3d6f
[27788.700703] CPU#2: fixed-PMC2 count: 00000000002b3c77
[27788.706922] perf_event_intel: clearing PMU state on CPU#2
[27788.713598] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 544.301 msecs
[27788.723308] perf interrupt took too long (4252836 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
[27922.376291] perf interrupt took too long (4219620 > 10000), lowering kernel.perf_event_max_sample_rate to 12500
[28039.869794] perf interrupt took too long (4186668 > 20000), lowering kernel.perf_event_max_sample_rate to 6250
[28123.044927] perf interrupt took too long (4153967 > 38461), lowering kernel.perf_event_max_sample_rate to 3250
[28128.807297] perf interrupt took too long (4121532 > 71428), lowering kernel.perf_event_max_sample_rate to 1750
[28128.818494] perf interrupt took too long (4089339 > 125000), lowering kernel.perf_event_max_sample_rate to 1000
[28128.829828] perf interrupt took too long (4057399 > 250000), lowering kernel.perf_event_max_sample_rate to 500
[28128.840903] perf interrupt took too long (4025786 > 500000), lowering kernel.perf_event_max_sample_rate to 250
[46336.872295] perf_event_intel: clearing PMU state on CPU#2
[69001.500107] perf_event_intel: clearing PMU state on CPU#2
[69001.561872] perf_event_intel: clearing PMU state on CPU#2





  reply	other threads:[~2015-01-27 21:27 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-23 12:51 [RFC][PATCH 0/3] perf fixes Peter Zijlstra
2015-01-23 12:52 ` [RFC][PATCH 1/3] perf: Tighten (and fix) the grouping condition Peter Zijlstra
2015-01-23 15:02   ` Mark Rutland
2015-01-23 15:07     ` Peter Zijlstra
2015-01-23 15:22       ` Mark Rutland
2015-02-04 12:59         ` Peter Zijlstra
2015-01-28 14:30   ` [tip:perf/urgent] " tip-bot for Peter Zijlstra
2015-01-23 12:52 ` [RFC][PATCH 2/3] perf: Add a bit of paranoia Peter Zijlstra
2015-01-26 16:26   ` Peter Zijlstra
2015-01-27 21:28     ` Vince Weaver [this message]
2015-01-29  2:16       ` Vince Weaver
2015-01-29  7:51         ` Jiri Olsa
2015-01-29 13:44           ` Peter Zijlstra
2015-02-04 14:39             ` [tip:perf/core] perf: Fix put_event() ctx lock tip-bot for Peter Zijlstra
2015-01-29 14:47     ` [RFC][PATCH 2/3] perf: Add a bit of paranoia Peter Zijlstra
2015-02-02  6:33       ` Vince Weaver
2015-02-02 15:42         ` Peter Zijlstra
2015-02-02 17:32           ` Peter Zijlstra
2015-02-04 14:51             ` Jiri Olsa
2015-02-04 16:33               ` Peter Zijlstra
2015-02-04 16:43                 ` Peter Zijlstra
2015-02-04 14:39     ` [tip:perf/core] perf: Fix move_group() order tip-bot for Peter Zijlstra (Intel)
2015-02-04 14:39   ` [tip:perf/core] perf: Add a bit of paranoia tip-bot for Peter Zijlstra
2015-01-23 12:52 ` [RFC][PATCH 3/3] perf: Fix event->ctx locking Peter Zijlstra
2015-02-04 14:39   ` [tip:perf/core] " tip-bot for Peter Zijlstra
2015-01-26 14:30 ` [RFC][PATCH 0/3] perf fixes Vince Weaver
2015-01-26 18:03   ` Vince Weaver
2015-01-26 18:34     ` Jiri Olsa
2015-01-26 18:52       ` Vince Weaver

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.11.1501271621550.5038@vincent-weaver-1.umelst.maine.edu \
    --to=vincent.weaver@maine.edu \
    --cc=eranian@gmail.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --subject='Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).