LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [RFC PATCH v2] tick: Make tick_periodic() check for missing ticks
@ 2020-02-07 19:39 Waiman Long
  2020-03-04  9:20 ` [tip: timers/core] tick/common: " tip-bot2 for Waiman Long
  2020-03-16  2:20 ` [RFC PATCH v2] tick: " Guenter Roeck
  0 siblings, 2 replies; 6+ messages in thread
From: Waiman Long @ 2020-02-07 19:39 UTC (permalink / raw)
  To: Frederic Weisbecker, Thomas Gleixner, Ingo Molnar
  Cc: linux-kernel, Jeremy Linton, pbunyan, Waiman Long

The tick_periodic() function is used at the beginning part of the
bootup process for time keeping while the other clock sources are
being initialized.

The current code assumes that all the timer interrupts are handled in
a timely manner with no missing ticks. That is not actually true. Some
ticks are missed and there are some discrepancies between the tick time
(jiffies) and the timestamp reported in the kernel log.  Some systems,
however, are more prone to missing ticks than the others.  In the extreme
case, the discrepancy can actually cause a soft lockup message to be
printed by the watchdog kthread. For example, on a Cavium ThunderX2
Sabre arm64 system:

 [   25.496379] watchdog: BUG: soft lockup - CPU#14 stuck for 22s!

On that system, the missing ticks are especially prevalent during the
smp_init() phase of the boot process. With an instrumented kernel,
it was found that it took about 24s as reported by the timestamp for
the tick to accumulate 4s of time.

Investigation and bisection done by others seemed to point to the
commit 73f381660959 ("arm64: Advertise mitigation of Spectre-v2, or
lack thereof") as the culprit. It could also be a firmware issue as
new firmware was promised that would fix the issue.

To properly address this problem, we cannot assume that there will
be no missing tick in tick_periodic(). This function is now modified
to follow the example of tick_do_update_jiffies64() by using another
reference clock to check for missing ticks. Since the watchdog timer
uses running_clock(), it is used here as the reference. With this patch
applied, the soft lockup problem in the arm64 system is gone and tick
time tracks much more closely to the timestamp time.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/time/tick-common.c | 36 +++++++++++++++++++++++++++++++++---
 1 file changed, 33 insertions(+), 3 deletions(-)

 v2: Avoid direct u64 division and better ns-ktime conversion.

diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 7e5d3524e924..55dbbe0f5573 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -16,6 +16,7 @@
 #include <linux/profile.h>
 #include <linux/sched.h>
 #include <linux/module.h>
+#include <linux/sched/clock.h>
 #include <trace/events/power.h>
 
 #include <asm/irq_regs.h>
@@ -84,12 +85,41 @@ int tick_is_oneshot_available(void)
 static void tick_periodic(int cpu)
 {
 	if (tick_do_timer_cpu == cpu) {
+		/*
+		 * Use running_clock() as reference to check for missing ticks.
+		 */
+		static ktime_t last_update;
+		ktime_t now;
+		int ticks = 1;
+
+		now = ns_to_ktime(running_clock());
 		write_seqlock(&jiffies_lock);
 
-		/* Keep track of the next tick event */
-		tick_next_period = ktime_add(tick_next_period, tick_period);
+		if (last_update) {
+			u64 delta = ktime_sub(now, last_update);
 
-		do_timer(1);
+			/*
+			 * Compute missed ticks
+			 *
+			 * There is likely a persistent delta between
+			 * last_update and tick_next_period. So they are
+			 * updated separately.
+			 */
+			if (delta >= 2 * tick_period) {
+				s64 period = ktime_to_ns(tick_period);
+
+				ticks = ktime_divns(delta, period);
+			}
+			last_update = ktime_add(last_update,
+						ticks * tick_period);
+		} else {
+			last_update = now;
+		}
+
+		/* Keep track of the next tick event */
+		tick_next_period = ktime_add(tick_next_period,
+					     ticks * tick_period);
+		do_timer(ticks);
 		write_sequnlock(&jiffies_lock);
 		update_wall_time();
 	}
-- 
2.18.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [tip: timers/core] tick/common: Make tick_periodic() check for missing ticks
  2020-02-07 19:39 [RFC PATCH v2] tick: Make tick_periodic() check for missing ticks Waiman Long
@ 2020-03-04  9:20 ` tip-bot2 for Waiman Long
  2020-03-16  2:20 ` [RFC PATCH v2] tick: " Guenter Roeck
  1 sibling, 0 replies; 6+ messages in thread
From: tip-bot2 for Waiman Long @ 2020-03-04  9:20 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Waiman Long, Thomas Gleixner, x86, LKML

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     d441dceb5dce71150f28add80d36d91bbfccba99
Gitweb:        https://git.kernel.org/tip/d441dceb5dce71150f28add80d36d91bbfccba99
Author:        Waiman Long <longman@redhat.com>
AuthorDate:    Fri, 07 Feb 2020 14:39:29 -05:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 04 Mar 2020 10:18:11 +01:00

tick/common: Make tick_periodic() check for missing ticks

The tick_periodic() function is used at the beginning part of the
bootup process for time keeping while the other clock sources are
being initialized.

The current code assumes that all the timer interrupts are handled in
a timely manner with no missing ticks. That is not actually true. Some
ticks are missed and there are some discrepancies between the tick time
(jiffies) and the timestamp reported in the kernel log.  Some systems,
however, are more prone to missing ticks than the others.  In the extreme
case, the discrepancy can actually cause a soft lockup message to be
printed by the watchdog kthread. For example, on a Cavium ThunderX2
Sabre arm64 system:

 [   25.496379] watchdog: BUG: soft lockup - CPU#14 stuck for 22s!

On that system, the missing ticks are especially prevalent during the
smp_init() phase of the boot process. With an instrumented kernel,
it was found that it took about 24s as reported by the timestamp for
the tick to accumulate 4s of time.

Investigation and bisection done by others seemed to point to the
commit 73f381660959 ("arm64: Advertise mitigation of Spectre-v2, or
lack thereof") as the culprit. It could also be a firmware issue as
new firmware was promised that would fix the issue.

To properly address this problem, stop assuming that there will be no
missing tick in tick_periodic(). Modify it to follow the example of
tick_do_update_jiffies64() by using another reference clock to check for
missing ticks. Since the watchdog timer uses running_clock(), it is used
here as the reference. With this applied, the soft lockup problem in the
affected arm64 system is gone and tick time tracks much more closely to the
timestamp time.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200207193929.27308-1-longman@redhat.com
---
 kernel/time/tick-common.c | 36 +++++++++++++++++++++++++++++++++---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 7e5d352..cce4ed1 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -16,6 +16,7 @@
 #include <linux/profile.h>
 #include <linux/sched.h>
 #include <linux/module.h>
+#include <linux/sched/clock.h>
 #include <trace/events/power.h>
 
 #include <asm/irq_regs.h>
@@ -84,12 +85,41 @@ int tick_is_oneshot_available(void)
 static void tick_periodic(int cpu)
 {
 	if (tick_do_timer_cpu == cpu) {
+		/*
+		 * Use running_clock() as reference to check for missing ticks.
+		 */
+		static ktime_t last_update;
+		ktime_t now;
+		int ticks = 1;
+
+		now = ns_to_ktime(running_clock());
 		write_seqlock(&jiffies_lock);
 
-		/* Keep track of the next tick event */
-		tick_next_period = ktime_add(tick_next_period, tick_period);
+		if (last_update) {
+			u64 delta = ktime_sub(now, last_update);
 
-		do_timer(1);
+			/*
+			 * Check for eventually missed ticks
+			 *
+			 * There is likely a persistent delta between
+			 * last_update and tick_next_period. So they are
+			 * updated separately.
+			 */
+			if (delta >= 2 * tick_period) {
+				s64 period = ktime_to_ns(tick_period);
+
+				ticks = ktime_divns(delta, period);
+			}
+			last_update = ktime_add(last_update,
+						ticks * tick_period);
+		} else {
+			last_update = now;
+		}
+
+		/* Keep track of the next tick event */
+		tick_next_period = ktime_add(tick_next_period,
+					     ticks * tick_period);
+		do_timer(ticks);
 		write_sequnlock(&jiffies_lock);
 		update_wall_time();
 	}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH v2] tick: Make tick_periodic() check for missing ticks
  2020-02-07 19:39 [RFC PATCH v2] tick: Make tick_periodic() check for missing ticks Waiman Long
  2020-03-04  9:20 ` [tip: timers/core] tick/common: " tip-bot2 for Waiman Long
@ 2020-03-16  2:20 ` Guenter Roeck
  2020-03-16  2:43   ` Waiman Long
  1 sibling, 1 reply; 6+ messages in thread
From: Guenter Roeck @ 2020-03-16  2:20 UTC (permalink / raw)
  To: Waiman Long
  Cc: Frederic Weisbecker, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Jeremy Linton, pbunyan

Hi,

On Fri, Feb 07, 2020 at 02:39:29PM -0500, Waiman Long wrote:
> The tick_periodic() function is used at the beginning part of the
> bootup process for time keeping while the other clock sources are
> being initialized.
> 
> The current code assumes that all the timer interrupts are handled in
> a timely manner with no missing ticks. That is not actually true. Some
> ticks are missed and there are some discrepancies between the tick time
> (jiffies) and the timestamp reported in the kernel log.  Some systems,
> however, are more prone to missing ticks than the others.  In the extreme
> case, the discrepancy can actually cause a soft lockup message to be
> printed by the watchdog kthread. For example, on a Cavium ThunderX2
> Sabre arm64 system:
> 
>  [   25.496379] watchdog: BUG: soft lockup - CPU#14 stuck for 22s!
> 
> On that system, the missing ticks are especially prevalent during the
> smp_init() phase of the boot process. With an instrumented kernel,
> it was found that it took about 24s as reported by the timestamp for
> the tick to accumulate 4s of time.
> 
> Investigation and bisection done by others seemed to point to the
> commit 73f381660959 ("arm64: Advertise mitigation of Spectre-v2, or
> lack thereof") as the culprit. It could also be a firmware issue as
> new firmware was promised that would fix the issue.
> 
> To properly address this problem, we cannot assume that there will
> be no missing tick in tick_periodic(). This function is now modified
> to follow the example of tick_do_update_jiffies64() by using another
> reference clock to check for missing ticks. Since the watchdog timer
> uses running_clock(), it is used here as the reference. With this patch
> applied, the soft lockup problem in the arm64 system is gone and tick
> time tracks much more closely to the timestamp time.
> 
> Signed-off-by: Waiman Long <longman@redhat.com>

Since this patch is in linux-next, roughly 10% of my x86 and x86_64
qemu emulation boots are stalling. Typical log:

[    0.002016] smpboot: Total of 1 processors activated (7576.40 BogoMIPS)
[    0.002016] devtmpfs: initialized
[    0.002016] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
[    0.002016] futex hash table entries: 256 (order: 3, 32768 bytes, linear)
[    0.002016] xor: measuring software checksum speed

another:

[    0.002653] Freeing SMP alternatives memory: 44K
[    0.002653] smpboot: CPU0: Intel Westmere E56xx/L56xx/X56xx (IBRS update) (family: 0x6, model: 0x2c, stepping: 0x1)
[    0.002653] Performance Events: unsupported p6 CPU model 44 no PMU driver, software events only.
[    0.002653] rcu: Hierarchical SRCU implementation.
[    0.002653] smp: Bringing up secondary CPUs ...
[    0.002653] x86: Booting SMP configuration:
[    0.002653] .... node  #0, CPUs:      #1
[    0.000000] smpboot: CPU 1 Converting physical 0 to logical die 1

... and then there is silence until the test aborts.

This is only (or at least predominantly) seen if the system running
the emulation is under load.

Reverting this patch fixes the problem.

Guenter

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH v2] tick: Make tick_periodic() check for missing ticks
  2020-03-16  2:20 ` [RFC PATCH v2] tick: " Guenter Roeck
@ 2020-03-16  2:43   ` Waiman Long
  2020-03-16  2:57     ` Guenter Roeck
  0 siblings, 1 reply; 6+ messages in thread
From: Waiman Long @ 2020-03-16  2:43 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Frederic Weisbecker, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Jeremy Linton, pbunyan

On 3/15/20 10:20 PM, Guenter Roeck wrote:
> Hi,
>
> On Fri, Feb 07, 2020 at 02:39:29PM -0500, Waiman Long wrote:
>> The tick_periodic() function is used at the beginning part of the
>> bootup process for time keeping while the other clock sources are
>> being initialized.
>>
>> The current code assumes that all the timer interrupts are handled in
>> a timely manner with no missing ticks. That is not actually true. Some
>> ticks are missed and there are some discrepancies between the tick time
>> (jiffies) and the timestamp reported in the kernel log.  Some systems,
>> however, are more prone to missing ticks than the others.  In the extreme
>> case, the discrepancy can actually cause a soft lockup message to be
>> printed by the watchdog kthread. For example, on a Cavium ThunderX2
>> Sabre arm64 system:
>>
>>  [   25.496379] watchdog: BUG: soft lockup - CPU#14 stuck for 22s!
>>
>> On that system, the missing ticks are especially prevalent during the
>> smp_init() phase of the boot process. With an instrumented kernel,
>> it was found that it took about 24s as reported by the timestamp for
>> the tick to accumulate 4s of time.
>>
>> Investigation and bisection done by others seemed to point to the
>> commit 73f381660959 ("arm64: Advertise mitigation of Spectre-v2, or
>> lack thereof") as the culprit. It could also be a firmware issue as
>> new firmware was promised that would fix the issue.
>>
>> To properly address this problem, we cannot assume that there will
>> be no missing tick in tick_periodic(). This function is now modified
>> to follow the example of tick_do_update_jiffies64() by using another
>> reference clock to check for missing ticks. Since the watchdog timer
>> uses running_clock(), it is used here as the reference. With this patch
>> applied, the soft lockup problem in the arm64 system is gone and tick
>> time tracks much more closely to the timestamp time.
>>
>> Signed-off-by: Waiman Long <longman@redhat.com>
> Since this patch is in linux-next, roughly 10% of my x86 and x86_64
> qemu emulation boots are stalling. Typical log:
>
> [    0.002016] smpboot: Total of 1 processors activated (7576.40 BogoMIPS)
> [    0.002016] devtmpfs: initialized
> [    0.002016] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
> [    0.002016] futex hash table entries: 256 (order: 3, 32768 bytes, linear)
> [    0.002016] xor: measuring software checksum speed
>
> another:
>
> [    0.002653] Freeing SMP alternatives memory: 44K
> [    0.002653] smpboot: CPU0: Intel Westmere E56xx/L56xx/X56xx (IBRS update) (family: 0x6, model: 0x2c, stepping: 0x1)
> [    0.002653] Performance Events: unsupported p6 CPU model 44 no PMU driver, software events only.
> [    0.002653] rcu: Hierarchical SRCU implementation.
> [    0.002653] smp: Bringing up secondary CPUs ...
> [    0.002653] x86: Booting SMP configuration:
> [    0.002653] .... node  #0, CPUs:      #1
> [    0.000000] smpboot: CPU 1 Converting physical 0 to logical die 1
>
> ... and then there is silence until the test aborts.
>
> This is only (or at least predominantly) seen if the system running
> the emulation is under load.
>
> Reverting this patch fixes the problem.

I was aware that there are some problem with this patch, but it is hard
to reproduce it. Do you have a more consistent way to reproduce it.
When  you say under load, you mean that the host system is also busy so
that there are a lot of vcpu preemption. Right? Could you give me the
x86-64 .config file that you use?

Thanks,
Longman


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH v2] tick: Make tick_periodic() check for missing ticks
  2020-03-16  2:43   ` Waiman Long
@ 2020-03-16  2:57     ` Guenter Roeck
  2020-03-16 14:20       ` Waiman Long
  0 siblings, 1 reply; 6+ messages in thread
From: Guenter Roeck @ 2020-03-16  2:57 UTC (permalink / raw)
  To: Waiman Long
  Cc: Frederic Weisbecker, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Jeremy Linton, pbunyan

[-- Attachment #1: Type: text/plain, Size: 3934 bytes --]

On 3/15/20 7:43 PM, Waiman Long wrote:
> On 3/15/20 10:20 PM, Guenter Roeck wrote:
>> Hi,
>>
>> On Fri, Feb 07, 2020 at 02:39:29PM -0500, Waiman Long wrote:
>>> The tick_periodic() function is used at the beginning part of the
>>> bootup process for time keeping while the other clock sources are
>>> being initialized.
>>>
>>> The current code assumes that all the timer interrupts are handled in
>>> a timely manner with no missing ticks. That is not actually true. Some
>>> ticks are missed and there are some discrepancies between the tick time
>>> (jiffies) and the timestamp reported in the kernel log.  Some systems,
>>> however, are more prone to missing ticks than the others.  In the extreme
>>> case, the discrepancy can actually cause a soft lockup message to be
>>> printed by the watchdog kthread. For example, on a Cavium ThunderX2
>>> Sabre arm64 system:
>>>
>>>  [   25.496379] watchdog: BUG: soft lockup - CPU#14 stuck for 22s!
>>>
>>> On that system, the missing ticks are especially prevalent during the
>>> smp_init() phase of the boot process. With an instrumented kernel,
>>> it was found that it took about 24s as reported by the timestamp for
>>> the tick to accumulate 4s of time.
>>>
>>> Investigation and bisection done by others seemed to point to the
>>> commit 73f381660959 ("arm64: Advertise mitigation of Spectre-v2, or
>>> lack thereof") as the culprit. It could also be a firmware issue as
>>> new firmware was promised that would fix the issue.
>>>
>>> To properly address this problem, we cannot assume that there will
>>> be no missing tick in tick_periodic(). This function is now modified
>>> to follow the example of tick_do_update_jiffies64() by using another
>>> reference clock to check for missing ticks. Since the watchdog timer
>>> uses running_clock(), it is used here as the reference. With this patch
>>> applied, the soft lockup problem in the arm64 system is gone and tick
>>> time tracks much more closely to the timestamp time.
>>>
>>> Signed-off-by: Waiman Long <longman@redhat.com>
>> Since this patch is in linux-next, roughly 10% of my x86 and x86_64
>> qemu emulation boots are stalling. Typical log:
>>
>> [    0.002016] smpboot: Total of 1 processors activated (7576.40 BogoMIPS)
>> [    0.002016] devtmpfs: initialized
>> [    0.002016] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
>> [    0.002016] futex hash table entries: 256 (order: 3, 32768 bytes, linear)
>> [    0.002016] xor: measuring software checksum speed
>>
>> another:
>>
>> [    0.002653] Freeing SMP alternatives memory: 44K
>> [    0.002653] smpboot: CPU0: Intel Westmere E56xx/L56xx/X56xx (IBRS update) (family: 0x6, model: 0x2c, stepping: 0x1)
>> [    0.002653] Performance Events: unsupported p6 CPU model 44 no PMU driver, software events only.
>> [    0.002653] rcu: Hierarchical SRCU implementation.
>> [    0.002653] smp: Bringing up secondary CPUs ...
>> [    0.002653] x86: Booting SMP configuration:
>> [    0.002653] .... node  #0, CPUs:      #1
>> [    0.000000] smpboot: CPU 1 Converting physical 0 to logical die 1
>>
>> ... and then there is silence until the test aborts.
>>
>> This is only (or at least predominantly) seen if the system running
>> the emulation is under load.
>>
>> Reverting this patch fixes the problem.
> 
> I was aware that there are some problem with this patch, but it is hard
> to reproduce it. Do you have a more consistent way to reproduce it.
> When  you say under load, you mean that the host system is also busy so
> that there are a lot of vcpu preemption. Right? Could you give me the

Correct. I am able to reproduce the problem quite reliably (ie 2-3 boots
out of ~25 fail) if I run a kernel compilation in parallel, but not (or
rarely) if the system is otherwise idle.

> x86-64 .config file that you use?
> 

Attached. It is pretty much defconfig with various debug and test options
enabled.

Guenter

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 30037 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH v2] tick: Make tick_periodic() check for missing ticks
  2020-03-16  2:57     ` Guenter Roeck
@ 2020-03-16 14:20       ` Waiman Long
  0 siblings, 0 replies; 6+ messages in thread
From: Waiman Long @ 2020-03-16 14:20 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Frederic Weisbecker, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Jeremy Linton, pbunyan

On 3/15/20 10:57 PM, Guenter Roeck wrote:
> On 3/15/20 7:43 PM, Waiman Long wrote:
>> On 3/15/20 10:20 PM, Guenter Roeck wrote:
>>> Hi,
>>>
>>> On Fri, Feb 07, 2020 at 02:39:29PM -0500, Waiman Long wrote:
>>>> The tick_periodic() function is used at the beginning part of the
>>>> bootup process for time keeping while the other clock sources are
>>>> being initialized.
>>>>
>>>> The current code assumes that all the timer interrupts are handled in
>>>> a timely manner with no missing ticks. That is not actually true. Some
>>>> ticks are missed and there are some discrepancies between the tick time
>>>> (jiffies) and the timestamp reported in the kernel log.  Some systems,
>>>> however, are more prone to missing ticks than the others.  In the extreme
>>>> case, the discrepancy can actually cause a soft lockup message to be
>>>> printed by the watchdog kthread. For example, on a Cavium ThunderX2
>>>> Sabre arm64 system:
>>>>
>>>>  [   25.496379] watchdog: BUG: soft lockup - CPU#14 stuck for 22s!
>>>>
>>>> On that system, the missing ticks are especially prevalent during the
>>>> smp_init() phase of the boot process. With an instrumented kernel,
>>>> it was found that it took about 24s as reported by the timestamp for
>>>> the tick to accumulate 4s of time.
>>>>
>>>> Investigation and bisection done by others seemed to point to the
>>>> commit 73f381660959 ("arm64: Advertise mitigation of Spectre-v2, or
>>>> lack thereof") as the culprit. It could also be a firmware issue as
>>>> new firmware was promised that would fix the issue.
>>>>
>>>> To properly address this problem, we cannot assume that there will
>>>> be no missing tick in tick_periodic(). This function is now modified
>>>> to follow the example of tick_do_update_jiffies64() by using another
>>>> reference clock to check for missing ticks. Since the watchdog timer
>>>> uses running_clock(), it is used here as the reference. With this patch
>>>> applied, the soft lockup problem in the arm64 system is gone and tick
>>>> time tracks much more closely to the timestamp time.
>>>>
>>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>> Since this patch is in linux-next, roughly 10% of my x86 and x86_64
>>> qemu emulation boots are stalling. Typical log:
>>>
>>> [    0.002016] smpboot: Total of 1 processors activated (7576.40 BogoMIPS)
>>> [    0.002016] devtmpfs: initialized
>>> [    0.002016] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
>>> [    0.002016] futex hash table entries: 256 (order: 3, 32768 bytes, linear)
>>> [    0.002016] xor: measuring software checksum speed
>>>
>>> another:
>>>
>>> [    0.002653] Freeing SMP alternatives memory: 44K
>>> [    0.002653] smpboot: CPU0: Intel Westmere E56xx/L56xx/X56xx (IBRS update) (family: 0x6, model: 0x2c, stepping: 0x1)
>>> [    0.002653] Performance Events: unsupported p6 CPU model 44 no PMU driver, software events only.
>>> [    0.002653] rcu: Hierarchical SRCU implementation.
>>> [    0.002653] smp: Bringing up secondary CPUs ...
>>> [    0.002653] x86: Booting SMP configuration:
>>> [    0.002653] .... node  #0, CPUs:      #1
>>> [    0.000000] smpboot: CPU 1 Converting physical 0 to logical die 1
>>>
>>> ... and then there is silence until the test aborts.
>>>
>>> This is only (or at least predominantly) seen if the system running
>>> the emulation is under load.
>>>
>>> Reverting this patch fixes the problem.
>> I was aware that there are some problem with this patch, but it is hard
>> to reproduce it. Do you have a more consistent way to reproduce it.
>> When  you say under load, you mean that the host system is also busy so
>> that there are a lot of vcpu preemption. Right? Could you give me the
> Correct. I am able to reproduce the problem quite reliably (ie 2-3 boots
> out of ~25 fail) if I run a kernel compilation in parallel, but not (or
> rarely) if the system is otherwise idle.
>
>> x86-64 .config file that you use?
>>
> Attached. It is pretty much defconfig with various debug and test options
> enabled.
>
> Guenter

Thanks,
Longman


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-03-16 14:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-07 19:39 [RFC PATCH v2] tick: Make tick_periodic() check for missing ticks Waiman Long
2020-03-04  9:20 ` [tip: timers/core] tick/common: " tip-bot2 for Waiman Long
2020-03-16  2:20 ` [RFC PATCH v2] tick: " Guenter Roeck
2020-03-16  2:43   ` Waiman Long
2020-03-16  2:57     ` Guenter Roeck
2020-03-16 14:20       ` Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).