LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCHv4 1/2] iommu/vt-d: Ratelimit each dmar fault printing
@ 2018-03-31  0:33 Dmitry Safonov
  2018-03-31  0:33 ` [PATCHv4 2/2] iommu/vt-d: Limit number of faults to clear in irq handler Dmitry Safonov
  2018-05-02  2:22 ` [PATCHv4 1/2] iommu/vt-d: Ratelimit each dmar fault printing Dmitry Safonov
  0 siblings, 2 replies; 15+ messages in thread
From: Dmitry Safonov @ 2018-03-31  0:33 UTC (permalink / raw)
  To: linux-kernel, joro
  Cc: 0x7f454c46, Dmitry Safonov, Alex Williamson, David Woodhouse,
	Ingo Molnar, Lu Baolu, iommu

There is a ratelimit for printing, but it's incremented each time the
cpu recives dmar fault interrupt. While one interrupt may signal about
*many* faults.
So, measuring the impact it turns out that reading/clearing one fault
takes < 1 usec, and printing info about the fault takes ~170 msec.

Having in mind that maximum number of fault recording registers per
remapping hardware unit is 256.. IRQ handler may run for (170*256) msec.
And as fault-serving loop runs without a time limit, during servicing
new faults may occur..

Ratelimit each fault printing rather than each irq printing.

Fixes: commit c43fce4eebae ("iommu/vt-d: Ratelimit fault handler")

BUG: spinlock lockup suspected on CPU#0, CliShell/9903
 lock: 0xffffffff81a47440, .magic: dead4ead, .owner: kworker/u16:2/8915, .owner_cpu: 6
CPU: 0 PID: 9903 Comm: CliShell
Call Trace:$\n'
[..] dump_stack+0x65/0x83$\n'
[..] spin_dump+0x8f/0x94$\n'
[..] do_raw_spin_lock+0x123/0x170$\n'
[..] _raw_spin_lock_irqsave+0x32/0x3a$\n'
[..] uart_chars_in_buffer+0x20/0x4d$\n'
[..] tty_chars_in_buffer+0x18/0x1d$\n'
[..] n_tty_poll+0x1cb/0x1f2$\n'
[..] tty_poll+0x5e/0x76$\n'
[..] do_select+0x363/0x629$\n'
[..] compat_core_sys_select+0x19e/0x239$\n'
[..] compat_SyS_select+0x98/0xc0$\n'
[..] sysenter_dispatch+0x7/0x25$\n'
[..]
NMI backtrace for cpu 6
CPU: 6 PID: 8915 Comm: kworker/u16:2
Workqueue: dmar_fault dmar_fault_work
Call Trace:$\n'
[..] wait_for_xmitr+0x26/0x8f$\n'
[..] serial8250_console_putchar+0x1c/0x2c$\n'
[..] uart_console_write+0x40/0x4b$\n'
[..] serial8250_console_write+0xe6/0x13f$\n'
[..] call_console_drivers.constprop.13+0xce/0x103$\n'
[..] console_unlock+0x1f8/0x39b$\n'
[..] vprintk_emit+0x39e/0x3e6$\n'
[..] printk+0x4d/0x4f$\n'
[..] dmar_fault+0x1a8/0x1fc$\n'
[..] dmar_fault_work+0x15/0x17$\n'
[..] process_one_work+0x1e8/0x3a9$\n'
[..] worker_thread+0x25d/0x345$\n'
[..] kthread+0xea/0xf2$\n'
[..] ret_from_fork+0x58/0x90$\n'

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: iommu@lists.linux-foundation.org
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 drivers/iommu/dmar.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index accf58388bdb..6c4ea32ee6a9 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1618,17 +1618,13 @@ irqreturn_t dmar_fault(int irq, void *dev_id)
 	int reg, fault_index;
 	u32 fault_status;
 	unsigned long flag;
-	bool ratelimited;
 	static DEFINE_RATELIMIT_STATE(rs,
 				      DEFAULT_RATELIMIT_INTERVAL,
 				      DEFAULT_RATELIMIT_BURST);
 
-	/* Disable printing, simply clear the fault when ratelimited */
-	ratelimited = !__ratelimit(&rs);
-
 	raw_spin_lock_irqsave(&iommu->register_lock, flag);
 	fault_status = readl(iommu->reg + DMAR_FSTS_REG);
-	if (fault_status && !ratelimited)
+	if (fault_status && __ratelimit(&rs))
 		pr_err("DRHD: handling fault status reg %x\n", fault_status);
 
 	/* TBD: ignore advanced fault log currently */
@@ -1638,6 +1634,8 @@ irqreturn_t dmar_fault(int irq, void *dev_id)
 	fault_index = dma_fsts_fault_record_index(fault_status);
 	reg = cap_fault_reg_offset(iommu->cap);
 	while (1) {
+		/* Disable printing, simply clear the fault when ratelimited */
+		bool ratelimited = !__ratelimit(&rs);
 		u8 fault_reason;
 		u16 source_id;
 		u64 guest_addr;
-- 
2.13.6

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCHv4 2/2] iommu/vt-d: Limit number of faults to clear in irq handler
  2018-03-31  0:33 [PATCHv4 1/2] iommu/vt-d: Ratelimit each dmar fault printing Dmitry Safonov
@ 2018-03-31  0:33 ` Dmitry Safonov
  2018-05-02  6:34   ` Lu Baolu
  2018-05-02  2:22 ` [PATCHv4 1/2] iommu/vt-d: Ratelimit each dmar fault printing Dmitry Safonov
  1 sibling, 1 reply; 15+ messages in thread
From: Dmitry Safonov @ 2018-03-31  0:33 UTC (permalink / raw)
  To: linux-kernel, joro
  Cc: 0x7f454c46, Dmitry Safonov, Alex Williamson, David Woodhouse,
	Ingo Molnar, Lu Baolu, iommu

Theoretically, on some machines faults might be generated faster than
they're cleared by CPU. Let's limit the cleaning-loop by number of hw
fault registers.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: iommu@lists.linux-foundation.org
Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 drivers/iommu/dmar.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 6c4ea32ee6a9..cf1105111209 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1615,7 +1615,7 @@ static int dmar_fault_do_one(struct intel_iommu *iommu, int type,
 irqreturn_t dmar_fault(int irq, void *dev_id)
 {
 	struct intel_iommu *iommu = dev_id;
-	int reg, fault_index;
+	int reg, fault_index, i;
 	u32 fault_status;
 	unsigned long flag;
 	static DEFINE_RATELIMIT_STATE(rs,
@@ -1633,7 +1633,7 @@ irqreturn_t dmar_fault(int irq, void *dev_id)
 
 	fault_index = dma_fsts_fault_record_index(fault_status);
 	reg = cap_fault_reg_offset(iommu->cap);
-	while (1) {
+	for (i = 0; i < cap_num_fault_regs(iommu->cap); i++) {
 		/* Disable printing, simply clear the fault when ratelimited */
 		bool ratelimited = !__ratelimit(&rs);
 		u8 fault_reason;
-- 
2.13.6

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4 1/2] iommu/vt-d: Ratelimit each dmar fault printing
  2018-03-31  0:33 [PATCHv4 1/2] iommu/vt-d: Ratelimit each dmar fault printing Dmitry Safonov
  2018-03-31  0:33 ` [PATCHv4 2/2] iommu/vt-d: Limit number of faults to clear in irq handler Dmitry Safonov
@ 2018-05-02  2:22 ` Dmitry Safonov
  2018-05-03 12:40   ` Joerg Roedel
  1 sibling, 1 reply; 15+ messages in thread
From: Dmitry Safonov @ 2018-05-02  2:22 UTC (permalink / raw)
  To: linux-kernel, joro
  Cc: 0x7f454c46, Alex Williamson, David Woodhouse, Ingo Molnar,
	Lu Baolu, iommu

Hi Joerg,

is there anything I may do about those two patches?
In 2/2 I've limited loop cnt as discussed in v3.
This one solves softlockup for us, might be useful.

On Sat, 2018-03-31 at 01:33 +0100, Dmitry Safonov wrote:
> There is a ratelimit for printing, but it's incremented each time the
> cpu recives dmar fault interrupt. While one interrupt may signal
> about
> *many* faults.
> So, measuring the impact it turns out that reading/clearing one fault
> takes < 1 usec, and printing info about the fault takes ~170 msec.
> 
> Having in mind that maximum number of fault recording registers per
> remapping hardware unit is 256.. IRQ handler may run for (170*256)
> msec.
> And as fault-serving loop runs without a time limit, during servicing
> new faults may occur..
> 
> Ratelimit each fault printing rather than each irq printing.
> 
> Fixes: commit c43fce4eebae ("iommu/vt-d: Ratelimit fault handler")
> 
> BUG: spinlock lockup suspected on CPU#0, CliShell/9903
>  lock: 0xffffffff81a47440, .magic: dead4ead, .owner:
> kworker/u16:2/8915, .owner_cpu: 6
> CPU: 0 PID: 9903 Comm: CliShell
> Call Trace:$\n'
> [..] dump_stack+0x65/0x83$\n'
> [..] spin_dump+0x8f/0x94$\n'
> [..] do_raw_spin_lock+0x123/0x170$\n'
> [..] _raw_spin_lock_irqsave+0x32/0x3a$\n'
> [..] uart_chars_in_buffer+0x20/0x4d$\n'
> [..] tty_chars_in_buffer+0x18/0x1d$\n'
> [..] n_tty_poll+0x1cb/0x1f2$\n'
> [..] tty_poll+0x5e/0x76$\n'
> [..] do_select+0x363/0x629$\n'
> [..] compat_core_sys_select+0x19e/0x239$\n'
> [..] compat_SyS_select+0x98/0xc0$\n'
> [..] sysenter_dispatch+0x7/0x25$\n'
> [..]
> NMI backtrace for cpu 6
> CPU: 6 PID: 8915 Comm: kworker/u16:2
> Workqueue: dmar_fault dmar_fault_work
> Call Trace:$\n'
> [..] wait_for_xmitr+0x26/0x8f$\n'
> [..] serial8250_console_putchar+0x1c/0x2c$\n'
> [..] uart_console_write+0x40/0x4b$\n'
> [..] serial8250_console_write+0xe6/0x13f$\n'
> [..] call_console_drivers.constprop.13+0xce/0x103$\n'
> [..] console_unlock+0x1f8/0x39b$\n'
> [..] vprintk_emit+0x39e/0x3e6$\n'
> [..] printk+0x4d/0x4f$\n'
> [..] dmar_fault+0x1a8/0x1fc$\n'
> [..] dmar_fault_work+0x15/0x17$\n'
> [..] process_one_work+0x1e8/0x3a9$\n'
> [..] worker_thread+0x25d/0x345$\n'
> [..] kthread+0xea/0xf2$\n'
> [..] ret_from_fork+0x58/0x90$\n'
> 
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: David Woodhouse <dwmw2@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Cc: iommu@lists.linux-foundation.org
> Signed-off-by: Dmitry Safonov <dima@arista.com>
> ---
>  drivers/iommu/dmar.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> index accf58388bdb..6c4ea32ee6a9 100644
> --- a/drivers/iommu/dmar.c
> +++ b/drivers/iommu/dmar.c
> @@ -1618,17 +1618,13 @@ irqreturn_t dmar_fault(int irq, void *dev_id)
>  	int reg, fault_index;
>  	u32 fault_status;
>  	unsigned long flag;
> -	bool ratelimited;
>  	static DEFINE_RATELIMIT_STATE(rs,
>  				      DEFAULT_RATELIMIT_INTERVAL,
>  				      DEFAULT_RATELIMIT_BURST);
>  
> -	/* Disable printing, simply clear the fault when ratelimited
> */
> -	ratelimited = !__ratelimit(&rs);
> -
>  	raw_spin_lock_irqsave(&iommu->register_lock, flag);
>  	fault_status = readl(iommu->reg + DMAR_FSTS_REG);
> -	if (fault_status && !ratelimited)
> +	if (fault_status && __ratelimit(&rs))
>  		pr_err("DRHD: handling fault status reg %x\n",
> fault_status);
>  
>  	/* TBD: ignore advanced fault log currently */
> @@ -1638,6 +1634,8 @@ irqreturn_t dmar_fault(int irq, void *dev_id)
>  	fault_index = dma_fsts_fault_record_index(fault_status);
>  	reg = cap_fault_reg_offset(iommu->cap);
>  	while (1) {
> +		/* Disable printing, simply clear the fault when
> ratelimited */
> +		bool ratelimited = !__ratelimit(&rs);
>  		u8 fault_reason;
>  		u16 source_id;
>  		u64 guest_addr;

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4 2/2] iommu/vt-d: Limit number of faults to clear in irq handler
  2018-03-31  0:33 ` [PATCHv4 2/2] iommu/vt-d: Limit number of faults to clear in irq handler Dmitry Safonov
@ 2018-05-02  6:34   ` Lu Baolu
  2018-05-02 12:38     ` Dmitry Safonov
  0 siblings, 1 reply; 15+ messages in thread
From: Lu Baolu @ 2018-05-02  6:34 UTC (permalink / raw)
  To: Dmitry Safonov, linux-kernel, joro, Raj, Ashok
  Cc: 0x7f454c46, Alex Williamson, David Woodhouse, Ingo Molnar, iommu

Hi,

On 03/31/2018 08:33 AM, Dmitry Safonov wrote:
> Theoretically, on some machines faults might be generated faster than
> they're cleared by CPU.

Is this a real case?

>  Let's limit the cleaning-loop by number of hw
> fault registers.

Will this cause the fault recording registers full of faults, hence new
faults will be dropped without logging? And even worse, new faults
will not generate interrupts?

Best regards,
Lu Baolu

>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: David Woodhouse <dwmw2@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Cc: iommu@lists.linux-foundation.org
> Signed-off-by: Dmitry Safonov <dima@arista.com>
> ---
>  drivers/iommu/dmar.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> index 6c4ea32ee6a9..cf1105111209 100644
> --- a/drivers/iommu/dmar.c
> +++ b/drivers/iommu/dmar.c
> @@ -1615,7 +1615,7 @@ static int dmar_fault_do_one(struct intel_iommu *iommu, int type,
>  irqreturn_t dmar_fault(int irq, void *dev_id)
>  {
>  	struct intel_iommu *iommu = dev_id;
> -	int reg, fault_index;
> +	int reg, fault_index, i;
>  	u32 fault_status;
>  	unsigned long flag;
>  	static DEFINE_RATELIMIT_STATE(rs,
> @@ -1633,7 +1633,7 @@ irqreturn_t dmar_fault(int irq, void *dev_id)
>  
>  	fault_index = dma_fsts_fault_record_index(fault_status);
>  	reg = cap_fault_reg_offset(iommu->cap);
> -	while (1) {
> +	for (i = 0; i < cap_num_fault_regs(iommu->cap); i++) {
>  		/* Disable printing, simply clear the fault when ratelimited */
>  		bool ratelimited = !__ratelimit(&rs);
>  		u8 fault_reason;

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4 2/2] iommu/vt-d: Limit number of faults to clear in irq handler
  2018-05-02  6:34   ` Lu Baolu
@ 2018-05-02 12:38     ` Dmitry Safonov
  2018-05-02 23:49       ` Lu Baolu
  0 siblings, 1 reply; 15+ messages in thread
From: Dmitry Safonov @ 2018-05-02 12:38 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, joro, Raj, Ashok
  Cc: 0x7f454c46, Alex Williamson, David Woodhouse, Ingo Molnar, iommu

Hi Lu,

On Wed, 2018-05-02 at 14:34 +0800, Lu Baolu wrote:
> Hi,
> 
> On 03/31/2018 08:33 AM, Dmitry Safonov wrote:
> > Theoretically, on some machines faults might be generated faster
> > than
> > they're cleared by CPU.
> 
> Is this a real case?

No. 1/2 is a real case and this one was discussed on v3:
lkml.kernel.org/r/<20180215191729.15777-1-dima@arista.com>

It's not possible on my hw as far as I tried, but the discussion result
was to fix this theoretical issue too.

> 
> >  Let's limit the cleaning-loop by number of hw
> > fault registers.
> 
> Will this cause the fault recording registers full of faults, hence
> new faults will be dropped without logging?

If faults come faster then they're being cleared - some of them will be
dropped without logging. Not sure if it's worth to report all faults in
such theoretical(!) situation.
If amount of reported faults for such situation is not enough and it's
worth to keep all the faults, then probably we should introduce a
workqueue here (which I did in v1, but it was rejected by the reason
that it will introduce some latency in fault reporting).

> And even worse, new faults will not generate interrupts?

They will, we clear page fault overflow outside of the loop, so any new
fault will raise interrupt, iiuc.

-- 
Thanks,
             Dmitry

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4 2/2] iommu/vt-d: Limit number of faults to clear in irq handler
  2018-05-02 12:38     ` Dmitry Safonov
@ 2018-05-02 23:49       ` Lu Baolu
  2018-05-03  0:52         ` Dmitry Safonov
  0 siblings, 1 reply; 15+ messages in thread
From: Lu Baolu @ 2018-05-02 23:49 UTC (permalink / raw)
  To: Dmitry Safonov, linux-kernel, joro, Raj, Ashok
  Cc: 0x7f454c46, Alex Williamson, David Woodhouse, Ingo Molnar, iommu

Hi,

On 05/02/2018 08:38 PM, Dmitry Safonov wrote:
> Hi Lu,
>
> On Wed, 2018-05-02 at 14:34 +0800, Lu Baolu wrote:
>> Hi,
>>
>> On 03/31/2018 08:33 AM, Dmitry Safonov wrote:
>>> Theoretically, on some machines faults might be generated faster
>>> than
>>> they're cleared by CPU.
>> Is this a real case?
> No. 1/2 is a real case and this one was discussed on v3:
> lkml.kernel.org/r/<20180215191729.15777-1-dima@arista.com>
>
> It's not possible on my hw as far as I tried, but the discussion result
> was to fix this theoretical issue too.

If faults are generated faster than CPU can clear them, the PCIe
device should be in a very very bad state. How about disabling
the PCIe device and ask the administrator to replace it? Anyway,
I don't think that's goal of this patch series. :-)

>
>>>  Let's limit the cleaning-loop by number of hw
>>> fault registers.
>> Will this cause the fault recording registers full of faults, hence
>> new faults will be dropped without logging?
> If faults come faster then they're being cleared - some of them will be
> dropped without logging. Not sure if it's worth to report all faults in
> such theoretical(!) situation.
> If amount of reported faults for such situation is not enough and it's
> worth to keep all the faults, then probably we should introduce a
> workqueue here (which I did in v1, but it was rejected by the reason
> that it will introduce some latency in fault reporting).
>
>> And even worse, new faults will not generate interrupts?
> They will, we clear page fault overflow outside of the loop, so any new
> fault will raise interrupt, iiuc.
>

I am afraid that they might not generate interrupts any more.

Say, the fault registers are full of events that are not cleared,
then a new fault comes. There is no room for this event and
hence the hardware might drop it silently.

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4 2/2] iommu/vt-d: Limit number of faults to clear in irq handler
  2018-05-02 23:49       ` Lu Baolu
@ 2018-05-03  0:52         ` Dmitry Safonov
  2018-05-03  1:32           ` Lu Baolu
  0 siblings, 1 reply; 15+ messages in thread
From: Dmitry Safonov @ 2018-05-03  0:52 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, joro, Raj, Ashok
  Cc: 0x7f454c46, Alex Williamson, David Woodhouse, Ingo Molnar, iommu

On Thu, 2018-05-03 at 07:49 +0800, Lu Baolu wrote:
> Hi,
> 
> On 05/02/2018 08:38 PM, Dmitry Safonov wrote:
> > Hi Lu,
> > 
> > On Wed, 2018-05-02 at 14:34 +0800, Lu Baolu wrote:
> > > Hi,
> > > 
> > > On 03/31/2018 08:33 AM, Dmitry Safonov wrote:
> > > > Theoretically, on some machines faults might be generated
> > > > faster
> > > > than
> > > > they're cleared by CPU.
> > > 
> > > Is this a real case?
> > 
> > No. 1/2 is a real case and this one was discussed on v3:
> > lkml.kernel.org/r/<20180215191729.15777-1-dima@arista.com>
> > 
> > It's not possible on my hw as far as I tried, but the discussion
> > result
> > was to fix this theoretical issue too.
> 
> If faults are generated faster than CPU can clear them, the PCIe
> device should be in a very very bad state. How about disabling
> the PCIe device and ask the administrator to replace it? Anyway,
> I don't think that's goal of this patch series. :-)

Uhm, yeah, my point is not about the number of faults, but about
physical ability of iommu to generate faults faster than cpu processes
them. I might be wrong that it's not possible (like low cpu freq?)

But the number of interrupts might be high. It's like you've many
mappings on iommu and PCIe device went off. It could be just a link
flap. I think it makes sense not lockup on such occasions.

> > > >  Let's limit the cleaning-loop by number of hw
> > > > fault registers.
> > > 
> > > Will this cause the fault recording registers full of faults,
> > > hence
> > > new faults will be dropped without logging?
> > 
> > If faults come faster then they're being cleared - some of them
> > will be
> > dropped without logging. Not sure if it's worth to report all
> > faults in
> > such theoretical(!) situation.
> > If amount of reported faults for such situation is not enough and
> > it's
> > worth to keep all the faults, then probably we should introduce a
> > workqueue here (which I did in v1, but it was rejected by the
> > reason
> > that it will introduce some latency in fault reporting).
> > 
> > > And even worse, new faults will not generate interrupts?
> > 
> > They will, we clear page fault overflow outside of the loop, so any
> > new
> > fault will raise interrupt, iiuc.
> > 
> 
> I am afraid that they might not generate interrupts any more.
> 
> Say, the fault registers are full of events that are not cleared,
> then a new fault comes. There is no room for this event and
> hence the hardware might drop it silently.

AFAICS, we're doing fault-clearing in a loop inside irq handler.
That means that while we're clearing if a fault raises, it'll make
an irq level triggered (or on edge) on lapic. So, whenever we return
from the irq handler, irq will raise again.

-- 
Thanks,
             Dmitry

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4 2/2] iommu/vt-d: Limit number of faults to clear in irq handler
  2018-05-03  0:52         ` Dmitry Safonov
@ 2018-05-03  1:32           ` Lu Baolu
  2018-05-03  1:59             ` Dmitry Safonov
  0 siblings, 1 reply; 15+ messages in thread
From: Lu Baolu @ 2018-05-03  1:32 UTC (permalink / raw)
  To: Dmitry Safonov, linux-kernel, joro, Raj, Ashok
  Cc: 0x7f454c46, Alex Williamson, David Woodhouse, Ingo Molnar, iommu

Hi,

On 05/03/2018 08:52 AM, Dmitry Safonov wrote:
> On Thu, 2018-05-03 at 07:49 +0800, Lu Baolu wrote:
>> Hi,
>>
>> On 05/02/2018 08:38 PM, Dmitry Safonov wrote:
>>> Hi Lu,
>>>
>>> On Wed, 2018-05-02 at 14:34 +0800, Lu Baolu wrote:
>>>> Hi,
>>>>
>>>> On 03/31/2018 08:33 AM, Dmitry Safonov wrote:
>>>>> Theoretically, on some machines faults might be generated
>>>>> faster
>>>>> than
>>>>> they're cleared by CPU.
>>>> Is this a real case?
>>> No. 1/2 is a real case and this one was discussed on v3:
>>> lkml.kernel.org/r/<20180215191729.15777-1-dima@arista.com>
>>>
>>> It's not possible on my hw as far as I tried, but the discussion
>>> result
>>> was to fix this theoretical issue too.
>> If faults are generated faster than CPU can clear them, the PCIe
>> device should be in a very very bad state. How about disabling
>> the PCIe device and ask the administrator to replace it? Anyway,
>> I don't think that's goal of this patch series. :-)
> Uhm, yeah, my point is not about the number of faults, but about
> physical ability of iommu to generate faults faster than cpu processes
> them. I might be wrong that it's not possible (like low cpu freq?)
>
> But the number of interrupts might be high. It's like you've many
> mappings on iommu and PCIe device went off. It could be just a link
> flap. I think it makes sense not lockup on such occasions.
>
>>>>>  Let's limit the cleaning-loop by number of hw
>>>>> fault registers.
>>>> Will this cause the fault recording registers full of faults,
>>>> hence
>>>> new faults will be dropped without logging?
>>> If faults come faster then they're being cleared - some of them
>>> will be
>>> dropped without logging. Not sure if it's worth to report all
>>> faults in
>>> such theoretical(!) situation.
>>> If amount of reported faults for such situation is not enough and
>>> it's
>>> worth to keep all the faults, then probably we should introduce a
>>> workqueue here (which I did in v1, but it was rejected by the
>>> reason
>>> that it will introduce some latency in fault reporting).
>>>
>>>> And even worse, new faults will not generate interrupts?
>>> They will, we clear page fault overflow outside of the loop, so any
>>> new
>>> fault will raise interrupt, iiuc.
>>>
>> I am afraid that they might not generate interrupts any more.
>>
>> Say, the fault registers are full of events that are not cleared,
>> then a new fault comes. There is no room for this event and
>> hence the hardware might drop it silently.
> AFAICS, we're doing fault-clearing in a loop inside irq handler.
> That means that while we're clearing if a fault raises, it'll make
> an irq level triggered (or on edge) on lapic. So, whenever we return
> from the irq handler, irq will raise again.
>

Uhm, double checked with the spec. Interrupts should be generated
since we always clear the fault overflow bit.

Anyway, we can't clear faults in a limited loop, as the spec says in 7.3.1:

Software is expected to process the non-recoverable faults reported through the Fault Recording
Registers in a circular FIFO fashion starting from the Fault Recording Register referenced by the Fault
Recording Index (FRI) field, until it finds a Fault Recording Register with no faults (F field Clear).

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4 2/2] iommu/vt-d: Limit number of faults to clear in irq handler
  2018-05-03  1:32           ` Lu Baolu
@ 2018-05-03  1:59             ` Dmitry Safonov
  2018-05-03  2:16               ` Lu Baolu
  0 siblings, 1 reply; 15+ messages in thread
From: Dmitry Safonov @ 2018-05-03  1:59 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, joro, Raj, Ashok
  Cc: 0x7f454c46, Alex Williamson, David Woodhouse, Ingo Molnar, iommu

On Thu, 2018-05-03 at 09:32 +0800, Lu Baolu wrote:
> Hi,
> 
> On 05/03/2018 08:52 AM, Dmitry Safonov wrote:
> > AFAICS, we're doing fault-clearing in a loop inside irq handler.
> > That means that while we're clearing if a fault raises, it'll make
> > an irq level triggered (or on edge) on lapic. So, whenever we
> > return
> > from the irq handler, irq will raise again.
> > 
> 
> Uhm, double checked with the spec. Interrupts should be generated
> since we always clear the fault overflow bit.
> 
> Anyway, we can't clear faults in a limited loop, as the spec says in
> 7.3.1:

Mind to elaborate?
ITOW, I do not see a contradiction. We're still clearing faults in FIFO
fashion. There is no limitation to do some spare work in between
clearings (return from interrupt, then fault again and continue).

> Software is expected to process the non-recoverable faults reported
> through the Fault Recording
> Registers in a circular FIFO fashion starting from the Fault
> Recording Register referenced by the Fault
> Recording Index (FRI) field, until it finds a Fault Recording
> Register with no faults (F field Clear).
> 
> Best regards,
> Lu Baolu

-- 
Thanks,
             Dmitry

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4 2/2] iommu/vt-d: Limit number of faults to clear in irq handler
  2018-05-03  1:59             ` Dmitry Safonov
@ 2018-05-03  2:16               ` Lu Baolu
  2018-05-03  2:32                 ` Lu Baolu
  2018-05-03  2:34                 ` Dmitry Safonov
  0 siblings, 2 replies; 15+ messages in thread
From: Lu Baolu @ 2018-05-03  2:16 UTC (permalink / raw)
  To: Dmitry Safonov, linux-kernel, joro, Raj, Ashok
  Cc: 0x7f454c46, Alex Williamson, David Woodhouse, Ingo Molnar, iommu

Hi,

On 05/03/2018 09:59 AM, Dmitry Safonov wrote:
> On Thu, 2018-05-03 at 09:32 +0800, Lu Baolu wrote:
>> Hi,
>>
>> On 05/03/2018 08:52 AM, Dmitry Safonov wrote:
>>> AFAICS, we're doing fault-clearing in a loop inside irq handler.
>>> That means that while we're clearing if a fault raises, it'll make
>>> an irq level triggered (or on edge) on lapic. So, whenever we
>>> return
>>> from the irq handler, irq will raise again.
>>>
>> Uhm, double checked with the spec. Interrupts should be generated
>> since we always clear the fault overflow bit.
>>
>> Anyway, we can't clear faults in a limited loop, as the spec says in
>> 7.3.1:
> Mind to elaborate?
> ITOW, I do not see a contradiction. We're still clearing faults in FIFO
> fashion. There is no limitation to do some spare work in between
> clearings (return from interrupt, then fault again and continue).

Hardware maintains an internal index to reference the fault recording
register in which the next fault can be recorded. When a fault comes,
hardware will check the Fault bit (bit 31 of the 4th 32-bit register recording
register) referenced by the internal index. If this bit is set, hardware will
not record the fault.

Since we now don't clear the F bit until a register entry which has the F bit
cleared, we might exit the fault handling with some register entries still
have the F bit set.

  F
| 0 |  xxxxxxxxxxxxx|
| 0 |  xxxxxxxxxxxxx|
| 0 |  xxxxxxxxxxxxx|  <--- Fault record index in fault status register
| 0 |  xxxxxxxxxxxxx|
| 1 |  xxxxxxxxxxxxx|  <--- hardware maintained index
| 1 |  xxxxxxxxxxxxx|
| 1 |  xxxxxxxxxxxxx|
| 0 |  xxxxxxxxxxxxx|
| 0 |  xxxxxxxxxxxxx|
| 0 |  xxxxxxxxxxxxx|
| 0 |  xxxxxxxxxxxxx|

Take an example as above, hardware could only record 2 more faults with
others all dropped.

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4 2/2] iommu/vt-d: Limit number of faults to clear in irq handler
  2018-05-03  2:16               ` Lu Baolu
@ 2018-05-03  2:32                 ` Lu Baolu
  2018-05-03  2:34                 ` Dmitry Safonov
  1 sibling, 0 replies; 15+ messages in thread
From: Lu Baolu @ 2018-05-03  2:32 UTC (permalink / raw)
  To: Dmitry Safonov, linux-kernel, joro, Raj, Ashok
  Cc: 0x7f454c46, Alex Williamson, David Woodhouse, Ingo Molnar, iommu

Hi,

On 05/03/2018 10:16 AM, Lu Baolu wrote:
> Hi,
>
> On 05/03/2018 09:59 AM, Dmitry Safonov wrote:
>> On Thu, 2018-05-03 at 09:32 +0800, Lu Baolu wrote:
>>> Hi,
>>>
>>> On 05/03/2018 08:52 AM, Dmitry Safonov wrote:
>>>> AFAICS, we're doing fault-clearing in a loop inside irq handler.
>>>> That means that while we're clearing if a fault raises, it'll make
>>>> an irq level triggered (or on edge) on lapic. So, whenever we
>>>> return
>>>> from the irq handler, irq will raise again.
>>>>
>>> Uhm, double checked with the spec. Interrupts should be generated
>>> since we always clear the fault overflow bit.
>>>
>>> Anyway, we can't clear faults in a limited loop, as the spec says in
>>> 7.3.1:
>> Mind to elaborate?
>> ITOW, I do not see a contradiction. We're still clearing faults in FIFO
>> fashion. There is no limitation to do some spare work in between
>> clearings (return from interrupt, then fault again and continue).
> Hardware maintains an internal index to reference the fault recording
> register in which the next fault can be recorded. When a fault comes,
> hardware will check the Fault bit (bit 31 of the 4th 32-bit register recording
> register) referenced by the internal index. If this bit is set, hardware will
> not record the fault.
>
> Since we now don't clear the F bit until a register entry which has the F bit
> cleared, we might exit the fault handling with some register entries still
> have the F bit set.
>
>   F
> | 0 |  xxxxxxxxxxxxx|
> | 0 |  xxxxxxxxxxxxx|
> | 0 |  xxxxxxxxxxxxx|  <--- Fault record index in fault status register

Forgot to mention, this fault record index that software reads from
the fault status register is also maintained by hardware. It means
the index of the first fault recording register that hardware records
the faults last time.

Software doesn't maintains its own index, right? So there might some
registers left there with F bit set.

Best regards,
Lu Baolu

> | 0 |  xxxxxxxxxxxxx|
> | 1 |  xxxxxxxxxxxxx|  <--- hardware maintained index
> | 1 |  xxxxxxxxxxxxx|
> | 1 |  xxxxxxxxxxxxx|
> | 0 |  xxxxxxxxxxxxx|
> | 0 |  xxxxxxxxxxxxx|
> | 0 |  xxxxxxxxxxxxx|
> | 0 |  xxxxxxxxxxxxx|
>
> Take an example as above, hardware could only record 2 more faults with
> others all dropped.
>
> Best regards,
> Lu Baolu
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4 2/2] iommu/vt-d: Limit number of faults to clear in irq handler
  2018-05-03  2:16               ` Lu Baolu
  2018-05-03  2:32                 ` Lu Baolu
@ 2018-05-03  2:34                 ` Dmitry Safonov
  2018-05-03  2:44                   ` Lu Baolu
  1 sibling, 1 reply; 15+ messages in thread
From: Dmitry Safonov @ 2018-05-03  2:34 UTC (permalink / raw)
  To: Lu Baolu, linux-kernel, joro, Raj, Ashok
  Cc: 0x7f454c46, Alex Williamson, David Woodhouse, Ingo Molnar, iommu

On Thu, 2018-05-03 at 10:16 +0800, Lu Baolu wrote:
> Hi,
> 
> On 05/03/2018 09:59 AM, Dmitry Safonov wrote:
> > On Thu, 2018-05-03 at 09:32 +0800, Lu Baolu wrote:
> > > Hi,
> > > 
> > > On 05/03/2018 08:52 AM, Dmitry Safonov wrote:
> > > > AFAICS, we're doing fault-clearing in a loop inside irq
> > > > handler.
> > > > That means that while we're clearing if a fault raises, it'll
> > > > make
> > > > an irq level triggered (or on edge) on lapic. So, whenever we
> > > > return
> > > > from the irq handler, irq will raise again.
> > > > 
> > > 
> > > Uhm, double checked with the spec. Interrupts should be generated
> > > since we always clear the fault overflow bit.
> > > 
> > > Anyway, we can't clear faults in a limited loop, as the spec says
> > > in
> > > 7.3.1:
> > 
> > Mind to elaborate?
> > ITOW, I do not see a contradiction. We're still clearing faults in
> > FIFO
> > fashion. There is no limitation to do some spare work in between
> > clearings (return from interrupt, then fault again and continue).
> 
> Hardware maintains an internal index to reference the fault recording
> register in which the next fault can be recorded. When a fault comes,
> hardware will check the Fault bit (bit 31 of the 4th 32-bit register
> recording
> register) referenced by the internal index. If this bit is set,
> hardware will
> not record the fault.
> 
> Since we now don't clear the F bit until a register entry which has
> the F bit
> cleared, we might exit the fault handling with some register entries
> still
> have the F bit set.
> 
>   F
> > 0 |  xxxxxxxxxxxxx|
> > 0 |  xxxxxxxxxxxxx|
> > 0 |  xxxxxxxxxxxxx|  <--- Fault record index in fault status
> > register
> > 0 |  xxxxxxxxxxxxx|
> > 1 |  xxxxxxxxxxxxx|  <--- hardware maintained index
> > 1 |  xxxxxxxxxxxxx|
> > 1 |  xxxxxxxxxxxxx|
> > 0 |  xxxxxxxxxxxxx|
> > 0 |  xxxxxxxxxxxxx|
> > 0 |  xxxxxxxxxxxxx|
> > 0 |  xxxxxxxxxxxxx|
> 
> Take an example as above, hardware could only record 2 more faults
> with
> others all dropped.

Ugh, yeah, I got what you're saying.. Thanks for explanations.
So, we shouldn't mark faults as cleared until we've actually processed
them here:
:        writel(DMA_FSTS_PFO | DMA_FSTS_PPF | DMA_FSTS_PRO,
:               iommu->reg + DMAR_FSTS_REG);

As Joerg mentioned, we do care about latency here, so this fault work
can't be moved entirely into workqueue.. but we might limit loop and
check if we've hit the limit - to proceed servicing faults in a wq,
as in that case we should care about being too long in irq-disabled
section more than about latencies.
Does that makes any sense, what do you think?

I can possibly re-write 2/2 with idea above..
And it would be a bit joy to have 1/1 applied, as it's independent fix
and fixes an issue that happens for real on our devices, heh.

-- 
Thanks,
             Dmitry

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4 2/2] iommu/vt-d: Limit number of faults to clear in irq handler
  2018-05-03  2:34                 ` Dmitry Safonov
@ 2018-05-03  2:44                   ` Lu Baolu
  0 siblings, 0 replies; 15+ messages in thread
From: Lu Baolu @ 2018-05-03  2:44 UTC (permalink / raw)
  To: Dmitry Safonov, linux-kernel, joro, Raj, Ashok
  Cc: 0x7f454c46, Alex Williamson, David Woodhouse, Ingo Molnar, iommu

Hi,

On 05/03/2018 10:34 AM, Dmitry Safonov wrote:
> On Thu, 2018-05-03 at 10:16 +0800, Lu Baolu wrote:
>> Hi,
>>
>> On 05/03/2018 09:59 AM, Dmitry Safonov wrote:
>>> On Thu, 2018-05-03 at 09:32 +0800, Lu Baolu wrote:
>>>> Hi,
>>>>
>>>> On 05/03/2018 08:52 AM, Dmitry Safonov wrote:
>>>>> AFAICS, we're doing fault-clearing in a loop inside irq
>>>>> handler.
>>>>> That means that while we're clearing if a fault raises, it'll
>>>>> make
>>>>> an irq level triggered (or on edge) on lapic. So, whenever we
>>>>> return
>>>>> from the irq handler, irq will raise again.
>>>>>
>>>> Uhm, double checked with the spec. Interrupts should be generated
>>>> since we always clear the fault overflow bit.
>>>>
>>>> Anyway, we can't clear faults in a limited loop, as the spec says
>>>> in
>>>> 7.3.1:
>>> Mind to elaborate?
>>> ITOW, I do not see a contradiction. We're still clearing faults in
>>> FIFO
>>> fashion. There is no limitation to do some spare work in between
>>> clearings (return from interrupt, then fault again and continue).
>> Hardware maintains an internal index to reference the fault recording
>> register in which the next fault can be recorded. When a fault comes,
>> hardware will check the Fault bit (bit 31 of the 4th 32-bit register
>> recording
>> register) referenced by the internal index. If this bit is set,
>> hardware will
>> not record the fault.
>>
>> Since we now don't clear the F bit until a register entry which has
>> the F bit
>> cleared, we might exit the fault handling with some register entries
>> still
>> have the F bit set.
>>
>>   F
>>> 0 |  xxxxxxxxxxxxx|
>>> 0 |  xxxxxxxxxxxxx|
>>> 0 |  xxxxxxxxxxxxx|  <--- Fault record index in fault status
>>> register
>>> 0 |  xxxxxxxxxxxxx|
>>> 1 |  xxxxxxxxxxxxx|  <--- hardware maintained index
>>> 1 |  xxxxxxxxxxxxx|
>>> 1 |  xxxxxxxxxxxxx|
>>> 0 |  xxxxxxxxxxxxx|
>>> 0 |  xxxxxxxxxxxxx|
>>> 0 |  xxxxxxxxxxxxx|
>>> 0 |  xxxxxxxxxxxxx|
>> Take an example as above, hardware could only record 2 more faults
>> with
>> others all dropped.
> Ugh, yeah, I got what you're saying.. Thanks for explanations.
> So, we shouldn't mark faults as cleared until we've actually processed
> them here:
> :        writel(DMA_FSTS_PFO | DMA_FSTS_PPF | DMA_FSTS_PRO,
> :               iommu->reg + DMAR_FSTS_REG);
>
> As Joerg mentioned, we do care about latency here, so this fault work
> can't be moved entirely into workqueue.. but we might limit loop and
> check if we've hit the limit - to proceed servicing faults in a wq,
> as in that case we should care about being too long in irq-disabled
> section more than about latencies.
> Does that makes any sense, what do you think?
>
> I can possibly re-write 2/2 with idea above..

Very appreciated. I am open to the idea. :-)

Best regards,
Lu Baolu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4 1/2] iommu/vt-d: Ratelimit each dmar fault printing
  2018-05-02  2:22 ` [PATCHv4 1/2] iommu/vt-d: Ratelimit each dmar fault printing Dmitry Safonov
@ 2018-05-03 12:40   ` Joerg Roedel
  2018-05-03 16:12     ` Dmitry Safonov
  0 siblings, 1 reply; 15+ messages in thread
From: Joerg Roedel @ 2018-05-03 12:40 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, 0x7f454c46, Alex Williamson, David Woodhouse,
	Ingo Molnar, Lu Baolu, iommu

On Wed, May 02, 2018 at 03:22:24AM +0100, Dmitry Safonov wrote:
> Hi Joerg,
> 
> is there anything I may do about those two patches?
> In 2/2 I've limited loop cnt as discussed in v3.
> This one solves softlockup for us, might be useful.

Applied the first patch, thanks. Please re-work the second one according
to the comments.


Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv4 1/2] iommu/vt-d: Ratelimit each dmar fault printing
  2018-05-03 12:40   ` Joerg Roedel
@ 2018-05-03 16:12     ` Dmitry Safonov
  0 siblings, 0 replies; 15+ messages in thread
From: Dmitry Safonov @ 2018-05-03 16:12 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: linux-kernel, 0x7f454c46, Alex Williamson, David Woodhouse,
	Ingo Molnar, Lu Baolu, iommu

On Thu, 2018-05-03 at 14:40 +0200, Joerg Roedel wrote:
> On Wed, May 02, 2018 at 03:22:24AM +0100, Dmitry Safonov wrote:
> > Hi Joerg,
> > 
> > is there anything I may do about those two patches?
> > In 2/2 I've limited loop cnt as discussed in v3.
> > This one solves softlockup for us, might be useful.
> 
> Applied the first patch, thanks. Please re-work the second one
> according
> to the comments.

Will do.

-- 
Thanks,
             Dmitry

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-05-03 16:12 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-31  0:33 [PATCHv4 1/2] iommu/vt-d: Ratelimit each dmar fault printing Dmitry Safonov
2018-03-31  0:33 ` [PATCHv4 2/2] iommu/vt-d: Limit number of faults to clear in irq handler Dmitry Safonov
2018-05-02  6:34   ` Lu Baolu
2018-05-02 12:38     ` Dmitry Safonov
2018-05-02 23:49       ` Lu Baolu
2018-05-03  0:52         ` Dmitry Safonov
2018-05-03  1:32           ` Lu Baolu
2018-05-03  1:59             ` Dmitry Safonov
2018-05-03  2:16               ` Lu Baolu
2018-05-03  2:32                 ` Lu Baolu
2018-05-03  2:34                 ` Dmitry Safonov
2018-05-03  2:44                   ` Lu Baolu
2018-05-02  2:22 ` [PATCHv4 1/2] iommu/vt-d: Ratelimit each dmar fault printing Dmitry Safonov
2018-05-03 12:40   ` Joerg Roedel
2018-05-03 16:12     ` Dmitry Safonov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).