LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* E1000:   e1000_update_stats()
@ 2008-03-05 20:45 Todd Tomaino
  2008-03-05 21:22 ` Chris Friesen
  2008-03-05 21:34 ` Kok, Auke
  0 siblings, 2 replies; 5+ messages in thread
From: Todd Tomaino @ 2008-03-05 20:45 UTC (permalink / raw)
  To: linux-kernel



I have been having trouble with interrupt latencies.  While profiling my system, I measured the e1000_update_stats() function disabling interrupts for roughly 100 usec.  For my application, this long of a delay results in data loss.  The e1000_update_stats() function is broken down into 3 main operations:

1. acquire spinlock
2. read registers & update adapter->stat counters
3. update OS adapter->net_stats statistics
4. update phy stats.
5. release spinlock.

On my system I measured 57 usec to perform operations #2 and #3, and 52 usec to perform operation #4.

Reading the comments in the code it appeared that updating adapter->net_stats and adapter->stat required interrupts to be turned off to prevent the e1000_adjust_tbi_stats() function from modifying these structures from interrupt context.  In order to reduce the amount of time spent with interrupts disabled I separated the register reads from updating adapter->stat counters.  The register reads take up the bulk of the 57 usec and shouldn't require protection (I could be wrong).  The new operation looks like this and reduces the amount of time spent with interrupts turned off by 1/2:

1. read registers
2. acquire spinlock
3. update adapter->stat counters
4. update OS adapter->net_stats statistics
5. update phy stats.
6. release spinlock.


It seemed like a bad idea to attempt to move the phy reads out of the spin locked code area.

These changes have greatly reduced my interrupt latency, but are there any negative side effects that could be caused??


Best regards,
Todd


System Info:

  Kernel:            2.6.18 (RHEL5.1)

  E1000 driver:      7.3.20-k2

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: E1000:   e1000_update_stats()
  2008-03-05 20:45 E1000: e1000_update_stats() Todd Tomaino
@ 2008-03-05 21:22 ` Chris Friesen
  2008-03-05 21:40   ` Kok, Auke
  2008-03-05 21:34 ` Kok, Auke
  1 sibling, 1 reply; 5+ messages in thread
From: Chris Friesen @ 2008-03-05 21:22 UTC (permalink / raw)
  To: Todd Tomaino; +Cc: linux-kernel

Todd Tomaino wrote:

> Reading the comments in the code it appeared that updating
> adapter->net_stats and adapter->stat required interrupts to be turned
> off to prevent the e1000_adjust_tbi_stats() function from modifying
> these structures from interrupt context.  In order to reduce the
> amount of time spent with interrupts disabled I separated the
> register reads from updating adapter->stat counters.  The register
> reads take up the bulk of the 57 usec and shouldn't require
> protection (I could be wrong).  The new operation looks like this and
> reduces the amount of time spent with interrupts turned off by 1/2:
> 
> 1. read registers 2. acquire spinlock 3. update adapter->stat
> counters 4. update OS adapter->net_stats statistics 5. update phy
> stats. 6. release spinlock.

One potential problem with this is that the stats can be further out of 
date.

Consider this case:

cpu0 reads some of the registers, then gets preempted/interrupted
cpu1 reads all the registers, updates the stats, then dumps the data


At this point cpu1 dumped information that doesn't include the registers 
that cpu0 was in the middle of updating.  Eventually cpu0 will run again 
and the stats will be accurate, but there is a window where the stats 
may be self-inconsistent and the various counts may not add up.

Chris


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: E1000:   e1000_update_stats()
  2008-03-05 20:45 E1000: e1000_update_stats() Todd Tomaino
  2008-03-05 21:22 ` Chris Friesen
@ 2008-03-05 21:34 ` Kok, Auke
  1 sibling, 0 replies; 5+ messages in thread
From: Kok, Auke @ 2008-03-05 21:34 UTC (permalink / raw)
  To: Todd Tomaino; +Cc: linux-kernel, NetDev


[ Adding netdev@vger.kernel.org !]

Todd Tomaino wrote:
> 
> I have been having trouble with interrupt latencies.  While profiling my system, I measured the e1000_update_stats() function disabling interrupts for roughly 100 usec.  For my application, this long of a delay results in data loss.  The e1000_update_stats() function is broken down into 3 main operations:
> 
> 1. acquire spinlock
> 2. read registers & update adapter->stat counters
> 3. update OS adapter->net_stats statistics
> 4. update phy stats.
> 5. release spinlock.
> 
> On my system I measured 57 usec to perform operations #2 and #3, and 52 usec to perform operation #4.
> 
> Reading the comments in the code it appeared that updating adapter->net_stats and adapter->stat required interrupts to be turned off to prevent the e1000_adjust_tbi_stats() function from modifying these structures from interrupt context.  In order to reduce the amount of time spent with interrupts disabled I separated the register reads from updating adapter->stat counters.  The register reads take up the bulk of the 57 usec and shouldn't require protection (I could be wrong).  The new operation looks like this and reduces the amount of time spent with interrupts turned off by 1/2:
> 
> 1. read registers
> 2. acquire spinlock
> 3. update adapter->stat counters
> 4. update OS adapter->net_stats statistics
> 5. update phy stats.
> 6. release spinlock.
> 
> 
> It seemed like a bad idea to attempt to move the phy reads out of the spin locked code area.
> 
> These changes have greatly reduced my interrupt latency, but are there any negative side effects that could be caused??

not per se, I think you did the right thing here. I think it might even be safe to
move the PHY reads outside the spinlock area, and this is certainly something I
would like to try.

would you care to send an RFC patch to netdev@vger.kernel.org?

Also, I notice you are using an old driver/kernel. care to retest against the
latest git trees?

I'll certainly take your suggestion and have it tested.

Auke

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: E1000:   e1000_update_stats()
  2008-03-05 21:22 ` Chris Friesen
@ 2008-03-05 21:40   ` Kok, Auke
  2008-03-05 22:00     ` Chris Friesen
  0 siblings, 1 reply; 5+ messages in thread
From: Kok, Auke @ 2008-03-05 21:40 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Todd Tomaino, linux-kernel

Chris Friesen wrote:
> Todd Tomaino wrote:
> 
>> Reading the comments in the code it appeared that updating
>> adapter->net_stats and adapter->stat required interrupts to be turned
>> off to prevent the e1000_adjust_tbi_stats() function from modifying
>> these structures from interrupt context.  In order to reduce the
>> amount of time spent with interrupts disabled I separated the
>> register reads from updating adapter->stat counters.  The register
>> reads take up the bulk of the 57 usec and shouldn't require
>> protection (I could be wrong).  The new operation looks like this and
>> reduces the amount of time spent with interrupts turned off by 1/2:
>>
>> 1. read registers 2. acquire spinlock 3. update adapter->stat
>> counters 4. update OS adapter->net_stats statistics 5. update phy
>> stats. 6. release spinlock.
> 
> One potential problem with this is that the stats can be further out of
> date.
> 
> Consider this case:
> 
> cpu0 reads some of the registers, then gets preempted/interrupted
> cpu1 reads all the registers, updates the stats, then dumps the data
> 
> 
> At this point cpu1 dumped information that doesn't include the registers
> that cpu0 was in the middle of updating.  Eventually cpu0 will run again
> and the stats will be accurate, but there is a window where the stats
> may be self-inconsistent and the various counts may not add up.

the update_stats function is only called every two seconds, and scheduled once. I
think that the chance that two of these functions run in contention in the way you
point out are practically zero, and if they do then we have much bigger problems
than just counters being wrong...

the lock is there to prevent an mii-tool PHY read from interfering with the
regularly scheduled PHY reads, so we might want to keep those reads locked
together as Todd suggests himself, but not the bulk MAC reads.

Auke

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: E1000:   e1000_update_stats()
  2008-03-05 21:40   ` Kok, Auke
@ 2008-03-05 22:00     ` Chris Friesen
  0 siblings, 0 replies; 5+ messages in thread
From: Chris Friesen @ 2008-03-05 22:00 UTC (permalink / raw)
  To: Kok, Auke; +Cc: Todd Tomaino, linux-kernel

Kok, Auke wrote:

> the update_stats function is only called every two seconds, and scheduled once. I
> think that the chance that two of these functions run in contention in the way you
> point out are practically zero, and if they do then we have much bigger problems
> than just counters being wrong...

I recently did an investigation that ended up sampling a subset of those 
stats every millisecond (to get an accurate picture of how many packets 
were being arriving vs being dropped due to rx_fifo errors over short 
intervals).  Ultimately it turned out that we had very bursty traffic. 
So it's not entirely unheard-of to sample at much higher rates.

At the very least, if we're going to remove the atomicity of reading 
registers and updating stats, how about adding a comment to this effect 
in the code?

Chris

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-03-05 22:00 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-05 20:45 E1000: e1000_update_stats() Todd Tomaino
2008-03-05 21:22 ` Chris Friesen
2008-03-05 21:40   ` Kok, Auke
2008-03-05 22:00     ` Chris Friesen
2008-03-05 21:34 ` Kok, Auke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).