LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: peculiar problem with 2.6, 8139too + ACPI
       [not found] <A6974D8E5F98D511BB910002A50A6647615FB5FE@hdsmsx403.hd.intel.com>
@ 2004-05-15  1:36 ` Len Brown
  2004-05-15 23:30   ` Robert Fendt
  2004-05-17 10:30   ` Robert Fendt
  0 siblings, 2 replies; 9+ messages in thread
From: Len Brown @ 2004-05-15  1:36 UTC (permalink / raw)
  To: Robert Fendt; +Cc: linux-kernel

On Fri, 2004-05-14 at 09:28, Robert Fendt wrote:

> I have the following problem: the 8139too driver produces lots of
> overruns and is _very_ slow (but strangely not always; the problem
> seems to be site-dependant, too). And here is what is weird: if I
> artifically raise the system load (e.g. compile a kernel just for
> fun), the download speed grows by at least an order of magnitude.

It is possible that the system is getting into a high power saving
mode on idle.  Device bus master activity or interrupts will wake
it up -- but the latency to return from power savings mode may be
so high that the device experiences receive buffer overruns.

Some devices handle this latency better than others,
and with a network, dropping RX packets can cause the
connection to thrash, and it seems that is what you see.

If the 8139too has statistics counters showing if it gets
RX buffer over-runs, that would be interseting to observe.

Also, to see what idle power saving states you have, their
latency and their usage, please do this:
cat /proc/acpi/processor/CPU0/power

It would also be interesting to know if you see the problem
more frequently when running on battery power, since some
systems have higher c-state exit latency when on battery.

It would also be interesting to know if you see the same
frequency of the problem on 2.4, since it has 100HZ clock
vs 1000HZ clock on 2.6 -- and this can have a significant
effect on the effectivness of idle c-states.

cheers,
-Len



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: peculiar problem with 2.6, 8139too + ACPI
  2004-05-15  1:36 ` peculiar problem with 2.6, 8139too + ACPI Len Brown
@ 2004-05-15 23:30   ` Robert Fendt
  2004-05-17 10:30   ` Robert Fendt
  1 sibling, 0 replies; 9+ messages in thread
From: Robert Fendt @ 2004-05-15 23:30 UTC (permalink / raw)
  To: linux-kernel

> It is possible that the system is getting into a high power saving
> mode on idle.  Device bus master activity or interrupts will wake
> it up -- but the latency to return from power savings mode may be
> so high that the device experiences receive buffer overruns.

Yes, I also thought in that direction, since the main difference between
the processor module loaded or not seems to be the idle handler.

> Some devices handle this latency better than others,
> and with a network, dropping RX packets can cause the
> connection to thrash, and it seems that is what you see.
> 
> If the 8139too has statistics counters showing if it gets
> RX buffer over-runs, that would be interseting to observe.

I seem to be unable to reproduce the problem on my home network. It is a
small (switched) 100BaseT network which is connected to the outside via an
asynchronous dsl line (128/768kbit). Maybe the different LAN topology or
the slow external link make the difference. I will retry on monday on the
corporate network. The latter is a large university net, with our section
consisting of 100BaseT-to-100BaseFX tranceiver switches. I do not have
information on the rest of the network.

> Also, to see what idle power saving states you have, their
> latency and their usage, please do this:
> cat /proc/acpi/processor/CPU0/power

betazed:~# cat /proc/acpi/processor/CPU1/power
active state:            C2
default state:           C1
bus master activity:     ffffffff
states:
    C1:                  promotion[C2] demotion[--] latency[000] usage[00000010]
   *C2:                  promotion[C3] demotion[C1] latency[001] usage[00025200]
    C3:                  promotion[--] demotion[C2] latency[101] usage[00024564]

> It would also be interesting to know if you see the problem
> more frequently when running on battery power, since some
> systems have higher c-state exit latency when on battery.

I cannot say at this moment, since I cannot reproduce the problem at home
(see above). I will try to get some info on this matter on monday. Stay
tuned.

> It would also be interesting to know if you see the same
> frequency of the problem on 2.4, since it has 100HZ clock
> vs 1000HZ clock on 2.6 -- and this can have a significant
> effect on the effectivness of idle c-states.

Will try to install a 2.4 kernel on the box and see what happens.

Regards,
Robert

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: peculiar problem with 2.6, 8139too + ACPI
  2004-05-15  1:36 ` peculiar problem with 2.6, 8139too + ACPI Len Brown
  2004-05-15 23:30   ` Robert Fendt
@ 2004-05-17 10:30   ` Robert Fendt
  2004-05-17 18:24     ` Len Brown
  1 sibling, 1 reply; 9+ messages in thread
From: Robert Fendt @ 2004-05-17 10:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Len Brown

On 14 May 2004 21:36:38 -0400
Len Brown <len.brown@intel.com> wrote:

> If the 8139too has statistics counters showing if it gets
> RX buffer over-runs, that would be interseting to observe.

Hmm I am not entirely sure I understand what you mean. ifconfig output
is as follows:

a) with 'processor' loaded

robert@betazed:~$ wget http://download.sourcemage.org/iso/smgl-i386-2.6.5-20040414.iso.bz2
--12:27:16--  http://download.sourcemage.org/iso/smgl-i386-2.6.5-20040414.iso.bz2
           => `smgl-i386-2.6.5-20040414.iso.bz2'
Resolving download.sourcemage.org... 152.2.210.81
Connecting to download.sourcemage.org[152.2.210.81]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 142,065,569 [text/plain]

 0% [                                     ] 202,609        2.30K/s ETA 10:17:41


robert@betazed:~$ /sbin/ifconfig
eth0      Link encap:Ethernet  HWaddr 00:0C:6E:8A:DD:BA  
          inet addr:129.217.168.125  Bcast:129.217.168.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:6eff:fe8a:ddba/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:933 errors:117 dropped:212 overruns:117 frame:0
          TX packets:638 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:622241 (607.6 KiB)  TX bytes:54355 (53.0 KiB)
          Interrupt:5 Base address:0xc800 


b) without 'processor' loaded

robert@betazed:~$ wget http://download.sourcemage.org/iso/smgl-i386-2.6.5-20040414.iso.bz2
--11:29:17--  http://download.sourcemage.org/iso/smgl-i386-2.6.5-20040414.iso.bz2
           => `smgl-i386-2.6.5-20040414.iso.bz2.2'
Resolving download.sourcemage.org... 152.2.210.81
Connecting to download.sourcemage.org[152.2.210.81]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 142,065,569 [text/plain]

 3% [=>                                                  ] 5,526,132    514.93K/s

robert@betazed:~$ /sbin/ifconfig
eth0      Link encap:Ethernet  HWaddr 00:0C:6E:8A:DD:BA  
          inet addr:129.217.168.125  Bcast:129.217.168.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:6eff:fe8a:ddba/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:4187 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2313 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:5904292 (5.6 MiB)  TX bytes:149285 (145.7 KiB)
          Interrupt:5 Base address:0xc800 


One additional problem in debugging this is that it seems to be
depending on the local network topology, since I somehow cannot
reproduce it when downloading from machines on the LAN or when I have a
slow downstream connection (e.g. DSL).

> It would also be interesting to know if you see the problem
> more frequently when running on battery power, since some
> systems have higher c-state exit latency when on battery.

Hmmm, I cannot see a difference between battery and ac. I will look into
it a bit more, though.

Regards,
Robert

-- 
Robert Fendt
Experimentelle Physik I, Universität Dortmund
Otto-Hahn-Str. 4, 44221 Dortmund, -Germany-
Tel. +49-231-755-3522

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: peculiar problem with 2.6, 8139too + ACPI
  2004-05-17 10:30   ` Robert Fendt
@ 2004-05-17 18:24     ` Len Brown
  2004-05-17 19:10       ` Jeff Garzik
  2004-05-20 23:53       ` Robert Fendt
  0 siblings, 2 replies; 9+ messages in thread
From: Len Brown @ 2004-05-17 18:24 UTC (permalink / raw)
  To: Robert Fendt, Jeff Garzik; +Cc: linux-kernel, James P Ketrenos

On Mon, 2004-05-17 at 06:30, Robert Fendt wrote:
> On 14 May 2004 21:36:38 -0400
> Len Brown <len.brown@intel.com> wrote:
> 
> > If the 8139too has statistics counters showing if it gets
> > RX buffer over-runs, that would be interseting to observe.
> 

> a) with 'processor' loaded
> 
> robert@betazed:~$ wget http://download.sourcemage.org/iso/smgl-i386-2.6.5-20040414.iso.bz2
> --12:27:16--  http://download.sourcemage.org/iso/smgl-i386-2.6.5-20040414.iso.bz2
>            => `smgl-i386-2.6.5-20040414.iso.bz2'
> Resolving download.sourcemage.org... 152.2.210.81
> Connecting to download.sourcemage.org[152.2.210.81]:80... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: 142,065,569 [text/plain]
> 
>  0% [                                     ] 202,609        2.30K/s ETA 10:17:41
> 
> 
> robert@betazed:~$ /sbin/ifconfig
> eth0      Link encap:Ethernet  HWaddr 00:0C:6E:8A:DD:BA  
>           inet addr:129.217.168.125  Bcast:129.217.168.255  Mask:255.255.255.0
>           inet6 addr: fe80::20c:6eff:fe8a:ddba/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:933 errors:117 dropped:212 overruns:117 frame:0

BINGO

There may be a way to get more detailed stats out of the driver with
netstat or something, Jeff would know.

>           TX packets:638 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000 
>           RX bytes:622241 (607.6 KiB)  TX bytes:54355 (53.0 KiB)
>           Interrupt:5 Base address:0xc800 
> 
> 
> b) without 'processor' loaded
> 
> robert@betazed:~$ wget http://download.sourcemage.org/iso/smgl-i386-2.6.5-20040414.iso.bz2
> --11:29:17--  http://download.sourcemage.org/iso/smgl-i386-2.6.5-20040414.iso.bz2
>            => `smgl-i386-2.6.5-20040414.iso.bz2.2'
> Resolving download.sourcemage.org... 152.2.210.81
> Connecting to download.sourcemage.org[152.2.210.81]:80... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: 142,065,569 [text/plain]
> 
>  3% [=>                                                  ] 5,526,132    514.93K/s
> 
> robert@betazed:~$ /sbin/ifconfig
> eth0      Link encap:Ethernet  HWaddr 00:0C:6E:8A:DD:BA  
>           inet addr:129.217.168.125  Bcast:129.217.168.255  Mask:255.255.255.0
>           inet6 addr: fe80::20c:6eff:fe8a:ddba/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:4187 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:2313 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000 
>           RX bytes:5904292 (5.6 MiB)  TX bytes:149285 (145.7 KiB)
>           Interrupt:5 Base address:0xc800 
> 
> 
> One additional problem in debugging this is that it seems to be
> depending on the local network topology, since I somehow cannot
> reproduce it when downloading from machines on the LAN or when I have a
> slow downstream connection (e.g. DSL).

Probably something to do with packet arrival time and the ability of the
system to think it is idle.  What topology does it fail with?

> > It would also be interesting to know if you see the problem
> > more frequently when running on battery power, since some
> > systems have higher c-state exit latency when on battery.
> 
> Hmmm, I cannot see a difference between battery and ac. I will look into
> it a bit more, though.

Does
cat /proc/acpi/processor/CPU0/power
show any C3 usage?

This could be either an ACPI issue -- we may enter C3 when there really
isn't enough time to enter and exit C3 w/o thrashing the system; AND/OR
a NIC problem where the driver/device is prone to over-run errors
when the latency to memory is high.

I think we're having a similar problem with the ipw2100.  it would
be interesting if you plugged an e100 into the failing config if
it fails the same way.

-Len



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: peculiar problem with 2.6, 8139too + ACPI
  2004-05-17 18:24     ` Len Brown
@ 2004-05-17 19:10       ` Jeff Garzik
  2004-05-20 23:53       ` Robert Fendt
  1 sibling, 0 replies; 9+ messages in thread
From: Jeff Garzik @ 2004-05-17 19:10 UTC (permalink / raw)
  To: Len Brown; +Cc: Robert Fendt, linux-kernel, James P Ketrenos

Len Brown wrote:
> On Mon, 2004-05-17 at 06:30, Robert Fendt wrote:
> 
>>On 14 May 2004 21:36:38 -0400
>>Len Brown <len.brown@intel.com> wrote:
>>
>>
>>>If the 8139too has statistics counters showing if it gets
>>>RX buffer over-runs, that would be interseting to observe.
>>
> 
>>a) with 'processor' loaded
>>
>>robert@betazed:~$ wget http://download.sourcemage.org/iso/smgl-i386-2.6.5-20040414.iso.bz2
>>--12:27:16--  http://download.sourcemage.org/iso/smgl-i386-2.6.5-20040414.iso.bz2
>>           => `smgl-i386-2.6.5-20040414.iso.bz2'
>>Resolving download.sourcemage.org... 152.2.210.81
>>Connecting to download.sourcemage.org[152.2.210.81]:80... connected.
>>HTTP request sent, awaiting response... 200 OK
>>Length: 142,065,569 [text/plain]
>>
>> 0% [                                     ] 202,609        2.30K/s ETA 10:17:41
>>
>>
>>robert@betazed:~$ /sbin/ifconfig
>>eth0      Link encap:Ethernet  HWaddr 00:0C:6E:8A:DD:BA  
>>          inet addr:129.217.168.125  Bcast:129.217.168.255  Mask:255.255.255.0
>>          inet6 addr: fe80::20c:6eff:fe8a:ddba/64 Scope:Link
>>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>          RX packets:933 errors:117 dropped:212 overruns:117 frame:0
> 
> 
> BINGO
> 
> There may be a way to get more detailed stats out of the driver with
> netstat or something, Jeff would know.


Almost all the standard stats are listed in /proc/net/dev...  8139 
hardware doesn't have any NIC-specific stats besides the Rx-Missed 
counter either, IIRC.

	Jeff




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: peculiar problem with 2.6, 8139too + ACPI
  2004-05-17 18:24     ` Len Brown
  2004-05-17 19:10       ` Jeff Garzik
@ 2004-05-20 23:53       ` Robert Fendt
  2004-05-21  1:16         ` Len Brown
  1 sibling, 1 reply; 9+ messages in thread
From: Robert Fendt @ 2004-05-20 23:53 UTC (permalink / raw)
  To: linux-kernel; +Cc: Len Brown, Jeff Garzik

On 17 May 2004 14:24:42 -0400
Len Brown <len.brown@intel.com> wrote:

> > One additional problem in debugging this is that it seems to be
> > depending on the local network topology, since I somehow cannot
> > reproduce it when downloading from machines on the LAN or when I have a
> > slow downstream connection (e.g. DSL).

Maybe I can add at least some details on the differing network
topologies. My local network is a small 100BaseT network, one single
8-port switch. Connection to "the outside" is made through a
masquerading router (an old 233-PentiumMMX box), which has an ADSL modem
connection (128/768kbit up/down). As I said, I cannot produce the
problem here, neither local nor when fetching data from external
sources.

The second topology is a large corporate-style network (university), so
it consists of several hundred machines and some dozens switches and
routers, at least. The particular section I am connected to is 100BaseFX
based, with BaseT-to-fiber transceiver switches in every office. So the
office is 100BaseT, and the connection between the offices is 100BaseFX.
The whole is connected to the rest of the network through a switch
somewhere in the building, a Cisco thingy I would guess (though I have
never seen it). The connection of the university to the Internet is
something in the gigabit/s range, through the "DFN" (deutsches
Forschungsnetzwerk, German scientific network). I do not have more
information, sorry.

> Does
> cat /proc/acpi/processor/CPU0/power
> show any C3 usage?

Yes, if I read this correctly, it does. BTW, seemingly pretty much the same on AC or battery.

betazed:~# cat /proc/acpi/processor/CPU1/power
active state:            C2
default state:           C1
bus master activity:     ffffffff
states:
    C1:                  promotion[C2] demotion[--] latency[000] usage[00000010]
   *C2:                  promotion[C3] demotion[C1] latency[001] usage[00025200]
    C3:                  promotion[--] demotion[C2] latency[101] usage[00024564]


> I think we're having a similar problem with the ipw2100.  it would
> be interesting if you plugged an e100 into the failing config if
> it fails the same way.

Could be difficult. We do not have such a card in the workgroup (would
have to be PCMCIA, of course; is there even such a thing?), and I would
have to convince my boss of buying one for testing purposes. I _could_
of course try to get the ipw2100 driver working, since we have an ap in
the group (and the laptop is Centrino based, as I mentioned before). I
am not informed on its status, however: does WEP encrytion work now?
Also a test would have to wait until tuesday, anyway, since I will not
return to the office before.

Regards,
Robert

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: peculiar problem with 2.6, 8139too + ACPI
  2004-05-20 23:53       ` Robert Fendt
@ 2004-05-21  1:16         ` Len Brown
  2004-05-27 13:45           ` Robert Fendt
  0 siblings, 1 reply; 9+ messages in thread
From: Len Brown @ 2004-05-21  1:16 UTC (permalink / raw)
  To: Robert Fendt; +Cc: linux-kernel, Jeff Garzik

On Thu, 2004-05-20 at 19:53, Robert Fendt wrote:
> On 17 May 2004 14:24:42 -0400

> differing network topologies.

probably not important, just shows that this is timing dependent.
System must be quiet enough that it gets into idle.

> > Does
> > cat /proc/acpi/processor/CPU0/power
> > show any C3 usage?
> 
> Yes, if I read this correctly, it does. BTW, seemingly pretty much the same on AC or battery.
> 
> betazed:~# cat /proc/acpi/processor/CPU1/power
> active state:            C2
> default state:           C1
> bus master activity:     ffffffff
> states:
>     C1:                  promotion[C2] demotion[--] latency[000] usage[00000010]
>    *C2:                  promotion[C3] demotion[C1] latency[001] usage[00025200]
>     C3:                  promotion[--] demotion[C2] latency[101] usage[00024564]
> 

Please verify that the problem goes away when you exclude the
acpi/processor module (CONFIG_ACPI_PROCESSOR) from the system.

With the recent spate of C3 issues, we should make an easier way to
disable C3 until it is fixed...

thanks,
-Len



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: peculiar problem with 2.6, 8139too + ACPI
  2004-05-21  1:16         ` Len Brown
@ 2004-05-27 13:45           ` Robert Fendt
  0 siblings, 0 replies; 9+ messages in thread
From: Robert Fendt @ 2004-05-27 13:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: Len Brown

On 20 May 2004 21:16:32 -0400
Len Brown <len.brown@intel.com> wrote:

> Please verify that the problem goes away when you exclude the
> acpi/processor module (CONFIG_ACPI_PROCESSOR) from the system.
> 
> With the recent spate of C3 issues, we should make an easier way to
> disable C3 until it is fixed...

Sorry for the delay, but I could not do further tests before now. I have
now at least managed to get 2.6.6 to boot on the box (although it breaks
the synaptics mouse pad driver, but that's a different matter; this too
seems to be ACPI-related, though). The boot problem was the new cpufreq
detection mode producing a seg'fault during boot (and guess what: ACPI
again).

So far I can confirm the following: 1) the problem still exists in 2.6.6
and 2) the problem can be switched on and off be (un)loading the
'processor' module. Test environment is still Debian-unstable, this time
with 2.6.6 vanilla kernel in Debian distro config (a.k.a. pretty bloated
but most things as modules).

I do not expect different results when disabling processor support
during kernel compile. Of course I can try this, if wanted. Any patch
suggestions to try out so far?

Regards,
Robert

^ permalink raw reply	[flat|nested] 9+ messages in thread

* peculiar problem with 2.6, 8139too + ACPI
@ 2004-05-14 13:28 Robert Fendt
  0 siblings, 0 replies; 9+ messages in thread
From: Robert Fendt @ 2004-05-14 13:28 UTC (permalink / raw)
  To: linux-kernel

Hi folks,

I have the following problem: the 8139too driver produces lots of
overruns and is _very_ slow (but strangely not always; the problem seems
to be site-dependant, too). And here is what is weird: if I artifically
raise the system load (e.g. compile a kernel just for fun), the download
speed grows by at least an order of magnitude.

Machine is an Asus M2400N notebook, Centrino-based, 8139-internal (lspci
says rev.10), kernel revisions tested now are 2.6.0, 2.6.3, 2.6.5. I did
not really test 2.6.6 yet, since I am having problems getting the beast
to work correctly. Distribution is Debian/unstable. My home router
(quite an old machine) is not ACPI capable, so it does not show the
problem (see below), and I effectively have only this one 8139-based
system to test.

After some detective work, I have narrowed the problem down to an
apparent conflict between 8139too and the 'processor' module, and
therefore to the ACPI code, which is AFAIK all-new in 2.6. If I rmmod
the module, the network speed is fine, but I lose the option to use the
'thermal' module. The big question is: has anyone an idea whether this
is a bug in the 8139too driver or in the ACPI code (IOW: is the problem
maybe not-too-new after all and I was just too dumb to find it)? Or in
which direction I should look? I am afraid I am no kernel/driver hacker
(but on the other hand not a totally C newbie, either). I just want to
avoid poking around in the dark and crashing my system in the process,
if possible :-)

Regards,
Robert

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2004-05-27 13:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <A6974D8E5F98D511BB910002A50A6647615FB5FE@hdsmsx403.hd.intel.com>
2004-05-15  1:36 ` peculiar problem with 2.6, 8139too + ACPI Len Brown
2004-05-15 23:30   ` Robert Fendt
2004-05-17 10:30   ` Robert Fendt
2004-05-17 18:24     ` Len Brown
2004-05-17 19:10       ` Jeff Garzik
2004-05-20 23:53       ` Robert Fendt
2004-05-21  1:16         ` Len Brown
2004-05-27 13:45           ` Robert Fendt
2004-05-14 13:28 Robert Fendt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).