LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* scheduling oddity on 2.6.20.3 stock
@ 2007-05-03  1:31 david
  2007-05-03 21:10 ` David Schwartz
  0 siblings, 1 reply; 7+ messages in thread
From: david @ 2007-05-03  1:31 UTC (permalink / raw)
  To: linux-kernel

I needed to recompress some files from .bz2 to .gz so I setup a script to 
do

bunzip2 -c $file.bz2 |gzip -9 >$file.gz

I expected that the two CPU heavy processes would end up on different 
cpu's and spend a little time shuffling data between the two cpu's on a 
system (dual core opteron)

however, instead what I find is that each process is getting 50% of one 
cpu while the other cpu is 97% idle.

David Lang

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: scheduling oddity on 2.6.20.3 stock
  2007-05-03  1:31 scheduling oddity on 2.6.20.3 stock david
@ 2007-05-03 21:10 ` David Schwartz
  2007-05-03 21:25   ` david
  2007-05-16 19:49   ` david
  0 siblings, 2 replies; 7+ messages in thread
From: David Schwartz @ 2007-05-03 21:10 UTC (permalink / raw)
  To: linux-kernel


> I needed to recompress some files from .bz2 to .gz so I setup a script to
> do
>
> bunzip2 -c $file.bz2 |gzip -9 >$file.gz
>
> I expected that the two CPU heavy processes would end up on different
> cpu's and spend a little time shuffling data between the two cpu's on a
> system (dual core opteron)
>
> however, instead what I find is that each process is getting 50% of one
> cpu while the other cpu is 97% idle.

That would only be possible if the compression/decompression block size is
small compared to the maximum pipe buffer size. I suspect the reverse is the
case.

It would be interesting to write an intermediate process that basically
enlarged the pipe buffers and see if that changed anything. Basically, the
intermediate process would allocate a large buffer (16MB or so) and fill it
from 'bunzip2' while draining it to 'gzip' in a non-blocking way (unless the
buffer was full/empty, of course).

DS



^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: scheduling oddity on 2.6.20.3 stock
  2007-05-03 21:10 ` David Schwartz
@ 2007-05-03 21:25   ` david
  2007-05-16 19:49   ` david
  1 sibling, 0 replies; 7+ messages in thread
From: david @ 2007-05-03 21:25 UTC (permalink / raw)
  To: David Schwartz; +Cc: linux-kernel

On Thu, 3 May 2007, David Schwartz wrote:

>> I needed to recompress some files from .bz2 to .gz so I setup a script to
>> do
>>
>> bunzip2 -c $file.bz2 |gzip -9 >$file.gz
>>
>> I expected that the two CPU heavy processes would end up on different
>> cpu's and spend a little time shuffling data between the two cpu's on a
>> system (dual core opteron)
>>
>> however, instead what I find is that each process is getting 50% of one
>> cpu while the other cpu is 97% idle.
>
> That would only be possible if the compression/decompression block size is
> small compared to the maximum pipe buffer size. I suspect the reverse is the
> case.
>
> It would be interesting to write an intermediate process that basically
> enlarged the pipe buffers and see if that changed anything. Basically, the
> intermediate process would allocate a large buffer (16MB or so) and fill it
> from 'bunzip2' while draining it to 'gzip' in a non-blocking way (unless the
> buffer was full/empty, of course).

hmm, how about
bunzip2 -c $file.bz2 |dd bs=8m |gzip -9 >$file.gz
should that work?

David Lang

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: scheduling oddity on 2.6.20.3 stock
  2007-05-03 21:10 ` David Schwartz
  2007-05-03 21:25   ` david
@ 2007-05-16 19:49   ` david
  2007-05-16 20:06     ` David Schwartz
  1 sibling, 1 reply; 7+ messages in thread
From: david @ 2007-05-16 19:49 UTC (permalink / raw)
  To: David Schwartz; +Cc: linux-kernel

On Thu, 3 May 2007, David Schwartz wrote:

>> I needed to recompress some files from .bz2 to .gz so I setup a script to
>> do
>>
>> bunzip2 -c $file.bz2 |gzip -9 >$file.gz
>>
>> I expected that the two CPU heavy processes would end up on different
>> cpu's and spend a little time shuffling data between the two cpu's on a
>> system (dual core opteron)
>>
>> however, instead what I find is that each process is getting 50% of one
>> cpu while the other cpu is 97% idle.
>
> That would only be possible if the compression/decompression block size is
> small compared to the maximum pipe buffer size. I suspect the reverse is the
> case.

I'm still running into this problem in various forms

is there an easy way to change the maximum pipe buffer size? (including a 
simple change to the kernel source, I do compile my own kernels)

> It would be interesting to write an intermediate process that basically
> enlarged the pipe buffers and see if that changed anything. Basically, the
> intermediate process would allocate a large buffer (16MB or so) and fill it
> from 'bunzip2' while draining it to 'gzip' in a non-blocking way (unless the
> buffer was full/empty, of course).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: scheduling oddity on 2.6.20.3 stock
  2007-05-16 19:49   ` david
@ 2007-05-16 20:06     ` David Schwartz
  2007-05-16 20:59       ` David Schwartz
  0 siblings, 1 reply; 7+ messages in thread
From: David Schwartz @ 2007-05-16 20:06 UTC (permalink / raw)
  To: david; +Cc: linux-kernel


> On Thu, 3 May 2007, David Schwartz wrote:
>
> >> I needed to recompress some files from .bz2 to .gz so I setup
> a script to
> >> do
> >>
> >> bunzip2 -c $file.bz2 |gzip -9 >$file.gz
> >>
> >> I expected that the two CPU heavy processes would end up on different
> >> cpu's and spend a little time shuffling data between the two cpu's on a
> >> system (dual core opteron)
> >>
> >> however, instead what I find is that each process is getting 50% of one
> >> cpu while the other cpu is 97% idle.
> >
> > That would only be possible if the compression/decompression
> block size is
> > small compared to the maximum pipe buffer size. I suspect the
> reverse is the
> > case.
>
> I'm still running into this problem in various forms
>
> is there an easy way to change the maximum pipe buffer size? (including a
> simple change to the kernel source, I do compile my own kernels)

No. Changing the size will not do what you want it to do since that only
tells the kernel what the size is, it does not determine what it is.

> > It would be interesting to write an intermediate process that basically
> > enlarged the pipe buffers and see if that changed anything.
> > Basically, the
> > intermediate process would allocate a large buffer (16MB or so)
> > and fill it
> > from 'bunzip2' while draining it to 'gzip' in a non-blocking
> > way (unless the
> > buffer was full/empty, of course).

It is not particularly hard to write such a process. I have a proxy that I
can easily tweak to do this. I'm going to give it a shot and see if it
helps.

DS



^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: scheduling oddity on 2.6.20.3 stock
  2007-05-16 20:06     ` David Schwartz
@ 2007-05-16 20:59       ` David Schwartz
  2007-06-02 22:12         ` Bill Davidsen
  0 siblings, 1 reply; 7+ messages in thread
From: David Schwartz @ 2007-05-16 20:59 UTC (permalink / raw)
  To: David Schwartz, david; +Cc: linux-kernel


> > >> bunzip2 -c $file.bz2 |gzip -9 >$file.gz

So here are some actual results from a dual P3-1Ghz machine (2.6.21.1,
CFSv9). First lets time each operation individually:

$ time bunzip2 -k linux-2.6.21.tar.bz2

real    1m5.626s
user    1m2.240s
sys     0m3.144s


$ time gzip -9 linux-2.6.21.tar

real    1m17.652s
user    1m15.609s
sys     0m1.912s

The compress was the most complex (no surprise there) but they are close
enough that efficient overlap will definitely affect the total wall time. If
we can both decompress and compress in 1:17, we are optimal. First, let's
try the normal way:

$ time (bunzip2 -c linux-2.6.21.tar.bz2 | gzip -9 > test1)

real    1m45.051s
user    2m16.945s
sys     0m2.752s

1:45, or 1/3 over optimal. Now, with a 32MB non-blocking cache between the
two processes ('accel' creates a 32MB cache and uses 'select' to fill from
stdin and empty to stdout without blocking either direction):

$ time (bunzip2 -c linux-2.6.21.tar.bz2 | ./accel | gzip -9 > test2)

real    1m18.361s
user    2m19.589s
sys     0m6.356s

Within testing accuracy of optimal.

So it's not the scheduler. It's the fact that bunzip2/gzip have inadequate
input/output buffering. I don't think it's unreasonable to consider this a
defect in those programs.

DS



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: scheduling oddity on 2.6.20.3 stock
  2007-05-16 20:59       ` David Schwartz
@ 2007-06-02 22:12         ` Bill Davidsen
  0 siblings, 0 replies; 7+ messages in thread
From: Bill Davidsen @ 2007-06-02 22:12 UTC (permalink / raw)
  To: davids; +Cc: david, linux-kernel

David Schwartz wrote:
>>>>> bunzip2 -c $file.bz2 |gzip -9 >$file.gz
> 
> So here are some actual results from a dual P3-1Ghz machine (2.6.21.1,
> CFSv9). First lets time each operation individually:
> 
> $ time bunzip2 -k linux-2.6.21.tar.bz2
> 
> real    1m5.626s
> user    1m2.240s
> sys     0m3.144s
> 
> 
> $ time gzip -9 linux-2.6.21.tar
> 
> real    1m17.652s
> user    1m15.609s
> sys     0m1.912s
> 
> The compress was the most complex (no surprise there) but they are close
> enough that efficient overlap will definitely affect the total wall time. If
> we can both decompress and compress in 1:17, we are optimal. First, let's
> try the normal way:
> 
> $ time (bunzip2 -c linux-2.6.21.tar.bz2 | gzip -9 > test1)
> 
> real    1m45.051s
> user    2m16.945s
> sys     0m2.752s
> 
> 1:45, or 1/3 over optimal. Now, with a 32MB non-blocking cache between the
> two processes ('accel' creates a 32MB cache and uses 'select' to fill from
> stdin and empty to stdout without blocking either direction):
> 
> $ time (bunzip2 -c linux-2.6.21.tar.bz2 | ./accel | gzip -9 > test2)
> 
> real    1m18.361s
> user    2m19.589s
> sys     0m6.356s
> 
> Within testing accuracy of optimal.
> 
> So it's not the scheduler. It's the fact that bunzip2/gzip have inadequate
> input/output buffering. I don't think it's unreasonable to consider this a
> defect in those programs.
> 
They are hardly designed to optimize this operation...

For a tunable buffer program allowing the buffer size and buffers in the 
pool to be set, see www.tmr.com/~public/source program ptbuf. I wrote it 
as a proof of concept for a pthreads presentation I was giving, and it 
happened to be useful.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-06-02 22:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-05-03  1:31 scheduling oddity on 2.6.20.3 stock david
2007-05-03 21:10 ` David Schwartz
2007-05-03 21:25   ` david
2007-05-16 19:49   ` david
2007-05-16 20:06     ` David Schwartz
2007-05-16 20:59       ` David Schwartz
2007-06-02 22:12         ` Bill Davidsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).