LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Question about synchronous write on SSD
@ 2008-02-19  5:48 Kyungmin Park
  2008-02-19  8:16 ` Thomas Petazzoni
  0 siblings, 1 reply; 6+ messages in thread
From: Kyungmin Park @ 2008-02-19  5:48 UTC (permalink / raw)
  To: Linux Filesystem Mailing List, linux-kernel

Hi,

Don't you remember the topic "solid state drive access and context
switching" [1].
I want to measure it is really better performance on SSD?

To write it on ssd synchronously, I hacked the
'generic_make_request()' [2] and got following results.

# echo 3 > /proc/sys/vm/drop_caches
# tiotest -f 100 -R -d /dev/sdd1
Tiotest results for 4 concurrent io threads:
| Write         400 MBs |    4.7 s |  84.306 MB/s |   8.4 %  |  77.5 % |
| Read          400 MBs |    4.3 s |  92.945 MB/s |   7.2 %  |  53.5 % |
Tiotest latency results:
| Write        |        0.126 ms |      706.379 ms |  0.00000 |   0.00000 |
| Read         |        0.161 ms |      311.738 ms |  0.00000 |   0.00000 |

# echo 3 > /proc/sys/vm/drop_caches
# tiotest -f 1000 -R -d /dev/sdd1
Tiotest results for 4 concurrent io threads:
| Write        4000 MBs |   47.5 s |  84.124 MB/s |   7.0 %  |  83.6 % |
| Read         4000 MBs |   41.9 s |  95.530 MB/s |   7.8 %  |  55.6 % |
Tiotest latency results:
| Write        |        0.176 ms |      714.677 ms |  0.00000 |   0.00000 |
| Read         |        0.161 ms |      311.815 ms |  0.00000 |   0.00000 |

However it's same performance as before. It means the patch is meaningless.

Could you tell me is it the proper place to hack or others?

Thank you,
Kyungmin Park

p.s. Of cource I got the following message
WARNING: at block/blk-core.c:1351 generic_make_request+0x126/0x3d8()

1. http://lkml.org/lkml/2007/12/3/247
2. simple hack
diff --git a/block/blk-core.c b/block/blk-core.c
index e9754dc..7262720 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1345,6 +1345,14 @@ static inline void
__generic_make_request(struct bio *bio)
        if (bio_check_eod(bio, nr_sectors))
                goto end_io;

+       /* FIXME simple hack by kmpark */
+       if (MINOR(bio->bi_bdev->bd_dev) == 49 &&
+           MAJOR(bio->bi_bdev->bd_dev) == 8 && bio_data_dir(bio) == WRITE) {
+               WARN_ON_ONCE(1);
+               /* Write synchronous */
+               bio->bi_rw |= (1 << BIO_RW_SYNC);
+       }
+
        /*
         * Resolve the mapping until finished. (drivers are
         * still free to implement/resolve their own stacking

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question about synchronous write on SSD
  2008-02-19  5:48 Question about synchronous write on SSD Kyungmin Park
@ 2008-02-19  8:16 ` Thomas Petazzoni
  2008-02-19  9:39   ` Kyungmin Park
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Petazzoni @ 2008-02-19  8:16 UTC (permalink / raw)
  To: Kyungmin Park; +Cc: Linux Filesystem Mailing List, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 752 bytes --]

Hi,

Le Tue, 19 Feb 2008 14:48:18 +0900,
"Kyungmin Park" <kmpark@infradead.org> a écrit :

> +               /* Write synchronous */
> +               bio->bi_rw |= (1 << BIO_RW_SYNC);

Adding BIO_RW_SYNC doesn't make generic_make_request() synchronous as
in "generic_make_request() returns only after write completion".
BIO_RW_SYNC only asks the I/O layer to unplug immediatly. But
generic_make_request() still returns before the completion of the I/O,
and the completion is notified asynchronously.

See:
 http://lxr.free-electrons.com/source/include/linux/bio.h#139

Sincerly,

Thomas
-- 
Thomas Petazzoni, Free Electrons
Free Embedded Linux Training Materials
on http://free-electrons.com/training
(More than 1500 pages!)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Question about synchronous write on SSD
  2008-02-19  8:16 ` Thomas Petazzoni
@ 2008-02-19  9:39   ` Kyungmin Park
  2008-02-19  9:43     ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Kyungmin Park @ 2008-02-19  9:39 UTC (permalink / raw)
  To: 'Thomas Petazzoni'
  Cc: 'Linux Filesystem Mailing List', linux-kernel

> Le Tue, 19 Feb 2008 14:48:18 +0900,
> "Kyungmin Park" <kmpark@infradead.org> a écrit :
> 
> > +               /* Write synchronous */
> > +               bio->bi_rw |= (1 << BIO_RW_SYNC);
> 
> Adding BIO_RW_SYNC doesn't make generic_make_request() synchronous as
> in "generic_make_request() returns only after write completion".
> BIO_RW_SYNC only asks the I/O layer to unplug immediatly. But
> generic_make_request() still returns before the completion of the I/O,
> and the completion is notified asynchronously.
> 

Agree, however see the following sequence.

__generic_make_request call q->make_request_fn(q, bio);
It was set by blk_init_queue_node with __make_request.
There are two ways in __make_request.
Case 1, get_rq
Case 2, out or merged (otherwise you mean unplug case)

In case 1, if the BIO_RW_SYNC is set, the request gets the REQ_RW_SYNC
And REQ_RW_SYNC says 
"include/linux/blkdev.h":112: __REQ_RW_SYNC,          /* request is sync (O_DIRECT) */
It means it acts as O_DIRECT flag. Is it right?
And it also is same as case 2. Unplug the device.
So next time it hasn't chance to merge???

Is it right?

BR,
Kyungmin Park


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question about synchronous write on SSD
  2008-02-19  9:39   ` Kyungmin Park
@ 2008-02-19  9:43     ` Jens Axboe
  2008-02-19  9:55       ` Kyungmin Park
  0 siblings, 1 reply; 6+ messages in thread
From: Jens Axboe @ 2008-02-19  9:43 UTC (permalink / raw)
  To: Kyungmin Park
  Cc: 'Thomas Petazzoni',
	'Linux Filesystem Mailing List',
	linux-kernel

On Tue, Feb 19 2008, Kyungmin Park wrote:
> > Le Tue, 19 Feb 2008 14:48:18 +0900,
> > "Kyungmin Park" <kmpark@infradead.org> a écrit :
> > 
> > > +               /* Write synchronous */
> > > +               bio->bi_rw |= (1 << BIO_RW_SYNC);
> > 
> > Adding BIO_RW_SYNC doesn't make generic_make_request() synchronous as
> > in "generic_make_request() returns only after write completion".
> > BIO_RW_SYNC only asks the I/O layer to unplug immediatly. But
> > generic_make_request() still returns before the completion of the I/O,
> > and the completion is notified asynchronously.
> > 
> 
> Agree, however see the following sequence.
> 
> __generic_make_request call q->make_request_fn(q, bio);
> It was set by blk_init_queue_node with __make_request.
> There are two ways in __make_request.
> Case 1, get_rq
> Case 2, out or merged (otherwise you mean unplug case)
> 
> In case 1, if the BIO_RW_SYNC is set, the request gets the REQ_RW_SYNC
> And REQ_RW_SYNC says 
> "include/linux/blkdev.h":112: __REQ_RW_SYNC,          /* request is sync (O_DIRECT) */
> It means it acts as O_DIRECT flag. Is it right?
> And it also is same as case 2. Unplug the device.
> So next time it hasn't chance to merge???

But that still doesn't make it sync. I think you are working the wrong
way. For ssd we still want merging and plugging also makes sense to some
degree, though it probably should be minimized. It'll only cause an
initial latency, for busy IO workloads you wont be plugging/unplugging
much anyway.

In fact your patch makes things WORSE, since the io schedulers will now
treat the IO as sync and introduce idling for the disk head. And you
definitely don't want that.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Question about synchronous write on SSD
  2008-02-19  9:43     ` Jens Axboe
@ 2008-02-19  9:55       ` Kyungmin Park
  2008-02-19 10:00         ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Kyungmin Park @ 2008-02-19  9:55 UTC (permalink / raw)
  To: 'Jens Axboe'
  Cc: 'Thomas Petazzoni',
	'Linux Filesystem Mailing List',
	linux-kernel

> >
> > Agree, however see the following sequence.
> >
> > __generic_make_request call q->make_request_fn(q, bio);
> > It was set by blk_init_queue_node with __make_request.
> > There are two ways in __make_request.
> > Case 1, get_rq
> > Case 2, out or merged (otherwise you mean unplug case)
> >
> > In case 1, if the BIO_RW_SYNC is set, the request gets the REQ_RW_SYNC
> > And REQ_RW_SYNC says
> > "include/linux/blkdev.h":112: __REQ_RW_SYNC,          /* request is sync (O_DIRECT) */
> > It means it acts as O_DIRECT flag. Is it right?
> > And it also is same as case 2. Unplug the device.
> > So next time it hasn't chance to merge???
> 
> But that still doesn't make it sync. I think you are working the wrong
> way. For ssd we still want merging and plugging also makes sense to some
> degree, though it probably should be minimized. It'll only cause an
> initial latency, for busy IO workloads you wont be plugging/unplugging
> much anyway.
> 
> In fact your patch makes things WORSE, since the io schedulers will now
> treat the IO as sync and introduce idling for the disk head. And you
> definitely don't want that.

Yes, you're right. It's for testing.
I just want to know the worst or corner case, if all writes are synchronous.
Of course I can measure the using tiotest "Do write synchronous" option.
Then you think it's the worse case?

Kyungmin Park


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question about synchronous write on SSD
  2008-02-19  9:55       ` Kyungmin Park
@ 2008-02-19 10:00         ` Jens Axboe
  0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2008-02-19 10:00 UTC (permalink / raw)
  To: Kyungmin Park
  Cc: 'Thomas Petazzoni',
	'Linux Filesystem Mailing List',
	linux-kernel

On Tue, Feb 19 2008, Kyungmin Park wrote:
> > >
> > > Agree, however see the following sequence.
> > >
> > > __generic_make_request call q->make_request_fn(q, bio);
> > > It was set by blk_init_queue_node with __make_request.
> > > There are two ways in __make_request.
> > > Case 1, get_rq
> > > Case 2, out or merged (otherwise you mean unplug case)
> > >
> > > In case 1, if the BIO_RW_SYNC is set, the request gets the REQ_RW_SYNC
> > > And REQ_RW_SYNC says
> > > "include/linux/blkdev.h":112: __REQ_RW_SYNC,          /* request is sync (O_DIRECT) */
> > > It means it acts as O_DIRECT flag. Is it right?
> > > And it also is same as case 2. Unplug the device.
> > > So next time it hasn't chance to merge???
> > 
> > But that still doesn't make it sync. I think you are working the wrong
> > way. For ssd we still want merging and plugging also makes sense to some
> > degree, though it probably should be minimized. It'll only cause an
> > initial latency, for busy IO workloads you wont be plugging/unplugging
> > much anyway.
> > 
> > In fact your patch makes things WORSE, since the io schedulers will now
> > treat the IO as sync and introduce idling for the disk head. And you
> > definitely don't want that.
> 
> Yes, you're right. It's for testing.
> I just want to know the worst or corner case, if all writes are synchronous.
> Of course I can measure the using tiotest "Do write synchronous" option.
> Then you think it's the worse case?

If you want to test when all writes are sync, then either mount with -o
sync, open with O_SYNC or use O_DIRECT writes. You can't force that
behaviour by changing the block layer code. Perhaps you could force
O_SYNC when a file is opened, if you want to experiment with worst case
generally. Not sure that makes a lot of sense, though.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-02-19 10:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-19  5:48 Question about synchronous write on SSD Kyungmin Park
2008-02-19  8:16 ` Thomas Petazzoni
2008-02-19  9:39   ` Kyungmin Park
2008-02-19  9:43     ` Jens Axboe
2008-02-19  9:55       ` Kyungmin Park
2008-02-19 10:00         ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).