LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* AIO, FIO and Threads ...
@ 2007-03-21  4:58 Davide Libenzi
  2007-03-21  6:54 ` Davide Libenzi
  2007-03-21  7:40 ` Jens Axboe
  0 siblings, 2 replies; 6+ messages in thread
From: Davide Libenzi @ 2007-03-21  4:58 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Ingo Molnar, Linus Torvalds, Jens Axboe

[-- Attachment #1: Type: TEXT/PLAIN, Size: 7317 bytes --]


I was looking at Jens FIO stuff, and I decided to cook a quick patch for 
FIO to support GUASI (Generic Userspace Asyncronous Syscall Interface):

http://www.xmailserver.org/guasi-lib.html

I then ran a few tests on my Dual Opteron 252 with SATA drives (sata_nv) 
and 8GB of RAM.
Mind that I'm not FIO expert, like at all, but I got some interesting 
results when comparing GUASI with libaio at 8/1000/10000 depths.
If I read those result correctly (Jens may help), GUASI output is more 
then double the libaio one.
Lots of context switches, yes. But the throughput looks like 2+ times.
Can someone try to repeat the measures and/or spot the error?
Or tell me which other tests to run?
This is kinda a suprise for me ...



PS: FIO patch to support GUASI is attached. You also need to fetch GUASI 
    and (configure && make install)



- Davide



>> fio --name=global --rw=randread --size=64m --ioengine=guasi --name=job1 --iodepth=8 --thread

job1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=guasi, iodepth=8
Starting 1 thread
Jobs: 1: [r] [100.0% done] [  3135/     0 kb/s] [eta 00m:00s]
job1: (groupid=0, jobs=1): err= 0: pid=29298
  read : io=65,536KiB, bw=1,576KiB/s, iops=384, runt= 42557msec
    slat (msec): min=    0, max=    0, avg= 0.00, stdev= 0.00
    clat (msec): min=    0, max=  212, avg=20.26, stdev=18.83
    bw (KiB/s) : min= 1166, max= 3376, per=98.51%, avg=1552.50, stdev=317.42
  cpu          : usr=7.69%, sys=92.99%, ctx=97648
  IO depths    : 1=0.0%, 2=0.0%, 4=0.1%, 8=99.9%, 16=0.0%, 32=0.0%, >=64=0.0%
     lat (msec): 2=1.4%, 4=3.6%, 10=25.3%, 20=34.0%, 50=28.1%, 100=6.8%
     lat (msec): 250=0.8%, 500=0.0%, 750=0.0%, 1000=0.0%, >=2000=0.0%

Run status group 0 (all jobs):
   READ: io=65,536KiB, aggrb=1,576KiB/s, minb=1,576KiB/s, maxb=1,576KiB/s, mint=42557msec, maxt=42557msec

Disk stats (read/write):
  sda: ios=16376/98, merge=8/135, ticks=339481/2810, in_queue=342290, util=99.17%


>> fio --name=global --rw=randread --size=64m --ioengine=libaio --name=job1 --iodepth=8 --thread

job1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=8
Starting 1 thread
Jobs: 1: [r] [95.9% done] [  2423/     0 kb/s] [eta 00m:03s]
job1: (groupid=0, jobs=1): err= 0: pid=29332
  read : io=65,536KiB, bw=929KiB/s, iops=226, runt= 72181msec
    slat (msec): min=    0, max=   98, avg=31.30, stdev=15.53
    clat (msec): min=    0, max=    0, avg= 0.00, stdev= 0.00
    bw (KiB/s) : min=  592, max= 2835, per=98.56%, avg=915.58, stdev=325.29
  cpu          : usr=0.02%, sys=0.34%, ctx=23023
  IO depths    : 1=22.2%, 2=22.2%, 4=44.4%, 8=11.1%, 16=0.0%, 32=0.0%, >=64=0.0%
     lat (msec): 2=100.0%, 4=0.0%, 10=0.0%, 20=0.0%, 50=0.0%, 100=0.0%
     lat (msec): 250=0.0%, 500=0.0%, 750=0.0%, 1000=0.0%, >=2000=0.0%

Run status group 0 (all jobs):
   READ: io=65,536KiB, aggrb=929KiB/s, minb=929KiB/s, maxb=929KiB/s, mint=72181msec, maxt=72181msec

Disk stats (read/write):
  sda: ios=16384/43, merge=0/42, ticks=71889/20573, in_queue=92461, util=99.57%


>> fio --name=global --rw=randread --size=64m --ioengine=guasi --name=job1 --iodepth=1000 --thread

job1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=guasi, iodepth=1000
Starting 1 thread
Jobs: 1: [r] [93.9% done] [   815/     0 kb/s] [eta 00m:02s]
job1: (groupid=0, jobs=1): err= 0: pid=29343
  read : io=65,536KiB, bw=2,130KiB/s, iops=520, runt= 31500msec
    slat (msec): min=    0, max=   26, avg= 1.02, stdev= 4.19
    clat (msec): min=   12, max=28024, avg=1920.73, stdev=764.20
    bw (KiB/s) : min= 1139, max= 3376, per=95.21%, avg=2027.87, stdev=354.38
  cpu          : usr=7.35%, sys=93.77%, ctx=104637
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.1%, 32=0.2%, >=64=99.6%
     lat (msec): 2=0.0%, 4=0.0%, 10=0.0%, 20=0.0%, 50=0.1%, 100=0.4%
     lat (msec): 250=1.2%, 500=1.0%, 750=0.8%, 1000=0.7%, >=2000=45.5%

Run status group 0 (all jobs):
   READ: io=65,536KiB, aggrb=2,130KiB/s, minb=2,130KiB/s, maxb=2,130KiB/s, mint=31500msec, maxt=31500msec

Disk stats (read/write):
  sda: ios=16267/31, merge=115/28, ticks=4019824/313471, in_queue=4333625, util=98.84%


>> fio --name=global --rw=randread --size=64m --ioengine=libaio --name=job1 --iodepth=1000 --thread

job1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=1000
Starting 1 thread
Jobs: 1: [r] [98.6% done] [  4083/     0 kb/s] [eta 00m:01s]]
job1: (groupid=0, jobs=1): err= 0: pid=30346
  read : io=65,536KiB, bw=920KiB/s, iops=224, runt= 72925msec
    slat (msec): min=    0, max= 5539, avg=4431.27, stdev=1268.03
    clat (msec): min=    0, max=    0, avg= 0.00, stdev= 0.00
    bw (KiB/s) : min=    0, max= 2361, per=103.56%, avg=952.75, stdev=499.54
  cpu          : usr=0.02%, sys=0.39%, ctx=23089
  IO depths    : 1=0.2%, 2=0.2%, 4=0.4%, 8=0.8%, 16=1.7%, 32=3.3%, >=64=93.4%
     lat (msec): 2=100.0%, 4=0.0%, 10=0.0%, 20=0.0%, 50=0.0%, 100=0.0%
     lat (msec): 250=0.0%, 500=0.0%, 750=0.0%, 1000=0.0%, >=2000=0.0%

Run status group 0 (all jobs):
   READ: io=65,536KiB, aggrb=920KiB/s, minb=920KiB/s, maxb=920KiB/s, mint=72925msec, maxt=72925msec

Disk stats (read/write):
  sda: ios=16384/70, merge=0/54, ticks=72644/31038, in_queue=103682, util=99.61%


>> fio --name=global --rw=randread --size=64m --ioengine=guasi --name=job1 --iodepth=10000 --thread

job1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=guasi, iodepth=10000
Starting 1 thread
Jobs: 1: [r] [100.0% done] [ 40752/     0 kb/s] [eta 00m:00s]
job1: (groupid=0, jobs=1): err= 0: pid=32203
  read : io=65,536KiB, bw=1,965KiB/s, iops=479, runt= 34148msec
    slat (msec): min=    0, max=  323, avg=124.06, stdev=112.39
    clat (msec): min=    0, max=33982, avg=20686.86, stdev=13689.22
    bw (KiB/s) : min=    1, max= 2187, per=94.75%, avg=1861.75, stdev=392.89
  cpu          : usr=0.35%, sys=2.42%, ctx=166667
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.1%, 32=0.2%, >=64=99.6%
     lat (msec): 2=0.0%, 4=0.0%, 10=0.0%, 20=0.1%, 50=0.5%, 100=1.5%
     lat (msec): 250=5.0%, 500=5.6%, 750=1.8%, 1000=0.8%, >=2000=2.3%

Run status group 0 (all jobs):
   READ: io=65,536KiB, aggrb=1,965KiB/s, minb=1,965KiB/s, maxb=1,965KiB/s, mint=34148msec, maxt=34148msec

Disk stats (read/write):
  sda: ios=16064/122, merge=319/73, ticks=4350268/172548, in_queue=4521657, util=98.95%



>> fio --name=global --rw=randread --size=64m --ioengine=libaio --name=job1 --iodepth=10000 --thread

job1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=10000
Starting 1 thread
Jobs: 1: [r] [61.3% done] [     0/     0 kb/s] [eta 00m:46s]]
job1: (groupid=0, jobs=1): err= 0: pid=9791
  read : io=65,536KiB, bw=917KiB/s, iops=224, runt= 73118msec
    slat (msec): min=    1, max=52656, avg=40082.23, stdev=15703.83
    clat (msec): min=    0, max=    3, avg= 2.61, stdev= 0.49
    bw (KiB/s) : min=    0, max= 2002, per=109.16%, avg=1001.00, stdev=1415.63
  cpu          : usr=0.02%, sys=0.40%, ctx=23095
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.1%, 16=0.2%, 32=0.4%, >=64=99.2%
     lat (msec): 2=0.0%, 4=100.0%, 10=0.0%, 20=0.0%, 50=0.0%, 100=0.0%
     lat (msec): 250=0.0%, 500=0.0%, 750=0.0%, 1000=0.0%, >=2000=0.0%

Run status group 0 (all jobs):
   READ: io=65,536KiB, aggrb=917KiB/s, minb=917KiB/s, maxb=917KiB/s, mint=73118msec, maxt=73118msec

Disk stats (read/write):
  sda: ios=16384/82, merge=0/86, ticks=72720/36477, in_queue=109197, util=99.44%


[-- Attachment #2: Type: TEXT/x-diff, Size: 8085 bytes --]

diff -Nru fio-1.14/engines/guasi.c fio-1.14.guasi/engines/guasi.c
--- fio-1.14/engines/guasi.c	1969-12-31 16:00:00.000000000 -0800
+++ fio-1.14.guasi/engines/guasi.c	2007-03-20 21:26:58.000000000 -0700
@@ -0,0 +1,256 @@
+/*
+ * guasi engine
+ *
+ * IO engine using the GUASI library.
+ *
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <errno.h>
+#include <assert.h>
+
+#include "../fio.h"
+#include "../os.h"
+
+#ifdef FIO_HAVE_GUASI
+
+#define GFIO_MIN_THREADS 32
+
+#include <guasi.h>
+#include <guasi_syscalls.h>
+
+#ifdef GFIO_DEBUG
+#define GDBG_PRINT(a) printf a
+#else
+#define GDBG_PRINT(a) (void) 0
+#endif
+
+#define STFU_GCC(a) a = a
+
+
+struct guasi_data {
+	guasi_t hctx;
+	int max_reqs;
+	guasi_req_t *reqs;
+	struct io_u **io_us;
+	int reqs_nr;
+};
+
+static int fio_guasi_prep(struct thread_data fio_unused *td, struct io_u *io_u)
+{
+	STFU_GCC(io_u);
+
+	GDBG_PRINT(("fio_guasi_prep(%p)\n", io_u));
+
+	return 0;
+}
+
+static struct io_u *fio_guasi_event(struct thread_data *td, int event)
+{
+	struct guasi_data *ld = td->io_ops->data;
+	struct io_u *io_u;
+	struct guasi_reqinfo rinf;
+
+	GDBG_PRINT(("fio_guasi_event(%d)\n", event));
+	if (guasi_req_info(ld->reqs[event], &rinf) < 0) {
+		fprintf(stderr, "guasi_req_info(%d) FAILED!\n", event);
+		return NULL;
+	}
+	io_u = rinf.asid;
+	GDBG_PRINT(("fio_guasi_event(%d) -> %p\n", event, io_u));
+
+	if (io_u->ddir == DDIR_READ ||
+	    io_u->ddir == DDIR_WRITE) {
+		if (rinf.result != (long) io_u->xfer_buflen) {
+			if (rinf.result < 0)
+				io_u->error = rinf.error;
+			else
+				io_u->resid = io_u->xfer_buflen - rinf.result;
+		} else
+			io_u->error = 0;
+	} else
+		io_u->error = rinf.result;
+
+	return io_u;
+}
+
+static int fio_guasi_getevents(struct thread_data *td, int min, int max,
+			       struct timespec *t)
+{
+	struct guasi_data *ld = td->io_ops->data;
+	int n = 0, r;
+	long timeo = -1;
+
+	GDBG_PRINT(("fio_guasi_getevents(%d, %d)\n", min, max));
+	if (min > ld->max_reqs)
+		min = ld->max_reqs;
+	if (max > ld->max_reqs)
+		max = ld->max_reqs;
+	if (t)
+		timeo = t->tv_sec * 1000L + t->tv_nsec / 1000000L;
+	do {
+		r = guasi_fetch(ld->hctx, ld->reqs + n, max - n, timeo);
+		if (r < 0)
+			break;
+		n += r;
+		if (n >= min)
+			break;
+	} while (1);
+	GDBG_PRINT(("fio_guasi_getevents() -> %d\n", n));
+
+	return n;
+}
+
+static int fio_guasi_queue(struct thread_data *td, struct io_u *io_u)
+{
+	struct guasi_data *ld = td->io_ops->data;
+
+	GDBG_PRINT(("fio_guasi_queue(%p)\n", io_u));
+	if (ld->reqs_nr == (int) td->iodepth)
+		return FIO_Q_BUSY;
+
+	ld->io_us[ld->reqs_nr] = io_u;
+	ld->reqs_nr++;
+	return FIO_Q_QUEUED;
+}
+
+static void fio_guasi_queued(struct thread_data *td, struct io_u **io_us,
+			     unsigned int nr)
+{
+	struct timeval now;
+	struct io_u *io_u = io_us[nr];
+
+	fio_gettime(&now, NULL);
+	memcpy(&io_u->issue_time, &now, sizeof(now));
+	io_u_queued(td, io_u);
+}
+
+static int fio_guasi_commit(struct thread_data *td)
+{
+	struct guasi_data *ld = td->io_ops->data;
+	int i;
+	struct io_u *io_u;
+	struct fio_file *f;
+
+	GDBG_PRINT(("fio_guasi_commit()\n"));
+	for (i = 0; i < ld->reqs_nr; i++) {
+		io_u = ld->io_us[i];
+		f = io_u->file;
+		io_u->greq = NULL;
+		if (io_u->ddir == DDIR_READ)
+			io_u->greq = guasi__pread(ld->hctx, ld, io_u, 0,
+						  f->fd, io_u->xfer_buf, io_u->xfer_buflen,
+						  io_u->offset);
+		else if (io_u->ddir == DDIR_WRITE)
+			io_u->greq = guasi__pwrite(ld->hctx, ld, io_u, 0,
+						   f->fd, io_u->xfer_buf, io_u->xfer_buflen,
+						   io_u->offset);
+		else if (io_u->ddir == DDIR_SYNC)
+			io_u->greq = guasi__fsync(ld->hctx, ld, io_u, 0, f->fd);
+		else {
+			fprintf(stderr, "fio_guasi_commit() FAILED: %d\n", io_u->ddir);
+		}
+		if (io_u->greq != NULL)
+			fio_guasi_queued(td, ld->io_us, i);
+	}
+	ld->reqs_nr = 0;
+	GDBG_PRINT(("fio_guasi_commit() -> %d\n", i));
+
+	return 0;
+}
+
+static int fio_guasi_cancel(struct thread_data *td, struct io_u *io_u)
+{
+	struct guasi_data *ld = td->io_ops->data;
+
+	STFU_GCC(ld);
+	GDBG_PRINT(("fio_guasi_cancel(%p)\n", io_u));
+
+	return guasi_req_cancel(io_u->greq);
+}
+
+static void fio_guasi_cleanup(struct thread_data *td)
+{
+	struct guasi_data *ld = td->io_ops->data;
+
+	if (ld) {
+		guasi_free(ld->hctx);
+		free(ld->reqs);
+		free(ld->io_us);
+		free(ld);
+		td->io_ops->data = NULL;
+	}
+}
+
+static int fio_guasi_init(struct thread_data *td)
+{
+	int maxthr;
+	struct guasi_data *ld = malloc(sizeof(*ld));
+
+	GDBG_PRINT(("fio_guasi_init(): depth=%d\n", td->iodepth));
+	memset(ld, 0, sizeof(*ld));
+	maxthr = td->iodepth > GFIO_MIN_THREADS ? td->iodepth: GFIO_MIN_THREADS;
+	if ((ld->hctx = guasi_create(GFIO_MIN_THREADS, maxthr, 1)) == NULL) {
+		td_verror(td, errno, "guasi_create");
+		free(ld);
+		return 1;
+	}
+	ld->max_reqs = td->iodepth;
+	ld->reqs = malloc(ld->max_reqs * sizeof(guasi_req_t));
+	ld->io_us = malloc(ld->max_reqs * sizeof(struct io_u *));
+	memset(ld->io_us, 0, ld->max_reqs * sizeof(struct io_u *));
+	ld->reqs_nr = 0;
+
+	td->io_ops->data = ld;
+	GDBG_PRINT(("fio_guasi_init(): depth=%d -> %p\n", td->iodepth, ld));
+
+	return 0;
+}
+
+static struct ioengine_ops ioengine = {
+	.name		= "guasi",
+		.version	= FIO_IOOPS_VERSION,
+		.init		= fio_guasi_init,
+		.prep		= fio_guasi_prep,
+		.queue		= fio_guasi_queue,
+		.commit		= fio_guasi_commit,
+		.cancel		= fio_guasi_cancel,
+		.getevents	= fio_guasi_getevents,
+		.event		= fio_guasi_event,
+		.cleanup	= fio_guasi_cleanup,
+		.open_file	= generic_open_file,
+		.close_file	= generic_close_file,
+};
+
+#else /* FIO_HAVE_GUASI */
+
+/*
+ * When we have a proper configure system in place, we simply wont build
+ * and install this io engine. For now install a crippled version that
+ * just complains and fails to load.
+ */
+static int fio_guasi_init(struct thread_data fio_unused *td)
+{
+	fprintf(stderr, "fio: guasi not available\n");
+	return 1;
+}
+
+static struct ioengine_ops ioengine = {
+	.name		= "guasi",
+		.version	= FIO_IOOPS_VERSION,
+		.init		= fio_guasi_init,
+};
+
+#endif
+
+static void fio_init fio_guasi_register(void)
+{
+	register_ioengine(&ioengine);
+}
+
+static void fio_exit fio_guasi_unregister(void)
+{
+	unregister_ioengine(&ioengine);
+}
+
diff -Nru fio-1.14/fio.h fio-1.14.guasi/fio.h
--- fio-1.14/fio.h	2007-03-14 06:24:42.000000000 -0700
+++ fio-1.14.guasi/fio.h	2007-03-20 20:06:48.000000000 -0700
@@ -23,6 +23,10 @@
 #include "syslet.h"
 #endif
 
+#ifdef FIO_HAVE_GUASI
+#include <guasi.h>
+#endif
+
 enum fio_ddir {
 	DDIR_READ = 0,
 	DDIR_WRITE,
@@ -110,6 +114,9 @@
 #ifdef FIO_HAVE_SYSLET
 		struct syslet_req req;
 #endif
+#ifdef FIO_HAVE_GUASI
+		guasi_req_t greq;
+#endif
 	};
 	struct timeval start_time;
 	struct timeval issue_time;
Binary files fio-1.14/job1.1.0 and fio-1.14.guasi/job1.1.0 differ
diff -Nru fio-1.14/Makefile fio-1.14.guasi/Makefile
--- fio-1.14/Makefile	2007-03-14 06:24:42.000000000 -0700
+++ fio-1.14.guasi/Makefile	2007-03-20 21:13:42.000000000 -0700
@@ -18,6 +18,7 @@
 OBJS += engines/null.o
 OBJS += engines/net.o
 OBJS += engines/syslet-rw.o
+OBJS += engines/guasi.o
 
 INSTALL = install
 prefix = /usr/local
@@ -26,7 +27,7 @@
 all: $(PROGS) $(SCRIPTS)
 
 fio: $(OBJS)
-	$(CC) $(CFLAGS) -o $@ $(filter %.o,$^) -lpthread -lm -ldl -laio -lrt
+	$(CC) $(CFLAGS) -o $@ $(filter %.o,$^) -lguasi -lpthread -lm -ldl -laio -lrt
 
 clean:
 	-rm -f *.o .depend cscope.out $(PROGS) engines/*.o core.* core
diff -Nru fio-1.14/os-linux.h fio-1.14.guasi/os-linux.h
--- fio-1.14/os-linux.h	2007-03-14 06:24:42.000000000 -0700
+++ fio-1.14.guasi/os-linux.h	2007-03-20 21:13:58.000000000 -0700
@@ -10,6 +10,7 @@
 
 #define FIO_HAVE_LIBAIO
 #define FIO_HAVE_POSIXAIO
+#define FIO_HAVE_GUASI
 #define FIO_HAVE_FADVISE
 #define FIO_HAVE_CPU_AFFINITY
 #define FIO_HAVE_DISK_UTIL

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AIO, FIO and Threads ...
  2007-03-21  4:58 AIO, FIO and Threads Davide Libenzi
@ 2007-03-21  6:54 ` Davide Libenzi
  2007-03-21  7:40 ` Jens Axboe
  1 sibling, 0 replies; 6+ messages in thread
From: Davide Libenzi @ 2007-03-21  6:54 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Ingo Molnar, Linus Torvalds, Jens Axboe

On Tue, 20 Mar 2007, Davide Libenzi wrote:

> 
> I was looking at Jens FIO stuff, and I decided to cook a quick patch for 
> FIO to support GUASI (Generic Userspace Asyncronous Syscall Interface):
> 
> http://www.xmailserver.org/guasi-lib.html
> 
> I then ran a few tests on my Dual Opteron 252 with SATA drives (sata_nv) 
> and 8GB of RAM.
> Mind that I'm not FIO expert, like at all, but I got some interesting 
> results when comparing GUASI with libaio at 8/1000/10000 depths.
> If I read those result correctly (Jens may help), GUASI output is more 
> then double the libaio one.
> Lots of context switches, yes. But the throughput looks like 2+ times.
> Can someone try to repeat the measures and/or spot the error?
> Or tell me which other tests to run?
> This is kinda a suprise for me ...

Tests with block sizes bigger than 4KB bring libaio performance close to 
GUASI, but not quite:

http://www.xmailserver.org/guasi-libaio-fio-results-1.txt

I dropped the last FIO+GUASI patch here:

http://www.xmailserver.org/fio-guasi-0.5.diff

And Jens FIO is here:

http://brick.kernel.dk/snaps/



- Davide



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AIO, FIO and Threads ...
  2007-03-21  4:58 AIO, FIO and Threads Davide Libenzi
  2007-03-21  6:54 ` Davide Libenzi
@ 2007-03-21  7:40 ` Jens Axboe
  2007-03-21 15:23   ` Davide Libenzi
  2007-03-22  2:02   ` Davide Libenzi
  1 sibling, 2 replies; 6+ messages in thread
From: Jens Axboe @ 2007-03-21  7:40 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Linux Kernel Mailing List, Ingo Molnar, Linus Torvalds

On Tue, Mar 20 2007, Davide Libenzi wrote:
> 
> I was looking at Jens FIO stuff, and I decided to cook a quick patch for 
> FIO to support GUASI (Generic Userspace Asyncronous Syscall Interface):
> 
> http://www.xmailserver.org/guasi-lib.html
> 
> I then ran a few tests on my Dual Opteron 252 with SATA drives (sata_nv) 
> and 8GB of RAM.
> Mind that I'm not FIO expert, like at all, but I got some interesting 
> results when comparing GUASI with libaio at 8/1000/10000 depths.
> If I read those result correctly (Jens may help), GUASI output is more 
> then double the libaio one.
> Lots of context switches, yes. But the throughput looks like 2+ times.
> Can someone try to repeat the measures and/or spot the error?
> Or tell me which other tests to run?
> This is kinda a suprise for me ...

I don't know guasi at all, but libaio requires O_DIRECT to be async. I'm
sure you know this, but you may not know that fio default to buffered IO
so you have to tell it to use O_DIRECT :-)

So try adding a --direct=1 (or --buffered=0, same thing) as an extra
option when comparing depths > 1.

I'll add your guasi engine, but disable it. Unfortunately fio still
doesn't have a nifty configure setup, so these things are still
manual...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AIO, FIO and Threads ...
  2007-03-21  7:40 ` Jens Axboe
@ 2007-03-21 15:23   ` Davide Libenzi
  2007-03-22  6:47     ` Jens Axboe
  2007-03-22  2:02   ` Davide Libenzi
  1 sibling, 1 reply; 6+ messages in thread
From: Davide Libenzi @ 2007-03-21 15:23 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linux Kernel Mailing List, Ingo Molnar, Linus Torvalds

On Wed, 21 Mar 2007, Jens Axboe wrote:

> On Tue, Mar 20 2007, Davide Libenzi wrote:
> > 
> > I was looking at Jens FIO stuff, and I decided to cook a quick patch for 
> > FIO to support GUASI (Generic Userspace Asyncronous Syscall Interface):
> > 
> > http://www.xmailserver.org/guasi-lib.html
> > 
> > I then ran a few tests on my Dual Opteron 252 with SATA drives (sata_nv) 
> > and 8GB of RAM.
> > Mind that I'm not FIO expert, like at all, but I got some interesting 
> > results when comparing GUASI with libaio at 8/1000/10000 depths.
> > If I read those result correctly (Jens may help), GUASI output is more 
> > then double the libaio one.
> > Lots of context switches, yes. But the throughput looks like 2+ times.
> > Can someone try to repeat the measures and/or spot the error?
> > Or tell me which other tests to run?
> > This is kinda a suprise for me ...
> 
> I don't know guasi at all, but libaio requires O_DIRECT to be async. I'm
> sure you know this, but you may not know that fio default to buffered IO
> so you have to tell it to use O_DIRECT :-)
> 
> So try adding a --direct=1 (or --buffered=0, same thing) as an extra
> option when comparing depths > 1.

I knew about AIO and O_DIRECT, but I thought FIO was using it by default :)
I used it for the first time yesterday night, and there are a pretty wide 
set of options. Will re-run today with --direct.
I was pretty surprised though. Since libaio was matching with syslets, I 
was thinking that a userspace version using a queue-always design (hard to 
do the cachehit optimization if you're not inside the scheduler ;), was 
going to be considerably slower.


> I'll add your guasi engine, but disable it. Unfortunately fio still
> doesn't have a nifty configure setup, so these things are still
> manual...

Well, you do have your own HAVE_*, you just need to make autoconf/automake 
do the checks for you. Of course, then you'll be pissed every time an 
autoconf/automake update break your setup, but lately it seems to be going 
better. Really :)


- Davide



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AIO, FIO and Threads ...
  2007-03-21  7:40 ` Jens Axboe
  2007-03-21 15:23   ` Davide Libenzi
@ 2007-03-22  2:02   ` Davide Libenzi
  1 sibling, 0 replies; 6+ messages in thread
From: Davide Libenzi @ 2007-03-22  2:02 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linux Kernel Mailing List, Ingo Molnar, Linus Torvalds

On Wed, 21 Mar 2007, Jens Axboe wrote:

> On Tue, Mar 20 2007, Davide Libenzi wrote:
> > 
> > I was looking at Jens FIO stuff, and I decided to cook a quick patch for 
> > FIO to support GUASI (Generic Userspace Asyncronous Syscall Interface):
> > 
> > http://www.xmailserver.org/guasi-lib.html
> > 
> > I then ran a few tests on my Dual Opteron 252 with SATA drives (sata_nv) 
> > and 8GB of RAM.
> > Mind that I'm not FIO expert, like at all, but I got some interesting 
> > results when comparing GUASI with libaio at 8/1000/10000 depths.
> > If I read those result correctly (Jens may help), GUASI output is more 
> > then double the libaio one.
> > Lots of context switches, yes. But the throughput looks like 2+ times.
> > Can someone try to repeat the measures and/or spot the error?
> > Or tell me which other tests to run?
> > This is kinda a suprise for me ...
> 
> I don't know guasi at all, but libaio requires O_DIRECT to be async. I'm
> sure you know this, but you may not know that fio default to buffered IO
> so you have to tell it to use O_DIRECT :-)
> 
> So try adding a --direct=1 (or --buffered=0, same thing) as an extra
> option when comparing depths > 1.

This is a much smaller box. P4 with 1GB of RAM, and on a 1GB RW test. 
Now libaio performs well, even though GUASI can keep the pace. Quite a 
surprise, nonetheless ...




- Davide



*** fio --name=global --rw=randrw --size=1g --bs=4k --direct=1 
        --ioengine=guasi --name=job1 --iodepth=100 --thread --runtime=20
job1: (g=0): rw=randrw, bs=4K-4K/4K-4K, ioengine=guasi, iodepth=100
Starting 1 thread
Jobs: 1: [m] [100.0% done] [   609/     0 kb/s] [eta 00m:00s]
job1: (groupid=0, jobs=1): err= 0: pid=21862
  read : io=8,300KiB, bw=412KiB/s, iops=100, runt= 20599msec
    slat (msec): min=    0, max=    2, avg= 0.04, stdev= 0.29
    clat (msec): min=   11, max= 1790, avg=564.76, stdev=350.58
    bw (KiB/s) : min=   47, max=  692, per=121.80%, avg=501.83, stdev=196.86
  write: io=11,344KiB, bw=563KiB/s, iops=137, runt= 20599msec
    slat (msec): min=    0, max=    2, avg= 0.04, stdev= 0.28
    clat (msec): min=    2, max=  643, avg=311.86, stdev=108.85
    bw (KiB/s) : min=    0, max= 1695, per=143.52%, avg=808.00, stdev=632.11
  cpu          : usr=0.19%, sys=1.94%, ctx=28036
  IO depths    : 1=0.0%, 2=0.0%, 4=0.1%, 8=0.2%, 16=0.3%, 32=0.7%, >=64=98.7%
     lat (msec): 2=0.0%, 4=0.0%, 10=0.0%, 20=0.3%, 50=1.2%, 100=2.2%
     lat (msec): 250=16.4%, 500=51.4%, 750=16.8%, 1000=6.5%, >=2000=5.1%

Run status group 0 (all jobs):
   READ: io=8,300KiB, aggrb=412KiB/s, minb=412KiB/s, maxb=412KiB/s, mint=20599msec, maxt=20599msec
  WRITE: io=11,344KiB, aggrb=563KiB/s, minb=563KiB/s, maxb=563KiB/s, mint=20599msec, maxt=20599msec

Disk stats (read/write):
  sda: ios=2074/2846, merge=1/26, ticks=945564/14060, in_queue=959624, util=97.65%



*** fio --name=global --rw=randrw --size=1g --bs=4k --direct=1 
        --ioengine=libaio --name=job1 --iodepth=100 --thread --runtime=20
job1: (g=0): rw=randrw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=100
Starting 1 thread
Jobs: 1: [m] [100.0% done] [   406/   438 kb/s] [eta 00m:00s]
job1: (groupid=0, jobs=1): err= 0: pid=21860
  read : io=8,076KiB, bw=403KiB/s, iops=98, runt= 20495msec
    slat (msec): min=    0, max=  494, avg= 0.55, stdev=14.75
    clat (msec): min=    0, max= 1788, avg=509.38, stdev=391.43
    bw (KiB/s) : min=   20, max=  682, per=104.55%, avg=421.32, stdev=153.91
  write: io=11,024KiB, bw=550KiB/s, iops=134, runt= 20495msec
    slat (msec): min=    0, max=  441, avg= 0.23, stdev= 8.40
    clat (msec): min=    0, max= 1695, avg=368.51, stdev=308.11
    bw (KiB/s) : min=    0, max= 1787, per=105.78%, avg=581.78, stdev=438.43
  cpu          : usr=0.06%, sys=0.76%, ctx=6185
  IO depths    : 1=0.1%, 2=0.2%, 4=0.3%, 8=0.7%, 16=1.3%, 32=2.7%, >=64=94.7%
     lat (msec): 2=0.4%, 4=0.1%, 10=0.7%, 20=2.5%, 50=7.0%, 100=9.4%
     lat (msec): 250=20.2%, 500=23.2%, 750=17.8%, 1000=10.9%, >=2000=7.9%

Run status group 0 (all jobs):
   READ: io=8,076KiB, aggrb=403KiB/s, minb=403KiB/s, maxb=403KiB/s, mint=20495msec, maxt=20495msec
  WRITE: io=11,024KiB, aggrb=550KiB/s, minb=550KiB/s, maxb=550KiB/s, mint=20495msec, maxt=20495msec

Disk stats (read/write):
  sda: ios=2019/2788, merge=0/38, ticks=988100/1048096, in_queue=2036196, util=99.38%



*** fio --name=global --rw=randrw --size=1g --bs=4k --direct=1 
        --ioengine=guasi --name=job1 --iodepth=1000 --thread --runtime=20
job1: (g=0): rw=randrw, bs=4K-4K/4K-4K, ioengine=guasi, iodepth=1000
Starting 1 thread
Jobs: 1: [m] [1.9% done] [  2348/  1710 kb/s] [eta 21m:00s]s]
job1: (groupid=0, jobs=1): err= 0: pid=19471
  read : io=10,640KiB, bw=436KiB/s, iops=106, runt= 24972msec
    slat (msec): min=    0, max=   26, avg= 3.46, stdev= 7.68
    clat (msec): min=   27, max= 9800, avg=5048.47, stdev=1728.19
    bw (KiB/s) : min=   44, max=  689, per=98.52%, avg=429.54, stdev=190.74
  write: io=9,748KiB, bw=399KiB/s, iops=97, runt= 24972msec
    slat (msec): min=    0, max=   26, avg= 4.04, stdev= 8.21
    clat (msec): min=    9, max= 9153, avg=4718.51, stdev=1692.57
    bw (KiB/s) : min=    0, max= 1586, per=109.27%, avg=436.00, stdev=395.51
  cpu          : usr=0.24%, sys=1.57%, ctx=20661
  IO depths    : 1=0.0%, 2=0.0%, 4=0.1%, 8=0.2%, 16=0.3%, 32=0.6%, >=64=98.8%
     lat (msec): 2=0.0%, 4=0.0%, 10=0.0%, 20=0.0%, 50=0.2%, 100=0.1%
     lat (msec): 250=0.5%, 500=0.4%, 750=1.1%, 1000=1.0%, >=2000=3.8%

Run status group 0 (all jobs):
   READ: io=10,640KiB, aggrb=436KiB/s, minb=436KiB/s, maxb=436KiB/s, mint=24972msec, maxt=24972msec
  WRITE: io=9,748KiB, aggrb=399KiB/s, minb=399KiB/s, maxb=399KiB/s, mint=24972msec, maxt=24972msec

Disk stats (read/write):
  sda: ios=2660/2449, merge=0/27, ticks=1016188/22112, in_queue=1038296, util=97.60%


*** fio --name=global --rw=randrw --size=1g --bs=4k --direct=1 
        --ioengine=libaio --name=job1 --iodepth=1000 --thread --runtime=20
job1: (g=0): rw=randrw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=1000
Starting 1 thread
Jobs: 1: [m] [1.8% done] [  3502/     0 kb/s] [eta 20m:01s]s]
job1: (groupid=0, jobs=1): err= 0: pid=19723
  read : io=10,944KiB, bw=493KiB/s, iops=120, runt= 22697msec
    slat (msec): min=  363, max= 5490, avg=3917.46, stdev=1842.45
    clat (msec): min=    0, max= 5491, avg=980.56, stdev=1764.36
    bw (KiB/s) : min=    1, max= 4091, per=211.44%, avg=1042.38, stdev=1517.27
  write: io=8,468KiB, bw=382KiB/s, iops=93, runt= 22697msec
    slat (msec): min=    0, max= 3025, avg=2864.85, stdev=115.63
    clat (msec): min=    0, max= 6104, avg=381.86, stdev=1124.52
    bw (KiB/s) : min=    0, max= 3046, per=196.70%, avg=751.40, stdev=1305.17
  cpu          : usr=0.04%, sys=0.62%, ctx=2148
  IO depths    : 1=0.0%, 2=0.0%, 4=0.1%, 8=0.2%, 16=0.3%, 32=0.7%, >=64=98.7%
     lat (msec): 2=74.0%, 4=0.1%, 10=0.1%, 20=0.0%, 50=0.0%, 100=0.0%
     lat (msec): 250=0.0%, 500=2.8%, 750=0.0%, 1000=4.6%, >=2000=2.6%

Run status group 0 (all jobs):
   READ: io=10,944KiB, aggrb=493KiB/s, minb=493KiB/s, maxb=493KiB/s, mint=22697msec, maxt=22697msec
  WRITE: io=8,468KiB, aggrb=382KiB/s, minb=382KiB/s, maxb=382KiB/s, mint=22697msec, maxt=22697msec

Disk stats (read/write):
  sda: ios=2734/2135, merge=2/36, ticks=2345836/1359304, in_queue=3722228, util=99.23%




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AIO, FIO and Threads ...
  2007-03-21 15:23   ` Davide Libenzi
@ 2007-03-22  6:47     ` Jens Axboe
  0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2007-03-22  6:47 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Linux Kernel Mailing List, Ingo Molnar, Linus Torvalds

On Wed, Mar 21 2007, Davide Libenzi wrote:
> On Wed, 21 Mar 2007, Jens Axboe wrote:
> 
> > On Tue, Mar 20 2007, Davide Libenzi wrote:
> > > 
> > > I was looking at Jens FIO stuff, and I decided to cook a quick patch for 
> > > FIO to support GUASI (Generic Userspace Asyncronous Syscall Interface):
> > > 
> > > http://www.xmailserver.org/guasi-lib.html
> > > 
> > > I then ran a few tests on my Dual Opteron 252 with SATA drives (sata_nv) 
> > > and 8GB of RAM.
> > > Mind that I'm not FIO expert, like at all, but I got some interesting 
> > > results when comparing GUASI with libaio at 8/1000/10000 depths.
> > > If I read those result correctly (Jens may help), GUASI output is more 
> > > then double the libaio one.
> > > Lots of context switches, yes. But the throughput looks like 2+ times.
> > > Can someone try to repeat the measures and/or spot the error?
> > > Or tell me which other tests to run?
> > > This is kinda a suprise for me ...
> > 
> > I don't know guasi at all, but libaio requires O_DIRECT to be async. I'm
> > sure you know this, but you may not know that fio default to buffered IO
> > so you have to tell it to use O_DIRECT :-)
> > 
> > So try adding a --direct=1 (or --buffered=0, same thing) as an extra
> > option when comparing depths > 1.
> 
> I knew about AIO and O_DIRECT, but I thought FIO was using it by default :)

It actually used to, but I changed the default a few months ago as I
think that is more appropriate.

> I used it for the first time yesterday night, and there are a pretty wide 
> set of options. Will re-run today with --direct.

Yep, I try to add good explanations to all of them though, also
available through --cmdhelp or --cmdhelp=option so you don't have to
lookup the documentation all the time.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-03-22  6:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-21  4:58 AIO, FIO and Threads Davide Libenzi
2007-03-21  6:54 ` Davide Libenzi
2007-03-21  7:40 ` Jens Axboe
2007-03-21 15:23   ` Davide Libenzi
2007-03-22  6:47     ` Jens Axboe
2007-03-22  2:02   ` Davide Libenzi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).