LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH 00/12] Posix Alarm Timers full patchset
@ 2011-01-06  2:15 John Stultz
  2011-01-06  2:15 ` [PATCH 01/12] timers: Introduce timerlist infrastructure John Stultz
                   ` (11 more replies)
  0 siblings, 12 replies; 22+ messages in thread
From: John Stultz @ 2011-01-06  2:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: John Stultz, Arve Hjønnevåg, Brian Swetland,
	Thomas Gleixner, Alessandro Zummo

The first 8 of these patches are already in -tip for inclusion
in 2.6.38, but I wanted to send them all out again so folks could
try them out and I could hopefully get some feedback on the remaining 
patches in this set.

Specifically, I'm really hoping to get some feedback from any Android
folks on the introduction of CLOCK_BOOTTIME (generic posix version of
ANDROID_ALARM_ELAPSED_REALTIME) which provides CLOCK_MONOTONIC + suspend
time, as well as the posix alarm timers (CLOCK_REALTIME_ALARM and
CLOCK_BOOTTIME_ALARM), which try to provide equivalent functionality
as the Android Alarm Timers.

There are still a few remaining TODOs here, but getting some feedback
as to any limitations this interface has compared to the Android
/dev/alarm interface would be great.

The full set applies ontop of v2.6.37.

thanks
-john

CC: Arve Hjønnevåg <arve@android.com>
CC: Brian Swetland <swetland@google.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Alessandro Zummo <a.zummo@towertech.it>

John Stultz (11):
  timers: Introduce timerlist infrastructure.
  timers: Rename timerlist infrastructure to timerqueue
  timers: Fixup allmodconfig build issue
  hrtimers: Convert hrtimers to use timerlist infrastructure
  hrtimer: fix timerqueue conversion flub
  RTC: Rework RTC code to use timerqueue for events
  RTC: Remove UIE emulation
  [RFC] hrtimers: extend hrtimer base code to handle more then 2
    clockids
  [RFC] hrtimers: Add CLOCK_BOOTTIME clockid, hrtimerbase and posix
    interface
  timers: Add rb_init_node() to allow for stack allocated rb nodes
  [RFC] Introduce Alarm (hybrid) timers

Thomas Gleixner (1):
  rtc: Namespace fixup

 drivers/rtc/class.c          |   13 +
 drivers/rtc/interface.c      |  574 +++++++++++++++++++++--------------
 drivers/rtc/rtc-dev.c        |  104 -------
 drivers/rtc/rtc-lib.c        |   28 ++
 include/linux/alarmtimer.h   |   30 ++
 include/linux/hrtimer.h      |   40 ++-
 include/linux/posix-timers.h |    2 +
 include/linux/rbtree.h       |    8 +
 include/linux/rtc.h          |   51 +++-
 include/linux/time.h         |    6 +
 include/linux/timerqueue.h   |   37 +++
 kernel/hrtimer.c             |  146 +++++-----
 kernel/posix-timers.c        |   16 +-
 kernel/time/Makefile         |    2 +-
 kernel/time/alarmtimer.c     |  699 ++++++++++++++++++++++++++++++++++++++++++
 kernel/time/timekeeping.c    |   79 +++++-
 kernel/time/timer_list.c     |    8 +-
 lib/Makefile                 |    2 +-
 lib/timerqueue.c             |  121 ++++++++
 19 files changed, 1525 insertions(+), 441 deletions(-)
 create mode 100644 include/linux/alarmtimer.h
 create mode 100644 include/linux/timerqueue.h
 create mode 100644 kernel/time/alarmtimer.c
 create mode 100644 lib/timerqueue.c

-- 
1.7.3.2.146.gca209


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 01/12] timers: Introduce timerlist infrastructure.
  2011-01-06  2:15 [PATCH 00/12] Posix Alarm Timers full patchset John Stultz
@ 2011-01-06  2:15 ` John Stultz
  2011-01-06  2:15 ` [PATCH 02/12] timers: Rename timerlist infrastructure to timerqueue John Stultz
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: John Stultz @ 2011-01-06  2:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: John Stultz, Alessandro Zummo, Thomas Gleixner, Richard Cochran

The timerlist infrastructure is a thin layer over the rbtree
code that implements a simple list of timers sorted by an
expires value, and a getnext function that provides a pointer
to the earliest timer.

This infrastructure allows drivers and other kernel infrastructure
to easily implement timers without duplicating code.

Signed-off-by: John Stultz <john.stultz@linaro.org>
LKML Reference: <1290136329-18291-2-git-send-email-john.stultz@linaro.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
CC: Alessandro Zummo <a.zummo@towertech.it>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Richard Cochran <richardcochran@gmail.com>
---
 include/linux/timerlist.h |   37 ++++++++++++++
 lib/Makefile              |    2 +-
 lib/timerlist.c           |  118 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 156 insertions(+), 1 deletions(-)
 create mode 100644 include/linux/timerlist.h
 create mode 100644 lib/timerlist.c

diff --git a/include/linux/timerlist.h b/include/linux/timerlist.h
new file mode 100644
index 0000000..c46b28a
--- /dev/null
+++ b/include/linux/timerlist.h
@@ -0,0 +1,37 @@
+#ifndef _LINUX_TIMERLIST_H
+#define _LINUX_TIMERLIST_H
+
+#include <linux/rbtree.h>
+#include <linux/ktime.h>
+
+
+struct timerlist_node {
+	struct rb_node node;
+	ktime_t expires;
+};
+
+struct timerlist_head {
+	struct rb_root head;
+	struct timerlist_node *next;
+};
+
+
+extern void timerlist_add(struct timerlist_head *head,
+				struct timerlist_node *node);
+extern void timerlist_del(struct timerlist_head *head,
+				struct timerlist_node *node);
+extern struct timerlist_node *timerlist_getnext(struct timerlist_head *head);
+extern struct timerlist_node *timerlist_iterate_next(
+						struct timerlist_node *node);
+
+static inline void timerlist_init(struct timerlist_node *node)
+{
+	RB_CLEAR_NODE(&node->node);
+}
+
+static inline void timerlist_init_head(struct timerlist_head *head)
+{
+	head->head = RB_ROOT;
+	head->next = NULL;
+}
+#endif /* _LINUX_TIMERLIST_H */
diff --git a/lib/Makefile b/lib/Makefile
index e6a3763..8b475cf 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -8,7 +8,7 @@ KBUILD_CFLAGS = $(subst -pg,,$(ORIG_CFLAGS))
 endif
 
 lib-y := ctype.o string.o vsprintf.o cmdline.o \
-	 rbtree.o radix-tree.o dump_stack.o \
+	 rbtree.o radix-tree.o dump_stack.o timerlist.o\
 	 idr.o int_sqrt.o extable.o prio_tree.o \
 	 sha1.o irq_regs.o reciprocal_div.o argv_split.o \
 	 proportions.o prio_heap.o ratelimit.o show_mem.o \
diff --git a/lib/timerlist.c b/lib/timerlist.c
new file mode 100644
index 0000000..9101b42
--- /dev/null
+++ b/lib/timerlist.c
@@ -0,0 +1,118 @@
+/*
+ *  Generic Timer-list
+ *
+ *  Manages a simple list of timers, ordered by expiration time.
+ *  Uses rbtrees for quick list adds and expiration.
+ *
+ *  NOTE: All of the following functions need to be serialized
+ *  to avoid races. No locking is done by this libary code.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include <linux/timerlist.h>
+#include <linux/rbtree.h>
+
+/**
+ * timerlist_add - Adds timer to timerlist.
+ *
+ * @head: head of timerlist
+ * @node: timer node to be added
+ *
+ * Adds the timer node to the timerlist, sorted by the
+ * node's expires value.
+ */
+void timerlist_add(struct timerlist_head *head, struct timerlist_node *node)
+{
+	struct rb_node **p = &head->head.rb_node;
+	struct rb_node *parent = NULL;
+	struct timerlist_node  *ptr;
+
+	/* Make sure we don't add nodes that are already added */
+	WARN_ON_ONCE(!RB_EMPTY_NODE(&node->node));
+
+	while (*p) {
+		parent = *p;
+		ptr = rb_entry(parent, struct timerlist_node, node);
+		if (node->expires.tv64 < ptr->expires.tv64)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+	rb_link_node(&node->node, parent, p);
+	rb_insert_color(&node->node, &head->head);
+
+	if (!head->next || node->expires.tv64 < head->next->expires.tv64)
+		head->next = node;
+}
+
+/**
+ * timerlist_del - Removes a timer from the timerlist.
+ *
+ * @head: head of timerlist
+ * @node: timer node to be removed
+ *
+ * Removes the timer node from the timerlist.
+ */
+void timerlist_del(struct timerlist_head *head, struct timerlist_node *node)
+{
+	WARN_ON_ONCE(RB_EMPTY_NODE(&node->node));
+
+	/* update next pointer */
+	if (head->next == node) {
+		struct rb_node *rbn = rb_next(&node->node);
+
+		head->next = rbn ?
+			rb_entry(rbn, struct timerlist_node, node) : NULL;
+	}
+	rb_erase(&node->node, &head->head);
+	RB_CLEAR_NODE(&node->node);
+}
+
+
+/**
+ * timerlist_getnext - Returns the timer with the earlies expiration time
+ *
+ * @head: head of timerlist
+ *
+ * Returns a pointer to the timer node that has the
+ * earliest expiration time.
+ */
+struct timerlist_node *timerlist_getnext(struct timerlist_head *head)
+{
+	return head->next;
+}
+
+
+/**
+ * timerlist_iterate_next - Returns the timer after the provided timer
+ *
+ * @node: Pointer to a timer.
+ *
+ * Provides the timer that is after the given node. This is used, when
+ * necessary, to iterate through the list of timers in a timer list
+ * without modifying the list.
+ */
+struct timerlist_node *timerlist_iterate_next(struct timerlist_node *node)
+{
+	struct rb_node *next;
+
+	if (!node)
+		return NULL;
+	next = rb_next(&node->node);
+	if (!next)
+		return NULL;
+	return container_of(next, struct timerlist_node, node);
+}
-- 
1.7.3.2.146.gca209


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 02/12] timers: Rename timerlist infrastructure to timerqueue
  2011-01-06  2:15 [PATCH 00/12] Posix Alarm Timers full patchset John Stultz
  2011-01-06  2:15 ` [PATCH 01/12] timers: Introduce timerlist infrastructure John Stultz
@ 2011-01-06  2:15 ` John Stultz
  2011-01-06  2:15 ` [PATCH 03/12] timers: Fixup allmodconfig build issue John Stultz
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: John Stultz @ 2011-01-06  2:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: John Stultz

Thomas pointed out a namespace collision between the new timerlist
infrastructure I introduced and the existing timer_list.c

So to avoid confusion, I've renamed the timerlist infrastructure
to timerqueue.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 include/linux/timerlist.h  |   37 --------------
 include/linux/timerqueue.h |   37 ++++++++++++++
 lib/Makefile               |    2 +-
 lib/timerlist.c            |  118 --------------------------------------------
 lib/timerqueue.c           |  118 ++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 156 insertions(+), 156 deletions(-)
 delete mode 100644 include/linux/timerlist.h
 create mode 100644 include/linux/timerqueue.h
 delete mode 100644 lib/timerlist.c
 create mode 100644 lib/timerqueue.c

diff --git a/include/linux/timerlist.h b/include/linux/timerlist.h
deleted file mode 100644
index c46b28a..0000000
--- a/include/linux/timerlist.h
+++ /dev/null
@@ -1,37 +0,0 @@
-#ifndef _LINUX_TIMERLIST_H
-#define _LINUX_TIMERLIST_H
-
-#include <linux/rbtree.h>
-#include <linux/ktime.h>
-
-
-struct timerlist_node {
-	struct rb_node node;
-	ktime_t expires;
-};
-
-struct timerlist_head {
-	struct rb_root head;
-	struct timerlist_node *next;
-};
-
-
-extern void timerlist_add(struct timerlist_head *head,
-				struct timerlist_node *node);
-extern void timerlist_del(struct timerlist_head *head,
-				struct timerlist_node *node);
-extern struct timerlist_node *timerlist_getnext(struct timerlist_head *head);
-extern struct timerlist_node *timerlist_iterate_next(
-						struct timerlist_node *node);
-
-static inline void timerlist_init(struct timerlist_node *node)
-{
-	RB_CLEAR_NODE(&node->node);
-}
-
-static inline void timerlist_init_head(struct timerlist_head *head)
-{
-	head->head = RB_ROOT;
-	head->next = NULL;
-}
-#endif /* _LINUX_TIMERLIST_H */
diff --git a/include/linux/timerqueue.h b/include/linux/timerqueue.h
new file mode 100644
index 0000000..406b103
--- /dev/null
+++ b/include/linux/timerqueue.h
@@ -0,0 +1,37 @@
+#ifndef _LINUX_TIMERQUEUE_H
+#define _LINUX_TIMERQUEUE_H
+
+#include <linux/rbtree.h>
+#include <linux/ktime.h>
+
+
+struct timerqueue_node {
+	struct rb_node node;
+	ktime_t expires;
+};
+
+struct timerqueue_head {
+	struct rb_root head;
+	struct timerqueue_node *next;
+};
+
+
+extern void timerqueue_add(struct timerqueue_head *head,
+				struct timerqueue_node *node);
+extern void timerqueue_del(struct timerqueue_head *head,
+				struct timerqueue_node *node);
+extern struct timerqueue_node *timerqueue_getnext(struct timerqueue_head *head);
+extern struct timerqueue_node *timerqueue_iterate_next(
+						struct timerqueue_node *node);
+
+static inline void timerqueue_init(struct timerqueue_node *node)
+{
+	RB_CLEAR_NODE(&node->node);
+}
+
+static inline void timerqueue_init_head(struct timerqueue_head *head)
+{
+	head->head = RB_ROOT;
+	head->next = NULL;
+}
+#endif /* _LINUX_TIMERQUEUE_H */
diff --git a/lib/Makefile b/lib/Makefile
index 8b475cf..9e2db72 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -8,7 +8,7 @@ KBUILD_CFLAGS = $(subst -pg,,$(ORIG_CFLAGS))
 endif
 
 lib-y := ctype.o string.o vsprintf.o cmdline.o \
-	 rbtree.o radix-tree.o dump_stack.o timerlist.o\
+	 rbtree.o radix-tree.o dump_stack.o timerqueue.o\
 	 idr.o int_sqrt.o extable.o prio_tree.o \
 	 sha1.o irq_regs.o reciprocal_div.o argv_split.o \
 	 proportions.o prio_heap.o ratelimit.o show_mem.o \
diff --git a/lib/timerlist.c b/lib/timerlist.c
deleted file mode 100644
index 9101b42..0000000
--- a/lib/timerlist.c
+++ /dev/null
@@ -1,118 +0,0 @@
-/*
- *  Generic Timer-list
- *
- *  Manages a simple list of timers, ordered by expiration time.
- *  Uses rbtrees for quick list adds and expiration.
- *
- *  NOTE: All of the following functions need to be serialized
- *  to avoid races. No locking is done by this libary code.
- *
- *  This program is free software; you can redistribute it and/or modify
- *  it under the terms of the GNU General Public License as published by
- *  the Free Software Foundation; either version 2 of the License, or
- *  (at your option) any later version.
- *
- *  This program is distributed in the hope that it will be useful,
- *  but WITHOUT ANY WARRANTY; without even the implied warranty of
- *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- *  GNU General Public License for more details.
- *
- *  You should have received a copy of the GNU General Public License
- *  along with this program; if not, write to the Free Software
- *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
- */
-
-#include <linux/timerlist.h>
-#include <linux/rbtree.h>
-
-/**
- * timerlist_add - Adds timer to timerlist.
- *
- * @head: head of timerlist
- * @node: timer node to be added
- *
- * Adds the timer node to the timerlist, sorted by the
- * node's expires value.
- */
-void timerlist_add(struct timerlist_head *head, struct timerlist_node *node)
-{
-	struct rb_node **p = &head->head.rb_node;
-	struct rb_node *parent = NULL;
-	struct timerlist_node  *ptr;
-
-	/* Make sure we don't add nodes that are already added */
-	WARN_ON_ONCE(!RB_EMPTY_NODE(&node->node));
-
-	while (*p) {
-		parent = *p;
-		ptr = rb_entry(parent, struct timerlist_node, node);
-		if (node->expires.tv64 < ptr->expires.tv64)
-			p = &(*p)->rb_left;
-		else
-			p = &(*p)->rb_right;
-	}
-	rb_link_node(&node->node, parent, p);
-	rb_insert_color(&node->node, &head->head);
-
-	if (!head->next || node->expires.tv64 < head->next->expires.tv64)
-		head->next = node;
-}
-
-/**
- * timerlist_del - Removes a timer from the timerlist.
- *
- * @head: head of timerlist
- * @node: timer node to be removed
- *
- * Removes the timer node from the timerlist.
- */
-void timerlist_del(struct timerlist_head *head, struct timerlist_node *node)
-{
-	WARN_ON_ONCE(RB_EMPTY_NODE(&node->node));
-
-	/* update next pointer */
-	if (head->next == node) {
-		struct rb_node *rbn = rb_next(&node->node);
-
-		head->next = rbn ?
-			rb_entry(rbn, struct timerlist_node, node) : NULL;
-	}
-	rb_erase(&node->node, &head->head);
-	RB_CLEAR_NODE(&node->node);
-}
-
-
-/**
- * timerlist_getnext - Returns the timer with the earlies expiration time
- *
- * @head: head of timerlist
- *
- * Returns a pointer to the timer node that has the
- * earliest expiration time.
- */
-struct timerlist_node *timerlist_getnext(struct timerlist_head *head)
-{
-	return head->next;
-}
-
-
-/**
- * timerlist_iterate_next - Returns the timer after the provided timer
- *
- * @node: Pointer to a timer.
- *
- * Provides the timer that is after the given node. This is used, when
- * necessary, to iterate through the list of timers in a timer list
- * without modifying the list.
- */
-struct timerlist_node *timerlist_iterate_next(struct timerlist_node *node)
-{
-	struct rb_node *next;
-
-	if (!node)
-		return NULL;
-	next = rb_next(&node->node);
-	if (!next)
-		return NULL;
-	return container_of(next, struct timerlist_node, node);
-}
diff --git a/lib/timerqueue.c b/lib/timerqueue.c
new file mode 100644
index 0000000..f46de84
--- /dev/null
+++ b/lib/timerqueue.c
@@ -0,0 +1,118 @@
+/*
+ *  Generic Timer-queue
+ *
+ *  Manages a simple queue of timers, ordered by expiration time.
+ *  Uses rbtrees for quick list adds and expiration.
+ *
+ *  NOTE: All of the following functions need to be serialized
+ *  to avoid races. No locking is done by this libary code.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include <linux/timerqueue.h>
+#include <linux/rbtree.h>
+
+/**
+ * timerqueue_add - Adds timer to timerqueue.
+ *
+ * @head: head of timerqueue
+ * @node: timer node to be added
+ *
+ * Adds the timer node to the timerqueue, sorted by the
+ * node's expires value.
+ */
+void timerqueue_add(struct timerqueue_head *head, struct timerqueue_node *node)
+{
+	struct rb_node **p = &head->head.rb_node;
+	struct rb_node *parent = NULL;
+	struct timerqueue_node  *ptr;
+
+	/* Make sure we don't add nodes that are already added */
+	WARN_ON_ONCE(!RB_EMPTY_NODE(&node->node));
+
+	while (*p) {
+		parent = *p;
+		ptr = rb_entry(parent, struct timerqueue_node, node);
+		if (node->expires.tv64 < ptr->expires.tv64)
+			p = &(*p)->rb_left;
+		else
+			p = &(*p)->rb_right;
+	}
+	rb_link_node(&node->node, parent, p);
+	rb_insert_color(&node->node, &head->head);
+
+	if (!head->next || node->expires.tv64 < head->next->expires.tv64)
+		head->next = node;
+}
+
+/**
+ * timerqueue_del - Removes a timer from the timerqueue.
+ *
+ * @head: head of timerqueue
+ * @node: timer node to be removed
+ *
+ * Removes the timer node from the timerqueue.
+ */
+void timerqueue_del(struct timerqueue_head *head, struct timerqueue_node *node)
+{
+	WARN_ON_ONCE(RB_EMPTY_NODE(&node->node));
+
+	/* update next pointer */
+	if (head->next == node) {
+		struct rb_node *rbn = rb_next(&node->node);
+
+		head->next = rbn ?
+			rb_entry(rbn, struct timerqueue_node, node) : NULL;
+	}
+	rb_erase(&node->node, &head->head);
+	RB_CLEAR_NODE(&node->node);
+}
+
+
+/**
+ * timerqueue_getnext - Returns the timer with the earlies expiration time
+ *
+ * @head: head of timerqueue
+ *
+ * Returns a pointer to the timer node that has the
+ * earliest expiration time.
+ */
+struct timerqueue_node *timerqueue_getnext(struct timerqueue_head *head)
+{
+	return head->next;
+}
+
+
+/**
+ * timerqueue_iterate_next - Returns the timer after the provided timer
+ *
+ * @node: Pointer to a timer.
+ *
+ * Provides the timer that is after the given node. This is used, when
+ * necessary, to iterate through the list of timers in a timer list
+ * without modifying the list.
+ */
+struct timerqueue_node *timerqueue_iterate_next(struct timerqueue_node *node)
+{
+	struct rb_node *next;
+
+	if (!node)
+		return NULL;
+	next = rb_next(&node->node);
+	if (!next)
+		return NULL;
+	return container_of(next, struct timerqueue_node, node);
+}
-- 
1.7.3.2.146.gca209


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 03/12] timers: Fixup allmodconfig build issue
  2011-01-06  2:15 [PATCH 00/12] Posix Alarm Timers full patchset John Stultz
  2011-01-06  2:15 ` [PATCH 01/12] timers: Introduce timerlist infrastructure John Stultz
  2011-01-06  2:15 ` [PATCH 02/12] timers: Rename timerlist infrastructure to timerqueue John Stultz
@ 2011-01-06  2:15 ` John Stultz
  2011-01-06  2:15 ` [PATCH 04/12] hrtimers: Convert hrtimers to use timerlist infrastructure John Stultz
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: John Stultz @ 2011-01-06  2:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: John Stultz

Adds missed EXPORT_SYMBOL lines that cause the following build
failures with allmodconfig:
ERROR: "timerqueue_add" [drivers/rtc/rtc-core.ko] undefined!
ERROR: "timerqueue_getnext" [drivers/rtc/rtc-core.ko] undefined!
ERROR: "timerqueue_del" [drivers/rtc/rtc-core.ko] undefined!

Reported-by: Ingo Molnar <mingo@elte.hu>
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 lib/timerqueue.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/timerqueue.c b/lib/timerqueue.c
index f46de84..444b093 100644
--- a/lib/timerqueue.c
+++ b/lib/timerqueue.c
@@ -24,6 +24,7 @@
 
 #include <linux/timerqueue.h>
 #include <linux/rbtree.h>
+#include <linux/module.h>
 
 /**
  * timerqueue_add - Adds timer to timerqueue.
@@ -57,6 +58,7 @@ void timerqueue_add(struct timerqueue_head *head, struct timerqueue_node *node)
 	if (!head->next || node->expires.tv64 < head->next->expires.tv64)
 		head->next = node;
 }
+EXPORT_SYMBOL_GPL(timerqueue_add);
 
 /**
  * timerqueue_del - Removes a timer from the timerqueue.
@@ -80,7 +82,7 @@ void timerqueue_del(struct timerqueue_head *head, struct timerqueue_node *node)
 	rb_erase(&node->node, &head->head);
 	RB_CLEAR_NODE(&node->node);
 }
-
+EXPORT_SYMBOL_GPL(timerqueue_del);
 
 /**
  * timerqueue_getnext - Returns the timer with the earlies expiration time
@@ -94,7 +96,7 @@ struct timerqueue_node *timerqueue_getnext(struct timerqueue_head *head)
 {
 	return head->next;
 }
-
+EXPORT_SYMBOL_GPL(timerqueue_getnext);
 
 /**
  * timerqueue_iterate_next - Returns the timer after the provided timer
@@ -116,3 +118,4 @@ struct timerqueue_node *timerqueue_iterate_next(struct timerqueue_node *node)
 		return NULL;
 	return container_of(next, struct timerqueue_node, node);
 }
+EXPORT_SYMBOL_GPL(timerqueue_iterate_next);
-- 
1.7.3.2.146.gca209


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 04/12] hrtimers: Convert hrtimers to use timerlist infrastructure
  2011-01-06  2:15 [PATCH 00/12] Posix Alarm Timers full patchset John Stultz
                   ` (2 preceding siblings ...)
  2011-01-06  2:15 ` [PATCH 03/12] timers: Fixup allmodconfig build issue John Stultz
@ 2011-01-06  2:15 ` John Stultz
  2011-01-06  2:15 ` [PATCH 05/12] hrtimer: fix timerqueue conversion flub John Stultz
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: John Stultz @ 2011-01-06  2:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: John Stultz, Alessandro Zummo, Thomas Gleixner, Richard Cochran

Converts the hrtimer code to use the new timerlist infrastructure

Signed-off-by: John Stultz <john.stultz@linaro.org>
LKML Reference: <1290136329-18291-3-git-send-email-john.stultz@linaro.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
CC: Alessandro Zummo <a.zummo@towertech.it>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Richard Cochran <richardcochran@gmail.com>
---
 include/linux/hrtimer.h  |   32 ++++++++---------
 kernel/hrtimer.c         |   86 ++++++++++++++++------------------------------
 kernel/time/timer_list.c |    8 ++--
 3 files changed, 49 insertions(+), 77 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index fd0c1b8..0a7abaa 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -22,7 +22,7 @@
 #include <linux/wait.h>
 #include <linux/percpu.h>
 #include <linux/timer.h>
-
+#include <linux/timerqueue.h>
 
 struct hrtimer_clock_base;
 struct hrtimer_cpu_base;
@@ -79,8 +79,8 @@ enum hrtimer_restart {
 
 /**
  * struct hrtimer - the basic hrtimer structure
- * @node:	red black tree node for time ordered insertion
- * @_expires:	the absolute expiry time in the hrtimers internal
+ * @node:	timerqueue node, which also manages node.expires,
+ *		the absolute expiry time in the hrtimers internal
  *		representation. The time is related to the clock on
  *		which the timer is based. Is setup by adding
  *		slack to the _softexpires value. For non range timers
@@ -101,8 +101,7 @@ enum hrtimer_restart {
  * The hrtimer structure must be initialized by hrtimer_init()
  */
 struct hrtimer {
-	struct rb_node			node;
-	ktime_t				_expires;
+	struct timerqueue_node		node;
 	ktime_t				_softexpires;
 	enum hrtimer_restart		(*function)(struct hrtimer *);
 	struct hrtimer_clock_base	*base;
@@ -141,8 +140,7 @@ struct hrtimer_sleeper {
 struct hrtimer_clock_base {
 	struct hrtimer_cpu_base	*cpu_base;
 	clockid_t		index;
-	struct rb_root		active;
-	struct rb_node		*first;
+	struct timerqueue_head	active;
 	ktime_t			resolution;
 	ktime_t			(*get_time)(void);
 	ktime_t			softirq_time;
@@ -184,43 +182,43 @@ struct hrtimer_cpu_base {
 
 static inline void hrtimer_set_expires(struct hrtimer *timer, ktime_t time)
 {
-	timer->_expires = time;
+	timer->node.expires = time;
 	timer->_softexpires = time;
 }
 
 static inline void hrtimer_set_expires_range(struct hrtimer *timer, ktime_t time, ktime_t delta)
 {
 	timer->_softexpires = time;
-	timer->_expires = ktime_add_safe(time, delta);
+	timer->node.expires = ktime_add_safe(time, delta);
 }
 
 static inline void hrtimer_set_expires_range_ns(struct hrtimer *timer, ktime_t time, unsigned long delta)
 {
 	timer->_softexpires = time;
-	timer->_expires = ktime_add_safe(time, ns_to_ktime(delta));
+	timer->node.expires = ktime_add_safe(time, ns_to_ktime(delta));
 }
 
 static inline void hrtimer_set_expires_tv64(struct hrtimer *timer, s64 tv64)
 {
-	timer->_expires.tv64 = tv64;
+	timer->node.expires.tv64 = tv64;
 	timer->_softexpires.tv64 = tv64;
 }
 
 static inline void hrtimer_add_expires(struct hrtimer *timer, ktime_t time)
 {
-	timer->_expires = ktime_add_safe(timer->_expires, time);
+	timer->node.expires = ktime_add_safe(timer->node.expires, time);
 	timer->_softexpires = ktime_add_safe(timer->_softexpires, time);
 }
 
 static inline void hrtimer_add_expires_ns(struct hrtimer *timer, u64 ns)
 {
-	timer->_expires = ktime_add_ns(timer->_expires, ns);
+	timer->node.expires = ktime_add_ns(timer->node.expires, ns);
 	timer->_softexpires = ktime_add_ns(timer->_softexpires, ns);
 }
 
 static inline ktime_t hrtimer_get_expires(const struct hrtimer *timer)
 {
-	return timer->_expires;
+	return timer->node.expires;
 }
 
 static inline ktime_t hrtimer_get_softexpires(const struct hrtimer *timer)
@@ -230,7 +228,7 @@ static inline ktime_t hrtimer_get_softexpires(const struct hrtimer *timer)
 
 static inline s64 hrtimer_get_expires_tv64(const struct hrtimer *timer)
 {
-	return timer->_expires.tv64;
+	return timer->node.expires.tv64;
 }
 static inline s64 hrtimer_get_softexpires_tv64(const struct hrtimer *timer)
 {
@@ -239,12 +237,12 @@ static inline s64 hrtimer_get_softexpires_tv64(const struct hrtimer *timer)
 
 static inline s64 hrtimer_get_expires_ns(const struct hrtimer *timer)
 {
-	return ktime_to_ns(timer->_expires);
+	return ktime_to_ns(timer->node.expires);
 }
 
 static inline ktime_t hrtimer_expires_remaining(const struct hrtimer *timer)
 {
-    return ktime_sub(timer->_expires, timer->base->get_time());
+	return ktime_sub(timer->node.expires, timer->base->get_time());
 }
 
 #ifdef CONFIG_HIGH_RES_TIMERS
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 72206cf..f5aaea2 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -516,10 +516,13 @@ hrtimer_force_reprogram(struct hrtimer_cpu_base *cpu_base, int skip_equal)
 
 	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++, base++) {
 		struct hrtimer *timer;
+		struct timerqueue_node *next;
 
-		if (!base->first)
+		next = timerqueue_getnext(&base->active);
+		if (!next)
 			continue;
-		timer = rb_entry(base->first, struct hrtimer, node);
+		timer = container_of(next, struct hrtimer, node);
+
 		expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
 		/*
 		 * clock_was_set() has changed base->offset so the
@@ -840,48 +843,17 @@ EXPORT_SYMBOL_GPL(hrtimer_forward);
 static int enqueue_hrtimer(struct hrtimer *timer,
 			   struct hrtimer_clock_base *base)
 {
-	struct rb_node **link = &base->active.rb_node;
-	struct rb_node *parent = NULL;
-	struct hrtimer *entry;
-	int leftmost = 1;
-
 	debug_activate(timer);
 
-	/*
-	 * Find the right place in the rbtree:
-	 */
-	while (*link) {
-		parent = *link;
-		entry = rb_entry(parent, struct hrtimer, node);
-		/*
-		 * We dont care about collisions. Nodes with
-		 * the same expiry time stay together.
-		 */
-		if (hrtimer_get_expires_tv64(timer) <
-				hrtimer_get_expires_tv64(entry)) {
-			link = &(*link)->rb_left;
-		} else {
-			link = &(*link)->rb_right;
-			leftmost = 0;
-		}
-	}
+	timerqueue_add(&base->active, &timer->node);
 
 	/*
-	 * Insert the timer to the rbtree and check whether it
-	 * replaces the first pending timer
-	 */
-	if (leftmost)
-		base->first = &timer->node;
-
-	rb_link_node(&timer->node, parent, link);
-	rb_insert_color(&timer->node, &base->active);
-	/*
 	 * HRTIMER_STATE_ENQUEUED is or'ed to the current state to preserve the
 	 * state of a possibly running callback.
 	 */
 	timer->state |= HRTIMER_STATE_ENQUEUED;
 
-	return leftmost;
+	return (&timer->node == base->active.next);
 }
 
 /*
@@ -901,12 +873,7 @@ static void __remove_hrtimer(struct hrtimer *timer,
 	if (!(timer->state & HRTIMER_STATE_ENQUEUED))
 		goto out;
 
-	/*
-	 * Remove the timer from the rbtree and replace the first
-	 * entry pointer if necessary.
-	 */
-	if (base->first == &timer->node) {
-		base->first = rb_next(&timer->node);
+	if (&timer->node == timerqueue_getnext(&base->active)) {
 #ifdef CONFIG_HIGH_RES_TIMERS
 		/* Reprogram the clock event device. if enabled */
 		if (reprogram && hrtimer_hres_active()) {
@@ -919,7 +886,7 @@ static void __remove_hrtimer(struct hrtimer *timer,
 		}
 #endif
 	}
-	rb_erase(&timer->node, &base->active);
+	timerqueue_del(&base->active, &timer->node);
 out:
 	timer->state = newstate;
 }
@@ -1128,11 +1095,13 @@ ktime_t hrtimer_get_next_event(void)
 	if (!hrtimer_hres_active()) {
 		for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++, base++) {
 			struct hrtimer *timer;
+			struct timerqueue_node *next;
 
-			if (!base->first)
+			next = timerqueue_getnext(&base->active);
+			if (!next)
 				continue;
 
-			timer = rb_entry(base->first, struct hrtimer, node);
+			timer = container_of(next, struct hrtimer, node);
 			delta.tv64 = hrtimer_get_expires_tv64(timer);
 			delta = ktime_sub(delta, base->get_time());
 			if (delta.tv64 < mindelta.tv64)
@@ -1162,6 +1131,7 @@ static void __hrtimer_init(struct hrtimer *timer, clockid_t clock_id,
 
 	timer->base = &cpu_base->clock_base[clock_id];
 	hrtimer_init_timer_hres(timer);
+	timerqueue_init(&timer->node);
 
 #ifdef CONFIG_TIMER_STATS
 	timer->start_site = NULL;
@@ -1278,14 +1248,14 @@ retry:
 
 	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
 		ktime_t basenow;
-		struct rb_node *node;
+		struct timerqueue_node *node;
 
 		basenow = ktime_add(now, base->offset);
 
-		while ((node = base->first)) {
+		while ((node = timerqueue_getnext(&base->active))) {
 			struct hrtimer *timer;
 
-			timer = rb_entry(node, struct hrtimer, node);
+			timer = container_of(node, struct hrtimer, node);
 
 			/*
 			 * The immediate goal for using the softexpires is
@@ -1441,7 +1411,7 @@ void hrtimer_run_pending(void)
  */
 void hrtimer_run_queues(void)
 {
-	struct rb_node *node;
+	struct timerqueue_node *node;
 	struct hrtimer_cpu_base *cpu_base = &__get_cpu_var(hrtimer_bases);
 	struct hrtimer_clock_base *base;
 	int index, gettime = 1;
@@ -1450,9 +1420,11 @@ void hrtimer_run_queues(void)
 		return;
 
 	for (index = 0; index < HRTIMER_MAX_CLOCK_BASES; index++) {
-		base = &cpu_base->clock_base[index];
+		struct timerqueue_node *next;
 
-		if (!base->first)
+		base = &cpu_base->clock_base[index];
+		next = timerqueue_getnext(&base->active);
+		if (!next)
 			continue;
 
 		if (gettime) {
@@ -1462,10 +1434,10 @@ void hrtimer_run_queues(void)
 
 		raw_spin_lock(&cpu_base->lock);
 
-		while ((node = base->first)) {
+		while ((node = next)) {
 			struct hrtimer *timer;
 
-			timer = rb_entry(node, struct hrtimer, node);
+			timer = container_of(node, struct hrtimer, node);
 			if (base->softirq_time.tv64 <=
 					hrtimer_get_expires_tv64(timer))
 				break;
@@ -1630,8 +1602,10 @@ static void __cpuinit init_hrtimers_cpu(int cpu)
 
 	raw_spin_lock_init(&cpu_base->lock);
 
-	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++)
+	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
 		cpu_base->clock_base[i].cpu_base = cpu_base;
+		timerqueue_init_head(&cpu_base->clock_base[i].active);
+	}
 
 	hrtimer_init_hres(cpu_base);
 }
@@ -1642,10 +1616,10 @@ static void migrate_hrtimer_list(struct hrtimer_clock_base *old_base,
 				struct hrtimer_clock_base *new_base)
 {
 	struct hrtimer *timer;
-	struct rb_node *node;
+	struct timerqueue_node *node;
 
-	while ((node = rb_first(&old_base->active))) {
-		timer = rb_entry(node, struct hrtimer, node);
+	while ((node = timerqueue_getnext(&old_base->active))) {
+		timer = container_of(node, struct hrtimer, node);
 		BUG_ON(hrtimer_callback_running(timer));
 		debug_deactivate(timer);
 
diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c
index ab8f5e3..32a19f9 100644
--- a/kernel/time/timer_list.c
+++ b/kernel/time/timer_list.c
@@ -79,26 +79,26 @@ print_active_timers(struct seq_file *m, struct hrtimer_clock_base *base,
 {
 	struct hrtimer *timer, tmp;
 	unsigned long next = 0, i;
-	struct rb_node *curr;
+	struct timerqueue_node *curr;
 	unsigned long flags;
 
 next_one:
 	i = 0;
 	raw_spin_lock_irqsave(&base->cpu_base->lock, flags);
 
-	curr = base->first;
+	curr = timerqueue_getnext(&base->active);
 	/*
 	 * Crude but we have to do this O(N*N) thing, because
 	 * we have to unlock the base when printing:
 	 */
 	while (curr && i < next) {
-		curr = rb_next(curr);
+		curr = timerqueue_iterate_next(curr);
 		i++;
 	}
 
 	if (curr) {
 
-		timer = rb_entry(curr, struct hrtimer, node);
+		timer = container_of(curr, struct hrtimer, node);
 		tmp = *timer;
 		raw_spin_unlock_irqrestore(&base->cpu_base->lock, flags);
 
-- 
1.7.3.2.146.gca209


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 05/12] hrtimer: fix timerqueue conversion flub
  2011-01-06  2:15 [PATCH 00/12] Posix Alarm Timers full patchset John Stultz
                   ` (3 preceding siblings ...)
  2011-01-06  2:15 ` [PATCH 04/12] hrtimers: Convert hrtimers to use timerlist infrastructure John Stultz
@ 2011-01-06  2:15 ` John Stultz
  2011-01-06  2:15 ` [PATCH 06/12] RTC: Rework RTC code to use timerqueue for events John Stultz
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: John Stultz @ 2011-01-06  2:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: John Stultz

In converting the hrtimers to timerqueue, I missed
a spot in hrtimer_run_queues where we loop running
timers. We end up not pulling the new next value out
and instead just use the last next value, causing
boot time hangs in some cases.

The proper fix is to pull timerqueue_getnext each iteration
instead of using a local next value.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/hrtimer.c |    7 ++-----
 1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index f5aaea2..f2429fc 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1420,11 +1420,8 @@ void hrtimer_run_queues(void)
 		return;
 
 	for (index = 0; index < HRTIMER_MAX_CLOCK_BASES; index++) {
-		struct timerqueue_node *next;
-
 		base = &cpu_base->clock_base[index];
-		next = timerqueue_getnext(&base->active);
-		if (!next)
+		if (!timerqueue_getnext(&base->active))
 			continue;
 
 		if (gettime) {
@@ -1434,7 +1431,7 @@ void hrtimer_run_queues(void)
 
 		raw_spin_lock(&cpu_base->lock);
 
-		while ((node = next)) {
+		while ((node = timerqueue_getnext(&base->active))) {
 			struct hrtimer *timer;
 
 			timer = container_of(node, struct hrtimer, node);
-- 
1.7.3.2.146.gca209


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 06/12] RTC: Rework RTC code to use timerqueue for events
  2011-01-06  2:15 [PATCH 00/12] Posix Alarm Timers full patchset John Stultz
                   ` (4 preceding siblings ...)
  2011-01-06  2:15 ` [PATCH 05/12] hrtimer: fix timerqueue conversion flub John Stultz
@ 2011-01-06  2:15 ` John Stultz
  2011-01-06  2:15 ` [PATCH 07/12] RTC: Remove UIE emulation John Stultz
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: John Stultz @ 2011-01-06  2:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: John Stultz, Alessandro Zummo, Thomas Gleixner, Richard Cochran

This patch reworks a large portion of the generic RTC code
to in-effect virtualize the rtc interrupt code.

The current RTC interface is very much a raw hardware interface.
Via the proc, /dev/, or sysfs interfaces, applciations can set
the hardware to trigger interrupts in one of three modes:

AIE: Alarm interrupt
UIE: Update interrupt (ie: once per second)
PIE: Periodic interrupt (sub-second irqs)

The problem with this interface is that it limits the RTC hardware
so it can only be used by one application at a time.

The purpose of this patch is to extend the RTC code so that we can
multiplex multiple applications event needs onto a single RTC device.
This is done by utilizing the timerqueue infrastructure to manage
a list of events, which cause the RTC hardware to be programmed
to fire an interrupt for the next event in the list.

In order to preserve the functionality of the exsting proc,/dev/ and
sysfs interfaces, we emulate the different interrupt modes as follows:

AIE: We create a rtc_timer dedicated to AIE mode interrupts. There is
only one per device, so we don't change existing interface semantics.

UIE: Again, a dedicated rtc_timer, set for periodic mode, is used
to emulate UIE interrupts. Again, only one per device.

PIE: Since PIE mode interrupts fire faster then the RTC's clock read
granularity, we emulate PIE mode interrupts using a hrtimer. Again,
one per device.

With this patch, the rtctest.c application in Documentation/rtc.txt
passes fine on x86 hardware. However, there may very well still be
bugs, so greatly I'd appreciate any feedback or testing!

Signed-off-by: John Stultz <john.stultz@linaro.org>
LKML Reference: <1290136329-18291-4-git-send-email-john.stultz@linaro.org>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
CC: Alessandro Zummo <a.zummo@towertech.it>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Richard Cochran <richardcochran@gmail.com>
---
 drivers/rtc/class.c     |   13 +
 drivers/rtc/interface.c |  574 ++++++++++++++++++++++++++++------------------
 drivers/rtc/rtc-lib.c   |   28 +++
 include/linux/rtc.h     |   43 +++-
 4 files changed, 428 insertions(+), 230 deletions(-)

diff --git a/drivers/rtc/class.c b/drivers/rtc/class.c
index e6539cb..8347c4b 100644
--- a/drivers/rtc/class.c
+++ b/drivers/rtc/class.c
@@ -16,6 +16,7 @@
 #include <linux/kdev_t.h>
 #include <linux/idr.h>
 #include <linux/slab.h>
+#include <linux/workqueue.h>
 
 #include "rtc-core.h"
 
@@ -152,6 +153,18 @@ struct rtc_device *rtc_device_register(const char *name, struct device *dev,
 	spin_lock_init(&rtc->irq_task_lock);
 	init_waitqueue_head(&rtc->irq_queue);
 
+	/* Init timerqueue */
+	timerqueue_init_head(&rtc->timerqueue);
+	INIT_WORK(&rtc->irqwork, rtctimer_do_work);
+	/* Init aie timer */
+	rtctimer_init(&rtc->aie_timer, rtc_aie_update_irq, (void *)rtc);
+	/* Init uie timer */
+	rtctimer_init(&rtc->uie_rtctimer, rtc_uie_update_irq, (void *)rtc);
+	/* Init pie timer */
+	hrtimer_init(&rtc->pie_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	rtc->pie_timer.function = rtc_pie_update_irq;
+	rtc->pie_enabled = 0;
+
 	strlcpy(rtc->name, name, RTC_DEVICE_NAME_SIZE);
 	dev_set_name(&rtc->dev, "rtc%d", id);
 
diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
index a0c8162..c81c50b 100644
--- a/drivers/rtc/interface.c
+++ b/drivers/rtc/interface.c
@@ -14,15 +14,11 @@
 #include <linux/rtc.h>
 #include <linux/sched.h>
 #include <linux/log2.h>
+#include <linux/workqueue.h>
 
-int rtc_read_time(struct rtc_device *rtc, struct rtc_time *tm)
+static int __rtc_read_time(struct rtc_device *rtc, struct rtc_time *tm)
 {
 	int err;
-
-	err = mutex_lock_interruptible(&rtc->ops_lock);
-	if (err)
-		return err;
-
 	if (!rtc->ops)
 		err = -ENODEV;
 	else if (!rtc->ops->read_time)
@@ -31,7 +27,18 @@ int rtc_read_time(struct rtc_device *rtc, struct rtc_time *tm)
 		memset(tm, 0, sizeof(struct rtc_time));
 		err = rtc->ops->read_time(rtc->dev.parent, tm);
 	}
+	return err;
+}
+
+int rtc_read_time(struct rtc_device *rtc, struct rtc_time *tm)
+{
+	int err;
 
+	err = mutex_lock_interruptible(&rtc->ops_lock);
+	if (err)
+		return err;
+
+	err = __rtc_read_time(rtc, tm);
 	mutex_unlock(&rtc->ops_lock);
 	return err;
 }
@@ -106,188 +113,54 @@ int rtc_set_mmss(struct rtc_device *rtc, unsigned long secs)
 }
 EXPORT_SYMBOL_GPL(rtc_set_mmss);
 
-static int rtc_read_alarm_internal(struct rtc_device *rtc, struct rtc_wkalrm *alarm)
+int rtc_read_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm)
 {
 	int err;
 
 	err = mutex_lock_interruptible(&rtc->ops_lock);
 	if (err)
 		return err;
-
-	if (rtc->ops == NULL)
-		err = -ENODEV;
-	else if (!rtc->ops->read_alarm)
-		err = -EINVAL;
-	else {
-		memset(alarm, 0, sizeof(struct rtc_wkalrm));
-		err = rtc->ops->read_alarm(rtc->dev.parent, alarm);
-	}
-
+	alarm->enabled = rtc->aie_timer.enabled;
+	if (alarm->enabled)
+		alarm->time = rtc_ktime_to_tm(rtc->aie_timer.node.expires);
 	mutex_unlock(&rtc->ops_lock);
-	return err;
+
+	return 0;
 }
+EXPORT_SYMBOL_GPL(rtc_read_alarm);
 
-int rtc_read_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm)
+int __rtc_set_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm)
 {
+	struct rtc_time tm;
+	long now, scheduled;
 	int err;
-	struct rtc_time before, now;
-	int first_time = 1;
-	unsigned long t_now, t_alm;
-	enum { none, day, month, year } missing = none;
-	unsigned days;
-
-	/* The lower level RTC driver may return -1 in some fields,
-	 * creating invalid alarm->time values, for reasons like:
-	 *
-	 *   - The hardware may not be capable of filling them in;
-	 *     many alarms match only on time-of-day fields, not
-	 *     day/month/year calendar data.
-	 *
-	 *   - Some hardware uses illegal values as "wildcard" match
-	 *     values, which non-Linux firmware (like a BIOS) may try
-	 *     to set up as e.g. "alarm 15 minutes after each hour".
-	 *     Linux uses only oneshot alarms.
-	 *
-	 * When we see that here, we deal with it by using values from
-	 * a current RTC timestamp for any missing (-1) values.  The
-	 * RTC driver prevents "periodic alarm" modes.
-	 *
-	 * But this can be racey, because some fields of the RTC timestamp
-	 * may have wrapped in the interval since we read the RTC alarm,
-	 * which would lead to us inserting inconsistent values in place
-	 * of the -1 fields.
-	 *
-	 * Reading the alarm and timestamp in the reverse sequence
-	 * would have the same race condition, and not solve the issue.
-	 *
-	 * So, we must first read the RTC timestamp,
-	 * then read the RTC alarm value,
-	 * and then read a second RTC timestamp.
-	 *
-	 * If any fields of the second timestamp have changed
-	 * when compared with the first timestamp, then we know
-	 * our timestamp may be inconsistent with that used by
-	 * the low-level rtc_read_alarm_internal() function.
-	 *
-	 * So, when the two timestamps disagree, we just loop and do
-	 * the process again to get a fully consistent set of values.
-	 *
-	 * This could all instead be done in the lower level driver,
-	 * but since more than one lower level RTC implementation needs it,
-	 * then it's probably best best to do it here instead of there..
-	 */
 
-	/* Get the "before" timestamp */
-	err = rtc_read_time(rtc, &before);
-	if (err < 0)
+	err = rtc_valid_tm(&alarm->time);
+	if (err)
 		return err;
-	do {
-		if (!first_time)
-			memcpy(&before, &now, sizeof(struct rtc_time));
-		first_time = 0;
-
-		/* get the RTC alarm values, which may be incomplete */
-		err = rtc_read_alarm_internal(rtc, alarm);
-		if (err)
-			return err;
-		if (!alarm->enabled)
-			return 0;
-
-		/* full-function RTCs won't have such missing fields */
-		if (rtc_valid_tm(&alarm->time) == 0)
-			return 0;
-
-		/* get the "after" timestamp, to detect wrapped fields */
-		err = rtc_read_time(rtc, &now);
-		if (err < 0)
-			return err;
-
-		/* note that tm_sec is a "don't care" value here: */
-	} while (   before.tm_min   != now.tm_min
-		 || before.tm_hour  != now.tm_hour
-		 || before.tm_mon   != now.tm_mon
-		 || before.tm_year  != now.tm_year);
-
-	/* Fill in the missing alarm fields using the timestamp; we
-	 * know there's at least one since alarm->time is invalid.
-	 */
-	if (alarm->time.tm_sec == -1)
-		alarm->time.tm_sec = now.tm_sec;
-	if (alarm->time.tm_min == -1)
-		alarm->time.tm_min = now.tm_min;
-	if (alarm->time.tm_hour == -1)
-		alarm->time.tm_hour = now.tm_hour;
-
-	/* For simplicity, only support date rollover for now */
-	if (alarm->time.tm_mday == -1) {
-		alarm->time.tm_mday = now.tm_mday;
-		missing = day;
-	}
-	if (alarm->time.tm_mon == -1) {
-		alarm->time.tm_mon = now.tm_mon;
-		if (missing == none)
-			missing = month;
-	}
-	if (alarm->time.tm_year == -1) {
-		alarm->time.tm_year = now.tm_year;
-		if (missing == none)
-			missing = year;
-	}
-
-	/* with luck, no rollover is needed */
-	rtc_tm_to_time(&now, &t_now);
-	rtc_tm_to_time(&alarm->time, &t_alm);
-	if (t_now < t_alm)
-		goto done;
-
-	switch (missing) {
+	rtc_tm_to_time(&alarm->time, &scheduled);
 
-	/* 24 hour rollover ... if it's now 10am Monday, an alarm that
-	 * that will trigger at 5am will do so at 5am Tuesday, which
-	 * could also be in the next month or year.  This is a common
-	 * case, especially for PCs.
-	 */
-	case day:
-		dev_dbg(&rtc->dev, "alarm rollover: %s\n", "day");
-		t_alm += 24 * 60 * 60;
-		rtc_time_to_tm(t_alm, &alarm->time);
-		break;
-
-	/* Month rollover ... if it's the 31th, an alarm on the 3rd will
-	 * be next month.  An alarm matching on the 30th, 29th, or 28th
-	 * may end up in the month after that!  Many newer PCs support
-	 * this type of alarm.
+	/* Make sure we're not setting alarms in the past */
+	err = __rtc_read_time(rtc, &tm);
+	rtc_tm_to_time(&tm, &now);
+	if (scheduled <= now)
+		return -ETIME;
+	/*
+	 * XXX - We just checked to make sure the alarm time is not
+	 * in the past, but there is still a race window where if
+	 * the is alarm set for the next second and the second ticks
+	 * over right here, before we set the alarm.
 	 */
-	case month:
-		dev_dbg(&rtc->dev, "alarm rollover: %s\n", "month");
-		do {
-			if (alarm->time.tm_mon < 11)
-				alarm->time.tm_mon++;
-			else {
-				alarm->time.tm_mon = 0;
-				alarm->time.tm_year++;
-			}
-			days = rtc_month_days(alarm->time.tm_mon,
-					alarm->time.tm_year);
-		} while (days < alarm->time.tm_mday);
-		break;
-
-	/* Year rollover ... easy except for leap years! */
-	case year:
-		dev_dbg(&rtc->dev, "alarm rollover: %s\n", "year");
-		do {
-			alarm->time.tm_year++;
-		} while (rtc_valid_tm(&alarm->time) != 0);
-		break;
-
-	default:
-		dev_warn(&rtc->dev, "alarm rollover not handled\n");
-	}
 
-done:
-	return 0;
+	if (!rtc->ops)
+		err = -ENODEV;
+	else if (!rtc->ops->set_alarm)
+		err = -EINVAL;
+	else
+		err = rtc->ops->set_alarm(rtc->dev.parent, alarm);
+
+	return err;
 }
-EXPORT_SYMBOL_GPL(rtc_read_alarm);
 
 int rtc_set_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm)
 {
@@ -300,16 +173,18 @@ int rtc_set_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm)
 	err = mutex_lock_interruptible(&rtc->ops_lock);
 	if (err)
 		return err;
-
-	if (!rtc->ops)
-		err = -ENODEV;
-	else if (!rtc->ops->set_alarm)
-		err = -EINVAL;
-	else
-		err = rtc->ops->set_alarm(rtc->dev.parent, alarm);
-
+	if (rtc->aie_timer.enabled) {
+		rtctimer_remove(rtc, &rtc->aie_timer);
+		rtc->aie_timer.enabled = 0;
+	}
+	rtc->aie_timer.node.expires = rtc_tm_to_ktime(alarm->time);
+	rtc->aie_timer.period = ktime_set(0, 0);
+	if (alarm->enabled) {
+		rtc->aie_timer.enabled = 1;
+		rtctimer_enqueue(rtc, &rtc->aie_timer);
+	}
 	mutex_unlock(&rtc->ops_lock);
-	return err;
+	return 0;
 }
 EXPORT_SYMBOL_GPL(rtc_set_alarm);
 
@@ -319,6 +194,16 @@ int rtc_alarm_irq_enable(struct rtc_device *rtc, unsigned int enabled)
 	if (err)
 		return err;
 
+	if (rtc->aie_timer.enabled != enabled) {
+		if (enabled) {
+			rtc->aie_timer.enabled = 1;
+			rtctimer_enqueue(rtc, &rtc->aie_timer);
+		} else {
+			rtctimer_remove(rtc, &rtc->aie_timer);
+			rtc->aie_timer.enabled = 0;
+		}
+	}
+
 	if (!rtc->ops)
 		err = -ENODEV;
 	else if (!rtc->ops->alarm_irq_enable)
@@ -337,52 +222,53 @@ int rtc_update_irq_enable(struct rtc_device *rtc, unsigned int enabled)
 	if (err)
 		return err;
 
-#ifdef CONFIG_RTC_INTF_DEV_UIE_EMUL
-	if (enabled == 0 && rtc->uie_irq_active) {
-		mutex_unlock(&rtc->ops_lock);
-		return rtc_dev_update_irq_enable_emul(rtc, enabled);
+	/* make sure we're changing state */
+	if (rtc->uie_rtctimer.enabled == enabled)
+		goto out;
+
+	if (enabled) {
+		struct rtc_time tm;
+		ktime_t now, onesec;
+
+		__rtc_read_time(rtc, &tm);
+		onesec = ktime_set(1, 0);
+		now = rtc_tm_to_ktime(tm);
+		rtc->uie_rtctimer.node.expires = ktime_add(now, onesec);
+		rtc->uie_rtctimer.period = ktime_set(1, 0);
+		rtc->uie_rtctimer.enabled = 1;
+		rtctimer_enqueue(rtc, &rtc->uie_rtctimer);
+	} else {
+		rtctimer_remove(rtc, &rtc->uie_rtctimer);
+		rtc->uie_rtctimer.enabled = 0;
 	}
-#endif
-
-	if (!rtc->ops)
-		err = -ENODEV;
-	else if (!rtc->ops->update_irq_enable)
-		err = -EINVAL;
-	else
-		err = rtc->ops->update_irq_enable(rtc->dev.parent, enabled);
 
+out:
 	mutex_unlock(&rtc->ops_lock);
-
-#ifdef CONFIG_RTC_INTF_DEV_UIE_EMUL
-	/*
-	 * Enable emulation if the driver did not provide
-	 * the update_irq_enable function pointer or if returned
-	 * -EINVAL to signal that it has been configured without
-	 * interrupts or that are not available at the moment.
-	 */
-	if (err == -EINVAL)
-		err = rtc_dev_update_irq_enable_emul(rtc, enabled);
-#endif
 	return err;
+
 }
 EXPORT_SYMBOL_GPL(rtc_update_irq_enable);
 
+
 /**
- * rtc_update_irq - report RTC periodic, alarm, and/or update irqs
- * @rtc: the rtc device
- * @num: how many irqs are being reported (usually one)
- * @events: mask of RTC_IRQF with one or more of RTC_PF, RTC_AF, RTC_UF
- * Context: any
+ * rtc_handle_legacy_irq - AIE, UIE and PIE event hook
+ * @rtc: pointer to the rtc device
+ *
+ * This function is called when an AIE, UIE or PIE mode interrupt
+ * has occured (or been emulated).
+ *
+ * Triggers the registered irq_task function callback.
  */
-void rtc_update_irq(struct rtc_device *rtc,
-		unsigned long num, unsigned long events)
+static void rtc_handle_legacy_irq(struct rtc_device *rtc, int num, int mode)
 {
 	unsigned long flags;
 
+	/* mark one irq of the appropriate mode */
 	spin_lock_irqsave(&rtc->irq_lock, flags);
-	rtc->irq_data = (rtc->irq_data + (num << 8)) | events;
+	rtc->irq_data = (rtc->irq_data + (num << 8)) | (RTC_IRQF|mode);
 	spin_unlock_irqrestore(&rtc->irq_lock, flags);
 
+	/* call the task func */
 	spin_lock_irqsave(&rtc->irq_task_lock, flags);
 	if (rtc->irq_task)
 		rtc->irq_task->func(rtc->irq_task->private_data);
@@ -391,6 +277,69 @@ void rtc_update_irq(struct rtc_device *rtc,
 	wake_up_interruptible(&rtc->irq_queue);
 	kill_fasync(&rtc->async_queue, SIGIO, POLL_IN);
 }
+
+
+/**
+ * rtc_aie_update_irq - AIE mode rtctimer hook
+ * @private: pointer to the rtc_device
+ *
+ * This functions is called when the aie_timer expires.
+ */
+void rtc_aie_update_irq(void *private)
+{
+	struct rtc_device *rtc = (struct rtc_device *)private;
+	rtc_handle_legacy_irq(rtc, 1, RTC_AF);
+}
+
+
+/**
+ * rtc_uie_update_irq - UIE mode rtctimer hook
+ * @private: pointer to the rtc_device
+ *
+ * This functions is called when the uie_timer expires.
+ */
+void rtc_uie_update_irq(void *private)
+{
+	struct rtc_device *rtc = (struct rtc_device *)private;
+	rtc_handle_legacy_irq(rtc, 1,  RTC_UF);
+}
+
+
+/**
+ * rtc_pie_update_irq - PIE mode hrtimer hook
+ * @timer: pointer to the pie mode hrtimer
+ *
+ * This function is used to emulate PIE mode interrupts
+ * using an hrtimer. This function is called when the periodic
+ * hrtimer expires.
+ */
+enum hrtimer_restart rtc_pie_update_irq(struct hrtimer *timer)
+{
+	struct rtc_device *rtc;
+	ktime_t period;
+	int count;
+	rtc = container_of(timer, struct rtc_device, pie_timer);
+
+	period = ktime_set(0, NSEC_PER_SEC/rtc->irq_freq);
+	count = hrtimer_forward_now(timer, period);
+
+	rtc_handle_legacy_irq(rtc, count, RTC_PF);
+
+	return HRTIMER_RESTART;
+}
+
+/**
+ * rtc_update_irq - Triggered when a RTC interrupt occurs.
+ * @rtc: the rtc device
+ * @num: how many irqs are being reported (usually one)
+ * @events: mask of RTC_IRQF with one or more of RTC_PF, RTC_AF, RTC_UF
+ * Context: any
+ */
+void rtc_update_irq(struct rtc_device *rtc,
+		unsigned long num, unsigned long events)
+{
+	schedule_work(&rtc->irqwork);
+}
 EXPORT_SYMBOL_GPL(rtc_update_irq);
 
 static int __rtc_match(struct device *dev, void *data)
@@ -477,18 +426,20 @@ int rtc_irq_set_state(struct rtc_device *rtc, struct rtc_task *task, int enabled
 	int err = 0;
 	unsigned long flags;
 
-	if (rtc->ops->irq_set_state == NULL)
-		return -ENXIO;
-
 	spin_lock_irqsave(&rtc->irq_task_lock, flags);
 	if (rtc->irq_task != NULL && task == NULL)
 		err = -EBUSY;
 	if (rtc->irq_task != task)
 		err = -EACCES;
-	spin_unlock_irqrestore(&rtc->irq_task_lock, flags);
 
-	if (err == 0)
-		err = rtc->ops->irq_set_state(rtc->dev.parent, enabled);
+	if (enabled) {
+		ktime_t period = ktime_set(0, NSEC_PER_SEC/rtc->irq_freq);
+		hrtimer_start(&rtc->pie_timer, period, HRTIMER_MODE_REL);
+	} else {
+		hrtimer_cancel(&rtc->pie_timer);
+	}
+	rtc->pie_enabled = enabled;
+	spin_unlock_irqrestore(&rtc->irq_task_lock, flags);
 
 	return err;
 }
@@ -509,21 +460,194 @@ int rtc_irq_set_freq(struct rtc_device *rtc, struct rtc_task *task, int freq)
 	int err = 0;
 	unsigned long flags;
 
-	if (rtc->ops->irq_set_freq == NULL)
-		return -ENXIO;
-
 	spin_lock_irqsave(&rtc->irq_task_lock, flags);
 	if (rtc->irq_task != NULL && task == NULL)
 		err = -EBUSY;
 	if (rtc->irq_task != task)
 		err = -EACCES;
-	spin_unlock_irqrestore(&rtc->irq_task_lock, flags);
-
 	if (err == 0) {
-		err = rtc->ops->irq_set_freq(rtc->dev.parent, freq);
-		if (err == 0)
-			rtc->irq_freq = freq;
+		rtc->irq_freq = freq;
+		if (rtc->pie_enabled) {
+			ktime_t period;
+			hrtimer_cancel(&rtc->pie_timer);
+			period = ktime_set(0, NSEC_PER_SEC/rtc->irq_freq);
+			hrtimer_start(&rtc->pie_timer, period,
+					HRTIMER_MODE_REL);
+		}
 	}
+	spin_unlock_irqrestore(&rtc->irq_task_lock, flags);
 	return err;
 }
 EXPORT_SYMBOL_GPL(rtc_irq_set_freq);
+
+/**
+ * rtctimer_enqueue - Adds a rtc_timer to the rtc_device timerqueue
+ * @rtc rtc device
+ * @timer timer being added.
+ *
+ * Enqueues a timer onto the rtc devices timerqueue and sets
+ * the next alarm event appropriately.
+ *
+ * Must hold ops_lock for proper serialization of timerqueue
+ */
+void rtctimer_enqueue(struct rtc_device *rtc, struct rtc_timer *timer)
+{
+	timerqueue_add(&rtc->timerqueue, &timer->node);
+	if (&timer->node == timerqueue_getnext(&rtc->timerqueue)) {
+		struct rtc_wkalrm alarm;
+		int err;
+		alarm.time = rtc_ktime_to_tm(timer->node.expires);
+		alarm.enabled = 1;
+		err = __rtc_set_alarm(rtc, &alarm);
+		if (err == -ETIME)
+			schedule_work(&rtc->irqwork);
+	}
+}
+
+/**
+ * rtctimer_remove - Removes a rtc_timer from the rtc_device timerqueue
+ * @rtc rtc device
+ * @timer timer being removed.
+ *
+ * Removes a timer onto the rtc devices timerqueue and sets
+ * the next alarm event appropriately.
+ *
+ * Must hold ops_lock for proper serialization of timerqueue
+ */
+void rtctimer_remove(struct rtc_device *rtc, struct rtc_timer *timer)
+{
+	struct timerqueue_node *next = timerqueue_getnext(&rtc->timerqueue);
+	timerqueue_del(&rtc->timerqueue, &timer->node);
+
+	if (next == &timer->node) {
+		struct rtc_wkalrm alarm;
+		int err;
+		next = timerqueue_getnext(&rtc->timerqueue);
+		if (!next)
+			return;
+		alarm.time = rtc_ktime_to_tm(next->expires);
+		alarm.enabled = 1;
+		err = __rtc_set_alarm(rtc, &alarm);
+		if (err == -ETIME)
+			schedule_work(&rtc->irqwork);
+	}
+}
+
+/**
+ * rtctimer_do_work - Expires rtc timers
+ * @rtc rtc device
+ * @timer timer being removed.
+ *
+ * Expires rtc timers. Reprograms next alarm event if needed.
+ * Called via worktask.
+ *
+ * Serializes access to timerqueue via ops_lock mutex
+ */
+void rtctimer_do_work(struct work_struct *work)
+{
+	struct rtc_timer *timer;
+	struct timerqueue_node *next;
+	ktime_t now;
+	struct rtc_time tm;
+
+	struct rtc_device *rtc =
+		container_of(work, struct rtc_device, irqwork);
+
+	mutex_lock(&rtc->ops_lock);
+again:
+	__rtc_read_time(rtc, &tm);
+	now = rtc_tm_to_ktime(tm);
+	while ((next = timerqueue_getnext(&rtc->timerqueue))) {
+		if (next->expires.tv64 > now.tv64)
+			break;
+
+		/* expire timer */
+		timer = container_of(next, struct rtc_timer, node);
+		timerqueue_del(&rtc->timerqueue, &timer->node);
+		timer->enabled = 0;
+		if (timer->task.func)
+			timer->task.func(timer->task.private_data);
+
+		/* Re-add/fwd periodic timers */
+		if (ktime_to_ns(timer->period)) {
+			timer->node.expires = ktime_add(timer->node.expires,
+							timer->period);
+			timer->enabled = 1;
+			timerqueue_add(&rtc->timerqueue, &timer->node);
+		}
+	}
+
+	/* Set next alarm */
+	if (next) {
+		struct rtc_wkalrm alarm;
+		int err;
+		alarm.time = rtc_ktime_to_tm(next->expires);
+		alarm.enabled = 1;
+		err = __rtc_set_alarm(rtc, &alarm);
+		if (err == -ETIME)
+			goto again;
+	}
+
+	mutex_unlock(&rtc->ops_lock);
+}
+
+
+/* rtctimer_init - Initializes an rtc_timer
+ * @timer: timer to be intiialized
+ * @f: function pointer to be called when timer fires
+ * @data: private data passed to function pointer
+ *
+ * Kernel interface to initializing an rtc_timer.
+ */
+void rtctimer_init(struct rtc_timer *timer, void (*f)(void* p), void* data)
+{
+	timerqueue_init(&timer->node);
+	timer->enabled = 0;
+	timer->task.func = f;
+	timer->task.private_data = data;
+}
+
+/* rtctimer_start - Sets an rtc_timer to fire in the future
+ * @ rtc: rtc device to be used
+ * @ timer: timer being set
+ * @ expires: time at which to expire the timer
+ * @ period: period that the timer will recur
+ *
+ * Kernel interface to set an rtc_timer
+ */
+int rtctimer_start(struct rtc_device *rtc, struct rtc_timer* timer,
+			ktime_t expires, ktime_t period)
+{
+	int ret = 0;
+	mutex_lock(&rtc->ops_lock);
+	if (timer->enabled)
+		rtctimer_remove(rtc, timer);
+
+	timer->node.expires = expires;
+	timer->period = period;
+
+	timer->enabled = 1;
+	rtctimer_enqueue(rtc, timer);
+
+	mutex_unlock(&rtc->ops_lock);
+	return ret;
+}
+
+/* rtctimer_cancel - Stops an rtc_timer
+ * @ rtc: rtc device to be used
+ * @ timer: timer being set
+ *
+ * Kernel interface to cancel an rtc_timer
+ */
+int rtctimer_cancel(struct rtc_device *rtc, struct rtc_timer* timer)
+{
+	int ret = 0;
+	mutex_lock(&rtc->ops_lock);
+	if (timer->enabled)
+		rtctimer_remove(rtc, timer);
+	timer->enabled = 0;
+	mutex_unlock(&rtc->ops_lock);
+	return ret;
+}
+
+
diff --git a/drivers/rtc/rtc-lib.c b/drivers/rtc/rtc-lib.c
index 773851f..075f170 100644
--- a/drivers/rtc/rtc-lib.c
+++ b/drivers/rtc/rtc-lib.c
@@ -117,4 +117,32 @@ int rtc_tm_to_time(struct rtc_time *tm, unsigned long *time)
 }
 EXPORT_SYMBOL(rtc_tm_to_time);
 
+/*
+ * Convert rtc_time to ktime
+ */
+ktime_t rtc_tm_to_ktime(struct rtc_time tm)
+{
+	time_t time;
+	rtc_tm_to_time(&tm, &time);
+	return ktime_set(time, 0);
+}
+EXPORT_SYMBOL_GPL(rtc_tm_to_ktime);
+
+/*
+ * Convert ktime to rtc_time
+ */
+struct rtc_time rtc_ktime_to_tm(ktime_t kt)
+{
+	struct timespec ts;
+	struct rtc_time ret;
+
+	ts = ktime_to_timespec(kt);
+	/* Round up any ns */
+	if (ts.tv_nsec)
+		ts.tv_sec++;
+	rtc_time_to_tm(ts.tv_sec, &ret);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(rtc_ktime_to_tm);
+
 MODULE_LICENSE("GPL");
diff --git a/include/linux/rtc.h b/include/linux/rtc.h
index 14dbc83..a3421ab 100644
--- a/include/linux/rtc.h
+++ b/include/linux/rtc.h
@@ -107,12 +107,17 @@ extern int rtc_year_days(unsigned int day, unsigned int month, unsigned int year
 extern int rtc_valid_tm(struct rtc_time *tm);
 extern int rtc_tm_to_time(struct rtc_time *tm, unsigned long *time);
 extern void rtc_time_to_tm(unsigned long time, struct rtc_time *tm);
+ktime_t rtc_tm_to_ktime(struct rtc_time tm);
+struct rtc_time rtc_ktime_to_tm(ktime_t kt);
+
 
 #include <linux/device.h>
 #include <linux/seq_file.h>
 #include <linux/cdev.h>
 #include <linux/poll.h>
 #include <linux/mutex.h>
+#include <linux/timerqueue.h>
+#include <linux/workqueue.h>
 
 extern struct class *rtc_class;
 
@@ -151,7 +156,19 @@ struct rtc_class_ops {
 };
 
 #define RTC_DEVICE_NAME_SIZE 20
-struct rtc_task;
+typedef struct rtc_task {
+	void (*func)(void *private_data);
+	void *private_data;
+} rtc_task_t;
+
+
+struct rtc_timer {
+	struct rtc_task	task;
+	struct timerqueue_node node;
+	ktime_t period;
+	int enabled;
+};
+
 
 /* flags */
 #define RTC_DEV_BUSY 0
@@ -179,6 +196,15 @@ struct rtc_device
 	spinlock_t irq_task_lock;
 	int irq_freq;
 	int max_user_freq;
+
+	struct timerqueue_head timerqueue;
+	struct rtc_timer aie_timer;
+	struct rtc_timer uie_rtctimer;
+	struct hrtimer pie_timer; /* sub second exp, so needs hrtimer */
+	int pie_enabled;
+	struct work_struct irqwork;
+
+
 #ifdef CONFIG_RTC_INTF_DEV_UIE_EMUL
 	struct work_struct uie_task;
 	struct timer_list uie_timer;
@@ -224,15 +250,22 @@ extern int rtc_alarm_irq_enable(struct rtc_device *rtc, unsigned int enabled);
 extern int rtc_dev_update_irq_enable_emul(struct rtc_device *rtc,
 						unsigned int enabled);
 
-typedef struct rtc_task {
-	void (*func)(void *private_data);
-	void *private_data;
-} rtc_task_t;
+void rtc_aie_update_irq(void *private);
+void rtc_uie_update_irq(void *private);
+enum hrtimer_restart rtc_pie_update_irq(struct hrtimer *timer);
 
 int rtc_register(rtc_task_t *task);
 int rtc_unregister(rtc_task_t *task);
 int rtc_control(rtc_task_t *t, unsigned int cmd, unsigned long arg);
 
+void rtctimer_enqueue(struct rtc_device *rtc, struct rtc_timer *timer);
+void rtctimer_remove(struct rtc_device *rtc, struct rtc_timer *timer);
+void rtctimer_init(struct rtc_timer *timer, void (*f)(void* p), void* data);
+int rtctimer_start(struct rtc_device *rtc, struct rtc_timer* timer,
+			ktime_t expires, ktime_t period);
+int rtctimer_cancel(struct rtc_device *rtc, struct rtc_timer* timer);
+void rtctimer_do_work(struct work_struct *work);
+
 static inline bool is_leap_year(unsigned int year)
 {
 	return (!(year % 4) && (year % 100)) || !(year % 400);
-- 
1.7.3.2.146.gca209


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 07/12] RTC: Remove UIE emulation
  2011-01-06  2:15 [PATCH 00/12] Posix Alarm Timers full patchset John Stultz
                   ` (5 preceding siblings ...)
  2011-01-06  2:15 ` [PATCH 06/12] RTC: Rework RTC code to use timerqueue for events John Stultz
@ 2011-01-06  2:15 ` John Stultz
  2011-01-06  2:15 ` [PATCH 08/12] rtc: Namespace fixup John Stultz
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: John Stultz @ 2011-01-06  2:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: John Stultz, Alessandro Zummo, Thomas Gleixner, Richard Cochran

Since we provide UIE interrupts via a rtc_timer, the old
emulation code can be removed.

Signed-off-by: John Stultz <john.stultz@linaro.org>
LKML Reference: <1290136329-18291-5-git-send-email-john.stultz@linaro.org>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
CC: Alessandro Zummo <a.zummo@towertech.it>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Richard Cochran <richardcochran@gmail.com>
---
 drivers/rtc/rtc-dev.c |  104 -------------------------------------------------
 include/linux/rtc.h   |   12 ------
 2 files changed, 0 insertions(+), 116 deletions(-)

diff --git a/drivers/rtc/rtc-dev.c b/drivers/rtc/rtc-dev.c
index 62227cd..212b16e 100644
--- a/drivers/rtc/rtc-dev.c
+++ b/drivers/rtc/rtc-dev.c
@@ -46,105 +46,6 @@ static int rtc_dev_open(struct inode *inode, struct file *file)
 	return err;
 }
 
-#ifdef CONFIG_RTC_INTF_DEV_UIE_EMUL
-/*
- * Routine to poll RTC seconds field for change as often as possible,
- * after first RTC_UIE use timer to reduce polling
- */
-static void rtc_uie_task(struct work_struct *work)
-{
-	struct rtc_device *rtc =
-		container_of(work, struct rtc_device, uie_task);
-	struct rtc_time tm;
-	int num = 0;
-	int err;
-
-	err = rtc_read_time(rtc, &tm);
-
-	spin_lock_irq(&rtc->irq_lock);
-	if (rtc->stop_uie_polling || err) {
-		rtc->uie_task_active = 0;
-	} else if (rtc->oldsecs != tm.tm_sec) {
-		num = (tm.tm_sec + 60 - rtc->oldsecs) % 60;
-		rtc->oldsecs = tm.tm_sec;
-		rtc->uie_timer.expires = jiffies + HZ - (HZ/10);
-		rtc->uie_timer_active = 1;
-		rtc->uie_task_active = 0;
-		add_timer(&rtc->uie_timer);
-	} else if (schedule_work(&rtc->uie_task) == 0) {
-		rtc->uie_task_active = 0;
-	}
-	spin_unlock_irq(&rtc->irq_lock);
-	if (num)
-		rtc_update_irq(rtc, num, RTC_UF | RTC_IRQF);
-}
-static void rtc_uie_timer(unsigned long data)
-{
-	struct rtc_device *rtc = (struct rtc_device *)data;
-	unsigned long flags;
-
-	spin_lock_irqsave(&rtc->irq_lock, flags);
-	rtc->uie_timer_active = 0;
-	rtc->uie_task_active = 1;
-	if ((schedule_work(&rtc->uie_task) == 0))
-		rtc->uie_task_active = 0;
-	spin_unlock_irqrestore(&rtc->irq_lock, flags);
-}
-
-static int clear_uie(struct rtc_device *rtc)
-{
-	spin_lock_irq(&rtc->irq_lock);
-	if (rtc->uie_irq_active) {
-		rtc->stop_uie_polling = 1;
-		if (rtc->uie_timer_active) {
-			spin_unlock_irq(&rtc->irq_lock);
-			del_timer_sync(&rtc->uie_timer);
-			spin_lock_irq(&rtc->irq_lock);
-			rtc->uie_timer_active = 0;
-		}
-		if (rtc->uie_task_active) {
-			spin_unlock_irq(&rtc->irq_lock);
-			flush_scheduled_work();
-			spin_lock_irq(&rtc->irq_lock);
-		}
-		rtc->uie_irq_active = 0;
-	}
-	spin_unlock_irq(&rtc->irq_lock);
-	return 0;
-}
-
-static int set_uie(struct rtc_device *rtc)
-{
-	struct rtc_time tm;
-	int err;
-
-	err = rtc_read_time(rtc, &tm);
-	if (err)
-		return err;
-	spin_lock_irq(&rtc->irq_lock);
-	if (!rtc->uie_irq_active) {
-		rtc->uie_irq_active = 1;
-		rtc->stop_uie_polling = 0;
-		rtc->oldsecs = tm.tm_sec;
-		rtc->uie_task_active = 1;
-		if (schedule_work(&rtc->uie_task) == 0)
-			rtc->uie_task_active = 0;
-	}
-	rtc->irq_data = 0;
-	spin_unlock_irq(&rtc->irq_lock);
-	return 0;
-}
-
-int rtc_dev_update_irq_enable_emul(struct rtc_device *rtc, unsigned int enabled)
-{
-	if (enabled)
-		return set_uie(rtc);
-	else
-		return clear_uie(rtc);
-}
-EXPORT_SYMBOL(rtc_dev_update_irq_enable_emul);
-
-#endif /* CONFIG_RTC_INTF_DEV_UIE_EMUL */
 
 static ssize_t
 rtc_dev_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
@@ -493,11 +394,6 @@ void rtc_dev_prepare(struct rtc_device *rtc)
 
 	rtc->dev.devt = MKDEV(MAJOR(rtc_devt), rtc->id);
 
-#ifdef CONFIG_RTC_INTF_DEV_UIE_EMUL
-	INIT_WORK(&rtc->uie_task, rtc_uie_task);
-	setup_timer(&rtc->uie_timer, rtc_uie_timer, (unsigned long)rtc);
-#endif
-
 	cdev_init(&rtc->char_dev, &rtc_dev_fops);
 	rtc->char_dev.owner = rtc->owner;
 }
diff --git a/include/linux/rtc.h b/include/linux/rtc.h
index a3421ab..44e18a2 100644
--- a/include/linux/rtc.h
+++ b/include/linux/rtc.h
@@ -203,18 +203,6 @@ struct rtc_device
 	struct hrtimer pie_timer; /* sub second exp, so needs hrtimer */
 	int pie_enabled;
 	struct work_struct irqwork;
-
-
-#ifdef CONFIG_RTC_INTF_DEV_UIE_EMUL
-	struct work_struct uie_task;
-	struct timer_list uie_timer;
-	/* Those fields are protected by rtc->irq_lock */
-	unsigned int oldsecs;
-	unsigned int uie_irq_active:1;
-	unsigned int stop_uie_polling:1;
-	unsigned int uie_task_active:1;
-	unsigned int uie_timer_active:1;
-#endif
 };
 #define to_rtc_device(d) container_of(d, struct rtc_device, dev)
 
-- 
1.7.3.2.146.gca209


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 08/12] rtc: Namespace fixup
  2011-01-06  2:15 [PATCH 00/12] Posix Alarm Timers full patchset John Stultz
                   ` (6 preceding siblings ...)
  2011-01-06  2:15 ` [PATCH 07/12] RTC: Remove UIE emulation John Stultz
@ 2011-01-06  2:15 ` John Stultz
  2011-01-06  2:15 ` [PATCH 09/12] [RFC] hrtimers: extend hrtimer base code to handle more then 2 clockids John Stultz
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: John Stultz @ 2011-01-06  2:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: Thomas Gleixner, John Stultz

From: Thomas Gleixner <tglx@linutronix.de>

rtctimer_* is already occupied by sound/core/rtctimer.c. Instead of
fiddling with that, rename the new functions to rtc_timer_* which
reads nicer anyway.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <johnstul@us.ibm.com>
---
 drivers/rtc/class.c     |    6 +++---
 drivers/rtc/interface.c |   42 +++++++++++++++++++++---------------------
 include/linux/rtc.h     |   12 ++++++------
 3 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/drivers/rtc/class.c b/drivers/rtc/class.c
index 8347c4b..9583cbc 100644
--- a/drivers/rtc/class.c
+++ b/drivers/rtc/class.c
@@ -155,11 +155,11 @@ struct rtc_device *rtc_device_register(const char *name, struct device *dev,
 
 	/* Init timerqueue */
 	timerqueue_init_head(&rtc->timerqueue);
-	INIT_WORK(&rtc->irqwork, rtctimer_do_work);
+	INIT_WORK(&rtc->irqwork, rtc_timer_do_work);
 	/* Init aie timer */
-	rtctimer_init(&rtc->aie_timer, rtc_aie_update_irq, (void *)rtc);
+	rtc_timer_init(&rtc->aie_timer, rtc_aie_update_irq, (void *)rtc);
 	/* Init uie timer */
-	rtctimer_init(&rtc->uie_rtctimer, rtc_uie_update_irq, (void *)rtc);
+	rtc_timer_init(&rtc->uie_rtctimer, rtc_uie_update_irq, (void *)rtc);
 	/* Init pie timer */
 	hrtimer_init(&rtc->pie_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
 	rtc->pie_timer.function = rtc_pie_update_irq;
diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
index c81c50b..90384b9 100644
--- a/drivers/rtc/interface.c
+++ b/drivers/rtc/interface.c
@@ -174,14 +174,14 @@ int rtc_set_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm)
 	if (err)
 		return err;
 	if (rtc->aie_timer.enabled) {
-		rtctimer_remove(rtc, &rtc->aie_timer);
+		rtc_timer_remove(rtc, &rtc->aie_timer);
 		rtc->aie_timer.enabled = 0;
 	}
 	rtc->aie_timer.node.expires = rtc_tm_to_ktime(alarm->time);
 	rtc->aie_timer.period = ktime_set(0, 0);
 	if (alarm->enabled) {
 		rtc->aie_timer.enabled = 1;
-		rtctimer_enqueue(rtc, &rtc->aie_timer);
+		rtc_timer_enqueue(rtc, &rtc->aie_timer);
 	}
 	mutex_unlock(&rtc->ops_lock);
 	return 0;
@@ -197,9 +197,9 @@ int rtc_alarm_irq_enable(struct rtc_device *rtc, unsigned int enabled)
 	if (rtc->aie_timer.enabled != enabled) {
 		if (enabled) {
 			rtc->aie_timer.enabled = 1;
-			rtctimer_enqueue(rtc, &rtc->aie_timer);
+			rtc_timer_enqueue(rtc, &rtc->aie_timer);
 		} else {
-			rtctimer_remove(rtc, &rtc->aie_timer);
+			rtc_timer_remove(rtc, &rtc->aie_timer);
 			rtc->aie_timer.enabled = 0;
 		}
 	}
@@ -236,9 +236,9 @@ int rtc_update_irq_enable(struct rtc_device *rtc, unsigned int enabled)
 		rtc->uie_rtctimer.node.expires = ktime_add(now, onesec);
 		rtc->uie_rtctimer.period = ktime_set(1, 0);
 		rtc->uie_rtctimer.enabled = 1;
-		rtctimer_enqueue(rtc, &rtc->uie_rtctimer);
+		rtc_timer_enqueue(rtc, &rtc->uie_rtctimer);
 	} else {
-		rtctimer_remove(rtc, &rtc->uie_rtctimer);
+		rtc_timer_remove(rtc, &rtc->uie_rtctimer);
 		rtc->uie_rtctimer.enabled = 0;
 	}
 
@@ -481,7 +481,7 @@ int rtc_irq_set_freq(struct rtc_device *rtc, struct rtc_task *task, int freq)
 EXPORT_SYMBOL_GPL(rtc_irq_set_freq);
 
 /**
- * rtctimer_enqueue - Adds a rtc_timer to the rtc_device timerqueue
+ * rtc_timer_enqueue - Adds a rtc_timer to the rtc_device timerqueue
  * @rtc rtc device
  * @timer timer being added.
  *
@@ -490,7 +490,7 @@ EXPORT_SYMBOL_GPL(rtc_irq_set_freq);
  *
  * Must hold ops_lock for proper serialization of timerqueue
  */
-void rtctimer_enqueue(struct rtc_device *rtc, struct rtc_timer *timer)
+void rtc_timer_enqueue(struct rtc_device *rtc, struct rtc_timer *timer)
 {
 	timerqueue_add(&rtc->timerqueue, &timer->node);
 	if (&timer->node == timerqueue_getnext(&rtc->timerqueue)) {
@@ -505,7 +505,7 @@ void rtctimer_enqueue(struct rtc_device *rtc, struct rtc_timer *timer)
 }
 
 /**
- * rtctimer_remove - Removes a rtc_timer from the rtc_device timerqueue
+ * rtc_timer_remove - Removes a rtc_timer from the rtc_device timerqueue
  * @rtc rtc device
  * @timer timer being removed.
  *
@@ -514,7 +514,7 @@ void rtctimer_enqueue(struct rtc_device *rtc, struct rtc_timer *timer)
  *
  * Must hold ops_lock for proper serialization of timerqueue
  */
-void rtctimer_remove(struct rtc_device *rtc, struct rtc_timer *timer)
+void rtc_timer_remove(struct rtc_device *rtc, struct rtc_timer *timer)
 {
 	struct timerqueue_node *next = timerqueue_getnext(&rtc->timerqueue);
 	timerqueue_del(&rtc->timerqueue, &timer->node);
@@ -534,7 +534,7 @@ void rtctimer_remove(struct rtc_device *rtc, struct rtc_timer *timer)
 }
 
 /**
- * rtctimer_do_work - Expires rtc timers
+ * rtc_timer_do_work - Expires rtc timers
  * @rtc rtc device
  * @timer timer being removed.
  *
@@ -543,7 +543,7 @@ void rtctimer_remove(struct rtc_device *rtc, struct rtc_timer *timer)
  *
  * Serializes access to timerqueue via ops_lock mutex
  */
-void rtctimer_do_work(struct work_struct *work)
+void rtc_timer_do_work(struct work_struct *work)
 {
 	struct rtc_timer *timer;
 	struct timerqueue_node *next;
@@ -592,14 +592,14 @@ again:
 }
 
 
-/* rtctimer_init - Initializes an rtc_timer
+/* rtc_timer_init - Initializes an rtc_timer
  * @timer: timer to be intiialized
  * @f: function pointer to be called when timer fires
  * @data: private data passed to function pointer
  *
  * Kernel interface to initializing an rtc_timer.
  */
-void rtctimer_init(struct rtc_timer *timer, void (*f)(void* p), void* data)
+void rtc_timer_init(struct rtc_timer *timer, void (*f)(void* p), void* data)
 {
 	timerqueue_init(&timer->node);
 	timer->enabled = 0;
@@ -607,7 +607,7 @@ void rtctimer_init(struct rtc_timer *timer, void (*f)(void* p), void* data)
 	timer->task.private_data = data;
 }
 
-/* rtctimer_start - Sets an rtc_timer to fire in the future
+/* rtc_timer_start - Sets an rtc_timer to fire in the future
  * @ rtc: rtc device to be used
  * @ timer: timer being set
  * @ expires: time at which to expire the timer
@@ -615,36 +615,36 @@ void rtctimer_init(struct rtc_timer *timer, void (*f)(void* p), void* data)
  *
  * Kernel interface to set an rtc_timer
  */
-int rtctimer_start(struct rtc_device *rtc, struct rtc_timer* timer,
+int rtc_timer_start(struct rtc_device *rtc, struct rtc_timer* timer,
 			ktime_t expires, ktime_t period)
 {
 	int ret = 0;
 	mutex_lock(&rtc->ops_lock);
 	if (timer->enabled)
-		rtctimer_remove(rtc, timer);
+		rtc_timer_remove(rtc, timer);
 
 	timer->node.expires = expires;
 	timer->period = period;
 
 	timer->enabled = 1;
-	rtctimer_enqueue(rtc, timer);
+	rtc_timer_enqueue(rtc, timer);
 
 	mutex_unlock(&rtc->ops_lock);
 	return ret;
 }
 
-/* rtctimer_cancel - Stops an rtc_timer
+/* rtc_timer_cancel - Stops an rtc_timer
  * @ rtc: rtc device to be used
  * @ timer: timer being set
  *
  * Kernel interface to cancel an rtc_timer
  */
-int rtctimer_cancel(struct rtc_device *rtc, struct rtc_timer* timer)
+int rtc_timer_cancel(struct rtc_device *rtc, struct rtc_timer* timer)
 {
 	int ret = 0;
 	mutex_lock(&rtc->ops_lock);
 	if (timer->enabled)
-		rtctimer_remove(rtc, timer);
+		rtc_timer_remove(rtc, timer);
 	timer->enabled = 0;
 	mutex_unlock(&rtc->ops_lock);
 	return ret;
diff --git a/include/linux/rtc.h b/include/linux/rtc.h
index 44e18a2..3c995b4 100644
--- a/include/linux/rtc.h
+++ b/include/linux/rtc.h
@@ -246,13 +246,13 @@ int rtc_register(rtc_task_t *task);
 int rtc_unregister(rtc_task_t *task);
 int rtc_control(rtc_task_t *t, unsigned int cmd, unsigned long arg);
 
-void rtctimer_enqueue(struct rtc_device *rtc, struct rtc_timer *timer);
-void rtctimer_remove(struct rtc_device *rtc, struct rtc_timer *timer);
-void rtctimer_init(struct rtc_timer *timer, void (*f)(void* p), void* data);
-int rtctimer_start(struct rtc_device *rtc, struct rtc_timer* timer,
+void rtc_timer_enqueue(struct rtc_device *rtc, struct rtc_timer *timer);
+void rtc_timer_remove(struct rtc_device *rtc, struct rtc_timer *timer);
+void rtc_timer_init(struct rtc_timer *timer, void (*f)(void* p), void* data);
+int rtc_timer_start(struct rtc_device *rtc, struct rtc_timer* timer,
 			ktime_t expires, ktime_t period);
-int rtctimer_cancel(struct rtc_device *rtc, struct rtc_timer* timer);
-void rtctimer_do_work(struct work_struct *work);
+int rtc_timer_cancel(struct rtc_device *rtc, struct rtc_timer* timer);
+void rtc_timer_do_work(struct work_struct *work);
 
 static inline bool is_leap_year(unsigned int year)
 {
-- 
1.7.3.2.146.gca209


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 09/12] [RFC] hrtimers: extend hrtimer base code to handle more then 2 clockids
  2011-01-06  2:15 [PATCH 00/12] Posix Alarm Timers full patchset John Stultz
                   ` (7 preceding siblings ...)
  2011-01-06  2:15 ` [PATCH 08/12] rtc: Namespace fixup John Stultz
@ 2011-01-06  2:15 ` John Stultz
  2011-01-06  2:15 ` [PATCH 10/12] [RFC] hrtimers: Add CLOCK_BOOTTIME clockid, hrtimerbase and posix interface John Stultz
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 22+ messages in thread
From: John Stultz @ 2011-01-06  2:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: John Stultz, Jamie Lokier, Thomas Gleixner, Alexander Shishkin,
	Arve Hjønnevåg

The hrtimer code is written mainly with CLOCK_REALTIME and CLOCK_MONOTONIC
in mind. These are clockids 0 and 1 resepctively. However, if we are
to introduce any new hrtimer bases, using new clockids, we have to skip
the cputimers (clockids 2,3) as well as other clockids that may not impelement
timers.

This patch adds a little bit of indirection between the clockid and
the base, so that we can extend the base by one when we add
a new clockid at number 7 or so.

CC: Jamie Lokier <jamie@shareable.org>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Alexander Shishkin <virtuoso@slind.org>
CC: Arve Hjønnevåg <arve@android.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 include/linux/hrtimer.h |    6 ++++-
 kernel/hrtimer.c        |   48 ++++++++++++++++++++++++++++++++++------------
 2 files changed, 40 insertions(+), 14 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 0a7abaa..f96c43d 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -149,7 +149,11 @@ struct hrtimer_clock_base {
 #endif
 };
 
-#define HRTIMER_MAX_CLOCK_BASES 2
+enum  hrtimer_base_type {
+	HRTIMER_BASE_REALTIME,
+	HRTIMER_BASE_MONOTONIC,
+	HRTIMER_MAX_CLOCK_BASES,
+};
 
 /*
  * struct hrtimer_cpu_base - the per cpu clock bases
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index f2429fc..fb20e08 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -53,11 +53,10 @@
 /*
  * The timer bases:
  *
- * Note: If we want to add new timer bases, we have to skip the two
- * clock ids captured by the cpu-timers. We do this by holding empty
- * entries rather than doing math adjustment of the clock ids.
- * This ensures that we capture erroneous accesses to these clock ids
- * rather than moving them into the range of valid clock id's.
+ * There are more clockids then hrtimer bases. Thus, we index
+ * into the timer bases by the hrtimer_base_type enum. When trying
+ * to reach a base using a clockid, hrtimer_clockid_to_base()
+ * is used to convert from clockid to the proper hrtimer_base_type.
  */
 DEFINE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases) =
 {
@@ -77,6 +76,17 @@ DEFINE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases) =
 	}
 };
 
+static int hrtimer_clock_to_base_table[MAX_CLOCKS];
+
+static inline int hrtimer_clockid_to_base(clockid_t clock_id)
+{
+	int ret = hrtimer_clock_to_base_table[clock_id];
+
+	WARN_ON(ret == -1);
+	return ret;
+}
+
+
 /*
  * Get the coarse grained time at the softirq based on xtime and
  * wall_to_monotonic.
@@ -95,8 +105,8 @@ static void hrtimer_get_softirq_time(struct hrtimer_cpu_base *base)
 
 	xtim = timespec_to_ktime(xts);
 	tomono = timespec_to_ktime(tom);
-	base->clock_base[CLOCK_REALTIME].softirq_time = xtim;
-	base->clock_base[CLOCK_MONOTONIC].softirq_time =
+	base->clock_base[HRTIMER_BASE_REALTIME].softirq_time = xtim;
+	base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time =
 		ktime_add(xtim, tomono);
 }
 
@@ -184,10 +194,11 @@ switch_hrtimer_base(struct hrtimer *timer, struct hrtimer_clock_base *base,
 	struct hrtimer_cpu_base *new_cpu_base;
 	int this_cpu = smp_processor_id();
 	int cpu = hrtimer_get_target(this_cpu, pinned);
+	int basenum = hrtimer_clockid_to_base(base->index);
 
 again:
 	new_cpu_base = &per_cpu(hrtimer_bases, cpu);
-	new_base = &new_cpu_base->clock_base[base->index];
+	new_base = &new_cpu_base->clock_base[basenum];
 
 	if (base != new_base) {
 		/*
@@ -627,7 +638,7 @@ static void retrigger_next_event(void *arg)
 
 	/* Adjust CLOCK_REALTIME offset */
 	raw_spin_lock(&base->lock);
-	base->clock_base[CLOCK_REALTIME].offset =
+	base->clock_base[HRTIMER_BASE_REALTIME].offset =
 		timespec_to_ktime(realtime_offset);
 
 	hrtimer_force_reprogram(base, 0);
@@ -725,8 +736,8 @@ static int hrtimer_switch_to_hres(void)
 		return 0;
 	}
 	base->hres_active = 1;
-	base->clock_base[CLOCK_REALTIME].resolution = KTIME_HIGH_RES;
-	base->clock_base[CLOCK_MONOTONIC].resolution = KTIME_HIGH_RES;
+	base->clock_base[HRTIMER_BASE_REALTIME].resolution = KTIME_HIGH_RES;
+	base->clock_base[HRTIMER_BASE_MONOTONIC].resolution = KTIME_HIGH_RES;
 
 	tick_setup_sched_timer();
 
@@ -1121,6 +1132,7 @@ static void __hrtimer_init(struct hrtimer *timer, clockid_t clock_id,
 			   enum hrtimer_mode mode)
 {
 	struct hrtimer_cpu_base *cpu_base;
+	int base;
 
 	memset(timer, 0, sizeof(struct hrtimer));
 
@@ -1129,7 +1141,8 @@ static void __hrtimer_init(struct hrtimer *timer, clockid_t clock_id,
 	if (clock_id == CLOCK_REALTIME && mode != HRTIMER_MODE_ABS)
 		clock_id = CLOCK_MONOTONIC;
 
-	timer->base = &cpu_base->clock_base[clock_id];
+	base = hrtimer_clockid_to_base(clock_id);
+	timer->base = &cpu_base->clock_base[base];
 	hrtimer_init_timer_hres(timer);
 	timerqueue_init(&timer->node);
 
@@ -1165,9 +1178,10 @@ EXPORT_SYMBOL_GPL(hrtimer_init);
 int hrtimer_get_res(const clockid_t which_clock, struct timespec *tp)
 {
 	struct hrtimer_cpu_base *cpu_base;
+	int base = hrtimer_clockid_to_base(which_clock);
 
 	cpu_base = &__raw_get_cpu_var(hrtimer_bases);
-	*tp = ktime_to_timespec(cpu_base->clock_base[which_clock].resolution);
+	*tp = ktime_to_timespec(cpu_base->clock_base[base].resolution);
 
 	return 0;
 }
@@ -1714,6 +1728,14 @@ static struct notifier_block __cpuinitdata hrtimers_nb = {
 
 void __init hrtimers_init(void)
 {
+	int i;
+
+	/* Init the clockid -> base mapping */
+	for (i = 0; i < MAX_CLOCKS; i++)
+		hrtimer_clock_to_base_table[i] = -1;
+	hrtimer_clock_to_base_table[CLOCK_REALTIME] = HRTIMER_BASE_REALTIME;
+	hrtimer_clock_to_base_table[CLOCK_MONOTONIC] = HRTIMER_BASE_MONOTONIC;
+
 	hrtimer_cpu_notify(&hrtimers_nb, (unsigned long)CPU_UP_PREPARE,
 			  (void *)(long)smp_processor_id());
 	register_cpu_notifier(&hrtimers_nb);
-- 
1.7.3.2.146.gca209


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 10/12] [RFC] hrtimers: Add CLOCK_BOOTTIME clockid, hrtimerbase and posix interface
  2011-01-06  2:15 [PATCH 00/12] Posix Alarm Timers full patchset John Stultz
                   ` (8 preceding siblings ...)
  2011-01-06  2:15 ` [PATCH 09/12] [RFC] hrtimers: extend hrtimer base code to handle more then 2 clockids John Stultz
@ 2011-01-06  2:15 ` John Stultz
  2011-01-06  2:15 ` [PATCH 11/12] timers: Add rb_init_node() to allow for stack allocated rb nodes John Stultz
  2011-01-06  2:15 ` [PATCH 12/12] [RFC] Introduce Alarm (hybrid) timers John Stultz
  11 siblings, 0 replies; 22+ messages in thread
From: John Stultz @ 2011-01-06  2:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: John Stultz, Jamie Lokier, Thomas Gleixner, Alexander Shishkin,
	Arve Hjønnevåg

CLOCK_MONOTONIC stops while the system is in suspend. This is because
to applications system suspend is invisible. However, there is a
growing set of applications that are wanting to be suspend-aware,
but do not want to deal with the complicatoins of CLOCK_REALTIME
(which might jump around if settimeofday is called).

For these applications, I propose a new clockid: CLOCK_BOOTTIME.
CLOCK_BOOTTIME is idential to CLOCK_MONOTONIC, except it also
includes any time spent in suspend.

This patch adds the new CLOCK_BOOTTIME clockid, as well as the
infrastructure needed to support hrtimers against it, and the
wiring to expose it out via the posix interface.

CC: Jamie Lokier <jamie@shareable.org>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Alexander Shishkin <virtuoso@slind.org>
CC: Arve Hjønnevåg <arve@android.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 include/linux/hrtimer.h   |    2 +
 include/linux/time.h      |    4 ++
 kernel/hrtimer.c          |   15 +++++++-
 kernel/posix-timers.c     |   16 ++++++++-
 kernel/time/timekeeping.c |   79 ++++++++++++++++++++++++++++++++++++++++++++-
 5 files changed, 112 insertions(+), 4 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index f96c43d..7f22dbb 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -152,6 +152,7 @@ struct hrtimer_clock_base {
 enum  hrtimer_base_type {
 	HRTIMER_BASE_REALTIME,
 	HRTIMER_BASE_MONOTONIC,
+	HRTIMER_BASE_BOOTTIME,
 	HRTIMER_MAX_CLOCK_BASES,
 };
 
@@ -314,6 +315,7 @@ static inline int hrtimer_is_hres_active(struct hrtimer *timer)
 
 extern ktime_t ktime_get(void);
 extern ktime_t ktime_get_real(void);
+extern ktime_t ktime_get_boottime(void);
 
 
 DECLARE_PER_CPU(struct tick_device, tick_cpu_device);
diff --git a/include/linux/time.h b/include/linux/time.h
index 9f15ac7..4ed031a 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -126,6 +126,8 @@ unsigned long get_seconds(void);
 struct timespec current_kernel_time(void);
 struct timespec __current_kernel_time(void); /* does not take xtime_lock */
 struct timespec __get_wall_to_monotonic(void); /* does not take xtime_lock */
+extern struct timespec __get_sleep_time(void); /* does not take xtime_lock */
+
 struct timespec get_monotonic_coarse(void);
 
 #define CURRENT_TIME		(current_kernel_time())
@@ -160,6 +162,7 @@ extern void getnstimeofday(struct timespec *tv);
 extern void getrawmonotonic(struct timespec *ts);
 extern void getboottime(struct timespec *ts);
 extern void monotonic_to_bootbased(struct timespec *ts);
+extern void get_monotonic_boottime(struct timespec *ts);
 
 extern struct timespec timespec_trunc(struct timespec t, unsigned gran);
 extern int timekeeping_valid_for_hres(void);
@@ -290,6 +293,7 @@ struct itimerval {
 #define CLOCK_MONOTONIC_RAW		4
 #define CLOCK_REALTIME_COARSE		5
 #define CLOCK_MONOTONIC_COARSE		6
+#define CLOCK_BOOTTIME			7
 
 /*
  * The IDs of various hardware clocks:
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index fb20e08..f507610 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -73,6 +73,11 @@ DEFINE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases) =
 			.get_time = &ktime_get,
 			.resolution = KTIME_LOW_RES,
 		},
+		{
+			.index = CLOCK_BOOTTIME,
+			.get_time = &ktime_get_boottime,
+			.resolution = KTIME_LOW_RES,
+		},
 	}
 };
 
@@ -93,21 +98,26 @@ static inline int hrtimer_clockid_to_base(clockid_t clock_id)
  */
 static void hrtimer_get_softirq_time(struct hrtimer_cpu_base *base)
 {
-	ktime_t xtim, tomono;
-	struct timespec xts, tom;
+	ktime_t xtim, tomono, sleep;
+	struct timespec xts, tom, slp;
 	unsigned long seq;
 
 	do {
 		seq = read_seqbegin(&xtime_lock);
 		xts = __current_kernel_time();
 		tom = __get_wall_to_monotonic();
+		slp = __get_sleep_time();
 	} while (read_seqretry(&xtime_lock, seq));
 
 	xtim = timespec_to_ktime(xts);
 	tomono = timespec_to_ktime(tom);
+	sleep = timespec_to_ktime(slp);
 	base->clock_base[HRTIMER_BASE_REALTIME].softirq_time = xtim;
 	base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time =
 		ktime_add(xtim, tomono);
+	base->clock_base[HRTIMER_BASE_BOOTTIME].softirq_time =
+		ktime_add(base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time,
+				sleep);
 }
 
 /*
@@ -1735,6 +1745,7 @@ void __init hrtimers_init(void)
 		hrtimer_clock_to_base_table[i] = -1;
 	hrtimer_clock_to_base_table[CLOCK_REALTIME] = HRTIMER_BASE_REALTIME;
 	hrtimer_clock_to_base_table[CLOCK_MONOTONIC] = HRTIMER_BASE_MONOTONIC;
+	hrtimer_clock_to_base_table[CLOCK_BOOTTIME] = HRTIMER_BASE_BOOTTIME;
 
 	hrtimer_cpu_notify(&hrtimers_nb, (unsigned long)CPU_UP_PREPARE,
 			  (void *)(long)smp_processor_id());
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index 9ca4973..1bc2572 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -234,7 +234,7 @@ static int posix_ktime_get_ts(clockid_t which_clock, struct timespec *tp)
 }
 
 /*
- * Get monotonic time for posix timers
+ * Get monotonic-raw time for posix timers
  */
 static int posix_get_monotonic_raw(clockid_t which_clock, struct timespec *tp)
 {
@@ -261,6 +261,14 @@ static int posix_get_coarse_res(const clockid_t which_clock, struct timespec *tp
 	*tp = ktime_to_timespec(KTIME_LOW_RES);
 	return 0;
 }
+
+static int posix_get_boottime(const clockid_t which_clock, struct timespec *tp)
+{
+	get_monotonic_boottime(tp);
+	return 0;
+}
+
+
 /*
  * Initialize everything, well, just everything in Posix clocks/timers ;)
  */
@@ -295,12 +303,18 @@ static __init int init_posix_timers(void)
 		.timer_create = no_timer_create,
 		.nsleep = no_nsleep,
 	};
+	struct k_clock clock_boottime = {
+		.clock_getres = hrtimer_get_res,
+		.clock_get = posix_get_boottime,
+		.clock_set = do_posix_clock_nosettime,
+	};
 
 	register_posix_clock(CLOCK_REALTIME, &clock_realtime);
 	register_posix_clock(CLOCK_MONOTONIC, &clock_monotonic);
 	register_posix_clock(CLOCK_MONOTONIC_RAW, &clock_monotonic_raw);
 	register_posix_clock(CLOCK_REALTIME_COARSE, &clock_realtime_coarse);
 	register_posix_clock(CLOCK_MONOTONIC_COARSE, &clock_monotonic_coarse);
+	register_posix_clock(CLOCK_BOOTTIME, &clock_boottime);
 
 	posix_timers_cache = kmem_cache_create("posix_timers_cache",
 					sizeof (struct k_itimer), 0, SLAB_PANIC,
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 49010d8..e3db3c9 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -823,7 +823,7 @@ void update_wall_time(void)
  * getboottime - Return the real time of system boot.
  * @ts:		pointer to the timespec to be set
  *
- * Returns the time of day in a timespec.
+ * Returns the wall-time of boot in a timespec.
  *
  * This is based on the wall_to_monotonic offset and the total suspend
  * time. Calls to settimeofday will affect the value returned (which
@@ -841,6 +841,83 @@ void getboottime(struct timespec *ts)
 }
 EXPORT_SYMBOL_GPL(getboottime);
 
+
+/**
+ * get_monotonic_boottime - Returns monotonic time since boot
+ * @ts:		pointer to the timespec to be set
+ *
+ * Returns the monotonic time since boot in a timespec.
+ *
+ * This is similar to CLOCK_MONTONIC/ktime_get_ts, but also
+ * includes the time spent in suspend.
+ */
+void get_monotonic_boottime(struct timespec *ts)
+{
+	struct timespec tomono, sleep;
+	unsigned int seq;
+	s64 nsecs;
+
+	WARN_ON(timekeeping_suspended);
+
+	do {
+		seq = read_seqbegin(&xtime_lock);
+		*ts = xtime;
+		tomono = wall_to_monotonic;
+		sleep = total_sleep_time;
+		nsecs = timekeeping_get_ns();
+
+	} while (read_seqretry(&xtime_lock, seq));
+
+	set_normalized_timespec(ts, ts->tv_sec + tomono.tv_sec + sleep.tv_sec,
+			ts->tv_nsec + tomono.tv_nsec + sleep.tv_nsec + nsecs);
+}
+EXPORT_SYMBOL_GPL(get_monotonic_boottime);
+
+/**
+ * ktime_get_boottime - Returns monotonic time since boot in a ktime
+ *
+ * Returns the monotonic time since boot in a ktime
+ *
+ * This is similar to CLOCK_MONTONIC/ktime_get, but also
+ * includes the time spent in suspend.
+ */
+ktime_t ktime_get_boottime(void)
+{
+	unsigned int seq;
+	s64 secs, nsecs;
+
+	WARN_ON(timekeeping_suspended);
+
+	do {
+		seq = read_seqbegin(&xtime_lock);
+		secs = xtime.tv_sec;
+		secs += wall_to_monotonic.tv_sec;
+		secs += total_sleep_time.tv_sec;
+		nsecs = xtime.tv_nsec;
+		nsecs += wall_to_monotonic.tv_nsec;
+		nsecs += total_sleep_time.tv_nsec;
+		nsecs += timekeeping_get_ns();
+
+	} while (read_seqretry(&xtime_lock, seq));
+	/*
+	 * Use ktime_set/ktime_add_ns to create a proper ktime on
+	 * 32-bit architectures without CONFIG_KTIME_SCALAR.
+	 */
+	return ktime_add_ns(ktime_set(secs, 0), nsecs);
+}
+EXPORT_SYMBOL_GPL(ktime_get_boottime);
+
+/**
+ * __get_sleep_time - returns total_sleep_time
+ *
+ * Returns total time spent in suspend.
+ * Requires the xtime lock be held
+ */
+struct timespec __get_sleep_time(void)
+{
+	return total_sleep_time;
+}
+
 /**
  * monotonic_to_bootbased - Convert the monotonic time to boot based.
  * @ts:		pointer to the timespec to be converted
-- 
1.7.3.2.146.gca209


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 11/12] timers: Add rb_init_node() to allow for stack allocated rb nodes
  2011-01-06  2:15 [PATCH 00/12] Posix Alarm Timers full patchset John Stultz
                   ` (9 preceding siblings ...)
  2011-01-06  2:15 ` [PATCH 10/12] [RFC] hrtimers: Add CLOCK_BOOTTIME clockid, hrtimerbase and posix interface John Stultz
@ 2011-01-06  2:15 ` John Stultz
  2011-01-06  2:15 ` [PATCH 12/12] [RFC] Introduce Alarm (hybrid) timers John Stultz
  11 siblings, 0 replies; 22+ messages in thread
From: John Stultz @ 2011-01-06  2:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: John Stultz

In cases where a timerqueue_node or some structure that utilizes
a timerqueue_node is allocated on the stack, gcc would give warnings
caused by the timerqueue_init()'s calling RB_CLEAR_NODE, which
self-references the nodes uninitialized data.

The solution is to create an rb_init_node() function that zeros
the rb_node structure out and then calls RB_CLEAR_NODE(), and
then call the new init function from timerqueue_init().

Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 include/linux/rbtree.h     |    8 ++++++++
 include/linux/timerqueue.h |    2 +-
 2 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index 7066acb..033b507 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -136,6 +136,14 @@ static inline void rb_set_color(struct rb_node *rb, int color)
 #define RB_EMPTY_NODE(node)	(rb_parent(node) == node)
 #define RB_CLEAR_NODE(node)	(rb_set_parent(node, node))
 
+static inline void rb_init_node(struct rb_node *rb)
+{
+	rb->rb_parent_color = 0;
+	rb->rb_right = NULL;
+	rb->rb_left = NULL;
+	RB_CLEAR_NODE(rb);
+}
+
 extern void rb_insert_color(struct rb_node *, struct rb_root *);
 extern void rb_erase(struct rb_node *, struct rb_root *);
 
diff --git a/include/linux/timerqueue.h b/include/linux/timerqueue.h
index 406b103..67cca3d 100644
--- a/include/linux/timerqueue.h
+++ b/include/linux/timerqueue.h
@@ -26,7 +26,7 @@ extern struct timerqueue_node *timerqueue_iterate_next(
 
 static inline void timerqueue_init(struct timerqueue_node *node)
 {
-	RB_CLEAR_NODE(&node->node);
+	rb_init_node(&node->node);
 }
 
 static inline void timerqueue_init_head(struct timerqueue_head *head)
-- 
1.7.3.2.146.gca209


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 12/12] [RFC] Introduce Alarm (hybrid) timers
  2011-01-06  2:15 [PATCH 00/12] Posix Alarm Timers full patchset John Stultz
                   ` (10 preceding siblings ...)
  2011-01-06  2:15 ` [PATCH 11/12] timers: Add rb_init_node() to allow for stack allocated rb nodes John Stultz
@ 2011-01-06  2:15 ` John Stultz
  2011-01-06  4:07   ` Arve Hjønnevåg
  11 siblings, 1 reply; 22+ messages in thread
From: John Stultz @ 2011-01-06  2:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: John Stultz, Arve Hjønnevåg, Brian Swetland,
	Thomas Gleixner, Alessandro Zummo

This patch introduces a new type of hybrid timer, called
alarm-timers.

Alarm-timers are similar to hrtimers, but when a system
is suspended, the RTC device is set to fire when the soonest
alarm timer expires.

Alarm-timers are exposed to userland via the posix clock and timers
interface, using two new clockids: CLOCK_REALTIME_ALARM and
CLOCK_BOOTTIME_ALARM. Both clockids behave identically to
CLOCK_REALTIME and CLOCK_BOOTTIME, respectively, but timers
set against the _ALARM suffixed clockids will wake the system if
it is suspended.

The concept for Alarm-timers was inspired by the Android Alarm
driver (by Arve Hjønnevåg) found in the Android kernel tree.

See: http://android.git.kernel.org/?p=kernel/common.git;a=blob;f=drivers/rtc/alarm.c;h=1250edfbdf3302f5e4ea6194847c6ef4bb7beb1c;hb=android-2.6.36

The android alarm driver was built on top of direct RTC
manipulation, and also implemented its own rbtree timer
code, so instead of trying to porting it over to my
timerlist and RTC timer code, I just re-implemented the
basic functionality roughly following the android in-kernel
interface.

Another distinction is that while the in-kernel interface
is pretty similar, the user-space interface for android alarm
timers is via ioctls. As mentioned above, I've instead chosen
to export this functionality via the posix interface, as it
seemed a little simpler and avoids creating duplicate interfaces
to things like CLOCK_REALTIME and CLOCK_MONOTONIC under alternate
names (ie:ANDROID_ALARM_RTC and ANDROID_ALARM_SYSTEMTIME).

Its possible that I've missed some subtleties of the Android
alarm driver interface, and that some of the interface decisions
I've made may not allow Android to use this interface directly,
I'd be very interested if those details could be pointed out,
and hopefully we can find a good solution to get this useful
functionality upstream.

Arve: It would be really great to get some feedback from you on this.

Signed-off-by: John Stultz <john.stultz@linaro.org>
CC: Arve Hjønnevåg <arve@android.com>
CC: Brian Swetland <swetland@google.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Alessandro Zummo <a.zummo@towertech.it>
---
 include/linux/alarmtimer.h   |   30 ++
 include/linux/posix-timers.h |    2 +
 include/linux/time.h         |    2 +
 kernel/time/Makefile         |    2 +-
 kernel/time/alarmtimer.c     |  699 ++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 734 insertions(+), 1 deletions(-)
 create mode 100644 include/linux/alarmtimer.h
 create mode 100644 kernel/time/alarmtimer.c

diff --git a/include/linux/alarmtimer.h b/include/linux/alarmtimer.h
new file mode 100644
index 0000000..6861f28
--- /dev/null
+++ b/include/linux/alarmtimer.h
@@ -0,0 +1,30 @@
+#ifndef _LINUX_ALARMTIMER_H
+#define _LINUX_ALARMTIMER_H
+
+#include <linux/time.h>
+#include <linux/hrtimer.h>
+#include <linux/timerqueue.h>
+#include <linux/rtc.h>
+
+enum alarmtimer_type {
+	ALARM_REALTIME,
+	ALARM_BOOTTIME,
+
+	ALARM_NUMTYPE,
+};
+
+struct alarm {
+	struct timerqueue_node node;
+	ktime_t period;
+	void (*function)(struct alarm *);
+	enum alarmtimer_type type;
+	char enabled;
+	void *data;
+};
+
+void alarm_init(struct alarm *alarm, enum alarmtimer_type type,
+		void (*function)(struct alarm *));
+void alarm_start(struct alarm *alarm, ktime_t start, ktime_t period);
+void alarm_cancel(struct alarm *alarm);
+
+#endif
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 3e23844a..0361309 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -4,6 +4,7 @@
 #include <linux/spinlock.h>
 #include <linux/list.h>
 #include <linux/sched.h>
+#include <linux/alarmtimer.h>
 
 union cpu_time_count {
 	cputime_t cpu;
@@ -63,6 +64,7 @@ struct k_itimer {
 			unsigned long incr;
 			unsigned long expires;
 		} mmtimer;
+		struct alarm alarmtimer;
 	} it;
 };
 
diff --git a/include/linux/time.h b/include/linux/time.h
index 4ed031a..4dd2b34 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -294,6 +294,8 @@ struct itimerval {
 #define CLOCK_REALTIME_COARSE		5
 #define CLOCK_MONOTONIC_COARSE		6
 #define CLOCK_BOOTTIME			7
+#define CLOCK_REALTIME_ALARM		8
+#define CLOCK_BOOTTIME_ALARM		9
 
 /*
  * The IDs of various hardware clocks:
diff --git a/kernel/time/Makefile b/kernel/time/Makefile
index ee26662..f85145f 100644
--- a/kernel/time/Makefile
+++ b/kernel/time/Makefile
@@ -1,4 +1,4 @@
-obj-y += timekeeping.o ntp.o clocksource.o jiffies.o timer_list.o timecompare.o timeconv.o
+obj-y += timekeeping.o ntp.o clocksource.o jiffies.o timer_list.o timecompare.o timeconv.o alarmtimer.o
 
 obj-$(CONFIG_GENERIC_CLOCKEVENTS_BUILD)		+= clockevents.o
 obj-$(CONFIG_GENERIC_CLOCKEVENTS)		+= tick-common.o
diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c
new file mode 100644
index 0000000..7d2ca0e
--- /dev/null
+++ b/kernel/time/alarmtimer.c
@@ -0,0 +1,699 @@
+/*
+ * Alarmtimer interface
+ *
+ * This interface provides a timer which is similarto hrtimers,
+ * but triggers a RTC alarm if the box is suspend.
+ *
+ * This interface is influenced by the Android RTC Alarm timer
+ * interface.
+ *
+ * Copyright (C) 2010 IBM Corperation
+ *
+ * Author: John Stultz <john.stultz@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include <linux/time.h>
+#include <linux/hrtimer.h>
+#include <linux/timerqueue.h>
+#include <linux/rtc.h>
+#include <linux/alarmtimer.h>
+#include <linux/mutex.h>
+#include <linux/platform_device.h>
+#include <linux/posix-timers.h>
+#include <linux/workqueue.h>
+#include <linux/freezer.h>
+
+
+static struct alarm_base {
+	struct mutex lock;
+	struct timerqueue_head timerqueue;
+	struct hrtimer timer;
+	ktime_t (*gettime)(void);
+	clockid_t base_clockid;
+	struct work_struct irqwork;
+} alarm_bases[ALARM_NUMTYPE];
+
+static struct rtc_timer rtctimer;
+static struct rtc_device *rtcdev;
+
+static ktime_t freezer_delta;
+static DEFINE_SPINLOCK(freezer_delta_lock);
+
+
+/**************************************************************************
+ * alarmtimer management code
+ */
+
+/*
+ * alarmtimer_enqueue - Adds an alarm timer to an alarm_base timerqueue
+ * @base: pointer to the base where the timer is being run
+ * @alarm: pointer to alarm being enqueued.
+ *
+ * Adds alarm to a alarm_base timerqueue and if necessary sets
+ * an hrtimer to run.
+ *
+ * Must hold base->lock when calling.
+ */
+static void alarmtimer_enqueue(struct alarm_base *base, struct alarm *alarm)
+{
+	timerqueue_add(&base->timerqueue, &alarm->node);
+	if (&alarm->node == timerqueue_getnext(&base->timerqueue)) {
+		hrtimer_try_to_cancel(&base->timer);
+		hrtimer_start(&base->timer, alarm->node.expires,
+				HRTIMER_MODE_ABS);
+	}
+}
+
+/*
+ * alarmtimer_remove - Removes an alarm timer from an alarm_base timerqueue
+ * @base: pointer to the base where the timer is running
+ * @alarm: pointer to alarm being removed
+ *
+ * Removes alarm to a alarm_base timerqueue and if necessary sets
+ * a new timer to run.
+ *
+ * Must hold base->lock when calling.
+ */
+static void alarmtimer_remove(struct alarm_base *base, struct alarm *alarm)
+{
+	struct timerqueue_node *next = timerqueue_getnext(&base->timerqueue);
+
+	timerqueue_del(&base->timerqueue, &alarm->node);
+	if (next == &alarm->node) {
+		hrtimer_try_to_cancel(&base->timer);
+		next = timerqueue_getnext(&base->timerqueue);
+		if (!next)
+			return;
+		hrtimer_start(&base->timer, next->expires, HRTIMER_MODE_ABS);
+	}
+}
+
+/*
+ * alarmtimer_do_work - Handles alarm being fired.
+ * @work: pointer to workqueue being run
+ *
+ * When a timer fires, this runs through the timerqueue to see
+ * which alarm timers, and run those that expired. If there are
+ * more alarm timers queued, we set the hrtimer to fire in the
+ * future.
+ */
+void alarmtimer_do_work(struct work_struct *work)
+{
+	struct alarm_base *base = container_of(work, struct alarm_base,
+						irqwork);
+	struct timerqueue_node *next;
+	ktime_t now;
+
+	mutex_lock(&base->lock);
+	now = base->gettime();
+	while ((next = timerqueue_getnext(&base->timerqueue))) {
+		struct alarm *alarm;
+		ktime_t expired = next->expires;
+
+		if (expired.tv64 >= now.tv64)
+			break;
+
+		alarm = container_of(next, struct alarm, node);
+
+		timerqueue_del(&base->timerqueue, &alarm->node);
+		alarm->enabled = 0;
+		/* Re-add periodic timers */
+		if (alarm->period.tv64) {
+			alarm->node.expires = ktime_add(expired, alarm->period);
+			timerqueue_add(&base->timerqueue, &alarm->node);
+			alarm->enabled = 1;
+		}
+		mutex_unlock(&base->lock);
+		if (alarm->function)
+			alarm->function(alarm);
+		mutex_lock(&base->lock);
+	}
+
+	if (next) {
+		hrtimer_start(&base->timer, next->expires,
+				HRTIMER_MODE_ABS);
+	}
+	mutex_unlock(&base->lock);
+}
+
+
+/*
+ * alarmtimer_fired - Handles alarm hrtimer being fired.
+ * @timer: pointer to hrtimer being run
+ *
+ * When a timer fires, this schedules the do_work function to
+ * be run.
+ */
+static enum hrtimer_restart alarmtimer_fired(struct hrtimer *timer)
+{
+	struct alarm_base *base = container_of(timer, struct alarm_base, timer);
+	schedule_work(&base->irqwork);
+	return HRTIMER_NORESTART;
+}
+
+
+/*
+ * alarmtimer_suspend - Suspend time callback
+ * @dev: unused
+ * @state: unused
+ *
+ * When we are going into suspend, we look through the bases
+ * to see which is the soonest timer to expire. We then
+ * set an rtc timer to fire that far into the future, which
+ * will wake us from suspend.
+ */
+static int alarmtimer_suspend(struct device *dev)
+{
+	struct rtc_time tm;
+	ktime_t min, now;
+	unsigned long flags;
+	int i;
+
+	spin_lock_irqsave(&freezer_delta_lock, flags);
+	min = freezer_delta;
+	freezer_delta = ktime_set(0,0);
+	spin_unlock_irqrestore(&freezer_delta_lock, flags);
+
+	/* If we have no rtcdev, just return */
+	if (!rtcdev)
+		return 0;
+
+	/* Find the soonest timer to expire*/
+	for (i = 0; i < ALARM_NUMTYPE; i++) {
+		struct alarm_base *base = &alarm_bases[i];
+		struct timerqueue_node *next;
+		ktime_t delta;
+
+		mutex_lock(&base->lock);
+		next = timerqueue_getnext(&base->timerqueue);
+		mutex_unlock(&base->lock);
+		if (!next)
+			continue;
+		delta = ktime_sub(next->expires, base->gettime());
+		if (!min.tv64 || (delta.tv64 < min.tv64))
+			min = delta;
+	}
+	if (min.tv64 == 0)
+		return 0;
+
+	/* XXX - Should we enforce a minimum sleep time? */
+	WARN_ON(min.tv64 < NSEC_PER_SEC);
+
+	/* Setup an rtc timer to fire that far in the future */
+	rtc_timer_cancel(rtcdev, &rtctimer);
+	rtc_read_time(rtcdev, &tm);
+	now = rtc_tm_to_ktime(tm);
+	now = ktime_add(now, min);
+
+	rtc_timer_start(rtcdev, &rtctimer, now, ktime_set(0, 0));
+
+	return 0;
+}
+
+
+static void alarmtimer_freezerset(ktime_t absexp, enum alarmtimer_type type)
+{
+	ktime_t delta;
+	unsigned long flags;
+	struct alarm_base *base = &alarm_bases[type];
+
+	delta = ktime_sub(absexp, base->gettime());
+
+	spin_lock_irqsave(&freezer_delta_lock, flags);
+	if (!freezer_delta.tv64 || (delta.tv64 < freezer_delta.tv64))
+		freezer_delta = delta;
+	spin_unlock_irqrestore(&freezer_delta_lock, flags);
+}
+
+
+/**************************************************************************
+ * alarm kernel interface code
+ */
+
+/*
+ * alarm_init - Initialize an alarm structure
+ * @alarm: ptr to alarm to be initialized
+ * @type: the type of the alarm
+ * @function: callback that is run when the alarm fires
+ *
+ * In-kernel interface to initializes the alarm structure.
+ */
+void alarm_init(struct alarm *alarm, enum alarmtimer_type type,
+		void (*function)(struct alarm *))
+{
+	timerqueue_init(&alarm->node);
+	alarm->period = ktime_set(0, 0);
+	alarm->function = function;
+	alarm->type = type;
+	alarm->enabled = 0;
+}
+
+/*
+ * alarm_start - Sets an alarm to fire
+ * @alarm: ptr to alarm to set
+ * @start: time to run the alarm
+ * @period: period at which the alarm will recur
+ *
+ * In-kernel interface set an alarm timer.
+ */
+void alarm_start(struct alarm *alarm, ktime_t start, ktime_t period)
+{
+	struct alarm_base *base = &alarm_bases[alarm->type];
+
+	mutex_lock(&base->lock);
+	if (alarm->enabled)
+		alarmtimer_remove(base, alarm);
+	alarm->node.expires = start;
+	alarm->period = period;
+	alarmtimer_enqueue(base, alarm);
+	alarm->enabled = 1;
+	mutex_unlock(&base->lock);
+}
+
+/*
+ * alarm_cancel - Tries to cancel an alarm timer
+ * @alarm: ptr to alarm to be canceled
+ *
+ * In-kernel interface to cancel an alarm timer.
+ */
+void alarm_cancel(struct alarm *alarm)
+{
+	struct alarm_base *base = &alarm_bases[alarm->type];
+
+	mutex_lock(&base->lock);
+	if (alarm->enabled)
+		alarmtimer_remove(base, alarm);
+	alarm->enabled = 0;
+	mutex_unlock(&base->lock);
+}
+
+
+/**************************************************************************
+ * alarm posix interface code
+ */
+
+/*
+ * clock2alarm - helper that converts from clockid to alarmtypes
+ * @clockid: clockid.
+ *
+ * Helper function that converts from clockids to alarmtypes
+ */
+static enum alarmtimer_type clock2alarm(clockid_t clockid)
+{
+	if (clockid == CLOCK_REALTIME_ALARM)
+		return ALARM_REALTIME;
+	if (clockid == CLOCK_BOOTTIME_ALARM)
+		return ALARM_BOOTTIME;
+	return -1;
+}
+
+/*
+ * alarm_handle_timer - Callback for posix timers
+ * @alarm: alarm that fired
+ *
+ * Posix timer callback for expired alarm timers.
+ */
+static void alarm_handle_timer(struct alarm *alarm)
+{
+	struct k_itimer *ptr = container_of(alarm, struct k_itimer,
+						it.alarmtimer);
+	if (posix_timer_event(ptr, 0) != 0)
+		ptr->it_overrun++;
+}
+
+/*
+ * alarm_clock_getres - posix getres interface
+ * @which_clock: clockid
+ * @tp: timespec to fill
+ *
+ * Returns the granularity of underlying alarm base clock
+ */
+static int alarm_clock_getres(const clockid_t which_clock, struct timespec *tp)
+{
+	clockid_t baseid = alarm_bases[clock2alarm(which_clock)].base_clockid;
+
+	return hrtimer_get_res(baseid, tp);
+}
+
+/**
+ * alarm_clock_get - posix clock_get interface
+ * @which_clock: clockid
+ * @tp: timespec to fill.
+ *
+ * Provides the underlying alarm base time.
+ */
+static int alarm_clock_get(clockid_t which_clock, struct timespec *tp)
+{
+	struct alarm_base *base = &alarm_bases[clock2alarm(which_clock)];
+
+	*tp = ktime_to_timespec(base->gettime());
+	return 0;
+}
+
+/**
+ * alarm_timer_create - posix timer_create interface
+ * @new_timer: k_itimer pointer to manage
+ *
+ * Initializes the k_itimer structure.
+ */
+static int alarm_timer_create(struct k_itimer *new_timer)
+{
+	enum  alarmtimer_type type;
+	struct alarm_base *base;
+
+	type = clock2alarm(new_timer->it_clock);
+	base = &alarm_bases[type];
+	alarm_init(&new_timer->it.alarmtimer, type, alarm_handle_timer);
+	return 0;
+}
+
+/**
+ * alarm_timer_get - posix timer_get interface
+ * @new_timer: k_itimer pointer
+ * @cur_setting: itimerspec data to fill
+ *
+ * Copies the itimerspec data out from the k_itimer
+ */
+static void alarm_timer_get(struct k_itimer *timr,
+				struct itimerspec *cur_setting)
+{
+	cur_setting->it_interval =
+			ktime_to_timespec(timr->it.alarmtimer.period);
+	cur_setting->it_value =
+			ktime_to_timespec(timr->it.alarmtimer.node.expires);
+	return;
+}
+
+/**
+ * alarm_timer_del - posix timer_del interface
+ * @timr: k_itimer pointer to be deleted
+ *
+ * Cancels any programmed alarms for the given timer.
+ */
+static int alarm_timer_del(struct k_itimer *timr)
+{
+	alarm_cancel(&timr->it.alarmtimer);
+	return 0;
+}
+
+/**
+ * alarm_timer_set - posix timer_set interface
+ * @timr: k_itimer pointer to be deleted
+ * @flags: timer flags
+ * @new_setting: itimerspec to be used
+ * @old_setting: itimerspec being replaced
+ *
+ * Sets the timer to new_setting, and starts the timer.
+ */
+static int alarm_timer_set(struct k_itimer *timr, int flags,
+				struct itimerspec *new_setting,
+				struct itimerspec *old_setting)
+{
+	/* Save old values */
+	old_setting->it_interval =
+			ktime_to_timespec(timr->it.alarmtimer.period);
+	old_setting->it_value =
+			ktime_to_timespec(timr->it.alarmtimer.node.expires);
+
+	/* If the timer was already set, cancel it */
+	alarm_cancel(&timr->it.alarmtimer);
+
+	/* start the timer */
+	alarm_start(&timr->it.alarmtimer,
+			timespec_to_ktime(new_setting->it_value),
+			timespec_to_ktime(new_setting->it_interval));
+	return 0;
+}
+
+/**
+ * alarmtimer_nsleep_wakeup - Wakeup function for alarm_timer_nsleep
+ * @alarm: ptr to alarm that fired
+ *
+ * Wakes up the task that set the alarmtimer
+ */
+static void alarmtimer_nsleep_wakeup(struct alarm *alarm)
+{
+	struct task_struct *task = (struct task_struct *)alarm->data;
+
+	alarm->data = NULL;
+	if (task)
+		wake_up_process(task);
+}
+
+/**
+ * alarmtimer_do_nsleep - Internal alarmtimer nsleep implementation
+ * @alarm: ptr to alarmtimer
+ * @absexp: absolute expiration time
+ *
+ * Sets the alarm timer and sleeps until it is fired or interrupted.
+ */
+static int alarmtimer_do_nsleep(struct alarm *alarm, ktime_t absexp)
+{
+	alarm->data = (void *)current;
+	do {
+		set_current_state(TASK_INTERRUPTIBLE);
+		alarm_start(alarm, absexp, ktime_set(0, 0));
+		if (likely(alarm->data))
+			schedule();
+
+		alarm_cancel(alarm);
+	} while (alarm->data && !signal_pending(current));
+
+	__set_current_state(TASK_RUNNING);
+
+	return (alarm->data == NULL);
+}
+
+
+/**
+ * update_rmtp - Update remaining timespec value
+ * @exp: expiration time
+ * @type: timer type
+ * @rmtp: user pointer to remaining timepsec value
+ *
+ * Helper function that fills in rmtp value with time between
+ * now and the exp value
+ */
+static int update_rmtp(ktime_t exp, enum  alarmtimer_type type,
+			struct timespec __user *rmtp)
+{
+	struct timespec rmt;
+	ktime_t rem;
+
+	rem = ktime_sub(exp, alarm_bases[type].gettime());
+
+	if (rem.tv64 <= 0)
+		return 0;
+	rmt = ktime_to_timespec(rem);
+
+	if (copy_to_user(rmtp, &rmt, sizeof(*rmtp)))
+		return -EFAULT;
+
+	return 1;
+
+}
+
+/**
+ * alarm_timer_nsleep_restart - restartblock alarmtimer nsleep
+ * @restart: ptr to restart block
+ *
+ * Handles restarted clock_nanosleep calls
+ */
+static long __sched alarm_timer_nsleep_restart(struct restart_block *restart)
+{
+	enum  alarmtimer_type type = restart->nanosleep.index;
+	ktime_t exp;
+	struct timespec __user  *rmtp;
+	struct alarm alarm;
+	int ret = 0;
+
+	exp.tv64 = restart->nanosleep.expires;
+	alarm_init(&alarm, type, alarmtimer_nsleep_wakeup);
+
+	if (alarmtimer_do_nsleep(&alarm, exp))
+		goto out;
+
+	if (freezing(current)) {
+		alarmtimer_freezerset(exp, type);
+	}
+
+	rmtp = restart->nanosleep.rmtp;
+	if (rmtp) {
+		ret = update_rmtp(exp, type, rmtp);
+		if (ret <= 0)
+			goto out;
+	}
+
+
+	/* The other values in restart are already filled in */
+	ret = -ERESTART_RESTARTBLOCK;
+out:
+	return ret;
+}
+
+/**
+ * alarm_timer_nsleep - alarmtimer nanosleep
+ * @which_clock: clockid
+ * @flags: determins abstime or relative
+ * @tsreq: requested sleep time (abs or rel)
+ * @rmtp: remaining sleep time saved
+ *
+ * Handles clock_nanosleep calls against _ALARM clockids
+ */
+static int alarm_timer_nsleep(const clockid_t which_clock, int flags,
+		     struct timespec *tsreq, struct timespec __user *rmtp)
+{
+	enum  alarmtimer_type type = clock2alarm(which_clock);
+	struct alarm alarm;
+	ktime_t exp;
+	int ret = 0;
+	struct restart_block *restart;
+
+	alarm_init(&alarm, type, alarmtimer_nsleep_wakeup);
+
+	exp = timespec_to_ktime(*tsreq);
+	/* Convert (if necessary) to absolute time */
+	if (flags != TIMER_ABSTIME) {
+		ktime_t now = alarm_bases[type].gettime();
+		exp = ktime_add(now, exp);
+	}
+
+	if (alarmtimer_do_nsleep(&alarm, exp))
+		goto out;
+
+	if (freezing(current)) {
+		alarmtimer_freezerset(exp, type);
+	}
+
+	/* abs timers don't set remaining time or restart */
+	if (flags == TIMER_ABSTIME) {
+		ret = -ERESTARTNOHAND;
+		goto out;
+	}
+
+	if (rmtp) {
+		ret = update_rmtp(exp, type, rmtp);
+		if (ret <= 0)
+			goto out;
+	}
+
+	restart = &current_thread_info()->restart_block;
+	restart->fn = alarm_timer_nsleep_restart;
+	restart->nanosleep.index = type;
+	restart->nanosleep.expires = exp.tv64;
+	restart->nanosleep.rmtp = rmtp;
+	ret = -ERESTART_RESTARTBLOCK;
+
+out:
+	return ret;
+}
+
+/**************************************************************************
+ * alarmtimer initialization code
+ */
+
+/* Suspend hook structures */
+static const struct dev_pm_ops alarmtimer_pm_ops = {
+	.suspend = alarmtimer_suspend,
+};
+
+static struct platform_driver alarmtimer_driver = {
+	.driver = {
+		.name = "alarmtimer",
+		.pm = &alarmtimer_pm_ops,
+	}
+};
+
+/**
+ * alarmtimer_init - Initialize alarm timer code
+ *
+ * This function initializes the alarm bases and registers
+ * the posix clock ids.
+ */
+static int __init alarmtimer_init(void)
+{
+	int error = 0;
+	int i;
+	struct k_clock alarm_clock = {
+		.clock_getres = alarm_clock_getres,
+		.clock_get = alarm_clock_get,
+		.clock_set = do_posix_clock_nosettime,
+		.timer_create = alarm_timer_create,
+		.timer_set = alarm_timer_set,
+		.timer_del = alarm_timer_del,
+		.timer_get = alarm_timer_get,
+		.nsleep = alarm_timer_nsleep,
+	};
+
+	register_posix_clock(CLOCK_REALTIME_ALARM, &alarm_clock);
+	register_posix_clock(CLOCK_BOOTTIME_ALARM, &alarm_clock);
+
+	/* Initialize alarm bases */
+	alarm_bases[ALARM_REALTIME].base_clockid = CLOCK_REALTIME;
+	alarm_bases[ALARM_REALTIME].gettime = &ktime_get_real;
+	alarm_bases[ALARM_BOOTTIME].base_clockid = CLOCK_BOOTTIME;
+	alarm_bases[ALARM_BOOTTIME].gettime = &ktime_get_boottime;
+	for (i = 0; i < ALARM_NUMTYPE; i++) {
+		timerqueue_init_head(&alarm_bases[i].timerqueue);
+		mutex_init(&alarm_bases[i].lock);
+		hrtimer_init(&alarm_bases[i].timer,
+				alarm_bases[i].base_clockid,
+				HRTIMER_MODE_ABS);
+		alarm_bases[i].timer.function = alarmtimer_fired;
+		INIT_WORK(&alarm_bases[i].irqwork, alarmtimer_do_work);
+	}
+	error = platform_driver_register(&alarmtimer_driver);
+	platform_device_register_simple("alarmtimer", -1, NULL, 0);
+
+	return error;
+}
+device_initcall(alarmtimer_init);
+
+/**
+ * has_wakealarm - check rtc device has wakealarm ability
+ * @dev: current device
+ * @name_ptr: name to be returned
+ *
+ * This helper function checks to see if the rtc device can wake
+ * from suspend.
+ */
+static int __init has_wakealarm(struct device *dev, void *name_ptr)
+{
+	struct rtc_device *candidate = to_rtc_device(dev);
+
+	if (!candidate->ops->set_alarm)
+		return 0;
+	if (!device_may_wakeup(candidate->dev.parent))
+		return 0;
+
+	*(const char **)name_ptr = dev_name(dev);
+	return 1;
+}
+
+/**
+ * alarmtimer_init_late - Late initializing of alarmtimer code
+ *
+ * This function locates a rtc device to use for wakealarms.
+ * Run as late_initcall to make sure rtc devices have been
+ * registered.
+ */
+static int __init alarmtimer_init_late(void)
+{
+	char *str;
+
+	/* Find an rtc device and init the rtc_timer */
+	class_find_device(rtc_class, NULL, &str, has_wakealarm);
+	if (str)
+		rtcdev = rtc_class_open(str);
+	if (!rtcdev) {
+		printk(KERN_WARNING "No RTC device found, ALARM timers will"
+			" not wake from suspend");
+	}
+	rtc_timer_init(&rtctimer, NULL, NULL);
+
+	return 0;
+}
+late_initcall(alarmtimer_init_late);
-- 
1.7.3.2.146.gca209


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 12/12] [RFC] Introduce Alarm (hybrid) timers
  2011-01-06  2:15 ` [PATCH 12/12] [RFC] Introduce Alarm (hybrid) timers John Stultz
@ 2011-01-06  4:07   ` Arve Hjønnevåg
  2011-01-06 18:04     ` John Stultz
  2011-01-11 18:56     ` John Stultz
  0 siblings, 2 replies; 22+ messages in thread
From: Arve Hjønnevåg @ 2011-01-06  4:07 UTC (permalink / raw)
  To: John Stultz
  Cc: linux-kernel, Brian Swetland, Thomas Gleixner, Alessandro Zummo

On Wed, Jan 5, 2011 at 6:15 PM, John Stultz <john.stultz@linaro.org> wrote:
> This patch introduces a new type of hybrid timer, called
> alarm-timers.
>
> Alarm-timers are similar to hrtimers, but when a system
> is suspended, the RTC device is set to fire when the soonest
> alarm timer expires.
>
> Alarm-timers are exposed to userland via the posix clock and timers
> interface, using two new clockids: CLOCK_REALTIME_ALARM and
> CLOCK_BOOTTIME_ALARM. Both clockids behave identically to
> CLOCK_REALTIME and CLOCK_BOOTTIME, respectively, but timers
> set against the _ALARM suffixed clockids will wake the system if
> it is suspended.
>
> The concept for Alarm-timers was inspired by the Android Alarm
> driver (by Arve Hjønnevåg) found in the Android kernel tree.
>
> See: http://android.git.kernel.org/?p=kernel/common.git;a=blob;f=drivers/rtc/alarm.c;h=1250edfbdf3302f5e4ea6194847c6ef4bb7beb1c;hb=android-2.6.36
>
> The android alarm driver was built on top of direct RTC
> manipulation, and also implemented its own rbtree timer
> code, so instead of trying to porting it over to my
> timerlist and RTC timer code, I just re-implemented the
> basic functionality roughly following the android in-kernel
> interface.
>
> Another distinction is that while the in-kernel interface
> is pretty similar, the user-space interface for android alarm
> timers is via ioctls. As mentioned above, I've instead chosen
> to export this functionality via the posix interface, as it
> seemed a little simpler and avoids creating duplicate interfaces
> to things like CLOCK_REALTIME and CLOCK_MONOTONIC under alternate
> names (ie:ANDROID_ALARM_RTC and ANDROID_ALARM_SYSTEMTIME).
>
> Its possible that I've missed some subtleties of the Android
> alarm driver interface, and that some of the interface decisions
> I've made may not allow Android to use this interface directly,
> I'd be very interested if those details could be pointed out,
> and hopefully we can find a good solution to get this useful
> functionality upstream.
>

I don't know how suited the posix interface is for this, but I think
it is critical to prevent suspend while an alarm is pending. If an
alarm is important enough to wake the system up from suspend, it is
probably not safe to suspend right after it triggered. The android
alarm driver holds a wakelock until user-space calls back in to wait
for the next alarm, while in-kernel alarms are called from interrupt
context. The apis provided in include/linux/pm_wakeup.h should provide
the functionality you need to prevent suspend until the alarms have
been fully processed, but I have not tried this api yet.

It would also be useful to still allow in-kernel alarms to be
activated from atomic context (we currently do this in a couple of
drivers to avoid using a second wakelock).

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 12/12] [RFC] Introduce Alarm (hybrid) timers
  2011-01-06  4:07   ` Arve Hjønnevåg
@ 2011-01-06 18:04     ` John Stultz
  2011-01-07  0:58       ` Arve Hjønnevåg
  2011-01-11 18:56     ` John Stultz
  1 sibling, 1 reply; 22+ messages in thread
From: John Stultz @ 2011-01-06 18:04 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: linux-kernel, Brian Swetland, Thomas Gleixner, Alessandro Zummo

On Wed, 2011-01-05 at 20:07 -0800, Arve Hjønnevåg wrote:
> On Wed, Jan 5, 2011 at 6:15 PM, John Stultz <john.stultz@linaro.org> wrote:
> > Its possible that I've missed some subtleties of the Android
> > alarm driver interface, and that some of the interface decisions
> > I've made may not allow Android to use this interface directly,
> > I'd be very interested if those details could be pointed out,
> > and hopefully we can find a good solution to get this useful
> > functionality upstream.
> >
> 
> I don't know how suited the posix interface is for this, but I think
> it is critical to prevent suspend while an alarm is pending. If an
> alarm is important enough to wake the system up from suspend, it is
> probably not safe to suspend right after it triggered. The android
> alarm driver holds a wakelock until user-space calls back in to wait
> for the next alarm, while in-kernel alarms are called from interrupt

Hrm. I was hoping to avoid wakelock discussions for now. What happens if
an app sets a single alarm and then never calls back in? I assume
closing the device drops the wakelock?

> context. The apis provided in include/linux/pm_wakeup.h should provide
> the functionality you need to prevent suspend until the alarms have
> been fully processed, but I have not tried this api yet.

Ok. I'll have to check out the pm_wakeup.h api and see if it can be
used.

> It would also be useful to still allow in-kernel alarms to be
> activated from atomic context (we currently do this in a couple of
> drivers to avoid using a second wakelock).

This is useful. I think I was being overly cautious using a mutex
instead of a spinlock for the base lock since I was worried about
calling into the RTC code which require mutexes, but we only do that at
suspend, so it should be ok to use a spinlock there. I'll revise and add
that in.

So otherwise, do you see any reason why android might not be able to
adapt this code to replace the android alarm timers?

thanks
-john



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 12/12] [RFC] Introduce Alarm (hybrid) timers
  2011-01-06 18:04     ` John Stultz
@ 2011-01-07  0:58       ` Arve Hjønnevåg
  2011-01-07  2:30         ` John Stultz
  0 siblings, 1 reply; 22+ messages in thread
From: Arve Hjønnevåg @ 2011-01-07  0:58 UTC (permalink / raw)
  To: John Stultz
  Cc: linux-kernel, Brian Swetland, Thomas Gleixner, Alessandro Zummo

2011/1/6 John Stultz <john.stultz@linaro.org>:
> On Wed, 2011-01-05 at 20:07 -0800, Arve Hjønnevåg wrote:
>> On Wed, Jan 5, 2011 at 6:15 PM, John Stultz <john.stultz@linaro.org> wrote:
>> > Its possible that I've missed some subtleties of the Android
>> > alarm driver interface, and that some of the interface decisions
>> > I've made may not allow Android to use this interface directly,
>> > I'd be very interested if those details could be pointed out,
>> > and hopefully we can find a good solution to get this useful
>> > functionality upstream.
>> >
>>
>> I don't know how suited the posix interface is for this, but I think
>> it is critical to prevent suspend while an alarm is pending. If an
>> alarm is important enough to wake the system up from suspend, it is
>> probably not safe to suspend right after it triggered. The android
>> alarm driver holds a wakelock until user-space calls back in to wait
>> for the next alarm, while in-kernel alarms are called from interrupt
>
> Hrm. I was hoping to avoid wakelock discussions for now. What happens if
> an app sets a single alarm and then never calls back in? I assume
> closing the device drops the wakelock?
>

Yes, but the current driver only supports a single client, so this
only happens when the system_server crashes, not when apps exits.

>> context. The apis provided in include/linux/pm_wakeup.h should provide
>> the functionality you need to prevent suspend until the alarms have
>> been fully processed, but I have not tried this api yet.
>
> Ok. I'll have to check out the pm_wakeup.h api and see if it can be
> used.
>
>> It would also be useful to still allow in-kernel alarms to be
>> activated from atomic context (we currently do this in a couple of
>> drivers to avoid using a second wakelock).
>
> This is useful. I think I was being overly cautious using a mutex
> instead of a spinlock for the base lock since I was worried about
> calling into the RTC code which require mutexes, but we only do that at
> suspend, so it should be ok to use a spinlock there. I'll revise and add
> that in.
>
> So otherwise, do you see any reason why android might not be able to
> adapt this code to replace the android alarm timers?
>

The user-space interface does not look appealing, but I don't see any
reason why the in-kernel interface(s) cannot be shared. Our user-space
code has a single thread that waits for alarms to trigger, while the
alarms can be modified from any thread. As far as I can tell, using
the posix interface would either require a thread per alarm (up to 5)
or using signals. Both make the user-space code more complicated, and
it is not clear if either of them provide a clear hand-off between
where the kernel needs to block suspend because the alarm has not been
delivered to user-space and where user-space needs to block suspend
because it is handling the alarm.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 12/12] [RFC] Introduce Alarm (hybrid) timers
  2011-01-07  0:58       ` Arve Hjønnevåg
@ 2011-01-07  2:30         ` John Stultz
  2011-01-08 10:36           ` Rafael J. Wysocki
  0 siblings, 1 reply; 22+ messages in thread
From: John Stultz @ 2011-01-07  2:30 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: linux-kernel, Brian Swetland, Thomas Gleixner, Alessandro Zummo, rjw

On Thu, 2011-01-06 at 16:58 -0800, Arve Hjønnevåg wrote:
> 2011/1/6 John Stultz <john.stultz@linaro.org>:
> > So otherwise, do you see any reason why android might not be able to
> > adapt this code to replace the android alarm timers?
> >
> 
> The user-space interface does not look appealing, but I don't see any
> reason why the in-kernel interface(s) cannot be shared. Our user-space
> code has a single thread that waits for alarms to trigger, while the
> alarms can be modified from any thread.

So its something like nanosleep(), only other threads can extend or
shorten the sleep time?

Could you explain some of the rational for such an interface, so I can
better understand the need?

>  As far as I can tell, using
> the posix interface would either require a thread per alarm (up to 5)
> or using signals. Both make the user-space code more complicated, and

Yea, it probably would need signals, but I'd have to grok the use case a
little better. And its possible it would complicate the user-space code
some, but on the other hand, it would be using a more standard kernel
interface. The other option is extending the posix interface to try to
better match the need.

> it is not clear if either of them provide a clear hand-off between
> where the kernel needs to block suspend because the alarm has not been
> delivered to user-space and where user-space needs to block suspend
> because it is handling the alarm.

Indeed. I'm still looking into the pm_wake details to see the
limitations there. Some method of inheriting a stay_awake seems to be
needed, but sounds pretty ugly. Alternatively we may need some method or
callback to the kernel to detect that a signal has been handled by
userland (allowing the pm_relax to occur).

Rafael: Any thoughts here?

thanks
-john



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 12/12] [RFC] Introduce Alarm (hybrid) timers
  2011-01-07  2:30         ` John Stultz
@ 2011-01-08 10:36           ` Rafael J. Wysocki
  2011-01-13  5:03             ` Arve Hjønnevåg
  0 siblings, 1 reply; 22+ messages in thread
From: Rafael J. Wysocki @ 2011-01-08 10:36 UTC (permalink / raw)
  To: John Stultz
  Cc: Arve Hjønnevåg, linux-kernel, Brian Swetland,
	Thomas Gleixner, Alessandro Zummo, Linux-pm mailing list

On Friday, January 07, 2011, John Stultz wrote:
> On Thu, 2011-01-06 at 16:58 -0800, Arve Hjønnevåg wrote:
> > 2011/1/6 John Stultz <john.stultz@linaro.org>:
> > > So otherwise, do you see any reason why android might not be able to
> > > adapt this code to replace the android alarm timers?
> > >
> > 
> > The user-space interface does not look appealing, but I don't see any
> > reason why the in-kernel interface(s) cannot be shared. Our user-space
> > code has a single thread that waits for alarms to trigger, while the
> > alarms can be modified from any thread.
> 
> So its something like nanosleep(), only other threads can extend or
> shorten the sleep time?
> 
> Could you explain some of the rational for such an interface, so I can
> better understand the need?
> 
> >  As far as I can tell, using
> > the posix interface would either require a thread per alarm (up to 5)
> > or using signals. Both make the user-space code more complicated, and
> 
> Yea, it probably would need signals, but I'd have to grok the use case a
> little better. And its possible it would complicate the user-space code
> some, but on the other hand, it would be using a more standard kernel
> interface. The other option is extending the posix interface to try to
> better match the need.
> 
> > it is not clear if either of them provide a clear hand-off between
> > where the kernel needs to block suspend because the alarm has not been
> > delivered to user-space and where user-space needs to block suspend
> > because it is handling the alarm.
> 
> Indeed. I'm still looking into the pm_wake details to see the
> limitations there. Some method of inheriting a stay_awake seems to be
> needed, but sounds pretty ugly. Alternatively we may need some method or
> callback to the kernel to detect that a signal has been handled by
> userland (allowing the pm_relax to occur).
> 
> Rafael: Any thoughts here?

I think this problem is specific to Android where suspend is started
automatically from kernel space, so user space needs an interface to actively
prevent the kernel from starting suspend.

The mainline model is that suspend will always be started from user space,
so instead of telling the kernel not to suspend user space needs to avoid
starting suspend in the first place.  In this model the kernel code can simply
call pm_relax() as soon as _it_ doesn't need to prevent the system from
suspending any more (eg. it knows that user space has learnt of the alarm) and
it need not worry about the user space part (eg. whether or not user space
is still handling the alarm). 

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 12/12] [RFC] Introduce Alarm (hybrid) timers
  2011-01-06  4:07   ` Arve Hjønnevåg
  2011-01-06 18:04     ` John Stultz
@ 2011-01-11 18:56     ` John Stultz
  2011-01-11 19:22       ` Anca Emanuel
  2011-01-13  4:50       ` Arve Hjønnevåg
  1 sibling, 2 replies; 22+ messages in thread
From: John Stultz @ 2011-01-11 18:56 UTC (permalink / raw)
  To: Arve Hjønnevåg, Rafael J. Wysocki
  Cc: linux-kernel, Brian Swetland, Thomas Gleixner, Alessandro Zummo

On Wed, 2011-01-05 at 20:07 -0800, Arve Hjønnevåg wrote:
> I don't know how suited the posix interface is for this, but I think
> it is critical to prevent suspend while an alarm is pending. If an
> alarm is important enough to wake the system up from suspend, it is
> probably not safe to suspend right after it triggered. The android
> alarm driver holds a wakelock until user-space calls back in to wait
> for the next alarm, while in-kernel alarms are called from interrupt
> context. The apis provided in include/linux/pm_wakeup.h should provide
> the functionality you need to prevent suspend until the alarms have
> been fully processed, but I have not tried this api yet.

So again, I was really hoping to avoid wading into the wakelocks
discussion. However, I'm hesitant to push the posix alarm timers
interface into the kernel if it is insufficient to replace the android
alarm driver.  Wakelocks are not upstream, so they shouldn't block
upstream progress, but I don't want to create an interface that ends up
being short sighted if some wakelock-like solution were to later be
included upstream.

So into the water i slowly wade. 

I've been thinking about Arve's example above. The part that concerns me
the most is the implicit suspend blocker that is acquired by the kernel
when the alarm fires in order to inhibit suspend during the user-space
processing until the process calls back into the alarm device.

I was considering various ideas, like a special signal that tells
userland that it holds a wakelock and is responsible for dropping it. Or
some sort of callback when signal handling is complete by userland
allowing userland to grab its own lock and let the kernel drop its held
lock.

But in my mind, it seems it would be cleaner if the userland application
did something to mark itself as inhibiting suspend. Then if it was to
block waiting on something like an alarm timer, the kernel would drop
the suspend blocker. Then when the alarm timer fires, the kernel would
re-aquire the suspend-blocker for the process when waking it up (the
kernel may do its own suspend inhibition internally as well - but there
wouldn't be any cross kernel/userland implicit lock passing). This is
sort of like SCHED_FIFO 99 style semantics, where a realtime process
won't be preempted unless it explicitly blocks.

I realize this might be more complicated, as suspend inhibition might be
desirable while a process is blocked, such as waiting on the disk, or
blocking on non-alarm triggering timers (although that seems wasteful).
But it seems that any blocking on devices that trigger wakeups would be
fine time for us to drop suspend blocker, as we know we will be woken up
after that point.

Arve: Would something like the above resolve the issue you brought up? I
realize Android might not be eager to convert to some new semantics,
(nor the upstream kernel be eager to start using optimistic suspend),
but should that day come, do you think such a solution would be
sufficient?

thanks
-john



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 12/12] [RFC] Introduce Alarm (hybrid) timers
  2011-01-11 18:56     ` John Stultz
@ 2011-01-11 19:22       ` Anca Emanuel
  2011-01-13  4:50       ` Arve Hjønnevåg
  1 sibling, 0 replies; 22+ messages in thread
From: Anca Emanuel @ 2011-01-11 19:22 UTC (permalink / raw)
  To: John Stultz
  Cc: Arve Hjønnevåg, Rafael J. Wysocki, linux-kernel,
	Brian Swetland, Thomas Gleixner, Alessandro Zummo

2011/1/11 John Stultz <john.stultz@linaro.org>:
> On Wed, 2011-01-05 at 20:07 -0800, Arve Hjønnevåg wrote:
>> I don't know how suited the posix interface is for this, but I think
>> it is critical to prevent suspend while an alarm is pending. If an
>> alarm is important enough to wake the system up from suspend, it is
>> probably not safe to suspend right after it triggered. The android
>> alarm driver holds a wakelock until user-space calls back in to wait
>> for the next alarm, while in-kernel alarms are called from interrupt
>> context. The apis provided in include/linux/pm_wakeup.h should provide
>> the functionality you need to prevent suspend until the alarms have
>> been fully processed, but I have not tried this api yet.
>
> So again, I was really hoping to avoid wading into the wakelocks
> discussion. However, I'm hesitant to push the posix alarm timers
> interface into the kernel if it is insufficient to replace the android
> alarm driver.  Wakelocks are not upstream, so they shouldn't block
> upstream progress, but I don't want to create an interface that ends up
> being short sighted if some wakelock-like solution were to later be
> included upstream.
>
> So into the water i slowly wade.
>
> I've been thinking about Arve's example above. The part that concerns me
> the most is the implicit suspend blocker that is acquired by the kernel
> when the alarm fires in order to inhibit suspend during the user-space
> processing until the process calls back into the alarm device.
>
> I was considering various ideas, like a special signal that tells
> userland that it holds a wakelock and is responsible for dropping it. Or
> some sort of callback when signal handling is complete by userland
> allowing userland to grab its own lock and let the kernel drop its held
> lock.
>
> But in my mind, it seems it would be cleaner if the userland application
> did something to mark itself as inhibiting suspend. Then if it was to
> block waiting on something like an alarm timer, the kernel would drop
> the suspend blocker. Then when the alarm timer fires, the kernel would
> re-aquire the suspend-blocker for the process when waking it up (the
> kernel may do its own suspend inhibition internally as well - but there
> wouldn't be any cross kernel/userland implicit lock passing). This is
> sort of like SCHED_FIFO 99 style semantics, where a realtime process
> won't be preempted unless it explicitly blocks.
>
> I realize this might be more complicated, as suspend inhibition might be
> desirable while a process is blocked, such as waiting on the disk, or
> blocking on non-alarm triggering timers (although that seems wasteful).
> But it seems that any blocking on devices that trigger wakeups would be
> fine time for us to drop suspend blocker, as we know we will be woken up
> after that point.
>
> Arve: Would something like the above resolve the issue you brought up? I
> realize Android might not be eager to convert to some new semantics,
> (nor the upstream kernel be eager to start using optimistic suspend),
> but should that day come, do you think such a solution would be
> sufficient?
>
> thanks
> -john
>

This is offtopic, but I set 60 minutes on my system until it suspends.
And 5 minutes for the screen.
I move my mouse, to avoid the suspend.
This is not the most trivial way of suspend blocker ?
If I see an movie on my pc, I want the movie app to be a suspend
blocker, because I want to see it until it finishes.
But what if I go away ? Then some sensor will tell to pause the movie,
and suspend.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 12/12] [RFC] Introduce Alarm (hybrid) timers
  2011-01-11 18:56     ` John Stultz
  2011-01-11 19:22       ` Anca Emanuel
@ 2011-01-13  4:50       ` Arve Hjønnevåg
  1 sibling, 0 replies; 22+ messages in thread
From: Arve Hjønnevåg @ 2011-01-13  4:50 UTC (permalink / raw)
  To: John Stultz
  Cc: Rafael J. Wysocki, linux-kernel, Brian Swetland, Thomas Gleixner,
	Alessandro Zummo

2011/1/11 John Stultz <john.stultz@linaro.org>:
> On Wed, 2011-01-05 at 20:07 -0800, Arve Hjønnevåg wrote:
>> I don't know how suited the posix interface is for this, but I think
>> it is critical to prevent suspend while an alarm is pending. If an
>> alarm is important enough to wake the system up from suspend, it is
>> probably not safe to suspend right after it triggered. The android
>> alarm driver holds a wakelock until user-space calls back in to wait
>> for the next alarm, while in-kernel alarms are called from interrupt
>> context. The apis provided in include/linux/pm_wakeup.h should provide
>> the functionality you need to prevent suspend until the alarms have
>> been fully processed, but I have not tried this api yet.
>
> So again, I was really hoping to avoid wading into the wakelocks
> discussion. However, I'm hesitant to push the posix alarm timers
> interface into the kernel if it is insufficient to replace the android
> alarm driver.  Wakelocks are not upstream, so they shouldn't block
> upstream progress, but I don't want to create an interface that ends up
> being short sighted if some wakelock-like solution were to later be
> included upstream.

It is my understanding that a wakelock-like solution is already
upstream, but I have not tried it yet.

>
> So into the water i slowly wade.
>
> I've been thinking about Arve's example above. The part that concerns me
> the most is the implicit suspend blocker that is acquired by the kernel
> when the alarm fires in order to inhibit suspend during the user-space
> processing until the process calls back into the alarm device.
>
> I was considering various ideas, like a special signal that tells
> userland that it holds a wakelock and is responsible for dropping it. Or
> some sort of callback when signal handling is complete by userland
> allowing userland to grab its own lock and let the kernel drop its held
> lock.
>
> But in my mind, it seems it would be cleaner if the userland application
> did something to mark itself as inhibiting suspend. Then if it was to
> block waiting on something like an alarm timer, the kernel would drop
> the suspend blocker. Then when the alarm timer fires, the kernel would
> re-aquire the suspend-blocker for the process when waking it up (the
> kernel may do its own suspend inhibition internally as well - but there
> wouldn't be any cross kernel/userland implicit lock passing). This is
> sort of like SCHED_FIFO 99 style semantics, where a realtime process
> won't be preempted unless it explicitly blocks.
>
> I realize this might be more complicated, as suspend inhibition might be
> desirable while a process is blocked, such as waiting on the disk, or
> blocking on non-alarm triggering timers (although that seems wasteful).
> But it seems that any blocking on devices that trigger wakeups would be
> fine time for us to drop suspend blocker, as we know we will be woken up
> after that point.
>
> Arve: Would something like the above resolve the issue you brought up? I
> realize Android might not be eager to convert to some new semantics,
> (nor the upstream kernel be eager to start using optimistic suspend),
> but should that day come, do you think such a solution would be
> sufficient?
>

The android alarm driver does in practice block suspend whenever the
user-space thread that waits for alarm is not blocked in the ioctl
(which seems similar to what you describe above), but I the model we
use to input events may be better for a generic alarm interface since
it is more flexible. For input event we require user-space to use
select() or poll() to  wait for events to become available and
user-space must then acquire a wakelock before reading the event. The
kernel then only needs to block suspend if the event has not been
read.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 12/12] [RFC] Introduce Alarm (hybrid) timers
  2011-01-08 10:36           ` Rafael J. Wysocki
@ 2011-01-13  5:03             ` Arve Hjønnevåg
  0 siblings, 0 replies; 22+ messages in thread
From: Arve Hjønnevåg @ 2011-01-13  5:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: John Stultz, linux-kernel, Brian Swetland, Thomas Gleixner,
	Alessandro Zummo, Linux-pm mailing list

On Sat, Jan 8, 2011 at 2:36 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Friday, January 07, 2011, John Stultz wrote:
>> On Thu, 2011-01-06 at 16:58 -0800, Arve Hjønnevåg wrote:
>> > 2011/1/6 John Stultz <john.stultz@linaro.org>:
>> > > So otherwise, do you see any reason why android might not be able to
>> > > adapt this code to replace the android alarm timers?
>> > >
>> >
>> > The user-space interface does not look appealing, but I don't see any
>> > reason why the in-kernel interface(s) cannot be shared. Our user-space
>> > code has a single thread that waits for alarms to trigger, while the
>> > alarms can be modified from any thread.
>>
>> So its something like nanosleep(), only other threads can extend or
>> shorten the sleep time?
>>
>> Could you explain some of the rational for such an interface, so I can
>> better understand the need?
>>
>> >  As far as I can tell, using
>> > the posix interface would either require a thread per alarm (up to 5)
>> > or using signals. Both make the user-space code more complicated, and
>>
>> Yea, it probably would need signals, but I'd have to grok the use case a
>> little better. And its possible it would complicate the user-space code
>> some, but on the other hand, it would be using a more standard kernel
>> interface. The other option is extending the posix interface to try to
>> better match the need.
>>
>> > it is not clear if either of them provide a clear hand-off between
>> > where the kernel needs to block suspend because the alarm has not been
>> > delivered to user-space and where user-space needs to block suspend
>> > because it is handling the alarm.
>>
>> Indeed. I'm still looking into the pm_wake details to see the
>> limitations there. Some method of inheriting a stay_awake seems to be
>> needed, but sounds pretty ugly. Alternatively we may need some method or
>> callback to the kernel to detect that a signal has been handled by
>> userland (allowing the pm_relax to occur).
>>
>> Rafael: Any thoughts here?
>
> I think this problem is specific to Android where suspend is started
> automatically from kernel space, so user space needs an interface to actively
> prevent the kernel from starting suspend.
>
> The mainline model is that suspend will always be started from user space,
> so instead of telling the kernel not to suspend user space needs to avoid
> starting suspend in the first place.  In this model the kernel code can simply
> call pm_relax() as soon as _it_ doesn't need to prevent the system from
> suspending any more (eg. it knows that user space has learnt of the alarm) and
> it need not worry about the user space part (eg. whether or not user space
> is still handling the alarm).
>

You still have to make sure a race free implementation is possible. If
you are implementing alarms by calling nano-sleep, your model require
the nano-sleeping thread to also respond to requests from the thread
that initiates suspend when that thread checks if it is safe to
suspend.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2011-01-13  5:09 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-06  2:15 [PATCH 00/12] Posix Alarm Timers full patchset John Stultz
2011-01-06  2:15 ` [PATCH 01/12] timers: Introduce timerlist infrastructure John Stultz
2011-01-06  2:15 ` [PATCH 02/12] timers: Rename timerlist infrastructure to timerqueue John Stultz
2011-01-06  2:15 ` [PATCH 03/12] timers: Fixup allmodconfig build issue John Stultz
2011-01-06  2:15 ` [PATCH 04/12] hrtimers: Convert hrtimers to use timerlist infrastructure John Stultz
2011-01-06  2:15 ` [PATCH 05/12] hrtimer: fix timerqueue conversion flub John Stultz
2011-01-06  2:15 ` [PATCH 06/12] RTC: Rework RTC code to use timerqueue for events John Stultz
2011-01-06  2:15 ` [PATCH 07/12] RTC: Remove UIE emulation John Stultz
2011-01-06  2:15 ` [PATCH 08/12] rtc: Namespace fixup John Stultz
2011-01-06  2:15 ` [PATCH 09/12] [RFC] hrtimers: extend hrtimer base code to handle more then 2 clockids John Stultz
2011-01-06  2:15 ` [PATCH 10/12] [RFC] hrtimers: Add CLOCK_BOOTTIME clockid, hrtimerbase and posix interface John Stultz
2011-01-06  2:15 ` [PATCH 11/12] timers: Add rb_init_node() to allow for stack allocated rb nodes John Stultz
2011-01-06  2:15 ` [PATCH 12/12] [RFC] Introduce Alarm (hybrid) timers John Stultz
2011-01-06  4:07   ` Arve Hjønnevåg
2011-01-06 18:04     ` John Stultz
2011-01-07  0:58       ` Arve Hjønnevåg
2011-01-07  2:30         ` John Stultz
2011-01-08 10:36           ` Rafael J. Wysocki
2011-01-13  5:03             ` Arve Hjønnevåg
2011-01-11 18:56     ` John Stultz
2011-01-11 19:22       ` Anca Emanuel
2011-01-13  4:50       ` Arve Hjønnevåg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).