LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH RT 00/39] Linux 3.14.34-rt32-rc1
@ 2015-03-12 19:13 Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 01/39] gpio: omap: use raw locks for locking Steven Rostedt
                   ` (41 more replies)
  0 siblings, 42 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker


Dear RT Folks,

This is the RT stable review cycle of patch 3.14.34-rt32-rc1.

Please scream at me if I messed something up. Please test the patches too.

The -rc release will be uploaded to kernel.org and will be deleted when
the final release is out. This is just a review release (or release candidate).

The pre-releases will not be pushed to the git repository, only the
final release is.

If all goes well, this patch will be converted to the next main release
on 3/16/2015.

Enjoy,

-- Steve


To build 3.14.34-rt32-rc1 directly, the following patches should be applied:

  http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.14.tar.xz

  http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.14.34.xz

  http://www.kernel.org/pub/linux/kernel/projects/rt/3.14/patch-3.14.34-rt32-rc1.patch.xz

You can also build from 3.14.34-rt31 by applying the incremental patch:

http://www.kernel.org/pub/linux/kernel/projects/rt/3.14/incr/patch-3.14.34-rt31-rt32-rc1.patch.xz


Changes from 3.14.34-rt31:

---


Brad Mouring (1):
      rtmutex.c: Fix incorrect waiter check

Daniel Wagner (2):
      work-simple: Simple work queue implemenation
      thermal: Defer thermal wakups to threads

Gustavo Bittencourt (1):
      rtmutex: enable deadlock detection in ww_mutex_lock functions

Josh Cartwright (1):
      lockdep: selftest: fix warnings due to missing PREEMPT_RT conditionals

Mike Galbraith (5):
      rt,locking: fix __ww_mutex_lock_interruptible() lockdep annotation
      x86: UV: raw_spinlock conversion
      scheduling while atomic in cgroup code
      sunrpc: make svc_xprt_do_enqueue() use get_cpu_light()
      locking: ww_mutex: fix ww_mutex vs self-deadlock

Paul E. McKenney (4):
      timers: Track total number of timers in list
      timers: Reduce __run_timers() latency for empty list
      timers: Reduce future __run_timers() latency for newly emptied list
      timers: Reduce future __run_timers() latency for first add to empty list

Paul Gortmaker (1):
      sas-ata/isci: dont't disable interrupts in qc_issue handler

Sebastian Andrzej Siewior (5):
      gpio: omap: use raw locks for locking
      locking/rt-mutex: avoid a NULL pointer dereference on deadlock
      arm/futex: disable preemption during futex_atomic_cmpxchg_inatomic()
      Revert "rwsem-rt: Do not allow readers to nest"
      fs/aio: simple simple work

Steven Rostedt (Red Hat) (2):
      staging: Mark rtl8821ae as broken
      Linux 3.14.34-rt32-rc1

Thomas Gleixner (14):
      rtmutex: Simplify rtmutex_slowtrylock()
      rtmutex: Simplify and document try_to_take_rtmutex()
      rtmutex: No need to keep task ref for lock owner check
      rtmutex: Clarify the boost/deboost part
      rtmutex: Document pi chain walk
      rtmutex: Simplify remove_waiter()
      rtmutex: Confine deadlock logic to futex
      rtmutex: Cleanup deadlock detector debug logic
      rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
      futex: Make unlock_pi more robust
      futex: Use futex_top_waiter() in lookup_pi_state()
      futex: Split out the waiter check from lookup_pi_state()
      futex: Split out the first waiter attachment from lookup_pi_state()
      futex: Simplify futex_lock_pi_atomic() and make it more robust

Yadi.hu (1):
      ARM: enable irq in translation/section permission fault handlers

Yang Shi (1):
      mips: rt: Replace pagefault_* to raw version

Yong Zhang (1):
      ARM: cmpxchg: define __HAVE_ARCH_CMPXCHG for armv6 and later

----
 arch/arm/include/asm/cmpxchg.h         |   2 +
 arch/arm/include/asm/futex.h           |   4 +
 arch/arm/mm/fault.c                    |   6 +
 arch/mips/mm/init.c                    |   4 +-
 arch/x86/include/asm/uv/uv_bau.h       |  14 +-
 arch/x86/include/asm/uv/uv_hub.h       |   2 +-
 arch/x86/kernel/apic/x2apic_uv_x.c     |   2 +-
 arch/x86/platform/uv/tlb_uv.c          |  26 +-
 arch/x86/platform/uv/uv_time.c         |  21 +-
 drivers/gpio/gpio-omap.c               |  72 ++--
 drivers/scsi/libsas/sas_ata.c          |   4 +-
 drivers/staging/rtl8821ae/Kconfig      |   1 +
 drivers/thermal/x86_pkg_temp_thermal.c |  50 ++-
 fs/aio.c                               |  24 +-
 include/linux/rtmutex.h                |   8 +-
 include/linux/rwsem_rt.h               |   1 +
 include/linux/work-simple.h            |  24 ++
 kernel/futex.c                         | 402 +++++++++++------------
 kernel/locking/rt.c                    |  37 ++-
 kernel/locking/rtmutex-debug.c         |   5 +-
 kernel/locking/rtmutex-debug.h         |   7 +-
 kernel/locking/rtmutex-tester.c        |   4 +-
 kernel/locking/rtmutex.c               | 582 ++++++++++++++++++++++++---------
 kernel/locking/rtmutex.h               |   7 +-
 kernel/locking/rtmutex_common.h        |  22 +-
 kernel/sched/Makefile                  |   2 +-
 kernel/sched/work-simple.c             | 172 ++++++++++
 kernel/timer.c                         |  26 ++
 lib/locking-selftest.c                 |  27 ++
 localversion-rt                        |   2 +-
 mm/memcontrol.c                        |   7 +-
 net/sunrpc/svc_xprt.c                  |   4 +-
 32 files changed, 1089 insertions(+), 482 deletions(-)

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 01/39] gpio: omap: use raw locks for locking
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 02/39] rtmutex: Simplify rtmutex_slowtrylock() Steven Rostedt
                   ` (40 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, stable-rt

[-- Attachment #1: 0001-gpio-omap-use-raw-locks-for-locking.patch --]
[-- Type: text/plain, Size: 10579 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

This patch converts gpio_bank.lock from a spin_lock into a
raw_spin_lock. The call path is to access this lock is always under a
raw_spin_lock, for instance
- __setup_irq() holds &desc->lock with irq off
  + __irq_set_trigger()
   + omap_gpio_irq_type()

- handle_level_irq() (runs with irqs off therefore raw locks)
  + mask_ack_irq()
   + omap_gpio_mask_irq()

This fixes the obvious backtrace on -RT. However the locking vs context
is not and this is not limited to -RT:
- omap_gpio_irq_type() is called with IRQ off and has an conditional
  call to pm_runtime_get_sync() which may sleep. Either it may happen or
  it may not happen but pm_runtime_get_sync() should not be called with
  irqs off.

- omap_gpio_debounce() is holding the lock with IRQs off.
  + omap2_set_gpio_debounce()
   + clk_prepare_enable()
    + clk_prepare() this one might sleep.
  The number of users of gpiod_set_debounce() / gpio_set_debounce()
  looks low but still this is not good.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 drivers/gpio/gpio-omap.c | 72 ++++++++++++++++++++++++------------------------
 1 file changed, 36 insertions(+), 36 deletions(-)

diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c
index 424319061e09..0f057af22fb1 100644
--- a/drivers/gpio/gpio-omap.c
+++ b/drivers/gpio/gpio-omap.c
@@ -59,7 +59,7 @@ struct gpio_bank {
 	u32 saved_datain;
 	u32 level_mask;
 	u32 toggle_mask;
-	spinlock_t lock;
+	raw_spinlock_t lock;
 	struct gpio_chip chip;
 	struct clk *dbck;
 	u32 mod_usage;
@@ -503,14 +503,14 @@ static int gpio_irq_type(struct irq_data *d, unsigned type)
 		(type & (IRQ_TYPE_LEVEL_LOW|IRQ_TYPE_LEVEL_HIGH)))
 		return -EINVAL;
 
-	spin_lock_irqsave(&bank->lock, flags);
+	raw_spin_lock_irqsave(&bank->lock, flags);
 	offset = GPIO_INDEX(bank, gpio);
 	retval = _set_gpio_triggering(bank, offset, type);
 	if (!LINE_USED(bank->mod_usage, offset)) {
 		_enable_gpio_module(bank, offset);
 		_set_gpio_direction(bank, offset, 1);
 	} else if (!gpio_is_input(bank, 1 << offset)) {
-		spin_unlock_irqrestore(&bank->lock, flags);
+		raw_spin_unlock_irqrestore(&bank->lock, flags);
 		return -EINVAL;
 	}
 
@@ -518,12 +518,12 @@ static int gpio_irq_type(struct irq_data *d, unsigned type)
 	if (retval) {
 		dev_err(bank->dev, "unable to lock offset %d for IRQ\n",
 			offset);
-		spin_unlock_irqrestore(&bank->lock, flags);
+		raw_spin_unlock_irqrestore(&bank->lock, flags);
 		return retval;
 	}
 
 	bank->irq_usage |= 1 << GPIO_INDEX(bank, gpio);
-	spin_unlock_irqrestore(&bank->lock, flags);
+	raw_spin_unlock_irqrestore(&bank->lock, flags);
 
 	if (type & (IRQ_TYPE_LEVEL_LOW | IRQ_TYPE_LEVEL_HIGH))
 		__irq_set_handler_locked(d->irq, handle_level_irq);
@@ -640,14 +640,14 @@ static int _set_gpio_wakeup(struct gpio_bank *bank, int gpio, int enable)
 		return -EINVAL;
 	}
 
-	spin_lock_irqsave(&bank->lock, flags);
+	raw_spin_lock_irqsave(&bank->lock, flags);
 	if (enable)
 		bank->context.wake_en |= gpio_bit;
 	else
 		bank->context.wake_en &= ~gpio_bit;
 
 	writel_relaxed(bank->context.wake_en, bank->base + bank->regs->wkup_en);
-	spin_unlock_irqrestore(&bank->lock, flags);
+	raw_spin_unlock_irqrestore(&bank->lock, flags);
 
 	return 0;
 }
@@ -682,7 +682,7 @@ static int omap_gpio_request(struct gpio_chip *chip, unsigned offset)
 	if (!BANK_USED(bank))
 		pm_runtime_get_sync(bank->dev);
 
-	spin_lock_irqsave(&bank->lock, flags);
+	raw_spin_lock_irqsave(&bank->lock, flags);
 	/* Set trigger to none. You need to enable the desired trigger with
 	 * request_irq() or set_irq_type(). Only do this if the IRQ line has
 	 * not already been requested.
@@ -692,7 +692,7 @@ static int omap_gpio_request(struct gpio_chip *chip, unsigned offset)
 		_enable_gpio_module(bank, offset);
 	}
 	bank->mod_usage |= 1 << offset;
-	spin_unlock_irqrestore(&bank->lock, flags);
+	raw_spin_unlock_irqrestore(&bank->lock, flags);
 
 	return 0;
 }
@@ -702,11 +702,11 @@ static void omap_gpio_free(struct gpio_chip *chip, unsigned offset)
 	struct gpio_bank *bank = container_of(chip, struct gpio_bank, chip);
 	unsigned long flags;
 
-	spin_lock_irqsave(&bank->lock, flags);
+	raw_spin_lock_irqsave(&bank->lock, flags);
 	bank->mod_usage &= ~(1 << offset);
 	_disable_gpio_module(bank, offset);
 	_reset_gpio(bank, bank->chip.base + offset);
-	spin_unlock_irqrestore(&bank->lock, flags);
+	raw_spin_unlock_irqrestore(&bank->lock, flags);
 
 	/*
 	 * If this is the last gpio to be freed in the bank,
@@ -804,12 +804,12 @@ static void gpio_irq_shutdown(struct irq_data *d)
 	unsigned long flags;
 	unsigned offset = GPIO_INDEX(bank, gpio);
 
-	spin_lock_irqsave(&bank->lock, flags);
+	raw_spin_lock_irqsave(&bank->lock, flags);
 	gpio_unlock_as_irq(&bank->chip, offset);
 	bank->irq_usage &= ~(1 << offset);
 	_disable_gpio_module(bank, offset);
 	_reset_gpio(bank, gpio);
-	spin_unlock_irqrestore(&bank->lock, flags);
+	raw_spin_unlock_irqrestore(&bank->lock, flags);
 
 	/*
 	 * If this is the last IRQ to be freed in the bank,
@@ -833,10 +833,10 @@ static void gpio_mask_irq(struct irq_data *d)
 	unsigned int gpio = irq_to_gpio(bank, d->hwirq);
 	unsigned long flags;
 
-	spin_lock_irqsave(&bank->lock, flags);
+	raw_spin_lock_irqsave(&bank->lock, flags);
 	_set_gpio_irqenable(bank, gpio, 0);
 	_set_gpio_triggering(bank, GPIO_INDEX(bank, gpio), IRQ_TYPE_NONE);
-	spin_unlock_irqrestore(&bank->lock, flags);
+	raw_spin_unlock_irqrestore(&bank->lock, flags);
 }
 
 static void gpio_unmask_irq(struct irq_data *d)
@@ -847,7 +847,7 @@ static void gpio_unmask_irq(struct irq_data *d)
 	u32 trigger = irqd_get_trigger_type(d);
 	unsigned long flags;
 
-	spin_lock_irqsave(&bank->lock, flags);
+	raw_spin_lock_irqsave(&bank->lock, flags);
 	if (trigger)
 		_set_gpio_triggering(bank, GPIO_INDEX(bank, gpio), trigger);
 
@@ -859,7 +859,7 @@ static void gpio_unmask_irq(struct irq_data *d)
 	}
 
 	_set_gpio_irqenable(bank, gpio, 1);
-	spin_unlock_irqrestore(&bank->lock, flags);
+	raw_spin_unlock_irqrestore(&bank->lock, flags);
 }
 
 static struct irq_chip gpio_irq_chip = {
@@ -882,9 +882,9 @@ static int omap_mpuio_suspend_noirq(struct device *dev)
 					OMAP_MPUIO_GPIO_MASKIT / bank->stride;
 	unsigned long		flags;
 
-	spin_lock_irqsave(&bank->lock, flags);
+	raw_spin_lock_irqsave(&bank->lock, flags);
 	writel_relaxed(0xffff & ~bank->context.wake_en, mask_reg);
-	spin_unlock_irqrestore(&bank->lock, flags);
+	raw_spin_unlock_irqrestore(&bank->lock, flags);
 
 	return 0;
 }
@@ -897,9 +897,9 @@ static int omap_mpuio_resume_noirq(struct device *dev)
 					OMAP_MPUIO_GPIO_MASKIT / bank->stride;
 	unsigned long		flags;
 
-	spin_lock_irqsave(&bank->lock, flags);
+	raw_spin_lock_irqsave(&bank->lock, flags);
 	writel_relaxed(bank->context.wake_en, mask_reg);
-	spin_unlock_irqrestore(&bank->lock, flags);
+	raw_spin_unlock_irqrestore(&bank->lock, flags);
 
 	return 0;
 }
@@ -942,9 +942,9 @@ static int gpio_input(struct gpio_chip *chip, unsigned offset)
 	unsigned long flags;
 
 	bank = container_of(chip, struct gpio_bank, chip);
-	spin_lock_irqsave(&bank->lock, flags);
+	raw_spin_lock_irqsave(&bank->lock, flags);
 	_set_gpio_direction(bank, offset, 1);
-	spin_unlock_irqrestore(&bank->lock, flags);
+	raw_spin_unlock_irqrestore(&bank->lock, flags);
 	return 0;
 }
 
@@ -968,10 +968,10 @@ static int gpio_output(struct gpio_chip *chip, unsigned offset, int value)
 	unsigned long flags;
 
 	bank = container_of(chip, struct gpio_bank, chip);
-	spin_lock_irqsave(&bank->lock, flags);
+	raw_spin_lock_irqsave(&bank->lock, flags);
 	bank->set_dataout(bank, offset, value);
 	_set_gpio_direction(bank, offset, 0);
-	spin_unlock_irqrestore(&bank->lock, flags);
+	raw_spin_unlock_irqrestore(&bank->lock, flags);
 	return 0;
 }
 
@@ -983,9 +983,9 @@ static int gpio_debounce(struct gpio_chip *chip, unsigned offset,
 
 	bank = container_of(chip, struct gpio_bank, chip);
 
-	spin_lock_irqsave(&bank->lock, flags);
+	raw_spin_lock_irqsave(&bank->lock, flags);
 	_set_gpio_debounce(bank, offset, debounce);
-	spin_unlock_irqrestore(&bank->lock, flags);
+	raw_spin_unlock_irqrestore(&bank->lock, flags);
 
 	return 0;
 }
@@ -996,9 +996,9 @@ static void gpio_set(struct gpio_chip *chip, unsigned offset, int value)
 	unsigned long flags;
 
 	bank = container_of(chip, struct gpio_bank, chip);
-	spin_lock_irqsave(&bank->lock, flags);
+	raw_spin_lock_irqsave(&bank->lock, flags);
 	bank->set_dataout(bank, offset, value);
-	spin_unlock_irqrestore(&bank->lock, flags);
+	raw_spin_unlock_irqrestore(&bank->lock, flags);
 }
 
 /*---------------------------------------------------------------------*/
@@ -1210,7 +1210,7 @@ static int omap_gpio_probe(struct platform_device *pdev)
 	else
 		bank->set_dataout = _set_gpio_dataout_mask;
 
-	spin_lock_init(&bank->lock);
+	raw_spin_lock_init(&bank->lock);
 
 	/* Static mapping, never released */
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
@@ -1267,7 +1267,7 @@ static int omap_gpio_runtime_suspend(struct device *dev)
 	unsigned long flags;
 	u32 wake_low, wake_hi;
 
-	spin_lock_irqsave(&bank->lock, flags);
+	raw_spin_lock_irqsave(&bank->lock, flags);
 
 	/*
 	 * Only edges can generate a wakeup event to the PRCM.
@@ -1320,7 +1320,7 @@ update_gpio_context_count:
 				bank->get_context_loss_count(bank->dev);
 
 	_gpio_dbck_disable(bank);
-	spin_unlock_irqrestore(&bank->lock, flags);
+	raw_spin_unlock_irqrestore(&bank->lock, flags);
 
 	return 0;
 }
@@ -1335,7 +1335,7 @@ static int omap_gpio_runtime_resume(struct device *dev)
 	unsigned long flags;
 	int c;
 
-	spin_lock_irqsave(&bank->lock, flags);
+	raw_spin_lock_irqsave(&bank->lock, flags);
 
 	/*
 	 * On the first resume during the probe, the context has not
@@ -1371,14 +1371,14 @@ static int omap_gpio_runtime_resume(struct device *dev)
 			if (c != bank->context_loss_count) {
 				omap_gpio_restore_context(bank);
 			} else {
-				spin_unlock_irqrestore(&bank->lock, flags);
+				raw_spin_unlock_irqrestore(&bank->lock, flags);
 				return 0;
 			}
 		}
 	}
 
 	if (!bank->workaround_enabled) {
-		spin_unlock_irqrestore(&bank->lock, flags);
+		raw_spin_unlock_irqrestore(&bank->lock, flags);
 		return 0;
 	}
 
@@ -1433,7 +1433,7 @@ static int omap_gpio_runtime_resume(struct device *dev)
 	}
 
 	bank->workaround_enabled = false;
-	spin_unlock_irqrestore(&bank->lock, flags);
+	raw_spin_unlock_irqrestore(&bank->lock, flags);
 
 	return 0;
 }
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 02/39] rtmutex: Simplify rtmutex_slowtrylock()
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 01/39] gpio: omap: use raw locks for locking Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 03/39] rtmutex: Simplify and document try_to_take_rtmutex() Steven Rostedt
                   ` (39 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, Oleg Nesterov, Lai Jiangshan

[-- Attachment #1: 0002-rtmutex-Simplify-rtmutex_slowtrylock.patch --]
[-- Type: text/plain, Size: 2484 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

upstream-commit: 88f2b4c15e561bb5c28709d666364f273bf54b98

Oleg noticed that rtmutex_slowtrylock() has a pointless check for
rt_mutex_owner(lock) != current.

To avoid calling try_to_take_rtmutex() we really want to check whether
the lock has an owner at all or whether the trylock failed because the
owner is NULL, but the RT_MUTEX_HAS_WAITERS bit is set. This covers
the lock is owned by caller situation as well.

We can actually do this check lockless. trylock is taking a chance
whether we take lock->wait_lock to do the check or not.

Add comments to the function while at it.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Conflicts:
	kernel/locking/rtmutex.c
---
 kernel/locking/rtmutex.c | 33 +++++++++++++++++++++------------
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 6c40660d1168..ccb3b8dbe6d4 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1465,23 +1465,32 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state,
 /*
  * Slow path try-lock function:
  */
-static inline int
-rt_mutex_slowtrylock(struct rt_mutex *lock)
+static inline int rt_mutex_slowtrylock(struct rt_mutex *lock)
 {
-	int ret = 0;
+	int ret;
 
+	/*
+	 * If the lock already has an owner we fail to get the lock.
+	 * This can be done without taking the @lock->wait_lock as
+	 * it is only being read, and this is a trylock anyway.
+	 */
+	if (rt_mutex_owner(lock))
+		return 0;
+
+	/*
+	 * The mutex has currently no owner. Lock the wait lock and
+	 * try to acquire the lock.
+	 */
 	if (!raw_spin_trylock(&lock->wait_lock))
-		return ret;
+		return 0;
 
-	if (likely(rt_mutex_owner(lock) != current)) {
+	ret = try_to_take_rt_mutex(lock, current, NULL);
 
-		ret = try_to_take_rt_mutex(lock, current, NULL);
-		/*
-		 * try_to_take_rt_mutex() sets the lock waiters
-		 * bit unconditionally. Clean this up.
-		 */
-		fixup_rt_mutex_waiters(lock);
-	}
+	/*
+	 * try_to_take_rt_mutex() sets the lock waiters bit
+	 * unconditionally. Clean this up.
+	 */
+	fixup_rt_mutex_waiters(lock);
 
 	raw_spin_unlock(&lock->wait_lock);
 
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 03/39] rtmutex: Simplify and document try_to_take_rtmutex()
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 01/39] gpio: omap: use raw locks for locking Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 02/39] rtmutex: Simplify rtmutex_slowtrylock() Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 04/39] rtmutex: No need to keep task ref for lock owner check Steven Rostedt
                   ` (38 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker

[-- Attachment #1: 0003-rtmutex-Simplify-and-document-try_to_take_rtmutex.patch --]
[-- Type: text/plain, Size: 6390 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

upstream commit: 358c331f391f3e0432f4f96f25017d12ac8d10b1

The current implementation of try_to_take_rtmutex() is correct, but
requires more than a single brain twist to understand the clever
encoded conditionals.

Untangle it and document the cases proper.

Looks less efficient at the first glance, but actually reduces the
binary code size on x8664 by 80 bytes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Conflicts:
	kernel/locking/rtmutex.c
---
 kernel/locking/rtmutex.c | 132 +++++++++++++++++++++++++++++++----------------
 1 file changed, 87 insertions(+), 45 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index ccb3b8dbe6d4..e4cbd0cd5405 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -575,78 +575,120 @@ static inline int lock_is_stealable(struct task_struct *task,
  *
  * Must be called with lock->wait_lock held.
  *
- * @lock:   the lock to be acquired.
- * @task:   the task which wants to acquire the lock
- * @waiter: the waiter that is queued to the lock's wait list. (could be NULL)
+ * @lock:   The lock to be acquired.
+ * @task:   The task which wants to acquire the lock
+ * @waiter: The waiter that is queued to the lock's wait list if the
+ *	    callsite called task_blocked_on_lock(), otherwise NULL
  */
 static int
 __try_to_take_rt_mutex(struct rt_mutex *lock, struct task_struct *task,
 		       struct rt_mutex_waiter *waiter, int mode)
 {
+	unsigned long flags;
+
 	/*
-	 * We have to be careful here if the atomic speedups are
-	 * enabled, such that, when
-	 *  - no other waiter is on the lock
-	 *  - the lock has been released since we did the cmpxchg
-	 * the lock can be released or taken while we are doing the
-	 * checks and marking the lock with RT_MUTEX_HAS_WAITERS.
+	 * Before testing whether we can acquire @lock, we set the
+	 * RT_MUTEX_HAS_WAITERS bit in @lock->owner. This forces all
+	 * other tasks which try to modify @lock into the slow path
+	 * and they serialize on @lock->wait_lock.
 	 *
-	 * The atomic acquire/release aware variant of
-	 * mark_rt_mutex_waiters uses a cmpxchg loop. After setting
-	 * the WAITERS bit, the atomic release / acquire can not
-	 * happen anymore and lock->wait_lock protects us from the
-	 * non-atomic case.
+	 * The RT_MUTEX_HAS_WAITERS bit can have a transitional state
+	 * as explained at the top of this file if and only if:
 	 *
-	 * Note, that this might set lock->owner =
-	 * RT_MUTEX_HAS_WAITERS in the case the lock is not contended
-	 * any more. This is fixed up when we take the ownership.
-	 * This is the transitional state explained at the top of this file.
+	 * - There is a lock owner. The caller must fixup the
+	 *   transient state if it does a trylock or leaves the lock
+	 *   function due to a signal or timeout.
+	 *
+	 * - @task acquires the lock and there are no other
+	 *   waiters. This is undone in rt_mutex_set_owner(@task) at
+	 *   the end of this function.
 	 */
 	mark_rt_mutex_waiters(lock);
 
+	/*
+	 * If @lock has an owner, give up.
+	 */
 	if (rt_mutex_owner(lock))
 		return 0;
 
 	/*
-	 * It will get the lock because of one of these conditions:
-	 * 1) there is no waiter
-	 * 2) higher priority than waiters
-	 * 3) it is top waiter
+	 * If @waiter != NULL, @task has already enqueued the waiter
+	 * into @lock waiter list. If @waiter == NULL then this is a
+	 * trylock attempt.
 	 */
-	if (rt_mutex_has_waiters(lock)) {
-		struct task_struct *pown = rt_mutex_top_waiter(lock)->task;
-
-		if (task != pown && !lock_is_stealable(task, pown, mode))
+	if (waiter) {
+		/*
+		 * If waiter is not the highest priority waiter of
+		 * @lock, give up.
+		 */
+		if (waiter != rt_mutex_top_waiter(lock))
 			return 0;
-	}
-
-	/* We got the lock. */
-
-	if (waiter || rt_mutex_has_waiters(lock)) {
-		unsigned long flags;
-		struct rt_mutex_waiter *top;
 
-		raw_spin_lock_irqsave(&task->pi_lock, flags);
-
-		/* remove the queued waiter. */
-		if (waiter) {
-			rt_mutex_dequeue(lock, waiter);
-			task->pi_blocked_on = NULL;
-		}
+		/*
+		 * We can acquire the lock. Remove the waiter from the
+		 * lock waiters list.
+		 */
+		rt_mutex_dequeue(lock, waiter);
 
+	} else {
 		/*
-		 * We have to enqueue the top waiter(if it exists) into
-		 * task->pi_waiters list.
+		 * If the lock has waiters already we check whether @task is
+		 * eligible to take over the lock.
+		 *
+		 * If there are no other waiters, @task can acquire
+		 * the lock.  @task->pi_blocked_on is NULL, so it does
+		 * not need to be dequeued.
 		 */
 		if (rt_mutex_has_waiters(lock)) {
-			top = rt_mutex_top_waiter(lock);
-			rt_mutex_enqueue_pi(task, top);
+			/*
+			 * If @task->prio is greater than or equal to
+			 * the top waiter priority (kernel view),
+			 * @task lost.
+			 */
+			if (task->prio >= rt_mutex_top_waiter(lock)->prio)
+				return 0;
+
+			/*
+			 * The current top waiter stays enqueued. We
+			 * don't have to change anything in the lock
+			 * waiters order.
+			 */
+		} else {
+			/*
+			 * No waiters. Take the lock without the
+			 * pi_lock dance.@task->pi_blocked_on is NULL
+			 * and we have no waiters to enqueue in @task
+			 * pi waiters list.
+			 */
+			goto takeit;
 		}
-		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
 	}
 
+	/*
+	 * Clear @task->pi_blocked_on. Requires protection by
+	 * @task->pi_lock. Redundant operation for the @waiter == NULL
+	 * case, but conditionals are more expensive than a redundant
+	 * store.
+	 */
+	raw_spin_lock_irqsave(&task->pi_lock, flags);
+	task->pi_blocked_on = NULL;
+	/*
+	 * Finish the lock acquisition. @task is the new owner. If
+	 * other waiters exist we have to insert the highest priority
+	 * waiter into @task->pi_waiters list.
+	 */
+	if (rt_mutex_has_waiters(lock))
+		rt_mutex_enqueue_pi(task, rt_mutex_top_waiter(lock));
+	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
+
+takeit:
+	/* We got the lock. */
 	debug_rt_mutex_lock(lock);
 
+	/*
+	 * This either preserves the RT_MUTEX_HAS_WAITERS bit if there
+	 * are still waiters or clears it.
+	 */
 	rt_mutex_set_owner(lock, task);
 
 	rt_mutex_deadlock_account_lock(lock, task);
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 04/39] rtmutex: No need to keep task ref for lock owner check
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (2 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 03/39] rtmutex: Simplify and document try_to_take_rtmutex() Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 05/39] rtmutex: Clarify the boost/deboost part Steven Rostedt
                   ` (37 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, Lai Jiangshan

[-- Attachment #1: 0004-rtmutex-No-need-to-keep-task-ref-for-lock-owner-chec.patch --]
[-- Type: text/plain, Size: 1456 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

upstream commit: 2ffa5a5cd2fe792b6399c903d5172adf088d8ff7

There is no point to keep the task ref across the check for lock
owner. Drop the ref before that, so the protection context is clear.

Found while documenting the chain walk.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/locking/rtmutex.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index e4cbd0cd5405..901a29345451 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -484,6 +484,8 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 
 	/* Release the task */
 	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
+	put_task_struct(task);
+
 	if (!rt_mutex_owner(lock)) {
 		struct rt_mutex_waiter *lock_top_waiter;
 
@@ -495,9 +497,8 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 		if (top_waiter != lock_top_waiter)
 			rt_mutex_wake_waiter(lock_top_waiter);
 		raw_spin_unlock(&lock->wait_lock);
-		goto out_put_task;
+		return 0;
 	}
-	put_task_struct(task);
 
 	/* Grab the next task */
 	task = rt_mutex_owner(lock);
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 05/39] rtmutex: Clarify the boost/deboost part
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (3 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 04/39] rtmutex: No need to keep task ref for lock owner check Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 06/39] rtmutex: Document pi chain walk Steven Rostedt
                   ` (36 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker

[-- Attachment #1: 0005-rtmutex-Clarify-the-boost-deboost-part.patch --]
[-- Type: text/plain, Size: 4792 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

upstream commit: a57594a13a446d1a6ab1dcd48339f799ce586843

Add a separate local variable for the boost/deboost logic to make the
code more readable. Add comments where appropriate.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Conflicts:
	kernel/locking/rtmutex.c
---
 kernel/locking/rtmutex.c | 57 ++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 48 insertions(+), 9 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 901a29345451..85f5369e27f3 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -366,9 +366,10 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 				      struct rt_mutex_waiter *orig_waiter,
 				      struct task_struct *top_task)
 {
-	struct rt_mutex *lock;
 	struct rt_mutex_waiter *waiter, *top_waiter = orig_waiter;
+	struct rt_mutex_waiter *prerequeue_top_waiter;
 	int detect_deadlock, ret = 0, depth = 0;
+	struct rt_mutex *lock;
 	unsigned long flags;
 
 	detect_deadlock = debug_rt_mutex_detect_deadlock(orig_waiter,
@@ -475,9 +476,14 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 		goto out_unlock_pi;
 	}
 
-	top_waiter = rt_mutex_top_waiter(lock);
+	/*
+	 * Store the current top waiter before doing the requeue
+	 * operation on @lock. We need it for the boost/deboost
+	 * decision below.
+	 */
+	prerequeue_top_waiter = rt_mutex_top_waiter(lock);
 
-	/* Requeue the waiter */
+	/* Requeue the waiter in the lock waiter list. */
 	rt_mutex_dequeue(lock, waiter);
 	waiter->prio = task->prio;
 	rt_mutex_enqueue(lock, waiter);
@@ -486,6 +492,11 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
 	put_task_struct(task);
 
+	/*
+	 * We must abort the chain walk if there is no lock owner even
+	 * in the dead lock detection case, as we have nothing to
+	 * follow here. This is the end of the chain we are walking.
+	 */
 	if (!rt_mutex_owner(lock)) {
 		struct rt_mutex_waiter *lock_top_waiter;
 
@@ -494,29 +505,48 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 		 * to wake the new top waiter up to try to get the lock.
 		 */
 		lock_top_waiter = rt_mutex_top_waiter(lock);
-		if (top_waiter != lock_top_waiter)
+		if (prerequeue_top_waiter != lock_top_waiter)
 			rt_mutex_wake_waiter(lock_top_waiter);
 		raw_spin_unlock(&lock->wait_lock);
 		return 0;
 	}
 
-	/* Grab the next task */
+	/* Grab the next task, i.e. the owner of @lock */
 	task = rt_mutex_owner(lock);
 	get_task_struct(task);
 	raw_spin_lock_irqsave(&task->pi_lock, flags);
 
 	if (waiter == rt_mutex_top_waiter(lock)) {
-		/* Boost the owner */
-		rt_mutex_dequeue_pi(task, top_waiter);
+		/*
+		 * The waiter became the new top (highest priority)
+		 * waiter on the lock. Replace the previous top waiter
+		 * in the owner tasks pi waiters list with this waiter
+		 * and adjust the priority of the owner.
+		 */
+		rt_mutex_dequeue_pi(task, prerequeue_top_waiter);
 		rt_mutex_enqueue_pi(task, waiter);
 		__rt_mutex_adjust_prio(task);
 
-	} else if (top_waiter == waiter) {
-		/* Deboost the owner */
+	} else if (prerequeue_top_waiter == waiter) {
+		/*
+		 * The waiter was the top waiter on the lock, but is
+		 * no longer the top prority waiter. Replace waiter in
+		 * the owner tasks pi waiters list with the new top
+		 * (highest priority) waiter and adjust the priority
+		 * of the owner.
+		 * The new top waiter is stored in @waiter so that
+		 * @waiter == @top_waiter evaluates to true below and
+		 * we continue to deboost the rest of the chain.
+		 */
 		rt_mutex_dequeue_pi(task, waiter);
 		waiter = rt_mutex_top_waiter(lock);
 		rt_mutex_enqueue_pi(task, waiter);
 		__rt_mutex_adjust_prio(task);
+	} else {
+		/*
+		 * Nothing changed. No need to do any priority
+		 * adjustment.
+		 */
 	}
 
 	/*
@@ -529,6 +559,10 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 
 	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
 
+	/*
+	 * Store the top waiter of @lock for the end of chain walk
+	 * decision below.
+	 */
 	top_waiter = rt_mutex_top_waiter(lock);
 	raw_spin_unlock(&lock->wait_lock);
 
@@ -539,6 +573,11 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 	if (!next_lock)
 		goto out_put_task;
 
+	/*
+	 * If the current waiter is not the top waiter on the lock,
+	 * then we can stop the chain walk here if we are not in full
+	 * deadlock detection mode.
+	 */
 	if (!detect_deadlock && waiter != top_waiter)
 		goto out_put_task;
 
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 06/39] rtmutex: Document pi chain walk
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (4 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 05/39] rtmutex: Clarify the boost/deboost part Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 07/39] rtmutex: Simplify remove_waiter() Steven Rostedt
                   ` (35 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker

[-- Attachment #1: 0006-rtmutex-Document-pi-chain-walk.patch --]
[-- Type: text/plain, Size: 7251 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

upstream commit: 3eb65aeadf701976b084e9171e16bb7d1e83fbb0

Add commentry to document the chain walk and the protection mechanisms
and their scope.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/locking/rtmutex.c | 100 ++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 91 insertions(+), 9 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 85f5369e27f3..400026ec7f05 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -358,6 +358,48 @@ static inline struct rt_mutex *task_blocked_on_lock(struct task_struct *p)
  * @top_task:	the current top waiter
  *
  * Returns 0 or -EDEADLK.
+ *
+ * Chain walk basics and protection scope
+ *
+ * [R] refcount on task
+ * [P] task->pi_lock held
+ * [L] rtmutex->wait_lock held
+ *
+ * Step	Description				Protected by
+ *	function arguments:
+ *	@task					[R]
+ *	@orig_lock if != NULL			@top_task is blocked on it
+ *	@next_lock				Unprotected. Cannot be
+ *						dereferenced. Only used for
+ *						comparison.
+ *	@orig_waiter if != NULL			@top_task is blocked on it
+ *	@top_task				current, or in case of proxy
+ *						locking protected by calling
+ *						code
+ *	again:
+ *	  loop_sanity_check();
+ *	retry:
+ * [1]	  lock(task->pi_lock);			[R] acquire [P]
+ * [2]	  waiter = task->pi_blocked_on;		[P]
+ * [3]	  check_exit_conditions_1();		[P]
+ * [4]	  lock = waiter->lock;			[P]
+ * [5]	  if (!try_lock(lock->wait_lock)) {	[P] try to acquire [L]
+ *	    unlock(task->pi_lock);		release [P]
+ *	    goto retry;
+ *	  }
+ * [6]	  check_exit_conditions_2();		[P] + [L]
+ * [7]	  requeue_lock_waiter(lock, waiter);	[P] + [L]
+ * [8]	  unlock(task->pi_lock);		release [P]
+ *	  put_task_struct(task);		release [R]
+ * [9]	  check_exit_conditions_3();		[L]
+ * [10]	  task = owner(lock);			[L]
+ *	  get_task_struct(task);		[L] acquire [R]
+ *	  lock(task->pi_lock);			[L] acquire [P]
+ * [11]	  requeue_pi_waiter(tsk, waiters(lock));[P] + [L]
+ * [12]	  check_exit_conditions_4();		[P] + [L]
+ * [13]	  unlock(task->pi_lock);		release [P]
+ *	  unlock(lock->wait_lock);		release [L]
+ *	  goto again;
  */
 static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 				      int deadlock_detect,
@@ -382,6 +424,9 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 	 * carefully whether things change under us.
 	 */
  again:
+	/*
+	 * We limit the lock chain length for each invocation.
+	 */
 	if (++depth > max_lock_depth) {
 		static int prev_max;
 
@@ -399,13 +444,28 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 
 		return -EDEADLK;
 	}
+
+	/*
+	 * We are fully preemptible here and only hold the refcount on
+	 * @task. So everything can have changed under us since the
+	 * caller or our own code below (goto retry/again) dropped all
+	 * locks.
+	 */
  retry:
 	/*
-	 * Task can not go away as we did a get_task() before !
+	 * [1] Task cannot go away as we did a get_task() before !
 	 */
 	raw_spin_lock_irqsave(&task->pi_lock, flags);
 
+	/*
+	 * [2] Get the waiter on which @task is blocked on.
+	 */
 	waiter = task->pi_blocked_on;
+
+	/*
+	 * [3] check_exit_conditions_1() protected by task->pi_lock.
+	 */
+
 	/*
 	 * Check whether the end of the boosting chain has been
 	 * reached or the state of the chain has changed while we
@@ -456,7 +516,15 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 	if (!detect_deadlock && waiter->prio == task->prio)
 		goto out_unlock_pi;
 
+	/*
+	 * [4] Get the next lock
+	 */
 	lock = waiter->lock;
+	/*
+	 * [5] We need to trylock here as we are holding task->pi_lock,
+	 * which is the reverse lock order versus the other rtmutex
+	 * operations.
+	 */
 	if (!raw_spin_trylock(&lock->wait_lock)) {
 		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
 		cpu_relax();
@@ -464,6 +532,9 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 	}
 
 	/*
+	 * [6] check_exit_conditions_2() protected by task->pi_lock and
+	 * lock->wait_lock.
+	 *
 	 * Deadlock detection. If the lock is the same as the original
 	 * lock which caused us to walk the lock chain or if the
 	 * current lock is owned by the task which initiated the chain
@@ -483,16 +554,18 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 	 */
 	prerequeue_top_waiter = rt_mutex_top_waiter(lock);
 
-	/* Requeue the waiter in the lock waiter list. */
+	/* [7] Requeue the waiter in the lock waiter list. */
 	rt_mutex_dequeue(lock, waiter);
 	waiter->prio = task->prio;
 	rt_mutex_enqueue(lock, waiter);
 
-	/* Release the task */
+	/* [8] Release the task */
 	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
 	put_task_struct(task);
 
 	/*
+	 * [9] check_exit_conditions_3 protected by lock->wait_lock.
+	 *
 	 * We must abort the chain walk if there is no lock owner even
 	 * in the dead lock detection case, as we have nothing to
 	 * follow here. This is the end of the chain we are walking.
@@ -501,8 +574,9 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 		struct rt_mutex_waiter *lock_top_waiter;
 
 		/*
-		 * If the requeue above changed the top waiter, then we need
-		 * to wake the new top waiter up to try to get the lock.
+		 * If the requeue [7] above changed the top waiter,
+		 * then we need to wake the new top waiter up to try
+		 * to get the lock.
 		 */
 		lock_top_waiter = rt_mutex_top_waiter(lock);
 		if (prerequeue_top_waiter != lock_top_waiter)
@@ -511,11 +585,12 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 		return 0;
 	}
 
-	/* Grab the next task, i.e. the owner of @lock */
+	/* [10] Grab the next task, i.e. the owner of @lock */
 	task = rt_mutex_owner(lock);
 	get_task_struct(task);
 	raw_spin_lock_irqsave(&task->pi_lock, flags);
 
+	/* [11] requeue the pi waiters if necessary */
 	if (waiter == rt_mutex_top_waiter(lock)) {
 		/*
 		 * The waiter became the new top (highest priority)
@@ -550,23 +625,30 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 	}
 
 	/*
+	 * [12] check_exit_conditions_4() protected by task->pi_lock
+	 * and lock->wait_lock. The actual decisions are made after we
+	 * dropped the locks.
+	 *
 	 * Check whether the task which owns the current lock is pi
 	 * blocked itself. If yes we store a pointer to the lock for
 	 * the lock chain change detection above. After we dropped
 	 * task->pi_lock next_lock cannot be dereferenced anymore.
 	 */
 	next_lock = task_blocked_on_lock(task);
-
-	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
-
 	/*
 	 * Store the top waiter of @lock for the end of chain walk
 	 * decision below.
 	 */
 	top_waiter = rt_mutex_top_waiter(lock);
+
+	/* [13] Drop the locks */
+	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
 	raw_spin_unlock(&lock->wait_lock);
 
 	/*
+	 * Make the actual exit decisions [12], based on the stored
+	 * values.
+	 *
 	 * We reached the end of the lock chain. Stop right here. No
 	 * point to go back just to figure that out.
 	 */
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 07/39] rtmutex: Simplify remove_waiter()
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (5 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 06/39] rtmutex: Document pi chain walk Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 08/39] rtmutex: Confine deadlock logic to futex Steven Rostedt
                   ` (34 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, Lai Jiangshan

[-- Attachment #1: 0007-rtmutex-Simplify-remove_waiter.patch --]
[-- Type: text/plain, Size: 2674 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

upstream commit: 1ca7b86062ec8473d03c5cdfd336abc8b1c8098c

Exit right away, when the removed waiter was not the top priority
waiter on the lock. Get rid of the extra indent level.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Conflicts:
	kernel/locking/rtmutex.c
---
 kernel/locking/rtmutex.c | 36 +++++++++++++++++++-----------------
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 400026ec7f05..6c74a0525dcb 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -985,7 +985,7 @@ static void wakeup_next_waiter(struct rt_mutex *lock)
 static void remove_waiter(struct rt_mutex *lock,
 			  struct rt_mutex_waiter *waiter)
 {
-	int first = (waiter == rt_mutex_top_waiter(lock));
+	bool is_top_waiter = (waiter == rt_mutex_top_waiter(lock));
 	struct task_struct *owner = rt_mutex_owner(lock);
 	struct rt_mutex *next_lock = NULL;
 	unsigned long flags;
@@ -995,30 +995,32 @@ static void remove_waiter(struct rt_mutex *lock,
 	current->pi_blocked_on = NULL;
 	raw_spin_unlock_irqrestore(&current->pi_lock, flags);
 
-	if (!owner)
+	/*
+	 * Only update priority if the waiter was the highest priority
+	 * waiter of the lock and there is an owner to update.
+	 */
+	if (!owner || !is_top_waiter)
 		return;
 
-	if (first) {
+	raw_spin_lock_irqsave(&owner->pi_lock, flags);
 
-		raw_spin_lock_irqsave(&owner->pi_lock, flags);
+	rt_mutex_dequeue_pi(owner, waiter);
 
-		rt_mutex_dequeue_pi(owner, waiter);
+	if (rt_mutex_has_waiters(lock))
+		rt_mutex_enqueue_pi(owner, rt_mutex_top_waiter(lock));
 
-		if (rt_mutex_has_waiters(lock)) {
-			struct rt_mutex_waiter *next;
+	__rt_mutex_adjust_prio(owner);
 
-			next = rt_mutex_top_waiter(lock);
-			rt_mutex_enqueue_pi(owner, next);
-		}
-		__rt_mutex_adjust_prio(owner);
-
-		/* Store the lock on which owner is blocked or NULL */
-		if (rt_mutex_real_waiter(owner->pi_blocked_on))
-			next_lock = task_blocked_on_lock(owner);
+	/* Store the lock on which owner is blocked or NULL */
+	if (rt_mutex_real_waiter(owner->pi_blocked_on))
+		next_lock = task_blocked_on_lock(owner);
 
-		raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
-	}
+	raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
 
+	/*
+	 * Don't walk the chain, if the owner task is not blocked
+	 * itself.
+	 */
 	if (!next_lock)
 		return;
 
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 08/39] rtmutex: Confine deadlock logic to futex
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (6 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 07/39] rtmutex: Simplify remove_waiter() Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 09/39] rtmutex: Cleanup deadlock detector debug logic Steven Rostedt
                   ` (33 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker

[-- Attachment #1: 0008-rtmutex-Confine-deadlock-logic-to-futex.patch --]
[-- Type: text/plain, Size: 11433 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

upstream commit: c051b21f71d1ffdfd7ad406a1ef5ede5e5f974c5

The deadlock logic is only required for futexes.

Remove the extra arguments for the public functions and also for the
futex specific ones which get always called with deadlock detection
enabled.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Conflicts:
	include/linux/rtmutex.h
	kernel/locking/rtmutex.c
---
 include/linux/rtmutex.h         |  8 +++---
 kernel/futex.c                  | 10 ++++----
 kernel/locking/rt.c             |  8 +++---
 kernel/locking/rtmutex-tester.c |  4 +--
 kernel/locking/rtmutex.c        | 54 ++++++++++++++++++++---------------------
 kernel/locking/rtmutex_common.h |  7 +++---
 6 files changed, 44 insertions(+), 47 deletions(-)

diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h
index edb77fd15803..d5a04ea47a13 100644
--- a/include/linux/rtmutex.h
+++ b/include/linux/rtmutex.h
@@ -105,12 +105,10 @@ extern void __rt_mutex_init(struct rt_mutex *lock, const char *name);
 extern void rt_mutex_destroy(struct rt_mutex *lock);
 
 extern void rt_mutex_lock(struct rt_mutex *lock);
-extern int rt_mutex_lock_interruptible(struct rt_mutex *lock,
-						int detect_deadlock);
-extern int rt_mutex_lock_killable(struct rt_mutex *lock, int detect_deadlock);
+extern int rt_mutex_lock_interruptible(struct rt_mutex *lock);
+extern int rt_mutex_lock_killable(struct rt_mutex *lock);
 extern int rt_mutex_timed_lock(struct rt_mutex *lock,
-					struct hrtimer_sleeper *timeout,
-					int detect_deadlock);
+			       struct hrtimer_sleeper *timeout);
 
 extern int rt_mutex_trylock(struct rt_mutex *lock);
 
diff --git a/kernel/futex.c b/kernel/futex.c
index 5ba3c0fc4d19..d2349337d1a0 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1708,7 +1708,7 @@ retry_private:
 			this->pi_state = pi_state;
 			ret = rt_mutex_start_proxy_lock(&pi_state->pi_mutex,
 							this->rt_waiter,
-							this->task, 1);
+							this->task);
 			if (ret == 1) {
 				/* We got the lock. */
 				requeue_pi_wake_futex(this, &key2, hb2);
@@ -2337,9 +2337,9 @@ retry_private:
 	/*
 	 * Block on the PI mutex:
 	 */
-	if (!trylock)
-		ret = rt_mutex_timed_lock(&q.pi_state->pi_mutex, to, 1);
-	else {
+	if (!trylock) {
+		ret = rt_mutex_timed_futex_lock(&q.pi_state->pi_mutex, to);
+	} else {
 		ret = rt_mutex_trylock(&q.pi_state->pi_mutex);
 		/* Fixup the trylock return value: */
 		ret = ret ? 0 : -EWOULDBLOCK;
@@ -2703,7 +2703,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
 		 */
 		WARN_ON(!q.pi_state);
 		pi_mutex = &q.pi_state->pi_mutex;
-		ret = rt_mutex_finish_proxy_lock(pi_mutex, to, &rt_waiter, 1);
+		ret = rt_mutex_finish_proxy_lock(pi_mutex, to, &rt_waiter);
 		debug_rt_mutex_free_waiter(&rt_waiter);
 
 		spin_lock(&hb2->lock);
diff --git a/kernel/locking/rt.c b/kernel/locking/rt.c
index 90b8ba03e2a4..eac2ddea9c45 100644
--- a/kernel/locking/rt.c
+++ b/kernel/locking/rt.c
@@ -98,7 +98,7 @@ int __lockfunc _mutex_lock_interruptible(struct mutex *lock)
 	int ret;
 
 	mutex_acquire(&lock->dep_map, 0, 0, _RET_IP_);
-	ret = rt_mutex_lock_interruptible(&lock->lock, 0);
+	ret = rt_mutex_lock_interruptible(&lock->lock);
 	if (ret)
 		mutex_release(&lock->dep_map, 1, _RET_IP_);
 	return ret;
@@ -110,7 +110,7 @@ int __lockfunc _mutex_lock_killable(struct mutex *lock)
 	int ret;
 
 	mutex_acquire(&lock->dep_map, 0, 0, _RET_IP_);
-	ret = rt_mutex_lock_killable(&lock->lock, 0);
+	ret = rt_mutex_lock_killable(&lock->lock);
 	if (ret)
 		mutex_release(&lock->dep_map, 1, _RET_IP_);
 	return ret;
@@ -137,7 +137,7 @@ int __lockfunc _mutex_lock_interruptible_nested(struct mutex *lock, int subclass
 	int ret;
 
 	mutex_acquire_nest(&lock->dep_map, subclass, 0, NULL, _RET_IP_);
-	ret = rt_mutex_lock_interruptible(&lock->lock, 0);
+	ret = rt_mutex_lock_interruptible(&lock->lock);
 	if (ret)
 		mutex_release(&lock->dep_map, 1, _RET_IP_);
 	return ret;
@@ -149,7 +149,7 @@ int __lockfunc _mutex_lock_killable_nested(struct mutex *lock, int subclass)
 	int ret;
 
 	mutex_acquire(&lock->dep_map, subclass, 0, _RET_IP_);
-	ret = rt_mutex_lock_killable(&lock->lock, 0);
+	ret = rt_mutex_lock_killable(&lock->lock);
 	if (ret)
 		mutex_release(&lock->dep_map, 1, _RET_IP_);
 	return ret;
diff --git a/kernel/locking/rtmutex-tester.c b/kernel/locking/rtmutex-tester.c
index 1d96dd0d93c1..a6b605b51b97 100644
--- a/kernel/locking/rtmutex-tester.c
+++ b/kernel/locking/rtmutex-tester.c
@@ -16,7 +16,7 @@
 #include <linux/freezer.h>
 #include <linux/stat.h>
 
-#include "rtmutex.h"
+#include "rtmutex_common.h"
 
 #define MAX_RT_TEST_THREADS	8
 #define MAX_RT_TEST_MUTEXES	8
@@ -107,7 +107,7 @@ static int handle_op(struct test_thread_data *td, int lockwakeup)
 
 		td->mutexes[id] = 1;
 		td->event = atomic_add_return(1, &rttest_event);
-		ret = rt_mutex_lock_interruptible(&mutexes[id], 0);
+		ret = rt_mutex_lock_interruptible(&mutexes[id]);
 		td->event = atomic_add_return(1, &rttest_event);
 		td->mutexes[id] = ret ? 0 : 4;
 		return ret ? -EINTR : 0;
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 6c74a0525dcb..ebfc02f151e3 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1734,17 +1734,17 @@ rt_mutex_slowunlock(struct rt_mutex *lock)
  */
 static inline int
 rt_mutex_fastlock(struct rt_mutex *lock, int state,
-		  int detect_deadlock, struct ww_acquire_ctx *ww_ctx,
+		  struct ww_acquire_ctx *ww_ctx,
 		  int (*slowfn)(struct rt_mutex *lock, int state,
 				struct hrtimer_sleeper *timeout,
 				int detect_deadlock,
 				struct ww_acquire_ctx *ww_ctx))
 {
-	if (!detect_deadlock && likely(rt_mutex_cmpxchg(lock, NULL, current))) {
+	if (likely(rt_mutex_cmpxchg(lock, NULL, current))) {
 		rt_mutex_deadlock_account_lock(lock, current);
 		return 0;
 	} else
-		return slowfn(lock, state, NULL, detect_deadlock, ww_ctx);
+		return slowfn(lock, state, NULL, 0, ww_ctx);
 }
 
 static inline int
@@ -1793,7 +1793,7 @@ void __sched rt_mutex_lock(struct rt_mutex *lock)
 {
 	might_sleep();
 
-	rt_mutex_fastlock(lock, TASK_UNINTERRUPTIBLE, 0, NULL, rt_mutex_slowlock);
+	rt_mutex_fastlock(lock, TASK_UNINTERRUPTIBLE, NULL, rt_mutex_slowlock);
 }
 EXPORT_SYMBOL_GPL(rt_mutex_lock);
 
@@ -1801,41 +1801,47 @@ EXPORT_SYMBOL_GPL(rt_mutex_lock);
  * rt_mutex_lock_interruptible - lock a rt_mutex interruptible
  *
  * @lock:		the rt_mutex to be locked
- * @detect_deadlock:	deadlock detection on/off
  *
  * Returns:
  *  0		on success
  * -EINTR	when interrupted by a signal
- * -EDEADLK	when the lock would deadlock (when deadlock detection is on)
  */
-int __sched rt_mutex_lock_interruptible(struct rt_mutex *lock,
-						 int detect_deadlock)
+int __sched rt_mutex_lock_interruptible(struct rt_mutex *lock)
 {
 	might_sleep();
 
-	return rt_mutex_fastlock(lock, TASK_INTERRUPTIBLE,
-				 detect_deadlock, NULL, rt_mutex_slowlock);
+	return rt_mutex_fastlock(lock, TASK_INTERRUPTIBLE, NULL, rt_mutex_slowlock);
 }
 EXPORT_SYMBOL_GPL(rt_mutex_lock_interruptible);
 
+/*
+ * Futex variant with full deadlock detection.
+ */
+int rt_mutex_timed_futex_lock(struct rt_mutex *lock,
+			      struct hrtimer_sleeper *timeout)
+{
+	might_sleep();
+
+	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, 1,
+				       NULL, rt_mutex_slowlock);
+}
+
 /**
  * rt_mutex_lock_killable - lock a rt_mutex killable
  *
  * @lock:		the rt_mutex to be locked
- * @detect_deadlock:	deadlock detection on/off
  *
  * Returns:
  *  0		on success
  * -EINTR	when interrupted by a signal
  * -EDEADLK	when the lock would deadlock (when deadlock detection is on)
  */
-int __sched rt_mutex_lock_killable(struct rt_mutex *lock,
-				   int detect_deadlock)
+int __sched rt_mutex_lock_killable(struct rt_mutex *lock)
 {
 	might_sleep();
 
 	return rt_mutex_fastlock(lock, TASK_KILLABLE,
-				 detect_deadlock, NULL, rt_mutex_slowlock);
+				 NULL, rt_mutex_slowlock);
 }
 EXPORT_SYMBOL_GPL(rt_mutex_lock_killable);
 
@@ -1846,22 +1852,19 @@ EXPORT_SYMBOL_GPL(rt_mutex_lock_killable);
  *
  * @lock:		the rt_mutex to be locked
  * @timeout:		timeout structure or NULL (no timeout)
- * @detect_deadlock:	deadlock detection on/off
  *
  * Returns:
  *  0		on success
  * -EINTR	when interrupted by a signal
  * -ETIMEDOUT	when the timeout expired
- * -EDEADLK	when the lock would deadlock (when deadlock detection is on)
  */
 int
-rt_mutex_timed_lock(struct rt_mutex *lock, struct hrtimer_sleeper *timeout,
-		    int detect_deadlock)
+rt_mutex_timed_lock(struct rt_mutex *lock, struct hrtimer_sleeper *timeout)
 {
 	might_sleep();
 
-	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout,
-				       detect_deadlock, NULL, rt_mutex_slowlock);
+	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, 0,
+				       NULL, rt_mutex_slowlock);
 }
 EXPORT_SYMBOL_GPL(rt_mutex_timed_lock);
 
@@ -1966,7 +1969,6 @@ void rt_mutex_proxy_unlock(struct rt_mutex *lock,
  * @lock:		the rt_mutex to take
  * @waiter:		the pre-initialized rt_mutex_waiter
  * @task:		the task to prepare
- * @detect_deadlock:	perform deadlock detection (1) or not (0)
  *
  * Returns:
  *  0 - task blocked on lock
@@ -1977,7 +1979,7 @@ void rt_mutex_proxy_unlock(struct rt_mutex *lock,
  */
 int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
 			      struct rt_mutex_waiter *waiter,
-			      struct task_struct *task, int detect_deadlock)
+			      struct task_struct *task)
 {
 	int ret;
 
@@ -2064,22 +2066,20 @@ struct task_struct *rt_mutex_next_owner(struct rt_mutex *lock)
  * rt_mutex_finish_proxy_lock() - Complete lock acquisition
  * @lock:		the rt_mutex we were woken on
  * @to:			the timeout, null if none. hrtimer should already have
- * 			been started.
+ *			been started.
  * @waiter:		the pre-initialized rt_mutex_waiter
- * @detect_deadlock:	perform deadlock detection (1) or not (0)
  *
  * Complete the lock acquisition started our behalf by another thread.
  *
  * Returns:
  *  0 - success
- * <0 - error, one of -EINTR, -ETIMEDOUT, or -EDEADLK
+ * <0 - error, one of -EINTR, -ETIMEDOUT
  *
  * Special API call for PI-futex requeue support
  */
 int rt_mutex_finish_proxy_lock(struct rt_mutex *lock,
 			       struct hrtimer_sleeper *to,
-			       struct rt_mutex_waiter *waiter,
-			       int detect_deadlock)
+			       struct rt_mutex_waiter *waiter)
 {
 	int ret;
 
diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h
index ac636d37ec32..ec51460fa233 100644
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -115,12 +115,11 @@ extern void rt_mutex_proxy_unlock(struct rt_mutex *lock,
 				  struct task_struct *proxy_owner);
 extern int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
 				     struct rt_mutex_waiter *waiter,
-				     struct task_struct *task,
-				     int detect_deadlock);
+				     struct task_struct *task);
 extern int rt_mutex_finish_proxy_lock(struct rt_mutex *lock,
 				      struct hrtimer_sleeper *to,
-				      struct rt_mutex_waiter *waiter,
-				      int detect_deadlock);
+				      struct rt_mutex_waiter *waiter);
+extern int rt_mutex_timed_futex_lock(struct rt_mutex *l, struct hrtimer_sleeper *to);
 
 #ifdef CONFIG_DEBUG_RT_MUTEXES
 # include "rtmutex-debug.h"
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 09/39] rtmutex: Cleanup deadlock detector debug logic
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (7 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 08/39] rtmutex: Confine deadlock logic to futex Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 10/39] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk Steven Rostedt
                   ` (32 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, Peter Zijlstra, Lai Jiangshan

[-- Attachment #1: 0009-rtmutex-Cleanup-deadlock-detector-debug-logic.patch --]
[-- Type: text/plain, Size: 12551 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

upstream commit: 8930ed80f970a90a795239e7415c9b0e6f964649

The conditions under which deadlock detection is conducted are unclear
and undocumented.

Add constants instead of using 0/1 and provide a selection function
which hides the additional debug dependency from the calling code.

Add comments where needed.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/20140522031949.947264874@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Conflicts:
	kernel/locking/rtmutex.c
---
 kernel/locking/rtmutex-debug.c  |  5 +--
 kernel/locking/rtmutex-debug.h  |  7 ++--
 kernel/locking/rtmutex.c        | 80 +++++++++++++++++++++++++++++------------
 kernel/locking/rtmutex.h        |  7 +++-
 kernel/locking/rtmutex_common.h | 15 ++++++++
 5 files changed, 85 insertions(+), 29 deletions(-)

diff --git a/kernel/locking/rtmutex-debug.c b/kernel/locking/rtmutex-debug.c
index 49b2ed3dced8..62b6cee8ea7f 100644
--- a/kernel/locking/rtmutex-debug.c
+++ b/kernel/locking/rtmutex-debug.c
@@ -66,12 +66,13 @@ void rt_mutex_debug_task_free(struct task_struct *task)
  * the deadlock. We print when we return. act_waiter can be NULL in
  * case of a remove waiter operation.
  */
-void debug_rt_mutex_deadlock(int detect, struct rt_mutex_waiter *act_waiter,
+void debug_rt_mutex_deadlock(enum rtmutex_chainwalk chwalk,
+			     struct rt_mutex_waiter *act_waiter,
 			     struct rt_mutex *lock)
 {
 	struct task_struct *task;
 
-	if (!debug_locks || detect || !act_waiter)
+	if (!debug_locks || chwalk == RT_MUTEX_FULL_CHAINWALK || !act_waiter)
 		return;
 
 	task = rt_mutex_owner(act_waiter->lock);
diff --git a/kernel/locking/rtmutex-debug.h b/kernel/locking/rtmutex-debug.h
index ab29b6a22669..d0519c3432b6 100644
--- a/kernel/locking/rtmutex-debug.h
+++ b/kernel/locking/rtmutex-debug.h
@@ -20,14 +20,15 @@ extern void debug_rt_mutex_unlock(struct rt_mutex *lock);
 extern void debug_rt_mutex_proxy_lock(struct rt_mutex *lock,
 				      struct task_struct *powner);
 extern void debug_rt_mutex_proxy_unlock(struct rt_mutex *lock);
-extern void debug_rt_mutex_deadlock(int detect, struct rt_mutex_waiter *waiter,
+extern void debug_rt_mutex_deadlock(enum rtmutex_chainwalk chwalk,
+				    struct rt_mutex_waiter *waiter,
 				    struct rt_mutex *lock);
 extern void debug_rt_mutex_print_deadlock(struct rt_mutex_waiter *waiter);
 # define debug_rt_mutex_reset_waiter(w)			\
 	do { (w)->deadlock_lock = NULL; } while (0)
 
-static inline int debug_rt_mutex_detect_deadlock(struct rt_mutex_waiter *waiter,
-						 int detect)
+static inline bool debug_rt_mutex_detect_deadlock(struct rt_mutex_waiter *waiter,
+						  enum rtmutex_chainwalk walk)
 {
 	return (waiter != NULL);
 }
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index ebfc02f151e3..07fb4d717039 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -329,6 +329,32 @@ static void rt_mutex_wake_waiter(struct rt_mutex_waiter *waiter)
 }
 
 /*
+ * Deadlock detection is conditional:
+ *
+ * If CONFIG_DEBUG_RT_MUTEXES=n, deadlock detection is only conducted
+ * if the detect argument is == RT_MUTEX_FULL_CHAINWALK.
+ *
+ * If CONFIG_DEBUG_RT_MUTEXES=y, deadlock detection is always
+ * conducted independent of the detect argument.
+ *
+ * If the waiter argument is NULL this indicates the deboost path and
+ * deadlock detection is disabled independent of the detect argument
+ * and the config settings.
+ */
+static bool rt_mutex_cond_detect_deadlock(struct rt_mutex_waiter *waiter,
+					  enum rtmutex_chainwalk chwalk)
+{
+	/*
+	 * This is just a wrapper function for the following call,
+	 * because debug_rt_mutex_detect_deadlock() smells like a magic
+	 * debug feature and I wanted to keep the cond function in the
+	 * main source file along with the comments instead of having
+	 * two of the same in the headers.
+	 */
+	return debug_rt_mutex_detect_deadlock(waiter, chwalk);
+}
+
+/*
  * Max number of times we'll walk the boosting chain:
  */
 int max_lock_depth = 1024;
@@ -402,7 +428,7 @@ static inline struct rt_mutex *task_blocked_on_lock(struct task_struct *p)
  *	  goto again;
  */
 static int rt_mutex_adjust_prio_chain(struct task_struct *task,
-				      int deadlock_detect,
+				      enum rtmutex_chainwalk chwalk,
 				      struct rt_mutex *orig_lock,
 				      struct rt_mutex *next_lock,
 				      struct rt_mutex_waiter *orig_waiter,
@@ -410,12 +436,12 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 {
 	struct rt_mutex_waiter *waiter, *top_waiter = orig_waiter;
 	struct rt_mutex_waiter *prerequeue_top_waiter;
-	int detect_deadlock, ret = 0, depth = 0;
+	int ret = 0, depth = 0;
 	struct rt_mutex *lock;
+	bool detect_deadlock;
 	unsigned long flags;
 
-	detect_deadlock = debug_rt_mutex_detect_deadlock(orig_waiter,
-							 deadlock_detect);
+	detect_deadlock = rt_mutex_cond_detect_deadlock(orig_waiter, chwalk);
 
 	/*
 	 * The (de)boosting is a step by step approach with a lot of
@@ -541,7 +567,7 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 	 * walk, we detected a deadlock.
 	 */
 	if (lock == orig_lock || rt_mutex_owner(lock) == top_task) {
-		debug_rt_mutex_deadlock(deadlock_detect, orig_waiter, lock);
+		debug_rt_mutex_deadlock(chwalk, orig_waiter, lock);
 		raw_spin_unlock(&lock->wait_lock);
 		ret = -EDEADLK;
 		goto out_unlock_pi;
@@ -835,7 +861,7 @@ try_to_take_rt_mutex(struct rt_mutex *lock, struct task_struct *task,
 static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
 				   struct rt_mutex_waiter *waiter,
 				   struct task_struct *task,
-				   int detect_deadlock)
+				   enum rtmutex_chainwalk chwalk)
 {
 	struct task_struct *owner = rt_mutex_owner(lock);
 	struct rt_mutex_waiter *top_waiter = waiter;
@@ -898,7 +924,7 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
 		__rt_mutex_adjust_prio(owner);
 		if (rt_mutex_real_waiter(owner->pi_blocked_on))
 			chain_walk = 1;
-	} else if (debug_rt_mutex_detect_deadlock(waiter, detect_deadlock)) {
+	} else if (rt_mutex_cond_detect_deadlock(waiter, chwalk)) {
 		chain_walk = 1;
 	}
 
@@ -923,7 +949,7 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
 
 	raw_spin_unlock(&lock->wait_lock);
 
-	res = rt_mutex_adjust_prio_chain(owner, detect_deadlock, lock,
+	res = rt_mutex_adjust_prio_chain(owner, chwalk, lock,
 					 next_lock, waiter, task);
 
 	raw_spin_lock(&lock->wait_lock);
@@ -1029,7 +1055,8 @@ static void remove_waiter(struct rt_mutex *lock,
 
 	raw_spin_unlock(&lock->wait_lock);
 
-	rt_mutex_adjust_prio_chain(owner, 0, lock, next_lock, NULL, current);
+	rt_mutex_adjust_prio_chain(owner, RT_MUTEX_MIN_CHAINWALK, lock,
+				   next_lock, NULL, current);
 
 	raw_spin_lock(&lock->wait_lock);
 }
@@ -1058,7 +1085,8 @@ void rt_mutex_adjust_pi(struct task_struct *task)
 	/* gets dropped in rt_mutex_adjust_prio_chain()! */
 	get_task_struct(task);
 	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
-	rt_mutex_adjust_prio_chain(task, 0, NULL, next_lock, NULL, task);
+	rt_mutex_adjust_prio_chain(task, RT_MUTEX_MIN_CHAINWALK, NULL,
+				   next_lock, NULL, task);
 }
 
 #ifdef CONFIG_PREEMPT_RT_FULL
@@ -1571,7 +1599,8 @@ static void ww_mutex_account_lock(struct rt_mutex *lock,
 static int __sched
 rt_mutex_slowlock(struct rt_mutex *lock, int state,
 		  struct hrtimer_sleeper *timeout,
-		  int detect_deadlock, struct ww_acquire_ctx *ww_ctx)
+		  enum rtmutex_chainwalk chwalk,
+		  struct ww_acquire_ctx *ww_ctx)
 {
 	struct rt_mutex_waiter waiter;
 	int ret = 0;
@@ -1597,7 +1626,7 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state,
 			timeout->task = NULL;
 	}
 
-	ret = task_blocks_on_rt_mutex(lock, &waiter, current, detect_deadlock);
+	ret = task_blocks_on_rt_mutex(lock, &waiter, current, chwalk);
 
 	if (likely(!ret))
 		ret = __rt_mutex_slowlock(lock, state, timeout, &waiter, ww_ctx);
@@ -1606,7 +1635,7 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state,
 
 	if (unlikely(ret)) {
 		remove_waiter(lock, &waiter);
-		rt_mutex_handle_deadlock(ret, detect_deadlock, &waiter);
+		rt_mutex_handle_deadlock(ret, chwalk, &waiter);
 	} else if (ww_ctx) {
 		ww_mutex_account_lock(lock, ww_ctx);
 	}
@@ -1737,30 +1766,32 @@ rt_mutex_fastlock(struct rt_mutex *lock, int state,
 		  struct ww_acquire_ctx *ww_ctx,
 		  int (*slowfn)(struct rt_mutex *lock, int state,
 				struct hrtimer_sleeper *timeout,
-				int detect_deadlock,
+				enum rtmutex_chainwalk chwalk,
 				struct ww_acquire_ctx *ww_ctx))
 {
 	if (likely(rt_mutex_cmpxchg(lock, NULL, current))) {
 		rt_mutex_deadlock_account_lock(lock, current);
 		return 0;
 	} else
-		return slowfn(lock, state, NULL, 0, ww_ctx);
+		return slowfn(lock, state, NULL, RT_MUTEX_MIN_CHAINWALK, ww_ctx);
 }
 
 static inline int
 rt_mutex_timed_fastlock(struct rt_mutex *lock, int state,
-			struct hrtimer_sleeper *timeout, int detect_deadlock,
+			struct hrtimer_sleeper *timeout,
+			enum rtmutex_chainwalk chwalk,
 			struct ww_acquire_ctx *ww_ctx,
 			int (*slowfn)(struct rt_mutex *lock, int state,
 				      struct hrtimer_sleeper *timeout,
-				      int detect_deadlock,
+				      enum rtmutex_chainwalk chwalk,
 				      struct ww_acquire_ctx *ww_ctx))
 {
-	if (!detect_deadlock && likely(rt_mutex_cmpxchg(lock, NULL, current))) {
+	if (chwalk == RT_MUTEX_MIN_CHAINWALK &&
+	    likely(rt_mutex_cmpxchg(lock, NULL, current))) {
 		rt_mutex_deadlock_account_lock(lock, current);
 		return 0;
 	} else
-		return slowfn(lock, state, timeout, detect_deadlock, ww_ctx);
+		return slowfn(lock, state, timeout, chwalk, ww_ctx);
 }
 
 static inline int
@@ -1822,8 +1853,9 @@ int rt_mutex_timed_futex_lock(struct rt_mutex *lock,
 {
 	might_sleep();
 
-	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, 1,
-				       NULL, rt_mutex_slowlock);
+	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout,
+				       RT_MUTEX_FULL_CHAINWALK, NULL,
+				       rt_mutex_slowlock);
 }
 
 /**
@@ -1863,7 +1895,8 @@ rt_mutex_timed_lock(struct rt_mutex *lock, struct hrtimer_sleeper *timeout)
 {
 	might_sleep();
 
-	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, 0,
+	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout,
+				       RT_MUTEX_MIN_CHAINWALK,
 				       NULL, rt_mutex_slowlock);
 }
 EXPORT_SYMBOL_GPL(rt_mutex_timed_lock);
@@ -2020,7 +2053,8 @@ int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
 #endif
 
 	/* We enforce deadlock detection for futexes */
-	ret = task_blocks_on_rt_mutex(lock, waiter, task, 1);
+	ret = task_blocks_on_rt_mutex(lock, waiter, task,
+				      RT_MUTEX_FULL_CHAINWALK);
 
 	if (ret && !rt_mutex_owner(lock)) {
 		/*
diff --git a/kernel/locking/rtmutex.h b/kernel/locking/rtmutex.h
index f6a1f3c133b1..c4060584c407 100644
--- a/kernel/locking/rtmutex.h
+++ b/kernel/locking/rtmutex.h
@@ -22,10 +22,15 @@
 #define debug_rt_mutex_init(m, n)			do { } while (0)
 #define debug_rt_mutex_deadlock(d, a ,l)		do { } while (0)
 #define debug_rt_mutex_print_deadlock(w)		do { } while (0)
-#define debug_rt_mutex_detect_deadlock(w,d)		(d)
 #define debug_rt_mutex_reset_waiter(w)			do { } while (0)
 
 static inline void rt_mutex_print_deadlock(struct rt_mutex_waiter *w)
 {
 	WARN(1, "rtmutex deadlock detected\n");
 }
+
+static inline bool debug_rt_mutex_detect_deadlock(struct rt_mutex_waiter *w,
+						  enum rtmutex_chainwalk walk)
+{
+	return walk == RT_MUTEX_FULL_CHAINWALK;
+}
diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h
index ec51460fa233..c6dcda5e53af 100644
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -103,6 +103,21 @@ static inline struct task_struct *rt_mutex_owner(struct rt_mutex *lock)
 }
 
 /*
+ * Constants for rt mutex functions which have a selectable deadlock
+ * detection.
+ *
+ * RT_MUTEX_MIN_CHAINWALK:	Stops the lock chain walk when there are
+ *				no further PI adjustments to be made.
+ *
+ * RT_MUTEX_FULL_CHAINWALK:	Invoke deadlock detection with a full
+ *				walk of the lock chain.
+ */
+enum rtmutex_chainwalk {
+	RT_MUTEX_MIN_CHAINWALK,
+	RT_MUTEX_FULL_CHAINWALK,
+};
+
+/*
  * PI-futex support (proxy locking functions, etc.):
  */
 #define PI_WAKEUP_INPROGRESS	((struct rt_mutex_waiter *) 1)
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 10/39] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (8 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 09/39] rtmutex: Cleanup deadlock detector debug logic Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 11/39] futex: Make unlock_pi more robust Steven Rostedt
                   ` (31 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, Peter Zijlstra, Lai Jiangshan

[-- Attachment #1: 0010-rtmutex-Avoid-pointless-requeueing-in-the-deadlock-d.patch --]
[-- Type: text/plain, Size: 4368 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

upstream commit: 67792e2cabadbadd1a93f6790fa7bcbd47eca7c3

In case the dead lock detector is enabled we follow the lock chain to
the end in rt_mutex_adjust_prio_chain, even if we could stop earlier
due to the priority/waiter constellation.

But once we are no longer the top priority waiter in a certain step
or the task holding the lock has already the same priority then there
is no point in dequeing and enqueing along the lock chain as there is
no change at all.

So stop the queueing at this point.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/20140522031950.280830190@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/locking/rtmutex.c | 77 +++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 70 insertions(+), 7 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 07fb4d717039..a0d9eadcfe02 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -440,6 +440,7 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 	struct rt_mutex *lock;
 	bool detect_deadlock;
 	unsigned long flags;
+	bool requeue = true;
 
 	detect_deadlock = rt_mutex_cond_detect_deadlock(orig_waiter, chwalk);
 
@@ -529,18 +530,31 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 			goto out_unlock_pi;
 		/*
 		 * If deadlock detection is off, we stop here if we
-		 * are not the top pi waiter of the task.
+		 * are not the top pi waiter of the task. If deadlock
+		 * detection is enabled we continue, but stop the
+		 * requeueing in the chain walk.
 		 */
-		if (!detect_deadlock && top_waiter != task_top_pi_waiter(task))
-			goto out_unlock_pi;
+		if (top_waiter != task_top_pi_waiter(task)) {
+			if (!detect_deadlock)
+				goto out_unlock_pi;
+			else
+				requeue = false;
+		}
 	}
 
 	/*
-	 * When deadlock detection is off then we check, if further
-	 * priority adjustment is necessary.
+	 * If the waiter priority is the same as the task priority
+	 * then there is no further priority adjustment necessary.  If
+	 * deadlock detection is off, we stop the chain walk. If its
+	 * enabled we continue, but stop the requeueing in the chain
+	 * walk.
 	 */
-	if (!detect_deadlock && waiter->prio == task->prio)
-		goto out_unlock_pi;
+	if (waiter->prio == task->prio) {
+		if (!detect_deadlock)
+			goto out_unlock_pi;
+		else
+			requeue = false;
+	}
 
 	/*
 	 * [4] Get the next lock
@@ -574,6 +588,55 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 	}
 
 	/*
+	 * If we just follow the lock chain for deadlock detection, no
+	 * need to do all the requeue operations. To avoid a truckload
+	 * of conditionals around the various places below, just do the
+	 * minimum chain walk checks.
+	 */
+	if (!requeue) {
+		/*
+		 * No requeue[7] here. Just release @task [8]
+		 */
+		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
+		put_task_struct(task);
+
+		/*
+		 * [9] check_exit_conditions_3 protected by lock->wait_lock.
+		 * If there is no owner of the lock, end of chain.
+		 */
+		if (!rt_mutex_owner(lock)) {
+			raw_spin_unlock(&lock->wait_lock);
+			return 0;
+		}
+
+		/* [10] Grab the next task, i.e. owner of @lock */
+		task = rt_mutex_owner(lock);
+		get_task_struct(task);
+		raw_spin_lock_irqsave(&task->pi_lock, flags);
+
+		/*
+		 * No requeue [11] here. We just do deadlock detection.
+		 *
+		 * [12] Store whether owner is blocked
+		 * itself. Decision is made after dropping the locks
+		 */
+		next_lock = task_blocked_on_lock(task);
+		/*
+		 * Get the top waiter for the next iteration
+		 */
+		top_waiter = rt_mutex_top_waiter(lock);
+
+		/* [13] Drop locks */
+		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
+		raw_spin_unlock(&lock->wait_lock);
+
+		/* If owner is not blocked, end of chain. */
+		if (!next_lock)
+			goto out_put_task;
+		goto again;
+	}
+
+	/*
 	 * Store the current top waiter before doing the requeue
 	 * operation on @lock. We need it for the boost/deboost
 	 * decision below.
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 11/39] futex: Make unlock_pi more robust
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (9 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 10/39] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 12/39] futex: Use futex_top_waiter() in lookup_pi_state() Steven Rostedt
                   ` (30 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, Peter Zijlstra, Darren Hart,
	Davidlohr Bueso, Kees Cook, wad

[-- Attachment #1: 0011-futex-Make-unlock_pi-more-robust.patch --]
[-- Type: text/plain, Size: 4839 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

upstream commit: ccf9e6a80d9e1b9df69c98e6b9745cf49869ee15

The kernel tries to atomically unlock the futex without checking
whether there is kernel state associated to the futex.

So if user space manipulated the user space value, this will leave
kernel internal state around associated to the owner task.

For robustness sake, lookup first whether there are waiters on the
futex. If there are waiters, wake the top priority waiter with all the
proper sanity checks applied.

If there are no waiters, do the atomic release. We do not have to
preserve the waiters bit in this case, because a potentially incoming
waiter is blocked on the hb->lock and will acquire the futex
atomically. We neither have to preserve the owner died bit. The caller
is the owner and it was supposed to cleanup the mess.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Darren Hart <darren@dvhart.com>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Kees Cook <kees@outflux.net>
Cc: wad@chromium.org
Link: http://lkml.kernel.org/r/20140611204237.016987332@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/futex.c | 76 +++++++++++++++++++---------------------------------------
 1 file changed, 25 insertions(+), 51 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index d2349337d1a0..a3ce98c65127 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1176,22 +1176,6 @@ static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q *this)
 	return 0;
 }
 
-static int unlock_futex_pi(u32 __user *uaddr, u32 uval)
-{
-	u32 uninitialized_var(oldval);
-
-	/*
-	 * There is no waiter, so we unlock the futex. The owner died
-	 * bit has not to be preserved here. We are the owner:
-	 */
-	if (cmpxchg_futex_value_locked(&oldval, uaddr, uval, 0))
-		return -EFAULT;
-	if (oldval != uval)
-		return -EAGAIN;
-
-	return 0;
-}
-
 /*
  * Express the locking dependencies for lockdep:
  */
@@ -2401,10 +2385,10 @@ uaddr_faulted:
  */
 static int futex_unlock_pi(u32 __user *uaddr, unsigned int flags)
 {
-	struct futex_hash_bucket *hb;
-	struct futex_q *this, *next;
+	u32 uninitialized_var(curval), uval, vpid = task_pid_vnr(current);
 	union futex_key key = FUTEX_KEY_INIT;
-	u32 uval, vpid = task_pid_vnr(current);
+	struct futex_hash_bucket *hb;
+	struct futex_q *match;
 	int ret;
 
 retry:
@@ -2417,57 +2401,47 @@ retry:
 		return -EPERM;
 
 	ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &key, VERIFY_WRITE);
-	if (unlikely(ret != 0))
-		goto out;
+	if (ret)
+		return ret;
 
 	hb = hash_futex(&key);
 	spin_lock(&hb->lock);
 
 	/*
-	 * To avoid races, try to do the TID -> 0 atomic transition
-	 * again. If it succeeds then we can return without waking
-	 * anyone else up. We only try this if neither the waiters nor
-	 * the owner died bit are set.
+	 * Check waiters first. We do not trust user space values at
+	 * all and we at least want to know if user space fiddled
+	 * with the futex value instead of blindly unlocking.
 	 */
-	if (!(uval & ~FUTEX_TID_MASK) &&
-	    cmpxchg_futex_value_locked(&uval, uaddr, vpid, 0))
-		goto pi_faulted;
-	/*
-	 * Rare case: we managed to release the lock atomically,
-	 * no need to wake anyone else up:
-	 */
-	if (unlikely(uval == vpid))
-		goto out_unlock;
-
-	/*
-	 * Ok, other tasks may need to be woken up - check waiters
-	 * and do the wakeup if necessary:
-	 */
-	plist_for_each_entry_safe(this, next, &hb->chain, list) {
-		if (!match_futex (&this->key, &key))
-			continue;
-		ret = wake_futex_pi(uaddr, uval, this);
+	match = futex_top_waiter(hb, &key);
+	if (match) {
+		ret = wake_futex_pi(uaddr, uval, match);
 		/*
-		 * The atomic access to the futex value
-		 * generated a pagefault, so retry the
-		 * user-access and the wakeup:
+		 * The atomic access to the futex value generated a
+		 * pagefault, so retry the user-access and the wakeup:
 		 */
 		if (ret == -EFAULT)
 			goto pi_faulted;
 		goto out_unlock;
 	}
+
 	/*
-	 * No waiters - kernel unlocks the futex:
+	 * We have no kernel internal state, i.e. no waiters in the
+	 * kernel. Waiters which are about to queue themselves are stuck
+	 * on hb->lock. So we can safely ignore them. We do neither
+	 * preserve the WAITERS bit not the OWNER_DIED one. We are the
+	 * owner.
 	 */
-	ret = unlock_futex_pi(uaddr, uval);
-	if (ret == -EFAULT)
+	if (cmpxchg_futex_value_locked(&curval, uaddr, uval, 0))
 		goto pi_faulted;
 
+	/*
+	 * If uval has changed, let user space handle it.
+	 */
+	ret = (curval == uval) ? 0 : -EAGAIN;
+
 out_unlock:
 	spin_unlock(&hb->lock);
 	put_futex_key(&key);
-
-out:
 	return ret;
 
 pi_faulted:
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 12/39] futex: Use futex_top_waiter() in lookup_pi_state()
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (10 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 11/39] futex: Make unlock_pi more robust Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 13/39] futex: Split out the waiter check from lookup_pi_state() Steven Rostedt
                   ` (29 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, Darren Hart, Peter Zijlstra,
	Davidlohr Bueso, Kees Cook, wad

[-- Attachment #1: 0012-futex-Use-futex_top_waiter-in-lookup_pi_state.patch --]
[-- Type: text/plain, Size: 4720 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

upstream commit: bd1dbcc67cd2c1181e2c01daac51eabf1b964dd8

No point in open coding the same function again.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Darren Hart <darren@dvhart.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Kees Cook <kees@outflux.net>
Cc: wad@chromium.org
Link: http://lkml.kernel.org/r/20140611204237.092947239@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/futex.c | 124 ++++++++++++++++++++++++++++-----------------------------
 1 file changed, 61 insertions(+), 63 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index a3ce98c65127..8bb158215e2c 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -786,87 +786,85 @@ static int
 lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
 		union futex_key *key, struct futex_pi_state **ps)
 {
+	struct futex_q *match = futex_top_waiter(hb, key);
 	struct futex_pi_state *pi_state = NULL;
-	struct futex_q *this, *next;
 	struct task_struct *p;
 	pid_t pid = uval & FUTEX_TID_MASK;
 
-	plist_for_each_entry_safe(this, next, &hb->chain, list) {
-		if (match_futex(&this->key, key)) {
-			/*
-			 * Sanity check the waiter before increasing
-			 * the refcount and attaching to it.
-			 */
-			pi_state = this->pi_state;
-			/*
-			 * Userspace might have messed up non-PI and
-			 * PI futexes [3]
-			 */
-			if (unlikely(!pi_state))
-				return -EINVAL;
+	if (match) {
+		/*
+		 * Sanity check the waiter before increasing the
+		 * refcount and attaching to it.
+		 */
+		pi_state = match->pi_state;
+		/*
+		 * Userspace might have messed up non-PI and PI
+		 * futexes [3]
+		 */
+		if (unlikely(!pi_state))
+			return -EINVAL;
 
-			WARN_ON(!atomic_read(&pi_state->refcount));
+		WARN_ON(!atomic_read(&pi_state->refcount));
 
+		/*
+		 * Handle the owner died case:
+		 */
+		if (uval & FUTEX_OWNER_DIED) {
 			/*
-			 * Handle the owner died case:
+			 * exit_pi_state_list sets owner to NULL and
+			 * wakes the topmost waiter. The task which
+			 * acquires the pi_state->rt_mutex will fixup
+			 * owner.
 			 */
-			if (uval & FUTEX_OWNER_DIED) {
+			if (!pi_state->owner) {
 				/*
-				 * exit_pi_state_list sets owner to NULL and
-				 * wakes the topmost waiter. The task which
-				 * acquires the pi_state->rt_mutex will fixup
-				 * owner.
+				 * No pi state owner, but the user
+				 * space TID is not 0. Inconsistent
+				 * state. [5]
 				 */
-				if (!pi_state->owner) {
-					/*
-					 * No pi state owner, but the user
-					 * space TID is not 0. Inconsistent
-					 * state. [5]
-					 */
-					if (pid)
-						return -EINVAL;
-					/*
-					 * Take a ref on the state and
-					 * return. [4]
-					 */
-					goto out_state;
-				}
-
-				/*
-				 * If TID is 0, then either the dying owner
-				 * has not yet executed exit_pi_state_list()
-				 * or some waiter acquired the rtmutex in the
-				 * pi state, but did not yet fixup the TID in
-				 * user space.
-				 *
-				 * Take a ref on the state and return. [6]
-				 */
-				if (!pid)
-					goto out_state;
-			} else {
+				if (pid)
+					return -EINVAL;
 				/*
-				 * If the owner died bit is not set,
-				 * then the pi_state must have an
-				 * owner. [7]
+				 * Take a ref on the state and
+				 * return. [4]
 				 */
-				if (!pi_state->owner)
-					return -EINVAL;
+				goto out_state;
 			}
 
 			/*
-			 * Bail out if user space manipulated the
-			 * futex value. If pi state exists then the
-			 * owner TID must be the same as the user
-			 * space TID. [9/10]
+			 * If TID is 0, then either the dying owner
+			 * has not yet executed exit_pi_state_list()
+			 * or some waiter acquired the rtmutex in the
+			 * pi state, but did not yet fixup the TID in
+			 * user space.
+			 *
+			 * Take a ref on the state and return. [6]
+			 */
+			if (!pid)
+				goto out_state;
+		} else {
+			/*
+			 * If the owner died bit is not set,
+			 * then the pi_state must have an
+			 * owner. [7]
 			 */
-			if (pid != task_pid_vnr(pi_state->owner))
+			if (!pi_state->owner)
 				return -EINVAL;
-
-		out_state:
-			atomic_inc(&pi_state->refcount);
-			*ps = pi_state;
-			return 0;
 		}
+
+		/*
+		 * Bail out if user space manipulated the
+		 * futex value. If pi state exists then the
+		 * owner TID must be the same as the user
+		 * space TID. [9/10]
+		 */
+		if (pid != task_pid_vnr(pi_state->owner))
+			return -EINVAL;
+
+	out_state:
+		atomic_inc(&pi_state->refcount);
+		*ps = pi_state;
+		return 0;
 	}
 
 	/*
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 13/39] futex: Split out the waiter check from lookup_pi_state()
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (11 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 12/39] futex: Use futex_top_waiter() in lookup_pi_state() Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 14/39] futex: Split out the first waiter attachment " Steven Rostedt
                   ` (28 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, Darren Hart, Peter Zijlstra,
	Davidlohr Bueso, Kees Cook, wad

[-- Attachment #1: 0013-futex-Split-out-the-waiter-check-from-lookup_pi_stat.patch --]
[-- Type: text/plain, Size: 5471 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

upstream commit: e60cbc5ceaa518d630ab8f35a7d05cee1c752648

We want to be a bit more clever in futex_lock_pi_atomic() and separate
the possible states. Split out the waiter verification into a separate
function. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Darren Hart <darren@dvhart.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Kees Cook <kees@outflux.net>
Cc: wad@chromium.org
Link: http://lkml.kernel.org/r/20140611204237.180458410@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/futex.c | 138 +++++++++++++++++++++++++++++----------------------------
 1 file changed, 71 insertions(+), 67 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index 8bb158215e2c..8377eec3f650 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -782,92 +782,96 @@ void exit_pi_state_list(struct task_struct *curr)
  * [10] There is no transient state which leaves owner and user space
  *	TID out of sync.
  */
-static int
-lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
-		union futex_key *key, struct futex_pi_state **ps)
+
+/*
+ * Validate that the existing waiter has a pi_state and sanity check
+ * the pi_state against the user space value. If correct, attach to
+ * it.
+ */
+static int attach_to_pi_state(u32 uval, struct futex_pi_state *pi_state,
+			      struct futex_pi_state **ps)
 {
-	struct futex_q *match = futex_top_waiter(hb, key);
-	struct futex_pi_state *pi_state = NULL;
-	struct task_struct *p;
 	pid_t pid = uval & FUTEX_TID_MASK;
 
-	if (match) {
-		/*
-		 * Sanity check the waiter before increasing the
-		 * refcount and attaching to it.
-		 */
-		pi_state = match->pi_state;
-		/*
-		 * Userspace might have messed up non-PI and PI
-		 * futexes [3]
-		 */
-		if (unlikely(!pi_state))
-			return -EINVAL;
+	/*
+	 * Userspace might have messed up non-PI and PI futexes [3]
+	 */
+	if (unlikely(!pi_state))
+		return -EINVAL;
 
-		WARN_ON(!atomic_read(&pi_state->refcount));
+	WARN_ON(!atomic_read(&pi_state->refcount));
 
+	/*
+	 * Handle the owner died case:
+	 */
+	if (uval & FUTEX_OWNER_DIED) {
 		/*
-		 * Handle the owner died case:
+		 * exit_pi_state_list sets owner to NULL and wakes the
+		 * topmost waiter. The task which acquires the
+		 * pi_state->rt_mutex will fixup owner.
 		 */
-		if (uval & FUTEX_OWNER_DIED) {
-			/*
-			 * exit_pi_state_list sets owner to NULL and
-			 * wakes the topmost waiter. The task which
-			 * acquires the pi_state->rt_mutex will fixup
-			 * owner.
-			 */
-			if (!pi_state->owner) {
-				/*
-				 * No pi state owner, but the user
-				 * space TID is not 0. Inconsistent
-				 * state. [5]
-				 */
-				if (pid)
-					return -EINVAL;
-				/*
-				 * Take a ref on the state and
-				 * return. [4]
-				 */
-				goto out_state;
-			}
-
+		if (!pi_state->owner) {
 			/*
-			 * If TID is 0, then either the dying owner
-			 * has not yet executed exit_pi_state_list()
-			 * or some waiter acquired the rtmutex in the
-			 * pi state, but did not yet fixup the TID in
-			 * user space.
-			 *
-			 * Take a ref on the state and return. [6]
+			 * No pi state owner, but the user space TID
+			 * is not 0. Inconsistent state. [5]
 			 */
-			if (!pid)
-				goto out_state;
-		} else {
+			if (pid)
+				return -EINVAL;
 			/*
-			 * If the owner died bit is not set,
-			 * then the pi_state must have an
-			 * owner. [7]
+			 * Take a ref on the state and return success. [4]
 			 */
-			if (!pi_state->owner)
-				return -EINVAL;
+			goto out_state;
 		}
 
 		/*
-		 * Bail out if user space manipulated the
-		 * futex value. If pi state exists then the
-		 * owner TID must be the same as the user
-		 * space TID. [9/10]
+		 * If TID is 0, then either the dying owner has not
+		 * yet executed exit_pi_state_list() or some waiter
+		 * acquired the rtmutex in the pi state, but did not
+		 * yet fixup the TID in user space.
+		 *
+		 * Take a ref on the state and return success. [6]
+		 */
+		if (!pid)
+			goto out_state;
+	} else {
+		/*
+		 * If the owner died bit is not set, then the pi_state
+		 * must have an owner. [7]
 		 */
-		if (pid != task_pid_vnr(pi_state->owner))
+		if (!pi_state->owner)
 			return -EINVAL;
-
-	out_state:
-		atomic_inc(&pi_state->refcount);
-		*ps = pi_state;
-		return 0;
 	}
 
 	/*
+	 * Bail out if user space manipulated the futex value. If pi
+	 * state exists then the owner TID must be the same as the
+	 * user space TID. [9/10]
+	 */
+	if (pid != task_pid_vnr(pi_state->owner))
+		return -EINVAL;
+out_state:
+	atomic_inc(&pi_state->refcount);
+	*ps = pi_state;
+	return 0;
+}
+
+static int
+lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
+		union futex_key *key, struct futex_pi_state **ps)
+{
+	struct futex_q *match = futex_top_waiter(hb, key);
+	struct futex_pi_state *pi_state = NULL;
+	struct task_struct *p;
+	pid_t pid = uval & FUTEX_TID_MASK;
+
+	/*
+	 * If there is a waiter on that futex, validate it and
+	 * attach to the pi_state when the validation succeeds.
+	 */
+	if (match)
+		return attach_to_pi_state(uval, match->pi_state, ps);
+
+	/*
 	 * We are the first waiter - try to look up the real owner and attach
 	 * the new pi_state to it, but bail out when TID = 0 [1]
 	 */
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 14/39] futex: Split out the first waiter attachment from lookup_pi_state()
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (12 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 13/39] futex: Split out the waiter check from lookup_pi_state() Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 15/39] futex: Simplify futex_lock_pi_atomic() and make it more robust Steven Rostedt
                   ` (27 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, Darren Hart, Peter Zijlstra,
	Davidlohr Bueso, Kees Cook, wad

[-- Attachment #1: 0014-futex-Split-out-the-first-waiter-attachment-from-loo.patch --]
[-- Type: text/plain, Size: 3035 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

upstream commit: 04e1b2e52b17195c9a1daa5935c55a4c8716095c

We want to be a bit more clever in futex_lock_pi_atomic() and separate
the possible states. Split out the code which attaches the first
waiter to the owner into a separate function. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Darren Hart <darren@dvhart.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Kees Cook <kees@outflux.net>
Cc: wad@chromium.org
Link: http://lkml.kernel.org/r/20140611204237.271300614@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/futex.c | 42 ++++++++++++++++++++++++++++--------------
 1 file changed, 28 insertions(+), 14 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index 8377eec3f650..e0802beff9ad 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -855,21 +855,16 @@ out_state:
 	return 0;
 }
 
-static int
-lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
-		union futex_key *key, struct futex_pi_state **ps)
+/*
+ * Lookup the task for the TID provided from user space and attach to
+ * it after doing proper sanity checks.
+ */
+static int attach_to_pi_owner(u32 uval, union futex_key *key,
+			      struct futex_pi_state **ps)
 {
-	struct futex_q *match = futex_top_waiter(hb, key);
-	struct futex_pi_state *pi_state = NULL;
-	struct task_struct *p;
 	pid_t pid = uval & FUTEX_TID_MASK;
-
-	/*
-	 * If there is a waiter on that futex, validate it and
-	 * attach to the pi_state when the validation succeeds.
-	 */
-	if (match)
-		return attach_to_pi_state(uval, match->pi_state, ps);
+	struct futex_pi_state *pi_state;
+	struct task_struct *p;
 
 	/*
 	 * We are the first waiter - try to look up the real owner and attach
@@ -912,7 +907,7 @@ lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
 	pi_state = alloc_pi_state();
 
 	/*
-	 * Initialize the pi_mutex in locked state and make 'p'
+	 * Initialize the pi_mutex in locked state and make @p
 	 * the owner of it:
 	 */
 	rt_mutex_init_proxy_locked(&pi_state->pi_mutex, p);
@@ -932,6 +927,25 @@ lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
 	return 0;
 }
 
+static int lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
+			   union futex_key *key, struct futex_pi_state **ps)
+{
+	struct futex_q *match = futex_top_waiter(hb, key);
+
+	/*
+	 * If there is a waiter on that futex, validate it and
+	 * attach to the pi_state when the validation succeeds.
+	 */
+	if (match)
+		return attach_to_pi_state(uval, match->pi_state, ps);
+
+	/*
+	 * We are the first waiter - try to look up the owner based on
+	 * @uval and attach to it.
+	 */
+	return attach_to_pi_owner(uval, key, ps);
+}
+
 /**
  * futex_lock_pi_atomic() - Atomic work required to acquire a pi aware futex
  * @uaddr:		the pi futex user address
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 15/39] futex: Simplify futex_lock_pi_atomic() and make it more robust
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (13 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 14/39] futex: Split out the first waiter attachment " Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 16/39] locking/rt-mutex: avoid a NULL pointer dereference on deadlock Steven Rostedt
                   ` (26 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, Peter Zijlstra, Davidlohr Bueso,
	Kees Cook, wad, Darren Hart

[-- Attachment #1: 0015-futex-Simplify-futex_lock_pi_atomic-and-make-it-more.patch --]
[-- Type: text/plain, Size: 7041 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

upstream commit: af54d6a1c3ad474bbc9893c9905022646be6092c

futex_lock_pi_atomic() is a maze of retry hoops and loops.

Reduce it to simple and understandable states:

First step is to lookup existing waiters (state) in the kernel.

If there is an existing waiter, validate it and attach to it.

If there is no existing waiter, check the user space value

If the TID encoded in the user space value is 0, take over the futex
preserving the owner died bit.

If the TID encoded in the user space value is != 0, lookup the owner
task, validate it and attach to it.

Reduces text size by 128 bytes on x8664.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Kees Cook <kees@outflux.net>
Cc: wad@chromium.org
Cc: Darren Hart <darren@dvhart.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1406131137020.5170@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/futex.c | 148 ++++++++++++++++++++++++---------------------------------
 1 file changed, 61 insertions(+), 87 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index e0802beff9ad..e7f392e56be4 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -946,6 +946,17 @@ static int lookup_pi_state(u32 uval, struct futex_hash_bucket *hb,
 	return attach_to_pi_owner(uval, key, ps);
 }
 
+static int lock_pi_update_atomic(u32 __user *uaddr, u32 uval, u32 newval)
+{
+	u32 uninitialized_var(curval);
+
+	if (unlikely(cmpxchg_futex_value_locked(&curval, uaddr, uval, newval)))
+		return -EFAULT;
+
+	/*If user space value changed, let the caller retry */
+	return curval != uval ? -EAGAIN : 0;
+}
+
 /**
  * futex_lock_pi_atomic() - Atomic work required to acquire a pi aware futex
  * @uaddr:		the pi futex user address
@@ -969,113 +980,69 @@ static int futex_lock_pi_atomic(u32 __user *uaddr, struct futex_hash_bucket *hb,
 				struct futex_pi_state **ps,
 				struct task_struct *task, int set_waiters)
 {
-	int lock_taken, ret, force_take = 0;
-	u32 uval, newval, curval, vpid = task_pid_vnr(task);
-
-retry:
-	ret = lock_taken = 0;
+	u32 uval, newval, vpid = task_pid_vnr(task);
+	struct futex_q *match;
+	int ret;
 
 	/*
-	 * To avoid races, we attempt to take the lock here again
-	 * (by doing a 0 -> TID atomic cmpxchg), while holding all
-	 * the locks. It will most likely not succeed.
+	 * Read the user space value first so we can validate a few
+	 * things before proceeding further.
 	 */
-	newval = vpid;
-	if (set_waiters)
-		newval |= FUTEX_WAITERS;
-
-	if (unlikely(cmpxchg_futex_value_locked(&curval, uaddr, 0, newval)))
+	if (get_futex_value_locked(&uval, uaddr))
 		return -EFAULT;
 
 	/*
 	 * Detect deadlocks.
 	 */
-	if ((unlikely((curval & FUTEX_TID_MASK) == vpid)))
+	if ((unlikely((uval & FUTEX_TID_MASK) == vpid)))
 		return -EDEADLK;
 
 	/*
-	 * Surprise - we got the lock, but we do not trust user space at all.
+	 * Lookup existing state first. If it exists, try to attach to
+	 * its pi_state.
 	 */
-	if (unlikely(!curval)) {
-		/*
-		 * We verify whether there is kernel state for this
-		 * futex. If not, we can safely assume, that the 0 ->
-		 * TID transition is correct. If state exists, we do
-		 * not bother to fixup the user space state as it was
-		 * corrupted already.
-		 */
-		return futex_top_waiter(hb, key) ? -EINVAL : 1;
-	}
-
-	uval = curval;
-
-	/*
-	 * Set the FUTEX_WAITERS flag, so the owner will know it has someone
-	 * to wake at the next unlock.
-	 */
-	newval = curval | FUTEX_WAITERS;
+	match = futex_top_waiter(hb, key);
+	if (match)
+		return attach_to_pi_state(uval, match->pi_state, ps);
 
 	/*
-	 * Should we force take the futex? See below.
+	 * No waiter and user TID is 0. We are here because the
+	 * waiters or the owner died bit is set or called from
+	 * requeue_cmp_pi or for whatever reason something took the
+	 * syscall.
 	 */
-	if (unlikely(force_take)) {
+	if (!(uval & FUTEX_TID_MASK)) {
 		/*
-		 * Keep the OWNER_DIED and the WAITERS bit and set the
-		 * new TID value.
+		 * We take over the futex. No other waiters and the user space
+		 * TID is 0. We preserve the owner died bit.
 		 */
-		newval = (curval & ~FUTEX_TID_MASK) | vpid;
-		force_take = 0;
-		lock_taken = 1;
-	}
+		newval = uval & FUTEX_OWNER_DIED;
+		newval |= vpid;
 
-	if (unlikely(cmpxchg_futex_value_locked(&curval, uaddr, uval, newval)))
-		return -EFAULT;
-	if (unlikely(curval != uval))
-		goto retry;
+		/* The futex requeue_pi code can enforce the waiters bit */
+		if (set_waiters)
+			newval |= FUTEX_WAITERS;
+
+		ret = lock_pi_update_atomic(uaddr, uval, newval);
+		/* If the take over worked, return 1 */
+		return ret < 0 ? ret : 1;
+	}
 
 	/*
-	 * We took the lock due to forced take over.
+	 * First waiter. Set the waiters bit before attaching ourself to
+	 * the owner. If owner tries to unlock, it will be forced into
+	 * the kernel and blocked on hb->lock.
 	 */
-	if (unlikely(lock_taken))
-		return 1;
-
+	newval = uval | FUTEX_WAITERS;
+	ret = lock_pi_update_atomic(uaddr, uval, newval);
+	if (ret)
+		return ret;
 	/*
-	 * We dont have the lock. Look up the PI state (or create it if
-	 * we are the first waiter):
+	 * If the update of the user space value succeeded, we try to
+	 * attach to the owner. If that fails, no harm done, we only
+	 * set the FUTEX_WAITERS bit in the user space variable.
 	 */
-	ret = lookup_pi_state(uval, hb, key, ps);
-
-	if (unlikely(ret)) {
-		switch (ret) {
-		case -ESRCH:
-			/*
-			 * We failed to find an owner for this
-			 * futex. So we have no pi_state to block
-			 * on. This can happen in two cases:
-			 *
-			 * 1) The owner died
-			 * 2) A stale FUTEX_WAITERS bit
-			 *
-			 * Re-read the futex value.
-			 */
-			if (get_futex_value_locked(&curval, uaddr))
-				return -EFAULT;
-
-			/*
-			 * If the owner died or we have a stale
-			 * WAITERS bit the owner TID in the user space
-			 * futex is 0.
-			 */
-			if (!(curval & FUTEX_TID_MASK)) {
-				force_take = 1;
-				goto retry;
-			}
-		default:
-			break;
-		}
-	}
-
-	return ret;
+	return attach_to_pi_owner(uval, key, ps);
 }
 
 /**
@@ -1649,7 +1616,12 @@ retry_private:
 				goto retry;
 			goto out;
 		case -EAGAIN:
-			/* The owner was exiting, try again. */
+			/*
+			 * Two reasons for this:
+			 * - Owner is exiting and we just wait for the
+			 *   exit to complete.
+			 * - The user space value changed.
+			 */
 			double_unlock_hb(hb1, hb2);
 			hb_waiters_dec(hb2);
 			put_futex_key(&key2);
@@ -2316,8 +2288,10 @@ retry_private:
 			goto uaddr_faulted;
 		case -EAGAIN:
 			/*
-			 * Task is exiting and we just wait for the
-			 * exit to complete.
+			 * Two reasons for this:
+			 * - Task is exiting and we just wait for the
+			 *   exit to complete.
+			 * - The user space value changed.
 			 */
 			queue_unlock(hb);
 			put_futex_key(&q.key);
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 16/39] locking/rt-mutex: avoid a NULL pointer dereference on deadlock
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (14 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 15/39] futex: Simplify futex_lock_pi_atomic() and make it more robust Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-13 10:40   ` Sebastian Andrzej Siewior
  2015-03-12 19:13 ` [PATCH RT 17/39] rtmutex.c: Fix incorrect waiter check Steven Rostedt
                   ` (25 subsequent siblings)
  41 siblings, 1 reply; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, stable-rt

[-- Attachment #1: 0016-locking-rt-mutex-avoid-a-NULL-pointer-dereference-on.patch --]
[-- Type: text/plain, Size: 1412 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

With task_blocks_on_rt_mutex() returning early -EDEADLK we never add the
waiter to the waitqueue. Later, we try to remove it via remove_waiter()
and go boom in rt_mutex_top_waiter() because rb_entry() gives a NULL
pointer.
Tested on v3.18-RT where rtmutex is used for regular mutex and I tried
to get one twice in a row.

Not sure when this started but I guess 397335f00 ("rtmutex: Fix deadlock
detector for real") or commit 3d5c9340 ("rtmutex: Handle deadlock
detection smarter").

Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/locking/rtmutex.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index a0d9eadcfe02..ec72272ab73a 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1697,7 +1697,8 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state,
 	set_current_state(TASK_RUNNING);
 
 	if (unlikely(ret)) {
-		remove_waiter(lock, &waiter);
+		if (rt_mutex_has_waiters(lock))
+			remove_waiter(lock, &waiter);
 		rt_mutex_handle_deadlock(ret, chwalk, &waiter);
 	} else if (ww_ctx) {
 		ww_mutex_account_lock(lock, ww_ctx);
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 17/39] rtmutex.c: Fix incorrect waiter check
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (15 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 16/39] locking/rt-mutex: avoid a NULL pointer dereference on deadlock Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 18/39] rt,locking: fix __ww_mutex_lock_interruptible() lockdep annotation Steven Rostedt
                   ` (24 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, Brad Mouring, Ben Shelton,
	T Makphaibulchoke, stable-rt

[-- Attachment #1: 0017-rtmutex.c-Fix-incorrect-waiter-check.patch --]
[-- Type: text/plain, Size: 1308 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Brad Mouring <bmouring@ni.com>

In task_blocks_on_lock, there's a null check on pi_blocked_on
of the task_struct. This pointer can encode the fact that the
task that contains the pointer is waking (preventing requeuing)
and therefore is non-null. Use the inline function to avoid
dereferencing an invalid "pointer"

Signed-off-by: Brad Mouring <brad.mouring@ni.com>
Reported-by: Ben Shelton <ben.shelton@ni.com>
Reviewed-by: T Makphaibulchoke <tmac@hp.com>
Tested-by: T Makphaibulchoke <tmac@hp.com>
Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/locking/rtmutex.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index ec72272ab73a..74a824523d3a 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -361,7 +361,8 @@ int max_lock_depth = 1024;
 
 static inline struct rt_mutex *task_blocked_on_lock(struct task_struct *p)
 {
-	return p->pi_blocked_on ? p->pi_blocked_on->lock : NULL;
+	return rt_mutex_real_waiter(p->pi_blocked_on) ?
+		p->pi_blocked_on->lock : NULL;
 }
 
 /*
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 18/39] rt,locking: fix __ww_mutex_lock_interruptible() lockdep annotation
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (16 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 17/39] rtmutex.c: Fix incorrect waiter check Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 19/39] rtmutex: enable deadlock detection in ww_mutex_lock functions Steven Rostedt
                   ` (23 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, stable-rt, Mike Galbraith

[-- Attachment #1: 0018-rt-locking-fix-__ww_mutex_lock_interruptible-lockdep.patch --]
[-- Type: text/plain, Size: 2587 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Mike Galbraith <umgwanakikbuti@gmail.com>

Using mutex_acquire_nest() as used in __ww_mutex_lock() fixes the
splat below.  Remove superfluous line break in __ww_mutex_lock()
as well.

|=============================================
|[ INFO: possible recursive locking detected ]
|3.14.4-rt5 #26 Not tainted
|---------------------------------------------
|Xorg/4298 is trying to acquire lock:
| (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffffa02b4270>] nouveau_gem_ioctl_pushbuf+0x870/0x19f0 [nouveau]
|but task is already holding lock:
| (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffffa02b4270>] nouveau_gem_ioctl_pushbuf+0x870/0x19f0 [nouveau]
|other info that might help us debug this:
| Possible unsafe locking scenario:
|       CPU0
|       ----
|  lock(reservation_ww_class_mutex);
|  lock(reservation_ww_class_mutex);
|
| *** DEADLOCK ***
|
| May be due to missing lock nesting notation
|
|3 locks held by Xorg/4298:
| #0:  (&cli->mutex){+.+.+.}, at: [<ffffffffa02b597b>] nouveau_abi16_get+0x2b/0x100 [nouveau]
| #1:  (reservation_ww_class_acquire){+.+...}, at: [<ffffffffa0160cd2>] drm_ioctl+0x4d2/0x610 [drm]
| #2:  (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffffa02b4270>] nouveau_gem_ioctl_pushbuf+0x870/0x19f0 [nouveau]

Cc: stable-rt@vger.kernel.org
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/locking/rtmutex.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 74a824523d3a..1794fae96a2a 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -2238,7 +2238,7 @@ __ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx *ww_c
 
 	might_sleep();
 
-	mutex_acquire(&lock->base.dep_map, 0, 0, _RET_IP_);
+	mutex_acquire_nest(&lock->base.dep_map, 0, 0, &ww_ctx->dep_map, _RET_IP_);
 	ret = rt_mutex_slowlock(&lock->base.lock, TASK_INTERRUPTIBLE, NULL, 0, ww_ctx);
 	if (ret)
 		mutex_release(&lock->base.dep_map, 1, _RET_IP_);
@@ -2256,8 +2256,7 @@ __ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ww_ctx)
 
 	might_sleep();
 
-	mutex_acquire_nest(&lock->base.dep_map, 0, 0, &ww_ctx->dep_map,
-			_RET_IP_);
+	mutex_acquire_nest(&lock->base.dep_map, 0, 0, &ww_ctx->dep_map, _RET_IP_);
 	ret = rt_mutex_slowlock(&lock->base.lock, TASK_UNINTERRUPTIBLE, NULL, 0, ww_ctx);
 	if (ret)
 		mutex_release(&lock->base.dep_map, 1, _RET_IP_);
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 19/39] rtmutex: enable deadlock detection in ww_mutex_lock functions
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (17 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 18/39] rt,locking: fix __ww_mutex_lock_interruptible() lockdep annotation Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 20/39] x86: UV: raw_spinlock conversion Steven Rostedt
                   ` (22 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, stable-rt, Gustavo Bittencourt

[-- Attachment #1: 0019-rtmutex-enable-deadlock-detection-in-ww_mutex_lock-f.patch --]
[-- Type: text/plain, Size: 1882 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Gustavo Bittencourt <gbitten@gmail.com>

The functions ww_mutex_lock_interruptible and ww_mutex_lock should return -EDEADLK when faced with
a deadlock. To do so, the paramenter detect_deadlock in rt_mutex_slowlock must be TRUE.
This patch corrects potential deadlocks when running PREEMPT_RT with nouveau driver.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Gustavo Bittencourt <gbitten@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/locking/rtmutex.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 1794fae96a2a..27a1993111f9 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -2239,7 +2239,8 @@ __ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx *ww_c
 	might_sleep();
 
 	mutex_acquire_nest(&lock->base.dep_map, 0, 0, &ww_ctx->dep_map, _RET_IP_);
-	ret = rt_mutex_slowlock(&lock->base.lock, TASK_INTERRUPTIBLE, NULL, 0, ww_ctx);
+	ret = rt_mutex_slowlock(&lock->base.lock, TASK_INTERRUPTIBLE, NULL,
+				RT_MUTEX_FULL_CHAINWALK, ww_ctx);
 	if (ret)
 		mutex_release(&lock->base.dep_map, 1, _RET_IP_);
 	else if (!ret && ww_ctx->acquired > 1)
@@ -2257,7 +2258,8 @@ __ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ww_ctx)
 	might_sleep();
 
 	mutex_acquire_nest(&lock->base.dep_map, 0, 0, &ww_ctx->dep_map, _RET_IP_);
-	ret = rt_mutex_slowlock(&lock->base.lock, TASK_UNINTERRUPTIBLE, NULL, 0, ww_ctx);
+	ret = rt_mutex_slowlock(&lock->base.lock, TASK_UNINTERRUPTIBLE, NULL,
+				RT_MUTEX_FULL_CHAINWALK, ww_ctx);
 	if (ret)
 		mutex_release(&lock->base.dep_map, 1, _RET_IP_);
 	else if (!ret && ww_ctx->acquired > 1)
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 20/39] x86: UV: raw_spinlock conversion
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (18 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 19/39] rtmutex: enable deadlock detection in ww_mutex_lock functions Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 22/39] arm/futex: disable preemption during futex_atomic_cmpxchg_inatomic() Steven Rostedt
                   ` (21 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, stable-rt, Mike Galbraith

[-- Attachment #1: 0020-x86-UV-raw_spinlock-conversion.patch --]
[-- Type: text/plain, Size: 8391 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Mike Galbraith <umgwanakikbuti@gmail.com>

Shrug.  Lots of hobbyists have a beast in their basement, right?

Cc: stable-rt@vger.kernel.org
Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/include/asm/uv/uv_bau.h   | 14 +++++++-------
 arch/x86/include/asm/uv/uv_hub.h   |  2 +-
 arch/x86/kernel/apic/x2apic_uv_x.c |  2 +-
 arch/x86/platform/uv/tlb_uv.c      | 26 +++++++++++++-------------
 arch/x86/platform/uv/uv_time.c     | 21 +++++++++++++--------
 5 files changed, 35 insertions(+), 30 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index 0b46ef261c77..c2c8a7828f04 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -611,9 +611,9 @@ struct bau_control {
 	cycles_t		send_message;
 	cycles_t		period_end;
 	cycles_t		period_time;
-	spinlock_t		uvhub_lock;
-	spinlock_t		queue_lock;
-	spinlock_t		disable_lock;
+	raw_spinlock_t		uvhub_lock;
+	raw_spinlock_t		queue_lock;
+	raw_spinlock_t		disable_lock;
 	/* tunables */
 	int			max_concurr;
 	int			max_concurr_const;
@@ -773,15 +773,15 @@ static inline int atom_asr(short i, struct atomic_short *v)
  * to be lowered below the current 'v'.  atomic_add_unless can only stop
  * on equal.
  */
-static inline int atomic_inc_unless_ge(spinlock_t *lock, atomic_t *v, int u)
+static inline int atomic_inc_unless_ge(raw_spinlock_t *lock, atomic_t *v, int u)
 {
-	spin_lock(lock);
+	raw_spin_lock(lock);
 	if (atomic_read(v) >= u) {
-		spin_unlock(lock);
+		raw_spin_unlock(lock);
 		return 0;
 	}
 	atomic_inc(v);
-	spin_unlock(lock);
+	raw_spin_unlock(lock);
 	return 1;
 }
 
diff --git a/arch/x86/include/asm/uv/uv_hub.h b/arch/x86/include/asm/uv/uv_hub.h
index a30836c8ac4d..f559a092d8fa 100644
--- a/arch/x86/include/asm/uv/uv_hub.h
+++ b/arch/x86/include/asm/uv/uv_hub.h
@@ -502,7 +502,7 @@ struct uv_blade_info {
 	unsigned short	nr_online_cpus;
 	unsigned short	pnode;
 	short		memory_nid;
-	spinlock_t	nmi_lock;	/* obsolete, see uv_hub_nmi */
+	raw_spinlock_t	nmi_lock;	/* obsolete, see uv_hub_nmi */
 	unsigned long	nmi_count;	/* obsolete, see uv_hub_nmi */
 };
 extern struct uv_blade_info *uv_blade_info;
diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c
index d263b1307de1..1ef690e5459f 100644
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -911,7 +911,7 @@ void __init uv_system_init(void)
 			uv_blade_info[blade].pnode = pnode;
 			uv_blade_info[blade].nr_possible_cpus = 0;
 			uv_blade_info[blade].nr_online_cpus = 0;
-			spin_lock_init(&uv_blade_info[blade].nmi_lock);
+			raw_spin_lock_init(&uv_blade_info[blade].nmi_lock);
 			min_pnode = min(pnode, min_pnode);
 			max_pnode = max(pnode, max_pnode);
 			blade++;
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index dfe605ac1bcd..999fd5be9aa1 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -719,9 +719,9 @@ static void destination_plugged(struct bau_desc *bau_desc,
 
 		quiesce_local_uvhub(hmaster);
 
-		spin_lock(&hmaster->queue_lock);
+		raw_spin_lock(&hmaster->queue_lock);
 		reset_with_ipi(&bau_desc->distribution, bcp);
-		spin_unlock(&hmaster->queue_lock);
+		raw_spin_unlock(&hmaster->queue_lock);
 
 		end_uvhub_quiesce(hmaster);
 
@@ -741,9 +741,9 @@ static void destination_timeout(struct bau_desc *bau_desc,
 
 		quiesce_local_uvhub(hmaster);
 
-		spin_lock(&hmaster->queue_lock);
+		raw_spin_lock(&hmaster->queue_lock);
 		reset_with_ipi(&bau_desc->distribution, bcp);
-		spin_unlock(&hmaster->queue_lock);
+		raw_spin_unlock(&hmaster->queue_lock);
 
 		end_uvhub_quiesce(hmaster);
 
@@ -764,7 +764,7 @@ static void disable_for_period(struct bau_control *bcp, struct ptc_stats *stat)
 	cycles_t tm1;
 
 	hmaster = bcp->uvhub_master;
-	spin_lock(&hmaster->disable_lock);
+	raw_spin_lock(&hmaster->disable_lock);
 	if (!bcp->baudisabled) {
 		stat->s_bau_disabled++;
 		tm1 = get_cycles();
@@ -777,7 +777,7 @@ static void disable_for_period(struct bau_control *bcp, struct ptc_stats *stat)
 			}
 		}
 	}
-	spin_unlock(&hmaster->disable_lock);
+	raw_spin_unlock(&hmaster->disable_lock);
 }
 
 static void count_max_concurr(int stat, struct bau_control *bcp,
@@ -840,7 +840,7 @@ static void record_send_stats(cycles_t time1, cycles_t time2,
  */
 static void uv1_throttle(struct bau_control *hmaster, struct ptc_stats *stat)
 {
-	spinlock_t *lock = &hmaster->uvhub_lock;
+	raw_spinlock_t *lock = &hmaster->uvhub_lock;
 	atomic_t *v;
 
 	v = &hmaster->active_descriptor_count;
@@ -972,7 +972,7 @@ static int check_enable(struct bau_control *bcp, struct ptc_stats *stat)
 	struct bau_control *hmaster;
 
 	hmaster = bcp->uvhub_master;
-	spin_lock(&hmaster->disable_lock);
+	raw_spin_lock(&hmaster->disable_lock);
 	if (bcp->baudisabled && (get_cycles() >= bcp->set_bau_on_time)) {
 		stat->s_bau_reenabled++;
 		for_each_present_cpu(tcpu) {
@@ -984,10 +984,10 @@ static int check_enable(struct bau_control *bcp, struct ptc_stats *stat)
 				tbcp->period_giveups = 0;
 			}
 		}
-		spin_unlock(&hmaster->disable_lock);
+		raw_spin_unlock(&hmaster->disable_lock);
 		return 0;
 	}
-	spin_unlock(&hmaster->disable_lock);
+	raw_spin_unlock(&hmaster->disable_lock);
 	return -1;
 }
 
@@ -1895,9 +1895,9 @@ static void __init init_per_cpu_tunables(void)
 		bcp->cong_reps			= congested_reps;
 		bcp->disabled_period =		sec_2_cycles(disabled_period);
 		bcp->giveup_limit =		giveup_limit;
-		spin_lock_init(&bcp->queue_lock);
-		spin_lock_init(&bcp->uvhub_lock);
-		spin_lock_init(&bcp->disable_lock);
+		raw_spin_lock_init(&bcp->queue_lock);
+		raw_spin_lock_init(&bcp->uvhub_lock);
+		raw_spin_lock_init(&bcp->disable_lock);
 	}
 }
 
diff --git a/arch/x86/platform/uv/uv_time.c b/arch/x86/platform/uv/uv_time.c
index 5c86786bbfd2..c039afa26aa2 100644
--- a/arch/x86/platform/uv/uv_time.c
+++ b/arch/x86/platform/uv/uv_time.c
@@ -58,7 +58,7 @@ static DEFINE_PER_CPU(struct clock_event_device, cpu_ced);
 
 /* There is one of these allocated per node */
 struct uv_rtc_timer_head {
-	spinlock_t	lock;
+	raw_spinlock_t	lock;
 	/* next cpu waiting for timer, local node relative: */
 	int		next_cpu;
 	/* number of cpus on this node: */
@@ -178,7 +178,7 @@ static __init int uv_rtc_allocate_timers(void)
 				uv_rtc_deallocate_timers();
 				return -ENOMEM;
 			}
-			spin_lock_init(&head->lock);
+			raw_spin_lock_init(&head->lock);
 			head->ncpus = uv_blade_nr_possible_cpus(bid);
 			head->next_cpu = -1;
 			blade_info[bid] = head;
@@ -232,7 +232,7 @@ static int uv_rtc_set_timer(int cpu, u64 expires)
 	unsigned long flags;
 	int next_cpu;
 
-	spin_lock_irqsave(&head->lock, flags);
+	raw_spin_lock_irqsave(&head->lock, flags);
 
 	next_cpu = head->next_cpu;
 	*t = expires;
@@ -244,12 +244,12 @@ static int uv_rtc_set_timer(int cpu, u64 expires)
 		if (uv_setup_intr(cpu, expires)) {
 			*t = ULLONG_MAX;
 			uv_rtc_find_next_timer(head, pnode);
-			spin_unlock_irqrestore(&head->lock, flags);
+			raw_spin_unlock_irqrestore(&head->lock, flags);
 			return -ETIME;
 		}
 	}
 
-	spin_unlock_irqrestore(&head->lock, flags);
+	raw_spin_unlock_irqrestore(&head->lock, flags);
 	return 0;
 }
 
@@ -268,7 +268,7 @@ static int uv_rtc_unset_timer(int cpu, int force)
 	unsigned long flags;
 	int rc = 0;
 
-	spin_lock_irqsave(&head->lock, flags);
+	raw_spin_lock_irqsave(&head->lock, flags);
 
 	if ((head->next_cpu == bcpu && uv_read_rtc(NULL) >= *t) || force)
 		rc = 1;
@@ -280,7 +280,7 @@ static int uv_rtc_unset_timer(int cpu, int force)
 			uv_rtc_find_next_timer(head, pnode);
 	}
 
-	spin_unlock_irqrestore(&head->lock, flags);
+	raw_spin_unlock_irqrestore(&head->lock, flags);
 
 	return rc;
 }
@@ -300,13 +300,18 @@ static int uv_rtc_unset_timer(int cpu, int force)
 static cycle_t uv_read_rtc(struct clocksource *cs)
 {
 	unsigned long offset;
+	cycle_t cycles;
 
+	preempt_disable();
 	if (uv_get_min_hub_revision_id() == 1)
 		offset = 0;
 	else
 		offset = (uv_blade_processor_id() * L1_CACHE_BYTES) % PAGE_SIZE;
 
-	return (cycle_t)uv_read_local_mmr(UVH_RTC | offset);
+	cycles = (cycle_t)uv_read_local_mmr(UVH_RTC | offset);
+	preempt_enable();
+
+	return cycles;
 }
 
 /*
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 22/39] arm/futex: disable preemption during futex_atomic_cmpxchg_inatomic()
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (19 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 20/39] x86: UV: raw_spinlock conversion Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 23/39] ARM: cmpxchg: define __HAVE_ARCH_CMPXCHG for armv6 and later Steven Rostedt
                   ` (20 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, stable-rt

[-- Attachment #1: 0022-arm-futex-disable-preemption-during-futex_atomic_cmp.patch --]
[-- Type: text/plain, Size: 1879 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

The ARM UP implementation of futex_atomic_cmpxchg_inatomic() assumes that
pagefault_disable() inherits a preempt disabled section. This assumtion
is true for mainline but -RT reverts this and allows preemption in
pagefault disabled regions.
The code sequence of futex_atomic_cmpxchg_inatomic():

|   x = *futex;
|   if (x == oldval)
|           *futex = newval;

The problem occurs if the code is preempted after reading the futex value or
after comparing it with x. While preempted, the futex owner has to be
scheduled which then releases the lock (in userland because it has no waiter
yet). Once the code is back on the CPU, it overwrites the futex value
with with the old PID and the waiter bit set.

The workaround is to explicit disable code preemption to avoid the
described race window.

Debugged-by:  Thomas Gleixner <tglx@linutronix.de>
Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/arm/include/asm/futex.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h
index 2aff798fbef4..54d26a98ff84 100644
--- a/arch/arm/include/asm/futex.h
+++ b/arch/arm/include/asm/futex.h
@@ -90,6 +90,8 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
 		return -EFAULT;
 
+	preempt_disable_rt();
+
 	__asm__ __volatile__("@futex_atomic_cmpxchg_inatomic\n"
 	"1:	" TUSER(ldr) "	%1, [%4]\n"
 	"	teq	%1, %2\n"
@@ -101,6 +103,8 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 	: "cc", "memory");
 
 	*uval = val;
+
+	preempt_enable_rt();
 	return ret;
 }
 
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 23/39] ARM: cmpxchg: define __HAVE_ARCH_CMPXCHG for armv6 and later
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (20 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 22/39] arm/futex: disable preemption during futex_atomic_cmpxchg_inatomic() Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 24/39] mips: rt: Replace pagefault_* to raw version Steven Rostedt
                   ` (19 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, stable-rt

[-- Attachment #1: 0023-ARM-cmpxchg-define-__HAVE_ARCH_CMPXCHG-for-armv6-and.patch --]
[-- Type: text/plain, Size: 1566 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

Both pi_stress and sigwaittest in rt-test show performance gain with
__HAVE_ARCH_CMPXCHG. Testing result on coretile_express_a9x4:

pi_stress -p 99 --duration=300 (on linux-3.4-rc5; bigger is better)
  vanilla:     Total inversion performed: 5493381
  patched:     Total inversion performed: 5621746

sigwaittest -p 99 -l 100000 (on linux-3.4-rc5-rt6; less is better)
  3.4-rc5-rt6: Min   24, Cur   27, Avg   30, Max   98
  patched:     Min   19, Cur   21, Avg   23, Max   96

Signed-off-by: Yong Zhang <yong.zhang0 at gmail.com>
Cc: Russell King <rmk+kernel at arm.linux.org.uk>
Cc: Nicolas Pitre <nico at linaro.org>
Cc: Will Deacon <will.deacon at arm.com>
Cc: Catalin Marinas <catalin.marinas at arm.com>
Cc: Thomas Gleixner <tglx at linutronix.de>
Cc: linux-arm-kernel at lists.infradead.org
Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/arm/include/asm/cmpxchg.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/include/asm/cmpxchg.h b/arch/arm/include/asm/cmpxchg.h
index df2fbba7efc8..30b1a4010b89 100644
--- a/arch/arm/include/asm/cmpxchg.h
+++ b/arch/arm/include/asm/cmpxchg.h
@@ -127,6 +127,8 @@ static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size
 
 #else	/* min ARCH >= ARMv6 */
 
+#define __HAVE_ARCH_CMPXCHG 1
+
 extern void __bad_cmpxchg(volatile void *ptr, int size);
 
 /*
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 24/39] mips: rt: Replace pagefault_* to raw version
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (21 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 23/39] ARM: cmpxchg: define __HAVE_ARCH_CMPXCHG for armv6 and later Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 25/39] sas-ata/isci: dontt disable interrupts in qc_issue handler Steven Rostedt
                   ` (18 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, stable-rt, Yang Shi

[-- Attachment #1: 0024-mips-rt-Replace-pagefault_-to-raw-version.patch --]
[-- Type: text/plain, Size: 4086 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Yang Shi <yang.shi@windriver.com>

In k{un}map_coherent, pagefault_disable and pagefault_enable are called
respectively, but k{un}map_coherent needs preempt disabled according to
commit f8829caee311207afbc882794bdc5aa0db5caf33 ("[MIPS] Fix aliasing bug
in copy_to_user_page / copy_from_user_page") to avoid dcache alias on COW.

k{un}map_coherent are just called when cpu_has_dc_aliases == 1 with VIPT cache.
However, actually, the most modern MIPS processors have PIPT dcache without
dcache alias issue. In such case, k{un}map_atomic will be called with preempt
enabled.

To fix this, we replace pagefault_* to raw version in k{un}map_coherent, which
disables preempt, otherwise the following kernel panic may be caught:

CPU 0 Unable to handle kernel paging request at virtual address fffffffffffd5000, epc == ffffffff80122c00, ra == ffffffff8011fbcc
Oops[#1]:
CPU: 0 PID: 409 Comm: runltp Not tainted 3.14.17-rt5 #1
task: 980000000fa936f0 ti: 980000000eed0000 task.ti: 980000000eed0000
$ 0 : 0000000000000000 000000001400a4e1 fffffffffffd5000 0000000000000001
$ 4 : 980000000cded000 fffffffffffd5000 980000000cdedf00 ffffffffffff00fe
$ 8 : 0000000000000000 ffffffffffffff00 000000000000000d 0000000000000004
$12 : 980000000eed3fe0 000000000000a400 ffffffffa00ae278 0000000000000000
$16 : 980000000cded000 000000726eb855c8 98000000012ccfe8 ffffffff8095e0c0
$20 : ffffffff80ad0000 ffffffff8095e0c0 98000000012d0bd8 980000000fb92000
$24 : 0000000000000000 ffffffff80177fb0
$28 : 980000000eed0000 980000000eed3b60 980000000fb92060 ffffffff8011fbcc
Hi : 000000000002cb02
Lo : 000000000000ee56
epc : ffffffff80122c00 copy_page+0x38/0x548
    Not tainted
ra : ffffffff8011fbcc copy_user_highpage+0x16c/0x180
Status: 1400a4e3 KX SX UX KERNEL EXL IE
Cause : 10800408
BadVA : fffffffffffd5000
PrId : 00010000 (MIPS64R2-generic)
Modules linked in: i2c_piix4 i2c_core uhci_hcd
Process runltp (pid: 409, threadinfo=980000000eed0000, task=980000000fa936f0, tls=000000fff7756700)
Stack : 98000000012ccfe8 980000000eeb7ba8 980000000ecc7508 000000000666da5b
000000726eb855c8 ffffffff802156e0 000000726ea4a000 98000000010007e0
980000000fb92060 0000000000000000 0000000000000000 6db6db6db6db6db7
0000000000000080 000000726eb855c8 980000000fb92000 980000000eeeec28
980000000ecc7508 980000000fb92060 0000000000000001 00000000000000a9
ffffffff80995e60 ffffffff80218910 000000001400a4e0 ffffffff804efd24
980000000ee25b90 ffffffff8079cec4 ffffffff8079d49c ffffffff80979658
000000000666da5b 980000000eeb7ba8 000000726eb855c8 00000000000000a9
980000000fb92000 980000000fa936f0 980000000eed3eb0 0000000000000001
980000000fb92088 0000000000030002 980000000ecc7508 ffffffff8011ecd0
...
Call Trace:
[<ffffffff80122c00>] copy_page+0x38/0x548
[<ffffffff8011fbcc>] copy_user_highpage+0x16c/0x180
[<ffffffff802156e0>] do_wp_page+0x658/0xcd8
[<ffffffff80218910>] handle_mm_fault+0x7d8/0x1070
[<ffffffff8011ecd0>] __do_page_fault+0x1a0/0x508
[<ffffffff80104d84>] resume_userspace_check+0x0/0x10

Or there may be random segmentation fault happened.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Yang Shi <yang.shi@windriver.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/mips/mm/init.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
index 6b59617760c1..740219e35d6c 100644
--- a/arch/mips/mm/init.c
+++ b/arch/mips/mm/init.c
@@ -124,7 +124,7 @@ void *kmap_coherent(struct page *page, unsigned long addr)
 
 	BUG_ON(Page_dcache_dirty(page));
 
-	pagefault_disable();
+	raw_pagefault_disable();
 	idx = (addr >> PAGE_SHIFT) & (FIX_N_COLOURS - 1);
 #ifdef CONFIG_MIPS_MT_SMTC
 	idx += FIX_N_COLOURS * smp_processor_id() +
@@ -191,7 +191,7 @@ void kunmap_coherent(void)
 	write_c0_entryhi(old_ctx);
 	EXIT_CRITICAL(flags);
 #endif
-	pagefault_enable();
+	raw_pagefault_enable();
 }
 
 void copy_user_highpage(struct page *to, struct page *from,
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 25/39] sas-ata/isci: dontt disable interrupts in qc_issue handler
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (22 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 24/39] mips: rt: Replace pagefault_* to raw version Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 26/39] scheduling while atomic in cgroup code Steven Rostedt
                   ` (17 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, stable-rt

[-- Attachment #1: 0025-sas-ata-isci-dont-t-disable-interrupts-in-qc_issue-h.patch --]
[-- Type: text/plain, Size: 3473 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Paul Gortmaker <paul.gortmaker@windriver.com>

On 3.14-rt we see the following trace on Canoe Pass for
SCSI_ISCI "Intel(R) C600 Series Chipset SAS Controller"
when the sas qc_issue handler is run:

 BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:905
 in_atomic(): 0, irqs_disabled(): 1, pid: 432, name: udevd
 CPU: 11 PID: 432 Comm: udevd Not tainted 3.14.28-rt22 #2
 Hardware name: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.02.01.0002.082220131453 08/22/2013
 ffff880fab500000 ffff880fa9f239c0 ffffffff81a2d273 0000000000000000
 ffff880fa9f239d8 ffffffff8107f023 ffff880faac23dc0 ffff880fa9f239f0
 ffffffff81a33cc0 ffff880faaeb1400 ffff880fa9f23a40 ffffffff815de891
 Call Trace:
 [<ffffffff81a2d273>] dump_stack+0x4e/0x7a
 [<ffffffff8107f023>] __might_sleep+0xe3/0x160
 [<ffffffff81a33cc0>] rt_spin_lock+0x20/0x50
 [<ffffffff815de891>] isci_task_execute_task+0x171/0x2f0  <-----
 [<ffffffff815cfecb>] sas_ata_qc_issue+0x25b/0x2a0
 [<ffffffff81606363>] ata_qc_issue+0x1f3/0x370
 [<ffffffff8160c600>] ? ata_scsi_invalid_field+0x40/0x40
 [<ffffffff8160c8f5>] ata_scsi_translate+0xa5/0x1b0
 [<ffffffff8160efc6>] ata_sas_queuecmd+0x86/0x280
 [<ffffffff815ce446>] sas_queuecommand+0x196/0x230
 [<ffffffff81081fad>] ? get_parent_ip+0xd/0x50
 [<ffffffff815b05a4>] scsi_dispatch_cmd+0xb4/0x210
 [<ffffffff815b7744>] scsi_request_fn+0x314/0x530

and gdb shows:

(gdb) list * isci_task_execute_task+0x171
0xffffffff815ddfb1 is in isci_task_execute_task (drivers/scsi/isci/task.c:138).
133             dev_dbg(&ihost->pdev->dev, "%s: num=%d\n", __func__, num);
134
135             for_each_sas_task(num, task) {
136                     enum sci_status status = SCI_FAILURE;
137
138                     spin_lock_irqsave(&ihost->scic_lock, flags);    <-----
139                     idev = isci_lookup_device(task->dev);
140                     io_ready = isci_device_io_ready(idev, task);
141                     tag = isci_alloc_tag(ihost);
142                     spin_unlock_irqrestore(&ihost->scic_lock, flags);
(gdb)

In addition to the scic_lock, the function also contains locking of
the task_state_lock -- which is clearly not a candidate for raw lock
conversion.  As can be seen by the comment nearby, we really should
be running the qc_issue code with interrupts enabled anyway.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 drivers/scsi/libsas/sas_ata.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index d2895836f9fa..c85e07ab51e4 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -191,7 +191,7 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
 	/* TODO: audit callers to ensure they are ready for qc_issue to
 	 * unconditionally re-enable interrupts
 	 */
-	local_irq_save(flags);
+	local_irq_save_nort(flags);
 	spin_unlock(ap->lock);
 
 	/* If the device fell off, no sense in issuing commands */
@@ -261,7 +261,7 @@ static unsigned int sas_ata_qc_issue(struct ata_queued_cmd *qc)
 
  out:
 	spin_lock(ap->lock);
-	local_irq_restore(flags);
+	local_irq_restore_nort(flags);
 	return ret;
 }
 
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 26/39] scheduling while atomic in cgroup code
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (23 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 25/39] sas-ata/isci: dontt disable interrupts in qc_issue handler Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-17 20:10   ` Paul Gortmaker
  2015-03-12 19:13 ` [PATCH RT 27/39] work-simple: Simple work queue implemenation Steven Rostedt
                   ` (16 subsequent siblings)
  41 siblings, 1 reply; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, stable-rt, Nikita Yushchenko,
	Mike Galbraith

[-- Attachment #1: 0026-scheduling-while-atomic-in-cgroup-code.patch --]
[-- Type: text/plain, Size: 2201 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Mike Galbraith <umgwanakikbuti@gmail.com>

mm, memcg: make refill_stock() use get_cpu_light()

Nikita reported the following memcg scheduling while atomic bug:

Call Trace:
[e22d5a90] [c0007ea8] show_stack+0x4c/0x168 (unreliable)
[e22d5ad0] [c0618c04] __schedule_bug+0x94/0xb0
[e22d5ae0] [c060b9ec] __schedule+0x530/0x550
[e22d5bf0] [c060bacc] schedule+0x30/0xbc
[e22d5c00] [c060ca24] rt_spin_lock_slowlock+0x180/0x27c
[e22d5c70] [c00b39dc] res_counter_uncharge_until+0x40/0xc4
[e22d5ca0] [c013ca88] drain_stock.isra.20+0x54/0x98
[e22d5cc0] [c01402ac] __mem_cgroup_try_charge+0x2e8/0xbac
[e22d5d70] [c01410d4] mem_cgroup_charge_common+0x3c/0x70
[e22d5d90] [c0117284] __do_fault+0x38c/0x510
[e22d5df0] [c011a5f4] handle_pte_fault+0x98/0x858
[e22d5e50] [c060ed08] do_page_fault+0x42c/0x6fc
[e22d5f40] [c000f5b4] handle_page_fault+0xc/0x80

What happens:

   refill_stock()
      get_cpu_var()
      drain_stock()
         res_counter_uncharge()
            res_counter_uncharge_until()
               spin_lock() <== boom

Fix it by replacing get/put_cpu_var() with get/put_cpu_light().

Cc: stable-rt@vger.kernel.org
Reported-by: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru>
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 mm/memcontrol.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index fa54408b66b2..be24d2d186fa 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2512,14 +2512,17 @@ static void __init memcg_stock_init(void)
  */
 static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
 {
-	struct memcg_stock_pcp *stock = &get_cpu_var(memcg_stock);
+	struct memcg_stock_pcp *stock;
+	int cpu = get_cpu_light();
+
+	stock = &per_cpu(memcg_stock, cpu);
 
 	if (stock->cached != memcg) { /* reset if necessary */
 		drain_stock(stock);
 		stock->cached = memcg;
 	}
 	stock->nr_pages += nr_pages;
-	put_cpu_var(memcg_stock);
+	put_cpu_light();
 }
 
 /*
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 27/39] work-simple: Simple work queue implemenation
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (24 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 26/39] scheduling while atomic in cgroup code Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 28/39] sunrpc: make svc_xprt_do_enqueue() use get_cpu_light() Steven Rostedt
                   ` (15 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, Daniel Wagner

[-- Attachment #1: 0027-work-simple-Simple-work-queue-implemenation.patch --]
[-- Type: text/plain, Size: 5808 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Daniel Wagner <daniel.wagner@bmw-carit.de>

Provides a framework for enqueuing callbacks from irq context
PREEMPT_RT_FULL safe. The callbacks are executed in kthread context.

Bases on wait-simple.

Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/work-simple.h |  24 +++++++
 kernel/sched/Makefile       |   2 +-
 kernel/sched/work-simple.c  | 172 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 197 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/work-simple.h
 create mode 100644 kernel/sched/work-simple.c

diff --git a/include/linux/work-simple.h b/include/linux/work-simple.h
new file mode 100644
index 000000000000..f175fa9a6016
--- /dev/null
+++ b/include/linux/work-simple.h
@@ -0,0 +1,24 @@
+#ifndef _LINUX_SWORK_H
+#define _LINUX_SWORK_H
+
+#include <linux/list.h>
+
+struct swork_event {
+	struct list_head item;
+	unsigned long flags;
+	void (*func)(struct swork_event *);
+};
+
+static inline void INIT_SWORK(struct swork_event *event,
+			      void (*func)(struct swork_event *))
+{
+	event->flags = 0;
+	event->func = func;
+}
+
+bool swork_queue(struct swork_event *sev);
+
+int swork_get(void);
+void swork_put(void);
+
+#endif /* _LINUX_SWORK_H */
diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
index b14a512da520..d8374f999e9b 100644
--- a/kernel/sched/Makefile
+++ b/kernel/sched/Makefile
@@ -13,7 +13,7 @@ endif
 
 obj-y += core.o proc.o clock.o cputime.o
 obj-y += idle_task.o fair.o rt.o deadline.o stop_task.o
-obj-y += wait.o wait-simple.o completion.o
+obj-y += wait.o wait-simple.o work-simple.o completion.o
 obj-$(CONFIG_SMP) += cpupri.o cpudeadline.o
 obj-$(CONFIG_SCHED_AUTOGROUP) += auto_group.o
 obj-$(CONFIG_SCHEDSTATS) += stats.o
diff --git a/kernel/sched/work-simple.c b/kernel/sched/work-simple.c
new file mode 100644
index 000000000000..c996f755dba6
--- /dev/null
+++ b/kernel/sched/work-simple.c
@@ -0,0 +1,172 @@
+/*
+ * Copyright (C) 2014 BMW Car IT GmbH, Daniel Wagner daniel.wagner@bmw-carit.de
+ *
+ * Provides a framework for enqueuing callbacks from irq context
+ * PREEMPT_RT_FULL safe. The callbacks are executed in kthread context.
+ */
+
+#include <linux/wait-simple.h>
+#include <linux/work-simple.h>
+#include <linux/kthread.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#define SWORK_EVENT_PENDING     (1 << 0)
+
+static DEFINE_MUTEX(worker_mutex);
+static struct sworker *glob_worker;
+
+struct sworker {
+	struct list_head events;
+	struct swait_head wq;
+
+	raw_spinlock_t lock;
+
+	struct task_struct *task;
+	int refs;
+};
+
+static bool swork_readable(struct sworker *worker)
+{
+	bool r;
+
+	if (kthread_should_stop())
+		return true;
+
+	raw_spin_lock_irq(&worker->lock);
+	r = !list_empty(&worker->events);
+	raw_spin_unlock_irq(&worker->lock);
+
+	return r;
+}
+
+static int swork_kthread(void *arg)
+{
+	struct sworker *worker = arg;
+
+	for (;;) {
+		swait_event_interruptible(worker->wq,
+					swork_readable(worker));
+		if (kthread_should_stop())
+			break;
+
+		raw_spin_lock_irq(&worker->lock);
+		while (!list_empty(&worker->events)) {
+			struct swork_event *sev;
+
+			sev = list_first_entry(&worker->events,
+					struct swork_event, item);
+			list_del(&sev->item);
+			raw_spin_unlock_irq(&worker->lock);
+
+			WARN_ON_ONCE(!test_and_clear_bit(SWORK_EVENT_PENDING,
+							 &sev->flags));
+			sev->func(sev);
+			raw_spin_lock_irq(&worker->lock);
+		}
+		raw_spin_unlock_irq(&worker->lock);
+	}
+	return 0;
+}
+
+static struct sworker *swork_create(void)
+{
+	struct sworker *worker;
+
+	worker = kzalloc(sizeof(*worker), GFP_KERNEL);
+	if (!worker)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&worker->events);
+	raw_spin_lock_init(&worker->lock);
+	init_swait_head(&worker->wq);
+
+	worker->task = kthread_run(swork_kthread, worker, "kswork");
+	if (IS_ERR(worker->task)) {
+		kfree(worker);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	return worker;
+}
+
+static void swork_destroy(struct sworker *worker)
+{
+	kthread_stop(worker->task);
+
+	WARN_ON(!list_empty(&worker->events));
+	kfree(worker);
+}
+
+/**
+ * swork_queue - queue swork
+ *
+ * Returns %false if @work was already on a queue, %true otherwise.
+ *
+ * The work is queued and processed on a random CPU
+ */
+bool swork_queue(struct swork_event *sev)
+{
+	unsigned long flags;
+
+	if (test_and_set_bit(SWORK_EVENT_PENDING, &sev->flags))
+		return false;
+
+	raw_spin_lock_irqsave(&glob_worker->lock, flags);
+	list_add_tail(&sev->item, &glob_worker->events);
+	raw_spin_unlock_irqrestore(&glob_worker->lock, flags);
+
+	swait_wake(&glob_worker->wq);
+	return true;
+}
+EXPORT_SYMBOL_GPL(swork_queue);
+
+/**
+ * swork_get - get an instance of the sworker
+ *
+ * Returns an negative error code if the initialization if the worker did not
+ * work, %0 otherwise.
+ *
+ */
+int swork_get(void)
+{
+	struct sworker *worker;
+
+	mutex_lock(&worker_mutex);
+	if (!glob_worker) {
+		worker = swork_create();
+		if (IS_ERR(worker)) {
+			mutex_unlock(&worker_mutex);
+			return -ENOMEM;
+		}
+
+		glob_worker = worker;
+	}
+
+	glob_worker->refs++;
+	mutex_unlock(&worker_mutex);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(swork_get);
+
+/**
+ * swork_put - puts an instance of the sworker
+ *
+ * Will destroy the sworker thread. This function must not be called until all
+ * queued events have been completed.
+ */
+void swork_put(void)
+{
+	mutex_lock(&worker_mutex);
+
+	glob_worker->refs--;
+	if (glob_worker->refs > 0)
+		goto out;
+
+	swork_destroy(glob_worker);
+	glob_worker = NULL;
+out:
+	mutex_unlock(&worker_mutex);
+}
+EXPORT_SYMBOL_GPL(swork_put);
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 28/39] sunrpc: make svc_xprt_do_enqueue() use get_cpu_light()
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (25 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 27/39] work-simple: Simple work queue implemenation Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 29/39] Revert "rwsem-rt: Do not allow readers to nest" Steven Rostedt
                   ` (14 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, stable-rt, Mike Galbraith

[-- Attachment #1: 0028-sunrpc-make-svc_xprt_do_enqueue-use-get_cpu_light.patch --]
[-- Type: text/plain, Size: 2120 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Mike Galbraith <umgwanakikbuti@gmail.com>

|BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:915
|in_atomic(): 1, irqs_disabled(): 0, pid: 3194, name: rpc.nfsd
|Preemption disabled at:[<ffffffffa06bf0bb>] svc_xprt_received+0x4b/0xc0 [sunrpc]
|CPU: 6 PID: 3194 Comm: rpc.nfsd Not tainted 3.18.7-rt1 #9
|Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.404 11/06/2014
| ffff880409630000 ffff8800d9a33c78 ffffffff815bdeb5 0000000000000002
| 0000000000000000 ffff8800d9a33c98 ffffffff81073c86 ffff880408dd6008
| ffff880408dd6000 ffff8800d9a33cb8 ffffffff815c3d84 ffff88040b3ac000
|Call Trace:
| [<ffffffff815bdeb5>] dump_stack+0x4f/0x9e
| [<ffffffff81073c86>] __might_sleep+0xe6/0x150
| [<ffffffff815c3d84>] rt_spin_lock+0x24/0x50
| [<ffffffffa06beec0>] svc_xprt_do_enqueue+0x80/0x230 [sunrpc]
| [<ffffffffa06bf0bb>] svc_xprt_received+0x4b/0xc0 [sunrpc]
| [<ffffffffa06c03ed>] svc_add_new_perm_xprt+0x6d/0x80 [sunrpc]
| [<ffffffffa06b2693>] svc_addsock+0x143/0x200 [sunrpc]
| [<ffffffffa072e69c>] write_ports+0x28c/0x340 [nfsd]
| [<ffffffffa072d2ac>] nfsctl_transaction_write+0x4c/0x80 [nfsd]
| [<ffffffff8117ee83>] vfs_write+0xb3/0x1d0
| [<ffffffff8117f889>] SyS_write+0x49/0xb0
| [<ffffffff815c4556>] system_call_fastpath+0x16/0x1b

Cc: stable-rt@vger.kernel.org
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 net/sunrpc/svc_xprt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index a4acaf2bcf18..22ca25818120 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -349,9 +349,9 @@ void svc_xprt_enqueue(struct svc_xprt *xprt)
 	if (!svc_xprt_has_something_to_do(xprt))
 		return;
 
-	cpu = get_cpu();
+	cpu = get_cpu_light();
 	pool = svc_pool_for_cpu(xprt->xpt_server, cpu);
-	put_cpu();
+	put_cpu_light();
 
 	spin_lock_bh(&pool->sp_lock);
 
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 29/39] Revert "rwsem-rt: Do not allow readers to nest"
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (26 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 28/39] sunrpc: make svc_xprt_do_enqueue() use get_cpu_light() Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 30/39] locking: ww_mutex: fix ww_mutex vs self-deadlock Steven Rostedt
                   ` (13 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, stable-rt

[-- Attachment #1: 0029-Revert-rwsem-rt-Do-not-allow-readers-to-nest.patch --]
[-- Type: text/plain, Size: 3114 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

This behaviour is required by cpufreq and its logic is "okay": It does a
read_lock followed by a try_read_lock.
Lockdep warns if one try a read_lock twice in -RT and vanilla so it
should be good. We still only allow multiple readers as long as it is in
the same process.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 include/linux/rwsem_rt.h |  1 +
 kernel/locking/rt.c      | 29 +++++++++++++++++++++++------
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/include/linux/rwsem_rt.h b/include/linux/rwsem_rt.h
index 0065b08fbb7a..924c2d274ab5 100644
--- a/include/linux/rwsem_rt.h
+++ b/include/linux/rwsem_rt.h
@@ -20,6 +20,7 @@
 
 struct rw_semaphore {
 	struct rt_mutex		lock;
+	int			read_depth;
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 	struct lockdep_map	dep_map;
 #endif
diff --git a/kernel/locking/rt.c b/kernel/locking/rt.c
index eac2ddea9c45..9a972e996185 100644
--- a/kernel/locking/rt.c
+++ b/kernel/locking/rt.c
@@ -322,7 +322,8 @@ EXPORT_SYMBOL(rt_up_write);
 void  rt_up_read(struct rw_semaphore *rwsem)
 {
 	rwsem_release(&rwsem->dep_map, 1, _RET_IP_);
-	rt_mutex_unlock(&rwsem->lock);
+	if (--rwsem->read_depth == 0)
+		rt_mutex_unlock(&rwsem->lock);
 }
 EXPORT_SYMBOL(rt_up_read);
 
@@ -333,6 +334,7 @@ EXPORT_SYMBOL(rt_up_read);
 void  rt_downgrade_write(struct rw_semaphore *rwsem)
 {
 	BUG_ON(rt_mutex_owner(&rwsem->lock) != current);
+	rwsem->read_depth = 1;
 }
 EXPORT_SYMBOL(rt_downgrade_write);
 
@@ -370,12 +372,23 @@ EXPORT_SYMBOL(rt_down_write_nested_lock);
 
 int  rt_down_read_trylock(struct rw_semaphore *rwsem)
 {
-	int ret;
+	struct rt_mutex *lock = &rwsem->lock;
+	int ret = 1;
 
-	ret = rt_mutex_trylock(&rwsem->lock);
-	if (ret)
-		rwsem_acquire(&rwsem->dep_map, 0, 1, _RET_IP_);
+	/*
+	 * recursive read locks succeed when current owns the rwsem,
+	 * but not when read_depth == 0 which means that the rwsem is
+	 * write locked.
+	 */
+	if (rt_mutex_owner(lock) != current)
+		ret = rt_mutex_trylock(&rwsem->lock);
+	else if (!rwsem->read_depth)
+		ret = 0;
 
+	if (ret) {
+		rwsem->read_depth++;
+		rwsem_acquire(&rwsem->dep_map, 0, 1, _RET_IP_);
+	}
 	return ret;
 }
 EXPORT_SYMBOL(rt_down_read_trylock);
@@ -383,7 +396,10 @@ EXPORT_SYMBOL(rt_down_read_trylock);
 static void __rt_down_read(struct rw_semaphore *rwsem, int subclass)
 {
 	rwsem_acquire(&rwsem->dep_map, subclass, 0, _RET_IP_);
-	rt_mutex_lock(&rwsem->lock);
+
+	if (rt_mutex_owner(&rwsem->lock) != current)
+		rt_mutex_lock(&rwsem->lock);
+	rwsem->read_depth++;
 }
 
 void  rt_down_read(struct rw_semaphore *rwsem)
@@ -408,6 +424,7 @@ void  __rt_rwsem_init(struct rw_semaphore *rwsem, const char *name,
 	debug_check_no_locks_freed((void *)rwsem, sizeof(*rwsem));
 	lockdep_init_map(&rwsem->dep_map, name, key, 0);
 #endif
+	rwsem->read_depth = 0;
 	rwsem->lock.save_state = 0;
 }
 EXPORT_SYMBOL(__rt_rwsem_init);
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 30/39] locking: ww_mutex: fix ww_mutex vs self-deadlock
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (27 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 29/39] Revert "rwsem-rt: Do not allow readers to nest" Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 31/39] thermal: Defer thermal wakups to threads Steven Rostedt
                   ` (12 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, stable-rt, Mike Galbraith

[-- Attachment #1: 0030-locking-ww_mutex-fix-ww_mutex-vs-self-deadlock.patch --]
[-- Type: text/plain, Size: 3198 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Mike Galbraith <umgwanakikbuti@gmail.com>

If the caller already holds the mutex, task_blocks_on_rt_mutex()
returns -EDEADLK, we proceed directly to rt_mutex_handle_deadlock()
where it's instant game over.

Let ww_mutexes return EDEADLK/EALREADY as they want to instead.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/locking/rtmutex.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 27a1993111f9..ee4e7e747e06 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1694,13 +1694,20 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state,
 
 	if (likely(!ret))
 		ret = __rt_mutex_slowlock(lock, state, timeout, &waiter, ww_ctx);
+	else if (ww_ctx) {
+		/* ww_mutex received EDEADLK, let it become EALREADY */
+		ret = __mutex_lock_check_stamp(lock, ww_ctx);
+		BUG_ON(!ret);
+	}
 
 	set_current_state(TASK_RUNNING);
 
 	if (unlikely(ret)) {
 		if (rt_mutex_has_waiters(lock))
 			remove_waiter(lock, &waiter);
-		rt_mutex_handle_deadlock(ret, chwalk, &waiter);
+		/* ww_mutex want to report EDEADLK/EALREADY, let them */
+		if (!ww_ctx)
+			rt_mutex_handle_deadlock(ret, chwalk, &waiter);
 	} else if (ww_ctx) {
 		ww_mutex_account_lock(lock, ww_ctx);
 	}
@@ -2239,8 +2246,7 @@ __ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx *ww_c
 	might_sleep();
 
 	mutex_acquire_nest(&lock->base.dep_map, 0, 0, &ww_ctx->dep_map, _RET_IP_);
-	ret = rt_mutex_slowlock(&lock->base.lock, TASK_INTERRUPTIBLE, NULL,
-				RT_MUTEX_FULL_CHAINWALK, ww_ctx);
+	ret = rt_mutex_slowlock(&lock->base.lock, TASK_INTERRUPTIBLE, NULL, 0, ww_ctx);
 	if (ret)
 		mutex_release(&lock->base.dep_map, 1, _RET_IP_);
 	else if (!ret && ww_ctx->acquired > 1)
@@ -2258,8 +2264,7 @@ __ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ww_ctx)
 	might_sleep();
 
 	mutex_acquire_nest(&lock->base.dep_map, 0, 0, &ww_ctx->dep_map, _RET_IP_);
-	ret = rt_mutex_slowlock(&lock->base.lock, TASK_UNINTERRUPTIBLE, NULL,
-				RT_MUTEX_FULL_CHAINWALK, ww_ctx);
+	ret = rt_mutex_slowlock(&lock->base.lock, TASK_UNINTERRUPTIBLE, NULL, 0, ww_ctx);
 	if (ret)
 		mutex_release(&lock->base.dep_map, 1, _RET_IP_);
 	else if (!ret && ww_ctx->acquired > 1)
@@ -2271,11 +2276,13 @@ EXPORT_SYMBOL_GPL(__ww_mutex_lock);
 
 void __sched ww_mutex_unlock(struct ww_mutex *lock)
 {
+	int nest = !!lock->ctx;
+
 	/*
 	 * The unlocking fastpath is the 0->1 transition from 'locked'
 	 * into 'unlocked' state:
 	 */
-	if (lock->ctx) {
+	if (nest) {
 #ifdef CONFIG_DEBUG_MUTEXES
 		DEBUG_LOCKS_WARN_ON(!lock->ctx->acquired);
 #endif
@@ -2284,7 +2291,7 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
 		lock->ctx = NULL;
 	}
 
-	mutex_release(&lock->base.dep_map, 1, _RET_IP_);
+	mutex_release(&lock->base.dep_map, nest, _RET_IP_);
 	rt_mutex_unlock(&lock->base.lock);
 }
 EXPORT_SYMBOL(ww_mutex_unlock);
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 31/39] thermal: Defer thermal wakups to threads
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (28 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 30/39] locking: ww_mutex: fix ww_mutex vs self-deadlock Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 32/39] lockdep: selftest: fix warnings due to missing PREEMPT_RT conditionals Steven Rostedt
                   ` (11 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, stable-rt, Daniel Wagner

[-- Attachment #1: 0031-thermal-Defer-thermal-wakups-to-threads.patch --]
[-- Type: text/plain, Size: 4191 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Daniel Wagner <wagi@monom.org>

On RT the spin lock in pkg_temp_thermal_platfrom_thermal_notify will
call schedule while we run in irq context.

[<ffffffff816850ac>] dump_stack+0x4e/0x8f
[<ffffffff81680f7d>] __schedule_bug+0xa6/0xb4
[<ffffffff816896b4>] __schedule+0x5b4/0x700
[<ffffffff8168982a>] schedule+0x2a/0x90
[<ffffffff8168a8b5>] rt_spin_lock_slowlock+0xe5/0x2d0
[<ffffffff8168afd5>] rt_spin_lock+0x25/0x30
[<ffffffffa03a7b75>] pkg_temp_thermal_platform_thermal_notify+0x45/0x134 [x86_pkg_temp_thermal]
[<ffffffff8103d4db>] ? therm_throt_process+0x1b/0x160
[<ffffffff8103d831>] intel_thermal_interrupt+0x211/0x250
[<ffffffff8103d8c1>] smp_thermal_interrupt+0x21/0x40
[<ffffffff8169415d>] thermal_interrupt+0x6d/0x80

Let's defer the work to a kthread.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[bigeasy: reoder init/denit position. TODO: flush swork on exit]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/thermal/x86_pkg_temp_thermal.c | 50 ++++++++++++++++++++++++++++++++--
 1 file changed, 47 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/x86_pkg_temp_thermal.c b/drivers/thermal/x86_pkg_temp_thermal.c
index 081fd7e6a9f0..f297ddd3365c 100644
--- a/drivers/thermal/x86_pkg_temp_thermal.c
+++ b/drivers/thermal/x86_pkg_temp_thermal.c
@@ -29,6 +29,7 @@
 #include <linux/pm.h>
 #include <linux/thermal.h>
 #include <linux/debugfs.h>
+#include <linux/work-simple.h>
 #include <asm/cpu_device_id.h>
 #include <asm/mce.h>
 
@@ -352,7 +353,7 @@ static void pkg_temp_thermal_threshold_work_fn(struct work_struct *work)
 	}
 }
 
-static int pkg_temp_thermal_platform_thermal_notify(__u64 msr_val)
+static void platform_thermal_notify_work(struct swork_event *event)
 {
 	unsigned long flags;
 	int cpu = smp_processor_id();
@@ -369,7 +370,7 @@ static int pkg_temp_thermal_platform_thermal_notify(__u64 msr_val)
 			pkg_work_scheduled[phy_id]) {
 		disable_pkg_thres_interrupt();
 		spin_unlock_irqrestore(&pkg_work_lock, flags);
-		return -EINVAL;
+		return;
 	}
 	pkg_work_scheduled[phy_id] = 1;
 	spin_unlock_irqrestore(&pkg_work_lock, flags);
@@ -378,9 +379,48 @@ static int pkg_temp_thermal_platform_thermal_notify(__u64 msr_val)
 	schedule_delayed_work_on(cpu,
 				&per_cpu(pkg_temp_thermal_threshold_work, cpu),
 				msecs_to_jiffies(notify_delay_ms));
+}
+
+#ifdef CONFIG_PREEMPT_RT_FULL
+static struct swork_event notify_work;
+
+static int thermal_notify_work_init(void)
+{
+	int err;
+
+	err = swork_get();
+	if (err)
+		return err;
+
+	INIT_SWORK(&notify_work, platform_thermal_notify_work);
 	return 0;
 }
 
+static void thermal_notify_work_cleanup(void)
+{
+	swork_put();
+}
+
+static int pkg_temp_thermal_platform_thermal_notify(__u64 msr_val)
+{
+	swork_queue(&notify_work);
+	return 0;
+}
+
+#else  /* !CONFIG_PREEMPT_RT_FULL */
+
+static int thermal_notify_work_init(void) { return 0; }
+
+static int thermal_notify_work_cleanup(void) {  }
+
+static int pkg_temp_thermal_platform_thermal_notify(__u64 msr_val)
+{
+	platform_thermal_notify_work(NULL);
+
+	return 0;
+}
+#endif /* CONFIG_PREEMPT_RT_FULL */
+
 static int find_siblings_cpu(int cpu)
 {
 	int i;
@@ -584,6 +624,9 @@ static int __init pkg_temp_thermal_init(void)
 	if (!x86_match_cpu(pkg_temp_thermal_ids))
 		return -ENODEV;
 
+	if (!thermal_notify_work_init())
+		return -ENODEV;
+
 	spin_lock_init(&pkg_work_lock);
 	platform_thermal_package_notify =
 			pkg_temp_thermal_platform_thermal_notify;
@@ -608,7 +651,7 @@ err_ret:
 	kfree(pkg_work_scheduled);
 	platform_thermal_package_notify = NULL;
 	platform_thermal_package_rate_control = NULL;
-
+	thermal_notify_work_cleanup();
 	return -ENODEV;
 }
 
@@ -633,6 +676,7 @@ static void __exit pkg_temp_thermal_exit(void)
 	mutex_unlock(&phy_dev_list_mutex);
 	platform_thermal_package_notify = NULL;
 	platform_thermal_package_rate_control = NULL;
+	thermal_notify_work_cleanup();
 	for_each_online_cpu(i)
 		cancel_delayed_work_sync(
 			&per_cpu(pkg_temp_thermal_threshold_work, i));
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 32/39] lockdep: selftest: fix warnings due to missing PREEMPT_RT conditionals
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (29 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 31/39] thermal: Defer thermal wakups to threads Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 33/39] fs/aio: simple simple work Steven Rostedt
                   ` (10 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, stable-rt, Josh Cartwright,
	Xander Huff, Gratian Crisan

[-- Attachment #1: 0032-lockdep-selftest-fix-warnings-due-to-missing-PREEMPT.patch --]
[-- Type: text/plain, Size: 4377 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Josh Cartwright <josh.cartwright@ni.com>

"lockdep: Selftest: Only do hardirq context test for raw spinlock"
disabled the execution of certain tests with PREEMPT_RT_FULL, but did
not prevent the tests from still being defined.  This leads to warnings
like:

  ./linux/lib/locking-selftest.c:574:1: warning: 'irqsafe1_hard_rlock_12' defined but not used [-Wunused-function]
  ./linux/lib/locking-selftest.c:574:1: warning: 'irqsafe1_hard_rlock_21' defined but not used [-Wunused-function]
  ./linux/lib/locking-selftest.c:577:1: warning: 'irqsafe1_hard_wlock_12' defined but not used [-Wunused-function]
  ./linux/lib/locking-selftest.c:577:1: warning: 'irqsafe1_hard_wlock_21' defined but not used [-Wunused-function]
  ./linux/lib/locking-selftest.c:580:1: warning: 'irqsafe1_soft_spin_12' defined but not used [-Wunused-function]
  ...

Fixed by wrapping the test definitions in #ifndef CONFIG_PREEMPT_RT_FULL
conditionals.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Josh Cartwright <josh.cartwright@ni.com>
Signed-off-by: Xander Huff <xander.huff@ni.com>
Acked-by: Gratian Crisan <gratian.crisan@ni.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 lib/locking-selftest.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index 19ae2366a7a9..b93a6103fa4d 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -590,6 +590,8 @@ GENERATE_TESTCASE(init_held_rsem)
 #include "locking-selftest-spin-hardirq.h"
 GENERATE_PERMUTATIONS_2_EVENTS(irqsafe1_hard_spin)
 
+#ifndef CONFIG_PREEMPT_RT_FULL
+
 #include "locking-selftest-rlock-hardirq.h"
 GENERATE_PERMUTATIONS_2_EVENTS(irqsafe1_hard_rlock)
 
@@ -605,9 +607,12 @@ GENERATE_PERMUTATIONS_2_EVENTS(irqsafe1_soft_rlock)
 #include "locking-selftest-wlock-softirq.h"
 GENERATE_PERMUTATIONS_2_EVENTS(irqsafe1_soft_wlock)
 
+#endif
+
 #undef E1
 #undef E2
 
+#ifndef CONFIG_PREEMPT_RT_FULL
 /*
  * Enabling hardirqs with a softirq-safe lock held:
  */
@@ -640,6 +645,8 @@ GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2A_rlock)
 #undef E1
 #undef E2
 
+#endif
+
 /*
  * Enabling irqs with an irq-safe lock held:
  */
@@ -663,6 +670,8 @@ GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2A_rlock)
 #include "locking-selftest-spin-hardirq.h"
 GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2B_hard_spin)
 
+#ifndef CONFIG_PREEMPT_RT_FULL
+
 #include "locking-selftest-rlock-hardirq.h"
 GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2B_hard_rlock)
 
@@ -678,6 +687,8 @@ GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2B_soft_rlock)
 #include "locking-selftest-wlock-softirq.h"
 GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2B_soft_wlock)
 
+#endif
+
 #undef E1
 #undef E2
 
@@ -709,6 +720,8 @@ GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2B_soft_wlock)
 #include "locking-selftest-spin-hardirq.h"
 GENERATE_PERMUTATIONS_3_EVENTS(irqsafe3_hard_spin)
 
+#ifndef CONFIG_PREEMPT_RT_FULL
+
 #include "locking-selftest-rlock-hardirq.h"
 GENERATE_PERMUTATIONS_3_EVENTS(irqsafe3_hard_rlock)
 
@@ -724,6 +737,8 @@ GENERATE_PERMUTATIONS_3_EVENTS(irqsafe3_soft_rlock)
 #include "locking-selftest-wlock-softirq.h"
 GENERATE_PERMUTATIONS_3_EVENTS(irqsafe3_soft_wlock)
 
+#endif
+
 #undef E1
 #undef E2
 #undef E3
@@ -757,6 +772,8 @@ GENERATE_PERMUTATIONS_3_EVENTS(irqsafe3_soft_wlock)
 #include "locking-selftest-spin-hardirq.h"
 GENERATE_PERMUTATIONS_3_EVENTS(irqsafe4_hard_spin)
 
+#ifndef CONFIG_PREEMPT_RT_FULL
+
 #include "locking-selftest-rlock-hardirq.h"
 GENERATE_PERMUTATIONS_3_EVENTS(irqsafe4_hard_rlock)
 
@@ -772,10 +789,14 @@ GENERATE_PERMUTATIONS_3_EVENTS(irqsafe4_soft_rlock)
 #include "locking-selftest-wlock-softirq.h"
 GENERATE_PERMUTATIONS_3_EVENTS(irqsafe4_soft_wlock)
 
+#endif
+
 #undef E1
 #undef E2
 #undef E3
 
+#ifndef CONFIG_PREEMPT_RT_FULL
+
 /*
  * read-lock / write-lock irq inversion.
  *
@@ -838,6 +859,10 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_inversion_soft_wlock)
 #undef E2
 #undef E3
 
+#endif
+
+#ifndef CONFIG_PREEMPT_RT_FULL
+
 /*
  * read-lock / write-lock recursion that is actually safe.
  */
@@ -876,6 +901,8 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
 #undef E2
 #undef E3
 
+#endif
+
 /*
  * read-lock / write-lock recursion that is unsafe.
  */
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 33/39] fs/aio: simple simple work
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (30 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 32/39] lockdep: selftest: fix warnings due to missing PREEMPT_RT conditionals Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 34/39] timers: Track total number of timers in list Steven Rostedt
                   ` (9 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, stable-rt, Mike Galbraith,
	Benjamin LaHaise

[-- Attachment #1: 0033-fs-aio-simple-simple-work.patch --]
[-- Type: text/plain, Size: 3751 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

|BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:768
|in_atomic(): 1, irqs_disabled(): 0, pid: 26, name: rcuos/2
|2 locks held by rcuos/2/26:
| #0:  (rcu_callback){.+.+..}, at: [<ffffffff810b1a12>] rcu_nocb_kthread+0x1e2/0x380
| #1:  (rcu_read_lock_sched){.+.+..}, at: [<ffffffff812acd26>] percpu_ref_kill_rcu+0xa6/0x1c0
|Preemption disabled at:[<ffffffff810b1a93>] rcu_nocb_kthread+0x263/0x380
|Call Trace:
| [<ffffffff81582e9e>] dump_stack+0x4e/0x9c
| [<ffffffff81077aeb>] __might_sleep+0xfb/0x170
| [<ffffffff81589304>] rt_spin_lock+0x24/0x70
| [<ffffffff811c5790>] free_ioctx_users+0x30/0x130
| [<ffffffff812ace34>] percpu_ref_kill_rcu+0x1b4/0x1c0
| [<ffffffff810b1a93>] rcu_nocb_kthread+0x263/0x380
| [<ffffffff8106e046>] kthread+0xd6/0xf0
| [<ffffffff81591eec>] ret_from_fork+0x7c/0xb0

replace this preempt_disable() friendly swork.

Cc: stable-rt@vger.kernel.org
Reported-By: Mike Galbraith <umgwanakikbuti@gmail.com>
Suggested-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 fs/aio.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 2f7e8c2e3e76..5cddcbfe1b71 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -40,6 +40,7 @@
 #include <linux/ramfs.h>
 #include <linux/percpu-refcount.h>
 #include <linux/mount.h>
+#include <linux/work-simple.h>
 
 #include <asm/kmap_types.h>
 #include <asm/uaccess.h>
@@ -110,7 +111,7 @@ struct kioctx {
 	struct page		**ring_pages;
 	long			nr_pages;
 
-	struct work_struct	free_work;
+	struct swork_event	free_work;
 
 	/*
 	 * signals when all in-flight requests are done
@@ -227,6 +228,7 @@ static int __init aio_setup(void)
 		.mount		= aio_mount,
 		.kill_sb	= kill_anon_super,
 	};
+	BUG_ON(swork_get());
 	aio_mnt = kern_mount(&aio_fs);
 	if (IS_ERR(aio_mnt))
 		panic("Failed to create aio fs mount.");
@@ -506,9 +508,9 @@ static int kiocb_cancel(struct kioctx *ctx, struct kiocb *kiocb)
 	return cancel(kiocb);
 }
 
-static void free_ioctx(struct work_struct *work)
+static void free_ioctx(struct swork_event *sev)
 {
-	struct kioctx *ctx = container_of(work, struct kioctx, free_work);
+	struct kioctx *ctx = container_of(sev, struct kioctx, free_work);
 
 	pr_debug("freeing %p\n", ctx);
 
@@ -525,8 +527,8 @@ static void free_ioctx_reqs(struct percpu_ref *ref)
 	if (ctx->requests_done)
 		complete(ctx->requests_done);
 
-	INIT_WORK(&ctx->free_work, free_ioctx);
-	schedule_work(&ctx->free_work);
+	INIT_SWORK(&ctx->free_work, free_ioctx);
+	swork_queue(&ctx->free_work);
 }
 
 /*
@@ -534,9 +536,9 @@ static void free_ioctx_reqs(struct percpu_ref *ref)
  * and ctx->users has dropped to 0, so we know no more kiocbs can be submitted -
  * now it's safe to cancel any that need to be.
  */
-static void free_ioctx_users(struct percpu_ref *ref)
+static void free_ioctx_users_work(struct swork_event *sev)
 {
-	struct kioctx *ctx = container_of(ref, struct kioctx, users);
+	struct kioctx *ctx = container_of(sev, struct kioctx, free_work);
 	struct kiocb *req;
 
 	spin_lock_irq(&ctx->ctx_lock);
@@ -555,6 +557,14 @@ static void free_ioctx_users(struct percpu_ref *ref)
 	percpu_ref_put(&ctx->reqs);
 }
 
+static void free_ioctx_users(struct percpu_ref *ref)
+{
+	struct kioctx *ctx = container_of(ref, struct kioctx, users);
+
+	INIT_SWORK(&ctx->free_work, free_ioctx_users_work);
+	swork_queue(&ctx->free_work);
+}
+
 static int ioctx_add_table(struct kioctx *ctx, struct mm_struct *mm)
 {
 	unsigned i, new_nr;
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 34/39] timers: Track total number of timers in list
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (31 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 33/39] fs/aio: simple simple work Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 35/39] timers: Reduce __run_timers() latency for empty list Steven Rostedt
                   ` (8 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, Paul E. McKenney, Josh Triplett,
	Peter Zijlstra, Oleg Nesterov, Mike Galbraith

[-- Attachment #1: 0034-timers-Track-total-number-of-timers-in-list.patch --]
[-- Type: text/plain, Size: 2249 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

upstream commit: fff421580f512fc044cc7421fdff31a7a6997350

Currently, the tvec_base structure's ->active_timers field tracks only
the non-deferrable timers, which means that even if ->active_timers is
zero, there might well be deferrable timers in the list.  This commit
therefore adds an ->all_timers field to track all the timers, whether
deferrable or not.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/timer.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/kernel/timer.c b/kernel/timer.c
index 8f1687a4b9c2..251fdacfebdb 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -84,6 +84,7 @@ struct tvec_base {
 	unsigned long timer_jiffies;
 	unsigned long next_timer;
 	unsigned long active_timers;
+	unsigned long all_timers;
 	struct tvec_root tv1;
 	struct tvec tv2;
 	struct tvec tv3;
@@ -395,6 +396,7 @@ static void internal_add_timer(struct tvec_base *base, struct timer_list *timer)
 			base->next_timer = timer->expires;
 		base->active_timers++;
 	}
+	base->all_timers++;
 }
 
 #ifdef CONFIG_TIMER_STATS
@@ -674,6 +676,7 @@ detach_expired_timer(struct timer_list *timer, struct tvec_base *base)
 	detach_timer(timer, true);
 	if (!tbase_get_deferrable(timer->base))
 		base->active_timers--;
+	base->all_timers--;
 }
 
 static int detach_if_pending(struct timer_list *timer, struct tvec_base *base,
@@ -688,6 +691,7 @@ static int detach_if_pending(struct timer_list *timer, struct tvec_base *base,
 		if (timer->expires == base->next_timer)
 			base->next_timer = base->timer_jiffies;
 	}
+	base->all_timers--;
 	return 1;
 }
 
@@ -1677,6 +1681,7 @@ static int init_timers_cpu(int cpu)
 	base->timer_jiffies = jiffies;
 	base->next_timer = base->timer_jiffies;
 	base->active_timers = 0;
+	base->all_timers = 0;
 	return 0;
 }
 
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 35/39] timers: Reduce __run_timers() latency for empty list
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (32 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 34/39] timers: Track total number of timers in list Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 36/39] timers: Reduce future __run_timers() latency for newly emptied list Steven Rostedt
                   ` (7 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, Paul E. McKenney, Josh Triplett,
	Peter Zijlstra, Oleg Nesterov, Mike Galbraith

[-- Attachment #1: 0035-timers-Reduce-__run_timers-latency-for-empty-list.patch --]
[-- Type: text/plain, Size: 2130 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

upstream commit: d550e81dc0ddc04f1b417c179c214103a28e0ee8

The __run_timers() function currently steps through the list one jiffy at
a time in order to update the timer wheel.  However, if the timer wheel
is empty, no adjustment is needed other than updating ->timer_jiffies.
In this case, which is likely to be common for NO_HZ_FULL kernels, the
kernel currently incurs a large latency for no good reason.  This commit
therefore short-circuits this case.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/timer.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/kernel/timer.c b/kernel/timer.c
index 251fdacfebdb..9cf70c562fac 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -341,6 +341,20 @@ void set_timer_slack(struct timer_list *timer, int slack_hz)
 }
 EXPORT_SYMBOL_GPL(set_timer_slack);
 
+/*
+ * If the list is empty, catch up ->timer_jiffies to the current time.
+ * The caller must hold the tvec_base lock.  Returns true if the list
+ * was empty and therefore ->timer_jiffies was updated.
+ */
+static bool catchup_timer_jiffies(struct tvec_base *base)
+{
+	if (!base->all_timers) {
+		base->timer_jiffies = jiffies;
+		return true;
+	}
+	return false;
+}
+
 static void
 __internal_add_timer(struct tvec_base *base, struct timer_list *timer)
 {
@@ -1203,6 +1217,10 @@ static inline void __run_timers(struct tvec_base *base)
 	struct timer_list *timer;
 
 	spin_lock_irq(&base->lock);
+	if (catchup_timer_jiffies(base)) {
+		spin_unlock_irq(&base->lock);
+		return;
+	}
 	while (time_after_eq(jiffies, base->timer_jiffies)) {
 		struct list_head work_list;
 		struct list_head *head = &work_list;
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 36/39] timers: Reduce future __run_timers() latency for newly emptied list
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (33 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 35/39] timers: Reduce __run_timers() latency for empty list Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 37/39] timers: Reduce future __run_timers() latency for first add to empty list Steven Rostedt
                   ` (6 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, Paul E. McKenney, Josh Triplett,
	Peter Zijlstra, Oleg Nesterov, Mike Galbraith

[-- Attachment #1: 0036-timers-Reduce-future-__run_timers-latency-for-newly-.patch --]
[-- Type: text/plain, Size: 1897 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

upstream commit: 16d937f880312e3f47157d4d6d6ebf7e61523378

The __run_timers() function currently steps through the list one jiffy at
a time in order to update the timer wheel.  However, if the timer wheel
is empty, no adjustment is needed other than updating ->timer_jiffies.
Therefore, if we just emptied the timer wheel, for example, by deleting
the last timer, we should mark the timer wheel as being up to date.
This marking will reduce (and perhaps eliminate) the jiffy-stepping that
a future __run_timers() call will need to do in response to some future
timer posting or migration.  This commit therefore catches ->timer_jiffies
for this case.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/timer.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/timer.c b/kernel/timer.c
index 9cf70c562fac..380f125ad100 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -691,6 +691,7 @@ detach_expired_timer(struct timer_list *timer, struct tvec_base *base)
 	if (!tbase_get_deferrable(timer->base))
 		base->active_timers--;
 	base->all_timers--;
+	(void)catchup_timer_jiffies(base);
 }
 
 static int detach_if_pending(struct timer_list *timer, struct tvec_base *base,
@@ -706,6 +707,7 @@ static int detach_if_pending(struct timer_list *timer, struct tvec_base *base,
 			base->next_timer = base->timer_jiffies;
 	}
 	base->all_timers--;
+	(void)catchup_timer_jiffies(base);
 	return 1;
 }
 
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 37/39] timers: Reduce future __run_timers() latency for first add to empty list
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (34 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 36/39] timers: Reduce future __run_timers() latency for newly emptied list Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 38/39] staging: Mark rtl8821ae as broken Steven Rostedt
                   ` (5 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker, Paul E. McKenney, Josh Triplett,
	Peter Zijlstra, Oleg Nesterov, Mike Galbraith

[-- Attachment #1: 0037-timers-Reduce-future-__run_timers-latency-for-first-.patch --]
[-- Type: text/plain, Size: 1654 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

upstream commit: 18d8cb64c9c074cbe2bd677ab10fff8283abdb62

The __run_timers() function currently steps through the list one jiffy at
a time in order to update the timer wheel.  However, if the timer wheel
is empty, no adjustment is needed other than updating ->timer_jiffies.
Therefore, just before we add a timer to an empty timer wheel, we should
mark the timer wheel as being up to date.  This marking will reduce (and
perhaps eliminate) the jiffy-stepping that a future __run_timers() call
will need to do in response to some future timer posting or migration.
This commit therefore updates ->timer_jiffies for this case.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/timer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/timer.c b/kernel/timer.c
index 380f125ad100..2059f6b27595 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -401,6 +401,7 @@ __internal_add_timer(struct tvec_base *base, struct timer_list *timer)
 
 static void internal_add_timer(struct tvec_base *base, struct timer_list *timer)
 {
+	(void)catchup_timer_jiffies(base);
 	__internal_add_timer(base, timer);
 	/*
 	 * Update base->active_timers and base->next_timer
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 38/39] staging: Mark rtl8821ae as broken
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (35 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 37/39] timers: Reduce future __run_timers() latency for first add to empty list Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-12 19:13 ` [PATCH RT 39/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (4 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker

[-- Attachment #1: 0038-staging-Mark-rtl8821ae-as-broken.patch --]
[-- Type: text/plain, Size: 803 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

rtl8821ae does not build with a make allmodconfig.

Mark it as broken.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 drivers/staging/rtl8821ae/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/rtl8821ae/Kconfig b/drivers/staging/rtl8821ae/Kconfig
index abccc9dabd65..50a407f71717 100644
--- a/drivers/staging/rtl8821ae/Kconfig
+++ b/drivers/staging/rtl8821ae/Kconfig
@@ -2,6 +2,7 @@ config R8821AE
 	tristate "RealTek RTL8821AE Wireless LAN NIC driver"
 	depends on PCI && WLAN && MAC80211
 	depends on m
+	depends on BROKEN
 	select WIRELESS_EXT
 	select WEXT_PRIV
 	select EEPROM_93CX6
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH RT 39/39] Linux 3.14.34-rt32-rc1
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (36 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 38/39] staging: Mark rtl8821ae as broken Steven Rostedt
@ 2015-03-12 19:13 ` Steven Rostedt
  2015-03-13  4:49 ` [PATCH RT 00/39] " Mike Galbraith
                   ` (3 subsequent siblings)
  41 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-12 19:13 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: Thomas Gleixner, Carsten Emde, Sebastian Andrzej Siewior,
	John Kacur, Paul Gortmaker

[-- Attachment #1: 0039-Linux-3.14.34-rt32-rc1.patch --]
[-- Type: text/plain, Size: 414 bytes --]

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

---
 localversion-rt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/localversion-rt b/localversion-rt
index a68b4337d4ce..2744705998c0 100644
--- a/localversion-rt
+++ b/localversion-rt
@@ -1 +1 @@
--rt31
+-rt32-rc1
-- 
2.1.4



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH RT 00/39] Linux 3.14.34-rt32-rc1
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (37 preceding siblings ...)
  2015-03-12 19:13 ` [PATCH RT 39/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
@ 2015-03-13  4:49 ` Mike Galbraith
  2015-03-13 13:50   ` Steven Rostedt
  2015-03-13 15:11   ` Steven Rostedt
  2015-03-13 11:01 ` Sebastian Andrzej Siewior
                   ` (2 subsequent siblings)
  41 siblings, 2 replies; 54+ messages in thread
From: Mike Galbraith @ 2015-03-13  4:49 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	Sebastian Andrzej Siewior, John Kacur, Paul Gortmaker

On Thu, 2015-03-12 at 15:13 -0400, Steven Rostedt wrote:
> Dear RT Folks,
> 
> This is the RT stable review cycle of patch 3.14.34-rt32-rc1.
> 
> Please scream at me if I messed something up.

Hm, in both 12/14 series, 21/N went on a walkabout.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH RT 16/39] locking/rt-mutex: avoid a NULL pointer dereference on deadlock
  2015-03-12 19:13 ` [PATCH RT 16/39] locking/rt-mutex: avoid a NULL pointer dereference on deadlock Steven Rostedt
@ 2015-03-13 10:40   ` Sebastian Andrzej Siewior
  2015-03-13 10:56     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 54+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-03-13 10:40 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	John Kacur, Paul Gortmaker, stable-rt

* Steven Rostedt | 2015-03-12 15:13:23 [-0400]:

>3.14.34-rt32-rc1 stable review patch.
>If anyone has any objections, please let me know.

If you take this, could you take 9d3e2d02f54160725d97f4ab1e1e8de493fbf33a
("locking/rtmutex: Set state back to running on error") it sits on my
queue and is upstream already.

Sebastian

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH RT 16/39] locking/rt-mutex: avoid a NULL pointer dereference on deadlock
  2015-03-13 10:40   ` Sebastian Andrzej Siewior
@ 2015-03-13 10:56     ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 54+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-03-13 10:56 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	John Kacur, Paul Gortmaker, stable-rt

* Sebastian Andrzej Siewior | 2015-03-13 11:40:47 [+0100]:

>If you take this, could you take 9d3e2d02f54160725d97f4ab1e1e8de493fbf33a
>("locking/rtmutex: Set state back to running on error") it sits on my
>queue and is upstream already.

Or forget what I said. It is there already, it has been removed in
v4.0-rc1. That is why it was on my queue and I did not apply it…

Sebastian

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH RT 00/39] Linux 3.14.34-rt32-rc1
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (38 preceding siblings ...)
  2015-03-13  4:49 ` [PATCH RT 00/39] " Mike Galbraith
@ 2015-03-13 11:01 ` Sebastian Andrzej Siewior
  2015-03-13 11:33 ` Sebastian Andrzej Siewior
  2015-03-16 13:59 ` Sebastian Andrzej Siewior
  41 siblings, 0 replies; 54+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-03-13 11:01 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	John Kacur, Paul Gortmaker

* Steven Rostedt | 2015-03-12 15:13:07 [-0400]:

>Please scream at me if I messed something up. Please test the patches too.

Could you add Mike's "3.14.23-rt20 - fs,btrfs: fix rt deadlock on
extent_buffer->loc" [0] and that is the upper chunk (ctree.c only). As I
mentioned in the thread, the code is gone in v3.18 and I didn't apply
it.

[0] https://lkml.org/lkml/2014/11/2/34

Sebastian

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH RT 00/39] Linux 3.14.34-rt32-rc1
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (39 preceding siblings ...)
  2015-03-13 11:01 ` Sebastian Andrzej Siewior
@ 2015-03-13 11:33 ` Sebastian Andrzej Siewior
  2015-03-16 13:59 ` Sebastian Andrzej Siewior
  41 siblings, 0 replies; 54+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-03-13 11:33 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	John Kacur, Paul Gortmaker

* Steven Rostedt | 2015-03-12 15:13:07 [-0400]:

>Please scream at me if I messed something up. Please test the patches too.

There is also "[PATCH v3.14-rt] netpoll: guard the access to dev->npinfo
with rcu_read_lock/unlock" which I did not apply because the code is
gone in v3.18. I stopped looking at it after I noticed that the code is
gone.

Sebastian

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH RT 00/39] Linux 3.14.34-rt32-rc1
  2015-03-13  4:49 ` [PATCH RT 00/39] " Mike Galbraith
@ 2015-03-13 13:50   ` Steven Rostedt
  2015-03-13 15:11   ` Steven Rostedt
  1 sibling, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-13 13:50 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	Sebastian Andrzej Siewior, John Kacur, Paul Gortmaker

On Fri, 13 Mar 2015 05:49:13 +0100
Mike Galbraith <umgwanakikbuti@gmail.com> wrote:

> On Thu, 2015-03-12 at 15:13 -0400, Steven Rostedt wrote:
> > Dear RT Folks,
> > 
> > This is the RT stable review cycle of patch 3.14.34-rt32-rc1.
> > 
> > Please scream at me if I messed something up.
> 
> Hm, in both 12/14 series, 21/N went on a walkabout.

I guess I need switch my mail server. My ISP (default) seems to be
trying to protect me from sending spam :-/  There seems to be a rate
limit if you send too many emails in some small frame of time (like
quilt mail does). I sent out 3.10-rt, 3.4-rt and a 3.2-rt patch series,
and it doesn't look like anything made it pass my ISP server.

Time to switch to my other mail server (the one I send via my mail
clients).

-- Steve

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH RT 00/39] Linux 3.14.34-rt32-rc1
  2015-03-13  4:49 ` [PATCH RT 00/39] " Mike Galbraith
  2015-03-13 13:50   ` Steven Rostedt
@ 2015-03-13 15:11   ` Steven Rostedt
  2015-03-13 15:24     ` Steven Rostedt
  1 sibling, 1 reply; 54+ messages in thread
From: Steven Rostedt @ 2015-03-13 15:11 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	Sebastian Andrzej Siewior, John Kacur, Paul Gortmaker

On Fri, Mar 13, 2015 at 05:49:13AM +0100, Mike Galbraith wrote:
> On Thu, 2015-03-12 at 15:13 -0400, Steven Rostedt wrote:
> > Dear RT Folks,
> > 
> > This is the RT stable review cycle of patch 3.14.34-rt32-rc1.
> > 
> > Please scream at me if I messed something up.
> 
> Hm, in both 12/14 series, 21/N went on a walkabout.

Here's 21 in 3.14-rt:


>From fe58592f2863d70ef37866d99d32b69d3cc0c3f3 Mon Sep 17 00:00:00 2001
Date: Wed, 10 Dec 2014 10:32:09 +0800
Subject: ARM: enable irq in translation/section permission fault handlers
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

3.14.34-rt32-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: "Yadi.hu" <yadi.hu@windriver.com>

Probably happens on all ARM, with
CONFIG_PREEMPT_RT_FULL
CONFIG_DEBUG_ATOMIC_SLEEP

This simple program....

int main() {
   *((char*)0xc0001000) = 0;
};

[ 512.742724] BUG: sleeping function called from invalid context at kernel/rtmutex.c:658
[ 512.743000] in_atomic(): 0, irqs_disabled(): 128, pid: 994, name: a
[ 512.743217] INFO: lockdep is turned off.
[ 512.743360] irq event stamp: 0
[ 512.743482] hardirqs last enabled at (0): [< (null)>] (null)
[ 512.743714] hardirqs last disabled at (0): [<c0426370>] copy_process+0x3b0/0x11c0
[ 512.744013] softirqs last enabled at (0): [<c0426370>] copy_process+0x3b0/0x11c0
[ 512.744303] softirqs last disabled at (0): [< (null)>] (null)
[ 512.744631] [<c041872c>] (unwind_backtrace+0x0/0x104)
[ 512.745001] [<c09af0c4>] (dump_stack+0x20/0x24)
[ 512.745355] [<c0462490>] (__might_sleep+0x1dc/0x1e0)
[ 512.745717] [<c09b6770>] (rt_spin_lock+0x34/0x6c)
[ 512.746073] [<c0441bf0>] (do_force_sig_info+0x34/0xf0)
[ 512.746457] [<c0442668>] (force_sig_info+0x18/0x1c)
[ 512.746829] [<c041d880>] (__do_user_fault+0x9c/0xd8)
[ 512.747185] [<c041d938>] (do_bad_area+0x7c/0x94)
[ 512.747536] [<c041d990>] (do_sect_fault+0x40/0x48)
[ 512.747898] [<c040841c>] (do_DataAbort+0x40/0xa0)
[ 512.748181] Exception stack(0xecaa1fb0 to 0xecaa1ff8)

Oxc0000000 belongs to kernel address space, user task can not be
allowed to access it. For above condition, correct result is that
test case should receive a “segment fault” and exits but not stacks.

the root cause is commit 02fe2845d6a8 ("avoid enabling interrupts in
prefetch/data abort handlers"),it deletes irq enable block in Data
abort assemble code and move them into page/breakpiont/alignment fault
handlers instead. But author does not enable irq in translation/section
permission fault handlers. ARM disables irq when it enters exception/
interrupt mode, if kernel doesn't enable irq, it would be still disabled
during translation/section permission fault.

We see the above splat because do_force_sig_info is still called with
IRQs off, and that code eventually does a:

        spin_lock_irqsave(&t->sighand->siglock, flags);

As this is architecture independent code, and we've not seen any other
need for other arch to have the siglock converted to raw lock, we can
conclude that we should enable irq for ARM translation/section
permission exception.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Yadi.hu <yadi.hu@windriver.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/arm/mm/fault.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index b40d4bab8e07..c15d2a0826c6 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -431,6 +431,9 @@ do_translation_fault(unsigned long addr, unsigned int fsr,
 	if (addr < TASK_SIZE)
 		return do_page_fault(addr, fsr, regs);
 
+	if (interrupts_enabled(regs))
+		local_irq_enable();
+
 	if (user_mode(regs))
 		goto bad_area;
 
@@ -498,6 +501,9 @@ do_translation_fault(unsigned long addr, unsigned int fsr,
 static int
 do_sect_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 {
+	if (interrupts_enabled(regs))
+		local_irq_enable();
+
 	do_bad_area(addr, fsr, regs);
 	return 0;
 }
-- 
2.1.4


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH RT 00/39] Linux 3.14.34-rt32-rc1
  2015-03-13 15:11   ` Steven Rostedt
@ 2015-03-13 15:24     ` Steven Rostedt
  0 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-13 15:24 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	Sebastian Andrzej Siewior, John Kacur, Paul Gortmaker

On Fri, 13 Mar 2015 11:11:35 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Fri, Mar 13, 2015 at 05:49:13AM +0100, Mike Galbraith wrote:
> > On Thu, 2015-03-12 at 15:13 -0400, Steven Rostedt wrote:
> > > Dear RT Folks,
> > > 
> > > This is the RT stable review cycle of patch 3.14.34-rt32-rc1.
> > > 
> > > Please scream at me if I messed something up.
> > 
> > Hm, in both 12/14 series, 21/N went on a walkabout.
> 

I missed the error message about these. For some reason this patch
causes the following errors:


<linux-kernel@vger.kernel.org>: host vger.kernel.org[209.132.180.67]
said: 550 5.7.1 Content-Policy reject msg: Wrong MIME labeling on 8-bit
character texts. BF:<H 0>; S932171AbbCMPPW (in reply to end of DATA
command)

<linux-rt-users@vger.kernel.org>: host vger.kernel.org[209.132.180.67]
said: 550 5.7.1 Content-Policy reject msg: Wrong MIME labeling on 8-bit
character texts. BF:<H 0>; S932171AbbCMPPW (in reply to end of DATA
command)

My other series will be missing them too.

-- Steve

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH RT 00/39] Linux 3.14.34-rt32-rc1
  2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
                   ` (40 preceding siblings ...)
  2015-03-13 11:33 ` Sebastian Andrzej Siewior
@ 2015-03-16 13:59 ` Sebastian Andrzej Siewior
  2015-03-16 14:02   ` Steven Rostedt
  41 siblings, 1 reply; 54+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-03-16 13:59 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	John Kacur, Paul Gortmaker

* Steven Rostedt | 2015-03-12 15:13:07 [-0400]:

>Please scream at me if I messed something up. Please test the patches too.

So Paul remided us about the dead lock thingy that has been reported.
Users reported that it does not occur with v3.18-RT and they think it is
due to 'Revert "timers: do not raise softirq unconditionally"' in 
Revert-timers-do-not-raise-softirq-unconditionally.patch.

I reverted it because I couldn't get highres to get to work at all on
v3.18 due to different synchronisation / expectaion of the timer
framework. Since the trylock might record a different lock owner it is
possible that this causes the deadlock (it thinks). Therefore it has no
stable tag nor any reference to the deadlock problem.

With this patch applied FULL_NOHZ should not properly work (again) due
to timer softirq wake ups (but this is a different problem).

Sebastian

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH RT 00/39] Linux 3.14.34-rt32-rc1
  2015-03-16 13:59 ` Sebastian Andrzej Siewior
@ 2015-03-16 14:02   ` Steven Rostedt
  2015-03-16 14:10     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 54+ messages in thread
From: Steven Rostedt @ 2015-03-16 14:02 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	John Kacur, Paul Gortmaker

On Mon, 16 Mar 2015 14:59:10 +0100
Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:

> * Steven Rostedt | 2015-03-12 15:13:07 [-0400]:
> 
> >Please scream at me if I messed something up. Please test the patches too.
> 
> So Paul remided us about the dead lock thingy that has been reported.
> Users reported that it does not occur with v3.18-RT and they think it is
> due to 'Revert "timers: do not raise softirq unconditionally"' in 
> Revert-timers-do-not-raise-softirq-unconditionally.patch.
> 
> I reverted it because I couldn't get highres to get to work at all on
> v3.18 due to different synchronisation / expectaion of the timer
> framework. Since the trylock might record a different lock owner it is
> possible that this causes the deadlock (it thinks). Therefore it has no
> stable tag nor any reference to the deadlock problem.

I guess the question is, is there any other place that does a trylock
in hard irq context? If so, the revert isn't going to fix it.

-- Steve

> 
> With this patch applied FULL_NOHZ should not properly work (again) due
> to timer softirq wake ups (but this is a different problem).
> 
> Sebastian


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH RT 00/39] Linux 3.14.34-rt32-rc1
  2015-03-16 14:02   ` Steven Rostedt
@ 2015-03-16 14:10     ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 54+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-03-16 14:10 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	John Kacur, Paul Gortmaker

On 03/16/2015 03:02 PM, Steven Rostedt wrote:
> On Mon, 16 Mar 2015 14:59:10 +0100
> Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> 
>> * Steven Rostedt | 2015-03-12 15:13:07 [-0400]:
>>
>>> Please scream at me if I messed something up. Please test the patches too.
>>
>> So Paul remided us about the dead lock thingy that has been reported.
>> Users reported that it does not occur with v3.18-RT and they think it is
>> due to 'Revert "timers: do not raise softirq unconditionally"' in 
>> Revert-timers-do-not-raise-softirq-unconditionally.patch.
>>
>> I reverted it because I couldn't get highres to get to work at all on
>> v3.18 due to different synchronisation / expectaion of the timer
>> framework. Since the trylock might record a different lock owner it is
>> possible that this causes the deadlock (it thinks). Therefore it has no
>> stable tag nor any reference to the deadlock problem.
> 
> I guess the question is, is there any other place that does a trylock
> in hard irq context? If so, the revert isn't going to fix it.

This is the only place and I introduced it only for that reason.

> 
> -- Steve

Sebastian


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH RT 26/39] scheduling while atomic in cgroup code
  2015-03-12 19:13 ` [PATCH RT 26/39] scheduling while atomic in cgroup code Steven Rostedt
@ 2015-03-17 20:10   ` Paul Gortmaker
  2015-03-17 20:13     ` Steven Rostedt
  2015-03-18  8:37     ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 54+ messages in thread
From: Paul Gortmaker @ 2015-03-17 20:10 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	Sebastian Andrzej Siewior, John Kacur, stable-rt,
	Nikita Yushchenko, Mike Galbraith

[[PATCH RT 26/39] scheduling while atomic in cgroup code] On 12/03/2015 (Thu 15:13) Steven Rostedt wrote:

> 3.14.34-rt32-rc1 stable review patch.
> If anyone has any objections, please let me know.
> 
> ------------------
> 
> From: Mike Galbraith <umgwanakikbuti@gmail.com>
> 
> mm, memcg: make refill_stock() use get_cpu_light()

This looks like only 1/2 of Mike's original patch:

https://lkml.org/lkml/2014/6/21/11 

I suspect that is because 3.18 could only use 1/2 of it, and based on
the SOB lines, this is backported from 3.18.

The other half applies to 3.14 -- testing in progress; not sure about
the 3.10-rt and earlier....

P.
--

> 
> Nikita reported the following memcg scheduling while atomic bug:
> 
> Call Trace:
> [e22d5a90] [c0007ea8] show_stack+0x4c/0x168 (unreliable)
> [e22d5ad0] [c0618c04] __schedule_bug+0x94/0xb0
> [e22d5ae0] [c060b9ec] __schedule+0x530/0x550
> [e22d5bf0] [c060bacc] schedule+0x30/0xbc
> [e22d5c00] [c060ca24] rt_spin_lock_slowlock+0x180/0x27c
> [e22d5c70] [c00b39dc] res_counter_uncharge_until+0x40/0xc4
> [e22d5ca0] [c013ca88] drain_stock.isra.20+0x54/0x98
> [e22d5cc0] [c01402ac] __mem_cgroup_try_charge+0x2e8/0xbac
> [e22d5d70] [c01410d4] mem_cgroup_charge_common+0x3c/0x70
> [e22d5d90] [c0117284] __do_fault+0x38c/0x510
> [e22d5df0] [c011a5f4] handle_pte_fault+0x98/0x858
> [e22d5e50] [c060ed08] do_page_fault+0x42c/0x6fc
> [e22d5f40] [c000f5b4] handle_page_fault+0xc/0x80
> 
> What happens:
> 
>    refill_stock()
>       get_cpu_var()
>       drain_stock()
>          res_counter_uncharge()
>             res_counter_uncharge_until()
>                spin_lock() <== boom
> 
> Fix it by replacing get/put_cpu_var() with get/put_cpu_light().
> 
> Cc: stable-rt@vger.kernel.org
> Reported-by: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru>
> Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  mm/memcontrol.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index fa54408b66b2..be24d2d186fa 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2512,14 +2512,17 @@ static void __init memcg_stock_init(void)
>   */
>  static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
>  {
> -	struct memcg_stock_pcp *stock = &get_cpu_var(memcg_stock);
> +	struct memcg_stock_pcp *stock;
> +	int cpu = get_cpu_light();
> +
> +	stock = &per_cpu(memcg_stock, cpu);
>  
>  	if (stock->cached != memcg) { /* reset if necessary */
>  		drain_stock(stock);
>  		stock->cached = memcg;
>  	}
>  	stock->nr_pages += nr_pages;
> -	put_cpu_var(memcg_stock);
> +	put_cpu_light();
>  }
>  
>  /*
> -- 
> 2.1.4
> 
> 

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH RT 26/39] scheduling while atomic in cgroup code
  2015-03-17 20:10   ` Paul Gortmaker
@ 2015-03-17 20:13     ` Steven Rostedt
  2015-03-18  8:37     ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-17 20:13 UTC (permalink / raw)
  To: Paul Gortmaker
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	Sebastian Andrzej Siewior, John Kacur, stable-rt,
	Nikita Yushchenko, Mike Galbraith

On Tue, 17 Mar 2015 16:10:11 -0400
Paul Gortmaker <paul.gortmaker@windriver.com> wrote:

> [[PATCH RT 26/39] scheduling while atomic in cgroup code] On 12/03/2015 (Thu 15:13) Steven Rostedt wrote:
> 
> > 3.14.34-rt32-rc1 stable review patch.
> > If anyone has any objections, please let me know.
> > 
> > ------------------
> > 
> > From: Mike Galbraith <umgwanakikbuti@gmail.com>
> > 
> > mm, memcg: make refill_stock() use get_cpu_light()
> 
> This looks like only 1/2 of Mike's original patch:
> 
> https://lkml.org/lkml/2014/6/21/11 
> 
> I suspect that is because 3.18 could only use 1/2 of it, and based on
> the SOB lines, this is backported from 3.18.
> 
> The other half applies to 3.14 -- testing in progress; not sure about
> the 3.10-rt and earlier....

Grumble. Yeah, I'm sure I got this from 3.18 and backported it.

-- Steve


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH RT 26/39] scheduling while atomic in cgroup code
  2015-03-17 20:10   ` Paul Gortmaker
  2015-03-17 20:13     ` Steven Rostedt
@ 2015-03-18  8:37     ` Sebastian Andrzej Siewior
  2015-03-18 13:20       ` Steven Rostedt
  1 sibling, 1 reply; 54+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-03-18  8:37 UTC (permalink / raw)
  To: Paul Gortmaker, Steven Rostedt
  Cc: linux-kernel, linux-rt-users, Thomas Gleixner, Carsten Emde,
	John Kacur, stable-rt, Nikita Yushchenko, Mike Galbraith

On 03/17/2015 09:10 PM, Paul Gortmaker wrote:
> [[PATCH RT 26/39] scheduling while atomic in cgroup code] On 12/03/2015 (Thu 15:13) Steven Rostedt wrote:
> 
>> 3.14.34-rt32-rc1 stable review patch.
>> If anyone has any objections, please let me know.
>>
>> ------------------
>>
>> From: Mike Galbraith <umgwanakikbuti@gmail.com>
>>
>> mm, memcg: make refill_stock() use get_cpu_light()
> 
> This looks like only 1/2 of Mike's original patch:
> 
> https://lkml.org/lkml/2014/6/21/11 
> 
> I suspect that is because 3.18 could only use 1/2 of it, and based on
> the SOB lines, this is backported from 3.18.

no. I didn't take the upper chunk because get_cpu_var() does only a
preempt_disable() and in that section there is not point of preemption
or a lock in involved. So it is fine the way it is.
The suggest change does a migrate_disable() which is a little more code.

The lower part (the chunk I applied) invokes drain_stock() and that is
where the sleeping-while-atomic warning came from.

So is the upper half really required and if so, why?

> The other half applies to 3.14 -- testing in progress; not sure about
> the 3.10-rt and earlier....
> 
> P.

Sebastian


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH RT 26/39] scheduling while atomic in cgroup code
  2015-03-18  8:37     ` Sebastian Andrzej Siewior
@ 2015-03-18 13:20       ` Steven Rostedt
  0 siblings, 0 replies; 54+ messages in thread
From: Steven Rostedt @ 2015-03-18 13:20 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Paul Gortmaker, linux-kernel, linux-rt-users, Thomas Gleixner,
	Carsten Emde, John Kacur, stable-rt, Nikita Yushchenko,
	Mike Galbraith

On Wed, 18 Mar 2015 09:37:02 +0100
Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:

> On 03/17/2015 09:10 PM, Paul Gortmaker wrote:
> > [[PATCH RT 26/39] scheduling while atomic in cgroup code] On 12/03/2015 (Thu 15:13) Steven Rostedt wrote:
> > 
> >> 3.14.34-rt32-rc1 stable review patch.
> >> If anyone has any objections, please let me know.
> >>
> >> ------------------
> >>
> >> From: Mike Galbraith <umgwanakikbuti@gmail.com>
> >>
> >> mm, memcg: make refill_stock() use get_cpu_light()
> > 
> > This looks like only 1/2 of Mike's original patch:
> > 
> > https://lkml.org/lkml/2014/6/21/11 
> > 
> > I suspect that is because 3.18 could only use 1/2 of it, and based on
> > the SOB lines, this is backported from 3.18.
> 
> no. I didn't take the upper chunk because get_cpu_var() does only a
> preempt_disable() and in that section there is not point of preemption
> or a lock in involved. So it is fine the way it is.
> The suggest change does a migrate_disable() which is a little more code.
> 
> The lower part (the chunk I applied) invokes drain_stock() and that is
> where the sleeping-while-atomic warning came from.
> 
> So is the upper half really required and if so, why?

Ug, you are right. I was trying to get this all working that I just
applied it based on Paul's suggestion. Looking into it, the part of the
patch that was dropped is not needed.

/me goes and reverts that change :-(

-- Steve


> 
> > The other half applies to 3.14 -- testing in progress; not sure about
> > the 3.10-rt and earlier....
> > 
> > P.
> 
> Sebastian


^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2015-03-18 13:20 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-12 19:13 [PATCH RT 00/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 01/39] gpio: omap: use raw locks for locking Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 02/39] rtmutex: Simplify rtmutex_slowtrylock() Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 03/39] rtmutex: Simplify and document try_to_take_rtmutex() Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 04/39] rtmutex: No need to keep task ref for lock owner check Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 05/39] rtmutex: Clarify the boost/deboost part Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 06/39] rtmutex: Document pi chain walk Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 07/39] rtmutex: Simplify remove_waiter() Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 08/39] rtmutex: Confine deadlock logic to futex Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 09/39] rtmutex: Cleanup deadlock detector debug logic Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 10/39] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 11/39] futex: Make unlock_pi more robust Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 12/39] futex: Use futex_top_waiter() in lookup_pi_state() Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 13/39] futex: Split out the waiter check from lookup_pi_state() Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 14/39] futex: Split out the first waiter attachment " Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 15/39] futex: Simplify futex_lock_pi_atomic() and make it more robust Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 16/39] locking/rt-mutex: avoid a NULL pointer dereference on deadlock Steven Rostedt
2015-03-13 10:40   ` Sebastian Andrzej Siewior
2015-03-13 10:56     ` Sebastian Andrzej Siewior
2015-03-12 19:13 ` [PATCH RT 17/39] rtmutex.c: Fix incorrect waiter check Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 18/39] rt,locking: fix __ww_mutex_lock_interruptible() lockdep annotation Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 19/39] rtmutex: enable deadlock detection in ww_mutex_lock functions Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 20/39] x86: UV: raw_spinlock conversion Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 22/39] arm/futex: disable preemption during futex_atomic_cmpxchg_inatomic() Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 23/39] ARM: cmpxchg: define __HAVE_ARCH_CMPXCHG for armv6 and later Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 24/39] mips: rt: Replace pagefault_* to raw version Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 25/39] sas-ata/isci: dontt disable interrupts in qc_issue handler Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 26/39] scheduling while atomic in cgroup code Steven Rostedt
2015-03-17 20:10   ` Paul Gortmaker
2015-03-17 20:13     ` Steven Rostedt
2015-03-18  8:37     ` Sebastian Andrzej Siewior
2015-03-18 13:20       ` Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 27/39] work-simple: Simple work queue implemenation Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 28/39] sunrpc: make svc_xprt_do_enqueue() use get_cpu_light() Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 29/39] Revert "rwsem-rt: Do not allow readers to nest" Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 30/39] locking: ww_mutex: fix ww_mutex vs self-deadlock Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 31/39] thermal: Defer thermal wakups to threads Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 32/39] lockdep: selftest: fix warnings due to missing PREEMPT_RT conditionals Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 33/39] fs/aio: simple simple work Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 34/39] timers: Track total number of timers in list Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 35/39] timers: Reduce __run_timers() latency for empty list Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 36/39] timers: Reduce future __run_timers() latency for newly emptied list Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 37/39] timers: Reduce future __run_timers() latency for first add to empty list Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 38/39] staging: Mark rtl8821ae as broken Steven Rostedt
2015-03-12 19:13 ` [PATCH RT 39/39] Linux 3.14.34-rt32-rc1 Steven Rostedt
2015-03-13  4:49 ` [PATCH RT 00/39] " Mike Galbraith
2015-03-13 13:50   ` Steven Rostedt
2015-03-13 15:11   ` Steven Rostedt
2015-03-13 15:24     ` Steven Rostedt
2015-03-13 11:01 ` Sebastian Andrzej Siewior
2015-03-13 11:33 ` Sebastian Andrzej Siewior
2015-03-16 13:59 ` Sebastian Andrzej Siewior
2015-03-16 14:02   ` Steven Rostedt
2015-03-16 14:10     ` Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).