LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* 2.6.28-rc3-git6: Reported regressions from 2.6.27
@ 2008-11-09 17:53 Rafael J. Wysocki
  2008-11-09 17:53 ` [Bug #11799] xorg can not start up with stolen memory Rafael J. Wysocki
                   ` (38 more replies)
  0 siblings, 39 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:53 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Adrian Bunk, Andrew Morton, Linus Torvalds, Natalie Protasevich,
	Kernel Testers List, Network Development, Linux ACPI,
	Linux PM List, Linux SCSI List

This message contains a list of some regressions from 2.6.27, for which there
are no fixes in the mainline I know of.  If any of them have been fixed already,
please let me know.

If you know of any other unresolved regressions from 2.6.27, please let me know
either and I'll add them to the list.  Also, please let me know if any of the
entries below are invalid.

Each entry from the list will be sent additionally in an automatic reply to
this message with CCs to the people involved in reporting and handling the
issue.


Listed regressions statistics:

  Date          Total  Pending  Unresolved
  ----------------------------------------
  2008-11-09       73       40          27
  2008-11-02       55       41          29
  2008-10-25       26       25          20


Unresolved regressions
----------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11996
Subject		: Tracing framework regression in 2.6.28-rc3
Submitter	: Pekka Paalanen <pq@iki.fi>
Date		: 2008-11-09 10:13 (1 days old)
References	: http://marc.info/?l=linux-kernel&m=122624392229317&w=4
Handled-By	: Steven Rostedt <rostedt@goodmis.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11994
Subject		: Computer doesn't power down after commit CPI: EC: do transaction from interrupt context
Submitter	: François Valenduc <francois.valenduc@tvcablenet.be>
Date		: 2008-11-09 02:02 (1 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5ceb40417bca2045350e77f740e0c4c94875fff2
Handled-By	: ykzhao <yakui.zhao@intel.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11989
Subject		: Suspend failure on NForce4-based boards due to chanes in stop_machine
Submitter	: Rafael J. Wysocki <rjw@sisk.pl>
Date		: 2008-11-03 0:28 (7 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc
References	: http://marc.info/?l=linux-kernel&m=122567187604356&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11987
Subject		: Bootup time regression from 2.6.27 to 2.6.28-rc3+
Submitter	: Lukas Hejtmanek <xhejtman@ics.muni.cz>
Date		: 2008-11-04 17:33 (6 days old)
References	: http://marc.info/?l=linux-kernel&m=122582006601658&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11986
Subject		: 2.6.28-rc2-git1: spitz still won't boot
Submitter	: Pavel Machek <pavel@suse.cz>
Date		: 2008-11-05 14:23 (5 days old)
References	: http://marc.info/?l=linux-kernel&m=122589528016337&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11984
Subject		: regression when switching TTY-&gt;X, input related?
Submitter	: Bernhard Schmidt <berni@birkenwald.de>
Date		: 2008-11-05 22:04 (5 days old)
References	: http://marc.info/?l=linux-kernel&m=122592278403853&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11970
Subject		: gettimeofday return a old time in mmbench
Submitter	: alexs <alex.shi@intel.com>
Date		: 2008-11-06 23:57 (4 days old)
Handled-By	: Ingo Molnar <mingo@elte.hu>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11965
Subject		: regression introduced by - timers: fix itimer/many thread hang
Submitter	: Doug Chapman <doug.chapman@hp.com>
Date		: 2008-11-06 11:03 (4 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f06febc96ba8e0af80bcc3eaec0a109e88275fac
References	: http://marc.info/?l=linux-kernel&m=122596943416648&w=4
Handled-By	: Frank Mayhar <fmayhar@google.com>
		  Peter Zijlstra <peterz@infradead.org>
		  Ingo Molnar <mingo@elte.hu>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11958
Subject		: [2.6.27.x =&gt; 2.6.28-rc3] Xorg crash with xf86MapVidMem error
Submitter	: Tomasz Chmielewski <tch@wpkg.org>
Date		: 2008-11-05 05:37 (5 days old)


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11947
Subject		: 2.6.28-rc VC switching with Intel graphics broken
Submitter	: Romano Giannetti <romano.giannetti@gmail.com>
Date		: 2008-11-03 12:10 (7 days old)
Handled-By	: Jesse Barnes <jbarnes@virtuousgeek.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11928
Subject		: ath5k gets lost with eeepc-laptop removal
Submitter	: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Date		: 2008-10-31 13:05 (10 days old)
References	: http://marc.info/?l=linux-kernel&m=122545827204957&w=4
Handled-By	: Nick Kossifidis <mickflemm@gmail.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11913
Subject		: USB/INPUT: slab error in cache_alloc_debugcheck_after(): double free?
Submitter	: Helge Deller <deller@gmx.de>
Date		: 2008-10-30 23:11 (11 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=cb8f488c33539f096580e202f5438a809195008f
References	: http://marc.info/?l=linux-kernel&m=122540833301394&w=4
Handled-By	: Jiri Kosina <jkosina@suse.cz>
		  Jiri Slaby <jirislaby@gmail.com>
		  Jiri Kosina <jkosina@suse.cz>
		  Jiri Slaby <jirislaby@gmail.com>
		  Denys Vlasenko <vda.linux@googlemail.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11908
Subject		: linux-2.6.28-rc2 regression : oprofile doesnt work anymore
Submitter	: Eric Dumazet <dada1@cosmosbay.com>
Date		: 2008-10-30 18:01 (11 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c493756e2a8a78bcaae30668317890dcfe86e7c3
References	: http://marc.info/?l=linux-kernel&m=122539004100532&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11906
Subject		: 2.6.28-rc2 seems to fail at powering down the monitor when it should
Submitter	: Gene Heskett <gene.heskett@gmail.com>
Date		: 2008-10-30 6:39 (11 days old)
References	: http://marc.info/?l=linux-kernel&m=122534879721424&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11905
Subject		: lots of extra timer interrupts costing 2W
Submitter	: Theodore Ts'o <tytso@mit.edu>
Date		: 2008-10-30 2:18 (11 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fb02fbc14d17837b4b7b02dbb36142c16a7bf208
References	: http://marc.info/?l=linux-kernel&m=122533314305315&w=4
		  http://marc.info/?l=linux-kernel&m=122541849114444&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11899
Subject		: sometime boot failed on T61 laptop
Submitter	: alexs <alex.shi@intel.com>
Date		: 2008-10-30 02:04 (11 days old)
Handled-By	: Tejun Heo <tj@kernel.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11896
Subject		: [2.6.28-rc2] EeePC ACPI errors &amp; exceptions
Submitter	: Darren Salt <linux@youmustbejoking.demon.co.uk>
Date		: 2008-10-27 22:52 (14 days old)
References	: http://marc.info/?l=linux-kernel&m=122514911328761&w=4
Handled-By	: Alexey Starikovskiy <aystarik@gmail.com>
		  Zhao Yakui <yakui.zhao@intel.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11891
Subject		: resume from disk broken on hp/compaq nx7000 (DRM problem)
Submitter	: Markus Meier <maekke@gentoo.org>
Date		: 2008-10-29 14:42 (12 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0a3e67a4caac273a3bfc4ced3da364830b1ab241
Handled-By	: Jesse Barnes <jbarnes@virtuousgeek.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11875
Subject		: radeonfb lockup in .28-rc (bisected)
Submitter	: James Cloos <cloos@jhcloos.com>
Date		: 2008-10-28 0:00 (13 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b1ee26bab14886350ba12a5c10cbc0696ac679bf
References	: http://marc.info/?l=linux-kernel&m=122515210200530&w=4
Handled-By	: Benjamin Herrenschmidt <benh@kernel.crashing.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11873
Subject		: unable to mount ext3 root filesystem due to htree_dirblock_to_tree
Submitter	: Jimmy.Jazz@gmx.net
Date		: 2008-10-28 05:09 (13 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4c46501d1659475dc6c89554af6ce7fe6ecf615c
Handled-By	: Tejun Heo <tj@kernel.org>
		  Neil Brown <neilb@suse.de>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11858
Subject		: Timeout regression introduced by 242f9dcb8ba6f68fcd217a119a7648a4f69290e9
Submitter	: Tejun Heo <tj@kernel.org>
Date		: 2008-10-26 9:46 (15 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=242f9dcb8ba6f68fcd217a119a7648a4f69290e9
References	: http://marc.info/?l=linux-kernel&m=122501447326698&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11849
Subject		: default IRQ affinity change in v2.6.27 (breaking several SMP PPC based systems)
Submitter	: Kumar Gala <galak@kernel.crashing.org>
Date		: 2008-10-24 12:45 (17 days old)
References	: http://marc.info/?l=linux-kernel&m=122485245924125&w=4
Handled-By	: Chris Snook <csnook@redhat.com>
		  Scott Wood <scottwood@freescale.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11834
Subject		: iwl3945: if I leave my machine running overnight, wifi will not work in the morning
Submitter	: Pavel Machek <pavel@suse.cz>
Date		: 2008-10-19 21:40 (22 days old)
References	: http://marc.info/?l=linux-kernel&m=122445440206101&w=4
Handled-By	: reinette chatre <reinette.chatre@intel.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11828
Subject		: Linux 2.6.27-git3: no SD card reader
Submitter	: J.A. Magallón <jamagallon@ono.com>
Date		: 2008-10-14 0:54 (27 days old)
References	: http://marc.info/?l=linux-kernel&m=122394573904699&w=4
Handled-By	: Pierre Ossman <drzeus-list@drzeus.cx>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11826
Subject		: extreme slowness of IO stuff using 2.6.28-rc1
Submitter	: Yves-Alexis Perez <corsac@debian.org>
Date		: 2008-10-25 04:25 (16 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=dc4304f7deee29fcdf6a2b62f7146ea7f505fd42
References	: http://marc.info/?l=linux-kernel&m=122521238402963&w=4
Handled-By	: Arjan van de Ven <arjan@linux.intel.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11822
Subject		: ACPI Warning (nspredef-0858): _SB_.PCI0.LPC_.EC__.BAT0._BIF: Return Package type mismatch at index 9 - found Buffer, expected String [20080926]
Submitter	: Len Brown <len.brown@intel.com>
Date		: 2008-10-25 01:26 (16 days old)
Handled-By	: Robert Moore <Robert.Moore@intel.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11799
Subject		: xorg can not start up with stolen memory
Submitter	: arrow zhang <arrow.ebd@gmail.com>
Date		: 2008-10-21 06:08 (20 days old)


Regressions with patches
------------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11988
Subject		: Eliminate recursive mutex in compat fb ioctl path
Submitter	: Keith Packard <keithp@keithp.com>
Date		: 2008-11-03 7:06 (7 days old)
References	: http://marc.info/?l=linux-kernel&m=122569604828448&w=4
Handled-By	: Keith Packard <keithp@keithp.com>
		  Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Patch		: http://marc.info/?l=linux-kernel&m=122569604828448&w=4
		  http://lkml.org/lkml/2008/10/31/162


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11985
Subject		: 2.6.28-rc3 truncates nfsd results
Submitter	: Doug Nazar <nazard@dragoninc.ca>
Date		: 2008-11-04 18:27 (6 days old)
References	: http://marc.info/?l=linux-kernel&m=122582366509153&w=4
Handled-By	: Doug Nazar <nazard@dragoninc.ca>
		  J. Bruce Fields <bfields@fieldses.org>
Patch		: http://marc.info/?l=linux-kernel&m=122592648119790&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11982
Subject		: Fan level 7 after resume wit 2.6.28-rc3
Submitter	: Tino Keitel <tino.keitel@tikei.de>
Date		: 2008-11-05 7:33 (5 days old)
References	: http://marc.info/?l=linux-kernel&m=122587043409186&w=4
Handled-By	: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Patch		: http://bugzilla.kernel.org/attachment.cgi?id=18744&action=view


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11942
Subject		: AMD64 reboot regression
Submitter	: Michael B. Trausch <mike@trausch.us>
Date		: 2008-11-02 20:30 (8 days old)
References	: http://marc.info/?l=linux-kernel&m=122565790519736&w=4
Handled-By	: Len Brown <lenb@kernel.org>
Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=11942#c11


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11937
Subject		: ext3 __log_wait_for_space: no transactions
Submitter	: Meelis Roos <mroos@linux.ee>
Date		: 2008-10-30 9:49 (11 days old)
References	: http://marc.info/?l=linux-kernel&m=122536026105643&w=4
Handled-By	: Theodore Tso <tytso@mit.edu>
Patch		: http://lkml.org/lkml/2008/11/1/61


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11925
Subject		: cdrom: missing compat ioctls
Submitter	: Andreas Schwab <schwab@suse.de>
Date		: 2008-10-31 14:02 (10 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=33c2dca4957bd0da3e1af7b96d0758d97e708ef6
Handled-By	: Andreas Schwab <schwab@suse.de>
Patch		: http://marc.info/?l=linux-kernel&m=122548923531545&w=2


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11917
Subject		: Asus Eee PC hotkeys stop working after prolonged usage
Submitter	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Date		: 2008-10-31 03:21 (10 days old)
Handled-By	: Alexey Starikovskiy <astarikovskiy@suse.de>
Patch		: http://marc.info/?l=linux-acpi&m=122603281422097&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11911
Subject		: new PCMCIA device instance after resume - orinoco can't download firmware
Submitter	: Andrey Borzenkov <arvidjaar@mail.ru>
Date		: 2008-10-28 19:19 (13 days old)
References	: http://marc.info/?l=linux-wireless&m=122522165719760&w=4
Handled-By	: Dave <kilroyd@googlemail.com>
Patch		: http://marc.info/?l=linux-wireless&m=122539058601588&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11903
Subject		: regression: vmalloc easily fail
Submitter	: Glauber Costa <glommer@redhat.com>
Date		: 2008-10-28 20:59 (13 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=db64fe02258f1507e13fe5212a989922323685ce
References	: http://marc.info/?l=linux-kernel&m=122522755530998&w=4
Handled-By	: Glauber Costa <glommer@redhat.com>
		  Nick Piggin <npiggin@suse.de>
		  Glauber Costa <glommer@redhat.com>
Patch		: http://marc.info/?l=linux-kernel&m=122609055221549&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11898
Subject		: mke2fs hang on AIC79 device.
Submitter	: alexs <alex.shi@intel.com>
Date		: 2008-10-30 01:17 (11 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f0c0a376d0fcd4c5579ecf5e95f88387cba85211
Handled-By	: James Bottomley <James.Bottomley@HansenPartnership.com>
Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=11898#c28


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11895
Subject		: 2.6.28-rc2 regression: keyboard dead after reboot on Toshiba Portege 4000
Submitter	: Andrey Borzenkov <arvidjaar@mail.ru>
Date		: 2008-10-28 19:05 (13 days old)
References	: http://marc.info/?l=linux-acpi&m=122522085418555&w=4
Handled-By	: Andrey Borzenkov <arvidjaar@mail.ru>
Patch		: http://marc.info/?l=linux-kernel&m=122547719810921&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11841
Subject		: plenty of line "ACPI: EC: non-query interrupt received, switching to interrupt mode" in dmesg and system not powering down
Submitter	: François Valenduc <francois.valenduc@tvcablenet.be>
Date		: 2008-10-25 10:29 (16 days old)
Handled-By	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Patch		: http://marc.info/?l=linux-acpi&m=122603281922125&w=4


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11806
Subject		: iwl3945 fails with microcode error
Submitter	: Johannes Berg <johannes@sipsolutions.net>
Date		: 2008-10-22 02:36 (19 days old)
References	: http://marc.info/?l=linux-kernel&m=122450235730661&w=4
Handled-By	: Reinette Chatre <reinette.chatre@intel.com>
Patch		: http://marc.info/?l=linux-wireless&m=122583010822172&w=2


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions from 2.6.27,
unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=11808

Please let me know if there are any Bugzilla entries that should be added to
the list in there.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11799] xorg can not start up with stolen memory
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
@ 2008-11-09 17:53 ` Rafael J. Wysocki
  2008-11-09 17:54 ` [Bug #11806] iwl3945 fails with microcode error Rafael J. Wysocki
                   ` (37 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:53 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, arrow zhang

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11799
Subject		: xorg can not start up with stolen memory
Submitter	: arrow zhang <arrow.ebd@gmail.com>
Date		: 2008-10-21 06:08 (20 days old)



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11806] iwl3945 fails with microcode error
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
  2008-11-09 17:53 ` [Bug #11799] xorg can not start up with stolen memory Rafael J. Wysocki
@ 2008-11-09 17:54 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11849] default IRQ affinity change in v2.6.27 (breaking several SMP PPC based systems) Rafael J. Wysocki
                   ` (36 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:54 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Johannes Berg, Reinette Chatre

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11806
Subject		: iwl3945 fails with microcode error
Submitter	: Johannes Berg <johannes@sipsolutions.net>
Date		: 2008-10-22 02:36 (19 days old)
References	: http://marc.info/?l=linux-kernel&m=122450235730661&w=4
Handled-By	: Reinette Chatre <reinette.chatre@intel.com>
Patch		: http://marc.info/?l=linux-wireless&m=122583010822172&w=2



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11849] default IRQ affinity change in v2.6.27 (breaking several SMP PPC based systems)
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
  2008-11-09 17:53 ` [Bug #11799] xorg can not start up with stolen memory Rafael J. Wysocki
  2008-11-09 17:54 ` [Bug #11806] iwl3945 fails with microcode error Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11834] iwl3945: if I leave my machine running overnight, wifi will not work in the morning Rafael J. Wysocki
                   ` (35 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Chris Snook, Kumar Gala, Scott Wood

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11849
Subject		: default IRQ affinity change in v2.6.27 (breaking several SMP PPC based systems)
Submitter	: Kumar Gala <galak@kernel.crashing.org>
Date		: 2008-10-24 12:45 (17 days old)
References	: http://marc.info/?l=linux-kernel&m=122485245924125&w=4
Handled-By	: Chris Snook <csnook@redhat.com>
		  Scott Wood <scottwood@freescale.com>



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11841] plenty of line "ACPI: EC: non-query interrupt received, switching to interrupt mode" in dmesg and system not powering down
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (5 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11826] extreme slowness of IO stuff using 2.6.28-rc1 Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11891] resume from disk broken on hp/compaq nx7000 (DRM problem) Rafael J. Wysocki
                   ` (31 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Alan Jenkins, François Valenduc

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11841
Subject		: plenty of line "ACPI: EC: non-query interrupt received, switching to interrupt mode" in dmesg and system not powering down
Submitter	: François Valenduc <francois.valenduc@tvcablenet.be>
Date		: 2008-10-25 10:29 (16 days old)
Handled-By	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Patch		: http://marc.info/?l=linux-acpi&m=122603281922125&w=4



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11834] iwl3945: if I leave my machine running overnight, wifi will not work in the morning
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11849] default IRQ affinity change in v2.6.27 (breaking several SMP PPC based systems) Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11822] ACPI Warning (nspredef-0858): _SB_.PCI0.LPC_.EC__.BAT0._BIF: Return Package type mismatch at index 9 - found Buffer, expected String [20080926] Rafael J. Wysocki
                   ` (34 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Pavel Machek, reinette chatre

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11834
Subject		: iwl3945: if I leave my machine running overnight, wifi will not work in the morning
Submitter	: Pavel Machek <pavel@suse.cz>
Date		: 2008-10-19 21:40 (22 days old)
References	: http://marc.info/?l=linux-kernel&m=122445440206101&w=4
Handled-By	: reinette chatre <reinette.chatre@intel.com>



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11826] extreme slowness of IO stuff using 2.6.28-rc1
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (4 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11822] ACPI Warning (nspredef-0858): _SB_.PCI0.LPC_.EC__.BAT0._BIF: Return Package type mismatch at index 9 - found Buffer, expected String [20080926] Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11841] plenty of line "ACPI: EC: non-query interrupt received, switching to interrupt mode" in dmesg and system not powering down Rafael J. Wysocki
                   ` (32 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Arjan van de Ven, Carlos R. Mafra,
	Frans Pop, Yves-Alexis Perez

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11826
Subject		: extreme slowness of IO stuff using 2.6.28-rc1
Submitter	: Yves-Alexis Perez <corsac@debian.org>
Date		: 2008-10-25 04:25 (16 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=dc4304f7deee29fcdf6a2b62f7146ea7f505fd42
References	: http://marc.info/?l=linux-kernel&m=122521238402963&w=4
Handled-By	: Arjan van de Ven <arjan@linux.intel.com>



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11822] ACPI Warning (nspredef-0858): _SB_.PCI0.LPC_.EC__.BAT0._BIF: Return Package type mismatch at index 9 - found Buffer, expected String [20080926]
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (3 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11834] iwl3945: if I leave my machine running overnight, wifi will not work in the morning Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11826] extreme slowness of IO stuff using 2.6.28-rc1 Rafael J. Wysocki
                   ` (33 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Len Brown, Robert Moore

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11822
Subject		: ACPI Warning (nspredef-0858): _SB_.PCI0.LPC_.EC__.BAT0._BIF: Return Package type mismatch at index 9 - found Buffer, expected String [20080926]
Submitter	: Len Brown <len.brown@intel.com>
Date		: 2008-10-25 01:26 (16 days old)
Handled-By	: Robert Moore <Robert.Moore@intel.com>



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (7 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11891] resume from disk broken on hp/compaq nx7000 (DRM problem) Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 21:15   ` Benjamin Herrenschmidt
  2008-11-10  5:46   ` Benjamin Herrenschmidt
  2008-11-09 17:59 ` [Bug #11873] unable to mount ext3 root filesystem due to htree_dirblock_to_tree Rafael J. Wysocki
                   ` (29 subsequent siblings)
  38 siblings, 2 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Andrew Morton, Benjamin Herrenschmidt,
	David S. Miller, James Cloos, Linus Torvalds

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11875
Subject		: radeonfb lockup in .28-rc (bisected)
Submitter	: James Cloos <cloos@jhcloos.com>
Date		: 2008-10-28 0:00 (13 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b1ee26bab14886350ba12a5c10cbc0696ac679bf
References	: http://marc.info/?l=linux-kernel&m=122515210200530&w=4
Handled-By	: Benjamin Herrenschmidt <benh@kernel.crashing.org>



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11873] unable to mount ext3 root filesystem due to htree_dirblock_to_tree
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (8 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11875] radeonfb lockup in .28-rc (bisected) Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11858] Timeout regression introduced by 242f9dcb8ba6f68fcd217a119a7648a4f69290e9 Rafael J. Wysocki
                   ` (28 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Jens Axboe, Jimmy.Jazz@gmx.net, Neil Brown,
	Tejun Heo

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11873
Subject		: unable to mount ext3 root filesystem due to htree_dirblock_to_tree
Submitter	: Jimmy.Jazz@gmx.net
Date		: 2008-10-28 05:09 (13 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4c46501d1659475dc6c89554af6ce7fe6ecf615c
Handled-By	: Tejun Heo <tj@kernel.org>
		  Neil Brown <neilb@suse.de>



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11891] resume from disk broken on hp/compaq nx7000 (DRM problem)
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (6 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11841] plenty of line "ACPI: EC: non-query interrupt received, switching to interrupt mode" in dmesg and system not powering down Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11875] radeonfb lockup in .28-rc (bisected) Rafael J. Wysocki
                   ` (30 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Dave Airlie, Eric Anholt, Jesse Barnes,
	Markus Meier

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11891
Subject		: resume from disk broken on hp/compaq nx7000 (DRM problem)
Submitter	: Markus Meier <maekke@gentoo.org>
Date		: 2008-10-29 14:42 (12 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0a3e67a4caac273a3bfc4ced3da364830b1ab241
Handled-By	: Jesse Barnes <jbarnes@virtuousgeek.org>



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11858] Timeout regression introduced by 242f9dcb8ba6f68fcd217a119a7648a4f69290e9
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (9 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11873] unable to mount ext3 root filesystem due to htree_dirblock_to_tree Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11896] [2.6.28-rc2] EeePC ACPI errors &amp; exceptions Rafael J. Wysocki
                   ` (27 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Tejun Heo

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11858
Subject		: Timeout regression introduced by 242f9dcb8ba6f68fcd217a119a7648a4f69290e9
Submitter	: Tejun Heo <tj@kernel.org>
Date		: 2008-10-26 9:46 (15 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=242f9dcb8ba6f68fcd217a119a7648a4f69290e9
References	: http://marc.info/?l=linux-kernel&m=122501447326698&w=4



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11898] mke2fs hang on AIC79 device.
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (13 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11895] 2.6.28-rc2 regression: keyboard dead after reboot on Toshiba Portege 4000 Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11903] regression: vmalloc easily fail Rafael J. Wysocki
                   ` (23 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, alexs, James Bottomley, Mike Christie, Yanmin Zhang

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11898
Subject		: mke2fs hang on AIC79 device.
Submitter	: alexs <alex.shi@intel.com>
Date		: 2008-10-30 01:17 (11 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f0c0a376d0fcd4c5579ecf5e95f88387cba85211
Handled-By	: James Bottomley <James.Bottomley@HansenPartnership.com>
Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=11898#c28



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11899] sometime boot failed on T61 laptop
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (11 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11896] [2.6.28-rc2] EeePC ACPI errors &amp; exceptions Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11895] 2.6.28-rc2 regression: keyboard dead after reboot on Toshiba Portege 4000 Rafael J. Wysocki
                   ` (25 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, alexs, Tejun Heo, Yanmin Zhang

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11899
Subject		: sometime boot failed on T61 laptop
Submitter	: alexs <alex.shi@intel.com>
Date		: 2008-10-30 02:04 (11 days old)
Handled-By	: Tejun Heo <tj@kernel.org>



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11896] [2.6.28-rc2] EeePC ACPI errors &amp; exceptions
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (10 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11858] Timeout regression introduced by 242f9dcb8ba6f68fcd217a119a7648a4f69290e9 Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11899] sometime boot failed on T61 laptop Rafael J. Wysocki
                   ` (26 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Alexey Starikovskiy, Alexey Starikovskiy,
	Darren Salt, Zhao Yakui

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11896
Subject		: [2.6.28-rc2] EeePC ACPI errors &amp; exceptions
Submitter	: Darren Salt <linux@youmustbejoking.demon.co.uk>
Date		: 2008-10-27 22:52 (14 days old)
References	: http://marc.info/?l=linux-kernel&m=122514911328761&w=4
Handled-By	: Alexey Starikovskiy <aystarik@gmail.com>
		  Zhao Yakui <yakui.zhao@intel.com>



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11895] 2.6.28-rc2 regression: keyboard dead after reboot on Toshiba Portege 4000
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (12 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11899] sometime boot failed on T61 laptop Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-10 16:53   ` Andrey Borzenkov
  2008-11-09 17:59 ` [Bug #11898] mke2fs hang on AIC79 device Rafael J. Wysocki
                   ` (24 subsequent siblings)
  38 siblings, 1 reply; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Andrey Borzenkov, Avi Kivity, Ingo Molnar

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11895
Subject		: 2.6.28-rc2 regression: keyboard dead after reboot on Toshiba Portege 4000
Submitter	: Andrey Borzenkov <arvidjaar@mail.ru>
Date		: 2008-10-28 19:05 (13 days old)
References	: http://marc.info/?l=linux-acpi&m=122522085418555&w=4
Handled-By	: Andrey Borzenkov <arvidjaar@mail.ru>
Patch		: http://marc.info/?l=linux-kernel&m=122547719810921&w=4



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11906] 2.6.28-rc2 seems to fail at powering down the monitor when it should
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (15 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11903] regression: vmalloc easily fail Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11905] lots of extra timer interrupts costing 2W Rafael J. Wysocki
                   ` (21 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Gene Heskett

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11906
Subject		: 2.6.28-rc2 seems to fail at powering down the monitor when it should
Submitter	: Gene Heskett <gene.heskett@gmail.com>
Date		: 2008-10-30 6:39 (11 days old)
References	: http://marc.info/?l=linux-kernel&m=122534879721424&w=4



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11905] lots of extra timer interrupts costing 2W
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (16 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11906] 2.6.28-rc2 seems to fail at powering down the monitor when it should Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11913] USB/INPUT: slab error in cache_alloc_debugcheck_after(): double free? Rafael J. Wysocki
                   ` (20 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Lukas Hejtmanek, Theodore Ts'o,
	Thomas Gleixner, Venki Pallipadi

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11905
Subject		: lots of extra timer interrupts costing 2W
Submitter	: Theodore Ts'o <tytso@mit.edu>
Date		: 2008-10-30 2:18 (11 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fb02fbc14d17837b4b7b02dbb36142c16a7bf208
References	: http://marc.info/?l=linux-kernel&m=122533314305315&w=4
		  http://marc.info/?l=linux-kernel&m=122541849114444&w=4



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11903] regression: vmalloc easily fail
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (14 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11898] mke2fs hang on AIC79 device Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11906] 2.6.28-rc2 seems to fail at powering down the monitor when it should Rafael J. Wysocki
                   ` (22 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Andrew Morton, Glauber Costa,
	Linus Torvalds, Nick Piggin

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11903
Subject		: regression: vmalloc easily fail
Submitter	: Glauber Costa <glommer@redhat.com>
Date		: 2008-10-28 20:59 (13 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=db64fe02258f1507e13fe5212a989922323685ce
References	: http://marc.info/?l=linux-kernel&m=122522755530998&w=4
Handled-By	: Glauber Costa <glommer@redhat.com>
		  Nick Piggin <npiggin@suse.de>
		  Glauber Costa <glommer@redhat.com>
Patch		: http://marc.info/?l=linux-kernel&m=122609055221549&w=4



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11908] linux-2.6.28-rc2 regression : oprofile doesnt work anymore
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (19 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11911] new PCMCIA device instance after resume - orinoco can't download firmware Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11917] Asus Eee PC hotkeys stop working after prolonged usage Rafael J. Wysocki
                   ` (17 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Eric Dumazet, Jesper Dangaard Brouer,
	Pekka Enberg, Robert Richter

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11908
Subject		: linux-2.6.28-rc2 regression : oprofile doesnt work anymore
Submitter	: Eric Dumazet <dada1@cosmosbay.com>
Date		: 2008-10-30 18:01 (11 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c493756e2a8a78bcaae30668317890dcfe86e7c3
References	: http://marc.info/?l=linux-kernel&m=122539004100532&w=4



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11917] Asus Eee PC hotkeys stop working after prolonged usage
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (20 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11908] linux-2.6.28-rc2 regression : oprofile doesnt work anymore Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11928] ath5k gets lost with eeepc-laptop removal Rafael J. Wysocki
                   ` (16 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Alan Jenkins, Alexey Starikovskiy

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11917
Subject		: Asus Eee PC hotkeys stop working after prolonged usage
Submitter	: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Date		: 2008-10-31 03:21 (10 days old)
Handled-By	: Alexey Starikovskiy <astarikovskiy@suse.de>
Patch		: http://marc.info/?l=linux-acpi&m=122603281422097&w=4



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11913] USB/INPUT: slab error in cache_alloc_debugcheck_after(): double free?
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (17 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11905] lots of extra timer interrupts costing 2W Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11911] new PCMCIA device instance after resume - orinoco can't download firmware Rafael J. Wysocki
                   ` (19 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Denys Vlasenko, Helge Deller,
	Jeroen Roovers, Jiri Kosina, Jiri Slaby

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11913
Subject		: USB/INPUT: slab error in cache_alloc_debugcheck_after(): double free?
Submitter	: Helge Deller <deller@gmx.de>
Date		: 2008-10-30 23:11 (11 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=cb8f488c33539f096580e202f5438a809195008f
References	: http://marc.info/?l=linux-kernel&m=122540833301394&w=4
Handled-By	: Jiri Kosina <jkosina@suse.cz>
		  Jiri Slaby <jirislaby@gmail.com>
		  Jiri Kosina <jkosina@suse.cz>
		  Jiri Slaby <jirislaby@gmail.com>
		  Denys Vlasenko <vda.linux@googlemail.com>



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11911] new PCMCIA device instance after resume - orinoco can't download firmware
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (18 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11913] USB/INPUT: slab error in cache_alloc_debugcheck_after(): double free? Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-10  3:55   ` Andrey Borzenkov
  2008-11-09 17:59 ` [Bug #11908] linux-2.6.28-rc2 regression : oprofile doesnt work anymore Rafael J. Wysocki
                   ` (18 subsequent siblings)
  38 siblings, 1 reply; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Andrey Borzenkov, Dave

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11911
Subject		: new PCMCIA device instance after resume - orinoco can't download firmware
Submitter	: Andrey Borzenkov <arvidjaar@mail.ru>
Date		: 2008-10-28 19:19 (13 days old)
References	: http://marc.info/?l=linux-wireless&m=122522165719760&w=4
Handled-By	: Dave <kilroyd@googlemail.com>
Patch		: http://marc.info/?l=linux-wireless&m=122539058601588&w=4



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11925] cdrom: missing compat ioctls
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (23 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11937] ext3 __log_wait_for_space: no transactions Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 23:00   ` Andreas Schwab
  2008-11-09 17:59 ` [Bug #11942] AMD64 reboot regression Rafael J. Wysocki
                   ` (13 subsequent siblings)
  38 siblings, 1 reply; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Andreas Schwab

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11925
Subject		: cdrom: missing compat ioctls
Submitter	: Andreas Schwab <schwab@suse.de>
Date		: 2008-10-31 14:02 (10 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=33c2dca4957bd0da3e1af7b96d0758d97e708ef6
Handled-By	: Andreas Schwab <schwab@suse.de>
Patch		: http://marc.info/?l=linux-kernel&m=122548923531545&w=2



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11928] ath5k gets lost with eeepc-laptop removal
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (21 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11917] Asus Eee PC hotkeys stop working after prolonged usage Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11937] ext3 __log_wait_for_space: no transactions Rafael J. Wysocki
                   ` (15 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Alan Jenkins, Luiz Fernando N. Capitulino,
	Matthew Garrett, Nick Kossifidis

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11928
Subject		: ath5k gets lost with eeepc-laptop removal
Submitter	: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Date		: 2008-10-31 13:05 (10 days old)
References	: http://marc.info/?l=linux-kernel&m=122545827204957&w=4
Handled-By	: Nick Kossifidis <mickflemm@gmail.com>



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11937] ext3 __log_wait_for_space: no transactions
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (22 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11928] ath5k gets lost with eeepc-laptop removal Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11925] cdrom: missing compat ioctls Rafael J. Wysocki
                   ` (14 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Bartlomiej Zolnierkiewicz, Meelis Roos,
	Simon Arlott, Theodore Tso

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11937
Subject		: ext3 __log_wait_for_space: no transactions
Submitter	: Meelis Roos <mroos@linux.ee>
Date		: 2008-10-30 9:49 (11 days old)
References	: http://marc.info/?l=linux-kernel&m=122536026105643&w=4
Handled-By	: Theodore Tso <tytso@mit.edu>
Patch		: http://lkml.org/lkml/2008/11/1/61



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11942] AMD64 reboot regression
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (24 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11925] cdrom: missing compat ioctls Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11947] 2.6.28-rc VC switching with Intel graphics broken Rafael J. Wysocki
                   ` (12 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Len Brown, Len Brown, Michael B. Trausch,
	Zhao Yakui

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11942
Subject		: AMD64 reboot regression
Submitter	: Michael B. Trausch <mike@trausch.us>
Date		: 2008-11-02 20:30 (8 days old)
References	: http://marc.info/?l=linux-kernel&m=122565790519736&w=4
Handled-By	: Len Brown <lenb@kernel.org>
Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=11942#c11



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11965] regression introduced by - timers: fix itimer/many thread hang
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (27 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11958] [2.6.27.x =&gt; 2.6.28-rc3] Xorg crash with xf86MapVidMem error Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11984] regression when switching TTY-&gt;X, input related? Rafael J. Wysocki
                   ` (9 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Doug Chapman, Frank Mayhar, Ingo Molnar,
	Peter Zijlstra

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11965
Subject		: regression introduced by - timers: fix itimer/many thread hang
Submitter	: Doug Chapman <doug.chapman@hp.com>
Date		: 2008-11-06 11:03 (4 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f06febc96ba8e0af80bcc3eaec0a109e88275fac
References	: http://marc.info/?l=linux-kernel&m=122596943416648&w=4
Handled-By	: Frank Mayhar <fmayhar@google.com>
		  Peter Zijlstra <peterz@infradead.org>
		  Ingo Molnar <mingo@elte.hu>



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11958] [2.6.27.x =&gt; 2.6.28-rc3] Xorg crash with xf86MapVidMem error
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (26 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11947] 2.6.28-rc VC switching with Intel graphics broken Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11965] regression introduced by - timers: fix itimer/many thread hang Rafael J. Wysocki
                   ` (10 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Tomasz Chmielewski

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11958
Subject		: [2.6.27.x =&gt; 2.6.28-rc3] Xorg crash with xf86MapVidMem error
Submitter	: Tomasz Chmielewski <tch@wpkg.org>
Date		: 2008-11-05 05:37 (5 days old)



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11947] 2.6.28-rc VC switching with Intel graphics broken
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (25 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11942] AMD64 reboot regression Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-11  9:28   ` Romano Giannetti
  2008-11-09 17:59 ` [Bug #11958] [2.6.27.x =&gt; 2.6.28-rc3] Xorg crash with xf86MapVidMem error Rafael J. Wysocki
                   ` (11 subsequent siblings)
  38 siblings, 1 reply; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Jesse Barnes, Romano Giannetti

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11947
Subject		: 2.6.28-rc VC switching with Intel graphics broken
Submitter	: Romano Giannetti <romano.giannetti@gmail.com>
Date		: 2008-11-03 12:10 (7 days old)
Handled-By	: Jesse Barnes <jbarnes@virtuousgeek.org>



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11985] 2.6.28-rc3 truncates nfsd results
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (30 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11982] Fan level 7 after resume wit 2.6.28-rc3 Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 21:05   ` J. Bruce Fields
  2008-11-09 17:59 ` [Bug #11970] gettimeofday return a old time in mmbench Rafael J. Wysocki
                   ` (6 subsequent siblings)
  38 siblings, 1 reply; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Doug Nazar, J. Bruce Fields, Jeff Garzik

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11985
Subject		: 2.6.28-rc3 truncates nfsd results
Submitter	: Doug Nazar <nazard@dragoninc.ca>
Date		: 2008-11-04 18:27 (6 days old)
References	: http://marc.info/?l=linux-kernel&m=122582366509153&w=4
Handled-By	: Doug Nazar <nazard@dragoninc.ca>
		  J. Bruce Fields <bfields@fieldses.org>
Patch		: http://marc.info/?l=linux-kernel&m=122592648119790&w=4



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11984] regression when switching TTY-&gt;X, input related?
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (28 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11965] regression introduced by - timers: fix itimer/many thread hang Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11982] Fan level 7 after resume wit 2.6.28-rc3 Rafael J. Wysocki
                   ` (8 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Bernhard Schmidt, Dmitry Torokhov, Jiri Kosina

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11984
Subject		: regression when switching TTY-&gt;X, input related?
Submitter	: Bernhard Schmidt <berni@birkenwald.de>
Date		: 2008-11-05 22:04 (5 days old)
References	: http://marc.info/?l=linux-kernel&m=122592278403853&w=4



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11970] gettimeofday return a old time in mmbench
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (31 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11985] 2.6.28-rc3 truncates nfsd results Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11986] 2.6.28-rc2-git1: spitz still won't boot Rafael J. Wysocki
                   ` (5 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, alexs, Ingo Molnar

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11970
Subject		: gettimeofday return a old time in mmbench
Submitter	: alexs <alex.shi@intel.com>
Date		: 2008-11-06 23:57 (4 days old)
Handled-By	: Ingo Molnar <mingo@elte.hu>



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11982] Fan level 7 after resume wit 2.6.28-rc3
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (29 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11984] regression when switching TTY-&gt;X, input related? Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11985] 2.6.28-rc3 truncates nfsd results Rafael J. Wysocki
                   ` (7 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Henrique de Moraes Holschuh, Tino Keitel

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11982
Subject		: Fan level 7 after resume wit 2.6.28-rc3
Submitter	: Tino Keitel <tino.keitel@tikei.de>
Date		: 2008-11-05 7:33 (5 days old)
References	: http://marc.info/?l=linux-kernel&m=122587043409186&w=4
Handled-By	: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Patch		: http://bugzilla.kernel.org/attachment.cgi?id=18744&action=view



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11994] Computer doesn't power down after commit CPI: EC: do transaction from interrupt context
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (34 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11988] Eliminate recursive mutex in compat fb ioctl path Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine Rafael J. Wysocki
                   ` (2 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, François Valenduc, ykzhao

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11994
Subject		: Computer doesn't power down after commit CPI: EC: do transaction from interrupt context
Submitter	: François Valenduc <francois.valenduc@tvcablenet.be>
Date		: 2008-11-09 02:02 (1 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5ceb40417bca2045350e77f740e0c4c94875fff2
Handled-By	: ykzhao <yakui.zhao@intel.com>



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (35 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11994] Computer doesn't power down after commit CPI: EC: do transaction from interrupt context Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-10 12:04   ` Heiko Carstens
  2008-11-09 17:59 ` [Bug #11987] Bootup time regression from 2.6.27 to 2.6.28-rc3+ Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11996] Tracing framework regression in 2.6.28-rc3 Rafael J. Wysocki
  38 siblings, 1 reply; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Heiko Carstens, Rafael J. Wysocki, Rusty Russell

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11989
Subject		: Suspend failure on NForce4-based boards due to chanes in stop_machine
Submitter	: Rafael J. Wysocki <rjw@sisk.pl>
Date		: 2008-11-03 0:28 (7 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc
References	: http://marc.info/?l=linux-kernel&m=122567187604356&w=4



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11988] Eliminate recursive mutex in compat fb ioctl path
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (33 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11986] 2.6.28-rc2-git1: spitz still won't boot Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-14 14:51   ` Geert Uytterhoeven
  2008-11-09 17:59 ` [Bug #11994] Computer doesn't power down after commit CPI: EC: do transaction from interrupt context Rafael J. Wysocki
                   ` (3 subsequent siblings)
  38 siblings, 1 reply; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Geert Uytterhoeven, Keith Packard

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11988
Subject		: Eliminate recursive mutex in compat fb ioctl path
Submitter	: Keith Packard <keithp@keithp.com>
Date		: 2008-11-03 7:06 (7 days old)
References	: http://marc.info/?l=linux-kernel&m=122569604828448&w=4
Handled-By	: Keith Packard <keithp@keithp.com>
		  Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Patch		: http://marc.info/?l=linux-kernel&m=122569604828448&w=4
		  http://lkml.org/lkml/2008/10/31/162



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11986] 2.6.28-rc2-git1: spitz still won't boot
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (32 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11970] gettimeofday return a old time in mmbench Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11988] Eliminate recursive mutex in compat fb ioctl path Rafael J. Wysocki
                   ` (4 subsequent siblings)
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Pavel Machek

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11986
Subject		: 2.6.28-rc2-git1: spitz still won't boot
Submitter	: Pavel Machek <pavel@suse.cz>
Date		: 2008-11-05 14:23 (5 days old)
References	: http://marc.info/?l=linux-kernel&m=122589528016337&w=4



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11987] Bootup time regression from 2.6.27 to 2.6.28-rc3+
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (36 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  2008-11-09 17:59 ` [Bug #11996] Tracing framework regression in 2.6.28-rc3 Rafael J. Wysocki
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Kernel Testers List, Lukas Hejtmanek

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11987
Subject		: Bootup time regression from 2.6.27 to 2.6.28-rc3+
Submitter	: Lukas Hejtmanek <xhejtman@ics.muni.cz>
Date		: 2008-11-04 17:33 (6 days old)
References	: http://marc.info/?l=linux-kernel&m=122582006601658&w=4



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11996] Tracing framework regression in 2.6.28-rc3
  2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
                   ` (37 preceding siblings ...)
  2008-11-09 17:59 ` [Bug #11987] Bootup time regression from 2.6.27 to 2.6.28-rc3+ Rafael J. Wysocki
@ 2008-11-09 17:59 ` Rafael J. Wysocki
  38 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 17:59 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Pekka Paalanen, Steven Rostedt

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11996
Subject		: Tracing framework regression in 2.6.28-rc3
Submitter	: Pekka Paalanen <pq@iki.fi>
Date		: 2008-11-09 10:13 (1 days old)
References	: http://marc.info/?l=linux-kernel&m=122624392229317&w=4
Handled-By	: Steven Rostedt <rostedt@goodmis.org>



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11985] 2.6.28-rc3 truncates nfsd results
  2008-11-09 17:59 ` [Bug #11985] 2.6.28-rc3 truncates nfsd results Rafael J. Wysocki
@ 2008-11-09 21:05   ` J. Bruce Fields
  0 siblings, 0 replies; 106+ messages in thread
From: J. Bruce Fields @ 2008-11-09 21:05 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Doug Nazar, Jeff Garzik

On Sun, Nov 09, 2008 at 06:59:15PM +0100, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.27.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11985
> Subject		: 2.6.28-rc3 truncates nfsd results
> Submitter	: Doug Nazar <nazard@dragoninc.ca>
> Date		: 2008-11-04 18:27 (6 days old)
> References	: http://marc.info/?l=linux-kernel&m=122582366509153&w=4
> Handled-By	: Doug Nazar <nazard@dragoninc.ca>
> 		  J. Bruce Fields <bfields@fieldses.org>
> Patch		: http://marc.info/?l=linux-kernel&m=122592648119790&w=4

The above patch has just been submitted to Linus.

--b.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-09 17:59 ` [Bug #11875] radeonfb lockup in .28-rc (bisected) Rafael J. Wysocki
@ 2008-11-09 21:15   ` Benjamin Herrenschmidt
  2008-11-10  5:46   ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2008-11-09 21:15 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Andrew Morton,
	David S. Miller, James Cloos, Linus Torvalds

On Sun, 2008-11-09 at 18:59 +0100, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.27.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11875
> Subject		: radeonfb lockup in .28-rc (bisected)
> Submitter	: James Cloos <cloos@jhcloos.com>
> Date		: 2008-10-28 0:00 (13 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b1ee26bab14886350ba12a5c10cbc0696ac679bf
> References	: http://marc.info/?l=linux-kernel&m=122515210200530&w=4
> Handled-By	: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> 

FYI. I'm back at work today, at which point I'll have a similar machine
to one of the victims which should allow me to either reproduce & fix,
or if I can't, send a workaround in the form of disabling that
specific acceleration unless explicitely enabled from the command line.

So expect a patch later today.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11925] cdrom: missing compat ioctls
  2008-11-09 17:59 ` [Bug #11925] cdrom: missing compat ioctls Rafael J. Wysocki
@ 2008-11-09 23:00   ` Andreas Schwab
  2008-11-09 23:29     ` Rafael J. Wysocki
  0 siblings, 1 reply; 106+ messages in thread
From: Andreas Schwab @ 2008-11-09 23:00 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux Kernel Mailing List, Kernel Testers List

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11925
> Subject		: cdrom: missing compat ioctls
> Submitter	: Andreas Schwab <schwab@suse.de>
> Date		: 2008-10-31 14:02 (10 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=33c2dca4957bd0da3e1af7b96d0758d97e708ef6
> Handled-By	: Andreas Schwab <schwab@suse.de>
> Patch		: http://marc.info/?l=linux-kernel&m=122548923531545&w=2

The patch has been picked up by akpm.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11925] cdrom: missing compat ioctls
  2008-11-09 23:00   ` Andreas Schwab
@ 2008-11-09 23:29     ` Rafael J. Wysocki
  2008-11-09 23:39       ` Andreas Schwab
  0 siblings, 1 reply; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-09 23:29 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Linux Kernel Mailing List, Kernel Testers List

On Monday, 10 of November 2008, Andreas Schwab wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11925
> > Subject		: cdrom: missing compat ioctls
> > Submitter	: Andreas Schwab <schwab@suse.de>
> > Date		: 2008-10-31 14:02 (10 days old)
> > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=33c2dca4957bd0da3e1af7b96d0758d97e708ef6
> > Handled-By	: Andreas Schwab <schwab@suse.de>
> > Patch		: http://marc.info/?l=linux-kernel&m=122548923531545&w=2
> 
> The patch has been picked up by akpm.

OK, but has it been merged into mainline already?

Rafael

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11925] cdrom: missing compat ioctls
  2008-11-09 23:29     ` Rafael J. Wysocki
@ 2008-11-09 23:39       ` Andreas Schwab
  0 siblings, 0 replies; 106+ messages in thread
From: Andreas Schwab @ 2008-11-09 23:39 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux Kernel Mailing List, Kernel Testers List

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> OK, but has it been merged into mainline already?

No.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11911] new PCMCIA device instance after resume - orinoco can't download firmware
  2008-11-09 17:59 ` [Bug #11911] new PCMCIA device instance after resume - orinoco can't download firmware Rafael J. Wysocki
@ 2008-11-10  3:55   ` Andrey Borzenkov
  0 siblings, 0 replies; 106+ messages in thread
From: Andrey Borzenkov @ 2008-11-10  3:55 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Dave, linux-pcmcia

[-- Attachment #1: Type: text/plain, Size: 804 bytes --]

On Sunday 09 November 2008, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.27.  Please verify if it still should be listed and let me know
> (either way).
> 

Still present in rc4.

> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11911
> Subject		: new PCMCIA device instance after resume - orinoco can't download firmware
> Submitter	: Andrey Borzenkov <arvidjaar@mail.ru>
> Date		: 2008-10-28 19:19 (13 days old)
> References	: http://marc.info/?l=linux-wireless&m=122522165719760&w=4
> Handled-By	: Dave <kilroyd@googlemail.com>
> Patch		: http://marc.info/?l=linux-wireless&m=122539058601588&w=4
> 
> 
> 



[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-09 17:59 ` [Bug #11875] radeonfb lockup in .28-rc (bisected) Rafael J. Wysocki
  2008-11-09 21:15   ` Benjamin Herrenschmidt
@ 2008-11-10  5:46   ` Benjamin Herrenschmidt
  2008-11-10  7:13     ` Paul Collins
                       ` (3 more replies)
  1 sibling, 4 replies; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2008-11-10  5:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Andrew Morton,
	David S. Miller, James Cloos, Paul Collins, Linus Torvalds

On Sun, 2008-11-09 at 18:59 +0100, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.27.  Please verify if it still should be listed and let me know
> (either way).

Allright, so I finally managed to find a machine to reproduce it and
I have a patch that fixes it here. I'm basically implementing the same
thing as X which is to ensure the bitmap is padded to 32 pixels. The
core fbcon has support for that to a certain extent so it's a fairly
small change.

Note that there was another bug, I think I was missing one
wait_for_fifo() though fixing that didn't make a difference here.

However, it's possible that this significantly impacts the performances,
maybe to the point where we may want to back out the imageblt
acceleration.

David, would you mind testing on your machine ? It's the one that shows
the biggest performance improvement, and I would like to know how much
it is affected by that patch. As long as the "worst case" performance
is still reasonable, I'm ok to take the hit if the improvement for you
is still significant.

Cheers,
Ben.

radeonfb: Fix accel problems with new imageblit hook

Some radeon chips have issues with color expansion of pixmaps that
aren't a multiple of 32 pixels wide. This works around it the same
way X does by requesting the right pitch alignment from fbcon and
then using the chip scissors to do clipping to the requested size.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

If confirmed by the reporters (in CC), please apply for .28 as it
fixes a regression.

Index: linux-work/drivers/video/aty/radeon_accel.c
===================================================================
--- linux-work.orig/drivers/video/aty/radeon_accel.c	2008-11-10 14:05:06.000000000 +1100
+++ linux-work/drivers/video/aty/radeon_accel.c	2008-11-10 14:34:45.000000000 +1100
@@ -179,7 +179,7 @@ static void radeonfb_prim_imageblit(stru
 
 	radeonfb_set_creg(rinfo, DP_GUI_MASTER_CNTL, &rinfo->dp_gui_mc_cache,
 			  rinfo->dp_gui_mc_base |
-			  GMC_BRUSH_NONE |
+			  GMC_BRUSH_NONE | GMC_DST_CLIP_LEAVE |
 			  GMC_SRC_DATATYPE_MONO_FG_BG |
 			  ROP3_S |
 			  GMC_BYTE_ORDER_MSB_TO_LSB |
@@ -189,9 +189,6 @@ static void radeonfb_prim_imageblit(stru
 	radeonfb_set_creg(rinfo, DP_SRC_FRGD_CLR, &rinfo->dp_src_fg_cache, fg);
 	radeonfb_set_creg(rinfo, DP_SRC_BKGD_CLR, &rinfo->dp_src_bg_cache, bg);
 
-	radeon_fifo_wait(rinfo, 1);
-	OUTREG(DST_Y_X, (image->dy << 16) | image->dx);
-
 	/* Ensure the dst cache is flushed and the engine idle before
 	 * issuing the operation.
 	 *
@@ -205,13 +202,19 @@ static void radeonfb_prim_imageblit(stru
 
 	/* X here pads width to a multiple of 32 and uses the clipper to
 	 * adjust the result. Is that really necessary ? Things seem to
-	 * work ok for me without that and the doco doesn't seem to imply
+	 * work ok for me without that and the doco doesn't seem to imply]
 	 * there is such a restriction.
 	 */
-	OUTREG(DST_WIDTH_HEIGHT, (image->width << 16) | image->height);
+	radeon_fifo_wait(rinfo, 4);
+	OUTREG(SC_TOP_LEFT, (image->dy << 16) | image->dx);
+	OUTREG(SC_BOTTOM_RIGHT, ((image->dy + image->height) << 16) |
+	       (image->dx + image->width));
+	OUTREG(DST_Y_X, (image->dy << 16) | image->dx);
+
+	OUTREG(DST_HEIGHT_WIDTH, (image->height << 16) | ((image->width + 31) & ~31));
 
-	src_bytes = (((image->width * image->depth) + 7) / 8) * image->height;
-	dwords = (src_bytes + 3) / 4;
+	dwords = (image->width + 31) >> 5;
+	dwords *= image->height;
 	bits = (u32*)(image->data);
 
 	while(dwords >= 8) {
Index: linux-work/drivers/video/aty/radeon_base.c
===================================================================
--- linux-work.orig/drivers/video/aty/radeon_base.c	2008-11-10 14:01:50.000000000 +1100
+++ linux-work/drivers/video/aty/radeon_base.c	2008-11-10 14:36:26.000000000 +1100
@@ -1875,6 +1875,7 @@ static int __devinit radeon_set_fbinfo (
 	info->fbops = &radeonfb_ops;
 	info->screen_base = rinfo->fb_base;
 	info->screen_size = rinfo->mapped_vram;
+
 	/* Fill fix common fields */
 	strlcpy(info->fix.id, rinfo->name, sizeof(info->fix.id));
         info->fix.smem_start = rinfo->fb_base_phys;
@@ -1889,8 +1890,25 @@ static int __devinit radeon_set_fbinfo (
         info->fix.mmio_len = RADEON_REGSIZE;
 	info->fix.accel = FB_ACCEL_ATI_RADEON;
 
+	/* Allocate colormap */
 	fb_alloc_cmap(&info->cmap, 256, 0);
 
+	/* Setup pixmap used for acceleration */
+#define PIXMAP_SIZE	(2048 * 4)
+
+	info->pixmap.addr = kmalloc(PIXMAP_SIZE, GFP_KERNEL);
+	if (!info->pixmap.addr) {
+		printk(KERN_ERR "radeonfb: Failed to allocate pixmap !\n");
+		noaccel = 1;
+		goto bail;
+	}
+	info->pixmap.size = PIXMAP_SIZE;
+	info->pixmap.flags = FB_PIXMAP_SYSTEM;
+	info->pixmap.scan_align = 4;
+	info->pixmap.buf_align = 4;
+	info->pixmap.access_align = 32;
+
+bail:
 	if (noaccel)
 		info->flags |= FBINFO_HWACCEL_DISABLED;
 



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-10  5:46   ` Benjamin Herrenschmidt
@ 2008-11-10  7:13     ` Paul Collins
  2008-11-10  9:05       ` Benjamin Herrenschmidt
  2008-11-10  9:06     ` David Miller
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 106+ messages in thread
From: Paul Collins @ 2008-11-10  7:13 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Andrew Morton, David S. Miller, James Cloos,
	Linus Torvalds

Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> On Sun, 2008-11-09 at 18:59 +0100, Rafael J. Wysocki wrote:
>> This message has been generated automatically as a part of a report
>> of recent regressions.
>> 
>> The following bug entry is on the current list of known regressions
>> from 2.6.27.  Please verify if it still should be listed and let me know
>> (either way).
>
> Allright, so I finally managed to find a machine to reproduce it and
> I have a patch that fixes it here. I'm basically implementing the same
> thing as X which is to ensure the bitmap is padded to 32 pixels.

Works great here (as you might expect).

-- 
Paul Collins
Wellington, New Zealand

Dag vijandelijk luchtschip de huismeester is dood

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-10  7:13     ` Paul Collins
@ 2008-11-10  9:05       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2008-11-10  9:05 UTC (permalink / raw)
  To: Paul Collins
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Andrew Morton, David S. Miller, James Cloos,
	Linus Torvalds

On Mon, 2008-11-10 at 20:13 +1300, Paul Collins wrote:
> Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> 
> > On Sun, 2008-11-09 at 18:59 +0100, Rafael J. Wysocki wrote:
> >> This message has been generated automatically as a part of a report
> >> of recent regressions.
> >> 
> >> The following bug entry is on the current list of known regressions
> >> from 2.6.27.  Please verify if it still should be listed and let me know
> >> (either way).
> >
> > Allright, so I finally managed to find a machine to reproduce it and
> > I have a patch that fixes it here. I'm basically implementing the same
> > thing as X which is to ensure the bitmap is padded to 32 pixels.
> 
> Works great here (as you might expect).

Yeah, well, Albook G4 with rv350, I think we have the same machine :-)

Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-10  5:46   ` Benjamin Herrenschmidt
  2008-11-10  7:13     ` Paul Collins
@ 2008-11-10  9:06     ` David Miller
  2008-11-10 20:39     ` Andreas Schwab
  2008-11-13 23:11     ` David Miller
  3 siblings, 0 replies; 106+ messages in thread
From: David Miller @ 2008-11-10  9:06 UTC (permalink / raw)
  To: benh; +Cc: rjw, linux-kernel, kernel-testers, akpm, cloos, paul, torvalds

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Mon, 10 Nov 2008 16:46:25 +1100

> David, would you mind testing on your machine ? It's the one that shows
> the biggest performance improvement, and I would like to know how much
> it is affected by that patch. As long as the "worst case" performance
> is still reasonable, I'm ok to take the hit if the improvement for you
> is still significant.

I will test this out at the very next opportunity.


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-09 17:59 ` [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine Rafael J. Wysocki
@ 2008-11-10 12:04   ` Heiko Carstens
  2008-11-10 14:47     ` Rafael J. Wysocki
  0 siblings, 1 reply; 106+ messages in thread
From: Heiko Carstens @ 2008-11-10 12:04 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Rusty Russell

On Sun, Nov 09, 2008 at 06:59:16PM +0100, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.27.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11989
> Subject		: Suspend failure on NForce4-based boards due to chanes in stop_machine
> Submitter	: Rafael J. Wysocki <rjw@sisk.pl>
> Date		: 2008-11-03 0:28 (7 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc
> References	: http://marc.info/?l=linux-kernel&m=122567187604356&w=4

Hi Rafael,

could you provide more informations for this, please?

What is your kernel configuration?
Do you have any binary only modules (nvidia?) loaded?

Is it possible to recreate the bug by e.g. just doing something like

echo 0 > /sys/devices/system/cpu/cpu1/online

(or any other online cpu)? Or does it trigger any lockdep warnings?

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-10 12:04   ` Heiko Carstens
@ 2008-11-10 14:47     ` Rafael J. Wysocki
  2008-11-10 22:55       ` Rafael J. Wysocki
  0 siblings, 1 reply; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-10 14:47 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Linux Kernel Mailing List, Kernel Testers List, Rusty Russell

On Monday, 10 of November 2008, Heiko Carstens wrote:
> On Sun, Nov 09, 2008 at 06:59:16PM +0100, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.27.  Please verify if it still should be listed and let me know
> > (either way).
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11989
> > Subject		: Suspend failure on NForce4-based boards due to chanes in stop_machine
> > Submitter	: Rafael J. Wysocki <rjw@sisk.pl>
> > Date		: 2008-11-03 0:28 (7 days old)
> > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc
> > References	: http://marc.info/?l=linux-kernel&m=122567187604356&w=4
> 
> Hi Rafael,

Hi,

> could you provide more informations for this, please?
> 
> What is your kernel configuration?

Available at: http://www.sisk.pl/kernel/debug/mainline/2.6.28-rc3/kitty-config

> Do you have any binary only modules (nvidia?) loaded?

No, I don't.

> Is it possible to recreate the bug by e.g. just doing something like
> 
> echo 0 > /sys/devices/system/cpu/cpu1/online

I haven't checked (yet), I'll do that later today and let you know.

> (or any other online cpu)? Or does it trigger any lockdep warnings?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11895] 2.6.28-rc2 regression: keyboard dead after reboot on Toshiba Portege 4000
  2008-11-09 17:59 ` [Bug #11895] 2.6.28-rc2 regression: keyboard dead after reboot on Toshiba Portege 4000 Rafael J. Wysocki
@ 2008-11-10 16:53   ` Andrey Borzenkov
  2008-11-10 18:06     ` Rafael J. Wysocki
  0 siblings, 1 reply; 106+ messages in thread
From: Andrey Borzenkov @ 2008-11-10 16:53 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Avi Kivity,
	Ingo Molnar, Len Brown

[-- Attachment #1: Type: text/plain, Size: 532 bytes --]

On Sunday 09 November 2008, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.27.  Please verify if it still should be listed and let me know
> (either way).
> 

it is fixed in rc4

> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11895

Could you reassign this to ACPI product so this bug could be further
investigated or should I open seperate one?

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11895] 2.6.28-rc2 regression: keyboard dead after reboot on Toshiba Portege 4000
  2008-11-10 16:53   ` Andrey Borzenkov
@ 2008-11-10 18:06     ` Rafael J. Wysocki
  0 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-10 18:06 UTC (permalink / raw)
  To: Andrey Borzenkov
  Cc: Linux Kernel Mailing List, Kernel Testers List, Avi Kivity,
	Ingo Molnar, Len Brown

On Monday, 10 of November 2008, Andrey Borzenkov wrote:
> On Sunday 09 November 2008, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.27.  Please verify if it still should be listed and let me know
> > (either way).
> > 
> 
> it is fixed in rc4
> 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11895
> 
> Could you reassign this to ACPI product so this bug could be further
> investigated or should I open seperate one?

Since you're saying it's fixed in -rc4, I'll close it and please open a
separate one for the issue that's not been fixed yet.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-10  5:46   ` Benjamin Herrenschmidt
  2008-11-10  7:13     ` Paul Collins
  2008-11-10  9:06     ` David Miller
@ 2008-11-10 20:39     ` Andreas Schwab
  2008-11-10 21:52       ` Benjamin Herrenschmidt
  2008-11-13 23:11     ` David Miller
  3 siblings, 1 reply; 106+ messages in thread
From: Andreas Schwab @ 2008-11-10 20:39 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Andrew Morton, David S. Miller, James Cloos,
	Paul Collins, Linus Torvalds

Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> radeonfb: Fix accel problems with new imageblit hook
>
> Some radeon chips have issues with color expansion of pixmaps that
> aren't a multiple of 32 pixels wide. This works around it the same
> way X does by requesting the right pitch alignment from fbcon and
> then using the chip scissors to do clipping to the requested size.

Unfortunately this does not fix the suspend regression on PowerBook6,7.
Instead I have to use the workaround in
<http://marc.info/?l=linux-kernel&m=122515268301239&w=2>.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-10 20:39     ` Andreas Schwab
@ 2008-11-10 21:52       ` Benjamin Herrenschmidt
  2008-11-10 23:20         ` Andreas Schwab
  0 siblings, 1 reply; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2008-11-10 21:52 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Andrew Morton, David S. Miller, James Cloos,
	Paul Collins, Linus Torvalds

On Mon, 2008-11-10 at 21:39 +0100, Andreas Schwab wrote:
> Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> 
> > radeonfb: Fix accel problems with new imageblit hook
> >
> > Some radeon chips have issues with color expansion of pixmaps that
> > aren't a multiple of 32 pixels wide. This works around it the same
> > way X does by requesting the right pitch alignment from fbcon and
> > then using the chip scissors to do clipping to the requested size.
> 
> Unfortunately this does not fix the suspend regression on PowerBook6,7.
> Instead I have to use the workaround in
> <http://marc.info/?l=linux-kernel&m=122515268301239&w=2>.

Strange. The suspend problem happens also when X hasn't been launched at all ?

Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-10 14:47     ` Rafael J. Wysocki
@ 2008-11-10 22:55       ` Rafael J. Wysocki
  2008-11-11 10:52         ` Ingo Molnar
  2008-11-11 21:28         ` Dmitry Adamushko
  0 siblings, 2 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-10 22:55 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Linux Kernel Mailing List, Kernel Testers List, Rusty Russell

On Monday, 10 of November 2008, Rafael J. Wysocki wrote:
> On Monday, 10 of November 2008, Heiko Carstens wrote:
> > On Sun, Nov 09, 2008 at 06:59:16PM +0100, Rafael J. Wysocki wrote:
> > > This message has been generated automatically as a part of a report
> > > of recent regressions.
> > > 
> > > The following bug entry is on the current list of known regressions
> > > from 2.6.27.  Please verify if it still should be listed and let me know
> > > (either way).
> > > 
> > > 
> > > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11989
> > > Subject		: Suspend failure on NForce4-based boards due to chanes in stop_machine
> > > Submitter	: Rafael J. Wysocki <rjw@sisk.pl>
> > > Date		: 2008-11-03 0:28 (7 days old)
> > > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc
> > > References	: http://marc.info/?l=linux-kernel&m=122567187604356&w=4
> > 
> > Hi Rafael,
> 
> Hi,
> 
> > could you provide more informations for this, please?
> > 
> > What is your kernel configuration?
> 
> Available at: http://www.sisk.pl/kernel/debug/mainline/2.6.28-rc3/kitty-config
> 
> > Do you have any binary only modules (nvidia?) loaded?
> 
> No, I don't.
> 
> > Is it possible to recreate the bug by e.g. just doing something like
> > 
> > echo 0 > /sys/devices/system/cpu/cpu1/online
> 
> I haven't checked (yet), I'll do that later today and let you know.
> 
> > (or any other online cpu)? Or does it trigger any lockdep warnings?

It cannot be reproduced with offlining CPU1 and it doesn't trigger any
warnings from lockdep.

However, it is reproducible by doing

# echo core > /sys/power/pm_test

and repeating

# echo disk > /sys/power/state

for a couple of times, in which case the last two lines printed to the console
before a (solid) hang are:

SMP alternatives: switching to SMP code
Booting processor 1 APIC 0x1 ip 0x6000

So, it evidently fails while re-enabling the non-boot CPU and not during
disabling it as I thought before.

With commit c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc reverted the issue is
not reproducible any more.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-10 21:52       ` Benjamin Herrenschmidt
@ 2008-11-10 23:20         ` Andreas Schwab
  2008-11-10 23:34           ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 106+ messages in thread
From: Andreas Schwab @ 2008-11-10 23:20 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Andrew Morton, David S. Miller, James Cloos,
	Paul Collins, Linus Torvalds

Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> On Mon, 2008-11-10 at 21:39 +0100, Andreas Schwab wrote:
>> Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
>> 
>> > radeonfb: Fix accel problems with new imageblit hook
>> >
>> > Some radeon chips have issues with color expansion of pixmaps that
>> > aren't a multiple of 32 pixels wide. This works around it the same
>> > way X does by requesting the right pitch alignment from fbcon and
>> > then using the chip scissors to do clipping to the requested size.
>> 
>> Unfortunately this does not fix the suspend regression on PowerBook6,7.
>> Instead I have to use the workaround in
>> <http://marc.info/?l=linux-kernel&m=122515268301239&w=2>.
>
> Strange. The suspend problem happens also when X hasn't been launched at all ?

There seems to be some race involved here.  I cannot reproduce the
problem ATM.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-10 23:20         ` Andreas Schwab
@ 2008-11-10 23:34           ` Benjamin Herrenschmidt
  2008-11-10 23:54             ` Andreas Schwab
  0 siblings, 1 reply; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2008-11-10 23:34 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Andrew Morton, David S. Miller, James Cloos,
	Paul Collins, Linus Torvalds


> There seems to be some race involved here.  I cannot reproduce the
> problem ATM.

I wonder if it's related to the new acceleration at all then.

I've tried various suspend/resume cycles in straight console mode using
directly snooze -f (kernel ioctl) and from X using ubuntu intrepid and
gnome power manager and it worked fine on a 5,6 which should be fairly
similar to your 6,7 I think.

It's possible that there's yet another X related race though. I've seen
cases of X whacking the chip -after- it has religuished the console back
to the kernel (back to KD_TEXT) in the past which is very wrong, though
I didn't spot that during my testing, there could be some race lurking
there.

Can you describe your problem more precisely ? I didn't see (or forgot)
your initial report. Did it crash on suspend or wakeup ? what symptoms ?

Note also that on PowerBooks, there's a platform hook that allows
radeonfb to wake up the video chip _very_ early, thus allowing easier
debugging of the boot process, so even races like that on wakeup would
surprise me since we do wakup up the chip before we even get a chance to
schedule userspace again (in fact before we even bring back the L2
cache !)  

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-10 23:34           ` Benjamin Herrenschmidt
@ 2008-11-10 23:54             ` Andreas Schwab
  2008-11-11  1:49               ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 106+ messages in thread
From: Andreas Schwab @ 2008-11-10 23:54 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Andrew Morton, David S. Miller, James Cloos,
	Paul Collins, Linus Torvalds

Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> Can you describe your problem more precisely ?

It crashes during suspend (after the console was switched away from X),
but I can only see a frame buffer with apparently random contents when
it happens.  When suspend works then those random frame buffer contents
are only briefly visible before the screen is cleared.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-10 23:54             ` Andreas Schwab
@ 2008-11-11  1:49               ` Benjamin Herrenschmidt
  2008-11-11  2:47                 ` Linus Torvalds
  2008-11-11  9:31                 ` Andreas Schwab
  0 siblings, 2 replies; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2008-11-11  1:49 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Andrew Morton, David S. Miller, James Cloos,
	Paul Collins, Linus Torvalds

On Tue, 2008-11-11 at 00:54 +0100, Andreas Schwab wrote:
> 
> > Can you describe your problem more precisely ?
> 
> It crashes during suspend (after the console was switched away from X),
> but I can only see a frame buffer with apparently random contents when
> it happens.  When suspend works then those random frame buffer contents
> are only briefly visible before the screen is cleared.

Does it actually switches away from X ?

IE. You see the console before the crap on console or not ?

I've seen what you describe happening when doing snooze -f (direct
kernel ioctl) straight from within X. It seems to me that the problem
was that for some reason it didn't switch the console, which would
definitely make it crash. I need to double check what's up, it's
possible that the kernel fails to switch it properly or fails to wait
for X to ack the switch.

In any case, I doesn't seem to be directly related to those radeonfb
changes, though a clash with X like that is indeed more likely to
actually happen if radeonfb relies more heavily on acceleration.

I'll have a look later today at the console switch from X in the kernel
see if it's been broken in a way or another.

Note: I just did some tests using both echo "mem" >/sys/power/state and
snooze -f and it worked fine. IE, the console switch away from X worked.
So while I think I observed your problem once, I also cannot reproduce
it now.

I wonder if there's a race condition in the VT switch. It's possible
that it could be yet another case of X whacking the chip after it has
effectively relinguished control of the VT to the kernel, or it could be
a kernel race.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-11  1:49               ` Benjamin Herrenschmidt
@ 2008-11-11  2:47                 ` Linus Torvalds
  2008-11-11  3:21                   ` Benjamin Herrenschmidt
  2008-11-11  9:31                 ` Andreas Schwab
  1 sibling, 1 reply; 106+ messages in thread
From: Linus Torvalds @ 2008-11-11  2:47 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Andreas Schwab, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Andrew Morton, David S. Miller, James Cloos,
	Paul Collins



On Tue, 11 Nov 2008, Benjamin Herrenschmidt wrote:
> 
> In any case, I doesn't seem to be directly related to those radeonfb
> changes, though a clash with X like that is indeed more likely to
> actually happen if radeonfb relies more heavily on acceleration.

Just a silly question, without actually looking at the code - since you 
now do acceleration in radeonfb, do you wait for everything to drain 
before you switch consoles? 

There could be races that depend on timing, where perhaps X is unhappy 
about being entered with the acceleration engine busy, or conversely the 
radeonfb code is unhappy about perhaps some still-in-progress X thing that 
hasn't been synchronously waited for..

Before, radeonfb_imageblit() would always end up doing a 
"radeon_engine_idle()", so in practice, I think just about any fbcon 
access ended up idling the engine. Now, we can probably do a lot more 
without syncronizing - maybe there's insufficient synchronization at the 
switch-over from X to text-mode or vice versa?

		Linus

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-11  2:47                 ` Linus Torvalds
@ 2008-11-11  3:21                   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2008-11-11  3:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andreas Schwab, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Andrew Morton, David S. Miller, James Cloos,
	Paul Collins

On Mon, 2008-11-10 at 18:47 -0800, Linus Torvalds wrote:
> 
> On Tue, 11 Nov 2008, Benjamin Herrenschmidt wrote:
> > 
> > In any case, I doesn't seem to be directly related to those radeonfb
> > changes, though a clash with X like that is indeed more likely to
> > actually happen if radeonfb relies more heavily on acceleration.
> 
> Just a silly question, without actually looking at the code - since you 
> now do acceleration in radeonfb, do you wait for everything to drain 
> before you switch consoles? 

radeonfb has been doing acceleration for some time :-) Just not color
expansion, only blits and solid fills (so basically scrolling). That is
a lot less common though and thus it's possible that existing races
didn't show up until now.

It does drain the engine in various cases, typically mode change,
blanking, sync callback. fbcon core should at least sync if not blank
when switching to KD_GRAPHICS (or at least used to, I need to double
check). I have additional guards also that disable use of the engine
when sleeping.

> There could be races that depend on timing, where perhaps X is unhappy 
> about being entered with the acceleration engine busy, or conversely the 
> radeonfb code is unhappy about perhaps some still-in-progress X thing that 
> hasn't been synchronously waited for..

Yes. From what's been reported, the more likely thing would be a race
when switching away from X.

> Before, radeonfb_imageblit() would always end up doing a 
> "radeon_engine_idle()", so in practice, I think just about any fbcon 
> access ended up idling the engine. Now, we can probably do a lot more 
> without syncronizing - maybe there's insufficient synchronization at the 
> switch-over from X to text-mode or vice versa?

Switch over from X should restore KD_TEXT which should turn to a call to
set_par() that idles the engine before anything gets written to the
screen, but those code path are intricated between the VT code and fbcon
and things may well be subtely broken. I'll dig later today after I'm
done with some other emergency.

At one point, I fixed a crapload of VT bugs where things were done
without any locking, nowadays, everything should pretty much be covered
by the console semaphore, but maybe there's still a problem there.
Another area to look at is X itself. I've had problems with X (or the
DRM) still whacking the card after handing back the console to the
kernel in the past, so it wouldn't surprise me if there was something
bogus there too.

I also had problems with fbcon trying to draw before it re-initialized
the card (ie, it -should- call set_par before any new draw operation
when switching back from KD_GRAPHICS, if not, we don't properly get to
reconfigure the engine before we try to use it, which can be fatal), but
those were fixed last time I looked.

Anyway, I'll dig and let you know what I find.

Cheers,
Ben.


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11947] 2.6.28-rc VC switching with Intel graphics broken
  2008-11-09 17:59 ` [Bug #11947] 2.6.28-rc VC switching with Intel graphics broken Rafael J. Wysocki
@ 2008-11-11  9:28   ` Romano Giannetti
  0 siblings, 0 replies; 106+ messages in thread
From: Romano Giannetti @ 2008-11-11  9:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Jesse Barnes

Rafael J. Wysocki wrote:
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11947
> Subject		: 2.6.28-rc VC switching with Intel graphics broken
> Submitter	: Romano Giannetti <romano.giannetti@gmail.com>
> Date		: 2008-11-03 12:10 (7 days old)
> Handled-By	: Jesse Barnes <jbarnes@virtuousgeek.org>

Still here in 2.6.28-rc4. Complete lock switching back from a VC to X.

Romano



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-11  1:49               ` Benjamin Herrenschmidt
  2008-11-11  2:47                 ` Linus Torvalds
@ 2008-11-11  9:31                 ` Andreas Schwab
  2008-11-11 11:30                   ` Benjamin Herrenschmidt
                                     ` (2 more replies)
  1 sibling, 3 replies; 106+ messages in thread
From: Andreas Schwab @ 2008-11-11  9:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Andrew Morton, David S. Miller, James Cloos,
	Paul Collins, Linus Torvalds

It looks like you are observing the same failure mode that I do.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-10 22:55       ` Rafael J. Wysocki
@ 2008-11-11 10:52         ` Ingo Molnar
  2008-11-11 11:31           ` Heiko Carstens
                             ` (3 more replies)
  2008-11-11 21:28         ` Dmitry Adamushko
  1 sibling, 4 replies; 106+ messages in thread
From: Ingo Molnar @ 2008-11-11 10:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Heiko Carstens, Linux Kernel Mailing List, Kernel Testers List,
	Rusty Russell, Vegard Nossum, Peter Zijlstra, Oleg Nesterov,
	Dmitry Adamushko, Andrew Morton


* Rafael J. Wysocki <rjw@sisk.pl> wrote:

> However, it is reproducible by doing
> 
> # echo core > /sys/power/pm_test
> 
> and repeating
> 
> # echo disk > /sys/power/state
> 
> for a couple of times, in which case the last two lines printed to the console
> before a (solid) hang are:
> 
> SMP alternatives: switching to SMP code
> Booting processor 1 APIC 0x1 ip 0x6000
> 
> So, it evidently fails while re-enabling the non-boot CPU and not 
> during disabling it as I thought before.
> 
> With commit c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc reverted the 
> issue is not reproducible any more.

[ Cc:-ed workqueue/locking/suspend-race-condition experts. ]

Seems like the new kernel/stop_machine.c logic has a race for the test 
sequence above. (Below is the bisected commit again, maybe the race is 
visible via email review as well.)

	Ingo

-------------->
>From c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc Mon Sep 17 00:00:00 2001
From: Heiko Carstens <heiko.carstens@de.ibm.com>
Date: Mon, 13 Oct 2008 23:50:10 +0200
Subject: [PATCH] stop_machine: use workqueues instead of kernel threads

Convert stop_machine to a workqueue based approach. Instead of using kernel
threads for stop_machine we now use a an rt workqueue to synchronize all
cpus.
This has the advantage that all needed per cpu threads are already created
when stop_machine gets called. And therefore a call to stop_machine won't
fail anymore. This is needed for s390 which needs a mechanism to synchronize
all cpus without allocating any memory.
As Rusty pointed out free_module() needs a non-failing stop_machine interface
as well.

As a side effect the stop_machine code gets simplified.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
 kernel/stop_machine.c |  111 ++++++++++++++++++-------------------------------
 1 files changed, 41 insertions(+), 70 deletions(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index af3c7ce..0e688c6 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -37,9 +37,13 @@ struct stop_machine_data {
 /* Like num_online_cpus(), but hotplug cpu uses us, so we need this. */
 static unsigned int num_threads;
 static atomic_t thread_ack;
-static struct completion finished;
 static DEFINE_MUTEX(lock);
 
+static struct workqueue_struct *stop_machine_wq;
+static struct stop_machine_data active, idle;
+static const cpumask_t *active_cpus;
+static void *stop_machine_work;
+
 static void set_state(enum stopmachine_state newstate)
 {
 	/* Reset ack counter. */
@@ -51,21 +55,25 @@ static void set_state(enum stopmachine_state newstate)
 /* Last one to ack a state moves to the next state. */
 static void ack_state(void)
 {
-	if (atomic_dec_and_test(&thread_ack)) {
-		/* If we're the last one to ack the EXIT, we're finished. */
-		if (state == STOPMACHINE_EXIT)
-			complete(&finished);
-		else
-			set_state(state + 1);
-	}
+	if (atomic_dec_and_test(&thread_ack))
+		set_state(state + 1);
 }
 
-/* This is the actual thread which stops the CPU.  It exits by itself rather
- * than waiting for kthread_stop(), because it's easier for hotplug CPU. */
-static int stop_cpu(struct stop_machine_data *smdata)
+/* This is the actual function which stops the CPU. It runs
+ * in the context of a dedicated stopmachine workqueue. */
+static void stop_cpu(struct work_struct *unused)
 {
 	enum stopmachine_state curstate = STOPMACHINE_NONE;
-
+	struct stop_machine_data *smdata = &idle;
+	int cpu = smp_processor_id();
+
+	if (!active_cpus) {
+		if (cpu == first_cpu(cpu_online_map))
+			smdata = &active;
+	} else {
+		if (cpu_isset(cpu, *active_cpus))
+			smdata = &active;
+	}
 	/* Simple state machine */
 	do {
 		/* Chill out and ensure we re-read stopmachine_state. */
@@ -90,7 +98,6 @@ static int stop_cpu(struct stop_machine_data *smdata)
 	} while (curstate != STOPMACHINE_EXIT);
 
 	local_irq_enable();
-	do_exit(0);
 }
 
 /* Callback for CPUs which aren't supposed to do anything. */
@@ -101,78 +108,34 @@ static int chill(void *unused)
 
 int __stop_machine(int (*fn)(void *), void *data, const cpumask_t *cpus)
 {
-	int i, err;
-	struct stop_machine_data active, idle;
-	struct task_struct **threads;
+	struct work_struct *sm_work;
+	int i;
 
+	/* Set up initial state. */
+	mutex_lock(&lock);
+	num_threads = num_online_cpus();
+	active_cpus = cpus;
 	active.fn = fn;
 	active.data = data;
 	active.fnret = 0;
 	idle.fn = chill;
 	idle.data = NULL;
 
-	/* This could be too big for stack on large machines. */
-	threads = kcalloc(NR_CPUS, sizeof(threads[0]), GFP_KERNEL);
-	if (!threads)
-		return -ENOMEM;
-
-	/* Set up initial state. */
-	mutex_lock(&lock);
-	init_completion(&finished);
-	num_threads = num_online_cpus();
 	set_state(STOPMACHINE_PREPARE);
 
-	for_each_online_cpu(i) {
-		struct stop_machine_data *smdata = &idle;
-		struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 };
-
-		if (!cpus) {
-			if (i == first_cpu(cpu_online_map))
-				smdata = &active;
-		} else {
-			if (cpu_isset(i, *cpus))
-				smdata = &active;
-		}
-
-		threads[i] = kthread_create((void *)stop_cpu, smdata, "kstop%u",
-					    i);
-		if (IS_ERR(threads[i])) {
-			err = PTR_ERR(threads[i]);
-			threads[i] = NULL;
-			goto kill_threads;
-		}
-
-		/* Place it onto correct cpu. */
-		kthread_bind(threads[i], i);
-
-		/* Make it highest prio. */
-		if (sched_setscheduler_nocheck(threads[i], SCHED_FIFO, &param))
-			BUG();
-	}
-
-	/* We've created all the threads.  Wake them all: hold this CPU so one
+	/* Schedule the stop_cpu work on all cpus: hold this CPU so one
 	 * doesn't hit this CPU until we're ready. */
 	get_cpu();
-	for_each_online_cpu(i)
-		wake_up_process(threads[i]);
-
+	for_each_online_cpu(i) {
+		sm_work = percpu_ptr(stop_machine_work, i);
+		INIT_WORK(sm_work, stop_cpu);
+		queue_work_on(i, stop_machine_wq, sm_work);
+	}
 	/* This will release the thread on our CPU. */
 	put_cpu();
-	wait_for_completion(&finished);
+	flush_workqueue(stop_machine_wq);
 	mutex_unlock(&lock);
-
-	kfree(threads);
-
 	return active.fnret;
-
-kill_threads:
-	for_each_online_cpu(i)
-		if (threads[i])
-			kthread_stop(threads[i]);
-	mutex_unlock(&lock);
-
-	kfree(threads);
-	return err;
 }
 
 int stop_machine(int (*fn)(void *), void *data, const cpumask_t *cpus)
@@ -187,3 +150,11 @@ int stop_machine(int (*fn)(void *), void *data, const cpumask_t *cpus)
 	return ret;
 }
 EXPORT_SYMBOL_GPL(stop_machine);
+
+static int __init stop_machine_init(void)
+{
+	stop_machine_wq = create_rt_workqueue("kstop");
+	stop_machine_work = alloc_percpu(struct work_struct);
+	return 0;
+}
+early_initcall(stop_machine_init);

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-11  9:31                 ` Andreas Schwab
@ 2008-11-11 11:30                   ` Benjamin Herrenschmidt
  2008-11-21  2:55                   ` Benjamin Herrenschmidt
  2008-11-21  3:02                   ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2008-11-11 11:30 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Andrew Morton, David S. Miller, James Cloos,
	Paul Collins, Linus Torvalds

On Tue, 2008-11-11 at 10:31 +0100, Andreas Schwab wrote:
> It looks like you are observing the same failure mode that I do.

Yup, once, haven't reproduced it ever since though :-(

Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 10:52         ` Ingo Molnar
@ 2008-11-11 11:31           ` Heiko Carstens
  2008-11-11 12:42             ` Heiko Carstens
  2008-11-11 13:36           ` [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine Vegard Nossum
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 106+ messages in thread
From: Heiko Carstens @ 2008-11-11 11:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Rusty Russell, Vegard Nossum,
	Peter Zijlstra, Oleg Nesterov, Dmitry Adamushko, Andrew Morton

On Tue, Nov 11, 2008 at 11:52:14AM +0100, Ingo Molnar wrote:
> 
> * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> 
> > However, it is reproducible by doing
> > 
> > # echo core > /sys/power/pm_test
> > 
> > and repeating
> > 
> > # echo disk > /sys/power/state
> > 
> > for a couple of times, in which case the last two lines printed to the console
> > before a (solid) hang are:
> > 
> > SMP alternatives: switching to SMP code
> > Booting processor 1 APIC 0x1 ip 0x6000
> > 
> > So, it evidently fails while re-enabling the non-boot CPU and not 
> > during disabling it as I thought before.
> > 
> > With commit c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc reverted the 
> > issue is not reproducible any more.
> 
> [ Cc:-ed workqueue/locking/suspend-race-condition experts. ]
> 
> Seems like the new kernel/stop_machine.c logic has a race for the test 
> sequence above. (Below is the bisected commit again, maybe the race is 
> visible via email review as well.)

FWIW, I tried to reproduce this on s390 and got the following:

A process that would do nothing but onlining/offlining cpus would get
stuck after a while:

 0 schedule+842 [0x342522]
 1 schedule_timeout+200 [0x342ec4]
 2 wait_for_common+362 [0x341fd6]
 3 wait_for_completion+54 [0x342146]
 4 __synchronize_sched+80 [0x81670]
 5 cpu_down+172 [0x33c030]
 6 store_online+96 [0x33c488]
 7 sysdev_store+52 [0x1bda84]
 8 sysfs_write_file+242 [0x1350ba]
 9 vfs_write+176 [0xd2028]
10 sys_write+82 [0xd21ea]
11 sysc_noemu+16 [0x269d8]

All cpus are in cpu_idle and no other task in state TASK_INTERRUPTIBLE
or TASK_UNINTERRUPTIBLE. However it would continue to work as soon as
I login into the system or generate a console interrupt.
I'm going to look into the dump and see if I can figure out what is
broken here.
Dunno if it is the same bug or something else.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 11:31           ` Heiko Carstens
@ 2008-11-11 12:42             ` Heiko Carstens
  2008-11-11 13:13               ` Ingo Molnar
  2008-11-11 14:35               ` Paul E. McKenney
  0 siblings, 2 replies; 106+ messages in thread
From: Heiko Carstens @ 2008-11-11 12:42 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Rusty Russell, Vegard Nossum,
	Peter Zijlstra, Oleg Nesterov, Dmitry Adamushko, Andrew Morton,
	Paul E. McKenney, Steven Rostedt

On Tue, Nov 11, 2008 at 12:31:34PM +0100, Heiko Carstens wrote:
> On Tue, Nov 11, 2008 at 11:52:14AM +0100, Ingo Molnar wrote:
> > 
> > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > 
> > > However, it is reproducible by doing
> > > 
> > > # echo core > /sys/power/pm_test
> > > 
> > > and repeating
> > > 
> > > # echo disk > /sys/power/state
> > > 
> > > for a couple of times, in which case the last two lines printed to the console
> > > before a (solid) hang are:
> > > 
> > > SMP alternatives: switching to SMP code
> > > Booting processor 1 APIC 0x1 ip 0x6000
> > > 
> > > So, it evidently fails while re-enabling the non-boot CPU and not 
> > > during disabling it as I thought before.
> > > 
> > > With commit c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc reverted the 
> > > issue is not reproducible any more.
> > 
> > [ Cc:-ed workqueue/locking/suspend-race-condition experts. ]
> > 
> > Seems like the new kernel/stop_machine.c logic has a race for the test 
> > sequence above. (Below is the bisected commit again, maybe the race is 
> > visible via email review as well.)
> 
> FWIW, I tried to reproduce this on s390 and got the following:
> 
> A process that would do nothing but onlining/offlining cpus would get
> stuck after a while:
> 
>  0 schedule+842 [0x342522]
>  1 schedule_timeout+200 [0x342ec4]
>  2 wait_for_common+362 [0x341fd6]
>  3 wait_for_completion+54 [0x342146]
>  4 __synchronize_sched+80 [0x81670]
>  5 cpu_down+172 [0x33c030]
>  6 store_online+96 [0x33c488]
>  7 sysdev_store+52 [0x1bda84]
>  8 sysfs_write_file+242 [0x1350ba]
>  9 vfs_write+176 [0xd2028]
> 10 sys_write+82 [0xd21ea]
> 11 sysc_noemu+16 [0x269d8]
> 
> All cpus are in cpu_idle and no other task in state TASK_INTERRUPTIBLE
> or TASK_UNINTERRUPTIBLE. However it would continue to work as soon as
> I login into the system or generate a console interrupt.
> I'm going to look into the dump and see if I can figure out what is
> broken here.
> Dunno if it is the same bug or something else.

[Cc:-ed Steven and Paul, since this backtrace seems to be RCU specific]

Steven, Paul, any idea what could cause the hang? I think I would
get lost in the RCU code...

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 12:42             ` Heiko Carstens
@ 2008-11-11 13:13               ` Ingo Molnar
  2008-11-11 14:35               ` Paul E. McKenney
  1 sibling, 0 replies; 106+ messages in thread
From: Ingo Molnar @ 2008-11-11 13:13 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Rusty Russell, Vegard Nossum,
	Peter Zijlstra, Oleg Nesterov, Dmitry Adamushko, Andrew Morton,
	Paul E. McKenney, Steven Rostedt, Thomas Gleixner


* Heiko Carstens <heiko.carstens@de.ibm.com> wrote:

> On Tue, Nov 11, 2008 at 12:31:34PM +0100, Heiko Carstens wrote:
> > On Tue, Nov 11, 2008 at 11:52:14AM +0100, Ingo Molnar wrote:
> > > 
> > > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > 
> > > > However, it is reproducible by doing
> > > > 
> > > > # echo core > /sys/power/pm_test
> > > > 
> > > > and repeating
> > > > 
> > > > # echo disk > /sys/power/state
> > > > 
> > > > for a couple of times, in which case the last two lines printed to the console
> > > > before a (solid) hang are:
> > > > 
> > > > SMP alternatives: switching to SMP code
> > > > Booting processor 1 APIC 0x1 ip 0x6000
> > > > 
> > > > So, it evidently fails while re-enabling the non-boot CPU and not 
> > > > during disabling it as I thought before.
> > > > 
> > > > With commit c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc reverted the 
> > > > issue is not reproducible any more.
> > > 
> > > [ Cc:-ed workqueue/locking/suspend-race-condition experts. ]
> > > 
> > > Seems like the new kernel/stop_machine.c logic has a race for the test 
> > > sequence above. (Below is the bisected commit again, maybe the race is 
> > > visible via email review as well.)
> > 
> > FWIW, I tried to reproduce this on s390 and got the following:
> > 
> > A process that would do nothing but onlining/offlining cpus would get
> > stuck after a while:
> > 
> >  0 schedule+842 [0x342522]
> >  1 schedule_timeout+200 [0x342ec4]
> >  2 wait_for_common+362 [0x341fd6]
> >  3 wait_for_completion+54 [0x342146]
> >  4 __synchronize_sched+80 [0x81670]
> >  5 cpu_down+172 [0x33c030]
> >  6 store_online+96 [0x33c488]
> >  7 sysdev_store+52 [0x1bda84]
> >  8 sysfs_write_file+242 [0x1350ba]
> >  9 vfs_write+176 [0xd2028]
> > 10 sys_write+82 [0xd21ea]
> > 11 sysc_noemu+16 [0x269d8]
> > 
> > All cpus are in cpu_idle and no other task in state TASK_INTERRUPTIBLE
> > or TASK_UNINTERRUPTIBLE. However it would continue to work as soon as
> > I login into the system or generate a console interrupt.
> > I'm going to look into the dump and see if I can figure out what is
> > broken here.
> > Dunno if it is the same bug or something else.
> 
> [Cc:-ed Steven and Paul, since this backtrace seems to be RCU specific]
> 
> Steven, Paul, any idea what could cause the hang? I think I would
> get lost in the RCU code...

Cc:-ed Thomas - sometimes "RCU hangs" happen due to nohz confusion: 
because no timer IRQ happens so there's nothing to drive the RCU 
machinery.

	Ingo

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 10:52         ` Ingo Molnar
  2008-11-11 11:31           ` Heiko Carstens
@ 2008-11-11 13:36           ` Vegard Nossum
  2008-11-11 13:46             ` Vegard Nossum
  2008-11-11 13:49             ` Peter Zijlstra
  2008-11-11 14:47           ` Vegard Nossum
  2008-11-12  3:39           ` Rusty Russell
  3 siblings, 2 replies; 106+ messages in thread
From: Vegard Nossum @ 2008-11-11 13:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Rafael J. Wysocki, Heiko Carstens, Linux Kernel Mailing List,
	Kernel Testers List, Rusty Russell, Peter Zijlstra,
	Oleg Nesterov, Dmitry Adamushko, Andrew Morton

On Tue, Nov 11, 2008 at 11:52 AM, Ingo Molnar <mingo@elte.hu> wrote:
> [ Cc:-ed workqueue/locking/suspend-race-condition experts. ]

Heh. I am not expert, but I looked at the code. The obvious suspicious
thing to see is the use of unpaired barriers? Maybe like this:

 47 static void set_state(enum stopmachine_state newstate)
48 {
49         /* Reset ack counter. */
50         atomic_set(&thread_ack, num_threads);
51         smp_wmb();

+ /* force ordering between thread_ack/state */

52         state = newstate;
53 }
54
55 /* Last one to ack a state moves to the next state. */
56 static void ack_state(void)
57 {
58         if (atomic_dec_and_test(&thread_ack))

Maybe
+ /* force ordering between thread_ack/state */
+ smp_rmb();
here?

59                 set_state(state + 1);
60 }
61

Or maybe I am wrong. But Documentation/memory-barriers.txt is rather
explicit on this point.


Vegard

-- 
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
	-- E. W. Dijkstra, EWD1036

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 13:36           ` [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine Vegard Nossum
@ 2008-11-11 13:46             ` Vegard Nossum
  2008-11-11 13:49             ` Peter Zijlstra
  1 sibling, 0 replies; 106+ messages in thread
From: Vegard Nossum @ 2008-11-11 13:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Rafael J. Wysocki, Heiko Carstens, Linux Kernel Mailing List,
	Kernel Testers List, Rusty Russell, Peter Zijlstra,
	Oleg Nesterov, Dmitry Adamushko, Andrew Morton

On Tue, Nov 11, 2008 at 2:36 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote:
> On Tue, Nov 11, 2008 at 11:52 AM, Ingo Molnar <mingo@elte.hu> wrote:
>> [ Cc:-ed workqueue/locking/suspend-race-condition experts. ]
>
> Heh. I am not expert, but I looked at the code. The obvious suspicious
> thing to see is the use of unpaired barriers? Maybe like this:

...

> 55 /* Last one to ack a state moves to the next state. */
> 56 static void ack_state(void)
> 57 {
> 58         if (atomic_dec_and_test(&thread_ack))
>
> Maybe
> + /* force ordering between thread_ack/state */
> + smp_rmb();
> here?

Oops, I am wrong (after a small investigation).

"1490 Any atomic operation that modifies some state in memory and
returns information
1491 about the state (old or new) implies an SMP-conditional general
memory barrier
1492 (smp_mb()) on each side of the actual operation (with the exception of
1493 explicit lock operations, described later).  These include:
1494
...
1503         atomic_dec_and_test();"

Won't fix the problem at hand, but maybe something like this would be
nice for future generations :-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 0e688c6..6796bb1 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -55,6 +55,7 @@ static void set_state(enum stopmachine_state newstate)
 /* Last one to ack a state moves to the next state. */
 static void ack_state(void)
 {
+       /* Implicit memory barrier; no smp_rmb() needed */
        if (atomic_dec_and_test(&thread_ack))
                set_state(state + 1);
 }


Vegard

-- 
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
	-- E. W. Dijkstra, EWD1036

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 13:36           ` [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine Vegard Nossum
  2008-11-11 13:46             ` Vegard Nossum
@ 2008-11-11 13:49             ` Peter Zijlstra
  1 sibling, 0 replies; 106+ messages in thread
From: Peter Zijlstra @ 2008-11-11 13:49 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Ingo Molnar, Rafael J. Wysocki, Heiko Carstens,
	Linux Kernel Mailing List, Kernel Testers List, Rusty Russell,
	Oleg Nesterov, Dmitry Adamushko, Andrew Morton

On Tue, 2008-11-11 at 14:36 +0100, Vegard Nossum wrote:
> On Tue, Nov 11, 2008 at 11:52 AM, Ingo Molnar <mingo@elte.hu> wrote:
> > [ Cc:-ed workqueue/locking/suspend-race-condition experts. ]
> 
> Heh. I am not expert, but I looked at the code. The obvious suspicious
> thing to see is the use of unpaired barriers? Maybe like this:
> 
>  47 static void set_state(enum stopmachine_state newstate)
> 48 {
> 49         /* Reset ack counter. */
> 50         atomic_set(&thread_ack, num_threads);
> 51         smp_wmb();
> 
> + /* force ordering between thread_ack/state */
> 
> 52         state = newstate;
> 53 }
> 54
> 55 /* Last one to ack a state moves to the next state. */
> 56 static void ack_state(void)
> 57 {
> 58         if (atomic_dec_and_test(&thread_ack))
> 
> Maybe
> + /* force ordering between thread_ack/state */
> + smp_rmb();
> here?

all atomic ops that have return values imply a full barrier, iirc

> 59                 set_state(state + 1);
> 60 }
> 61
> 
> Or maybe I am wrong. But Documentation/memory-barriers.txt is rather
> explicit on this point.
> 
> 
> Vegard
> 


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 12:42             ` Heiko Carstens
  2008-11-11 13:13               ` Ingo Molnar
@ 2008-11-11 14:35               ` Paul E. McKenney
  2008-11-11 15:01                 ` Heiko Carstens
  2008-11-11 15:02                 ` Paul E. McKenney
  1 sibling, 2 replies; 106+ messages in thread
From: Paul E. McKenney @ 2008-11-11 14:35 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Ingo Molnar, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Rusty Russell, Vegard Nossum,
	Peter Zijlstra, Oleg Nesterov, Dmitry Adamushko, Andrew Morton,
	Steven Rostedt

On Tue, Nov 11, 2008 at 01:42:01PM +0100, Heiko Carstens wrote:
> On Tue, Nov 11, 2008 at 12:31:34PM +0100, Heiko Carstens wrote:
> > On Tue, Nov 11, 2008 at 11:52:14AM +0100, Ingo Molnar wrote:
> > > 
> > > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > 
> > > > However, it is reproducible by doing
> > > > 
> > > > # echo core > /sys/power/pm_test
> > > > 
> > > > and repeating
> > > > 
> > > > # echo disk > /sys/power/state
> > > > 
> > > > for a couple of times, in which case the last two lines printed to the console
> > > > before a (solid) hang are:
> > > > 
> > > > SMP alternatives: switching to SMP code
> > > > Booting processor 1 APIC 0x1 ip 0x6000
> > > > 
> > > > So, it evidently fails while re-enabling the non-boot CPU and not 
> > > > during disabling it as I thought before.
> > > > 
> > > > With commit c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc reverted the 
> > > > issue is not reproducible any more.
> > > 
> > > [ Cc:-ed workqueue/locking/suspend-race-condition experts. ]
> > > 
> > > Seems like the new kernel/stop_machine.c logic has a race for the test 
> > > sequence above. (Below is the bisected commit again, maybe the race is 
> > > visible via email review as well.)
> > 
> > FWIW, I tried to reproduce this on s390 and got the following:
> > 
> > A process that would do nothing but onlining/offlining cpus would get
> > stuck after a while:
> > 
> >  0 schedule+842 [0x342522]
> >  1 schedule_timeout+200 [0x342ec4]
> >  2 wait_for_common+362 [0x341fd6]
> >  3 wait_for_completion+54 [0x342146]
> >  4 __synchronize_sched+80 [0x81670]
> >  5 cpu_down+172 [0x33c030]
> >  6 store_online+96 [0x33c488]
> >  7 sysdev_store+52 [0x1bda84]
> >  8 sysfs_write_file+242 [0x1350ba]
> >  9 vfs_write+176 [0xd2028]
> > 10 sys_write+82 [0xd21ea]
> > 11 sysc_noemu+16 [0x269d8]
> > 
> > All cpus are in cpu_idle and no other task in state TASK_INTERRUPTIBLE
> > or TASK_UNINTERRUPTIBLE. However it would continue to work as soon as
> > I login into the system or generate a console interrupt.
> > I'm going to look into the dump and see if I can figure out what is
> > broken here.
> > Dunno if it is the same bug or something else.
> 
> [Cc:-ed Steven and Paul, since this backtrace seems to be RCU specific]
> 
> Steven, Paul, any idea what could cause the hang? I think I would
> get lost in the RCU code...

Hello, Heiko,

Could you please apply the following debug patch (due to Jiangshan and
myself)?  Then you should be able to build with CONFIG_RCU_TRACE,
then mount debugfs after boot, for example, on /debug.  This will
create a /debug/rcu directory with three files, "rcucb", "rcu_data",
and "rcu_bh_data".  Since you are still able to log in, could you
please send the contents of these three files?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 10:52         ` Ingo Molnar
  2008-11-11 11:31           ` Heiko Carstens
  2008-11-11 13:36           ` [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine Vegard Nossum
@ 2008-11-11 14:47           ` Vegard Nossum
  2008-11-11 15:11             ` Dmitry Adamushko
  2008-11-11 16:31             ` Oleg Nesterov
  2008-11-12  3:39           ` Rusty Russell
  3 siblings, 2 replies; 106+ messages in thread
From: Vegard Nossum @ 2008-11-11 14:47 UTC (permalink / raw)
  To: Ingo Molnar, Rafael J. Wysocki
  Cc: Heiko Carstens, Linux Kernel Mailing List, Kernel Testers List,
	Rusty Russell, Peter Zijlstra, Oleg Nesterov, Dmitry Adamushko,
	Andrew Morton

On Tue, Nov 11, 2008 at 11:52 AM, Ingo Molnar <mingo@elte.hu> wrote:
> [ Cc:-ed workqueue/locking/suspend-race-condition experts. ]
>
> Seems like the new kernel/stop_machine.c logic has a race for the test
> sequence above. (Below is the bisected commit again, maybe the race is
> visible via email review as well.)

I try again.

I think that the test for stop_machine_data in stop_cpu() should not
have been moved from __stop_machine(). Because now cpu_online_map may
change in-between calls to stop_cpu() (if the callback tries to
online/offline CPUs), and the end result may be different.

Maybe?


Vegard

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 14:35               ` Paul E. McKenney
@ 2008-11-11 15:01                 ` Heiko Carstens
  2008-11-11 16:17                   ` Paul E. McKenney
  2008-11-11 15:02                 ` Paul E. McKenney
  1 sibling, 1 reply; 106+ messages in thread
From: Heiko Carstens @ 2008-11-11 15:01 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Rusty Russell, Vegard Nossum,
	Peter Zijlstra, Oleg Nesterov, Dmitry Adamushko, Andrew Morton,
	Steven Rostedt

On Tue, Nov 11, 2008 at 06:35:05AM -0800, Paul E. McKenney wrote:
> > > A process that would do nothing but onlining/offlining cpus would get
> > > stuck after a while:
> > > 
> > >  0 schedule+842 [0x342522]
> > >  1 schedule_timeout+200 [0x342ec4]
> > >  2 wait_for_common+362 [0x341fd6]
> > >  3 wait_for_completion+54 [0x342146]
> > >  4 __synchronize_sched+80 [0x81670]
> > >  5 cpu_down+172 [0x33c030]
> > >  6 store_online+96 [0x33c488]
> > >  7 sysdev_store+52 [0x1bda84]
> > >  8 sysfs_write_file+242 [0x1350ba]
> > >  9 vfs_write+176 [0xd2028]
> > > 10 sys_write+82 [0xd21ea]
> > > 11 sysc_noemu+16 [0x269d8]
> > > 
> > > All cpus are in cpu_idle and no other task in state TASK_INTERRUPTIBLE
> > > or TASK_UNINTERRUPTIBLE. However it would continue to work as soon as
> > > I login into the system or generate a console interrupt.
> > > I'm going to look into the dump and see if I can figure out what is
> > > broken here.
> > > Dunno if it is the same bug or something else.
> > 
> > [Cc:-ed Steven and Paul, since this backtrace seems to be RCU specific]
> > 
> > Steven, Paul, any idea what could cause the hang? I think I would
> > get lost in the RCU code...
> 
> Hello, Heiko,
> 
> Could you please apply the following debug patch (due to Jiangshan and
> myself)?  Then you should be able to build with CONFIG_RCU_TRACE,
> then mount debugfs after boot, for example, on /debug.  This will
> create a /debug/rcu directory with three files, "rcucb", "rcu_data",
> and "rcu_bh_data".  Since you are still able to log in, could you
> please send the contents of these three files?

Hi Paul,

could you attach the patch please? :)

Does the patch also make sense if the system continues to work? That
is the machine isn't stalled anymore as soon as I log in.
On the other hand I do have a dump of the system and can look in
whatever data structures you want. If that helps.

Thanks,
Heiko

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 14:35               ` Paul E. McKenney
  2008-11-11 15:01                 ` Heiko Carstens
@ 2008-11-11 15:02                 ` Paul E. McKenney
  2008-11-11 16:14                   ` Heiko Carstens
  2008-11-11 17:03                   ` Q: force_quiescent_state && cpu_online_map Oleg Nesterov
  1 sibling, 2 replies; 106+ messages in thread
From: Paul E. McKenney @ 2008-11-11 15:02 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Ingo Molnar, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Rusty Russell, Vegard Nossum,
	Peter Zijlstra, Oleg Nesterov, Dmitry Adamushko, Andrew Morton,
	Steven Rostedt

On Tue, Nov 11, 2008 at 06:35:05AM -0800, Paul E. McKenney wrote:
> On Tue, Nov 11, 2008 at 01:42:01PM +0100, Heiko Carstens wrote:
> > On Tue, Nov 11, 2008 at 12:31:34PM +0100, Heiko Carstens wrote:
> > > On Tue, Nov 11, 2008 at 11:52:14AM +0100, Ingo Molnar wrote:
> > > > 
> > > > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > > 
> > > > > However, it is reproducible by doing
> > > > > 
> > > > > # echo core > /sys/power/pm_test
> > > > > 
> > > > > and repeating
> > > > > 
> > > > > # echo disk > /sys/power/state
> > > > > 
> > > > > for a couple of times, in which case the last two lines printed to the console
> > > > > before a (solid) hang are:
> > > > > 
> > > > > SMP alternatives: switching to SMP code
> > > > > Booting processor 1 APIC 0x1 ip 0x6000
> > > > > 
> > > > > So, it evidently fails while re-enabling the non-boot CPU and not 
> > > > > during disabling it as I thought before.
> > > > > 
> > > > > With commit c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc reverted the 
> > > > > issue is not reproducible any more.
> > > > 
> > > > [ Cc:-ed workqueue/locking/suspend-race-condition experts. ]
> > > > 
> > > > Seems like the new kernel/stop_machine.c logic has a race for the test 
> > > > sequence above. (Below is the bisected commit again, maybe the race is 
> > > > visible via email review as well.)
> > > 
> > > FWIW, I tried to reproduce this on s390 and got the following:
> > > 
> > > A process that would do nothing but onlining/offlining cpus would get
> > > stuck after a while:
> > > 
> > >  0 schedule+842 [0x342522]
> > >  1 schedule_timeout+200 [0x342ec4]
> > >  2 wait_for_common+362 [0x341fd6]
> > >  3 wait_for_completion+54 [0x342146]
> > >  4 __synchronize_sched+80 [0x81670]
> > >  5 cpu_down+172 [0x33c030]
> > >  6 store_online+96 [0x33c488]
> > >  7 sysdev_store+52 [0x1bda84]
> > >  8 sysfs_write_file+242 [0x1350ba]
> > >  9 vfs_write+176 [0xd2028]
> > > 10 sys_write+82 [0xd21ea]
> > > 11 sysc_noemu+16 [0x269d8]
> > > 
> > > All cpus are in cpu_idle and no other task in state TASK_INTERRUPTIBLE
> > > or TASK_UNINTERRUPTIBLE. However it would continue to work as soon as
> > > I login into the system or generate a console interrupt.
> > > I'm going to look into the dump and see if I can figure out what is
> > > broken here.
> > > Dunno if it is the same bug or something else.
> > 
> > [Cc:-ed Steven and Paul, since this backtrace seems to be RCU specific]
> > 
> > Steven, Paul, any idea what could cause the hang? I think I would
> > get lost in the RCU code...
> 
> Hello, Heiko,
> 
> Could you please apply the following debug patch (due to Jiangshan and
> myself)?  Then you should be able to build with CONFIG_RCU_TRACE,
> then mount debugfs after boot, for example, on /debug.  This will
> create a /debug/rcu directory with three files, "rcucb", "rcu_data",
> and "rcu_bh_data".  Since you are still able to log in, could you
> please send the contents of these three files?
> 
> 							Thanx, Paul

This time with the patch actually attached...  Thanks to Peter Z.
for alerting me to my omission.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---

diff --git a/include/linux/rcuclassic.h b/include/linux/rcuclassic.h
index 4ab8436..735f35a 100644
--- a/include/linux/rcuclassic.h
+++ b/include/linux/rcuclassic.h
@@ -54,6 +54,9 @@ struct rcu_ctrlblk {
 				 /* for current batch to proceed.        */
 } ____cacheline_internodealigned_in_smp;
 
+extern struct rcu_ctrlblk rcu_ctrlblk;
+extern struct rcu_ctrlblk rcu_bh_ctrlblk;
+
 /* Is batch a before batch b ? */
 static inline int rcu_batch_before(long a, long b)
 {
@@ -76,6 +79,7 @@ struct rcu_data {
 	long		quiescbatch;     /* Batch # for grace period */
 	int		passed_quiesc;	 /* User-mode/idle loop etc. */
 	int		qs_pending;	 /* core waits for quiesc state */
+	bool		beenonline;	 /* CPU online at least once */
 
 	/* 2) batch handling */
 	long  	       	batch;           /* Batch # for current RCU batch */
diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt
index 9fdba03..ba32338 100644
--- a/kernel/Kconfig.preempt
+++ b/kernel/Kconfig.preempt
@@ -68,7 +68,6 @@ config PREEMPT_RCU
 
 config RCU_TRACE
 	bool "Enable tracing for RCU - currently stats in debugfs"
-	depends on PREEMPT_RCU
 	select DEBUG_FS
 	default y
 	help
diff --git a/kernel/Makefile b/kernel/Makefile
index 4e1d7df..e0bfce7 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -77,6 +77,8 @@ obj-$(CONFIG_CLASSIC_RCU) += rcuclassic.o
 obj-$(CONFIG_PREEMPT_RCU) += rcupreempt.o
 ifeq ($(CONFIG_PREEMPT_RCU),y)
 obj-$(CONFIG_RCU_TRACE) += rcupreempt_trace.o
+else
+obj-$(CONFIG_RCU_TRACE) += rcuclassic_trace.o
 endif
 obj-$(CONFIG_RELAY) += relay.o
 obj-$(CONFIG_SYSCTL) += utsname_sysctl.o
diff --git a/kernel/rcuclassic.c b/kernel/rcuclassic.c
index aad93cd..06472fc 100644
--- a/kernel/rcuclassic.c
+++ b/kernel/rcuclassic.c
@@ -57,13 +57,13 @@ EXPORT_SYMBOL_GPL(rcu_lock_map);
 
 
 /* Definition for rcupdate control block. */
-static struct rcu_ctrlblk rcu_ctrlblk = {
+struct rcu_ctrlblk rcu_ctrlblk = {
 	.cur = -300,
 	.completed = -300,
 	.lock = __SPIN_LOCK_UNLOCKED(&rcu_ctrlblk.lock),
 	.cpumask = CPU_MASK_NONE,
 };
-static struct rcu_ctrlblk rcu_bh_ctrlblk = {
+struct rcu_ctrlblk rcu_bh_ctrlblk = {
 	.cur = -300,
 	.completed = -300,
 	.lock = __SPIN_LOCK_UNLOCKED(&rcu_bh_ctrlblk.lock),
@@ -564,6 +564,7 @@ static void rcu_init_percpu_data(int cpu, struct rcu_ctrlblk *rcp,
 	rdp->donetail = &rdp->donelist;
 	rdp->quiescbatch = rcp->completed;
 	rdp->qs_pending = 0;
+	rdp->beenonline = 1;
 	rdp->cpu = cpu;
 	rdp->blimit = blimit;
 }
diff --git a/kernel/rcuclassic_trace.c b/kernel/rcuclassic_trace.c
new file mode 100644
index 0000000..b719048
--- /dev/null
+++ b/kernel/rcuclassic_trace.c
@@ -0,0 +1,198 @@
+/*
+ * Read-Copy Update tracing for classic implementation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright IBM Corporation, 2008
+ *
+ * Updated to use seqfile by Lai Jiangshan.
+ *
+ * Papers:  http://www.rdrop.com/users/paulmck/RCU
+ *
+ * For detailed explanation of Read-Copy Update mechanism see -
+ * 		Documentation/RCU
+ *
+ */
+#include <linux/rcupdate.h>
+#include <linux/module.h>
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+
+/* Print out rcu_data structures using seqfile facility. */
+
+static struct rcu_data *get_rcu_data_bh(int cpu)
+{
+	return &per_cpu(rcu_bh_data, cpu);
+}
+
+static struct rcu_data *get_rcu_data(int cpu)
+{
+	return &per_cpu(rcu_data, cpu);
+}
+
+static int show_rcu_data(struct seq_file *m, void *v)
+{
+	struct rcu_data *rdp = v;
+
+	if (!rdp->beenonline)
+		return 0;
+
+	seq_printf(m, "processor\t: %d", rdp->cpu);
+	if (cpu_is_offline(rdp->cpu))
+		seq_puts(m, "!\n");
+	else
+		seq_puts(m, "\n");
+	seq_printf(m, "quiescbatch\t: %ld\n", rdp->quiescbatch);
+	seq_printf(m, "batch\t\t: %ld\n", rdp->batch);
+	seq_printf(m, "passed_quiesc\t: %d\n", rdp->passed_quiesc);
+	seq_printf(m, "qs_pending\t: %d\n", rdp->qs_pending);
+	seq_printf(m, "qlen\t\t: %ld\n", rdp->qlen);
+	seq_printf(m, "blimit\t\t: %ld\n", rdp->blimit);
+	seq_puts(m, "\n");
+	return 0;
+}
+
+static void *c_start(struct seq_file *m, loff_t *pos)
+{
+	typedef struct rcu_data *(*get_data_func)(int);
+
+	if (*pos == 0)  /* just in case, cpu 0 is not the first */
+		*pos = first_cpu(cpu_possible_map);
+	else
+		*pos = next_cpu_nr(*pos - 1, cpu_possible_map);
+	if ((*pos) < nr_cpu_ids)
+		return ((get_data_func)m->private)(*pos);
+	return NULL;
+}
+
+static void *c_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	(*pos)++;
+	return c_start(m, pos);
+}
+
+static void c_stop(struct seq_file *m, void *v)
+{
+}
+
+const struct seq_operations rcu_data_seq_op = {
+	.start	= c_start,
+	.next	= c_next,
+	.stop	= c_stop,
+	.show	= show_rcu_data,
+};
+
+static int rcu_data_open(struct inode *inode, struct file *file)
+{
+	int ret = seq_open(file, &rcu_data_seq_op);
+
+	if (ret)
+		return ret;
+	((struct seq_file *)file->private_data)->private = inode->i_private;
+	return 0;
+}
+
+static const struct file_operations rcu_data_fops = {
+	.owner		= THIS_MODULE,
+	.open		= rcu_data_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= seq_release,
+};
+
+/* Print out rcu_ctrlblk structures using seqfile facility. */
+
+static void print_one_rcu_ctrlblk(struct seq_file *m, struct rcu_ctrlblk *rcp)
+{
+	seq_printf(m, "cur=%ld  completed=%ld   next_pending=%d  s=%d\n\t",
+		   rcp->cur, rcp->completed, rcp->next_pending, rcp->signaled);
+	seq_cpumask(m, &rcp->cpumask);
+	seq_puts(m, "\n");
+}
+
+static int show_rcucb(struct seq_file *m, void *unused)
+{
+	seq_puts(m, "rcu: ");
+	print_one_rcu_ctrlblk(m, &rcu_ctrlblk);
+	seq_puts(m, "rcu_bh: ");
+	print_one_rcu_ctrlblk(m, &rcu_bh_ctrlblk);
+	seq_puts(m, "online: ");
+	seq_cpumask(m, &cpu_online_map);
+	seq_puts(m, "\n");
+	return 0;
+}
+
+static int rcucb_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, show_rcucb, NULL);
+}
+
+static struct file_operations rcucb_fops = {
+	.owner		= THIS_MODULE,
+	.open		= rcucb_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
+static struct dentry *rcudir, *rcu_bh_data_file, *rcu_data_file, *rcucb_file;
+
+static int __init rcuclassic_trace_init(void)
+{
+	rcudir = debugfs_create_dir("rcu", NULL);
+	if (!rcudir)
+		goto out;
+
+	rcu_bh_data_file = debugfs_create_file("rcu_bh_data", 0444, rcudir,
+					       get_rcu_data_bh, &rcu_data_fops);
+	if (!rcu_bh_data_file)
+		goto out_rcudir;
+
+	rcu_data_file = debugfs_create_file("rcu_data", 0444, rcudir,
+					    get_rcu_data, &rcu_data_fops);
+	if (!rcu_data_file)
+		goto out_rcudata_bh_file;
+
+	rcucb_file = debugfs_create_file("rcucb", 0444, rcudir,
+					 NULL, &rcucb_fops);
+	if (!rcucb_file)
+		goto out_rcudata_file;
+	return 0;
+
+out_rcudata_file:
+	debugfs_remove(rcu_data_file);
+out_rcudata_bh_file:
+	debugfs_remove(rcu_bh_data_file);
+out_rcudir:
+	debugfs_remove(rcudir);
+out:
+	return 1;
+}
+
+static void __exit rcuclassic_trace_cleanup(void)
+{
+	debugfs_remove(rcucb_file);
+	debugfs_remove(rcu_data_file);
+	debugfs_remove(rcu_bh_data_file);
+	debugfs_remove(rcudir);
+}
+
+module_init(rcuclassic_trace_init);
+module_exit(rcuclassic_trace_cleanup);
+
+MODULE_AUTHOR("Paul E. McKenney");
+MODULE_DESCRIPTION("Read-Copy Update tracing for classic implementation");
+MODULE_LICENSE("GPL");
+

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 14:47           ` Vegard Nossum
@ 2008-11-11 15:11             ` Dmitry Adamushko
  2008-11-11 16:31             ` Oleg Nesterov
  1 sibling, 0 replies; 106+ messages in thread
From: Dmitry Adamushko @ 2008-11-11 15:11 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Ingo Molnar, Rafael J. Wysocki, Heiko Carstens,
	Linux Kernel Mailing List, Kernel Testers List, Rusty Russell,
	Peter Zijlstra, Oleg Nesterov, Andrew Morton

2008/11/11 Vegard Nossum <vegard.nossum@gmail.com>:
> On Tue, Nov 11, 2008 at 11:52 AM, Ingo Molnar <mingo@elte.hu> wrote:
>> [ Cc:-ed workqueue/locking/suspend-race-condition experts. ]
>>
>> Seems like the new kernel/stop_machine.c logic has a race for the test
>> sequence above. (Below is the bisected commit again, maybe the race is
>> visible via email review as well.)
>
> I try again.
>
> I think that the test for stop_machine_data in stop_cpu() should not
> have been moved from __stop_machine().

Do you mean the following test?

        if (!active_cpus) {
                if (cpu == first_cpu(cpu_online_map))
                        smdata = &active;
        } else {
                if (cpu_isset(cpu, *active_cpus))
                        smdata = &active;
        }

> Because now cpu_online_map may
> change in-between calls to stop_cpu() (if the callback tries to
> online/offline CPUs), and the end result may be different.

take_cpu_down() may not run earlier than stop_cpu() on all the cpus
have completed the STOPMACHINE_DISABLE_IRQ step, iow. "state ==
STOPMACHINE_RUN". By that moment, 'smdata' has been set up on all
cpus... if this is the case you had in mind.


>
> Maybe?
>
>
> Vegard
>


-- 
Best regards,
Dmitry Adamushko

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 15:02                 ` Paul E. McKenney
@ 2008-11-11 16:14                   ` Heiko Carstens
  2008-11-11 16:45                     ` Paul E. McKenney
  2008-11-11 17:03                   ` Q: force_quiescent_state && cpu_online_map Oleg Nesterov
  1 sibling, 1 reply; 106+ messages in thread
From: Heiko Carstens @ 2008-11-11 16:14 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Rusty Russell, Vegard Nossum,
	Peter Zijlstra, Oleg Nesterov, Dmitry Adamushko, Andrew Morton,
	Steven Rostedt

> > Could you please apply the following debug patch (due to Jiangshan and
> > myself)?  Then you should be able to build with CONFIG_RCU_TRACE,
> > then mount debugfs after boot, for example, on /debug.  This will
> > create a /debug/rcu directory with three files, "rcucb", "rcu_data",
> > and "rcu_bh_data".  Since you are still able to log in, could you
> > please send the contents of these three files?
> > 
> > 							Thanx, Paul
> 
> This time with the patch actually attached...  Thanks to Peter Z.
> for alerting me to my omission.

Well, your patch doesn't apply on git head. However I used preemptible
RCU instead and had tracing enabled.

This is the output of the three files after it stalled (and continued,
because I caused an interrupt by sending a network packet) twice:

[root@h0545001 rcu]# cat rcuctrs 
CPU last cur F M
  1    0   0 1 1
  3    0   0 1 1
  4    0   0 0 0
  5    0   0 0 1
  6    0   0 0 0
ggp = 1640, state = waitack

[root@h0545001 rcu]# cat rcugp 
oldggp=1652  newggp=1655

[root@h0545001 rcu]# cat rcustats 
na=33948 nl=3 wa=33945 wl=0 da=33945 dl=0 dr=33945 di=0
1=0 e1=0 i1=1674 ie1=4 g1=1670 a1=1920 ae1=251 a2=1669
z1=1669 ze1=0 z2=1669 m1=4411 me1=2742 m2=1669


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 15:01                 ` Heiko Carstens
@ 2008-11-11 16:17                   ` Paul E. McKenney
  0 siblings, 0 replies; 106+ messages in thread
From: Paul E. McKenney @ 2008-11-11 16:17 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Ingo Molnar, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Rusty Russell, Vegard Nossum,
	Peter Zijlstra, Oleg Nesterov, Dmitry Adamushko, Andrew Morton,
	Steven Rostedt

On Tue, Nov 11, 2008 at 04:01:32PM +0100, Heiko Carstens wrote:
> On Tue, Nov 11, 2008 at 06:35:05AM -0800, Paul E. McKenney wrote:
> > > > A process that would do nothing but onlining/offlining cpus would get
> > > > stuck after a while:
> > > > 
> > > >  0 schedule+842 [0x342522]
> > > >  1 schedule_timeout+200 [0x342ec4]
> > > >  2 wait_for_common+362 [0x341fd6]
> > > >  3 wait_for_completion+54 [0x342146]
> > > >  4 __synchronize_sched+80 [0x81670]
> > > >  5 cpu_down+172 [0x33c030]
> > > >  6 store_online+96 [0x33c488]
> > > >  7 sysdev_store+52 [0x1bda84]
> > > >  8 sysfs_write_file+242 [0x1350ba]
> > > >  9 vfs_write+176 [0xd2028]
> > > > 10 sys_write+82 [0xd21ea]
> > > > 11 sysc_noemu+16 [0x269d8]
> > > > 
> > > > All cpus are in cpu_idle and no other task in state TASK_INTERRUPTIBLE
> > > > or TASK_UNINTERRUPTIBLE. However it would continue to work as soon as
> > > > I login into the system or generate a console interrupt.
> > > > I'm going to look into the dump and see if I can figure out what is
> > > > broken here.
> > > > Dunno if it is the same bug or something else.
> > > 
> > > [Cc:-ed Steven and Paul, since this backtrace seems to be RCU specific]
> > > 
> > > Steven, Paul, any idea what could cause the hang? I think I would
> > > get lost in the RCU code...
> > 
> > Hello, Heiko,
> > 
> > Could you please apply the following debug patch (due to Jiangshan and
> > myself)?  Then you should be able to build with CONFIG_RCU_TRACE,
> > then mount debugfs after boot, for example, on /debug.  This will
> > create a /debug/rcu directory with three files, "rcucb", "rcu_data",
> > and "rcu_bh_data".  Since you are still able to log in, could you
> > please send the contents of these three files?
> 
> Hi Paul,
> 
> could you attach the patch please? :)

Peter Z. beat you to it.  ;-)

See previous email.

> Does the patch also make sense if the system continues to work? That
> is the machine isn't stalled anymore as soon as I log in.
> On the other hand I do have a dump of the system and can look in
> whatever data structures you want. If that helps.

Ah!

I would like to see the value of rcu_ctrlblk.cpumask and also the value
of cpu_online_map.  One guess would be that rcu_ctrlblk.cpumask has a
bit set that is -not- set in cpu_online_map, which would indicate that
RCU was incorrectly waiting on an offline CPU.

On the other hand, if all the bits set in rcu_ctrlblk.cpumask are also
set in cpu_online_map, then could you please dump out the instances of
the rcu_data per-CPU variable that correspond to the bits set in
rcu_ctrlblk.cpumask?

Finally, if no bits are set in rcu_ctrlblk.cpumask, the question would
be "why isn't the synchronize_sched() waking up?"

BTW, I am assuming that you have the same config as Raphael, in other
words, that you are running Classic RCU rather than preemptable RCU.

The point of the patch is that it allows you to see this info by catting
out the /debug/rcu files, at least assuming that the system is healthy
enough to allow you to cat files.  But if you already have a crash dump...

							Thanx, Paul

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 14:47           ` Vegard Nossum
  2008-11-11 15:11             ` Dmitry Adamushko
@ 2008-11-11 16:31             ` Oleg Nesterov
  2008-11-12  3:30               ` Rusty Russell
  1 sibling, 1 reply; 106+ messages in thread
From: Oleg Nesterov @ 2008-11-11 16:31 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Ingo Molnar, Rafael J. Wysocki, Heiko Carstens,
	Linux Kernel Mailing List, Kernel Testers List, Rusty Russell,
	Peter Zijlstra, Dmitry Adamushko, Andrew Morton

On 11/11, Vegard Nossum wrote:
>
> I think that the test for stop_machine_data in stop_cpu() should not
> have been moved from __stop_machine(). Because now cpu_online_map may
> change in-between calls to stop_cpu() (if the callback tries to
> online/offline CPUs), and the end result may be different.

I don't think this is possible, the callback must not be called unless
all threads ack (at least) the STOPMACHINE_PREPARE state.


Off-topic question, __stop_machine() does:
	
	/* Schedule the stop_cpu work on all cpus: hold this CPU so one
	 * doesn't hit this CPU until we're ready. */
	get_cpu();
	for_each_online_cpu(i) {
		sm_work = percpu_ptr(stop_machine_work, i);
		INIT_WORK(sm_work, stop_cpu);
		queue_work_on(i, stop_machine_wq, sm_work);
	}
	/* This will release the thread on our CPU. */
	put_cpu();

Don't we actually need preempt_disable/preempt_enable instead of
get/put cpu? (yes, there the same currently). We don't care about
the CPU we are running on, and it can't go away until we queue all
works. But we must ensure that stop_cpu() on the same CPU can't
preempt us, right?

Oleg.


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 16:14                   ` Heiko Carstens
@ 2008-11-11 16:45                     ` Paul E. McKenney
  2008-11-11 17:34                       ` Paul E. McKenney
  0 siblings, 1 reply; 106+ messages in thread
From: Paul E. McKenney @ 2008-11-11 16:45 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Ingo Molnar, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Rusty Russell, Vegard Nossum,
	Peter Zijlstra, Oleg Nesterov, Dmitry Adamushko, Andrew Morton,
	Steven Rostedt

On Tue, Nov 11, 2008 at 05:14:01PM +0100, Heiko Carstens wrote:
> > > Could you please apply the following debug patch (due to Jiangshan and
> > > myself)?  Then you should be able to build with CONFIG_RCU_TRACE,
> > > then mount debugfs after boot, for example, on /debug.  This will
> > > create a /debug/rcu directory with three files, "rcucb", "rcu_data",
> > > and "rcu_bh_data".  Since you are still able to log in, could you
> > > please send the contents of these three files?
> > > 
> > > 							Thanx, Paul
> > 
> > This time with the patch actually attached...  Thanks to Peter Z.
> > for alerting me to my omission.
> 
> Well, your patch doesn't apply on git head. However I used preemptible
> RCU instead and had tracing enabled.

Were you using preemptible RCU earlier as well?  Raphael was using
classic RCU.  Don't get me wrong, all problems need fixing, just trying
to make sure I understand where the problems are occurring.

> This is the output of the three files after it stalled (and continued,
> because I caused an interrupt by sending a network packet) twice:
> 
> [root@h0545001 rcu]# cat rcuctrs 
> CPU last cur F M
>   1    0   0 1 1
>   3    0   0 1 1
>   4    0   0 0 0
>   5    0   0 0 1
>   6    0   0 0 0
> ggp = 1640, state = waitack
> 
> [root@h0545001 rcu]# cat rcugp 
> oldggp=1652  newggp=1655
> 
> [root@h0545001 rcu]# cat rcustats 
> na=33948 nl=3 wa=33945 wl=0 da=33945 dl=0 dr=33945 di=0
> 1=0 e1=0 i1=1674 ie1=4 g1=1670 a1=1920 ae1=251 a2=1669
> z1=1669 ze1=0 z2=1669 m1=4411 me1=2742 m2=1669

This hang also involved synchronize_sched()?  Or synchronize_rcu()?

The reason I ask is that the above stats are for the synchronize_rcu()
rather than synchronize_sched().

							Thanx, Paul

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Q: force_quiescent_state && cpu_online_map
  2008-11-11 15:02                 ` Paul E. McKenney
  2008-11-11 16:14                   ` Heiko Carstens
@ 2008-11-11 17:03                   ` Oleg Nesterov
  2008-11-11 17:25                     ` Paul E. McKenney
  1 sibling, 1 reply; 106+ messages in thread
From: Oleg Nesterov @ 2008-11-11 17:03 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Heiko Carstens, Ingo Molnar, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Rusty Russell,
	Vegard Nossum, Peter Zijlstra, Dmitry Adamushko, Andrew Morton,
	Steven Rostedt

I don't think this matters, but still...

	force_quiescent_state:

			 * cpu_online_map is updated by the _cpu_down()
			 * using __stop_machine(). Since we're in irqs disabled
			 * section, __stop_machine() is not exectuting, hence
			 * the cpu_online_map is stable.
			 *
			 * However,  a cpu might have been offlined _just_ before
			 * we disabled irqs while entering here.
			 * And rcu subsystem might not yet have handled the CPU_DEAD
			 * notification, leading to the offlined cpu's bit
			 * being set in the rcp->cpumask.
			 *
			 * Hence cpumask = (rcp->cpumask & cpu_online_map) to prevent
			 * sending smp_reschedule() to an offlined CPU.
			 */
			cpus_and(cpumask, rcp->cpumask, cpu_online_map);
			cpu_clear(rdp->cpu, cpumask);
			for_each_cpu_mask_nr(cpu, cpumask)
				smp_send_reschedule(cpu);

However,

	// called by __stop_machine take_cpu_down()
	arch/x86/kernel/smpboot.c:cpu_disable_common()

		/*
		 * HACK:
		 * Allow any queued timer interrupts to get serviced
		 * This is only a temporary solution until we cleanup
		 * fixup_irqs as we do for IA64.
		 */
		local_irq_enable();
		mdelay(1);
		local_irq_disable();
		...
		remove_cpu_from_maps(cpu);

So it is possible to send the ipi to the dying CPU. I know nothing
about this low-level irq code, most probably this is harmless. We
already did clear_local_APIC(), but I don't understand what it does.

Oleg.


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Q: force_quiescent_state && cpu_online_map
  2008-11-11 17:03                   ` Q: force_quiescent_state && cpu_online_map Oleg Nesterov
@ 2008-11-11 17:25                     ` Paul E. McKenney
  0 siblings, 0 replies; 106+ messages in thread
From: Paul E. McKenney @ 2008-11-11 17:25 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Heiko Carstens, Ingo Molnar, Rafael J. Wysocki,
	Linux Kernel Mailing List, Kernel Testers List, Rusty Russell,
	Vegard Nossum, Peter Zijlstra, Dmitry Adamushko, Andrew Morton,
	Steven Rostedt

On Tue, Nov 11, 2008 at 06:03:27PM +0100, Oleg Nesterov wrote:
> I don't think this matters, but still...
> 
> 	force_quiescent_state:
> 
> 			 * cpu_online_map is updated by the _cpu_down()
> 			 * using __stop_machine(). Since we're in irqs disabled
> 			 * section, __stop_machine() is not exectuting, hence
> 			 * the cpu_online_map is stable.
> 			 *
> 			 * However,  a cpu might have been offlined _just_ before
> 			 * we disabled irqs while entering here.
> 			 * And rcu subsystem might not yet have handled the CPU_DEAD
> 			 * notification, leading to the offlined cpu's bit
> 			 * being set in the rcp->cpumask.
> 			 *
> 			 * Hence cpumask = (rcp->cpumask & cpu_online_map) to prevent
> 			 * sending smp_reschedule() to an offlined CPU.
> 			 */
> 			cpus_and(cpumask, rcp->cpumask, cpu_online_map);
> 			cpu_clear(rdp->cpu, cpumask);
> 			for_each_cpu_mask_nr(cpu, cpumask)
> 				smp_send_reschedule(cpu);
> 
> However,
> 
> 	// called by __stop_machine take_cpu_down()
> 	arch/x86/kernel/smpboot.c:cpu_disable_common()
> 
> 		/*
> 		 * HACK:
> 		 * Allow any queued timer interrupts to get serviced
> 		 * This is only a temporary solution until we cleanup
> 		 * fixup_irqs as we do for IA64.
> 		 */
> 		local_irq_enable();
> 		mdelay(1);
> 		local_irq_disable();
> 		...
> 		remove_cpu_from_maps(cpu);
> 
> So it is possible to send the ipi to the dying CPU. I know nothing
> about this low-level irq code, most probably this is harmless. We
> already did clear_local_APIC(), but I don't understand what it does.

Indeed, some of the things I am doing as part of the hierarchical RCU
implementation need to be applied to preemptable RCU.  :-/

							Thanx, Paul

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 16:45                     ` Paul E. McKenney
@ 2008-11-11 17:34                       ` Paul E. McKenney
  2008-11-12  9:05                         ` Heiko Carstens
  0 siblings, 1 reply; 106+ messages in thread
From: Paul E. McKenney @ 2008-11-11 17:34 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Ingo Molnar, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Rusty Russell, Vegard Nossum,
	Peter Zijlstra, Oleg Nesterov, Dmitry Adamushko, Andrew Morton,
	Steven Rostedt

On Tue, Nov 11, 2008 at 08:45:23AM -0800, Paul E. McKenney wrote:
> On Tue, Nov 11, 2008 at 05:14:01PM +0100, Heiko Carstens wrote:
> > > > Could you please apply the following debug patch (due to Jiangshan and
> > > > myself)?  Then you should be able to build with CONFIG_RCU_TRACE,
> > > > then mount debugfs after boot, for example, on /debug.  This will
> > > > create a /debug/rcu directory with three files, "rcucb", "rcu_data",
> > > > and "rcu_bh_data".  Since you are still able to log in, could you
> > > > please send the contents of these three files?
> > > > 
> > > > 							Thanx, Paul
> > > 
> > > This time with the patch actually attached...  Thanks to Peter Z.
> > > for alerting me to my omission.
> > 
> > Well, your patch doesn't apply on git head. However I used preemptible
> > RCU instead and had tracing enabled.
> 
> Were you using preemptible RCU earlier as well?  Raphael was using
> classic RCU.  Don't get me wrong, all problems need fixing, just trying
> to make sure I understand where the problems are occurring.

And here is a version of the patch rebased to linux-2.6 git head.

This adds tracing to classic RCU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---

 include/linux/rcuclassic.h |    4 
 kernel/Kconfig.preempt     |    1 
 kernel/Makefile            |    2 
 kernel/rcuclassic.c        |    5 -
 kernel/rcuclassic_trace.c  |  198 +++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 207 insertions(+), 3 deletions(-)

diff --git a/include/linux/rcuclassic.h b/include/linux/rcuclassic.h
index 5f89b62..ce183a8 100644
--- a/include/linux/rcuclassic.h
+++ b/include/linux/rcuclassic.h
@@ -63,6 +63,9 @@ struct rcu_ctrlblk {
 				 /* for current batch to proceed.        */
 } ____cacheline_internodealigned_in_smp;
 
+extern struct rcu_ctrlblk rcu_ctrlblk;
+extern struct rcu_ctrlblk rcu_bh_ctrlblk;
+
 /* Is batch a before batch b ? */
 static inline int rcu_batch_before(long a, long b)
 {
@@ -81,6 +84,7 @@ struct rcu_data {
 	long		quiescbatch;     /* Batch # for grace period */
 	int		passed_quiesc;	 /* User-mode/idle loop etc. */
 	int		qs_pending;	 /* core waits for quiesc state */
+	bool		beenonline;	 /* CPU online at least once */
 
 	/* 2) batch handling */
 	/*
diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt
index 9fdba03..ba32338 100644
--- a/kernel/Kconfig.preempt
+++ b/kernel/Kconfig.preempt
@@ -68,7 +68,6 @@ config PREEMPT_RCU
 
 config RCU_TRACE
 	bool "Enable tracing for RCU - currently stats in debugfs"
-	depends on PREEMPT_RCU
 	select DEBUG_FS
 	default y
 	help
diff --git a/kernel/Makefile b/kernel/Makefile
index 9a3ec66..9771050 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -79,6 +79,8 @@ obj-$(CONFIG_CLASSIC_RCU) += rcuclassic.o
 obj-$(CONFIG_PREEMPT_RCU) += rcupreempt.o
 ifeq ($(CONFIG_PREEMPT_RCU),y)
 obj-$(CONFIG_RCU_TRACE) += rcupreempt_trace.o
+else
+obj-$(CONFIG_RCU_TRACE) += rcuclassic_trace.o
 endif
 obj-$(CONFIG_RELAY) += relay.o
 obj-$(CONFIG_SYSCTL) += utsname_sysctl.o
diff --git a/kernel/rcuclassic.c b/kernel/rcuclassic.c
index 37f72e5..54bd23b 100644
--- a/kernel/rcuclassic.c
+++ b/kernel/rcuclassic.c
@@ -58,14 +58,14 @@ EXPORT_SYMBOL_GPL(rcu_lock_map);
 
 
 /* Definition for rcupdate control block. */
-static struct rcu_ctrlblk rcu_ctrlblk = {
+struct rcu_ctrlblk rcu_ctrlblk = {
 	.cur = -300,
 	.completed = -300,
 	.pending = -300,
 	.lock = __SPIN_LOCK_UNLOCKED(&rcu_ctrlblk.lock),
 	.cpumask = CPU_MASK_NONE,
 };
-static struct rcu_ctrlblk rcu_bh_ctrlblk = {
+struct rcu_ctrlblk rcu_bh_ctrlblk = {
 	.cur = -300,
 	.completed = -300,
 	.pending = -300,
@@ -725,6 +725,7 @@ static void rcu_init_percpu_data(int cpu, struct rcu_ctrlblk *rcp,
 	rdp->donetail = &rdp->donelist;
 	rdp->quiescbatch = rcp->completed;
 	rdp->qs_pending = 0;
+	rdp->beenonline = 1;
 	rdp->cpu = cpu;
 	rdp->blimit = blimit;
 	spin_unlock_irqrestore(&rcp->lock, flags);
diff --git a/kernel/rcuclassic_trace.c b/kernel/rcuclassic_trace.c
new file mode 100644
index 0000000..612170c
--- /dev/null
+++ b/kernel/rcuclassic_trace.c
@@ -0,0 +1,198 @@
+/*
+ * Read-Copy Update tracing for classic implementation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright IBM Corporation, 2008
+ *
+ * Updated to use seqfile by Lai Jiangshan.
+ *
+ * Papers:  http://www.rdrop.com/users/paulmck/RCU
+ *
+ * For detailed explanation of Read-Copy Update mechanism see -
+ * 		Documentation/RCU
+ *
+ */
+#include <linux/rcupdate.h>
+#include <linux/module.h>
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+
+/* Print out rcu_data structures using seqfile facility. */
+
+static struct rcu_data *get_rcu_data_bh(int cpu)
+{
+	return &per_cpu(rcu_bh_data, cpu);
+}
+
+static struct rcu_data *get_rcu_data(int cpu)
+{
+	return &per_cpu(rcu_data, cpu);
+}
+
+static int show_rcu_data(struct seq_file *m, void *v)
+{
+	struct rcu_data *rdp = v;
+
+	if (!rdp->beenonline)
+		return 0;
+
+	seq_printf(m, "processor\t: %d", rdp->cpu);
+	if (cpu_is_offline(rdp->cpu))
+		seq_puts(m, "!\n");
+	else
+		seq_puts(m, "\n");
+	seq_printf(m, "quiescbatch\t: %ld\n", rdp->quiescbatch);
+	seq_printf(m, "batch\t\t: %ld\n", rdp->batch);
+	seq_printf(m, "passed_quiesc\t: %d\n", rdp->passed_quiesc);
+	seq_printf(m, "qs_pending\t: %d\n", rdp->qs_pending);
+	seq_printf(m, "qlen\t\t: %ld\n", rdp->qlen);
+	seq_printf(m, "blimit\t\t: %ld\n", rdp->blimit);
+	seq_puts(m, "\n");
+	return 0;
+}
+
+static void *c_start(struct seq_file *m, loff_t *pos)
+{
+	typedef struct rcu_data *(*get_data_func)(int);
+
+	if (*pos == 0)  /* just in case, cpu 0 is not the first */
+		*pos = first_cpu(cpu_possible_map);
+	else
+		*pos = next_cpu_nr(*pos - 1, cpu_possible_map);
+	if ((*pos) < nr_cpu_ids)
+		return ((get_data_func)m->private)(*pos);
+	return NULL;
+}
+
+static void *c_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	(*pos)++;
+	return c_start(m, pos);
+}
+
+static void c_stop(struct seq_file *m, void *v)
+{
+}
+
+const struct seq_operations rcu_data_seq_op = {
+	.start	= c_start,
+	.next	= c_next,
+	.stop	= c_stop,
+	.show	= show_rcu_data,
+};
+
+static int rcu_data_open(struct inode *inode, struct file *file)
+{
+	int ret = seq_open(file, &rcu_data_seq_op);
+
+	if (ret)
+		return ret;
+	((struct seq_file *)file->private_data)->private = inode->i_private;
+	return 0;
+}
+
+static const struct file_operations rcu_data_fops = {
+	.owner		= THIS_MODULE,
+	.open		= rcu_data_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= seq_release,
+};
+
+/* Print out rcu_ctrlblk structures using seqfile facility. */
+
+static void print_one_rcu_ctrlblk(struct seq_file *m, struct rcu_ctrlblk *rcp)
+{
+	seq_printf(m, "cur=%ld  completed=%ld   pending=%d  s=%d\n\t",
+		   rcp->cur, rcp->completed, rcp->pending, rcp->signaled);
+	seq_cpumask(m, &rcp->cpumask);
+	seq_puts(m, "\n");
+}
+
+static int show_rcucb(struct seq_file *m, void *unused)
+{
+	seq_puts(m, "rcu: ");
+	print_one_rcu_ctrlblk(m, &rcu_ctrlblk);
+	seq_puts(m, "rcu_bh: ");
+	print_one_rcu_ctrlblk(m, &rcu_bh_ctrlblk);
+	seq_puts(m, "online: ");
+	seq_cpumask(m, &cpu_online_map);
+	seq_puts(m, "\n");
+	return 0;
+}
+
+static int rcucb_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, show_rcucb, NULL);
+}
+
+static struct file_operations rcucb_fops = {
+	.owner		= THIS_MODULE,
+	.open		= rcucb_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
+static struct dentry *rcudir, *rcu_bh_data_file, *rcu_data_file, *rcucb_file;
+
+static int __init rcuclassic_trace_init(void)
+{
+	rcudir = debugfs_create_dir("rcu", NULL);
+	if (!rcudir)
+		goto out;
+
+	rcu_bh_data_file = debugfs_create_file("rcu_bh_data", 0444, rcudir,
+					       get_rcu_data_bh, &rcu_data_fops);
+	if (!rcu_bh_data_file)
+		goto out_rcudir;
+
+	rcu_data_file = debugfs_create_file("rcu_data", 0444, rcudir,
+					    get_rcu_data, &rcu_data_fops);
+	if (!rcu_data_file)
+		goto out_rcudata_bh_file;
+
+	rcucb_file = debugfs_create_file("rcucb", 0444, rcudir,
+					 NULL, &rcucb_fops);
+	if (!rcucb_file)
+		goto out_rcudata_file;
+	return 0;
+
+out_rcudata_file:
+	debugfs_remove(rcu_data_file);
+out_rcudata_bh_file:
+	debugfs_remove(rcu_bh_data_file);
+out_rcudir:
+	debugfs_remove(rcudir);
+out:
+	return 1;
+}
+
+static void __exit rcuclassic_trace_cleanup(void)
+{
+	debugfs_remove(rcucb_file);
+	debugfs_remove(rcu_data_file);
+	debugfs_remove(rcu_bh_data_file);
+	debugfs_remove(rcudir);
+}
+
+module_init(rcuclassic_trace_init);
+module_exit(rcuclassic_trace_cleanup);
+
+MODULE_AUTHOR("Paul E. McKenney");
+MODULE_DESCRIPTION("Read-Copy Update tracing for classic implementation");
+MODULE_LICENSE("GPL");
+

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-10 22:55       ` Rafael J. Wysocki
  2008-11-11 10:52         ` Ingo Molnar
@ 2008-11-11 21:28         ` Dmitry Adamushko
  2008-11-11 23:43           ` Rafael J. Wysocki
  1 sibling, 1 reply; 106+ messages in thread
From: Dmitry Adamushko @ 2008-11-11 21:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Heiko Carstens, Linux Kernel Mailing List, Kernel Testers List,
	Rusty Russell

2008/11/10 Rafael J. Wysocki <rjw@sisk.pl>:
> On Monday, 10 of November 2008, Rafael J. Wysocki wrote:
>> On Monday, 10 of November 2008, Heiko Carstens wrote:
>> > On Sun, Nov 09, 2008 at 06:59:16PM +0100, Rafael J. Wysocki wrote:
>> > > This message has been generated automatically as a part of a report
>> > > of recent regressions.
>> > >
>> > > The following bug entry is on the current list of known regressions
>> > > from 2.6.27.  Please verify if it still should be listed and let me know
>> > > (either way).
>> > >
>> > >
>> > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11989
>> > > Subject           : Suspend failure on NForce4-based boards due to chanes in stop_machine
>> > > Submitter : Rafael J. Wysocki <rjw@sisk.pl>
>> > > Date              : 2008-11-03 0:28 (7 days old)
>> > > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc
>> > > References        : http://marc.info/?l=linux-kernel&m=122567187604356&w=4
>> >
>> > Hi Rafael,
>>
>> Hi,
>>
>> > could you provide more informations for this, please?
>> >
>> > What is your kernel configuration?
>>
>> Available at: http://www.sisk.pl/kernel/debug/mainline/2.6.28-rc3/kitty-config
>>
>> > Do you have any binary only modules (nvidia?) loaded?
>>
>> No, I don't.
>>
>> > Is it possible to recreate the bug by e.g. just doing something like
>> >
>> > echo 0 > /sys/devices/system/cpu/cpu1/online
>>
>> I haven't checked (yet), I'll do that later today and let you know.
>>
>> > (or any other online cpu)? Or does it trigger any lockdep warnings?
>
> It cannot be reproduced with offlining CPU1 and it doesn't trigger any
> warnings from lockdep.
>
> However, it is reproducible by doing
>
> # echo core > /sys/power/pm_test
>
> and repeating
>
> # echo disk > /sys/power/state
>
> for a couple of times, in which case the last two lines printed to the console
> before a (solid) hang are:
>
> SMP alternatives: switching to SMP code
> Booting processor 1 APIC 0x1 ip 0x6000
>
> So, it evidently fails while re-enabling the non-boot CPU and not during
> disabling it as I thought before.

Can you also provide the full log including the messages when a system
goes down please?

At first glance, "Botting processor..." as the last message looks
strange in this context.
So either wakeup_secondary_cpu()'s completion failed for some reason
(say, due to some kind of a problem that took place while disabling
non-boot cpus... I'm purely speculating here so far) or the printk's
output was not complete.

Perhaps, redoing the test with pr_debug() in arch/x86/kernel/smpboot.c
enabled would shed more light...


-- 
Best regards,
Dmitry Adamushko

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 21:28         ` Dmitry Adamushko
@ 2008-11-11 23:43           ` Rafael J. Wysocki
  0 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-11 23:43 UTC (permalink / raw)
  To: Dmitry Adamushko
  Cc: Heiko Carstens, Linux Kernel Mailing List, Kernel Testers List,
	Rusty Russell

On Tuesday, 11 of November 2008, Dmitry Adamushko wrote:
> 2008/11/10 Rafael J. Wysocki <rjw@sisk.pl>:
> > On Monday, 10 of November 2008, Rafael J. Wysocki wrote:
> >> On Monday, 10 of November 2008, Heiko Carstens wrote:
> >> > On Sun, Nov 09, 2008 at 06:59:16PM +0100, Rafael J. Wysocki wrote:
> >> > > This message has been generated automatically as a part of a report
> >> > > of recent regressions.
> >> > >
> >> > > The following bug entry is on the current list of known regressions
> >> > > from 2.6.27.  Please verify if it still should be listed and let me know
> >> > > (either way).
> >> > >
> >> > >
> >> > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11989
> >> > > Subject           : Suspend failure on NForce4-based boards due to chanes in stop_machine
> >> > > Submitter : Rafael J. Wysocki <rjw@sisk.pl>
> >> > > Date              : 2008-11-03 0:28 (7 days old)
> >> > > First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc
> >> > > References        : http://marc.info/?l=linux-kernel&m=122567187604356&w=4
> >> >
> >> > Hi Rafael,
> >>
> >> Hi,
> >>
> >> > could you provide more informations for this, please?
> >> >
> >> > What is your kernel configuration?
> >>
> >> Available at: http://www.sisk.pl/kernel/debug/mainline/2.6.28-rc3/kitty-config
> >>
> >> > Do you have any binary only modules (nvidia?) loaded?
> >>
> >> No, I don't.
> >>
> >> > Is it possible to recreate the bug by e.g. just doing something like
> >> >
> >> > echo 0 > /sys/devices/system/cpu/cpu1/online
> >>
> >> I haven't checked (yet), I'll do that later today and let you know.
> >>
> >> > (or any other online cpu)? Or does it trigger any lockdep warnings?
> >
> > It cannot be reproduced with offlining CPU1 and it doesn't trigger any
> > warnings from lockdep.
> >
> > However, it is reproducible by doing
> >
> > # echo core > /sys/power/pm_test
> >
> > and repeating
> >
> > # echo disk > /sys/power/state
> >
> > for a couple of times, in which case the last two lines printed to the console
> > before a (solid) hang are:
> >
> > SMP alternatives: switching to SMP code
> > Booting processor 1 APIC 0x1 ip 0x6000
> >
> > So, it evidently fails while re-enabling the non-boot CPU and not during
> > disabling it as I thought before.
> 
> Can you also provide the full log including the messages when a system
> goes down please?
> 
> At first glance, "Botting processor..." as the last message looks
> strange in this context.
> So either wakeup_secondary_cpu()'s completion failed for some reason
> (say, due to some kind of a problem that took place while disabling
> non-boot cpus... I'm purely speculating here so far) or the printk's
> output was not complete.
>
> Perhaps, redoing the test with pr_debug() in arch/x86/kernel/smpboot.c
> enabled would shed more light...

Will do tomorrow.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 16:31             ` Oleg Nesterov
@ 2008-11-12  3:30               ` Rusty Russell
  0 siblings, 0 replies; 106+ messages in thread
From: Rusty Russell @ 2008-11-12  3:30 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Vegard Nossum, Ingo Molnar, Rafael J. Wysocki, Heiko Carstens,
	Linux Kernel Mailing List, Kernel Testers List, Peter Zijlstra,
	Dmitry Adamushko, Andrew Morton

On Wednesday 12 November 2008 03:01:18 Oleg Nesterov wrote:
> On 11/11, Vegard Nossum wrote:
> > I think that the test for stop_machine_data in stop_cpu() should not
> > have been moved from __stop_machine(). Because now cpu_online_map may
> > change in-between calls to stop_cpu() (if the callback tries to
> > online/offline CPUs), and the end result may be different.
>
> I don't think this is possible, the callback must not be called unless
> all threads ack (at least) the STOPMACHINE_PREPARE state.
>
>
> Off-topic question, __stop_machine() does:
>
> 	/* Schedule the stop_cpu work on all cpus: hold this CPU so one
> 	 * doesn't hit this CPU until we're ready. */
> 	get_cpu();
> 	for_each_online_cpu(i) {
> 		sm_work = percpu_ptr(stop_machine_work, i);
> 		INIT_WORK(sm_work, stop_cpu);
> 		queue_work_on(i, stop_machine_wq, sm_work);
> 	}
> 	/* This will release the thread on our CPU. */
> 	put_cpu();
>
> Don't we actually need preempt_disable/preempt_enable instead of
> get/put cpu? (yes, there the same currently). We don't care about
> the CPU we are running on, and it can't go away until we queue all
> works. But we must ensure that stop_cpu() on the same CPU can't
> preempt us, right?

A subtle distinction, but yes.  It used to be true before the recent changes, 
where we manually did "this" cpu.

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 10:52         ` Ingo Molnar
                             ` (2 preceding siblings ...)
  2008-11-11 14:47           ` Vegard Nossum
@ 2008-11-12  3:39           ` Rusty Russell
  2008-11-15 13:37             ` Rafael J. Wysocki
  3 siblings, 1 reply; 106+ messages in thread
From: Rusty Russell @ 2008-11-12  3:39 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Rafael J. Wysocki, Heiko Carstens, Linux Kernel Mailing List,
	Kernel Testers List, Vegard Nossum, Peter Zijlstra,
	Oleg Nesterov, Dmitry Adamushko, Andrew Morton

On Tuesday 11 November 2008 21:22:14 Ingo Molnar wrote:> * Rafael J. Wysocki <rjw@sisk.pl> wrote:> > So, it evidently fails while re-enabling the non-boot CPU and not> > during disabling it as I thought before.
(Resend, due to HTML version previously)
But what is calling stop_machine in that path?
There *is* a race, but I don't think it could cause this (we should make acopy of active.fnret inside the lock before returning it).
Two patches: one fixes that race, the next adds debugging spew.
stop_machine: fix race with return value
We should not access active.fnret outside the lock; in theory the nextstop_machine could overwrite it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>--- kernel/stop_machine.c |    5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff -r d7c9a15da615 kernel/stop_machine.c--- a/kernel/stop_machine.c	Mon Nov 10 09:47:45 2008 +1100+++ b/kernel/stop_machine.c	Tue Nov 11 23:19:47 2008 +1030@@ -112,7 +112,7 @@ int __stop_machine(int (*fn)(void *), void *data, const cpumask_t *cpus) { 	struct work_struct *sm_work;-	int i;+	int i, ret;  	/* Set up initial state. */ 	mutex_lock(&lock);@@ -137,8 +137,9 @@ 	/* This will release the thread on our CPU. */ 	put_cpu(); 	flush_workqueue(stop_machine_wq);+	ret = active.fnret; 	mutex_unlock(&lock);-	return active.fnret;+	return ret; }  int stop_machine(int (*fn)(void *), void *data, const cpumask_t *cpus)===diff -r fe7dd39b1cff kernel/stop_machine.c--- a/kernel/stop_machine.c	Wed Nov 12 14:07:18 2008 +1030+++ b/kernel/stop_machine.c	Wed Nov 12 14:09:08 2008 +1030@@ -89,6 +89,8 @@ 			case STOPMACHINE_RUN: 				/* On multiple CPUs only a single error code 				 * is needed to tell that something failed. */+				printk("stop_machine: %i running %p\n",+				       smp_processor_id(), smdata->fn); 				err = smdata->fn(smdata->data); 				if (err) 					smdata->fnret = err;@@ -106,6 +108,7 @@ /* Callback for CPUs which aren't supposed to do anything. */ static int chill(void *unused) {+	printk("stop_machine: %i chilling\n", smp_processor_id()); 	return 0; } @@ -126,17 +129,23 @@  	set_state(STOPMACHINE_PREPARE); +	printk("stop_machine: running on %i cpus:\n", num_threads);+	dump_stack();+ 	/* Schedule the stop_cpu work on all cpus: hold this CPU so one 	 * doesn't hit this CPU until we're ready. */ 	get_cpu(); 	for_each_online_cpu(i) {+		printk("stop_machine: setting up cpu %i\n", i); 		sm_work = percpu_ptr(stop_machine_work, i); 		INIT_WORK(sm_work, stop_cpu); 		queue_work_on(i, stop_machine_wq, sm_work); 	} 	/* This will release the thread on our CPU. */+	printk("stop_machine: releasing CPU %i\n", smp_processor_id()); 	put_cpu(); 	flush_workqueue(stop_machine_wq);+	printk("stop_machine: done\n"); 	ret = active.fnret; 	mutex_unlock(&lock); 	return ret;\0ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-11 17:34                       ` Paul E. McKenney
@ 2008-11-12  9:05                         ` Heiko Carstens
  2008-11-12 16:03                           ` Paul E. McKenney
  0 siblings, 1 reply; 106+ messages in thread
From: Heiko Carstens @ 2008-11-12  9:05 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Rusty Russell, Vegard Nossum,
	Peter Zijlstra, Oleg Nesterov, Dmitry Adamushko, Andrew Morton,
	Steven Rostedt

On Tue, Nov 11, 2008 at 09:34:51AM -0800, Paul E. McKenney wrote:
> On Tue, Nov 11, 2008 at 08:45:23AM -0800, Paul E. McKenney wrote:
> > On Tue, Nov 11, 2008 at 05:14:01PM +0100, Heiko Carstens wrote:
> > > > > Could you please apply the following debug patch (due to Jiangshan and
> > > > > myself)?  Then you should be able to build with CONFIG_RCU_TRACE,
> > > > > then mount debugfs after boot, for example, on /debug.  This will
> > > > > create a /debug/rcu directory with three files, "rcucb", "rcu_data",
> > > > > and "rcu_bh_data".  Since you are still able to log in, could you
> > > > > please send the contents of these three files?
> > > > > 
> > > > > 							Thanx, Paul
> > > > 
> > > > This time with the patch actually attached...  Thanks to Peter Z.
> > > > for alerting me to my omission.
> > > 
> > > Well, your patch doesn't apply on git head. However I used preemptible
> > > RCU instead and had tracing enabled.
> > 
> > Were you using preemptible RCU earlier as well?  Raphael was using
> > classic RCU.  Don't get me wrong, all problems need fixing, just trying
> > to make sure I understand where the problems are occurring.

Indeed, my fault. I just try to reproduce a cpu hotplug bug with classic RCU
and cpu hotplug stress test, but no luck so far.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-12  9:05                         ` Heiko Carstens
@ 2008-11-12 16:03                           ` Paul E. McKenney
  2008-11-12 16:51                             ` Heiko Carstens
  0 siblings, 1 reply; 106+ messages in thread
From: Paul E. McKenney @ 2008-11-12 16:03 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Ingo Molnar, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Rusty Russell, Vegard Nossum,
	Peter Zijlstra, Oleg Nesterov, Dmitry Adamushko, Andrew Morton,
	Steven Rostedt, manfred

On Wed, Nov 12, 2008 at 10:05:08AM +0100, Heiko Carstens wrote:
> On Tue, Nov 11, 2008 at 09:34:51AM -0800, Paul E. McKenney wrote:
> > On Tue, Nov 11, 2008 at 08:45:23AM -0800, Paul E. McKenney wrote:
> > > On Tue, Nov 11, 2008 at 05:14:01PM +0100, Heiko Carstens wrote:
> > > > > > Could you please apply the following debug patch (due to Jiangshan and
> > > > > > myself)?  Then you should be able to build with CONFIG_RCU_TRACE,
> > > > > > then mount debugfs after boot, for example, on /debug.  This will
> > > > > > create a /debug/rcu directory with three files, "rcucb", "rcu_data",
> > > > > > and "rcu_bh_data".  Since you are still able to log in, could you
> > > > > > please send the contents of these three files?
> > > > > > 
> > > > > > 							Thanx, Paul
> > > > > 
> > > > > This time with the patch actually attached...  Thanks to Peter Z.
> > > > > for alerting me to my omission.
> > > > 
> > > > Well, your patch doesn't apply on git head. However I used preemptible
> > > > RCU instead and had tracing enabled.
> > > 
> > > Were you using preemptible RCU earlier as well?  Raphael was using
> > > classic RCU.  Don't get me wrong, all problems need fixing, just trying
> > > to make sure I understand where the problems are occurring.
> 
> Indeed, my fault. I just try to reproduce a cpu hotplug bug with classic RCU
> and cpu hotplug stress test, but no luck so far.

OK, then my next step will be to send Rafael an updated version of
my hierarchical RCU, which is more robust than classic RCU against
online/offline stress tests.  On the machines I have access to, anyway.  ;-)

Then I will look at preemptable RCU, which undoubtably needs some of the
same help that I have been giving to hierarchical RCU.  Manfred thus
wins the clairvoyance award!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-12 16:03                           ` Paul E. McKenney
@ 2008-11-12 16:51                             ` Heiko Carstens
  2008-11-12 19:43                               ` Paul E. McKenney
  0 siblings, 1 reply; 106+ messages in thread
From: Heiko Carstens @ 2008-11-12 16:51 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Ingo Molnar, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Rusty Russell, Vegard Nossum,
	Peter Zijlstra, Oleg Nesterov, Dmitry Adamushko, Andrew Morton,
	Steven Rostedt, manfred

On Wed, Nov 12, 2008 at 08:03:49AM -0800, Paul E. McKenney wrote:
> On Wed, Nov 12, 2008 at 10:05:08AM +0100, Heiko Carstens wrote:
> > On Tue, Nov 11, 2008 at 09:34:51AM -0800, Paul E. McKenney wrote:
> > > > Were you using preemptible RCU earlier as well?  Raphael was using
> > > > classic RCU.  Don't get me wrong, all problems need fixing, just trying
> > > > to make sure I understand where the problems are occurring.
> > 
> > Indeed, my fault. I just try to reproduce a cpu hotplug bug with classic RCU
> > and cpu hotplug stress test, but no luck so far.
> 
> OK, then my next step will be to send Rafael an updated version of
> my hierarchical RCU, which is more robust than classic RCU against
> online/offline stress tests.  On the machines I have access to, anyway.  ;-)
> 
> Then I will look at preemptable RCU, which undoubtably needs some of the
> same help that I have been giving to hierarchical RCU.  Manfred thus
> wins the clairvoyance award!

Well, I tried all day long to reproduce a cpu hotplug/stop_machine hang
with classic RCU and a kernel configuration that is as close as possible
to Raphael's configuration, but it just continues to work without a bug.

One of the machines is a virtual machine with 8 virtual cpus mapped on
two real cpus. The real cpus are again shared with other guests. So I end
up with cpu steal times of 50-90%. That should have revealed races in the
stop_machine code, considering that thousands of cpu hotplug operations
happened.

I let these test machines running over night. Maybe something happens...
but at a first glance it looks more like the reworked stop_machine code
triggers a different bug that already is present. Hmmm...

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-12 16:51                             ` Heiko Carstens
@ 2008-11-12 19:43                               ` Paul E. McKenney
  0 siblings, 0 replies; 106+ messages in thread
From: Paul E. McKenney @ 2008-11-12 19:43 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Ingo Molnar, Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Rusty Russell, Vegard Nossum,
	Peter Zijlstra, Oleg Nesterov, Dmitry Adamushko, Andrew Morton,
	Steven Rostedt, manfred

On Wed, Nov 12, 2008 at 05:51:18PM +0100, Heiko Carstens wrote:
> On Wed, Nov 12, 2008 at 08:03:49AM -0800, Paul E. McKenney wrote:
> > On Wed, Nov 12, 2008 at 10:05:08AM +0100, Heiko Carstens wrote:
> > > On Tue, Nov 11, 2008 at 09:34:51AM -0800, Paul E. McKenney wrote:
> > > > > Were you using preemptible RCU earlier as well?  Raphael was using
> > > > > classic RCU.  Don't get me wrong, all problems need fixing, just trying
> > > > > to make sure I understand where the problems are occurring.
> > > 
> > > Indeed, my fault. I just try to reproduce a cpu hotplug bug with classic RCU
> > > and cpu hotplug stress test, but no luck so far.
> > 
> > OK, then my next step will be to send Rafael an updated version of
> > my hierarchical RCU, which is more robust than classic RCU against
> > online/offline stress tests.  On the machines I have access to, anyway.  ;-)
> > 
> > Then I will look at preemptable RCU, which undoubtably needs some of the
> > same help that I have been giving to hierarchical RCU.  Manfred thus
> > wins the clairvoyance award!
> 
> Well, I tried all day long to reproduce a cpu hotplug/stop_machine hang
> with classic RCU and a kernel configuration that is as close as possible
> to Raphael's configuration, but it just continues to work without a bug.
> 
> One of the machines is a virtual machine with 8 virtual cpus mapped on
> two real cpus. The real cpus are again shared with other guests. So I end
> up with cpu steal times of 50-90%. That should have revealed races in the
> stop_machine code, considering that thousands of cpu hotplug operations
> happened.
> 
> I let these test machines running over night. Maybe something happens...
> but at a first glance it looks more like the reworked stop_machine code
> triggers a different bug that already is present. Hmmm...

I can make Classic RCU break in 2.6.28-rc3, but I need a 128-CPU machine to
break it.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-10  5:46   ` Benjamin Herrenschmidt
                       ` (2 preceding siblings ...)
  2008-11-10 20:39     ` Andreas Schwab
@ 2008-11-13 23:11     ` David Miller
  2008-11-14  0:54       ` Benjamin Herrenschmidt
  3 siblings, 1 reply; 106+ messages in thread
From: David Miller @ 2008-11-13 23:11 UTC (permalink / raw)
  To: benh; +Cc: rjw, linux-kernel, kernel-testers, akpm, cloos, paul, torvalds

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Mon, 10 Nov 2008 16:46:25 +1100

> David, would you mind testing on your machine ? It's the one that shows
> the biggest performance improvement, and I would like to know how much
> it is affected by that patch. As long as the "worst case" performance
> is still reasonable, I'm ok to take the hit if the improvement for you
> is still significant.

Finally got around to this, we lose about a full second in the
"cat rfc3261.txt" benchmark:

2.6.28-rc4 vanilla:

7.634
7.704
7.688

2.6.28rc4+patch:

8.712
8.685
8.702

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-13 23:11     ` David Miller
@ 2008-11-14  0:54       ` Benjamin Herrenschmidt
  2008-11-14  2:50         ` David Miller
  0 siblings, 1 reply; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2008-11-14  0:54 UTC (permalink / raw)
  To: David Miller
  Cc: rjw, linux-kernel, kernel-testers, akpm, cloos, paul, torvalds

On Thu, 2008-11-13 at 15:11 -0800, David Miller wrote:
> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Mon, 10 Nov 2008 16:46:25 +1100
> 
> > David, would you mind testing on your machine ? It's the one that shows
> > the biggest performance improvement, and I would like to know how much
> > it is affected by that patch. As long as the "worst case" performance
> > is still reasonable, I'm ok to take the hit if the improvement for you
> > is still significant.
> 
> Finally got around to this, we lose about a full second in the
> "cat rfc3261.txt" benchmark:
> 
> 2.6.28-rc4 vanilla:
> 
> 7.634
> 7.704
> 7.688
> 
> 2.6.28rc4+patch:
> 
> 8.712
> 8.685
> 8.702

How does it compare with not having the acceleration ? ie. I don't think
I can do anything about it, except maybe optimize for the case where the
pixmap is already aligned (and thus doesn't need scissors), the main
question is is the acceleration still worth it or not at all since it's
generally not worth it on other architectures.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-14  0:54       ` Benjamin Herrenschmidt
@ 2008-11-14  2:50         ` David Miller
  2008-11-14  3:04           ` David Miller
  0 siblings, 1 reply; 106+ messages in thread
From: David Miller @ 2008-11-14  2:50 UTC (permalink / raw)
  To: benh; +Cc: rjw, linux-kernel, kernel-testers, akpm, cloos, paul, torvalds

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Fri, 14 Nov 2008 11:54:20 +1100

> How does it compare with not having the acceleration ?

I'll find out for you.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-14  2:50         ` David Miller
@ 2008-11-14  3:04           ` David Miller
  2008-11-14  3:29             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 106+ messages in thread
From: David Miller @ 2008-11-14  3:04 UTC (permalink / raw)
  To: benh; +Cc: rjw, linux-kernel, kernel-testers, akpm, cloos, paul, torvalds

From: David Miller <davem@davemloft.net>
Date: Thu, 13 Nov 2008 18:50:59 -0800 (PST)

> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Fri, 14 Nov 2008 11:54:20 +1100
> 
> > How does it compare with not having the acceleration ?
> 
> I'll find out for you.

It makes a huge difference, with the acceleration patch:

commit b1ee26bab14886350ba12a5c10cbc0696ac679bf
Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date:   Wed Oct 15 22:03:46 2008 -0700

    radeonfb: accelerate imageblit and other improvements

reverted, the test case takes 25 seconds or more instead of
the 7 or 8 seconds we're seeing now.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-14  3:04           ` David Miller
@ 2008-11-14  3:29             ` Benjamin Herrenschmidt
  2008-11-14  4:28               ` David Miller
  0 siblings, 1 reply; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2008-11-14  3:29 UTC (permalink / raw)
  To: David Miller
  Cc: rjw, linux-kernel, kernel-testers, akpm, cloos, paul, torvalds


> It makes a huge difference, with the acceleration patch:
> 
> commit b1ee26bab14886350ba12a5c10cbc0696ac679bf
> Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date:   Wed Oct 15 22:03:46 2008 -0700
> 
>     radeonfb: accelerate imageblit and other improvements
> 
> reverted, the test case takes 25 seconds or more instead of
> the 7 or 8 seconds we're seeing now.

Ok, thanks a lot for those tests !

So I consider the loss of perfs due to the workaround to be minor enough
here. I'll submit the patch for inclusion.

I might look at not doing the clipping in cases things are already
aligned later but I doubt it's going to be worth the pain,

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-14  3:29             ` Benjamin Herrenschmidt
@ 2008-11-14  4:28               ` David Miller
  2008-11-14  8:51                 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 106+ messages in thread
From: David Miller @ 2008-11-14  4:28 UTC (permalink / raw)
  To: benh; +Cc: rjw, linux-kernel, kernel-testers, akpm, cloos, paul, torvalds

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Fri, 14 Nov 2008 14:29:11 +1100

> 
> So I consider the loss of perfs due to the workaround to be minor enough
> here. I'll submit the patch for inclusion.

BTW, there is a warning generated by this fix, the src_bytes
variable becomes unused or something like that.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-14  4:28               ` David Miller
@ 2008-11-14  8:51                 ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2008-11-14  8:51 UTC (permalink / raw)
  To: David Miller
  Cc: rjw, linux-kernel, kernel-testers, akpm, cloos, paul, torvalds

On Thu, 2008-11-13 at 20:28 -0800, David Miller wrote:
> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Fri, 14 Nov 2008 14:29:11 +1100
> 
> > 
> > So I consider the loss of perfs due to the workaround to be minor enough
> > here. I'll submit the patch for inclusion.
> 
> BTW, there is a warning generated by this fix, the src_bytes
> variable becomes unused or something like that.

Ok thanks. I'll check that asap. I think I did remove the use some
intermediary variable indeed, probably forgot to remove its declaration
too.

Cheers,
Ben.


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11988] Eliminate recursive mutex in compat fb ioctl path
  2008-11-09 17:59 ` [Bug #11988] Eliminate recursive mutex in compat fb ioctl path Rafael J. Wysocki
@ 2008-11-14 14:51   ` Geert Uytterhoeven
  2008-11-15 11:51     ` Rafael J. Wysocki
  0 siblings, 1 reply; 106+ messages in thread
From: Geert Uytterhoeven @ 2008-11-14 14:51 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux Kernel Mailing List, Kernel Testers List, Keith Packard

On Sun, 9 Nov 2008, Rafael J. Wysocki wrote:
> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.27.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11988
> Subject		: Eliminate recursive mutex in compat fb ioctl path
> Submitter	: Keith Packard <keithp@keithp.com>
> Date		: 2008-11-03 7:06 (7 days old)
> References	: http://marc.info/?l=linux-kernel&m=122569604828448&w=4
> Handled-By	: Keith Packard <keithp@keithp.com>
> 		  Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
> Patch		: http://marc.info/?l=linux-kernel&m=122569604828448&w=4
> 		  http://lkml.org/lkml/2008/10/31/162

Fixed in mainline.

commit a684e7d33096892093456dd56a582cfc3bfad648
Author: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Date:   Thu Nov 6 12:53:37 2008 -0800

    fbdev: fix fb_compat_ioctl() deadlocks

    commit 3e680aae4e53ab54cdbb0c29257dae0cbb158e1c ("fb: convert
    lock/unlock_kernel() into local fb mutex") introduced several deadlocks
    in the fb_compat_ioctl() path, as mutex_lock() doesn't allow recursion,
    unlike lock_kernel().  This broke frame buffer applications on 64-bit
    systems with a 32-bit userland.
    
    commit 120a37470c2831fea49fdebaceb5a7039f700ce6 ("framebuffer compat_ioctl
    deadlock") fixed one of the deadlocks.
    
    This patch fixes the remaining deadlocks:
      - Revert commit 120a37470c2831fea49fdebaceb5a7039f700ce6,
      - Extract the core logic of fb_ioctl() into a new function do_fb_ioctl(),
      - Change all callsites of fb_ioctl() where info->lock is already held to
        call do_fb_ioctl() instead,
      - Add sparse annotations to all routines that take info->lock.
    
    Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
    Cc: Mikulas Patocka <mpatocka@redhat.com>
    Cc: Krzysztof Helt <krzysztof.h1@wp.pl>
    Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

With kind regards,

Geert Uytterhoeven
Software Architect

Sony Techsoft Centre Europe
The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium

Phone:    +32 (0)2 700 8453
Fax:      +32 (0)2 700 8622
E-mail:   Geert.Uytterhoeven@sonycom.com
Internet: http://www.sony-europe.com/

A division of Sony Europe (Belgium) N.V.
VAT BE 0413.825.160 · RPR Brussels
Fortis · BIC GEBABEBB · IBAN BE41293037680010

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11988] Eliminate recursive mutex in compat fb ioctl path
  2008-11-14 14:51   ` Geert Uytterhoeven
@ 2008-11-15 11:51     ` Rafael J. Wysocki
  0 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-15 11:51 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Linux Kernel Mailing List, Kernel Testers List, Keith Packard

On Friday, 14 of November 2008, Geert Uytterhoeven wrote:
> On Sun, 9 Nov 2008, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.27.  Please verify if it still should be listed and let me know
> > (either way).
> > 
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11988
> > Subject		: Eliminate recursive mutex in compat fb ioctl path
> > Submitter	: Keith Packard <keithp@keithp.com>
> > Date		: 2008-11-03 7:06 (7 days old)
> > References	: http://marc.info/?l=linux-kernel&m=122569604828448&w=4
> > Handled-By	: Keith Packard <keithp@keithp.com>
> > 		  Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
> > Patch		: http://marc.info/?l=linux-kernel&m=122569604828448&w=4
> > 		  http://lkml.org/lkml/2008/10/31/162
> 
> Fixed in mainline.
> 
> commit a684e7d33096892093456dd56a582cfc3bfad648
> Author: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
> Date:   Thu Nov 6 12:53:37 2008 -0800
> 
>     fbdev: fix fb_compat_ioctl() deadlocks
> 
>     commit 3e680aae4e53ab54cdbb0c29257dae0cbb158e1c ("fb: convert
>     lock/unlock_kernel() into local fb mutex") introduced several deadlocks
>     in the fb_compat_ioctl() path, as mutex_lock() doesn't allow recursion,
>     unlike lock_kernel().  This broke frame buffer applications on 64-bit
>     systems with a 32-bit userland.
>     
>     commit 120a37470c2831fea49fdebaceb5a7039f700ce6 ("framebuffer compat_ioctl
>     deadlock") fixed one of the deadlocks.
>     
>     This patch fixes the remaining deadlocks:
>       - Revert commit 120a37470c2831fea49fdebaceb5a7039f700ce6,
>       - Extract the core logic of fb_ioctl() into a new function do_fb_ioctl(),
>       - Change all callsites of fb_ioctl() where info->lock is already held to
>         call do_fb_ioctl() instead,
>       - Add sparse annotations to all routines that take info->lock.
>     
>     Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
>     Cc: Mikulas Patocka <mpatocka@redhat.com>
>     Cc: Krzysztof Helt <krzysztof.h1@wp.pl>
>     Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Thanks, closed.

Rafael

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-12  3:39           ` Rusty Russell
@ 2008-11-15 13:37             ` Rafael J. Wysocki
  0 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-15 13:37 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Ingo Molnar, Heiko Carstens, Linux Kernel Mailing List,
	Kernel Testers List, Vegard Nossum, Peter Zijlstra,
	Oleg Nesterov, Dmitry Adamushko, Andrew Morton

On Wednesday, 12 of November 2008, Rusty Russell wrote:
> On Tuesday 11 November 2008 21:22:14 Ingo Molnar wrote:
> > * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > So, it evidently fails while re-enabling the non-boot CPU and not
> > > during disabling it as I thought before.
> 
> (Resend, due to HTML version previously)
> 
> But what is calling stop_machine in that path?
> 
> There *is* a race, but I don't think it could cause this (we should make a
> copy of active.fnret inside the lock before returning it).

Still, that seems to be the case.

> Two patches: one fixes that race, the next adds debugging spew.
> 
> stop_machine: fix race with return value

With this patch applied (reproduced below for clarity) the problem is not
reproducible any more.

Care to push it upstream ASAP?

Thanks,
Rafael

---
stop_machine: fix race with return value

We should not access active.fnret outside the lock; in theory the next
stop_machine could overwrite it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
 kernel/stop_machine.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff -r d7c9a15da615 kernel/stop_machine.c
--- a/kernel/stop_machine.c	Mon Nov 10 09:47:45 2008 +1100
+++ b/kernel/stop_machine.c	Tue Nov 11 23:19:47 2008 +1030
@@ -112,7 +112,7 @@
 int __stop_machine(int (*fn)(void *), void *data, const cpumask_t *cpus)
 {
 	struct work_struct *sm_work;
-	int i;
+	int i, ret;
 
 	/* Set up initial state. */
 	mutex_lock(&lock);
@@ -137,8 +137,9 @@
 	/* This will release the thread on our CPU. */
 	put_cpu();
 	flush_workqueue(stop_machine_wq);
+	ret = active.fnret;
 	mutex_unlock(&lock);
-	return active.fnret;
+	return ret;
 }
 
 int stop_machine(int (*fn)(void *), void *data, const cpumask_t *cpus)


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-11  9:31                 ` Andreas Schwab
  2008-11-11 11:30                   ` Benjamin Herrenschmidt
@ 2008-11-21  2:55                   ` Benjamin Herrenschmidt
  2008-11-21  3:02                   ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2008-11-21  2:55 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Andrew Morton, David S. Miller, James Cloos,
	Paul Collins, Linus Torvalds

On Tue, 2008-11-11 at 10:31 +0100, Andreas Schwab wrote:
> It looks like you are observing the same failure mode that I do.

The lockup when shutting down isn't happening for me anymore with recent
X (ubuntu intrepid) btw.

I haven't quite figured out what's up yet.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Bug #11875] radeonfb lockup in .28-rc (bisected)
  2008-11-11  9:31                 ` Andreas Schwab
  2008-11-11 11:30                   ` Benjamin Herrenschmidt
  2008-11-21  2:55                   ` Benjamin Herrenschmidt
@ 2008-11-21  3:02                   ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 106+ messages in thread
From: Benjamin Herrenschmidt @ 2008-11-21  3:02 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
	Kernel Testers List, Andrew Morton, David S. Miller, James Cloos,
	Paul Collins, Linus Torvalds

On Tue, 2008-11-11 at 10:31 +0100, Andreas Schwab wrote:
> It looks like you are observing the same failure mode that I do.

BTW> I've been running a torture scripts that does an ls -lR / in a
console and constantly chvt between that console and X and so far
haven't got it to crash...

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine
  2008-11-16 16:24 2.6.28-rc5: Reported regressions from 2.6.27 Rafael J. Wysocki
@ 2008-11-16 16:35 ` Rafael J. Wysocki
  0 siblings, 0 replies; 106+ messages in thread
From: Rafael J. Wysocki @ 2008-11-16 16:35 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Kernel Testers List, Heiko Carstens, Rafael J. Wysocki, Rusty Russell

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.27.  Please verify if it still should be listed and let me know
(either way).


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11989
Subject		: Suspend failure on NForce4-based boards due to chanes in stop_machine
Submitter	: Rafael J. Wysocki <rjw@sisk.pl>
Date		: 2008-11-03 0:28 (14 days old)
First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c9583e55fa2b08a230c549bd1e3c0bde6c50d9cc
References	: http://marc.info/?l=linux-kernel&m=122567187604356&w=4
Handled-By	: Rusty Russell <rusty@rustcorp.com.au>
Patch		: http://lkml.org/lkml/2008/11/15/69



^ permalink raw reply	[flat|nested] 106+ messages in thread

end of thread, other threads:[~2008-11-21  3:03 UTC | newest]

Thread overview: 106+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
2008-11-09 17:53 ` [Bug #11799] xorg can not start up with stolen memory Rafael J. Wysocki
2008-11-09 17:54 ` [Bug #11806] iwl3945 fails with microcode error Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11849] default IRQ affinity change in v2.6.27 (breaking several SMP PPC based systems) Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11834] iwl3945: if I leave my machine running overnight, wifi will not work in the morning Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11822] ACPI Warning (nspredef-0858): _SB_.PCI0.LPC_.EC__.BAT0._BIF: Return Package type mismatch at index 9 - found Buffer, expected String [20080926] Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11826] extreme slowness of IO stuff using 2.6.28-rc1 Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11841] plenty of line "ACPI: EC: non-query interrupt received, switching to interrupt mode" in dmesg and system not powering down Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11891] resume from disk broken on hp/compaq nx7000 (DRM problem) Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11875] radeonfb lockup in .28-rc (bisected) Rafael J. Wysocki
2008-11-09 21:15   ` Benjamin Herrenschmidt
2008-11-10  5:46   ` Benjamin Herrenschmidt
2008-11-10  7:13     ` Paul Collins
2008-11-10  9:05       ` Benjamin Herrenschmidt
2008-11-10  9:06     ` David Miller
2008-11-10 20:39     ` Andreas Schwab
2008-11-10 21:52       ` Benjamin Herrenschmidt
2008-11-10 23:20         ` Andreas Schwab
2008-11-10 23:34           ` Benjamin Herrenschmidt
2008-11-10 23:54             ` Andreas Schwab
2008-11-11  1:49               ` Benjamin Herrenschmidt
2008-11-11  2:47                 ` Linus Torvalds
2008-11-11  3:21                   ` Benjamin Herrenschmidt
2008-11-11  9:31                 ` Andreas Schwab
2008-11-11 11:30                   ` Benjamin Herrenschmidt
2008-11-21  2:55                   ` Benjamin Herrenschmidt
2008-11-21  3:02                   ` Benjamin Herrenschmidt
2008-11-13 23:11     ` David Miller
2008-11-14  0:54       ` Benjamin Herrenschmidt
2008-11-14  2:50         ` David Miller
2008-11-14  3:04           ` David Miller
2008-11-14  3:29             ` Benjamin Herrenschmidt
2008-11-14  4:28               ` David Miller
2008-11-14  8:51                 ` Benjamin Herrenschmidt
2008-11-09 17:59 ` [Bug #11873] unable to mount ext3 root filesystem due to htree_dirblock_to_tree Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11858] Timeout regression introduced by 242f9dcb8ba6f68fcd217a119a7648a4f69290e9 Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11896] [2.6.28-rc2] EeePC ACPI errors &amp; exceptions Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11899] sometime boot failed on T61 laptop Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11895] 2.6.28-rc2 regression: keyboard dead after reboot on Toshiba Portege 4000 Rafael J. Wysocki
2008-11-10 16:53   ` Andrey Borzenkov
2008-11-10 18:06     ` Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11898] mke2fs hang on AIC79 device Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11903] regression: vmalloc easily fail Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11906] 2.6.28-rc2 seems to fail at powering down the monitor when it should Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11905] lots of extra timer interrupts costing 2W Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11913] USB/INPUT: slab error in cache_alloc_debugcheck_after(): double free? Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11911] new PCMCIA device instance after resume - orinoco can't download firmware Rafael J. Wysocki
2008-11-10  3:55   ` Andrey Borzenkov
2008-11-09 17:59 ` [Bug #11908] linux-2.6.28-rc2 regression : oprofile doesnt work anymore Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11917] Asus Eee PC hotkeys stop working after prolonged usage Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11928] ath5k gets lost with eeepc-laptop removal Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11937] ext3 __log_wait_for_space: no transactions Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11925] cdrom: missing compat ioctls Rafael J. Wysocki
2008-11-09 23:00   ` Andreas Schwab
2008-11-09 23:29     ` Rafael J. Wysocki
2008-11-09 23:39       ` Andreas Schwab
2008-11-09 17:59 ` [Bug #11942] AMD64 reboot regression Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11947] 2.6.28-rc VC switching with Intel graphics broken Rafael J. Wysocki
2008-11-11  9:28   ` Romano Giannetti
2008-11-09 17:59 ` [Bug #11958] [2.6.27.x =&gt; 2.6.28-rc3] Xorg crash with xf86MapVidMem error Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11965] regression introduced by - timers: fix itimer/many thread hang Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11984] regression when switching TTY-&gt;X, input related? Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11982] Fan level 7 after resume wit 2.6.28-rc3 Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11985] 2.6.28-rc3 truncates nfsd results Rafael J. Wysocki
2008-11-09 21:05   ` J. Bruce Fields
2008-11-09 17:59 ` [Bug #11970] gettimeofday return a old time in mmbench Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11986] 2.6.28-rc2-git1: spitz still won't boot Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11988] Eliminate recursive mutex in compat fb ioctl path Rafael J. Wysocki
2008-11-14 14:51   ` Geert Uytterhoeven
2008-11-15 11:51     ` Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11994] Computer doesn't power down after commit CPI: EC: do transaction from interrupt context Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine Rafael J. Wysocki
2008-11-10 12:04   ` Heiko Carstens
2008-11-10 14:47     ` Rafael J. Wysocki
2008-11-10 22:55       ` Rafael J. Wysocki
2008-11-11 10:52         ` Ingo Molnar
2008-11-11 11:31           ` Heiko Carstens
2008-11-11 12:42             ` Heiko Carstens
2008-11-11 13:13               ` Ingo Molnar
2008-11-11 14:35               ` Paul E. McKenney
2008-11-11 15:01                 ` Heiko Carstens
2008-11-11 16:17                   ` Paul E. McKenney
2008-11-11 15:02                 ` Paul E. McKenney
2008-11-11 16:14                   ` Heiko Carstens
2008-11-11 16:45                     ` Paul E. McKenney
2008-11-11 17:34                       ` Paul E. McKenney
2008-11-12  9:05                         ` Heiko Carstens
2008-11-12 16:03                           ` Paul E. McKenney
2008-11-12 16:51                             ` Heiko Carstens
2008-11-12 19:43                               ` Paul E. McKenney
2008-11-11 17:03                   ` Q: force_quiescent_state && cpu_online_map Oleg Nesterov
2008-11-11 17:25                     ` Paul E. McKenney
2008-11-11 13:36           ` [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine Vegard Nossum
2008-11-11 13:46             ` Vegard Nossum
2008-11-11 13:49             ` Peter Zijlstra
2008-11-11 14:47           ` Vegard Nossum
2008-11-11 15:11             ` Dmitry Adamushko
2008-11-11 16:31             ` Oleg Nesterov
2008-11-12  3:30               ` Rusty Russell
2008-11-12  3:39           ` Rusty Russell
2008-11-15 13:37             ` Rafael J. Wysocki
2008-11-11 21:28         ` Dmitry Adamushko
2008-11-11 23:43           ` Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11987] Bootup time regression from 2.6.27 to 2.6.28-rc3+ Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11996] Tracing framework regression in 2.6.28-rc3 Rafael J. Wysocki
2008-11-16 16:24 2.6.28-rc5: Reported regressions from 2.6.27 Rafael J. Wysocki
2008-11-16 16:35 ` [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).