LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* 2.6.25-rc5: Reported regressions from 2.6.24
@ 2008-03-10 23:14 Rafael J. Wysocki
  2008-03-11  0:40 ` Jeff Garzik
                   ` (3 more replies)
  0 siblings, 4 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2008-03-10 23:14 UTC (permalink / raw)
  To: LKML; +Cc: Adrian Bunk, Andrew Morton, Linus Torvalds, Natalie Protasevich

[We have closed some entries since yesterday, located some patches and found
a couple of new regressions, so here's an updated report.]

This message contains a list of some regressions from 2.6.24 reported since
2.6.25-rc1 was released, for which there are no fixes in the mainline I know
of.  If any of them have been fixed already, please let me know.

If you know of any other unresolved regressions from 2.6.24, please let me know
either and I'll add them to the list.  Also, please let me know if any of the
entries below are invalid.


Listed regressions statistics:

  Date          Total  Pending  Unresolved
  ----------------------------------------
  2008-03-11      141       58          43
  2008-03-10      138       66          47
  2008-03-03      115       65          49
  2008-02-25       90       51          39
  2008-02-17       61       45          37


Unresolved regressions
----------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=9954
Subject		: iwl3945: not only it periodically dies, it also BUG()s
Submitter	: Pavel Machek <pavel@ucw.cz>
Date		: 2008-02-05 22:44
References	: http://lkml.org/lkml/2008/2/5/453
Handled-By	: Chatre, Reinette <reinette.chatre@intel.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=9958
Subject		: parisc compile error
Submitter	: Adrian Bunk <bunk@kernel.org>
Date		: 2008-02-08 01:12
References	: http://lkml.org/lkml/2008/2/7/572
Handled-By	: Kyle McMartin <kyle@mcmartin.ca>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=9962
Subject		: mount: could not find filesystem
Submitter	: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Date		: 2008-02-12 14:34
References	: http://lkml.org/lkml/2008/2/12/91
Handled-By	: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
		  Yinghai Lu <yhlu.kernel@gmail.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=9976
Subject		: BUG: 2.6.25-rc1: iptables postrouting setup causes oops
Submitter	: Ben Nizette <bn@niasdigital.com>
Date		: 2008-02-12 12:46
References	: http://lkml.org/lkml/2008/2/12/148
Handled-By	: Haavard Skinnemoen <hskinnemoen@atmel.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=9978
Subject		: 2.6.25-rc1: volanoMark 45% regression
Submitter	: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
Date		:  Fri Jan 25 21:08:00 2008 +0100
References	: http://lkml.org/lkml/2008/2/13/128
Handled-By	: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
		  Balbir Singh <balbir@linux.vnet.ibm.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=9980
Subject		: 2.6.25-rc1 on Sun Ultra 40
Submitter	: Jasper Bryant-Greene <jasper@unix.geek.nz>
Date		: 2008-02-13 12:25
References	: http://lkml.org/lkml/2008/2/13/181
Handled-By	: Yinghai Lu <Yinghai.Lu@sun.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=9983
Subject		: PROBLEM: 2.6.25-rc1-git2 freezes when accessing external USB hard disk (ehci-hcd)
Submitter	: Linas Žvirblis <0x0007@gmail.com>
Date		: 2008-02-13 22:38
References	: http://lkml.org/lkml/2008/2/13/566


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=9984
Subject		: problem with starting 2.6.25-rc1 and latest git
Submitter	: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Date		: 2008-02-13 23:16
References	: http://lkml.org/lkml/2008/2/13/587


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=9992
Subject		: 2.6.24-git: kmap_atomic() WARN_ON()
Submitter	: Thomas Gleixner <tglx@linutronix.de>
Date		: 2008-02-07 00:58
References	: http://lkml.org/lkml/2008/2/6/451
		  http://lkml.org/lkml/2007/1/14/38


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=9995
Subject		: 2.6.25-rc1 regression - backlight controlls do not work - ThinkPad T61
Submitter	: Lukas Hejtmanek <xhejtman@fi.muni.cz>
Date		: 2008-02-15 04:51
Handled-By	: Zhang Rui <rui.zhang@intel.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10011
Subject		: The computer is blocked when X is started
Submitter	: François Valenduc <francois.valenduc@tvcablenet.be>
Date		: 2008-02-17 06:28
Handled-By	: Thomas Gleixner <tglx@linutronix.de>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10016
Subject		: cobalt_btns.c &lt;-&gt; struct platform_device compile error
Submitter	: Adrian Bunk <bunk@kernel.org>
Date		: 2008-02-17 12:12
References	: http://lkml.org/lkml/2008/2/17/293


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10017
Subject		: cdev removal broke cobalt_btns.c compilation
Submitter	: Adrian Bunk <bunk@kernel.org>
Date		: 2008-02-17 12:14
References	: http://lkml.org/lkml/2008/2/17/295


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10025
Subject		: Current git very broken on the Dreamcast
Submitter	: Adrian McMenamin <adrian@newgolddream.dyndns.info>
Date		: 2008-02-16 19:38
References	: http://lkml.org/lkml/2008/2/16/196
Handled-By	: Kristoffer Ericson <kristoffer.ericson@gmail.com>
		  Magnus Damm <magnus.damm@gmail.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10027
Subject		: 2.6.25-rc[12] Video4Linux Bttv Regression
Submitter	: Bongani Hlope <bonganilinux@mweb.co.za>
Date		: 2008-02-17 09:36
References	: http://lkml.org/lkml/2008/2/17/55
Handled-By	: Mauro Carvalho Chehab <mchehab@infradead.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10033
Subject		: mips yosemite_defconfig compile error
Submitter	: Adrian Bunk <bunk@kernel.org>
Date		: 2008-02-17 16:45
References	: http://lkml.org/lkml/2008/2/17/383


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10041
Subject		: 2.6.25-rc1/2 regression: first-time login into gnome fails
Submitter	: Romano Giannetti <romanol@upcomillas.es>
Date		: 2008-02-18 11:56
References	: http://lkml.org/lkml/2008/2/18/145
Handled-By	: Ray Lee <ray-lk@madrabbit.org>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10051
Subject		: Spurious messages at boot, eventually hangs the usb subsustem
Submitter	: Jean-Luc Coulon <jean.luc.coulon@gmail.com>
Date		: 2008-02-20 09:10


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10061
Subject		: Hang in md5_resync
Submitter	: Steinar H. Gunderson <sgunderson@bigfoot.com>
Date		: 2008-02-21 13:13


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10065
Subject		: 2.6.25-rc2 regression - hang on suspend
Submitter	: Soeren Sonnenburg <kernel@nn7.de>
Date		: 2008-02-19 12:59
References	: http://lkml.org/lkml/2008/2/19/165
		  http://lkml.org/lkml/2008/2/17/381
Handled-By	: Rafael J. Wysocki <rjw@sisk.pl>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10067
Subject		: TUNER_TDA8290=y, VIDEO_DEV=n build error
Submitter	: Toralf Förster <toralf.foerster@gmx.de>
Date		: 2008-02-22 10:36
References	: http://lkml.org/lkml/2008/2/19/262


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10078
Subject		: USB OOPS 2.6.25-rc2-git1
Submitter	: Andre Tomt <andre@tomt.net>
Date		: 2008-02-19 16:19
References	: http://lkml.org/lkml/2008/2/19/253
Handled-By	: David Brownell <david-b@pacbell.net>
		  Alan Stern <stern@rowland.harvard.edu>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10080
Subject		: 2.6.25-rc2: ohci1394 problem
Submitter	: Thomas Meyer <thomas@m3y3r.de>
Date		: 2008-02-20 08:47
References	: http://lkml.org/lkml/2008/2/20/58
Handled-By	: Stefan Richter <stefanr@s5r6.in-berlin.de>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10082
Subject		: [BUG] 2.6.25-rc2-git4 - Regression Kernel oops while running kernbench and tbench on powerpc
Submitter	: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Date		: 2008-02-20 16:01
References	: http://lkml.org/lkml/2008/2/20/218
		  http://lkml.org/lkml/2008/1/18/71


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10084
Subject		: 2.6.25-rc2-git4 BUG: sysfs_readdir
Submitter	: Randy Dunlap <randy.dunlap@oracle.com>
Date		: 2008-02-21 17:25
References	: http://lkml.org/lkml/2008/2/21/212
Handled-By	: Greg KH <greg@kroah.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10086
Subject		: 2.6.25-rc2 + smartd = hang
Submitter	: Anders Eriksson <aeriksson@fastmail.fm>
Date		:  Sat Jan 26 20:13:12 2008 +0100
References	: http://lkml.org/lkml/2008/2/22/239
Handled-By	: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10093
Subject		: 2.6.25-current-git hangs on boot
Submitter	: Soeren Sonnenburg <kernel@nn7.de>
Date		: 2008-02-23 18:55
References	: http://lkml.org/lkml/2008/2/23/263
		  http://marc.info/?l=linux-acpi&amp;m=120387537018467&amp;w=4
Handled-By	: Pallipadi, Venkatesh <venkatesh.pallipadi@intel.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10097
Subject		: SMP BUG in __nf_conntrack_find
Submitter	: Christian Casteyde <casteyde.christian@free.fr>
Date		: 2008-02-25 10:44


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10100
Subject		: 208c70a45624400fafd7511b96bc426bf01f8f5e breaks EC init
Submitter	: Michael S. Tsirkin <m.s.tsirkin@gmail.com>
Date		: 2008-02-25 20:19
References	: http://lkml.org/lkml/2008/2/25/282
Handled-By	: Alexey Starikovskiy <astarikovskiy@suse.de>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10102
Subject		: 2.6.25-rc2 Regression Thinkpad acpi
Submitter	: Lukas Hejtmanek <xhejtman@ics.muni.cz>
Date		: 2008-02-25 12:47
References	: http://lkml.org/lkml/2008/2/25/73
Handled-By	: Henrique de Moraes Holschuh <hmh@hmh.eng.br>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10117
Subject		: 2.6.25-current-git hangs on boot (pci=nommconf helps)
Submitter	: Soeren Sonnenburg <kernel@nn7.de>
Date		: 2008-02-23 18:55
References	: http://lkml.org/lkml/2008/2/23/263


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10133
Subject		: INFO: possible circular locking in the resume
Submitter	: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Date		: 2008-02-27
References	: http://lkml.org/lkml/2008/2/26/479
Handled-By	: Gautham R Shenoy <ego@in.ibm.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10146
Subject		: 2.6.25-rc: complete lockup on boot/start of X (bisected)
Submitter	: Marcin Slusarz <marcin.slusarz@gmail.com>
Date		:  Fri Jan 25 21:08:29 2008 +0100
References	: http://lkml.org/lkml/2008/3/2/91
Handled-By	: Peter Zijlstra <a.p.zijlstra@chello.nl>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10152
Subject		: Clocksource tsc is always unstable with 2.6.25-* kernels and CONFIG_NO_HZ=y on my box
Submitter	: Gabriel C <nix.or.die@googlemail.com>
Date		: 2008-02-24 01:31
References	: http://lkml.org/lkml/2008/2/23/380
		  http://lkml.org/lkml/2008/2/24/281
Handled-By	: Thomas Gleixner <tglx@linutronix.de>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10156
Subject		: KVM &amp; Qemu crashed with infinite recursive kernel loop in the guest
Submitter	: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Date		: 2008-02-28 11:25
References	: http://lkml.org/lkml/2008/2/28/106


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10164
Subject		: ntfs build failure on no-mmu
Submitter	: Mike Frysinger <vapier.adi@gmail.com>
Date		: 2008-03-03 11:05
References	: http://lkml.org/lkml/2008/3/1/179


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10172
Subject		: INFO: inconsistent lock state
Submitter	: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Date		: 2008-03-05 03:26


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10190
Subject		: [BUG] Linux-2.6.25-rc4 (and also in rc3) Compile Error
Submitter	: Tarkan Erimer <tarkan@netone.net.tr>
Date		: 2008-03-05 05:01
References	: http://www.ussg.iu.edu/hypermail/linux/kernel/0803.0/1867.html


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10191
Subject		: Treason uncloaked spams syslog with latest git
Submitter	: Thomas Gleixner <tglx@linutronix.de>
Date		: 2008-03-06 05:47
References	: http://www.ussg.iu.edu/hypermail/linux/kernel/0803.0/2444.html


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10203
Subject		: Unable to ifconfig up b43 wireless interface
Submitter	: Christian Casteyde <casteyde.christian@free.fr>
Date		: 2008-03-09 00:55
Handled-By	: Michael Buesch <mb@bu3sch.de>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10207
Subject		: INFO: task mount:11202 blocked for more than 120 seconds
Submitter	: Christian Kujau <lists@nerdbynature.de>
Date		: 2008-03-07 21:32
References	: http://lkml.org/lkml/2008/3/7/308
		  http://lkml.org/lkml/2008/3/9/186
Handled-By	: David Chinner <dgc@sgi.com>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10211
Subject		: build #408 issue for v2.6.25-rc4-56-gd7fe321 in cx2341x_ctrl_get_menu
Submitter	: Toralf Förster <toralf.foerster@gmx.de>
Date		: 2008-03-07 13:48
References	: http://lkml.org/lkml/2008/3/7/168


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10214
Subject		: [regression] 2.6.25-rc4 snd-es18xx broken on Alpha
Submitter	: Bob Tracy <rct@frus.com>
Date		: 2008-03-08 04:58
References	: http://lkml.org/lkml/2008/3/7/409
Handled-By	: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
		  Rene Herman <rene.herman@keyaccess.nl>


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10219
Subject		: 25rc4-git3 blockdev/loopback related lockdep trace.
Submitter	: Dave Jones <davej@codemonkey.org.uk>
Date		: 2008-03-10 16:01
References	: http://lkml.org/lkml/2008/3/10/120


Regressionn with patches
------------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=9969
Subject		: 2.6.24-git15 Keyboard Issue?
Submitter	: Chris Holvenstot <cholvenstot@comcast.net>
Date		: 2008-02-06 14:02
References	: http://lkml.org/lkml/2008/2/6/100
		  http://lkml.org/lkml/2008/2/13/82
Handled-By	: Thomas Gleixner <tglx@linutronix.de>
Patch		: http://lkml.org/lkml/2008/2/15/343


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10013
Subject		: tbench regression in 2.6.25-rc1
Submitter	: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
Date		: 2008-02-15 02:52
References	: http://lkml.org/lkml/2008/2/14/546
Handled-By	: Eric Dumazet <dada1@cosmosbay.com>
		  David Miller <davem@davemloft.net>
Patch		: http://lkml.org/lkml/2008/2/18/66
		  http://lkml.org/lkml/2008/2/18/117


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10031
Subject		: [2.6.25-rc2] e100: Trying to free already-free IRQ 11 during suspend ...
Submitter	: Andrey Borzenkov <arvidjaar@mail.ru>
Date		: 2008-02-16 13:36
References	: http://lkml.org/lkml/2008/2/17/125
Handled-By	: Kok, Auke <auke-jan.h.kok@intel.com>
Patch		: http://lkml.org/lkml/2008/2/21/259


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10104
Subject		: 2.6.25-rc3: WARNING: at arch/x86/mm/ioremap.c:137
Submitter	: Phil Oester <kernel@linuxace.com>
Date		: 2008-02-25 03:09
References	: http://lkml.org/lkml/2008/2/24/265
Handled-By	: Ingo Molnar <mingo@elte.hu>
Patch		: http://lkml.org/lkml/2008/3/10/240


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10122
Subject		: FIXED_PHY must depend on PHYLIB=y
Submitter	: Olaf Hering <olaf@aepfle.de>
Date		: 2008-02-27 07:14
References	: http://lkml.org/lkml/2008/2/27/90
Handled-By	: Adrian Bunk <adrian.bunk@movial.fi>
Patch		: http://lkml.org/lkml/2008/2/27/157


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10123
Subject		: No power-off / reboot with 2.6.25-rcX (up to -rc3) kernels
Submitter	: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Date		:  Tue May 22 22:47:54 2007 -0400
References	: http://lkml.org/lkml/2008/3/10/340
Handled-By	: Gautham R Shenoy <ego@in.ibm.com>
Patch		: http://lkml.org/lkml/2008/3/10/91


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10132
Subject		: 2.6.25 git regression, oops on boot
Submitter	: Jonathan McDowell <noodles@earth.li>
Date		: 2008-02-29 11:09
References	: http://marc.info/?l=linux-kernel&amp;m=120423268404812&amp;w=2
		  http://lkml.org/lkml/2008/2/28/369
Handled-By	: Zhang Rui <rui.zhang@intel.com>
		  Lin Ming <ming.m.lin@intel.com>
Patch		: http://lkml.org/lkml/2008/2/29/49


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10153
Subject		: (regression) kernel/timeconst.h bugs with HZ=128
Submitter	: David Brownell <david-b@pacbell.net>
Date		: 2008-02-26 19:32
References	: http://lkml.org/lkml/2008/2/26/294
Handled-By	: H. Peter Anvin <hpa@zytor.com>
Patch		: http://bugzilla.kernel.org/attachment.cgi?id=15114&amp;action=view
		  http://bugzilla.kernel.org/attachment.cgi?id=15115&amp;action=view


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10168
Subject		: WARNING: at drivers/usb/host/ehci-hcd.c:287
Submitter	: Christian Kujau <lists@nerdbynature.de>
Date		: 2008-03-03 01:05
References	: http://lkml.org/lkml/2008/3/2/171
Handled-By	: Alan Stern <stern@rowland.harvard.edu>
		  David Brownell <david-b@pacbell.net>
Patch		: http://lkml.org/lkml/2008/3/4/420


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10186
Subject		: SCSI_AIC94XX must depend on SCSI
Submitter	: Toralf Förster <toralf.foerster@gmx.de>
Date		: 2008-03-06 19:09
References	: http://marc.info/?l=linux-kernel&amp;m=120483073617232&amp;w=2
Handled-By	: Adrian Bunk <bunk@kernel.org>
Patch		: http://marc.info/?l=linux-kernel&amp;m=120483499725928&amp;w=2


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10189
Subject		: libata: allow LLDs w/o any reset method
Submitter	: Ingo Molnar <mingo@elte.hu>
Date		: 2008-03-06 10:25
References	: http://marc.info/?l=linux-kernel&amp;m=120479928020617&amp;w=2
Handled-By	: Tejun Heo <htejun@gmail.com>
Patch		: http://marc.info/?l=linux-ide&amp;m=120477660124629&amp;w=2


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10209
Subject		: 2.6.25 sysdev API problem
Submitter	: Mikael Pettersson <mikpe@it.uu.se>
Date		: 2008-03-08 16:56
References	: http://lkml.org/lkml/2008/3/8/59
Handled-By	: Greg KH <gregkh@suse.de>
		  Balaji Rao <balajirrao@gmail.com>
Patch		: http://lkml.org/lkml/2008/3/9/10


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10210
Subject		: [Regression] 2.6.25-rc4-git3: Handling of audio CDs broken on pata_ali
Submitter	: Rafael J. Wysocki <rjw@sisk.pl>
Date		: 2008-03-08 22:46
References	: http://lkml.org/lkml/2008/3/8/123
Handled-By	: Tejun Heo <htejun@gmail.com>
Patch		: http://lkml.org/lkml/2008/3/10/69


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10218
Subject		: [patch] fix ACPI boot regression (was: Re: Linux 2.6.25-rc5)
Submitter	: Ingo Molnar <mingo@elte.hu>
Date		: 2008-03-10 18:04
References	: http://lkml.org/lkml/2008/3/10/171
Handled-By	: Ingo Molnar <mingo@elte.hu>
Patch		: http://lkml.org/lkml/2008/3/10/171


For details, please visit the bug entries and follow the links given in
references.

As you can see, there is a Bugzilla entry for each of the listed regressions.
There also is a Bugzilla entry used for tracking the regressions from 2.6.24,
unresolved as well as resolved, at:

http://bugzilla.kernel.org/show_bug.cgi?id=9832

Please let me know if there are any Bugzilla entries that should be added to
the list in there.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-10 23:14 2.6.25-rc5: Reported regressions from 2.6.24 Rafael J. Wysocki
@ 2008-03-11  0:40 ` Jeff Garzik
  2008-03-11  1:05   ` Rafael J. Wysocki
  2008-03-11  2:15   ` Linus Torvalds
  2008-03-11 12:22 ` Stefan Richter
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 41+ messages in thread
From: Jeff Garzik @ 2008-03-11  0:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Adrian Bunk, Andrew Morton, Linus Torvalds, Natalie Protasevich

Rafael J. Wysocki wrote:
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=9992
> Subject		: 2.6.24-git: kmap_atomic() WARN_ON()
> Submitter	: Thomas Gleixner <tglx@linutronix.de>
> Date		: 2008-02-07 00:58
> References	: http://lkml.org/lkml/2008/2/6/451
> 		  http://lkml.org/lkml/2007/1/14/38

Solved by b445c56815d84b9fce40707f99811bdc354458e0


> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10123
> Subject		: No power-off / reboot with 2.6.25-rcX (up to -rc3) kernels
> Submitter	: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
> Date		:  Tue May 22 22:47:54 2007 -0400
> References	: http://lkml.org/lkml/2008/3/10/340
> Handled-By	: Gautham R Shenoy <ego@in.ibm.com>
> Patch		: http://lkml.org/lkml/2008/3/10/91

FWIW, I have this same problem.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-11  0:40 ` Jeff Garzik
@ 2008-03-11  1:05   ` Rafael J. Wysocki
  2008-03-11  2:15   ` Linus Torvalds
  1 sibling, 0 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2008-03-11  1:05 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: LKML, Adrian Bunk, Andrew Morton, Linus Torvalds, Natalie Protasevich

On Tuesday, 11 of March 2008, Jeff Garzik wrote:
> Rafael J. Wysocki wrote:
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=9992
> > Subject		: 2.6.24-git: kmap_atomic() WARN_ON()
> > Submitter	: Thomas Gleixner <tglx@linutronix.de>
> > Date		: 2008-02-07 00:58
> > References	: http://lkml.org/lkml/2008/2/6/451
> > 		  http://lkml.org/lkml/2007/1/14/38
> 
> Solved by b445c56815d84b9fce40707f99811bdc354458e0

Thanks, Adrian has just closed it.

> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10123
> > Subject		: No power-off / reboot with 2.6.25-rcX (up to -rc3) kernels
> > Submitter	: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
> > Date		:  Tue May 22 22:47:54 2007 -0400
> > References	: http://lkml.org/lkml/2008/3/10/340
> > Handled-By	: Gautham R Shenoy <ego@in.ibm.com>
> > Patch		: http://lkml.org/lkml/2008/3/10/91
> 
> FWIW, I have this same problem.

Does the patch help?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-11  0:40 ` Jeff Garzik
  2008-03-11  1:05   ` Rafael J. Wysocki
@ 2008-03-11  2:15   ` Linus Torvalds
  2008-03-11  3:00     ` Andrew Morton
  2008-03-11 18:57     ` Jeff Garzik
  1 sibling, 2 replies; 41+ messages in thread
From: Linus Torvalds @ 2008-03-11  2:15 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Rafael J. Wysocki, LKML, Adrian Bunk, Andrew Morton, Natalie Protasevich



On Mon, 10 Mar 2008, Jeff Garzik wrote:
>
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10123
> > Subject		: No power-off / reboot with 2.6.25-rcX (up to -rc3)
> > kernels
> > Submitter	: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
> > Date		:  Tue May 22 22:47:54 2007 -0400
> > References	: http://lkml.org/lkml/2008/3/10/340
> > Handled-By	: Gautham R Shenoy <ego@in.ibm.com>
> > Patch		: http://lkml.org/lkml/2008/3/10/91
> 
> FWIW, I have this same problem.

There's a newer patch in

	http://lkml.org/lkml/2008/3/10/343

which I think should replace the 2008/3/10/91 one, but which needs 
testing.

Jeff, does that one ("keep rd->online and cpu_online_map in sync") fix the 
problem for you?

(Andrew - I saw you say that the older patch fixed things for you, does 
the newer one - on its own - also do so?)

		Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-11  2:15   ` Linus Torvalds
@ 2008-03-11  3:00     ` Andrew Morton
  2008-03-11  8:28       ` Ingo Molnar
  2008-03-11 18:57     ` Jeff Garzik
  1 sibling, 1 reply; 41+ messages in thread
From: Andrew Morton @ 2008-03-11  3:00 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff Garzik, Rafael J. Wysocki, LKML, Adrian Bunk, Natalie Protasevich

On Mon, 10 Mar 2008 19:15:45 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Mon, 10 Mar 2008, Jeff Garzik wrote:
> >
> > > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10123
> > > Subject		: No power-off / reboot with 2.6.25-rcX (up to -rc3)
> > > kernels
> > > Submitter	: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
> > > Date		:  Tue May 22 22:47:54 2007 -0400
> > > References	: http://lkml.org/lkml/2008/3/10/340
> > > Handled-By	: Gautham R Shenoy <ego@in.ibm.com>
> > > Patch		: http://lkml.org/lkml/2008/3/10/91
> > 
> > FWIW, I have this same problem.
> 
> There's a newer patch in
> 
> 	http://lkml.org/lkml/2008/3/10/343
> 
> which I think should replace the 2008/3/10/91 one, but which needs 
> testing.
> 
> Jeff, does that one ("keep rd->online and cpu_online_map in sync") fix the 
> problem for you?
> 
> (Andrew - I saw you say that the older patch fixed things for you, does 
> the newer one - on its own - also do so?)
> 

Yes, it does.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-11  3:00     ` Andrew Morton
@ 2008-03-11  8:28       ` Ingo Molnar
  0 siblings, 0 replies; 41+ messages in thread
From: Ingo Molnar @ 2008-03-11  8:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Jeff Garzik, Rafael J. Wysocki, LKML,
	Adrian Bunk, Natalie Protasevich


* Andrew Morton <akpm@linux-foundation.org> wrote:

> > There's a newer patch in
> > 
> > 	http://lkml.org/lkml/2008/3/10/343
> > 
> > which I think should replace the 2008/3/10/91 one, but which needs 
> > testing.
> > 
> > Jeff, does that one ("keep rd->online and cpu_online_map in sync") fix the 
> > problem for you?
> > 
> > (Andrew - I saw you say that the older patch fixed things for you, does 
> > the newer one - on its own - also do so?)
> > 
> 
> Yes, it does.

great. I've picked it up too and will push it through the test-grind.

	Ingo

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-10 23:14 2.6.25-rc5: Reported regressions from 2.6.24 Rafael J. Wysocki
  2008-03-11  0:40 ` Jeff Garzik
@ 2008-03-11 12:22 ` Stefan Richter
  2008-03-11 13:04   ` Adrian Bunk
  2008-03-12 22:12 ` Christian Kujau
  2008-03-13  5:03 ` David Chinner
  3 siblings, 1 reply; 41+ messages in thread
From: Stefan Richter @ 2008-03-11 12:22 UTC (permalink / raw)
  To: Rafael J. Wysocki, Thomas Meyer
  Cc: LKML, Adrian Bunk, Andrew Morton, Linus Torvalds, Natalie Protasevich

Rafael J. Wysocki wrote:
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10080
> Subject		: 2.6.25-rc2: ohci1394 problem
> Submitter	: Thomas Meyer <thomas@m3y3r.de>
> Date		: 2008-02-20 08:47
> References	: http://lkml.org/lkml/2008/2/20/58
> Handled-By	: Stefan Richter <stefanr@s5r6.in-berlin.de>

Thomas wrote on 2008-02-25:
''So i did a "make clean" and a "make" (not a make
-j3 as i use to do) and recompiled 2.6.25-rc3 and now it works again.
Case closed under strange error.''

I have closed the bugzilla bug now.
-- 
Stefan Richter
-=====-==--- --== -=-==
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-11 12:22 ` Stefan Richter
@ 2008-03-11 13:04   ` Adrian Bunk
  2008-03-17 21:28     ` Thomas Meyer
  0 siblings, 1 reply; 41+ messages in thread
From: Adrian Bunk @ 2008-03-11 13:04 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Rafael J. Wysocki, Thomas Meyer, LKML, Andrew Morton,
	Linus Torvalds, Natalie Protasevich

On Tue, Mar 11, 2008 at 01:22:42PM +0100, Stefan Richter wrote:
> Rafael J. Wysocki wrote:
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10080
> > Subject		: 2.6.25-rc2: ohci1394 problem
> > Submitter	: Thomas Meyer <thomas@m3y3r.de>
> > Date		: 2008-02-20 08:47
> > References	: http://lkml.org/lkml/2008/2/20/58
> > Handled-By	: Stefan Richter <stefanr@s5r6.in-berlin.de>
> 
> Thomas wrote on 2008-02-25:
> ''So i did a "make clean" and a "make" (not a make
> -j3 as i use to do) and recompiled 2.6.25-rc3 and now it works again.
> Case closed under strange error.''
>...

Although I don't think this would cause the error, it would be nice if 
Thomas could verify that the -j3 did not cause the problem.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-11  2:15   ` Linus Torvalds
  2008-03-11  3:00     ` Andrew Morton
@ 2008-03-11 18:57     ` Jeff Garzik
  2008-03-11 22:41       ` Rafael J. Wysocki
  1 sibling, 1 reply; 41+ messages in thread
From: Jeff Garzik @ 2008-03-11 18:57 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, LKML, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Andrew Morton, Ingo Molnar, Len Brown

Linus Torvalds wrote:
> On Mon, 10 Mar 2008, Jeff Garzik wrote:
>>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10123
>>> Subject		: No power-off / reboot with 2.6.25-rcX (up to -rc3)

>> FWIW, I have this same problem.

> Jeff, does that one ("keep rd->online and cpu_online_map in sync") fix the 
> problem for you?

Nope.  I am running baadac8b10c5ac15ce3d26b68fa266c8889b163f now, and it 
still hangs on reboot or power-off.

Interestingly, if I reboot -immediately- from gdm, it succeeds.  However 
if I login to Fedora GNOME via gdm, and load my standard apps (1001 
terminals, firefox, tbird, IRC) reboot and poweroff no longer work.

My guess was always some ACPI regression.  I'll bisect today or 
tomorrow.  It is reproducible regression that appeared recently (circa 
2.6.24 or 2.6.25-rc1 I think), so I should be able to find the culprit.

	Jeff




^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-11 18:57     ` Jeff Garzik
@ 2008-03-11 22:41       ` Rafael J. Wysocki
  2008-03-12 20:01         ` Linus Torvalds
  2008-03-17 19:20         ` 2.6.25-rc5: Reported regressions from 2.6.24 Jeff Garzik
  0 siblings, 2 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2008-03-11 22:41 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Linus Torvalds, LKML, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Ingo Molnar, Len Brown,
	Guennadi Liakhovetski, Greg KH

On Tuesday, 11 of March 2008, Jeff Garzik wrote:
> Linus Torvalds wrote:
> > On Mon, 10 Mar 2008, Jeff Garzik wrote:
> >>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10123
> >>> Subject		: No power-off / reboot with 2.6.25-rcX (up to -rc3)
> 
> >> FWIW, I have this same problem.
> 
> > Jeff, does that one ("keep rd->online and cpu_online_map in sync") fix the 
> > problem for you?
> 
> Nope.  I am running baadac8b10c5ac15ce3d26b68fa266c8889b163f now, and it 
> still hangs on reboot or power-off.
> 
> Interestingly, if I reboot -immediately- from gdm, it succeeds.  However 
> if I login to Fedora GNOME via gdm, and load my standard apps (1001 
> terminals, firefox, tbird, IRC) reboot and poweroff no longer work.
> 
> My guess was always some ACPI regression.  I'll bisect today or 
> tomorrow.  It is reproducible regression that appeared recently (circa 
> 2.6.24 or 2.6.25-rc1 I think), so I should be able to find the culprit.

In http://bugzilla.kernel.org/show_bug.cgi?id=10123 Guennadi says that
reverting

commit fd7d1ced29e5beb88c9068801da7a362606d8273
Author: Greg Kroah-Hartman <gregkh@suse.de>
Date:   Tue May 22 22:47:54 2007 -0400

    PCI: make pci_bus a struct device

fixes the problem for him (this seems to be yet another reboot/poweroff IOW).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-11 22:41       ` Rafael J. Wysocki
@ 2008-03-12 20:01         ` Linus Torvalds
  2008-03-12 20:32           ` Greg KH
  2008-03-17 19:20         ` 2.6.25-rc5: Reported regressions from 2.6.24 Jeff Garzik
  1 sibling, 1 reply; 41+ messages in thread
From: Linus Torvalds @ 2008-03-12 20:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jeff Garzik, LKML, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Ingo Molnar, Len Brown,
	Guennadi Liakhovetski, Greg KH



On Tue, 11 Mar 2008, Rafael J. Wysocki wrote:

> 
> In http://bugzilla.kernel.org/show_bug.cgi?id=10123 Guennadi says that
> reverting
> 
> commit fd7d1ced29e5beb88c9068801da7a362606d8273
> Author: Greg Kroah-Hartman <gregkh@suse.de>
> Date:   Tue May 22 22:47:54 2007 -0400
> 
>     PCI: make pci_bus a struct device
> 
> fixes the problem for him (this seems to be yet another reboot/poweroff IOW).

Ahh, I thought this was done already, but nope, my PCI pull from Greg 
didn't contain the revert.

Greg? I know you must be aware of the problem, because you replied to the 
email at some point. Wazzup?

		Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-12 20:01         ` Linus Torvalds
@ 2008-03-12 20:32           ` Greg KH
  2008-03-12 21:27             ` pcibios_scanned needs to be set in ACPI? (was Re: 2.6.25-rc5: Reported regressions from 2.6.24) Greg KH
  0 siblings, 1 reply; 41+ messages in thread
From: Greg KH @ 2008-03-12 20:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Jeff Garzik, LKML, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Ingo Molnar, Len Brown,
	Guennadi Liakhovetski

On Wed, Mar 12, 2008 at 01:01:15PM -0700, Linus Torvalds wrote:
> 
> 
> On Tue, 11 Mar 2008, Rafael J. Wysocki wrote:
> 
> > 
> > In http://bugzilla.kernel.org/show_bug.cgi?id=10123 Guennadi says that
> > reverting
> > 
> > commit fd7d1ced29e5beb88c9068801da7a362606d8273
> > Author: Greg Kroah-Hartman <gregkh@suse.de>
> > Date:   Tue May 22 22:47:54 2007 -0400
> > 
> >     PCI: make pci_bus a struct device
> > 
> > fixes the problem for him (this seems to be yet another reboot/poweroff IOW).
> 
> Ahh, I thought this was done already, but nope, my PCI pull from Greg 
> didn't contain the revert.
> 
> Greg? I know you must be aware of the problem, because you replied to the 
> email at some point. Wazzup?

I'm still trying to figure out why his is the only machine having
problems with this.  I think it's an acpi "we walk the list of pci
devices twice" type thing, but don't know yet.

I'm still working on it...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 41+ messages in thread

* pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 20:32           ` Greg KH
@ 2008-03-12 21:27             ` Greg KH
  2008-03-12 21:38               ` Greg KH
                                 ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Greg KH @ 2008-03-12 21:27 UTC (permalink / raw)
  To: Linus Torvalds, Rafael J. Wysocki, Jeff Garzik, LKML,
	Adrian Bunk, Andrew Morton, Natalie Protasevich, Ingo Molnar,
	Len Brown, Guennadi Liakhovetski

On Wed, Mar 12, 2008 at 01:32:05PM -0700, Greg KH wrote:
> On Wed, Mar 12, 2008 at 01:01:15PM -0700, Linus Torvalds wrote:
> > 
> > 
> > On Tue, 11 Mar 2008, Rafael J. Wysocki wrote:
> > 
> > > 
> > > In http://bugzilla.kernel.org/show_bug.cgi?id=10123 Guennadi says that
> > > reverting
> > > 
> > > commit fd7d1ced29e5beb88c9068801da7a362606d8273
> > > Author: Greg Kroah-Hartman <gregkh@suse.de>
> > > Date:   Tue May 22 22:47:54 2007 -0400
> > > 
> > >     PCI: make pci_bus a struct device
> > > 
> > > fixes the problem for him (this seems to be yet another reboot/poweroff IOW).
> > 
> > Ahh, I thought this was done already, but nope, my PCI pull from Greg 
> > didn't contain the revert.
> > 
> > Greg? I know you must be aware of the problem, because you replied to the 
> > email at some point. Wazzup?
> 
> I'm still trying to figure out why his is the only machine having
> problems with this.  I think it's an acpi "we walk the list of pci
> devices twice" type thing, but don't know yet.

Ok, I think I got it.  And it looks like an ACPI bug, but one that we
might have been ignoring for a long time...


In looking at the log files at boot, we see that we are using ACPI to
find the PCI devices:

	ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]

Followed by a lot of kobjects for pci devices being added, including
this root bus:
	kobject: '0000:01:00.0' (c7c978cc): kobject_add_internal: parent: '0000:00:01.0', set: 'devices'
	kobject: '0000:01:00.0' (c7c978cc): kobject_uevent_env
	kobject: '0000:01:00.0' (c7c978cc): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:00.0'
	kobject: '0000:01' (c7c35900): kobject_add_internal: parent: 'pci_bus', set: 'devices'
	kobject: '0000:01' (c7c35900): kobject_uevent_env
	kobject: '0000:01' (c7c35900): fill_kobj_path: path = '/class/pci_bus/0000:01'

All is fine, until later on we decide to fallback to the "old" style of
probing:
	PCI: Probing PCI hardware
	kobject (c7c35900): tried to init an initialized object, something is seriously wrong.
	Pid: 1, comm: swapper Not tainted 2.6.25-rc2-testpm #30
	 [<c01ea0e9>] kobject_init+0x89/0x90
	 [<c025094e>] device_initialize+0x1e/0x90
	 [<c025119b>] device_register+0xb/0x20
	 [<c01f3fd8>] pci_bus_add_devices+0x98/0x140
	 [<c030aff7>] ? pcibios_scan_root+0x27/0xa0
	 [<c03f69d0>] pci_legacy_init+0x50/0xf0
	 [<c03db5c2>] kernel_init+0x132/0x310
	 [<c010303a>] ? ret_from_fork+0x6/0x1c
	 [<c03db490>] ? kernel_init+0x0/0x310
	 [<c03db490>] ? kernel_init+0x0/0x310
	 [<c0103d3f>] kernel_thread_helper+0x7/0x18
	=======================
	kobject: '0000:01' (c7c35900): kobject_add_internal: parent: 'pci_bus', set: 'devices'

This shows that we are trying to register the exact same kobject that we
had already previously registered.  Not nice...

Now we have a check in the pci bus code to not register anything that we
had already registered in the past:

        list_for_each_entry(dev, &bus->devices, bus_list) {
                /*
                 * Skip already-present devices (which are on the
                 * global device list.)
                 */
                if (!list_empty(&dev->global_list))
                        continue;
                retval = pci_bus_add_device(dev);

But, in redoing the pci list logic (coming in .26 and in -mm and -next)
I realized that this wasn't a real check, as this list is just a
"shadow" list that some types of pci probing never set up.

So that explains why the warning we get when trying to register a device
multiple times in the kobject core.

But why does this happen in the first place?

The code in arch/x86/pci/legacy.c::pci_legacy_init() checks the
pcibios_scanned flag to determine if we had already scanned the PCI bus.
Which we did in the ACPI code, right?

So, Len, shouldn't we be setting this flag in the ACPI core if we had
already scanned the pci bus there?

I can fix this problem by putting the check in the pci core in
pci_bus_add_devices() like we have done in -next, but I think that we
also need to do something in ACPI as well.

Guennadi, could you test the -next kernel tree to see if the logic there
solves this issue for you?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 21:27             ` pcibios_scanned needs to be set in ACPI? (was Re: 2.6.25-rc5: Reported regressions from 2.6.24) Greg KH
@ 2008-03-12 21:38               ` Greg KH
  2008-03-12 22:25                 ` Linus Torvalds
  2008-03-12 21:41               ` Len Brown
  2008-03-12 22:20               ` Linus Torvalds
  2 siblings, 1 reply; 41+ messages in thread
From: Greg KH @ 2008-03-12 21:38 UTC (permalink / raw)
  To: Linus Torvalds, Rafael J. Wysocki, Jeff Garzik, LKML,
	Adrian Bunk, Andrew Morton, Natalie Protasevich, Ingo Molnar,
	Len Brown, Guennadi Liakhovetski

On Wed, Mar 12, 2008 at 02:27:04PM -0700, Greg KH wrote:
> On Wed, Mar 12, 2008 at 01:32:05PM -0700, Greg KH wrote:
> > On Wed, Mar 12, 2008 at 01:01:15PM -0700, Linus Torvalds wrote:
> > > 
> > > 
> > > On Tue, 11 Mar 2008, Rafael J. Wysocki wrote:
> > > 
> > > > 
> > > > In http://bugzilla.kernel.org/show_bug.cgi?id=10123 Guennadi says that
> > > > reverting
> > > > 
> > > > commit fd7d1ced29e5beb88c9068801da7a362606d8273
> > > > Author: Greg Kroah-Hartman <gregkh@suse.de>
> > > > Date:   Tue May 22 22:47:54 2007 -0400
> > > > 
> > > >     PCI: make pci_bus a struct device
> > > > 
> > > > fixes the problem for him (this seems to be yet another reboot/poweroff IOW).
> > > 
> > > Ahh, I thought this was done already, but nope, my PCI pull from Greg 
> > > didn't contain the revert.
> > > 
> > > Greg? I know you must be aware of the problem, because you replied to the 
> > > email at some point. Wazzup?
> > 
> > I'm still trying to figure out why his is the only machine having
> > problems with this.  I think it's an acpi "we walk the list of pci
> > devices twice" type thing, but don't know yet.
> 
> Ok, I think I got it.  And it looks like an ACPI bug, but one that we
> might have been ignoring for a long time...
> 
> 
> In looking at the log files at boot, we see that we are using ACPI to
> find the PCI devices:
> 
> 	ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
> 
> Followed by a lot of kobjects for pci devices being added, including
> this root bus:
> 	kobject: '0000:01:00.0' (c7c978cc): kobject_add_internal: parent: '0000:00:01.0', set: 'devices'
> 	kobject: '0000:01:00.0' (c7c978cc): kobject_uevent_env
> 	kobject: '0000:01:00.0' (c7c978cc): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:00.0'
> 	kobject: '0000:01' (c7c35900): kobject_add_internal: parent: 'pci_bus', set: 'devices'
> 	kobject: '0000:01' (c7c35900): kobject_uevent_env
> 	kobject: '0000:01' (c7c35900): fill_kobj_path: path = '/class/pci_bus/0000:01'
> 
> All is fine, until later on we decide to fallback to the "old" style of
> probing:
> 	PCI: Probing PCI hardware
> 	kobject (c7c35900): tried to init an initialized object, something is seriously wrong.
> 	Pid: 1, comm: swapper Not tainted 2.6.25-rc2-testpm #30
> 	 [<c01ea0e9>] kobject_init+0x89/0x90
> 	 [<c025094e>] device_initialize+0x1e/0x90
> 	 [<c025119b>] device_register+0xb/0x20
> 	 [<c01f3fd8>] pci_bus_add_devices+0x98/0x140
> 	 [<c030aff7>] ? pcibios_scan_root+0x27/0xa0
> 	 [<c03f69d0>] pci_legacy_init+0x50/0xf0
> 	 [<c03db5c2>] kernel_init+0x132/0x310
> 	 [<c010303a>] ? ret_from_fork+0x6/0x1c
> 	 [<c03db490>] ? kernel_init+0x0/0x310
> 	 [<c03db490>] ? kernel_init+0x0/0x310
> 	 [<c0103d3f>] kernel_thread_helper+0x7/0x18
> 	=======================
> 	kobject: '0000:01' (c7c35900): kobject_add_internal: parent: 'pci_bus', set: 'devices'
> 
> This shows that we are trying to register the exact same kobject that we
> had already previously registered.  Not nice...
> 
> Now we have a check in the pci bus code to not register anything that we
> had already registered in the past:
> 
>         list_for_each_entry(dev, &bus->devices, bus_list) {
>                 /*
>                  * Skip already-present devices (which are on the
>                  * global device list.)
>                  */
>                 if (!list_empty(&dev->global_list))
>                         continue;
>                 retval = pci_bus_add_device(dev);
> 
> But, in redoing the pci list logic (coming in .26 and in -mm and -next)
> I realized that this wasn't a real check, as this list is just a
> "shadow" list that some types of pci probing never set up.
> 
> So that explains why the warning we get when trying to register a device
> multiple times in the kobject core.
> 
> But why does this happen in the first place?
> 
> The code in arch/x86/pci/legacy.c::pci_legacy_init() checks the
> pcibios_scanned flag to determine if we had already scanned the PCI bus.
> Which we did in the ACPI code, right?
> 
> So, Len, shouldn't we be setting this flag in the ACPI core if we had
> already scanned the pci bus there?
> 
> I can fix this problem by putting the check in the pci core in
> pci_bus_add_devices() like we have done in -next, but I think that we
> also need to do something in ACPI as well.
> 
> Guennadi, could you test the -next kernel tree to see if the logic there
> solves this issue for you?

Actually, here's a simple patch from -next that should test this logic
for you.  Can you let me know if this solves the start up WARNING dump
for you?

thanks,

greg k-h

------------

Date: Thu, 14 Feb 2008 14:56:56 -0800
From: Greg Kroah-Hartman <gregkh@suse.de>
Subject: PCI: add is_added flag to struct pci_dev

This lets us check if the device is really added to the driver core or
not, which is what we need when walking some of the bus lists.  The flag
is there in anticipation of getting rid of the other PCI device list,
which is what we used to check in this situation.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 arch/powerpc/platforms/pseries/pci_dlpar.c |    7 ++-----
 drivers/pci/bus.c                          |   11 ++++-------
 drivers/pci/probe.c                        |    2 +-
 drivers/pci/remove.c                       |    6 ++----
 include/linux/pci.h                        |    1 +
 5 files changed, 10 insertions(+), 17 deletions(-)

--- a/arch/powerpc/platforms/pseries/pci_dlpar.c
+++ b/arch/powerpc/platforms/pseries/pci_dlpar.c
@@ -88,11 +88,8 @@ pcibios_fixup_new_pci_devices(struct pci
 	struct pci_dev *dev;
 
 	list_for_each_entry(dev, &bus->devices, bus_list) {
-		/*
-		 * Skip already-present devices (which are on the
-		 * global device list.)
-		 */
-		if (list_empty(&dev->global_list)) {
+		/* Skip already-added devices */
+		if (!dev->is_added) {
 			int i;
 
 			/* Fill device archdata and setup iommu table */
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -84,6 +84,7 @@ int pci_bus_add_device(struct pci_dev *d
 	if (retval)
 		return retval;
 
+	dev->is_added = 1;
 	down_write(&pci_bus_sem);
 	list_add_tail(&dev->global_list, &pci_devices);
 	up_write(&pci_bus_sem);
@@ -112,11 +113,8 @@ void pci_bus_add_devices(struct pci_bus 
 	int retval;
 
 	list_for_each_entry(dev, &bus->devices, bus_list) {
-		/*
-		 * Skip already-present devices (which are on the
-		 * global device list.)
-		 */
-		if (!list_empty(&dev->global_list))
+		/* Skip already-added devices */
+		if (dev->is_added)
 			continue;
 		retval = pci_bus_add_device(dev);
 		if (retval)
@@ -124,8 +122,7 @@ void pci_bus_add_devices(struct pci_bus 
 	}
 
 	list_for_each_entry(dev, &bus->devices, bus_list) {
-
-		BUG_ON(list_empty(&dev->global_list));
+		BUG_ON(!dev->is_added);
 
 		/*
 		 * If there is an unattached subordinate bus, attach
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -984,7 +984,7 @@ EXPORT_SYMBOL(pci_scan_single_device);
  *
  * Scan a PCI slot on the specified PCI bus for devices, adding
  * discovered devices to the @bus->devices list.  New devices
- * will have an empty dev->global_list head.
+ * will not have is_added set.
  */
 int pci_scan_slot(struct pci_bus *bus, int devfn)
 {
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -18,13 +18,11 @@ static void pci_free_resources(struct pc
 
 static void pci_stop_dev(struct pci_dev *dev)
 {
-	if (!dev->global_list.next)
-		return;
-
-	if (!list_empty(&dev->global_list)) {
+	if (dev->is_added) {
 		pci_proc_detach_device(dev);
 		pci_remove_sysfs_dev_files(dev);
 		device_unregister(&dev->dev);
+		dev->is_added = 0;
 		down_write(&pci_bus_sem);
 		list_del(&dev->global_list);
 		dev->global_list.next = dev->global_list.prev = NULL;
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -181,6 +181,7 @@ struct pci_dev {
 	unsigned int	transparent:1;	/* Transparent PCI bridge */
 	unsigned int	multifunction:1;/* Part of multi-function device */
 	/* keep track of device state */
+	unsigned int	is_added:1;
 	unsigned int	is_busmaster:1; /* device is busmaster */
 	unsigned int	no_msi:1;	/* device may not use msi */
 	unsigned int	no_d1d2:1;   /* only allow d0 or d3 */

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 21:27             ` pcibios_scanned needs to be set in ACPI? (was Re: 2.6.25-rc5: Reported regressions from 2.6.24) Greg KH
  2008-03-12 21:38               ` Greg KH
@ 2008-03-12 21:41               ` Len Brown
  2008-03-12 22:20               ` Linus Torvalds
  2 siblings, 0 replies; 41+ messages in thread
From: Len Brown @ 2008-03-12 21:41 UTC (permalink / raw)
  To: Greg KH
  Cc: Linus Torvalds, Rafael J. Wysocki, Jeff Garzik, LKML,
	Adrian Bunk, Andrew Morton, Natalie Protasevich, Ingo Molnar,
	Guennadi Liakhovetski

On Wednesday 12 March 2008, Greg KH wrote:
> On Wed, Mar 12, 2008 at 01:32:05PM -0700, Greg KH wrote:
> > On Wed, Mar 12, 2008 at 01:01:15PM -0700, Linus Torvalds wrote:
> > > 
> > > 
> > > On Tue, 11 Mar 2008, Rafael J. Wysocki wrote:
> > > 
> > > > 
> > > > In http://bugzilla.kernel.org/show_bug.cgi?id=10123 Guennadi says that
> > > > reverting
> > > > 
> > > > commit fd7d1ced29e5beb88c9068801da7a362606d8273
> > > > Author: Greg Kroah-Hartman <gregkh@suse.de>
> > > > Date:   Tue May 22 22:47:54 2007 -0400
> > > > 
> > > >     PCI: make pci_bus a struct device
> > > > 
> > > > fixes the problem for him (this seems to be yet another reboot/poweroff IOW).
> > > 
> > > Ahh, I thought this was done already, but nope, my PCI pull from Greg 
> > > didn't contain the revert.
> > > 
> > > Greg? I know you must be aware of the problem, because you replied to the 
> > > email at some point. Wazzup?
> > 
> > I'm still trying to figure out why his is the only machine having
> > problems with this.  I think it's an acpi "we walk the list of pci
> > devices twice" type thing, but don't know yet.
> 
> Ok, I think I got it.  And it looks like an ACPI bug, but one that we
> might have been ignoring for a long time...
> 
> 
> In looking at the log files at boot, we see that we are using ACPI to
> find the PCI devices:
> 
> 	ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]

This is just ACPI telling us that it found a PCI Interrupt Routing table.
We load it and have it available for reference later when the PCI
devices request their IRQs.  ie. it responds to PCI probing,
it doesn't cause PCI probing.

> Followed by a lot of kobjects for pci devices being added, including
> this root bus:
> 	kobject: '0000:01:00.0' (c7c978cc): kobject_add_internal: parent: '0000:00:01.0', set: 'devices'
> 	kobject: '0000:01:00.0' (c7c978cc): kobject_uevent_env
> 	kobject: '0000:01:00.0' (c7c978cc): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:00.0'
> 	kobject: '0000:01' (c7c35900): kobject_add_internal: parent: 'pci_bus', set: 'devices'
> 	kobject: '0000:01' (c7c35900): kobject_uevent_env
> 	kobject: '0000:01' (c7c35900): fill_kobj_path: path = '/class/pci_bus/0000:01'

I don't think ACPI is doing this directly.
More likely that PNP is doing it via PNPACPI.
(try booting with pnpacpi=off)

> All is fine, until later on we decide to fallback to the "old" style of
> probing:
> 	PCI: Probing PCI hardware

Why do we fall back?
I don't see this line at all on my test box.

-Len

> 	kobject (c7c35900): tried to init an initialized object, something is seriously wrong.
> 	Pid: 1, comm: swapper Not tainted 2.6.25-rc2-testpm #30
> 	 [<c01ea0e9>] kobject_init+0x89/0x90
> 	 [<c025094e>] device_initialize+0x1e/0x90
> 	 [<c025119b>] device_register+0xb/0x20
> 	 [<c01f3fd8>] pci_bus_add_devices+0x98/0x140
> 	 [<c030aff7>] ? pcibios_scan_root+0x27/0xa0
> 	 [<c03f69d0>] pci_legacy_init+0x50/0xf0
> 	 [<c03db5c2>] kernel_init+0x132/0x310
> 	 [<c010303a>] ? ret_from_fork+0x6/0x1c
> 	 [<c03db490>] ? kernel_init+0x0/0x310
> 	 [<c03db490>] ? kernel_init+0x0/0x310
> 	 [<c0103d3f>] kernel_thread_helper+0x7/0x18
> 	=======================
> 	kobject: '0000:01' (c7c35900): kobject_add_internal: parent: 'pci_bus', set: 'devices'
> 
> This shows that we are trying to register the exact same kobject that we
> had already previously registered.  Not nice...
> 
> Now we have a check in the pci bus code to not register anything that we
> had already registered in the past:
> 
>         list_for_each_entry(dev, &bus->devices, bus_list) {
>                 /*
>                  * Skip already-present devices (which are on the
>                  * global device list.)
>                  */
>                 if (!list_empty(&dev->global_list))
>                         continue;
>                 retval = pci_bus_add_device(dev);
> 
> But, in redoing the pci list logic (coming in .26 and in -mm and -next)
> I realized that this wasn't a real check, as this list is just a
> "shadow" list that some types of pci probing never set up.
> 
> So that explains why the warning we get when trying to register a device
> multiple times in the kobject core.
> 
> But why does this happen in the first place?
> 
> The code in arch/x86/pci/legacy.c::pci_legacy_init() checks the
> pcibios_scanned flag to determine if we had already scanned the PCI bus.
> Which we did in the ACPI code, right?
> 
> So, Len, shouldn't we be setting this flag in the ACPI core if we had
> already scanned the pci bus there?
> 
> I can fix this problem by putting the check in the pci core in
> pci_bus_add_devices() like we have done in -next, but I think that we
> also need to do something in ACPI as well.
> 
> Guennadi, could you test the -next kernel tree to see if the logic there
> solves this issue for you?
> 
> thanks,
> 
> greg k-h
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-10 23:14 2.6.25-rc5: Reported regressions from 2.6.24 Rafael J. Wysocki
  2008-03-11  0:40 ` Jeff Garzik
  2008-03-11 12:22 ` Stefan Richter
@ 2008-03-12 22:12 ` Christian Kujau
  2008-03-12 23:01   ` Rafael J. Wysocki
  2008-03-13  5:03 ` David Chinner
  3 siblings, 1 reply; 41+ messages in thread
From: Christian Kujau @ 2008-03-12 22:12 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: LKML

On Tue, 11 Mar 2008, Rafael J. Wysocki wrote:
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10191
> Subject		: Treason uncloaked spams syslog with latest git
> Submitter	: Thomas Gleixner <tglx@linutronix.de>
> Date		: 2008-03-06 05:47
> References	: http://www.ussg.iu.edu/hypermail/linux/kernel/0803.0/2444.html

Dunno if you got this, but this seems to be fixed: 
http://lkml.org/lkml/2008/3/11/435

> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10207
> Subject		: INFO: task mount:11202 blocked for more than 120 seconds
> Submitter	: Christian Kujau <lists@nerdbynature.de>
> Date		: 2008-03-07 21:32
> References	: http://lkml.org/lkml/2008/3/7/308
> 		  http://lkml.org/lkml/2008/3/9/186
> Handled-By	: David Chinner <dgc@sgi.com>

FWIW, it has been reported by Chr too: http://lkml.org/lkml/2008/3/12/313
And David could be taken out of the loop, as it seems dm-crypt related 
(dm-devel@redhat.com is already notified), not XFS related.


Thanks for maintaining this list,
Christian.
-- 
BOFH excuse #348:

We're on Token Ring, and it looks like the token got loose.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 21:27             ` pcibios_scanned needs to be set in ACPI? (was Re: 2.6.25-rc5: Reported regressions from 2.6.24) Greg KH
  2008-03-12 21:38               ` Greg KH
  2008-03-12 21:41               ` Len Brown
@ 2008-03-12 22:20               ` Linus Torvalds
  2008-03-12 22:34                 ` Greg KH
  2 siblings, 1 reply; 41+ messages in thread
From: Linus Torvalds @ 2008-03-12 22:20 UTC (permalink / raw)
  To: Greg KH
  Cc: Rafael J. Wysocki, Jeff Garzik, LKML, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Ingo Molnar, Len Brown,
	Guennadi Liakhovetski



On Wed, 12 Mar 2008, Greg KH wrote:
> 
> Ok, I think I got it.  And it looks like an ACPI bug, but one that we
> might have been ignoring for a long time...

I still think that the fact that it regressed in that PCI patch means that 
there is simply something wrong with the patch. At the very least that 
patch changed behaviour, which was *not* what it was claiming it was 
doing.

I do think it's triggered by the "acpi=noirq" setting: that means that 
ACPI *won't* disable the legacy scan. Now, admittedly that's a really odd 
thing to do, and I think it's really strange how pci_acpi_init() does that

	pcibios_scanned++;

in a place where it is not actually scanning the bus, so I do agree that 
ACPI is doing something really odd here, but the fact is, this code all 
used to work.

Can we please just fix the regression caused by that offending patch?  In 
other words: why did that patch change behaviour AT ALL?

Quite frankly, we're too late in the game to say "this exposed some other 
long-time bug". That *particular* patch needs to be fixed, or reverted. We 
can look at changing ACPI in 2.6.26, not in -rc6.

				Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 21:38               ` Greg KH
@ 2008-03-12 22:25                 ` Linus Torvalds
  2008-03-12 22:54                   ` Greg KH
  0 siblings, 1 reply; 41+ messages in thread
From: Linus Torvalds @ 2008-03-12 22:25 UTC (permalink / raw)
  To: Greg KH
  Cc: Rafael J. Wysocki, Jeff Garzik, LKML, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Ingo Molnar, Len Brown,
	Guennadi Liakhovetski



On Wed, 12 Mar 2008, Greg KH wrote:
> 
> Actually, here's a simple patch from -next that should test this logic
> for you.  Can you let me know if this solves the start up WARNING dump
> for you?

This patch looks bogus.

Why do you introduce a "dev->is_added" field that apparently has to match 
the old "list_empty(&dev->global_list)" 1:1 anyway?

In other words: when is it *ever* permissible for "is_added" to have a 
different value from the "list_empty(..)" logic? And if they must always 
match (and it looks like they have to, since you set and clear the flag 
exactly when you add/remove it from the list), then what exactly is this 
supposed to fix?

			Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 22:20               ` Linus Torvalds
@ 2008-03-12 22:34                 ` Greg KH
  2008-03-12 23:02                   ` Linus Torvalds
  0 siblings, 1 reply; 41+ messages in thread
From: Greg KH @ 2008-03-12 22:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Jeff Garzik, LKML, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Ingo Molnar, Len Brown,
	Guennadi Liakhovetski

On Wed, Mar 12, 2008 at 03:20:51PM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 12 Mar 2008, Greg KH wrote:
> > 
> > Ok, I think I got it.  And it looks like an ACPI bug, but one that we
> > might have been ignoring for a long time...
> 
> I still think that the fact that it regressed in that PCI patch means that 
> there is simply something wrong with the patch. At the very least that 
> patch changed behaviour, which was *not* what it was claiming it was 
> doing.
> 
> I do think it's triggered by the "acpi=noirq" setting: that means that 
> ACPI *won't* disable the legacy scan. Now, admittedly that's a really odd 
> thing to do, and I think it's really strange how pci_acpi_init() does that
> 
> 	pcibios_scanned++;
> 
> in a place where it is not actually scanning the bus, so I do agree that 
> ACPI is doing something really odd here, but the fact is, this code all 
> used to work.
> 
> Can we please just fix the regression caused by that offending patch?  In 
> other words: why did that patch change behaviour AT ALL?
> 
> Quite frankly, we're too late in the game to say "this exposed some other 
> long-time bug". That *particular* patch needs to be fixed, or reverted. We 
> can look at changing ACPI in 2.6.26, not in -rc6.

What happend in .25-rc was that we now catch these kinds of problems
(watching for duplicate kobjects to be registered and such.)  So this
might have always been happening, but no warning was ever produced.

I can revert the "catch this kind of thing" patch, but I don't think
that's the real solution here :)

The reason we aren't shutting down is also due to the way kobjects now
work.  If you don't clean up properly, they linger around and something
on the shutdown path (I haven't figured that out yet) doesn't want to
stop the machine.

We have seen this in a number of places, all catching real problems in
subsystems where they were grabbing 2 references to an object, and then
only releasing one when finished (cpufreq is an example of this.)  When
those are fixed, the shutdown problem goes away.

So in this case, we are registering a kobject twice, which increases the
reference count, and then we never clean it up on shutdown properly as
we only drop it once.  Hence the shutdown problem.

So, we need to not register the device twice, my patch should fix that
problem.  Or we can add the pcibios_scanned++ call somewhere in PNP or
ACPI to prevent us from ever attempting to try to register the device
twice.  Either way should fix Guennadi's issue.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 22:25                 ` Linus Torvalds
@ 2008-03-12 22:54                   ` Greg KH
  2008-03-12 23:09                     ` Linus Torvalds
  0 siblings, 1 reply; 41+ messages in thread
From: Greg KH @ 2008-03-12 22:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Jeff Garzik, LKML, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Ingo Molnar, Len Brown,
	Guennadi Liakhovetski

On Wed, Mar 12, 2008 at 03:25:41PM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 12 Mar 2008, Greg KH wrote:
> > 
> > Actually, here's a simple patch from -next that should test this logic
> > for you.  Can you let me know if this solves the start up WARNING dump
> > for you?
> 
> This patch looks bogus.
> 
> Why do you introduce a "dev->is_added" field that apparently has to match 
> the old "list_empty(&dev->global_list)" 1:1 anyway?
> 
> In other words: when is it *ever* permissible for "is_added" to have a 
> different value from the "list_empty(..)" logic? And if they must always 
> match (and it looks like they have to, since you set and clear the flag 
> exactly when you add/remove it from the list), then what exactly is this 
> supposed to fix?

In the patch series in -next, it is supposed to replace the list_empty()
logic exactly, as that list goes away in the next patch in the series.

So yes, it is not a "fix" per-say, but would be nice to see if it solves
this issue in some way.

All I can think is that somehow this pci device for the root hub isn't
added to that extra list (as that is only done in the pcibios logic) and
so it isn't set.

I can't get a box here to produce both of those PCI: messages myself,
and neither can Len, so something is really odd here.  And that has
nothing to do with the pci_bus rework, that is just showing the problem
more accuratly now.  Even if it were to be reverted, the root problem
would still be present.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-12 22:12 ` Christian Kujau
@ 2008-03-12 23:01   ` Rafael J. Wysocki
  0 siblings, 0 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2008-03-12 23:01 UTC (permalink / raw)
  To: Christian Kujau; +Cc: LKML, Herbert Xu

On Wednesday, 12 of March 2008, Christian Kujau wrote:
> On Tue, 11 Mar 2008, Rafael J. Wysocki wrote:
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10191
> > Subject		: Treason uncloaked spams syslog with latest git
> > Submitter	: Thomas Gleixner <tglx@linutronix.de>
> > Date		: 2008-03-06 05:47
> > References	: http://www.ussg.iu.edu/hypermail/linux/kernel/0803.0/2444.html
> 
> Dunno if you got this, but this seems to be fixed: 
> http://lkml.org/lkml/2008/3/11/435

Yes, I noticed the patch and updated the Bugzilla entry with a link to it,
but it will be closed when the patch appears in the Linus' tree.

> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10207
> > Subject		: INFO: task mount:11202 blocked for more than 120 seconds
> > Submitter	: Christian Kujau <lists@nerdbynature.de>
> > Date		: 2008-03-07 21:32
> > References	: http://lkml.org/lkml/2008/3/7/308
> > 		  http://lkml.org/lkml/2008/3/9/186
> > Handled-By	: David Chinner <dgc@sgi.com>
> 
> FWIW, it has been reported by Chr too: http://lkml.org/lkml/2008/3/12/313

Yes.

> And David could be taken out of the loop, as it seems dm-crypt related 
> (dm-devel@redhat.com is already notified), not XFS related.

The entry has already been updated and Herbert is on its CC list.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 22:34                 ` Greg KH
@ 2008-03-12 23:02                   ` Linus Torvalds
  2008-03-12 23:16                     ` Greg KH
  2008-03-12 23:17                     ` Guennadi Liakhovetski
  0 siblings, 2 replies; 41+ messages in thread
From: Linus Torvalds @ 2008-03-12 23:02 UTC (permalink / raw)
  To: Greg KH
  Cc: Rafael J. Wysocki, Jeff Garzik, LKML, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Ingo Molnar, Len Brown,
	Guennadi Liakhovetski



On Wed, 12 Mar 2008, Greg KH wrote:
> 
> What happend in .25-rc was that we now catch these kinds of problems
> (watching for duplicate kobjects to be registered and such.)  So this
> might have always been happening, but no warning was ever produced.

It's not the warning that worries me. It's the apparent oops (keyboard 
leds blinking?) at shutdown/poweroff!

> The reason we aren't shutting down is also due to the way kobjects now
> work.  If you don't clean up properly, they linger around and something
> on the shutdown path (I haven't figured that out yet) doesn't want to
> stop the machine.

.. and that's my issue! We're too late in the game to try to figure things 
out and leave things hanging. The patch broke something, it needs to be 
fixed or reverted. It's been going on too long.

I think it should have been reverted probably two weeks ago already. We 
can re-apply it early in the 2.6.26 series, and then try to fix it right. 

Since there is at least a patch worth trying now, I'll hold off reverting 
it and wait for Guennardi to test the patch, but the fact is, we shouldn't 
have a known-broken kernel for several weeks, when there is a known fix 
for it in reverting a single commit!

We have _way_ too many regressions as it is. Regressions are bad. Ones 
that have known causes and haven't been fixed in three weeks are 
unacceptable.

			Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 22:54                   ` Greg KH
@ 2008-03-12 23:09                     ` Linus Torvalds
  2008-03-13  4:48                       ` Greg KH
  2008-03-13  5:44                       ` Greg KH
  0 siblings, 2 replies; 41+ messages in thread
From: Linus Torvalds @ 2008-03-12 23:09 UTC (permalink / raw)
  To: Greg KH
  Cc: Rafael J. Wysocki, Jeff Garzik, LKML, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Ingo Molnar, Len Brown,
	Guennadi Liakhovetski



On Wed, 12 Mar 2008, Greg KH wrote:
> 
> I can't get a box here to produce both of those PCI: messages myself,
> and neither can Len, so something is really odd here.

You can't?

I can trivially reproduce the warnings on my laptop by just adding 
"acpi=noirq" to the command line in grub.

	PCI: Probing PCI hardware
	kobject (ffff81007e08d9c8): tried to init an initialized object, something is seriously wrong.
	Pid: 1, comm: swapper Not tainted 2.6.25-rc3-00081-g7704a8b #29
	
	Call Trace:
	 [<ffffffff8054f921>] __down_read+0x12/0x93
	 [<ffffffff80313d60>] kobject_init+0x39/0x82
	 [<ffffffff803956d6>] device_initialize+0x25/0xa4
	 [<ffffffff80395f83>] device_register+0x9/0x12
	 [<ffffffff80322cdc>] pci_bus_add_devices+0xe2/0x13e
	 [<ffffffff807491be>] pci_legacy_init+0x66/0xf9
	 [<ffffffff8039763e>] bus_register+0x15b/0x221
	 [<ffffffff8072a6ba>] kernel_init+0x14a/0x2b4
	 [<ffffffff8020be38>] child_rip+0xa/0x12
	 [<ffffffff8072a570>] kernel_init+0x0/0x2b4
	 [<ffffffff8020be2e>] child_rip+0x0/0x12

did you try just adding that simple command line thing?

			Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 23:02                   ` Linus Torvalds
@ 2008-03-12 23:16                     ` Greg KH
  2008-03-12 23:32                       ` Guennadi Liakhovetski
  2008-03-12 23:35                       ` Linus Torvalds
  2008-03-12 23:17                     ` Guennadi Liakhovetski
  1 sibling, 2 replies; 41+ messages in thread
From: Greg KH @ 2008-03-12 23:16 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Jeff Garzik, LKML, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Ingo Molnar, Len Brown,
	Guennadi Liakhovetski

On Wed, Mar 12, 2008 at 04:02:57PM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 12 Mar 2008, Greg KH wrote:
> > 
> > What happend in .25-rc was that we now catch these kinds of problems
> > (watching for duplicate kobjects to be registered and such.)  So this
> > might have always been happening, but no warning was ever produced.
> 
> It's not the warning that worries me. It's the apparent oops (keyboard 
> leds blinking?) at shutdown/poweroff!

It oopses at shutdown?  I thought this was originally reported as a
"will not power off" which for a while was attributed to the cpufreq fix
that went into -rc2 or -rc3.

I didn't realize there was an oops, sorry.

> > The reason we aren't shutting down is also due to the way kobjects now
> > work.  If you don't clean up properly, they linger around and something
> > on the shutdown path (I haven't figured that out yet) doesn't want to
> > stop the machine.
> 
> .. and that's my issue! We're too late in the game to try to figure things 
> out and leave things hanging. The patch broke something, it needs to be 
> fixed or reverted. It's been going on too long.
> 
> I think it should have been reverted probably two weeks ago already. We 
> can re-apply it early in the 2.6.26 series, and then try to fix it right. 
> 
> Since there is at least a patch worth trying now, I'll hold off reverting 
> it and wait for Guennardi to test the patch, but the fact is, we shouldn't 
> have a known-broken kernel for several weeks, when there is a known fix 
> for it in reverting a single commit!
> 
> We have _way_ too many regressions as it is. Regressions are bad. Ones 
> that have known causes and haven't been fixed in three weeks are 
> unacceptable.

Sorry, I thought this was just a warning at boot time.

It would be interesting to see if reverting the pci_bus patch did
anything about the fact that we register the root PCI bus through two
different methods.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 23:02                   ` Linus Torvalds
  2008-03-12 23:16                     ` Greg KH
@ 2008-03-12 23:17                     ` Guennadi Liakhovetski
  2008-03-12 23:37                       ` Linus Torvalds
  1 sibling, 1 reply; 41+ messages in thread
From: Guennadi Liakhovetski @ 2008-03-12 23:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Greg KH, Rafael J. Wysocki, Jeff Garzik, LKML, Adrian Bunk,
	Andrew Morton, Natalie Protasevich, Ingo Molnar, Len Brown

On Wed, 12 Mar 2008, Linus Torvalds wrote:

> On Wed, 12 Mar 2008, Greg KH wrote:
> > 
> > What happend in .25-rc was that we now catch these kinds of problems
> > (watching for duplicate kobjects to be registered and such.)  So this
> > might have always been happening, but no warning was ever produced.
> 
> It's not the warning that worries me. It's the apparent oops (keyboard 
> leds blinking?) at shutdown/poweroff!

No, no oops, no blinking LEDs. The machine just stays there after syncing 
SCSI disks. I can still call sysrqs, and I've captured them with the 
serial console - see complete dumps here: 
http://bugzilla.kernel.org/attachment.cgi?id=15057

> Since there is at least a patch worth trying now, I'll hold off reverting 
> it and wait for Guennardi to test the patch, but the fact is, we shouldn't 
> have a known-broken kernel for several weeks, when there is a known fix 
> for it in reverting a single commit!

I'll test it in about 12 hours.

Thanks
Guennadi
---
Guennadi Liakhovetski

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 23:16                     ` Greg KH
@ 2008-03-12 23:32                       ` Guennadi Liakhovetski
  2008-03-12 23:37                         ` Greg KH
  2008-03-12 23:35                       ` Linus Torvalds
  1 sibling, 1 reply; 41+ messages in thread
From: Guennadi Liakhovetski @ 2008-03-12 23:32 UTC (permalink / raw)
  To: Greg KH
  Cc: Linus Torvalds, Rafael J. Wysocki, Jeff Garzik, LKML,
	Adrian Bunk, Andrew Morton, Natalie Protasevich, Ingo Molnar,
	Len Brown

On Wed, 12 Mar 2008, Greg KH wrote:

> It oopses at shutdown?  I thought this was originally reported as a
> "will not power off" which for a while was attributed to the cpufreq fix
> that went into -rc2 or -rc3.

As I already replied to Linus, no, it doesn't.

> It would be interesting to see if reverting the pci_bus patch did
> anything about the fact that we register the root PCI bus through two
> different methods.

You mean this: http://marc.info/?l=linux-kernel&m=120483340622706&w=2

Thanks
Guennadi
---
Guennadi Liakhovetski

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 23:16                     ` Greg KH
  2008-03-12 23:32                       ` Guennadi Liakhovetski
@ 2008-03-12 23:35                       ` Linus Torvalds
  1 sibling, 0 replies; 41+ messages in thread
From: Linus Torvalds @ 2008-03-12 23:35 UTC (permalink / raw)
  To: Greg KH
  Cc: Rafael J. Wysocki, Jeff Garzik, LKML, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Ingo Molnar, Len Brown,
	Guennadi Liakhovetski



On Wed, 12 Mar 2008, Greg KH wrote:
> >
> > It's not the warning that worries me. It's the apparent oops (keyboard 
> > leds blinking?) at shutdown/poweroff!
> 
> It oopses at shutdown?  I thought this was originally reported as a
> "will not power off" which for a while was attributed to the cpufreq fix
> that went into -rc2 or -rc3.
> 
> I didn't realize there was an oops, sorry.

I'm not at all sure there is an oops - in fact, I'd have expected it to 
show up on the serial console if there was one.

The bug report says that the keyboard leds blink, which is *sometimes* due 
to having the led oops blinking code enabled, but hey, no actual oops was 
ever shown, and sometimes a blink is just a blink.

Did you see the full dmesg from syslog? That one has not just the 
warnings, but also sysrq output at the point it hangs. The suspicious 
thing seems to be

	halt          R running      0  3291   3289
	       c013d75a 7488242e 00000180 75b8ffa0 c6efbddc c6efbddc c0426d00 c6efbdf0 
	       c0125da4 c6efbe10 c0125ed1 0000000a 00000001 c0426d00 c0426d00 00000046 
	       b7efcff4 c6efbe20 00000046 c0426d00 c0426d00 c6efbe2c c0126068 c110c060 
	Call Trace:
	 [<c013d75a>] ? tick_program_event+0x4a/0x80
	 [<c0125da4>] ? _local_bh_enable+0x24/0x80
	 [<c0125ed1>] ? __do_softirq+0xd1/0xf0
	 [<c0126068>] ? irq_exit+0x28/0x90
	 [<c0313079>] ? preempt_schedule_irq+0x49/0x70
	 [<c0103c28>] ? apic_timer_interrupt+0x28/0x30
	 [<c0251c58>] ? device_shutdown+0x48/0x70
	 [<c012e618>] ? kernel_shutdown_prepare+0x28/0x30
	 [<c012e630>] ? kernel_power_off+0x10/0x40
	...

which makes me suspect we're in some endless loop in device_shutdown(), 
but that's just a random guess (it seems to be running on the othe CPU: 
CPU0 is in idle - and when that happens the stack trace is really not 
very reliable at all, so take all that with a huge pinch of salt!).

> Sorry, I thought this was just a warning at boot time.

If it had been just the warning, I would ignore it as a good thing to be 
cleaned up later. But no, the original problem was the inability to halt 
and reboot, and the bugzilla entry says

	It also introduces these two errors:
	   ^^^^

with underlining by me. So the warnings in themselves are just an 
interesting coincidence (and probably related to the cause, of course).

		Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 23:32                       ` Guennadi Liakhovetski
@ 2008-03-12 23:37                         ` Greg KH
  0 siblings, 0 replies; 41+ messages in thread
From: Greg KH @ 2008-03-12 23:37 UTC (permalink / raw)
  To: Guennadi Liakhovetski
  Cc: Linus Torvalds, Rafael J. Wysocki, Jeff Garzik, LKML,
	Adrian Bunk, Andrew Morton, Natalie Protasevich, Ingo Molnar,
	Len Brown

On Thu, Mar 13, 2008 at 12:32:48AM +0100, Guennadi Liakhovetski wrote:
> On Wed, 12 Mar 2008, Greg KH wrote:
> 
> > It oopses at shutdown?  I thought this was originally reported as a
> > "will not power off" which for a while was attributed to the cpufreq fix
> > that went into -rc2 or -rc3.
> 
> As I already replied to Linus, no, it doesn't.
> 
> > It would be interesting to see if reverting the pci_bus patch did
> > anything about the fact that we register the root PCI bus through two
> > different methods.
> 
> You mean this: http://marc.info/?l=linux-kernel&m=120483340622706&w=2

Yes, the warnings go away as there is no more struct device to register,
but the big "PCI:" messages from the syslog at startup with the patch
reverted is what I am curious about.

I'll test more in a few hours, have to go herd the kids off to piano
lessons...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 23:17                     ` Guennadi Liakhovetski
@ 2008-03-12 23:37                       ` Linus Torvalds
  2008-03-12 23:47                         ` Guennadi Liakhovetski
  0 siblings, 1 reply; 41+ messages in thread
From: Linus Torvalds @ 2008-03-12 23:37 UTC (permalink / raw)
  To: Guennadi Liakhovetski
  Cc: Greg KH, Rafael J. Wysocki, Jeff Garzik, LKML, Adrian Bunk,
	Andrew Morton, Natalie Protasevich, Ingo Molnar, Len Brown



On Thu, 13 Mar 2008, Guennadi Liakhovetski wrote:
> 
> No, no oops, no blinking LEDs.

Oh, the original report says:

  "Problem Description: Power off / reboot blink keyboard LEDs and leave 
   the system at .."

which is why I thought it might be a hidden oops.

But that must have been some unrelated and misleading red herring.

		Linus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 23:37                       ` Linus Torvalds
@ 2008-03-12 23:47                         ` Guennadi Liakhovetski
  0 siblings, 0 replies; 41+ messages in thread
From: Guennadi Liakhovetski @ 2008-03-12 23:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Greg KH, Rafael J. Wysocki, Jeff Garzik, LKML, Adrian Bunk,
	Andrew Morton, Natalie Protasevich, Ingo Molnar, Len Brown

On Wed, 12 Mar 2008, Linus Torvalds wrote:

> On Thu, 13 Mar 2008, Guennadi Liakhovetski wrote:
> > 
> > No, no oops, no blinking LEDs.
> 
> Oh, the original report says:
> 
>   "Problem Description: Power off / reboot blink keyboard LEDs and leave 
>    the system at .."
> 
> which is why I thought it might be a hidden oops.
> 
> But that must have been some unrelated and misleading red herring.

Ah, ok, I see now. No, no herring here. The LEDs just blink _once_ as they 
always do before shutdown (on this machine?). But the machine stays on.

Thanks
Guennadi
---
Guennadi Liakhovetski

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 23:09                     ` Linus Torvalds
@ 2008-03-13  4:48                       ` Greg KH
  2008-03-13  5:44                       ` Greg KH
  1 sibling, 0 replies; 41+ messages in thread
From: Greg KH @ 2008-03-13  4:48 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Jeff Garzik, LKML, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Ingo Molnar, Len Brown,
	Guennadi Liakhovetski

On Wed, Mar 12, 2008 at 04:09:17PM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 12 Mar 2008, Greg KH wrote:
> > 
> > I can't get a box here to produce both of those PCI: messages myself,
> > and neither can Len, so something is really odd here.
> 
> You can't?
> 
> I can trivially reproduce the warnings on my laptop by just adding 
> "acpi=noirq" to the command line in grub.
> 
> 	PCI: Probing PCI hardware
> 	kobject (ffff81007e08d9c8): tried to init an initialized object, something is seriously wrong.
> 	Pid: 1, comm: swapper Not tainted 2.6.25-rc3-00081-g7704a8b #29
> 	
> 	Call Trace:
> 	 [<ffffffff8054f921>] __down_read+0x12/0x93
> 	 [<ffffffff80313d60>] kobject_init+0x39/0x82
> 	 [<ffffffff803956d6>] device_initialize+0x25/0xa4
> 	 [<ffffffff80395f83>] device_register+0x9/0x12
> 	 [<ffffffff80322cdc>] pci_bus_add_devices+0xe2/0x13e
> 	 [<ffffffff807491be>] pci_legacy_init+0x66/0xf9
> 	 [<ffffffff8039763e>] bus_register+0x15b/0x221
> 	 [<ffffffff8072a6ba>] kernel_init+0x14a/0x2b4
> 	 [<ffffffff8020be38>] child_rip+0xa/0x12
> 	 [<ffffffff8072a570>] kernel_init+0x0/0x2b4
> 	 [<ffffffff8020be2e>] child_rip+0x0/0x12
> 
> did you try just adding that simple command line thing?

This wasn't doing anything on my laptop, but it does cause the warning
on my mac mini, thanks for showing how to trigger it.

And that's with the patch I posted, so that's no good.

Let me see if I can figure it out now that I can reproduce it...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-10 23:14 2.6.25-rc5: Reported regressions from 2.6.24 Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2008-03-12 22:12 ` Christian Kujau
@ 2008-03-13  5:03 ` David Chinner
  2008-03-13 21:12   ` Rafael J. Wysocki
  3 siblings, 1 reply; 41+ messages in thread
From: David Chinner @ 2008-03-13  5:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Adrian Bunk, Andrew Morton, Linus Torvalds, Natalie Protasevich

On Tue, Mar 11, 2008 at 12:14:52AM +0100, Rafael J. Wysocki wrote:
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10207
> Subject		: INFO: task mount:11202 blocked for more than 120 seconds
> Submitter	: Christian Kujau <lists@nerdbynature.de>
> Date		: 2008-03-07 21:32
> References	: http://lkml.org/lkml/2008/3/7/308
> 		  http://lkml.org/lkml/2008/3/9/186
> Handled-By	: David Chinner <dgc@sgi.com>

Rafael, this looks to be something related to dm-crypt, not XFS. Can you
reassign it appropriately?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-12 23:09                     ` Linus Torvalds
  2008-03-13  4:48                       ` Greg KH
@ 2008-03-13  5:44                       ` Greg KH
  2008-03-13  6:24                         ` Yinghai Lu
  2008-03-13 10:06                         ` Guennadi Liakhovetski
  1 sibling, 2 replies; 41+ messages in thread
From: Greg KH @ 2008-03-13  5:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Jeff Garzik, LKML, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Ingo Molnar, Len Brown,
	Guennadi Liakhovetski

On Wed, Mar 12, 2008 at 04:09:17PM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 12 Mar 2008, Greg KH wrote:
> > 
> > I can't get a box here to produce both of those PCI: messages myself,
> > and neither can Len, so something is really odd here.

Ok, stupid me, this was my fault.  I was assuming that pci busses would
never be registered multiple times with the pci core.  Obviously this
isn't true.  The previous patch I proposed was only paying attention to
the PCI devices, and that logic is just fine (it's already protected
when it is attempted to be registered multiple times.)

So, the patch below fixes the issue for me, and reboot seems to work as
well.

Guennadi, can you test this out on your machine?

thanks for your patience,

greg k-h

From: Greg Kroah-Hartman <gregkh@suse.de>
Subject: PCI: fix issue with busses registering multiple times in sysfs

PCI busses can be registered multiple times, so we need to detect if we
have registered our bus structure in sysfs already.  If so, don't do it
again.

Thanks to Guennadi Liakhovetski <g.liakhovetski@gmx.de> for reporting
the problem, and to Linus for poking me to get me to believe that it was
a real problem.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/pci/bus.c   |    6 +++++-
 include/linux/pci.h |    1 +
 2 files changed, 6 insertions(+), 1 deletion(-)

--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -143,14 +143,18 @@ void pci_bus_add_devices(struct pci_bus 
 			/* register the bus with sysfs as the parent is now
 			 * properly registered. */
 			child_bus = dev->subordinate;
+			if (child_bus->is_added)
+				continue;
 			child_bus->dev.parent = child_bus->bridge;
 			retval = device_register(&child_bus->dev);
 			if (retval)
 				dev_err(&dev->dev, "Error registering pci_bus,"
 					" continuing...\n");
-			else
+			else {
+				child_bus->is_added = 1;
 				retval = device_create_file(&child_bus->dev,
 							&dev_attr_cpuaffinity);
+			}
 			if (retval)
 				dev_err(&dev->dev, "Error creating cpuaffinity"
 					" file, continuing...\n");
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -278,6 +278,7 @@ struct pci_bus {
 	struct device		dev;
 	struct bin_attribute	*legacy_io; /* legacy I/O for this bus */
 	struct bin_attribute	*legacy_mem; /* legacy mem */
+	unsigned int		is_added:1;
 };
 
 #define pci_bus_b(n)	list_entry(n, struct pci_bus, node)

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI? (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-13  5:44                       ` Greg KH
@ 2008-03-13  6:24                         ` Yinghai Lu
  2008-03-13 10:07                           ` Guennadi Liakhovetski
  2008-03-13 10:06                         ` Guennadi Liakhovetski
  1 sibling, 1 reply; 41+ messages in thread
From: Yinghai Lu @ 2008-03-13  6:24 UTC (permalink / raw)
  To: Greg KH
  Cc: Linus Torvalds, Rafael J. Wysocki, Jeff Garzik, LKML,
	Adrian Bunk, Andrew Morton, Natalie Protasevich, Ingo Molnar,
	Len Brown, Guennadi Liakhovetski

On Wed, Mar 12, 2008 at 10:44 PM, Greg KH <greg@kroah.com> wrote:
> On Wed, Mar 12, 2008 at 04:09:17PM -0700, Linus Torvalds wrote:
>  >
>  >
>
> > On Wed, 12 Mar 2008, Greg KH wrote:
>  > >
>  > > I can't get a box here to produce both of those PCI: messages myself,
>  > > and neither can Len, so something is really odd here.
>
>  Ok, stupid me, this was my fault.  I was assuming that pci busses would
>  never be registered multiple times with the pci core.  Obviously this
>  isn't true.  The previous patch I proposed was only paying attention to
>  the PCI devices, and that logic is just fine (it's already protected
>  when it is attempted to be registered multiple times.)
>
>  So, the patch below fixes the issue for me, and reboot seems to work as
>  well.
>
>  Guennadi, can you test this out on your machine?
>
>  thanks for your patience,
>
>  greg k-h
>
>
>  From: Greg Kroah-Hartman <gregkh@suse.de>
>  Subject: PCI: fix issue with busses registering multiple times in sysfs
>
>  PCI busses can be registered multiple times, so we need to detect if we
>  have registered our bus structure in sysfs already.  If so, don't do it
>  again.
>
>  Thanks to Guennadi Liakhovetski <g.liakhovetski@gmx.de> for reporting
>  the problem, and to Linus for poking me to get me to believe that it was
>  a real problem.
>
>  Cc: Linus Torvalds <torvalds@linux-foundation.org>
>  Cc: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
>
> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

wonder if

http://git.kernel.org/?p=linux/kernel/git/x86/linux-2.6-x86.git;a=commitdiff;h=fff07473e243989a2739b9d802d63e051ade7188

helps.

YH

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-13  5:44                       ` Greg KH
  2008-03-13  6:24                         ` Yinghai Lu
@ 2008-03-13 10:06                         ` Guennadi Liakhovetski
  2008-03-13 15:32                           ` Greg KH
  1 sibling, 1 reply; 41+ messages in thread
From: Guennadi Liakhovetski @ 2008-03-13 10:06 UTC (permalink / raw)
  To: Greg KH
  Cc: Linus Torvalds, Rafael J. Wysocki, Jeff Garzik, LKML,
	Adrian Bunk, Andrew Morton, Natalie Protasevich, Ingo Molnar,
	Len Brown

On Wed, 12 Mar 2008, Greg KH wrote:

> On Wed, Mar 12, 2008 at 04:09:17PM -0700, Linus Torvalds wrote:
> > 
> > 
> > On Wed, 12 Mar 2008, Greg KH wrote:
> > > 
> > > I can't get a box here to produce both of those PCI: messages myself,
> > > and neither can Len, so something is really odd here.
> 
> Ok, stupid me, this was my fault.  I was assuming that pci busses would
> never be registered multiple times with the pci core.  Obviously this
> isn't true.  The previous patch I proposed was only paying attention to
> the PCI devices, and that logic is just fine (it's already protected
> when it is attempted to be registered multiple times.)
> 
> So, the patch below fixes the issue for me, and reboot seems to work as
> well.
> 
> Guennadi, can you test this out on your machine?

Yes, it fixes all _3_ startup warnings and lets the machine reboot and 
power off again. 3 warnings are the 2 reported as a regression from 
2.6.24, and one present also under 2.6.24:

PCI: Probing PCI hardware
sysfs: duplicate filename 'bridge' can not be created
WARNING: at fs/sysfs/dir.c:424 sysfs_add_one()
Pid: 1, comm: swapper Not tainted 2.6.24-hires-nohz #2
 [<c010541a>] show_trace_log_lvl+0x1a/0x30
 [<c0105ed2>] show_trace+0x12/0x20
 [<c010675e>] dump_stack+0x6e/0x80
 [<c01b246d>] sysfs_add_one+0x9d/0xe0
 [<c01b328b>] sysfs_create_link+0x8b/0x130
 [<c01f9f14>] pci_bus_add_devices+0x94/0x120
 [<c03f3920>] pci_legacy_init+0x50/0xf0
 [<c03d95f2>] kernel_init+0x142/0x320
 [<c0104fe3>] kernel_thread_helper+0x7/0x14
 =======================
pci 0000:00:01.0: Error creating sysfs bridge symlink, continuing...

So, well done! I was going to disturb you with that one after 2.6.25, now 
I don't have to any more, unless we want it fixed in 2.6.24-stable.

> thanks for your patience,

always at your disposal:-)

Thanks
Guennadi
---
Guennadi Liakhovetski

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI? (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-13  6:24                         ` Yinghai Lu
@ 2008-03-13 10:07                           ` Guennadi Liakhovetski
  0 siblings, 0 replies; 41+ messages in thread
From: Guennadi Liakhovetski @ 2008-03-13 10:07 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Greg KH, Linus Torvalds, Rafael J. Wysocki, Jeff Garzik, LKML,
	Adrian Bunk, Andrew Morton, Natalie Protasevich, Ingo Molnar,
	Len Brown

On Wed, 12 Mar 2008, Yinghai Lu wrote:

> wonder if
> 
> http://git.kernel.org/?p=linux/kernel/git/x86/linux-2.6-x86.git;a=commitdiff;h=fff07473e243989a2739b9d802d63e051ade7188
> 
> helps.

Firstly, it doesn't apply to -rc4-ish, secondly, no, after applying it 
manually, it didn't improve anything.

Thanks
Guennadi
---
Guennadi Liakhovetski

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: pcibios_scanned needs to be set in ACPI?  (was Re: 2.6.25-rc5: Reported regressions from 2.6.24)
  2008-03-13 10:06                         ` Guennadi Liakhovetski
@ 2008-03-13 15:32                           ` Greg KH
  0 siblings, 0 replies; 41+ messages in thread
From: Greg KH @ 2008-03-13 15:32 UTC (permalink / raw)
  To: Guennadi Liakhovetski
  Cc: Linus Torvalds, Rafael J. Wysocki, Jeff Garzik, LKML,
	Adrian Bunk, Andrew Morton, Natalie Protasevich, Ingo Molnar,
	Len Brown

On Thu, Mar 13, 2008 at 11:06:10AM +0100, Guennadi Liakhovetski wrote:
> On Wed, 12 Mar 2008, Greg KH wrote:
> 
> > On Wed, Mar 12, 2008 at 04:09:17PM -0700, Linus Torvalds wrote:
> > > 
> > > 
> > > On Wed, 12 Mar 2008, Greg KH wrote:
> > > > 
> > > > I can't get a box here to produce both of those PCI: messages myself,
> > > > and neither can Len, so something is really odd here.
> > 
> > Ok, stupid me, this was my fault.  I was assuming that pci busses would
> > never be registered multiple times with the pci core.  Obviously this
> > isn't true.  The previous patch I proposed was only paying attention to
> > the PCI devices, and that logic is just fine (it's already protected
> > when it is attempted to be registered multiple times.)
> > 
> > So, the patch below fixes the issue for me, and reboot seems to work as
> > well.
> > 
> > Guennadi, can you test this out on your machine?
> 
> Yes, it fixes all _3_ startup warnings and lets the machine reboot and 
> power off again. 3 warnings are the 2 reported as a regression from 
> 2.6.24, and one present also under 2.6.24:
> 
> PCI: Probing PCI hardware
> sysfs: duplicate filename 'bridge' can not be created

Yes, I noticed that if you were having this problem, .24 would also be
complaining to you about creating sysfs links.  I'll make up a .24 patch
for -stable after this.

Kay just pointed out that I can use a struct device field instead of
creating my own in the bus device, so I'll simplify the patch and then
send it to Linus in a bit.

Thanks so much for testing quickly and letting me know.

greg k-h

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-13  5:03 ` David Chinner
@ 2008-03-13 21:12   ` Rafael J. Wysocki
  0 siblings, 0 replies; 41+ messages in thread
From: Rafael J. Wysocki @ 2008-03-13 21:12 UTC (permalink / raw)
  To: David Chinner
  Cc: LKML, Adrian Bunk, Andrew Morton, Linus Torvalds,
	Natalie Protasevich, Herbert Xu

On Thursday, 13 of March 2008, David Chinner wrote:
> On Tue, Mar 11, 2008 at 12:14:52AM +0100, Rafael J. Wysocki wrote:
> > 
> > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10207
> > Subject		: INFO: task mount:11202 blocked for more than 120 seconds
> > Submitter	: Christian Kujau <lists@nerdbynature.de>
> > Date		: 2008-03-07 21:32
> > References	: http://lkml.org/lkml/2008/3/7/308
> > 		  http://lkml.org/lkml/2008/3/9/186
> > Handled-By	: David Chinner <dgc@sgi.com>
> 
> Rafael, this looks to be something related to dm-crypt, not XFS. Can you
> reassign it appropriately?

Well, it's already been assigned to IO/Storage->LVM2/DM, but I've reassigned
it to Herbert Xu.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-11 22:41       ` Rafael J. Wysocki
  2008-03-12 20:01         ` Linus Torvalds
@ 2008-03-17 19:20         ` Jeff Garzik
  1 sibling, 0 replies; 41+ messages in thread
From: Jeff Garzik @ 2008-03-17 19:20 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linus Torvalds, LKML, Adrian Bunk, Andrew Morton,
	Natalie Protasevich, Ingo Molnar, Len Brown,
	Guennadi Liakhovetski, Greg KH

Rafael J. Wysocki wrote:
> On Tuesday, 11 of March 2008, Jeff Garzik wrote:
>> Linus Torvalds wrote:
>>> On Mon, 10 Mar 2008, Jeff Garzik wrote:
>>>>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10123
>>>>> Subject		: No power-off / reboot with 2.6.25-rcX (up to -rc3)
>>>> FWIW, I have this same problem.
>>> Jeff, does that one ("keep rd->online and cpu_online_map in sync") fix the 
>>> problem for you?
>> Nope.  I am running baadac8b10c5ac15ce3d26b68fa266c8889b163f now, and it 
>> still hangs on reboot or power-off.
>>
>> Interestingly, if I reboot -immediately- from gdm, it succeeds.  However 
>> if I login to Fedora GNOME via gdm, and load my standard apps (1001 
>> terminals, firefox, tbird, IRC) reboot and poweroff no longer work.
>>
>> My guess was always some ACPI regression.  I'll bisect today or 
>> tomorrow.  It is reproducible regression that appeared recently (circa 
>> 2.6.24 or 2.6.25-rc1 I think), so I should be able to find the culprit.


Well, after going through several kernel versions (back to 2.6.19 so 
far), this machine continues to have reboot problems.  I'm going to 
back-burner this, as it is looking more like a hardware or BIOS problem 
that cropped up recently.

	Jeff





^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-11 13:04   ` Adrian Bunk
@ 2008-03-17 21:28     ` Thomas Meyer
  2008-03-19  9:15       ` Stefan Richter
  0 siblings, 1 reply; 41+ messages in thread
From: Thomas Meyer @ 2008-03-17 21:28 UTC (permalink / raw)
  To: Adrian Bunk
  Cc: Stefan Richter, Rafael J. Wysocki, LKML, Andrew Morton,
	Linus Torvalds, Natalie Protasevich

Adrian Bunk schrieb:
> On Tue, Mar 11, 2008 at 01:22:42PM +0100, Stefan Richter wrote:
>   
>> Rafael J. Wysocki wrote:
>>     
>>> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10080
>>> Subject		: 2.6.25-rc2: ohci1394 problem
>>> Submitter	: Thomas Meyer <thomas@m3y3r.de>
>>> Date		: 2008-02-20 08:47
>>> References	: http://lkml.org/lkml/2008/2/20/58
>>> Handled-By	: Stefan Richter <stefanr@s5r6.in-berlin.de>
>>>       
>> Thomas wrote on 2008-02-25:
>> ''So i did a "make clean" and a "make" (not a make
>> -j3 as i use to do) and recompiled 2.6.25-rc3 and now it works again.
>> Case closed under strange error.''
>> ...
>>     
>
> Although I don't think this would cause the error, it would be nice if 
> Thomas could verify that the -j3 did not cause the problem.
>   
I still cannot *believe* this bug, but  i just checked out the latest 
kernel and did a make distclean and a make (with mr. bunks patch 
applied) and there it is again:
$ dmesg

(cut)
[  464.852986] ohci1394: fw-host0: physical posted write error
[  464.852991] ohci1394: fw-host0: respTxComplete: dma prg stopped
[  464.852997] ohci1394: fw-host0: SelfID received outside of bus reset 
sequence
[  464.853002] ohci1394: fw-host0: Unhandled interrupt(s) 0xfc7cfe0c
[  464.896722] ohci1394: fw-host0: Unrecoverable error!
[  464.896722] ohci1394: fw-host0: Async Rsp Tx Context died: 
ctrl[f0002a00] cmdptr[f0002a00]
[  464.896722] ohci1394: fw-host0: Iso Recv 3 Context died: 
ctrl[d4000d0e] cmdptr[0014c397] match[00000000]
[  464.896722] ohci1394: fw-host0: Iso Recv 17 Context died: 
ctrl[7c006e38] cmdptr[f58b18cd] match[4910c683]
[  464.896722] ohci1394: fw-host0: Iso Recv 18 Context died: 
ctrl[003cacf0] cmdptr[88f2eb10] match[46e8104e]
[  464.896722] ohci1394: fw-host0: Iso Recv 19 Context died: 
ctrl[0c047e80] cmdptr[83060246] match[83060846]
[  464.896722] ohci1394: fw-host0: Iso Recv 26 Context died: 
ctrl[00656c62] cmdptr[6e696461] match[706f2067]
[  464.896722] ohci1394: fw-host0: Iso Recv 27 Context died: 
ctrl[4d006d65] cmdptr[61726570] match[676e6974]
[  464.896722] ohci1394: fw-host0: physical posted write error
[  464.896722] ohci1394: fw-host0: respTxComplete: dma prg stopped
[  464.896722] ohci1394: fw-host0: SelfID received outside of bus reset 
sequence
[  464.896722] ohci1394: fw-host0: Unhandled interrupt(s) 0xfc7cfe0c
[  464.898957] ohci1394: fw-host0: Unrecoverable error!
[  464.898957] ohci1394: fw-host0: Async Rsp Tx Context died: 
ctrl[f0002a00] cmdptr[f0002a00]
[  464.898957] ohci1394: fw-host0: Iso Recv 3 Context died: 
ctrl[d4000d0e] cmdptr[0014c397] match[00000000]
[  464.898957] ohci1394: fw-host0: Iso Recv 17 Context died: 
ctrl[7c006e38] cmdptr[f58b18cd] match[4910c683]
[  464.898957] ohci1394: fw-host0: Iso Recv 18 Context died: 
ctrl[003cacf0] cmdptr[88f2eb10] match[46e8104e]
[  464.898957] ohci1394: fw-host0: Iso Recv 19 Context died: 
ctrl[0c047e80] cmdptr[83060246] match[83060846]
[  464.898957] ohci1394: fw-host0: Iso Recv 26 Context died: 
ctrl[00656c62] cmdptr[6e696461] match[706f2067]
[  464.898957] ohci1394: fw-host0: Iso Recv 27 Context died: 
ctrl[4d006d65] cmdptr[61726570] match[676e6974]
[  464.898957] ohci1394: fw-host0: physical posted write error
[  464.898957] ohci1394: fw-host0: respTxComplete: dma prg stopped
[  464.898957] ohci1394: fw-host0: SelfID received outside of bus reset 
sequence
[  464.898957] ohci1394: fw-host0: Unhandled interrupt(s) 0xfc7cfe0c
and so on....

$ git describe
v2.6.25-rc6-14-gbde4f8f

As i already wrote: I tried to bisect this behavior, but with no result.

And Stefan didn't change anything in the involved drivers. I have no 
idea what could cause this kind of bug!
Suggestions?

- Maybe my build chain produces corrupted code?
- Maybe an udev error?
- ...?

$ emerge --info
Portage 2.1.4.4 (default-linux/x86/2006.1, gcc-4.2.3, glibc-2.7-r1, 
2.6.25-rc6 i686)
=================================================================
System uname: 2.6.25-rc6 i686 Genuine Intel(R) CPU T2400 @ 1.83GHz
Timestamp of tree: Mon, 17 Mar 2008 19:00:01 +0000
distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) 
[disabled]
app-shells/bash:     3.2_p33
dev-java/java-config: 1.3.7, 2.1.5
dev-lang/python:     2.4.4-r4, 2.5.1-r5
dev-python/pycrypto: 2.0.1-r6
sys-apps/baselayout: 2.0.0_rc6-r1
sys-apps/sandbox:    1.2.18.1-r2
sys-devel/autoconf:  2.13, 2.61-r1
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 
1.10.1
sys-devel/binutils:  2.18-r1
sys-devel/gcc-config: 1.4.0-r4
sys-devel/libtool:   1.5.26
virtual/os-headers:  2.6.24
ACCEPT_KEYWORDS="x86 ~x86"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-march=prescott -O2 -pipe"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config 
/usr/kde/3.5/shutdown /usr/kde/4.0/env /usr/kde/4.0/share/config 
/usr/kde/4.0/shutdown /usr/share/config"
CONFIG_PROTECT_MASK="/etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf 
/etc/gconf /etc/gentoo-release /etc/php/apache2-php5/ext-active/ 
/etc/php/cgi-php5/ext-active/ /etc/php/cli-php5/ext-active/ 
/etc/revdep-rebuild /etc/terminfo /etc/udev/rules.d"
CXXFLAGS="-march=prescott -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="distlocks metadata-transfer sandbox sfperms strict 
unmerge-orphans userfetch"
GENTOO_MIRRORS="http://distfiles.gentoo.org 
http://distro.ibiblio.org/pub/linux/distributions/gentoo"
LANG="de_DE"
LC_ALL="de_DE"
LINGUAS="de"
MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times 
--compress --force --whole-file --delete --stats --timeout=180 
--exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
(cut)


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: 2.6.25-rc5: Reported regressions from 2.6.24
  2008-03-17 21:28     ` Thomas Meyer
@ 2008-03-19  9:15       ` Stefan Richter
  0 siblings, 0 replies; 41+ messages in thread
From: Stefan Richter @ 2008-03-19  9:15 UTC (permalink / raw)
  To: Thomas Meyer
  Cc: Adrian Bunk, Rafael J. Wysocki, LKML, Andrew Morton,
	Linus Torvalds, Natalie Protasevich

Thomas Meyer wrote:
> Adrian Bunk schrieb:
>> On Tue, Mar 11, 2008 at 01:22:42PM +0100, Stefan Richter wrote:
>>  
>>> Rafael J. Wysocki wrote:
>>>    
>>>> Bug-Entry    : http://bugzilla.kernel.org/show_bug.cgi?id=10080

I have reopened, re-assigned, and slightly renamed that bug now.

>>>> Subject        : 2.6.25-rc2: ohci1394 problem
>>>> Submitter    : Thomas Meyer <thomas@m3y3r.de>
>>>> Date        : 2008-02-20 08:47
>>>> References    : http://lkml.org/lkml/2008/2/20/58
>>>> Handled-By    : Stefan Richter <stefanr@s5r6.in-berlin.de>

This bug is not handled by me.

>>>>       
>>> Thomas wrote on 2008-02-25:
>>> ''So i did a "make clean" and a "make" (not a make
>>> -j3 as i use to do) and recompiled 2.6.25-rc3 and now it works again.
>>> Case closed under strange error.''
>>> ...
>>>     
>>
>> Although I don't think this would cause the error, it would be nice if 
>> Thomas could verify that the -j3 did not cause the problem.
>>   
> I still cannot *believe* this bug, but  i just checked out the latest 
> kernel and did a make distclean and a make (with mr. bunks patch 
> applied) and there it is again:
> $ dmesg
> 
> (cut)
> [  464.852986] ohci1394: fw-host0: physical posted write error
[...]
> [  464.898957] ohci1394: fw-host0: Unhandled interrupt(s) 0xfc7cfe0c
> and so on....
> 
> $ git describe
> v2.6.25-rc6-14-gbde4f8f
> 
> As i already wrote: I tried to bisect this behavior, but with no result.
> 
> And Stefan didn't change anything in the involved drivers. I have no 
> idea what could cause this kind of bug!

The messages which Thomas posted result from ohci1394 getting ~0 (i.e. 
0xffffffff) from some or all MMIO reads.  This is not a FireWire driver bug.

MMIO has been broken by something after 2.6.24.
-- 
Stefan Richter
-=====-==--- --== =--==
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2008-03-19 19:27 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-10 23:14 2.6.25-rc5: Reported regressions from 2.6.24 Rafael J. Wysocki
2008-03-11  0:40 ` Jeff Garzik
2008-03-11  1:05   ` Rafael J. Wysocki
2008-03-11  2:15   ` Linus Torvalds
2008-03-11  3:00     ` Andrew Morton
2008-03-11  8:28       ` Ingo Molnar
2008-03-11 18:57     ` Jeff Garzik
2008-03-11 22:41       ` Rafael J. Wysocki
2008-03-12 20:01         ` Linus Torvalds
2008-03-12 20:32           ` Greg KH
2008-03-12 21:27             ` pcibios_scanned needs to be set in ACPI? (was Re: 2.6.25-rc5: Reported regressions from 2.6.24) Greg KH
2008-03-12 21:38               ` Greg KH
2008-03-12 22:25                 ` Linus Torvalds
2008-03-12 22:54                   ` Greg KH
2008-03-12 23:09                     ` Linus Torvalds
2008-03-13  4:48                       ` Greg KH
2008-03-13  5:44                       ` Greg KH
2008-03-13  6:24                         ` Yinghai Lu
2008-03-13 10:07                           ` Guennadi Liakhovetski
2008-03-13 10:06                         ` Guennadi Liakhovetski
2008-03-13 15:32                           ` Greg KH
2008-03-12 21:41               ` Len Brown
2008-03-12 22:20               ` Linus Torvalds
2008-03-12 22:34                 ` Greg KH
2008-03-12 23:02                   ` Linus Torvalds
2008-03-12 23:16                     ` Greg KH
2008-03-12 23:32                       ` Guennadi Liakhovetski
2008-03-12 23:37                         ` Greg KH
2008-03-12 23:35                       ` Linus Torvalds
2008-03-12 23:17                     ` Guennadi Liakhovetski
2008-03-12 23:37                       ` Linus Torvalds
2008-03-12 23:47                         ` Guennadi Liakhovetski
2008-03-17 19:20         ` 2.6.25-rc5: Reported regressions from 2.6.24 Jeff Garzik
2008-03-11 12:22 ` Stefan Richter
2008-03-11 13:04   ` Adrian Bunk
2008-03-17 21:28     ` Thomas Meyer
2008-03-19  9:15       ` Stefan Richter
2008-03-12 22:12 ` Christian Kujau
2008-03-12 23:01   ` Rafael J. Wysocki
2008-03-13  5:03 ` David Chinner
2008-03-13 21:12   ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).