LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: [Linux-uvc-devel] [BUG] NULL pointer dereference caused by uvcvideo stress test
       [not found] ` <200810151417.14661.laurent.pinchart@skynet.be>
@ 2008-10-15 16:43   ` Alan Jenkins
  2008-10-15 18:17     ` Laurent Pinchart
  0 siblings, 1 reply; 9+ messages in thread
From: Alan Jenkins @ 2008-10-15 16:43 UTC (permalink / raw)
  To: Laurent Pinchart; +Cc: linux-uvc-devel, linux-kernel

Laurent Pinchart wrote:
> Hi Alan,
>
> On Friday 26 September 2008, Alan Jenkins wrote:
>   
>> This is is on v2.6.27-rc6.  I originally saw it on todays -tip tree.
>>
>> # while true; do modprobe uvcvideo; modprobe -r uvcvideo; done
>>
>> After a few tens of cycles, modprobe gets stuck in "D" state, and the
>> following backtrace appears:
>>     
>
> [snip]
>
> I can't reproduce the issue here on 2.6.27. Could you please test that 
> version ?
>   

Sure... still happens.

If you look at the trace, it happens as "hald-probe-video" opens the 
video device.  This is from Ubuntu 8.04.  Possibly it's significant that 
I use the camera first, to make sure it works (I use Kopete, the 
settings dialogue includes a video test).

[  267.056655] input: UVC Camera (eb1a:2761) as /class/input/input59
[  267.069990] usbcore: registered new interface driver uvcvideo
[  267.069990] USB Video Class driver (v0.1.0)
[  267.299989] usbcore: deregistering interface driver uvcvideo
[  267.336656] Linux video capture interface: v2.00
[  267.349981] uvcvideo: Found UVC 1.00 device <unnamed> (eb1a:2761)
[  267.353317] input: UVC Camera (eb1a:2761) as /class/input/input60
[  267.363324] usbcore: registered new interface driver uvcvideo
[  267.363324] USB Video Class driver (v0.1.0)
[  267.373325] BUG: unable to handle kernel NULL pointer dereference at 
00000030
[  267.373325] IP: [<e03e13ec>] :videodev:video_open+0x8b/0xe4
[  267.373325] *pdpt = 0000000011ebc001 *pde = 0000000000000000
[  267.373325] Oops: 0000 [#1]
[  267.373325] Modules linked in: uvcvideo(-) compat_ioctl32 videodev 
v4l1_compat aes_i586 aes_generic af_packet i915 drm cpufreq_userspace 
cpufreq_powersave cpufreq_ondemand cpufreq_stats freq_table 
cpufreq_conservative wmi sbs sbshc ip6t_LOG nf_conntrack_ipv6 ipt_LOG 
xt_limit ipt_addrtype xt_state xt_tcpudp xt_conntrack ip6table_filter 
ip6_tables ipv6 nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_nat 
nf_conntrack_ipv4 nf_conntrack_ftp nf_conntrack iptable_filter ip_tables 
x_tables dm_crypt dm_mod fuse joydev arc4 ecb crypto_blkcipher ath5k 
mac80211 led_class sg cfg80211 psmouse serio_raw snd_hda_intel 
snd_pcm_oss snd_mixer_oss ata_generic snd_pcm snd_timer snd_page_alloc 
snd_hwdep shpchp pci_hotplug intel_agp agpgart video output battery ac 
eeepc_laptop backlight button evdev uhci_hcd ehci_hcd usb_storage 
libusual pcspkr usbcore thermal processor fan [last unloaded: v4l1_compat]
[  267.373325]
[  267.373325] Pid: 7709, comm: hald-probe-vide Not tainted (2.6.27eeepc 
#61)
[  267.373325] EIP: 0060:[<e03e13ec>] EFLAGS: 00010246 CPU: 0
[  267.373325] EIP is at video_open+0x8b/0xe4 [videodev]
[  267.373325] EAX: 00000000 EBX: e03e49d0 ECX: e03e1361 EDX: d1e6c000
[  267.373325] ESI: d1ea5c80 EDI: 00000000 EBP: debf41ec ESP: d1efde98
[  267.373325]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[  267.373325] Process hald-probe-vide (pid: 7709, ti=d1efc000 
task=d1c586e0 task.ti=d1efc000)
[  267.373325] Stack: 00000000 df1e17c0 00000000 debf41ec c01604e6 
d1ea5c80 00000000 d1ea5c80
[  267.373325]        debf41ec d1efdf14 c0160406 c015d36d deae0580 
def61700 d1efdf14 d1ea5c80
[  267.373325]        d1efdf14 00008001 c015d471 d1ea5c80 00000000 
00000000 d1efdf14 c016692e
[  267.373325] Call Trace:
[  267.373325]  [<c01604e6>] chrdev_open+0xe0/0xf6
[  267.373325]  [<c0160406>] chrdev_open+0x0/0xf6
[  267.373325]  [<c015d36d>] __dentry_open+0xf2/0x1da
[  267.373325]  [<c015d471>] nameidata_to_filp+0x1c/0x2c
[  267.373325]  [<c016692e>] do_filp_open+0x343/0x61e
[  267.373325]  [<c014e1ad>] handle_mm_fault+0x27d/0x528
[  267.373325]  [<c016df2d>] alloc_fd+0x46/0xad
[  267.373325]  [<c015d1a8>] do_sys_open+0x3f/0xb3
[  267.373325]  [<c015d260>] sys_open+0x1e/0x23
[  267.373325]  [<c01035c1>] sysenter_do_call+0x12/0x21
[  267.373325]  [<c0280000>] xfrm_state_find+0x3bb/0x4b1
[  267.373325]  =======================
[  267.373325] Code: c4 0c 85 d2 74 6d 8b 02 8b 5e 10 85 c0 74 15 8b 00 
85 c0 74 0b 83 38 02 74 0a ff 80 40 01 00 00 8b 02 eb 02 31 c0 89 46 10 
31 ff <8b> 48 30 85 c9 74 36 89 f2 89 e8 ff d1 85 c0 89 c7 74 2a 8b 46
[  267.373325] EIP: [<e03e13ec>] video_open+0x8b/0xe4 [videodev] SS:ESP 
0068:d1efde98
[  267.373325] ---[ end trace 1a1a7d2a0a55bf75 ]---
[  267.379201] usbcore: deregistering interface driver uvcvideo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Linux-uvc-devel] [BUG] NULL pointer dereference caused by uvcvideo stress test
  2008-10-15 16:43   ` [Linux-uvc-devel] [BUG] NULL pointer dereference caused by uvcvideo stress test Alan Jenkins
@ 2008-10-15 18:17     ` Laurent Pinchart
  2008-10-15 18:54       ` Alan Jenkins
  0 siblings, 1 reply; 9+ messages in thread
From: Laurent Pinchart @ 2008-10-15 18:17 UTC (permalink / raw)
  To: Alan Jenkins; +Cc: linux-uvc-devel, linux-kernel, Mauro Carvalho Chehab

Hi Alan,

On Wednesday 15 October 2008, Alan Jenkins wrote:
> Laurent Pinchart wrote:
> > Hi Alan,
> >
> > On Friday 26 September 2008, Alan Jenkins wrote:
> >> This is is on v2.6.27-rc6.  I originally saw it on todays -tip tree.
> >>
> >> # while true; do modprobe uvcvideo; modprobe -r uvcvideo; done
> >>
> >> After a few tens of cycles, modprobe gets stuck in "D" state, and the
> >> following backtrace appears:
> >
> > [snip]
> >
> > I can't reproduce the issue here on 2.6.27. Could you please test that
> > version ?
>
> Sure... still happens.

I had secretly hoped it would have disapearred :-)

> If you look at the trace, it happens as "hald-probe-video" opens the
> video device.  This is from Ubuntu 8.04.  Possibly it's significant that
> I use the camera first, to make sure it works (I use Kopete, the
> settings dialogue includes a video test).

The NULL pointer (or rather 0x00000030 pointer) dereference happens in 
video_open:

        file->f_op = fops_get(vfl->fops);
        if (file->f_op->open)
                err = file->f_op->open(inode, file);

file->f_op ends up being NULL. Either vfl->fops is NULL to begin with, or 
fops_get failed to get a reference to the file_operations structure.

I'd be surprised if vfl->fops was NULL. To rule out that case, can you add a 
BUG_ON(vfl->fops == NULL) before the call to fops_get ?

I'm not too familiar with the module loader, but a quick look at the code 
shows that the module could be marked as being unloaded (MODULE_STATE_GOING) 
before its exit function is called. If this is the case video_open would 
still be called, as the video device would still be registered, but fops_get 
would fail in try_module_get and return a NULL pointer. It seems the pointer 
returned by fops_get should be tested in video_open.

I've CC'ed the v4l maintainer to get his opinion on this.

> [  267.056655] input: UVC Camera (eb1a:2761) as /class/input/input59
> [  267.069990] usbcore: registered new interface driver uvcvideo
> [  267.069990] USB Video Class driver (v0.1.0)
> [  267.299989] usbcore: deregistering interface driver uvcvideo
> [  267.336656] Linux video capture interface: v2.00
> [  267.349981] uvcvideo: Found UVC 1.00 device <unnamed> (eb1a:2761)
> [  267.353317] input: UVC Camera (eb1a:2761) as /class/input/input60
> [  267.363324] usbcore: registered new interface driver uvcvideo
> [  267.363324] USB Video Class driver (v0.1.0)
> [  267.373325] BUG: unable to handle kernel NULL pointer dereference at
> 00000030
> [  267.373325] IP: [<e03e13ec>] :videodev:video_open+0x8b/0xe4
> [  267.373325] *pdpt = 0000000011ebc001 *pde = 0000000000000000
> [  267.373325] Oops: 0000 [#1]
> [  267.373325] Modules linked in: uvcvideo(-) compat_ioctl32 videodev
> v4l1_compat aes_i586 aes_generic af_packet i915 drm cpufreq_userspace
> cpufreq_powersave cpufreq_ondemand cpufreq_stats freq_table
> cpufreq_conservative wmi sbs sbshc ip6t_LOG nf_conntrack_ipv6 ipt_LOG
> xt_limit ipt_addrtype xt_state xt_tcpudp xt_conntrack ip6table_filter
> ip6_tables ipv6 nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_nat
> nf_conntrack_ipv4 nf_conntrack_ftp nf_conntrack iptable_filter ip_tables
> x_tables dm_crypt dm_mod fuse joydev arc4 ecb crypto_blkcipher ath5k
> mac80211 led_class sg cfg80211 psmouse serio_raw snd_hda_intel
> snd_pcm_oss snd_mixer_oss ata_generic snd_pcm snd_timer snd_page_alloc
> snd_hwdep shpchp pci_hotplug intel_agp agpgart video output battery ac
> eeepc_laptop backlight button evdev uhci_hcd ehci_hcd usb_storage
> libusual pcspkr usbcore thermal processor fan [last unloaded: v4l1_compat]
> [  267.373325]
> [  267.373325] Pid: 7709, comm: hald-probe-vide Not tainted (2.6.27eeepc
> #61)
> [  267.373325] EIP: 0060:[<e03e13ec>] EFLAGS: 00010246 CPU: 0
> [  267.373325] EIP is at video_open+0x8b/0xe4 [videodev]
> [  267.373325] EAX: 00000000 EBX: e03e49d0 ECX: e03e1361 EDX: d1e6c000
> [  267.373325] ESI: d1ea5c80 EDI: 00000000 EBP: debf41ec ESP: d1efde98
> [  267.373325]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> [  267.373325] Process hald-probe-vide (pid: 7709, ti=d1efc000
> task=d1c586e0 task.ti=d1efc000)
> [  267.373325] Stack: 00000000 df1e17c0 00000000 debf41ec c01604e6
> d1ea5c80 00000000 d1ea5c80
> [  267.373325]        debf41ec d1efdf14 c0160406 c015d36d deae0580
> def61700 d1efdf14 d1ea5c80
> [  267.373325]        d1efdf14 00008001 c015d471 d1ea5c80 00000000
> 00000000 d1efdf14 c016692e
> [  267.373325] Call Trace:
> [  267.373325]  [<c01604e6>] chrdev_open+0xe0/0xf6
> [  267.373325]  [<c0160406>] chrdev_open+0x0/0xf6
> [  267.373325]  [<c015d36d>] __dentry_open+0xf2/0x1da
> [  267.373325]  [<c015d471>] nameidata_to_filp+0x1c/0x2c
> [  267.373325]  [<c016692e>] do_filp_open+0x343/0x61e
> [  267.373325]  [<c014e1ad>] handle_mm_fault+0x27d/0x528
> [  267.373325]  [<c016df2d>] alloc_fd+0x46/0xad
> [  267.373325]  [<c015d1a8>] do_sys_open+0x3f/0xb3
> [  267.373325]  [<c015d260>] sys_open+0x1e/0x23
> [  267.373325]  [<c01035c1>] sysenter_do_call+0x12/0x21
> [  267.373325]  [<c0280000>] xfrm_state_find+0x3bb/0x4b1
> [  267.373325]  =======================
> [  267.373325] Code: c4 0c 85 d2 74 6d 8b 02 8b 5e 10 85 c0 74 15 8b 00
> 85 c0 74 0b 83 38 02 74 0a ff 80 40 01 00 00 8b 02 eb 02 31 c0 89 46 10
> 31 ff <8b> 48 30 85 c9 74 36 89 f2 89 e8 ff d1 85 c0 89 c7 74 2a 8b 46
> [  267.373325] EIP: [<e03e13ec>] video_open+0x8b/0xe4 [videodev] SS:ESP
> 0068:d1efde98
> [  267.373325] ---[ end trace 1a1a7d2a0a55bf75 ]---
> [  267.379201] usbcore: deregistering interface driver uvcvideo

Best regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Linux-uvc-devel] [BUG] NULL pointer dereference caused by uvcvideo stress test
  2008-10-15 18:17     ` Laurent Pinchart
@ 2008-10-15 18:54       ` Alan Jenkins
  2008-10-15 21:19         ` Laurent Pinchart
  0 siblings, 1 reply; 9+ messages in thread
From: Alan Jenkins @ 2008-10-15 18:54 UTC (permalink / raw)
  To: Laurent Pinchart; +Cc: linux-uvc-devel, linux-kernel, Mauro Carvalho Chehab

Laurent Pinchart wrote:
> Hi Alan,
>
> On Wednesday 15 October 2008, Alan Jenkins wrote:
>   
>> Laurent Pinchart wrote:
>>     
>>> Hi Alan,
>>>
>>> On Friday 26 September 2008, Alan Jenkins wrote:
>>>       
>>>> This is is on v2.6.27-rc6.  I originally saw it on todays -tip tree.
>>>>
>>>> # while true; do modprobe uvcvideo; modprobe -r uvcvideo; done
>>>>
>>>> After a few tens of cycles, modprobe gets stuck in "D" state, and the
>>>> following backtrace appears:
>>>>         
>>> [snip]
>>>
>>> I can't reproduce the issue here on 2.6.27. Could you please test that
>>> version ?
>>>       
>> Sure... still happens.
>>     
>
> I had secretly hoped it would have disapearred :-)
>
>   
>> If you look at the trace, it happens as "hald-probe-video" opens the
>> video device.  This is from Ubuntu 8.04.  Possibly it's significant that
>> I use the camera first, to make sure it works (I use Kopete, the
>> settings dialogue includes a video test).
>>     
>
> The NULL pointer (or rather 0x00000030 pointer) dereference happens in 
> video_open:
>
>         file->f_op = fops_get(vfl->fops);
>         if (file->f_op->open)
>                 err = file->f_op->open(inode, file);
>
> file->f_op ends up being NULL. Either vfl->fops is NULL to begin with, or 
> fops_get failed to get a reference to the file_operations structure.
>
> I'd be surprised if vfl->fops was NULL. To rule out that case, can you add a 
> BUG_ON(vfl->fops == NULL) before the call to fops_get ?
>
> I'm not too familiar with the module loader, but a quick look at the code 
> shows that the module could be marked as being unloaded (MODULE_STATE_GOING) 
> before its exit function is called. If this is the case video_open would 
> still be called, as the video device would still be registered, but fops_get 
> would fail in try_module_get and return a NULL pointer. It seems the pointer 
> returned by fops_get should be tested in video_open.
>
> I've CC'ed the v4l maintainer to get his opinion on this.
>   

I put one before and one after

134    BUG_ON(vfl->fops == NULL);
135    file->f_op = fops_get(vfl->fops);
136    BUG_ON(file->f_op == NULL);

and the second one triggered

[  245.379990] ------------[ cut here ]------------
[  245.379990] kernel BUG at drivers/media/video/v4l2-dev.c:136!
[  245.379990] invalid opcode: 0000 [#1]
[  245.379990] Modules linked in: uvcvideo(-) compat_ioctl32 videodev 
v4l1_compat aes_i586 aes_generic af_packet i915 drm cpufreq_userspace 
cpufreq_powersave cpufreq_ondemand cpufreq_stats freq_table 
cpufreq_conservative wmi sbs sbshc ip6t_LOG nf_conntrack_ipv6 ipt_LOG 
xt_limit ipt_addrtype xt_state xt_tcpudp xt_conntrack ip6table_filter 
ip6_tables ipv6 nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_nat 
nf_conntrack_ipv4 nf_conntrack_ftp nf_conntrack iptable_filter ip_tables 
x_tables dm_crypt dm_mod fuse joydev arc4 ecb crypto_blkcipher ath5k 
mac80211 led_class cfg80211 psmouse serio_raw sg snd_hda_intel 
ata_generic snd_pcm_oss snd_mixer_oss snd_pcm shpchp pci_hotplug 
snd_timer snd_page_alloc snd_hwdep intel_agp agpgart video output 
battery ac eeepc_laptop backlight button evdev uhci_hcd ehci_hcd 
usb_storage libusual usbcore pcspkr thermal processor fan [last 
unloaded: v4l1_compat]
[  245.379990]
[  245.379990] Pid: 11686, comm: hald-probe-vide Not tainted 
(2.6.27eeepc #62)
[  245.379990] EIP: 0060:[<e024f3f2>] EFLAGS: 00010246 CPU: 0
[  245.379990] EIP is at video_open+0x8f/0xee [videodev]
[  245.379990] EAX: e03fbf40 EBX: e02529dc ECX: 00000000 EDX: d1d20e00
[  245.379990] ESI: d1f33780 EDI: ffffffed EBP: de4e182c ESP: d1fade98
[  245.379990]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[  245.379990] Process hald-probe-vide (pid: 11686, ti=d1fac000 
task=d8e73810 task.ti=d1fac000)
[  245.379990] Stack: 00000000 d1f20540 00000000 de4e182c c01604e6 
d1f33780 00000000 d1f33780
[  245.379990]        de4e182c d1fadf14 c0160406 c015d36d deae2580 
d22c8600 d1fadf14 d1f33780
[  245.379990]        d1fadf14 00008001 c015d471 d1f33780 00000000 
00000000 d1fadf14 c016692e
[  245.379990] Call Trace:
[  245.379990]  [<c01604e6>] chrdev_open+0xe0/0xf6
[  245.379990]  [<c0160406>] chrdev_open+0x0/0xf6
[  245.379990]  [<c015d36d>] __dentry_open+0xf2/0x1da
[  245.379990]  [<c015d471>] nameidata_to_filp+0x1c/0x2c
[  245.379990]  [<c016692e>] do_filp_open+0x343/0x61e
[  245.379990]  [<c014e1ad>] handle_mm_fault+0x27d/0x528
[  245.379990]  [<c016df2d>] alloc_fd+0x46/0xad
[  245.379990]  [<c015d1a8>] do_sys_open+0x3f/0xb3
[  245.379990]  [<c015d260>] sys_open+0x1e/0x23
[  245.379990]  [<c01035c1>] sysenter_do_call+0x12/0x21
[  245.379990]  [<c0280000>] xfrm_state_find+0x3bb/0x4b1
[  245.379990]  =======================
[  245.379990] Code: 74 77 8b 02 8b 5e 10 85 c0 75 04 0f 0b eb fe 8b 00 
85 c0 74 0d 31 c9 83 38 02 74 08 ff 80 40 01 00 00 8b 0a 85 c9 89 4e 10 
75 04 <0f> 0b eb fe 8b 49 30 31 ff 85 c9 74 36 89 f2 89 e8 ff d1 85 c0
[  245.379990] EIP: [<e024f3f2>] video_open+0x8f/0xee [videodev] SS:ESP 
0068:d1fade98
[  245.379990] ---[ end trace 2385a52acb7b9557 ]---


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Linux-uvc-devel] [BUG] NULL pointer dereference caused by uvcvideo stress test
  2008-10-15 18:54       ` Alan Jenkins
@ 2008-10-15 21:19         ` Laurent Pinchart
  2008-10-16 10:01           ` Alan Jenkins
  0 siblings, 1 reply; 9+ messages in thread
From: Laurent Pinchart @ 2008-10-15 21:19 UTC (permalink / raw)
  To: Alan Jenkins; +Cc: linux-uvc-devel, linux-kernel, Mauro Carvalho Chehab

[-- Attachment #1: Type: text/plain, Size: 1768 bytes --]

Hi Alan,

On Wednesday 15 October 2008, Alan Jenkins wrote:
> Laurent Pinchart wrote:
> > On Wednesday 15 October 2008, Alan Jenkins wrote:
> > > If you look at the trace, it happens as "hald-probe-video" opens the
> > > video device.  This is from Ubuntu 8.04.  Possibly it's significant that
> > > I use the camera first, to make sure it works (I use Kopete, the
> > > settings dialogue includes a video test).
> >
> > The NULL pointer (or rather 0x00000030 pointer) dereference happens in
> > video_open:
> >
> >         file->f_op = fops_get(vfl->fops);
> >         if (file->f_op->open)
> >                 err = file->f_op->open(inode, file);
> >
> > file->f_op ends up being NULL. Either vfl->fops is NULL to begin with, or
> > fops_get failed to get a reference to the file_operations structure.
> >
> > I'd be surprised if vfl->fops was NULL. To rule out that case, can you
> > add a BUG_ON(vfl->fops == NULL) before the call to fops_get ?
> >
> > I'm not too familiar with the module loader, but a quick look at the code
> > shows that the module could be marked as being unloaded
> > (MODULE_STATE_GOING) before its exit function is called. If this is the
> > case video_open would still be called, as the video device would still be
> > registered, but fops_get would fail in try_module_get and return a NULL
> > pointer. It seems the pointer returned by fops_get should be tested in
> > video_open.
> >
> > I've CC'ed the v4l maintainer to get his opinion on this.
>
> I put one before and one after
>
> 134    BUG_ON(vfl->fops == NULL);
> 135    file->f_op = fops_get(vfl->fops);
> 136    BUG_ON(file->f_op == NULL);
>
> and the second one triggered

This confirms my suspicion. Could you please try the attached patch ?

Best regards,

Laurent Pinchart

[-- Attachment #2: fops_get_check.patch --]
[-- Type: text/x-diff, Size: 710 bytes --]

diff --git a/drivers/media/video/v4l2-dev.c b/drivers/media/video/v4l2-dev.c
index 155fdec..7a3c1ed 100644
--- a/drivers/media/video/v4l2-dev.c
+++ b/drivers/media/video/v4l2-dev.c
@@ -132,6 +132,11 @@ static int video_open(struct inode *inode, struct file *file)
 	}
 	old_fops = file->f_op;
 	file->f_op = fops_get(vfl->fops);
+	if (file->f_op == NULL) {
+		file->f_op = old_fops;
+		err = -ENODEV;
+		goto out;
+	}
 	if (file->f_op->open)
 		err = file->f_op->open(inode, file);
 	if (err) {
@@ -139,6 +144,7 @@ static int video_open(struct inode *inode, struct file *file)
 		file->f_op = fops_get(old_fops);
 	}
 	fops_put(old_fops);
+out:
 	mutex_unlock(&videodev_lock);
 	unlock_kernel();
 	return err;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Linux-uvc-devel] [BUG] NULL pointer dereference caused by uvcvideo stress test
  2008-10-15 21:19         ` Laurent Pinchart
@ 2008-10-16 10:01           ` Alan Jenkins
  2008-10-16 12:03             ` Laurent Pinchart
  2008-10-24 14:31             ` Mauro Carvalho Chehab
  0 siblings, 2 replies; 9+ messages in thread
From: Alan Jenkins @ 2008-10-16 10:01 UTC (permalink / raw)
  To: Laurent Pinchart; +Cc: linux-uvc-devel, linux-kernel, Mauro Carvalho Chehab

Laurent Pinchart wrote:
> Hi Alan,
>
> On Wednesday 15 October 2008, Alan Jenkins wrote:
>   
>> Laurent Pinchart wrote:
>>     
>>> On Wednesday 15 October 2008, Alan Jenkins wrote:
>>>       
>>>> If you look at the trace, it happens as "hald-probe-video" opens the
>>>> video device.  This is from Ubuntu 8.04.  Possibly it's significant that
>>>> I use the camera first, to make sure it works (I use Kopete, the
>>>> settings dialogue includes a video test).
>>>>         
>>> The NULL pointer (or rather 0x00000030 pointer) dereference happens in
>>> video_open:
>>>
>>>         file->f_op = fops_get(vfl->fops);
>>>         if (file->f_op->open)
>>>                 err = file->f_op->open(inode, file);
>>>
>>> file->f_op ends up being NULL. Either vfl->fops is NULL to begin with, or
>>> fops_get failed to get a reference to the file_operations structure.
>>>
>>> I'd be surprised if vfl->fops was NULL. To rule out that case, can you
>>> add a BUG_ON(vfl->fops == NULL) before the call to fops_get ?
>>>
>>> I'm not too familiar with the module loader, but a quick look at the code
>>> shows that the module could be marked as being unloaded
>>> (MODULE_STATE_GOING) before its exit function is called. If this is the
>>> case video_open would still be called, as the video device would still be
>>> registered, but fops_get would fail in try_module_get and return a NULL
>>> pointer. It seems the pointer returned by fops_get should be tested in
>>> video_open.
>>>
>>> I've CC'ed the v4l maintainer to get his opinion on this.
>>>       
>> I put one before and one after
>>
>> 134    BUG_ON(vfl->fops == NULL);
>> 135    file->f_op = fops_get(vfl->fops);
>> 136    BUG_ON(file->f_op == NULL);
>>
>> and the second one triggered
>>     
>
> This confirms my suspicion. Could you please try the attached patch ?
>   

Yup, that seems to fix it.

I wonder if there are more instances of this error in other subsystems.

Ta
Alan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Linux-uvc-devel] [BUG] NULL pointer dereference caused by uvcvideo stress test
  2008-10-16 10:01           ` Alan Jenkins
@ 2008-10-16 12:03             ` Laurent Pinchart
  2008-10-16 12:22               ` Alan Jenkins
  2008-10-24 14:31             ` Mauro Carvalho Chehab
  1 sibling, 1 reply; 9+ messages in thread
From: Laurent Pinchart @ 2008-10-16 12:03 UTC (permalink / raw)
  To: Alan Jenkins; +Cc: linux-uvc-devel, linux-kernel, Mauro Carvalho Chehab

Hi Alan,

On Thursday 16 October 2008, Alan Jenkins wrote:
> Laurent Pinchart wrote:
> > On Wednesday 15 October 2008, Alan Jenkins wrote:
> >> Laurent Pinchart wrote:
> >>> On Wednesday 15 October 2008, Alan Jenkins wrote:
> >>>> If you look at the trace, it happens as "hald-probe-video" opens the
> >>>> video device.  This is from Ubuntu 8.04.  Possibly it's significant
> >>>> that I use the camera first, to make sure it works (I use Kopete, the
> >>>> settings dialogue includes a video test).
> >>>
> >>> The NULL pointer (or rather 0x00000030 pointer) dereference happens in
> >>> video_open:
> >>>
> >>>         file->f_op = fops_get(vfl->fops);
> >>>         if (file->f_op->open)
> >>>                 err = file->f_op->open(inode, file);
> >>>
> >>> file->f_op ends up being NULL. Either vfl->fops is NULL to begin with,
> >>> or fops_get failed to get a reference to the file_operations structure.
> >>>
> >>> I'd be surprised if vfl->fops was NULL. To rule out that case, can you
> >>> add a BUG_ON(vfl->fops == NULL) before the call to fops_get ?
> >>>
> >>> I'm not too familiar with the module loader, but a quick look at the
> >>> code shows that the module could be marked as being unloaded
> >>> (MODULE_STATE_GOING) before its exit function is called. If this is the
> >>> case video_open would still be called, as the video device would still
> >>> be registered, but fops_get would fail in try_module_get and return a
> >>> NULL pointer. It seems the pointer returned by fops_get should be
> >>> tested in video_open.
> >>>
> >>> I've CC'ed the v4l maintainer to get his opinion on this.
> >>
> >> I put one before and one after
> >>
> >> 134    BUG_ON(vfl->fops == NULL);
> >> 135    file->f_op = fops_get(vfl->fops);
> >> 136    BUG_ON(file->f_op == NULL);
> >>
> >> and the second one triggered
> >
> > This confirms my suspicion. Could you please try the attached patch ?
>
> Yup, that seems to fix it.

Great.

> I wonder if there are more instances of this error in other subsystems.

>From a quick grep it seems the following subsystems are affected:

drivers/media/video
drivers/media/video/dvb/dvb-core
drivers/gpu/drm
sound/core

Unless the issue is critical and should be fixed before 2.6.28, 
drivers/media/video won't matter as the v4l core has already been moved to 
the cdev API in the kernel tree, removing the offending code.

Will you submit patches for the other three subsystems or would you like me to 
take care of that ?

Best regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Linux-uvc-devel] [BUG] NULL pointer dereference caused by uvcvideo stress test
  2008-10-16 12:03             ` Laurent Pinchart
@ 2008-10-16 12:22               ` Alan Jenkins
  0 siblings, 0 replies; 9+ messages in thread
From: Alan Jenkins @ 2008-10-16 12:22 UTC (permalink / raw)
  To: Laurent Pinchart; +Cc: linux-uvc-devel, linux-kernel, Mauro Carvalho Chehab

Laurent Pinchart wrote:
> Hi Alan,
>
> On Thursday 16 October 2008, Alan Jenkins wrote:
>   
>> Laurent Pinchart wrote:
>>     
>>> On Wednesday 15 October 2008, Alan Jenkins wrote:
>>>       
>>>> Laurent Pinchart wrote:
>>>>         
>>>>> On Wednesday 15 October 2008, Alan Jenkins wrote:
>>>>>           
>>>>>> If you look at the trace, it happens as "hald-probe-video" opens the
>>>>>> video device.  This is from Ubuntu 8.04.  Possibly it's significant
>>>>>> that I use the camera first, to make sure it works (I use Kopete, the
>>>>>> settings dialogue includes a video test).
>>>>>>             
>>>>> The NULL pointer (or rather 0x00000030 pointer) dereference happens in
>>>>> video_open:
>>>>>
>>>>>         file->f_op = fops_get(vfl->fops);
>>>>>         if (file->f_op->open)
>>>>>                 err = file->f_op->open(inode, file);
>>>>>
>>>>> file->f_op ends up being NULL. Either vfl->fops is NULL to begin with,
>>>>> or fops_get failed to get a reference to the file_operations structure.
>>>>>
>>>>> I'd be surprised if vfl->fops was NULL. To rule out that case, can you
>>>>> add a BUG_ON(vfl->fops == NULL) before the call to fops_get ?
>>>>>
>>>>> I'm not too familiar with the module loader, but a quick look at the
>>>>> code shows that the module could be marked as being unloaded
>>>>> (MODULE_STATE_GOING) before its exit function is called. If this is the
>>>>> case video_open would still be called, as the video device would still
>>>>> be registered, but fops_get would fail in try_module_get and return a
>>>>> NULL pointer. It seems the pointer returned by fops_get should be
>>>>> tested in video_open.
>>>>>
>>>>> I've CC'ed the v4l maintainer to get his opinion on this.
>>>>>           
>>>> I put one before and one after
>>>>
>>>> 134    BUG_ON(vfl->fops == NULL);
>>>> 135    file->f_op = fops_get(vfl->fops);
>>>> 136    BUG_ON(file->f_op == NULL);
>>>>
>>>> and the second one triggered
>>>>         
>>> This confirms my suspicion. Could you please try the attached patch ?
>>>       
>> Yup, that seems to fix it.
>>     
>
> Great.
>
>   
>> I wonder if there are more instances of this error in other subsystems.
>>     
>
> From a quick grep it seems the following subsystems are affected:
>
> drivers/media/video
> drivers/media/video/dvb/dvb-core
> drivers/gpu/drm
> sound/core
>
> Unless the issue is critical and should be fixed before 2.6.28, 
> drivers/media/video won't matter as the v4l core has already been moved to 
> the cdev API in the kernel tree, removing the offending code.
>
> Will you submit patches for the other three subsystems or would you like me to 
> take care of that ?
>   

I need to start concentrating on project work.  Feel free to take all
the hard work^W^W patch credits :).

It's not critical to me.  This only happens in the stress test because
it unloads the module "too soon" after loading it, while HAL tries to
open the new device.

It's a completely artificial test.  I ran it to see what happens with
input devices - the device numbers don't seem to be reallocated like
e.g. usb storage devices.  Continually reloading the uvcvideo driver
means the number assigned to input device increments each time, and you
get "input67" and so on.  I haven't worked out whether this is a bug
though.  For all I know the numbers are reused eventually, once they've
run through the entire device minor space.

Alan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Linux-uvc-devel] [BUG] NULL pointer dereference caused by uvcvideo stress test
  2008-10-16 10:01           ` Alan Jenkins
  2008-10-16 12:03             ` Laurent Pinchart
@ 2008-10-24 14:31             ` Mauro Carvalho Chehab
  2008-10-25 11:19               ` Laurent Pinchart
  1 sibling, 1 reply; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2008-10-24 14:31 UTC (permalink / raw)
  To: Alan Jenkins; +Cc: Laurent Pinchart, linux-uvc-devel, linux-kernel

On Thu, 16 Oct 2008 11:01:27 +0100
Alan Jenkins <alan-jenkins@tuffmail.co.uk> wrote:

> Laurent Pinchart wrote:
> > Hi Alan,
> >
> > On Wednesday 15 October 2008, Alan Jenkins wrote:
> >   
> >> Laurent Pinchart wrote:
> >>     
> >>> On Wednesday 15 October 2008, Alan Jenkins wrote:
> >>>       
> >>>> If you look at the trace, it happens as "hald-probe-video" opens the
> >>>> video device.  This is from Ubuntu 8.04.  Possibly it's significant that
> >>>> I use the camera first, to make sure it works (I use Kopete, the
> >>>> settings dialogue includes a video test).
> >>>>         
> >>> The NULL pointer (or rather 0x00000030 pointer) dereference happens in
> >>> video_open:
> >>>
> >>>         file->f_op = fops_get(vfl->fops);
> >>>         if (file->f_op->open)
> >>>                 err = file->f_op->open(inode, file);
> >>>
> >>> file->f_op ends up being NULL. Either vfl->fops is NULL to begin with, or
> >>> fops_get failed to get a reference to the file_operations structure.
> >>>
> >>> I'd be surprised if vfl->fops was NULL. To rule out that case, can you
> >>> add a BUG_ON(vfl->fops == NULL) before the call to fops_get ?
> >>>
> >>> I'm not too familiar with the module loader, but a quick look at the code
> >>> shows that the module could be marked as being unloaded
> >>> (MODULE_STATE_GOING) before its exit function is called. If this is the
> >>> case video_open would still be called, as the video device would still be
> >>> registered, but fops_get would fail in try_module_get and return a NULL
> >>> pointer. It seems the pointer returned by fops_get should be tested in
> >>> video_open.
> >>>
> >>> I've CC'ed the v4l maintainer to get his opinion on this.

Sorry for being late with this. Too much work here...

I suspect that you only hit this bug due to BKL removal from open/close fops.
maybe you're calling open() before having the device fully initialized?

Anyway, I think that the proposed check is interesting on other places where
there are similar code.

Cheers,
Mauro

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Linux-uvc-devel] [BUG] NULL pointer dereference caused by uvcvideo stress test
  2008-10-24 14:31             ` Mauro Carvalho Chehab
@ 2008-10-25 11:19               ` Laurent Pinchart
  0 siblings, 0 replies; 9+ messages in thread
From: Laurent Pinchart @ 2008-10-25 11:19 UTC (permalink / raw)
  To: linux-uvc-devel; +Cc: Mauro Carvalho Chehab, Alan Jenkins, linux-kernel

Hi Mauro,

On Friday 24 October 2008, Mauro Carvalho Chehab wrote:
> On Thu, 16 Oct 2008 11:01:27 +0100
>
> Alan Jenkins <alan-jenkins@tuffmail.co.uk> wrote:
> > Laurent Pinchart wrote:
> > > Hi Alan,
> > >
> > > On Wednesday 15 October 2008, Alan Jenkins wrote:
> > >> Laurent Pinchart wrote:
> > >>> On Wednesday 15 October 2008, Alan Jenkins wrote:
> > >>>> If you look at the trace, it happens as "hald-probe-video" opens the
> > >>>> video device.  This is from Ubuntu 8.04.  Possibly it's significant
> > >>>> that I use the camera first, to make sure it works (I use Kopete,
> > >>>> the settings dialogue includes a video test).
> > >>>
> > >>> The NULL pointer (or rather 0x00000030 pointer) dereference happens
> > >>> in video_open:
> > >>>
> > >>>         file->f_op = fops_get(vfl->fops);
> > >>>         if (file->f_op->open)
> > >>>                 err = file->f_op->open(inode, file);
> > >>>
> > >>> file->f_op ends up being NULL. Either vfl->fops is NULL to begin
> > >>> with, or fops_get failed to get a reference to the file_operations
> > >>> structure.
> > >>>
> > >>> I'd be surprised if vfl->fops was NULL. To rule out that case, can
> > >>> you add a BUG_ON(vfl->fops == NULL) before the call to fops_get ?
> > >>>
> > >>> I'm not too familiar with the module loader, but a quick look at the
> > >>> code shows that the module could be marked as being unloaded
> > >>> (MODULE_STATE_GOING) before its exit function is called. If this is
> > >>> the case video_open would still be called, as the video device would
> > >>> still be registered, but fops_get would fail in try_module_get and
> > >>> return a NULL pointer. It seems the pointer returned by fops_get
> > >>> should be tested in video_open.
> > >>>
> > >>> I've CC'ed the v4l maintainer to get his opinion on this.
>
> Sorry for being late with this. Too much work here...
>
> I suspect that you only hit this bug due to BKL removal from open/close
> fops. maybe you're calling open() before having the device fully
> initialized?

It's actually the other way around. open() is closed while unloading the 
module. As v4l is being moved to cdev the bug will go away.

> Anyway, I think that the proposed check is interesting on other places
> where there are similar code.

Best regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-10-25 11:19 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <gbiduj$9p3$1@ger.gmane.org>
     [not found] ` <200810151417.14661.laurent.pinchart@skynet.be>
2008-10-15 16:43   ` [Linux-uvc-devel] [BUG] NULL pointer dereference caused by uvcvideo stress test Alan Jenkins
2008-10-15 18:17     ` Laurent Pinchart
2008-10-15 18:54       ` Alan Jenkins
2008-10-15 21:19         ` Laurent Pinchart
2008-10-16 10:01           ` Alan Jenkins
2008-10-16 12:03             ` Laurent Pinchart
2008-10-16 12:22               ` Alan Jenkins
2008-10-24 14:31             ` Mauro Carvalho Chehab
2008-10-25 11:19               ` Laurent Pinchart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).