LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [BUG 2.4] NFS unlocking operation accesses invalid file struct 
@ 2003-11-25 11:00 Akinobu Mita
  2003-11-26  0:35 ` Trond Myklebust
  0 siblings, 1 reply; 7+ messages in thread
From: Akinobu Mita @ 2003-11-25 11:00 UTC (permalink / raw)
  To: linux-kernel

Hi,

I'm investigating the reliabiblity of the NFS locking.
I noticed that possible NFS locking related crash in the following situation:

process A
process B
  -- A and B are sharing task's fd array.
     (clone()d with CLONE_FILES)

file F
  -- The file on NFS

file descriptor p (equivalent to file struct P)
file descriptor q (equivalent to file struct Q)

  -- p and q are individual file descriptors for the file F
     (not dup()-ed)

file lock L

  -- The file lock L has been locked via fcntl() for the file descriptor q by
     the process B (connects with file struct Q)


1. The process A closes the file descriptor p.

In filp_close(), the process A closes file struct P, it unlocks all the
file locks related to the i-node of the file F, which are held by the
processes sharing the same fd array process A refers to. (locks_remove_posix)

2. The process A unlocks the file lock L.

First of all, the process A removes the file lock L from the list of the
file locks related to the i-node of the file F. Then, it calls the `nfs_lock'
to do the unlocking operation for its file-system dependent operation.

3. While executing the `nfs_lock' with RPC procedure, the process A
  sleep on there for a while.

On the other side.
4. The process B closes the file descriptor q.

Because process A has already remove the entry of the file lock from the list,
process B cannot find the entry so it just exit without doing anything about
the list.
System treats the closing operation carried out by the process B is done,
while the process A is sleeping.
The process B invalidates the file struct Q because it is no longer needed.

But, the process A has not finished the operation of the unlocking 
for file lock L yet.

5. When the process A wakes up, it attempts to execute remaining unlocking
   works, and accesses the file struct Q.

Because the file struct Q is no longer valid, it is likely to cause NULL
pointer dereference.
Also, the file struct Q might be used by other files. in this case, the data
contradiction would happen.

Does anyone have a idea of how to fix it ?

Regards,
-- 
Akinobu Mita


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG 2.4] NFS unlocking operation accesses invalid file struct
  2003-11-25 11:00 [BUG 2.4] NFS unlocking operation accesses invalid file struct Akinobu Mita
@ 2003-11-26  0:35 ` Trond Myklebust
  2003-11-27 11:54   ` Akinobu Mita
  0 siblings, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2003-11-26  0:35 UTC (permalink / raw)
  To: Akinobu Mita; +Cc: linux-kernel

>>>>> " " == Akinobu Mita <mita@miraclelinux.com> writes:


     > Does anyone have a idea of how to fix it ?

Yes. I posted a patch about a week or 2 ago. The original patch can be
found on

  http://www.fys.uio.no/~trondmy/src/Linux-2.4.x/2.4.23-rc1/linux-2.4.23-01-posix_race.dif

However, I now believe the real problem here is that
locks_remove_posix() should also be checking the pid (as is done in
all the other POSIX locking checks by calling locks_same_owner()).

It is wrong for locks_remove_posix() to be deleting locks that don't
belong to this pid... Note: this bug exists in 2.6.x. too, although
there it does not cause an Oops...

Cheers,
  Trond

--- linux-2.4.23-rc1/fs/locks.c.orig	2003-11-16 19:30:53.000000000 -0500
+++ linux-2.4.23-rc1/fs/locks.c	2003-11-25 19:34:02.000000000 -0500
@@ -1746,7 +1746,8 @@
 	lock_kernel();
 	before = &inode->i_flock;
 	while ((fl = *before) != NULL) {
-		if ((fl->fl_flags & FL_POSIX) && fl->fl_owner == owner) {
+		if ((fl->fl_flags & FL_POSIX) && fl->fl_owner == owner &&
+				fl->fl_pid == current->pid) {
 			locks_unlock_delete(before);
 			before = &inode->i_flock;
 			continue;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG 2.4] NFS unlocking operation accesses invalid file struct
  2003-11-26  0:35 ` Trond Myklebust
@ 2003-11-27 11:54   ` Akinobu Mita
  2003-11-27 15:15     ` Gene Heskett
  2003-11-27 16:23     ` Trond Myklebust
  0 siblings, 2 replies; 7+ messages in thread
From: Akinobu Mita @ 2003-11-27 11:54 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-kernel

Thanks, Trond.

but, your patch causes memory leak.


# gcc leak.c -o leak -lpthread
# find /usr -type f -exec ./leak {} \; &
# while true; do sleep 1; grep file_lock_cache /proc/slabinfo;done

-- leak.c --
#include <strings.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>
#include <pthread.h>

int process_B(void *arg)
{
        int i, ret;
	struct stat stat;
	int fd = *(int *)arg;
        struct flock lck;

	if ((ret = fstat(fd, &stat)) < 0) {
		perror("fstat");
		return ret;
	}
	for (i = 0; i < stat.st_size/2; i++) {
		lck.l_type = F_RDLCK;
		lck.l_whence = 0;
		lck.l_start = 2*i;
		lck.l_len = 1;
		if ((ret = fcntl(fd, F_SETLK, &lck)) < 0) {
			perror("fcntl");
			return ret;
		}
	}
	return 0;
}

int main(int argc, char **argv)
{
        int p, ret;
        pthread_t tid;

        p = open(argv[1], O_RDWR);
        if (p < 0) {
                perror("open");
                exit(1);
        }
        pthread_create(&tid, NULL, process_B, &p);
	pthread_join(tid, NULL);
        if ((ret = close(p)) < 0)
                perror("close");
        exit(0);
}
----

it seems that your another patch could not avoid the race completely.

Cheers,
--
Akinobu Mita



On Wednesday 26 November 2003 09:35, Trond Myklebust wrote:
> >>>>> " " == Akinobu Mita <mita@miraclelinux.com> writes:
>      > Does anyone have a idea of how to fix it ?
>
> Yes. I posted a patch about a week or 2 ago. The original patch can be
> found on
>
>  
> http://www.fys.uio.no/~trondmy/src/Linux-2.4.x/2.4.23-rc1/linux-2.4.23-01-p
>osix_race.dif
>
> However, I now believe the real problem here is that
> locks_remove_posix() should also be checking the pid (as is done in
> all the other POSIX locking checks by calling locks_same_owner()).
>
> It is wrong for locks_remove_posix() to be deleting locks that don't
> belong to this pid... Note: this bug exists in 2.6.x. too, although
> there it does not cause an Oops...
>
> Cheers,
>   Trond
>
> --- linux-2.4.23-rc1/fs/locks.c.orig	2003-11-16 19:30:53.000000000 -0500
> +++ linux-2.4.23-rc1/fs/locks.c	2003-11-25 19:34:02.000000000 -0500
> @@ -1746,7 +1746,8 @@
>  	lock_kernel();
>  	before = &inode->i_flock;
>  	while ((fl = *before) != NULL) {
> -		if ((fl->fl_flags & FL_POSIX) && fl->fl_owner == owner) {
> +		if ((fl->fl_flags & FL_POSIX) && fl->fl_owner == owner &&
> +				fl->fl_pid == current->pid) {
>  			locks_unlock_delete(before);
>  			before = &inode->i_flock;
>  			continue;


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG 2.4] NFS unlocking operation accesses invalid file struct
  2003-11-27 11:54   ` Akinobu Mita
@ 2003-11-27 15:15     ` Gene Heskett
  2003-11-27 16:23     ` Trond Myklebust
  1 sibling, 0 replies; 7+ messages in thread
From: Gene Heskett @ 2003-11-27 15:15 UTC (permalink / raw)
  To: Akinobu Mita, Trond Myklebust; +Cc: linux-kernel

On Thursday 27 November 2003 06:54, Akinobu Mita wrote:

>From somebody trying to learn something here.

Unpatched 2.6.0-test11 here, using anticipatory scheduler

What should I expect to see occuring when this is executed?

Here, after a few initial cycles of the numbers getting larger, then 
stepping smaller and restarting the rise, eventually (a minute or so) 
the numbers started to rise and never stopped till I killed it.
The first 2 numbers always matched, and a much smaller pair near the 
end of the line always matched, the first pair being something above 
30,000 when I stopped it after about 2 1/2 minutes.

>Thanks, Trond.
>
>but, your patch causes memory leak.

[snip code]

-- 
Cheers, Gene
AMD K6-III@500mhz 320M
Athlon1600XP@1400mhz  512M
99.27% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attornies please note, additions to this message
by Gene Heskett are:
Copyright 2003 by Maurice Eugene Heskett, all rights reserved.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG 2.4] NFS unlocking operation accesses invalid file struct
  2003-11-27 11:54   ` Akinobu Mita
  2003-11-27 15:15     ` Gene Heskett
@ 2003-11-27 16:23     ` Trond Myklebust
  2003-12-10  1:06       ` Akinobu Mita
  1 sibling, 1 reply; 7+ messages in thread
From: Trond Myklebust @ 2003-11-27 16:23 UTC (permalink / raw)
  To: Akinobu Mita; +Cc: linux-kernel

>>>>> " " == Akinobu Mita <mita@miraclelinux.com> writes:

     > Thanks, Trond.  but, your patch causes memory leak.


Yep. Worse: pthreads assumes that we don't use the pid as the lock
owner. That again means that the test in locks_same_owner() is
incorrect.
For 2.6.x, the NPTL further complicates matters by introducing the
tgid as their equivalent of the posix process id, and not tying
CLONE_THREAD to CLONE_FILES. AFAICS there's nothing we can do about
that...



So then the correct thing to do is indeed to wrap the call to
locks_unlock_delete() with an fget()/fput() pair, and then to remove
the test for fl_pid in locks_same_owner().

We then need to fix lockd so that it generates correct fl_owners for
its locks...

Let me see if I can get that right.

Cheers,
  Trond

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG 2.4] NFS unlocking operation accesses invalid file struct
  2003-11-27 16:23     ` Trond Myklebust
@ 2003-12-10  1:06       ` Akinobu Mita
  2003-12-10  1:27         ` hanasaki
  0 siblings, 1 reply; 7+ messages in thread
From: Akinobu Mita @ 2003-12-10  1:06 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-kernel

Hello Trond,

I apologize for the delay in responding.

On Friday 28 November 2003 01:23, Trond Myklebust wrote:
> So then the correct thing to do is indeed to wrap the call to
> locks_unlock_delete() with an fget()/fput() pair, and then to remove
> the test for fl_pid in locks_same_owner().
>
> We then need to fix lockd so that it generates correct fl_owners for
> its locks...
>
> Let me see if I can get that right.
>

I looked at your patch carefully
(http://www.fys.uio.no/~trondmy/src/Linux-2.4.x/2.4.23-rc1/linux-2.4.23-01-posix_race.dif)
and I think it would fix the problem completely.

Thanks,

--
Akinobu Mita


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG 2.4] NFS unlocking operation accesses invalid file struct
  2003-12-10  1:06       ` Akinobu Mita
@ 2003-12-10  1:27         ` hanasaki
  0 siblings, 0 replies; 7+ messages in thread
From: hanasaki @ 2003-12-10  1:27 UTC (permalink / raw)
  Cc: linux-kernel

could this be related to the problem I had?

debian sarge nfs client on 2.4.23
debian sarge nfs server on 2.6test11 - rpc number erros and bad locks

Akinobu Mita wrote:
> Hello Trond,
> 
> I apologize for the delay in responding.
> 
> On Friday 28 November 2003 01:23, Trond Myklebust wrote:
> 
>>So then the correct thing to do is indeed to wrap the call to
>>locks_unlock_delete() with an fget()/fput() pair, and then to remove
>>the test for fl_pid in locks_same_owner().
>>
>>We then need to fix lockd so that it generates correct fl_owners for
>>its locks...
>>
>>Let me see if I can get that right.
>>
> 
> 
> I looked at your patch carefully
> (http://www.fys.uio.no/~trondmy/src/Linux-2.4.x/2.4.23-rc1/linux-2.4.23-01-posix_race.dif)
> and I think it would fix the problem completely.
> 
> Thanks,
> 
> --
> Akinobu Mita
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-12-10  1:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-11-25 11:00 [BUG 2.4] NFS unlocking operation accesses invalid file struct Akinobu Mita
2003-11-26  0:35 ` Trond Myklebust
2003-11-27 11:54   ` Akinobu Mita
2003-11-27 15:15     ` Gene Heskett
2003-11-27 16:23     ` Trond Myklebust
2003-12-10  1:06       ` Akinobu Mita
2003-12-10  1:27         ` hanasaki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).