LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* Re: your mail
  2007-02-05 15:41 logic
@ 2007-02-05 12:36 ` Joerg Roedel
  2007-02-05 14:01   ` Pekka Enberg
  0 siblings, 1 reply; 341+ messages in thread
From: Joerg Roedel @ 2007-02-05 12:36 UTC (permalink / raw)
  To: logic; +Cc: linux-kernel

On Mon, Feb 05, 2007 at 05:41:29PM +0200, logic@thinknet.ro wrote:
> Good morning,
> 
> I am experiencing a bug i think. I am running a 2.6.19.2 kernel on a 3Ghz
> Intel with HT activated, 1 gb ram, and noname motherboard. Here is the
> output of the hang:

Hmm, this seems to be the same issue as in [1] and [2]. A page that is
assumed to belong to the slab but is not longer marked as a slab page.
Could this be a bug in the memory management?

Joerg

[1] http://lkml.org/lkml/2007/2/4/77
[2] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=406477

-- 
Joerg Roedel
Operating System Research Center
AMD Saxony LLC & Co. KG



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2007-02-05 12:36 ` your mail Joerg Roedel
@ 2007-02-05 14:01   ` Pekka Enberg
  2007-02-06  9:41     ` Joerg Roedel
  0 siblings, 1 reply; 341+ messages in thread
From: Pekka Enberg @ 2007-02-05 14:01 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: logic, linux-kernel

Hi Joerg,

On 2/5/07, Joerg Roedel <joerg.roedel@amd.com> wrote:
> Hmm, this seems to be the same issue as in [1] and [2]. A page that is
> assumed to belong to the slab but is not longer marked as a slab page.
> Could this be a bug in the memory management?

The BUG_ON triggers whenever you feed an invalid pointer to kfree() or
kmem_cache_free() so I am guessing the caller is simply broken. Note
that kernels prior to 2.6.18 would quietly corrupt the slab unless
CONFIG_SLAB_DEBUG was enabled which might explain why this hasn't been
noticed before.

                               Pekka

^ permalink raw reply	[flat|nested] 341+ messages in thread

* (no subject)
@ 2007-02-05 15:41 logic
  2007-02-05 12:36 ` your mail Joerg Roedel
  0 siblings, 1 reply; 341+ messages in thread
From: logic @ 2007-02-05 15:41 UTC (permalink / raw)
  To: linux-kernel

Good morning,

I am experiencing a bug i think. I am running a 2.6.19.2 kernel on a 3Ghz
Intel with HT activated, 1 gb ram, and noname motherboard. Here is the
output of the hang:
------------[ cut here ]------------

Message from syslogd@mail at Fri Feb  2 22:47:32 2007 ...
mail kernel: kernel BUG at mm/slab.c:607!

Message from syslogd@mail at Fri Feb  2 22:47:32 2007 ...
mail kernel: invalid opcode: 0000 [#1]

Message from syslogd@mail at Fri Feb  2 22:47:32 2007 ...
mail kernel: CPU:    1

Message from syslogd@mail at Fri Feb  2 22:47:32 2007 ...
mail kernel: EIP:    0060:[<c0155ebb>]    Not tainted VLI

Message from syslogd@mail at Fri Feb  2 22:47:32 2007 ...

Message from syslogd@mail at Fri Feb  2 22:47:32 2007 ...
mail kernel: eax: 1af3a451   ebx: c89ba000   ecx: 4a5ffc80   edx: c214bfe0

Message from syslogd@mail at Fri Feb  2 22:47:32 2007 ...
mail kernel: ds: 007b   es: 007b   ss: 0068

Message from syslogd@mail at Fri Feb  2 22:47:32 2007 ...
mail kernel: EIP is at free_block+0x44/0xda

Message from syslogd@mail at Fri Feb  2 22:47:32 2007 ...
mail kernel: PREEMPT SMP
mail kernel: EFLAGS: 00010046   (2.6.19.2 #1)

Message from syslogd@mail at Fri Feb  2 22:47:32 2007 ...
mail kernel: Call Trace:

Message from syslogd@mail at Fri Feb  2 22:47:32 2007 ...
mail kernel:  =======================

Message from syslogd@mail at Fri Feb  2 22:47:32 2007 ...
mail kernel: Process events/1 (pid: 9, ti=c19b6000 task=c1983a90
task.ti=c19b6000)

Message from syslogd@mail at Fri Feb  2 22:47:32 2007 ...
mail kernel: Stack: 00000000 0000001e c196d018 c196d018 0000001e c196d000
00000000 c015664d

Message from syslogd@mail at Fri Feb  2 22:47:32 2007 ...
mail kernel:        00000000 00000000 c196fc80 c19485c0 c196fc80 c19485c0
c194b140 00000213



-- 
--
Va salut,
Alexandru Gheorghita
Technical Manager
Think.Net
phone: 0788.700.842
mail: logic@thinknet.ro



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2007-02-05 14:01   ` Pekka Enberg
@ 2007-02-06  9:41     ` Joerg Roedel
  0 siblings, 0 replies; 341+ messages in thread
From: Joerg Roedel @ 2007-02-06  9:41 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: logic, linux-kernel

On Mon, Feb 05, 2007 at 04:01:23PM +0200, Pekka Enberg wrote:
> Hi Joerg,
> 
> On 2/5/07, Joerg Roedel <joerg.roedel@amd.com> wrote:
> >Hmm, this seems to be the same issue as in [1] and [2]. A page that is
> >assumed to belong to the slab but is not longer marked as a slab page.
> >Could this be a bug in the memory management?
> 
> The BUG_ON triggers whenever you feed an invalid pointer to kfree() or
> kmem_cache_free() so I am guessing the caller is simply broken. Note
> that kernels prior to 2.6.18 would quietly corrupt the slab unless
> CONFIG_SLAB_DEBUG was enabled which might explain why this hasn't been
> noticed before.

Ok. I was not aware of that. Thanks for clarification.

Joerg

-- 
Joerg Roedel
Operating System Research Center
AMD Saxony LLC & Co. KG



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <CAP7CzPfLu6mm6f2fon-zez3PW6rDACEH6ihF2aG+1Dc7Zc2WuQ@mail.gmail.com>
@ 2021-09-13  6:06 ` Willy Tarreau
  0 siblings, 0 replies; 341+ messages in thread
From: Willy Tarreau @ 2021-09-13  6:06 UTC (permalink / raw)
  To: zhao xc
  Cc: tglx, peterz, keescook, mingo, joe, john.garry, song.bao.hua,
	linux-kernel

Hi,

On Mon, Sep 13, 2021 at 01:32:51PM +0800, zhao xc wrote:
> Hi maintainer:
>         delete blank line between two enum definitions

Could you please make sure to place a subject (and a meaningful one) in
the "subject" field of your e-mails ? There's nothing more annoying than
receiving messages with no subject and having to read them to figure you
were not interested!

Thanks,
Willy

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2021-08-21  8:59 Kari Argillander
@ 2021-08-22 13:13 ` CGEL
  0 siblings, 0 replies; 341+ messages in thread
From: CGEL @ 2021-08-22 13:13 UTC (permalink / raw)
  To: Kari Argillander
  Cc: viro, christian.brauner, jamorris, gladkov.alexey, yang.yang29,
	tj, paul.gortmaker, linux-fsdevel, linux-kernel, Zeal Robot

O
Sat, Aug 21, 2021 at 11:59:39AM +0300, Kari Argillander wrote:
> Bcc:
> Subject: Re: [PATCH] proc: prevent mount proc on same mountpoint in one pid
>  namespace
> Reply-To:
> In-Reply-To: <20210821083105.30336-1-yang.yang29@zte.com.cn>
> 
> On Sat, Aug 21, 2021 at 01:31:05AM -0700, cgel.zte@gmail.com wrote:
> > From: Yang Yang <yang.yang29@zte.com.cn>
> > 
> > Patch "proc: allow to mount many instances of proc in one pid namespace"
> > aims to mount many instances of proc on different mountpoint, see
> > tools/testing/selftests/proc/proc-multiple-procfs.c.
> > 
> > But there is a side-effects, user can mount many instances of proc on
> > the same mountpoint in one pid namespace, which is not allowed before.
> > This duplicate mount makes no sense but wastes memory and CPU, and user
> > may be confused why kernel allows it.
> > 
> > The logic of this patch is: when try to mount proc on /mnt, check if
> > there is a proc instance mount on /mnt in the same pid namespace. If
> > answer is yes, return -EBUSY.
> > 
> > Since this check can't be done in proc_get_tree(), which call
> > get_tree_nodev() and will create new super_block unconditionally.
> > And other nodev fs may faces the same case, so add a new hook in
> > fs_context_operations.
> > 
> > Reported-by: Zeal Robot <zealci@zte.com.cn>
> > Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
> > ---
> >  fs/namespace.c             |  9 +++++++++
> >  fs/proc/root.c             | 15 +++++++++++++++
> >  include/linux/fs_context.h |  1 +
> >  3 files changed, 25 insertions(+)
> > 
> > diff --git a/fs/namespace.c b/fs/namespace.c
> > index f79d9471cb76..84da649a70c5 100644
> > --- a/fs/namespace.c
> > +++ b/fs/namespace.c
> > @@ -2878,6 +2878,7 @@ static int do_new_mount_fc(struct fs_context *fc, struct path *mountpoint,
> >  static int do_new_mount(struct path *path, const char *fstype, int sb_flags,
> >  			int mnt_flags, const char *name, void *data)
> >  {
> > +	int (*check_mntpoint)(struct fs_context *fc, struct path *path);
> >  	struct file_system_type *type;
> >  	struct fs_context *fc;
> >  	const char *subtype = NULL;
> > @@ -2906,6 +2907,13 @@ static int do_new_mount(struct path *path, const char *fstype, int sb_flags,
> >  	if (IS_ERR(fc))
> >  		return PTR_ERR(fc);
> >  
> > +	/* check if there is a same super_block mount on path*/
> > +	check_mntpoint = fc->ops->check_mntpoint;
> > +	if (check_mntpoint)
> > +		err = check_mntpoint(fc, path);
> > +	if (err < 0)
> > +		goto err_fc;
> > +
> >  	if (subtype)
> >  		err = vfs_parse_fs_string(fc, "subtype",
> >  					  subtype, strlen(subtype));
> > @@ -2920,6 +2928,7 @@ static int do_new_mount(struct path *path, const char *fstype, int sb_flags,
> >  	if (!err)
> >  		err = do_new_mount_fc(fc, path, mnt_flags);
> >  
> > +err_fc:
> >  	put_fs_context(fc);
> >  	return err;
> >  }
> > diff --git a/fs/proc/root.c b/fs/proc/root.c
> > index c7e3b1350ef8..0971d6b0bec2 100644
> > --- a/fs/proc/root.c
> > +++ b/fs/proc/root.c
> > @@ -237,11 +237,26 @@ static void proc_fs_context_free(struct fs_context *fc)
> >  	kfree(ctx);
> >  }
> >  
> > +static int proc_check_mntpoint(struct fs_context *fc, struct path *path)
> > +{
> > +	struct super_block *mnt_sb = path->mnt->mnt_sb;
> > +	struct proc_fs_info *fs_info;
> > +
> > +	if (strcmp(mnt_sb->s_type->name, "proc") == 0) {
> > +		fs_info = mnt_sb->s_fs_info;
> > +		if (fs_info->pid_ns == task_active_pid_ns(current) &&
> > +		    path->mnt->mnt_root == path->dentry)
> > +			return -EBUSY;
> > +	}
> > +	return 0;
> > +}
> > +
> >  static const struct fs_context_operations proc_fs_context_ops = {
> >  	.free		= proc_fs_context_free,
> >  	.parse_param	= proc_parse_param,
> >  	.get_tree	= proc_get_tree,
> >  	.reconfigure	= proc_reconfigure,
> > +	.check_mntpoint	= proc_check_mntpoint,
> >  };
> >  
> >  static int proc_init_fs_context(struct fs_context *fc)
> > diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h
> > index 6b54982fc5f3..090a05fb2d7d 100644
> > --- a/include/linux/fs_context.h
> > +++ b/include/linux/fs_context.h
> > @@ -119,6 +119,7 @@ struct fs_context_operations {
> >  	int (*parse_monolithic)(struct fs_context *fc, void *data);
> >  	int (*get_tree)(struct fs_context *fc);
> >  	int (*reconfigure)(struct fs_context *fc);
> > +	int (*check_mntpoint)(struct fs_context *fc, struct path *path);
> 
> Don't you think this should be it's own patch. It is after all internal
> api change. This also needs documentation. It would be confusing if
> someone convert to new mount api and there is one line which just
> address some proc stuff but even commit message does not address does
> every fs needs to add this. 
> 
> Documentation is very good shape right now and we are in face that
> everyone is migrating to use new mount api so everyting should be well
> documented.
> i
Thanks for your reply!

I will take commit message more carefully next time.
Sinece I am not quit sure about this patch, so I didn't write
Documentation for patch v1. AIViro had made it clear, so this 
patch is abondoned.
> >  };
> >  
> >  /*
> > -- 
> > 2.25.1
> > 

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2021-08-16  2:46 Kari Argillander
@ 2021-08-16 12:27 ` Christoph Hellwig
  0 siblings, 0 replies; 341+ messages in thread
From: Christoph Hellwig @ 2021-08-16 12:27 UTC (permalink / raw)
  To: Kari Argillander
  Cc: Konstantin Komarov, Christoph Hellwig, ntfs3, linux-kernel,
	linux-fsdevel, Pali Rohár, Matthew Wilcox

On Mon, Aug 16, 2021 at 05:46:59AM +0300, Kari Argillander wrote:
> I would like really like to get fsparam_flag_no also for no_acs_rules
> but then we have to make new name for it. Other possibility is to
> modify mount api so it mount option can be no/no_. I think that would
> maybe be good change. 

I don't think adding another no_ alias is a good idea.  I'd suggest
to just rename the existing flag before the ntfs3 driver ever hits
mainline.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2021-04-07  8:25 ` your mail Huang Rui
@ 2021-04-07  9:25   ` Christian König
  0 siblings, 0 replies; 341+ messages in thread
From: Christian König @ 2021-04-07  9:25 UTC (permalink / raw)
  To: Huang Rui, songqiang; +Cc: airlied, daniel, linux-kernel, dri-devel

Thanks Ray for pointing this out. Looks like the mail ended up in my 
spam folder otherwise.

Apart from that this patch is a really really big NAK. I can't count how 
often I had to reject stuff like this!

Using the page reference for TTM pages is illegal and can lead to struct 
page corruption.

Can you please describe why you need that?

Regards,
Christian.

Am 07.04.21 um 10:25 schrieb Huang Rui:
> On Wed, Apr 07, 2021 at 09:27:46AM +0800, songqiang wrote:
>
> Please add the description in the commit message and subject.
>
> Thanks,
> Ray
>
>> Signed-off-by: songqiang <songqiang@uniontech.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_page_alloc.c | 18 ++++++++++++++----
>>   1 file changed, 14 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c
>> index 14660f723f71..f3698f0ad4d7 100644
>> --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
>> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
>> @@ -736,8 +736,16 @@ static void ttm_put_pages(struct page **pages, unsigned npages, int flags,
>>   					if (++p != pages[i + j])
>>   					    break;
>>   
>> -				if (j == HPAGE_PMD_NR)
>> +				if (j == HPAGE_PMD_NR) {
>>   					order = HPAGE_PMD_ORDER;
>> +					for (j = 1; j < HPAGE_PMD_NR; ++j)
>> +						page_ref_dec(pages[i+j]);
>> +				}
>>   			}
>>   #endif
>>   
>> @@ -868,10 +876,12 @@ static int ttm_get_pages(struct page **pages, unsigned npages, int flags,
>>   				p = alloc_pages(huge_flags, HPAGE_PMD_ORDER);
>>   				if (!p)
>>   					break;
>> -
>> -				for (j = 0; j < HPAGE_PMD_NR; ++j)
>> +				for (j = 0; j < HPAGE_PMD_NR; ++j) {
>>   					pages[i++] = p++;
>> -
>> +					if (j > 0)
>> +						page_ref_inc(pages[i-1]);
>> +				}
>>   				npages -= HPAGE_PMD_NR;
>>   			}
>>   		}
>>
>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=04%7C01%7Cray.huang%40amd.com%7C4ccc617b77d746db5af108d8f98db612%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637533734805563118%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=9bSP90LYdJyJYJYmuphVmqk%2B3%2FE4JPrtXkQTbxwAt68%3D&amp;reserved=0


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2021-04-07  1:27 [PATCH] drivers/gpu/drm/ttm/ttm_page_allo.c: adjust ttm pages refcount fix the bug: Feb 6 17:13:13 aaa-PC kernel: [ 466.271034] BUG: Bad page state in process blur_image pfn:7aee2 Feb 6 17:13:13 aaa-PC kernel: [ 466.271037] page:980000025fca4170 count:0 mapcount:0 mapping:980000025a0dca60 index:0x0 Feb 6 17:13:13 aaa-PC kernel: [ 466.271039] flags: 0x1e01fff000000() Feb 6 17:13:13 aaa-PC kernel: [ 466.271042] raw: 0001e01fff000000 0000000000000100 0000000000000200 980000025a0dca60 Feb 6 17:13:13 aaa-PC kernel: [ 466.271044] raw: 0000000000000000 0000000000000000 00000000ffffffff Feb 6 17:13:13 aaa-PC kernel: [ 466.271046] page dumped because: non-NULL mapping Feb 6 17:13:13 aaa-PC kernel: [ 466.271047] Modules linked in: bnep fuse bluetooth ecdh_generic sha256_generic cfg80211 rfkill vfat fat serio_raw uio_pdrv_genirq binfmt_misc ip_tables amdgpu chash radeon r8168 loongson gpu_sched Feb 6 17:13:13 aaa-PC kernel: [ 466.271059] CPU: 3 PID: 9554 Comm: blur_image Tainted: G B 4.19.0-loongson-3-desktop #3036 Feb 6 17:13:13 aaa-PC kernel: [ 466.271061] Hardware name: Haier Kunlun-LS3A4000-LS7A-desktop/Kunlun-LS3A4000-LS7A-desktop, BIOS Kunlun-V4.0.12V4.0 LS3A4000 03/19/2020 Feb 6 17:13:13 aaa-PC kernel: [ 466.271063] Stack : 000000000000007b 000000007400cce0 0000000000000000 0000000000000007 Feb 6 17:13:13 aaa-PC kernel: [ 466.271067] 0000000000000000 0000000000000000 0000000000002a82 ffffffff8202c910 Feb 6 17:13:13 aaa-PC kernel: [ 466.271070] 0000000000000000 0000000000002a82 0000000000000000 ffffffff81e20000 Feb 6 17:13:13 aaa-PC kernel: [ 466.271074] 0000000000000000 ffffffff8021301c ffffffff82040000 6e754b20534f4942 Feb 6 17:13:13 aaa-PC kernel: [ 466.271078] ffff000000000000 0000000000000000 000000007400cce0 0000000000000000 Feb 6 17:13:13 aaa-PC kernel: [ 466.271082] 9800000007155d40 ffffffff81cc5470 0000000000000005 6db6db6db6db0000 Feb 6 17:13:13 aaa-PC kernel: [ 466.271086] 0000000000000003 fffffffffffffffb 0000000000006000 98000002559f4000 Feb 6 17:13:13 aaa-PC kernel: [ 466.271090] 980000024a448000 980000024a44b7f0 9800000007155d50 ffffffff819f5158 Feb 6 17:13:13 aaa-PC kernel: [ 466.271094] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Feb 6 17:13:13 aaa-PC kernel: [ 466.271097] 9800000007155d40 ffffffff802310c4 ffffffff81e70000 ffffffff819f5158 Feb 6 17:13:13 aaa-PC kernel: [ 466.271101] ... Feb 6 17:13:13 aaa-PC kernel: [ 466.271103] Call Trace: Feb 6 17:13:13 aaa-PC kernel: [ 466.271107] [<ffffffff802310c4>] show_stack+0x44/0x1c0 Feb 6 17:13:13 aaa-PC kernel: [ 466.271110] [<ffffffff819f5158>] dump_stack+0x1d8/0x240 Feb 6 17:13:13 aaa-PC kernel: [ 466.271113] [<ffffffff80491c10>] bad_page+0x210/0x2c0 Feb 6 17:13:13 aaa-PC kernel: [ 466.271116] [<ffffffff804931c8>] free_pcppages_bulk+0x708/0x900 Feb 6 17:13:13 aaa-PC kernel: [ 46 6.271119] [<ffffffff804980cc>] free_unref_page_list+0x1cc/0x2c0 Feb 6 17:13:13 aaa-PC kernel: [ 466.271122] [<ffffffff804ad2c8>] release_pages+0x648/0x900 Feb 6 17:13:13 aaa-PC kernel: [ 466.271125] [<ffffffff804f3b48>] tlb_flush_mmu_free+0x88/0x100 Feb 6 17:13:13 aaa-PC kernel: [ 466.271128] [<ffffffff804f8a24>] zap_pte_range+0xa24/0x1480 Feb 6 17:13:13 aaa-PC kernel: [ 466.271132] [<ffffffff804f98b0>] unmap_page_range+0x1f0/0x500 Feb 6 17:13:13 aaa-PC kernel: [ 466.271135] [<ffffffff804fa054>] unmap_vmas+0x154/0x200 Feb 6 17:13:13 aaa-PC kernel: [ 466.271138] [<ffffffff8051190c>] exit_mmap+0x20c/0x380 Feb 6 17:13:13 aaa-PC kernel: [ 466.271142] [<ffffffff802bb9c8>] mmput+0x148/0x300 Feb 6 17:13:13 aaa-PC kernel: [ 466.271145] [<ffffffff802c80d8>] do_exit+0x6d8/0x1900 Feb 6 17:13:13 aaa-PC kernel: [ 466.271148] [<ffffffff802cb288>] do_group_exit+0x88/0x1c0 Feb 6 17:13:13 aaa-PC kernel: [ 466.271151] [<ffffffff802cb3d8>] sys_exit_group+0x18/0x40 Feb 6 17 :13:13 aaa-PC kernel: [ 466.271155] [<ffffffff8023f954>] syscall_common+0x34/0xa4 songqiang
@ 2021-04-07  8:25 ` Huang Rui
  2021-04-07  9:25   ` Christian König
  0 siblings, 1 reply; 341+ messages in thread
From: Huang Rui @ 2021-04-07  8:25 UTC (permalink / raw)
  To: songqiang; +Cc: Koenig, Christian, airlied, daniel, linux-kernel, dri-devel

On Wed, Apr 07, 2021 at 09:27:46AM +0800, songqiang wrote:

Please add the description in the commit message and subject.

Thanks,
Ray

> Signed-off-by: songqiang <songqiang@uniontech.com>
> ---
>  drivers/gpu/drm/ttm/ttm_page_alloc.c | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> index 14660f723f71..f3698f0ad4d7 100644
> --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> @@ -736,8 +736,16 @@ static void ttm_put_pages(struct page **pages, unsigned npages, int flags,
>  					if (++p != pages[i + j])
>  					    break;
>  
> -				if (j == HPAGE_PMD_NR)
> +				if (j == HPAGE_PMD_NR) {
>  					order = HPAGE_PMD_ORDER;
> +					for (j = 1; j < HPAGE_PMD_NR; ++j)
> +						page_ref_dec(pages[i+j]);
> +				}
>  			}
>  #endif
>  
> @@ -868,10 +876,12 @@ static int ttm_get_pages(struct page **pages, unsigned npages, int flags,
>  				p = alloc_pages(huge_flags, HPAGE_PMD_ORDER);
>  				if (!p)
>  					break;
> -
> -				for (j = 0; j < HPAGE_PMD_NR; ++j)
> +				for (j = 0; j < HPAGE_PMD_NR; ++j) {
>  					pages[i++] = p++;
> -
> +					if (j > 0)
> +						page_ref_inc(pages[i-1]);
> +				}
>  				npages -= HPAGE_PMD_NR;
>  			}
>  		}
> 
> 
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&amp;data=04%7C01%7Cray.huang%40amd.com%7C4ccc617b77d746db5af108d8f98db612%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637533734805563118%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=9bSP90LYdJyJYJYmuphVmqk%2B3%2FE4JPrtXkQTbxwAt68%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2021-04-01 21:16 Bhaumik Bhatt
@ 2021-04-07  6:56 ` Manivannan Sadhasivam
  0 siblings, 0 replies; 341+ messages in thread
From: Manivannan Sadhasivam @ 2021-04-07  6:56 UTC (permalink / raw)
  To: Bhaumik Bhatt
  Cc: linux-arm-msm, hemantk, jhugo, linux-kernel, carl.yin,
	naveen.kumar, loic.poulain

On Thu, Apr 01, 2021 at 02:16:09PM -0700, Bhaumik Bhatt wrote:
> Subject: [PATCH v8 0/9] Updates to MHI channel handling
> 

Subject is present in the body ;)

> MHI specification shows a state machine with support for STOP channel command
> and the validity of certain state transitions. MHI host currently does not
> provide any mechanism to stop a channel and restart it without resetting it.
> There are also times when the device moves on to a different execution
> environment while client drivers on the host are unaware of it and still
> attempt to reset the channels facing unnecessary timeouts.
> 
> This series addresses the above areas to provide support for stopping an MHI
> channel, resuming it back, improved documentation and improving upon channel
> state machine handling in general.
> 
> This set of patches was tested on arm64 and x86_64 architecture.
> 

Series applied to mhi-next!

Thanks,
Mani

> v8:
> -Split the state machine improvements patch to three patches as per review
> 
> v7:
> -Tested on x86_64 architecture
> -Drop the patch "Do not clear channel context more than once" as issue is fixed
> differently using "bus: mhi: core: Fix double dma free()"
> -Update the commit text to better reflect changes on state machine improvements
> 
> v6:
> -Dropped the patch which introduced start/stop transfer APIs for lack of users
> -Updated error handling and debug prints on channel handling improvements patch
> -Improved commit text to better explain certain patches based on review comments
> -Removed references to new APIs from the documentation improvement patch
> 
> v5:
> -Added reviewed-by tags from Hemant I missed earlier
> -Added patch to prevent kernel warnings on clearing channel context twice
> 
> v4:
> -Updated commit text/descriptions and addressed checkpatch checks
> -Added context validity check before starting/stopping channels from new API
> -Added patch to clear channel context configuration after reset/unprepare
> 
> v3:
> -Updated documentation for channel transfer APIs to highlight differences
> -Create separate patch for "allowing channel to be disabled from stopped state"
> 
> v2:
> -Renamed the newly introduced APIs to mhi_start_transfer() / mhi_stop_transfer()
> -Added improved documentation to avoid confusion with the new APIs
> -Removed the __ prefix from mhi_unprepare_channel() API for consistency.
> 
> Bhaumik Bhatt (9):
>   bus: mhi: core: Allow sending the STOP channel command
>   bus: mhi: core: Clear context for stopped channels from remove()
>   bus: mhi: core: Improvements to the channel handling state machine
>   bus: mhi: core: Update debug messages to use client device
>   bus: mhi: core: Hold device wake for channel update commands
>   bus: mhi: core: Clear configuration from channel context during reset
>   bus: mhi: core: Check channel execution environment before issuing
>     reset
>   bus: mhi: core: Remove __ prefix for MHI channel unprepare function
>   bus: mhi: Improve documentation on channel transfer setup APIs
> 
>  drivers/bus/mhi/core/init.c     |  22 ++++-
>  drivers/bus/mhi/core/internal.h |  12 +++
>  drivers/bus/mhi/core/main.c     | 190 ++++++++++++++++++++++++----------------
>  include/linux/mhi.h             |  18 +++-
>  4 files changed, 162 insertions(+), 80 deletions(-)
> 
> -- 
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
> 

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <20210322213644.333112726@goodmis.org>
@ 2021-03-22 21:40 ` Steven Rostedt
  0 siblings, 0 replies; 341+ messages in thread
From: Steven Rostedt @ 2021-03-22 21:40 UTC (permalink / raw)
  To: linux-kernel

On Mon, Mar 22, 2021 at 05:36:44PM -0400, Steven Rostedt wrote:

$@#@#$%%%!!!!

Bah! There was another typo in the email list!

Take 3

-- Steve


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <20210322212156.440428241@goodmis.org>
@ 2021-03-22 21:36 ` Steven Rostedt
  0 siblings, 0 replies; 341+ messages in thread
From: Steven Rostedt @ 2021-03-22 21:36 UTC (permalink / raw)
  To: linux-kernel

On Mon, Mar 22, 2021 at 05:21:56PM -0400, Steven Rostedt wrote:

Bah! John 'Warthog' Hawley email had those single quotes in it that I cut and
pasted into the Cc list, causing the quilt mail parsing to fail, but as LKML
was in the "To" part, it still sent!

Take 2

-- Steve


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2020-12-02 18:51             ` your mail Andy Shevchenko
@ 2020-12-02 18:56               ` Andy Shevchenko
  0 siblings, 0 replies; 341+ messages in thread
From: Andy Shevchenko @ 2020-12-02 18:56 UTC (permalink / raw)
  To: Yun Levi
  Cc: Yury Norov, Rasmus Villemoes, dushistov, Arnd Bergmann,
	Andrew Morton, Gustavo A. R. Silva, William Breathitt Gray,
	richard.weiyang, joseph.qi, skalluru, Josh Poimboeuf,
	Linux Kernel Mailing List, linux-arch

On Wed, Dec 02, 2020 at 08:51:27PM +0200, Andy Shevchenko wrote:
> On Thu, Dec 03, 2020 at 03:27:33AM +0900, Yun Levi wrote:
> > On Thu, Dec 3, 2020 at 2:36 AM Andy Shevchenko
> > <andriy.shevchenko@linux.intel.com> wrote:
> > > On Wed, Dec 02, 2020 at 09:26:05AM -0800, Yury Norov wrote:

...

> > > Side note: speaking of performance, any plans to fix for_each_*_bit*() for
> > > cases when the nbits is known to be <= BITS_PER_LONG?
> > >
> > > Now it makes an awful code generation (something like few hundred bytes of
> > > code).
> 
> > Frankly Speaking, I don't have an idea in now.....
> > Could you share your idea or wisdom?
> 
> Something like (I may be mistaken by names, etc, I'm not a compiler expert,
> and this is in pseudo language, I don't remember all API names by hart,
> just to express the idea) as a rough first step
> 
> __builtin_constant(nbits, find_next_set_bit_long, find_next_set_bit)
> 
> find_next_set_bit_long()
> {
> 	unsigned long v = BIT_LAST_WORD(i);
> 	return ffs_long(v);
> }
> 
> Same for find_first_set_bit() -> map it to ffs_long().
> 
> And I believe it can be optimized more.

Btw it will also require to reconsider test cases where such constant small
nbits values are passed (forcing compiler to avoid optimization somehow, one
way is to try random nbits for some test cases).

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2020-12-02 18:27           ` Yun Levi
@ 2020-12-02 18:51             ` Andy Shevchenko
  2020-12-02 18:56               ` Andy Shevchenko
  0 siblings, 1 reply; 341+ messages in thread
From: Andy Shevchenko @ 2020-12-02 18:51 UTC (permalink / raw)
  To: Yun Levi
  Cc: Yury Norov, Rasmus Villemoes, dushistov, Arnd Bergmann,
	Andrew Morton, Gustavo A. R. Silva, William Breathitt Gray,
	richard.weiyang, joseph.qi, skalluru, Josh Poimboeuf,
	Linux Kernel Mailing List, linux-arch

On Thu, Dec 03, 2020 at 03:27:33AM +0900, Yun Levi wrote:
> On Thu, Dec 3, 2020 at 2:36 AM Andy Shevchenko
> <andriy.shevchenko@linux.intel.com> wrote:
> > On Wed, Dec 02, 2020 at 09:26:05AM -0800, Yury Norov wrote:

...

> > Side note: speaking of performance, any plans to fix for_each_*_bit*() for
> > cases when the nbits is known to be <= BITS_PER_LONG?
> >
> > Now it makes an awful code generation (something like few hundred bytes of
> > code).

> Frankly Speaking, I don't have an idea in now.....
> Could you share your idea or wisdom?

Something like (I may be mistaken by names, etc, I'm not a compiler expert,
and this is in pseudo language, I don't remember all API names by hart,
just to express the idea) as a rough first step

__builtin_constant(nbits, find_next_set_bit_long, find_next_set_bit)

find_next_set_bit_long()
{
	unsigned long v = BIT_LAST_WORD(i);
	return ffs_long(v);
}

Same for find_first_set_bit() -> map it to ffs_long().

And I believe it can be optimized more.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2020-08-17 17:12     ` Amit Pundir
@ 2020-08-30 18:58       ` Bjorn Andersson
  0 siblings, 0 replies; 341+ messages in thread
From: Bjorn Andersson @ 2020-08-30 18:58 UTC (permalink / raw)
  To: Amit Pundir
  Cc: Konrad Dybcio, Andy Gross, dt, John Stultz, linux-arm-msm, lkml,
	Rob Herring, Sumit Semwal

On Mon 17 Aug 17:12 UTC 2020, Amit Pundir wrote:

> On Thu, 13 Aug 2020 at 12:38, Bjorn Andersson
> <bjorn.andersson@linaro.org> wrote:
> >
> > On Thu 06 Aug 15:31 PDT 2020, Konrad Dybcio wrote:
> >
> > > Subject: Re: [PATCH v4] arm64: dts: qcom: Add support for Xiaomi Poco F1 (Beryllium)
> > >
> > > >// This removed_region is needed to boot the device
> > > >               // TODO: Find out the user of this reserved memory
> > > >               removed_region: memory@88f00000 {
> > >
> > > This region seems to belong to the Trust Zone. When Linux tries to access it, TZ bites and shuts the device down.
> > >
> >
> > This is in line with what the documentation indicates and then it would
> > be better to just bump &tz_mem to a size of 0x4900000.
> 
> Hi, so just to be sure that I got this right, you want me to extend
> &tz_mem to the size of 0x4900000 from the default size of 0x2D00000 by
> including this downstream &removed_region (of size 0x1A00000) +
> previously unreserved downstream memory region (of size 0x200000), to
> align with the starting address of &qseecom_mem?
> 

Yes

Regards,
Bjorn

> I just gave this &tz_mem change a spin and I do not see any obvious
> regression in my limited smoke testing (Boots AOSP to UI with
> v5.9-rc1. Touch/BT/WiFi works) so far, with 20+ out-of-tree patches.
> 
> Regards,
> Amit Pundir
> 
> >
> > Regards,
> > Bjorn

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2020-08-13  7:04   ` your mail Bjorn Andersson
@ 2020-08-17 17:12     ` Amit Pundir
  2020-08-30 18:58       ` Bjorn Andersson
  0 siblings, 1 reply; 341+ messages in thread
From: Amit Pundir @ 2020-08-17 17:12 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: Konrad Dybcio, Andy Gross, dt, John Stultz, linux-arm-msm, lkml,
	Rob Herring, Sumit Semwal

On Thu, 13 Aug 2020 at 12:38, Bjorn Andersson
<bjorn.andersson@linaro.org> wrote:
>
> On Thu 06 Aug 15:31 PDT 2020, Konrad Dybcio wrote:
>
> > Subject: Re: [PATCH v4] arm64: dts: qcom: Add support for Xiaomi Poco F1 (Beryllium)
> >
> > >// This removed_region is needed to boot the device
> > >               // TODO: Find out the user of this reserved memory
> > >               removed_region: memory@88f00000 {
> >
> > This region seems to belong to the Trust Zone. When Linux tries to access it, TZ bites and shuts the device down.
> >
>
> This is in line with what the documentation indicates and then it would
> be better to just bump &tz_mem to a size of 0x4900000.

Hi, so just to be sure that I got this right, you want me to extend
&tz_mem to the size of 0x4900000 from the default size of 0x2D00000 by
including this downstream &removed_region (of size 0x1A00000) +
previously unreserved downstream memory region (of size 0x200000), to
align with the starting address of &qseecom_mem?

I just gave this &tz_mem change a spin and I do not see any obvious
regression in my limited smoke testing (Boots AOSP to UI with
v5.9-rc1. Touch/BT/WiFi works) so far, with 20+ out-of-tree patches.

Regards,
Amit Pundir

>
> Regards,
> Bjorn

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2020-08-06 22:31 ` Konrad Dybcio
@ 2020-08-13  7:04   ` Bjorn Andersson
  2020-08-17 17:12     ` Amit Pundir
  0 siblings, 1 reply; 341+ messages in thread
From: Bjorn Andersson @ 2020-08-13  7:04 UTC (permalink / raw)
  To: Konrad Dybcio
  Cc: amit.pundir, agross, devicetree, john.stultz, linux-arm-msm,
	linux-kernel, robh+dt, sumit.semwal

On Thu 06 Aug 15:31 PDT 2020, Konrad Dybcio wrote:

> Subject: Re: [PATCH v4] arm64: dts: qcom: Add support for Xiaomi Poco F1 (Beryllium)
> 
> >// This removed_region is needed to boot the device
> >               // TODO: Find out the user of this reserved memory
> >               removed_region: memory@88f00000 {
> 
> This region seems to belong to the Trust Zone. When Linux tries to access it, TZ bites and shuts the device down.
> 

This is in line with what the documentation indicates and then it would
be better to just bump &tz_mem to a size of 0x4900000.

Regards,
Bjorn

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2020-06-09 11:38 Gaurav Singh
@ 2020-06-09 11:54 ` Greg KH
  0 siblings, 0 replies; 341+ messages in thread
From: Greg KH @ 2020-06-09 11:54 UTC (permalink / raw)
  To: Gaurav Singh
  Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, Andrii Nakryiko, John Fastabend, KP Singh,
	David S. Miller, Jakub Kicinski, Jesper Dangaard Brouer,
	open list:BPF (Safe dynamic programs and tools),
	open list:BPF (Safe dynamic programs and tools),
	open list

On Tue, Jun 09, 2020 at 07:38:38AM -0400, Gaurav Singh wrote:
> Please find the patch below.
> 
> Thanks and regards,
> Gaurav.
> 
> >From Gaurav Singh <gaurav1086@gmail.com> # This line is ignored.
> From: Gaurav Singh <gaurav1086@gmail.com>
> Reply-To: 
> Subject: 
> In-Reply-To: 
> 
> 

I think something went wrong in your submission, just use 'git
send-email' to send the patch out.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2020-05-06  5:52 Jiaxun Yang
@ 2020-05-07 11:00 ` Thomas Bogendoerfer
  0 siblings, 0 replies; 341+ messages in thread
From: Thomas Bogendoerfer @ 2020-05-07 11:00 UTC (permalink / raw)
  To: Jiaxun Yang
  Cc: linux-mips, clang-built-linux, Maciej W . Rozycki, Fangrui Song,
	Kees Cook, Nathan Chancellor, Paul Burton, Masahiro Yamada,
	Jouni Hogander, Kevin Darbyshire-Bryant, Borislav Petkov,
	Heiko Carstens, linux-kernel

On Wed, May 06, 2020 at 01:52:45PM +0800, Jiaxun Yang wrote:
> Subject: [PATCH v6] MIPS: Truncate link address into 32bit for 32bit kernel
> In-Reply-To: <20200413062651.3992652-1-jiaxun.yang@flygoat.com>
> 
> LLD failed to link vmlinux with 64bit load address for 32bit ELF
> while bfd will strip 64bit address into 32bit silently.
> To fix LLD build, we should truncate load address provided by platform
> into 32bit for 32bit kernel.
> 
> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
> Link: https://github.com/ClangBuiltLinux/linux/issues/786
> Link: https://sourceware.org/bugzilla/show_bug.cgi?id=25784
> Reviewed-by: Fangrui Song <maskray@google.com>
> Reviewed-by: Kees Cook <keescook@chromium.org>
> Tested-by: Nathan Chancellor <natechancellor@gmail.com>
> Cc: Maciej W. Rozycki <macro@linux-mips.org>
> ---
> V2: Take MaskRay's shell magic.
> 
> V3: After spent an hour on dealing with special character issue in
> Makefile, I gave up to do shell hacks and write a util in C instead.
> Thanks Maciej for pointing out Makefile variable problem.
> 
> v4: Finally we managed to find a Makefile method to do it properly
> thanks to Kees. As it's too far from the initial version, I removed
> Review & Test tag from Nick and Fangrui and Cc instead.
> 
> v5: Care vmlinuz as well.
> 
> v6: Rename to LIKER_LOAD_ADDRESS 
> ---
>  arch/mips/Makefile                 | 13 ++++++++++++-
>  arch/mips/boot/compressed/Makefile |  2 +-
>  arch/mips/kernel/vmlinux.lds.S     |  2 +-
>  3 files changed, 14 insertions(+), 3 deletions(-)

applied to mips-next.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.                                                [ RFC1925, 2.3 ]

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <20191026192359.27687-1-frank-w@public-files.de>
@ 2019-10-26 19:30 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 341+ messages in thread
From: Greg Kroah-Hartman @ 2019-10-26 19:30 UTC (permalink / raw)
  To: Frank Wunderlich
  Cc: linux-mediatek, Matthias Brugger, linux-serial, linux-arm-kernel,
	linux-kernel

On Sat, Oct 26, 2019 at 09:23:59PM +0200, Frank Wunderlich wrote:
> Date: Sat, 26 Oct 2019 20:53:28 +0200
> Subject: [PATCH] serial: 8250-mtk: Ask for IRQ-count before request one

Odd email with no subject line :(

Plaese fix up and resend.

thanks,

greg k-h-

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <20190626145238.19708-1-bigeasy@linutronix.de>
@ 2019-06-27 21:13 ` Tejun Heo
  0 siblings, 0 replies; 341+ messages in thread
From: Tejun Heo @ 2019-06-27 21:13 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, Lai Jiangshan, Peter Zijlstra, Thomas Gleixner

On Wed, Jun 26, 2019 at 04:52:36PM +0200, Sebastian Andrzej Siewior wrote:
> A small series of tiny cleanups.

Applied 1-2 to wq/for-5.3.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2019-04-11 10:53 ` Peter Zijlstra
@ 2019-04-12  3:23   ` Nicholas Piggin
  0 siblings, 0 replies; 341+ messages in thread
From: Nicholas Piggin @ 2019-04-12  3:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, linux-kernel, Ingo Molnar, Thomas Gleixner

Peter Zijlstra's on April 11, 2019 8:53 pm:
> Was this supposed to be patch 6/5 of your previous series?

Dang, I screwed up the headers? Thanks for the ping, I will resend.

It is standalone. It seems more suited to the scheduler tree than the
timers one, but your call.

It is generally of more use when CPU0 is _not_ a housekeeping one,
and that's where I've done most testing, but I don't see any hard
dependency.

Thanks,
Nick

> 
> On Thu, Apr 11, 2019 at 04:05:36PM +1000, Nicholas Piggin wrote:
>> Date: Tue, 9 Apr 2019 20:23:16 +1000
>> Subject: [PATCH] kernel/sched: run nohz idle load balancer on HK_FLAG_MISC
>>  CPUs
>> 
>> The nohz idle balancer runs on the lowest idle CPU. This can
>> interfere with isolated CPUs, so confine it to HK_FLAG_MISC
>> housekeeping CPUs.
>> 
>> HK_FLAG_SCHED is not used for this because it is not set anywhere
>> at the moment. This could be folded into HK_FLAG_SCHED once that
>> option is fixed.
>> 
>> The problem was observed with increased jitter on an application
>> running on CPU0, caused by nohz idle load balancing being run on
>> CPU1 (an SMT sibling).
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>  kernel/sched/fair.c | 16 ++++++++++------
>>  1 file changed, 10 insertions(+), 6 deletions(-)
>> 
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index fdab7eb6f351..d29ca323214d 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -9522,22 +9522,26 @@ static inline int on_null_domain(struct rq *rq)
>>   * - When one of the busy CPUs notice that there may be an idle rebalancing
>>   *   needed, they will kick the idle load balancer, which then does idle
>>   *   load balancing for all the idle CPUs.
>> + * - HK_FLAG_MISC CPUs are used for this task, because HK_FLAG_SCHED not set
>> + *   anywhere yet.
>>   */
>>  
>>  static inline int find_new_ilb(void)
>>  {
>> -	int ilb = cpumask_first(nohz.idle_cpus_mask);
>> +	int ilb;
>>  
>> -	if (ilb < nr_cpu_ids && idle_cpu(ilb))
>> -		return ilb;
>> +	for_each_cpu_and(ilb, nohz.idle_cpus_mask,
>> +			      housekeeping_cpumask(HK_FLAG_MISC)) {
>> +		if (idle_cpu(ilb))
>> +			return ilb;
>> +	}
>>  
>>  	return nr_cpu_ids;
>>  }
>>  
>>  /*
>> - * Kick a CPU to do the nohz balancing, if it is time for it. We pick the
>> - * nohz_load_balancer CPU (if there is one) otherwise fallback to any idle
>> - * CPU (if there is one).
>> + * Kick a CPU to do the nohz balancing, if it is time for it. We pick any
>> + * idle CPU in the HK_FLAG_MISC housekeeping set (if there is one).
>>   */
>>  static void kick_ilb(unsigned int flags)
>>  {
>> -- 
>> 2.20.1
>> 
> 

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <20190411060536.22409-1-npiggin@gmail.com>
@ 2019-04-11 10:53 ` Peter Zijlstra
  2019-04-12  3:23   ` Nicholas Piggin
  0 siblings, 1 reply; 341+ messages in thread
From: Peter Zijlstra @ 2019-04-11 10:53 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Thomas Gleixner, Frederic Weisbecker, Ingo Molnar, linux-kernel

Was this supposed to be patch 6/5 of your previous series?

On Thu, Apr 11, 2019 at 04:05:36PM +1000, Nicholas Piggin wrote:
> Date: Tue, 9 Apr 2019 20:23:16 +1000
> Subject: [PATCH] kernel/sched: run nohz idle load balancer on HK_FLAG_MISC
>  CPUs
> 
> The nohz idle balancer runs on the lowest idle CPU. This can
> interfere with isolated CPUs, so confine it to HK_FLAG_MISC
> housekeeping CPUs.
> 
> HK_FLAG_SCHED is not used for this because it is not set anywhere
> at the moment. This could be folded into HK_FLAG_SCHED once that
> option is fixed.
> 
> The problem was observed with increased jitter on an application
> running on CPU0, caused by nohz idle load balancing being run on
> CPU1 (an SMT sibling).
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  kernel/sched/fair.c | 16 ++++++++++------
>  1 file changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index fdab7eb6f351..d29ca323214d 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9522,22 +9522,26 @@ static inline int on_null_domain(struct rq *rq)
>   * - When one of the busy CPUs notice that there may be an idle rebalancing
>   *   needed, they will kick the idle load balancer, which then does idle
>   *   load balancing for all the idle CPUs.
> + * - HK_FLAG_MISC CPUs are used for this task, because HK_FLAG_SCHED not set
> + *   anywhere yet.
>   */
>  
>  static inline int find_new_ilb(void)
>  {
> -	int ilb = cpumask_first(nohz.idle_cpus_mask);
> +	int ilb;
>  
> -	if (ilb < nr_cpu_ids && idle_cpu(ilb))
> -		return ilb;
> +	for_each_cpu_and(ilb, nohz.idle_cpus_mask,
> +			      housekeeping_cpumask(HK_FLAG_MISC)) {
> +		if (idle_cpu(ilb))
> +			return ilb;
> +	}
>  
>  	return nr_cpu_ids;
>  }
>  
>  /*
> - * Kick a CPU to do the nohz balancing, if it is time for it. We pick the
> - * nohz_load_balancer CPU (if there is one) otherwise fallback to any idle
> - * CPU (if there is one).
> + * Kick a CPU to do the nohz balancing, if it is time for it. We pick any
> + * idle CPU in the HK_FLAG_MISC housekeeping set (if there is one).
>   */
>  static void kick_ilb(unsigned int flags)
>  {
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2019-03-19 15:22 ` Keith Busch
  2019-03-19 23:49   ` Chaitanya Kulkarni
  2019-03-20 16:30   ` Maxim Levitsky
@ 2019-04-08 10:04   ` Maxim Levitsky
  2 siblings, 0 replies; 341+ messages in thread
From: Maxim Levitsky @ 2019-04-08 10:04 UTC (permalink / raw)
  To: Keith Busch
  Cc: Fam Zheng, Keith Busch, Sagi Grimberg, kvm, Wolfram Sang,
	Greg Kroah-Hartman, Liang Cunming, Nicolas Ferre, linux-kernel,
	linux-nvme, David S . Miller, Jens Axboe, Alex Williamson,
	Kirti Wankhede, Mauro Carvalho Chehab, Paolo Bonzini,
	Liu Changpeng, Paul E . McKenney, Amnon Ilan, Christoph Hellwig,
	John Ferlan

On Tue, 2019-03-19 at 09:22 -0600, Keith Busch wrote:
> On Tue, Mar 19, 2019 at 04:41:07PM +0200, Maxim Levitsky wrote:
> >   -> Share the NVMe device between host and guest. 
> >      Even in fully virtualized configurations,
> >      some partitions of nvme device could be used by guests as block
> > devices 
> >      while others passed through with nvme-mdev to achieve balance between
> >      all features of full IO stack emulation and performance.
> >   
> >   -> NVME-MDEV is a bit faster due to the fact that in-kernel driver 
> >      can send interrupts to the guest directly without a context 
> >      switch that can be expensive due to meltdown mitigation.
> > 
> >   -> Is able to utilize interrupts to get reasonable performance. 
> >      This is only implemented
> >      as a proof of concept and not included in the patches, 
> >      but interrupt driven mode shows reasonable performance
> >      
> >   -> This is a framework that later can be used to support NVMe devices 
> >      with more of the IO virtualization built-in 
> >      (IOMMU with PASID support coupled with device that supports it)
> 
> Would be very interested to see the PASID support. You wouldn't even
> need to mediate the IO doorbells or translations if assigning entire
> namespaces, and should be much faster than the shadow doorbells.
> 
> I think you should send 6/9 "nvme/pci: init shadow doorbell after each
> reset" separately for immediate inclusion.
> 
> I like the idea in principle, but it will take me a little time to get
> through reviewing your implementation. I would have guessed we could
> have leveraged something from the existing nvme/target for the mediating
> controller register access and admin commands. Maybe even start with
> implementing an nvme passthrough namespace target type (we currently
> have block and file).


Hi!

Sorry to bother you, but any update?

I was somewhat sick for the last week, now finally back in shape to continue
working on this and other tasks I have.

I am studing now the nvme target code and the io_uring to evaluate the
difficultiy of using something similiar to talk to the block device instead of /
in addtion to the  direct connection I implemented.

I would be glad to hear more feedback on this project.

I will also soon post the few fixes separately as you suggested.

Best regards,
    Maxim Levitskky





^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <20190323171738.GA26736@titus.pi.local>
@ 2019-03-26  8:42 ` Dan Carpenter
  0 siblings, 0 replies; 341+ messages in thread
From: Dan Carpenter @ 2019-03-26  8:42 UTC (permalink / raw)
  To: William J. Cunningham; +Cc: gregkh, devel, linux-kernel

On Sat, Mar 23, 2019 at 01:17:38PM -0400, William J. Cunningham wrote:
> >From bb04b0ca982b7042902fffe1377e0e38e83b402b Mon Sep 17 00:00:00 2001
> From: Will Cunningham <wjcunningham7@gmail.com>
> Date: Sat, 23 Mar 2019 12:54:34 -0400
> Subject: [PATCH] Staging: emxx_udc: emxx_udc: Fixed a coding style error
> 
> Removed unnecessary parentheses.
> 
> Signed-off-by: Will Cunningham <wjcunningham7@gmail.com>

Please fix up the headers and resend.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2019-03-21 17:07   ` Maxim Levitsky
@ 2019-03-25 16:46     ` Stefan Hajnoczi
  0 siblings, 0 replies; 341+ messages in thread
From: Stefan Hajnoczi @ 2019-03-25 16:46 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: linux-nvme, linux-kernel, kvm, Jens Axboe, Alex Williamson,
	Keith Busch, Christoph Hellwig, Sagi Grimberg, Kirti Wankhede,
	David S . Miller, Mauro Carvalho Chehab, Greg Kroah-Hartman,
	Wolfram Sang, Nicolas Ferre, Paul E . McKenney, Paolo Bonzini,
	Liang Cunming, Liu Changpeng, Fam Zheng, Amnon Ilan, John Ferlan

[-- Attachment #1: Type: text/plain, Size: 2913 bytes --]

On Thu, Mar 21, 2019 at 07:07:38PM +0200, Maxim Levitsky wrote:
> On Thu, 2019-03-21 at 16:13 +0000, Stefan Hajnoczi wrote:
> > On Tue, Mar 19, 2019 at 04:41:07PM +0200, Maxim Levitsky wrote:
> > > Date: Tue, 19 Mar 2019 14:45:45 +0200
> > > Subject: [PATCH 0/9] RFC: NVME VFIO mediated device
> > > 
> > > Hi everyone!
> > > 
> > > In this patch series, I would like to introduce my take on the problem of
> > > doing 
> > > as fast as possible virtualization of storage with emphasis on low latency.
> > > 
> > > In this patch series I implemented a kernel vfio based, mediated device
> > > that 
> > > allows the user to pass through a partition and/or whole namespace to a
> > > guest.
> > > 
> > > The idea behind this driver is based on paper you can find at
> > > https://www.usenix.org/conference/atc18/presentation/peng,
> > > 
> > > Although note that I stared the development prior to reading this paper, 
> > > independently.
> > > 
> > > In addition to that implementation is not based on code used in the paper
> > > as 
> > > I wasn't being able at that time to make the source available to me.
> > > 
> > > ***Key points about the implementation:***
> > > 
> > > * Polling kernel thread is used. The polling is stopped after a 
> > > predefined timeout (1/2 sec by default).
> > > Support for all interrupt driven mode is planned, and it shows promising
> > > results.
> > > 
> > > * Guest sees a standard NVME device - this allows to run guest with 
> > > unmodified drivers, for example windows guests.
> > > 
> > > * The NVMe device is shared between host and guest.
> > > That means that even a single namespace can be split between host 
> > > and guest based on different partitions.
> > > 
> > > * Simple configuration
> > > 
> > > *** Performance ***
> > > 
> > > Performance was tested on Intel DC P3700, With Xeon E5-2620 v2 
> > > and both latency and throughput is very similar to SPDK.
> > > 
> > > Soon I will test this on a better server and nvme device and provide
> > > more formal performance numbers.
> > > 
> > > Latency numbers:
> > > ~80ms - spdk with fio plugin on the host.
> > > ~84ms - nvme driver on the host
> > > ~87ms - mdev-nvme + nvme driver in the guest
> > 
> > You mentioned the spdk numbers are with vhost-user-nvme.  Have you
> > measured SPDK's vhost-user-blk?
> 
> I had lot of measuments of vhost-user-blk vs vhost-user-nvme.
> vhost-user-nvme was always a bit faster but only a bit.
> Thus I don't think it makes sense to benchamrk against vhost-user-blk.

It's interesting because mdev-nvme is closest to the hardware while
vhost-user-blk is closest to software.  Doing things at the NVMe level
isn't buying much performance because it's still going through a
software path comparable to vhost-user-blk.

From what you say it sounds like there isn't much to optimize away :(.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2019-03-21 16:13 ` Stefan Hajnoczi
@ 2019-03-21 17:07   ` Maxim Levitsky
  2019-03-25 16:46     ` Stefan Hajnoczi
  0 siblings, 1 reply; 341+ messages in thread
From: Maxim Levitsky @ 2019-03-21 17:07 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: linux-nvme, linux-kernel, kvm, Jens Axboe, Alex Williamson,
	Keith Busch, Christoph Hellwig, Sagi Grimberg, Kirti Wankhede,
	David S . Miller, Mauro Carvalho Chehab, Greg Kroah-Hartman,
	Wolfram Sang, Nicolas Ferre, Paul E . McKenney, Paolo Bonzini,
	Liang Cunming, Liu Changpeng, Fam Zheng, Amnon Ilan, John Ferlan

On Thu, 2019-03-21 at 16:13 +0000, Stefan Hajnoczi wrote:
> On Tue, Mar 19, 2019 at 04:41:07PM +0200, Maxim Levitsky wrote:
> > Date: Tue, 19 Mar 2019 14:45:45 +0200
> > Subject: [PATCH 0/9] RFC: NVME VFIO mediated device
> > 
> > Hi everyone!
> > 
> > In this patch series, I would like to introduce my take on the problem of
> > doing 
> > as fast as possible virtualization of storage with emphasis on low latency.
> > 
> > In this patch series I implemented a kernel vfio based, mediated device
> > that 
> > allows the user to pass through a partition and/or whole namespace to a
> > guest.
> > 
> > The idea behind this driver is based on paper you can find at
> > https://www.usenix.org/conference/atc18/presentation/peng,
> > 
> > Although note that I stared the development prior to reading this paper, 
> > independently.
> > 
> > In addition to that implementation is not based on code used in the paper
> > as 
> > I wasn't being able at that time to make the source available to me.
> > 
> > ***Key points about the implementation:***
> > 
> > * Polling kernel thread is used. The polling is stopped after a 
> > predefined timeout (1/2 sec by default).
> > Support for all interrupt driven mode is planned, and it shows promising
> > results.
> > 
> > * Guest sees a standard NVME device - this allows to run guest with 
> > unmodified drivers, for example windows guests.
> > 
> > * The NVMe device is shared between host and guest.
> > That means that even a single namespace can be split between host 
> > and guest based on different partitions.
> > 
> > * Simple configuration
> > 
> > *** Performance ***
> > 
> > Performance was tested on Intel DC P3700, With Xeon E5-2620 v2 
> > and both latency and throughput is very similar to SPDK.
> > 
> > Soon I will test this on a better server and nvme device and provide
> > more formal performance numbers.
> > 
> > Latency numbers:
> > ~80ms - spdk with fio plugin on the host.
> > ~84ms - nvme driver on the host
> > ~87ms - mdev-nvme + nvme driver in the guest
> 
> You mentioned the spdk numbers are with vhost-user-nvme.  Have you
> measured SPDK's vhost-user-blk?

I had lot of measuments of vhost-user-blk vs vhost-user-nvme.
vhost-user-nvme was always a bit faster but only a bit.
Thus I don't think it makes sense to benchamrk against vhost-user-blk.

Best regards,
	Maxim Levitsky




^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <20190319144116.400-1-mlevitsk@redhat.com>
  2019-03-19 15:22 ` Keith Busch
@ 2019-03-21 16:13 ` Stefan Hajnoczi
  2019-03-21 17:07   ` Maxim Levitsky
  1 sibling, 1 reply; 341+ messages in thread
From: Stefan Hajnoczi @ 2019-03-21 16:13 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: linux-nvme, linux-kernel, kvm, Jens Axboe, Alex Williamson,
	Keith Busch, Christoph Hellwig, Sagi Grimberg, Kirti Wankhede,
	David S . Miller, Mauro Carvalho Chehab, Greg Kroah-Hartman,
	Wolfram Sang, Nicolas Ferre, Paul E . McKenney ,
	Paolo Bonzini, Liang Cunming, Liu Changpeng, Fam Zheng,
	Amnon Ilan, John Ferlan

[-- Attachment #1: Type: text/plain, Size: 2018 bytes --]

On Tue, Mar 19, 2019 at 04:41:07PM +0200, Maxim Levitsky wrote:
> Date: Tue, 19 Mar 2019 14:45:45 +0200
> Subject: [PATCH 0/9] RFC: NVME VFIO mediated device
> 
> Hi everyone!
> 
> In this patch series, I would like to introduce my take on the problem of doing 
> as fast as possible virtualization of storage with emphasis on low latency.
> 
> In this patch series I implemented a kernel vfio based, mediated device that 
> allows the user to pass through a partition and/or whole namespace to a guest.
> 
> The idea behind this driver is based on paper you can find at
> https://www.usenix.org/conference/atc18/presentation/peng,
> 
> Although note that I stared the development prior to reading this paper, 
> independently.
> 
> In addition to that implementation is not based on code used in the paper as 
> I wasn't being able at that time to make the source available to me.
> 
> ***Key points about the implementation:***
> 
> * Polling kernel thread is used. The polling is stopped after a 
> predefined timeout (1/2 sec by default).
> Support for all interrupt driven mode is planned, and it shows promising results.
> 
> * Guest sees a standard NVME device - this allows to run guest with 
> unmodified drivers, for example windows guests.
> 
> * The NVMe device is shared between host and guest.
> That means that even a single namespace can be split between host 
> and guest based on different partitions.
> 
> * Simple configuration
> 
> *** Performance ***
> 
> Performance was tested on Intel DC P3700, With Xeon E5-2620 v2 
> and both latency and throughput is very similar to SPDK.
> 
> Soon I will test this on a better server and nvme device and provide
> more formal performance numbers.
> 
> Latency numbers:
> ~80ms - spdk with fio plugin on the host.
> ~84ms - nvme driver on the host
> ~87ms - mdev-nvme + nvme driver in the guest

You mentioned the spdk numbers are with vhost-user-nvme.  Have you
measured SPDK's vhost-user-blk?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2019-03-20 17:03     ` Keith Busch
@ 2019-03-20 17:33       ` Maxim Levitsky
  0 siblings, 0 replies; 341+ messages in thread
From: Maxim Levitsky @ 2019-03-20 17:33 UTC (permalink / raw)
  To: Keith Busch
  Cc: Fam Zheng, Keith Busch, Sagi Grimberg, kvm, Wolfram Sang,
	Greg Kroah-Hartman, Liang Cunming, Nicolas Ferre, linux-kernel,
	linux-nvme, David S . Miller, Jens Axboe, Alex Williamson,
	Kirti Wankhede, Mauro Carvalho Chehab, Paolo Bonzini,
	Liu Changpeng, Paul E . McKenney, Amnon Ilan, Christoph Hellwig,
	John Ferlan

On Wed, 2019-03-20 at 11:03 -0600, Keith Busch wrote:
> On Wed, Mar 20, 2019 at 06:30:29PM +0200, Maxim Levitsky wrote:
> > Or instead I can use the block backend, 
> > (but note that currently the block back-end doesn't support polling which is
> > critical for the performance).
> 
> Oh, I think you can do polling through there. For reference, fs/io_uring.c
> has a pretty good implementation that aligns with how you could use it.


That is exactly my thought. The polling recently got lot of improvements in the
block layer, which migh make this feasable.

I will give it a try.

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2019-03-20 16:30   ` Maxim Levitsky
@ 2019-03-20 17:03     ` Keith Busch
  2019-03-20 17:33       ` Maxim Levitsky
  0 siblings, 1 reply; 341+ messages in thread
From: Keith Busch @ 2019-03-20 17:03 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Fam Zheng, Keith Busch, Sagi Grimberg, kvm, Wolfram Sang,
	Greg Kroah-Hartman, Liang Cunming, Nicolas Ferre, linux-kernel,
	linux-nvme, David S . Miller, Jens Axboe, Alex Williamson,
	Kirti Wankhede, Mauro Carvalho Chehab, Paolo Bonzini,
	Liu Changpeng, Paul E . McKenney, Amnon Ilan, Christoph Hellwig,
	John Ferlan

On Wed, Mar 20, 2019 at 06:30:29PM +0200, Maxim Levitsky wrote:
> Or instead I can use the block backend, 
> (but note that currently the block back-end doesn't support polling which is
> critical for the performance).

Oh, I think you can do polling through there. For reference, fs/io_uring.c
has a pretty good implementation that aligns with how you could use it.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2019-03-19 23:49   ` Chaitanya Kulkarni
@ 2019-03-20 16:44     ` Maxim Levitsky
  0 siblings, 0 replies; 341+ messages in thread
From: Maxim Levitsky @ 2019-03-20 16:44 UTC (permalink / raw)
  To: Chaitanya Kulkarni, Keith Busch
  Cc: Fam Zheng, Jens Axboe, Sagi Grimberg, kvm, Wolfram Sang,
	Greg Kroah-Hartman, Liang Cunming, Nicolas Ferre, linux-nvme,
	linux-kernel, Keith Busch, Alex Williamson, Christoph Hellwig,
	Kirti Wankhede, Mauro Carvalho Chehab, Paolo Bonzini,
	Liu Changpeng, Paul E . McKenney, Amnon Ilan, David S . Miller,
	John Ferlan

On Tue, 2019-03-19 at 23:49 +0000, Chaitanya Kulkarni wrote:
> Hi Keith,
> On 03/19/2019 08:21 AM, Keith Busch wrote:
> > On Tue, Mar 19, 2019 at 04:41:07PM +0200, Maxim Levitsky wrote:
> > >    -> Share the NVMe device between host and guest.
> > >       Even in fully virtualized configurations,
> > >       some partitions of nvme device could be used by guests as block
> > > devices
> > >       while others passed through with nvme-mdev to achieve balance
> > > between
> > >       all features of full IO stack emulation and performance.
> > > 
> > >    -> NVME-MDEV is a bit faster due to the fact that in-kernel driver
> > >       can send interrupts to the guest directly without a context
> > >       switch that can be expensive due to meltdown mitigation.
> > > 
> > >    -> Is able to utilize interrupts to get reasonable performance.
> > >       This is only implemented
> > >       as a proof of concept and not included in the patches,
> > >       but interrupt driven mode shows reasonable performance
> > > 
> > >    -> This is a framework that later can be used to support NVMe devices
> > >       with more of the IO virtualization built-in
> > >       (IOMMU with PASID support coupled with device that supports it)
> > 
> > Would be very interested to see the PASID support. You wouldn't even
> > need to mediate the IO doorbells or translations if assigning entire
> > namespaces, and should be much faster than the shadow doorbells.
> > 
> > I think you should send 6/9 "nvme/pci: init shadow doorbell after each
> > reset" separately for immediate inclusion.
> > 
> > I like the idea in principle, but it will take me a little time to get
> > through reviewing your implementation. I would have guessed we could
> > have leveraged something from the existing nvme/target for the mediating
> > controller register access and admin commands. Maybe even start with
> > implementing an nvme passthrough namespace target type (we currently
> > have block and file).
> 
> I have the code for the NVMeOf target passthru-ctrl, I think we can use 
> that as it is if you are looking for the passthru for NVMeOF.
> 
> I'll post patch-series based on the latest code base soon.

I am very intersing in this code. 
Could you explain how your NVMeOF target passthrough works? 
Which components of the NVME stack does it involve?

Best regards,
	Maxim Levitsky

> > 
> > _______________________________________________
> > Linux-nvme mailing list
> > Linux-nvme@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-nvme
> > 
> 
> 
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2019-03-19 15:22 ` Keith Busch
  2019-03-19 23:49   ` Chaitanya Kulkarni
@ 2019-03-20 16:30   ` Maxim Levitsky
  2019-03-20 17:03     ` Keith Busch
  2019-04-08 10:04   ` Maxim Levitsky
  2 siblings, 1 reply; 341+ messages in thread
From: Maxim Levitsky @ 2019-03-20 16:30 UTC (permalink / raw)
  To: Keith Busch
  Cc: Fam Zheng, Keith Busch, Sagi Grimberg, kvm, Wolfram Sang,
	Greg Kroah-Hartman, Liang Cunming, Nicolas Ferre, linux-kernel,
	linux-nvme, David S . Miller, Jens Axboe, Alex Williamson,
	Kirti Wankhede, Mauro Carvalho Chehab, Paolo Bonzini,
	Liu Changpeng, Paul E . McKenney, Amnon Ilan, Christoph Hellwig,
	John Ferlan

On Tue, 2019-03-19 at 09:22 -0600, Keith Busch wrote:
> On Tue, Mar 19, 2019 at 04:41:07PM +0200, Maxim Levitsky wrote:
> >   -> Share the NVMe device between host and guest. 
> >      Even in fully virtualized configurations,
> >      some partitions of nvme device could be used by guests as block
> > devices 
> >      while others passed through with nvme-mdev to achieve balance between
> >      all features of full IO stack emulation and performance.
> >   
> >   -> NVME-MDEV is a bit faster due to the fact that in-kernel driver 
> >      can send interrupts to the guest directly without a context 
> >      switch that can be expensive due to meltdown mitigation.
> > 
> >   -> Is able to utilize interrupts to get reasonable performance. 
> >      This is only implemented
> >      as a proof of concept and not included in the patches, 
> >      but interrupt driven mode shows reasonable performance
> >      
> >   -> This is a framework that later can be used to support NVMe devices 
> >      with more of the IO virtualization built-in 
> >      (IOMMU with PASID support coupled with device that supports it)
> 

> Would be very interested to see the PASID support. You wouldn't even
> need to mediate the IO doorbells or translations if assigning entire
> namespaces, and should be much faster than the shadow doorbells.

I fully agree with that.
Note that to enable PASID support two things have to happen in this vendor.

1. Mature support for IOMMU with PASID support. On Intel side I know that they
only have a spec released and currently the kernel bits to support it are
placed.
I still don't know when a product actually supporting this spec is going to be
released. For other vendors (ARM/AMD/) I haven't done yet a research on state of
PASID based IOMMU support on their platforms.

2. NVMe spec has to be extended to support PASID. At minimum, we need an ability
to assign an PASID to a sq/cq queue pair and ability to relocate the doorbells,
such as each guest would get its own (hardware backed) MMIO page with its own
doorbells. Plus of course the hardware vendors have to embrace the spec. I guess
these two things will happen in collaborative manner.


> 
> I think you should send 6/9 "nvme/pci: init shadow doorbell after each
> reset" separately for immediate inclusion.
I'll do this soon. 

Also '5/9 nvme/pci: add known admin effects to augment admin effects log page'
can be considered for immediate inclusion as well, as it works around a flaw
in the NVMe controller badly done admin side effects page with no side effects
(pun intended) for spec compliant controllers (I think so). 

This can be fixed with a quirk if you prefer though.

> 
> I like the idea in principle, but it will take me a little time to get
> through reviewing your implementation. I would have guessed we could
> have leveraged something from the existing nvme/target for the mediating
> controller register access and admin commands. Maybe even start with
> implementing an nvme passthrough namespace target type (we currently
> have block and file).

I fully agree with you on that I could have used some of the nvme/target code,
and I am planning to do so eventually.

For that I would need to make my driver, to be one of the target drivers, and I
would need to add another target back end, like you said to allow my target
driver to talk directly to the nvme hardware bypassing the block layer.

Or instead I can use the block backend, 
(but note that currently the block back-end doesn't support polling which is
critical for the performance).

Switch to the target code might though have some (probably minor) performance
impact, as it would probably lengthen the critical code path a bit (I might need
for instance to translate the PRP lists I am getting from the virtual controller
to a scattergather list and back).

This is why I did this the way I did, but now knowing that probably I can afford
to loose a bit of performance, I can look at doing that.

Best regards,
Thanks in advance for the review,
	Maxim Levitsky

PS:

For reference currently the IO path looks more or less like that:

My IO thread notices a doorbell write, reads a command from a submission queue,
translates it (without even looking at the data pointer) and sends it to the
nvme pci driver together with pointer to data iterator'.

The nvme pci driver calls the data iterator N times, which makes the iterator
translate and fetch the DMA addresses where the data is already mapped on the
its pci nvme device (the mdev driver maps all the guest memory to the nvme pci
device).
The nvme pci driver uses these addresses it receives, to create a prp list,
which it puts into the data pointer.

The nvme pci driver also allocates an free command id, from a list, puts it into
the command ID and sends the command to the real hardware.

Later the IO thread calls to the nvme pci driver to poll the queue. When
completions arrive, the nvme pci driver returns them back to the IO thread.

> 
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <20190319022012.11051-1-thirtythreeforty@gmail.com>
@ 2019-03-20  7:26 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 341+ messages in thread
From: Greg Kroah-Hartman @ 2019-03-20  7:26 UTC (permalink / raw)
  To: George Hilliard; +Cc: linux-mips, linux-kernel

On Mon, Mar 18, 2019 at 08:20:01PM -0600, George Hilliard wrote:
> Because of this change, the driver now expects a pinctrl device
> reference in the mmc controller's device tree node; without it, it will
> bail out.  This could break existing setups that don't specify it
> because it "just worked" up until now.  So currently I just let the old
> behavior fall away because this is a staging driver.  But if this is a
> problem, the old behavior could be added back as a fallback.
> 
> This is version 2 of a patchset that I requested feedback for about a
> month ago.  Please review as if they are a new patchset; all the patches
> were rebased several times and a couple new correctness fixes added.
> 
> The TODO list is largely unchanged, aside from the couple of TODO
> comments in the code that I have addressed.  Ultimately, I think this
> driver could potentially be merged with the "real" mtk-mmc driver as the
> TODO suggests, but someone who is more familiar with the IP core will
> have to do that.  Mediatek documentation (that I can find) is very
> sparse.
> 
> This is ready to merge if there is no other feedback!
> 
> >From George Hilliard <thirtythreeforty@gmail.com> # This line is ignored.
> From: George Hilliard <thirtythreeforty@gmail.com>
> Reply-To: 
> Subject: [PATCH v2 00/11] mt7621-mmc: Various correctness fixes
> In-Reply-To: 
> 
> 

No subject for this email?


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2019-03-19 15:22 ` Keith Busch
@ 2019-03-19 23:49   ` Chaitanya Kulkarni
  2019-03-20 16:44     ` Maxim Levitsky
  2019-03-20 16:30   ` Maxim Levitsky
  2019-04-08 10:04   ` Maxim Levitsky
  2 siblings, 1 reply; 341+ messages in thread
From: Chaitanya Kulkarni @ 2019-03-19 23:49 UTC (permalink / raw)
  To: Keith Busch, Maxim Levitsky
  Cc: Fam Zheng, Keith Busch, Sagi Grimberg, kvm, Wolfram Sang,
	Greg Kroah-Hartman, Liang Cunming, Nicolas Ferre, linux-kernel,
	linux-nvme, David S . Miller, Jens Axboe, Alex Williamson,
	Kirti Wankhede, Mauro Carvalho Chehab, Paolo Bonzini,
	Liu Changpeng, Paul E . McKenney ,
	Amnon Ilan, Christoph Hellwig, John Ferlan

Hi Keith,
On 03/19/2019 08:21 AM, Keith Busch wrote:
> On Tue, Mar 19, 2019 at 04:41:07PM +0200, Maxim Levitsky wrote:
>>    -> Share the NVMe device between host and guest.
>>       Even in fully virtualized configurations,
>>       some partitions of nvme device could be used by guests as block devices
>>       while others passed through with nvme-mdev to achieve balance between
>>       all features of full IO stack emulation and performance.
>>
>>    -> NVME-MDEV is a bit faster due to the fact that in-kernel driver
>>       can send interrupts to the guest directly without a context
>>       switch that can be expensive due to meltdown mitigation.
>>
>>    -> Is able to utilize interrupts to get reasonable performance.
>>       This is only implemented
>>       as a proof of concept and not included in the patches,
>>       but interrupt driven mode shows reasonable performance
>>
>>    -> This is a framework that later can be used to support NVMe devices
>>       with more of the IO virtualization built-in
>>       (IOMMU with PASID support coupled with device that supports it)
>
> Would be very interested to see the PASID support. You wouldn't even
> need to mediate the IO doorbells or translations if assigning entire
> namespaces, and should be much faster than the shadow doorbells.
>
> I think you should send 6/9 "nvme/pci: init shadow doorbell after each
> reset" separately for immediate inclusion.
>
> I like the idea in principle, but it will take me a little time to get
> through reviewing your implementation. I would have guessed we could
> have leveraged something from the existing nvme/target for the mediating
> controller register access and admin commands. Maybe even start with
> implementing an nvme passthrough namespace target type (we currently
> have block and file).

I have the code for the NVMeOf target passthru-ctrl, I think we can use 
that as it is if you are looking for the passthru for NVMeOF.

I'll post patch-series based on the latest code base soon.
>
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
>


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <20190319144116.400-1-mlevitsk@redhat.com>
@ 2019-03-19 15:22 ` Keith Busch
  2019-03-19 23:49   ` Chaitanya Kulkarni
                     ` (2 more replies)
  2019-03-21 16:13 ` Stefan Hajnoczi
  1 sibling, 3 replies; 341+ messages in thread
From: Keith Busch @ 2019-03-19 15:22 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: linux-nvme, linux-kernel, kvm, Jens Axboe, Alex Williamson,
	Keith Busch, Christoph Hellwig, Sagi Grimberg, Kirti Wankhede,
	David S . Miller, Mauro Carvalho Chehab, Greg Kroah-Hartman,
	Wolfram Sang, Nicolas Ferre, Paul E . McKenney ,
	Paolo Bonzini, Liang Cunming, Liu Changpeng, Fam Zheng,
	Amnon Ilan, John Ferlan

On Tue, Mar 19, 2019 at 04:41:07PM +0200, Maxim Levitsky wrote:
>   -> Share the NVMe device between host and guest. 
>      Even in fully virtualized configurations,
>      some partitions of nvme device could be used by guests as block devices 
>      while others passed through with nvme-mdev to achieve balance between
>      all features of full IO stack emulation and performance.
>   
>   -> NVME-MDEV is a bit faster due to the fact that in-kernel driver 
>      can send interrupts to the guest directly without a context 
>      switch that can be expensive due to meltdown mitigation.
> 
>   -> Is able to utilize interrupts to get reasonable performance. 
>      This is only implemented
>      as a proof of concept and not included in the patches, 
>      but interrupt driven mode shows reasonable performance
>      
>   -> This is a framework that later can be used to support NVMe devices 
>      with more of the IO virtualization built-in 
>      (IOMMU with PASID support coupled with device that supports it)

Would be very interested to see the PASID support. You wouldn't even
need to mediate the IO doorbells or translations if assigning entire
namespaces, and should be much faster than the shadow doorbells.

I think you should send 6/9 "nvme/pci: init shadow doorbell after each
reset" separately for immediate inclusion.

I like the idea in principle, but it will take me a little time to get
through reviewing your implementation. I would have guessed we could
have leveraged something from the existing nvme/target for the mediating
controller register access and admin commands. Maybe even start with
implementing an nvme passthrough namespace target type (we currently
have block and file).

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <20190225201635.4648-1-hannes@cmpxchg.org>
@ 2019-02-26 23:49 ` Roman Gushchin
  0 siblings, 0 replies; 341+ messages in thread
From: Roman Gushchin @ 2019-02-26 23:49 UTC (permalink / raw)
  To: up, the, LRU, counts, tracking
  Cc: Andrew Morton, Tejun Heo, linux-mm, cgroups, linux-kernel, Kernel Team

On Mon, Feb 25, 2019 at 03:16:29PM -0500, Johannes Weiner wrote:
> [resending, rebased on top of latest mmots]
> 
> The memcg LRU stats usage is currently a bit messy. Memcg has private
> per-zone counters because reclaim needs zone granularity sometimes,
> but we also have plenty of users that need to awkwardly sum them up to
> node or memcg granularity. Meanwhile the canonical per-memcg vmstats
> do not track the LRU counts (NR_INACTIVE_ANON etc.) as you'd expect.
> 
> This series enables LRU count tracking in the per-memcg vmstats array
> such that lruvec_page_state() and memcg_page_state() work on the enum
> node_stat_item items for the LRU counters. Then it converts all the
> callers that don't specifically need per-zone numbers over to that.

The updated version looks very good* to me!
Please, feel free to use:
Reviewed-by: Roman Gushchin <guro@fb.com>

Looking through the patchset, I have a feeling that we're sometimes
gathering too much data. Perhaps we don't need the whole set
of counters to be per-cpu on both memcg- and memcg-per-node levels.
Merging them can save quite a lot of space. Anyway, it's a separate
topic.

* except "to" and "subject" of the cover letter

Thanks!

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <20180827145032.9522-1-hch@lst.de>
@ 2018-08-31 20:23 ` Paul Burton
  0 siblings, 0 replies; 341+ messages in thread
From: Paul Burton @ 2018-08-31 20:23 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: iommu, Marek Szyprowski, Robin Murphy, Greg Kroah-Hartman,
	linux-mips, linux-kernel

Hi Christoph,

On Mon, Aug 27, 2018 at 04:50:27PM +0200, Christoph Hellwig wrote:
> Subject: [RFC] merge dma_direct_ops and dma_noncoherent_ops
> 
> While most architectures are either always or never dma coherent for a
> given build, the arm, arm64, mips and soon arc architectures can have
> different dma coherent settings on a per-device basis.  Additionally
> some mips builds can decide at boot time if dma is coherent or not.
> 
> I've started to look into handling noncoherent dma in swiotlb, and
> moving the dma-iommu ops into common code [1], and for that we need a
> generic way to check if a given device is coherent or not.  Moving
> this flag into struct device also simplifies the conditionally coherent
> architecture implementations.
> 
> These patches are also available in a git tree given that they have
> a few previous posted dependencies:
> 
>     git://git.infradead.org/users/hch/misc.git dma-direct-noncoherent-merge
> 
> Gitweb:
> 
>     http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-direct-noncoherent-merge

Apart from the nits in patch 2, these look sane to me from a MIPS
perspective, so for patches 1-4:

    Acked-by: Paul Burton <paul.burton@mips.com> # MIPS parts

Thanks,
    Paul

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <20180724222212.8742-1-tsotsos@gmail.com>
@ 2018-07-25  7:39 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 341+ messages in thread
From: Greg Kroah-Hartman @ 2018-07-25  7:39 UTC (permalink / raw)
  To: Georgios Tsotsos; +Cc: devel, James Hogan, linux-kernel, Aaro Koskinen

On Wed, Jul 25, 2018 at 01:22:07AM +0300, Georgios Tsotsos wrote:
> Date: Wed, 25 Jul 2018 01:18:58 +0300
> Subject: [PATCH 0/4] Staging: octeon-usb: Fixes and Coding style applied. 
> 
> Hello, 

Somehow your subject here got messed up and put in the bod of the email.
Not a big deal this time, but be more careful next time please.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <2018071901551081442221@163.com>
@ 2018-07-18 20:04 ` Johan Hovold
  0 siblings, 0 replies; 341+ messages in thread
From: Johan Hovold @ 2018-07-18 20:04 UTC (permalink / raw)
  To: m13297920107; +Cc: johan, gregkh, linux-usb, linux-kernel, moviesong, billli

On Thu, Jul 19, 2018 at 01:55:12AM +0800, m13297920107@163.com wrote:
> From 14bd57ea5c5fc385bd36b5a3ea5c805337bbc8db Mon Sep 17 00:00:00 2001
> From: Movie Song <MovieSong@aten-itlab.cn>
> Date: Thu, 19 Jul 2018 02:20:48 +0800
> Subject: [PATCH] USB:serial:pl2303:add a new device id for ATEN

Add spaces after the colons (':') in the Subject above, and place a
short commit message here before your SoB.

> Signed-off-by:MovieSong<MovieSong@aten-itlab.cn>

Missing spaces in you SoB as well.

> ---
>  drivers/usb/serial/pl2303.c | 2 ++
>  drivers/usb/serial/pl2303.h | 1 +
>  2 files changed, 3 insertions(+)
> 
> diff --git a/drivers/usb/serial/pl2303.c b/drivers/usb/serial/pl2303.c
> index 5d1a193..e41f725 100644
> --- a/drivers/usb/serial/pl2303.c
> +++ b/drivers/usb/serial/pl2303.c
> @@ -52,6 +52,8 @@
>   .driver_info = PL2303_QUIRK_ENDPOINT_HACK },
>   { USB_DEVICE(ATEN_VENDOR_ID, ATEN_PRODUCT_UC485),
>   .driver_info = PL2303_QUIRK_ENDPOINT_HACK },
> + { USB_DEVICE(ATEN_VENDOR_ID, ATEN_PRODUCT_UC232B),
> + .driver_info = PL2303_QUIRK_ENDPOINT_HACK },
>   { USB_DEVICE(ATEN_VENDOR_ID, ATEN_PRODUCT_ID2) },
>   { USB_DEVICE(ATEN_VENDOR_ID2, ATEN_PRODUCT_ID) },
>   { USB_DEVICE(ELCOM_VENDOR_ID, ELCOM_PRODUCT_ID) },
> diff --git a/drivers/usb/serial/pl2303.h b/drivers/usb/serial/pl2303.h
> index fcd7239..26965cc 100644
> --- a/drivers/usb/serial/pl2303.h
> +++ b/drivers/usb/serial/pl2303.h
> @@ -24,6 +24,7 @@
>  #define ATEN_VENDOR_ID2 0x0547
>  #define ATEN_PRODUCT_ID 0x2008
>  #define ATEN_PRODUCT_UC485 0x2021
> +#define ATEN_PRODUCT_UC232B 0x2022
>  #define ATEN_PRODUCT_ID2 0x2118
>  
>  #define IODATA_VENDOR_ID 0x04bb

As I suggested earlier, try sending the patch to yourself first and run
scripts/checkpatch.pl on it. The patch is still whitespace corrupted
(probably by your mail client) as checkpatch would have let you know:

WARNING: Use a single space after Signed-off-by:
#13: 
Signed-off-by:MovieSong<MovieSong@aten-itlab.cn>

WARNING: email address 'MovieSong<MovieSong@aten-itlab.cn>' might be better as 'MovieSong <MovieSong@aten-itlab.cn>'
#13: 
Signed-off-by:MovieSong<MovieSong@aten-itlab.cn>

WARNING: please, no spaces at the start of a line
#27: FILE: drivers/usb/serial/pl2303.c:55:
+ { USB_DEVICE(ATEN_VENDOR_ID, ATEN_PRODUCT_UC232B),$

WARNING: please, no spaces at the start of a line
#28: FILE: drivers/usb/serial/pl2303.c:56:
+ .driver_info = PL2303_QUIRK_ENDPOINT_HACK },$

total: 1 errors, 4 warnings, 15 lines checked


git-send-email is convenient for sending patches (e.g. generated with
git-format-patch). Perhaps you can set that up.

One more try?

Thanks,
Johan

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <201807160555.w6G5t9Dc075492@mse.aten.com.tw>
@ 2018-07-16 10:03 ` Johan Hovold
  0 siblings, 0 replies; 341+ messages in thread
From: Johan Hovold @ 2018-07-16 10:03 UTC (permalink / raw)
  To: moviesong; +Cc: johan, gregkh, linux-usb, linux-kernel, YorkDai, BillLi

On Mon, Jul 16, 2018 at 09:46:05AM +0800, MovieSong wrote:
> From cff42ec450bdd1fb44dd80564cb622660a9a8071 Mon Sep 17 00:00:00 2001
> From: Movie Song <MovieSong@aten-itlab.cn>
> Date: Fri, 13 Jul 2018 17:46:19 +0800
> Subject: [PATCH] This add a new device for ATEN
> 
> Signed-off-by: Movie Song <MovieSong@aten-itlab.cn>

First, your mail still has the legal disclaimer footer which prevents us
from using this patch.

Second, the patch is now inline, but it's unfortunately white-space
damaged (tabs replaces with spaces).

Take a look at

	https://marc.info/?l=linux-usb&m=150576193231309

for an example of what the subject and commit message should look like.

Send it to yourself first and make sure it has no legal disclaimer
footers, and that you can apply it using git-am.

> ---
>  drivers/usb/serial/pl2303.c | 2 ++
>  drivers/usb/serial/pl2303.h | 1 +
>  2 files changed, 3 insertions(+)
> 
> diff --git a/drivers/usb/serial/pl2303.c b/drivers/usb/serial/pl2303.c
> index 5d1a193..99f7e1f 100644
> --- a/drivers/usb/serial/pl2303.c
> +++ b/drivers/usb/serial/pl2303.c
> @@ -52,6 +52,8 @@
>   .driver_info = PL2303_QUIRK_ENDPOINT_HACK },
>   { USB_DEVICE(ATEN_VENDOR_ID, ATEN_PRODUCT_UC485),
>   .driver_info = PL2303_QUIRK_ENDPOINT_HACK },
> + { USB_DEVICE(ATEN_VENDOR_ID, ATEN_PRODUCT_UC485),
> + .driver_info = PL2303_QUIRK_ENDPOINT_HACK },

And here you add a duplicate entry instead of the one based on the new
id you add.

>   { USB_DEVICE(ATEN_VENDOR_ID, ATEN_PRODUCT_ID2) },
>   { USB_DEVICE(ATEN_VENDOR_ID2, ATEN_PRODUCT_ID) },
>   { USB_DEVICE(ELCOM_VENDOR_ID, ELCOM_PRODUCT_ID) },
> diff --git a/drivers/usb/serial/pl2303.h b/drivers/usb/serial/pl2303.h
> index fcd7239..26965cc 100644
> --- a/drivers/usb/serial/pl2303.h
> +++ b/drivers/usb/serial/pl2303.h
> @@ -24,6 +24,7 @@
>  #define ATEN_VENDOR_ID2 0x0547
>  #define ATEN_PRODUCT_ID 0x2008
>  #define ATEN_PRODUCT_UC485 0x2021
> +#define ATEN_PRODUCT_UC232B 0x2022
>  #define ATEN_PRODUCT_ID2 0x2118
> 
>  #define IODATA_VENDOR_ID 0x04bb

Thanks,
Johan

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] ` <20180613173128.32384-1-vasilyev@ispras.ru>
@ 2018-06-19  7:42   ` Dan Carpenter
  0 siblings, 0 replies; 341+ messages in thread
From: Dan Carpenter @ 2018-06-19  7:42 UTC (permalink / raw)
  To: Anton Vasilyev
  Cc: Andy Shevchenko, devel, ldv-project, Johannes Thumshirn,
	linux-kernel, Sinan Kaya, Hannes Reinecke, Gaurav Pathak

Thanks for this.  This is a lot of work.

On Wed, Jun 13, 2018 at 08:31:28PM +0300, Anton Vasilyev wrote:
> diff --git a/drivers/staging/rts5208/rtsx.c b/drivers/staging/rts5208/rtsx.c
> index 70e0b8623110..69e6abe14abf 100644
> --- a/drivers/staging/rts5208/rtsx.c
> +++ b/drivers/staging/rts5208/rtsx.c
> @@ -857,7 +857,7 @@ static int rtsx_probe(struct pci_dev *pci,
>  	dev->chip = kzalloc(sizeof(*dev->chip), GFP_KERNEL);
>  	if (!dev->chip) {
>  		err = -ENOMEM;
> -		goto errout;
> +		goto chip_alloc_fail;

The most recent successful allocation is scsi_host_alloc().  I was
really hoping this would say something like "goto err_free_host;" or
something.  The naming style here is a "come from" label which doesn't
say if it's going to free the scsi host or not...  It turns out we don't
free the the host, but we should:

err_put_host:
	scsi_host_put(host);

The kzalloc() has it's own error message built in, and all the other
error paths as well so the dev_err() is not super important to me...

Killing the threads seems actually really complicated so maybe we should
just have a separate error paths for that.  I'm not sure...

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-12-07  9:26 Alexander Kappner
@ 2017-12-07 10:38 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 341+ messages in thread
From: Greg Kroah-Hartman @ 2017-12-07 10:38 UTC (permalink / raw)
  To: Alexander Kappner; +Cc: mathias.nyman, linux-usb, linux-kernel

On Thu, Dec 07, 2017 at 01:26:14AM -0800, Alexander Kappner wrote:
> Date: Wed, 6 Dec 2017 15:28:37 -0800
> Subject: [PATCH] usb-core: Fix potential null pointer dereference in xhci-debugfs.c

Something went wrong here, resulting in an email with no subject.

Can you fix this up and resend?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-08-18 17:42 Rajneesh Bhardwaj
@ 2017-08-18 17:53 ` Rajneesh Bhardwaj
  0 siblings, 0 replies; 341+ messages in thread
From: Rajneesh Bhardwaj @ 2017-08-18 17:53 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Peter Zijlstra (Intel),
	Platform Driver, dvhart, Andy Shevchenko, linux-kernel,
	Vishwanath Somayaji, dbasehore, rjw, rajatja

On Fri, Aug 18, 2017 at 11:12:14PM +0530, Rajneesh Bhardwaj wrote:
> Bcc: 
> Subject: Re: [PATCH] platform/x86: intel_pmc_core: Add Package C-states
>  residency info
> Reply-To: 
> In-Reply-To: <CAHp75Vd5Wnio-RCEBENtonYWOJF2+88FDvqkUv1HzV3CdcaaPA@mail.gmail.com>
>

Please ignore my previous email without subject. It was sent by mistake.

> On Fri, Aug 18, 2017 at 08:17:32PM +0300, Andy Shevchenko wrote:
> > +PeterZ (since I mentioned his name)
> > 
> > On Fri, Aug 18, 2017 at 5:58 PM, Rajneesh Bhardwaj
> > <rajneesh.bhardwaj@intel.com> wrote:
> > > On Fri, Aug 18, 2017 at 03:57:34PM +0300, Andy Shevchenko wrote:
> > >> On Fri, Aug 18, 2017 at 3:37 PM, Rajneesh Bhardwaj
> > >> <rajneesh.bhardwaj@intel.com> wrote:
> > >> > This patch introduces a new debugfs entry to read current Package C-state
> > >> > residency values and, one new kernel API to read the Package C-10 residency
> > >> > counter.
> > >> >
> > >> > Package C-state residency MSRs provide useful debug information about system
> > >> > idle states. In idle states system must enter deeper Package C-states.
> > 
> > >> Why this patch is needed?
> > >
> > > Andy, I'll try to give some background for this.
> > >
> > > This is needed to enhance the S0ix failure debug capabilities from within
> > > the kernel. On ChromeOS we have S0ix failsafe kernel framework that is used
> > > to validate S0ix and report the blockers in case of a failure.
> > > https://patchwork.kernel.org/patch/9148999/
> > 
> > (It's not part of upstream)
> 
> Sorry i sent an older link. There are fresh attempts to get this into
> mainline kernel and looks like there is a traction for it.
> https://patchwork.kernel.org/patch/9831229/
> 
> Package C-state (PC10) validation is discussed there.
> 
> > 
> > > So far only intel_pmc_slp_s0_counter_read is called by this framework to
> > > check whether the previous attempt to enter S0ix was success or not.
> > 
> > I harder see even a single user of that API in current kernel. It
> > should be unexported and removed I think.
> > 
> > >  Having
> > > another PC10 counter related exported function enhances the S0ix debug since
> > > PC10 state is a prerequisite to enter S0ix.
> > >
> > >> See, we have turbostat and cpupower user space tools which do this
> > >> without any additional code to be written in kernel. What prevents
> > >> your user space application do the same?
> > >>
> > >> Moreover, we have events for cstate, I assume perf or something alike
> > >> can monitor those counters as well.
> > >
> > > You're right, perhaps the debugfs is redundant when we have those user space
> > > tools but such tools are not available readily for all platforms/distros.
> > > Interfaces like /dev/cpu/*/msr that turbostat uses are not available on all
> > > the platforms.
> > > PMC driver is a debug driver so i thought its better to show Package C-state
> > > related info for low power debug here.
> > >
> > >>
> > >> Sorry, NAK.
> > >
> > > This patch has two parts i.e. exported PC10 API and the debugfs. Based on
> > > the above explanation, if the patch is not good as is, please let me know if
> > > i should drop the debugfs part and respin a v2 with just the exported API or
> > > drop this totally.
> > >
> > > Thanks for the feedback and thanks for taking time to review!
> > 
> > Reading above makes me think that entire design of this is misguided.
> > Since the most of values are counters they better to be accessed in a
> > way how perf does.
> > 
> > In case you need *in-kernel* facility, do some APIs (if it's not done
> > yet) for events drivers first.
> > cstate event driver is already in upstream.
> > 
> > Sorry, NAK for entire patch until it would be blessed by people like Peter Z.
> > 
> > -- 
> > With Best Regards,
> > Andy Shevchenko
> 
> -- 
> Best Regards,
> Rajneesh

-- 
Best Regards,
Rajneesh

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-06-04 11:59 Yury Norov
@ 2017-06-14 20:16 ` Yury Norov
  0 siblings, 0 replies; 341+ messages in thread
From: Yury Norov @ 2017-06-14 20:16 UTC (permalink / raw)
  To: Catalin Marinas, linux-arm-kernel, linux-kernel, linux-doc,
	Arnd Bergmann
  Cc: Andrew Pinski, Andrew Pinski, Adam Borowski, Chris Metcalf,
	Steve Ellcey, Maxim Kuvyrkov, Ramana Radhakrishnan,
	Florian Weimer, Bamvor Zhangjian, Andreas Schwab, Chris Metcalf,
	Heiko Carstens, schwidefsky, broonie, Joseph Myers,
	christoph.muellner, szabolcs.nagy, klimov.linux, Nathan_Lynch,
	agraf, Prasun.Kapoor, geert, philipp.tomsich, manuel.montezelo,
	linyongting, davem, zhouchengming1

Hi Catalin, all.

Thank you for your time on reviewing the series. I really appreciate it.

This is the updated version where I tried to address all comments:
https://github.com/norov/linux/commits/ilp32-20170613.4

(3 last patches here is the Andrew Pinski's rework of vdso rebased on
ilp32 series)

If nothing will come here on review, I'll send v8 at the beginning of
the next week. Is this plan OK?

And this is the backport on the v4.11 kernel:
https://github.com/norov/linux/commits/ilp32-4.11.4

Yury

On Sun, Jun 04, 2017 at 02:59:49PM +0300, Yury Norov wrote:
> Subject: [PATCH v7 resend 2 00/20] ILP32 for ARM64
> 
> Hi Catalin,
>  
> Here is a rebase of latest kernel patchset against next-20170602. There's almost
> no changes, but there are some conflicts that are not trivial, and I'd like to
> refresh the submission therefore.
> 
> How are your experiments with testing and benchmarking of ILP32 are going? In
> my current tests I see 0 failures on LTP. Benchmarking on SPEC CPU2006 and
> LMBench shows no difference for LP64 and expected performance boost for ILP32
> (compared to LP64 results).
> 
> Steve Ellcey is handling upstream submission of Glibc patches. The patches are
> ready and have been reviewed and reworked per community’s comments. There are
> no outstanding userspace ABI issues from Glibc. Glibc submission is now waiting
> on ILP32 kernel submission.
> 
> Catalin, regarding rootfs, is OpenSuSe’s build sufficient for your experiments?
> I’ve also seen Wookey merging patches for ILP32 triplet to binutils and pushing
> them to Debian.
> 
> One last thing I wanted to check with you about is ILP32 PCS - does, in your
> view, ARM Ltd. needs to publish any additional docs for ABI to become official?
> 
> Below is the regular description.
> 
> Thanks.
> Yury
> 
> --------
> 
> This series enables aarch64 with ilp32 mode.
> 
> As supporting work, it introduces ARCH_32BIT_OFF_T configuration
> option that is enabled for existing 32-bit architectures but disabled
> for new arches (so 64-bit off_t is is used by new userspace). Also it
> deprecates getrlimit and setrlimit syscalls prior to prlimit64.
> 
> This version is based on linux-next from 2017-03-01. It works with
> glibc-2.25, and tested with LTP, glibc testsuite, trinity, lmbench,
> CPUSpec.
> 
> Patches 1, 2, 3 and 8 are general, and may be applied separately.
> 
> This is the rebase of v7 - still no major changes has been made.
> 
> Kernel and GLIBC trees:
> https://github.com/norov/linux/tree/ilp32-20170602
> https://github.com/norov/glibc/tree/dev9
> 
> (GLIBC patches are managed by Steve Ellcey, so my tree is only for
> reference.)
> 
> Changes:
> v3: https://lkml.org/lkml/2014/9/3/704
> v4: https://lkml.org/lkml/2015/4/13/691
> v5: https://lkml.org/lkml/2015/9/29/911
> v6: https://lkml.org/lkml/2016/5/23/661
> v7: RFC nowrap:  https://lkml.org/lkml/2016/6/17/990
> v7: RFC2 nowrap: https://lkml.org/lkml/2016/8/17/245
> v7: RFC3 nowrap: https://lkml.org/lkml/2016/10/21/883
> v7: https://lkml.org/lkml/2017/1/9/213
> v7: Resend: http://lists.infradead.org/pipermail/linux-arm-kernel/2017-March/490801.html
> v7: Resend 2:
>     - vdso-ilp32 Makefile synced with lp64 Makefile (patch 19);
>     - rebased on next-20170602.
> 
> Andrew Pinski (6):
>   arm64: rename COMPAT to AARCH32_EL0 in Kconfig
>   arm64: ensure the kernel is compiled for LP64
>   arm64:uapi: set __BITS_PER_LONG correctly for ILP32 and LP64
>   arm64: ilp32: add sys_ilp32.c and a separate table (in entry.S) to use
>     it
>   arm64: ilp32: introduce ilp32-specific handlers for sigframe and
>     ucontext
>   arm64:ilp32: add ARM64_ILP32 to Kconfig
> 
> Philipp Tomsich (1):
>   arm64:ilp32: add vdso-ilp32 and use for signal return
> 
> Yury Norov (13):
>   compat ABI: use non-compat openat and open_by_handle_at variants
>   32-bit ABI: introduce ARCH_32BIT_OFF_T config option
>   asm-generic: Drop getrlimit and setrlimit syscalls from default list
>   arm64: ilp32: add documentation on the ILP32 ABI for ARM64
>   thread: move thread bits accessors to separated file
>   arm64: introduce is_a32_task and is_a32_thread (for AArch32 compat)
>   arm64: ilp32: add is_ilp32_compat_{task,thread} and TIF_32BIT_AARCH64
>   arm64: introduce binfmt_elf32.c
>   arm64: ilp32: introduce binfmt_ilp32.c
>   arm64: ilp32: share aarch32 syscall handlers
>   arm64: signal: share lp64 signal routines to ilp32
>   arm64: signal32: move ilp32 and aarch32 common code to separated file
>   arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32
> 
>  Documentation/arm64/ilp32.txt                 |  45 +++++++
>  arch/Kconfig                                  |   4 +
>  arch/arc/Kconfig                              |   1 +
>  arch/arc/include/uapi/asm/unistd.h            |   1 +
>  arch/arm/Kconfig                              |   1 +
>  arch/arm64/Kconfig                            |  19 ++-
>  arch/arm64/Makefile                           |   8 ++
>  arch/arm64/include/asm/compat.h               |  19 +--
>  arch/arm64/include/asm/elf.h                  |  37 ++----
>  arch/arm64/include/asm/fpsimd.h               |   2 +-
>  arch/arm64/include/asm/ftrace.h               |   2 +-
>  arch/arm64/include/asm/hwcap.h                |   6 +-
>  arch/arm64/include/asm/is_compat.h            |  90 ++++++++++++++
>  arch/arm64/include/asm/memory.h               |   5 +-
>  arch/arm64/include/asm/processor.h            |  11 +-
>  arch/arm64/include/asm/ptrace.h               |   2 +-
>  arch/arm64/include/asm/seccomp.h              |   2 +-
>  arch/arm64/include/asm/signal32.h             |   9 +-
>  arch/arm64/include/asm/signal32_common.h      |  27 ++++
>  arch/arm64/include/asm/signal_common.h        |  33 +++++
>  arch/arm64/include/asm/signal_ilp32.h         |  38 ++++++
>  arch/arm64/include/asm/syscall.h              |   2 +-
>  arch/arm64/include/asm/thread_info.h          |   4 +-
>  arch/arm64/include/asm/unistd.h               |   6 +-
>  arch/arm64/include/asm/vdso.h                 |   6 +
>  arch/arm64/include/uapi/asm/bitsperlong.h     |   9 +-
>  arch/arm64/include/uapi/asm/unistd.h          |  13 ++
>  arch/arm64/kernel/Makefile                    |   8 +-
>  arch/arm64/kernel/asm-offsets.c               |   9 +-
>  arch/arm64/kernel/binfmt_elf32.c              |  38 ++++++
>  arch/arm64/kernel/binfmt_ilp32.c              |  85 +++++++++++++
>  arch/arm64/kernel/cpufeature.c                |   8 +-
>  arch/arm64/kernel/cpuinfo.c                   |  20 +--
>  arch/arm64/kernel/entry.S                     |  34 +++++-
>  arch/arm64/kernel/entry32.S                   |  80 ------------
>  arch/arm64/kernel/entry32_common.S            | 107 ++++++++++++++++
>  arch/arm64/kernel/entry_ilp32.S               |  22 ++++
>  arch/arm64/kernel/head.S                      |   2 +-
>  arch/arm64/kernel/hw_breakpoint.c             |   8 +-
>  arch/arm64/kernel/perf_regs.c                 |   2 +-
>  arch/arm64/kernel/process.c                   |   7 +-
>  arch/arm64/kernel/ptrace.c                    |  80 ++++++++++--
>  arch/arm64/kernel/signal.c                    | 102 ++++++++++------
>  arch/arm64/kernel/signal32.c                  | 107 ----------------
>  arch/arm64/kernel/signal32_common.c           | 135 ++++++++++++++++++++
>  arch/arm64/kernel/signal_ilp32.c              | 170 ++++++++++++++++++++++++++
>  arch/arm64/kernel/sys_ilp32.c                 | 100 +++++++++++++++
>  arch/arm64/kernel/traps.c                     |   5 +-
>  arch/arm64/kernel/vdso-ilp32/.gitignore       |   2 +
>  arch/arm64/kernel/vdso-ilp32/Makefile         |  80 ++++++++++++
>  arch/arm64/kernel/vdso-ilp32/vdso-ilp32.S     |  33 +++++
>  arch/arm64/kernel/vdso-ilp32/vdso-ilp32.lds.S |  95 ++++++++++++++
>  arch/arm64/kernel/vdso.c                      |  69 +++++++++--
>  arch/arm64/kernel/vdso/gettimeofday.S         |  20 ++-
>  arch/arm64/kernel/vdso/vdso.S                 |   6 +-
>  arch/blackfin/Kconfig                         |   1 +
>  arch/c6x/include/uapi/asm/unistd.h            |   1 +
>  arch/cris/Kconfig                             |   1 +
>  arch/frv/Kconfig                              |   1 +
>  arch/h8300/Kconfig                            |   1 +
>  arch/h8300/include/uapi/asm/unistd.h          |   1 +
>  arch/hexagon/Kconfig                          |   1 +
>  arch/hexagon/include/uapi/asm/unistd.h        |   1 +
>  arch/m32r/Kconfig                             |   1 +
>  arch/m68k/Kconfig                             |   1 +
>  arch/metag/Kconfig                            |   1 +
>  arch/metag/include/uapi/asm/unistd.h          |   1 +
>  arch/microblaze/Kconfig                       |   1 +
>  arch/mips/Kconfig                             |   1 +
>  arch/mn10300/Kconfig                          |   1 +
>  arch/nios2/Kconfig                            |   1 +
>  arch/nios2/include/uapi/asm/unistd.h          |   1 +
>  arch/openrisc/Kconfig                         |   1 +
>  arch/openrisc/include/uapi/asm/unistd.h       |   1 +
>  arch/parisc/Kconfig                           |   1 +
>  arch/powerpc/Kconfig                          |   1 +
>  arch/score/Kconfig                            |   1 +
>  arch/score/include/uapi/asm/unistd.h          |   1 +
>  arch/sh/Kconfig                               |   1 +
>  arch/sparc/Kconfig                            |   1 +
>  arch/tile/Kconfig                             |   1 +
>  arch/tile/include/uapi/asm/unistd.h           |   1 +
>  arch/tile/kernel/compat.c                     |   3 +
>  arch/unicore32/Kconfig                        |   1 +
>  arch/unicore32/include/uapi/asm/unistd.h      |   1 +
>  arch/x86/Kconfig                              |   1 +
>  arch/x86/um/Kconfig                           |   1 +
>  arch/xtensa/Kconfig                           |   1 +
>  drivers/clocksource/arm_arch_timer.c          |   2 +-
>  include/linux/fcntl.h                         |   2 +-
>  include/linux/thread_bits.h                   |  63 ++++++++++
>  include/linux/thread_info.h                   |  66 ++--------
>  include/uapi/asm-generic/unistd.h             |  10 +-
>  93 files changed, 1601 insertions(+), 413 deletions(-)
>  create mode 100644 Documentation/arm64/ilp32.txt
>  create mode 100644 arch/arm64/include/asm/is_compat.h
>  create mode 100644 arch/arm64/include/asm/signal32_common.h
>  create mode 100644 arch/arm64/include/asm/signal_common.h
>  create mode 100644 arch/arm64/include/asm/signal_ilp32.h
>  create mode 100644 arch/arm64/kernel/binfmt_elf32.c
>  create mode 100644 arch/arm64/kernel/binfmt_ilp32.c
>  create mode 100644 arch/arm64/kernel/entry32_common.S
>  create mode 100644 arch/arm64/kernel/entry_ilp32.S
>  create mode 100644 arch/arm64/kernel/signal32_common.c
>  create mode 100644 arch/arm64/kernel/signal_ilp32.c
>  create mode 100644 arch/arm64/kernel/sys_ilp32.c
>  create mode 100644 arch/arm64/kernel/vdso-ilp32/.gitignore
>  create mode 100644 arch/arm64/kernel/vdso-ilp32/Makefile
>  create mode 100644 arch/arm64/kernel/vdso-ilp32/vdso-ilp32.S
>  create mode 100644 arch/arm64/kernel/vdso-ilp32/vdso-ilp32.lds.S
>  create mode 100644 include/linux/thread_bits.h
> 
> -- 
> 2.11.0

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-04-27  2:08                       ` Joonsoo Kim
@ 2017-04-27 15:10                         ` Michal Hocko
  0 siblings, 0 replies; 341+ messages in thread
From: Michal Hocko @ 2017-04-27 15:10 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: linux-mm, Andrew Morton, Mel Gorman, Vlastimil Babka,
	Andrea Arcangeli, Jerome Glisse, Reza Arbab, Yasuaki Ishimatsu,
	qiuxishi, Kani Toshimitsu, slaoub, Andi Kleen, David Rientjes,
	Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, LKML

On Thu 27-04-17 11:08:38, Joonsoo Kim wrote:
> On Wed, Apr 26, 2017 at 11:19:06AM +0200, Michal Hocko wrote:
> > > > [...]
> > > > 
> > > > > > You are trying to change a semantic of something that has a well defined
> > > > > > meaning. I disagree that we should change it. It might sound like a
> > > > > > simpler thing to do because pfn walkers will have to be checked but what
> > > > > > you are proposing is conflating two different things together.
> > > > > 
> > > > > I don't think that *I* try to change the semantic of pfn_valid().
> > > > > It would be original semantic of pfn_valid().
> > > > > 
> > > > > "If pfn_valid() returns true, we can get proper struct page and the
> > > > > zone information,"
> > > > 
> > > > I do not see any guarantee about the zone information anywhere. In fact
> > > > this is not true with the original implementation as I've tried to
> > > > explain already. We do have new pages associated with a zone but that
> > > > association might change during the online phase. So you cannot really
> > > > rely on that information until the page is online. There is no real
> > > > change in that regards after my rework.
> > > 
> > > I know that what you did doesn't change thing much. What I try to say
> > > is that previous implementation related to pfn_valid() in hotplug is
> > > wrong. Please do not assume that hotplug implementation is correct and
> > > other pfn_valid() users are incorrect. There is no design document so
> > > I'm not sure which one is correct but assumption that pfn_valid() user
> > > can access whole the struct page information makes much sense to me.
> > 
> > Not really. E.g. ZONE_DEVICE pages are never online AFAIK. I believe we
> > still need pfn_valid to work for those pfns. Really, pfn_valid has a
> 
> It's really contrary example to your insist. They requires not only
> struct page but also other information, especially, the zone index.
> They checks zone idx to know whether this page is for ZONE_DEVICE or not.

Yes and they guarantee this association is true. Without memory onlining
though. This memory is never online for anybody who is asking.

[...]

> I think that I did my best to explain my reasoning. It seems that we
> cannot agree with each other so it's better for some others to express
> their opinion to this problem. I will stop this discussion from now
> on.

I _do_ appreciate your feedback and if the general consensus is to
modify pfn_valid I can go that direction but my gut feeling tells me
that conflating "existing struct page" test and "fully online and
initialized" one is a wrong thing to do.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-04-26  9:19                     ` Michal Hocko
@ 2017-04-27  2:08                       ` Joonsoo Kim
  2017-04-27 15:10                         ` Michal Hocko
  0 siblings, 1 reply; 341+ messages in thread
From: Joonsoo Kim @ 2017-04-27  2:08 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Mel Gorman, Vlastimil Babka,
	Andrea Arcangeli, Jerome Glisse, Reza Arbab, Yasuaki Ishimatsu,
	qiuxishi, Kani Toshimitsu, slaoub, Andi Kleen, David Rientjes,
	Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, LKML

On Wed, Apr 26, 2017 at 11:19:06AM +0200, Michal Hocko wrote:
> > > [...]
> > > 
> > > > > You are trying to change a semantic of something that has a well defined
> > > > > meaning. I disagree that we should change it. It might sound like a
> > > > > simpler thing to do because pfn walkers will have to be checked but what
> > > > > you are proposing is conflating two different things together.
> > > > 
> > > > I don't think that *I* try to change the semantic of pfn_valid().
> > > > It would be original semantic of pfn_valid().
> > > > 
> > > > "If pfn_valid() returns true, we can get proper struct page and the
> > > > zone information,"
> > > 
> > > I do not see any guarantee about the zone information anywhere. In fact
> > > this is not true with the original implementation as I've tried to
> > > explain already. We do have new pages associated with a zone but that
> > > association might change during the online phase. So you cannot really
> > > rely on that information until the page is online. There is no real
> > > change in that regards after my rework.
> > 
> > I know that what you did doesn't change thing much. What I try to say
> > is that previous implementation related to pfn_valid() in hotplug is
> > wrong. Please do not assume that hotplug implementation is correct and
> > other pfn_valid() users are incorrect. There is no design document so
> > I'm not sure which one is correct but assumption that pfn_valid() user
> > can access whole the struct page information makes much sense to me.
> 
> Not really. E.g. ZONE_DEVICE pages are never online AFAIK. I believe we
> still need pfn_valid to work for those pfns. Really, pfn_valid has a

It's really contrary example to your insist. They requires not only
struct page but also other information, especially, the zone index.
They checks zone idx to know whether this page is for ZONE_DEVICE or not.

So, pfn_valid() for ZONE_DEVICE pages assume that struct page has all
the valid information. It's perfectly matched with my suggestion.
Online isn't important issue here. What the important point is the condition
that pfn_valid() return true. pfn_valid() for ZONE_DEVICE returns true after
arch_add_memory() since all the struct page information is fixed there.

If zone of hotplugged memory cannot be fixed at this moment, you can
defef it until all the information is fixed (onlining). That
seems to be better semantic of pfn_valid() to me.

> different meaning than you would like it to have. Who knows how many
> others like that are lurking there. I feel much more comfortable to go
> and hunt already broken code and fix it rathert than break something
> unexpectedly.

I think that I did my best to explain my reasoning. It seems that we
cannot agree with each other so it's better for some others to express
their opinion to this problem. I will stop this discussion from now
on.

Thanks.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-04-25  2:50                   ` Joonsoo Kim
@ 2017-04-26  9:19                     ` Michal Hocko
  2017-04-27  2:08                       ` Joonsoo Kim
  0 siblings, 1 reply; 341+ messages in thread
From: Michal Hocko @ 2017-04-26  9:19 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: linux-mm, Andrew Morton, Mel Gorman, Vlastimil Babka,
	Andrea Arcangeli, Jerome Glisse, Reza Arbab, Yasuaki Ishimatsu,
	qiuxishi, Kani Toshimitsu, slaoub, Andi Kleen, David Rientjes,
	Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, LKML

On Tue 25-04-17 11:50:45, Joonsoo Kim wrote:
> On Mon, Apr 24, 2017 at 09:53:12AM +0200, Michal Hocko wrote:
> > On Mon 24-04-17 10:44:43, Joonsoo Kim wrote:
> > > On Fri, Apr 21, 2017 at 09:16:16AM +0200, Michal Hocko wrote:
> > > > On Fri 21-04-17 13:38:28, Joonsoo Kim wrote:
> > > > > On Thu, Apr 20, 2017 at 09:28:20AM +0200, Michal Hocko wrote:
> > > > > > On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> > > > > > > On Mon, Apr 17, 2017 at 10:15:15AM +0200, Michal Hocko wrote:
> > > > > > [...]
> > > > > > > > Which pfn walkers you have in mind?
> > > > > > > 
> > > > > > > For example, kpagecount_read() in fs/proc/page.c. I searched it by
> > > > > > > using pfn_valid().
> > > > > > 
> > > > > > Yeah, I've checked that one and in fact this is a good example of the
> > > > > > case where you do not really care about holes. It just checks the page
> > > > > > count which is a valid information under any circumstances.
> > > > > 
> > > > > I don't think so. First, it checks the page *map* count. Is it still valid
> > > > > even if PageReserved() is set?
> > > > 
> > > > I do not know about any user which would manipulate page map count for
> > > > referenced pages. The core MM code doesn't.
> > > 
> > > That's weird that we can get *map* count without PageReserved() check,
> > > but we cannot get zone information.
> > > Zone information is more static information than map count.
> > 
> > As I've already pointed out the rework of the hotplug code is mainly
> > about postponing the zone initialization from the physical hot add to
> > the logical onlining. The zone is really not clear until that moment.
> >  
> > > It should be defined/documented in this time that what information in
> > > the struct page is valid even if PageReserved() is set. And then, we
> > > need to fix all the things based on this design decision.
> > 
> > Where would you suggest documenting this? We do have
> > Documentation/memory-hotplug.txt but it is not really specific about
> > struct page.
> 
> pfn_valid() in include/linux/mmzone.h looks proper place.

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index c412e6a3a1e9..443258fcac93 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1288,10 +1288,14 @@ unsigned long __init node_memmap_size_bytes(int, unsigned long, unsigned long);
 #ifdef CONFIG_ARCH_HAS_HOLES_MEMORYMODEL
 /*
  * pfn_valid() is meant to be able to tell if a given PFN has valid memmap
- * associated with it or not. In FLATMEM, it is expected that holes always
- * have valid memmap as long as there is valid PFNs either side of the hole.
- * In SPARSEMEM, it is assumed that a valid section has a memmap for the
- * entire section.
+ * associated with it or not. This means that a struct page exists for this
+ * pfn. The caller cannot assume the page is fully initialized though.
+ * pfn_to_online_page() should be used to make sure the struct page is fully
+ * initialized.
+ *
+ * In FLATMEM, it is expected that holes always have valid memmap as long as
+ * there is valid PFNs either side of the hole. In SPARSEMEM, it is assumed
+ * that a valid section has a memmap for the entire section.
  *
  * However, an ARM, and maybe other embedded architectures in the future
  * free memmap backing holes to save memory on the assumption the memmap is

> > [...]
> > 
> > > > You are trying to change a semantic of something that has a well defined
> > > > meaning. I disagree that we should change it. It might sound like a
> > > > simpler thing to do because pfn walkers will have to be checked but what
> > > > you are proposing is conflating two different things together.
> > > 
> > > I don't think that *I* try to change the semantic of pfn_valid().
> > > It would be original semantic of pfn_valid().
> > > 
> > > "If pfn_valid() returns true, we can get proper struct page and the
> > > zone information,"
> > 
> > I do not see any guarantee about the zone information anywhere. In fact
> > this is not true with the original implementation as I've tried to
> > explain already. We do have new pages associated with a zone but that
> > association might change during the online phase. So you cannot really
> > rely on that information until the page is online. There is no real
> > change in that regards after my rework.
> 
> I know that what you did doesn't change thing much. What I try to say
> is that previous implementation related to pfn_valid() in hotplug is
> wrong. Please do not assume that hotplug implementation is correct and
> other pfn_valid() users are incorrect. There is no design document so
> I'm not sure which one is correct but assumption that pfn_valid() user
> can access whole the struct page information makes much sense to me.

Not really. E.g. ZONE_DEVICE pages are never online AFAIK. I believe we
still need pfn_valid to work for those pfns. Really, pfn_valid has a
different meaning than you would like it to have. Who knows how many
others like that are lurking there. I feel much more comfortable to go
and hunt already broken code and fix it rathert than break something
unexpectedly.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-04-24  7:53                 ` Michal Hocko
@ 2017-04-25  2:50                   ` Joonsoo Kim
  2017-04-26  9:19                     ` Michal Hocko
  0 siblings, 1 reply; 341+ messages in thread
From: Joonsoo Kim @ 2017-04-25  2:50 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Mel Gorman, Vlastimil Babka,
	Andrea Arcangeli, Jerome Glisse, Reza Arbab, Yasuaki Ishimatsu,
	qiuxishi, Kani Toshimitsu, slaoub, Andi Kleen, David Rientjes,
	Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, LKML

On Mon, Apr 24, 2017 at 09:53:12AM +0200, Michal Hocko wrote:
> On Mon 24-04-17 10:44:43, Joonsoo Kim wrote:
> > On Fri, Apr 21, 2017 at 09:16:16AM +0200, Michal Hocko wrote:
> > > On Fri 21-04-17 13:38:28, Joonsoo Kim wrote:
> > > > On Thu, Apr 20, 2017 at 09:28:20AM +0200, Michal Hocko wrote:
> > > > > On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> > > > > > On Mon, Apr 17, 2017 at 10:15:15AM +0200, Michal Hocko wrote:
> > > > > [...]
> > > > > > > Which pfn walkers you have in mind?
> > > > > > 
> > > > > > For example, kpagecount_read() in fs/proc/page.c. I searched it by
> > > > > > using pfn_valid().
> > > > > 
> > > > > Yeah, I've checked that one and in fact this is a good example of the
> > > > > case where you do not really care about holes. It just checks the page
> > > > > count which is a valid information under any circumstances.
> > > > 
> > > > I don't think so. First, it checks the page *map* count. Is it still valid
> > > > even if PageReserved() is set?
> > > 
> > > I do not know about any user which would manipulate page map count for
> > > referenced pages. The core MM code doesn't.
> > 
> > That's weird that we can get *map* count without PageReserved() check,
> > but we cannot get zone information.
> > Zone information is more static information than map count.
> 
> As I've already pointed out the rework of the hotplug code is mainly
> about postponing the zone initialization from the physical hot add to
> the logical onlining. The zone is really not clear until that moment.
>  
> > It should be defined/documented in this time that what information in
> > the struct page is valid even if PageReserved() is set. And then, we
> > need to fix all the things based on this design decision.
> 
> Where would you suggest documenting this? We do have
> Documentation/memory-hotplug.txt but it is not really specific about
> struct page.

pfn_valid() in include/linux/mmzone.h looks proper place.

> 
> [...]
> 
> > > You are trying to change a semantic of something that has a well defined
> > > meaning. I disagree that we should change it. It might sound like a
> > > simpler thing to do because pfn walkers will have to be checked but what
> > > you are proposing is conflating two different things together.
> > 
> > I don't think that *I* try to change the semantic of pfn_valid().
> > It would be original semantic of pfn_valid().
> > 
> > "If pfn_valid() returns true, we can get proper struct page and the
> > zone information,"
> 
> I do not see any guarantee about the zone information anywhere. In fact
> this is not true with the original implementation as I've tried to
> explain already. We do have new pages associated with a zone but that
> association might change during the online phase. So you cannot really
> rely on that information until the page is online. There is no real
> change in that regards after my rework.

I know that what you did doesn't change thing much. What I try to say
is that previous implementation related to pfn_valid() in hotplug is
wrong. Please do not assume that hotplug implementation is correct and
other pfn_valid() users are incorrect. There is no design document so
I'm not sure which one is correct but assumption that pfn_valid() user
can access whole the struct page information makes much sense to me.
So, I hope that please fix hotplug implementation rather than
modifying each pfn_valid() users.

> 
> [...]
> > > So please do not conflate those two different concepts together. I
> > > believe that the most prominent pfn walkers should be covered now and
> > > others can be evaluated later.
> > 
> > Even if original pfn_valid()'s semantic is not the one that I mentioned,
> > I think that suggested semantic from me is better.
> > Only hotplug code need to be changed and others doesn't need to be changed.
> > There is no overhead for others. What's the problem about this approach?
> 
> That this would require to check _every_ single pfn_valid user in the
> kernel. That is beyond my time capacity and not really necessary because
> the current code already suffers from the same/similar class of
> problems.

I think that all the pfn_valid() user doesn't consider hole case.
Unlike your expectation, if your way is taken, it requires to check
_every_ pfn_valid() users.

Thanks.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-04-24  1:44               ` Joonsoo Kim
@ 2017-04-24  7:53                 ` Michal Hocko
  2017-04-25  2:50                   ` Joonsoo Kim
  0 siblings, 1 reply; 341+ messages in thread
From: Michal Hocko @ 2017-04-24  7:53 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: linux-mm, Andrew Morton, Mel Gorman, Vlastimil Babka,
	Andrea Arcangeli, Jerome Glisse, Reza Arbab, Yasuaki Ishimatsu,
	qiuxishi, Kani Toshimitsu, slaoub, Andi Kleen, David Rientjes,
	Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, LKML

On Mon 24-04-17 10:44:43, Joonsoo Kim wrote:
> On Fri, Apr 21, 2017 at 09:16:16AM +0200, Michal Hocko wrote:
> > On Fri 21-04-17 13:38:28, Joonsoo Kim wrote:
> > > On Thu, Apr 20, 2017 at 09:28:20AM +0200, Michal Hocko wrote:
> > > > On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> > > > > On Mon, Apr 17, 2017 at 10:15:15AM +0200, Michal Hocko wrote:
> > > > [...]
> > > > > > Which pfn walkers you have in mind?
> > > > > 
> > > > > For example, kpagecount_read() in fs/proc/page.c. I searched it by
> > > > > using pfn_valid().
> > > > 
> > > > Yeah, I've checked that one and in fact this is a good example of the
> > > > case where you do not really care about holes. It just checks the page
> > > > count which is a valid information under any circumstances.
> > > 
> > > I don't think so. First, it checks the page *map* count. Is it still valid
> > > even if PageReserved() is set?
> > 
> > I do not know about any user which would manipulate page map count for
> > referenced pages. The core MM code doesn't.
> 
> That's weird that we can get *map* count without PageReserved() check,
> but we cannot get zone information.
> Zone information is more static information than map count.

As I've already pointed out the rework of the hotplug code is mainly
about postponing the zone initialization from the physical hot add to
the logical onlining. The zone is really not clear until that moment.
 
> It should be defined/documented in this time that what information in
> the struct page is valid even if PageReserved() is set. And then, we
> need to fix all the things based on this design decision.

Where would you suggest documenting this? We do have
Documentation/memory-hotplug.txt but it is not really specific about
struct page.

[...]

> > You are trying to change a semantic of something that has a well defined
> > meaning. I disagree that we should change it. It might sound like a
> > simpler thing to do because pfn walkers will have to be checked but what
> > you are proposing is conflating two different things together.
> 
> I don't think that *I* try to change the semantic of pfn_valid().
> It would be original semantic of pfn_valid().
> 
> "If pfn_valid() returns true, we can get proper struct page and the
> zone information,"

I do not see any guarantee about the zone information anywhere. In fact
this is not true with the original implementation as I've tried to
explain already. We do have new pages associated with a zone but that
association might change during the online phase. So you cannot really
rely on that information until the page is online. There is no real
change in that regards after my rework.

[...]
> > So please do not conflate those two different concepts together. I
> > believe that the most prominent pfn walkers should be covered now and
> > others can be evaluated later.
> 
> Even if original pfn_valid()'s semantic is not the one that I mentioned,
> I think that suggested semantic from me is better.
> Only hotplug code need to be changed and others doesn't need to be changed.
> There is no overhead for others. What's the problem about this approach?

That this would require to check _every_ single pfn_valid user in the
kernel. That is beyond my time capacity and not really necessary because
the current code already suffers from the same/similar class of
problems.
 
> And, I'm not sure that you covered the most prominent pfn walkers.
> Please see pagetypeinfo_showblockcount_print() in mm/vmstat.c.

I probably haven't (and will send a patch to fix this one - thanks for
pointing to it) but the point is they those are broken already and they
can be fixed in follow up patches. If you change pfn_valid you might
break an existing code in an unexpected ways.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-04-21  7:16             ` Michal Hocko
@ 2017-04-24  1:44               ` Joonsoo Kim
  2017-04-24  7:53                 ` Michal Hocko
  0 siblings, 1 reply; 341+ messages in thread
From: Joonsoo Kim @ 2017-04-24  1:44 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Mel Gorman, Vlastimil Babka,
	Andrea Arcangeli, Jerome Glisse, Reza Arbab, Yasuaki Ishimatsu,
	qiuxishi, Kani Toshimitsu, slaoub, Andi Kleen, David Rientjes,
	Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, LKML

On Fri, Apr 21, 2017 at 09:16:16AM +0200, Michal Hocko wrote:
> On Fri 21-04-17 13:38:28, Joonsoo Kim wrote:
> > On Thu, Apr 20, 2017 at 09:28:20AM +0200, Michal Hocko wrote:
> > > On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> > > > On Mon, Apr 17, 2017 at 10:15:15AM +0200, Michal Hocko wrote:
> > > [...]
> > > > > Which pfn walkers you have in mind?
> > > > 
> > > > For example, kpagecount_read() in fs/proc/page.c. I searched it by
> > > > using pfn_valid().
> > > 
> > > Yeah, I've checked that one and in fact this is a good example of the
> > > case where you do not really care about holes. It just checks the page
> > > count which is a valid information under any circumstances.
> > 
> > I don't think so. First, it checks the page *map* count. Is it still valid
> > even if PageReserved() is set?
> 
> I do not know about any user which would manipulate page map count for
> referenced pages. The core MM code doesn't.

That's weird that we can get *map* count without PageReserved() check,
but we cannot get zone information.
Zone information is more static information than map count.

It should be defined/documented in this time that what information in
the struct page is valid even if PageReserved() is set. And then, we
need to fix all the things based on this design decision.

> 
> > What I'd like to ask in this example is
> > that what information is valid if PageReserved() is set. Is there any
> > design document on this? I think that we need to define/document it first.
> 
> NO, it is not AFAIK.
> 
> [...]
> > > OK, fair enough. I did't consider memblock allocations. I will rethink
> > > this patch but there are essentially 3 options
> > > 	- use a different criterion for the offline holes dection. I
> > > 	  have just realized we might do it by storing the online
> > > 	  information into the mem sections
> > > 	- drop this patch
> > > 	- move the PageReferenced check down the chain into
> > > 	  isolate_freepages_block resp. isolate_migratepages_block
> > > 
> > > I would prefer 3 over 2 over 1. I definitely want to make this more
> > > robust so 1 is preferable long term but I do not want this to be a
> > > roadblock to the rest of the rework. Does that sound acceptable to you?
> > 
> > I like #1 among of above options and I already see your patch for #1.
> > It's much better than your first attempt but I'm still not happy due
> > to the semantic of pfn_valid().
> 
> You are trying to change a semantic of something that has a well defined
> meaning. I disagree that we should change it. It might sound like a
> simpler thing to do because pfn walkers will have to be checked but what
> you are proposing is conflating two different things together.

I don't think that *I* try to change the semantic of pfn_valid().
It would be original semantic of pfn_valid().

"If pfn_valid() returns true, we can get proper struct page and the
zone information,"

That situation is now being changed by your patch *hotplug rework*.

"Even if pfn_valid() returns true, we cannot get the zone information
without PageReserved() check, since *zone is determined during
onlining* and pfn_valid() return true after adding the memory."

> 
> > > [..]
> > > > Let me clarify my desire(?) for this issue.
> > > > 
> > > > 1. If pfn_valid() returns true, struct page has valid information, at
> > > > least, in flags (zone id, node id, flags, etc...). So, we can use them
> > > > without checking PageResereved().
> > > 
> > > This is no longer true after my rework. Pages are associated with the
> > > zone during _onlining_ rather than when they are physically hotpluged.
> > 
> > If your rework make information valid during _onlining_, my
> > suggestion is making pfn_valid() return false until onlining.
> > 
> > Caller of pfn_valid() expects that they can get valid information from
> > the struct page. There is no reason to access the struct page if they
> > can't get valid information from it. So, passing pfn_valid() should
> > guarantee that, at least, some kind of information is valid.
> > 
> > If pfn_valid() doesn't guarantee it, most of the pfn walker should
> > check PageResereved() to make sure that validity of information from
> > the struct page.
> 
> This is true only for those walkers which really depend on the full
> initialization. This is not the case for all of them. I do not see any
> reason to introduce another _pfn_valid to just check whether there is a
> struct page...

It's really confusing concept that only some information is valid for
*not* fully initialized struct page. Even, there is no document that
what information is valid for this half-initialized struct page.

Better design would be that we regard that every information is
invalid for half-initialized struct page. In this case, it's natural
to make pfn_valid() returns false for this half-initialized struct page.

>  
> So please do not conflate those two different concepts together. I
> believe that the most prominent pfn walkers should be covered now and
> others can be evaluated later.

Even if original pfn_valid()'s semantic is not the one that I mentioned,
I think that suggested semantic from me is better.
Only hotplug code need to be changed and others doesn't need to be changed.
There is no overhead for others. What's the problem about this approach?

And, I'm not sure that you covered the most prominent pfn walkers.
Please see pagetypeinfo_showblockcount_print() in mm/vmstat.c.
As you admitted, additional check approach is really error-prone and
this example shows that.

Thanks.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-04-21  4:38           ` Joonsoo Kim
@ 2017-04-21  7:16             ` Michal Hocko
  2017-04-24  1:44               ` Joonsoo Kim
  0 siblings, 1 reply; 341+ messages in thread
From: Michal Hocko @ 2017-04-21  7:16 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: linux-mm, Andrew Morton, Mel Gorman, Vlastimil Babka,
	Andrea Arcangeli, Jerome Glisse, Reza Arbab, Yasuaki Ishimatsu,
	qiuxishi, Kani Toshimitsu, slaoub, Andi Kleen, David Rientjes,
	Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, LKML

On Fri 21-04-17 13:38:28, Joonsoo Kim wrote:
> On Thu, Apr 20, 2017 at 09:28:20AM +0200, Michal Hocko wrote:
> > On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> > > On Mon, Apr 17, 2017 at 10:15:15AM +0200, Michal Hocko wrote:
> > [...]
> > > > Which pfn walkers you have in mind?
> > > 
> > > For example, kpagecount_read() in fs/proc/page.c. I searched it by
> > > using pfn_valid().
> > 
> > Yeah, I've checked that one and in fact this is a good example of the
> > case where you do not really care about holes. It just checks the page
> > count which is a valid information under any circumstances.
> 
> I don't think so. First, it checks the page *map* count. Is it still valid
> even if PageReserved() is set?

I do not know about any user which would manipulate page map count for
referenced pages. The core MM code doesn't.

> What I'd like to ask in this example is
> that what information is valid if PageReserved() is set. Is there any
> design document on this? I think that we need to define/document it first.

NO, it is not AFAIK.

[...]
> > OK, fair enough. I did't consider memblock allocations. I will rethink
> > this patch but there are essentially 3 options
> > 	- use a different criterion for the offline holes dection. I
> > 	  have just realized we might do it by storing the online
> > 	  information into the mem sections
> > 	- drop this patch
> > 	- move the PageReferenced check down the chain into
> > 	  isolate_freepages_block resp. isolate_migratepages_block
> > 
> > I would prefer 3 over 2 over 1. I definitely want to make this more
> > robust so 1 is preferable long term but I do not want this to be a
> > roadblock to the rest of the rework. Does that sound acceptable to you?
> 
> I like #1 among of above options and I already see your patch for #1.
> It's much better than your first attempt but I'm still not happy due
> to the semantic of pfn_valid().

You are trying to change a semantic of something that has a well defined
meaning. I disagree that we should change it. It might sound like a
simpler thing to do because pfn walkers will have to be checked but what
you are proposing is conflating two different things together.

> > [..]
> > > Let me clarify my desire(?) for this issue.
> > > 
> > > 1. If pfn_valid() returns true, struct page has valid information, at
> > > least, in flags (zone id, node id, flags, etc...). So, we can use them
> > > without checking PageResereved().
> > 
> > This is no longer true after my rework. Pages are associated with the
> > zone during _onlining_ rather than when they are physically hotpluged.
> 
> If your rework make information valid during _onlining_, my
> suggestion is making pfn_valid() return false until onlining.
> 
> Caller of pfn_valid() expects that they can get valid information from
> the struct page. There is no reason to access the struct page if they
> can't get valid information from it. So, passing pfn_valid() should
> guarantee that, at least, some kind of information is valid.
> 
> If pfn_valid() doesn't guarantee it, most of the pfn walker should
> check PageResereved() to make sure that validity of information from
> the struct page.

This is true only for those walkers which really depend on the full
initialization. This is not the case for all of them. I do not see any
reason to introduce another _pfn_valid to just check whether there is a
struct page...
 
So please do not conflate those two different concepts together. I
believe that the most prominent pfn walkers should be covered now and
others can be evaluated later.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-04-20  7:28         ` Michal Hocko
  2017-04-20  8:49           ` Michal Hocko
@ 2017-04-21  4:38           ` Joonsoo Kim
  2017-04-21  7:16             ` Michal Hocko
  1 sibling, 1 reply; 341+ messages in thread
From: Joonsoo Kim @ 2017-04-21  4:38 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Mel Gorman, Vlastimil Babka,
	Andrea Arcangeli, Jerome Glisse, Reza Arbab, Yasuaki Ishimatsu,
	qiuxishi, Kani Toshimitsu, slaoub, Andi Kleen, David Rientjes,
	Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, LKML

On Thu, Apr 20, 2017 at 09:28:20AM +0200, Michal Hocko wrote:
> On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> > On Mon, Apr 17, 2017 at 10:15:15AM +0200, Michal Hocko wrote:
> [...]
> > > Which pfn walkers you have in mind?
> > 
> > For example, kpagecount_read() in fs/proc/page.c. I searched it by
> > using pfn_valid().
> 
> Yeah, I've checked that one and in fact this is a good example of the
> case where you do not really care about holes. It just checks the page
> count which is a valid information under any circumstances.

I don't think so. First, it checks the page *map* count. Is it still valid
even if PageReserved() is set? What I'd like to ask in this example is
that what information is valid if PageReserved() is set. Is there any
design document on this? I think that we need to define/document it first.

And, I hope that all the information in flags field is valid in all
cases if pfn_valid() return true. By the design.

This makes all the exsiting pfn walkers happy since we don't need an
additional check for PageReserved().

> 
> > > > The other problem I found is that your change will makes some
> > > > contiguous zones to be considered as non-contiguous. Memory allocated
> > > > by memblock API is also marked as PageResereved. If we consider this as
> > > > a hole, we will set such a zone as non-contiguous.
> > > 
> > > Why would that be a problem? We shouldn't touch those pages anyway?
> > 
> > Skipping those pages in compaction are valid so no problem in this
> > case.
> > 
> > The problem I mentioned above is that adding PageReserved() check in
> > __pageblock_pfn_to_page() invalidates optimization by
> > set_zone_contiguous(). In compaction, we need to get a valid struct
> > page and it requires a lot of work. There is performance problem
> > report due to this so set_zone_contiguous() optimization is added. It
> > checks if the zone is contiguous or not in boot time. If zone is
> > determined as contiguous, we can easily get a valid struct page in
> > runtime without expensive checks.
> 
> OK, I see. I've had some vague understading and the clarification helps.
> 
> > Your patch try to add PageReserved() to __pageblock_pfn_to_page(). It
> > woule make that zone->contiguous usually returns false since memory
> > used by memblock API is marked as PageReserved() and your patch regard
> > it as a hole. It invalidates set_zone_contiguous() optimization and I
> > worry about it.
> 
> OK, fair enough. I did't consider memblock allocations. I will rethink
> this patch but there are essentially 3 options
> 	- use a different criterion for the offline holes dection. I
> 	  have just realized we might do it by storing the online
> 	  information into the mem sections
> 	- drop this patch
> 	- move the PageReferenced check down the chain into
> 	  isolate_freepages_block resp. isolate_migratepages_block
> 
> I would prefer 3 over 2 over 1. I definitely want to make this more
> robust so 1 is preferable long term but I do not want this to be a
> roadblock to the rest of the rework. Does that sound acceptable to you?

I like #1 among of above options and I already see your patch for #1.
It's much better than your first attempt but I'm still not happy due
to the semantic of pfn_valid().

> [..]
> > Let me clarify my desire(?) for this issue.
> > 
> > 1. If pfn_valid() returns true, struct page has valid information, at
> > least, in flags (zone id, node id, flags, etc...). So, we can use them
> > without checking PageResereved().
> 
> This is no longer true after my rework. Pages are associated with the
> zone during _onlining_ rather than when they are physically hotpluged.

If your rework make information valid during _onlining_, my
suggestion is making pfn_valid() return false until onlining.

Caller of pfn_valid() expects that they can get valid information from
the struct page. There is no reason to access the struct page if they
can't get valid information from it. So, passing pfn_valid() should
guarantee that, at least, some kind of information is valid.

If pfn_valid() doesn't guarantee it, most of the pfn walker should
check PageResereved() to make sure that validity of information from
the struct page.

> Basically only the nid is set properly. Strictly speaking this is the
> case also without my rework because the zone might change during online
> phase so you cannot assume it is correct even now. It just happens that
> it more or less works just fine.
>
> > 2. pfn_valid() for offlined holes returns false. This can be easily
> > (?) implemented by manipulating SECTION_MAP_MASK in hotplug code. I
> > guess that there is no reason that pfn_valid() returns true for
> > offlined holes. If there is, please let me know.
> 
> There is some code which really expects that pfn_valid returns true iff
> there is a struct page and it doesn't care about the online status.
> E.g. hotplug code itself so no, we cannot change pfn_valid. What we can
> do though is to add pfn_to_online_page which would do the proper check.
> I have already sent [1]. As noted above we can (ab)use the remaining bit
> in SECTION_MAP_MASK to detect offline pages more robustly.

Some pfn_valid() caller in hotplug code look wrong. They want to check
section's validity rather than pfn's validity. Others want to access
the struct page so they fit for my assumption (?) for pfn_valid().
Therefore, we can change that pfn_valid() return false until online.

> > 3. We don't need to check PageReserved() in most of pfn walkers in
> > order to check offline holes.
> 
> We still have to distinguish those who care about offline pages from
> those who do not care about it.

Hotplug code can distinguish those by another way by using new section
mask as you did in a new patch. If someone excluding hotplug code do
care about offline pages, it would be just for optimization rather
than correteness. I think that it's okay.

Thanks.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-04-20 11:56             ` Vlastimil Babka
@ 2017-04-20 12:13               ` Michal Hocko
  0 siblings, 0 replies; 341+ messages in thread
From: Michal Hocko @ 2017-04-20 12:13 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Joonsoo Kim, linux-mm, Andrew Morton, Mel Gorman,
	Andrea Arcangeli, Jerome Glisse, Reza Arbab, Yasuaki Ishimatsu,
	qiuxishi, Kani Toshimitsu, slaoub, Andi Kleen, David Rientjes,
	Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, LKML

On Thu 20-04-17 13:56:34, Vlastimil Babka wrote:
> On 04/20/2017 10:49 AM, Michal Hocko wrote:
> > On Thu 20-04-17 09:28:20, Michal Hocko wrote:
> >> On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> > [...]
> >>> Your patch try to add PageReserved() to __pageblock_pfn_to_page(). It
> >>> woule make that zone->contiguous usually returns false since memory
> >>> used by memblock API is marked as PageReserved() and your patch regard
> >>> it as a hole. It invalidates set_zone_contiguous() optimization and I
> >>> worry about it.
> >>
> >> OK, fair enough. I did't consider memblock allocations. I will rethink
> >> this patch but there are essentially 3 options
> >> 	- use a different criterion for the offline holes dection. I
> >> 	  have just realized we might do it by storing the online
> >> 	  information into the mem sections
> >> 	- drop this patch
> >> 	- move the PageReferenced check down the chain into
> >> 	  isolate_freepages_block resp. isolate_migratepages_block
> >>
> >> I would prefer 3 over 2 over 1. I definitely want to make this more
> >> robust so 1 is preferable long term but I do not want this to be a
> >> roadblock to the rest of the rework. Does that sound acceptable to you?
> > 
> > So I've played with all three options just to see how the outcome would
> > look like and it turned out that going with 1 will be easiest in the
> > end. What do you think about the following? It should be free of any 
> > false positives. I have only compile tested it yet.
> 
> That looks fine, can't say immediately if fully correct. I think you'll
> need to bump SECTION_NID_SHIFT as well and make sure things still fit?
> Otherwise looks like nobody needed a new section bit since 2005, so we
> should be fine.

You are absolutely right. Thanks for spotting this! I have folded this
in

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 611ff869fa4d..c412e6a3a1e9 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1166,7 +1166,7 @@ extern unsigned long usemap_size(void);
 #define SECTION_IS_ONLINE	(1UL<<2)
 #define SECTION_MAP_LAST_BIT	(1UL<<3)
 #define SECTION_MAP_MASK	(~(SECTION_MAP_LAST_BIT-1))
-#define SECTION_NID_SHIFT	2
+#define SECTION_NID_SHIFT	3
 
 static inline struct page *__section_mem_map_addr(struct mem_section *section)
 {
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-04-20  8:49           ` Michal Hocko
@ 2017-04-20 11:56             ` Vlastimil Babka
  2017-04-20 12:13               ` Michal Hocko
  0 siblings, 1 reply; 341+ messages in thread
From: Vlastimil Babka @ 2017-04-20 11:56 UTC (permalink / raw)
  To: Michal Hocko, Joonsoo Kim
  Cc: linux-mm, Andrew Morton, Mel Gorman, Andrea Arcangeli,
	Jerome Glisse, Reza Arbab, Yasuaki Ishimatsu, qiuxishi,
	Kani Toshimitsu, slaoub, Andi Kleen, David Rientjes,
	Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, LKML

On 04/20/2017 10:49 AM, Michal Hocko wrote:
> On Thu 20-04-17 09:28:20, Michal Hocko wrote:
>> On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> [...]
>>> Your patch try to add PageReserved() to __pageblock_pfn_to_page(). It
>>> woule make that zone->contiguous usually returns false since memory
>>> used by memblock API is marked as PageReserved() and your patch regard
>>> it as a hole. It invalidates set_zone_contiguous() optimization and I
>>> worry about it.
>>
>> OK, fair enough. I did't consider memblock allocations. I will rethink
>> this patch but there are essentially 3 options
>> 	- use a different criterion for the offline holes dection. I
>> 	  have just realized we might do it by storing the online
>> 	  information into the mem sections
>> 	- drop this patch
>> 	- move the PageReferenced check down the chain into
>> 	  isolate_freepages_block resp. isolate_migratepages_block
>>
>> I would prefer 3 over 2 over 1. I definitely want to make this more
>> robust so 1 is preferable long term but I do not want this to be a
>> roadblock to the rest of the rework. Does that sound acceptable to you?
> 
> So I've played with all three options just to see how the outcome would
> look like and it turned out that going with 1 will be easiest in the
> end. What do you think about the following? It should be free of any 
> false positives. I have only compile tested it yet.

That looks fine, can't say immediately if fully correct. I think you'll
need to bump SECTION_NID_SHIFT as well and make sure things still fit?
Otherwise looks like nobody needed a new section bit since 2005, so we
should be fine.

> ---
> From 747794c13c0e82b55b793a31cdbe1a84ee1c6920 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Thu, 13 Apr 2017 10:28:45 +0200
> Subject: [PATCH] mm: consider zone which is not fully populated to have holes
> 
> __pageblock_pfn_to_page has two users currently, set_zone_contiguous
> which checks whether the given zone contains holes and
> pageblock_pfn_to_page which then carefully returns a first valid
> page from the given pfn range for the given zone. This doesn't handle
> zones which are not fully populated though. Memory pageblocks can be
> offlined or might not have been onlined yet. In such a case the zone
> should be considered to have holes otherwise pfn walkers can touch
> and play with offline pages.
> 
> Current callers of pageblock_pfn_to_page in compaction seem to work
> properly right now because they only isolate PageBuddy
> (isolate_freepages_block) or PageLRU resp. __PageMovable
> (isolate_migratepages_block) which will be always false for these pages.
> It would be safer to skip these pages altogether, though.
> 
> In order to do this patch adds a new memory section state
> (SECTION_IS_ONLINE) which is set in memory_present (during boot
> time) or in online_pages_range during the memory hotplug. Similarly
> offline_mem_sections clears the bit and it is called when the memory
> range is offlined.
> 
> pfn_to_online_page helper is then added which check the mem section and
> only returns a page if it is onlined already.
> 
> Use the new helper in __pageblock_pfn_to_page and skip the whole page
> block in such a case.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  include/linux/memory_hotplug.h | 21 ++++++++++++++++++++
>  include/linux/mmzone.h         | 20 ++++++++++++++++++-
>  mm/memory_hotplug.c            |  3 +++
>  mm/page_alloc.c                |  5 ++++-
>  mm/sparse.c                    | 45 +++++++++++++++++++++++++++++++++++++++++-
>  5 files changed, 91 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 3c8cf86201c3..fc1c873504eb 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -14,6 +14,19 @@ struct memory_block;
>  struct resource;
>  
>  #ifdef CONFIG_MEMORY_HOTPLUG
> +/*
> + * Return page for the valid pfn only if the page is online. All pfn
> + * walkers which rely on the fully initialized page->flags and others
> + * should use this rather than pfn_valid && pfn_to_page
> + */
> +#define pfn_to_online_page(pfn)				\
> +({							\
> +	struct page *___page = NULL;			\
> +							\
> +	if (online_section_nr(pfn_to_section_nr(pfn)))	\
> +		___page = pfn_to_page(pfn);		\
> +	___page;					\
> +})
>  
>  /*
>   * Types for free bootmem stored in page->lru.next. These have to be in
> @@ -203,6 +216,14 @@ extern void set_zone_contiguous(struct zone *zone);
>  extern void clear_zone_contiguous(struct zone *zone);
>  
>  #else /* ! CONFIG_MEMORY_HOTPLUG */
> +#define pfn_to_online_page(pfn)			\
> +({						\
> +	struct page *___page = NULL;		\
> +	if (pfn_valid(pfn))			\
> +		___page = pfn_to_page(pfn);	\
> +	___page;				\
> + })
> +
>  /*
>   * Stub functions for when hotplug is off
>   */
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 0fc121bbf4ff..cad16ac080f5 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1143,7 +1143,8 @@ extern unsigned long usemap_size(void);
>   */
>  #define	SECTION_MARKED_PRESENT	(1UL<<0)
>  #define SECTION_HAS_MEM_MAP	(1UL<<1)
> -#define SECTION_MAP_LAST_BIT	(1UL<<2)
> +#define SECTION_IS_ONLINE	(1UL<<2)
> +#define SECTION_MAP_LAST_BIT	(1UL<<3)
>  #define SECTION_MAP_MASK	(~(SECTION_MAP_LAST_BIT-1))
>  #define SECTION_NID_SHIFT	2
>  
> @@ -1174,6 +1175,23 @@ static inline int valid_section_nr(unsigned long nr)
>  	return valid_section(__nr_to_section(nr));
>  }
>  
> +static inline int online_section(struct mem_section *section)
> +{
> +	return (section && (section->section_mem_map & SECTION_IS_ONLINE));
> +}
> +
> +static inline int online_section_nr(unsigned long nr)
> +{
> +	return online_section(__nr_to_section(nr));
> +}
> +
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +void online_mem_sections(unsigned long start_pfn, unsigned long end_pfn);
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +void offline_mem_sections(unsigned long start_pfn, unsigned long end_pfn);
> +#endif
> +#endif
> +
>  static inline struct mem_section *__pfn_to_section(unsigned long pfn)
>  {
>  	return __nr_to_section(pfn_to_section_nr(pfn));
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index caa58338d121..98f565c279bf 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -929,6 +929,9 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
>  	unsigned long i;
>  	unsigned long onlined_pages = *(unsigned long *)arg;
>  	struct page *page;
> +
> +	online_mem_sections(start_pfn, start_pfn + nr_pages);
> +
>  	if (PageReserved(pfn_to_page(start_pfn)))
>  		for (i = 0; i < nr_pages; i++) {
>  			page = pfn_to_page(start_pfn + i);
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 5d72d29a6ece..fa752de84eef 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1353,7 +1353,9 @@ struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
>  	if (!pfn_valid(start_pfn) || !pfn_valid(end_pfn))
>  		return NULL;
>  
> -	start_page = pfn_to_page(start_pfn);
> +	start_page = pfn_to_online_page(start_pfn);
> +	if (!start_page)
> +		return NULL;
>  
>  	if (page_zone(start_page) != zone)
>  		return NULL;
> @@ -7686,6 +7688,7 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
>  			break;
>  	if (pfn == end_pfn)
>  		return;
> +	offline_mem_sections(pfn, end_pfn);
>  	zone = page_zone(pfn_to_page(pfn));
>  	spin_lock_irqsave(&zone->lock, flags);
>  	pfn = start_pfn;
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 6903c8fc3085..79017f90d8fc 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -185,7 +185,8 @@ void __init memory_present(int nid, unsigned long start, unsigned long end)
>  		ms = __nr_to_section(section);
>  		if (!ms->section_mem_map)
>  			ms->section_mem_map = sparse_encode_early_nid(nid) |
> -							SECTION_MARKED_PRESENT;
> +							SECTION_MARKED_PRESENT |
> +							SECTION_IS_ONLINE;
>  	}
>  }
>  
> @@ -590,6 +591,48 @@ void __init sparse_init(void)
>  }
>  
>  #ifdef CONFIG_MEMORY_HOTPLUG
> +
> +/* Mark all memory sections within the pfn range as online */
> +void online_mem_sections(unsigned long start_pfn, unsigned long end_pfn)
> +{
> +	unsigned long pfn;
> +
> +	for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> +		unsigned long section_nr = pfn_to_section_nr(start_pfn);
> +		struct mem_section *ms;
> +
> +		/* onlining code should never touch invalid ranges */
> +		if (WARN_ON(!valid_section_nr(section_nr)))
> +			continue;
> +
> +		ms = __nr_to_section(section_nr);
> +		ms->section_mem_map |= SECTION_IS_ONLINE;
> +	}
> +}
> +
> +#ifdef CONFIG_MEMORY_HOTREMOVE
> +/* Mark all memory sections within the pfn range as online */
> +void offline_mem_sections(unsigned long start_pfn, unsigned long end_pfn)
> +{
> +	unsigned long pfn;
> +
> +	for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
> +		unsigned long section_nr = pfn_to_section_nr(start_pfn);
> +		struct mem_section *ms;
> +
> +		/*
> +		 * TODO this needs some double checking. Offlining code makes
> +		 * sure to check pfn_valid but those checks might be just bogus
> +		 */
> +		if (WARN_ON(!valid_section_nr(section_nr)))
> +			continue;
> +
> +		ms = __nr_to_section(section_nr);
> +		ms->section_mem_map &= ~SECTION_IS_ONLINE;
> +	}
> +}
> +#endif
> +
>  #ifdef CONFIG_SPARSEMEM_VMEMMAP
>  static inline struct page *kmalloc_section_memmap(unsigned long pnum, int nid)
>  {
> 

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-04-20  7:28         ` Michal Hocko
@ 2017-04-20  8:49           ` Michal Hocko
  2017-04-20 11:56             ` Vlastimil Babka
  2017-04-21  4:38           ` Joonsoo Kim
  1 sibling, 1 reply; 341+ messages in thread
From: Michal Hocko @ 2017-04-20  8:49 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: linux-mm, Andrew Morton, Mel Gorman, Vlastimil Babka,
	Andrea Arcangeli, Jerome Glisse, Reza Arbab, Yasuaki Ishimatsu,
	qiuxishi, Kani Toshimitsu, slaoub, Andi Kleen, David Rientjes,
	Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, LKML

On Thu 20-04-17 09:28:20, Michal Hocko wrote:
> On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
[...]
> > Your patch try to add PageReserved() to __pageblock_pfn_to_page(). It
> > woule make that zone->contiguous usually returns false since memory
> > used by memblock API is marked as PageReserved() and your patch regard
> > it as a hole. It invalidates set_zone_contiguous() optimization and I
> > worry about it.
> 
> OK, fair enough. I did't consider memblock allocations. I will rethink
> this patch but there are essentially 3 options
> 	- use a different criterion for the offline holes dection. I
> 	  have just realized we might do it by storing the online
> 	  information into the mem sections
> 	- drop this patch
> 	- move the PageReferenced check down the chain into
> 	  isolate_freepages_block resp. isolate_migratepages_block
> 
> I would prefer 3 over 2 over 1. I definitely want to make this more
> robust so 1 is preferable long term but I do not want this to be a
> roadblock to the rest of the rework. Does that sound acceptable to you?

So I've played with all three options just to see how the outcome would
look like and it turned out that going with 1 will be easiest in the
end. What do you think about the following? It should be free of any 
false positives. I have only compile tested it yet.
---
>From 747794c13c0e82b55b793a31cdbe1a84ee1c6920 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Thu, 13 Apr 2017 10:28:45 +0200
Subject: [PATCH] mm: consider zone which is not fully populated to have holes

__pageblock_pfn_to_page has two users currently, set_zone_contiguous
which checks whether the given zone contains holes and
pageblock_pfn_to_page which then carefully returns a first valid
page from the given pfn range for the given zone. This doesn't handle
zones which are not fully populated though. Memory pageblocks can be
offlined or might not have been onlined yet. In such a case the zone
should be considered to have holes otherwise pfn walkers can touch
and play with offline pages.

Current callers of pageblock_pfn_to_page in compaction seem to work
properly right now because they only isolate PageBuddy
(isolate_freepages_block) or PageLRU resp. __PageMovable
(isolate_migratepages_block) which will be always false for these pages.
It would be safer to skip these pages altogether, though.

In order to do this patch adds a new memory section state
(SECTION_IS_ONLINE) which is set in memory_present (during boot
time) or in online_pages_range during the memory hotplug. Similarly
offline_mem_sections clears the bit and it is called when the memory
range is offlined.

pfn_to_online_page helper is then added which check the mem section and
only returns a page if it is onlined already.

Use the new helper in __pageblock_pfn_to_page and skip the whole page
block in such a case.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/linux/memory_hotplug.h | 21 ++++++++++++++++++++
 include/linux/mmzone.h         | 20 ++++++++++++++++++-
 mm/memory_hotplug.c            |  3 +++
 mm/page_alloc.c                |  5 ++++-
 mm/sparse.c                    | 45 +++++++++++++++++++++++++++++++++++++++++-
 5 files changed, 91 insertions(+), 3 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 3c8cf86201c3..fc1c873504eb 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -14,6 +14,19 @@ struct memory_block;
 struct resource;
 
 #ifdef CONFIG_MEMORY_HOTPLUG
+/*
+ * Return page for the valid pfn only if the page is online. All pfn
+ * walkers which rely on the fully initialized page->flags and others
+ * should use this rather than pfn_valid && pfn_to_page
+ */
+#define pfn_to_online_page(pfn)				\
+({							\
+	struct page *___page = NULL;			\
+							\
+	if (online_section_nr(pfn_to_section_nr(pfn)))	\
+		___page = pfn_to_page(pfn);		\
+	___page;					\
+})
 
 /*
  * Types for free bootmem stored in page->lru.next. These have to be in
@@ -203,6 +216,14 @@ extern void set_zone_contiguous(struct zone *zone);
 extern void clear_zone_contiguous(struct zone *zone);
 
 #else /* ! CONFIG_MEMORY_HOTPLUG */
+#define pfn_to_online_page(pfn)			\
+({						\
+	struct page *___page = NULL;		\
+	if (pfn_valid(pfn))			\
+		___page = pfn_to_page(pfn);	\
+	___page;				\
+ })
+
 /*
  * Stub functions for when hotplug is off
  */
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 0fc121bbf4ff..cad16ac080f5 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1143,7 +1143,8 @@ extern unsigned long usemap_size(void);
  */
 #define	SECTION_MARKED_PRESENT	(1UL<<0)
 #define SECTION_HAS_MEM_MAP	(1UL<<1)
-#define SECTION_MAP_LAST_BIT	(1UL<<2)
+#define SECTION_IS_ONLINE	(1UL<<2)
+#define SECTION_MAP_LAST_BIT	(1UL<<3)
 #define SECTION_MAP_MASK	(~(SECTION_MAP_LAST_BIT-1))
 #define SECTION_NID_SHIFT	2
 
@@ -1174,6 +1175,23 @@ static inline int valid_section_nr(unsigned long nr)
 	return valid_section(__nr_to_section(nr));
 }
 
+static inline int online_section(struct mem_section *section)
+{
+	return (section && (section->section_mem_map & SECTION_IS_ONLINE));
+}
+
+static inline int online_section_nr(unsigned long nr)
+{
+	return online_section(__nr_to_section(nr));
+}
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+void online_mem_sections(unsigned long start_pfn, unsigned long end_pfn);
+#ifdef CONFIG_MEMORY_HOTREMOVE
+void offline_mem_sections(unsigned long start_pfn, unsigned long end_pfn);
+#endif
+#endif
+
 static inline struct mem_section *__pfn_to_section(unsigned long pfn)
 {
 	return __nr_to_section(pfn_to_section_nr(pfn));
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index caa58338d121..98f565c279bf 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -929,6 +929,9 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
 	unsigned long i;
 	unsigned long onlined_pages = *(unsigned long *)arg;
 	struct page *page;
+
+	online_mem_sections(start_pfn, start_pfn + nr_pages);
+
 	if (PageReserved(pfn_to_page(start_pfn)))
 		for (i = 0; i < nr_pages; i++) {
 			page = pfn_to_page(start_pfn + i);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5d72d29a6ece..fa752de84eef 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1353,7 +1353,9 @@ struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
 	if (!pfn_valid(start_pfn) || !pfn_valid(end_pfn))
 		return NULL;
 
-	start_page = pfn_to_page(start_pfn);
+	start_page = pfn_to_online_page(start_pfn);
+	if (!start_page)
+		return NULL;
 
 	if (page_zone(start_page) != zone)
 		return NULL;
@@ -7686,6 +7688,7 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
 			break;
 	if (pfn == end_pfn)
 		return;
+	offline_mem_sections(pfn, end_pfn);
 	zone = page_zone(pfn_to_page(pfn));
 	spin_lock_irqsave(&zone->lock, flags);
 	pfn = start_pfn;
diff --git a/mm/sparse.c b/mm/sparse.c
index 6903c8fc3085..79017f90d8fc 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -185,7 +185,8 @@ void __init memory_present(int nid, unsigned long start, unsigned long end)
 		ms = __nr_to_section(section);
 		if (!ms->section_mem_map)
 			ms->section_mem_map = sparse_encode_early_nid(nid) |
-							SECTION_MARKED_PRESENT;
+							SECTION_MARKED_PRESENT |
+							SECTION_IS_ONLINE;
 	}
 }
 
@@ -590,6 +591,48 @@ void __init sparse_init(void)
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
+
+/* Mark all memory sections within the pfn range as online */
+void online_mem_sections(unsigned long start_pfn, unsigned long end_pfn)
+{
+	unsigned long pfn;
+
+	for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+		unsigned long section_nr = pfn_to_section_nr(start_pfn);
+		struct mem_section *ms;
+
+		/* onlining code should never touch invalid ranges */
+		if (WARN_ON(!valid_section_nr(section_nr)))
+			continue;
+
+		ms = __nr_to_section(section_nr);
+		ms->section_mem_map |= SECTION_IS_ONLINE;
+	}
+}
+
+#ifdef CONFIG_MEMORY_HOTREMOVE
+/* Mark all memory sections within the pfn range as online */
+void offline_mem_sections(unsigned long start_pfn, unsigned long end_pfn)
+{
+	unsigned long pfn;
+
+	for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+		unsigned long section_nr = pfn_to_section_nr(start_pfn);
+		struct mem_section *ms;
+
+		/*
+		 * TODO this needs some double checking. Offlining code makes
+		 * sure to check pfn_valid but those checks might be just bogus
+		 */
+		if (WARN_ON(!valid_section_nr(section_nr)))
+			continue;
+
+		ms = __nr_to_section(section_nr);
+		ms->section_mem_map &= ~SECTION_IS_ONLINE;
+	}
+}
+#endif
+
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
 static inline struct page *kmalloc_section_memmap(unsigned long pnum, int nid)
 {
-- 
2.11.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-04-20  1:27       ` Joonsoo Kim
@ 2017-04-20  7:28         ` Michal Hocko
  2017-04-20  8:49           ` Michal Hocko
  2017-04-21  4:38           ` Joonsoo Kim
  0 siblings, 2 replies; 341+ messages in thread
From: Michal Hocko @ 2017-04-20  7:28 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: linux-mm, Andrew Morton, Mel Gorman, Vlastimil Babka,
	Andrea Arcangeli, Jerome Glisse, Reza Arbab, Yasuaki Ishimatsu,
	qiuxishi, Kani Toshimitsu, slaoub, Andi Kleen, David Rientjes,
	Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, LKML

On Thu 20-04-17 10:27:55, Joonsoo Kim wrote:
> On Mon, Apr 17, 2017 at 10:15:15AM +0200, Michal Hocko wrote:
[...]
> > Which pfn walkers you have in mind?
> 
> For example, kpagecount_read() in fs/proc/page.c. I searched it by
> using pfn_valid().

Yeah, I've checked that one and in fact this is a good example of the
case where you do not really care about holes. It just checks the page
count which is a valid information under any circumstances.

> > > The other problem I found is that your change will makes some
> > > contiguous zones to be considered as non-contiguous. Memory allocated
> > > by memblock API is also marked as PageResereved. If we consider this as
> > > a hole, we will set such a zone as non-contiguous.
> > 
> > Why would that be a problem? We shouldn't touch those pages anyway?
> 
> Skipping those pages in compaction are valid so no problem in this
> case.
> 
> The problem I mentioned above is that adding PageReserved() check in
> __pageblock_pfn_to_page() invalidates optimization by
> set_zone_contiguous(). In compaction, we need to get a valid struct
> page and it requires a lot of work. There is performance problem
> report due to this so set_zone_contiguous() optimization is added. It
> checks if the zone is contiguous or not in boot time. If zone is
> determined as contiguous, we can easily get a valid struct page in
> runtime without expensive checks.

OK, I see. I've had some vague understading and the clarification helps.

> Your patch try to add PageReserved() to __pageblock_pfn_to_page(). It
> woule make that zone->contiguous usually returns false since memory
> used by memblock API is marked as PageReserved() and your patch regard
> it as a hole. It invalidates set_zone_contiguous() optimization and I
> worry about it.

OK, fair enough. I did't consider memblock allocations. I will rethink
this patch but there are essentially 3 options
	- use a different criterion for the offline holes dection. I
	  have just realized we might do it by storing the online
	  information into the mem sections
	- drop this patch
	- move the PageReferenced check down the chain into
	  isolate_freepages_block resp. isolate_migratepages_block

I would prefer 3 over 2 over 1. I definitely want to make this more
robust so 1 is preferable long term but I do not want this to be a
roadblock to the rest of the rework. Does that sound acceptable to you?
 
[..]
> Let me clarify my desire(?) for this issue.
> 
> 1. If pfn_valid() returns true, struct page has valid information, at
> least, in flags (zone id, node id, flags, etc...). So, we can use them
> without checking PageResereved().

This is no longer true after my rework. Pages are associated with the
zone during _onlining_ rather than when they are physically hotpluged.
Basically only the nid is set properly. Strictly speaking this is the
case also without my rework because the zone might change during online
phase so you cannot assume it is correct even now. It just happens that
it more or less works just fine.

> 2. pfn_valid() for offlined holes returns false. This can be easily
> (?) implemented by manipulating SECTION_MAP_MASK in hotplug code. I
> guess that there is no reason that pfn_valid() returns true for
> offlined holes. If there is, please let me know.

There is some code which really expects that pfn_valid returns true iff
there is a struct page and it doesn't care about the online status.
E.g. hotplug code itself so no, we cannot change pfn_valid. What we can
do though is to add pfn_to_online_page which would do the proper check.
I have already sent [1]. As noted above we can (ab)use the remaining bit
in SECTION_MAP_MASK to detect offline pages more robustly.

> 3. We don't need to check PageReserved() in most of pfn walkers in
> order to check offline holes.

We still have to distinguish those who care about offline pages from
those who do not care about it.

Thanks!
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-04-17  8:15     ` Michal Hocko
@ 2017-04-20  1:27       ` Joonsoo Kim
  2017-04-20  7:28         ` Michal Hocko
  0 siblings, 1 reply; 341+ messages in thread
From: Joonsoo Kim @ 2017-04-20  1:27 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Mel Gorman, Vlastimil Babka,
	Andrea Arcangeli, Jerome Glisse, Reza Arbab, Yasuaki Ishimatsu,
	qiuxishi, Kani Toshimitsu, slaoub, Andi Kleen, David Rientjes,
	Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, LKML

On Mon, Apr 17, 2017 at 10:15:15AM +0200, Michal Hocko wrote:
> On Mon 17-04-17 14:47:20, Joonsoo Kim wrote:
> > On Sat, Apr 15, 2017 at 02:17:31PM +0200, Michal Hocko wrote:
> > > Hi,
> > > here I 3 more preparatory patches which I meant to send on Thursday but
> > > forgot... After more thinking about pfn walkers I have realized that
> > > the current code doesn't check offline holes in zones. From a quick
> > > review that doesn't seem to be a problem currently. Pfn walkers can race
> > > with memory offlining and with the original hotplug impementation those
> > > offline pages can change the zone but I wasn't able to find any serious
> > > problem other than small confusion. The new hotplug code, will not have
> > > any valid zone, though so those code paths should check PageReserved
> > > to rule offline holes. I hope I have addressed all of them in these 3
> > > patches. I would appreciate if Vlastimil and Jonsoo double check after
> > > me.
> > 
> > Hello, Michal.
> > 
> > s/Jonsoo/Joonsoo. :)
> 
> ups, sorry about that.
> 
> > I'm not sure that it's a good idea to add PageResereved() check in pfn
> > walkers. First, this makes struct page validity check as two steps,
> > pfn_valid() and then PageResereved().
> 
> Yes, those are two separate checkes because semantically they are
> different. Not all pfn walkers do care about the online status.

If offlined page has no valid information, reading information
about offlined pages are just wrong. So, all pfn walkers that reads
information about the page should do care about it.

I guess that many callers for pfn_valid() is in this category.

> 
> > If we should not use struct page
> > in this case, it's better to pfn_valid() returns false rather than
> > adding a separate check. Anyway, we need to fix more places (all pfn
> > walker?) if we want to check validity by two steps.
> 
> Which pfn walkers you have in mind?

For example, kpagecount_read() in fs/proc/page.c. I searched it by
using pfn_valid().

> > The other problem I found is that your change will makes some
> > contiguous zones to be considered as non-contiguous. Memory allocated
> > by memblock API is also marked as PageResereved. If we consider this as
> > a hole, we will set such a zone as non-contiguous.
> 
> Why would that be a problem? We shouldn't touch those pages anyway?

Skipping those pages in compaction are valid so no problem in this
case.

The problem I mentioned above is that adding PageReserved() check in
__pageblock_pfn_to_page() invalidates optimization by
set_zone_contiguous(). In compaction, we need to get a valid struct
page and it requires a lot of work. There is performance problem
report due to this so set_zone_contiguous() optimization is added. It
checks if the zone is contiguous or not in boot time. If zone is
determined as contiguous, we can easily get a valid struct page in
runtime without expensive checks.

Your patch try to add PageReserved() to __pageblock_pfn_to_page(). It
woule make that zone->contiguous usually returns false since memory
used by memblock API is marked as PageReserved() and your patch regard
it as a hole. It invalidates set_zone_contiguous() optimization and I
worry about it.

>  
> > And, I guess that it's not enough to check PageResereved() in
> > pageblock_pfn_to_page() in order to skip these pages in compaction. If
> > holes are in the middle of the pageblock, pageblock_pfn_to_page()
> > cannot catch it and compaction will use struct page for this hole.
> 
> Yes pageblock_pfn_to_page cannot catch it and it wouldn't with the
> current implementation anyway. So the implementation won't be any worse
> than with the current code. On the other hand offline holes will always
> fill the whole pageblock (assuming those are not spanning multiple
> memblocks).
>  
> > Therefore, I think that making pfn_valid() return false for not
> > onlined memory is a better solution for this problem. I don't know the
> > implementation detail for hotplug and I don't see your recent change
> > but we may defer memmap initialization until the zone is determined.
> > It will make pfn_valid() return false for un-initialized range.
> 
> I am not really sure. pfn_valid is used in many context and its only
> purpose is to tell whether pfn_to_page will return a valid struct page
> AFAIU.
> 
> I agree that having more checks is more error prone and we can add a
> helper pfn_to_valid_page or something similar but I believe we can do
> that on top of the current hotplug rework. This would require a non
> trivial amount of changes and I believe that a lacking check for the
> offline holes is not critical - we would (ab)use the lowest zone which
> is similar to (ab)using ZONE_NORMAL/MOVABLE with the original code.

I'm not objecting your hotplug rework. In fact, I don't know the
relationship between this work and hotplug rework. I'm agreeing
with checking offline holes but I don't like the design and
implementation about it.

Let me clarify my desire(?) for this issue.

1. If pfn_valid() returns true, struct page has valid information, at
least, in flags (zone id, node id, flags, etc...). So, we can use them
without checking PageResereved().

2. pfn_valid() for offlined holes returns false. This can be easily
(?) implemented by manipulating SECTION_MAP_MASK in hotplug code. I
guess that there is no reason that pfn_valid() returns true for
offlined holes. If there is, please let me know.

3. We don't need to check PageReserved() in most of pfn walkers in
order to check offline holes.

Thanks.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-04-17  5:47   ` your mail Joonsoo Kim
@ 2017-04-17  8:15     ` Michal Hocko
  2017-04-20  1:27       ` Joonsoo Kim
  0 siblings, 1 reply; 341+ messages in thread
From: Michal Hocko @ 2017-04-17  8:15 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: linux-mm, Andrew Morton, Mel Gorman, Vlastimil Babka,
	Andrea Arcangeli, Jerome Glisse, Reza Arbab, Yasuaki Ishimatsu,
	qiuxishi, Kani Toshimitsu, slaoub, Andi Kleen, David Rientjes,
	Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, LKML

On Mon 17-04-17 14:47:20, Joonsoo Kim wrote:
> On Sat, Apr 15, 2017 at 02:17:31PM +0200, Michal Hocko wrote:
> > Hi,
> > here I 3 more preparatory patches which I meant to send on Thursday but
> > forgot... After more thinking about pfn walkers I have realized that
> > the current code doesn't check offline holes in zones. From a quick
> > review that doesn't seem to be a problem currently. Pfn walkers can race
> > with memory offlining and with the original hotplug impementation those
> > offline pages can change the zone but I wasn't able to find any serious
> > problem other than small confusion. The new hotplug code, will not have
> > any valid zone, though so those code paths should check PageReserved
> > to rule offline holes. I hope I have addressed all of them in these 3
> > patches. I would appreciate if Vlastimil and Jonsoo double check after
> > me.
> 
> Hello, Michal.
> 
> s/Jonsoo/Joonsoo. :)

ups, sorry about that.

> I'm not sure that it's a good idea to add PageResereved() check in pfn
> walkers. First, this makes struct page validity check as two steps,
> pfn_valid() and then PageResereved().

Yes, those are two separate checkes because semantically they are
different. Not all pfn walkers do care about the online status.

> If we should not use struct page
> in this case, it's better to pfn_valid() returns false rather than
> adding a separate check. Anyway, we need to fix more places (all pfn
> walker?) if we want to check validity by two steps.

Which pfn walkers you have in mind?

> The other problem I found is that your change will makes some
> contiguous zones to be considered as non-contiguous. Memory allocated
> by memblock API is also marked as PageResereved. If we consider this as
> a hole, we will set such a zone as non-contiguous.

Why would that be a problem? We shouldn't touch those pages anyway?
 
> And, I guess that it's not enough to check PageResereved() in
> pageblock_pfn_to_page() in order to skip these pages in compaction. If
> holes are in the middle of the pageblock, pageblock_pfn_to_page()
> cannot catch it and compaction will use struct page for this hole.

Yes pageblock_pfn_to_page cannot catch it and it wouldn't with the
current implementation anyway. So the implementation won't be any worse
than with the current code. On the other hand offline holes will always
fill the whole pageblock (assuming those are not spanning multiple
memblocks).
 
> Therefore, I think that making pfn_valid() return false for not
> onlined memory is a better solution for this problem. I don't know the
> implementation detail for hotplug and I don't see your recent change
> but we may defer memmap initialization until the zone is determined.
> It will make pfn_valid() return false for un-initialized range.

I am not really sure. pfn_valid is used in many context and its only
purpose is to tell whether pfn_to_page will return a valid struct page
AFAIU.

I agree that having more checks is more error prone and we can add a
helper pfn_to_valid_page or something similar but I believe we can do
that on top of the current hotplug rework. This would require a non
trivial amount of changes and I believe that a lacking check for the
offline holes is not critical - we would (ab)use the lowest zone which
is similar to (ab)using ZONE_NORMAL/MOVABLE with the original code.

Thanks!
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2017-04-15 12:17 ` Michal Hocko
@ 2017-04-17  5:47   ` Joonsoo Kim
  2017-04-17  8:15     ` Michal Hocko
  0 siblings, 1 reply; 341+ messages in thread
From: Joonsoo Kim @ 2017-04-17  5:47 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Mel Gorman, Vlastimil Babka,
	Andrea Arcangeli, Jerome Glisse, Reza Arbab, Yasuaki Ishimatsu,
	qiuxishi, Kani Toshimitsu, slaoub, Andi Kleen, David Rientjes,
	Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, LKML

On Sat, Apr 15, 2017 at 02:17:31PM +0200, Michal Hocko wrote:
> Hi,
> here I 3 more preparatory patches which I meant to send on Thursday but
> forgot... After more thinking about pfn walkers I have realized that
> the current code doesn't check offline holes in zones. From a quick
> review that doesn't seem to be a problem currently. Pfn walkers can race
> with memory offlining and with the original hotplug impementation those
> offline pages can change the zone but I wasn't able to find any serious
> problem other than small confusion. The new hotplug code, will not have
> any valid zone, though so those code paths should check PageReserved
> to rule offline holes. I hope I have addressed all of them in these 3
> patches. I would appreciate if Vlastimil and Jonsoo double check after
> me.

Hello, Michal.

s/Jonsoo/Joonsoo. :)

I'm not sure that it's a good idea to add PageResereved() check in pfn
walkers. First, this makes struct page validity check as two steps,
pfn_valid() and then PageResereved(). If we should not use struct page
in this case, it's better to pfn_valid() returns false rather than
adding a separate check. Anyway, we need to fix more places (all pfn
walker?) if we want to check validity by two steps.

The other problem I found is that your change will makes some
contiguous zones to be considered as non-contiguous. Memory allocated
by memblock API is also marked as PageResereved. If we consider this as
a hole, we will set such a zone as non-contiguous.

And, I guess that it's not enough to check PageResereved() in
pageblock_pfn_to_page() in order to skip these pages in compaction. If
holes are in the middle of the pageblock, pageblock_pfn_to_page()
cannot catch it and compaction will use struct page for this hole.

Therefore, I think that making pfn_valid() return false for not
onlined memory is a better solution for this problem. I don't know the
implementation detail for hotplug and I don't see your recent change
but we may defer memmap initialization until the zone is determined.
It will make pfn_valid() return false for un-initialized range.

Thanks.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2016-11-16 14:25   ` Steven Rostedt
@ 2016-11-16 14:28     ` Peter Zijlstra
  0 siblings, 0 replies; 341+ messages in thread
From: Peter Zijlstra @ 2016-11-16 14:28 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Christoph Lameter, Daniel Vacek, Daniel Bristot de Oliveira,
	Tommaso Cucinotta, LKML, linux-rt-users, Ingo Molnar

On Wed, Nov 16, 2016 at 09:25:43AM -0500, Steven Rostedt wrote:
> On Wed, 16 Nov 2016 11:40:14 +0100
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> 
> > On top of which, the implementation had issues; now I know you're the
> > blinder kind of person that disregards everything not in his immediate
> > interest, but if you'd looked at the patch you'd have seen he'd added
> > code the idle entry path, which will slow down every single to-idle
> > transition.
> 
> Isn't to-idle a bit bloated anyway? Or has that been fixed. I know
> there was some issues with idle_balance() which can add latency to
> wakeups. idle_balance() is also in the to-idle path.
> 

Yes it is too heavy as is, but just stacking more crap in just because
its already expensive seems to wrong way around.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2016-11-16 10:40 ` your mail Peter Zijlstra
@ 2016-11-16 14:25   ` Steven Rostedt
  2016-11-16 14:28     ` Peter Zijlstra
  0 siblings, 1 reply; 341+ messages in thread
From: Steven Rostedt @ 2016-11-16 14:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Christoph Lameter, Daniel Vacek, Daniel Bristot de Oliveira,
	Tommaso Cucinotta, LKML, linux-rt-users, Ingo Molnar

On Wed, 16 Nov 2016 11:40:14 +0100
Peter Zijlstra <peterz@infradead.org> wrote:


> On top of which, the implementation had issues; now I know you're the
> blinder kind of person that disregards everything not in his immediate
> interest, but if you'd looked at the patch you'd have seen he'd added
> code the idle entry path, which will slow down every single to-idle
> transition.

Isn't to-idle a bit bloated anyway? Or has that been fixed. I know
there was some issues with idle_balance() which can add latency to
wakeups. idle_balance() is also in the to-idle path.

Note, that this is a sched feature which would be a nop (jump_label)
when disabled. And I'm sure it could also be optimized to be a static
inline as well when it is enabled.

I'm not saying we need to go this approach, but I'm just saying that
the to-idle issue is a bit of a red herring.

-- Steve

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2016-11-15 20:29 Christoph Lameter
@ 2016-11-16 10:40 ` Peter Zijlstra
  2016-11-16 14:25   ` Steven Rostedt
  0 siblings, 1 reply; 341+ messages in thread
From: Peter Zijlstra @ 2016-11-16 10:40 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Daniel Vacek, Daniel Bristot de Oliveira, Tommaso Cucinotta,
	LKML, linux-rt-users, Steven Rostedt, Ingo Molnar

On Tue, Nov 15, 2016 at 02:29:16PM -0600, Christoph Lameter wrote:
> 
> > > There is a deadlock, Peter!!!
> >
> > Describe please? Also, have you tried disabling RT_RUNTIME_SHARE ?
> >
> 
> 
> The description was given earlier in the the thread and the drawbacks of
> using RT_RUNTIME_SHARE as well.

I've not seen a deadlock described. It either was an unbounded priority
inversion or a starvation issue, both of which are 'design' features of
the !rt kernel.

Neither things are new, so its not a regression either.

And, as stated, I'm not really happy to muck with this known troublesome
code and add features for which we must then maintain feature parity
when replacing it either.

On top of which, the implementation had issues; now I know you're the
blinder kind of person that disregards everything not in his immediate
interest, but if you'd looked at the patch you'd have seen he'd added
code the idle entry path, which will slow down every single to-idle
transition.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2016-09-20 22:21 Andrew Banman
@ 2016-09-20 22:23 ` andrew banman
  0 siblings, 0 replies; 341+ messages in thread
From: andrew banman @ 2016-09-20 22:23 UTC (permalink / raw)
  To: Andrew Banman
  Cc: mingo, akpm, tglx, hpa, travis, rja, sivanich, x86, linux-kernel

Subject line got dropped the first time around. Will send again.

Apologies for the chatter,

Andrew

On Tue, Sep 20, 2016 at 05:21:06PM -0500, Andrew Banman wrote:
> From Andrew Banman <abanman@sgi.com> # This line is ignored.
> From: Andrew Banman <abanman@sgi.com>
> Subject: [PATCH 0/9] arch/x86/platform/uv: add UV4 support to BAU
> In-Reply-To: 
> 
> The following patch set adds support for UV4 architecture to the Broadcast
> Assist Unit (BAU). Major hardware changes to the BAU require these fixes to
> ensure correct operation and to avoid illegal MMR writes.
> 
>  arch/x86/include/asm/uv/uv_bau.h |  45 ++----------------------------
>  arch/x86/platform/uv/tlb_uv.c    | 114 ++++++++++++++++++++++++---------------------------------------
> -------------
> 
> The patch set can be thought of in three logical groups:
> 
> 1) General cleanup.
> 
>  [PATCH 1/9] arch/x86/platform/uv: BAU cleanup: update printks
>  [PATCH 2/9] arch/x86/platform/uv: BAU cleanup: pq_init
>  [PATCH 3/9] arch/x86/platform/uv: BAU replace uv_physnodeaddr
> 
>  These housekeeping patches make the subsequent UV4 patches clearer,
>  and they should be done in any case.
> 
> 
> 2) Implement a new scheme to abstract UV version-specific functions.
> 
>  [PATCH 4/9] arch/x86/platform/uv: BAU add generic function pointers
>  [PATCH 5/9] arch/x86/platform/uv: BAU use generic function pointers
> 
>  We add a struct of function pointers to define version-specific BAU
>  operations. The philosophy is to abstract functions that perform the same
>  operation on all UV versions but have different implementations. This will
>  simplify their use in the body of the driver code and greatly simplify the
>  UV4 patches to follow.
> 
> 
> 3) Add UV4 functionality.
> 
>  [PATCH 6/9] arch/x86/platform/uv: BAU UV4 populate uvhub_version
>  [PATCH 7/9] arch/x86/platform/uv: BAU UV4 disable software timeout
>  [PATCH 8/9] arch/x86/platform/uv: BAU UV4 fix payload queue setup
>  [PATCH 9/9] arch/x86/platform/uv: BAU UV4 add version-specific
> 
>  These patches feature a minimal set of changes to make the BAU on UV4
>  operational.
> 
> 
> This patch set has been tested for regressions on pre-UV4 architectures and
> for correct functionality on UV4. The patches apply cleanly to 4.8-rc7.
> Fine-tuned performance tweaking for UV4 will come in a future patch set.
> 
> 
> Thank you,
> 
> Andrew Banman

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2015-08-03  6:18 Shraddha Barke
  2015-08-03  7:12 ` your mail Sudip Mukherjee
@ 2015-08-03  7:24 ` Dan Carpenter
  1 sibling, 0 replies; 341+ messages in thread
From: Dan Carpenter @ 2015-08-03  7:24 UTC (permalink / raw)
  To: Shraddha Barke
  Cc: Oleg Drokin, Al Viro, Julia Lawall, aybuke ozdemir,
	Andreas Dilger, John L. Hammond, Frank Zago, Greg Kroah-Hartman,
	HPDD-discuss, devel, linux-kernel

Returning EINVAL here is the wrong thing.  Just leave the code as is.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2015-08-03  6:18 Shraddha Barke
@ 2015-08-03  7:12 ` Sudip Mukherjee
  2015-08-03  7:24 ` Dan Carpenter
  1 sibling, 0 replies; 341+ messages in thread
From: Sudip Mukherjee @ 2015-08-03  7:12 UTC (permalink / raw)
  To: Shraddha Barke
  Cc: Oleg Drokin, Al Viro, Julia Lawall, aybuke ozdemir,
	Andreas Dilger, John L. Hammond, Frank Zago, Greg Kroah-Hartman,
	HPDD-discuss, devel, linux-kernel

On Mon, Aug 03, 2015 at 11:48:59AM +0530, Shraddha Barke wrote:
> From b67c6c20455b04b77447ab4561e44f1a75dd978d Mon Sep 17 00:00:00 2001
> From: Shraddha Barke <shraddha.6596@gmail.com>
> Date: Mon, 3 Aug 2015 11:34:19 +0530
> Subject: [PATCH] Staging : lustre : Use -EINVAL instead of -ENOSYS

You do not need these in the commit message.

regards
sudip

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2015-01-21 23:57   ` Jason Gunthorpe
  2015-01-22 20:50     ` One Thousand Gnomes
@ 2015-01-28 22:09     ` atull
  1 sibling, 0 replies; 341+ messages in thread
From: atull @ 2015-01-28 22:09 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: One Thousand Gnomes, michal.simek, linux-kernel,
	delicious.quinoa, dinguyen, yvanderv

On Wed, 21 Jan 2015, Jason Gunthorpe wrote:

> [unfutzd the cc a bit, sorry]
> 
> On Wed, Jan 21, 2015 at 04:19:17PM -0600, atull wrote:
> > > If we consider a Zynq, for instance, there are a number of clock nets
> > > that the CPU drives into the FPGA fabric. These nets are controlled by
> > > the kernel CLK framework. So, before we program the FPGA bitstream the
> > > clocks must be setup properly.
> > 
> > It's pretty normal for drivers to find out what their clocks are from
> > the DT and enable them.  
> 
> Sure, but the clocks are bitfile specific, and not related to
> programming. Some bitfiles may not require CPU clocks at all.

The bitfile specific clocks are the clocks that are turned on by the device
driver for that chunk of the bitfile.  So those clocks can be specified
the same way as clocks are specied in the DT.

> 
> > Yes the DT overlay can specify:
> >   * clock info
> >   * firmware file name if user is doing it that way
> >   * fpga manager - specific info
> >     * compatiblity string specifies what type of fpga it is
> >     * which fpga this image should go into
> >   * fpga/processor bridges to enable
> >   * driver(s) info that is dependent on the above
> 
> All sounds reasonable
> 
> > > Today in our Zynq systems we have the bootloader preconfigure
> > > everything for what we are trying to do - but that is specific to the
> > > particular FPGA we are expecting to run, and eg, I expect if we ran a
> > > kernel using the Zynq clk framework there would be problems with it
> > > mangling the configuration.
> > > 
> > > So there would have to be some kind of sequence where the DT is
> > > loaded, the zynq specific FPGA programmer does its pre setup, then the
> > > request_firmarw/fpga_program_fw loads the bitstream and another pass
> > > for a zynq specific post setup and completion handshake?
> > 
> > fpga-mgr.c has the concept that each different FPGA family will
> > likely need its own way of doing these 3 steps:
> >  * write_init (prepare fgpa for receiving configuration information)
> >  * write (write configuration info to the fpga)
> >  * write_complete (done writing, put fpga into running mode)
> > 
> > There are callbacks into the manufacturor/fpga family specific lower
> > level driver to do these things (as part of the "fpga_manager_ops"
> 
> I think the missing bit here is that there are bitfile specific things
> as well.
> 
> The functions above are fine for a generic manufacturer bitfile loader,
> ie Xilinx GPIO twiddling, Altera JTAG, Zynq DMA, etc.
> 
> But wrappered around that should be another set of functions that are
> bitfile specific.
> 
> Like Zynq-PL-boot-protocol-v1 - which deasserts a reset line and waits
> for the PL to signal back that it has completed reset.
> 
> Or jgg-boot-protocol-v1 which monitors the configuration GPIOs for a
> specific ready pattern..
> 
> Or ... 
> 
> All of those procedures depend on the bitfile to implement something.
> 
> > > The DT needs to specify not only the bitstream programming HW to use
> > > but this ancillary programming protocol. There are many ways to do
> > > a out of reset and completion handshake on Zynq, for instance.
> > 
> > Currently the lower level driver supports only one preferred method
> > of programming.  I guess we could add an enumerated DT property to
> > select programming protocol.  It would have to be manufacturor specific.
> > Alternatively it could be encoded into the compatibility string if that
> > makes sense.
> 
> From a DT perspective I'd expect it to look something like:
> 
> soc {
> 
>   // This is the 'how to program a bitstream'
>   fpga-bitstream0: zynq_pl_dma 
>   {
>      compatible = "xilinx,zynq,pl,dma";
>      regs = <..>
>   }
> 
>   fpga: ..
>   {
>      // This is 'what is in the bitstream'
>      boot-protocol = "xilinx,zynq,protocol1";
>      compatible = "jgg,fpga-foo-bar";
>      manager = @fpga-bitstream0
>      clocks = ..
>      clock-frequency = ...
> 
>      zynq_axi_gp0
>      {
>       // Settings for a CPU to FPGA AXI bridge
>        axi setting 1 = ...
>        [...]
>      }
>   }
> }

That's good.  It also needs to specify the driver(s) for the hardware that the
bitstream will instantiate.

> 
> I could also see integrating with the regulator framework as well to
> power up FPGA specific controllable power supplies.
> 
> > > And then user space would need to have control points between each of
> > > these steps.
> 
> > We could have two options, configurable from the ioctl:
> > * When the DT is loaded, do everything
> > * Even when the DT is loaded, wait for further instructions from ioctl or
> > 
> > Freewheeling flow:
> > * Tell ioctl that we are in freewheeling mode
> > * Load DT overlay
> > 
> > Tightly controlled flow:
> > * Tell ioctl that we are doing things stepwise
> > * Load DT overlay
> > * Use ioctl to step through getting the fpga loaded and known to be happy
> 
> I think you've certainly got the idea!
> 
> Thinking it through some more, if the kernel DT tells the fpga-mgr
> that it is 
> 
>   boot-protocol = "xilinx,zynq,protocol1","jgg,foo-bar";
> 
> Then the kernel should refuse to start it if it doesn't know how to do
> both 'xilinx,zynq,protocol1' and 'jgg,foo-bar'.
> 
> Thus the user space ioctl interface becomes more of how to implement a
> boot protocol helper in userspace? With the proper locking - while the
> helper is working the FPGA cannot be messed with..
> 
> Jason
> 

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2015-01-21 23:57   ` Jason Gunthorpe
@ 2015-01-22 20:50     ` One Thousand Gnomes
  2015-01-28 22:09     ` atull
  1 sibling, 0 replies; 341+ messages in thread
From: One Thousand Gnomes @ 2015-01-22 20:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: atull, michal.simek, linux-kernel, delicious.quinoa, dinguyen, yvanderv

> The functions above are fine for a generic manufacturer bitfile loader,
> ie Xilinx GPIO twiddling, Altera JTAG, Zynq DMA, etc.
> 
> But wrappered around that should be another set of functions that are
> bitfile specific.

And also a transport layer. You can have the same FPGA with the same
loader protocol off multiple different bus types (from USB to on CPU die
and all the way between).

Alan

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] ` <alpine.DEB.2.02.1501211520150.13480@linuxheads99>
@ 2015-01-21 23:57   ` Jason Gunthorpe
  2015-01-22 20:50     ` One Thousand Gnomes
  2015-01-28 22:09     ` atull
  0 siblings, 2 replies; 341+ messages in thread
From: Jason Gunthorpe @ 2015-01-21 23:57 UTC (permalink / raw)
  To: atull
  Cc: One Thousand Gnomes, michal.simek, linux-kernel,
	delicious.quinoa, dinguyen, yvanderv

[unfutzd the cc a bit, sorry]

On Wed, Jan 21, 2015 at 04:19:17PM -0600, atull wrote:
> > If we consider a Zynq, for instance, there are a number of clock nets
> > that the CPU drives into the FPGA fabric. These nets are controlled by
> > the kernel CLK framework. So, before we program the FPGA bitstream the
> > clocks must be setup properly.
> 
> It's pretty normal for drivers to find out what their clocks are from
> the DT and enable them.  

Sure, but the clocks are bitfile specific, and not related to
programming. Some bitfiles may not require CPU clocks at all.

> Yes the DT overlay can specify:
>   * clock info
>   * firmware file name if user is doing it that way
>   * fpga manager - specific info
>     * compatiblity string specifies what type of fpga it is
>     * which fpga this image should go into
>   * fpga/processor bridges to enable
>   * driver(s) info that is dependent on the above

All sounds reasonable

> > Today in our Zynq systems we have the bootloader preconfigure
> > everything for what we are trying to do - but that is specific to the
> > particular FPGA we are expecting to run, and eg, I expect if we ran a
> > kernel using the Zynq clk framework there would be problems with it
> > mangling the configuration.
> > 
> > So there would have to be some kind of sequence where the DT is
> > loaded, the zynq specific FPGA programmer does its pre setup, then the
> > request_firmarw/fpga_program_fw loads the bitstream and another pass
> > for a zynq specific post setup and completion handshake?
> 
> fpga-mgr.c has the concept that each different FPGA family will
> likely need its own way of doing these 3 steps:
>  * write_init (prepare fgpa for receiving configuration information)
>  * write (write configuration info to the fpga)
>  * write_complete (done writing, put fpga into running mode)
> 
> There are callbacks into the manufacturor/fpga family specific lower
> level driver to do these things (as part of the "fpga_manager_ops"

I think the missing bit here is that there are bitfile specific things
as well.

The functions above are fine for a generic manufacturer bitfile loader,
ie Xilinx GPIO twiddling, Altera JTAG, Zynq DMA, etc.

But wrappered around that should be another set of functions that are
bitfile specific.

Like Zynq-PL-boot-protocol-v1 - which deasserts a reset line and waits
for the PL to signal back that it has completed reset.

Or jgg-boot-protocol-v1 which monitors the configuration GPIOs for a
specific ready pattern..

Or ... 

All of those procedures depend on the bitfile to implement something.

> > The DT needs to specify not only the bitstream programming HW to use
> > but this ancillary programming protocol. There are many ways to do
> > a out of reset and completion handshake on Zynq, for instance.
> 
> Currently the lower level driver supports only one preferred method
> of programming.  I guess we could add an enumerated DT property to
> select programming protocol.  It would have to be manufacturor specific.
> Alternatively it could be encoded into the compatibility string if that
> makes sense.

>From a DT perspective I'd expect it to look something like:

soc {

  // This is the 'how to program a bitstream'
  fpga-bitstream0: zynq_pl_dma 
  {
     compatible = "xilinx,zynq,pl,dma";
     regs = <..>
  }

  fpga: ..
  {
     // This is 'what is in the bitstream'
     boot-protocol = "xilinx,zynq,protocol1";
     compatible = "jgg,fpga-foo-bar";
     manager = @fpga-bitstream0
     clocks = ..
     clock-frequency = ...

     zynq_axi_gp0
     {
      // Settings for a CPU to FPGA AXI bridge
       axi setting 1 = ...
       [...]
     }
  }
}

I could also see integrating with the regulator framework as well to
power up FPGA specific controllable power supplies.

> > And then user space would need to have control points between each of
> > these steps.

> We could have two options, configurable from the ioctl:
> * When the DT is loaded, do everything
> * Even when the DT is loaded, wait for further instructions from ioctl or
> 
> Freewheeling flow:
> * Tell ioctl that we are in freewheeling mode
> * Load DT overlay
> 
> Tightly controlled flow:
> * Tell ioctl that we are doing things stepwise
> * Load DT overlay
> * Use ioctl to step through getting the fpga loaded and known to be happy

I think you've certainly got the idea!

Thinking it through some more, if the kernel DT tells the fpga-mgr
that it is 

  boot-protocol = "xilinx,zynq,protocol1","jgg,foo-bar";

Then the kernel should refuse to start it if it doesn't know how to do
both 'xilinx,zynq,protocol1' and 'jgg,foo-bar'.

Thus the user space ioctl interface becomes more of how to implement a
boot protocol helper in userspace? With the proper locking - while the
helper is working the FPGA cannot be messed with..

Jason

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2014-10-15  8:10 Christoph Lameter
@ 2014-10-27 15:07 ` Tejun Heo
  0 siblings, 0 replies; 341+ messages in thread
From: Tejun Heo @ 2014-10-27 15:07 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: linux-kernel

Hello, Christoph.

On Wed, Oct 15, 2014 at 03:10:37AM -0500, Christoph Lameter wrote:
> Subject: Convert remaining __get_cpu_var uses
> 
> During the 3.18 merge period additional __get_cpu_var uses were
> added. The patch converts these to this_cpu_ptr().
> 
> [This does not address the powerpc issue where the conversion
> patches were routed directly to the powerpc maintainers but were
> not applied in the merge period. Will have to be handled separately]
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>

Can you please repost with proper subject line and the subsys
maintainers cc'd?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2014-09-01 15:47 sunwxg
@ 2014-09-01 17:01 ` Dan Carpenter
  0 siblings, 0 replies; 341+ messages in thread
From: Dan Carpenter @ 2014-09-01 17:01 UTC (permalink / raw)
  To: sunwxg
  Cc: Greg Kroah-Hartman, Dulshani Gunawardhana, Josh Triplett,
	John L. Hammond, Andreas Dilger, Chi Pham, Oleg Drokin, devel,
	linux-kernel

No subject.

It should be a subject about adding spaces.

regards,
dan carpenter



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <1409556896-21523-2-git-send-email-xiaoguang_wang5188@qq.com>
@ 2014-09-01  8:04 ` Dan Carpenter
  0 siblings, 0 replies; 341+ messages in thread
From: Dan Carpenter @ 2014-09-01  8:04 UTC (permalink / raw)
  To: sunwxg
  Cc: Benjamin Romer, David Kershner, Greg Kroah-Hartman, Ken Cox,
	Iulia Manda, Luis R. Rodriguez, Masaru Nomura, devel,
	sparmaintainer, linux-kernel

On Mon, Sep 01, 2014 at 03:34:56PM +0800, sunwxg wrote:
> From: Sun Wang <xiaoguang_wang5188@qq.com>
> 
> Subject: [PATCH] staging: unisys: visorutil: procobjecttree: fix coding style issue 
> 

Your email headers are mangled.  The subject is vague.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2014-07-09  1:03 James Ban
@ 2014-07-09  7:56 ` Mark Brown
  0 siblings, 0 replies; 341+ messages in thread
From: Mark Brown @ 2014-07-09  7:56 UTC (permalink / raw)
  To: James Ban; +Cc: Liam Girdwood, Support Opensource, LKML, David Dajun Chen

[-- Attachment #1: Type: text/plain, Size: 1145 bytes --]

On Wed, Jul 09, 2014 at 10:03:32AM +0900, James Ban wrote:

> > > +	ret = regmap_read(chip->regmap, DA9211_REG_EVENT_B, &reg_val);
> > > +	if (ret < 0)
> > > +		goto error_i2c;

> > > +	if (reg_val & DA9211_E_OV_CURR_A) {

> > > +	if (reg_val & DA9211_E_OV_CURR_B) {

> > > +	return IRQ_HANDLED;

> > This is buggy - the driver should only return IRQ_HANDLED if it handled the
> > interrupt somehow, otherwise it should return IRQ_NONE and let the interrupt
> > core handle things.  This is especially important since the device appears to
> > require that interrupts are explicitly acknoweldged so if something is flagged
> > but not handled the interrupt will just sit constantly asserted.

> Basically all interrupts are masked when the chip wakes up. 
> Only two interrupts are unmasked at the start of driver like below.

I know that's the intention but the code should still be written
robustly - something might go wrong somewhere which causes another
interrupt to be enabled, or we might even gain support for shared
threaded interrupts in the interrupt core and someone could then
try to use that in a system.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <CAM0G4ztXWM5kw6dV4WRrTVJBMmeJDXuRnbeRBE603hM+7c=PCg@mail.gmail.com>
@ 2014-02-25 15:01 ` Will Deacon
  0 siblings, 0 replies; 341+ messages in thread
From: Will Deacon @ 2014-02-25 15:01 UTC (permalink / raw)
  To: srikanth TS; +Cc: ts.srikanth, linux-kernel, iommu, sungjinn.chung

On Tue, Feb 25, 2014 at 11:20:11AM +0000, srikanth TS wrote:
> 
> On Feb 25, 2014 2:28 AM, "Will Deacon" <will.deacon@arm.com<mailto:will.deacon@arm.com>> wrote:
> >
> > On Mon, Feb 24, 2014 at 03:12:21PM +0000, srikanth TS wrote:
> > > Hi Will Deacon,
> >
> > Hello,
> >
> > > Currently SMMU driver expecting all stream ID used by respective master
> > > should be defined in the DT.
> > >
> > > We want to know how to handle in the case of virtual functions dynamically
> > > created and destroyed.
> > >
> > > Is PCI driver responsible for creating stream ID respective BDand
> > > requesting SMMU to add to the mapping table[stream Id to context mapping
> > > table]?
> > >
> > > Or is there any right way of doing it?
> >
> > Correct, the driver currently doesn't support dynamic mappings (mainly
> > because I didn't want to try and invent something that I couldn't test).
> >
> > There are a couple of ways to solve this:
> >
> >   (1) Add a way for a PCI RC to dynamically allocate StreamIDs on an SMMU
> >       within a fixed range. That would probably need some code in the bus
> >       layer, so that a bus notifier can kick and call back to the relevant
> >       SMMU.
> 
> I think first way of solving seems to be better, because we don't know how many
> 
> VF are used and i feel its not good idea to keep whole list of streamID [which is
> 
> equal to max num vf] in DT. Again in this method we need to generate the stream ID
> 
> dynamically whenever VF is added in pci iov driver side. And then pass that
> 
> stream ID to SMMU.
> 
> Is it ok this way?  Or you prefer 2nd way which is simpler.

I'm happy either way, but I'd need to see some patches before I can merge
anything ;)

Will

^ permalink raw reply	[flat|nested] 341+ messages in thread

* RE: your mail
  2014-02-24 17:28 ` Will Deacon
@ 2014-02-25 11:28   ` Varun Sethi
  0 siblings, 0 replies; 341+ messages in thread
From: Varun Sethi @ 2014-02-25 11:28 UTC (permalink / raw)
  To: Will Deacon, srikanth TS; +Cc: iommu, sungjinn.chung, linux-kernel, ts.srikanth



> -----Original Message-----
> From: iommu-bounces@lists.linux-foundation.org [mailto:iommu-
> bounces@lists.linux-foundation.org] On Behalf Of Will Deacon
> Sent: Monday, February 24, 2014 10:59 PM
> To: srikanth TS
> Cc: iommu@lists.linux-foundation.org; sungjinn.chung@samsung.com; linux-
> kernel@vger.kernel.org; ts.srikanth@samsung.com
> Subject: Re: your mail
> 
> On Mon, Feb 24, 2014 at 03:12:21PM +0000, srikanth TS wrote:
> > Hi Will Deacon,
> 
> Hello,
> 
> > Currently SMMU driver expecting all stream ID used by respective
> > master should be defined in the DT.
> >
> > We want to know how to handle in the case of virtual functions
> > dynamically created and destroyed.
> >
> > Is PCI driver responsible for creating stream ID respective BDand
> > requesting SMMU to add to the mapping table[stream Id to context
> > mapping table]?
> >
> > Or is there any right way of doing it?
> 
> Correct, the driver currently doesn't support dynamic mappings (mainly
> because I didn't want to try and invent something that I couldn't test).
> 
> There are a couple of ways to solve this:
> 
>   (1) Add a way for a PCI RC to dynamically allocate StreamIDs on an SMMU
>       within a fixed range. That would probably need some code in the bus
>       layer, so that a bus notifier can kick and call back to the
> relevant
>       SMMU.
This could be done in add device notifier. I am working on similar(not PCI) hot plug device infrastructure for arm smmu driver. I will post an RFC patch by next week.

-Varun


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <CAM0G4zvu1BHcOrSgBuobvb-+fVsNWXjXdzZdV51T70B9_ZC4XQ@mail.gmail.com>
@ 2014-02-24 17:28 ` Will Deacon
  2014-02-25 11:28   ` Varun Sethi
  0 siblings, 1 reply; 341+ messages in thread
From: Will Deacon @ 2014-02-24 17:28 UTC (permalink / raw)
  To: srikanth TS; +Cc: iommu, linux-kernel, ts.srikanth, sungjinn.chung

On Mon, Feb 24, 2014 at 03:12:21PM +0000, srikanth TS wrote:
> Hi Will Deacon,

Hello,

> Currently SMMU driver expecting all stream ID used by respective master
> should be defined in the DT.
> 
> We want to know how to handle in the case of virtual functions dynamically
> created and destroyed.
> 
> Is PCI driver responsible for creating stream ID respective BDand
> requesting SMMU to add to the mapping table[stream Id to context mapping
> table]?
> 
> Or is there any right way of doing it?

Correct, the driver currently doesn't support dynamic mappings (mainly
because I didn't want to try and invent something that I couldn't test).

There are a couple of ways to solve this:

  (1) Add a way for a PCI RC to dynamically allocate StreamIDs on an SMMU
      within a fixed range. That would probably need some code in the bus
      layer, so that a bus notifier can kick and call back to the relevant
      SMMU.

  (2) Describe the RID -> SID mapping in the device-tree. We probably want
      to avoid an enormous table, so this would only work for simple `SID =
      RID + offset' or 'SID = RID & mask' cases.

How do your IDs map to each other?

Will

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2014-01-23  9:06 Prabhakar Lad
@ 2014-01-23 19:55 ` Mark Brown
  0 siblings, 0 replies; 341+ messages in thread
From: Mark Brown @ 2014-01-23 19:55 UTC (permalink / raw)
  To: Prabhakar Lad; +Cc: LKML

[-- Attachment #1: Type: text/plain, Size: 609 bytes --]

On Thu, Jan 23, 2014 at 02:36:05PM +0530, Prabhakar Lad wrote:
> Hi Mark,

Please use a subject line for your e-mails, otherwise they look a lot
like spam.

> So currently I am booting it traditional way (NON DT way) and
> regulator_dev_lookup()
> fails (return NULL)  and for this check it fails.

> +    if (ret && ret != -ENODEV) {
>          regulator = ERR_PTR(ret);
>          goto out;
>      }
> In the NON-DT case the 'ret' is never updated in regulator_dev_lookup().

What is the problem you're trying to report here?  You're describing the
behaviour of the code but I don't understand the problem.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2014-01-09 18:49   ` Joe Borġ
@ 2014-01-14 16:40     ` Steven Rostedt
  0 siblings, 0 replies; 341+ messages in thread
From: Steven Rostedt @ 2014-01-14 16:40 UTC (permalink / raw)
  To: Joe Bor??; +Cc: Greg KH, abbotti, hsweeten, devel, linux-kernel

On Thu, Jan 09, 2014 at 06:49:39PM +0000, Joe Bor?? wrote:
> 
> I didn't do the changes as root, I sent them from my server as it has SMTP out.
> 

Hmm, this gives me an idea. There's nothing, I believe, that makes the root user
have to have the name "root" except for the passwd file. Maybe I'll just
rename "root" to "walley" and then use "root" as my normal account. If anyone tries
to break into "root" they will just gain access to a normal account and nothing
more ;-)

/me goes back to hacking

-- Steve


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2014-01-09 18:39 ` Greg KH
@ 2014-01-09 18:49   ` Joe Borġ
  2014-01-14 16:40     ` Steven Rostedt
  0 siblings, 1 reply; 341+ messages in thread
From: Joe Borġ @ 2014-01-09 18:49 UTC (permalink / raw)
  To: Greg KH; +Cc: abbotti, hsweeten, devel, linux-kernel

Hi Greg,

I'll re do them tonight.

I didn't do the changes as root, I sent them from my server as it has SMTP out.

Thanks

Regards,
Joseph David Borġ
http://www.jdborg.com


On 9 January 2014 18:39, Greg KH <gregkh@linuxfoundation.org> wrote:
> On Mon, Dec 30, 2013 at 05:40:44PM +0000, Joe Borg wrote:
>> >From 6d9f6446434c4021cc9452e31c374ac50e08f0f9 Mon Sep 17 00:00:00 2001
>> From: Joe Borg <root@josephb.org>
>
> This isn't matching your "from:" line on your email, why should I trust
> it?
>
> And doing kernel work as 'root'?  That's not a good idea for lots of
> reasons...
>
>> Date: Mon, 30 Dec 2013 15:35:08 +0000
>> Subject: [PATCH 62/62] DAS1800: Fixing error from checkpatch.
>>
>> Fixed pointer typeo; foo * bar should be foo *bar.
>>
>> Signed-off by Joe Borg <root@josephb.org>
>
> What happened to your Subject:?
>
> And why is the whole git header in the email, please use git send-email
> so that I don't have to hand-edit the body of the email to apply it.
>
> Can you please fix this up and resend?
>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <1388425244-10017-1-git-send-email-jdb@sitrep3.com>
@ 2014-01-09 18:39 ` Greg KH
  2014-01-09 18:49   ` Joe Borġ
  0 siblings, 1 reply; 341+ messages in thread
From: Greg KH @ 2014-01-09 18:39 UTC (permalink / raw)
  To: Joe Borg; +Cc: abbotti, hsweeten, devel, linux-kernel

On Mon, Dec 30, 2013 at 05:40:44PM +0000, Joe Borg wrote:
> >From 6d9f6446434c4021cc9452e31c374ac50e08f0f9 Mon Sep 17 00:00:00 2001
> From: Joe Borg <root@josephb.org>

This isn't matching your "from:" line on your email, why should I trust
it?

And doing kernel work as 'root'?  That's not a good idea for lots of
reasons...

> Date: Mon, 30 Dec 2013 15:35:08 +0000
> Subject: [PATCH 62/62] DAS1800: Fixing error from checkpatch.
> 
> Fixed pointer typeo; foo * bar should be foo *bar.
> 
> Signed-off by Joe Borg <root@josephb.org>

What happened to your Subject:?

And why is the whole git header in the email, please use git send-email
so that I don't have to hand-edit the body of the email to apply it.

Can you please fix this up and resend?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <CACaajQtCTW_PKA25q3-4o4XAV6sgZnyD+Skkw6mhUHpRBEgbjQ@mail.gmail.com>
@ 2012-11-26 18:29 ` Greg KH
  0 siblings, 0 replies; 341+ messages in thread
From: Greg KH @ 2012-11-26 18:29 UTC (permalink / raw)
  To: Vasiliy Tolstov; +Cc: linux-kernel, stable

On Mon, Nov 26, 2012 at 10:14:44PM +0400, Vasiliy Tolstov wrote:
> Hello, Greg. Hello kernel team! I'm system enginer at clodo.ru (russian cloud
> hosting provider) we are use xen and sles11-sp2 for our compute xen nodes.
> Each virtual machine (domU) have disks that attached by Infiniband SRP. On top
> of disk that attached by srp we use multipath (to do failover)
> Now we have issues like all commands that uses multipath hang while one storage
> is rebooted.
> After some discussion with maintainer of linux-rdma (Bart Van Assche) and using
> it backported ib_srp with HA patches we can't solve deadlock issues. Bart
> thinks that SLES team does not backport some core scsi patches to their kernel
> (3.0.42) to prevent multipath deadlock (currently is about 2.5 minutes) on
> failed target.
> Is that possible to determine or getting help to solve this issue?

As you are using SLES, please contact the SUSE for support for that
kernel, as you are paying for it, and the community can't do anything to
support their kernel, sorry.

Best of luck,

greg k-h

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2012-10-10 15:06 Kent Yoder
@ 2012-10-10 15:12 ` Kent Yoder
  0 siblings, 0 replies; 341+ messages in thread
From: Kent Yoder @ 2012-10-10 15:12 UTC (permalink / raw)
  To: Kent Yoder; +Cc: linux-kernel, linux-security-module, tpmdd-devel

 Please ignore.

On Wed, Oct 10, 2012 at 10:06:53AM -0500, Kent Yoder wrote:
> The following changes since commit ecefbd94b834fa32559d854646d777c56749ef1c:
> 
>   Merge tag 'kvm-3.7-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm (2012-10-04 09:30:33 -0700)
> 
> are available in the git repository at:
> 
> 
>   git://github.com/shpedoikal/linux.git tpmdd-fixes-v3.6
> 
> for you to fetch changes up to 1631cfb7cee28388b04aef6c0a73050f6fd76e4d:
> 
>   driver/char/tpm: fix regression causesd by ppi (2012-10-10 09:50:56 -0500)
> 
> ----------------------------------------------------------------
> Gang Wei (1):
>       driver/char/tpm: fix regression causesd by ppi
> 
>  drivers/char/tpm/tpm.c     |  3 ++-
>  drivers/char/tpm/tpm.h     |  9 +++++++--
>  drivers/char/tpm/tpm_ppi.c | 18 ++++++++++--------
>  3 files changed, 19 insertions(+), 11 deletions(-)


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2012-10-04 16:50 Andrea Arcangeli
@ 2012-10-04 18:17 ` Christoph Lameter
  0 siblings, 0 replies; 341+ messages in thread
From: Christoph Lameter @ 2012-10-04 18:17 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: linux-kernel, linux-mm, Linus Torvalds, Andrew Morton,
	Peter Zijlstra, Ingo Molnar, Mel Gorman, Hugh Dickins,
	Rik van Riel, Johannes Weiner, Hillf Danton, Andrew Jones,
	Dan Smith, Thomas Gleixner, Paul Turner, Suresh Siddha,
	Mike Galbraith, Paul E. McKenney

On Thu, 4 Oct 2012, Andrea Arcangeli wrote:

> So we could drop page_autonuma by creating a CONFIG_SLUB=y dependency
> (AUTONUMA wouldn't be available in the kernel config if SLAB=y, and it
> also wouldn't be available on 32bit archs but the latter isn't a
> problem).

Nope it should depend on page struct alignment. Other kernel subsystems
may be depeding on page struct alignment in the future (and some other
arches may already have that requirement)


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2012-08-03 17:43 Tejun Heo
@ 2012-08-08 16:39 ` Tejun Heo
  0 siblings, 0 replies; 341+ messages in thread
From: Tejun Heo @ 2012-08-08 16:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: torvalds, akpm, padovan, marcel, peterz, mingo, davem,
	dougthompson, ibm-acpi, cbou, rui.zhang, tomi.valkeinen

On Fri, Aug 03, 2012 at 10:43:45AM -0700, Tejun Heo wrote:
> delayed_work has been annoyingly missing the mechanism to modify timer
> of a pending delayed_work - ie. mod_timer() counterpart.  delayed_work
> users have been working around this using several methods - using an
> explicit timer + work item, messing directly with delayed_work->timer,
> and canceling before re-queueing, all of which are error-prone and/or
> ugly.
> 
> Gustavo Padovan posted a RFC implementation[1] of mod_delayed_work() a
> while back but it wasn't complete.  To properly implement
> mod_delayed_work[_on](), it should be able to steal pending work items
> which may be on timer or worklist or anywhere inbetween.  This is
> similar to what __cancel_work_timer() does but it turns out that there
> are a lot of holes around this area and try_to_grab_pending() needs
> considerable amount of work to be used for other purposes too.
> 
> This patchset improves canceling and try_to_grab_pending(), use it to
> implement mod_delayed_work[_on](), convert easy ones, and drop
> __cancel_delayed_work_sync() which doesn't have relevant users
> afterwards.

Applied to wq/for-3.7.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2012-02-21 15:39 Yang Honggang
@ 2012-02-21 11:34 ` Hans J. Koch
  0 siblings, 0 replies; 341+ messages in thread
From: Hans J. Koch @ 2012-02-21 11:34 UTC (permalink / raw)
  To: Yang Honggang; +Cc: linux-kernel, hjk

On Tue, Feb 21, 2012 at 10:39:18AM -0500, Yang Honggang wrote:
> hi, everyone

Please give your mail a proper subject line before posting.
If you talk about UIO, it should start with uio:
Otherwise, people won't read it and just send it to /dev/null.

> 
> Is there a mail list dedicated for UIO (userspace I/O)?

No, there's not enough mail traffic to justify that.

> I want to contribute to UIO but did not find the right
> mail list.

Please send your contribution to LKML and Cc: me and Greg
Kroah-Hartman. If you change an existing driver, also Cc:
the author of that driver.

Thanks,
Hans


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2012-02-12 19:06     ` Al Viro
@ 2012-02-13  9:40       ` Jiri Slaby
  0 siblings, 0 replies; 341+ messages in thread
From: Jiri Slaby @ 2012-02-13  9:40 UTC (permalink / raw)
  To: Al Viro
  Cc: Jiri Slaby, Richard Weinberger, linux-kernel,
	user-mode-linux-devel, akpm, alan, gregkh

On 02/12/2012 08:06 PM, Al Viro wrote:
> Yecchhh...  If I'm reading (and grepping) it right, there are only two
> non-default instance of tty_operations ->shutdown() - pty and vt ones.
> Lovely...  And while we are at it, vt instance is definitely not safe
> from interrupts - calls console_lock().  Not that it was relevant in
> this case...

Thanks for looking into that. I was too lazy to do that on Sunday.

You're right that it may cause problems. Fortunately vt doesn't refcount
ttys. Hence con_shutdown can be called only from release_tty (close
path) in the user context.

Adding to my TODO list, unless somebody beats me to fix it.

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2012-02-12 19:11 ` Al Viro
@ 2012-02-13  9:15   ` Jiri Slaby
  0 siblings, 0 replies; 341+ messages in thread
From: Jiri Slaby @ 2012-02-13  9:15 UTC (permalink / raw)
  To: Al Viro
  Cc: Richard Weinberger, linux-kernel, user-mode-linux-devel, akpm,
	alan, gregkh

On 02/12/2012 08:11 PM, Al Viro wrote:
> On Sun, Feb 12, 2012 at 01:21:10AM +0100, Richard Weinberger wrote:
> 
>> @@ -343,7 +267,7 @@ static irqreturn_t line_write_interrupt(int irq, void *data)
>>  {
>>  	struct chan *chan = data;
>>  	struct line *line = chan->line;
>> -	struct tty_struct *tty = line->tty;
>> +	struct tty_struct *tty = tty_port_tty_get(&line->port);
>>  	int err;
>>  
>>  	/*
>> @@ -354,6 +278,9 @@ static irqreturn_t line_write_interrupt(int irq, void *data)
>>  	spin_lock(&line->lock);
>>  	err = flush_buffer(line);
>>  	if (err == 0) {
>> +		tty_kref_put(tty);
>> +
>> +		spin_unlock(&line->lock);
>>  		return IRQ_NONE;
>>  	} else if (err < 0) {
>>  		line->head = line->buffer;
>> @@ -365,9 +292,12 @@ static irqreturn_t line_write_interrupt(int irq, void *data)
>>  		return IRQ_NONE;
>>  
>>  	tty_wakeup(tty);
>> +	tty_kref_put(tty);
>>  	return IRQ_HANDLED;
>>  }
> 
> That, BTW, smells ugly.  Note that return before the last one has no
> tty_kref_put() for a very good reason - it's under if (!tty).  And
> just as line->tty, port->tty can become NULL, so tty_port_tty_get()
> can, indeed, return NULL here.  Which makes the first tty_kref_put()
> oopsable...

Nope, it is allowed to call tty_kref_put(NULL).

regards,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2012-02-12  0:21 Richard Weinberger
  2012-02-12  0:25 ` your mail Jesper Juhl
  2012-02-12  1:02 ` Al Viro
@ 2012-02-12 19:11 ` Al Viro
  2012-02-13  9:15   ` Jiri Slaby
  2 siblings, 1 reply; 341+ messages in thread
From: Al Viro @ 2012-02-12 19:11 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: linux-kernel, user-mode-linux-devel, akpm, alan, gregkh

On Sun, Feb 12, 2012 at 01:21:10AM +0100, Richard Weinberger wrote:

> @@ -343,7 +267,7 @@ static irqreturn_t line_write_interrupt(int irq, void *data)
>  {
>  	struct chan *chan = data;
>  	struct line *line = chan->line;
> -	struct tty_struct *tty = line->tty;
> +	struct tty_struct *tty = tty_port_tty_get(&line->port);
>  	int err;
>  
>  	/*
> @@ -354,6 +278,9 @@ static irqreturn_t line_write_interrupt(int irq, void *data)
>  	spin_lock(&line->lock);
>  	err = flush_buffer(line);
>  	if (err == 0) {
> +		tty_kref_put(tty);
> +
> +		spin_unlock(&line->lock);
>  		return IRQ_NONE;
>  	} else if (err < 0) {
>  		line->head = line->buffer;
> @@ -365,9 +292,12 @@ static irqreturn_t line_write_interrupt(int irq, void *data)
>  		return IRQ_NONE;
>  
>  	tty_wakeup(tty);
> +	tty_kref_put(tty);
>  	return IRQ_HANDLED;
>  }

That, BTW, smells ugly.  Note that return before the last one has no
tty_kref_put() for a very good reason - it's under if (!tty).  And
just as line->tty, port->tty can become NULL, so tty_port_tty_get()
can, indeed, return NULL here.  Which makes the first tty_kref_put()
oopsable...

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2012-02-12 12:40   ` Jiri Slaby
@ 2012-02-12 19:06     ` Al Viro
  2012-02-13  9:40       ` Jiri Slaby
  0 siblings, 1 reply; 341+ messages in thread
From: Al Viro @ 2012-02-12 19:06 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Richard Weinberger, linux-kernel, user-mode-linux-devel, akpm,
	alan, gregkh, Jiri Slaby

On Sun, Feb 12, 2012 at 01:40:47PM +0100, Jiri Slaby wrote:
> > Is tty_kref_put() safe in interrupt?  Here it seems to be OK, but in other
> > callers...  More or less at random: drivers/tty/serial/lantiq.c has it
> > called from lqasc_rx_int().  It seems to be possible to have it end up
> > calling ->ops->shutdown() and in this case that'd be lqasc_shutdown().
> > Which does a bunch of free_irq(), including the ->rx_irq, i.e. the one
> > we have it called from.  Alan?
> 
> I'm not Alan, but will reply anyway. Yes, it is safe (unless the driver
> does something tricky). In the driver you mention, this is uart_ops,
> called from tty_port_operations' ->shutdown. And that's a different from
> tty_operations' ->shutdown.
> 
> Yes, there are:
> * tty->ops
> * tty_port->ops
> * uart_port->ops
> 
> uart_port->ops->shutdown is supposed to tear down interrupts like in
> lantiq.c. It is called from tty_port->ops->shutdown. And that one is
> allowed to be called only from user context (tty->ops->close and
> tty->ops->hangup).

Yecchhh...  If I'm reading (and grepping) it right, there are only two
non-default instance of tty_operations ->shutdown() - pty and vt ones.
Lovely...  And while we are at it, vt instance is definitely not safe
from interrupts - calls console_lock().  Not that it was relevant in
this case...

It's probably too late in this case, but I would've called that method
->sync_cleanup().  Assuming I'm not misreading its intent and history...

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2012-02-12  1:02 ` Al Viro
@ 2012-02-12 12:40   ` Jiri Slaby
  2012-02-12 19:06     ` Al Viro
  0 siblings, 1 reply; 341+ messages in thread
From: Jiri Slaby @ 2012-02-12 12:40 UTC (permalink / raw)
  To: Al Viro
  Cc: Richard Weinberger, linux-kernel, user-mode-linux-devel, akpm,
	alan, gregkh, Jiri Slaby

On 02/12/2012 02:02 AM, Al Viro wrote:
> On Sun, Feb 12, 2012 at 01:21:10AM +0100, Richard Weinberger wrote:
>> +++ b/arch/um/drivers/line.c
>> @@ -19,19 +19,29 @@ static irqreturn_t line_interrupt(int irq, void *data)
>>  {
>>  	struct chan *chan = data;
>>  	struct line *line = chan->line;
>> +	struct tty_struct *tty;
>> +
>> +	if (line) {
>> +		tty = tty_port_tty_get(&line->port);
>> +		chan_interrupt(&line->chan_list, &line->task, tty, irq);
>> +		tty_kref_put(tty);
>> +	}
>>  
>> -	if (line)
>> -		chan_interrupt(&line->chan_list, &line->task, line->tty, irq);
>>  	return IRQ_HANDLED;
>>  }
> 
> Is tty_kref_put() safe in interrupt?  Here it seems to be OK, but in other
> callers...  More or less at random: drivers/tty/serial/lantiq.c has it
> called from lqasc_rx_int().  It seems to be possible to have it end up
> calling ->ops->shutdown() and in this case that'd be lqasc_shutdown().
> Which does a bunch of free_irq(), including the ->rx_irq, i.e. the one
> we have it called from.  Alan?

I'm not Alan, but will reply anyway. Yes, it is safe (unless the driver
does something tricky). In the driver you mention, this is uart_ops,
called from tty_port_operations' ->shutdown. And that's a different from
tty_operations' ->shutdown.

Yes, there are:
* tty->ops
* tty_port->ops
* uart_port->ops

uart_port->ops->shutdown is supposed to tear down interrupts like in
lantiq.c. It is called from tty_port->ops->shutdown. And that one is
allowed to be called only from user context (tty->ops->close and
tty->ops->hangup).

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2012-02-12  0:21 Richard Weinberger
  2012-02-12  0:25 ` your mail Jesper Juhl
@ 2012-02-12  1:02 ` Al Viro
  2012-02-12 12:40   ` Jiri Slaby
  2012-02-12 19:11 ` Al Viro
  2 siblings, 1 reply; 341+ messages in thread
From: Al Viro @ 2012-02-12  1:02 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: linux-kernel, user-mode-linux-devel, akpm, alan, gregkh

On Sun, Feb 12, 2012 at 01:21:10AM +0100, Richard Weinberger wrote:

Not a full review by any means, but...

> +++ b/arch/um/drivers/line.c
> @@ -19,19 +19,29 @@ static irqreturn_t line_interrupt(int irq, void *data)
>  {
>  	struct chan *chan = data;
>  	struct line *line = chan->line;
> +	struct tty_struct *tty;
> +
> +	if (line) {
> +		tty = tty_port_tty_get(&line->port);
> +		chan_interrupt(&line->chan_list, &line->task, tty, irq);
> +		tty_kref_put(tty);
> +	}
>  
> -	if (line)
> -		chan_interrupt(&line->chan_list, &line->task, line->tty, irq);
>  	return IRQ_HANDLED;
>  }

Is tty_kref_put() safe in interrupt?  Here it seems to be OK, but in other
callers...  More or less at random: drivers/tty/serial/lantiq.c has it
called from lqasc_rx_int().  It seems to be possible to have it end up
calling ->ops->shutdown() and in this case that'd be lqasc_shutdown().
Which does a bunch of free_irq(), including the ->rx_irq, i.e. the one
we have it called from.  Alan?

> @@ -495,13 +413,6 @@ static int setup_one_line(struct line *lines, int n, char *init, int init_prio,
>  	struct line *line = &lines[n];
>  	int err = -EINVAL;
>  
> -	spin_lock(&line->count_lock);
> -
> -	if (line->count) {
> -		*error_out = "Device is already open";
> -		goto out;
> -	}

... and similar in line_open() - just what happens if you try to reconfigure
an opened one?

> @@ -612,13 +523,15 @@ int line_get_config(char *name, struct line *lines, unsigned int num, char *str,
>  
>  	line = &lines[dev];
>  
> -	spin_lock(&line->count_lock);
> +	tty = tty_port_tty_get(&line->port);
> +
>  	if (!line->valid)
>  		CONFIG_CHUNK(str, size, n, "none", 1);
> -	else if (line->tty == NULL)
> +	else if (tty == NULL)
>  		CONFIG_CHUNK(str, size, n, line->init_str, 1);
>  	else n = chan_config_string(&line->chan_list, str, size, error_out);
> -	spin_unlock(&line->count_lock);
> +
> +	tty_kref_put(tty);

again, where's the exclusion with config changes?

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2012-02-12  0:21 Richard Weinberger
@ 2012-02-12  0:25 ` Jesper Juhl
  2012-02-12  1:02 ` Al Viro
  2012-02-12 19:11 ` Al Viro
  2 siblings, 0 replies; 341+ messages in thread
From: Jesper Juhl @ 2012-02-12  0:25 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: linux-kernel, user-mode-linux-devel, viro, akpm, alan, gregkh

On Sun, 12 Feb 2012, Richard Weinberger wrote:

> Can you please review this patch?
> 

A subject on the mail along with a description of the patch would make 
that a great deal easier...

-- 
Jesper Juhl <jj@chaosbits.net>       http://www.chaosbits.net/
Don't top-post http://www.catb.org/jargon/html/T/top-post.html
Plain text mails only, please.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <20120110061735.9BD676BA98@mailhub.coreip.homeip.net>
@ 2012-01-10  7:45 ` Dmitry Torokhov
  0 siblings, 0 replies; 341+ messages in thread
From: Dmitry Torokhov @ 2012-01-10  7:45 UTC (permalink / raw)
  To: Milton Miller; +Cc: Che-Liang Chiou, linux-kernel

On Mon, Jan 09, 2012 at 10:17:35PM -0800, Milton Miller wrote:
> Subject	Re: [PATCH 1/2] Input: serio_raw - cosmetic fixes
> In-Reply-To: <20120109082412.GC4049@core.coreip.homeip.net>
> References: <20120109082412.GC4049@core.coreip.homeip.net>
> 	<1325847795-30486-1-git-send-email-clchiou@chromium.org>
> Date: Tue, 10 Jan 2012 00:14:35 -0600
> Subject: (No subject header)
> X-Originating-IP: 71.22.127.106
> Message-ID: <1326176075_1502@mail4.comsite.net>
> 
> On Mon, 9 Jan 2012 about 00:24:12 -0800, Dmitry Torokhov wrote:
> > >  	struct serio_raw_client *client = file->private_data;
> > >  	struct serio_raw *serio_raw = client->serio_raw;
> > > -	unsigned int mask;
> > > 
> > >  	poll_wait(file, &serio_raw->wait, wait);
> > > 
> > > -	mask = serio_raw->dead ? POLLHUP | POLLERR : POLLOUT | POLLWRNORM;
> > >  	if (serio_raw->head != serio_raw->tail)
> > >  		return POLLIN | POLLRDNORM;
> > > 
> > 
> > This however is not quite correct. I will be applying the patch below
> > instead.
> > 
> > 
> > diff --git a/drivers/input/serio/serio_raw.c b/drivers/input/serio/serio_raw.c
> > index ca78a89..c2c6ad8 100644
> > --- a/drivers/input/serio/serio_raw.c
> > +++ b/drivers/input/serio/serio_raw.c
> > @@ -237,7 +237,7 @@ static unsigned int serio_raw_poll(struct file *file, poll_table *wait)
> >  
> >  	mask = serio_raw->dead ? POLLHUP | POLLERR : POLLOUT | POLLWRNORM;
> >  	if (serio_raw->head != serio_raw->tail)
> > -		return POLLIN | POLLRDNORM;
> > +		mask |= POLLIN | POLLRDNORM;
> >  
> >  	return 0;
> 
> doesn't this need to be changed to return mask?

Doh! Of course it does.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2011-09-21 21:54 jim.cromie
@ 2011-09-26 23:23 ` Greg KH
  0 siblings, 0 replies; 341+ messages in thread
From: Greg KH @ 2011-09-26 23:23 UTC (permalink / raw)
  To: jim.cromie; +Cc: jbaron, joe, bart.vanassche, linux-kernel

On Wed, Sep 21, 2011 at 03:54:49PM -0600, jim.cromie@gmail.com wrote:
> hi all,
> 
> this reworked* patchset enhances dynamic-debug with:

I need acks from Jason before I can apply any of this...


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2011-05-18 19:22   ` your mail Greg KH
@ 2011-05-18 20:35     ` Alessio Igor Bogani
  0 siblings, 0 replies; 341+ messages in thread
From: Alessio Igor Bogani @ 2011-05-18 20:35 UTC (permalink / raw)
  To: Greg KH
  Cc: Rusty Russell, Tim Bird, Christoph Hellwig, Anders Kaseorg,
	Tim Abbott, LKML, Linux Embedded, Jason Wessel, Dirk Behme

Dear Mr. Kroah-Hartman,

2011/5/18 Greg KH <greg@kroah.com>:
[...]
> Care to resend it without all the stuff above so someone (Rusty I guess)
> can apply it?

Sure! It'll follow in few minutes.

Thank you very much!

Ciao,
Alessio

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2011-05-18 18:55 ` Alessio Igor Bogani
@ 2011-05-18 19:22   ` Greg KH
  2011-05-18 20:35     ` Alessio Igor Bogani
  0 siblings, 1 reply; 341+ messages in thread
From: Greg KH @ 2011-05-18 19:22 UTC (permalink / raw)
  To: Alessio Igor Bogani
  Cc: Rusty Russell, Tim Bird, Christoph Hellwig, Anders Kaseorg,
	Tim Abbott, LKML, Linux Embedded, Jason Wessel, Dirk Behme

On Wed, May 18, 2011 at 08:55:25PM +0200, Alessio Igor Bogani wrote:
> Dear Mr. Bird, Dear Mr. Kroah-Hartman,
> 
> Sorry for my very bad English.
> 
> 2011/5/18 Tim Bird <tim.bird@am.sony.com>:
> [...]
> > Alessio - do you have any timings you can share for the speedup?
> 
> You can find a little benchmark using ftrace at end of this email:
> https://lkml.org/lkml/2011/4/5/341
> 
> > On 05/17/2011 04:22 PM, Greg KH wrote:
> >> On Tue, May 17, 2011 at 10:56:03PM +0200, Alessio Igor Bogani wrote:
> >>> This work was supported by a hardware donation from the CE Linux Forum.
> [...]
> >> Please explain why you make a change, not just who sponsored the change,
> >> that's not very interesting to developers.
> 
> You are right. I apologize.
> 
> This patch is a missing piece (not essential it is only a further little
> optimization) of this little patchset:
> https://lkml.org/lkml/2011/4/16/48
> 
> Unfortunately I forgot to include this patch in the series (my first error)
> then I avoided explaining the changes because I had thought that those were
> already enough explained in the cover-letter of the patchset (my second error).
> 
> Sorry for my mistakes.
> 
> Is this better?
> 
> Subject: [PATCH] module: Use binary search in lookup_symbol()
> 
> The function is_exported() with its helper function lookup_symbol() are used to
> verify if a provided symbol is effectively exported by the kernel or by the
> modules. Now that both have their symbols sorted we can replace a linear search
> with a binary search which provide a considerably speed-up.
> 
> This work was supported by a hardware donation from the CE Linux Forum.
> 
> Signed-off-by: Alessio Igor Bogani <abogani@kernel.org>

Much better, I have no objection to this at all.

	Acked-by: Greg Kroah-Hartman <gregkh@suse.de>

Care to resend it without all the stuff above so someone (Rusty I guess)
can apply it?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2011-05-16  9:44 ` your mail Felipe Balbi
@ 2011-05-16 10:07   ` Munegowda, Keshava
  0 siblings, 0 replies; 341+ messages in thread
From: Munegowda, Keshava @ 2011-05-16 10:07 UTC (permalink / raw)
  To: balbi; +Cc: linux-usb, linux-omap, linux-kernel, gadiyar, sameo, parthab

On Mon, May 16, 2011 at 3:14 PM, Felipe Balbi <balbi@ti.com> wrote:
> Hi,
>
> On Mon, May 16, 2011 at 03:04:20PM +0530, Keshava Munegowda wrote:
>> Following 2 hwmod strcuture are added:
>> UHH hwmod of usbhs with uhh base address and
>> EHCI , OHCI irq and base addresses.
>> TLL hwmod of usbhs with the TLL base address and irq.
>>
>> Signed-off-by: Keshava Munegowda <keshava_mgowda@ti.com>
>
> missing subject line.

Ya, I have already correct it and resend this patch [RESEND] [PATCH 1/5]...

Regards
keshava

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2011-05-16  9:34 Keshava Munegowda
@ 2011-05-16  9:44 ` Felipe Balbi
  2011-05-16 10:07   ` Munegowda, Keshava
  0 siblings, 1 reply; 341+ messages in thread
From: Felipe Balbi @ 2011-05-16  9:44 UTC (permalink / raw)
  To: Keshava Munegowda
  Cc: linux-usb, linux-omap, linux-kernel, balbi, gadiyar, sameo, parthab

[-- Attachment #1: Type: text/plain, Size: 364 bytes --]

Hi,

On Mon, May 16, 2011 at 03:04:20PM +0530, Keshava Munegowda wrote:
> Following 2 hwmod strcuture are added:
> UHH hwmod of usbhs with uhh base address and
> EHCI , OHCI irq and base addresses.
> TLL hwmod of usbhs with the TLL base address and irq.
> 
> Signed-off-by: Keshava Munegowda <keshava_mgowda@ti.com>

missing subject line.

-- 
balbi

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2011-01-14  1:14 Omar Ramirez Luna
@ 2011-01-14  4:36 ` Greg KH
  0 siblings, 0 replies; 341+ messages in thread
From: Greg KH @ 2011-01-14  4:36 UTC (permalink / raw)
  To: Omar Ramirez Luna; +Cc: Felipe Contreras, devel, linux-kernel

On Thu, Jan 13, 2011 at 07:14:53PM -0600, Omar Ramirez Luna wrote:
> Please pull these changes for 2.6.38:
> 
> The following changes since commit 3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5:
> 
>   Linux 2.6.37 (2011-01-04 16:50:19 -0800)
> 
> are available in the git repository at:
>   git://dev.omapzoom.org/pub/scm/tidspbridge/kernel-dspbridge.git for-gkh-2.6.38
> 
> Guzman Lugo, Fernando (1):
>       staging: tidspbridge: configure full L1 MMU range
> 
> Omar Ramirez Luna (1):
>       staging: tidspbridge: replace mbox callback with notifier_call
> 
>  drivers/staging/tidspbridge/core/tiomap3430.c |   15 +++++++--------
>  1 files changed, 7 insertions(+), 8 deletions(-)

You forgot a Subject: line.

Also, as these are just 2 patches, care to just email them so we can
review them?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2011-01-03 17:03 ` your mail Stanislaw Gruszka
@ 2011-01-04  5:17   ` Tejun Heo
  0 siblings, 0 replies; 341+ messages in thread
From: Tejun Heo @ 2011-01-04  5:17 UTC (permalink / raw)
  To: Stanislaw Gruszka; +Cc: castet.matthieu, linux-kernel, linux-usb, stf_xl

On Mon, Jan 03, 2011 at 06:03:17PM +0100, Stanislaw Gruszka wrote:
> On Mon, Jan 03, 2011 at 05:38:00PM +0100, castet.matthieu@free.fr wrote:
> > could you CC me on ueagle-atm.c patches.

Will try to, but maybe it's a good idea to add a MAINTAINERS entry?

> > From what I remind we sleep in the workqueue, that's why we couldn't use the
> > system one (freeze keyboard...). But may be the code changed.
> In case when firmware is not available we can sleep for a few seconds in
> work function. That's block keyboard driver who also use common workqueue.
> If recent Tejun workqueue rewrite allow to long sleep in work func and
> not hurt other workqueue users, patch is ok.

Yeap, work items can sleep all they want on the system_wq.  It won't
delay execution of other work items.

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2011-01-03 16:38 castet.matthieu
@ 2011-01-03 17:03 ` Stanislaw Gruszka
  2011-01-04  5:17   ` Tejun Heo
  0 siblings, 1 reply; 341+ messages in thread
From: Stanislaw Gruszka @ 2011-01-03 17:03 UTC (permalink / raw)
  To: castet.matthieu; +Cc: linux-kernel, linux-usb, stf_xl, tj

On Mon, Jan 03, 2011 at 05:38:00PM +0100, castet.matthieu@free.fr wrote:
> Hi,
> 
> could you CC me on ueagle-atm.c patches.
> 
> From what I remind we sleep in the workqueue, that's why we couldn't use the
> system one (freeze keyboard...). But may be the code changed.
In case when firmware is not available we can sleep for a few seconds in
work function. That's block keyboard driver who also use common workqueue.
If recent Tejun workqueue rewrite allow to long sleep in work func and
not hurt other workqueue users, patch is ok.

Unfortunately I'm not able to test the patch, my ueagle device was physically
damaged a few months ago.

Stanislaw

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2010-06-13  6:16 Mike Gilks
@ 2010-06-18 23:52 ` Greg KH
  0 siblings, 0 replies; 341+ messages in thread
From: Greg KH @ 2010-06-18 23:52 UTC (permalink / raw)
  To: Mike Gilks; +Cc: gregkh, mchehab, julia, joe, devel, linux-kernel

On Sun, Jun 13, 2010 at 02:16:47PM +0800, Mike Gilks wrote:
> Subject:r8192U_core.c Last pass
> In-Reply-To: 
> 
> 
> This is the last patch I can manage for this file.
> Everything else to do with checkpatch.pl issues may require an actual developer to look at it.

I have a whole bunch of series of patches from you (one duplicating
Linus's patch, I don't think you ment to send that...)  So, which should
I apply?

How about I delete them all and you send me the latest ones that you
want me to apply, as I'm totally confused which is your latest version
and which isn't and I see lots of duplicates.

Sound good?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2010-04-15 23:41   ` Rafi Rubin
@ 2010-04-16  4:21     ` Dmitry Torokhov
  0 siblings, 0 replies; 341+ messages in thread
From: Dmitry Torokhov @ 2010-04-16  4:21 UTC (permalink / raw)
  To: Rafi Rubin; +Cc: Alan Cox, linux-i2c, khali, linux-input, linux-kernel

On Thu, Apr 15, 2010 at 07:41:22PM -0400, Rafi Rubin wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> >> +	if (ts->tc.event_sended == false) {
> > 
> > We set "event_sended" to false immediately before calling
> > cy8ctmg110_send_event() so I do not see the point of this flag.
> 
> On that note:
> 
> $ git grep -n sended
> drivers/net/eth16i.c:1295:
> 		how many packets there is to be sended */
> drivers/net/wan/sbni.c:638:
> 		/* if frame was sended but not ACK'ed - resend it */
> drivers/net/wan/sbni.c:659:
> 		* frame sended then in prepare_to_send next frame
> drivers/usb/serial/aircable.c:13:
> 		* next two bytes must say how much data will be sended.
> 

Well, if you want to go down that path...

[dtor@hammer work]$ grep -r -e "\(setted\|setuped\|split\+ed\)" . | wc -l
54
[dtor@hammer work]$ 

-- 
Dmitry

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2010-04-14 23:16 ` your mail Dmitry Torokhov
@ 2010-04-15 23:41   ` Rafi Rubin
  2010-04-16  4:21     ` Dmitry Torokhov
  0 siblings, 1 reply; 341+ messages in thread
From: Rafi Rubin @ 2010-04-15 23:41 UTC (permalink / raw)
  To: Dmitry Torokhov; +Cc: Alan Cox, linux-i2c, khali, linux-input, linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

>> +	if (ts->tc.event_sended == false) {
> 
> We set "event_sended" to false immediately before calling
> cy8ctmg110_send_event() so I do not see the point of this flag.

On that note:

$ git grep -n sended
drivers/net/eth16i.c:1295:
		how many packets there is to be sended */
drivers/net/wan/sbni.c:638:
		/* if frame was sended but not ACK'ed - resend it */
drivers/net/wan/sbni.c:659:
		* frame sended then in prepare_to_send next frame
drivers/usb/serial/aircable.c:13:
		* next two bytes must say how much data will be sended.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvHpB4ACgkQwuRiAT9o609wAgCfbGjTP2lIN6JJyX28VzjPHxTY
ylIAn15FZRPpBEkWaFR8oAFKCCRmNF4d
=u4nx
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2010-04-14 12:54 Alan Cox
@ 2010-04-14 23:16 ` Dmitry Torokhov
  2010-04-15 23:41   ` Rafi Rubin
  0 siblings, 1 reply; 341+ messages in thread
From: Dmitry Torokhov @ 2010-04-14 23:16 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-i2c, khali, linux-input, linux-kernel

On Wed, Apr 14, 2010 at 01:54:02PM +0100, Alan Cox wrote:
> Subject: [FOR COMMENT] cy8ctmg110 for review
> 
> From: Samuli Konttila <samuli.konttila@aavamobile.com>
> 
> Add support for the cy8ctmg110 capacitive touchscreen used on some embedded
> devices.
> 
> (Some clean up by Alan Cox)
> 
> (No signed off, not yet ready to go in)
> ---
> 
>  drivers/input/touchscreen/Kconfig         |   12 +
>  drivers/input/touchscreen/Makefile        |    3 
>  drivers/input/touchscreen/cy8ctmg110_ts.c |  521 +++++++++++++++++++++++++++++
>  3 files changed, 535 insertions(+), 1 deletions(-)
>  create mode 100644 drivers/input/touchscreen/cy8ctmg110_ts.c
> 
> 
> diff --git a/drivers/input/touchscreen/Kconfig b/drivers/input/touchscreen/Kconfig
> index b3ba374..89a3eb1 100644
> --- a/drivers/input/touchscreen/Kconfig
> +++ b/drivers/input/touchscreen/Kconfig
> @@ -591,4 +591,16 @@ config TOUCHSCREEN_TPS6507X
>  	  To compile this driver as a module, choose M here: the
>  	  module will be called tps6507x_ts.
>  
> +config TOUCHSCREEN_CY8CTMG110
> +	tristate "cy8ctmg110 touchscreen"
> +	depends on I2C
> +	help
> +	  Say Y here if you have a cy8ctmg110 touchscreen capacitive
> +	  touchscreen
> +
> +	  If unsure, say N.
> +
> +	  To compile this driver as a module, choose M here: the
> +	  module will be called cy8ctmg110_ts.
> +
>  endif
> diff --git a/drivers/input/touchscreen/Makefile b/drivers/input/touchscreen/Makefile
> index dfb7239..c7acb65 100644
> --- a/drivers/input/touchscreen/Makefile
> +++ b/drivers/input/touchscreen/Makefile
> @@ -1,5 +1,5 @@
>  #
> -# Makefile for the touchscreen drivers.
> +# Makefile for the touchscreen drivers.mororor
>  #
>  
>  # Each configuration option enables a list of files.
> @@ -12,6 +12,7 @@ obj-$(CONFIG_TOUCHSCREEN_AD7879)	+= ad7879.o
>  obj-$(CONFIG_TOUCHSCREEN_ADS7846)	+= ads7846.o
>  obj-$(CONFIG_TOUCHSCREEN_ATMEL_TSADCC)	+= atmel_tsadcc.o
>  obj-$(CONFIG_TOUCHSCREEN_BITSY)		+= h3600_ts_input.o
> +obj-$(CONFIG_TOUCHSCREEN_CY8CTMG110)    += cy8ctmg110_ts.o
>  obj-$(CONFIG_TOUCHSCREEN_DYNAPRO)	+= dynapro.o
>  obj-$(CONFIG_TOUCHSCREEN_GUNZE)		+= gunze.o
>  obj-$(CONFIG_TOUCHSCREEN_EETI)		+= eeti_ts.o
> diff --git a/drivers/input/touchscreen/cy8ctmg110_ts.c b/drivers/input/touchscreen/cy8ctmg110_ts.c
> new file mode 100644
> index 0000000..4adbe87
> --- /dev/null
> +++ b/drivers/input/touchscreen/cy8ctmg110_ts.c
> @@ -0,0 +1,521 @@
> +/*
> + * cy8ctmg110_ts.c Driver for cypress touch screen controller
> + * Copyright (c) 2009 Aava Mobile
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
> + */
> +
> +#include <linux/module.h>
> +#include <linux/kernel.h>
> +#include <linux/input.h>
> +#include <linux/slab.h>
> +#include <linux/interrupt.h>
> +#include <asm/io.h>
> +#include <linux/i2c.h>
> +#include <linux/timer.h>
> +#include <linux/gpio.h>
> +#include <linux/hrtimer.h>
> +
> +#include <linux/platform_device.h>
> +#include <linux/delay.h>
> +#include <linux/fs.h>
> +#include <asm/ioctl.h>
> +#include <asm/uaccess.h>
> +#include <linux/device.h>
> +#include <linux/module.h>
> +#include <linux/platform_device.h>
> +#include <linux/delay.h>
> +#include <linux/fs.h>
> +#include <asm/ioctl.h>
> +#include <linux/fs.h>
> +#include <linux/init.h>
> +#include <linux/miscdevice.h>
> +#include <linux/module.h>
> +
> +
> +#define CY8CTMG110_DRIVER_NAME      "cy8ctmg110"
> +
> +
> +/*HW definations*/
> +#define CY8CTMG110_RESET_PIN_GPIO   43
> +#define CY8CTMG110_IRQ_PIN_GPIO     59
> +#define CY8CTMG110_I2C_ADDR         0x38
> +#define CY8CTMG110_I2C_ADDR_EXT     0x39
> +#define CY8CTMG110_I2C_ADDR_        0x2	/*i2c address first sample */
> +#define CY8CTMG110_I2C_ADDR__       53	/*i2c address to FW where irq support missing */
> +#define CY8CTMG110_TOUCH_IRQ        21
> +#define CY8CTMG110_TOUCH_LENGHT     9787
> +#define CY8CTMG110_SCREEN_LENGHT    8424
> +
> +
> +/*Touch coordinates*/
> +#define CY8CTMG110_X_MIN        0
> +#define CY8CTMG110_Y_MIN        0
> +#define CY8CTMG110_X_MAX        864
> +#define CY8CTMG110_Y_MAX        480
> +
> +
> +/*cy8ctmg110 registers defination*/
> +#define CY8CTMG110_TOUCH_WAKEUP_TIME   0
> +#define CY8CTMG110_TOUCH_SLEEP_TIME    2
> +#define CY8CTMG110_TOUCH_X1            3
> +#define CY8CTMG110_TOUCH_Y1            5
> +#define CY8CTMG110_TOUCH_X2            7
> +#define CY8CTMG110_TOUCH_Y2            9
> +#define CY8CTMG110_FINGERS             11
> +#define CY8CTMG110_GESTURE             12
> +#define CY8CTMG110_REG_MAX             13
> +
> +#define CY8CTMG110_POLL_TIMER_DELAY  1000*1000*100
> +#define TOUCH_MAX_I2C_FAILS          50
> +
> +/* Scale factors for coordinates */
> +#define X_SCALE_FACTOR 9387/8424
> +#define Y_SCALE_FACTOR 97/100
> +
> +/* For tracing */
> +static int g_y_trace_coord = 0;
> +module_param(g_y_trace_coord, int, 0600);
> +
> +/* Polling mode */
> +static int polling = 0;
> +module_param(polling, int, 0);
> +MODULE_PARM_DESC(polling, "Set to enabling polling of the touchscreen");
> +
> +
> +/*
> + * The touch position structure.
> + */
> +struct ts_event {
> +	int x1;
> +	int y1;
> +	int x2;
> +	int y2;
> +	bool event_sended;
> +};
> +
> +/*
> + * The touch driver structure.
> + */
> +struct cy8ctmg110 {
> +	struct input_dev *input;
> +	char phys[32];
> +	struct ts_event tc;
> +	struct i2c_client *client;
> +	bool pending;
> +	spinlock_t lock;
> +	bool initController;
> +	bool sleepmode;
> +	int i2c_fail_count;
> +	struct hrtimer timer;
> +};
> +
> +/*
> + * cy8ctmg110_poweroff is the routine that is called when touch hardware 
> + * will powered off
> + */
> +static void cy8ctmg110_power(bool poweron)
> +{
> +	if (poweron)
> +		gpio_direction_output(CY8CTMG110_RESET_PIN_GPIO, 0);
> +	else
> +		gpio_direction_output(CY8CTMG110_RESET_PIN_GPIO, 1);
> +}
> +
> +/*
> + * cy8ctmg110_write_req write regs to the i2c devices
> + * 
> + */
> +static int cy8ctmg110_write_req(struct cy8ctmg110 *tsc, unsigned char reg,
> +		unsigned char len, unsigned char *value)
> +{
> +	struct i2c_client *client = tsc->client;
> +	unsigned int ret;
> +	unsigned char i2c_data[] = { 0, 0, 0, 0, 0, 0 };
> +	struct i2c_msg msg[] = {
> +			{client->addr, 0, len + 1, i2c_data},
> +			};
> +
> +	i2c_data[0] = reg;
> +	memcpy(i2c_data + 1, value, len);
> +
> +	ret = i2c_transfer(client->adapter, msg, 1);
> +	if (ret != 1) {
> +		printk("cy8ctmg110 touch : i2c write data cmd failed \n");
> +		return ret;
> +	}
> +	return 0;
> +}
> +
> +/*
> + * cy8ctmg110_read_req read regs from i2c devise
> + * 
> + */
> +
> +static int cy8ctmg110_read_req(struct cy8ctmg110 *tsc,
> +		unsigned char *i2c_data, unsigned char len, unsigned char cmd)
> +{
> +	struct i2c_client *client = tsc->client;
> +	unsigned int ret;
> +	unsigned char regs_cmd[2] = { 0, 0 };
> +	struct i2c_msg msg1[] = {
> +		{client->addr, 0, 1, regs_cmd},
> +	};
> +	struct i2c_msg msg2[] = {
> +		{client->addr, I2C_M_RD, len, i2c_data},
> +	};
> +
> +	regs_cmd[0] = cmd;
> +
> +	/* first write slave position to i2c devices */
> +	ret = i2c_transfer(client->adapter, msg1, 1);
> +	if (ret != 1) {
> +		tsc->i2c_fail_count++;
> +		return ret;
> +	}
> +
> +	/* Second read data from position */
> +	ret = i2c_transfer(client->adapter, msg2, 1);
> +	if (ret != 1) {
> +		tsc->i2c_fail_count++;
> +		return ret;
> +	}
> +	return 0;
> +}
> +
> +/*
> + * cy8ctmg110_send_event delevery touch event to the userpace
> + * function use normal input interface
> + */
> +static void cy8ctmg110_send_event(void *tsc)
> +{
> +	struct cy8ctmg110 *ts = tsc;
> +	struct input_dev *input = ts->input;
> +	u16 x, y;
> +	u16 x2, y2;
> +
> +	x = ts->tc.x1;
> +	y = ts->tc.y1;
> +
> +	if (ts->tc.event_sended == false) {

We set "event_sended" to false immediately before calling
cy8ctmg110_send_event() so I do not see the point of this flag.

> +		input_report_key(input, BTN_TOUCH, 1);
> +		ts->pending = true;
> +		x2 = (u16) (y * X_SCALE_FACTOR);
> +		y2 = (u16) (x * Y_SCALE_FACTOR);
> +		input_report_abs(input, ABS_X, x2);
> +		input_report_abs(input, ABS_Y, y2);
> +		input_sync(input);
> +		if (g_y_trace_coord)
> +			printk("cy8ctmg110 touch position X:%d (was = %d) Y:%d (was = %d)\n", x2, y, y2, x);

Do we really need this? Seems to be early development diagnostic.

> +	}
> +
> +}
> +
> +/*
> + * cy8ctmg110_touch_pos check touch position from i2c devices
> + * 
> + */
> +static int cy8ctmg110_touch_pos(struct cy8ctmg110 *tsc)
> +{
> +	unsigned char reg_p[CY8CTMG110_REG_MAX];
> +	int x, y;
> +
> +	memset(reg_p, 0, CY8CTMG110_REG_MAX);
> +
> +	/*Reading coordinates */
> +	if (cy8ctmg110_read_req(tsc, reg_p, 9, CY8CTMG110_TOUCH_X1) != 0)
> +		return -EIO;
> +		
> +	y = reg_p[2] << 8 | reg_p[3];
> +	x = reg_p[0] << 8 | reg_p[1];
> +		/*number of touch */
> +	if (reg_p[8] == 0) {
> +		if (tsc->pending == true) {
> +			struct input_dev *input = tsc->input;
> +
> +			input_report_key(input, BTN_TOUCH, 0);
> +			tsc->tc.event_sended = true;
> +			tsc->pending = false;
> +		}

Just do input_report_key(input, BTN_TOUCH, 0); and let input core take
care of filtering duplicates. This will allow you get rid of bunch of
flags. Also input_sync() is missing here.

> +	} else if (tsc->tc.x1 != x || tsc->tc.y1 != y) {
> +		tsc->tc.y1 = y;
> +		tsc->tc.x1 = x;
> +		tsc->tc.event_sended = false;
> +		cy8ctmg110_send_event(tsc);
> +	}
> +	return 0;
> +}
> +
> +/*
> + * if interrupt isn't in use the touch positions can reads by polling
> + * 
> + */
> +static enum hrtimer_restart cy8ctmg110_timer(struct hrtimer *handle)
> +{
> +	struct cy8ctmg110 *ts = container_of(handle, struct cy8ctmg110, timer);
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&ts->lock, flags);
> +
> +	cy8ctmg110_touch_pos(ts);
> +	if (ts->i2c_fail_count < TOUCH_MAX_I2C_FAILS)
> +		hrtimer_start(&ts->timer, ktime_set(0, CY8CTMG110_POLL_TIMER_DELAY), HRTIMER_MODE_REL);
> +

So device simply dies after so many errors?

> +	spin_unlock_irqrestore(&ts->lock, flags);

The timer handler is the only user for the spinlock, what is the point?

> +	return HRTIMER_NORESTART;
> +}
> +
> +/*
> + * 
> + */
> +static bool cy8ctmg110_set_sleepmode(struct cy8ctmg110 *ts)
> +{
> +	unsigned char reg_p[3];
> +
> +	if (ts->sleepmode == true) {
> +		reg_p[0] = 0x00;
> +		reg_p[1] = 0xff;
> +		reg_p[2] = 5;
> +	} else {
> +		reg_p[0] = 0x10;
> +		reg_p[1] = 0xff;
> +		reg_p[2] = 0;
> +	}
> +
> +	if (cy8ctmg110_write_req(ts, CY8CTMG110_TOUCH_WAKEUP_TIME, 3, reg_p))
> +		return false;
> +
> +	ts->initController = true;
> +	return true;
> +}
> +
> +/*
> + * cy8ctmg110_irq_handler irq handling function
> + * 
> + */
> +
> +static irqreturn_t cy8ctmg110_irq_handler(int irq, void *dev_id)
> +{
> +	struct cy8ctmg110 *tsc = (struct cy8ctmg110 *) dev_id;
> +
> +	if (tsc->initController == false) {
> +		if (cy8ctmg110_set_sleepmode(tsc) == true)
> +			tsc->initController = true;
> +	} else
> +		cy8ctmg110_touch_pos(tsc);

Initalizing device from interrupt handler is quite novel concept...

> +
> +	/* if interrupt supported in the touch controller
> +	   timer polling need to stop */
> +	tsc->i2c_fail_count = TOUCH_MAX_I2C_FAILS;
> +	return IRQ_HANDLED;
> +}
> +
> +
> +static int cy8ctmg110_probe(struct i2c_client *client, const struct i2c_device_id *id)
> +{
> +	struct cy8ctmg110 *ts;
> +	struct input_dev *input_dev;
> +	int err;
> +	client->irq = CY8CTMG110_TOUCH_IRQ;
> +
> +	if (!i2c_check_functionality(client->adapter,
> +					I2C_FUNC_SMBUS_READ_WORD_DATA))
> +		return -EIO;
> +
> +	ts = kzalloc(sizeof(struct cy8ctmg110), GFP_KERNEL);
> +	input_dev = input_allocate_device();
> +
> +	if (!ts || !input_dev) {
> +		err = -ENOMEM;
> +		goto err_free_mem;
> +	}
> +
> +	ts->client = client;
> +	i2c_set_clientdata(client, ts);
> +
> +	ts->input = input_dev;
> +	ts->pending = false;
> +	ts->sleepmode = false;
> +
> +	snprintf(ts->phys, sizeof(ts->phys), "%s/input0",
> +						dev_name(&client->dev));
> +
> +	input_dev->name = CY8CTMG110_DRIVER_NAME " Touchscreen";
> +	input_dev->phys = ts->phys;
> +	input_dev->id.bustype = BUS_I2C;
> +
> +	spin_lock_init(&ts->lock);
> +
> +	input_dev->evbit[0] = BIT_MASK(EV_KEY) | BIT_MASK(EV_REP) |

You usually do not set up autorepeat for pointingt devices.

> +					BIT_MASK(EV_REL) | BIT_MASK(EV_ABS);

The device does not emit relative events.

> +	input_dev->keybit[BIT_WORD(BTN_TOUCH)] = BIT_MASK(BTN_TOUCH);
> +
> +	input_set_capability(input_dev, EV_KEY, KEY_F);

KEY_F?

> +
> +	input_set_abs_params(input_dev, ABS_X, CY8CTMG110_X_MIN, CY8CTMG110_X_MAX, 0, 0);
> +	input_set_abs_params(input_dev, ABS_Y, CY8CTMG110_Y_MIN, CY8CTMG110_Y_MAX, 0, 0);
> +
> +	err = gpio_request(CY8CTMG110_RESET_PIN_GPIO, NULL);
> +
> +	if (err) {
> +		dev_err(&client->dev, "cy8ctmg110_ts: Unable to request GPIO pin %d.\n",
> +						CY8CTMG110_RESET_PIN_GPIO);
> +		goto err_free_irq;
> +	}
> +	cy8ctmg110_power(true);
> +
> +	ts->initController = false;
> +	ts->i2c_fail_count = 0;
> +
> +	hrtimer_init(&ts->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> +	ts->timer.function = cy8ctmg110_timer;
> +
> +	if (polling)
> +		hrtimer_start(&ts->timer, ktime_set(10, 0), HRTIMER_MODE_REL);
> +

Polling mode shoudl be controlled by platform data, not kernel module I think.

> +	/* Can we fall back to polling if these bits fail - something to look
> +	   at for robustness */
> +
> +	err = gpio_request(CY8CTMG110_IRQ_PIN_GPIO, "touch_irq_key");
> +	if (err < 0) {
> +		dev_err(&client->dev,
> +			"cy8ctmg110_ts: failed to request GPIO %d, error %d\n",
> +						CY8CTMG110_IRQ_PIN_GPIO, err);
> +		goto err_free_timer;
> +	}
> +
> +	err = gpio_direction_input(CY8CTMG110_IRQ_PIN_GPIO);
> +
> +	if (err < 0) {
> +		dev_err(&client->dev,
> +			"cy8ctmg110_ts: failed to configure input direction for GPIO %d, error %d\n",
> +						CY8CTMG110_IRQ_PIN_GPIO, err);
> +		goto err_free_gpio;
> +	}
> +	client->irq = gpio_to_irq(CY8CTMG110_IRQ_PIN_GPIO);
> +
> +	if (client->irq < 0) {
> +		err = client->irq;
> +		dev_err(&client->dev,
> +	"cy8ctmg110_ts: Unable to get irq number" " for GPIO %d, error %d\n",
> +						CY8CTMG110_IRQ_PIN_GPIO, err);
> +		goto err_free_gpio;
> +	}
> +	err = request_irq(client->irq, cy8ctmg110_irq_handler, IRQF_TRIGGER_RISING | IRQF_SHARED, "touch_reset_key", ts);
> +	if (err < 0) {
> +		dev_err(&client->dev,
> +			"cy8ctmg110 irq %d busy? error %d\n",
> +				client->irq, err);
> +		goto err_free_gpio;
> +	}
> +
> +	err = input_register_device(input_dev);
> +	if (!err)
> +		return 0;
> +err_free_gpio:
> +	gpio_free(CY8CTMG110_IRQ_PIN_GPIO);
> +err_free_timer:
> +	if (polling)
> +		hrtimer_cancel(&ts->timer);
> +err_free_irq:
> +	free_irq(client->irq, ts);
> +err_free_mem:
> +	input_free_device(input_dev);
> +	kfree(ts);
> +	return err;
> +}
> +
> +/*
> + * cy8ctmg110_suspend
> + * 
> + */
> +
> +static int cy8ctmg110_suspend(struct i2c_client *client, pm_message_t mesg)
> +{

Stop timer here? Also power down the device?

> +	if (device_may_wakeup(&client->dev))
> +		enable_irq_wake(client->irq);
> +
> +	return 0;
> +}
> +
> +/*
> + * cy8ctmg110_resume 
> + * 
> + */
> +
> +static int cy8ctmg110_resume(struct i2c_client *client)
> +{
> +	if (device_may_wakeup(&client->dev))
> +		disable_irq_wake(client->irq);
> +
> +	return 0;
> +}
> +
> +/*
> + * cy8ctmg110_remove
> + * 
> + */
> +
> +static int cy8ctmg110_remove(struct i2c_client *client)
> +{
> +	struct cy8ctmg110 *ts = i2c_get_clientdata(client);
> +
> +	cy8ctmg110_power(false);
> +
> +	if (polling)
> +		hrtimer_cancel(&ts->timer);

Implement close() method and move the code above there? Also do open().

> +	free_irq(client->irq, ts);
> +	input_unregister_device(ts->input);
> +	/* FIXME: Do we need to free the GPIO ? */
> +	kfree(ts);
> +	return 0;
> +}
> +
> +static struct i2c_device_id cy8ctmg110_idtable[] = {
> +	{CY8CTMG110_DRIVER_NAME, 1},
> +	{}
> +};
> +
> +MODULE_DEVICE_TABLE(i2c, cy8ctmg110_idtable);
> +
> +static struct i2c_driver cy8ctmg110_driver = {
> +	.driver = {
> +		   .owner = THIS_MODULE,
> +		   .name = CY8CTMG110_DRIVER_NAME,
> +		   .bus = &i2c_bus_type,
> +		   },
> +	.id_table = cy8ctmg110_idtable,
> +	.probe = cy8ctmg110_probe,
> +	.remove = cy8ctmg110_remove,
> +	.suspend = cy8ctmg110_suspend,
> +	.resume = cy8ctmg110_resume,
> +};
> +
> +static int __init cy8ctmg110_init(void)
> +{
> +	return i2c_add_driver(&cy8ctmg110_driver);
> +}
> +
> +static void __exit cy8ctmg110_exit(void)
> +{
> +	i2c_del_driver(&cy8ctmg110_driver);
> +}
> +
> +module_init(cy8ctmg110_init);
> +module_exit(cy8ctmg110_exit);
> +
> +MODULE_AUTHOR("Samuli Konttila <samuli.konttila@aavamobile.com>");
> +MODULE_DESCRIPTION("cy8ctmg110 TouchScreen Driver");
> +MODULE_LICENSE("GPL v2");
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-input" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Dmitry

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <20100113004939.289333186@suse.com>
@ 2010-01-13 14:57 ` scameron
  0 siblings, 0 replies; 341+ messages in thread
From: scameron @ 2010-01-13 14:57 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Linux Kernel Mailing List, Andrew Morton, Linux SCSI

On Tue, Jan 12, 2010 at 07:49:00PM -0500, Jeff Mahoney wrote:
> Subject: [patch 5/6] hpsa: Fix section mismatch
> References: <20100113004855.550486769@suse.com>
> Content-Disposition: inline; filename=patches.rpmify/hpsa-fix-section-mismatch
> 
>  hpsa_pci_init calls hpsa_interrupt_mode which is a __devinit function.
>  hpsa_pci_init is only called by hpsa_init_one which is also __devinit, so
>  mark it __devinit as well.
> 
> Signed-off-by: Jeff Mahoney <jeffm@suse.com>
> ---
>  drivers/scsi/hpsa.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/drivers/scsi/hpsa.c
> +++ b/drivers/scsi/hpsa.c
> @@ -3111,7 +3111,7 @@ default_int_mode:
>  	return;
>  }
>  
> -static int hpsa_pci_init(struct ctlr_info *h, struct pci_dev *pdev)
> +static int __devinit hpsa_pci_init(struct ctlr_info *h, struct pci_dev *pdev)
>  {
>  	ushort subsystem_vendor_id, subsystem_device_id, command;
>  	__u32 board_id, scratchpad = 0;
> 

Thanks.

-- steve


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2009-05-07 10:20                   ` your mail Ingo Molnar
@ 2009-05-08  3:27                     ` Casey Schaufler
  0 siblings, 0 replies; 341+ messages in thread
From: Casey Schaufler @ 2009-05-08  3:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: James Morris, Chris Wright, Oleg Nesterov, Roland McGrath,
	Andrew Morton, linux-kernel, Al Viro, linux-security-module

Ingo Molnar wrote:
> * James Morris <jmorris@namei.org> wrote:
>
>   
>> On Thu, 7 May 2009, Chris Wright wrote:
>>
>>     
>>> * Ingo Molnar (mingo@elte.hu) wrote:
>>>       
>> [Added LSM list to the CC; please do so whenever making changes in this 
>> area...]
>>
>>     
>>>> They have no active connection to the core kernel 
>>>> ptrace_may_access() check in any case:
>>>>         
>>> Not sure what you mean:
>>>
>>> ptrace_may_access
>>>  __ptrace_may_access
>>>   security_ptrace_may_access
>>>
>>> Looks like your patch won't compile.
>>>
>>>       
>> Below is an updated version which fixes the bug, against 
>> git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6#next
>>
>> Boot tested with SELinux.
>>     
>
> thanks! Below are the two patches i wrote and tested.
>   

I hate to make an assumption regarding whether or not your tests
included Smack, so I'll ask. Does tested mean with Smack?

Thank you.

> 	Ingo
>
> ----- Forwarded message from Ingo Molnar <mingo@elte.hu> -----
>
> Date: Thu, 7 May 2009 11:49:47 +0200
> From: Ingo Molnar <mingo@elte.hu>
> To: Chris Wright <chrisw@sous-sol.org>
> Subject: [patch 1/2] ptrace, security: rename ptrace_may_access =>
> 	ptrace_access_check
> Cc: Oleg Nesterov <oleg@redhat.com>, Roland McGrath <roland@redhat.com>,
> 	Andrew Morton <akpm@linux-foundation.org>,
> 	linux-kernel@vger.kernel.org, Al Viro <viro@ZenIV.linux.org.uk>
>
> The ptrace_may_access() methods are named confusingly - some 
> variants return a bool, while the security subsystem methods have a 
> retval convention.
>
> Rename it to ptrace_access_check, to reduce the confusion factor. A 
> followup patch eliminates the bool usage.
>
> [ Impact: cleanup, no code changed ]
>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> Cc: Roland McGrath <roland@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Chris Wright <chrisw@sous-sol.org>
> Cc: Al Viro <viro@ZenIV.linux.org.uk>
> Cc: Oleg Nesterov <oleg@redhat.com>
> LKML-Reference: <20090507084943.GB19133@elte.hu>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
>  fs/proc/array.c            |    2 +-
>  fs/proc/base.c             |   10 +++++-----
>  fs/proc/task_mmu.c         |    2 +-
>  include/linux/ptrace.h     |    4 ++--
>  include/linux/security.h   |   14 +++++++-------
>  kernel/ptrace.c            |   10 +++++-----
>  security/capability.c      |    2 +-
>  security/commoncap.c       |    4 ++--
>  security/root_plug.c       |    2 +-
>  security/security.c        |    4 ++--
>  security/selinux/hooks.c   |    6 +++---
>  security/smack/smack_lsm.c |    8 ++++----
>  12 files changed, 34 insertions(+), 34 deletions(-)
>
> Index: linux/fs/proc/array.c
> ===================================================================
> --- linux.orig/fs/proc/array.c
> +++ linux/fs/proc/array.c
> @@ -366,7 +366,7 @@ static int do_task_stat(struct seq_file 
>  
>  	state = *get_task_state(task);
>  	vsize = eip = esp = 0;
> -	permitted = ptrace_may_access(task, PTRACE_MODE_READ);
> +	permitted = ptrace_access_check(task, PTRACE_MODE_READ);
>  	mm = get_task_mm(task);
>  	if (mm) {
>  		vsize = task_vsize(mm);
> Index: linux/fs/proc/base.c
> ===================================================================
> --- linux.orig/fs/proc/base.c
> +++ linux/fs/proc/base.c
> @@ -222,7 +222,7 @@ static int check_mem_permission(struct t
>  		rcu_read_lock();
>  		match = (tracehook_tracer_task(task) == current);
>  		rcu_read_unlock();
> -		if (match && ptrace_may_access(task, PTRACE_MODE_ATTACH))
> +		if (match && ptrace_access_check(task, PTRACE_MODE_ATTACH))
>  			return 0;
>  	}
>  
> @@ -242,7 +242,7 @@ struct mm_struct *mm_for_maps(struct tas
>  	if (task->mm != mm)
>  		goto out;
>  	if (task->mm != current->mm &&
> -	    __ptrace_may_access(task, PTRACE_MODE_READ) < 0)
> +	    __ptrace_access_check(task, PTRACE_MODE_READ) < 0)
>  		goto out;
>  	task_unlock(task);
>  	return mm;
> @@ -322,7 +322,7 @@ static int proc_pid_wchan(struct task_st
>  	wchan = get_wchan(task);
>  
>  	if (lookup_symbol_name(wchan, symname) < 0)
> -		if (!ptrace_may_access(task, PTRACE_MODE_READ))
> +		if (!ptrace_access_check(task, PTRACE_MODE_READ))
>  			return 0;
>  		else
>  			return sprintf(buffer, "%lu", wchan);
> @@ -559,7 +559,7 @@ static int proc_fd_access_allowed(struct
>  	 */
>  	task = get_proc_task(inode);
>  	if (task) {
> -		allowed = ptrace_may_access(task, PTRACE_MODE_READ);
> +		allowed = ptrace_access_check(task, PTRACE_MODE_READ);
>  		put_task_struct(task);
>  	}
>  	return allowed;
> @@ -938,7 +938,7 @@ static ssize_t environ_read(struct file 
>  	if (!task)
>  		goto out_no_task;
>  
> -	if (!ptrace_may_access(task, PTRACE_MODE_READ))
> +	if (!ptrace_access_check(task, PTRACE_MODE_READ))
>  		goto out;
>  
>  	ret = -ENOMEM;
> Index: linux/fs/proc/task_mmu.c
> ===================================================================
> --- linux.orig/fs/proc/task_mmu.c
> +++ linux/fs/proc/task_mmu.c
> @@ -656,7 +656,7 @@ static ssize_t pagemap_read(struct file 
>  		goto out;
>  
>  	ret = -EACCES;
> -	if (!ptrace_may_access(task, PTRACE_MODE_READ))
> +	if (!ptrace_access_check(task, PTRACE_MODE_READ))
>  		goto out_task;
>  
>  	ret = -EINVAL;
> Index: linux/include/linux/ptrace.h
> ===================================================================
> --- linux.orig/include/linux/ptrace.h
> +++ linux/include/linux/ptrace.h
> @@ -99,9 +99,9 @@ extern void ptrace_fork(struct task_stru
>  #define PTRACE_MODE_READ   1
>  #define PTRACE_MODE_ATTACH 2
>  /* Returns 0 on success, -errno on denial. */
> -extern int __ptrace_may_access(struct task_struct *task, unsigned int mode);
> +extern int __ptrace_access_check(struct task_struct *task, unsigned int mode);
>  /* Returns true on success, false on denial. */
> -extern bool ptrace_may_access(struct task_struct *task, unsigned int mode);
> +extern bool ptrace_access_check(struct task_struct *task, unsigned int mode);
>  
>  static inline int ptrace_reparented(struct task_struct *child)
>  {
> Index: linux/include/linux/security.h
> ===================================================================
> --- linux.orig/include/linux/security.h
> +++ linux/include/linux/security.h
> @@ -52,7 +52,7 @@ struct audit_krule;
>  extern int cap_capable(struct task_struct *tsk, const struct cred *cred,
>  		       int cap, int audit);
>  extern int cap_settime(struct timespec *ts, struct timezone *tz);
> -extern int cap_ptrace_may_access(struct task_struct *child, unsigned int mode);
> +extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
>  extern int cap_ptrace_traceme(struct task_struct *parent);
>  extern int cap_capget(struct task_struct *target, kernel_cap_t *effective, kernel_cap_t *inheritable, kernel_cap_t *permitted);
>  extern int cap_capset(struct cred *new, const struct cred *old,
> @@ -1209,7 +1209,7 @@ static inline void security_free_mnt_opt
>   *	@alter contains the flag indicating whether changes are to be made.
>   *	Return 0 if permission is granted.
>   *
> - * @ptrace_may_access:
> + * @ptrace_access_check:
>   *	Check permission before allowing the current process to trace the
>   *	@child process.
>   *	Security modules may also want to perform a process tracing check
> @@ -1224,7 +1224,7 @@ static inline void security_free_mnt_opt
>   *	Check that the @parent process has sufficient permission to trace the
>   *	current process before allowing the current process to present itself
>   *	to the @parent process for tracing.
> - *	The parent process will still have to undergo the ptrace_may_access
> + *	The parent process will still have to undergo the ptrace_access_check
>   *	checks before it is allowed to trace this one.
>   *	@parent contains the task_struct structure for debugger process.
>   *	Return 0 if permission is granted.
> @@ -1336,7 +1336,7 @@ static inline void security_free_mnt_opt
>  struct security_operations {
>  	char name[SECURITY_NAME_MAX + 1];
>  
> -	int (*ptrace_may_access) (struct task_struct *child, unsigned int mode);
> +	int (*ptrace_access_check) (struct task_struct *child, unsigned int mode);
>  	int (*ptrace_traceme) (struct task_struct *parent);
>  	int (*capget) (struct task_struct *target,
>  		       kernel_cap_t *effective,
> @@ -1617,7 +1617,7 @@ extern int security_module_enable(struct
>  extern int register_security(struct security_operations *ops);
>  
>  /* Security operations */
> -int security_ptrace_may_access(struct task_struct *child, unsigned int mode);
> +int security_ptrace_access_check(struct task_struct *child, unsigned int mode);
>  int security_ptrace_traceme(struct task_struct *parent);
>  int security_capget(struct task_struct *target,
>  		    kernel_cap_t *effective,
> @@ -1798,10 +1798,10 @@ static inline int security_init(void)
>  	return 0;
>  }
>  
> -static inline int security_ptrace_may_access(struct task_struct *child,
> +static inline int security_ptrace_access_check(struct task_struct *child,
>  					     unsigned int mode)
>  {
> -	return cap_ptrace_may_access(child, mode);
> +	return cap_ptrace_access_check(child, mode);
>  }
>  
>  static inline int security_ptrace_traceme(struct task_struct *parent)
> Index: linux/kernel/ptrace.c
> ===================================================================
> --- linux.orig/kernel/ptrace.c
> +++ linux/kernel/ptrace.c
> @@ -127,7 +127,7 @@ int ptrace_check_attach(struct task_stru
>  	return ret;
>  }
>  
> -int __ptrace_may_access(struct task_struct *task, unsigned int mode)
> +int __ptrace_access_check(struct task_struct *task, unsigned int mode)
>  {
>  	const struct cred *cred = current_cred(), *tcred;
>  
> @@ -162,14 +162,14 @@ int __ptrace_may_access(struct task_stru
>  	if (!dumpable && !capable(CAP_SYS_PTRACE))
>  		return -EPERM;
>  
> -	return security_ptrace_may_access(task, mode);
> +	return security_ptrace_access_check(task, mode);
>  }
>  
> -bool ptrace_may_access(struct task_struct *task, unsigned int mode)
> +bool ptrace_access_check(struct task_struct *task, unsigned int mode)
>  {
>  	int err;
>  	task_lock(task);
> -	err = __ptrace_may_access(task, mode);
> +	err = __ptrace_access_check(task, mode);
>  	task_unlock(task);
>  	return !err;
>  }
> @@ -217,7 +217,7 @@ repeat:
>  	/* the same process cannot be attached many times */
>  	if (task->ptrace & PT_PTRACED)
>  		goto bad;
> -	retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH);
> +	retval = __ptrace_access_check(task, PTRACE_MODE_ATTACH);
>  	if (retval)
>  		goto bad;
>  
> Index: linux/security/capability.c
> ===================================================================
> --- linux.orig/security/capability.c
> +++ linux/security/capability.c
> @@ -863,7 +863,7 @@ struct security_operations default_secur
>  
>  void security_fixup_ops(struct security_operations *ops)
>  {
> -	set_to_cap_if_null(ops, ptrace_may_access);
> +	set_to_cap_if_null(ops, ptrace_access_check);
>  	set_to_cap_if_null(ops, ptrace_traceme);
>  	set_to_cap_if_null(ops, capget);
>  	set_to_cap_if_null(ops, capset);
> Index: linux/security/commoncap.c
> ===================================================================
> --- linux.orig/security/commoncap.c
> +++ linux/security/commoncap.c
> @@ -79,7 +79,7 @@ int cap_settime(struct timespec *ts, str
>  }
>  
>  /**
> - * cap_ptrace_may_access - Determine whether the current process may access
> + * cap_ptrace_access_check - Determine whether the current process may access
>   *			   another
>   * @child: The process to be accessed
>   * @mode: The mode of attachment.
> @@ -87,7 +87,7 @@ int cap_settime(struct timespec *ts, str
>   * Determine whether a process may access another, returning 0 if permission
>   * granted, -ve if denied.
>   */
> -int cap_ptrace_may_access(struct task_struct *child, unsigned int mode)
> +int cap_ptrace_access_check(struct task_struct *child, unsigned int mode)
>  {
>  	int ret = 0;
>  
> Index: linux/security/root_plug.c
> ===================================================================
> --- linux.orig/security/root_plug.c
> +++ linux/security/root_plug.c
> @@ -72,7 +72,7 @@ static int rootplug_bprm_check_security 
>  
>  static struct security_operations rootplug_security_ops = {
>  	/* Use the capability functions for some of the hooks */
> -	.ptrace_may_access =		cap_ptrace_may_access,
> +	.ptrace_access_check =		cap_ptrace_access_check,
>  	.ptrace_traceme =		cap_ptrace_traceme,
>  	.capget =			cap_capget,
>  	.capset =			cap_capset,
> Index: linux/security/security.c
> ===================================================================
> --- linux.orig/security/security.c
> +++ linux/security/security.c
> @@ -127,9 +127,9 @@ int register_security(struct security_op
>  
>  /* Security operations */
>  
> -int security_ptrace_may_access(struct task_struct *child, unsigned int mode)
> +int security_ptrace_access_check(struct task_struct *child, unsigned int mode)
>  {
> -	return security_ops->ptrace_may_access(child, mode);
> +	return security_ops->ptrace_access_check(child, mode);
>  }
>  
>  int security_ptrace_traceme(struct task_struct *parent)
> Index: linux/security/selinux/hooks.c
> ===================================================================
> --- linux.orig/security/selinux/hooks.c
> +++ linux/security/selinux/hooks.c
> @@ -1854,12 +1854,12 @@ static inline u32 open_file_to_av(struct
>  
>  /* Hook functions begin here. */
>  
> -static int selinux_ptrace_may_access(struct task_struct *child,
> +static int selinux_ptrace_access_check(struct task_struct *child,
>  				     unsigned int mode)
>  {
>  	int rc;
>  
> -	rc = cap_ptrace_may_access(child, mode);
> +	rc = cap_ptrace_access_check(child, mode);
>  	if (rc)
>  		return rc;
>  
> @@ -5318,7 +5318,7 @@ static int selinux_key_getsecurity(struc
>  static struct security_operations selinux_ops = {
>  	.name =				"selinux",
>  
> -	.ptrace_may_access =		selinux_ptrace_may_access,
> +	.ptrace_access_check =		selinux_ptrace_access_check,
>  	.ptrace_traceme =		selinux_ptrace_traceme,
>  	.capget =			selinux_capget,
>  	.capset =			selinux_capset,
> Index: linux/security/smack/smack_lsm.c
> ===================================================================
> --- linux.orig/security/smack/smack_lsm.c
> +++ linux/security/smack/smack_lsm.c
> @@ -92,7 +92,7 @@ struct inode_smack *new_inode_smack(char
>   */
>  
>  /**
> - * smack_ptrace_may_access - Smack approval on PTRACE_ATTACH
> + * smack_ptrace_access_check - Smack approval on PTRACE_ATTACH
>   * @ctp: child task pointer
>   * @mode: ptrace attachment mode
>   *
> @@ -100,11 +100,11 @@ struct inode_smack *new_inode_smack(char
>   *
>   * Do the capability checks, and require read and write.
>   */
> -static int smack_ptrace_may_access(struct task_struct *ctp, unsigned int mode)
> +static int smack_ptrace_access_check(struct task_struct *ctp, unsigned int mode)
>  {
>  	int rc;
>  
> -	rc = cap_ptrace_may_access(ctp, mode);
> +	rc = cap_ptrace_access_check(ctp, mode);
>  	if (rc != 0)
>  		return rc;
>  
> @@ -2826,7 +2826,7 @@ static void smack_release_secctx(char *s
>  struct security_operations smack_ops = {
>  	.name =				"smack",
>  
> -	.ptrace_may_access =		smack_ptrace_may_access,
> +	.ptrace_access_check =		smack_ptrace_access_check,
>  	.ptrace_traceme =		smack_ptrace_traceme,
>  	.capget = 			cap_capget,
>  	.capset = 			cap_capset,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
> ----- End forwarded message -----
> ----- Forwarded message from Ingo Molnar <mingo@elte.hu> -----
>
> Date: Thu, 7 May 2009 11:50:54 +0200
> From: Ingo Molnar <mingo@elte.hu>
> To: Chris Wright <chrisw@sous-sol.org>
> Subject: [patch 2/2] ptrace: turn ptrace_access_check() into a retval
> 	function
> Cc: Oleg Nesterov <oleg@redhat.com>, Roland McGrath <roland@redhat.com>,
> 	Andrew Morton <akpm@linux-foundation.org>,
> 	linux-kernel@vger.kernel.org, Al Viro <viro@ZenIV.linux.org.uk>
>
> ptrace_access_check() returns a bool, while most of the ptrace 
> access check machinery works with Linux retvals (where 0 indicates 
> success, negative indicates an error).
>
> So eliminate the bool and invert the usage at the call sites.
>
> ( Note: "< 0" checks are used instead of !0 checks, because that's
>   the convention for retval checks and it results in similarly fast
>   assembly code. )
>
> [ Impact: cleanup ]
>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
>  fs/proc/array.c        |    2 +-
>  fs/proc/base.c         |    8 ++++----
>  fs/proc/task_mmu.c     |    2 +-
>  include/linux/ptrace.h |    2 +-
>  kernel/ptrace.c        |    6 ++++--
>  5 files changed, 11 insertions(+), 9 deletions(-)
>
> Index: linux/fs/proc/array.c
> ===================================================================
> --- linux.orig/fs/proc/array.c
> +++ linux/fs/proc/array.c
> @@ -366,7 +366,7 @@ static int do_task_stat(struct seq_file 
>  
>  	state = *get_task_state(task);
>  	vsize = eip = esp = 0;
> -	permitted = ptrace_access_check(task, PTRACE_MODE_READ);
> +	permitted = !ptrace_access_check(task, PTRACE_MODE_READ);
>  	mm = get_task_mm(task);
>  	if (mm) {
>  		vsize = task_vsize(mm);
> Index: linux/fs/proc/base.c
> ===================================================================
> --- linux.orig/fs/proc/base.c
> +++ linux/fs/proc/base.c
> @@ -222,7 +222,7 @@ static int check_mem_permission(struct t
>  		rcu_read_lock();
>  		match = (tracehook_tracer_task(task) == current);
>  		rcu_read_unlock();
> -		if (match && ptrace_access_check(task, PTRACE_MODE_ATTACH))
> +		if (match && !ptrace_access_check(task, PTRACE_MODE_ATTACH))
>  			return 0;
>  	}
>  
> @@ -322,7 +322,7 @@ static int proc_pid_wchan(struct task_st
>  	wchan = get_wchan(task);
>  
>  	if (lookup_symbol_name(wchan, symname) < 0)
> -		if (!ptrace_access_check(task, PTRACE_MODE_READ))
> +		if (ptrace_access_check(task, PTRACE_MODE_READ) < 0)
>  			return 0;
>  		else
>  			return sprintf(buffer, "%lu", wchan);
> @@ -559,7 +559,7 @@ static int proc_fd_access_allowed(struct
>  	 */
>  	task = get_proc_task(inode);
>  	if (task) {
> -		allowed = ptrace_access_check(task, PTRACE_MODE_READ);
> +		allowed = !ptrace_access_check(task, PTRACE_MODE_READ);
>  		put_task_struct(task);
>  	}
>  	return allowed;
> @@ -938,7 +938,7 @@ static ssize_t environ_read(struct file 
>  	if (!task)
>  		goto out_no_task;
>  
> -	if (!ptrace_access_check(task, PTRACE_MODE_READ))
> +	if (ptrace_access_check(task, PTRACE_MODE_READ) < 0)
>  		goto out;
>  
>  	ret = -ENOMEM;
> Index: linux/fs/proc/task_mmu.c
> ===================================================================
> --- linux.orig/fs/proc/task_mmu.c
> +++ linux/fs/proc/task_mmu.c
> @@ -656,7 +656,7 @@ static ssize_t pagemap_read(struct file 
>  		goto out;
>  
>  	ret = -EACCES;
> -	if (!ptrace_access_check(task, PTRACE_MODE_READ))
> +	if (ptrace_access_check(task, PTRACE_MODE_READ) < 0)
>  		goto out_task;
>  
>  	ret = -EINVAL;
> Index: linux/include/linux/ptrace.h
> ===================================================================
> --- linux.orig/include/linux/ptrace.h
> +++ linux/include/linux/ptrace.h
> @@ -101,7 +101,7 @@ extern void ptrace_fork(struct task_stru
>  /* Returns 0 on success, -errno on denial. */
>  extern int __ptrace_access_check(struct task_struct *task, unsigned int mode);
>  /* Returns true on success, false on denial. */
> -extern bool ptrace_access_check(struct task_struct *task, unsigned int mode);
> +extern int ptrace_access_check(struct task_struct *task, unsigned int mode);
>  
>  static inline int ptrace_reparented(struct task_struct *child)
>  {
> Index: linux/kernel/ptrace.c
> ===================================================================
> --- linux.orig/kernel/ptrace.c
> +++ linux/kernel/ptrace.c
> @@ -165,13 +165,15 @@ int __ptrace_access_check(struct task_st
>  	return security_ptrace_access_check(task, mode);
>  }
>  
> -bool ptrace_access_check(struct task_struct *task, unsigned int mode)
> +int ptrace_access_check(struct task_struct *task, unsigned int mode)
>  {
>  	int err;
> +
>  	task_lock(task);
>  	err = __ptrace_access_check(task, mode);
>  	task_unlock(task);
> -	return !err;
> +
> +	return err;
>  }
>  
>  int ptrace_attach(struct task_struct *task)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
> ----- End forwarded message -----
> --
> To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>   


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2009-05-07  9:54                 ` James Morris
@ 2009-05-07 10:20                   ` Ingo Molnar
  2009-05-08  3:27                     ` Casey Schaufler
  0 siblings, 1 reply; 341+ messages in thread
From: Ingo Molnar @ 2009-05-07 10:20 UTC (permalink / raw)
  To: James Morris
  Cc: Chris Wright, Oleg Nesterov, Roland McGrath, Andrew Morton,
	linux-kernel, Al Viro, linux-security-module


* James Morris <jmorris@namei.org> wrote:

> On Thu, 7 May 2009, Chris Wright wrote:
> 
> > * Ingo Molnar (mingo@elte.hu) wrote:
> 
> [Added LSM list to the CC; please do so whenever making changes in this 
> area...]
> 
> > > They have no active connection to the core kernel 
> > > ptrace_may_access() check in any case:
> > 
> > Not sure what you mean:
> > 
> > ptrace_may_access
> >  __ptrace_may_access
> >   security_ptrace_may_access
> > 
> > Looks like your patch won't compile.
> > 
> 
> Below is an updated version which fixes the bug, against 
> git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6#next
> 
> Boot tested with SELinux.

thanks! Below are the two patches i wrote and tested.

	Ingo

----- Forwarded message from Ingo Molnar <mingo@elte.hu> -----

Date: Thu, 7 May 2009 11:49:47 +0200
From: Ingo Molnar <mingo@elte.hu>
To: Chris Wright <chrisw@sous-sol.org>
Subject: [patch 1/2] ptrace, security: rename ptrace_may_access =>
	ptrace_access_check
Cc: Oleg Nesterov <oleg@redhat.com>, Roland McGrath <roland@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, Al Viro <viro@ZenIV.linux.org.uk>

The ptrace_may_access() methods are named confusingly - some 
variants return a bool, while the security subsystem methods have a 
retval convention.

Rename it to ptrace_access_check, to reduce the confusion factor. A 
followup patch eliminates the bool usage.

[ Impact: cleanup, no code changed ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
LKML-Reference: <20090507084943.GB19133@elte.hu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 fs/proc/array.c            |    2 +-
 fs/proc/base.c             |   10 +++++-----
 fs/proc/task_mmu.c         |    2 +-
 include/linux/ptrace.h     |    4 ++--
 include/linux/security.h   |   14 +++++++-------
 kernel/ptrace.c            |   10 +++++-----
 security/capability.c      |    2 +-
 security/commoncap.c       |    4 ++--
 security/root_plug.c       |    2 +-
 security/security.c        |    4 ++--
 security/selinux/hooks.c   |    6 +++---
 security/smack/smack_lsm.c |    8 ++++----
 12 files changed, 34 insertions(+), 34 deletions(-)

Index: linux/fs/proc/array.c
===================================================================
--- linux.orig/fs/proc/array.c
+++ linux/fs/proc/array.c
@@ -366,7 +366,7 @@ static int do_task_stat(struct seq_file 
 
 	state = *get_task_state(task);
 	vsize = eip = esp = 0;
-	permitted = ptrace_may_access(task, PTRACE_MODE_READ);
+	permitted = ptrace_access_check(task, PTRACE_MODE_READ);
 	mm = get_task_mm(task);
 	if (mm) {
 		vsize = task_vsize(mm);
Index: linux/fs/proc/base.c
===================================================================
--- linux.orig/fs/proc/base.c
+++ linux/fs/proc/base.c
@@ -222,7 +222,7 @@ static int check_mem_permission(struct t
 		rcu_read_lock();
 		match = (tracehook_tracer_task(task) == current);
 		rcu_read_unlock();
-		if (match && ptrace_may_access(task, PTRACE_MODE_ATTACH))
+		if (match && ptrace_access_check(task, PTRACE_MODE_ATTACH))
 			return 0;
 	}
 
@@ -242,7 +242,7 @@ struct mm_struct *mm_for_maps(struct tas
 	if (task->mm != mm)
 		goto out;
 	if (task->mm != current->mm &&
-	    __ptrace_may_access(task, PTRACE_MODE_READ) < 0)
+	    __ptrace_access_check(task, PTRACE_MODE_READ) < 0)
 		goto out;
 	task_unlock(task);
 	return mm;
@@ -322,7 +322,7 @@ static int proc_pid_wchan(struct task_st
 	wchan = get_wchan(task);
 
 	if (lookup_symbol_name(wchan, symname) < 0)
-		if (!ptrace_may_access(task, PTRACE_MODE_READ))
+		if (!ptrace_access_check(task, PTRACE_MODE_READ))
 			return 0;
 		else
 			return sprintf(buffer, "%lu", wchan);
@@ -559,7 +559,7 @@ static int proc_fd_access_allowed(struct
 	 */
 	task = get_proc_task(inode);
 	if (task) {
-		allowed = ptrace_may_access(task, PTRACE_MODE_READ);
+		allowed = ptrace_access_check(task, PTRACE_MODE_READ);
 		put_task_struct(task);
 	}
 	return allowed;
@@ -938,7 +938,7 @@ static ssize_t environ_read(struct file 
 	if (!task)
 		goto out_no_task;
 
-	if (!ptrace_may_access(task, PTRACE_MODE_READ))
+	if (!ptrace_access_check(task, PTRACE_MODE_READ))
 		goto out;
 
 	ret = -ENOMEM;
Index: linux/fs/proc/task_mmu.c
===================================================================
--- linux.orig/fs/proc/task_mmu.c
+++ linux/fs/proc/task_mmu.c
@@ -656,7 +656,7 @@ static ssize_t pagemap_read(struct file 
 		goto out;
 
 	ret = -EACCES;
-	if (!ptrace_may_access(task, PTRACE_MODE_READ))
+	if (!ptrace_access_check(task, PTRACE_MODE_READ))
 		goto out_task;
 
 	ret = -EINVAL;
Index: linux/include/linux/ptrace.h
===================================================================
--- linux.orig/include/linux/ptrace.h
+++ linux/include/linux/ptrace.h
@@ -99,9 +99,9 @@ extern void ptrace_fork(struct task_stru
 #define PTRACE_MODE_READ   1
 #define PTRACE_MODE_ATTACH 2
 /* Returns 0 on success, -errno on denial. */
-extern int __ptrace_may_access(struct task_struct *task, unsigned int mode);
+extern int __ptrace_access_check(struct task_struct *task, unsigned int mode);
 /* Returns true on success, false on denial. */
-extern bool ptrace_may_access(struct task_struct *task, unsigned int mode);
+extern bool ptrace_access_check(struct task_struct *task, unsigned int mode);
 
 static inline int ptrace_reparented(struct task_struct *child)
 {
Index: linux/include/linux/security.h
===================================================================
--- linux.orig/include/linux/security.h
+++ linux/include/linux/security.h
@@ -52,7 +52,7 @@ struct audit_krule;
 extern int cap_capable(struct task_struct *tsk, const struct cred *cred,
 		       int cap, int audit);
 extern int cap_settime(struct timespec *ts, struct timezone *tz);
-extern int cap_ptrace_may_access(struct task_struct *child, unsigned int mode);
+extern int cap_ptrace_access_check(struct task_struct *child, unsigned int mode);
 extern int cap_ptrace_traceme(struct task_struct *parent);
 extern int cap_capget(struct task_struct *target, kernel_cap_t *effective, kernel_cap_t *inheritable, kernel_cap_t *permitted);
 extern int cap_capset(struct cred *new, const struct cred *old,
@@ -1209,7 +1209,7 @@ static inline void security_free_mnt_opt
  *	@alter contains the flag indicating whether changes are to be made.
  *	Return 0 if permission is granted.
  *
- * @ptrace_may_access:
+ * @ptrace_access_check:
  *	Check permission before allowing the current process to trace the
  *	@child process.
  *	Security modules may also want to perform a process tracing check
@@ -1224,7 +1224,7 @@ static inline void security_free_mnt_opt
  *	Check that the @parent process has sufficient permission to trace the
  *	current process before allowing the current process to present itself
  *	to the @parent process for tracing.
- *	The parent process will still have to undergo the ptrace_may_access
+ *	The parent process will still have to undergo the ptrace_access_check
  *	checks before it is allowed to trace this one.
  *	@parent contains the task_struct structure for debugger process.
  *	Return 0 if permission is granted.
@@ -1336,7 +1336,7 @@ static inline void security_free_mnt_opt
 struct security_operations {
 	char name[SECURITY_NAME_MAX + 1];
 
-	int (*ptrace_may_access) (struct task_struct *child, unsigned int mode);
+	int (*ptrace_access_check) (struct task_struct *child, unsigned int mode);
 	int (*ptrace_traceme) (struct task_struct *parent);
 	int (*capget) (struct task_struct *target,
 		       kernel_cap_t *effective,
@@ -1617,7 +1617,7 @@ extern int security_module_enable(struct
 extern int register_security(struct security_operations *ops);
 
 /* Security operations */
-int security_ptrace_may_access(struct task_struct *child, unsigned int mode);
+int security_ptrace_access_check(struct task_struct *child, unsigned int mode);
 int security_ptrace_traceme(struct task_struct *parent);
 int security_capget(struct task_struct *target,
 		    kernel_cap_t *effective,
@@ -1798,10 +1798,10 @@ static inline int security_init(void)
 	return 0;
 }
 
-static inline int security_ptrace_may_access(struct task_struct *child,
+static inline int security_ptrace_access_check(struct task_struct *child,
 					     unsigned int mode)
 {
-	return cap_ptrace_may_access(child, mode);
+	return cap_ptrace_access_check(child, mode);
 }
 
 static inline int security_ptrace_traceme(struct task_struct *parent)
Index: linux/kernel/ptrace.c
===================================================================
--- linux.orig/kernel/ptrace.c
+++ linux/kernel/ptrace.c
@@ -127,7 +127,7 @@ int ptrace_check_attach(struct task_stru
 	return ret;
 }
 
-int __ptrace_may_access(struct task_struct *task, unsigned int mode)
+int __ptrace_access_check(struct task_struct *task, unsigned int mode)
 {
 	const struct cred *cred = current_cred(), *tcred;
 
@@ -162,14 +162,14 @@ int __ptrace_may_access(struct task_stru
 	if (!dumpable && !capable(CAP_SYS_PTRACE))
 		return -EPERM;
 
-	return security_ptrace_may_access(task, mode);
+	return security_ptrace_access_check(task, mode);
 }
 
-bool ptrace_may_access(struct task_struct *task, unsigned int mode)
+bool ptrace_access_check(struct task_struct *task, unsigned int mode)
 {
 	int err;
 	task_lock(task);
-	err = __ptrace_may_access(task, mode);
+	err = __ptrace_access_check(task, mode);
 	task_unlock(task);
 	return !err;
 }
@@ -217,7 +217,7 @@ repeat:
 	/* the same process cannot be attached many times */
 	if (task->ptrace & PT_PTRACED)
 		goto bad;
-	retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH);
+	retval = __ptrace_access_check(task, PTRACE_MODE_ATTACH);
 	if (retval)
 		goto bad;
 
Index: linux/security/capability.c
===================================================================
--- linux.orig/security/capability.c
+++ linux/security/capability.c
@@ -863,7 +863,7 @@ struct security_operations default_secur
 
 void security_fixup_ops(struct security_operations *ops)
 {
-	set_to_cap_if_null(ops, ptrace_may_access);
+	set_to_cap_if_null(ops, ptrace_access_check);
 	set_to_cap_if_null(ops, ptrace_traceme);
 	set_to_cap_if_null(ops, capget);
 	set_to_cap_if_null(ops, capset);
Index: linux/security/commoncap.c
===================================================================
--- linux.orig/security/commoncap.c
+++ linux/security/commoncap.c
@@ -79,7 +79,7 @@ int cap_settime(struct timespec *ts, str
 }
 
 /**
- * cap_ptrace_may_access - Determine whether the current process may access
+ * cap_ptrace_access_check - Determine whether the current process may access
  *			   another
  * @child: The process to be accessed
  * @mode: The mode of attachment.
@@ -87,7 +87,7 @@ int cap_settime(struct timespec *ts, str
  * Determine whether a process may access another, returning 0 if permission
  * granted, -ve if denied.
  */
-int cap_ptrace_may_access(struct task_struct *child, unsigned int mode)
+int cap_ptrace_access_check(struct task_struct *child, unsigned int mode)
 {
 	int ret = 0;
 
Index: linux/security/root_plug.c
===================================================================
--- linux.orig/security/root_plug.c
+++ linux/security/root_plug.c
@@ -72,7 +72,7 @@ static int rootplug_bprm_check_security 
 
 static struct security_operations rootplug_security_ops = {
 	/* Use the capability functions for some of the hooks */
-	.ptrace_may_access =		cap_ptrace_may_access,
+	.ptrace_access_check =		cap_ptrace_access_check,
 	.ptrace_traceme =		cap_ptrace_traceme,
 	.capget =			cap_capget,
 	.capset =			cap_capset,
Index: linux/security/security.c
===================================================================
--- linux.orig/security/security.c
+++ linux/security/security.c
@@ -127,9 +127,9 @@ int register_security(struct security_op
 
 /* Security operations */
 
-int security_ptrace_may_access(struct task_struct *child, unsigned int mode)
+int security_ptrace_access_check(struct task_struct *child, unsigned int mode)
 {
-	return security_ops->ptrace_may_access(child, mode);
+	return security_ops->ptrace_access_check(child, mode);
 }
 
 int security_ptrace_traceme(struct task_struct *parent)
Index: linux/security/selinux/hooks.c
===================================================================
--- linux.orig/security/selinux/hooks.c
+++ linux/security/selinux/hooks.c
@@ -1854,12 +1854,12 @@ static inline u32 open_file_to_av(struct
 
 /* Hook functions begin here. */
 
-static int selinux_ptrace_may_access(struct task_struct *child,
+static int selinux_ptrace_access_check(struct task_struct *child,
 				     unsigned int mode)
 {
 	int rc;
 
-	rc = cap_ptrace_may_access(child, mode);
+	rc = cap_ptrace_access_check(child, mode);
 	if (rc)
 		return rc;
 
@@ -5318,7 +5318,7 @@ static int selinux_key_getsecurity(struc
 static struct security_operations selinux_ops = {
 	.name =				"selinux",
 
-	.ptrace_may_access =		selinux_ptrace_may_access,
+	.ptrace_access_check =		selinux_ptrace_access_check,
 	.ptrace_traceme =		selinux_ptrace_traceme,
 	.capget =			selinux_capget,
 	.capset =			selinux_capset,
Index: linux/security/smack/smack_lsm.c
===================================================================
--- linux.orig/security/smack/smack_lsm.c
+++ linux/security/smack/smack_lsm.c
@@ -92,7 +92,7 @@ struct inode_smack *new_inode_smack(char
  */
 
 /**
- * smack_ptrace_may_access - Smack approval on PTRACE_ATTACH
+ * smack_ptrace_access_check - Smack approval on PTRACE_ATTACH
  * @ctp: child task pointer
  * @mode: ptrace attachment mode
  *
@@ -100,11 +100,11 @@ struct inode_smack *new_inode_smack(char
  *
  * Do the capability checks, and require read and write.
  */
-static int smack_ptrace_may_access(struct task_struct *ctp, unsigned int mode)
+static int smack_ptrace_access_check(struct task_struct *ctp, unsigned int mode)
 {
 	int rc;
 
-	rc = cap_ptrace_may_access(ctp, mode);
+	rc = cap_ptrace_access_check(ctp, mode);
 	if (rc != 0)
 		return rc;
 
@@ -2826,7 +2826,7 @@ static void smack_release_secctx(char *s
 struct security_operations smack_ops = {
 	.name =				"smack",
 
-	.ptrace_may_access =		smack_ptrace_may_access,
+	.ptrace_access_check =		smack_ptrace_access_check,
 	.ptrace_traceme =		smack_ptrace_traceme,
 	.capget = 			cap_capget,
 	.capset = 			cap_capset,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

----- End forwarded message -----
----- Forwarded message from Ingo Molnar <mingo@elte.hu> -----

Date: Thu, 7 May 2009 11:50:54 +0200
From: Ingo Molnar <mingo@elte.hu>
To: Chris Wright <chrisw@sous-sol.org>
Subject: [patch 2/2] ptrace: turn ptrace_access_check() into a retval
	function
Cc: Oleg Nesterov <oleg@redhat.com>, Roland McGrath <roland@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, Al Viro <viro@ZenIV.linux.org.uk>

ptrace_access_check() returns a bool, while most of the ptrace 
access check machinery works with Linux retvals (where 0 indicates 
success, negative indicates an error).

So eliminate the bool and invert the usage at the call sites.

( Note: "< 0" checks are used instead of !0 checks, because that's
  the convention for retval checks and it results in similarly fast
  assembly code. )

[ Impact: cleanup ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 fs/proc/array.c        |    2 +-
 fs/proc/base.c         |    8 ++++----
 fs/proc/task_mmu.c     |    2 +-
 include/linux/ptrace.h |    2 +-
 kernel/ptrace.c        |    6 ++++--
 5 files changed, 11 insertions(+), 9 deletions(-)

Index: linux/fs/proc/array.c
===================================================================
--- linux.orig/fs/proc/array.c
+++ linux/fs/proc/array.c
@@ -366,7 +366,7 @@ static int do_task_stat(struct seq_file 
 
 	state = *get_task_state(task);
 	vsize = eip = esp = 0;
-	permitted = ptrace_access_check(task, PTRACE_MODE_READ);
+	permitted = !ptrace_access_check(task, PTRACE_MODE_READ);
 	mm = get_task_mm(task);
 	if (mm) {
 		vsize = task_vsize(mm);
Index: linux/fs/proc/base.c
===================================================================
--- linux.orig/fs/proc/base.c
+++ linux/fs/proc/base.c
@@ -222,7 +222,7 @@ static int check_mem_permission(struct t
 		rcu_read_lock();
 		match = (tracehook_tracer_task(task) == current);
 		rcu_read_unlock();
-		if (match && ptrace_access_check(task, PTRACE_MODE_ATTACH))
+		if (match && !ptrace_access_check(task, PTRACE_MODE_ATTACH))
 			return 0;
 	}
 
@@ -322,7 +322,7 @@ static int proc_pid_wchan(struct task_st
 	wchan = get_wchan(task);
 
 	if (lookup_symbol_name(wchan, symname) < 0)
-		if (!ptrace_access_check(task, PTRACE_MODE_READ))
+		if (ptrace_access_check(task, PTRACE_MODE_READ) < 0)
 			return 0;
 		else
 			return sprintf(buffer, "%lu", wchan);
@@ -559,7 +559,7 @@ static int proc_fd_access_allowed(struct
 	 */
 	task = get_proc_task(inode);
 	if (task) {
-		allowed = ptrace_access_check(task, PTRACE_MODE_READ);
+		allowed = !ptrace_access_check(task, PTRACE_MODE_READ);
 		put_task_struct(task);
 	}
 	return allowed;
@@ -938,7 +938,7 @@ static ssize_t environ_read(struct file 
 	if (!task)
 		goto out_no_task;
 
-	if (!ptrace_access_check(task, PTRACE_MODE_READ))
+	if (ptrace_access_check(task, PTRACE_MODE_READ) < 0)
 		goto out;
 
 	ret = -ENOMEM;
Index: linux/fs/proc/task_mmu.c
===================================================================
--- linux.orig/fs/proc/task_mmu.c
+++ linux/fs/proc/task_mmu.c
@@ -656,7 +656,7 @@ static ssize_t pagemap_read(struct file 
 		goto out;
 
 	ret = -EACCES;
-	if (!ptrace_access_check(task, PTRACE_MODE_READ))
+	if (ptrace_access_check(task, PTRACE_MODE_READ) < 0)
 		goto out_task;
 
 	ret = -EINVAL;
Index: linux/include/linux/ptrace.h
===================================================================
--- linux.orig/include/linux/ptrace.h
+++ linux/include/linux/ptrace.h
@@ -101,7 +101,7 @@ extern void ptrace_fork(struct task_stru
 /* Returns 0 on success, -errno on denial. */
 extern int __ptrace_access_check(struct task_struct *task, unsigned int mode);
 /* Returns true on success, false on denial. */
-extern bool ptrace_access_check(struct task_struct *task, unsigned int mode);
+extern int ptrace_access_check(struct task_struct *task, unsigned int mode);
 
 static inline int ptrace_reparented(struct task_struct *child)
 {
Index: linux/kernel/ptrace.c
===================================================================
--- linux.orig/kernel/ptrace.c
+++ linux/kernel/ptrace.c
@@ -165,13 +165,15 @@ int __ptrace_access_check(struct task_st
 	return security_ptrace_access_check(task, mode);
 }
 
-bool ptrace_access_check(struct task_struct *task, unsigned int mode)
+int ptrace_access_check(struct task_struct *task, unsigned int mode)
 {
 	int err;
+
 	task_lock(task);
 	err = __ptrace_access_check(task, mode);
 	task_unlock(task);
-	return !err;
+
+	return err;
 }
 
 int ptrace_attach(struct task_struct *task)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

----- End forwarded message -----

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2009-03-27 23:26 Eric Anholt
@ 2009-03-28  0:02 ` Linus Torvalds
  0 siblings, 0 replies; 341+ messages in thread
From: Linus Torvalds @ 2009-03-28  0:02 UTC (permalink / raw)
  To: Eric Anholt; +Cc: lkml, dri-devel



On Fri, 27 Mar 2009, Eric Anholt wrote:
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel drm-intel-next

Grr.

Guys, what the *hell* is wrong with you, when you can't even react to 
trivial warnings and fix buggy code pointed out by the compiler?

If you had _ever_ compiled this on x86-64, you would have seen:

  drivers/gpu/drm/i915/i915_gem_debugfs.c: In function ‘i915_gem_fence_regs_info’:
  drivers/gpu/drm/i915/i915_gem_debugfs.c:201: warning: format ‘%08x’ expects type ‘unsigned int’, but argument 7 has type ‘size_t’

and this is not the first time this has happened.

See commits f06da264cfb0f9444d41ca247213e419f90aa72a and 
aeb565dfc3ac4c8b47c5049085b4c7bfb2c7d5d7.

What's so hard with keeping the build warning-clean, and fixing these 
things _long_ before they hit my tree?

Some basic quality control. PLEASE.

		Linus

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <8F90F944E50427428C60E12A34A309D21C401BA619@carmd-exchmb01.sierrawireless.local>
@ 2009-03-13 16:54 ` Ralf Nyren
  0 siblings, 0 replies; 341+ messages in thread
From: Ralf Nyren @ 2009-03-13 16:54 UTC (permalink / raw)
  To: Rory Filer; +Cc: linux-kernel, Kevin Lloyd

Hi Rory,

Sounds great, send the driver and I'll give it a try at once. I'll report back
to you with the results.

Many thanks, Ralf

On Fri, 13 Mar 2009, Rory Filer wrote:

> Hi Ralf
>
>
>
> Kevin passed your email on to my attention and I think we can help you with this problem. We've been doing a lot of work on our drivers lately and I've got a freshly-ready version of sierra.c just for 2.6.28. We've done a lot of testing here and it seems pretty robust; perhaps you'd be willing to give it a try?
>
>
>
> Since I'm not sure about the etiquette for posting to this list, so I will attach the driver in a separate email to you.
>
>
>
> Regards
>
>
>
> Rory Filer
>
>
>
>
>
> -----Original Message-----
>
> From: Ralf Nyren [mailto:ralf@nyren.net]
>
> Sent: Friday, March 13, 2009 8:01 AM
>
> To: linux-kernel@vger.kernel.org
>
> Cc: Kevin Lloyd
>
> Subject: Sierra Wireless (MC8780) HSDPA speed issue
>
>
>
> Hi,
>
>
>
> I have a Sierra Wireless MC8780 UMTS card in a Fujitsu S6410 laptop running kernel 2.6.28.7. In kernel sierra driver v1.3.2.
>
>
>
> The card works but speed seems limited to approx 1.0 Mbit/s download using the linux driver.  Testing the card in Windows XP yields download speeds close to 5.0 Mbit/s.
>
>
>
> I recently updated the firmware of the card to support HSDPA/HSUPA. The update gave the desired result in Windows but not in Linux. The speed improved in Linux but didn't increase above 1 Mbit/s.
>
>
>
> Is there any known driver limitations or is this a configuration issue?
>
>
>
> Please let me know if you need any additional information.
>
>
>
> Best regards, Ralf
>
>

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2009-03-11 14:59 ` your mail Linus Torvalds
@ 2009-03-11 17:23   ` Vitaly Mayatskikh
  0 siblings, 0 replies; 341+ messages in thread
From: Vitaly Mayatskikh @ 2009-03-11 17:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Vitaly Mayatskikh, linux-kernel


> On Wed, 11 Mar 2009, Vitaly Mayatskikh wrote:
> > 
> > (v)scnprintf says it should return 0 when size is 0, but doesn't do
> > so. Also size_t is unsigned, it can't be less then 0. Fix the code and
> > comments.
> 
> That is bogus.
> 
> The code really does (od "did"? Maybe you removed it) check for _smaller_ 
> than 0:

Well, (v)scnprintf says it returns 0 for size <= 0, but really returns
-1 for size == 0. I think, this code can't return 0 for size == 0:

	i=vsnprintf(buf,size,fmt,args);
	return (i >= size) ? (size - 1) : i;

Systemtap's script:

function test:long()
%{
        char tmp[256];
        long err;
        err = scnprintf(tmp, 0, "%lu", (long)128);
        THIS->__retvalue = err;
%}

probe begin
{
        printf("scnprintf returns %d\n", test());
}

stap -g scnprintf.stp
scnprintf returns -1

-- 
wbr, Vitaly

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2009-03-11 10:47 Vitaly Mayatskikh
@ 2009-03-11 14:59 ` Linus Torvalds
  2009-03-11 17:23   ` Vitaly Mayatskikh
  0 siblings, 1 reply; 341+ messages in thread
From: Linus Torvalds @ 2009-03-11 14:59 UTC (permalink / raw)
  To: Vitaly Mayatskikh; +Cc: linux-kernel



On Wed, 11 Mar 2009, Vitaly Mayatskikh wrote:
> 
> (v)scnprintf says it should return 0 when size is 0, but doesn't do
> so. Also size_t is unsigned, it can't be less then 0. Fix the code and
> comments.

That is bogus.

The code really does (od "did"? Maybe you removed it) check for _smaller_ 
than 0:

	int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)
	{
		...
		/* Reject out-of-range values early.  Large positive sizes are
		   used for unknown buffer sizes. */
		if (unlikely((int) size < 0)) {
			/* There can be only one.. */
			static char warn = 1;
			WARN_ON(warn);
			warn = 0;
			return 0;
		}
		...

because under/overflows have happened.

The kernel is _not_ a regular libc. We have different rules.

		Linus

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2009-02-13  0:45 Youngwhan Kim
@ 2009-02-13  3:40 ` Johannes Weiner
  0 siblings, 0 replies; 341+ messages in thread
From: Johannes Weiner @ 2009-02-13  3:40 UTC (permalink / raw)
  To: Youngwhan Kim; +Cc: linux-kernel

On Fri, Feb 13, 2009 at 09:45:13AM +0900, Youngwhan Kim wrote:
> unsubscribe

There is just no way out!

> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org        ^^^^^^^^^^^^

                           ^^^^^^^^^^^^^^^^^^^^^^^^^

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2009-01-19  2:54 Gao, Yunpeng
@ 2009-01-19  3:07 ` Matthew Wilcox
  0 siblings, 0 replies; 341+ messages in thread
From: Matthew Wilcox @ 2009-01-19  3:07 UTC (permalink / raw)
  To: Gao, Yunpeng; +Cc: linux-ia64, linux-kernel

On Mon, Jan 19, 2009 at 10:54:02AM +0800, Gao, Yunpeng wrote:
> I have to use 64bit variable in my 2.6.27 kernel NAND driver as below:
> ---------------------------------------------------------------------------
> u64 NAND_capacity;
> unsigned int block_num, block_size;
> ...
> block_num = NAND_capacity / block_size;
> ---------------------------------------------------------------------------
> but it failed when compiling and reports 'undefined reference to `__udivdi3'.

Presumably block_size is a power of two, so you can do:

	block_num = NAND_capacity >> block_shift;

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2009-01-13  6:10 Steven Rostedt
@ 2009-01-13 13:21 ` Steven Rostedt
  0 siblings, 0 replies; 341+ messages in thread
From: Steven Rostedt @ 2009-01-13 13:21 UTC (permalink / raw)
  To: linux-kernel

On Tue, Jan 13, 2009 at 01:10:04AM -0500, Steven Rostedt wrote:

Bah! sorry for the noise here. My scripts to send out the patch
queue failed to handle the comma in "Luck, Tony" email address.
But it unfortunately did a partial send :-(

I had to modify Tony's email for the final send.

-- Steve


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2009-01-11  3:41 Jose Luis Marchetti
@ 2009-01-11  6:47 ` Jesper Juhl
  0 siblings, 0 replies; 341+ messages in thread
From: Jesper Juhl @ 2009-01-11  6:47 UTC (permalink / raw)
  To: Jose Luis Marchetti; +Cc: linux-kernel

On Sat, 10 Jan 2009, Jose Luis Marchetti wrote:

> Hi,
> 
> I would like to open/read/write/close a regular file from my device
> driver.

That's probably a bad idea and what you really want to do is use procfs, 
sysfs, debugfs, relayfs, module parameters or similar.

Take a look here: 
http://kernelnewbies.org/FAQ/WhyWritingFilesFromKernelIsBad 


-- 
Jesper Juhl <jj@chaosbits.net>        http://personal.chaosbits.net/
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please      http://www.expita.com/nomime.html


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2008-05-24 20:05 Thomas Gleixner
@ 2008-05-24 21:06 ` Daniel Walker
  0 siblings, 0 replies; 341+ messages in thread
From: Daniel Walker @ 2008-05-24 21:06 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel


On Sat, 2008-05-24 at 22:05 +0200, Thomas Gleixner wrote:

> > If that's the requirement then code that cleans up the corner case that
> > I've identified, which is also minimal should be acceptable .. Since
> > it's meeting the same requirement you layed out above for the original
> > plist changes.
> 
> Your code solves the least to worry about corner case and hurts
> performance for nothing. You take extra locks in the hot path for no
> benefit.
> 
> Aside of that it introduces lock order problems and we can really do
> without extra useless complexity in the futex code.
> 
> You can argue in circles. This is not going anywhere near mainline.

Above I'm not speaking about my code, I'm only speaking in terms of a
solution to this case, even if it isn't mine..

Daniel




^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
@ 2008-05-24 20:05 Thomas Gleixner
  2008-05-24 21:06 ` Daniel Walker
  0 siblings, 1 reply; 341+ messages in thread
From: Thomas Gleixner @ 2008-05-24 20:05 UTC (permalink / raw)
  To: Daniel Walker; +Cc: linux-kernel

On Sat, 24 May 2008, Daniel Walker wrote:
> > There is no kernel side controlled handover of a normal futex. The
> > woken up waiters race for it and a low prio thread on another CPU can
> > steal it even if there is a high prio waiter woken up.
> 
> After reading futex_wake, Doesn't it depend how many waiters are woken?
> Given that comes from userspace, glibc could wake a single waiter and
> obtain a priority ordering, couldn't it?

It could and it does. Still this does not protect against another
lower prio task taking the futex before the woken waiter can do it,
which is happening way more often than your theoretical setscheduler
case. Again, setscheduler is called in startup code of a program not
at arbitrary points during runtime, which rely on lock ordering.

> > The plist add on works correct in most of the cases, nothing else. To
> > achieve full correctness there is much more necessary than this
> > setscheduler issue. The plist changes were accepted because the
> > overhead is really minimal, but achieving full correctness would hurt
> > performance badly.
> 
> If that's the requirement then code that cleans up the corner case that
> I've identified, which is also minimal should be acceptable .. Since
> it's meeting the same requirement you layed out above for the original
> plist changes.

Your code solves the least to worry about corner case and hurts
performance for nothing. You take extra locks in the hot path for no
benefit.

Aside of that it introduces lock order problems and we can really do
without extra useless complexity in the futex code.

You can argue in circles. This is not going anywhere near mainline.

Thanks,
	tglx

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2008-05-20 12:34 Lukas Hejtmanek
@ 2008-05-20 15:20 ` Alan Stern
  0 siblings, 0 replies; 341+ messages in thread
From: Alan Stern @ 2008-05-20 15:20 UTC (permalink / raw)
  To: Lukas Hejtmanek
  Cc: Oliver Neukum, Rafael J. Wysocki, Linux Kernel Mailing List,
	greg, linux-usb

On Tue, 20 May 2008, Lukas Hejtmanek wrote:

> <stern@rowland.harvard.edu>, Greg KH <greg@kroah.com>
> Bcc: 
> Subject: Re: [Bug #10630] USB devices plugged into dock are not discoverred
> 	until reload of ehci-hcd
> Reply-To: 
> In-Reply-To: <200805201327.34678.oliver@neukum.org>
> X-echelon: NSA, CIA, CI5, MI5, FBI, KGB, BIS, Plutonium, Bin Laden, bomb
> 
> On Tue, May 20, 2008 at 01:27:34PM +0200, Oliver Neukum wrote:
> > > done.
> > > http://bugzilla.kernel.org/show_bug.cgi?id=10630
> > 
> > Aha. Thanks.
> > Please recompile without CONFIG_USB_SUSPEND
> 
> Hm, without USB_SUSPEND it works. So what next, considered fixed or any
> further investigation is needed?

No further investigation is needed.  I tried doing essentially the same 
thing on my system and the same problem occurred.

It is caused by the way ehci-hcd "auto-clears" the port
change-suspend feature.  This patch should fix the problem.  Please 
try it out and let us know if it works.

Alan Stern



Index: usb-2.6/drivers/usb/host/ehci.h
===================================================================
--- usb-2.6.orig/drivers/usb/host/ehci.h
+++ usb-2.6/drivers/usb/host/ehci.h
@@ -97,6 +97,8 @@ struct ehci_hcd {			/* one per controlle
 			dedicated to the companion controller */
 	unsigned long		owned_ports;		/* which ports are
 			owned by the companion during a bus suspend */
+	unsigned long		port_c_suspend;		/* which ports have
+			the change-suspend feature turned on */
 
 	/* per-HC memory pools (could be per-bus, but ...) */
 	struct dma_pool		*qh_pool;	/* qh per active urb */
Index: usb-2.6/drivers/usb/host/ehci-hub.c
===================================================================
--- usb-2.6.orig/drivers/usb/host/ehci-hub.c
+++ usb-2.6/drivers/usb/host/ehci-hub.c
@@ -609,7 +609,7 @@ static int ehci_hub_control (
 			}
 			break;
 		case USB_PORT_FEAT_C_SUSPEND:
-			/* we auto-clear this feature */
+			clear_bit(wIndex, &ehci->port_c_suspend);
 			break;
 		case USB_PORT_FEAT_POWER:
 			if (HCS_PPC (ehci->hcs_params))
@@ -688,7 +688,7 @@ static int ehci_hub_control (
 			/* resume completed? */
 			else if (time_after_eq(jiffies,
 					ehci->reset_done[wIndex])) {
-				status |= 1 << USB_PORT_FEAT_C_SUSPEND;
+				set_bit(wIndex, &ehci->port_c_suspend);
 				ehci->reset_done[wIndex] = 0;
 
 				/* stop resume signaling */
@@ -765,6 +765,8 @@ static int ehci_hub_control (
 			status |= 1 << USB_PORT_FEAT_RESET;
 		if (temp & PORT_POWER)
 			status |= 1 << USB_PORT_FEAT_POWER;
+		if (test_bit(wIndex, &ehci->port_c_suspend))
+			status |= 1 << USB_PORT_FEAT_C_SUSPEND;
 
 #ifndef	VERBOSE_DEBUG
 	if (status & ~0xffff)	/* only if wPortChange is interesting */


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
@ 2007-10-17 18:28 nicholas.thompson1
  0 siblings, 0 replies; 341+ messages in thread
From: nicholas.thompson1 @ 2007-10-17 18:28 UTC (permalink / raw)
  To: linux-kernel

>Nope, wrong clues.
>The right clues are in the footer of this message after it travels thru the list.
>
>I supplied them to Nicholas already, but apparently others need to be reminded of
>them every now and then  :-]   That footer is in these list messages for a reason!
>
>    /Matti Aarnio -- one of  <postmaster@vger.kernel.org>
>
>PS: You want to contact VGER's email and list managers ?
>    We use the internet email standard address "postmaster"
>

Jan, Matti, + List,
 I am very sorry about the noise, that's what I get for using cut and paste while tired and before my third cup of coffee. ;p Apologies.

Nick 

>>On Wed, Oct 17, 2007 at 06:36:19PM +0200, Jan Engelhardt wrote:
>> Date: Wed, 17 Oct 2007 18:36:19 +0200 (CEST)
>> From: Jan Engelhardt <jengelh@computergmbh.de>
>> To: nicholas.thompson1@mchsi.com
>> cc: linux-kernel@vger.kernel.org
>> Subject: Re: your mail
>> 
>> On Oct 17 2007 16:30, nicholas.thompson1@mchsi.com wrote:
>> >Date: Wed, 17 Oct 2007 16:30:24 +0000
>> >From:  <nicholas.thompson1@mchsi.com>
>> >To:  <linux-kernel@vger.kernel.org>
>>              ^^^^^^
> >>
> >>subscribe linux-alpha
>>                  ^^^^^

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2007-10-17 16:36 ` your mail Jan Engelhardt
@ 2007-10-17 17:50   ` Matti Aarnio
  0 siblings, 0 replies; 341+ messages in thread
From: Matti Aarnio @ 2007-10-17 17:50 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: linux-kernel

Nope, wrong clues.
The right clues are in the footer of this message after it travels thru the list.

I supplied them to Nicholas already, but apparently others need to be reminded of
them every now and then  :-]   That footer is in these list messages for a reason!

    /Matti Aarnio -- one of  <postmaster@vger.kernel.org>

PS: You want to contact VGER's email and list managers ?
    We use the internet email standard address "postmaster"


On Wed, Oct 17, 2007 at 06:36:19PM +0200, Jan Engelhardt wrote:
> Date: Wed, 17 Oct 2007 18:36:19 +0200 (CEST)
> From: Jan Engelhardt <jengelh@computergmbh.de>
> To: nicholas.thompson1@mchsi.com
> cc: linux-kernel@vger.kernel.org
> Subject: Re: your mail
> 
> On Oct 17 2007 16:30, nicholas.thompson1@mchsi.com wrote:
> >Date: Wed, 17 Oct 2007 16:30:24 +0000
> >From:  <nicholas.thompson1@mchsi.com>
> >To:  <linux-kernel@vger.kernel.org>
>              ^^^^^^
> >
> >subscribe linux-alpha
>                  ^^^^^

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2007-10-17 16:30 nicholas.thompson1
@ 2007-10-17 16:36 ` Jan Engelhardt
  2007-10-17 17:50   ` Matti Aarnio
  0 siblings, 1 reply; 341+ messages in thread
From: Jan Engelhardt @ 2007-10-17 16:36 UTC (permalink / raw)
  To: nicholas.thompson1; +Cc: linux-kernel


On Oct 17 2007 16:30, nicholas.thompson1@mchsi.com wrote:
>Date: Wed, 17 Oct 2007 16:30:24 +0000
>From:  <nicholas.thompson1@mchsi.com>
>To:  <linux-kernel@vger.kernel.org>
             ^^^^^^
>
>subscribe linux-alpha
                 ^^^^^


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2007-09-24 20:44 Steven Rostedt
@ 2007-09-24 20:50 ` Steven Rostedt
  0 siblings, 0 replies; 341+ messages in thread
From: Steven Rostedt @ 2007-09-24 20:50 UTC (permalink / raw)
  To: Jaswinder Singh; +Cc: linux-kernel, mingo, linux-rt-users



--
On Mon, 24 Sep 2007, Steven Rostedt wrote:

> linux-rt-users@vger.kernel.org
> Bcc:
> Subject: Re: realtime preemption performance difference
> Reply-To:
> In-Reply-To: <3f9a31f40709240448h4a9e8337t437328b5c675ecd5@mail.gmail.com>

[ I'm actually just learning how to screw-up^Wuse mutt ]

bah!

-- Steve


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2007-08-16  0:36                 ` Satyam Sharma
@ 2007-08-16  0:32                   ` Herbert Xu
  0 siblings, 0 replies; 341+ messages in thread
From: Herbert Xu @ 2007-08-16  0:32 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Segher Boessenkool, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, ak, netdev,
	cfriesen, Heiko Carstens, rpjday, jesper.juhl, linux-arch,
	Andrew Morton, zlynx, clameter, schwidefsky, Chris Snook, davem,
	Linus Torvalds, wensong, wjiang

On Thu, Aug 16, 2007 at 06:06:00AM +0530, Satyam Sharma wrote:
> 
> that are:
> 
> 	while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) {
> 		mdelay(1);
> 		msecs--;
> 	}
> 
> where mdelay() becomes __const_udelay() which happens to be in another
> translation unit (arch/i386/lib/delay.c) and hence saves this callsite
> from being a bug :-)

The udelay itself certainly should have some form of cpu_relax in it.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2007-05-16 17:11   ` Olof Johansson
@ 2007-05-16 17:24     ` Bob Picco
  0 siblings, 0 replies; 341+ messages in thread
From: Bob Picco @ 2007-05-16 17:24 UTC (permalink / raw)
  To: Olof Johansson
  Cc: Linas Vepstas, Bob Picco, johnrose, linuxppc-dev, Andrew Morton,
	linux-kernel

Olof Johansson wrote:	[Wed May 16 2007, 01:11:00PM EDT]
> On Wed, May 16, 2007 at 11:43:41AM -0500, Linas Vepstas wrote:
> > On Wed, May 16, 2007 at 09:30:46AM -0400, Bob Picco wrote:
> > > Subject: Re: 2.6.22-rc1-mm1 powerpc build breakage
> > > 
> > > /usr/src/linux-2.6.22-rc1-mm1/drivers/pci/hotplug/rpadlpar_sysfs.c:132: error: unknown field `subsys' specified in initializer
> > > /usr/src/linux-2.6.22-rc1-mm1/drivers/pci/hotplug/rpadlpar_sysfs.c:132: warning: initialization from incompatible pointer type
> > > make[4]: *** [drivers/pci/hotplug/rpadlpar_sysfs.o] Error 1
> > > make[3]: *** [drivers/pci/hotplug] Error 2
> > > make[2]: *** [drivers/pci] Error 2
> > > make[1]: *** [drivers] Error 2
> > > make: *** [_all] Error 2
> > 
> > John Rose is working to fix this "real soon now".
> 
> Do you mean the fix Al Viro posted yesterday?
> 
> http://patchwork.ozlabs.org/linuxppc/patch?id=11177
> 
> 
> -Olof
Missed that patch.

thanks,

bob

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2007-05-16 16:43 ` your mail Linas Vepstas
@ 2007-05-16 17:11   ` Olof Johansson
  2007-05-16 17:24     ` Bob Picco
  0 siblings, 1 reply; 341+ messages in thread
From: Olof Johansson @ 2007-05-16 17:11 UTC (permalink / raw)
  To: Linas Vepstas
  Cc: Bob Picco, johnrose, linuxppc-dev, Andrew Morton, linux-kernel

On Wed, May 16, 2007 at 11:43:41AM -0500, Linas Vepstas wrote:
> On Wed, May 16, 2007 at 09:30:46AM -0400, Bob Picco wrote:
> > Subject: Re: 2.6.22-rc1-mm1 powerpc build breakage
> > 
> > /usr/src/linux-2.6.22-rc1-mm1/drivers/pci/hotplug/rpadlpar_sysfs.c:132: error: unknown field `subsys' specified in initializer
> > /usr/src/linux-2.6.22-rc1-mm1/drivers/pci/hotplug/rpadlpar_sysfs.c:132: warning: initialization from incompatible pointer type
> > make[4]: *** [drivers/pci/hotplug/rpadlpar_sysfs.o] Error 1
> > make[3]: *** [drivers/pci/hotplug] Error 2
> > make[2]: *** [drivers/pci] Error 2
> > make[1]: *** [drivers] Error 2
> > make: *** [_all] Error 2
> 
> John Rose is working to fix this "real soon now".

Do you mean the fix Al Viro posted yesterday?

http://patchwork.ozlabs.org/linuxppc/patch?id=11177


-Olof

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2007-05-16 13:30 Bob Picco
@ 2007-05-16 16:43 ` Linas Vepstas
  2007-05-16 17:11   ` Olof Johansson
  0 siblings, 1 reply; 341+ messages in thread
From: Linas Vepstas @ 2007-05-16 16:43 UTC (permalink / raw)
  To: Bob Picco, johnrose; +Cc: Andrew Morton, linuxppc-dev, linux-kernel

On Wed, May 16, 2007 at 09:30:46AM -0400, Bob Picco wrote:
> Subject: Re: 2.6.22-rc1-mm1 powerpc build breakage
> 
> /usr/src/linux-2.6.22-rc1-mm1/drivers/pci/hotplug/rpadlpar_sysfs.c:132: error: unknown field `subsys' specified in initializer
> /usr/src/linux-2.6.22-rc1-mm1/drivers/pci/hotplug/rpadlpar_sysfs.c:132: warning: initialization from incompatible pointer type
> make[4]: *** [drivers/pci/hotplug/rpadlpar_sysfs.o] Error 1
> make[3]: *** [drivers/pci/hotplug] Error 2
> make[2]: *** [drivers/pci] Error 2
> make[1]: *** [drivers] Error 2
> make: *** [_all] Error 2

John Rose is working to fix this "real soon now".

--linas

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2007-03-29 21:42 ` your mail Jan Engelhardt
  2007-03-29 21:46   ` David Miller
@ 2007-03-29 21:48   ` Gerard Braad
  1 sibling, 0 replies; 341+ messages in thread
From: Gerard Braad @ 2007-03-29 21:48 UTC (permalink / raw)
  To: Linux Kernel Mailing List

Sorry, this wasn't supposed to happen. Already done...
Unsubscribed due to lack of a digest mail.

> I wonder why people can't send their unsubscribe message to the same
> address they sent their subscribe message to.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2007-03-29 21:42 ` your mail Jan Engelhardt
@ 2007-03-29 21:46   ` David Miller
  2007-03-29 21:48   ` Gerard Braad
  1 sibling, 0 replies; 341+ messages in thread
From: David Miller @ 2007-03-29 21:46 UTC (permalink / raw)
  To: jengelh; +Cc: linux-kernel

From: Jan Engelhardt <jengelh@linux01.gwdg.de>
Date: Thu, 29 Mar 2007 23:42:17 +0200 (MEST)

> > unsubscribe linux-kernel ..
> 
> I wonder why people can't send their unsubscribe message to the same 
> address they sent their subscribe message to.

People get frustrated that it doesn't work then start doing stupid
things like sending it to the actual list, like this person did.

Of course they always fail to consider doing the proper thing which is
to ask postmaster@vger.kernel.org or the list owner
(linux-kernel-owner@vger.kernel.org in this case) for help if it is
the case that their email has changed and they no longer have a way to
send from the subscribed address.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2007-03-29 21:39 Gerard Braad Jr.
@ 2007-03-29 21:42 ` Jan Engelhardt
  2007-03-29 21:46   ` David Miller
  2007-03-29 21:48   ` Gerard Braad
  0 siblings, 2 replies; 341+ messages in thread
From: Jan Engelhardt @ 2007-03-29 21:42 UTC (permalink / raw)
  To: Linux Kernel Mailing List


>
> unsubscribe linux-kernel ..

I wonder why people can't send their unsubscribe message to the same 
address they sent their subscribe message to.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2005-11-25 22:06 root
@ 2005-11-26  0:11 ` Hugh Dickins
  0 siblings, 0 replies; 341+ messages in thread
From: Hugh Dickins @ 2005-11-26  0:11 UTC (permalink / raw)
  To: root; +Cc: linux-kernel

On Fri, 25 Nov 2005, root wrote:

> Nov 25 21:59:24 txiringo kernel: [17182458.504000] program ddcprobe
> is using MAP_PRIVATE, PROT_WRITE mmap of VM_RESERVED memory, which
> is deprecated. Please report this to linux-kernel@vger.kernel.org

Thanks for the report: now fixed, please upgrade to 2.6.15-rc2-git3 or later.

Hugh

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2005-06-16 23:32 ` your mail Chris Wedgwood
@ 2005-06-17  1:46   ` Tom McNeal
  0 siblings, 0 replies; 341+ messages in thread
From: Tom McNeal @ 2005-06-17  1:46 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: linux-kernel

I'll look at that.  This occurs on all Linux platforms, including a generic
2.4.31 I downloaded from kernel.org. The user test is trivial, just doing
the nonblocking connect, the poll, the send, and then the close, in that loop.

Tom

Chris Wedgwood wrote:
> On Thu, Jun 16, 2005 at 11:08:28PM +0000, trmcneal@comcast.net wrote:
> 
> 
>>>I've been working with some tcp network test programs that have
>>>multiple clients opening nonblocking sockets to a single server
>>>port, sending a short message, and then closing the socket,
>>>100,000 times.  Since the socket is non-blocking, it generally
>>>tries to connect and then does a poll since the socket is busy.
>>>The test fails if the poll times out in 10 seconds.  It fails
>>>consistently on Linux servers but succeeds on Solaris servers; the
>>>client is a non-issue unless its loopback on the Linux server.
> 
> 
> where is the code for this?  are you sure you're not overflowing the
> listen backlog somewhere?  that would show up in some cases but not
> all depending on latencies and local scheduler behavior
> 

-- 
Tom McNeal
(650)906-0761(cell)
(650)964-8459(fax)
Email: trmcneal@comcast.net

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2005-06-16 23:08 trmcneal
@ 2005-06-16 23:32 ` Chris Wedgwood
  2005-06-17  1:46   ` Tom McNeal
  0 siblings, 1 reply; 341+ messages in thread
From: Chris Wedgwood @ 2005-06-16 23:32 UTC (permalink / raw)
  To: trmcneal; +Cc: linux-kernel

On Thu, Jun 16, 2005 at 11:08:28PM +0000, trmcneal@comcast.net wrote:

> > I've been working with some tcp network test programs that have
> > multiple clients opening nonblocking sockets to a single server
> > port, sending a short message, and then closing the socket,
> > 100,000 times.  Since the socket is non-blocking, it generally
> > tries to connect and then does a poll since the socket is busy.
> > The test fails if the poll times out in 10 seconds.  It fails
> > consistently on Linux servers but succeeds on Solaris servers; the
> > client is a non-issue unless its loopback on the Linux server.

where is the code for this?  are you sure you're not overflowing the
listen backlog somewhere?  that would show up in some cases but not
all depending on latencies and local scheduler behavior

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2005-02-03  0:17 Aleksey Gorelov
  2005-02-03  1:12 ` your mail Matthew Dharm
@ 2005-02-03 16:03 ` Alan Stern
  1 sibling, 0 replies; 341+ messages in thread
From: Alan Stern @ 2005-02-03 16:03 UTC (permalink / raw)
  To: Aleksey Gorelov; +Cc: mdharm-usb, linux-kernel

On Wed, 2 Feb 2005, Aleksey Gorelov wrote:

> Hi Matt, Alan, 
> 
>   Could you please tell me (link would do) why it makes default
> delay_use=5 
> really necessary (from the patch below)?
> https://lists.one-eyed-alien.net/pipermail/usb-storage/2004-August/00074
> 7.html
> 
> It makes USB boot really painfull and slow :(
> 
>   I understand there should be a good reason for it. I've tried to find
> an answer in 
> archives, without much success though.

Lots of devices don't need that delay, but enough of them do that we 
decided to add it.  The value of 5 seconds was more or less arbitrary; it 
was long enough for every device we could test and it didn't seem _too_ 
long.  Maybe 1 second would be long enough -- we just didn't know so we 
were conservative.

Alan Stern


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2005-02-03  0:17 Aleksey Gorelov
@ 2005-02-03  1:12 ` Matthew Dharm
  2005-02-03 16:03 ` Alan Stern
  1 sibling, 0 replies; 341+ messages in thread
From: Matthew Dharm @ 2005-02-03  1:12 UTC (permalink / raw)
  To: Aleksey Gorelov; +Cc: stern, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1039 bytes --]

It's basically just like the code says.

A lot of devices choke if you access them too quickly after enumeration.
The 5 second delay seems to be enough for most devices.  But we made it
adjustable exactly for people like you.

Matt

On Wed, Feb 02, 2005 at 04:17:13PM -0800, Aleksey Gorelov wrote:
> Hi Matt, Alan, 
> 
>   Could you please tell me (link would do) why it makes default
> delay_use=5 
> really necessary (from the patch below)?
> https://lists.one-eyed-alien.net/pipermail/usb-storage/2004-August/00074
> 7.html
> 
> It makes USB boot really painfull and slow :(
> 
>   I understand there should be a good reason for it. I've tried to find
> an answer in 
> archives, without much success though.
> 
> Thanks,
> Aleks.

-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

Now payink attention, please.  This is mouse.  Click-click. Easy to 
use, da? Now you try...
					-- Pitr to Miranda
User Friendly, 10/11/1998

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-09-19 12:29 plt
@ 2004-09-19 18:22 ` Jesper Juhl
  0 siblings, 0 replies; 341+ messages in thread
From: Jesper Juhl @ 2004-09-19 18:22 UTC (permalink / raw)
  To: plt; +Cc: linux-kernel

On Sun, 19 Sep 2004 plt@taylorassociate.com wrote:

> Question: Are you guys going to work on please cleaning up some of the errors in
> the code so we can get please get a more clean compile?
> 
I think it's safe to say that there is an ongoing effort to do that.

Some more strict typechecking has recently been introduced (read more 
here: http://kerneltrap.org/node/view/3848 ) and this currently cause a 
lot of compiler warnings that have yet to be cleaned, but that will happen 
in time - faster if you lend a hand.

> 
> drivers/mtd/nftlmount.c:44: warning: unused variable `oob'
> 
This is due to the fact that the code using that variable is currently 
within an  #if 0  block. I am not familiar with the mtd code, but the 
comment in there has this to say :

#if 0 /* Some people seem to have devices without ECC or erase marks
         on the Media Header blocks. There are enough other sanity
         checks in here that we can probably do without it.
      */

...

#endif

So it would seem that this bit of code could be on its way out. I'd assume 
that once it goes (if it does) that the variable will then be removed as 
well.


Ohh and btw, if you want people to pay attention to your emails you should 
try adding a descriptive Subject:  :)


--
Jesper Juhl


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-16 15:42 Jon Smirl
@ 2004-08-16 23:55 ` Dave Airlie
  0 siblings, 0 replies; 341+ messages in thread
From: Dave Airlie @ 2004-08-16 23:55 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Christoph Hellwig, torvalds, Andrew Morton, linux-kernel


> But DRM still has to live with existing fbdev drivers. The same DRM
> code is used in 2.4 and 2.6 so existing fbdev drivers are not going
> away anytime soon. When DRM detects a fbdev it will revert back into
> stealth mode where is attaches itself to the hardware without telling
> the kernel that it is doing so. DRM can not use stealth mode when
> running without fbdev present since it will mess up hotplug by not
> marking the resources in use.
>
> I don't believe the ordering between fbdev and DRM is an issue. If you
> are using fbdev you likely have it compiled in. In that case fbdev
> always loads first and DRM second. In the non-ppc world, most of us
> have x86 boxes which don't use fbdev. In those machines DRM needs to be
> a first class driver. In the real world I don't know anyone other than
> a developer who would load DRM first and then fbdev. If this is a
> problem you will need to fixed fbdev to fall back into stealth mode
> like DRM does.

This is a good point, we are being forced into stealth mode by the fb
driver if they want to load after us they should respsect us and do the
same, (nope this isn't an us and them, DRM vs fb - I think we have a
solution and are heading the correct direction)...

Dave.

-- 
David Airlie, Software Engineer
http://www.skynet.ie/~airlied / airlied at skynet.ie
pam_smb / Linux DECstation / Linux VAX / ILUG person


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-16 12:37                 ` Christoph Hellwig
@ 2004-08-16 23:33                   ` Dave Airlie
  0 siblings, 0 replies; 341+ messages in thread
From: Dave Airlie @ 2004-08-16 23:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alan Cox, torvalds, Andrew Morton, Linux Kernel Mailing List

>
> All the fbdev handling code in X is also an accident?

I've no idea I've nothing to do with X... but the fact that graphics work
at all with fb/drm/X is by no fault of any design it is pure hack ...

> Really, why do you even push for this change if the better fix isn't that
> far away.  Send the i915 driver and the other misc cleanups to Linus now
> and get a proper graphics stub driver done, it's not that much work.  I'll
> hack up the fbdev side once I'll get a little time, but the drm code is
> far to disgusting to touch, sorry.

It means writing 6 or 7 stub drivers, for cards we don't have, it means
making PCI probing different for some fbdev drivers and some DRM drivers
(e.g. the i915 doesn't have a framebuffer driver in 2.6 so do I write a
stub on the chance that someone writes an fb driver for it? -  why do this
when the DRM will start encompassing the fb soon..) it is a lot of work
that we intend throwing away, the final solution is not to merge DRM/fb
via a stub, it is to create a single driver for each card, what happens
when the DRM starts doing memory management and 2d stuff.. we won't want
fb to be able to load anymore as it will break the DRM...I see Jon Smirl
has found the thread, please discuss with him as he was the one doing all
the legwork at the kernel summit...

again this doesn't break any real setups, it is the path of least
resistance as it doesn't affect fb drivers, why should DRM be a second
class citizen, when it is clearly going to have to be a first class 2.6
driver to do its job... if you can find someone with a real world setup
that this breaks I'll consider it a really bad idea... but I think Jon has
made his point far better than I...

Dave.

-- 
David Airlie, Software Engineer
http://www.skynet.ie/~airlied / airlied at skynet.ie
pam_smb / Linux DECstation / Linux VAX / ILUG person


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
@ 2004-08-16 15:42 Jon Smirl
  2004-08-16 23:55 ` Dave Airlie
  0 siblings, 1 reply; 341+ messages in thread
From: Jon Smirl @ 2004-08-16 15:42 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: torvalds, Andrew Morton, linux-kernel, Dave Airlie

Graphics drivers in the kernel are broken. The kernel was never
designed to have two device drivers trying to control the same piece of
hardware. 
I have posted a long list of 25 points that we are working towards to
unify things. http://lkml.org/lkml/2004/8/2/111 The PCI ROM patch that
has been posted recently addresses the first one.

In the meanwhile we have to transition somehow between what we have and
where we are going. Since fbdev has taken the path to pretend that DRM
doesn't exist DRM has to go through a lot of trouble to work when fbdev
is in the system. DRM also has to work when fbdev is not in the system.

DRM is being reworked into a first class driver with full support for
2.6 and hotplug. Part of being a first class driver means that DRM has
to register itself with the kernel like a real driver and claim all of
it's resources. I'm also fixing the driver to use 2.6 module parameters
and to support dynamic assignment of minors. Sysfs support is in the
patch being discussed.

But DRM still has to live with existing fbdev drivers. The same DRM
code is used in 2.4 and 2.6 so existing fbdev drivers are not going
away anytime soon. When DRM detects a fbdev it will revert back into
stealth mode where is attaches itself to the hardware without telling
the kernel that it is doing so. DRM can not use stealth mode when
running without fbdev present since it will mess up hotplug by not
marking the resources in use.

I don't believe the ordering between fbdev and DRM is an issue. If you
are using fbdev you likely have it compiled in. In that case fbdev
always loads first and DRM second. In the non-ppc world, most of us
have x86 boxes which don't use fbdev. In those machines DRM needs to be
a first class driver. In the real world I don't know anyone other than
a developer who would load DRM first and then fbdev. If this is a
problem you will need to fixed fbdev to fall back into stealth mode
like DRM does.

I would like to encourage you to work towards the points on the above
referenced list. It has been widely distributed and commented on. It
has been posted to lkml, dri-dev, fb-dev and xorg lists and discussed
at OLS. 

Sorry, but I can't add an In-Reply-To header in the middle of thread on
yahoo. cc me on a reply to the main thread so that I will pick up the header.

=====
Jon Smirl
jonsmirl@yahoo.com


		
__________________________________
Do you Yahoo!?
Yahoo! Mail - 50x more storage than other providers!
http://promotions.yahoo.com/new_mail

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-16 12:24               ` Dave Airlie
@ 2004-08-16 12:37                 ` Christoph Hellwig
  2004-08-16 23:33                   ` Dave Airlie
  0 siblings, 1 reply; 341+ messages in thread
From: Christoph Hellwig @ 2004-08-16 12:37 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Christoph Hellwig, Alan Cox, torvalds, Andrew Morton,
	Linux Kernel Mailing List

On Mon, Aug 16, 2004 at 01:24:30PM +0100, Dave Airlie wrote:
> >
> > Works fine on all my pmacs here.  In fact X works only on fbdev for
> > full features.
> 
> I think Alan would classify that as luck rathar than design... and I would

All the fbdev handling code in X is also an accident?

Really, why do you even push for this change if the better fix isn't that
far away.  Send the i915 driver and the other misc cleanups to Linus now
and get a proper graphics stub driver done, it's not that much work.  I'll
hack up the fbdev side once I'll get a little time, but the drm code is
far to disgusting to touch, sorry.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-16 12:20             ` Christoph Hellwig
@ 2004-08-16 12:24               ` Dave Airlie
  2004-08-16 12:37                 ` Christoph Hellwig
  0 siblings, 1 reply; 341+ messages in thread
From: Dave Airlie @ 2004-08-16 12:24 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alan Cox, torvalds, Andrew Morton, Linux Kernel Mailing List

>
> Works fine on all my pmacs here.  In fact X works only on fbdev for
> full features.

I think Alan would classify that as luck rathar than design... and I would
tend to agree, does it work if you load the driver modules in any order?
or do you always to fb then drm? or the other way around?

-- 
David Airlie, Software Engineer
http://www.skynet.ie/~airlied / airlied at skynet.ie
pam_smb / Linux DECstation / Linux VAX / ILUG person


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-16 11:12           ` Alan Cox
@ 2004-08-16 12:20             ` Christoph Hellwig
  2004-08-16 12:24               ` Dave Airlie
  0 siblings, 1 reply; 341+ messages in thread
From: Christoph Hellwig @ 2004-08-16 12:20 UTC (permalink / raw)
  To: Alan Cox
  Cc: Christoph Hellwig, Dave Airlie, torvalds, Andrew Morton,
	Linux Kernel Mailing List

On Mon, Aug 16, 2004 at 12:12:00PM +0100, Alan Cox wrote:
> On Llu, 2004-08-16 at 10:50, Christoph Hellwig wrote:
> > no, now you're acting like an even more broken driver, preventing a fbdev
> > driver to be loaded afterwards and doing all kinds of funny things.  Please
> > revert to the old method until you have a common pci_driver for fbdev and dri.
> 
> fbdev and DRI are not functional together in the general case. They
> sometimes happen to work by luck. fbdev and X for that matter are
> generally incompatible except unaccelerated.

Works fine on all my pmacs here.  In fact X works only on fbdev for
full features.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-16 11:08                 ` Christoph Hellwig
  2004-08-16 11:12                   ` Alan Cox
@ 2004-08-16 11:47                   ` Dave Airlie
  1 sibling, 0 replies; 341+ messages in thread
From: Dave Airlie @ 2004-08-16 11:47 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: torvalds, Andrew Morton, linux-kernel


> > Yes and that is the final goal but you are dodging the point we cannot
> > jump to a fully finished state in one simple transition, it is great to
> > hear "fbdrv/drm into a common driver" it's a simple sentence surely coding
> > it must be simple, well its not and we are taking the route that should
>
> It _is+ simple.  Look at drivers/message/fusion/ for a driver doing multiple
> protocol on a single pci_driver.  I don't demand full-blown memory management
> integration or anything pother fancy.  Just get your crap sorted out.
>
> ou could propably have done a prototype in the time you wasted arguing here.

we could write one quick enough for one card but now make it work on
combinations of mach64/i810/radeon/r128/i830/i915/mga cards and tested so
that it doesn't break current setups, its just not going to happen, this
change doesn't break near as many setups (I'd be surprised if it broke any
real world setups at all...) I don't have the hardware to test this on all
those cards, the hope is to get the DRM into a state that we can start
proving the shared idea on one card.. it will also make changes to fb
drivers which I'm not comfortable with doing and will cause more hassles..

> I want you a) to back out this particular broken change in your current
> mega-patch.  and b) submit small reviewable changes in the future, as every
> other driver maintainer does.

I'm considering your argument and have taken it on-board, I await Linus's
decision for now, I'll start looking into the info you've given me and
I'll talk to the DRM people actually doing the work (not one line of this
is orignally from me!!..)

All DRM changes are available in small chunks in DRM CVS and DRM bk trees,
the -mm tree picks up the DRM changes and I fix the bugs that come up in
the -mm tree and then I submit the bk tree to Linus, I thought this was
how kernel development worked these days,

The patch you are against is
http://drm.bkbits.net:8080/drm-2.6/patch@1.1784.4.4?nav=index.html|tags|ChangeSet@1.1722.154.18..|cset@1.1784.4.4

with a couple of bugfixes on top of it from testing in -mm.. if I'm
missing the kernel development process somehow please inform me.. I'm new
to this maintainer job and the drm hasn't been maintained in years so I'm
not starting from a good place...

Thanks,
Dave.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-16 11:08                 ` Christoph Hellwig
@ 2004-08-16 11:12                   ` Alan Cox
  2004-08-16 11:47                   ` Dave Airlie
  1 sibling, 0 replies; 341+ messages in thread
From: Alan Cox @ 2004-08-16 11:12 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Dave Airlie, torvalds, Andrew Morton, Linux Kernel Mailing List

On Llu, 2004-08-16 at 12:08, Christoph Hellwig wrote:
> I want you a) to back out this particular broken change in your current
> mega-patch.  and b) submit small reviewable changes in the future, as every
> other driver maintainer does.

DRI is done as small reviewable changes. If you want to be involved then
follow the DRI list too or ask for the entire list to be gated to
linux-kernel for your pleasure...


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-16  9:50         ` Christoph Hellwig
  2004-08-16 10:29           ` Dave Airlie
@ 2004-08-16 11:12           ` Alan Cox
  2004-08-16 12:20             ` Christoph Hellwig
  1 sibling, 1 reply; 341+ messages in thread
From: Alan Cox @ 2004-08-16 11:12 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Dave Airlie, torvalds, Andrew Morton, Linux Kernel Mailing List

On Llu, 2004-08-16 at 10:50, Christoph Hellwig wrote:
> no, now you're acting like an even more broken driver, preventing a fbdev
> driver to be loaded afterwards and doing all kinds of funny things.  Please
> revert to the old method until you have a common pci_driver for fbdev and dri.

fbdev and DRI are not functional together in the general case. They
sometimes happen to work by luck. fbdev and X for that matter are
generally incompatible except unaccelerated.



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-16 11:02               ` Dave Airlie
@ 2004-08-16 11:08                 ` Christoph Hellwig
  2004-08-16 11:12                   ` Alan Cox
  2004-08-16 11:47                   ` Dave Airlie
  0 siblings, 2 replies; 341+ messages in thread
From: Christoph Hellwig @ 2004-08-16 11:08 UTC (permalink / raw)
  To: Dave Airlie; +Cc: torvalds, Andrew Morton, linux-kernel

On Mon, Aug 16, 2004 at 12:02:15PM +0100, Dave Airlie wrote:
> > 	You do stop fb from beeing loaded after drm
> > and thus break perfectly working setups during stable series.  And you
> 
> I doubt anyone has a system that does it and they should have a broken one
> if they do it.. drm has also said you should load fb before it.. and
> having both fb and drm loaded on the same hardware is a hack anyways..

So fix it properly instead of making it even more broken.

> Yes and that is the final goal but you are dodging the point we cannot
> jump to a fully finished state in one simple transition, it is great to
> hear "fbdrv/drm into a common driver" it's a simple sentence surely coding
> it must be simple, well its not and we are taking the route that should

It _is+ simple.  Look at drivers/message/fusion/ for a driver doing multiple
protocol on a single pci_driver.  I don't demand full-blown memory management
integration or anything pother fancy.  Just get your crap sorted out.

ou could propably have done a prototype in the time you wasted arguing here.

> You seem to want us to go down the finished unmergeable mega-patch road
> to avoid breaking something that is broken and might work, the benefits
> don't outweight the costs.. so it makes no sense..

I want you a) to back out this particular broken change in your current
mega-patch.  and b) submit small reviewable changes in the future, as every
other driver maintainer does.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-16 10:38             ` Christoph Hellwig
@ 2004-08-16 11:02               ` Dave Airlie
  2004-08-16 11:08                 ` Christoph Hellwig
  0 siblings, 1 reply; 341+ messages in thread
From: Dave Airlie @ 2004-08-16 11:02 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: torvalds, Andrew Morton, linux-kernel


>
> 3) stop making broken changes.

The current system is broken in way more subtle ways...

> 	You do stop fb from beeing loaded after drm
> and thus break perfectly working setups during stable series.  And you

I doubt anyone has a system that does it and they should have a broken one
if they do it.. drm has also said you should load fb before it.. and
having both fb and drm loaded on the same hardware is a hack anyways..

> introduce indeterministic behaviour, and although I haven't looked at the
> code because unlike every guideline tells you you didn't post it to do the
> list, probably horribly broken code.

I just did post it, it's been in the DRM CVS tree for 3-6 mths now, it's
been in -mm for 1.5 mths, I've followed what Andrew and Linus told me to
do to get the DRM maintained... the link I posted in the last mail to the
broken out patch in the -mm tree, the only file to change is really
drm_drv.h and some bits in drm_stub.h... the current code is we have
discovered horribly broken in a lot of cases.. I've gotten nothing back to
say this code is any worse....

> If you want pci_driver semantics - and apparently you do - move fbdev
> and drm into a common driver or introduce a stub.  This was discussed to
> death and all kinds of list and Kernel Summit and now please follow what
> was agreed on instead of introducing subtile hacks.

Yes and that is the final goal but you are dodging the point we cannot
jump to a fully finished state in one simple transition, it is great to
hear "fbdrv/drm into a common driver" it's a simple sentence surely coding
it must be simple, well its not and we are taking the route that should
affect the least people, I'm majorly involved in the discussion and I was
the one to agree to carry out the maintenance paths between DRM and LK,
this code is needed for us to move forward with the merged drivers - if
Linus/Andrew decide not to merge it I'll go back to the DRM team and it'll
be reworked until they do accept it, but we have to stop the fb from
loading after the DRM at some stage and it may as well be earlier.. (if
2.7 was going to happen I'd wait but kernel development seems to be
changing...)

You seem to want us to go down the finished unmergeable mega-patch road
to avoid breaking something that is broken and might work, the benefits
don't outweight the costs.. so it makes no sense..

Again if Linus/Andrew bounce this we will have to rework it but something
like this has to go in at some stage...

Dave.

-- 
David Airlie, Software Engineer
http://www.skynet.ie/~airlied / airlied at skynet.ie
pam_smb / Linux DECstation / Linux VAX / ILUG person


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-16 10:29           ` Dave Airlie
@ 2004-08-16 10:38             ` Christoph Hellwig
  2004-08-16 11:02               ` Dave Airlie
  0 siblings, 1 reply; 341+ messages in thread
From: Christoph Hellwig @ 2004-08-16 10:38 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Christoph Hellwig, torvalds, Andrew Morton, linux-kernel

On Mon, Aug 16, 2004 at 11:29:48AM +0100, Dave Airlie wrote:
> 1) move the DRM to be a real PCI driver now - stop fb from working on same
> card
> 
> 2) move the DRM to act like a real PCI driver when fb isn't loaded, when
> we merge we rip the code out..

3) stop making broken changes.

	You do stop fb from beeing loaded after drm
and thus break perfectly working setups during stable series.  And you
introduce indeterministic behaviour, and although I haven't looked at the
code because unlike every guideline tells you you didn't post it to do the
list, probably horribly broken code.

If you want pci_driver semantics - and apparently you do - move fbdev
and drm into a common driver or introduce a stub.  This was discussed to
death and all kinds of list and Kernel Summit and now please follow what
was agreed on instead of introducing subtile hacks.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-16  9:50         ` Christoph Hellwig
@ 2004-08-16 10:29           ` Dave Airlie
  2004-08-16 10:38             ` Christoph Hellwig
  2004-08-16 11:12           ` Alan Cox
  1 sibling, 1 reply; 341+ messages in thread
From: Dave Airlie @ 2004-08-16 10:29 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: torvalds, Andrew Morton, linux-kernel


>
> no, now you're acting like an even more broken driver, preventing a fbdev
> driver to be loaded afterwards and doing all kinds of funny things.  Please
> revert to the old method until you have a common pci_driver for fbdev and dri.
>

the options we have are
1) move the DRM to be a real PCI driver now - stop fb from working on same
card

2) move the DRM to act like a real PCI driver when fb isn't loaded, when
we merge we rip the code out..

the other option is not going to happen unless Linus/Andrew/Alan tell us
to go away do it that away and will then unconditionally merge a
mega-patch when I'm finished - you can't have it both ways we fix things
step-by-step or we leave it as is and nobody fixes it, so Christoph I
repsect your opinion but unless you care about this enough to do the work
on it, the way we are going seems to be the best way to avoid breaking
things and I'm leaving the decision on whether to merge this stuff or not
to Linus/Andrew - btw in case anyone wants to look the patch is whats at:
http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.8-rc4/2.6.8-rc4-mm1/broken-out/bk-drm.patch

Dave.

-- 
David Airlie, Software Engineer
http://www.skynet.ie/~airlied / airlied at skynet.ie
pam_smb / Linux DECstation / Linux VAX / ILUG person


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-16  9:30       ` Dave Airlie
@ 2004-08-16  9:50         ` Christoph Hellwig
  2004-08-16 10:29           ` Dave Airlie
  2004-08-16 11:12           ` Alan Cox
  0 siblings, 2 replies; 341+ messages in thread
From: Christoph Hellwig @ 2004-08-16  9:50 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Christoph Hellwig, torvalds, Andrew Morton, linux-kernel

On Mon, Aug 16, 2004 at 10:30:55AM +0100, Dave Airlie wrote:
> >
> > Eeek, doing different styles of probing is even worse than what you did
> > before.  Please revert to pci_find_device() util you havea proper common
> > driver ready.
> 
> There was nothing wrong with what we did before it just happened to work
> like 2.4. we are now acting like real 2.6 drivers,

no, now you're acting like an even more broken driver, preventing a fbdev
driver to be loaded afterwards and doing all kinds of funny things.  Please
revert to the old method until you have a common pci_driver for fbdev and dri.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-16  9:17     ` Christoph Hellwig
@ 2004-08-16  9:30       ` Dave Airlie
  2004-08-16  9:50         ` Christoph Hellwig
  0 siblings, 1 reply; 341+ messages in thread
From: Dave Airlie @ 2004-08-16  9:30 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: torvalds, Andrew Morton, linux-kernel

>
> Eeek, doing different styles of probing is even worse than what you did
> before.  Please revert to pci_find_device() util you havea proper common
> driver ready.

There was nothing wrong with what we did before it just happened to work
like 2.4. we are now acting like real 2.6 drivers, which we need to do for
sysfs and hotplug to work, Jon Smirl is working on a proper minor device
support (like USB does I think)... we need to get this work done before we
can have proper common drivers and I don't want to do all this work in
hiding and then have it refused because we told no-one,

The DRM will flux a lot over the next while (while we get this common
drm/fb stuff together) and as long as we can keep the changes from
actually breaking it I think people should be able to live with it ...

Dave.

-- 
David Airlie, Software Engineer
http://www.skynet.ie/~airlied / airlied at skynet.ie
pam_smb / Linux DECstation / Linux VAX / ILUG person


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-15 23:40   ` Dave Airlie
@ 2004-08-16  9:17     ` Christoph Hellwig
  2004-08-16  9:30       ` Dave Airlie
  0 siblings, 1 reply; 341+ messages in thread
From: Christoph Hellwig @ 2004-08-16  9:17 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Christoph Hellwig, torvalds, Andrew Morton, linux-kernel

On Mon, Aug 16, 2004 at 12:40:43AM +0100, Dave Airlie wrote:
> Probably should say PCI APIs properly, it now does enable/disable devices
> and registers the DRM as owning the memory regions, does proper PCI
> probing .. in cases where the fb is loaded on the card already it falls
> back to the old ways (evil direct register writing.. ), this change will
> stop you loading the fb driver adter the drm driver but this shouldn't be
> a common case at all..

Eeek, doing different styles of probing is even worse than what you did
before.  Please revert to pci_find_device() util you havea proper common
driver ready.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-15 12:34 ` your mail Christoph Hellwig
@ 2004-08-15 23:40   ` Dave Airlie
  2004-08-16  9:17     ` Christoph Hellwig
  0 siblings, 1 reply; 341+ messages in thread
From: Dave Airlie @ 2004-08-15 23:40 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: torvalds, Andrew Morton, linux-kernel

> On Sun, Aug 15, 2004 at 01:19:31PM +0100, Dave Airlie wrote:
> > Graphics, and the DRM now uses PCI properly if no framebuffer is loaded
> > (it falls back if framebuffer is enabled...),
>
> Can you explain what this means?
>

Probably should say PCI APIs properly, it now does enable/disable devices
and registers the DRM as owning the memory regions, does proper PCI
probing .. in cases where the fb is loaded on the card already it falls
back to the old ways (evil direct register writing.. ), this change will
stop you loading the fb driver adter the drm driver but this shouldn't be
a common case at all..

Dave.

-- 
David Airlie, Software Engineer
http://www.skynet.ie/~airlied / airlied at skynet.ie
pam_smb / Linux DECstation / Linux VAX / ILUG person


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-08-15 12:19 Dave Airlie
@ 2004-08-15 12:34 ` Christoph Hellwig
  2004-08-15 23:40   ` Dave Airlie
  0 siblings, 1 reply; 341+ messages in thread
From: Christoph Hellwig @ 2004-08-15 12:34 UTC (permalink / raw)
  To: Dave Airlie; +Cc: torvalds, Andrew Morton, linux-kernel

On Sun, Aug 15, 2004 at 01:19:31PM +0100, Dave Airlie wrote:
> Graphics, and the DRM now uses PCI properly if no framebuffer is loaded
> (it falls back if framebuffer is enabled...),

Can you explain what this means?


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-05-24 23:04 Laughlin, Joseph V
  2004-05-24 23:13 ` Bernd Petrovitsch
@ 2004-05-24 23:21 ` Chris Wright
  1 sibling, 0 replies; 341+ messages in thread
From: Chris Wright @ 2004-05-24 23:21 UTC (permalink / raw)
  To: Laughlin, Joseph V; +Cc: Herbert Poetzl, linux-kernel

* Laughlin, Joseph V (Joseph.V.Laughlin@boeing.com) wrote:
> Currently, we're using sched_setaffinity() to control it, which existed
> in our 2.4.19 kernel.  (but, you have to be root to use it, and we'd
> like non-root users to be able to change the affinity.)

Sounds like it's patched in.  And it likely doesn't require root per se,
but CAP_SYS_NICE (as the 2.6 code does).

So, you've got choices of how to disable those capability checks to do
what you want.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 341+ messages in thread

* RE: your mail
@ 2004-05-24 23:15 Laughlin, Joseph V
  0 siblings, 0 replies; 341+ messages in thread
From: Laughlin, Joseph V @ 2004-05-24 23:15 UTC (permalink / raw)
  To: Bernd Petrovitsch, linux-kernel

> -----Original Message-----
> From: Bernd Petrovitsch [mailto:bernd@firmix.at] 
> Sent: Monday, May 24, 2004 4:13 PM
> To: Laughlin, Joseph V; linux-kernel@vger.kernel.org
> Subject: RE: your mail
> 
> 
> On Tue, 2004-05-25 at 01:04, Laughlin, Joseph V wrote:
> > > -----Original Message-----
> [...]
> > > On Mon, May 24, 2004 at 03:20:33PM -0700, Laughlin, 
> Joseph V wrote:
> > > > I've been tasked with modifying a 2.4 kernel so that a
> > > non-root user
> > > > can do the following:
> > > > 
> > > > Dynamically change the priorities of processes (up and 
> down) Lock
> > > > processes in memory Can change process cpu affinity
> [...]
> > Currently, we're using sched_setaffinity() to control it, which 
> > existed in our 2.4.19 kernel.  (but, you have to be root to use it, 
> > and we'd like non-root users to be able to change the affinity.)
> 
> And using sudo or setuid Binaries?
> 
> 	Bernd
> -- 

Not an option, unfortunately. 

^ permalink raw reply	[flat|nested] 341+ messages in thread

* RE: your mail
  2004-05-24 23:04 Laughlin, Joseph V
@ 2004-05-24 23:13 ` Bernd Petrovitsch
  2004-05-24 23:21 ` Chris Wright
  1 sibling, 0 replies; 341+ messages in thread
From: Bernd Petrovitsch @ 2004-05-24 23:13 UTC (permalink / raw)
  To: Laughlin, Joseph V, linux-kernel

On Tue, 2004-05-25 at 01:04, Laughlin, Joseph V wrote:
> > -----Original Message-----
[...]
> > On Mon, May 24, 2004 at 03:20:33PM -0700, Laughlin, Joseph V wrote:
> > > I've been tasked with modifying a 2.4 kernel so that a 
> > non-root user 
> > > can do the following:
> > > 
> > > Dynamically change the priorities of processes (up and down) Lock 
> > > processes in memory Can change process cpu affinity
[...]
> Currently, we're using sched_setaffinity() to control it, which existed
> in our 2.4.19 kernel.  (but, you have to be root to use it, and we'd
> like non-root users to be able to change the affinity.)

And using sudo or setuid Binaries?

	Bernd
-- 
Firmix Software GmbH                   http://www.firmix.at/
mobil: +43 664 4416156                 fax: +43 1 7890849-55
          Embedded Linux Development and Services



^ permalink raw reply	[flat|nested] 341+ messages in thread

* RE: your mail
@ 2004-05-24 23:04 Laughlin, Joseph V
  2004-05-24 23:13 ` Bernd Petrovitsch
  2004-05-24 23:21 ` Chris Wright
  0 siblings, 2 replies; 341+ messages in thread
From: Laughlin, Joseph V @ 2004-05-24 23:04 UTC (permalink / raw)
  To: Herbert Poetzl; +Cc: linux-kernel

> -----Original Message-----
> From: Herbert Poetzl [mailto:herbert@13thfloor.at] 
> Sent: Monday, May 24, 2004 3:30 PM
> To: Laughlin, Joseph V
> Cc: linux-kernel@vger.kernel.org
> Subject: Re: your mail
> 
> 
> On Mon, May 24, 2004 at 03:20:33PM -0700, Laughlin, Joseph V wrote:
> > I've been tasked with modifying a 2.4 kernel so that a 
> non-root user 
> > can do the following:
> > 
> > Dynamically change the priorities of processes (up and down) Lock 
> > processes in memory Can change process cpu affinity
> > 
> > Anyone got any ideas about how I could start doing this?  
> (I'm new to 
> > kernel development, btw.)
> 
> check the kernel capability system ...
> (it's quite simple)
> 
> #define CAP_SYS_NICE         23
> #define CAP_IPC_LOCK         14
> 
> cpu scheduler affinity isn't part of 2.4 AFAIK
> so there is no easy way to 'control' it ...
> 

Currently, we're using sched_setaffinity() to control it, which existed
in our 2.4.19 kernel.  (but, you have to be root to use it, and we'd
like non-root users to be able to change the affinity.)

Joe


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-05-24 22:30 ` your mail Herbert Poetzl
@ 2004-05-24 22:34   ` Marc-Christian Petersen
  0 siblings, 0 replies; 341+ messages in thread
From: Marc-Christian Petersen @ 2004-05-24 22:34 UTC (permalink / raw)
  To: linux-kernel; +Cc: Herbert Poetzl, Laughlin, Joseph V

On Tuesday 25 May 2004 00:30, Herbert Poetzl wrote:

Hi Joseph,

> > Dynamically change the priorities of processes (up and down)
> > Lock processes in memory
> > Can change process cpu affinity
> > Anyone got any ideas about how I could start doing this?  (I'm new to
> > kernel development, btw.)
> check the kernel capability system ...
> (it's quite simple)
> #define CAP_SYS_NICE         23
> #define CAP_IPC_LOCK         14
> cpu scheduler affinity isn't part of 2.4 AFAIK
> so there is no easy way to 'control' it ...

at least I have a patch in my 2.4-tree where a user in a predefined GID 
(changeable via /proc) can change his/her nice of his/her own processes up 
and down.

ciao, Marc

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-05-24 22:20 Laughlin, Joseph V
  2004-05-24 22:30 ` your mail Herbert Poetzl
@ 2004-05-24 22:33 ` Chris Wright
  1 sibling, 0 replies; 341+ messages in thread
From: Chris Wright @ 2004-05-24 22:33 UTC (permalink / raw)
  To: Laughlin, Joseph V; +Cc: linux-kernel

* Laughlin, Joseph V (Joseph.V.Laughlin@boeing.com) wrote:
> I've been tasked with modifying a 2.4 kernel so that a non-root user can
> do the following:
> 
> Dynamically change the priorities of processes (up and down)

Requires CAP_SYS_NICE.

> Lock processes in memory

Currently requires CAP_IPC_LOCK.  However, this one is already been
done using rlimits (at least via mlock() and friends, SHM_LOCK has
different issue).

> Can change process cpu affinity

Requires CAP_SYS_NICE (but I believe this was a 2.6 feature).

> Anyone got any ideas about how I could start doing this?  (I'm new to
> kernel development, btw.)

There's a few approaches floating about.  Probably the simplest is to
disable the checks globally, but this will also be less secure.  I have
an example of this in 2.6 if you'd like.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-05-24 22:20 Laughlin, Joseph V
@ 2004-05-24 22:30 ` Herbert Poetzl
  2004-05-24 22:34   ` Marc-Christian Petersen
  2004-05-24 22:33 ` Chris Wright
  1 sibling, 1 reply; 341+ messages in thread
From: Herbert Poetzl @ 2004-05-24 22:30 UTC (permalink / raw)
  To: Laughlin, Joseph V; +Cc: linux-kernel

On Mon, May 24, 2004 at 03:20:33PM -0700, Laughlin, Joseph V wrote:
> I've been tasked with modifying a 2.4 kernel so that a non-root user can
> do the following:
> 
> Dynamically change the priorities of processes (up and down)
> Lock processes in memory
> Can change process cpu affinity
> 
> Anyone got any ideas about how I could start doing this?  (I'm new to
> kernel development, btw.)

check the kernel capability system ...
(it's quite simple)

#define CAP_SYS_NICE         23
#define CAP_IPC_LOCK         14

cpu scheduler affinity isn't part of 2.4 AFAIK
so there is no easy way to 'control' it ...

HTH,
Herbert

> Thanks,
> 
> Joe Laughlin
> Phantom Works - Integrated Technology Development Labs 
> The Boeing Company
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-04-29  3:03 whitehorse
@ 2004-04-29  3:21 ` Jon
  0 siblings, 0 replies; 341+ messages in thread
From: Jon @ 2004-04-29  3:21 UTC (permalink / raw)
  To: whitehorse; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 741 bytes --]

On Wed, Apr 28, 2004 at 11:03:08PM -0400, whitehorse@mustika.net wrote:
> dear Sir,
>  I have a problem in compiling kernel 2.6.4 from kernel 2.4.19. I use
>  Debian woody. When I rebooting new kernel, some message occur such:
>  "modprobe: QM_MODULES: function not implemented"
>  and I can't load my modules when boot. I would like to waiting any one who
>  answer this. Please send to this mail. Thanks
> 
>  Best regards,
> 
>  Hafid
>  Indonesia
> 
You need to install module-init-tools which is not in Debian Woody
A backport of it for x86 machines is here
http://www.backports.org/debian/dists/woody/module-init-tools/
-- 
Jon
http://tesla.resnet.mtu.edu
The only meaning in life is the meaning you create for it.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <200404121623.42558.vda@port.imtp.ilyichevsk.odessa.ua>
@ 2004-04-13 13:46 ` James Morris
  0 siblings, 0 replies; 341+ messages in thread
From: James Morris @ 2004-04-13 13:46 UTC (permalink / raw)
  To: Denis Vlasenko
  Cc: David S. Miller, netdev,
	YOSHIFUJI Hideaki / 吉藤英明,
	linux-kernel

On Mon, 12 Apr 2004, Denis Vlasenko wrote:

> According to my measurements,
> 
> ip_vs_control_add() (from include/net/ip_vs.h) is called twice
> and
> sock_queue_rcv_skb() (from include/net/sock.h) is called 19 times
> from various kernel .c files.
> 
> Both these includes generate more than 500 bytes of code on x86.
> 
> These patches uninline them. Please apply.

What kind of performance impact (if any) does this patch have?


- James
-- 
James Morris
<jmorris@redhat.com>



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-04-09 17:54 Martin Knoblauch
@ 2004-04-09 18:12 ` Joel Jaeggli
  0 siblings, 0 replies; 341+ messages in thread
From: Joel Jaeggli @ 2004-04-09 18:12 UTC (permalink / raw)
  To: Martin Knoblauch; +Cc: linux-kernel

On Fri, 9 Apr 2004, Martin Knoblauch wrote:

> >I was wondering if for linux or better for a linux filesystem
> >there is something like dynamic swapping of files possible.
> >For explanation: I habeaccess to an Infinstor via NFS and
> >linux is runnig there. This server has a nice funtion I'd
> >like to have: if there are files that are not used for a
> >specified time (i.e. 30 days) they are moved to another storage
> >(disk and after that to an streamer tape) and are replaced
> >by some kind of 'link'. So if you look at your directory you
> >can see everything that was there, but if you try to open it,
> >you have to wait a moment (some seconds if the file was
> >swapped to another disk) oder just another moment (some
> >minutes if the file is on a tape) and then it restored at
> >it's old place.
> >
> 
>  Good description of a HSM (Hierarchical Storage Management)
> System.
> 
> >So is there anything which provides such a feature? By now
> >I have a little script that moves such files out of the way and
> >replaces them by links. But restoring is somewhat harder and
> >it's not automatic.
> >
> >Any ideas?
> >

part of the thing for us (my group at UO) right now, is tape robots aren't
cheaper than disk, so a lot of our offline/near-line backup is slowly
moving in that direction... 1TB lto jukeboxs cost order of $8-9K ea and
the driver for your commercial tabe-backup software can cost nearly that
much on top of it, but I can put 3.5TB of disk in a 5u enclosure and
locate in some other building for a similar price if not less. Even If buy
it in something like a netapp filer it's still only around $10,000 a TB so
HSM systems involving tape don't really have the same apeal as when we
were paying $1200ea for 4GB scsi disks. If I had sunk costs in something
like a storagetek powerhorn with 6000 tape capacity I might think a little
differently but I suspect your situation is closer to mine that it is to
the sorts of people who buy those.

>  Really depends. As far as I know thare are no "free" HSM Systems
> out there for Linux The only one that I am faintly familiar with
> that runs on Linux is StorNext from ADIC. Definitely not free.
> 
>  DMF/Irix may now be ported to Linux (Altix/IA64), but I doubt
> it will be free.
> 
>  Sun is most likely not (yet) interested in doing a Linux port
> of SAM-FS (there are still Sparc/Solaris Machines to sell).
> And it won't be free (my guess).
> 
>  Tivoli/IBM and UniTree are also sold for Linux. Again "sold" is
> the important word
> 
> Martin
> 
> 
> =====
> ------------------------------------------------------
> Martin Knoblauch
> email: k n o b i AT knobisoft DOT de
> www:   http://www.knobisoft.de
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-- 
-------------------------------------------------------------------------- 
Joel Jaeggli  	       Unix Consulting 	       joelja@darkwing.uoregon.edu    
GPG Key Fingerprint:     5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-03-15 22:49 Kevin Leung
@ 2004-03-15 23:26 ` Richard B. Johnson
  0 siblings, 0 replies; 341+ messages in thread
From: Richard B. Johnson @ 2004-03-15 23:26 UTC (permalink / raw)
  To: Kevin Leung; +Cc: linux-kernel

On Mon, 15 Mar 2004, Kevin Leung wrote:

> Hello All,
>
> I am very new to Linux and am working on a project. The nature of the
> project is to essentially record all process/thread scheduling activity for
> use in a later application. I wanted to know if any experts out there knew
> of any libraries that could essentially "monitor" or "listen" for any
> scheduling changes made. For instance if the kernel decides to set process A
> from "sleeping" to "running" and process B from "running" to "sleeping", I
> wanted to know if there was a function that could generate an immediate
> notification of this event.

No. FYI, there are hundreds-of-thousands of such "events" per second
of operation! Basically, any time some task is waiting for I/O its
CPU is taken away and given to somebody else. This is what "sleeping"
usually means. Once the I/O completes, the task gets the CPU
again and that's what "running" means. If you were to instrument
these two state-changes for all tasks, it would certainly leave
only a new percent of CPU available for the tasks. This would
royally screw up the meaning of anything you were trying to
instrument.

> Priority change information is also desireable.

If you mean the dynamic priority that keeps changing until
the task is executed, no. If you mean priority like
'nice', you can instrument the sys-call.

> The more aspects which trigger notificaiton, the better. As a first attempt,

There is a kernel logging daemon that writes 'printk' messages. This
works by having a user-mode daemon open and read /proc/kmsg. You can
make a similar communications interface, using the existing daemon
as a template, that will instrument anything you want.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.24 on an i686 machine (797.90 BogoMips).
            Note 96.31% of all statistics are fiction.



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-02-24 13:58 Jim Deas
@ 2004-02-24 14:44 ` Richard B. Johnson
  0 siblings, 0 replies; 341+ messages in thread
From: Richard B. Johnson @ 2004-02-24 14:44 UTC (permalink / raw)
  To: Jim Deas; +Cc: linux-kernel

On Tue, 24 Feb 2004, Jim Deas wrote:

> Can someone point me in the right direction.
> I am getting a oops on a driver I am porting from 2.4 to 2.6.2 kernel.
> I have expanded the file_operations structures and have a driver that
> loads and inits the hardware but when I call the open function I
> get an oops. The best I can track it is
>

Fix your line-warp!

> EIP 0060:[c0188954]
> chrdev_open +0x104
>
> What is the best debug tool to put this oops information in clear
> sight? It appears to never get to my modules open routine so I am
> at a debugging crossroad. What is the option on a kernel compile
> to get the compile listing so I can see what is at 0x104 in this
> block of code?
>

Nothing is going to help with that EIP with a segment value of
0x60. It looks like some dumb coding error, using a pointer
that disappeared after the module init function. In other
words, it's probably something like:

int __init init_module()
{
    struct file_operations fops;
    mset(&fops, 0x00, sizeof(fops));
    fops.open = open;
    fops.release = close;
    fops.owner = THIS_MODULE;
    register_chrdev(DEV_MAJOR, dev, &fops);
}

So, everything in init_module is GONE. Your program calls open()
and the pointer in the kernel gets dereferenced to junk.

There are kernel debugging tools, however I have found that
the most useful tools are printk() and some discipline.

In the case of code above, don't just change the declaration
of the fops object to static. Instead, move it outside the
function, so it's obviously where it won't go away.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.24 on an i686 machine (797.90 BogoMips).
            Note 96.31% of all statistics are fiction.



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-02-19 13:52 Joilnen Leite
@ 2004-02-19 14:12 ` Richard B. Johnson
  0 siblings, 0 replies; 341+ messages in thread
From: Richard B. Johnson @ 2004-02-19 14:12 UTC (permalink / raw)
  To: Joilnen Leite; +Cc: linux-kernel, linux-ide

On Thu, 19 Feb 2004, Joilnen Leite wrote:

> excuse me friends shcedule_timeout(1) is not a problem
> for spin_lock_irqsave ?
>
> drivers/scsi/ide-scsi.c:897
>
> spin_lock_irqsave(&ide_lock, flags);
> while (HWGROUP(drive)->handler) {
>        HWGROUP(drive)->handler = NULL;
>        schedule_timeout(1);
> }
>
> pub 1024D/5139533E Joilnen Batista Leite
> F565 BD0B 1A39 390D 827E 03E5 0CD4 0F20 5139 533E

What kernel version?  It is very bad. You can't sleep with
a spin-lock being held!

Cheers,
Dick Johnson
Penguin : Linux version 2.4.24 on an Intel Pentium III machine (797.90 BogoMips).
            Note 96.31% of all statistics are fiction.



^ permalink raw reply	[flat|nested] 341+ messages in thread

* RE: your mail
@ 2004-02-13 19:23 Bloch, Jack
  0 siblings, 0 replies; 341+ messages in thread
From: Bloch, Jack @ 2004-02-13 19:23 UTC (permalink / raw)
  To: 'Maciej Zenczykowski'; +Cc: 'linux-kernel@vger.kernel.org'

By the way shouldn't a munmap call really free the memory. I have an strace
showing that the process calls munmap a lot but I do not seeany gaps in the
map file

Jack Bloch 
Siemens ICN
phone                (561) 923-6550
e-mail                jack.bloch@icn.siemens.com


-----Original Message-----
From: Bloch, Jack 
Sent: Friday, February 13, 2004 2:14 PM
To: 'Maciej Zenczykowski'
Cc: linux-kernel@vger.kernel.org
Subject: RE: your mail


Yes, your assumtion about the 1GB is correct.

Jack Bloch 
Siemens ICN
phone                (561) 923-6550
e-mail                jack.bloch@icn.siemens.com


-----Original Message-----
From: Maciej Zenczykowski [mailto:maze@cela.pl]
Sent: Friday, February 13, 2004 1:11 PM
To: Bloch, Jack
Cc: linux-kernel@vger.kernel.org
Subject: Re: your mail


The deleted marks in question mean that the file in question has been 
unlinked (rm'ed), however it is still being used and the inode in question 
still exists.  This memory is in use and thus validly takes up mapping 
space.  You'd need to unmap inorder to free that memory.  Deleting a file 
does not delete that file until _all_ processes close and unmap any 
references to it.  What's more worrying is the large area of unmapped 
memory below 1GB (0x40000000), wonder why it doesn't get allocated?  But I 
think the answer is that the standard allocator only searches 1GB..3GB for 
free areas...

Cheers,
MaZe.

On Fri, 13 Feb 2004, Bloch, Jack wrote:

> I am running a 2.4.19 Kernel and have a problem where a process is using
the
> up to the 0xC0000000 of space. It is no longer possible for this process
to
> get any more memory vi mmap or via shmget. However, when I dump the
> /procs/#/maps file, I see large chunks of memory deleted. i.e this should
be
> freely available to be used by the next call. I do not see these addresses
> get re-used. The maps file is attached.
> 
>  <<9369>> 
> 
> Jack Bloch 
> Siemens ICN
> phone                (561) 923-6550
> e-mail                jack.bloch@icn.siemens.com
> 
> 

^ permalink raw reply	[flat|nested] 341+ messages in thread

* RE: your mail
@ 2004-02-13 19:14 Bloch, Jack
  0 siblings, 0 replies; 341+ messages in thread
From: Bloch, Jack @ 2004-02-13 19:14 UTC (permalink / raw)
  To: 'Maciej Zenczykowski'; +Cc: linux-kernel

Yes, your assumtion about the 1GB is correct.

Jack Bloch 
Siemens ICN
phone                (561) 923-6550
e-mail                jack.bloch@icn.siemens.com


-----Original Message-----
From: Maciej Zenczykowski [mailto:maze@cela.pl]
Sent: Friday, February 13, 2004 1:11 PM
To: Bloch, Jack
Cc: linux-kernel@vger.kernel.org
Subject: Re: your mail


The deleted marks in question mean that the file in question has been 
unlinked (rm'ed), however it is still being used and the inode in question 
still exists.  This memory is in use and thus validly takes up mapping 
space.  You'd need to unmap inorder to free that memory.  Deleting a file 
does not delete that file until _all_ processes close and unmap any 
references to it.  What's more worrying is the large area of unmapped 
memory below 1GB (0x40000000), wonder why it doesn't get allocated?  But I 
think the answer is that the standard allocator only searches 1GB..3GB for 
free areas...

Cheers,
MaZe.

On Fri, 13 Feb 2004, Bloch, Jack wrote:

> I am running a 2.4.19 Kernel and have a problem where a process is using
the
> up to the 0xC0000000 of space. It is no longer possible for this process
to
> get any more memory vi mmap or via shmget. However, when I dump the
> /procs/#/maps file, I see large chunks of memory deleted. i.e this should
be
> freely available to be used by the next call. I do not see these addresses
> get re-used. The maps file is attached.
> 
>  <<9369>> 
> 
> Jack Bloch 
> Siemens ICN
> phone                (561) 923-6550
> e-mail                jack.bloch@icn.siemens.com
> 
> 

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-02-13 16:45 Bloch, Jack
@ 2004-02-13 18:11 ` Maciej Zenczykowski
  0 siblings, 0 replies; 341+ messages in thread
From: Maciej Zenczykowski @ 2004-02-13 18:11 UTC (permalink / raw)
  To: Bloch, Jack; +Cc: linux-kernel

The deleted marks in question mean that the file in question has been 
unlinked (rm'ed), however it is still being used and the inode in question 
still exists.  This memory is in use and thus validly takes up mapping 
space.  You'd need to unmap inorder to free that memory.  Deleting a file 
does not delete that file until _all_ processes close and unmap any 
references to it.  What's more worrying is the large area of unmapped 
memory below 1GB (0x40000000), wonder why it doesn't get allocated?  But I 
think the answer is that the standard allocator only searches 1GB..3GB for 
free areas...

Cheers,
MaZe.

On Fri, 13 Feb 2004, Bloch, Jack wrote:

> I am running a 2.4.19 Kernel and have a problem where a process is using the
> up to the 0xC0000000 of space. It is no longer possible for this process to
> get any more memory vi mmap or via shmget. However, when I dump the
> /procs/#/maps file, I see large chunks of memory deleted. i.e this should be
> freely available to be used by the next call. I do not see these addresses
> get re-used. The maps file is attached.
> 
>  <<9369>> 
> 
> Jack Bloch 
> Siemens ICN
> phone                (561) 923-6550
> e-mail                jack.bloch@icn.siemens.com
> 
> 


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2004-02-10 23:36 Bloch, Jack
@ 2004-02-11  1:09 ` Maciej Zenczykowski
  0 siblings, 0 replies; 341+ messages in thread
From: Maciej Zenczykowski @ 2004-02-11  1:09 UTC (permalink / raw)
  To: Bloch, Jack; +Cc: linux-kernel

On Tue, 10 Feb 2004, Bloch, Jack wrote:

> I have a system with 2GB of memory. One of my processes calls mmap to try to
> map a 100MB file into memory. This calls fails with -ENOMEM. I rebuilt the
> kernel with a few debug printk statements in mmap.c to see where the failure
> was occurring it occurred in the function arch_get_unmapped_area. the code
> is as follows:
> 
> for (vma = find_vma(mm, addr); ; vma = vma->vm_next) {
> 		/* At this point:  (!vma || addr < vma->vm_end). */
> 		unsigned long __heap_stack_gap;
> 		if (TASK_SIZE - len < addr)
>                 { 

it's valid there's no point in searching further for an area of at least 
len bytes.  The user area is 0 .. TASK_SIZE-1.  The addr is the address 
currently being checked, the len is the requested length.  if addr+len is 
greater or equal to TASK_SIZE then the current addr (which is increasing 
within this loop) already causes such a mapping to overflow into kernel 
space (exceeds the TASK_SIZE virtual address limit).  This is precisely as 
expected.

I'd assume your program has fragmented memory to such a level that a 
single consecutive 100 MB area is no longer free (not that hard to do, 
since TASK_SIZE is 3 GB).

Cheers,
MaZe.



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-12-26 22:27 ` your mail Linus Torvalds
@ 2004-01-05 10:59   ` Gerd Knorr
  0 siblings, 0 replies; 341+ messages in thread
From: Gerd Knorr @ 2004-01-05 10:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: caszonyi, linux-kernel

> 		....
>                         while (voffset >= sg_dma_len(vsg)) {
>                                 voffset -= sg_dma_len(vsg);
>                                 vsg++;
>                         }
> 		....

> I suspect the problem is that 
> 
> 	"voffset >= sg_dma_len(vsg)"
> 
> test: if "voffset" is _exactly_ the same as sg_dma_len(), then we will 
> test one more iteration (when "voffset" is 0), and that iteration may be 
> past the end of the "vsg" array.

That certainly makes sense, the 'v' plane is the last one in the memory
block for the video frame to be captured, so voffset / vsg will walk to
the last sg entry and may overrun described.  Good catch, I'm impressed.

> I suspect the fix might be to change the test to
> 
> 	"voffset && voffset >= sg_dma_len(vsg)"

Merged into my tree, thanks.

still busy with the xmas mail backlog,

  Gerd


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-12-26 20:20 caszonyi
@ 2003-12-26 22:27 ` Linus Torvalds
  2004-01-05 10:59   ` Gerd Knorr
  0 siblings, 1 reply; 341+ messages in thread
From: Linus Torvalds @ 2003-12-26 22:27 UTC (permalink / raw)
  To: caszonyi; +Cc: linux-kernel, kraxel



On Fri, 26 Dec 2003 caszonyi@rdslink.ro wrote:
> 
> I was trying to capture a tv program  with mencoder when the oops occured
> a couple  of hours later the system froze without leaving a single trace
> in logs. I was able to reboot with SysRq.
> 
> Programs versions, config and dmesg are attached.

Looks like this loop:

		....
                        while (voffset >= sg_dma_len(vsg)) {
                                voffset -= sg_dma_len(vsg);
                                vsg++;
                        }
		....

and in particular, it's the "sg_dma_len()" access that oopses, apparently 
because vsg was stale to begin with, or because it incremented past the 
last pointer.

The pointer that fails (0xc4bea00c) looks reasonable, so it's almost
certainly due to CONFIG_PAGE_DEBUG showing some kind of use-after-free
problem (ie the pointer is stale, and the memory has already been freed).

I suspect the problem is that 

	"voffset >= sg_dma_len(vsg)"

test: if "voffset" is _exactly_ the same as sg_dma_len(), then we will 
test one more iteration (when "voffset" is 0), and that iteration may be 
past the end of the "vsg" array.

I suspect the fix might be to change the test to

	"voffset && voffset >= sg_dma_len(vsg)"

to make sure that we never access vsg past the end of the array.
		

		Linus

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-12-23 14:54 ` your mail Matti Aarnio
@ 2003-12-23 17:36   ` Norberto Bensa
  0 siblings, 0 replies; 341+ messages in thread
From: Norberto Bensa @ 2003-12-23 17:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: Matti Aarnio

[-- Attachment #1: signed data --]
[-- Type: text/plain, Size: 354 bytes --]

Matti Aarnio wrote:
> Folks, I don't understand you...
> In EVERY list posting there are explicite instructions
> of how to unsubscribe, and STILL people do it wrong...

People doesn't read.

Regards,
Norberto

-- 
Linux 2.6.0-mm1 Pentium III (Coppermine) GenuineIntel GNU/Linux
 14:35:46 up 39 min,  1 user,  load average: 0.34, 0.18, 0.13

[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-12-23 14:16 dublinux
@ 2003-12-23 14:54 ` Matti Aarnio
  2003-12-23 17:36   ` Norberto Bensa
  0 siblings, 1 reply; 341+ messages in thread
From: Matti Aarnio @ 2003-12-23 14:54 UTC (permalink / raw)
  To: dublinux; +Cc: linux-kernel

Folks, I don't understand you...
In EVERY list posting there are explicite instructions
of how to unsubscribe, and STILL people do it wrong...

Do tell us (postmaster@vger.kernel.org) if you do find that
there is something confusing, and should be improved.

  /Matti Aarnio -- one of  <postmaster@vger.kernel.org>

On Tue, Dec 23, 2003 at 03:16:22PM +0100, dublinux wrote:
> Date:	Tue, 23 Dec 2003 15:16:22 +0100
> From:	dublinux <dublinux@box.it>
> To:	linux-kernel@vger.kernel.org
> 
> unsubscribe linux-kernel
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <20031210120336.GU8039@holomorphy.com>
@ 2003-12-10 13:17 ` Stephan von Krawczynski
  0 siblings, 0 replies; 341+ messages in thread
From: Stephan von Krawczynski @ 2003-12-10 13:17 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: paul, marcelo.tosatti, thornber, linux-kernel

On Wed, 10 Dec 2003 04:03:36 -0800
William Lee Irwin III <wli@holomorphy.com> wrote:

> On Tue, 9 Dec 2003, William Lee Irwin III wrote:
> >>> Just apply the patch if you're for some reason terrified of 2.6.
> 
> On Wed, 10 Dec 2003 00:15:17 +0000 (GMT) Paul Jakma <paul@clubi.ie> wrote:
> >> Or get RedHat or Fedora to apply the patch.
> 
> On Wed, Dec 10, 2003 at 11:49:28AM +0000, skraw@ithnet.com wrote:
> > There it is again, this /dev/null argument.
> > "Multi-billion dollar companies" have gone bancrupt on the simple
> > fact that diversification of one product can rattle customers/users
> > to a degree that they in fact decide against the whole product range.
> > IOW go on with the idea to spread around an unknown number of kernel
> > versions and you can be sure that linux as a whole will greatly suffer.
> > This is a "user" issue, not a "developer" issue of course. Developers
> > can apply any kind of patches they like, but don't go and tell the
> > vast user base to "just apply patch xyz". They won't honor this at
> > all, your level of acceptance will dramatically drop.
> 
> One of the main reasons to have an open source OS is customization.
> Arguing that it's not truly feasible to customize will not hold water.

Are you calling a user-configured (not user-patched) kernel "customized" or
not?
_The_ top reason (at least when reading Al's posts :-) is probably that the
source is cross-checked by many eyes. If you create a infinite number of
patched kernel-versions it is obvious you will loose this primary advantage.
The more versions the fewer cross-checking.
IOW a "customized" but instable OS values exactly zero.

> Pretty much every "productized" version of Linux is heavily customized
> to get some kind of value-add. There's no reason to bother mainline
> with this; if it's a serious user issue of that magnitude vendors will
> pick it up.

"Serious" is a subjective argument, therefore different people see different
issues as serious. In my opinion a kernel.org kernel should cover most if not
all possible stable customizations, see it as a pool.
So my primary question for inclusion would not be "what is it worth?" but "does
it do any harm?". I am not god, therefore I do not and can not judge 
"worthness". Can you?

Regards,
Stephan


^ permalink raw reply	[flat|nested] 341+ messages in thread

* RE: your mail
@ 2003-12-03 16:19 Bloch, Jack
  0 siblings, 0 replies; 341+ messages in thread
From: Bloch, Jack @ 2003-12-03 16:19 UTC (permalink / raw)
  To: 'Linus Torvalds'; +Cc: linux-kernel

Thanks,

I found the problem. I do have errno.h included. I was doing a read of errno
after calling perror. If I read it directly after getting the neagtive 0ne
back, it contains the right value.

Jack Bloch 
Siemens ICN
phone                (561) 923-6550
e-mail                jack.bloch@icn.siemens.com


-----Original Message-----
From: Linus Torvalds [mailto:torvalds@osdl.org]
Sent: Wednesday, December 03, 2003 11:04 AM
To: Bloch, Jack
Cc: linux-kernel@vger.kernel.org
Subject: Re: your mail




On Wed, 3 Dec 2003, Bloch, Jack wrote:
>
> I try to open a non-existan device driver node file. The Kernel returns a
> value of -1 (expected). However, when I read the value of errno it
contains
> a value of 29. A call to the perror functrion does print out the correct
> error message (a value of 2). Why does this happen?

Because you forgot a "#include <errno.h>"? Or you have something else
wrong in your program that makes "errno" mean the wrong thing?

		Linus

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-12-03 15:08 Bloch, Jack
  2003-12-03 15:43 ` your mail Richard B. Johnson
@ 2003-12-03 16:03 ` Linus Torvalds
  1 sibling, 0 replies; 341+ messages in thread
From: Linus Torvalds @ 2003-12-03 16:03 UTC (permalink / raw)
  To: Bloch, Jack; +Cc: linux-kernel



On Wed, 3 Dec 2003, Bloch, Jack wrote:
>
> I try to open a non-existan device driver node file. The Kernel returns a
> value of -1 (expected). However, when I read the value of errno it contains
> a value of 29. A call to the perror functrion does print out the correct
> error message (a value of 2). Why does this happen?

Because you forgot a "#include <errno.h>"? Or you have something else
wrong in your program that makes "errno" mean the wrong thing?

		Linus

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-12-03 15:08 Bloch, Jack
@ 2003-12-03 15:43 ` Richard B. Johnson
  2003-12-03 16:03 ` Linus Torvalds
  1 sibling, 0 replies; 341+ messages in thread
From: Richard B. Johnson @ 2003-12-03 15:43 UTC (permalink / raw)
  To: Bloch, Jack; +Cc: linux-kernel

On Wed, 3 Dec 2003, Bloch, Jack wrote:

> I try to open a non-existan device driver node file. The Kernel returns a
> value of -1 (expected). However, when I read the value of errno it contains
> a value of 29. A call to the perror functrion does print out the correct
> error message (a value of 2). Why does this happen?
>
> Jack Bloch
> Siemens ICN
> phone                (561) 923-6550
> e-mail                jack.bloch@icn.siemens.com


Because it doesn't happen! You are likely polluting the errno
variable either with another system call before you test it
or by not including the correct header file (errno may be a
MACRO).


Try this program:


#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include <fcntl.h>
#include <errno.h>

int main(int args, char *argv[])
{
    int fd, save_errno;
    if(args < 2) {
        fprintf(stderr, "Usage:\n%s <filename>\n", argv[0]);
        exit(EXIT_FAILURE);
    }
    if((fd = open(argv[1], O_RDONLY)) < 0) {
        save_errno = errno;
        perror("open");
        fprintf(stderr, "Was %d (%s)\n", save_errno, strerror(save_errno));
        exit(EXIT_FAILURE);
    }
    (void)close(fd);
    return 0;
}

Script started on Wed Dec  3 10:41:24 2003
# ./xxx /dev/XXX
open: No such file or directory
Was 2 (No such file or directory)
# ./xxx /dev/VXI
open: Operation not supported by device
Was 19 (Operation not supported by device)
# exit
exit
Script done on Wed Dec  3 10:42:12 2003

Cheers,
Dick Johnson
Penguin : Linux version 2.4.22 on an i686 machine (797.90 BogoMips).
            Note 96.31% of all statistics are fiction.



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-09-30 14:50     ` Dave Jones
  2003-09-30 15:30       ` Jamie Lokier
@ 2003-09-30 16:34       ` Adrian Bunk
  1 sibling, 0 replies; 341+ messages in thread
From: Adrian Bunk @ 2003-09-30 16:34 UTC (permalink / raw)
  To: Dave Jones, Jamie Lokier, John Bradford, akpm, torvalds, linux-kernel

On Tue, Sep 30, 2003 at 03:50:08PM +0100, Dave Jones wrote:
>...
>  > Basically, if you're building a
>  > distro boot kernel, you must turn on all known workarounds.  That's
>  > certainly lowest-common-denominator, but it's a far cry from the
>  > configuration that a 386-as-firewall user wants.
> 
> Ok, I see what you're getting at, but Adrian's patch turned arch/i386/Kconfig
> and arch/i386/Makefile into guacamole.  After spending so much time
> getting that crap into something maintainable, it seemed a huge step
> backwards to litter it with dozens of ifdefs and duplication.
> There has to be a cleaner way of pleasing everyone.
>...

Referring to the latest patch I sent:

arch/i386/Kconfig:
The only problems seem to be some CPU_ONLY_* derived symbols I haven't 
yet found a better solution for.

arch/i386/Makefile:
There are two ifdefs to deal with Pentium 4 and K7/K8 selected at the 
same time:
ifdef CONFIG_CPU_PENTIUM4
  cpuflags-$(CONFIG_CPU_K{7,8})    := ...
else
  cpuflags-$(CONFIG_CPU_K{7,8})    := ...
endif

That's perhaps not optimal but IMHO not that bad.

The dozens of ifdefs were in other areas where I tried to add some 
additional space optimizations. It was a mistake to put them into the 
same patch and in the latest patches I sent they were already separated 
and they are _not_ required for the CPU selection scheme.

> 		Dave

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-09-30 14:50     ` Dave Jones
@ 2003-09-30 15:30       ` Jamie Lokier
  2003-09-30 16:34       ` Adrian Bunk
  1 sibling, 0 replies; 341+ messages in thread
From: Jamie Lokier @ 2003-09-30 15:30 UTC (permalink / raw)
  To: Dave Jones, John Bradford, akpm, torvalds, linux-kernel

Dave Jones wrote:
>  > I'm not sure what the fuss is; a strict 386 kernel runs just fine
>  > without any problems on an Athlon.  But anyway...
> 
> Unless it got configured away as proposed in your earlier patch.

No, I don't understand.  What about my patch, or indeed anything else,
stops a "strict 386" kernel from running on an Athlon?

>  > The latter is for distro boots.  The former is for that
>  > 386-as-a-firewall with 1MB of RAM, where it _really_ has to trim
>  > everything it can, and no errata thank you.
> 
> Again, 'trimming' away a few hundred bytes of errata workarounds
> is ridiculous when we have bigger fish to fry where we can save
> KBs of .text size, and MB's of runtime memory.

Well I think both are worthwhile.  Low hanging fruit and all that -
this is an example of a small saving that's very clear and easy.

>  > I've not heard of anyone actually wanting a strict 386 kernel lately,
>  > but strict 486 is not so unusual.
> 
> ISTR that current gcc's emit 486 instructions anyway, so its possible
> that with a modern toolchain, you can't *build* a 386 kernel.
> I'm not sure if that got fixed or not, I don't track gcc lists any more.

Afaict GCC has fine targetting for the 386, better than it did years
ago.  It didn't used to use the "leave" instruction, have an option to
optimise for size, or options for selecting exactly which
architectural instruction set it would use.

Anyway, that there is very little difference between 386 and 486 from
an application point of view anyway.  You may be thinking of the
recent C++ ABI debacle, I think it was, which accidentially turned out
to require some instruction emulation in the Debian kernel.  I think
they've fixed it in GCC now.

>  > Just as some people want a P4 optimised kernel, and some people want a
>  > K7 optimised kernel, so some people want a 386 or 486 or Pentium
>  > optimised kernel.  Lowest-common-denominator means it runs on
>  > everything, and isn't really anything to do with 386 any more - that's
>  > not really the lowest-common-denominator, by virtue of the obvious
>  > fact that pure 386 code isn't reliable on all other CPUs.
> 
> Elaborate? "pure 386 code" (whatever that means in your definition)
> should run perfectly reliable on every CPU we care about.

If that were true, why are we talking about needing workarounds for
non-386 chips to work correctly?

The canonical example is the F00F sequence: reliable on a 386, crashes
a Pentium.  That's a fine example of pure 386 code not being reliable
on a higher CPU.  And that's why it isn't safe to run Linux 1.0 on
your Pentium web server.

> So first you argue for compiling out a few hundred bytes of errata
> workaround, now you want to instead compile in checks & printk's
> (which probably add up to not far off the same amount of space).

Oh, I have nothing against __init space :)

>  > By selecting a PII kernel, it is possible to configure out the code
>  > for X86_PPRO_FENCE and X86_F00F_BUG, yet as far as I can tell, those
>  > _can_ possibly boot on kernels where the errata are needed, and nary a
>  > printk is emitted for it.  Nasty bugs they are, too.
> 
> Indeed. That's arguably a bug that occured when someone split the
> original CONFIG_M686 into _M686 & MPENTIUMII.

It's a bit more complicated.  It dates from before we had the
"alternative" macro, and it was still cool to optimise spin_unlock()
into the most minimal instruction sequence at compile time.

It's only since then that we've been generalising to "M586 should run
on all later models correctly".  Arguably, tidying up in the process.

Now we could use "alternative" to put the locked store or non-locked
store there and it would not look out of place.

If we're honest, Linux seems to have evolved through the 2.5 series
from "optimise the primitives as tight as reasonable for a target
architecture" to "a few nops here and there won't hurt".  Perhaps
Transmeta's malign influence, as nops cost virtually nothing on those :)

Or perhaps it's because CPU models have branched and don't make a
straight line any more.  So we have to do more run-time checking to
keep it sane.

>  > More generally than the CPU, you can also configure out BLK_DEV_RZ1000
>  > which is another crucial workaround that needs to go in any
>  > lowest-common-denominator kernel.
> 
> I wouldn't look at the history of drivers/ide/ as a shining example of
> good design 8-)

No, but as an example of needing to enable all the workarounds for a
distro boot kernel, it's a glorious gem.  Even now people aren't quite
sure if multi-sector mode or DMA should be enabled by default :)

>  > Basically, if you're building a
>  > distro boot kernel, you must turn on all known workarounds.  That's
>  > certainly lowest-common-denominator, but it's a far cry from the
>  > configuration that a 386-as-firewall user wants.
> 
> Ok, I see what you're getting at, but Adrian's patch turned arch/i386/Kconfig
> and arch/i386/Makefile into guacamole.  After spending so much time
> getting that crap into something maintainable, it seemed a huge step
> backwards to litter it with dozens of ifdefs and duplication.
> There has to be a cleaner way of pleasing everyone.

Perhaps it's in a name.  It doesn't help that there's an assumed
linear progression of CPUs to support, up to the point where they
branch off all over the place in feature space.  In the linear part,
CONFIG_M586, CONFIG_M686 etc. seem to mean "support this CPU or
later", whatever later means (and it's not stated exactly).  After the
explosion of different feature directions, they stop meaning that and
just become optimisation knobs, as all the different essential features
are supported at run time.

Personally I think Adrian's patch's heart is in the right place,
simply because the menu options make more sense than the present
rather confusion decision, if you intend to (or might ever, take your
pick) run a kernel compiled for one CPU on another.  I am never sure,
for example, if it's safe to take the hard disk from my K6 and drop it
into a P5MMX box and boot from it.  The kernel config just doesn't
make that clear.

With Adrian's it does, even if the code behind it is a little like
guacamole.  Perhaps the code could be cleaner; I don't see that
individual CPU model support is much different than what we already
have, except for the option to fix features at compile time rather than
run time.

And that gives me an idea.... ;)

-- Jamie

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-09-30 14:58     ` Jamie Lokier
@ 2003-09-30 15:11       ` Dave Jones
  0 siblings, 0 replies; 341+ messages in thread
From: Dave Jones @ 2003-09-30 15:11 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: John Bradford, akpm, torvalds, linux-kernel

On Tue, Sep 30, 2003 at 03:58:54PM +0100, Jamie Lokier wrote:

 > (Aside: It is quite an anomaly that those cumbersome floating point
 > instructions are emulated on the older CPUs, yet all the other
 > instructions aren't emulated.  Emulation is very slow, and forcing
 > userspace to just use different code instead is good, but that's just
 > as valid for floating point as it is for MMX, cmpxchg etc.)

There was a patch around a while back that did 486 emulation on 386
kernels. I think it even made into the Mandrake kernel.

 > To be fair, the kernel really ought to just say that and halt.  That
 > is a fine compromise.  It won't make embedded systems folks completely
 > happy, because if you've only got 2MB of NVRAM for your whole kernel
 > _and_ filesystem including user data (think PDA or cellphone), then a
 > hundred bytes here or there is actually worth trimming.

With such tight constraints, why not just use 2.4 (or even 2.2) which
has much lower memory usage and diskspace requirements ?

		Dave

-- 
 Dave Jones     http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-09-30 14:10   ` John Bradford
@ 2003-09-30 14:58     ` Jamie Lokier
  2003-09-30 15:11       ` Dave Jones
  0 siblings, 1 reply; 341+ messages in thread
From: Jamie Lokier @ 2003-09-30 14:58 UTC (permalink / raw)
  To: John Bradford; +Cc: Dave Jones, akpm, torvalds, linux-kernel

John Bradford wrote:
> Unless, of course, you object to the possibility that somebody might
> go out of their way to compile a 386 specific kernel from source
> themselves, then run it on an Athlon.  By chance it will probably
> appear to work OK, but won't have the workaround enabled.  So what?

Actually the 386 kernel will work just fine on the AMD...  The
workaround is only needed, in the kernel, to protect against the
kernel's own use of non-386 features...

Userspace is a different matter, but userspace has a lot of
model-specific things to worry about beyond this one instruction on
AMD.  In practice: bswap, cmov, cmpxchg, mmx, sse, sse2, so knowing
whether to use prefetch or not is just one more variable for userspace
- and one which any portable app or library will have to know about in
any case.

(Aside: It is quite an anomaly that those cumbersome floating point
instructions are emulated on the older CPUs, yet all the other
instructions aren't emulated.  Emulation is very slow, and forcing
userspace to just use different code instead is good, but that's just
as valid for floating point as it is for MMX, cmpxchg etc.)

> Only somebody who knows exactly what they were doing is likely to do
> that - how could it happen by accident?  If you really must, put a
> warning in to say, 'This kernel doesn't support your processor', but
> doing that just adds more bloat.  OK, so the bloat will be freed after
> boot, but it's still bloat on the boot device, which matters in some
> embedded systems.

To be fair, the kernel really ought to just say that and halt.  That
is a fine compromise.  It won't make embedded systems folks completely
happy, because if you've only got 2MB of NVRAM for your whole kernel
_and_ filesystem including user data (think PDA or cellphone), then a
hundred bytes here or there is actually worth trimming.

But then, those sort of embedded folks should just figure out
compressed software-suspend, and then they can ditch __init data from
the NVRAM image completely.  It's much better to lose all of __init
than just a few bytes here or there, isn't it?

-- Jamie

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-09-30 14:06   ` Jamie Lokier
@ 2003-09-30 14:50     ` Dave Jones
  2003-09-30 15:30       ` Jamie Lokier
  2003-09-30 16:34       ` Adrian Bunk
  0 siblings, 2 replies; 341+ messages in thread
From: Dave Jones @ 2003-09-30 14:50 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: John Bradford, akpm, torvalds, linux-kernel

On Tue, Sep 30, 2003 at 03:06:27PM +0100, Jamie Lokier wrote:
 > Dave Jones wrote:
 > > On Tue, Sep 30, 2003 at 09:17:16AM +0100, John Bradford wrote:
 > >  > Of course a kernel compiled strictly for 386s may seem to boot on an
 > >  > Athlon but not work properly.  So what?  Just don't run the 'wrong'
 > >  > kernel.
 > > Wrong answer. How do you intend to install Linux when a distro boot
 > > kernel is compiled for lowest-common-denominator (386), and is the
 > > 'wrong' kernel for an Athlon ?
 > I'm not sure what the fuss is; a strict 386 kernel runs just fine
 > without any problems on an Athlon.  But anyway...

Unless it got configured away as proposed in your earlier patch.

 > Dave, you are conflating "kernel compiled strictly for 386s" with
 > "compiled for lowest-common-denominator".
 > 
 > They are totally different configurations.  Isn't that why we have
 > "generic" now?

CONFIG_GENERIC could be extended to offer other options yes,
but right now what it does doesn't really match the name IMO.
Right now its closer to a CONFIG_MAX_CACHELINE_SIZE

 > The latter is for distro boots.  The former is for that
 > 386-as-a-firewall with 1MB of RAM, where it _really_ has to trim
 > everything it can, and no errata thank you.

Again, 'trimming' away a few hundred bytes of errata workarounds
is ridiculous when we have bigger fish to fry where we can save
KBs of .text size, and MB's of runtime memory.

 > I've not heard of anyone actually wanting a strict 386 kernel lately,
 > but strict 486 is not so unusual.

ISTR that current gcc's emit 486 instructions anyway, so its possible
that with a modern toolchain, you can't *build* a 386 kernel.
I'm not sure if that got fixed or not, I don't track gcc lists any more.

 > Just as some people want a P4 optimised kernel, and some people want a
 > K7 optimised kernel, so some people want a 386 or 486 or Pentium
 > optimised kernel.  Lowest-common-denominator means it runs on
 > everything, and isn't really anything to do with 386 any more - that's
 > not really the lowest-common-denominator, by virtue of the obvious
 > fact that pure 386 code isn't reliable on all other CPUs.

Elaborate? "pure 386 code" (whatever that means in your definition)
should run perfectly reliable on every CPU we care about.

 > > We hashed this argument out a week or so ago, it seems the message
 > > didn't get across. YOU CAN NOT DISABLE ERRATA WORKAROUNDS IN A KERNEL
 > > THAT MAY POSSIBLY BOOT ON HARDWARE THAT WORKAROUND IS FOR.
 > I agree.  It shouln't be possible to boot on the wrong hardware: it
 > should refuse.

So first you argue for compiling out a few hundred bytes of errata
workaround, now you want to instead compile in checks & printk's
(which probably add up to not far off the same amount of space).

 > By selecting a PII kernel, it is possible to configure out the code
 > for X86_PPRO_FENCE and X86_F00F_BUG, yet as far as I can tell, those
 > _can_ possibly boot on kernels where the errata are needed, and nary a
 > printk is emitted for it.  Nasty bugs they are, too.

Indeed. That's arguably a bug that occured when someone split the
original CONFIG_M686 into _M686 & MPENTIUMII.

 > More generally than the CPU, you can also configure out BLK_DEV_RZ1000
 > which is another crucial workaround that needs to go in any
 > lowest-common-denominator kernel.

I wouldn't look at the history of drivers/ide/ as a shining example of
good design 8-)

 > Basically, if you're building a
 > distro boot kernel, you must turn on all known workarounds.  That's
 > certainly lowest-common-denominator, but it's a far cry from the
 > configuration that a 386-as-firewall user wants.

Ok, I see what you're getting at, but Adrian's patch turned arch/i386/Kconfig
and arch/i386/Makefile into guacamole.  After spending so much time
getting that crap into something maintainable, it seemed a huge step
backwards to litter it with dozens of ifdefs and duplication.
There has to be a cleaner way of pleasing everyone.

 > > clearer?
 > If the kernel had a consistent policy so far, it would be more clear,
 > but it doesn't.

Agreed, there are some questionable parts.

		Dave

-- 
 Dave Jones     http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-09-30 13:31 ` your mail Dave Jones
  2003-09-30 14:06   ` Jamie Lokier
@ 2003-09-30 14:10   ` John Bradford
  2003-09-30 14:58     ` Jamie Lokier
  1 sibling, 1 reply; 341+ messages in thread
From: John Bradford @ 2003-09-30 14:10 UTC (permalink / raw)
  To: Dave Jones; +Cc: Jamie Lokier, akpm, torvalds, linux-kernel

Quote from Dave Jones <davej@redhat.com>:
> On Tue, Sep 30, 2003 at 09:17:16AM +0100, John Bradford wrote:
>  
>  > Of course a kernel compiled strictly for 386s may seem to boot on an
>  > Athlon but not work properly.  So what?  Just don't run the 'wrong'
>  > kernel.
> 
> Wrong answer. How do you intend to install Linux when a distro boot
> kernel is compiled for lowest-common-denominator (386), and is the
> 'wrong' kernel for an Athlon ?

I don't.  I *never* suggested doing that.  I clearly said a kernel
compiled *strictly* for 386s.  I.E. Without support for other
processors.

> We hashed this argument out a week or so ago, it seems the message
> didn't get across. YOU CAN NOT DISABLE ERRATA WORKAROUNDS IN A KERNEL
> THAT MAY POSSIBLY BOOT ON HARDWARE THAT WORKAROUND IS FOR.

It seems the message didn't get across to you.

Have you actually looked at Adrian's patch?

*Forget* that 386=lowest-common-denominator.  This
'386=lowest-common-denominator' theme is out of date, and we should be
moving away from it - oh, hang on, that's exactly what Adrian's patch
allows us to do.

A distribution installation kernel needs to boot all supported
hardware - of course it does.  So what?  Just select support for all
the processors in the configurator.  No, don't just select 386,
because 386 doesn't mean 386 and above anymore with Adrian's patch, it
means support 386 and don't bloat the kernel with workarounds for
other processors.  Select *all* processors.  Now you have a nice,
(bloated), kernel that boots on the same hardware that you old '386'
one did.  Fine for installation on diverse hardware.  Rubbish for
performance.

Unless, of course, you object to the possibility that somebody might
go out of their way to compile a 386 specific kernel from source
themselves, then run it on an Athlon.  By chance it will probably
appear to work OK, but won't have the workaround enabled.  So what?
Only somebody who knows exactly what they were doing is likely to do
that - how could it happen by accident?  If you really must, put a
warning in to say, 'This kernel doesn't support your processor', but
doing that just adds more bloat.  OK, so the bloat will be freed after
boot, but it's still bloat on the boot device, which matters in some
embedded systems.

> clearer ?

It's clear that you didn't read my original post, yes.

John.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-09-30 13:31 ` your mail Dave Jones
@ 2003-09-30 14:06   ` Jamie Lokier
  2003-09-30 14:50     ` Dave Jones
  2003-09-30 14:10   ` John Bradford
  1 sibling, 1 reply; 341+ messages in thread
From: Jamie Lokier @ 2003-09-30 14:06 UTC (permalink / raw)
  To: Dave Jones, John Bradford, akpm, torvalds, linux-kernel

Dave Jones wrote:
> On Tue, Sep 30, 2003 at 09:17:16AM +0100, John Bradford wrote:
>  > Of course a kernel compiled strictly for 386s may seem to boot on an
>  > Athlon but not work properly.  So what?  Just don't run the 'wrong'
>  > kernel.
> 
> Wrong answer. How do you intend to install Linux when a distro boot
> kernel is compiled for lowest-common-denominator (386), and is the
> 'wrong' kernel for an Athlon ?

I'm not sure what the fuss is; a strict 386 kernel runs just fine
without any problems on an Athlon.  But anyway...

Dave, you are conflating "kernel compiled strictly for 386s" with
"compiled for lowest-common-denominator".

They are totally different configurations.  Isn't that why we have
"generic" now?

The latter is for distro boots.  The former is for that
386-as-a-firewall with 1MB of RAM, where it _really_ has to trim
everything it can, and no errata thank you.

I've not heard of anyone actually wanting a strict 386 kernel lately,
but strict 486 is not so unusual.

Just as some people want a P4 optimised kernel, and some people want a
K7 optimised kernel, so some people want a 386 or 486 or Pentium
optimised kernel.  Lowest-common-denominator means it runs on
everything, and isn't really anything to do with 386 any more - that's
not really the lowest-common-denominator, by virtue of the obvious
fact that pure 386 code isn't reliable on all other CPUs.

> We hashed this argument out a week or so ago, it seems the message
> didn't get across. YOU CAN NOT DISABLE ERRATA WORKAROUNDS IN A KERNEL
> THAT MAY POSSIBLY BOOT ON HARDWARE THAT WORKAROUND IS FOR.

I agree.  It shouln't be possible to boot on the wrong hardware: it
should refuse.

There is precedent: X86_GOOD_APIC && X86_LOCAL_APIC: when booted on a
non-MMX P5, it refuses to boot, because it does not contain the errata
workaround.

Unfortunately the kernel has opposite precedents too.

By selecting a PII kernel, it is possible to configure out the code
for X86_PPRO_FENCE and X86_F00F_BUG, yet as far as I can tell, those
_can_ possibly boot on kernels where the errata are needed, and nary a
printk is emitted for it.  Nasty bugs they are, too.

More generally than the CPU, you can also configure out BLK_DEV_RZ1000
which is another crucial workaround that needs to go in any
lowest-common-denominator kernel.  Basically, if you're building a
distro boot kernel, you must turn on all known workarounds.  That's
certainly lowest-common-denominator, but it's a far cry from the
configuration that a 386-as-firewall user wants.

> clearer?

If the kernel had a consistent policy so far, it would be more clear,
but it doesn't.

-- Jamie

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-09-30  8:17 John Bradford
@ 2003-09-30 13:31 ` Dave Jones
  2003-09-30 14:06   ` Jamie Lokier
  2003-09-30 14:10   ` John Bradford
  0 siblings, 2 replies; 341+ messages in thread
From: Dave Jones @ 2003-09-30 13:31 UTC (permalink / raw)
  To: John Bradford; +Cc: Jamie Lokier, akpm, torvalds, linux-kernel

On Tue, Sep 30, 2003 at 09:17:16AM +0100, John Bradford wrote:
 
 > Of course a kernel compiled strictly for 386s may seem to boot on an
 > Athlon but not work properly.  So what?  Just don't run the 'wrong'
 > kernel.

Wrong answer. How do you intend to install Linux when a distro boot
kernel is compiled for lowest-common-denominator (386), and is the
'wrong' kernel for an Athlon ?

We hashed this argument out a week or so ago, it seems the message
didn't get across. YOU CAN NOT DISABLE ERRATA WORKAROUNDS IN A KERNEL
THAT MAY POSSIBLY BOOT ON HARDWARE THAT WORKAROUND IS FOR.

clearer ?

		Dave

-- 
 Dave Jones     http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-08-28  2:25 warudkar
@ 2003-08-27 16:02 ` William Lee Irwin III
  0 siblings, 0 replies; 341+ messages in thread
From: William Lee Irwin III @ 2003-08-27 16:02 UTC (permalink / raw)
  To: warudkar; +Cc: kernel, linux-kernel, Andrew Morton

On Wed, Aug 27, 2003 at 09:25:23PM -0500, warudkar@vsnl.net wrote:
> Con - With swappiness set to 100, the apps do start up in 3 minutes and kswapd doesn't hog the CPU. But X is still unusable till all of them have started up.
> Wli - Sorry, vmstat segfaults on 2.6!

This is a bug in older versions of vmstat. Upgrade vmstat.


-- wli

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-08-25 16:45 Marcelo Tosatti
@ 2003-08-25 16:59 ` Herbert Pötzl
  0 siblings, 0 replies; 341+ messages in thread
From: Herbert Pötzl @ 2003-08-25 16:59 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: lkml

On Mon, Aug 25, 2003 at 01:45:25PM -0300, Marcelo Tosatti wrote:
> On Mon, 25 Aug 2003, Herbert Pötzl wrote:
> 
> > On Mon, Aug 25, 2003 at 10:53:21AM -0300, Marcelo Tosatti wrote:
> > >
> > > >
> > > >
> > > > Matthias Andree wrote:
> > > >
> > > > >On Mon, 25 Aug 2003, Marcelo Tosatti wrote:
> > > > >
> > > > >
> > > > >>- 2.4.22-rc4 was released as 2.4.22 with no changes.
> > > > >>
> > > > >
> > > > >What are the plans for 2.4.23? XFS merge perhaps <hint>?
> > > > >
> > > >
> > > > Maybe some of Andrea's VM stuff?
> > >
> > > Definately. Thats the first thing I'm going to do after looking
> through
> > > "2.4.23-pre-patches" folder.
> >
> > any chance for the Bind Mount Extensions? 8-)
> 
> I haven't found time to at the patch yet but will do so soon.

fine, no problem, let me know if you need something
(like rediff, resend, explanation, etc ...)

best,
Herbert


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
@ 2003-08-25 16:45 Marcelo Tosatti
  2003-08-25 16:59 ` Herbert Pötzl
  0 siblings, 1 reply; 341+ messages in thread
From: Marcelo Tosatti @ 2003-08-25 16:45 UTC (permalink / raw)
  To: Herbert Pötzl; +Cc: lkml

On Mon, 25 Aug 2003, Herbert Pötzl wrote:

> On Mon, Aug 25, 2003 at 10:53:21AM -0300, Marcelo Tosatti wrote:
> >
> > >
> > >
> > > Matthias Andree wrote:
> > >
> > > >On Mon, 25 Aug 2003, Marcelo Tosatti wrote:
> > > >
> > > >
> > > >>- 2.4.22-rc4 was released as 2.4.22 with no changes.
> > > >>
> > > >
> > > >What are the plans for 2.4.23? XFS merge perhaps <hint>?
> > > >
> > >
> > > Maybe some of Andrea's VM stuff?
> >
> > Definately. Thats the first thing I'm going to do after looking
through
> > "2.4.23-pre-patches" folder.
>
> any chance for the Bind Mount Extensions? 8-)

I haven't found time to at the patch yet but will do so soon.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-08-25 13:53 Marcelo Tosatti
@ 2003-08-25 14:30 ` Herbert Pötzl
  0 siblings, 0 replies; 341+ messages in thread
From: Herbert Pötzl @ 2003-08-25 14:30 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: lkml

On Mon, Aug 25, 2003 at 10:53:21AM -0300, Marcelo Tosatti wrote:
> 
> >
> >
> > Matthias Andree wrote:
> >
> > >On Mon, 25 Aug 2003, Marcelo Tosatti wrote:
> > >
> > >
> > >>- 2.4.22-rc4 was released as 2.4.22 with no changes.
> > >>
> > >
> > >What are the plans for 2.4.23? XFS merge perhaps <hint>?
> > >
> >
> > Maybe some of Andrea's VM stuff?
> 
> Definately. Thats the first thing I'm going to do after looking through
> "2.4.23-pre-patches" folder.

any chance for the Bind Mount Extensions? 8-)

best,
Herbert

> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-08-18  6:21 "Andrey Borzenkov" 
@ 2003-08-18 20:42 ` Greg KH
  0 siblings, 0 replies; 341+ messages in thread
From: Greg KH @ 2003-08-18 20:42 UTC (permalink / raw)
  To: Andrey Borzenkov; +Cc: jw schultz, linux-kernel

On Mon, Aug 18, 2003 at 10:21:22AM +0400, "Andrey Borzenkov"  wrote:
> 
> just to show what I expected from sysfs - here is entry from Solaris
> /devices:
> 
> brw-r-----   1 root     sys       32,240 Jan 24  2002 /devices/pci@16,4000/scsi@5,1/sd@0,0:a
> 
> this entry identifies disk partition 0 on drive with SCSI ID 0, LUN 0
> connected to bus 1 of controller in slot 5 of PCI bus identified
> by 16. Now you can use whatever policy you like to give human
> meaningful name to this entry. And if you have USB it will continue
> further giving you exact topology starting from the root of your
> device tree.
> 
> and this path does not contain single logical id so it is not subject
> to change if I add the same controller somewhere else.
> 
> hopefully it clarifies what I mean ...

Hm, a bit.  First, have you looked at what sysfs provides?  Here's one
of my machines and tell me if it has all the info you are looking for:

$ tree /sys/bus/scsi/
/sys/bus/scsi/
|-- devices
|   `-- 0:0:0:0 -> ../../../devices/pci0000:00/0000:00:1e.0/0000:02:05.0/host0/0:0:0:0
`-- drivers
    `-- sd
        `-- 0:0:0:0 -> ../../../../devices/pci0000:00/0000:00:1e.0/0000:02:05.0/host0/0:0:0:0

$ tree /sys/block/sda/
/sys/block/sda/
|-- dev
|-- device -> ../../devices/pci0000:00/0000:00:1e.0/0000:02:05.0/host0/0:0:0:0
|-- queue
|   |-- iosched
|   |   |-- antic_expire
|   |   |-- read_batch_expire
|   |   |-- read_expire
|   |   |-- write_batch_expire
|   |   `-- write_expire
|   `-- nr_requests
|-- range
|-- sda1
|   |-- dev
|   |-- size
|   |-- start
|   `-- stat
|-- sda2
|   |-- dev
|   |-- size
|   |-- start
|   `-- stat
|-- sda3
|   |-- dev
|   |-- size
|   |-- start
|   `-- stat
|-- sda4
|   |-- dev
|   |-- size
|   |-- start
|   `-- stat
|-- size
`-- stat


Now, from that you can see exactly where my scsi device is in the pci
tree, and you can see in the block directory, what block device is
assigned to what physical device in the device tree.  Then there are 4
partitions on this disk, all what those specific paramaters.

So, when sda shows up, udev can determine that it lives on a specific
scsi device, located in a specific place in the pci space, and that it
has some number of partitions, all of specific sizes, wich specific
major/minor numbers.  It can then create all of the /dev links based on
this.

Please, take a few minutes looking at the existing sysfs tree on Linux.
If you then have any specific questions, I would be glad to answer
them.

Hope this helps,

greg k-h

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-08-14 21:57 kartikey bhatt
@ 2003-08-15  3:31 ` James Morris
  0 siblings, 0 replies; 341+ messages in thread
From: James Morris @ 2003-08-15  3:31 UTC (permalink / raw)
  To: kartikey bhatt; +Cc: davem, linux-kernel, alan

On Fri, 15 Aug 2003, kartikey bhatt wrote:

> Hi James.
> A little bit work for you.
> Somebody on mailing list commented that you should *really* go for better
> algorithm like CAST6 (rfc2612) to be included in kernel.
> This time I'm sending you cast6.c (cast6 cipher algorithm) implementation.
> But this time it's a patch.

Cool.  Unfortunately the patch is corrupted, please try sending as an 
attachment or via a different mail system.


- James
-- 
James Morris
<jmorris@intercode.com.au>


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <200308031136.17768.lx@lxhp.in-berlin.de>
@ 2003-08-03 18:30 ` Linus Torvalds
  0 siblings, 0 replies; 341+ messages in thread
From: Linus Torvalds @ 2003-08-03 18:30 UTC (permalink / raw)
  To: hp; +Cc: linux-assembly, Kernel Mailing List, David S. Miller


On Sun, 3 Aug 2003, hp wrote:
>
> so He/You did lock me out, too?
> whithout any notice. by what reason?

Maybe because this has nothing to do with the kernel?

It's ok to discuss kernel issues on the kernel mailing list, but we've had 
tons of totally off-topic flames, rants and general noise.

To the point that a lot of people don't even have time to follow 
linux-kernel any more, since a lot of the discussion has nothing to do 
with the technical kernel work.

Since some of these rants are started (and kept going) by people who don't
ever seem to actually get involved in _real_ kernel-related technical
discussions, David felt that one way to curb it was to just blacklist 
people who repeatedly post things that aren't related to the kernel.

It's ok to be off-topic every once in a while, but it's not ok to 
consistently be so.

That said, David is also not the most politic person I know, and I suspect 
this could have been handled slightly more gracefully. One potential less 
annoying approach is to not block posting from people, but rewrite the 
subject line for such posters with a prepended "[OFF-TOPIC]", and just let 
people filter those out on the receiving end. Or just automatically shunt 
them off to another list.

I dunno. I don't personally much care - but I've never been the maintainer 
of the mailing list, and I sure as hell don't ever want to be. Whoever is 
the maintainer gets to set the rules.

			Linus


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-05-14 18:41 dirf
@ 2003-05-16 10:00 ` Maciej Soltysiak
  0 siblings, 0 replies; 341+ messages in thread
From: Maciej Soltysiak @ 2003-05-16 10:00 UTC (permalink / raw)
  To: dirf; +Cc: linux-kernel

> - Where I can find a list of RFCs?
http://www.ietf.org/rfc

There is a RFC Index link

> - Where I can find a cdfs format ( cd file system format)?
You mean kernel drivers or the specification?
The kernel drivers are in the stock kernel.

Regards,
Maciej


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <053C05D4.4D025D2E.0005F166@netscape.net>
@ 2003-05-08  9:06 ` Gerd Knorr
  0 siblings, 0 replies; 341+ messages in thread
From: Gerd Knorr @ 2003-05-08  9:06 UTC (permalink / raw)
  To: ark925; +Cc: Kernel List

> Actually it does in some cases. I know of two devices that have analog
> tuners on an smbus-like interface (OV511 USB TV and W9967CF USB TV). The
> tuner can be controlled using a pair of i2c_smbus_write_byte_data()
> calls.

Hmm, maybe we should rename the SMBUS class to SENSORS or MAINBOARD or
something like that?  I assumed you smbus interfaces are used for
mainboard sensors only ...

> Would a patch that adds smbus algorithm support to tuner.c be
> acceptable?

Yes.  Certainly makes more sense than duplicating the whole rest of
tuner.c just for a smbus-aware tuner driver ;)

  Gerd

-- 
sigfault

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-30 21:39 Mauricio Oliveira Carneiro
@ 2003-05-01  0:05 ` Greg KH
  0 siblings, 0 replies; 341+ messages in thread
From: Greg KH @ 2003-05-01  0:05 UTC (permalink / raw)
  To: Mauricio Oliveira Carneiro; +Cc: linux-kernel

On Wed, Apr 30, 2003 at 06:39:41PM -0300, Mauricio Oliveira Carneiro wrote:
> But I can't see it mounted anywhere in my system, nor can I mount it by 
> hand since I don't know the device filename (/dev/?) .

Have you read the Linux USB Guide at http://www.linux-usb.org/ ?

If you still have questions/problems after reading that, try asking this
on the linux-usb-users mailing list.  The people there can help you out.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-25 17:35 Bloch, Jack
@ 2003-04-25 19:43 ` Francois Romieu
  0 siblings, 0 replies; 341+ messages in thread
From: Francois Romieu @ 2003-04-25 19:43 UTC (permalink / raw)
  To: Bloch, Jack; +Cc: linux-kernel

Bloch, Jack <Jack.Bloch@icn.siemens.com> :
> Is there example driver source code available for a MUSYCC CN8478 device?
> Please CC me directly on any answers. 

http://ww.google.com/search?q=musycc.c+bsd&ie=ISO-8859-1&hl=fr&lr=

Example source code for a complete reset of a PEB20534 device operating in
last descriptor address control mode will be welcome too.

Regards

--
Ueimor

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail 
  2003-04-05  0:38 Ed Vance
@ 2003-04-05  4:51 ` Keith Owens
  0 siblings, 0 replies; 341+ messages in thread
From: Keith Owens @ 2003-04-05  4:51 UTC (permalink / raw)
  To: Ed Vance; +Cc: linux-kernel

On Fri, 4 Apr 2003 16:38:50 -0800 , 
Ed Vance <EdV@macrolink.com> wrote:
>On Fri, Apr 04, 2003 at 3:21 PM, Keith Owens wrote:
>> 
>> On Fri, 4 Apr 2003 14:10:16 -0800 , 
>> Ed Vance <EdV@macrolink.com> wrote:
>> >Perhaps there is a middle ground. Leave the list open, but require a
>> >confirmation reply prior to passing along posts from addresses that:
>> >
>> >1. are not members of the list, AND
>> >2. have not previously done a proper confirmation reply.
>> 
>> 30 seconds after doing that, the spammers will forge email that claims
>> to be from LT, AC, DM, MT etc.  Not to mention all the viruses that
>> forge the headers.  Verification by 'From:' line on an open list is
>> pointless.
>> 
>The goal was to greatly reduce, in one swell foop, the volume of spam that
>the filters (and postmaster) must interactively deal with. I thought that
>perhaps this method could replace one of more of the troublesome filtering
>techniques to achieve the same net spam reduction without evoking as much
>whining.

Paraphrase: Replace filtering code that catches spam with filtering
code based on checking header content that can be trivially forged by
spammers.

>Matti, 
>Roughly what percentage of the spam actually hitting vger today (and
>bouncing off) is based on Keith's flavor of spoofing? Is it even 1 percent? 

Current figures are irrelevant, spammers react to spam filters and they
react very quickly[*].  If you replace "reject HTML bodies" with "allow
HTML based on known From: lines" then the spammers will send HTML
bodies with forged headers, because they know it will get through.
That will require the original HTML filters to be reintroduced, the end
result is you added an extra step for new posters without reducing the
spam or users whining "my mail does not get through".

[*] About 24 hours after slashdot carried a story on Baysian spam
    filters, I started receiving HTML spam that contained comments that
    were designed to fool the Baysian filters, like this.

    FREE 1 MONTH SUPP<!--kernel-->Y WITH THIS

    The comment has no effect on the spam display but the use of
    non-spam words skews the Baysian rules on whether the content is
    spam or not.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* RE: your mail 
@ 2003-04-05  0:38 Ed Vance
  2003-04-05  4:51 ` Keith Owens
  0 siblings, 1 reply; 341+ messages in thread
From: Ed Vance @ 2003-04-05  0:38 UTC (permalink / raw)
  To: 'Keith Owens', 'Matti Aarnio'; +Cc: linux-kernel

On Fri, Apr 04, 2003 at 3:21 PM, Keith Owens wrote:
> 
> On Fri, 4 Apr 2003 14:10:16 -0800 , 
> Ed Vance <EdV@macrolink.com> wrote:
> >Perhaps there is a middle ground. Leave the list open, but require a
> >confirmation reply prior to passing along posts from addresses that:
> >
> >1. are not members of the list, AND
> >2. have not previously done a proper confirmation reply.
> 
> 30 seconds after doing that, the spammers will forge email that claims
> to be from LT, AC, DM, MT etc.  Not to mention all the viruses that
> forge the headers.  Verification by 'From:' line on an open list is
> pointless.
> 

Keith,

No single method is perfect. Your point is well taken. 

The goal was to greatly reduce, in one swell foop, the volume of spam that
the filters (and postmaster) must interactively deal with. I thought that
perhaps this method could replace one of more of the troublesome filtering
techniques to achieve the same net spam reduction without evoking as much
whining.

imperfect != pointless

Matti, 
Roughly what percentage of the spam actually hitting vger today (and
bouncing off) is based on Keith's flavor of spoofing? Is it even 1 percent? 

Cheers,
Ed

---------------------------------------------------------------- 
Ed Vance              edv (at) macrolink (dot) com
Macrolink, Inc.       1500 N. Kellogg Dr  Anaheim, CA  92807
----------------------------------------------------------------

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail 
  2003-04-04 22:10 Ed Vance
  2003-04-04 23:19 ` William Scott Lockwood III
@ 2003-04-04 23:21 ` Keith Owens
  1 sibling, 0 replies; 341+ messages in thread
From: Keith Owens @ 2003-04-04 23:21 UTC (permalink / raw)
  To: Ed Vance; +Cc: linux-kernel

On Fri, 4 Apr 2003 14:10:16 -0800 , 
Ed Vance <EdV@macrolink.com> wrote:
>Perhaps there is a middle ground. Leave the list open, but require a
>confirmation reply prior to passing along posts from addresses that:
>
>1. are not members of the list, AND
>2. have not previously done a proper confirmation reply.

30 seconds after doing that, the spammers will forge email that claims
to be from LT, AC, DM, MT etc.  Not to mention all the viruses that
forge the headers.  Verification by 'From:' line on an open list is
pointless.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* RE: your mail
  2003-04-04 22:10 Ed Vance
@ 2003-04-04 23:19 ` William Scott Lockwood III
  2003-04-04 23:21 ` Keith Owens
  1 sibling, 0 replies; 341+ messages in thread
From: William Scott Lockwood III @ 2003-04-04 23:19 UTC (permalink / raw)
  To: Ed Vance; +Cc: 'Matti Aarnio', linux-kernel

That is the best suggestion I've yet seen.  It's an excellent idea!

On Fri, 4 Apr 2003, Ed Vance wrote:

> On Fri, Apr 04, 2003 at 12:38 PM, Matti Aarnio wrote:
> > [snip]
> > A somewhat better anti-spam filter method, than what we use presently
> > is to use strictly CLOSED list -- e.g. must be a member to post.
> > I have seen what kind of pains closed lists are, I even moderate
> > couple small ones.
> >
> > However we are deliberately running "open for posting, subject to
> > filters" policy, which lets questions and reports to come from
> > non-subscribers.
> >
>
> Perhaps there is a middle ground. Leave the list open, but require a
> confirmation reply prior to passing along posts from addresses that:
>
> 1. are not members of the list, AND
> 2. have not previously done a proper confirmation reply.
>
> The unconfirmed posts would time out and disappear after a decent interval,
> to prevent constipation.
>
> So, anybody could still post, the members would not be inconvenienced, and
> non-members would be inconvenienced only on their first post from each
> address they post from. This would preserve the "real time" nature of the
> list, while gaining the assurance that all who post are life-forms, even if
> they live in front of a keyboard and have no real life.  ;-)
>
> Of course, this would require storage for the list of confirmed addresses
> and pending unconfirmed posts, and the bandwidth and other overhead of the
> infrequent confirmation messages.
>
> Just a thought.
>
> Cheers,
> Ed
>
> ----------------------------------------------------------------
> Ed Vance              edv (at) macrolink (dot) com
> Macrolink, Inc.       1500 N. Kellogg Dr  Anaheim, CA  92807
> ----------------------------------------------------------------
>


^ permalink raw reply	[flat|nested] 341+ messages in thread

* RE: your mail
@ 2003-04-04 22:10 Ed Vance
  2003-04-04 23:19 ` William Scott Lockwood III
  2003-04-04 23:21 ` Keith Owens
  0 siblings, 2 replies; 341+ messages in thread
From: Ed Vance @ 2003-04-04 22:10 UTC (permalink / raw)
  To: 'Matti Aarnio'; +Cc: William Scott Lockwood III, linux-kernel

On Fri, Apr 04, 2003 at 12:38 PM, Matti Aarnio wrote:
> [snip]
> A somewhat better anti-spam filter method, than what we use presently
> is to use strictly CLOSED list -- e.g. must be a member to post.
> I have seen what kind of pains closed lists are, I even moderate
> couple small ones.
> 
> However we are deliberately running "open for posting, subject to
> filters" policy, which lets questions and reports to come from
> non-subscribers.
> 

Perhaps there is a middle ground. Leave the list open, but require a
confirmation reply prior to passing along posts from addresses that:

1. are not members of the list, AND
2. have not previously done a proper confirmation reply.

The unconfirmed posts would time out and disappear after a decent interval,
to prevent constipation.

So, anybody could still post, the members would not be inconvenienced, and
non-members would be inconvenienced only on their first post from each
address they post from. This would preserve the "real time" nature of the
list, while gaining the assurance that all who post are life-forms, even if
they live in front of a keyboard and have no real life.  ;-)

Of course, this would require storage for the list of confirmed addresses
and pending unconfirmed posts, and the bandwidth and other overhead of the
infrequent confirmation messages.

Just a thought. 

Cheers,
Ed

---------------------------------------------------------------- 
Ed Vance              edv (at) macrolink (dot) com
Macrolink, Inc.       1500 N. Kellogg Dr  Anaheim, CA  92807
----------------------------------------------------------------

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-04 15:28           ` William Scott Lockwood III
                               ` (2 preceding siblings ...)
  2003-04-04 16:10             ` Jens Axboe
@ 2003-04-04 20:37             ` Matti Aarnio
  3 siblings, 0 replies; 341+ messages in thread
From: Matti Aarnio @ 2003-04-04 20:37 UTC (permalink / raw)
  To: William Scott Lockwood III; +Cc: linux-kernel

On Fri, Apr 04, 2003 at 07:28:12AM -0800, William Scott Lockwood III wrote:
...
> The best list is one that is inclusive.  One that tollerates other opinions
> and choices.  LKML has turned into the largest, nastiest click I've ever
> seen, and that's really sad, as I'm sure it scares some good people away.

Are you speaking about PEOPLE who react on emails by flaming, or
something of list filtering "technology" ?

>  Look at all the crap I and others got for using hotmail - I finally
> got sick and tired of the whining and now have to take 3x as long to 
> read my mail - but it's not a hotmail address anymore, so the whining
> stoped.

About people, then..    There I can't help, unfortunately.

We have lots of people subscribing on Hotmail addresses, and only 
complaint I can voice is that those people will at times let their
mailbox quotas overflow, which leads to bounces, and then subscription
revocation...  (Hard controlled quotas are not unique to Hotmail, nor
people who let them overflow...)


> Why not spend less timing restricting what people can read and post
> from, and just let people participate?

There is this small thing called spam...

We have various filters (see my other posting), but obviously they
are not infallible, a few spams do leak thru, and earn new filter
rules (if I can think up something suitably specific, while generic..)

A somewhat better anti-spam filter method, than what we use presently
is to use strictly CLOSED list -- e.g. must be a member to post.
I have seen what kind of pains closed lists are, I even moderate
couple small ones.

However we are deliberately running "open for posting, subject to
filters" policy, which lets questions and reports to come from
non-subscribers.


/Matti Aarnio

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-04 15:28           ` William Scott Lockwood III
  2003-04-04 16:04             ` Richard B. Johnson
  2003-04-04 16:04             ` Christoph Hellwig
@ 2003-04-04 16:10             ` Jens Axboe
  2003-04-04 20:37             ` Matti Aarnio
  3 siblings, 0 replies; 341+ messages in thread
From: Jens Axboe @ 2003-04-04 16:10 UTC (permalink / raw)
  To: William Scott Lockwood III
  Cc: Richard B. Johnson, David S. Miller, linux-kernel

On Fri, Apr 04 2003, William Scott Lockwood III wrote:
> On Fri, 4 Apr 2003, Richard B. Johnson wrote:
> > On Thu, 3 Apr 2003, William Scott Lockwood III wrote:
> > > On Thu, 3 Apr 2003, David S. Miller wrote:
> > > >    From: "Richard B. Johnson" <root@chaos.analogic.com>
> > > >    Date: Thu, 3 Apr 2003 15:02:41 -0500 (EST)
> > > >    Well it's not a yahoo users problem because yahoo users can't fix
> > > >    it. Some yahoo users have yahoo "free" mail as their only connection
> > > >    to the internet because of facist network administrators.
> > > > If you want all the SPAM that will result on Linux-kernel, we
> > > > can disable the filter if you want.
> > > > I refuse to sit here and listen to all the "this is the only
> > > > connection person FOO has to the internet" stories, quite frankly I'm
> > > > absolutely sick of hearing them.
> > > > If you don't have properly functioning mail, you can't use these
> > > > lists.
> > > > Period.
> > > When did that become your call?  I didn't realize you owned LKML.
> > Well it's his "baseball" and; "You'll play by my rules or you won't
> > play at all..."
> > FYI, there is no Major Domo. It's Latin, major domus, "master of
> > the house". He doith whatever he careth...
> 
> Yes, I can see that.  No matter who it alienates.  Weither or not he's
> checked with anyone else either.  How about leting those of us who (like
> Linus) choose to use a commercial email product do so?  Garbage about
> headers, etc. is just that - garbage.  The best list is one that is
> inclusive.  One that tollerates other opinions and choices.  LKML has
> turned into the largest, nastiest click I've ever seen, and that's really
> sad, as I'm sure it scares some good people away.  Look at all the crap I
> and others got for using hotmail - I finally got sick and tired of the
> whining and now have to take 3x as long to read my mail - but it's not a
> hotmail address anymore, so the whining stoped.  Why not spend less timing
> restricting what people can read and post from, and just let people
> participate?

Oh please go away. Would you rather see lkml be as ridden with spam as
other lists? You have the right to use a commercial product, and you may
also exercise your right to choose a _bad_ one.

Besides, crap like the above doesn't carry much weight. Especially not
from someone who rarely contributes anything but noise on the list. No
time for whiners, to the kill file you go.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-04 15:28           ` William Scott Lockwood III
  2003-04-04 16:04             ` Richard B. Johnson
@ 2003-04-04 16:04             ` Christoph Hellwig
  2003-04-04 16:10             ` Jens Axboe
  2003-04-04 20:37             ` Matti Aarnio
  3 siblings, 0 replies; 341+ messages in thread
From: Christoph Hellwig @ 2003-04-04 16:04 UTC (permalink / raw)
  To: William Scott Lockwood III
  Cc: Richard B. Johnson, David S. Miller, linux-kernel

On Fri, Apr 04, 2003 at 07:28:12AM -0800, William Scott Lockwood III wrote:
> Yes, I can see that.  No matter who it alienates.  Weither or not he's
> checked with anyone else either.

LKML is DaveM's list.  If the choice he and his co-postmaster make don't
suit yours or others need setup your own linux kernel list.

> How about leting those of us who (like
> Linus) choose to use a commercial email product do so?  Garbage about
> headers, etc. is just that - garbage.

Who said anything about commercial products?  lkml refuses _broken_
mails, it doesn't check what MUA you used.

> and others got for using hotmail - I finally got sick and tired of the
> whining and now have to take 3x as long to read my mail - but it's not a
> hotmail address anymore, so the whining stoped.  Why not spend less timing
> restricting what people can read and post from, and just let people
> participate?

Please red the mail RFC and the nettiquette and come back once you've
done that.  Your current whining wastes our time.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-04 15:28           ` William Scott Lockwood III
@ 2003-04-04 16:04             ` Richard B. Johnson
  2003-04-04 16:04             ` Christoph Hellwig
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 341+ messages in thread
From: Richard B. Johnson @ 2003-04-04 16:04 UTC (permalink / raw)
  To: William Scott Lockwood III; +Cc: David S. Miller, Linux kernel

On Fri, 4 Apr 2003, William Scott Lockwood III wrote:

> On Fri, 4 Apr 2003, Richard B. Johnson wrote:
> > On Thu, 3 Apr 2003, William Scott Lockwood III wrote:
> > > On Thu, 3 Apr 2003, David S. Miller wrote:
> > > >    From: "Richard B. Johnson" <root@chaos.analogic.com>
> > > >    Date: Thu, 3 Apr 2003 15:02:41 -0500 (EST)
> > > >    Well it's not a yahoo users problem because yahoo users can't fix
> > > >    it. Some yahoo users have yahoo "free" mail as their only connection
> > > >    to the internet because of facist network administrators.
> > > > If you want all the SPAM that will result on Linux-kernel, we
> > > > can disable the filter if you want.
> > > > I refuse to sit here and listen to all the "this is the only
> > > > connection person FOO has to the internet" stories, quite frankly I'm
> > > > absolutely sick of hearing them.
> > > > If you don't have properly functioning mail, you can't use these
> > > > lists.
> > > > Period.
> > > When did that become your call?  I didn't realize you owned LKML.
> > Well it's his "baseball" and; "You'll play by my rules or you won't
> > play at all..."
> > FYI, there is no Major Domo. It's Latin, major domus, "master of
> > the house". He doith whatever he careth...
>
> Yes, I can see that.  No matter who it alienates.  Weither or not he's
> checked with anyone else either.  How about leting those of us who (like
> Linus) choose to use a commercial email product do so?  Garbage about
> headers, etc. is just that - garbage.  The best list is one that is
> inclusive.  One that tollerates other opinions and choices.  LKML has
> turned into the largest, nastiest click I've ever seen, and that's really
> sad, as I'm sure it scares some good people away.  Look at all the crap I
> and others got for using hotmail - I finally got sick and tired of the
> whining and now have to take 3x as long to read my mail - but it's not a
> hotmail address anymore, so the whining stoped.  Why not spend less timing
> restricting what people can read and post from, and just let people
> participate?
>

Well SPAM is a very big problem and I can see that David is trying.
Sometimes he has a bad day and pisses a few off with his answers.
However, in every case in which somebody that I know of has complained,
the problems did get "mysteriously" fixed so, like they say; "Don't go
away mad. Just go away!".

Once you get flammed for a few years, you get used to it. That's why
some people send email to me rather than "the list". Sometimes I am
able to help without having to forward their problems to the list.
Sometimes I have to take a work-break and can't help, and other times
I can't help because I don't know what they are talking about. Anyway
if David wants invoke the rage of the Gods, yawn... It doesn't bother
me anymore....


Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-04 12:57         ` Richard B. Johnson
@ 2003-04-04 15:28           ` William Scott Lockwood III
  2003-04-04 16:04             ` Richard B. Johnson
                               ` (3 more replies)
  0 siblings, 4 replies; 341+ messages in thread
From: William Scott Lockwood III @ 2003-04-04 15:28 UTC (permalink / raw)
  To: Richard B. Johnson; +Cc: David S. Miller, linux-kernel

On Fri, 4 Apr 2003, Richard B. Johnson wrote:
> On Thu, 3 Apr 2003, William Scott Lockwood III wrote:
> > On Thu, 3 Apr 2003, David S. Miller wrote:
> > >    From: "Richard B. Johnson" <root@chaos.analogic.com>
> > >    Date: Thu, 3 Apr 2003 15:02:41 -0500 (EST)
> > >    Well it's not a yahoo users problem because yahoo users can't fix
> > >    it. Some yahoo users have yahoo "free" mail as their only connection
> > >    to the internet because of facist network administrators.
> > > If you want all the SPAM that will result on Linux-kernel, we
> > > can disable the filter if you want.
> > > I refuse to sit here and listen to all the "this is the only
> > > connection person FOO has to the internet" stories, quite frankly I'm
> > > absolutely sick of hearing them.
> > > If you don't have properly functioning mail, you can't use these
> > > lists.
> > > Period.
> > When did that become your call?  I didn't realize you owned LKML.
> Well it's his "baseball" and; "You'll play by my rules or you won't
> play at all..."
> FYI, there is no Major Domo. It's Latin, major domus, "master of
> the house". He doith whatever he careth...

Yes, I can see that.  No matter who it alienates.  Weither or not he's
checked with anyone else either.  How about leting those of us who (like
Linus) choose to use a commercial email product do so?  Garbage about
headers, etc. is just that - garbage.  The best list is one that is
inclusive.  One that tollerates other opinions and choices.  LKML has
turned into the largest, nastiest click I've ever seen, and that's really
sad, as I'm sure it scares some good people away.  Look at all the crap I
and others got for using hotmail - I finally got sick and tired of the
whining and now have to take 3x as long to read my mail - but it's not a
hotmail address anymore, so the whining stoped.  Why not spend less timing
restricting what people can read and post from, and just let people
participate?


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-04  0:31       ` William Scott Lockwood III
  2003-04-04  0:40         ` David S. Miller
@ 2003-04-04 12:57         ` Richard B. Johnson
  2003-04-04 15:28           ` William Scott Lockwood III
  1 sibling, 1 reply; 341+ messages in thread
From: Richard B. Johnson @ 2003-04-04 12:57 UTC (permalink / raw)
  To: William Scott Lockwood III; +Cc: David S. Miller, linux-kernel

On Thu, 3 Apr 2003, William Scott Lockwood III wrote:

> On Thu, 3 Apr 2003, David S. Miller wrote:
> >    From: "Richard B. Johnson" <root@chaos.analogic.com>
> >    Date: Thu, 3 Apr 2003 15:02:41 -0500 (EST)
> >    Well it's not a yahoo users problem because yahoo users can't fix
> >    it. Some yahoo users have yahoo "free" mail as their only connection
> >    to the internet because of facist network administrators.
> > If you want all the SPAM that will result on Linux-kernel, we
> > can disable the filter if you want.
> > I refuse to sit here and listen to all the "this is the only
> > connection person FOO has to the internet" stories, quite frankly I'm
> > absolutely sick of hearing them.
> > If you don't have properly functioning mail, you can't use these
> > lists.
> > Period.
>
> When did that become your call?  I didn't realize you owned LKML.
>

Well it's his "baseball" and; "You'll play by my rules or you won't
play at all..."

FYI, there is no Major Domo. It's Latin, major domus, "master of
the house". He doith whatever he careth...

Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-04  0:40         ` David S. Miller
@ 2003-04-04  0:47           ` William Scott Lockwood III
  0 siblings, 0 replies; 341+ messages in thread
From: William Scott Lockwood III @ 2003-04-04  0:47 UTC (permalink / raw)
  To: David S. Miller; +Cc: root, linux-kernel

Yeah, sorry - I thought HPA was running it for some reason.  I still don't
think you should make that kind of a call unilaterally, but hey - after
all the incessant whining I put up with about using OE, I finally caved
and moved to a real email address myself.  I guess it just goes to show
that if you while and act petulant long enough...

On Thu, 3 Apr 2003, David S. Miller wrote:

>    From: William Scott Lockwood III <vlad@geekizoid.com>
>    Date: Thu, 3 Apr 2003 16:31:13 -0800 (PST)
>
>    When did that become your call?  I didn't realize you owned LKML.
>
> Maybe this is news to you, but I've been running LKML for
> 6 or so years now.
>


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-04  0:31       ` William Scott Lockwood III
@ 2003-04-04  0:40         ` David S. Miller
  2003-04-04  0:47           ` William Scott Lockwood III
  2003-04-04 12:57         ` Richard B. Johnson
  1 sibling, 1 reply; 341+ messages in thread
From: David S. Miller @ 2003-04-04  0:40 UTC (permalink / raw)
  To: vlad; +Cc: root, linux-kernel

   From: William Scott Lockwood III <vlad@geekizoid.com>
   Date: Thu, 3 Apr 2003 16:31:13 -0800 (PST)

   When did that become your call?  I didn't realize you owned LKML.
   
Maybe this is news to you, but I've been running LKML for
6 or so years now.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-03 20:00     ` David S. Miller
  2003-04-03 20:21       ` Richard B. Johnson
@ 2003-04-04  0:31       ` William Scott Lockwood III
  2003-04-04  0:40         ` David S. Miller
  2003-04-04 12:57         ` Richard B. Johnson
  1 sibling, 2 replies; 341+ messages in thread
From: William Scott Lockwood III @ 2003-04-04  0:31 UTC (permalink / raw)
  To: David S. Miller; +Cc: root, linux-kernel

On Thu, 3 Apr 2003, David S. Miller wrote:
>    From: "Richard B. Johnson" <root@chaos.analogic.com>
>    Date: Thu, 3 Apr 2003 15:02:41 -0500 (EST)
>    Well it's not a yahoo users problem because yahoo users can't fix
>    it. Some yahoo users have yahoo "free" mail as their only connection
>    to the internet because of facist network administrators.
> If you want all the SPAM that will result on Linux-kernel, we
> can disable the filter if you want.
> I refuse to sit here and listen to all the "this is the only
> connection person FOO has to the internet" stories, quite frankly I'm
> absolutely sick of hearing them.
> If you don't have properly functioning mail, you can't use these
> lists.
> Period.

When did that become your call?  I didn't realize you owned LKML.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-03 20:02   ` your mail Richard B. Johnson
  2003-04-03 19:24     ` Alan Cox
  2003-04-03 20:00     ` David S. Miller
@ 2003-04-03 20:40     ` Trever L. Adams
  2 siblings, 0 replies; 341+ messages in thread
From: Trever L. Adams @ 2003-04-03 20:40 UTC (permalink / raw)
  To: root; +Cc: David S. Miller, Linux Kernel Mailing List

On Thu, 2003-04-03 at 15:02, Richard B. Johnson wrote:
> Well it's not a yahoo users problem because yahoo users can't fix
> it. Some yahoo users have yahoo "free" mail as their only connection
> to the internet because of facist network administrators. It gets
> worse how that you can't tell a company to go screw themselves and
> get another job. The three engineers that I know who use yahoo do
> so because they don't have any choice and there is no way that they
> can configure the mailer to get rid of the empty HTML section.

I would suggest that those who think Yahoo is there only option, check
out digitalme.com or myrealbox.com.  Web, pop, imap, etc.  All free.

Trever
--
"Never raise your hand to your children - it leaves your midsection
unprotected." -- Matthew Harrell


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-03 20:00     ` David S. Miller
@ 2003-04-03 20:21       ` Richard B. Johnson
  2003-04-03 20:15         ` David S. Miller
  2003-04-04  0:31       ` William Scott Lockwood III
  1 sibling, 1 reply; 341+ messages in thread
From: Richard B. Johnson @ 2003-04-03 20:21 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

On Thu, 3 Apr 2003, David S. Miller wrote:

>    From: "Richard B. Johnson" <root@chaos.analogic.com>
>    Date: Thu, 3 Apr 2003 15:02:41 -0500 (EST)
>
>    Well it's not a yahoo users problem because yahoo users can't fix
>    it. Some yahoo users have yahoo "free" mail as their only connection
>    to the internet because of facist network administrators.
>
> If you want all the SPAM that will result on Linux-kernel, we
> can disable the filter if you want.

No. I think you can let empty HTML sections go through.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-03 20:21       ` Richard B. Johnson
@ 2003-04-03 20:15         ` David S. Miller
  0 siblings, 0 replies; 341+ messages in thread
From: David S. Miller @ 2003-04-03 20:15 UTC (permalink / raw)
  To: root; +Cc: linux-kernel

   From: "Richard B. Johnson" <root@chaos.analogic.com>
   Date: Thu, 3 Apr 2003 15:21:25 -0500 (EST)

   On Thu, 3 Apr 2003, David S. Miller wrote:
   
   > If you want all the SPAM that will result on Linux-kernel, we
   > can disable the filter if you want.
   
   No. I think you can let empty HTML sections go through.
   
I think these people who it matters to can petition yahoo.com to drop
this dumb empty HTML section.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-03 19:22 ` David S. Miller
@ 2003-04-03 20:02   ` Richard B. Johnson
  2003-04-03 19:24     ` Alan Cox
                       ` (2 more replies)
  0 siblings, 3 replies; 341+ messages in thread
From: Richard B. Johnson @ 2003-04-03 20:02 UTC (permalink / raw)
  To: David S. Miller; +Cc: Linux kernel

On Thu, 3 Apr 2003, David S. Miller wrote:

> On Thu, 2003-04-03 at 08:22, Richard B. Johnson wrote:
> > FYI vger rejects mail sent from yahoo.com, claims that it
> > has a HTML subpart and considers it spam or Outlook Virus.
> >
> > FYI any mail sent from yahoo will end up using the yahoo tools
> > (qmail). This will put an empty HTML section in all mail. It
> > is not a good thing to reject this because that means you reject
> > all mail from yahoo.
>
> That's yahoo users problem not ours.  If you can't be bothered
> to get a plain text email out, you shouldn't be using these
> lists.
>

Well it's not a yahoo users problem because yahoo users can't fix
it. Some yahoo users have yahoo "free" mail as their only connection
to the internet because of facist network administrators. It gets
worse how that you can't tell a company to go screw themselves and
get another job. The three engineers that I know who use yahoo do
so because they don't have any choice and there is no way that they
can configure the mailer to get rid of the empty HTML section.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-03 20:02   ` your mail Richard B. Johnson
  2003-04-03 19:24     ` Alan Cox
@ 2003-04-03 20:00     ` David S. Miller
  2003-04-03 20:21       ` Richard B. Johnson
  2003-04-04  0:31       ` William Scott Lockwood III
  2003-04-03 20:40     ` Trever L. Adams
  2 siblings, 2 replies; 341+ messages in thread
From: David S. Miller @ 2003-04-03 20:00 UTC (permalink / raw)
  To: root; +Cc: linux-kernel

   From: "Richard B. Johnson" <root@chaos.analogic.com>
   Date: Thu, 3 Apr 2003 15:02:41 -0500 (EST)
   
   Well it's not a yahoo users problem because yahoo users can't fix
   it. Some yahoo users have yahoo "free" mail as their only connection
   to the internet because of facist network administrators.

If you want all the SPAM that will result on Linux-kernel, we
can disable the filter if you want.

I refuse to sit here and listen to all the "this is the only
connection person FOO has to the internet" stories, quite frankly I'm
absolutely sick of hearing them.

If you don't have properly functioning mail, you can't use these
lists.

Period.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-04-03 20:02   ` your mail Richard B. Johnson
@ 2003-04-03 19:24     ` Alan Cox
  2003-04-03 20:00     ` David S. Miller
  2003-04-03 20:40     ` Trever L. Adams
  2 siblings, 0 replies; 341+ messages in thread
From: Alan Cox @ 2003-04-03 19:24 UTC (permalink / raw)
  To: root; +Cc: David S. Miller, Linux Kernel Mailing List

On Iau, 2003-04-03 at 21:02, Richard B. Johnson wrote:
> Well it's not a yahoo users problem because yahoo users can't fix
> it. Some yahoo users have yahoo "free" mail as their only connection
> to the internet because of facist network administrators. It gets
> worse how that you can't tell a company to go screw themselves and
> get another job. The three engineers that I know who use yahoo do
> so because they don't have any choice and there is no way that they
> can configure the mailer to get rid of the empty HTML section.

There are lots of other free email providers.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-01-31 18:46 saurabh  khanna
@ 2003-02-03 12:53 ` Alexander Kellett
  0 siblings, 0 replies; 341+ messages in thread
From: Alexander Kellett @ 2003-02-03 12:53 UTC (permalink / raw)
  To: saurabh khanna; +Cc: linux-kernel

hiya,

unfortunately this list isn't for such problems and it
would be better to contact your distribution or the various
forums it has. try google.

/me wonders again why this list isn't called linux-kernel-dev@...

Alex

On Fri, Jan 31, 2003 at 06:46:05PM -0000, saurabh  khanna wrote:
> Problem: My xwindows did not open and my sound card don't work.
> 
> Xwindows:
> I am a novice. I am using redhat linux 8. It detects my graphics 
> card
> correctly but when i tried to open xwindows, my system hangs.
> 
> Sound:
> Linux has detected my sound card once but not configured it and 
> after
> that nor it is working nither it is detected by my linux.
> 
> GRUB:
> Also, i can boot my linux through LILO only. GRUB wont work, it 
> gives
> error "Not enough memory".
> 
> I have re-installed linux on my computer but the problem 
> remains.
> All other detailes are follows.
> 
> Kernel version:
> Linux version 2.4.18-14 (bhcompile@astest.test.redhat.com)
> (gcc version 3.2 20020903 (Red Hat Linux 8.0 3.2-7))
> #1 Wed Sep 4 12:13:11 EDT 2002
> 
> 
> Commond which triggers the problem:
> startx
> 
> Processor information:
> processor	: 0
> 
> vendor_id	: AuthenticAMD
> 
> cpu family	: 6
> 
> model		: 6
> 
> model name	: AMD Athlon(TM) XP 1700+
> 
> stepping	: 2
> 
> cpu MHz		: 1469.861
> 
> cache size	: 256 KB
> 
> fdiv_bug	: no
> 
> hlt_bug		: no
> 
> f00f_bug	: no
> 
> coma_bug	: no
> 
> fpu		: yes
> 
> fpu_exception	: yes
> 
> cpuid level	: 1
> 
> wp		: yes
> 
> flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
> cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow
> 
> bogomips	: 2920.57
> 
> 
> 
> Module information:
> nls_iso8859-1           3516   1 (autoclean)
> 
> nls_cp437               5116   1 (autoclean)
> 
> vfat                   13084   1 (autoclean)
> 
> fat                    38744   0 (autoclean)
> [vfat]
> autofs                 13348   0 (autoclean) (unused)
> 
> ipt_REJECT              3736   2 (autoclean)
> 
> iptable_filter          2412   1 (autoclean)
> 
> ip_tables              14840   2 [ipt_REJECT iptable_filter]
> 
> mousedev                5524   0 (unused)
> 
> keybdev                 2976   0 (unused)
> 
> hid                    22244   0 (unused)
> 
> input                   5888   0 [mousedev keybdev hid]
> 
> usb-ohci               21288   0 (unused)
> 
> usbcore                77056   1 [hid usb-ohci]
> 
> ext3                   70400   1
> 
> jbd                    52212   1 [ext3]
> 
> 
> 
> Loaded driver and hardware information:
> 0000-001f : dma
> 1
> 0020-003f : pic
> 1
> 0040-005f : timer
> 
> 0060-006f : keyboard
> 
> 0070-007f : rtc
> 
> 0080-008f : dma page reg
> 
> 00a0-00bf : pic
> 2
> 00c0-00df : dma
> 2
> 00f0-00ff : fpu
> 
> 01f0-01f7 : ide
> 0
> 02f8-02ff : serial(auto)
> 
> 03c0-03df : vga+
> 
> 03f6-03f6 : ide
> 0
> 03f8-03ff : serial(auto)
> 
> 0cf8-0cff : PCI conf
> 1
> 5000-500f : PCI device 10de:01b4 (nVidia Corporation)
> 
> 5100-511f : PCI device 10de:01b4 (nVidia Corporation)
> 
> 5500-550f : PCI device 10de:01b4 (nVidia Corporation)
> 
> a800-a80f : nVidia Corporation nForce IDE
> 
> a800-a807 : ide0
> 
> a808-a80f : ide1
> 
> b000-bfff : PCI Bus #01
> 
> b800-b807 : Rockwell International HCF 56k Data/Fax/Voice/Spkp 
> (w/Handset) Modem
> 
> d800-d807 : PCI device 10de:01c3 (nVidia Corporation)
> 
> e000-e07f : PCI device 10de:01b1 (nVidia Corporation)
> 
> e100-e1ff : PCI device 10de:01b1 (nVidia Corporation)
> 
> 00000000-0007ffff : System RAM
> 
> 0009fc00-0009ffff : reserved
> 
> 000a0000-000bffff : Video RAM area
> 
> 000c0000-000c7fff : Video ROM
> 
> 000f0000-000fffff : System ROM
> 
> 00100000-06febfff : System RAM
> 
> 00100000-00247f2e : Kernel code
> 
> 00247f2f-0033ed03 : Kernel data
> 
> 06fec000-06feefff : ACPI Tables
> 
> 06fef000-06ffefff : reserved
> 
> 06fff000-06ffffff : ACPI Non-volatile Storage
> 
> eb000000-ec7fffff : PCI Bus #02
> 
> eb000000-ebffffff : nVidia Corporation GeForce2 Integrated GPU
> 
> ec800000-ecffffff : PCI Bus #01
> 
> ec800000-ec80ffff : Rockwell International HCF 56k 
> Data/Fax/Voice/Spkp (w/Handset) Modem
> 
> ed000000-ed000fff : PCI device 10de:01b1 (nVidia Corporation)
> 
> ed800000-ed87ffff : PCI device 10de:01b0 (nVidia Corporation)
> 
> ee000000-ee0003ff : PCI device 10de:01c3 (nVidia Corporation)
> 
> ee800000-ee800fff : PCI device 10de:01c2 (nVidia Corporation)
> 
> ee800000-ee800fff : usb-ohci
> e
> f000000-ef000fff : PCI device
> 10de:01c2 (nVidia Corporation)
> 
> ef000000-ef000fff : usb-ohci
> e
> ff00000-f7ffffff : PCI Bus #02
> 
> f0000000-f7ffffff : nVidia Corporation GeForce2 Integrated GPU
> 
> f8000000-fbffffff : PCI device
> 10de:01a4 (nVidia Corporation)
> 
> fec00000-fec00fff : reserved
> 
> fee00000-fee00fff : reserved
> 
> ffff0000-ffffffff : reserved
> 
> 
> PCI information:
> 00:00.0 Host bridge: nVidia Corporation nForce CPU bridge (rev 
> b2)
> 	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0
> 	Region 0: Memory at f8000000 (32-bit, prefetchable) [size=64M]
> 	Capabilities: [40] AGP version 2.0
> 		Status: RQ=31 SBA+ 64bit- FW+ Rate=x1,x2,x4
> 		Command: RQ=0 SBA- AGP- 64bit- FW- Rate=x1
> 	Capabilities: [60] #08 [2001]
> 
> 00:00.1 RAM memory: nVidia Corporation nForce 220/420 Memory 
> Controller (rev b2)
> 	Subsystem: nVidia Corporation: Unknown device 0c11
> 	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap- 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR-
> 
> 00:00.2 RAM memory: nVidia Corporation nForce 220/420 Memory 
> Controller (rev b2)
> 	Subsystem: nVidia Corporation: Unknown device 0c11
> 	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap- 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR-
> 
> 00:00.3 RAM memory: nVidia Corporation: Unknown device 01aa (rev 
> b2)
> 	Subsystem: nVidia Corporation: Unknown device 0c11
> 	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap- 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR-
> 
> 00:01.0 ISA bridge: nVidia Corporation nForce ISA Bridge (rev 
> c3)
> 	Subsystem: nVidia Corporation: Unknown device 0c11
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- 
> ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0
> 	Capabilities: [50] #08 [01e1]
> 
> 00:01.1 SMBus: nVidia Corporation nForce PCI System Management 
> (rev c1)
> 	Subsystem: nVidia Corporation: Unknown device 0c11
> 	Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR-
> 	Interrupt: pin A routed to IRQ 5
> 	Region 0: I/O ports at 5000 [size=16]
> 	Region 1: I/O ports at 5500 [size=16]
> 	Region 2: I/O ports at 5100 [size=32]
> 	Capabilities: [44] Power Management version 2
> 		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
> PME(D0-,D1-,D2-,D3hot+,D3cold+)
> 		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> 
> 00:02.0 USB Controller: nVidia Corporation: Unknown device 01c2 
> (rev c3) (prog-if 10 [OHCI])
> 	Subsystem: nVidia Corporation: Unknown device 0c11
> 	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0 (750ns min, 250ns max)
> 	Interrupt: pin A routed to IRQ 5
> 	Region 0: Memory at ef000000 (32-bit, non-prefetchable) 
> [size=4K]
> 	Capabilities: [44] Power Management version 2
> 		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
> PME(D0+,D1+,D2+,D3hot+,D3cold+)
> 		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> 
> 00:03.0 USB Controller: nVidia Corporation: Unknown device 01c2 
> (rev c3) (prog-if 10 [OHCI])
> 	Subsystem: nVidia Corporation: Unknown device 0c11
> 	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0 (750ns min, 250ns max)
> 	Interrupt: pin A routed to IRQ 5
> 	Region 0: Memory at ee800000 (32-bit, non-prefetchable) 
> [size=4K]
> 	Capabilities: [44] Power Management version 2
> 		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
> PME(D0+,D1+,D2+,D3hot+,D3cold+)
> 		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> 
> 00:04.0 Ethernet controller: nVidia Corporation: Unknown device 
> 01c3 (rev c2)
> 	Subsystem: nVidia Corporation: Unknown device 0c11
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0 (250ns min, 5000ns max)
> 	Interrupt: pin A routed to IRQ 5
> 	Region 0: Memory at ee000000 (32-bit, non-prefetchable) 
> [size=1K]
> 	Region 1: I/O ports at d800 [size=8]
> 	Capabilities: [44] Power Management version 2
> 		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
> PME(D0+,D1+,D2+,D3hot+,D3cold+)
> 		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> 
> 00:05.0 Multimedia audio controller: nVidia Corporation: Unknown 
> device 01b0 (rev c2)
> 	Subsystem: nVidia Corporation: Unknown device 0c11
> 	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0 (250ns min, 3000ns max)
> 	Interrupt: pin A routed to IRQ 5
> 	Region 0: Memory at ed800000 (32-bit, non-prefetchable) 
> [size=512K]
> 	Capabilities: [44] Power Management version 2
> 		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
> PME(D0-,D1-,D2-,D3hot-,D3cold-)
> 		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> 
> 00:06.0 Multimedia audio controller: nVidia Corporation nForce 
> Audio (rev c2)
> 	Subsystem: nVidia Corporation: Unknown device 8384
> 	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0 (500ns min, 1250ns max)
> 	Interrupt: pin A routed to IRQ 11
> 	Region 0: I/O ports at e100 [size=256]
> 	Region 1: I/O ports at e000 [size=128]
> 	Region 2: Memory at ed000000 (32-bit, non-prefetchable) 
> [disabled] [size=4K]
> 	Capabilities: [44] Power Management version 2
> 		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
> PME(D0-,D1-,D2-,D3hot-,D3cold-)
> 		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> 
> 00:08.0 PCI bridge: nVidia Corporation nForce PCI-to-PCI bridge 
> (rev c2) (prog-if 00 [Normal decode])
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0
> 	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
> 	I/O behind bridge: 0000b000-0000bfff
> 	Memory behind bridge: ec800000-ecffffff
> 	Prefetchable memory behind bridge: f8000000-f7ffffff
> 	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
> 
> 00:09.0 IDE interface: nVidia Corporation nForce IDE (rev c3) 
> (prog-if 8a [Master SecP PriP])
> 	Subsystem: nVidia Corporation: Unknown device 0c11
> 	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0 (750ns min, 250ns max)
> 	Region 4: I/O ports at a800 [size=16]
> 	Capabilities: [44] Power Management version 2
> 		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
> PME(D0-,D1-,D2-,D3hot-,D3cold-)
> 		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> 
> 00:1e.0 PCI bridge: nVidia Corporation nForce AGP to PCI Bridge 
> (rev b2) (prog-if 00 [Normal decode])
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap- 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 0
> 	Bus: primary=00, secondary=02, subordinate=02, sec-latency=64
> 	I/O behind bridge: 0000a000-00009fff
> 	Memory behind bridge: eb000000-ec7fffff
> 	Prefetchable memory behind bridge: eff00000-f7ffffff
> 	BridgeCtl: Parity- SERR- NoISA- VGA+ MAbort- >Reset- FastB2B-
> 
> 01:08.0 Communication controller: Rockwell International HCF 56k 
> Data/Fax/Voice/Spkp (w/Handset) Modem (rev 01)
> 	Subsystem: Rockwell International HCF 56k Data/Fax/Voice/Spkp 
> (w/Handset) Modem
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 64
> 	Interrupt: pin A routed to IRQ 5
> 	Region 0: Memory at ec800000 (32-bit, non-prefetchable) 
> [size=64K]
> 	Region 1: I/O ports at b800 [size=8]
> 	Capabilities: [40] Power Management version 2
> 		Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA 
> PME(D0+,D1+,D2+,D3hot+,D3cold+)
> 		Status: D0 PME-Enable+ DSel=0 DScale=0 PME-
> 
> 02:00.0 VGA compatible controller: nVidia Corporation NV15 
> [GeForce2 - nForce GPU] (rev b1) (prog-if 00 [VGA])
> 	Subsystem: nVidia Corporation: Unknown device 0c11
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
> ParErr- Stepping- SERR- FastB2B-
> 	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
> <TAbort- <MAbort- >SERR- <PERR-
> 	Latency: 32 (1250ns min, 250ns max)
> 	Interrupt: pin A routed to IRQ 11
> 	Region 0: Memory at eb000000 (32-bit, non-prefetchable) 
> [size=16M]
> 	Region 1: Memory at f0000000 (32-bit, prefetchable) 
> [size=128M]
> 	Expansion ROM at efff0000 [disabled] [size=64K]
> 	Capabilities: [60] Power Management version 2
> 		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
> PME(D0-,D1-,D2-,D3hot-,D3cold-)
> 		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> 	Capabilities: [44] AGP version 2.0
> 		Status: RQ=31 SBA- 64bit- FW+ Rate=x1,x4
> 		Command: RQ=0 SBA- AGP- 64bit- FW- Rate=<none>
> 
> 
> 
> XF86Config:
> 
> # File generated by anaconda.
> 
> Section "ServerLayout"
>         Identifier     "Anaconda Configured"
>         Screen      0  "Screen0" 0 0
>         InputDevice    "Mouse0" "CorePointer"
> 	InputDevice	"Mouse1" "SendCoreEvents"
>         InputDevice    "Keyboard0" "CoreKeyboard"
> EndSection
> 
> Section "Files"
> 
> # The location of the RGB database.  Note, this is the name of 
> the
> # file minus the extension (like ".txt" or ".db").  There is 
> normally
> # no need to change the default.
> 
>     RgbPath	"/usr/X11R6/lib/X11/rgb"
> 
> # Multiple FontPath entries are allowed (they are concatenated 
> together)
> # By default, Red Hat 6.0 and later now use a font server 
> independent of
> # the X server to render fonts.
> 
>     FontPath   "unix/:7100"
> 
> EndSection
> 
> Section "Module"
>         Load  "dbe"
>         Load  "extmod"
> 	Load  "fbdevhw"
> 	Load  "dri"
>         Load  "glx"
>         Load  "record"
>         Load  "freetype"
>         Load  "type1"
> EndSection
> 
> Section "InputDevice"
>         Identifier  "Keyboard0"
>         Driver      "keyboard"
> 
> #	Option	"AutoRepeat"	"500 5"
> 
> # when using XQUEUE, comment out the above line, and uncomment 
> the
> # following line
> #	Option	"Protocol"	"Xqueue"
> 
> # Specify which keyboard LEDs can be user-controlled (eg, with 
> xset(1))
> #	Option	"Xleds"		"1 2 3"
> 
> # To disable the XKEYBOARD extension, uncomment XkbDisable.
> #	Option	"XkbDisable"
> 
> # To customise the XKB settings to suit your keyboard, modify 
> the
> # lines below (which are the defaults).  For example, for a 
> non-U.S.
> # keyboard, you will probably want to use:
> #	Option	"XkbModel"	"pc102"
> # If you have a US Microsoft Natural keyboard, you can use:
> #	Option	"XkbModel"	"microsoft"
> #
> # Then to change the language, change the Layout setting.
> # For example, a german layout can be obtained with:
> #	Option	"XkbLayout"	"de"
> # or:
> #	Option	"XkbLayout"	"de"
> #	Option	"XkbVariant"	"nodeadkeys"
> #
> # If you'd like to switch the positions of your capslock and
> # control keys, use:
> #	Option	"XkbOptions"	"ctrl:swapcaps"
> 	Option	"XkbRules"	"xfree86"
> 	Option	"XkbModel"	"pc105"
> 	Option	"XkbLayout"	"us"
> 	#Option	"XkbVariant"	""
> 	#Option	"XkbOptions"	""
> EndSection
> 
> Section "InputDevice"
>         Identifier  "Mouse0"
>         Driver      "mouse"
>         Option      "Protocol" "PS/2"
>         Option      "Device" "/dev/psaux"
>         Option      "ZAxisMapping" "4 5"
>         Option      "Emulate3Buttons" "yes"
> EndSection
> 
> 
> Section "InputDevice"
> 	Identifier	"Mouse1"
> 	Driver		"mouse"
> 	Option		"Device"		"/dev/input/mice"
> 	Option		"Protocol"		"IMPS/2"
> 	Option		"Emulate3Buttons"	"no"
> 	Option		"ZAxisMapping"		"4 5"
> EndSection
> 
> 
> Section "Monitor"
>         Identifier   "Monitor0"
>         VendorName   "Monitor Vendor"
>         ModelName    "Monitor Model"
>         HorizSync   30-55
>         VertRefresh 50-120
>         Option "dpms"
> 
> 
> EndSection
> 
> Section "Device"
> 	# no known options
> 	Identifier   "NVIDIA GeForce 2 MX (generic)"
>         Driver       "nv"
>         VendorName   "NVIDIA GeForce 2 MX (generic)"
>         BoardName     "NVIDIA GeForce 2 MX (generic)"
> 
>         #BusID
> EndSection
> 
> Section "Screen"
> 	Identifier   "Screen0"
>         Device       "NVIDIA GeForce 2 MX (generic)"
>         Monitor      "Monitor0"
> 	DefaultDepth	16
> 
> 	Subsection "Display"
>         	Depth       16
>                 Modes       "1024x768" "800x600" "640x480"
> 	EndSubsection
> 
> EndSection
> 
> Section "DRI"
> 	Mode 0666
> EndSection
> 
> cmdline:
> auto BOOT_IMAGE=linux ro BOOT_FILE=/boot/vmlinuz-2.4.18-14 
> root=LABEL=/
> 
> 
> dma:
> 4: cascade
> 
> 
> intrrupts:
> CPU0
>   0:     337647          XT-PIC  timer
>   1:       2694          XT-PIC  keyboard
>   2:          0          XT-PIC  cascade
>   5:          0          XT-PIC  usb-ohci, usb-ohci
>   8:          1          XT-PIC  rtc
>  12:         20          XT-PIC  PS/2 Mouse
>  14:      27338          XT-PIC  ide0
> NMI:          0
> ERR:          2
> 
> 
> partitions:
> major minor  #blocks  name     rio rmerge rsect ruse wio wmerge 
> wsect wuse running use aveq
> 
>    3     0   39121488 hda 2567 4181 52107 22201 1417 1941 26952 
> 45880 -2 329576 7788488
>    3     1    5245191 hda1 9 43 104 109 0 0 0 0 0 109 109
>    3     2          1 hda2 0 0 0 0 0 0 0 0 0 0 0
>    3     5   10490413 hda5 9 43 104 95 0 0 0 0 0 95 95
>    3     6   11695288 hda6 50 43 145 214 7 1 8 95 0 230 310
>    3     7    8104761 hda7 9 43 104 132 0 0 0 0 0 132 132
>    3     8    3277228 hda8 2475 3966 51498 21527 1410 1940 26944 
> 45785 0 22394 67314
>    3     9     305203 hda9 9 25 104 56 0 0 0 0 0 56 56
> 
> 
> 
> My e-mail addresses are: linux_guyus@yahoo.com and 
> linux_guyus@rediff.com
> My postel address: 80, Ahilya Nagar Ext. Annapurna Road, Indore, 
> M.P.,
> India. PIN 452009
> please answer me soon.
> 		Thanking you.
> 			Saurabh Khanna.
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

mvg,
Alex

-- 
"[...] Konqueror open source project. Weighing in at less than
            one tenth the size of another open source renderer"
Apple,  Jan 2003 (http://www.apple.com/safari/)

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-01-25 23:10           ` Larry McVoy
@ 2003-01-26  8:12             ` David S. Miller
  0 siblings, 0 replies; 341+ messages in thread
From: David S. Miller @ 2003-01-26  8:12 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Eric W. Biederman, Jason Papadopoulos, linux-kernel, linux-mm

On Sat, 2003-01-25 at 15:10, Larry McVoy wrote:
> All good page coloring implementation do exactly that.  The starting
> index into the page buckets is based on process id.

I think everyone interested in learning more about this
topic should go read the following papers, they were very
helpful when I was fiddling around in this area.

These papers, in turn, reference several others which are
good reads as well.

1) W. L. Lynch, B. K. Bray, and M. J. Flynn. "The effect of page
   allocation on caches". In Micro-25 Conference Proceedings, pages
   222-225, December 1992. 

2) W. Lynch and M. Flynn. "Cache improvements through colored page
   allocation". ACM Transactions on Computer Systems, 1993. Submitted
   for review, 1992. 

3) William L. Lynch. "The Interaction of Virtual Memory and Cache
   Memory". PhD thesis, Stanford University, October
   1993. CSL-TR-93-587.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-01-25 17:47         ` Eric W. Biederman
@ 2003-01-25 23:10           ` Larry McVoy
  2003-01-26  8:12             ` David S. Miller
  0 siblings, 1 reply; 341+ messages in thread
From: Larry McVoy @ 2003-01-25 23:10 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Larry McVoy, Jason Papadopoulos, linux-kernel, linux-mm

> I am wondering if there is any point in biasing page addresses in between
> processes so that processes are less likely to have a cache conflict.
> i.e.  process 1 address 0 %16K == 0, process 2 address 0 %16K == 4K 

All good page coloring implementation do exactly that.  The starting
index into the page buckets is based on process id.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-01-25  2:26       ` Larry McVoy
@ 2003-01-25 17:47         ` Eric W. Biederman
  2003-01-25 23:10           ` Larry McVoy
  0 siblings, 1 reply; 341+ messages in thread
From: Eric W. Biederman @ 2003-01-25 17:47 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Jason Papadopoulos, linux-kernel, linux-mm

Larry McVoy <lm@bitmover.com> writes:

> > For the record, I finally got to try my own page coloring patch on a 1GHz
> > Athlon Thunderbird system with 256kB L2 cache. With the present patch, my
> > own number crunching benchmarks and a kernel compile don't show any benefit 
> > at all, and lmbench is completely unchanged except for the mmap latency, 
> > which is slightly worse. Hardly a compelling case for PCs!
> 
> If it works correctly then the variability in lat_ctx should go away.
> Try this
> 
> 	for p in 2 4 8 12 16 24 32 64
> 	do	for size in 0 2 4 8 16
> 		do	for i in 1 2 3 4 5 6 7 8 9 0
> 			do	lat_ctx -s$size $p
> 			done
> 		done
> 	done
> 
> on both the with and without kernel.  The page coloring should make the 
> numbers rock steady, without it, they will bounce a lot.

On the same kind of vein I have seen some tremendous variability in the
stream benchmark.  Under linux I have gotten it to very as much
as a 100MB/sec by running updatedb, between runs.  In one case
it ran faster with updatedb running in the background.

But at the same time streams tends to be very steady if you have a quiet
machine and run it several times in a row repeatedly because it gets
allocated essentially the same memory every run.

So I do no the variables of cache contention do have effect on some
real programs.  I have not yet tracked it down to see if cache coloring
could be a benefit.  I suspect the buddy allocator actually comes
quite close most of the time, and tricks like allocating multiple pages
at once could improve that even more with very little effort, while reducing
page fault miss times.

I am wondering if there is any point in biasing page addresses in between
processes so that processes are less likely to have a cache conflict.
i.e.  process 1 address 0 %16K == 0, process 2 address 0 %16K == 4K 

Eric

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-01-24  6:06   ` John Alvord
@ 2003-01-25  2:29     ` Jason Papadopoulos
  2003-01-25  2:26       ` Larry McVoy
  0 siblings, 1 reply; 341+ messages in thread
From: Jason Papadopoulos @ 2003-01-25  2:29 UTC (permalink / raw)
  To: linux-kernel, linux-mm

At 10:06 PM 1/23/03 -0800, John Alvord wrote:

>The big challenge in Linux is that several serious attempts to add
>page coloring have foundered on the shoals of "no benefit found". It
>may be that the typical hardware Linux runs on just doesn't experience
>the problem very much.

Another strike against page coloring is that it gives tremendous benefits
when caches are large and not very associative, but if both of these are
not present the benefits are much smaller. In the case of latter-day PCs,
neither of these is the case: the caches are very small and at least 8-way
set associative.

For the record, I finally got to try my own page coloring patch on a 1GHz
Athlon Thunderbird system with 256kB L2 cache. With the present patch, my
own number crunching benchmarks and a kernel compile don't show any benefit 
at all, and lmbench is completely unchanged except for the mmap latency, 
which is slightly worse. Hardly a compelling case for PCs!

Oh well. At least now I'll be able to port to 2.5 :)

jasonp

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-01-25  2:29     ` Jason Papadopoulos
@ 2003-01-25  2:26       ` Larry McVoy
  2003-01-25 17:47         ` Eric W. Biederman
  0 siblings, 1 reply; 341+ messages in thread
From: Larry McVoy @ 2003-01-25  2:26 UTC (permalink / raw)
  To: Jason Papadopoulos; +Cc: linux-kernel, linux-mm

> For the record, I finally got to try my own page coloring patch on a 1GHz
> Athlon Thunderbird system with 256kB L2 cache. With the present patch, my
> own number crunching benchmarks and a kernel compile don't show any benefit 
> at all, and lmbench is completely unchanged except for the mmap latency, 
> which is slightly worse. Hardly a compelling case for PCs!

If it works correctly then the variability in lat_ctx should go away.
Try this

	for p in 2 4 8 12 16 24 32 64
	do	for size in 0 2 4 8 16
		do	for i in 1 2 3 4 5 6 7 8 9 0
			do	lat_ctx -s$size $p
			done
		done
	done

on both the with and without kernel.  The page coloring should make the 
numbers rock steady, without it, they will bounce a lot.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-01-24 19:14         ` David Lang
@ 2003-01-24 19:40           ` Maciej W. Rozycki
  0 siblings, 0 replies; 341+ messages in thread
From: Maciej W. Rozycki @ 2003-01-24 19:40 UTC (permalink / raw)
  To: David Lang; +Cc: Anoop J., linux-mm, linux-kernel

On Fri, 24 Jan 2003, David Lang wrote:

> the cache never sees the virtual addresses, it operated excclusivly on the
> physical addresses so the problem of aliasing never comes up.

 It depends on the implementation.

> virtual to physical addres mapping is all resolved before anything hits
> the cache.

 It depends on the processor.

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-01-24  9:49       ` Anoop J.
@ 2003-01-24 19:14         ` David Lang
  2003-01-24 19:40           ` Maciej W. Rozycki
  0 siblings, 1 reply; 341+ messages in thread
From: David Lang @ 2003-01-24 19:14 UTC (permalink / raw)
  To: Anoop J.; +Cc: linux-mm, linux-kernel

the cache never sees the virtual addresses, it operated excclusivly on the
physical addresses so the problem of aliasing never comes up.

virtual to physical addres mapping is all resolved before anything hits
the cache.

David Lang

On Fri, 24 Jan 2003, Anoop J. wrote:

> Date: Fri, 24 Jan 2003 15:19:16 +0530 (IST)
> From: Anoop J. <cs99001@nitc.ac.in>
> To: david.lang@digitalinsight.com
> Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org
> Subject: Re: your mail
>
> ok i shall put it in another way
> since virtual indexing is a representation of the virtual memory,
> it is possible for more multiple virtual addresses to represent the same
> physical address.So the problem of aliasing occurs in the cache.Does page
> coloring guarantee a unique mapping of physical address.If so how is the
> maping from virtual to physical address
>
>
>
> Thanks
>
>
>
> > I think this is a case of the same tuerm being used for two different
> > purposes. I don't know the use you are refering to.
> >
> > David Lang
> >
> >
> >
>
>
>
>

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-01-24  8:48     ` David Lang
@ 2003-01-24  9:49       ` Anoop J.
  2003-01-24 19:14         ` David Lang
  0 siblings, 1 reply; 341+ messages in thread
From: Anoop J. @ 2003-01-24  9:49 UTC (permalink / raw)
  To: david.lang; +Cc: linux-mm, linux-kernel

ok i shall put it in another way
since virtual indexing is a representation of the virtual memory,
it is possible for more multiple virtual addresses to represent the same
physical address.So the problem of aliasing occurs in the cache.Does page
coloring guarantee a unique mapping of physical address.If so how is the
maping from virtual to physical address



Thanks



> I think this is a case of the same tuerm being used for two different
> purposes. I don't know the use you are refering to.
>
> David Lang
>
>
>





^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-01-24  6:28 ` your mail David Lang
@ 2003-01-24  8:51   ` Anoop J.
  2003-01-24  8:48     ` David Lang
  0 siblings, 1 reply; 341+ messages in thread
From: Anoop J. @ 2003-01-24  8:51 UTC (permalink / raw)
  To: david.lang; +Cc: linux-mm, linux-kernel

I read that the data coherency problem due to virtual indexing is avoided
through page coloring and it has also got the speed of physical indexing
can u just elaborate on how this is possible?


Thanks




> implementing a fully associative cache eliminates the need for page
> coloring, but it has to be implemented in hardware. if you don't have
> fully associative caches in your hardware page coloring helps avoid the
> worst case memory allocations.
>
> from what I have seen on the attempts to implement it the problem is
> that the calculations needed to do page colored allocations end up
> costing enough that they end up with a net loss compared to the old
> method.
>
> David Lang
>
>
>  On Fri, 24 Jan 2003, Anoop J.
> wrote:
>
>> Date: Fri, 24 Jan 2003 11:24:24 +0530 (IST)
>> From: Anoop J. <cs99001@nitc.ac.in>
>> To: linux-mm@kvack.org, linux-kernel@vger.kernel.org
>>
>>
>> How is this different from a fully associative cache .Would be better
>> if u could deal it based on the address bits used
>>




^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-01-24  8:51   ` Anoop J.
@ 2003-01-24  8:48     ` David Lang
  2003-01-24  9:49       ` Anoop J.
  0 siblings, 1 reply; 341+ messages in thread
From: David Lang @ 2003-01-24  8:48 UTC (permalink / raw)
  To: Anoop J.; +Cc: linux-mm, linux-kernel

I think this is a case of the same tuerm being used for two different
purposes. I don't know the use you are refering to.

David Lang


On Fri, 24 Jan 2003, Anoop J. wrote:

> I read that the data coherency problem due to virtual indexing is avoided
> through page coloring and it has also got the speed of physical indexing
> can u just elaborate on how this is possible?
>
>
> Thanks
>
>
>
>
> > implementing a fully associative cache eliminates the need for page
> > coloring, but it has to be implemented in hardware. if you don't have
> > fully associative caches in your hardware page coloring helps avoid the
> > worst case memory allocations.
> >
> > from what I have seen on the attempts to implement it the problem is
> > that the calculations needed to do page colored allocations end up
> > costing enough that they end up with a net loss compared to the old
> > method.
> >
> > David Lang
> >
> >
> >  On Fri, 24 Jan 2003, Anoop J.
> > wrote:
> >
> >> Date: Fri, 24 Jan 2003 11:24:24 +0530 (IST)
> >> From: Anoop J. <cs99001@nitc.ac.in>
> >> To: linux-mm@kvack.org, linux-kernel@vger.kernel.org
> >>
> >>
> >> How is this different from a fully associative cache .Would be better
> >> if u could deal it based on the address bits used
> >>
>
>
>

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-01-24  5:54 Anoop J.
@ 2003-01-24  6:28 ` David Lang
  2003-01-24  8:51   ` Anoop J.
  0 siblings, 1 reply; 341+ messages in thread
From: David Lang @ 2003-01-24  6:28 UTC (permalink / raw)
  To: Anoop J.; +Cc: linux-mm, linux-kernel

implementing a fully associative cache eliminates the need for page
coloring, but it has to be implemented in hardware. if you don't have
fully associative caches in your hardware page coloring helps avoid the
worst case memory allocations.

from what I have seen on the attempts to implement it the problem is that
the calculations needed to do page colored allocations end up costing
enough that they end up with a net loss compared to the old method.

David Lang


 On Fri, 24 Jan 2003, Anoop J.
wrote:

> Date: Fri, 24 Jan 2003 11:24:24 +0530 (IST)
> From: Anoop J. <cs99001@nitc.ac.in>
> To: linux-mm@kvack.org, linux-kernel@vger.kernel.org
>
>
> How is this different from a fully associative cache .Would be better if u
> could deal it based on the address bits used
>
> Thanks
>
> David Lang wrote:
>
> >The idea of page coloring is based on the fact that common implementations
> >of caching can't put any page in memory in any line in the cache (such an
> >implementation is possible, but is more expensive to do so is not commonly
> >done)
> >
> >With this implementation it means that if your program happens to use
> >memory that cannot be mapped to half of the cache lines then effectivly
> >the CPU cache is half it's rated size for your program. the next time your
> >program runs it may get a more favorable memory allocation and be able to
> >use all of the cache and therefor run faster.
> >
> >Page coloring is an attampt to take this into account when allocating
> >memory to programs so that every program gets to use all of the cache.
> >
> >David Lang
> >
> >
> > On Fri, 24 Jan 2003, Anoop J. wrote:
> >
> >>Date: Fri, 24 Jan 2003 10:38:03 +0530 (IST)
> >>From: Anoop J. <cs99001@nitc.ac.in>
> >>To: linux-kernel@vger.kernel.org, linux-mm@kvack.org
> >>
> >>
> >>How does page coloring work. Iwant its mechanism not the implementation.
> >>I went through some pages of W.L.Lynch's paper on cache and VM. Still not
> >>able to grasp it .
> >>
> >>
> >>Thanks in advance
> >>
> >>
> >>
> >>-
> >>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >>the body of a message to majordomo@vger.kernel.org
> >>More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>Please read the FAQ at  http://www.tux.org/lkml/
> >>
> >
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-01-24  5:11 ` your mail David Lang
@ 2003-01-24  6:06   ` John Alvord
  2003-01-25  2:29     ` Jason Papadopoulos
  0 siblings, 1 reply; 341+ messages in thread
From: John Alvord @ 2003-01-24  6:06 UTC (permalink / raw)
  To: David Lang; +Cc: Anoop J., linux-kernel, linux-mm

The big challenge in Linux is that several serious attempts to add
page coloring have foundered on the shoals of "no benefit found". It
may be that the typical hardware Linux runs on just doesn't experience
the problem very much.

john


On Thu, 23 Jan 2003 21:11:10 -0800 (PST), David Lang
<david.lang@digitalinsight.com> wrote:

>The idea of page coloring is based on the fact that common implementations
>of caching can't put any page in memory in any line in the cache (such an
>implementation is possible, but is more expensive to do so is not commonly
>done)
>
>With this implementation it means that if your program happens to use
>memory that cannot be mapped to half of the cache lines then effectivly
>the CPU cache is half it's rated size for your program. the next time your
>program runs it may get a more favorable memory allocation and be able to
>use all of the cache and therefor run faster.
>
>Page coloring is an attampt to take this into account when allocating
>memory to programs so that every program gets to use all of the cache.
>
>David Lang
>
>
> On Fri, 24 Jan 2003, Anoop J. wrote:
>
>> Date: Fri, 24 Jan 2003 10:38:03 +0530 (IST)
>> From: Anoop J. <cs99001@nitc.ac.in>
>> To: linux-kernel@vger.kernel.org, linux-mm@kvack.org
>>
>>
>> How does page coloring work. Iwant its mechanism not the implementation.
>> I went through some pages of W.L.Lynch's paper on cache and VM. Still not
>> able to grasp it .
>>
>>
>> Thanks in advance
>>
>>
>>
>> -
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-01-24  5:08 Anoop J.
@ 2003-01-24  5:11 ` David Lang
  2003-01-24  6:06   ` John Alvord
  0 siblings, 1 reply; 341+ messages in thread
From: David Lang @ 2003-01-24  5:11 UTC (permalink / raw)
  To: Anoop J.; +Cc: linux-kernel, linux-mm

The idea of page coloring is based on the fact that common implementations
of caching can't put any page in memory in any line in the cache (such an
implementation is possible, but is more expensive to do so is not commonly
done)

With this implementation it means that if your program happens to use
memory that cannot be mapped to half of the cache lines then effectivly
the CPU cache is half it's rated size for your program. the next time your
program runs it may get a more favorable memory allocation and be able to
use all of the cache and therefor run faster.

Page coloring is an attampt to take this into account when allocating
memory to programs so that every program gets to use all of the cache.

David Lang


 On Fri, 24 Jan 2003, Anoop J. wrote:

> Date: Fri, 24 Jan 2003 10:38:03 +0530 (IST)
> From: Anoop J. <cs99001@nitc.ac.in>
> To: linux-kernel@vger.kernel.org, linux-mm@kvack.org
>
>
> How does page coloring work. Iwant its mechanism not the implementation.
> I went through some pages of W.L.Lynch's paper on cache and VM. Still not
> able to grasp it .
>
>
> Thanks in advance
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2003-01-12 13:28 Philip K.F. Hölzenspies
@ 2003-01-13 16:37 ` Pete Zaitcev
  0 siblings, 0 replies; 341+ messages in thread
From: Pete Zaitcev @ 2003-01-13 16:37 UTC (permalink / raw)
  To: Philip K.F. Hölzenspies; +Cc: linux-kernel

> Linux version 2.4.20 (root@tomwaits) (gcc version 3.2) #1 SMP Sat Jan 11 18:46:51 CET 2003
> Intel MultiProcessor Specification v1.4
>     Virtual Wire compatibility mode.
>[...]
> PCI: Using IRQ router AMD768 [1022/7443] at 00:07.3
> PCI->APIC IRQ transform: (B1,I5,P0) -> 16
> PCI->APIC IRQ transform: (B2,I5,P0) -> 18

> PCI: Enabling device 02:08.2 (0014 -> 0016)
> PCI: No IRQ known for interrupt pin C of device 02:08.2. Probably buggy
> MP table.

I am sorry to say, I cannot help you. This is the department
of Manfred, most likely. The 95% bet is that your BIOS is crap,
and you have to poke ASUS. However, you might want to explore
a possiblity of a bug. The best way to do it is to run "mptable"
program to dump the table and then get someone who makes
a sense of the data. Try to figure out who wrote the code
to support AMD IRQ router. He may be the culprit (5%, but...)

 http://people.redhat.com/zaitcev/linux/mptable-2.0.15a-1.i386.rpm
 http://people.redhat.com/zaitcev/linux/mptable-2.0.15a-1.src.rpm

-- Pete

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-11-11 19:22 David Mosberger
@ 2002-11-12  1:39 ` Rik van Riel
  0 siblings, 0 replies; 341+ messages in thread
From: Rik van Riel @ 2002-11-12  1:39 UTC (permalink / raw)
  To: davidm; +Cc: Mario Smarduch, linux-ia64, linux-kernel

On Mon, 11 Nov 2002, David Mosberger wrote:
> >>>>> On Mon, 11 Nov 2002 10:29:29 -0600, Mario Smarduch <cms063@email.mot.com> said:
>
>   Mario> I know that on some commercial Unix systems there are ways to
>   Mario> cap the CPU utilization by user/group ids are there such
>   Mario> features/patches available on Linux?

> The kernel patches available from this URL are pretty old (up to
> 2.4.6, as far as I could see), and I'm not sure what the future plans
> for PRM on Linux are.  Perhaps someone else can provide more details.

I'm (slowly) working on a per-user fair scheduler on top of Ingo's
O(1) scheduler.  Slowly because it's a fairly complex thing.

Once that is done it should be possible to change the accounting
to other resource containers and generally have fun assigning
priorities, though that is beyond the scope of what I'm trying to
achieve.

cheers,

Rik
-- 
Bravely reimplemented by the knights who say "NIH".
http://www.surriel.com/		http://distro.conectiva.com/
Current spamtrap:  <a href=mailto:"october@surriel.com">october@surriel.com</a>


^ permalink raw reply	[flat|nested] 341+ messages in thread

* RE: your mail
@ 2002-10-31 18:13 Bloch, Jack
  0 siblings, 0 replies; 341+ messages in thread
From: Bloch, Jack @ 2002-10-31 18:13 UTC (permalink / raw)
  To: 'Tom Bradley'; +Cc: linux-kernel

Thanks very much.

Jack Bloch 
Siemens ICN
phone                (561) 923-6550
e-mail                jack.bloch@icn.siemens.com


-----Original Message-----
From: Tom Bradley [mailto:tojabr@tojabr.com]
Sent: Thursday, October 31, 2002 1:00 PM
To: Bloch, Jack
Cc: linux-kernel@vger.kernel.org
Subject: Re: your mail


They are just regular values. The UL tells the compiler to format the
number as an unsgned long.


On Thu, 31 Oct 2002, Bloch, Jack wrote:

> I am looking at some sample driver code which shows the usage of some
> unsigned integers 1UL, 2UL, 4UL, 16UL, 64UL, 128UL and 256UL.  I need to
> know what these are defined as. Please excuse my ignorance.
>
> Please CC me directly on any responses.
>
> Jack Bloch
> Siemens ICN
> phone                (561) 923-6550
> e-mail                jack.bloch@icn.siemens.com
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-10-31 15:39 Bloch, Jack
@ 2002-10-31 18:00 ` Tom Bradley
  0 siblings, 0 replies; 341+ messages in thread
From: Tom Bradley @ 2002-10-31 18:00 UTC (permalink / raw)
  To: Bloch, Jack; +Cc: linux-kernel

They are just regular values. The UL tells the compiler to format the
number as an unsgned long.


On Thu, 31 Oct 2002, Bloch, Jack wrote:

> I am looking at some sample driver code which shows the usage of some
> unsigned integers 1UL, 2UL, 4UL, 16UL, 64UL, 128UL and 256UL.  I need to
> know what these are defined as. Please excuse my ignorance.
>
> Please CC me directly on any responses.
>
> Jack Bloch
> Siemens ICN
> phone                (561) 923-6550
> e-mail                jack.bloch@icn.siemens.com
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-10-30 12:45 Roberto Fichera
@ 2002-10-30 14:04 ` Richard B. Johnson
  0 siblings, 0 replies; 341+ messages in thread
From: Richard B. Johnson @ 2002-10-30 14:04 UTC (permalink / raw)
  To: Roberto Fichera; +Cc: linux-kernel

On Wed, 30 Oct 2002, Roberto Fichera wrote:

> I've a problem with a DAT on a Compaq Proliant ML350 with PIII 1GHz,
> 1Gb RAM, RAID controller Smart Array 451 with 3 x HDD 9Gb RAID 5
> and an internal SCSI controller Adaptec 7899 Ultra160 where is connected
> only a DAT 12/24 Gb. Current installed distribution is RH7.3 with its kernel
> 2.4.18-10 but I've tryed the standard 2.4.19 with the same problem.
> The problem is that the DAT don't work any more with Linux. This DAT work
> well on Win2K :-(! Below  there is some logs and a 'ps fax' showing a tar in
> D state.
> 
> Does anyone know a solution ?

> 
> Adaptec AIC7xxx driver version: 6.2.6
> aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
> Corrupted Serial EEPROM
^^^^^^^^^^^^^^^^^^^^^^^^^

I think your controller has fallen-back into survival mode
because it lost it's mind. You may want to upgrade the
controller BIOS to fix this problem. Then, see if it handles
tapes okay.


Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
   Bush : The Fourth Reich of America



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail 
  2002-10-18  2:47   ` Rusty Russell
@ 2002-10-18 21:50     ` Kai Germaschewski
  0 siblings, 0 replies; 341+ messages in thread
From: Kai Germaschewski @ 2002-10-18 21:50 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Daniel Phillips, S, Roman Zippel, linux-kernel

On Fri, 18 Oct 2002, Rusty Russell wrote:

> > I wonder if this new method is going to be mandatory (the only one
> > available) or optional. I think there's two different kind of users, for
> > one modules which use an API which provides its own infrastructure for
> > dealing with modules via ->owner, on the other hand things like netfilter
> > (that's probably where you are coming from) where calls into a module,
> > which need protection are really frequent.
> 
> Mandatory for interfaces where the function can sleep (or be preempted).

and is not protected by other means (try_inc_mod_count()), I presume.

> > I see that your approach makes frequent calls into the module cheaper, but
> > I'm not totally convinced that the current safe interfaces need to change
> > just to accomodate rare cases like netfilter (there's most likely some
> > more cases like it, but the majority of modules is not).
> 
> They're not changing.  The current users doing try_inc_mod_count() are
> fine.  It's the ones not doing it which are problematic.

Alright, so I'm fine with it ;) (not that makes a difference, but...)

--Kai



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail 
  2002-10-17 14:56 ` your mail Kai Germaschewski
@ 2002-10-18  2:47   ` Rusty Russell
  2002-10-18 21:50     ` Kai Germaschewski
  0 siblings, 1 reply; 341+ messages in thread
From: Rusty Russell @ 2002-10-18  2:47 UTC (permalink / raw)
  To: Kai Germaschewski; +Cc: Daniel Phillips, S, Roman Zippel, linux-kernel

In message <Pine.LNX.4.44.0210170930410.6301-100000@chaos.physics.uiowa.edu> yo
u write:
> Since I made the mistake of getting involved into this discussion lately,

My condolences. 8)

> I wonder if this new method is going to be mandatory (the only one
> available) or optional. I think there's two different kind of users, for
> one modules which use an API which provides its own infrastructure for
> dealing with modules via ->owner, on the other hand things like netfilter
> (that's probably where you are coming from) where calls into a module,
> which need protection are really frequent.

Mandatory for interfaces where the function can sleep (or be preempted).

> Note that for the vast majority of modules, dealing with unload races is 
> as simple as setting ->owner, for example filesystems, network drivers.

Yes.  We do not have complete coverage though, this policy would
extend it.

> I see that your approach makes frequent calls into the module cheaper, but
> I'm not totally convinced that the current safe interfaces need to change
> just to accomodate rare cases like netfilter (there's most likely some
> more cases like it, but the majority of modules is not).

They're not changing.  The current users doing try_inc_mod_count() are
fine.  It's the ones not doing it which are problematic.

> Anyway, I may see further problems, but let me check first: Is your count
> supposed to only count users which are currently executing in the module's
> .text, or is it also to count references to data allocated in the module?
> (I.e. when I register_netdev(), does that keep a reference to the module
> even after the code has left the module's .text?)

It's to protect entry to the function, but of course, some interfaces
(eg. filesystems) lend themselves very neatly to batching this at
mount/unmount time.  Data is already protected by the usual means.

At risk of boring you, here's the document from the documentation
patch.  Suggestions welcome.

+Writing Modules and the Interfaces To Be Used By Them: A Gentle Guide.
+Copyright 2002, Rusty Russell IBM Corportation
+
+Modules are running parts of the kernel which can be added, and
+sometimes removed, while the kernel is operational.
+
+There are several delicate issues involved in this procedure which
+indicate special care should be taken.
+
+There are three cases you need to be careful:
+
+1) Any code which creates an interface for callbacks (ie. almost any
+   function called register_*)
+	=> See Rule #1
+
+2) Any modules which use (old) interfaces which do not obey Rule #1
+	=> See Rule #2
+
+Rule #1: Module-safe Interfaces.  Any interface which allows
+	registration of callbacks, must also allow registration of a
+	"struct module *owner", either in the structure or as a
+	function parameter, and it must use them to protect the
+	callbacks.  See "MAKING INTERFACES SAFE".
+
+Exception #1: As an optimization, you may skip this protection if you
+	   *know* that the callbacks are non-preemtible and never
+	   sleep (eg. registration of interrupt handlers).
+
+
+Rule #2: Modules using unsafe interfaces.  If your module is using any
+	interface which does not obey rule number 1, that means your
+	module functions may be called from the rest of the kernel
+	without the caller first doing a successful try_module_get().
+
+	You must not register a "module_cleanup" handler, and your module
+	cannot be unloaded except by force.  You must be especially
+	careful in this case with initialization: see "INITIALIZING
+	MODULES WHICH USE UNSAFE INTERFACES".
+
+MAKING INTERFACES SAFE
+
+A caller must always call "try_module_get()" on a function pointers's
+owner before calling through that function pointer.  If
+"try_module_get()" returns 0 (false), the function pointer must *not*
+be called, and the caller should pretend that registration does not
+exist: this means the (module) owner is closing down and doesn't want
+any more calls, or in the process of starting up and isn't ready yet.
+
+For many interfaces, this can be optimized by assuming that a
+structure containing function pointers has the same owner, and knowing
+that one function is always called before the others, such as the
+filesystem code which knows a mount must succeed before any other
+methods can be accessed.
+
+You must call "module_put()" on the owner sometime after you have
+called the function(s).
+
+If you cannot make your interface module-safe in this way, you can at
+least split registration into a "reserve" stage and an "activate"
+stage, so that modules can use the interface, even if they cannot
+(easily) unload.
+
+
+INITIALIZING MODULES WHICH USE UNSAFE INTERFACES
+
+Safe interfaces will never enter your module before module_init() has
+successfully finished, but unsafe interfaces may.  The rule is simple:
+your init_module() function *must* succeed (by returning 0) if it has
+successfully used any unsafe interfaces.
+
+So, if you are only using ONE unsafe interface, simply use that
+interface last.  Otherwise you will have to use printk() to report
+failure and leave the module initialized (but possibly useless).
+
+
+
+If you have questions about how to apply this document to your own
+modules, please ask rusty@rustcorp.com.au or linux-kernel@vger.kernel.org.
+
+Thankyou,
+Rusty.

--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-10-17  7:41 Rusty Russell
@ 2002-10-17 14:56 ` Kai Germaschewski
  2002-10-18  2:47   ` Rusty Russell
  0 siblings, 1 reply; 341+ messages in thread
From: Kai Germaschewski @ 2002-10-17 14:56 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Daniel Phillips, S, Roman Zippel, linux-kernel

On Thu, 17 Oct 2002, Rusty Russell wrote:

> > But that one is easy: the zero check just takes the same spinlock as 
> > TRY_INC_MOD_COUNT, then sets can't-increment only in the case the count
> > is zero, considerably simpler than:
> 
> The current spinlock is horrible.  You could use a brlock, of course,
> but I didn't mainly because of code bloat and speed.  My current code
> looks like:
> 
> static inline int try_module_get(struct module *module)
> {
> 	int ret = 1;
> 
> 	if (module) {
> 		unsigned int cpu = get_cpu();
> 		if (likely(module->ref[cpu].live))
> 			local_inc(&module->ref[cpu].counter);
> 		else
> 			ret = 0;
> 		put_cpu();
> 	}
> 	return ret;
> }

Since I made the mistake of getting involved into this discussion lately,
I wonder if this new method is going to be mandatory (the only one
available) or optional. I think there's two different kind of users, for
one modules which use an API which provides its own infrastructure for
dealing with modules via ->owner, on the other hand things like netfilter
(that's probably where you are coming from) where calls into a module,
which need protection are really frequent.

Note that for the vast majority of modules, dealing with unload races is 
as simple as setting ->owner, for example filesystems, network drivers.

Sure, we need a global lock (unload_lock) when calling into these modules
initially, but these "binding/unbinding" calls are really rare. For
filesystems, they happen once per mount, for network drivers only for
ifconfig up/down. Afterwards, calling into the module (e.g. accessing the
mounted filesystem, xmitting/receiving data) doesn't have any overhead at
all compared to a linked-in filesystem/driver (well, ignore TLB misses)

I don't see a good reason to change this, in particular, since it provides 
useful information to the user, that is the mod_use_count. It means "Is it 
possible to successfully unload the module now?", and since looking at
the count and the actual unload is protected by unload_lock, the unload 
will either succeed basically immediately, or fail with -EBUSY right away.

I see that your approach makes frequent calls into the module cheaper, but
I'm not totally convinced that the current safe interfaces need to change
just to accomodate rare cases like netfilter (there's most likely some
more cases like it, but the majority of modules is not).

Anyway, I may see further problems, but let me check first: Is your count
supposed to only count users which are currently executing in the module's
.text, or is it also to count references to data allocated in the module?
(I.e. when I register_netdev(), does that keep a reference to the module
even after the code has left the module's .text?)

--Kai


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-10-14  6:28 Maros RAJNOCH /HiaeR Silvanna/
@ 2002-10-14 12:28 ` Dave Jones
  0 siblings, 0 replies; 341+ messages in thread
From: Dave Jones @ 2002-10-14 12:28 UTC (permalink / raw)
  To: Maros RAJNOCH /HiaeR Silvanna/; +Cc: linux-kernel

On Mon, Oct 14, 2002 at 08:28:28AM +0200, Maros RAJNOCH /HiaeR Silvanna/ wrote:
 > Linux version 2.4.2-2 (root@porky.devel.redhat.com) (gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-79)) #1 Sun Apr 8 20:41:30 EDT 2001

1, 2.4.2 is /very/ old, there are updated errata kernel packages at
    ftp.redhat.com
2, Bugs in Red Hat's kernel should be filed in http://bugzilla.redhat.com
   and not in linux-kernel.

		Dave

-- 
| Dave Jones.        http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-10-02 19:58 Mark Peloquin
@ 2002-10-02 20:19 ` jbradford
  0 siblings, 0 replies; 341+ messages in thread
From: jbradford @ 2002-10-02 20:19 UTC (permalink / raw)
  To: Mark Peloquin; +Cc: alan, linux-kernel

> On Wed, 2002-10-02 at 17:09, Alan Cox wrote:
> > Look at history - if such a mess got in, it would never get sorted.
> 
> Instead of throwing around vague statements with little
> context like "compost heap" and "such a mess", why don't
> you spell out the specific design points of EVMS that you
> disagree with. The advantages and disadvantages of
> each point can then be discussed.

Yeah, but he is right in any case - look how the IDE mess of 2.5.x, which, frankly, I don't believe was ever as bad as people seem to be saying it was, has put people off testing 2.5.x.  Instead they are waiting for Linus to type

mv linux-2.5.x linux-2.6.0

at which point they think that all remaining bugs will auto-magically correct themselves and the tree is one again safe to use.  WRONG answer!

Simply from the point of view of not wanting to 'scare off' people from a whole tree, (which is so rediculous, I think I'll go and patent it), and as a result get less testing, we're better off trying to stop weirdness from getting in.

John.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-10-02 12:41 s.stoklossa
@ 2002-10-02 12:51 ` Sam Ravnborg
  0 siblings, 0 replies; 341+ messages in thread
From: Sam Ravnborg @ 2002-10-02 12:51 UTC (permalink / raw)
  To: s.stoklossa; +Cc: mec, linux-kernel

On Wed, Oct 02, 2002 at 02:41:42PM +0200, s.stoklossa@mentopolis.de wrote:
> beim versuch, die Einstellungen von alsa aufzurufen, kam folgende FM:
> 
>  Q> ./scripts/Menuconfig: MCmenu74: command not found
> 
> grusz
> 
> Sven
Known problem, try this patch:
copy-n-pated so may not apply cleanly, try by hand.
Ps. Please in english next time.

        Sam

--- linux/sound/Config.in       2002-10-01 12:09:44.000000000 +0200
+++ linux/sound/Config.in       2002-10-01 12:21:05.000000000 +0200
@@ -31,10 +31,7 @@
 if [ "$CONFIG_SND" != "n" -a "$CONFIG_ARM" = "y" ]; then
   source sound/arm/Config.in
 fi
-if [ "$CONFIG_SND" != "n" -a "$CONFIG_SPARC32" = "y" ]; then
-  source sound/sparc/Config.in
-fi
-if [ "$CONFIG_SND" != "n" -a "$CONFIG_SPARC64" = "y" ]; then
+if [ "$CONFIG_SND" != "n" -a "$CONFIG_SPARC32" = "y" ] || [ "$CONFIG_SND" !=
"n" -a "$CONFIG_SPARC64" = "y" ] ; then
   source sound/sparc/Config.in
 fi

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-09-21  5:32 Greg KH
@ 2002-09-23 18:35 ` Patrick Mochel
  0 siblings, 0 replies; 341+ messages in thread
From: Patrick Mochel @ 2002-09-23 18:35 UTC (permalink / raw)
  To: Rhoads, Rob
  Cc: Greg KH, linux-kernel, hardeneddrivers-discuss, cgl_discussion


In general, I agree completely with what Greg says (as usual), but I do 
have a few additional comments.

> (I'll skip the intro, and feel good sections and get into the details
> that you lay out, starting in section 2)
> 
> Section 2:
> 2.1:
> 	- do NOT use /proc for driver info.  Use driverfs.
> 	- If you are using a kernel version that does not have driverfs,
> 	  put all /proc driver info under /proc/drivers, which is where
> 	  it belongs.

Actually, they mention using driverfs in Section 3: Instrumentation. I 
can't tell if this was around before, or this was just added. The date is 
the same (16 Aug), but there is no changelog information about the spec. 

I would suggest not using procfs at all, even if driverfs is not avaiable.  
If you're using 2.4, backport driverfs, or clone it for your own
filesystem. It's not dependent on the driver model at all, and has been
done at least once before (Greg's pcihpfs).

> Section 3:

> The Common Statistic Manager:

Please drop the term 'Manager' from your nomenclature. It is ambiguous, 
because of the context in which its generally used in. Windows uses the 
term for any collection of kernel or device data and/or kernel policy. 
It's not a bad term, but it fails to make a clear distinction between 
kernel space and user space, which we insist on. 

Only the mechanism for setting the policy should exist in the kernel, and
itself my be very intelligent. But, the policy itself should exist outside
of the kernel.


> 3.2.5.2:
> (I'm not condoning ANY of these functions or code, just trying to point out how
> you should, if they were to be in the kernel, done properly.)
> 	- do not use typedef
> 	- struct stat_info does not need *unit, as that is already
> 	  specified in the scale field, right?
> 	- the stat_value_t union is just a horrible abomination, don't
> 	  do that.

Please do not pass void *. You should only pass type-safe structures. If 
you cannot get that information, you should redesign the API. 

> 3.4 Event logging:
> 	- I'm not even going to touch this, sorry.

There are a lot of topics in this spec, most of which are irrelevant to 
actually hardening drivers. They may be features dependent on your APIs, 
but they are completely optional and may hinder acceptance of your primary 
objectives. 

Event logging is definitely one of them, esp. with a function like

evl_log_event_string(  
	ME_EVENT_BUCKET_EMPTY, 
	LOG_WARNING, 
	"Leaky bucket exception (bucket empty):\ 
	Bucket_Level <= Observed_Value - Last_Value\ 
	|%s=%s|%s=%s|%s=%s|%s=%s|%s=%s|%s=%s|%s=%s|%s=%s\ 
	|%s=%u|%s=%u|%s=%u|%s=%u|%s=%u|%s=%u|", 
	RMGT_FacilityIDAttrStr,         RMUUID, 
	RMGT_SubsystemIDAttrStr,    SUBSYSTEM_UUID, 
	RMGT_SubsystemNameAttrStr,  subsystem_name, 
	RMGT_ResourceIDAttrStr,         resource_id, 
	RMGT_ResourceNameAttrStr,   resource_name, 
	ME_MonitorIDAttrStr,        monitor_uuid,  
	ME_StatisticIDAttrStr,         statistic_id, 
	ME_StatisticNameAttrStr,    statistic_name,  
	ME_BucketSizeAttrStr,       bucketsz,  
	ME_FillValueAttrStr,            fillval, 
	ME_FillIntervalAttrStr,         fillint, 
	ME_BucketLevelAttrStr,      bucketlvl, 
	ME_ObservedValueAttrStr,    obsval, 
	ME_LastValueAttrStr,        lastval); 


> In summary, I think that a lot of people have spent a lot of time in
> creating this document, and the surrounding code that matches this
> document.  I really wish that a tiny bit of that effort had gone into
> contacting the Linux kernel development community, and asking to work
> with them on a project like this.  Due to that not happening, and by
> looking at the resultant spec and code, I'm really afraid the majority
> of that time and effort will have been wasted.

I completely agree. There is definitely good intention in some aspects of 
the spec, and definitely in the effort put forth to support this type of 
work. But, in order to gain the support of kernel developers, or even the 
blessing of a few, you should be working with them on the design from the 
beginnging.

Designing APIs is hard. Doing it well is very hard. I'm not claiming I've 
done a stellar job, but I have at least learned that. I've made a lot of 
poor design decisions, many of which are also evident in your code 
descriptions and examples. I can't tell you how many times I've rewritten 
things over and over and over because someone hated them (usually Linus or 
Greg).

There are people that are willing to help, as we are trying to do. But,
it's much easier if you do things gradually and get that help from the
beginning.

> What do I think can be salvaged?  Diagnostics are a good idea, and I
> think they fit into the driver model in 2.5 pretty well.  A lot of
> kernel janitoring work could be done by the CG team to clean up, and
> harden (by applying the things in section 2) the existing kernel
> drivers.  That effort alone would go a long way in helping the stability
> of Linux, and also introduce the CG developers into the kernel community
> as active, helping developers.  It would allow the CG developers to
> learn from the existing developers, as we must be doing something right
> for Linux to be working as well as it does :)

Which kernel are you targeting? I didn't see it in the spec, though I 
could have easily missed it. CGL is based on 2.4, so I would assume that. 
But, I would think the ideal choice would be to start in 2.5 and backport 
it to 2.4. 

If that's the case, how do you intend to work with the driver model? 
There will be quite a bit of code and interface duplication between
your code and the driver model. I can see ways to support many of the
things you want in a relatively easy manner, and not punish the common
user or developer; but the margin is to small to write the answer... ;)

Also, there are many projects in areas similar to what your doing: 
diganostics, HA, etc, etc. It would be nice to see some collaboration 
between the developers of those projects instead of having many disparate 
projects with similar goals. 


	-pat







^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-09-14 12:39 Paolo Ciarrocchi
@ 2002-09-14 17:05 ` Rik van Riel
  0 siblings, 0 replies; 341+ messages in thread
From: Rik van Riel @ 2002-09-14 17:05 UTC (permalink / raw)
  To: Paolo Ciarrocchi; +Cc: linux-kernel, conman

On Sat, 14 Sep 2002, Paolo Ciarrocchi wrote:

> I think that only the _memload_ test is not
> working with 2.5.*, am I wrong?

You're right, the memload test doesn't work with 2.5 but
needs the following patch...

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/

Spamtraps of the month:  september@surriel.com trac@trac.org


--- contest-0.1/mem_load.c.orig	2002-09-13 23:36:47.000000000 -0400
+++ contest-0.1/mem_load.c	2002-09-14 11:10:07.000000000 -0400
@@ -47,24 +47,25 @@
   switch (type) {

   case 0: /* RAM */
-    if ((position = strstr(buffer, "Mem:")) == (char *) NULL) {
-      fprintf (stderr, "Can't parse \"Mem:\" in /proc/meminfo\n");
+    if ((position = strstr(buffer, "MemTotal:")) == (char *) NULL) {
+      fprintf (stderr, "Can't parse \"MemTotal:\" in /proc/meminfo\n");
       exit (-1);
     }
-    sscanf (position, "Mem:  %ul", &size);
+    sscanf (position, "MemTotal:  %ul", &size);
     break;

   case 1:
-    if ((position = strstr(buffer, "Swap:")) == (char *) NULL) {
-      fprintf (stderr, "Can't parse \"Swap:\" in /proc/meminfo\n");
+    if ((position = strstr(buffer, "SwapTotal:")) == (char *) NULL) {
+      fprintf (stderr, "Can't parse \"SwapTotal:\" in /proc/meminfo\n");
       exit (-1);
     }
-    sscanf (position, "Swap: %ul", &size);
+    sscanf (position, "SwapTotal: %ul", &size);
     break;

   }

-  return (size / MB);
+  /* convert from kB to MB */
+  return (size / KB);

 }

--- contest-0.1/mem_load.h.orig	2002-09-14 11:09:28.000000000 -0400
+++ contest-0.1/mem_load.h	2002-09-14 11:09:42.000000000 -0400
@@ -24,6 +24,7 @@

 #define MAX_BUF_SIZE 1024          /* size of /proc/meminfo in bytes */
 #define MB (1024 * 1024)           /* 2^20 bytes */
+#define KB 1024
 #define MAX_MEM_IN_MB (1024 * 64)  /* 64 GB */

 /* Tuning parameter.  Increase if you are getting an 'unreasonable' load


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <200208312335.g7VNZmk37659@sullivan.realtime.net>
@ 2002-09-01  9:53 ` Krzysiek Taraszka
  0 siblings, 0 replies; 341+ messages in thread
From: Krzysiek Taraszka @ 2002-09-01  9:53 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-kernel

On Sat, 31 Aug 2002, Milton Miller wrote:

> At Fri Aug 30 2002 - 12:54:37 EST Krzysiek Taraszka (dzimi@pld.org.pl) wrote:
> > Great work, but in 2.2.22rc2 powerpc's still broken.
> > First of All Sources have got a lot of unsed stuff.
> > For example look like that:
> > 
> > [dzimi@cyborg linux]$ rgrep -n -R '*.*' 'CONFIG_PPC64' . 
> ...
> 
> Doesn't sound like -rc (release canidate) changes.

Well yes, in 2.2.10 someone tried to add CONFIG_PPC64 support in to 2.2 
kernel.
In 2.2.11 someone add CONFIG_PPC64 in to Config.in! but on 2.2.12 or 
2.2.13 someone remove it ... 
(without remove it from directory != arch/ppc/kernel/ )
 
> > Second kernel-2.2.21 still have got time init problems in symbios driver
> > on powerpc platform.
> > I send to you my ugly hack witch work, but IMHO he's ugly ;) I need to do
> > it correct.
> 
> > Third, kernel for powerpc boot and work on g3-266 but on g3-333 Ops ...
> > (kernel traps, kernel wrote: Caused by SRR1 or somethink like that, in 2.3
> > i saw #define FIX_SRR1 macro ...)
> 
> Well, SRR1 doesn't cause traps, but it does help tell you why they occurred.
> And the FIX_SRR1 stuff isn't the solution either if you look at it closer.
> How about a decoded oops?  Also, you didn't say what platform you were using.

I used g3 (pmac). My based system was PLD with 2.4.18 tree.
I used gcc-2.95.4 to build 2.2.21 vmlinux.

> As far as the open-pic changes you posted, how about explaining what your
> trying to fix (partly hidden by the rename and move to chrp_setup.c from
> open_pic.c)?

I tried to fix problem witch is on my IBM RS/6000 (model b50).
Openpic can't initialize propertly my scsi system. (sym82c8xx scsi 
driver). Some time init problems.

Oh I forgot, 2.2.22rc1/2 or kernel >= 2.2.16 (2.2 tree) didn't work on my 
IBM RS/6000 (b50).
Build with egcs work, but work slow (Bogomips: 16MHz!) and won't reboot 
and shutdown -h now.
The same code build with gcc Ops (Kernel Exception, look like openpic 
allocation address.)
I'll post the Ops later.

> I see you are wrapping the 8259 checks, but it also refers to a few new
> functions/macros I didn't see defined.

Hmm, yes, that why my patch is ugly. I want to do this correctly.
 
> How about discussing these problems and patches over at
> linuxppc-dev@lists.linuxppc.org ? (I set the reply-to there).

Ok, but first of all i should subscribe there.

Krzysiek Taraszka			(dzimi@pld.org.pl)



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail 
  2002-08-30 18:43 Bloch, Jack
  2002-08-30 18:55 ` your mail Matthew Dharm
  2002-08-30 19:22 ` Andreas Dilger
@ 2002-08-31  0:12 ` David Woodhouse
  2 siblings, 0 replies; 341+ messages in thread
From: David Woodhouse @ 2002-08-31  0:12 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Bloch, Jack, linux-kernel



adilger@clusterfs.com said:
>  I would instead suggest using a filesystem like JFFS2 for flash
> devices. This is journaled like ext3, but it also has the benefit of
> doing wear levelling on the device, which otherwise will probably wear
> out the superblock part of the flash rather quickly. 

He said he's using CompactFlash. CompactFlash is not flash, as far as we're
concerned: it is an IDE drive. You may think it has flash inside it; we
couldn't possibly comment.

In fact, it generally has a kind of pseudo-filesystem internally which it 
uses to emulate a block device with 512-byte sectors. It may do its own 
wear-levelling; the manufacturers are often quite cagey about whether it 
actually does or not. Draw your own conclusions about that if you will.

It's quite common to find that this internal pseudo-filesystem _itself_ gets
screwed on power failures. This tends to manifest itself as unrecoverable 
I/O errors.

There is no fundamental reason why every CF card should have these 
problems, in the same way as there is no fundamental reason why all PC 
BIOSes should be crap. But the same expectations apply.

If you want to pass power-fail testing, I would recommend you switch to
using real flash. JFFS2 on real flash has survived days of stress testing
whilst being power cycled randomly every ~5 minutes. The same tests were 
observed to destroy CF cards¹.

CF is bog-roll technology. It's disposable storage designed for temporary
use in stuff like cameras -- not for real computing. Think of it like a
floppy disc and you won't go far wrong.

--
dwmw2
¹ http://www.embeddedlinuxworks.com/articles/jffs_guide.html²
² Constant reboots no longer screw the wear levelling, as reported there.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-08-30 18:43 Bloch, Jack
  2002-08-30 18:55 ` your mail Matthew Dharm
@ 2002-08-30 19:22 ` Andreas Dilger
  2002-08-31  0:12 ` David Woodhouse
  2 siblings, 0 replies; 341+ messages in thread
From: Andreas Dilger @ 2002-08-30 19:22 UTC (permalink / raw)
  To: Bloch, Jack; +Cc: linux-kernel

On Aug 30, 2002  14:43 -0400, Bloch, Jack wrote:
> I have an embedded system runing a 2.4.18-3 Kernel. It runs from a 256MB
> compact flash disk (emulates an IDE interface). I am using an EXT2
> filesystem. During some power-off/power-on testing, the disk check failed.
> It dropped me to a shell and I had to run e2fsck -cfv to correct this
> problem. This is all good and well in a lab environment, but in reality,
> there is nobody there to perform the repair (running system is not equipped
> with keyboard and monitor). Is there any way to invoke e2fsck automatically
> or inhibit the failure detection mechanism? Please CC me directly on any
> responses.

I would instead suggest using a filesystem like JFFS2 for flash devices.
This is journaled like ext3, but it also has the benefit of doing wear
levelling on the device, which otherwise will probably wear out the
superblock part of the flash rather quickly.

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-08-30 18:43 Bloch, Jack
@ 2002-08-30 18:55 ` Matthew Dharm
  2002-08-30 19:22 ` Andreas Dilger
  2002-08-31  0:12 ` David Woodhouse
  2 siblings, 0 replies; 341+ messages in thread
From: Matthew Dharm @ 2002-08-30 18:55 UTC (permalink / raw)
  To: Bloch, Jack; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1541 bytes --]

I would simply recommend switching to ext3, where these types of errors
generally don't occur.

Oh, and if you just edit your initscripts, you can do anything you want.

Matt

On Fri, Aug 30, 2002 at 02:43:52PM -0400, Bloch, Jack wrote:
> I have an embedded system runing a 2.4.18-3 Kernel. It runs from a 256MB
> compact flash disk (emulates an IDE interface). I am using an EXT2
> filesystem. During some power-off/power-on testing, the disk check failed.
> It dropped me to a shell and I had to run e2fsck -cfv to correct this
> problem. This is all good and well in a lab environment, but in reality,
> there is nobody there to perform the repair (running system is not equipped
> with keyboard and monitor). Is there any way to invoke e2fsck automatically
> or inhibit the failure detection mechanism? Please CC me directly on any
> responses.
> 
> 
> Thanks in advance....
> 
> Jack Bloch 
> Siemens ICN
> phone                (561) 923-6550
> e-mail                jack.bloch@icn.siemens.com
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

My mother not mind to die for stoppink Windows NT!  She is rememberink 
Stalin!
					-- Pitr
User Friendly, 9/6/1998

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-08-27 18:22 Steffen Persvold
@ 2002-08-27 19:27 ` Willy Tarreau
  0 siblings, 0 replies; 341+ messages in thread
From: Willy Tarreau @ 2002-08-27 19:27 UTC (permalink / raw)
  To: Steffen Persvold; +Cc: linux-kernel

On Tue, Aug 27, 2002 at 08:22:03PM +0200, Steffen Persvold wrote:
 
> I have an idea that this happens because the packets are comming out of 
> order into the receiving node (i.e the bonding device is alternating 
> between each interface when sending, and when the receiving node gets the 
> packets it is possible that the first interface get packets number 0, 2, 
> 4 and 6 in one interrupt and queues it to the network stack before packet 
> 1, 3, 5 is handled on the other interface).

You pointed your finger on this exact common problem.
You can use the XOR bonding mode (modprobe bonding mode=2), which uses a
hash of mac addresses to select the outgoing interface. This is interesting
if you have lots of L2 hosts on the same network switch.

Or if you have a few hosts on the same switch, you'd better use the "nexthop"
parameter of "ip route". IIRC, it should be something like :
  ip route add <destination> nexthop dev eth0 nexthop dev eth1
but read the help, I'm not certain.

Cheers,
Willy


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-08-23 16:12     ` Bill Unruh
  2002-08-23 20:33       ` Mike Dresser
@ 2002-08-25  2:05       ` Mike Dresser
  1 sibling, 0 replies; 341+ messages in thread
From: Mike Dresser @ 2002-08-25  2:05 UTC (permalink / raw)
  To: Bill Unruh; +Cc: linux-ppp, linux-kernel

On Fri, 23 Aug 2002, Bill Unruh wrote:

> You could try running the little program I got basically from Carlson in
> http://axion.physics.ubc.ca/modem-chk.html
> to try resetting the serial line befor the next attempt (eg, put it into
> /etc/ppp/ip-down).
> Not sure if this is the problem however.

It died again.

I'm going to go out there and swap out the modem with a different model if
i have one.  If that doesn't fix it, I'll get that VIA garbage out of the
system and replaced with a proper Intel 815 based motherboard.

Mike


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-08-23 16:12     ` Bill Unruh
@ 2002-08-23 20:33       ` Mike Dresser
  2002-08-25  2:05       ` Mike Dresser
  1 sibling, 0 replies; 341+ messages in thread
From: Mike Dresser @ 2002-08-23 20:33 UTC (permalink / raw)
  To: Bill Unruh; +Cc: linux-ppp, linux-kernel

On Fri, 23 Aug 2002, Bill Unruh wrote:

>
> OK, that problem is usually a "hardware" problem-- ie the hardware is
> not responding properly to the icotl request. This could be because
> there is not hardware there (eg trying to open a serial port which does
> not exist on the machine), or is busy, or has been left in some weird
> state. The last sounds most likely here-- eg the serial port on your
> modem thinks it is still busy.
>
> You could try running the little program I got basically from Carlson in
> http://axion.physics.ubc.ca/modem-chk.html
> to try resetting the serial line befor the next attempt (eg, put it into
> /etc/ppp/ip-down).
> Not sure if this is the problem however.

Another 7 minutes, and I'll know if this worked or not.

Another data point I just thought of, if i poff chatham, and then pon
chatham, that actually works.

It just hung up.

And redialed.

And connected properly.

Thank you so very much, it looks like your reset-serial did the job.

I'll implement it on future machines, just in case the same problem
happens, rather than pray it works.

I saw a lot of postings on the 5160 USR modem on the serial-pci-info list,
perhaps it's something to do with this modem.

I'll know for sure at 10:30 this evening, if it is definately owrking or
not.  I was logged in on the other line to monitor the syslog, and bring
up the internet line, just in case.

Thanks again,

Mike


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-08-23 15:26   ` Mike Dresser
@ 2002-08-23 16:12     ` Bill Unruh
  2002-08-23 20:33       ` Mike Dresser
  2002-08-25  2:05       ` Mike Dresser
  0 siblings, 2 replies; 341+ messages in thread
From: Bill Unruh @ 2002-08-23 16:12 UTC (permalink / raw)
  To: Mike Dresser; +Cc: linux-ppp, linux-kernel


OK, that problem is usually a "hardware" problem-- ie the hardware is
not responding properly to the icotl request. This could be because
there is not hardware there (eg trying to open a serial port which does
not exist on the machine), or is busy, or has been left in some weird
state. The last sounds most likely here-- eg the serial port on your
modem thinks it is still busy.

You could try running the little program I got basically from Carlson in
http://axion.physics.ubc.ca/modem-chk.html
to try resetting the serial line befor the next attempt (eg, put it into
/etc/ppp/ip-down).
Not sure if this is the problem however.

On Fri, 23 Aug 2002, Mike Dresser wrote:

> On Fri, 23 Aug 2002, Bill Unruh wrote:
>
> > Well, it would be good if you actually told us what problem you were
> > describing. Is this a new connection attempt after the first hang up?
> > What?
> >
> > What repeats over and over-- I see no repeat.
>
> I >
> > You also do not tell us info like what kind of modem is this-- external,
> > internal, serial, usb, pci, winmodem,....
> >
> > I assume what you are refering to is the "inappropriate ioctl" line.
> > This indicates a hardware problem.
> >
> > Actually, it looks to me like another pppd is up on the line. Those
> > EchoReq are another pppd receiving stuff on an open pppd on another
> > line. More information on what it is you are trying to do, on what your
> > system is, and what the problem is might get you help.
> >
>
> Sorry.
>
> It's a new connection from the persist option.  The exact same message
> repeats for every dial out it attempts.
>
> It's a PCI 3com 56k Sportster.  It's a hardware modem.
>
> There is sometimes another pppd up on ttys1
>
> Here's the setup:
>
> There is an external modem on ttyS01, irq 3, that dials in occasionally as
> needed.
>
> there is an internal PCI modem on ttyS04, irq 5, that dials in permamently
> to the ISP.
>
> Every 6 hours, the ISP enforces the 6 hour hangup rule they have.
>
> The modem is set to persist, max-fails 0.  It is not able to redial, and
> keeps giving the error message that i pasted.
>
> Under 2.2.x, this functioned properly.
>
> System is a VIA VT82C693A/694x [Apollo PRO133x] based motherboard, from
> Giga-byte, if I remember correctly.  Celeron 533.
>
> Sorry about the too brief error message, I fell into my "it makes sense to
> me the way it is" trap.
>
> Mike
>
>

-- 
William G. Unruh        Canadian Institute for          Tel: +1(604)822-3273
Physics&Astronomy          Advanced Research            Fax: +1(604)822-5324
UBC, Vancouver,BC        Program in Cosmology           unruh@physics.ubc.ca
Canada V6T 1Z1               and Gravity           www.theory.physics.ubc.ca/
For step by step instructions about setting up ppp under Linux, see
            http://www.theory.physics.ubc.ca/ppp-linux.html


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-08-23 15:12 ` your mail Bill Unruh
@ 2002-08-23 15:26   ` Mike Dresser
  2002-08-23 16:12     ` Bill Unruh
  0 siblings, 1 reply; 341+ messages in thread
From: Mike Dresser @ 2002-08-23 15:26 UTC (permalink / raw)
  To: Bill Unruh; +Cc: linux-ppp, linux-kernel

On Fri, 23 Aug 2002, Bill Unruh wrote:

> Well, it would be good if you actually told us what problem you were
> describing. Is this a new connection attempt after the first hang up?
> What?
>
> What repeats over and over-- I see no repeat.

I >
> You also do not tell us info like what kind of modem is this-- external,
> internal, serial, usb, pci, winmodem,....
>
> I assume what you are refering to is the "inappropriate ioctl" line.
> This indicates a hardware problem.
>
> Actually, it looks to me like another pppd is up on the line. Those
> EchoReq are another pppd receiving stuff on an open pppd on another
> line. More information on what it is you are trying to do, on what your
> system is, and what the problem is might get you help.
>

Sorry.

It's a new connection from the persist option.  The exact same message
repeats for every dial out it attempts.

It's a PCI 3com 56k Sportster.  It's a hardware modem.

There is sometimes another pppd up on ttys1

Here's the setup:

There is an external modem on ttyS01, irq 3, that dials in occasionally as
needed.

there is an internal PCI modem on ttyS04, irq 5, that dials in permamently
to the ISP.

Every 6 hours, the ISP enforces the 6 hour hangup rule they have.

The modem is set to persist, max-fails 0.  It is not able to redial, and
keeps giving the error message that i pasted.

Under 2.2.x, this functioned properly.

System is a VIA VT82C693A/694x [Apollo PRO133x] based motherboard, from
Giga-byte, if I remember correctly.  Celeron 533.

Sorry about the too brief error message, I fell into my "it makes sense to
me the way it is" trap.

Mike


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-08-23 14:45 Mike Dresser
@ 2002-08-23 15:12 ` Bill Unruh
  2002-08-23 15:26   ` Mike Dresser
  0 siblings, 1 reply; 341+ messages in thread
From: Bill Unruh @ 2002-08-23 15:12 UTC (permalink / raw)
  To: Mike Dresser; +Cc: linux-ppp, linux-kernel

Well, it would be good if you actually told us what problem you were
describing. Is this a new connection attempt after the first hang up?
What?

What repeats over and over-- I see no repeat.

You also do not tell us info like what kind of modem is this-- external,
internal, serial, usb, pci, winmodem,....

I assume what you are refering to is the "inappropriate ioctl" line.
This indicates a hardware problem.

Actually, it looks to me like another pppd is up on the line. Those
EchoReq are another pppd receiving stuff on an open pppd on another
line. More information on what it is you are trying to do, on what your
system is, and what the problem is might get you help.


On Fri, 23 Aug 2002, Mike Dresser wrote:

> I'm having problems with pppd under 2.4.19, with pppd 2.4.1
>
> I can establish a new connection, and no problems.  But once the ISP on
> the other end hangs up, this is what i get in my syslog.
> Repeats over and over.  I saw a few google postings about this, but those
> were back in _1999_, so I would think they were fixed by now.
>
> Doesn't matter if PPP is compiled in with the kernel, or modules.
>
> I'm running Debian 3.0(woody)
>
> This worked under Debian 2.2 and kernel 2.2.21
>
> Aug 23 10:25:55 tilburybackup chat[9825]: abort on (BUSY)
> Aug 23 10:25:55 tilburybackup chat[9825]: abort on (NO CARRIER)
> Aug 23 10:25:55 tilburybackup chat[9825]: abort on (VOICE)
> Aug 23 10:25:55 tilburybackup chat[9825]: abort on (NO DIALTONE)
> Aug 23 10:25:55 tilburybackup chat[9825]: abort on (NO DIAL TONE)
> Aug 23 10:25:55 tilburybackup chat[9825]: abort on (NO ANSWER)
> Aug 23 10:25:55 tilburybackup chat[9825]: send (ATZ^M)
> Aug 23 10:25:55 tilburybackup chat[9825]: expect (OK)
> Aug 23 10:25:55 tilburybackup chat[9825]: ATZ^M^M
> Aug 23 10:25:55 tilburybackup chat[9825]: OK
> Aug 23 10:25:55 tilburybackup chat[9825]:  -- got it
> Aug 23 10:25:55 tilburybackup chat[9825]: send (ATDT3806600^M)
> Aug 23 10:25:55 tilburybackup chat[9825]: expect (CONNECT)
> Aug 23 10:25:55 tilburybackup chat[9825]: ^M
> Aug 23 10:26:11 tilburybackup pppd[9804]: rcvd [LCP EchoReq id=0x4 magic=0x96835d5b]
> Aug 23 10:26:11 tilburybackup pppd[9804]: sent [LCP EchoRep id=0x4 magic=0x72c56787]
> Aug 23 10:26:11 tilburybackup pppd[9804]: sent [LCP EchoReq id=0x4 magic=0x72c56787]
> Aug 23 10:26:11 tilburybackup pppd[9804]: rcvd [LCP EchoRep id=0x4 magic=0x96835d5b]
> Aug 23 10:26:16 tilburybackup chat[9825]: ATDT3806600^M^M
> Aug 23 10:26:16 tilburybackup chat[9825]: CONNECT
> Aug 23 10:26:16 tilburybackup chat[9825]:  -- got it
> Aug 23 10:26:16 tilburybackup chat[9825]: send (\d)
> Aug 23 10:26:17 tilburybackup pppd[329]: Serial connection established.
> Aug 23 10:26:17 tilburybackup pppd[329]: using channel 1179
> Aug 23 10:26:17 tilburybackup pppd[329]: Couldn't create new ppp unit: Inappropriate ioctl for device
> Aug 23 10:26:18 tilburybackup pppd[329]: Hangup (SIGHUP)
>
> tilburybackup:/etc/ppp# egrep -v '#|^ *$' /etc/ppp/options
> asyncmap 0
> auth
> crtscts
> lock
> hide-password
> modem
> proxyarp
> lcp-echo-interval 30
> lcp-echo-failure 4
> noipx
> persist
> maxfail 0
>
> ttyS04 at port 0xcc00 (irq = 5) is a 16550A
>
> 00:0b.0 Serial controller: US Robotics/3Com 56K FaxModem Model 5610 (rev 01) (prog-if 02 [16550])
>         Subsystem: US Robotics/3Com USR 56k Internal FAX Modem (Model 2977)
>         Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
>         Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
>         Interrupt: pin A routed to IRQ 5
>         Region 0: I/O ports at cc00 [size=8]
>         Capabilities: [dc] Power Management version 2
>                 Flags: PMEClk- DSI- D1- D2+ AuxCurrent=0mA PME(D0+,D1-,D2+,D3hot+,D3cold+)
>                 Status: D0 PME-Enable- DSel=0 DScale=2 PME-
>
> Any ideas?
>
> Mike
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ppp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

-- 
William G. Unruh        Canadian Institute for          Tel: +1(604)822-3273
Physics&Astronomy          Advanced Research            Fax: +1(604)822-5324
UBC, Vancouver,BC        Program in Cosmology           unruh@physics.ubc.ca
Canada V6T 1Z1               and Gravity           www.theory.physics.ubc.ca/
For step by step instructions about setting up ppp under Linux, see
            http://www.theory.physics.ubc.ca/ppp-linux.html


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-08-19 21:29 Bloch, Jack
@ 2002-08-20  6:47 ` Philipp Matthias Hahn
  0 siblings, 0 replies; 341+ messages in thread
From: Philipp Matthias Hahn @ 2002-08-20  6:47 UTC (permalink / raw)
  To: Bloch, Jack; +Cc: linux-kernel

Hello Jack!

On Mon, Aug 19, 2002 at 05:29:26PM -0400, Bloch, Jack wrote:
> Are there any plans to do an SCTP (RFC 2960) implementation for Linux?
> Please CC me directly on any responses.

The Linux Kernel 2.5 Status page at
	http://www.kernelnewbies.org/status/latest.html
lists the following URL:
	http://www.sf.net/projects/lksctp

BYtE
Philipp
-- 
  / /  (_)__  __ ____  __ Philipp Hahn
 / /__/ / _ \/ // /\ \/ /
/____/_/_//_/\_,_/ /_/\_\ pmhahn@titan.lahn.de

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-08-16  7:51 Misha Alex
@ 2002-08-16  9:52 ` Willy Tarreau
  0 siblings, 0 replies; 341+ messages in thread
From: Willy Tarreau @ 2002-08-16  9:52 UTC (permalink / raw)
  To: Misha Alex; +Cc: linux-kernel

Hello !

On Fri, Aug 16, 2002 at 07:51:37AM +0000, Misha Alex wrote:
 
>      Also i tried the linear addressing linear = c*H*S + h*S +s -1 .But 
> linear or linear*512 never gave me the exact byte offset to seek.
> 
> I am working in linux and using a hexeditor to seek .How many exact bytes 
> should i seek to find out the extended partition.I read the MBR and found 
> the exteneded partiton.
> 00 01 01 00 02 fe 3f 01 3f 00 00 00 43 7d 00 00
> 80 00 01 02 0b fe bf 7e 82 7d 00 00 3d 26 9c 00
> 00 00 81 7f 0f fe ff ff bf a3 9c 00 f1 49 c3 01
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

I haven't played with this for a long time, but I still have some memory
about this. First, when an offset is higher than 8GB, there's no way to
code it with the bios' CHS scheme as you find it in the partition table.
I see that you know how to decode this, so set all the CHS bits to ones
and look at the offset. For this reason, we often use only the size to
locate these partitions. If I recall correctly, the last 4 bytes of your
parts are the sizes in sectors. For example, hda2 is 9c263d sectors long,
which equals 5.2 GB. You'll notice that bytes 8 to 11 of each partitions
are nearly equivalent to the size of the previous part. They should be
the start offset in sectors. So in this case, hda3 begins at 9ca3bf
(byte 5255953920), and is 1c349f1 sectors long (15.1 GB).

I think that 'fe ff ff' after the partition type indicates that only the
linear mode should be used, but I'm not sure about this nor I do I have
any proof. You should read the partition code to get more clues, IMHO.

Hoping this helps,
Willy


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-07-06 15:59 Hacksaw
@ 2002-07-07 19:32 ` Min Li
  0 siblings, 0 replies; 341+ messages in thread
From: Min Li @ 2002-07-07 19:32 UTC (permalink / raw)
  To: Hacksaw; +Cc: linux-kernel

Hello, Yes, I tried to subcribe to those two lists. But I don't think they
are working. But I do need help right now...


On Sat, 6 Jul 2002, Hacksaw wrote:

> Hello Min:
> 
> I suggest your questions would be better asked on the kernle newbies list:
> http://mail.nl.linux.org/kernelnewbies/
> 
> and/or on the RedHat install List:
> 
> https://listman.redhat.com/mailman/listinfo/redhat-install-list.
> 
> The kernel list is strictly for talk about developing the kernel. Also, please 
> read the linux kernel mailing list FAQ: http://www.tux.org/lkml/
> -- 
> Powered by beta particles
> http://www.hacksaw.org -- http://www.privatecircus.com -- KB1FVD
> 
> 
> 


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-07-05  8:47 Christian Berger
@ 2002-07-05 13:34 ` Gerhard Mack
  0 siblings, 0 replies; 341+ messages in thread
From: Gerhard Mack @ 2002-07-05 13:34 UTC (permalink / raw)
  To: Christian Berger; +Cc: linux-kernel

Right command wrong email address.  You need to send that to
majordomo@vger.kernel.org



On 5 Jul 2002, Christian Berger wrote:

> Date: 05 Jul 2002 10:47:32 +0200
> From: Christian Berger <christian@berger-online.de>
> To: linux-kernel@vger.kernel.org
>
> unsubscribe linux-kernel
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

--
Gerhard Mack

gmack@innerfire.net

<>< As a computer I find your faith in technology amusing.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
       [not found] <000d01c22361$62c9d6f0$0100a8c0@digital>
@ 2002-07-04 20:45 ` Stephen Tweedie
  0 siblings, 0 replies; 341+ messages in thread
From: Stephen Tweedie @ 2002-07-04 20:45 UTC (permalink / raw)
  To: Naseer Bhatti
  Cc: security, security, linux-kernel, sct, akpm, adilger, ext3-users

Hi,

On Thu, Jul 04, 2002 at 06:47:11PM +0500, Naseer Bhatti <naseer@digitallinx.com> wrote:

> I got these errors in the log on a Production server. I am running ProFTPD 1.2.4 with RedHat 7.2 Kernel 2.4.7-10 not yet compiled myself and Apache 1.3.26. I got my server stop responding and after reboot I checked the logs and got a lots of kernel bugs. ProFTPD was also involved in that. httpd (Apache 1.3.26) also gave some stack output. Correct me if I am wrong. I have attached the file for detailed analysis. Please check it and let me know about the possible bug/solution.

The log shows no sign of any ext3 problem.  I can't see anything in it
which would justify trying to send a compressed log of nearly 400kB to
an ext3 general users mailing list.

For what it's worth, your dcache oopses are most often associated with
bad memory --- try memtest86 on that machine before you go any
further.

--Stephen

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-06-24  5:49 pah
@ 2002-06-24  7:34 ` Zwane Mwaikambo
  0 siblings, 0 replies; 341+ messages in thread
From: Zwane Mwaikambo @ 2002-06-24  7:34 UTC (permalink / raw)
  To: pah; +Cc: linux-kernel

On 24 Jun 2002 pah@promiscua.org wrote:

> 	I've just found a bug (an unsignificant bug) in the panic() function!
> 	There's a possible buffer overflow if the formated string exceeds
> 1024 characters (I think that the problem is in all kernel releases).
> 	The problem is in the use of vsprintf() insted of vsnprintf()!
> 
> 	I know that this doesn't compromise any exploitation by an uid
> different than zero, but should be fixed in the case of panic()'s arguments
> exceeds the buffer limit (probably by an lkm or something like that) and
> cause (probably) a system crash.
> 

In that case there are quite a number of other places in the kernel which 
can be 'exploited' in various ways.

Cheers,
	Zwane

--
http://function.linuxpower.ca
		


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-05-16 15:54   ` Sanket Rathi
  2002-05-16 18:05     ` Alan Cox
@ 2002-05-20 18:07     ` David Mosberger
  1 sibling, 0 replies; 341+ messages in thread
From: David Mosberger @ 2002-05-20 18:07 UTC (permalink / raw)
  To: Sanket Rathi; +Cc: Alan Cox, linux-kernel

>>>>> On Thu, 16 May 2002 21:24:10 +0530 (IST), Sanket Rathi <sanket.rathi@cdac.ernet.in> said:

  Sanket> No actually i don't want that for DMA it is for diffrent
  Sanket> requirment.  actually in our device there is a page table in
  Sanket> device which have virtual to physical address translation we
  Sanket> save virtual address in device and corresponding physical
  Sanket> address. but we can store only upto 44 bit information of
  Sanket> virtual address thats why i want that.

  Sanket> Can you help me in this

There is no way to limit virtual memory to 44 bits.  I don't know how
your device works, but just fyi: ia64 divides the address space into 8
equal-sized regions and user space applications tend to use at least
two regions (2 for text and 3 for data/stack).  This means that even
with the smallest page size, you'll have to take virtual address bits
61-63 into account.

Hope this helps,

	--david
--
Interested in learning more about IA-64 Linux?  Try http://www.lia64.org/book/

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-05-16 15:54   ` Sanket Rathi
@ 2002-05-16 18:05     ` Alan Cox
  2002-05-20 18:07     ` David Mosberger
  1 sibling, 0 replies; 341+ messages in thread
From: Alan Cox @ 2002-05-16 18:05 UTC (permalink / raw)
  To: Sanket Rathi; +Cc: Alan Cox, Sanket Rathi, linux-kernel

> No actually i don't want that for DMA it is for diffrent requirment.
> actually in our device there is a page table in device which have
> virtual to physical address translation we save virtual address in device
> and corresponding physical address. but we can store only upto 44 bit 
> information of virtual address thats why i want that.

Still read Documentation/DMA-mapping.txt

Whether its DMA or not you are going to need to keep the allocations below
44bits and thats what the DMA allocators do

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-05-16 13:38 ` your mail Alan Cox
@ 2002-05-16 15:54   ` Sanket Rathi
  2002-05-16 18:05     ` Alan Cox
  2002-05-20 18:07     ` David Mosberger
  0 siblings, 2 replies; 341+ messages in thread
From: Sanket Rathi @ 2002-05-16 15:54 UTC (permalink / raw)
  To: Alan Cox; +Cc: Sanket Rathi, linux-kernel

No actually i don't want that for DMA it is for diffrent requirment.
actually in our device there is a page table in device which have
virtual to physical address translation we save virtual address in device
and corresponding physical address. but we can store only upto 44 bit 
information of virtual address thats why i want that.

Can you help me in this 

Thanks in advance

-----
--------Sanket


> > I just want to know how can we restrict the maximum virtual memory and
> > maximum physical memory on ia64 platform.
> > kernel. Actually we have a device which can only access 44 bits so we cant
> 
> That won't help you. You might not be dealing with RAM at the bottom of the
> address space. You might also be in platforms with an iommu, or doing DMA
> to another PCI target
> 
> > Tell me something related to this or any link which i can refer
> 
> Assuming the device is doing bus mastering. Read
> Documentation/DMA-mapping.txt
> 


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-05-16 12:40 Sanket Rathi
@ 2002-05-16 13:38 ` Alan Cox
  2002-05-16 15:54   ` Sanket Rathi
  0 siblings, 1 reply; 341+ messages in thread
From: Alan Cox @ 2002-05-16 13:38 UTC (permalink / raw)
  To: Sanket Rathi; +Cc: linux-kernel

> I just want to know how can we restrict the maximum virtual memory and
> maximum physical memory on ia64 platform.
> kernel. Actually we have a device which can only access 44 bits so we cant

That won't help you. You might not be dealing with RAM at the bottom of the
address space. You might also be in platforms with an iommu, or doing DMA
to another PCI target

> Tell me something related to this or any link which i can refer

Assuming the device is doing bus mastering. Read
Documentation/DMA-mapping.txt

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-05-03 15:29   ` Keith Owens
@ 2002-05-03 15:45     ` tomas szepe
  0 siblings, 0 replies; 341+ messages in thread
From: tomas szepe @ 2002-05-03 15:45 UTC (permalink / raw)
  To: Keith Owens; +Cc: kbuild-devel, linux-kernel

> On Fri, 3 May 2002 16:37:38 +0200, 
> tomas szepe <kala@pinerecords.com> wrote:
>
> >kala@nibbler:~$ tar xzf /usr/src/linux-2.5.13.tgz 
> >kala@nibbler:~$ cd linux-2.5.13 
> >kala@nibbler:~/linux-2.5.13$ zcat /usr/src/kbuild-2.5-core-10.gz /usr/src/kbuild-2.5-common-2.5.13-1.gz /usr/src/kbuild-2.5-i386-2.5.13-1.gz |patch -sp1
> >kala@nibbler:~/linux-2.5.13$ cp /lib/modules/2.5.13/.config .
> >kala@nibbler:~/linux-2.5.13$ make -f Makefile-2.5 oldconfig
> >Makefile-2.5:251: /no_such_file-arch/i386/Makefile.defs.noconfig: No such file or directory
> 
> The trailing '/' is omitted in one case.  Workaround for common source and object
> 
> export KBUILD_SRCTREE_000=`pwd`/
> make -f Makefile-2.5 oldconfig

Another problem/question:

$ cd build
$ export KBUILD_OBJTREE=$PWD
$ export KBUILD_SRCTREE_000=/usr/src/linux-2.5.13
$ alias M="make -f $KBUILD_SRCTREE_000/Makefile-2.5"
$ cp /lib/modules/2.5.13/.config .
$ M oldconfig
...

$ M installable
...

[so far so good]

$ make -f Makefile-2.5 menuconfig
[enable RAMDISK support, tweak ramdisk size, enable initrd]
...

Now, issuing "M installable" will result in nearly all files getting rebuilt.
The same happens when switching ramdisk off again. How's that?

I tried enabling/disabling many other config options and doing rebuilds but
couldn't find anything as damaging buildtime-wise as the ramdisk stuff.

Hopefully I'm causing no headaches,
T.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail 
  2002-05-03 14:37 ` your mail tomas szepe
  2002-05-03 15:07   ` tomas szepe
@ 2002-05-03 15:29   ` Keith Owens
  2002-05-03 15:45     ` tomas szepe
  1 sibling, 1 reply; 341+ messages in thread
From: Keith Owens @ 2002-05-03 15:29 UTC (permalink / raw)
  To: tomas szepe; +Cc: kbuild-devel, linux-kernel

On Fri, 3 May 2002 16:37:38 +0200, 
tomas szepe <kala@pinerecords.com> wrote:
>kala@nibbler:~$ tar xzf /usr/src/linux-2.5.13.tgz 
>kala@nibbler:~$ cd linux-2.5.13 
>kala@nibbler:~/linux-2.5.13$ zcat /usr/src/kbuild-2.5-core-10.gz /usr/src/kbuild-2.5-common-2.5.13-1.gz /usr/src/kbuild-2.5-i386-2.5.13-1.gz |patch -sp1
>kala@nibbler:~/linux-2.5.13$ cp /lib/modules/2.5.13/.config .
>kala@nibbler:~/linux-2.5.13$ make -f Makefile-2.5 oldconfig
>Makefile-2.5:251: /no_such_file-arch/i386/Makefile.defs.noconfig: No such file or directory

The trailing '/' is omitted in one case.  Workaround for common source and object

export KBUILD_SRCTREE_000=`pwd`/
make -f Makefile-2.5 oldconfig


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-05-03 14:37 ` your mail tomas szepe
@ 2002-05-03 15:07   ` tomas szepe
  2002-05-03 15:29   ` Keith Owens
  1 sibling, 0 replies; 341+ messages in thread
From: tomas szepe @ 2002-05-03 15:07 UTC (permalink / raw)
  To: Keith Owens; +Cc: kbuild-devel, linux-kernel

Building as follows works, though.

$ cd /usr/src && tar xzf linux-2.5.13.tar.gz
$ cd ~ && mkdir build && cd build
$ export KBUILD_OBJTREE=$PWD
$ export KBUILD_SRCTREE_000=/usr/src/linux-2.5.13
$ alias M="make -f $KBUILD_SRCTREE_000/Makefile-2.5"
$ cp /lib/modules/2.5.13/.config .
$ M oldconfig
$ M installable

T.


> > Release 2.4 of kernel build for kernel 2.5 (kbuild 2.5) is available.
> > http://sourceforge.net/projects/kbuild/, package kbuild-2.5, download
> > release 2.4.
> >
> > kbuild-2.5-core-13-1.
> 
> I believe you meant 's/13/10/'.
> 
> > kbuild-2.5-common-2.5.13-1.
> > kbuild-2.5-i386-2.5.13-1.
> 
> hmmm.. doesn't look so good.
> 
> kala@nibbler:~$ tar xzf /usr/src/linux-2.5.13.tgz 
> kala@nibbler:~$ cd linux-2.5.13 
> kala@nibbler:~/linux-2.5.13$ zcat /usr/src/kbuild-2.5-core-10.gz /usr/src/kbuild-2.5-common-2.5.13-1.gz /usr/src/kbuild-2.5-i386-2.5.13-1.gz |patch -sp1
> kala@nibbler:~/linux-2.5.13$ cp /lib/modules/2.5.13/.config .
> kala@nibbler:~/linux-2.5.13$ make -f Makefile-2.5 oldconfig
> Makefile-2.5:251: /no_such_file-arch/i386/Makefile.defs.noconfig: No such file or directory
> /home/kala/linux-2.5.13/scripts/Makefile-2.5:473: /no_such_file-arch/i386/Makefile.defs.config: No such file or directory
> Makefile-2.5:251: /no_such_file-arch/i386/Makefile.defs.noconfig: No such file or directory
> /home/kala/linux-2.5.13/scripts/Makefile-2.5:473: /no_such_file-arch/i386/Makefile.defs.config: No such file or directory
> Using ARCH='i386' AS='as' LD='ld' CC='/usr/bin/gcc' CPP='/usr/bin/gcc -E' AR='ar' HOSTAS='as' HOSTLD='gcc' HOSTCC='gcc' HOSTAR='ar'
> Generating global Makefile
>   phase 1 (find all inputs)
> ...
> 
> kala@nibbler:~/linux-2.5.13$ make -f Makefile-2.5 installable
> Makefile-2.5:251: /no_such_file-arch/i386/Makefile.defs.noconfig: No such file or directory
> spec value %p not found
> /home/kala/linux-2.5.13/scripts/Makefile-2.5:473: /no_such_file-arch/i386/Makefile.defs.config: No such file or directory
> Using ARCH='i386' AS='as' LD='ld' CC='/usr/bin/gcc' CPP='/usr/bin/gcc -E' AR='ar' HOSTAS='as' HOSTLD='gcc' HOSTCC='gcc' HOSTAR='ar'
> Generating global Makefile
>   phase 1 (find all inputs)
>   phase 2 (convert all Makefile.in files)
>   phase 3 (evaluate selections)
>   phase 4 (integrity checks, write global makefile)
> pp_makefile4: arch/i386/lib/lib.a is selected but is not part of vmlinux, missing link_subdirs?
> make: *** [phase4] Error 1
> 
> 
> T.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-05-03 14:19 Keith Owens
@ 2002-05-03 14:37 ` tomas szepe
  2002-05-03 15:07   ` tomas szepe
  2002-05-03 15:29   ` Keith Owens
  0 siblings, 2 replies; 341+ messages in thread
From: tomas szepe @ 2002-05-03 14:37 UTC (permalink / raw)
  To: Keith Owens; +Cc: kbuild-devel, linux-kernel

> Release 2.4 of kernel build for kernel 2.5 (kbuild 2.5) is available.
> http://sourceforge.net/projects/kbuild/, package kbuild-2.5, download
> release 2.4.
>
> kbuild-2.5-core-13-1.

I believe you meant 's/13/10/'.

> kbuild-2.5-common-2.5.13-1.
> kbuild-2.5-i386-2.5.13-1.

hmmm.. doesn't look so good.

kala@nibbler:~$ tar xzf /usr/src/linux-2.5.13.tgz 
kala@nibbler:~$ cd linux-2.5.13 
kala@nibbler:~/linux-2.5.13$ zcat /usr/src/kbuild-2.5-core-10.gz /usr/src/kbuild-2.5-common-2.5.13-1.gz /usr/src/kbuild-2.5-i386-2.5.13-1.gz |patch -sp1
kala@nibbler:~/linux-2.5.13$ cp /lib/modules/2.5.13/.config .
kala@nibbler:~/linux-2.5.13$ make -f Makefile-2.5 oldconfig
Makefile-2.5:251: /no_such_file-arch/i386/Makefile.defs.noconfig: No such file or directory
/home/kala/linux-2.5.13/scripts/Makefile-2.5:473: /no_such_file-arch/i386/Makefile.defs.config: No such file or directory
Makefile-2.5:251: /no_such_file-arch/i386/Makefile.defs.noconfig: No such file or directory
/home/kala/linux-2.5.13/scripts/Makefile-2.5:473: /no_such_file-arch/i386/Makefile.defs.config: No such file or directory
Using ARCH='i386' AS='as' LD='ld' CC='/usr/bin/gcc' CPP='/usr/bin/gcc -E' AR='ar' HOSTAS='as' HOSTLD='gcc' HOSTCC='gcc' HOSTAR='ar'
Generating global Makefile
  phase 1 (find all inputs)
...

kala@nibbler:~/linux-2.5.13$ make -f Makefile-2.5 installable
Makefile-2.5:251: /no_such_file-arch/i386/Makefile.defs.noconfig: No such file or directory
spec value %p not found
/home/kala/linux-2.5.13/scripts/Makefile-2.5:473: /no_such_file-arch/i386/Makefile.defs.config: No such file or directory
Using ARCH='i386' AS='as' LD='ld' CC='/usr/bin/gcc' CPP='/usr/bin/gcc -E' AR='ar' HOSTAS='as' HOSTLD='gcc' HOSTCC='gcc' HOSTAR='ar'
Generating global Makefile
  phase 1 (find all inputs)
  phase 2 (convert all Makefile.in files)
  phase 3 (evaluate selections)
  phase 4 (integrity checks, write global makefile)
pp_makefile4: arch/i386/lib/lib.a is selected but is not part of vmlinux, missing link_subdirs?
make: *** [phase4] Error 1


T.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-04-24  7:55 Huo Zhigang
  2002-04-24  7:51 ` your mail Zwane Mwaikambo
@ 2002-04-24  8:27 ` Alan Cox
  1 sibling, 0 replies; 341+ messages in thread
From: Alan Cox @ 2002-04-24  8:27 UTC (permalink / raw)
  To: Huo Zhigang; +Cc: linux-kernel

> 
> >INIT: Switching to runlevel: 6
> >INIT: Send processes the TERM signal
> >Unable to handle kernel NULL pointer dereference
>   
>   What's wrong with my machines?  They are all running linux-2.2.18(SMP-supported) with a kernel module which is a driver of Myricom NIC M3S-PCI64C-2 written by my group.
>   Thank you in advance 8-)

If you boot the machije without your driver, then reboot does the
same happen ? If not then it may well be your driver has an error but only
when it closes down

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-04-24  7:55 Huo Zhigang
@ 2002-04-24  7:51 ` Zwane Mwaikambo
  2002-04-24  8:27 ` Alan Cox
  1 sibling, 0 replies; 341+ messages in thread
From: Zwane Mwaikambo @ 2002-04-24  7:51 UTC (permalink / raw)
  To: Huo Zhigang; +Cc: Linux Kernel

On Wed, 24 Apr 2002, Huo Zhigang wrote:

>   Hi, all.
>   My cluster go wrong these days. So many times when I "/sbin/reboot" a node, the following message will be displayed on the console.
> 
> >INIT: Switching to runlevel: 6
> >INIT: Send processes the TERM signal
> >Unable to handle kernel NULL pointer dereference
>   
>   What's wrong with my machines?  They are all running linux-2.2.18(SMP-supported) with a kernel module which is a driver of Myricom NIC M3S-PCI64C-2 written by my group.
>   Thank you in advance 8-)
>   
>             Zhigang Huo
>             zghuo@ncic.ac.cn

Have you tried decoding the oops? Have a look at  
linux/Documentation/oops-tracing.txt

	Zwane

-- 
http://function.linuxpower.ca



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-04-21 21:16 Ivan G.
@ 2002-04-21 23:02 ` Jeff Garzik
  0 siblings, 0 replies; 341+ messages in thread
From: Jeff Garzik @ 2002-04-21 23:02 UTC (permalink / raw)
  To: Ivan G.; +Cc: Urban Widmark, LKML

On Sun, Apr 21, 2002 at 03:16:40PM -0600, Ivan G. wrote:
> Urban,
> 
> About the suggestion to make via_rhine_error handle more interrupts,
> 
> enum intr_status_bits {
>         IntrRxDone=0x0001, IntrRxErr=0x0004, IntrRxEmpty=0x0020,
>         IntrTxDone=0x0002, IntrTxAbort=0x0008, IntrTxUnderrun=0x0010,
>         IntrPCIErr=0x0040,
>         IntrStatsMax=0x0080, IntrRxEarly=0x0100, IntrMIIChange=0x0200,
>         IntrRxOverflow=0x0400, IntrRxDropped=0x0800, IntrRxNoBuf=0x1000,
>         IntrTxAborted=0x2000, IntrLinkChange=0x4000,
>         IntrRxWakeUp=0x8000,
>         IntrNormalSummary=0x0003, IntrAbnormalSummary=0xC260,
> };
> 
> RxEarly, RxOverflow, RxNoBuf are not handled
> (which brings up another question - how should they be handled 
> and where?? It doesn't seem to me that those should end up in error,
> sending CmdTxDemand. )

*blink*  I had not noticed that.

All drivers actually need to handle RxNoBufs and RxOverflow, assuming
they have similar meaning to what I'm familiar with on other chips.
The chip may recover transparently, but one should be at least aware of
them.

RxEarly you very likely do -not- want to handle...

	Jeff





^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-03-13 19:21 Romain Liévin
  2002-03-13 19:43 ` your mail Alan Cox
@ 2002-03-14  7:08 ` Zwane Mwaikambo
  1 sibling, 0 replies; 341+ messages in thread
From: Zwane Mwaikambo @ 2002-03-14  7:08 UTC (permalink / raw)
  To: Romain Liévin; +Cc: Kernel List, Alan Cox, Tim Waugh

Firstly, thanks for doing this =) secondly i'll give your driver a try 
when you release the serial version (i have a serial cable + TI-83)

Cheers,
	Zwane



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-03-13 20:28   ` Romain Liévin
  2002-03-13 20:49     ` Richard B. Johnson
@ 2002-03-13 22:35     ` Alan Cox
  1 sibling, 0 replies; 341+ messages in thread
From: Alan Cox @ 2002-03-13 22:35 UTC (permalink / raw)
  To: Romain Liévin; +Cc: Alan Cox, Kernel List

> +/*
> + * Deal with CONFIG_MODVERSIONS
> + */
> +#if 0 /* Pb with MODVERSIONS */
> +#if CONFIG_MODVERSIONS==1
> +#define MODVERSIONS
> +#include <linux/modversions.h>
> +#endif
> +#endif

[modversions.h is magically included by the kernel for you when its in 
 kernel if you haven't worked that one out yet]

> +#define PP_NO 3
> +struct tipar_struct  table[PP_NO];
static ?

> +               for(i=0; i < delay; i++) {
> +                       inbyte(minor);
> +               }
> +               schedule();

Oh random tip

		  if(current->need_resched)
			schedule();

will just give up the CPU when you are out of time

> +       if(table[minor].opened)
> +               return -EBUSY;
> +       table[minor].opened++;

Think about open/close at the same moment or SMP - the watchdog drivers all
had this problem and now do

	unsigned long opened = 0;

	if(test_and_set_bit(0, &opened))
		return -EBUSY;

	clear_bit(0, &opened)

[this generates atomic operations so is always safe]

> +       if(!table[minor].opened)
> +               return -EFAULT;

	BUG() may be better - it can't happen so BUG() will get a backtrace
and actually get it reported 8)

> +static long long tipar_lseek(struct file * file, long long offset, int origin)
> +{
> +       return -ESPIPE;
> +}

Can go (you now use no_llseek)


Basically except for the open/close one I'm now just picking holes. 
For the device major/minors see http://www.lanana.org.




^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-03-13 20:49     ` Richard B. Johnson
@ 2002-03-13 22:27       ` Alan Cox
  0 siblings, 0 replies; 341+ messages in thread
From: Alan Cox @ 2002-03-13 22:27 UTC (permalink / raw)
  To: root; +Cc: Romain Liévin, Alan Cox, Kernel List

> > +                       START(max);=20
> > +                       do {
> > +                               WAIT(max);
> > +                       } while (inbyte(minor) & 0x10);
> 
>              This may never happen. You end up waiting forever!

No - its hidden in his macros. Look harder



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-03-13 20:28   ` Romain Liévin
@ 2002-03-13 20:49     ` Richard B. Johnson
  2002-03-13 22:27       ` Alan Cox
  2002-03-13 22:35     ` Alan Cox
  1 sibling, 1 reply; 341+ messages in thread
From: Richard B. Johnson @ 2002-03-13 20:49 UTC (permalink / raw)
  To: Romain Liévin; +Cc: Alan Cox, Kernel List

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=US-ASCII, Size: 4611 bytes --]

On Wed, 13 Mar 2002, [ISO-8859-1] Romain Liévin wrote:
I'm going to comment on a few points:

[SNIPPED most...]

> +
> +/* D-bus protocol:
> +                    1                 0                      0
> +       _______        ______|______    __________|________    __________
> +Red  :        ________      |      ____          |        ____
> +       _        ____________|________      ______|__________       _____
> +White:  ________            |        ______      |          _______
> +*/
> +
> +/* Try to transmit a byte on the specified port (-1 if error). */
> +static int put_ti_parallel(int minor, unsigned char data)
> +{
> +       int bit, i;
> +       unsigned long max;
> +  
> +       for (bit=0; bit<8; bit++) {
> +               if (data & 1) {
> +                       outbyte(2, minor);
> +                       START(max); 
> +                       do {
> +                               WAIT(max);
> +                       } while (inbyte(minor) & 0x10);

             This may never happen. You end up waiting forever!
             If the port doesn't exist or is broken, it may return 0xff
             forever! You need to time-out and get out.


> +                       
> +                       outbyte(3, minor);
> +                       START(max);
> +                       do {
> +                               WAIT(max);
> +                       } while (!(inbyte(minor) & 0x10));

                     This may never happen. You end up awiting forever!
                     You need to time-out and get out.


> +               } else {
> +                       outbyte(1, minor);
> +                       START(max);
> +                       do {
> +                               WAIT(max);
> +                       } while (inbyte(minor) & 0x20);
> +                       
                       This also may never happen!
                       Same applies, time-out and get out.
                     
> +                       outbyte(3, minor);
> +                       START(max);
> +                       do {
> +                               WAIT(max);
> +                       } while (!(inbyte(minor) & 0x20));

                      This also may never happen!
                      Same applives, time-out and get out.

> +               }
> +               data >>= 1;
> +               for(i=0; i < delay; i++) {
> +                       inbyte(minor);
> +               }

> +               schedule();

                  This will just spin without setting
                  current->policy |= SCHED_YIELD;
                  (you really should use sys_sched_yield())


> +       }
> +       
> +       return 0;
> +}
> +
> +/* Receive a byte on the specified port or -1 if error. */
> +static int get_ti_parallel(int minor)
> +{
> +       int bit,i;
> +       unsigned char v, data=0;
> +       unsigned long max;
> +
> +       for (bit=0; bit<8; bit++) {
> +               START(max); 
> +               do {
> +                       WAIT(max);
> +               } while ((v=inbyte(minor) & 0x30) == 0x30);
> +      
                  More wait-forever above...

> +               if (v == 0x10) { 
> +                       data=(data>>1) | 0x80;
> +                       outbyte(1, minor);
> +                       START(max);
> +                       do {
> +                               WAIT(max);
> +                       } while (!(inbyte(minor) & 0x20));

                      More wait-forever above.


> +                       outbyte(3, minor);
> +               } else {
> +                       data=data>>1;
> +                       outbyte(2, minor);
> +                       START(max);
> +                       do {
> +                               WAIT(max);
> +                       } while (!(inbyte(minor) & 0x10));
> +                       outbyte(3, minor);
                      More wait-forever!

> +               }
> +               for(i=0; i<delay; i++) {
> +                       inbyte(minor);
> +               }
> +               schedule();
                  No current->policy

> +       }
> +       return (int)data;
> +}
> +

[SNIPPED rest]


Basically, this code performs a needed function. I have been waiting
for someone to write this! However, it's not yet ready for prime-time.
Never assume that hardware is going to produce what you expect. Don't
wait forever for something that was supposed to happen. You'll hang
the machine.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).

                 Windows-2000/Professional isn't.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-03-13 19:43 ` your mail Alan Cox
@ 2002-03-13 20:28   ` Romain Liévin
  2002-03-13 20:49     ` Richard B. Johnson
  2002-03-13 22:35     ` Alan Cox
  0 siblings, 2 replies; 341+ messages in thread
From: Romain Liévin @ 2002-03-13 20:28 UTC (permalink / raw)
  To: Alan Cox; +Cc: Kernel List

Quoting Alan Cox <alan@lxorguk.ukuu.org.uk>:

> > It has been tested on x86 for almost 2 years and on Alpha & Sparc too
> with 
> > various calculators.
> 
> One oddity - some other comments
> 
> > +static int tipar_open(struct inode *inode, struct file *file)
> > +{
> > +       unsigned int minor = minor(inode->i_rdev) - TIPAR_MINOR_0;
> > +
> > +       if (minor >= PP_NO)
> > +               return -ENXIO;  
> > +       
> > +       init_ti_parallel(minor);
> > +
> > +       MOD_INC_USE_COUNT;
> 
> You should remove these and use in 2.4 + . Also what stops multiple
> simultaneous runs of init_ti_parallel if two people open it at once ?
> 
> 
> > +static unsigned int tipar_poll(struct file *file, poll_table *
> wait)
> > +{
> > +       unsigned int mask=0;
> > +       return mask;
> > +}
> 
> That seems unfinished ??
> 
> > +static int tipar_ioctl(struct inode *inode, struct file *file,
> > +                      unsigned int cmd, unsigned long arg)
> > +       case O_NONBLOCK:
> > +               file->f_flags |= O_NONBLOCK;
> > +               return 0;
> 
> O_NDELAY is set by fcntl - your driver never needs this.
> 
> > +       default:
> > +               retval = -EINVAL;
> 
> SuS says -ENOTTY here (lots of drivers get this wrong still)
> 
> > +static long long tipar_lseek(struct file * file, long long offset,
> int origin)
> > +{
> > +       return -ESPIPE;
> > +}
> 
> There is a generic no_llseek function
> 
> > +/* Major & minor number for character devices */
> > +#define TIPAR_MAJOR   61
> 
> These don't appear to be officially assigned via lanana ?
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

Fixed some stuffs according to your remarks.

Comments are welcome...

=================== [ cuts here ] =====================
--- linux.orig/drivers/char/tipar.c     Wed Mar 13 19:19:10 2002
+++ linux/drivers/char/tipar.c  Wed Mar 13 21:24:51 2002
@@ -0,0 +1,543 @@
+/* Hey EMACS -*- linux-c -*-
+ *
+ * tipar - low level driver for handling a parallel link cable
+ * designed for Texas Instruments graphing calculators.
+ *
+ * Copyright (C) 2000-2002, Romain Lievin <roms@lpg.ticalc.org>
+ * under the terms of the GNU General Public License.
+ */
+
+#define VERSION "1.12"
+
+/* This driver should, in theory, work with any parallel port that has an
+ * appropriate low-level driver; all I/O is done through the parport
+ * abstraction layer.
+ *
+ * If this driver is built into the kernel, you can configure it using the
+ * kernel command-line.  For example:
+ *
+ *      tipar=timeout,delay       (set timeout and delay)
+ *
+ * If the driver is loaded as a module, similar functionality is available
+ * using module parameters.  The equivalent of the above commands would be:
+ *
+ *      # insmod tipar.o tipar=15,10
+ */
+
+/* COMPATIBILITY WITH OLD KERNELS
+ *
+ * Usually, parallel cables were bound to ports at
+ * particular I/O addresses, as follows:
+ *
+ *      tipar0             0x378
+ *      tipar1             0x278
+ *      tipar2             0x3bc
+ *
+ *
+ * This driver, by default, binds tipar devices according to parport and
+ * the minor number.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/delay.h>
+#include <linux/config.h>
+#include <linux/version.h>
+#include <linux/init.h>
+#include <asm/uaccess.h>
+#include <linux/ioport.h>
+#include <linux/errno.h>
+#include <linux/sched.h>
+#include <linux/fs.h>
+#include <asm/io.h>
+#include <linux/devfs_fs_kernel.h>
+#include <linux/parport.h> /* Our code depend on parport */
+
+/*
+ * TI definitions
+ */
+#include <linux/ticable.h>
+
+/*
+ * Deal with CONFIG_MODVERSIONS
+ */
+#if 0 /* Pb with MODVERSIONS */
+#if CONFIG_MODVERSIONS==1
+#define MODVERSIONS
+#include <linux/modversions.h>
+#endif
+#endif
+
+/* ----- global variables --------------------------------------------- */
+
+struct tipar_struct {
+       struct pardevice *dev;                  /* Parport device entry */
+       int opened;
+};
+
+#define PP_NO 3
+struct tipar_struct  table[PP_NO];
+
+static int delay   = IO_DELAY;      /* inter-bit delay in microseconds */
+static int timeout = TIMAXTIME;     /* timeout in tenth of seconds     */
+
+static devfs_handle_t devfs_handle = NULL;
+static unsigned int tp_count = 0;   /* tipar_count */
+
+/* --- macros for parport access -------------------------------------- */
+
+#define r_dtr(x)        (parport_read_data(table[(x)].dev->port))
+#define r_str(x)        (parport_read_status(table[(x)].dev->port))
+#define w_ctr(x,y)      (parport_write_control(table[(x)].dev->port, (y)))
+#define w_dtr(x,y)      (parport_write_data(table[(x)].dev->port, (y)))
+
+/* --- setting states on the D-bus with the right timing: ------------- */
+
+static inline void outbyte(int value, int minor)
+{
+       w_dtr(minor, value);
+}
+
+static inline int inbyte(int minor)
+{
+       return (r_str(minor) & 0x30);
+}
+
+static inline void init_ti_parallel(int minor)
+{
+       outbyte(3, minor);
+}
+
+/* ----- global defines ----------------------------------------------- */
+
+#define START(x) { max=jiffies+HZ/(timeout/10); }
+#define WAIT(x) { if(!time_before(jiffies, (x))) return -1; schedule(); }
+
+/* ----- D-bus bit-banging functions ---------------------------------- */
+
+/* D-bus protocol:
+                    1                 0                      0
+       _______        ______|______    __________|________    __________
+Red  :        ________      |      ____          |        ____
+       _        ____________|________      ______|__________       _____
+White:  ________            |        ______      |          _______
+*/
+
+/* Try to transmit a byte on the specified port (-1 if error). */
+static int put_ti_parallel(int minor, unsigned char data)
+{
+       int bit, i;
+       unsigned long max;
+  
+       for (bit=0; bit<8; bit++) {
+               if (data & 1) {
+                       outbyte(2, minor);
+                       START(max); 
+                       do {
+                               WAIT(max);
+                       } while (inbyte(minor) & 0x10);
+                       
+                       outbyte(3, minor);
+                       START(max);
+                       do {
+                               WAIT(max);
+                       } while (!(inbyte(minor) & 0x10));
+               } else {
+                       outbyte(1, minor);
+                       START(max);
+                       do {
+                               WAIT(max);
+                       } while (inbyte(minor) & 0x20);
+                       
+                       outbyte(3, minor);
+                       START(max);
+                       do {
+                               WAIT(max);
+                       } while (!(inbyte(minor) & 0x20));
+               }
+               data >>= 1;
+               for(i=0; i < delay; i++) {
+                       inbyte(minor);
+               }
+               schedule();
+       }
+       
+       return 0;
+}
+
+/* Receive a byte on the specified port or -1 if error. */
+static int get_ti_parallel(int minor)
+{
+       int bit,i;
+       unsigned char v, data=0;
+       unsigned long max;
+
+       for (bit=0; bit<8; bit++) {
+               START(max); 
+               do {
+                       WAIT(max);
+               } while ((v=inbyte(minor) & 0x30) == 0x30);
+      
+               if (v == 0x10) { 
+                       data=(data>>1) | 0x80;
+                       outbyte(1, minor);
+                       START(max);
+                       do {
+                               WAIT(max);
+                       } while (!(inbyte(minor) & 0x20));
+                       outbyte(3, minor);
+               } else {
+                       data=data>>1;
+                       outbyte(2, minor);
+                       START(max);
+                       do {
+                               WAIT(max);
+                       } while (!(inbyte(minor) & 0x10));
+                       outbyte(3, minor);
+               }
+               for(i=0; i<delay; i++) {
+                       inbyte(minor);
+               }
+               schedule();
+       }
+       return (int)data;
+}
+
+/* Return non zero if both lines are at logical one */
+static int check_ti_parallel(int minor)
+{
+       return ((inbyte(minor) & 0x30) == 0x30);
+}
+
+/* Try to detect a parallel link cable on the specified port */
+static int probe_ti_parallel(int minor)
+{
+       int i, j;
+       int seq[]={ 0x00, 0x20, 0x10, 0x30 };
+       unsigned char data = 0;
+       
+       for(i=3; i>=0; i--) {
+               outbyte(3, minor);
+               outbyte(i, minor);
+               for(j=0; j<delay; j++) data = inbyte(minor);
+               /*printk("Probing -> %i: 0x%02x 0x%02x\n", i, data & 0x30,
seq[i]);*/
+               if( (data & 0x30) != seq[i]) {
+                       outbyte(3, minor);
+                       return -1;
+               }
+       } 
+       outbyte(3, minor);
+       return 0;
+}
+
+/* ----- kernel module functions--------------------------------------- */
+
+static int tipar_open(struct inode *inode, struct file *file)
+{
+       unsigned int minor = minor(inode->i_rdev) - TIPAR_MINOR_0;
+
+       if (minor >= PP_NO)
+               return -ENXIO;
+
+       if(table[minor].opened)
+               return -EBUSY;
+
+       table[minor].opened++;
+
+       lp_claim_parport_or_block(table[minor].dev);
+       init_ti_parallel(minor);
+       lp_release_parport(table[minor].dev);
+
+       return 0;
+}
+
+static int tipar_close(struct inode *inode, struct file *file)
+{
+       if (minor >= PP_NO)
+               return -ENXIO;
+
+       if(!table[minor].opened)
+               return -EFAULT;
+
+       table[minor].opened--;
+
+       return 0;
+}
+
+static ssize_t tipar_write(struct file *file,
+                          const char *buf, size_t count, loff_t *ppos)
+{
+       unsigned int minor = minor(file->f_dentry->d_inode->i_rdev) - 
+               TIPAR_MINOR_0;
+       ssize_t n;
+  
+       if (minor >= PP_NO)
+               return -ENXIO;
+
+       if (table[minor].dev == NULL) 
+               return -ENXIO;
+
+       parport_claim_or_block (table[minor].dev);
+       
+       for(n=0; n<count; n++) {
+               unsigned char b;
+               
+               if(get_user(b, buf + n)) {
+                       n = -EFAULT;
+                       goto out;
+               }
+
+               if(put_ti_parallel(minor, b) == -1) {
+                       init_ti_parallel(minor);
+                       n = -ETIMEDOUT;
+                       goto out;
+               }
+       }
+
+ out:
+       parport_release (table[minor].dev);
+       return n;
+}
+
+static ssize_t tipar_read(struct file *file, char *buf, 
+                         size_t count, loff_t *ppos)
+{
+       int b=0;
+       unsigned int minor=minor(file->f_dentry->d_inode->i_rdev) - 
+               TIPAR_MINOR_0;
+       ssize_t retval = 0;
+
+       if(count == 0)
+               return 0;
+
+       if(ppos != &file->f_pos)
+               return -ESPIPE;
+
+       parport_claim_or_block(table[minor].dev);
+  
+       do {
+               b = get_ti_parallel(minor);
+               if(b == -1) {
+                       init_ti_parallel(minor);
+                       retval = -ETIMEDOUT;
+                       goto out;
+               }
+               else
+                       break;
+      
+               /* Non-blocking mode: try again ! */
+               if (file->f_flags & O_NONBLOCK) {
+                       retval = -EAGAIN;
+                       goto out;
+               }
+               
+               /* Signal pending, try again ! */
+               if (signal_pending(current)) {
+                       retval = -ERESTARTSYS;
+                       goto out;
+               }
+
+               schedule();
+       } while (1);
+
+       retval = put_user(b, (unsigned char *)buf);
+       if(!retval)
+               retval = 1;
+       else
+               retval = -EFAULT;
+
+ out:
+       parport_release(table[minor].dev);
+       return retval;
+}
+
+static int tipar_ioctl(struct inode *inode, struct file *file,
+                      unsigned int cmd, unsigned long arg)
+{
+       unsigned int minor = minor(inode->i_rdev) - TIPAR_MINOR_0;
+       int retval = 0;
+
+       if (minor >= PP_NO) 
+               return -ENODEV;
+
+       switch (cmd) {
+       case 0:
+               break;
+       case TIPAR_DELAY:
+               delay = arg;
+               return 0;
+       case TIPAR_TIMEOUT:
+               timeout = arg;
+               return 0;
+       default:
+               retval = -ENOTTY;
+               break;
+       }
+
+       return retval;
+}
+
+static long long tipar_lseek(struct file * file, long long offset, int origin)
+{
+       return -ESPIPE;
+}
+
+
+/* ----- kernel module registering ------------------------------------ */
+
+static struct file_operations tipar_fops = {
+       owner:   THIS_MODULE,
+       llseek:  no_llseek,
+       read:    tipar_read,
+       write:   tipar_write,
+       ioctl:   tipar_ioctl,
+       open:    tipar_open,
+       release: tipar_close,
+};
+
+/* --- initialisation code ------------------------------------- */
+
+#ifndef MODULE
+/*      You must set these - there is no sane way to probe for this cable.
+ *      You can use tipar=timeout,delay to set these now. */
+static int __init tipar_setup (char *str)
+{
+       int ints[2];
+
+        str = get_options (str, ARRAY_SIZE(ints), ints);
+
+        if (ints[0] > 0) {
+                timeout = ints[1];
+                if(ints[0] > 1) {
+                        delay = ints[2];
+               }
+        }
+        return 1;
+}
+#endif
+
+/*
+ * Register our module into parport.
+ * Pass also 2 callbacks functions to parport: a pre-emptive function and an
+ * interrupt handler function (unused).
+ * Display a message such "tipar0: using parport0 (polling)".
+ */
+static int tipar_register(int nr, struct parport *port)
+{
+       char name[8];
+       
+       /* Register our module into parport */
+       table[nr].dev = parport_register_device(port, "tipar",
+                                               NULL, NULL, NULL, 0,
+                                               (void *) &table[nr]);
+       
+       if (table[nr].dev == NULL)
+               return 1;
+ 
+       /* Use devfs, tree: /dev/ticables/par/[0..2] */
+       sprintf(name, "%d", nr);
+       devfs_register(devfs_handle, name,
+                       DEVFS_FL_AUTO_DEVNUM, TIPAR_MAJOR, nr,
+                       S_IFCHR | S_IRUGO | S_IWUGO,
+                       &tipar_fops, NULL);
+
+       /* Display informations */
+       printk(KERN_INFO "tipar%d: using %s (%s).\n", nr, port->name,
+              (port->irq == PARPORT_IRQ_NONE) ? "polling" :
"interrupt-driven");
+
+       if(probe_ti_parallel(nr) != -1)
+               printk("tipar%d: link cable found !\n", nr);
+       else
+               printk("tipar%d: link cable not found (do not plug cable to
calc).\n", nr);
+
+       return 0;
+}
+
+static void tipar_attach (struct parport *port)
+{
+       if (tp_count == PP_NO) {
+               printk("tipar: ignoring parallel port (max. %d)\n", 
+                      PP_NO);
+               return;
+       }
+       if (!tipar_register(tp_count, port))
+               tp_count++;
+}
+
+static void tipar_detach (struct parport *port)
+{
+       /* Will be written at some point in the future */
+}
+
+static struct parport_driver tipar_driver = {
+       "tipar",
+       tipar_attach,
+       tipar_detach,
+       NULL
+};
+
+int tipar_init(void)
+{
+       unsigned int i;
+       
+       /* Initialize structure */
+       for (i = 0; i < PP_NO; i++) {
+               table[i].dev = NULL;
+               table[i].opened = 0;
+       }
+
+       /* Register parport device */  
+       if (devfs_register_chrdev (TIPAR_MAJOR, "tipar", &tipar_fops)) {
+               printk("tipar: unable to get major %d\n", TIPAR_MAJOR);
+               return -EIO;
+       }
+
+       /* Use devfs with tree: /dev/ticables/par/[0..2] */
+       devfs_handle = devfs_mk_dir (NULL, "ticables/par", NULL);
+
+       if (parport_register_driver (&tipar_driver)) {
+               printk ("tipar: unable to register with parport\n");
+               return -EIO;
+       }
+
+       return 0;
+}  
+
+int __init tipar_init_module(void)
+{
+       printk("tipar: parallel link cable driver, version %s\n", VERSION);
+       return tipar_init();
+}
+
+void __exit tipar_cleanup_module(void)
+{
+       unsigned int offset;
+
+       /* Unregistering module */
+       parport_unregister_driver (&tipar_driver);
+
+       devfs_unregister (devfs_handle);
+       devfs_unregister_chrdev(TIPAR_MAJOR, "tipar");  
+
+       for (offset = 0; offset < PP_NO; offset++) {
+               if (table[offset].dev == NULL)
+                       continue;
+               parport_unregister_device(table[offset].dev);
+       }
+}
+
+__setup("tipar=", tipar_setup);
+module_init(tipar_init_module);
+module_exit(tipar_cleanup_module);
+
+MODULE_AUTHOR("Author/Maintainer: Romain Lievin <roms@lpg.ticalc.org>");
+MODULE_DESCRIPTION("Device driver for TI/PC parallel link cables");
+MODULE_LICENSE("GPL");
+
+EXPORT_NO_SYMBOLS;
+
+MODULE_PARM(timeout, "i");
+MODULE_PARM_DESC(timeout, "Timeout, default=1.5 seconds");
+MODULE_PARM(delay, "i");
+MODULE_PARM_DESC(delay, "Inter-bit delay, default=10 microseconds");
--- linux.orig/include/linux/ticable.h  Wed Mar 13 19:42:30 2002
+++ linux/include/linux/ticable.h       Wed Mar 13 21:25:03 2002
@@ -0,0 +1,41 @@
+/* Hey EMACS -*- linux-c -*-
+ *
+ * tipar/tiser/tiglusb - low level driver for handling link cables
+ * designed for Texas Instruments graphing calculators.
+ *
+ * Copyright (C) 2000-2002, Romain Lievin <roms@lpg.ticalc.org>
+ * under the terms of the GNU General Public License.
+ */
+
+#ifndef TICABLE_H 
+#define TICABLE_H 1
+
+/* Internal default constants for the kernel module */
+#define TIMAXTIME 10      /* 1 seconds                         */
+#define IO_DELAY  10      /* 10 micro-seconds  */
+
+/* Major & minor number for character devices */
+#define TIPAR_MAJOR   61
+#define TIPAR_MINOR_0  1
+#define TIPAR_MINOR_1  2
+#define TIPAR_MINOR_2  3
+
+#define TISER_MAJOR   62
+#define TISER_MINOR_0  1
+#define TISER_MINOR_1  2
+#define TISER_MINOR_2  3
+#define TISER_MINOR_3  4
+
+/*
+ * Request values for the 'ioctl' function.
+ * Simply pass the appropriate value as arg of the ioctl call.
+ * These values do not conflict with other ones but they have to be
+ * allocated... (/usr/src/linux/Documentation/ioctl-number.txt).
+ */
+#define TIPAR_DELAY     _IOW('p', 0xa8, int) /* set delay   */
+#define TIPAR_TIMEOUT   _IOW('p', 0xa9, int) /* set timeout */
+
+#define TISER_DELAY     _IOW('p', 0xa0, int) /* set delay   */
+#define TISER_TIMEOUT   _IOW('p', 0xa1, int) /* set timeout */
+
+#endif /* TICABLE_H */


Romain.

---
Romain Liévin (aka roms)
http://lpg.ticalc.org/prj_tilp, prj_usb, prj_tidev, prj_gtktiemu
mail: roms@lpg.ticalc.org

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-03-13 19:21 Romain Liévin
@ 2002-03-13 19:43 ` Alan Cox
  2002-03-13 20:28   ` Romain Liévin
  2002-03-14  7:08 ` Zwane Mwaikambo
  1 sibling, 1 reply; 341+ messages in thread
From: Alan Cox @ 2002-03-13 19:43 UTC (permalink / raw)
  To: Romain Liévin; +Cc: Kernel List, Linus Torvalds, Alan Cox, Tim Waugh

> It has been tested on x86 for almost 2 years and on Alpha & Sparc too with 
> various calculators.

One oddity - some other comments

> +static int tipar_open(struct inode *inode, struct file *file)
> +{
> +       unsigned int minor = minor(inode->i_rdev) - TIPAR_MINOR_0;
> +
> +       if (minor >= PP_NO)
> +               return -ENXIO;  
> +       
> +       init_ti_parallel(minor);
> +
> +       MOD_INC_USE_COUNT;

You should remove these and use in 2.4 + . Also what stops multiple
simultaneous runs of init_ti_parallel if two people open it at once ?


> +static unsigned int tipar_poll(struct file *file, poll_table * wait)
> +{
> +       unsigned int mask=0;
> +       return mask;
> +}

That seems unfinished ??

> +static int tipar_ioctl(struct inode *inode, struct file *file,
> +                      unsigned int cmd, unsigned long arg)
> +       case O_NONBLOCK:
> +               file->f_flags |= O_NONBLOCK;
> +               return 0;

O_NDELAY is set by fcntl - your driver never needs this.

> +       default:
> +               retval = -EINVAL;

SuS says -ENOTTY here (lots of drivers get this wrong still)

> +static long long tipar_lseek(struct file * file, long long offset, int origin)
> +{
> +       return -ESPIPE;
> +}

There is a generic no_llseek function

> +/* Major & minor number for character devices */
> +#define TIPAR_MAJOR   61

These don't appear to be officially assigned via lanana ?

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-02-28 13:58 shura
@ 2002-03-01 15:30 ` Jan-Marek Glogowski
  0 siblings, 0 replies; 341+ messages in thread
From: Jan-Marek Glogowski @ 2002-03-01 15:30 UTC (permalink / raw)
  To: shura; +Cc: linux-kernel

Hi

> I'm setting up a new machine with a pair of IDE drives connected to
> HPT 370 controller. I defined a RAID-1 array using the HPT370 bios
> setting utility.
> Description - hard:
> motherboard Abit ST6-RAID, HPT370, 2 identical hard disks as
> primary/secondary master on ide3/ide4
> - bios:
> Primary Master:   Mirror (Raid 1) for Array #0 UDMA 5 78150 BOOT
> Primary Slave:    No drive
> Secondary Master: Mirror (Raid 1) for Array #0 UDMA 5 78150 HIDDEEN
> Secondary Slave:  No drive
> - os:
> Linux RedHat 7.1 & kernel 2.4.17
> with compilation option
> CONFIG_BLK_DEV_ATARAID_HPT=y
> Lilo:
> ...
> root=/dev/hde10

The root should be /dev/ataraid/xxx for any ata raid but that is not the
real problem...

> During system booting i see following
> ...
> ataraid/d0: ataraid/d0p1 ataraid/d0p2 ataraid/d0p3 ataraid/d0p4 <>
> Highpoint HPT370 Softwareraid driver for linux version 0.01
> Drive 0 is 76319 Mb
> Drive 6 is 76319 Mb
> Raid array consists of 2 drivers
> ...
> Kernel panic: VFS: Unable to mount root fs on 21:0a
> ...

Ataraid seems to find four partitions d0p[1234]. But as far as I know
mirroring isn't supported by the in kernel open source drivers at all -
you may look at the closed source drivers at www.highpoint-tech.com, if
you really need the "hpt native" raid.
(http://people.redhat.com/arjanv/pdcraid/ataraidhowto.html)

> Booting with option root=/dev/atarad/d0p1 ro
> (or root=/dev/ataraid/d0p10 ro)
> and etc - no effect

If you just just use need to access the harddisks from linux it is
suggested to use linux software raid (there was a discussion at lkml - if
I remember right). On modern PCs it uses < 5% CPU and is faster, as it
operates high level in the kernel, not right before the hardware, as those
"big software part, small hardware part" raid controller from Promise and
Highpoint do.

HTH

Jan-Marek


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail 
  2002-02-25  4:02     ` Alexander Viro
@ 2002-02-26  5:50       ` Rusty Russell
  0 siblings, 0 replies; 341+ messages in thread
From: Rusty Russell @ 2002-02-26  5:50 UTC (permalink / raw)
  To: Alexander Viro; +Cc: linux-kernel

In message <Pine.GSO.4.21.0202242230370.1549-100000@weyl.math.psu.edu> you write:
> Honour or not, in this case your complaint is hardly deserved.  To
> compress the above a bit:
> 
> you: <false statement>
> me: RTFS.  <short description of the reasons why statement is wrong; further
> details could be obtained by reading TFS>

Al, *please* read.

Rusty said:
> First, fd passing sucks: you can't leave an fd somewhere and wait for
> someone to pick it up, and they vanish when you exit.  Secondly, you
> have some arbitrary limit on the number of semaphores.  Thirdly,
> someone has to own them.

These are all true: I was criticising the "fd == semaphore" approach,
in the context of my "tied to mapped location" approach, and Linus's
"magic cookie" approach.

I went on to explain furthur:

> Consider tdb, the Trivial Database.  There is no "master locking
> daemon".  There is no way for the first opener (who then has to create
> the semaphores in your model) to pass them to other openers: this is a
> library.

You also managed to ignore my previous comment on the "fd ==
semaphore" approach:

> Implemented exactly that (and posted to l-k IIRC), and it's
> *horrible* to use.

And you came out assuming I had no idea how fd passing works:

> Yes, you can.  Please, RTFS

...and then in the next mail you suggested I implement a "master
locking daemon".

I have taken the liberty of rewriting your reply as I might expect to
see from a peer:

================
From: Al Viro's Polite Twin
To: Rusty Russell
Subject: Re: [PATCH] Lightweight userspace semaphores... 
Date: Two days after hell freezes over

On Mon, 25 Feb 2002, Rusty Russell wrote:
> First, fd passing sucks: you can't leave an fd somewhere and wait for
> someone to pick it up, and they vanish when you exit.  Secondly, you

Have you considered using a daemon to hold the fds?  It shouldn't be
that bad.

================

See how it doesn't assume that I am an idiot?  It's not condescending,
and invites furthur consideration.  It's also shorter than your other
two replies.

I might have replied as follows:

		Yes, and for a "serious" database it's not a problem, as
	it usually has some kind of daemon anyway.  But for TDB, I
	found that it's fragile and extremely unwieldy.  Creating a
	unix domain socket for each .tdb file may not be possible.
	The tdb_open call would have to fork off a daemon if it's the
	first process to access it.  It starts to get fairly icky:
	certainly when compared with the fairly trivial patch to
	support the "semaphore tied to mapped region" approach.

		You can try if you want (TDB enclosed).

Maybe I'm the only one who finds it *really* painful to continually
deal with your "Dan Bernstein of Linux" approach: enough that it
hinders my kernel work.

Genuinely hope this helps,
Rusty.
--
  Taste: it's not just for source code anymore...

#ifndef __TDB_H__
#define __TDB_H__
/* 
   Unix SMB/Netbios implementation.
   Version 3.0
   Samba database functions
   Copyright (C) Andrew Tridgell 1999
   
   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 2 of the License, or
   (at your option) any later version.
   
   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.
   
   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/
#ifdef  __cplusplus
extern "C" {
#endif

/* flags to tdb_store() */
#define TDB_REPLACE 1
#define TDB_INSERT 2
#define TDB_MODIFY 3

/* flags for tdb_open() */
#define TDB_DEFAULT 0 /* just a readability place holder */
#define TDB_CLEAR_IF_FIRST 1
#define TDB_INTERNAL 2 /* don't store on disk */
#define TDB_NOLOCK   4 /* don't do any locking */
#define TDB_NOMMAP   8 /* don't use mmap */
#define TDB_CONVERT 16 /* convert endian (internal use) */

#define TDB_ERRCODE(code, ret) ((tdb->ecode = (code)), ret)

/* error codes */
enum TDB_ERROR {TDB_SUCCESS=0, TDB_ERR_CORRUPT, TDB_ERR_IO, TDB_ERR_LOCK, 
		TDB_ERR_OOM, TDB_ERR_EXISTS, TDB_ERR_NOEXIST, TDB_ERR_NOLOCK };

#ifndef u32
#define u32 unsigned
#endif

typedef struct {
	char *dptr;
	size_t dsize;
} TDB_DATA;

typedef u32 tdb_len;
typedef u32 tdb_off;

/* this is stored at the front of every database */
struct tdb_header {
	char magic_food[32]; /* for /etc/magic */
	u32 version; /* version of the code */
	u32 hash_size; /* number of hash entries */
	tdb_off rwlocks;
	tdb_off reserved[31];
};

struct tdb_lock_type {
	u32 count;
	u32 ltype;
};

struct tdb_traverse_lock {
	struct tdb_traverse_lock *next;
	u32 off;
	u32 hash;
};

/* this is the context structure that is returned from a db open */
typedef struct tdb_context {
	char *name; /* the name of the database */
	void *map_ptr; /* where it is currently mapped */
	int fd; /* open file descriptor for the database */
	tdb_len map_size; /* how much space has been mapped */
	int read_only; /* opened read-only */
	struct tdb_lock_type *locked; /* array of chain locks */
	enum TDB_ERROR ecode; /* error code for last tdb error */
	struct tdb_header header; /* a cached copy of the header */
	u32 flags; /* the flags passed to tdb_open */
	u32 *lockedkeys; /* array of locked keys: first is #keys */
	struct tdb_traverse_lock travlocks; /* current traversal locks */
	struct tdb_context *next; /* all tdbs to avoid multiple opens */
	dev_t device;	/* uniquely identifies this tdb */
	ino_t inode;	/* uniquely identifies this tdb */
} TDB_CONTEXT;

typedef int (*tdb_traverse_func)(TDB_CONTEXT *, TDB_DATA, TDB_DATA, void *);
typedef void (*tdb_log_func)(TDB_CONTEXT *, int , const char *, ...);

TDB_CONTEXT *tdb_open(char *name, int hash_size, int tdb_flags,
		      int open_flags, mode_t mode);

enum TDB_ERROR tdb_error(TDB_CONTEXT *tdb);
const char *tdb_errorstr(TDB_CONTEXT *tdb);
TDB_DATA tdb_fetch(TDB_CONTEXT *tdb, TDB_DATA key);
int tdb_delete(TDB_CONTEXT *tdb, TDB_DATA key);
int tdb_store(TDB_CONTEXT *tdb, TDB_DATA key, TDB_DATA dbuf, int flag);
int tdb_close(TDB_CONTEXT *tdb);
TDB_DATA tdb_firstkey(TDB_CONTEXT *tdb);
TDB_DATA tdb_nextkey(TDB_CONTEXT *tdb, TDB_DATA key);
int tdb_traverse(TDB_CONTEXT *tdb, tdb_traverse_func fn, void *state);
int tdb_exists(TDB_CONTEXT *tdb, TDB_DATA key);
int tdb_lockkeys(TDB_CONTEXT *tdb, u32 number, TDB_DATA keys[]);
void tdb_unlockkeys(TDB_CONTEXT *tdb);
int tdb_lockall(TDB_CONTEXT *tdb);
void tdb_unlockall(TDB_CONTEXT *tdb);

/* Low level locking functions: use with care */
int tdb_chainlock(TDB_CONTEXT *tdb, TDB_DATA key);
void tdb_chainunlock(TDB_CONTEXT *tdb, TDB_DATA key);

/* Debug functions. Not used in production. */
void tdb_dump_all(TDB_CONTEXT *tdb);
void tdb_printfreelist(TDB_CONTEXT *tdb);

extern TDB_DATA tdb_null;
#ifdef  __cplusplus
}
#endif

#endif /* tdb.h */

 /* 
   Unix SMB/Netbios implementation.
   Version 3.0
   Samba database functions
   Copyright (C) Andrew Tridgell              1999-2000
   Copyright (C) Luke Kenneth Casson Leighton      2000
   Copyright (C) Paul `Rusty' Russell		   2000
   Copyright (C) Jeremy Allison			   2000
   
   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 2 of the License, or
   (at your option) any later version.
   
   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.
   
   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*/
#include <stdlib.h>
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <errno.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include "tdb.h"

#define TDB_MAGIC_FOOD "TDB file\n"
#define TDB_VERSION (0x26011967 + 6)
#define TDB_MAGIC (0x26011999U)
#define TDB_FREE_MAGIC (~TDB_MAGIC)
#define TDB_DEAD_MAGIC (0xFEE1DEAD)
#define TDB_ALIGNMENT 4
#define MIN_REC_SIZE (2*sizeof(struct list_struct) + TDB_ALIGNMENT)
#define DEFAULT_HASH_SIZE 131
#define TDB_PAGE_SIZE 0x2000
#define FREELIST_TOP (sizeof(struct tdb_header))
#define TDB_ALIGN(x,a) (((x) + (a)-1) & ~((a)-1))
#define TDB_BYTEREV(x) (((((x)&0xff)<<24)|((x)&0xFF00)<<8)|(((x)>>8)&0xFF00)|((x)>>24))
#define TDB_DEAD(r) ((r)->magic == TDB_DEAD_MAGIC)
#define TDB_BAD_MAGIC(r) ((r)->magic != TDB_MAGIC && !TDB_DEAD(r))
#define TDB_HASH_TOP(hash) (FREELIST_TOP + (BUCKET(hash)+1)*sizeof(tdb_off))

/* lock offsets */
#define GLOBAL_LOCK 0
#define ACTIVE_LOCK 4

#ifndef MAP_FILE
#define MAP_FILE 0
#endif

#ifndef MAP_FAILED
#define MAP_FAILED ((void *)-1)
#endif

#define BUCKET(hash) ((hash) % tdb->header.hash_size)
TDB_DATA tdb_null;

/* all contexts, to ensure no double-opens (fcntl locks don't nest!) */
static TDB_CONTEXT *tdbs = NULL;

static void tdb_munmap(TDB_CONTEXT *tdb)
{
	if (tdb->flags & TDB_INTERNAL)
		return;

	if (tdb->map_ptr)
		munmap(tdb->map_ptr, tdb->map_size);
	tdb->map_ptr = NULL;
}

static void tdb_mmap(TDB_CONTEXT *tdb)
{
	if (tdb->flags & TDB_INTERNAL)
		return;

	if (!(tdb->flags & TDB_NOMMAP)) {
		tdb->map_ptr = mmap(NULL, tdb->map_size, 
				    PROT_READ|(tdb->read_only? 0:PROT_WRITE), 
				    MAP_SHARED|MAP_FILE, tdb->fd, 0);

		/*
		 * NB. When mmap fails it returns MAP_FAILED *NOT* NULL !!!!
		 */

		if (tdb->map_ptr == MAP_FAILED)
			tdb->map_ptr = NULL;
	} else {
		tdb->map_ptr = NULL;
	}
}

/* Endian conversion: we only ever deal with 4 byte quantities */
static void *convert(void *buf, u32 size)
{
	u32 i, *p = buf;
	for (i = 0; i < size / 4; i++)
		p[i] = TDB_BYTEREV(p[i]);
	return buf;
}
#define DOCONV() (tdb->flags & TDB_CONVERT)
#define CONVERT(x) (DOCONV() ? convert(&x, sizeof(x)) : &x)

/* the body of the database is made of one list_struct for the free space
   plus a separate data list for each hash value */
struct list_struct {
	tdb_off next; /* offset of the next record in the list */
	tdb_len rec_len; /* total byte length of record */
	tdb_len key_len; /* byte length of key */
	tdb_len data_len; /* byte length of data */
	u32 full_hash; /* the full 32 bit hash of the key */
	u32 magic;   /* try to catch errors */
	/* the following union is implied:
		union {
			char record[rec_len];
			struct {
				char key[key_len];
				char data[data_len];
			}
			u32 totalsize; (tailer)
		}
	*/
};

/* a byte range locking function - return 0 on success
   this functions locks/unlocks 1 byte at the specified offset.

   On error, errno is also set so that errors are passed back properly
   through tdb_open(). */
static int tdb_brlock(TDB_CONTEXT *tdb, tdb_off offset, 
		      int rw_type, int lck_type)
{
	struct flock fl;

	if (tdb->flags & TDB_NOLOCK)
		return 0;
	if (tdb->read_only) {
		errno = EACCES;
		return -1;
	}

	fl.l_type = rw_type;
	fl.l_whence = SEEK_SET;
	fl.l_start = offset;
	fl.l_len = 1;
	fl.l_pid = 0;

	if (fcntl(tdb->fd,lck_type,&fl)) {
		/* errno set by fcntl */
		return TDB_ERRCODE(TDB_ERR_LOCK, -1);
	}
	return 0;
}

/* lock a list in the database. list -1 is the alloc list */
static int tdb_lock(TDB_CONTEXT *tdb, int list, int ltype)
{
	if (list < -1 || list >= (int)tdb->header.hash_size) {
		return -1;
	}
	if (tdb->flags & TDB_NOLOCK)
		return 0;

	/* Since fcntl locks don't nest, we do a lock for the first one,
	   and simply bump the count for future ones */
	if (tdb->locked[list+1].count == 0) {
		if (tdb_brlock(tdb,FREELIST_TOP+4*list,ltype,F_SETLKW)) {
			return -1;
		}
		tdb->locked[list+1].ltype = ltype;
	}
	tdb->locked[list+1].count++;
	return 0;
}

/* unlock the database: returns void because it's too late for errors. */
static void tdb_unlock(TDB_CONTEXT *tdb, int list, int ltype)
{
	if (tdb->flags & TDB_NOLOCK)
		return;

	/* Sanity checks */
	if (list < -1 || list >= (int)tdb->header.hash_size)
		return;
	if (tdb->locked[list+1].count==0)
		return;

	if (tdb->locked[list+1].count == 1) {
		/* Down to last nested lock: unlock underneath */
		tdb_brlock(tdb, FREELIST_TOP+4*list, F_UNLCK, F_SETLKW);
	}
	tdb->locked[list+1].count--;
}

/* This is based on the hash agorithm from gdbm */
static u32 tdb_hash(TDB_DATA *key)
{
	u32 value;	/* Used to compute the hash value.  */
	u32   i;	/* Used to cycle through random values. */

	/* Set the initial value from the key size. */
	for (value = 0x238F13AF * key->dsize, i=0; i < key->dsize; i++)
		value = (value + (key->dptr[i] << (i*5 % 24)));

	return (1103515243 * value + 12345);  
}

/* check for an out of bounds access - if it is out of bounds then
   see if the database has been expanded by someone else and expand
   if necessary 
   note that "len" is the minimum length needed for the db
*/
static int tdb_oob(TDB_CONTEXT *tdb, tdb_off len)
{
	struct stat st;
	if (len <= tdb->map_size)
		return 0;
	if (tdb->flags & TDB_INTERNAL) {
		return TDB_ERRCODE(TDB_ERR_IO, -1);
	}

	if (fstat(tdb->fd, &st) == -1)
		return TDB_ERRCODE(TDB_ERR_IO, -1);

	if (st.st_size < (size_t)len) {
		return TDB_ERRCODE(TDB_ERR_IO, -1);
	}

	/* Unmap, update size, remap */
	tdb_munmap(tdb);
	tdb->map_size = st.st_size;
	tdb_mmap(tdb);
	return 0;
}

/* write a lump of data at a specified offset */
static int tdb_write(TDB_CONTEXT *tdb, tdb_off off, void *buf, tdb_len len)
{
	if (tdb_oob(tdb, off + len) != 0)
		return -1;

	if (tdb->map_ptr)
		memcpy(off + (char *)tdb->map_ptr, buf, len);
	else if (lseek(tdb->fd, off, SEEK_SET) != off
		 || write(tdb->fd, buf, len) != (ssize_t)len) {
		return TDB_ERRCODE(TDB_ERR_IO, -1);
	}
	return 0;
}

/* read a lump of data at a specified offset, maybe convert */
static int tdb_read(TDB_CONTEXT *tdb,tdb_off off,void *buf,tdb_len len,int cv)
{
	if (tdb_oob(tdb, off + len) != 0)
		return -1;

	if (tdb->map_ptr)
		memcpy(buf, off + (char *)tdb->map_ptr, len);
	else if (lseek(tdb->fd, off, SEEK_SET) != off
		 || read(tdb->fd, buf, len) != (ssize_t)len) {
		return TDB_ERRCODE(TDB_ERR_IO, -1);
	}
	if (cv)
		convert(buf, len);
	return 0;
}

/* read a lump of data, allocating the space for it */
static char *tdb_alloc_read(TDB_CONTEXT *tdb, tdb_off offset, tdb_len len)
{
	char *buf;

	if (!(buf = malloc(len))) {
		return TDB_ERRCODE(TDB_ERR_OOM, buf);
	}
	if (tdb_read(tdb, offset, buf, len, 0) == -1) {
		free(buf);
		return NULL;
	}
	return buf;
}

/* read/write a tdb_off */
static int ofs_read(TDB_CONTEXT *tdb, tdb_off offset, tdb_off *d)
{
	return tdb_read(tdb, offset, (char*)d, sizeof(*d), DOCONV());
}
static int ofs_write(TDB_CONTEXT *tdb, tdb_off offset, tdb_off *d)
{
	tdb_off off = *d;
	return tdb_write(tdb, offset, CONVERT(off), sizeof(*d));
}

/* read/write a record */
static int rec_read(TDB_CONTEXT *tdb, tdb_off offset, struct list_struct *rec)
{
	if (tdb_read(tdb, offset, rec, sizeof(*rec),DOCONV()) == -1)
		return -1;
	if (TDB_BAD_MAGIC(rec)) {
		return TDB_ERRCODE(TDB_ERR_CORRUPT, -1);
	}
	return tdb_oob(tdb, rec->next+sizeof(*rec));
}
static int rec_write(TDB_CONTEXT *tdb, tdb_off offset, struct list_struct *rec)
{
	struct list_struct r = *rec;
	return tdb_write(tdb, offset, CONVERT(r), sizeof(r));
}

/* read a freelist record and check for simple errors */
static int rec_free_read(TDB_CONTEXT *tdb, tdb_off off, struct list_struct *rec)
{
	if (tdb_read(tdb, off, rec, sizeof(*rec),DOCONV()) == -1)
		return -1;
	if (rec->magic != TDB_FREE_MAGIC) {
		return TDB_ERRCODE(TDB_ERR_CORRUPT, -1);
	}
	if (tdb_oob(tdb, rec->next+sizeof(*rec)) != 0)
		return -1;
	return 0;
}

/* update a record tailer (must hold allocation lock) */
static int update_tailer(TDB_CONTEXT *tdb, tdb_off offset,
			 const struct list_struct *rec)
{
	tdb_off totalsize;

	/* Offset of tailer from record header */
	totalsize = sizeof(*rec) + rec->rec_len;
	return ofs_write(tdb, offset + totalsize - sizeof(tdb_off),
			 &totalsize);
}

static tdb_off tdb_dump_record(TDB_CONTEXT *tdb, tdb_off offset)
{
	struct list_struct rec;
	tdb_off tailer_ofs, tailer;

	if (tdb_read(tdb, offset, (char *)&rec, sizeof(rec), DOCONV()) == -1) {
		printf("ERROR: failed to read record at %u\n", offset);
		return 0;
	}

	printf(" rec: offset=%u next=%d rec_len=%d key_len=%d data_len=%d full_hash=0x%x magic=0x%x\n",
	       offset, rec.next, rec.rec_len, rec.key_len, rec.data_len, rec.full_hash, rec.magic);

	tailer_ofs = offset + sizeof(rec) + rec.rec_len - sizeof(tdb_off);
	if (ofs_read(tdb, tailer_ofs, &tailer) == -1) {
		printf("ERROR: failed to read tailer at %u\n", tailer_ofs);
		return rec.next;
	}

	if (tailer != rec.rec_len + sizeof(rec)) {
		printf("ERROR: tailer does not match record! tailer=%u totalsize=%u\n", tailer, rec.rec_len + sizeof(rec));
	}
	return rec.next;
}

static void tdb_dump_chain(TDB_CONTEXT *tdb, int i)
{
	tdb_off rec_ptr, top;

	top = TDB_HASH_TOP(i);

	tdb_lock(tdb, i, F_WRLCK);

	if (ofs_read(tdb, top, &rec_ptr) == -1) {
		tdb_unlock(tdb, i, F_WRLCK);
		return;
	}

	if (rec_ptr)
		printf("hash=%d\n", i);

	while (rec_ptr) {
		rec_ptr = tdb_dump_record(tdb, rec_ptr);
	}
	tdb_unlock(tdb, i, F_WRLCK);
}

void tdb_dump_all(TDB_CONTEXT *tdb)
{
	int i;
	for (i=0;i<tdb->header.hash_size;i++) {
		tdb_dump_chain(tdb, i);
	}
	printf("freelist:\n");
	tdb_dump_chain(tdb, -1);
}

void tdb_printfreelist(TDB_CONTEXT *tdb)
{
	long total_free = 0;
	tdb_off offset, rec_ptr, last_ptr;
	struct list_struct rec;

	tdb_lock(tdb, -1, F_WRLCK);

	last_ptr = 0;
	offset = FREELIST_TOP;

	/* read in the freelist top */
	if (ofs_read(tdb, offset, &rec_ptr) == -1) {
		return;
	}

	printf("freelist top=[0x%08x]\n", rec_ptr );
	while (rec_ptr) {
		if (tdb_read(tdb, rec_ptr, (char *)&rec, sizeof(rec), DOCONV()) == -1) {
			return;
		}

		if (rec.magic != TDB_FREE_MAGIC) {
			printf("bad magic 0x%08x in free list\n", rec.magic);
			return;
		}

		printf("entry offset=[0x%08x], rec.rec_len = [0x%08x (%d)]\n", rec.next, rec.rec_len, rec.rec_len );
		total_free += rec.rec_len;

		/* move to the next record */
		rec_ptr = rec.next;
	}
	printf("total rec_len = [0x%08x (%d)]\n", (int)total_free, 
               (int)total_free);

	tdb_unlock(tdb, -1, F_WRLCK);
}

/* Remove an element from the freelist.  Must have alloc lock. */
static int remove_from_freelist(TDB_CONTEXT *tdb, tdb_off off, tdb_off next)
{
	tdb_off last_ptr, i;

	/* read in the freelist top */
	last_ptr = FREELIST_TOP;
	while (ofs_read(tdb, last_ptr, &i) != -1 && i != 0) {
		if (i == off) {
			/* We've found it! */
			return ofs_write(tdb, last_ptr, &next);
		}
		/* Follow chain (next offset is at start of record) */
		last_ptr = i;
	}
	return TDB_ERRCODE(TDB_ERR_CORRUPT, -1);
}

/* Add an element into the freelist. Merge adjacent records if
   neccessary. */
static int tdb_free(TDB_CONTEXT *tdb, tdb_off offset, struct list_struct *rec)
{
	tdb_off right, left;

	/* Allocation and tailer lock */
	if (tdb_lock(tdb, -1, F_WRLCK) != 0)
		return -1;

	/* set an initial tailer, so if we fail we don't leave a bogus record */
	update_tailer(tdb, offset, rec);

	/* Look right first (I'm an Australian, dammit) */
	right = offset + sizeof(*rec) + rec->rec_len;
	if (right + sizeof(*rec) <= tdb->map_size) {
		struct list_struct r;

		if (tdb_read(tdb, right, &r, sizeof(r), DOCONV()) == -1) {
			goto left;
		}

		/* If it's free, expand to include it. */
		if (r.magic == TDB_FREE_MAGIC) {
			if (remove_from_freelist(tdb, right, r.next) == -1) {
				goto left;
			}
			rec->rec_len += sizeof(r) + r.rec_len;
		}
	}

left:
	/* Look left */
	left = offset - sizeof(tdb_off);
	if (left > TDB_HASH_TOP(tdb->header.hash_size-1)) {
		struct list_struct l;
		tdb_off leftsize;

		/* Read in tailer and jump back to header */
		if (ofs_read(tdb, left, &leftsize) == -1) {
			goto update;
		}
		left = offset - leftsize;

		/* Now read in record */
		if (tdb_read(tdb, left, &l, sizeof(l), DOCONV()) == -1) {
			goto update;
		}

		/* If it's free, expand to include it. */
		if (l.magic == TDB_FREE_MAGIC) {
			if (remove_from_freelist(tdb, left, l.next) == -1) {
				goto update;
			} else {
				offset = left;
				rec->rec_len += leftsize;
			}
		}
	}

update:
	if (update_tailer(tdb, offset, rec) == -1) {
		goto fail;
	}

	/* Now, prepend to free list */
	rec->magic = TDB_FREE_MAGIC;

	if (ofs_read(tdb, FREELIST_TOP, &rec->next) == -1 ||
	    rec_write(tdb, offset, rec) == -1 ||
	    ofs_write(tdb, FREELIST_TOP, &offset) == -1) {
		goto fail;
	}

	/* And we're done. */
	tdb_unlock(tdb, -1, F_WRLCK);
	return 0;

 fail:
	tdb_unlock(tdb, -1, F_WRLCK);
	return -1;
}


/* expand a file.  we prefer to use ftruncate, as that is what posix
  says to use for mmap expansion */
static int expand_file(TDB_CONTEXT *tdb, tdb_off size, tdb_off addition)
{
	char buf[1024];

	if (ftruncate(tdb->fd, size+addition) != 0) {
		return -1;
	}

	/* now fill the file with something. This ensures that the file isn't sparse, which would be
	   very bad if we ran out of disk. This must be done with write, not via mmap */
	memset(buf, 0x42, sizeof(buf));
	while (addition) {
		int n = addition>sizeof(buf)?sizeof(buf):addition;
		int ret;
		if (lseek(tdb->fd, size, SEEK_SET) != size)
			return -1;
		ret = write(tdb->fd, buf, n);
		if (ret != n) {
			return -1;
		}
		addition -= n;
		size += n;
	}
	return 0;
}


/* expand the database at least size bytes by expanding the underlying
   file and doing the mmap again if necessary */
static int tdb_expand(TDB_CONTEXT *tdb, tdb_off size)
{
	struct list_struct rec;
	tdb_off offset;

	if (tdb_lock(tdb, -1, F_WRLCK) == -1) {
		return -1;
	}

	/* must know about any previous expansions by another process */
	tdb_oob(tdb, tdb->map_size + 1);

	/* always make room for at least 10 more records, and round
           the database up to a multiple of TDB_PAGE_SIZE */
	size = TDB_ALIGN(tdb->map_size + size*10, TDB_PAGE_SIZE) - tdb->map_size;

	if (!(tdb->flags & TDB_INTERNAL))
		tdb_munmap(tdb);

	/*
	 * We must ensure the file is unmapped before doing this
	 * to ensure consistency with systems like OpenBSD where
	 * writes and mmaps are not consistent.
	 */

	/* expand the file itself */
	if (!(tdb->flags & TDB_INTERNAL)) {
		if (expand_file(tdb, tdb->map_size, size) != 0)
			goto fail;
	}

	tdb->map_size += size;

	if (tdb->flags & TDB_INTERNAL)
		tdb->map_ptr = realloc(tdb->map_ptr, tdb->map_size);
	else {
		/*
		 * We must ensure the file is remapped before adding the space
		 * to ensure consistency with systems like OpenBSD where
		 * writes and mmaps are not consistent.
		 */

		/* We're ok if the mmap fails as we'll fallback to read/write */
		tdb_mmap(tdb);
	}

	/* form a new freelist record */
	memset(&rec,'\0',sizeof(rec));
	rec.rec_len = size - sizeof(rec);

	/* link it into the free list */
	offset = tdb->map_size - size;
	if (tdb_free(tdb, offset, &rec) == -1)
		goto fail;

	tdb_unlock(tdb, -1, F_WRLCK);
	return 0;
 fail:
	tdb_unlock(tdb, -1, F_WRLCK);
	return -1;
}

/* allocate some space from the free list. The offset returned points
   to a unconnected list_struct within the database with room for at
   least length bytes of total data

   0 is returned if the space could not be allocated
 */
static tdb_off tdb_allocate(TDB_CONTEXT *tdb, tdb_len length,
			    struct list_struct *rec)
{
	tdb_off rec_ptr, last_ptr, newrec_ptr;
	struct list_struct newrec;

	if (tdb_lock(tdb, -1, F_WRLCK) == -1)
		return 0;

	/* Extra bytes required for tailer */
	length += sizeof(tdb_off);

 again:
	last_ptr = FREELIST_TOP;

	/* read in the freelist top */
	if (ofs_read(tdb, FREELIST_TOP, &rec_ptr) == -1)
		goto fail;

	/* keep looking until we find a freelist record big enough */
	while (rec_ptr) {
		if (rec_free_read(tdb, rec_ptr, rec) == -1)
			goto fail;

		if (rec->rec_len >= length) {
			/* found it - now possibly split it up  */
			if (rec->rec_len > length + MIN_REC_SIZE) {
				/* Length of left piece */
				length = TDB_ALIGN(length, TDB_ALIGNMENT);

				/* Right piece to go on free list */
				newrec.rec_len = rec->rec_len
					- (sizeof(*rec) + length);
				newrec_ptr = rec_ptr + sizeof(*rec) + length;

				/* And left record is shortened */
				rec->rec_len = length;
			} else
				newrec_ptr = 0;

			/* Remove allocated record from the free list */
			if (ofs_write(tdb, last_ptr, &rec->next) == -1)
				goto fail;

			/* Update header: do this before we drop alloc
                           lock, otherwise tdb_free() might try to
                           merge with us, thinking we're free.
                           (Thanks Jeremy Allison). */
			rec->magic = TDB_MAGIC;
			if (rec_write(tdb, rec_ptr, rec) == -1)
				goto fail;

			/* Did we create new block? */
			if (newrec_ptr) {
				/* Update allocated record tailer (we
                                   shortened it). */
				if (update_tailer(tdb, rec_ptr, rec) == -1)
					goto fail;

				/* Free new record */
				if (tdb_free(tdb, newrec_ptr, &newrec) == -1)
					goto fail;
			}

			/* all done - return the new record offset */
			tdb_unlock(tdb, -1, F_WRLCK);
			return rec_ptr;
		}
		/* move to the next record */
		last_ptr = rec_ptr;
		rec_ptr = rec->next;
	}
	/* we didn't find enough space. See if we can expand the
	   database and if we can then try again */
	if (tdb_expand(tdb, length + sizeof(*rec)) == 0)
		goto again;
 fail:
	tdb_unlock(tdb, -1, F_WRLCK);
	return 0;
}

/* initialise a new database with a specified hash size */
static int tdb_new_database(TDB_CONTEXT *tdb, int hash_size)
{
	struct tdb_header *newdb;
	int size, ret = -1;

	/* We make it up in memory, then write it out if not internal */
	size = sizeof(struct tdb_header) + (hash_size+1)*sizeof(tdb_off);
	if (!(newdb = calloc(size, 1)))
		return TDB_ERRCODE(TDB_ERR_OOM, -1);

	/* Fill in the header */
	newdb->version = TDB_VERSION;
	newdb->hash_size = hash_size;
	if (tdb->flags & TDB_INTERNAL) {
		tdb->map_size = size;
		tdb->map_ptr = (char *)newdb;
		memcpy(&tdb->header, newdb, sizeof(tdb->header));
		/* Convert the `ondisk' version if asked. */
		CONVERT(*newdb);
		return 0;
	}
	if (lseek(tdb->fd, 0, SEEK_SET) == -1)
		goto fail;

	if (ftruncate(tdb->fd, 0) == -1)
		goto fail;

	/* This creates an endian-converted header, as if read from disk */
	CONVERT(*newdb);
	memcpy(&tdb->header, newdb, sizeof(tdb->header));
	/* Don't endian-convert the magic food! */
	memcpy(newdb->magic_food, TDB_MAGIC_FOOD, strlen(TDB_MAGIC_FOOD)+1);
	if (write(tdb->fd, newdb, size) != size)
		ret = -1;
	else
		ret = 0;

  fail:
	free(newdb);
	return ret;
}

/* Returns 0 on fail.  On success, return offset of record, and fills
   in rec */
static tdb_off tdb_find(TDB_CONTEXT *tdb, TDB_DATA key, u32 hash,
			struct list_struct *r)
{
	tdb_off rec_ptr;
	
	/* read in the hash top */
	if (ofs_read(tdb, TDB_HASH_TOP(hash), &rec_ptr) == -1)
		return 0;

	/* keep looking until we find the right record */
	while (rec_ptr) {
		if (rec_read(tdb, rec_ptr, r) == -1)
			return 0;

		if (!TDB_DEAD(r) && hash==r->full_hash && key.dsize==r->key_len) {
			char *k;
			/* a very likely hit - read the key */
			k = tdb_alloc_read(tdb, rec_ptr + sizeof(*r), 
					   r->key_len);
			if (!k)
				return 0;

			if (memcmp(key.dptr, k, key.dsize) == 0) {
				free(k);
				return rec_ptr;
			}
			free(k);
		}
		rec_ptr = r->next;
	}
	return TDB_ERRCODE(TDB_ERR_NOEXIST, 0);
}

/* If they do lockkeys, check that this hash is one they locked */
static int tdb_keylocked(TDB_CONTEXT *tdb, u32 hash)
{
	u32 i;
	if (!tdb->lockedkeys)
		return 1;
	for (i = 0; i < tdb->lockedkeys[0]; i++)
		if (tdb->lockedkeys[i+1] == hash)
			return 1;
	return TDB_ERRCODE(TDB_ERR_NOLOCK, 0);
}

/* As tdb_find, but if you succeed, keep the lock */
static tdb_off tdb_find_lock(TDB_CONTEXT *tdb, TDB_DATA key, int locktype,
			     struct list_struct *rec)
{
	u32 hash, rec_ptr;

	hash = tdb_hash(&key);
	if (!tdb_keylocked(tdb, hash))
		return 0;
	if (tdb_lock(tdb, BUCKET(hash), locktype) == -1)
		return 0;
	if (!(rec_ptr = tdb_find(tdb, key, hash, rec)))
		tdb_unlock(tdb, BUCKET(hash), locktype);
	return rec_ptr;
}

enum TDB_ERROR tdb_error(TDB_CONTEXT *tdb)
{
	return tdb->ecode;
}

static struct tdb_errname {
	enum TDB_ERROR ecode; const char *estring;
} emap[] = { {TDB_SUCCESS, "Success"},
	     {TDB_ERR_CORRUPT, "Corrupt database"},
	     {TDB_ERR_IO, "IO Error"},
	     {TDB_ERR_LOCK, "Locking error"},
	     {TDB_ERR_OOM, "Out of memory"},
	     {TDB_ERR_EXISTS, "Record exists"},
	     {TDB_ERR_NOLOCK, "Lock exists on other keys"},
	     {TDB_ERR_NOEXIST, "Record does not exist"} };

/* Error string for the last tdb error */
const char *tdb_errorstr(TDB_CONTEXT *tdb)
{
	u32 i;
	for (i = 0; i < sizeof(emap) / sizeof(struct tdb_errname); i++)
		if (tdb->ecode == emap[i].ecode)
			return emap[i].estring;
	return "Invalid error code";
}

/* update an entry in place - this only works if the new data size
   is <= the old data size and the key exists.
   on failure return -1
*/
static int tdb_update(TDB_CONTEXT *tdb, TDB_DATA key, TDB_DATA dbuf)
{
	struct list_struct rec;
	tdb_off rec_ptr;
	int ret = -1;

	/* find entry */
	if (!(rec_ptr = tdb_find_lock(tdb, key, F_WRLCK, &rec)))
		return -1;

	/* must be long enough key, data and tailer */
	if (rec.rec_len < key.dsize + dbuf.dsize + sizeof(tdb_off)) {
		tdb->ecode = TDB_SUCCESS; /* Not really an error */
		goto out;
	}

	if (tdb_write(tdb, rec_ptr + sizeof(rec) + rec.key_len,
		      dbuf.dptr, dbuf.dsize) == -1)
		goto out;

	if (dbuf.dsize != rec.data_len) {
		/* update size */
		rec.data_len = dbuf.dsize;
		ret = rec_write(tdb, rec_ptr, &rec);
	} else
		ret = 0;
 out:
	tdb_unlock(tdb, BUCKET(rec.full_hash), F_WRLCK);
	return ret;
}

/* find an entry in the database given a key */
TDB_DATA tdb_fetch(TDB_CONTEXT *tdb, TDB_DATA key)
{
	tdb_off rec_ptr;
	struct list_struct rec;
	TDB_DATA ret;

	/* find which hash bucket it is in */
	if (!(rec_ptr = tdb_find_lock(tdb,key,F_RDLCK,&rec)))
		return tdb_null;

	ret.dptr = tdb_alloc_read(tdb, rec_ptr + sizeof(rec) + rec.key_len,
				  rec.data_len);
	ret.dsize = rec.data_len;
	tdb_unlock(tdb, BUCKET(rec.full_hash), F_RDLCK);
	return ret;
}

/* check if an entry in the database exists 

   note that 1 is returned if the key is found and 0 is returned if not found
   this doesn't match the conventions in the rest of this module, but is
   compatible with gdbm
*/
int tdb_exists(TDB_CONTEXT *tdb, TDB_DATA key)
{
	struct list_struct rec;
	
	if (tdb_find_lock(tdb, key, F_RDLCK, &rec) == 0)
		return 0;
	tdb_unlock(tdb, BUCKET(rec.full_hash), F_RDLCK);
	return 1;
}

/* record lock stops delete underneath */
static int lock_record(TDB_CONTEXT *tdb, tdb_off off)
{
	return off ? tdb_brlock(tdb, off, F_RDLCK, F_SETLKW) : 0;
}
/*
  Write locks override our own fcntl readlocks, so check it here.
  Note this is meant to be F_SETLK, *not* F_SETLKW, as it's not
  an error to fail to get the lock here.
*/
 
static int write_lock_record(TDB_CONTEXT *tdb, tdb_off off)
{
	struct tdb_traverse_lock *i;
	for (i = &tdb->travlocks; i; i = i->next)
		if (i->off == off)
			return -1;
	return tdb_brlock(tdb, off, F_WRLCK, F_SETLK);
}

/*
  Note this is meant to be F_SETLK, *not* F_SETLKW, as it's not
  an error to fail to get the lock here.
*/

static int write_unlock_record(TDB_CONTEXT *tdb, tdb_off off)
{
	return tdb_brlock(tdb, off, F_UNLCK, F_SETLK);
}
/* fcntl locks don't stack: avoid unlocking someone else's */
static int unlock_record(TDB_CONTEXT *tdb, tdb_off off)
{
	struct tdb_traverse_lock *i;
	u32 count = 0;

	if (off == 0)
		return 0;
	for (i = &tdb->travlocks; i; i = i->next)
		if (i->off == off)
			count++;
	return (count == 1 ? tdb_brlock(tdb, off, F_UNLCK, F_SETLKW) : 0);
}

/* actually delete an entry in the database given the offset */
static int do_delete(TDB_CONTEXT *tdb, tdb_off rec_ptr, struct list_struct*rec)
{
	tdb_off last_ptr, i;
	struct list_struct lastrec;

	if (tdb->read_only) return -1;

	if (write_lock_record(tdb, rec_ptr) == -1) {
		/* Someone traversing here: mark it as dead */
		rec->magic = TDB_DEAD_MAGIC;
		return rec_write(tdb, rec_ptr, rec);
	}
	write_unlock_record(tdb, rec_ptr);

	/* find previous record in hash chain */
	if (ofs_read(tdb, TDB_HASH_TOP(rec->full_hash), &i) == -1)
		return -1;
	for (last_ptr = 0; i != rec_ptr; last_ptr = i, i = lastrec.next)
		if (rec_read(tdb, i, &lastrec) == -1)
			return -1;

	/* unlink it: next ptr is at start of record. */
	if (last_ptr == 0)
		last_ptr = TDB_HASH_TOP(rec->full_hash);
	if (ofs_write(tdb, last_ptr, &rec->next) == -1)
		return -1;

	/* recover the space */
	if (tdb_free(tdb, rec_ptr, rec) == -1)
		return -1;
	return 0;
}

/* Uses traverse lock: 0 = finish, -1 = error, other = record offset */
static int tdb_next_lock(TDB_CONTEXT *tdb, struct tdb_traverse_lock *tlock,
			 struct list_struct *rec)
{
	int want_next = (tlock->off != 0);

	/* No traversal allows if you've called tdb_lockkeys() */
	if (tdb->lockedkeys)
		return TDB_ERRCODE(TDB_ERR_NOLOCK, -1);

	/* Lock each chain from the start one. */
	for (; tlock->hash < tdb->header.hash_size; tlock->hash++) {
		if (tdb_lock(tdb, tlock->hash, F_WRLCK) == -1)
			return -1;

		/* No previous record?  Start at top of chain. */
		if (!tlock->off) {
			if (ofs_read(tdb, TDB_HASH_TOP(tlock->hash),
				     &tlock->off) == -1)
				goto fail;
		} else {
			/* Otherwise unlock the previous record. */
			unlock_record(tdb, tlock->off);
		}

		if (want_next) {
			/* We have offset of old record: grab next */
			if (rec_read(tdb, tlock->off, rec) == -1)
				goto fail;
			tlock->off = rec->next;
		}

		/* Iterate through chain */
		while( tlock->off) {
			tdb_off current;
			if (rec_read(tdb, tlock->off, rec) == -1)
				goto fail;
			if (!TDB_DEAD(rec)) {
				/* Woohoo: we found one! */
				lock_record(tdb, tlock->off);
				return tlock->off;
			}
			/* Try to clean dead ones from old traverses */
			current = tlock->off;
			tlock->off = rec->next;
			do_delete(tdb, current, rec);
		}
		tdb_unlock(tdb, tlock->hash, F_WRLCK);
		want_next = 0;
	}
	/* We finished iteration without finding anything */
	return TDB_ERRCODE(TDB_SUCCESS, 0);

 fail:
	tlock->off = 0;
	tdb_unlock(tdb, tlock->hash, F_WRLCK);
	return -1;
}

/* traverse the entire database - calling fn(tdb, key, data) on each element.
   return -1 on error or the record count traversed
   if fn is NULL then it is not called
   a non-zero return value from fn() indicates that the traversal should stop
  */
int tdb_traverse(TDB_CONTEXT *tdb, tdb_traverse_func fn, void *state)
{
	TDB_DATA key, dbuf;
	struct list_struct rec;
	struct tdb_traverse_lock tl = { NULL, 0, 0 };
	int ret, count = 0;

	/* This was in the initializaton, above, but the IRIX compiler
	 * did not like it.  crh
	 */
	tl.next = tdb->travlocks.next;

	/* fcntl locks don't stack: beware traverse inside traverse */
	tdb->travlocks.next = &tl;

	/* tdb_next_lock places locks on the record returned, and its chain */
	while ((ret = tdb_next_lock(tdb, &tl, &rec)) > 0) {
		count++;
		/* now read the full record */
		key.dptr = tdb_alloc_read(tdb, tl.off + sizeof(rec), 
					  rec.key_len + rec.data_len);
		if (!key.dptr) {
			tdb_unlock(tdb, tl.hash, F_WRLCK);
			unlock_record(tdb, tl.off);
			tdb->travlocks.next = tl.next;
			return -1;
		}
		key.dsize = rec.key_len;
		dbuf.dptr = key.dptr + rec.key_len;
		dbuf.dsize = rec.data_len;

		/* Drop chain lock, call out */
		tdb_unlock(tdb, tl.hash, F_WRLCK);
		if (fn && fn(tdb, key, dbuf, state)) {
			/* They want us to terminate traversal */
			unlock_record(tdb, tl.off);
			tdb->travlocks.next = tl.next;
			free(key.dptr);
			return count;
		}
		free(key.dptr);
	}
	tdb->travlocks.next = tl.next;
	if (ret < 0)
		return -1;
	else
		return count;
}

/* find the first entry in the database and return its key */
TDB_DATA tdb_firstkey(TDB_CONTEXT *tdb)
{
	TDB_DATA key;
	struct list_struct rec;

	/* release any old lock */
	unlock_record(tdb, tdb->travlocks.off);
	tdb->travlocks.off = tdb->travlocks.hash = 0;

	if (tdb_next_lock(tdb, &tdb->travlocks, &rec) <= 0)
		return tdb_null;
	/* now read the key */
	key.dsize = rec.key_len;
	key.dptr =tdb_alloc_read(tdb,tdb->travlocks.off+sizeof(rec),key.dsize);
	tdb_unlock(tdb, BUCKET(tdb->travlocks.hash), F_WRLCK);
	return key;
}

/* find the next entry in the database, returning its key */
TDB_DATA tdb_nextkey(TDB_CONTEXT *tdb, TDB_DATA oldkey)
{
	u32 oldhash;
	TDB_DATA key = tdb_null;
	struct list_struct rec;
	char *k = NULL;

	/* Is locked key the old key?  If so, traverse will be reliable. */
	if (tdb->travlocks.off) {
		if (tdb_lock(tdb,tdb->travlocks.hash,F_WRLCK))
			return tdb_null;
		if (rec_read(tdb, tdb->travlocks.off, &rec) == -1
		    || !(k = tdb_alloc_read(tdb,tdb->travlocks.off+sizeof(rec),
					    rec.key_len))
		    || memcmp(k, oldkey.dptr, oldkey.dsize) != 0) {
			/* No, it wasn't: unlock it and start from scratch */
			unlock_record(tdb, tdb->travlocks.off);
			tdb_unlock(tdb, tdb->travlocks.hash, F_WRLCK);
			tdb->travlocks.off = 0;
		}

		if (k)
			free(k);
	}

	if (!tdb->travlocks.off) {
		/* No previous element: do normal find, and lock record */
		tdb->travlocks.off = tdb_find_lock(tdb, oldkey, F_WRLCK, &rec);
		if (!tdb->travlocks.off)
			return tdb_null;
		tdb->travlocks.hash = BUCKET(rec.full_hash);
		lock_record(tdb, tdb->travlocks.off);
	}
	oldhash = tdb->travlocks.hash;

	/* Grab next record: locks chain and returned record,
	   unlocks old record */
	if (tdb_next_lock(tdb, &tdb->travlocks, &rec) > 0) {
		key.dsize = rec.key_len;
		key.dptr = tdb_alloc_read(tdb, tdb->travlocks.off+sizeof(rec),
					  key.dsize);
		/* Unlock the chain of this new record */
		tdb_unlock(tdb, tdb->travlocks.hash, F_WRLCK);
	}
	/* Unlock the chain of old record */
	tdb_unlock(tdb, BUCKET(oldhash), F_WRLCK);
	return key;
}

/* delete an entry in the database given a key */
int tdb_delete(TDB_CONTEXT *tdb, TDB_DATA key)
{
	tdb_off rec_ptr;
	struct list_struct rec;
	int ret;

	if (!(rec_ptr = tdb_find_lock(tdb, key, F_WRLCK, &rec)))
		return -1;
	ret = do_delete(tdb, rec_ptr, &rec);
	tdb_unlock(tdb, BUCKET(rec.full_hash), F_WRLCK);
	return ret;
}

/* store an element in the database, replacing any existing element
   with the same key 

   return 0 on success, -1 on failure
*/
int tdb_store(TDB_CONTEXT *tdb, TDB_DATA key, TDB_DATA dbuf, int flag)
{
	struct list_struct rec;
	u32 hash;
	tdb_off rec_ptr;
	char *p = NULL;
	int ret = 0;

	/* find which hash bucket it is in */
	hash = tdb_hash(&key);
	if (!tdb_keylocked(tdb, hash))
		return -1;
	if (tdb_lock(tdb, BUCKET(hash), F_WRLCK) == -1)
		return -1;

	/* check for it existing, on insert. */
	if (flag == TDB_INSERT) {
		if (tdb_exists(tdb, key)) {
			tdb->ecode = TDB_ERR_EXISTS;
			goto fail;
		}
	} else {
		/* first try in-place update, on modify or replace. */
		if (tdb_update(tdb, key, dbuf) == 0)
			goto out;
		if (flag == TDB_MODIFY && tdb->ecode == TDB_ERR_NOEXIST)
			goto fail;
	}
	/* reset the error code potentially set by the tdb_update() */
	tdb->ecode = TDB_SUCCESS;

	/* delete any existing record - if it doesn't exist we don't
           care.  Doing this first reduces fragmentation, and avoids
           coalescing with `allocated' block before it's updated. */
	if (flag != TDB_INSERT)
		tdb_delete(tdb, key);

	/* Copy key+value *before* allocating free space in case malloc
	   fails and we are left with a dead spot in the tdb. */

	if (!(p = (char *)malloc(key.dsize + dbuf.dsize))) {
		tdb->ecode = TDB_ERR_OOM;
		goto fail;
	}

	memcpy(p, key.dptr, key.dsize);
	memcpy(p+key.dsize, dbuf.dptr, dbuf.dsize);

	/* now we're into insert / modify / replace of a record which
	 * we know could not be optimised by an in-place store (for
	 * various reasons).  */
	if (!(rec_ptr = tdb_allocate(tdb, key.dsize + dbuf.dsize, &rec)))
		goto fail;

	/* Read hash top into next ptr */
	if (ofs_read(tdb, TDB_HASH_TOP(hash), &rec.next) == -1)
		goto fail;

	rec.key_len = key.dsize;
	rec.data_len = dbuf.dsize;
	rec.full_hash = hash;
	rec.magic = TDB_MAGIC;

	/* write out and point the top of the hash chain at it */
	if (rec_write(tdb, rec_ptr, &rec) == -1
	    || tdb_write(tdb, rec_ptr+sizeof(rec), p, key.dsize+dbuf.dsize)==-1
	    || ofs_write(tdb, TDB_HASH_TOP(hash), &rec_ptr) == -1) {
	fail:
		/* Need to tdb_unallocate() here */
		ret = -1;
	}
 out:
	if (p)
		free(p); 
	tdb_unlock(tdb, BUCKET(hash), F_WRLCK);
	return ret;
}

static int tdb_already_open(dev_t device,
			    ino_t ino)
{
	TDB_CONTEXT *i;
	
	for (i = tdbs; i; i = i->next) {
		if (i->device == device && i->inode == ino) {
			return 1;
		}
	}

	return 0;
}

/* open the database, creating it if necessary 

   The open_flags and mode are passed straight to the open call on the
   database file. A flags value of O_WRONLY is invalid. The hash size
   is advisory, use zero for a default value.

   Return is NULL on error, in which case errno is also set.  Don't 
   try to call tdb_error or tdb_errname, just do strerror(errno).

   @param name may be NULL for internal databases. */
TDB_CONTEXT *tdb_open(char *name, int hash_size, int tdb_flags,
		      int open_flags, mode_t mode)
{
	TDB_CONTEXT *tdb;
	struct stat st;
	int rev = 0, locked;

	if (!(tdb = calloc(1, sizeof *tdb))) {
		/* Can't log this */
		errno = ENOMEM;
		goto fail;
	}
	tdb->fd = -1;
	tdb->name = NULL;
	tdb->map_ptr = NULL;
	tdb->lockedkeys = NULL;
	tdb->flags = tdb_flags;
	
	if ((open_flags & O_ACCMODE) == O_WRONLY) {
		errno = EINVAL;
		goto fail;
	}
	
	if (hash_size == 0)
		hash_size = DEFAULT_HASH_SIZE;
	if ((open_flags & O_ACCMODE) == O_RDONLY) {
		tdb->read_only = 1;
		/* read only databases don't do locking or clear if first */
		tdb->flags |= TDB_NOLOCK;
		tdb->flags &= ~TDB_CLEAR_IF_FIRST;
	}

	/* internal databases don't mmap or lock, and start off cleared */
	if (tdb->flags & TDB_INTERNAL) {
		tdb->flags |= (TDB_NOLOCK | TDB_NOMMAP);
		tdb->flags &= ~TDB_CLEAR_IF_FIRST;
		tdb_new_database(tdb, hash_size);
		goto internal;
	}

	if ((tdb->fd = open(name, open_flags, mode)) == -1) {
		goto fail;	/* errno set by open(2) */
	}

	/* ensure there is only one process initialising at once */
	if (tdb_brlock(tdb, GLOBAL_LOCK, F_WRLCK, F_SETLKW) == -1) {
		goto fail;	/* errno set by tdb_brlock */
	}

	/* we need to zero database if we are the only one with it open */
	if ((locked = (tdb_brlock(tdb, ACTIVE_LOCK, F_WRLCK, F_SETLK) == 0))
	    && (tdb_flags & TDB_CLEAR_IF_FIRST)) {
		open_flags |= O_CREAT;
		if (ftruncate(tdb->fd, 0) == -1) {
			goto fail; /* errno set by ftruncate */
		}
	}

	if (read(tdb->fd, &tdb->header, sizeof(tdb->header)) != sizeof(tdb->header)
	    || strcmp(tdb->header.magic_food, TDB_MAGIC_FOOD) != 0
	    || (tdb->header.version != TDB_VERSION
		&& !(rev = (tdb->header.version==TDB_BYTEREV(TDB_VERSION))))) {
		/* its not a valid database - possibly initialise it */
		if (!(open_flags & O_CREAT) || tdb_new_database(tdb, hash_size) == -1) {
			errno = EIO; /* ie bad format or something */
			goto fail;
		}
		rev = (tdb->flags & TDB_CONVERT);
	}
	if (!rev)
		tdb->flags &= ~TDB_CONVERT;
	else {
		tdb->flags |= TDB_CONVERT;
		convert(&tdb->header, sizeof(tdb->header));
	}
	if (fstat(tdb->fd, &st) == -1)
		goto fail;

	/* Is it already in the open list?  If so, fail. */
	if (tdb_already_open(st.st_dev, st.st_ino)) {
		errno = EBUSY;
		goto fail;
	}

	if (!(tdb->name = (char *)strdup(name))) {
		errno = ENOMEM;
		goto fail;
	}

	tdb->map_size = st.st_size;
	tdb->device = st.st_dev;
	tdb->inode = st.st_ino;
	tdb->locked = calloc(tdb->header.hash_size+1, sizeof(tdb->locked[0]));
	if (!tdb->locked) {
		errno = ENOMEM;
		goto fail;
	}
	tdb_mmap(tdb);
	if (locked) {
		if (tdb_brlock(tdb, ACTIVE_LOCK, F_UNLCK, F_SETLK) == -1) {
			goto fail;
		}
	}
	/* leave this lock in place to indicate it's in use */
	if (tdb_brlock(tdb, ACTIVE_LOCK, F_RDLCK, F_SETLKW) == -1)
		goto fail;

 internal:
	/* Internal (memory-only) databases skip all the code above to
	 * do with disk files, and resume here by releasing their
	 * global lock and hooking into the active list. */
	if (tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW) == -1)
		goto fail;
	tdb->next = tdbs;
	tdbs = tdb;
	return tdb;

 fail:
	{ int save_errno = errno;

	if (!tdb)
		return NULL;
	
	if (tdb->map_ptr) {
		if (tdb->flags & TDB_INTERNAL)
			free(tdb->map_ptr);
		else
			tdb_munmap(tdb);
	}
	if (tdb->name)
		free(tdb->name);
	if (tdb->fd != -1)
		close(tdb->fd);
	if (tdb->locked)
		free(tdb->locked);
	errno = save_errno;
	return NULL;
	}
}

/* close a database */
int tdb_close(TDB_CONTEXT *tdb)
{
	TDB_CONTEXT **i;
	int ret = 0;

	if (tdb->map_ptr) {
		if (tdb->flags & TDB_INTERNAL)
			free(tdb->map_ptr);
		else
			tdb_munmap(tdb);
	}
	if (tdb->name)
		free(tdb->name);
	if (tdb->fd != -1)
		ret = close(tdb->fd);
	if (tdb->locked)
		free(tdb->locked);
	if (tdb->lockedkeys)
		free(tdb->lockedkeys);

	/* Remove from contexts list */
	for (i = &tdbs; *i; i = &(*i)->next) {
		if (*i == tdb) {
			*i = tdb->next;
			break;
		}
	}

	memset(tdb, 0, sizeof(*tdb));
	free(tdb);

	return ret;
}

/* lock/unlock entire database */
int tdb_lockall(TDB_CONTEXT *tdb)
{
	u32 i;

	/* There are no locks on read-only dbs */
	if (tdb->read_only)
		return TDB_ERRCODE(TDB_ERR_LOCK, -1);
	if (tdb->lockedkeys)
		return TDB_ERRCODE(TDB_ERR_NOLOCK, -1);
	for (i = 0; i < tdb->header.hash_size; i++) 
		if (tdb_lock(tdb, i, F_WRLCK))
			break;

	/* If error, release locks we have... */
	if (i < tdb->header.hash_size) {
		u32 j;

		for ( j = 0; j < i; j++)
			tdb_unlock(tdb, j, F_WRLCK);
		return TDB_ERRCODE(TDB_ERR_NOLOCK, -1);
	}

	return 0;
}
void tdb_unlockall(TDB_CONTEXT *tdb)
{
	u32 i;
	for (i=0; i < tdb->header.hash_size; i++)
		tdb_unlock(tdb, i, F_WRLCK);
}

int tdb_lockkeys(TDB_CONTEXT *tdb, u32 number, TDB_DATA keys[])
{
	u32 i, j, hash;

	/* Can't lock more keys if already locked */
	if (tdb->lockedkeys)
		return TDB_ERRCODE(TDB_ERR_NOLOCK, -1);
	if (!(tdb->lockedkeys = malloc(sizeof(u32) * (number+1))))
		return TDB_ERRCODE(TDB_ERR_OOM, -1);
	/* First number in array is # keys */
	tdb->lockedkeys[0] = number;

	/* Insertion sort by bucket */
	for (i = 0; i < number; i++) {
		hash = tdb_hash(&keys[i]);
		for (j = 0; j < i && BUCKET(tdb->lockedkeys[j+1]) < BUCKET(hash); j++);
			memmove(&tdb->lockedkeys[j+2], &tdb->lockedkeys[j+1], sizeof(u32) * (i-j));
		tdb->lockedkeys[j+1] = hash;
	}
	/* Finally, lock in order */
	for (i = 0; i < number; i++)
		if (tdb_lock(tdb, i, F_WRLCK))
			break;

	/* If error, release locks we have... */
	if (i < number) {
		for ( j = 0; j < i; j++)
			tdb_unlock(tdb, j, F_WRLCK);
		free(tdb->lockedkeys);
		tdb->lockedkeys = NULL;
		return TDB_ERRCODE(TDB_ERR_NOLOCK, -1);
	}
	return 0;
}

/* Unlock the keys previously locked by tdb_lockkeys() */
void tdb_unlockkeys(TDB_CONTEXT *tdb)
{
	u32 i;
	for (i = 0; i < tdb->lockedkeys[0]; i++)
		tdb_unlock(tdb, tdb->lockedkeys[i+1], F_WRLCK);
	free(tdb->lockedkeys);
	tdb->lockedkeys = NULL;
}

/* lock/unlock one hash chain. This is meant to be used to reduce
   contention - it cannot guarantee how many records will be locked */
int tdb_chainlock(TDB_CONTEXT *tdb, TDB_DATA key)
{
	return tdb_lock(tdb, BUCKET(tdb_hash(&key)), F_WRLCK);
}
void tdb_chainunlock(TDB_CONTEXT *tdb, TDB_DATA key)
{
	tdb_unlock(tdb, BUCKET(tdb_hash(&key)), F_WRLCK);
}

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-02-25  1:41 Rusty Russell
  2002-02-25  1:58 ` your mail Alexander Viro
@ 2002-02-25 13:16 ` Alan Cox
  1 sibling, 0 replies; 341+ messages in thread
From: Alan Cox @ 2002-02-25 13:16 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Linus Torvalds, mingo, Matthew Kirkwood, Benjamin LaHaise,
	David Axmark, William Lee Irwin III, linux-kernel

> > 	fd = sem_initialize();
> > 	mmap(fd, ...)
> > 	..
> > 	munmap(..)
> > 
> > which gives you a handle for the semaphore.
> 
> No no no!  Implemented exactly that (and posted to l-k IIRC), and it's
> *horrible* to use.

All Linus forgot was to sem_initialize("filename"); With that the rest
comes out for free.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail 
  2002-02-25  2:14   ` Rusty Russell
  2002-02-25  3:18     ` Davide Libenzi
@ 2002-02-25  4:02     ` Alexander Viro
  2002-02-26  5:50       ` Rusty Russell
  1 sibling, 1 reply; 341+ messages in thread
From: Alexander Viro @ 2002-02-25  4:02 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Linus Torvalds, mingo, Matthew Kirkwood, Benjamin LaHaise,
	David Axmark, William Lee Irwin III, linux-kernel



On Mon, 25 Feb 2002, Rusty Russell wrote:

> In message <Pine.GSO.4.21.0202242054410.1329-100000@weyl.math.psu.edu> you writ
> e:
> > 
> > 
> > On Mon, 25 Feb 2002, Rusty Russell wrote:
> > > First, fd passing sucks: you can't leave an fd somewhere and wait for
> > > someone to pick it up, and they vanish when you exit.  Secondly, you
> > 
> > Yes, you can.  Please, RTFS - what is passed is not a descriptor, it's
> > struct file *.  As soon as datagram is sent, descriptors are resolved and
> > after that point descriptor table of sender (or, for that matter, survival
> > of sender) doesn't matter.
> 
> Please explain how I leave a fd somewhere for other processes to grab
> it.  
> 
> And then please explain how they get the fd after I've exited.
> 
> Al, you are one of the most unpleasant people to deal with on this
> list.  This is *not* an honor, and I beg you to consider a different
> approach in future correspondence.

Honour or not, in this case your complaint is hardly deserved.  To
compress the above a bit:

you: <false statement>
me: RTFS.  <short description of the reasons why statement is wrong; further
details could be obtained by reading TFS>

As for your question, SCM_RIGHTS datagram can easily outlive the sending
process.  You will need a helper process (either per-meeting point or
system-wide) to avoid GC killing the thing, but that's it.

Writing such helper is left as an exercise to reader - it _is_ trivial.
To put fd(s):
	connect to (name of AF_UNIX socket)
	sendmsg to it; no OOB data, one byte of data (non-0)
	form an SCM_RIGHTS datagram with fds in question
	sendmsg it to the same socket.
	close the socket
In helper:
	listen on (name)
repeat:
	accept connection
	read one byte
	if it's non-zero
		put fd of connection into a list
		goto repeat
	else
		take first fd from list
		form an SCM_RIGHTS datagram with that fd
		send it into the new connection
		close fd
		close connection
		goto repeat
To get fd(s):
	connect ....
	sendmsg .................................... (0)
	recvmsg and pick fd from the message
	close connection
	recvmsg from fd and pick the set of fds from the message
	close fd

End of story.  In real-life situation you will want to throttle in helper,
etc., but in any case main loop is ~20 lines of code.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail 
  2002-02-25  2:14   ` Rusty Russell
@ 2002-02-25  3:18     ` Davide Libenzi
  2002-02-25  4:02     ` Alexander Viro
  1 sibling, 0 replies; 341+ messages in thread
From: Davide Libenzi @ 2002-02-25  3:18 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Alexander Viro, Linus Torvalds, mingo, Matthew Kirkwood,
	Benjamin LaHaise, David Axmark, William Lee Irwin III,
	linux-kernel

On Mon, 25 Feb 2002, Rusty Russell wrote:

> In message <Pine.GSO.4.21.0202242054410.1329-100000@weyl.math.psu.edu> you writ
> e:
> >
> >
> > On Mon, 25 Feb 2002, Rusty Russell wrote:
> > > First, fd passing sucks: you can't leave an fd somewhere and wait for
> > > someone to pick it up, and they vanish when you exit.  Secondly, you
> >
> > Yes, you can.  Please, RTFS - what is passed is not a descriptor, it's
> > struct file *.  As soon as datagram is sent, descriptors are resolved and
> > after that point descriptor table of sender (or, for that matter, survival
> > of sender) doesn't matter.
>
> Please explain how I leave a fd somewhere for other processes to grab
> it.
>
> And then please explain how they get the fd after I've exited.
>
> Al, you are one of the most unpleasant people to deal with on this
> list.  This is *not* an honor, and I beg you to consider a different
> approach in future correspondence.

Actually, this is one of Al's nicest posts :-)
You obviously can't share fd# but you can share file*
I don't know how you're going to have these semaphores 'externally visible',
if with numbers like IPC sems or if with pathnames like unix sockets ( or
something else ). But you can have internally a number/path/else -> file*
mapping and when a task attaches the sem you map the file* onto an fd# in
the task's file table. If you keep this mapping persistent ( until
explicit deletion ) the file* remain alive event with zero attached
processes. I think it's this what Al was trying to say.




- Davide




^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail 
  2002-02-25  1:58 ` your mail Alexander Viro
@ 2002-02-25  2:14   ` Rusty Russell
  2002-02-25  3:18     ` Davide Libenzi
  2002-02-25  4:02     ` Alexander Viro
  0 siblings, 2 replies; 341+ messages in thread
From: Rusty Russell @ 2002-02-25  2:14 UTC (permalink / raw)
  To: Alexander Viro
  Cc: Linus Torvalds, mingo, Matthew Kirkwood, Benjamin LaHaise,
	David Axmark, William Lee Irwin III, linux-kernel

In message <Pine.GSO.4.21.0202242054410.1329-100000@weyl.math.psu.edu> you writ
e:
> 
> 
> On Mon, 25 Feb 2002, Rusty Russell wrote:
> > First, fd passing sucks: you can't leave an fd somewhere and wait for
> > someone to pick it up, and they vanish when you exit.  Secondly, you
> 
> Yes, you can.  Please, RTFS - what is passed is not a descriptor, it's
> struct file *.  As soon as datagram is sent, descriptors are resolved and
> after that point descriptor table of sender (or, for that matter, survival
> of sender) doesn't matter.

Please explain how I leave a fd somewhere for other processes to grab
it.  

And then please explain how they get the fd after I've exited.

Al, you are one of the most unpleasant people to deal with on this
list.  This is *not* an honor, and I beg you to consider a different
approach in future correspondence.

Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-02-25  1:41 Rusty Russell
@ 2002-02-25  1:58 ` Alexander Viro
  2002-02-25  2:14   ` Rusty Russell
  2002-02-25 13:16 ` Alan Cox
  1 sibling, 1 reply; 341+ messages in thread
From: Alexander Viro @ 2002-02-25  1:58 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Linus Torvalds, mingo, Matthew Kirkwood, Benjamin LaHaise,
	David Axmark, William Lee Irwin III, linux-kernel



On Mon, 25 Feb 2002, Rusty Russell wrote:

> > Note that getting a file descriptor is really quite useful - it means that
> > you can pass the file descriptor around through unix domain sockets, for
> > example, and allow sharing of the semaphore across unrelated processes
> > that way.
> 
> First, fd passing sucks: you can't leave an fd somewhere and wait for
> someone to pick it up, and they vanish when you exit.  Secondly, you

Yes, you can.  Please, RTFS - what is passed is not a descriptor, it's
struct file *.  As soon as datagram is sent, descriptors are resolved and
after that point descriptor table of sender (or, for that matter, survival
of sender) doesn't matter.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-01-30 18:21 Nickolaos Fotopoulos
  2002-01-30 18:57 ` your mail Matti Aarnio
@ 2002-01-31  1:50 ` Drew P. Vogel
  1 sibling, 0 replies; 341+ messages in thread
From: Drew P. Vogel @ 2002-01-31  1:50 UTC (permalink / raw)
  To: Nickolaos Fotopoulos; +Cc: Linux kernel list (E-mail)

Personally, when I'm getting a few hundred emails per day, I don't even
notice the 5% spam.

--Drew Vogel

On Wed, 30 Jan 2002, Nickolaos Fotopoulos wrote:

>I'm new to this list.  Does it get spammed often, like this guy
>(grumph@pakistanmail.com) is doing?  It is allready becoming quite anouying!
>This is by far the busiest list I have ever subscribed to, and there does
>not seem to be any sort of spam blocker working here.  I thought Majodomo
>had stuff like this built in?  If not maybe a list moderator could address
>this.
>				Nick Fotopoulos
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>




^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-01-30 18:21 Nickolaos Fotopoulos
@ 2002-01-30 18:57 ` Matti Aarnio
  2002-01-31  1:50 ` Drew P. Vogel
  1 sibling, 0 replies; 341+ messages in thread
From: Matti Aarnio @ 2002-01-30 18:57 UTC (permalink / raw)
  To: Nickolaos Fotopoulos; +Cc: Linux kernel list (E-mail)

On Wed, Jan 30, 2002 at 01:21:17PM -0500, Nickolaos Fotopoulos wrote:
> I'm new to this list.  Does it get spammed often, like this guy
> (grumph@pakistanmail.com) is doing?  It is allready becoming quite anouying!

  I already asked about the phenomena, and the guy(?) replied that
  he won't use that system anymore as it is doing those repeated
  sends all by itself.

> This is by far the busiest list I have ever subscribed to, and there does
> not seem to be any sort of spam blocker working here.  I thought Majodomo
> had stuff like this built in?  If not maybe a list moderator could address
> this.

  http://vger.kernel.org/majordomo-info.html

  Trust me, there is HEAVY filtering.
  Still some spams do get thru.

> 				Nick Fotopoulos

/Matti Aarnio

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2002-01-09 17:49 Michael Zhu
@ 2002-01-09 18:17 ` Jens Axboe
  0 siblings, 0 replies; 341+ messages in thread
From: Jens Axboe @ 2002-01-09 18:17 UTC (permalink / raw)
  To: Michael Zhu; +Cc: root, linux-kernel

On Wed, Jan 09 2002, Michael Zhu wrote:
> > 
> > This may be a troll. How would you boot? Who
> decrypts during the
> > boot?
> > 
> 
> You mean that the loop device couldn't en/decrypt the
> whole data on the disk? That mean the loop device
> could implement the block level en/decryption.

Please, read up on the loop crypto stuff off-list. Most of these
questions are very FAQ. You can loop crypto a whole disk or partition of
you want.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2001-12-27 18:55     ` Linus Torvalds
  2001-12-27 19:41       ` Andrew Morton
@ 2001-12-28 22:14       ` Martin Dalecki
  1 sibling, 0 replies; 341+ messages in thread
From: Martin Dalecki @ 2001-12-28 22:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andre Hedrick, Keith Owens, kbuild-devel, linux-kernel

Linus Torvalds wrote:

>(Right now you can see this in block_ioctl.c - while only a few of the
>ioctl's have been converted, you get the idea. I'm actually surprised that
>nobody seems to have commented on that part).
>

That was just too obvious, at least for me... However I don't see why 
you just don't start killing of constructs like:

swtch  (ioctrl)

    BLASH:
BLAHHH:
 BLASHH:
 BLAASS:
     BLAH:
    default:
            return -ENOVAL;
}

There are ton' s of them out there in the block drivers..


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2001-12-27 18:55     ` Linus Torvalds
@ 2001-12-27 19:41       ` Andrew Morton
  2001-12-28 22:14       ` Martin Dalecki
  1 sibling, 0 replies; 341+ messages in thread
From: Andrew Morton @ 2001-12-27 19:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andre Hedrick, Keith Owens, kbuild-devel, linux-kernel

Linus Torvalds wrote:
> 
> The other part of the bio rewrite has been to get rid of another coupling:
> the coupling between "struct buffer_head" (which is used for a limited
> kind of memory management by a number of filesystems) and the act of
> actually just doing IO.
> 
> I used to think that we could just relegate "struct buffer_head" to _be_
> the IO entity, but it turns out to be much easier to just split off the IO
> part, which is why you now have a separate "bio" structure for the block
> IO part, and the buffer_head stuff uses that to get the work done.
> 

So... would it be correct to say that there won't be any large
changes to the buffer_head concept in 2.5?

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2001-12-27 18:09   ` Andre Hedrick
@ 2001-12-27 18:55     ` Linus Torvalds
  2001-12-27 19:41       ` Andrew Morton
  2001-12-28 22:14       ` Martin Dalecki
  0 siblings, 2 replies; 341+ messages in thread
From: Linus Torvalds @ 2001-12-27 18:55 UTC (permalink / raw)
  To: Andre Hedrick; +Cc: Keith Owens, kbuild-devel, linux-kernel


On Thu, 27 Dec 2001, Andre Hedrick wrote:
>
> Lots of luck ... please pass your crack pipe arounds so the rest of us
> idiots can see your vision or lack of ...

Heh. I think I must have passed it on to you long ago, and you never gave
it back, you sneaky bastard ;)

The vision, btw, is to get the request layer in good enough shape that we
can dispense with the mid-layer approaches of SCSI/IDE, and block devices
turn into _just_ device drivers.

For example, ide-scsi is heading for that big scrap-yard in the sky: it's
not the SCSI layer that handles special ioctl requests any more, because
the upper layers are going to be flexible enough that you can just pass
the requests down the regular pipe.

(Right now you can see this in block_ioctl.c - while only a few of the
ioctl's have been converted, you get the idea. I'm actually surprised that
nobody seems to have commented on that part).

The final end result of this (I sincerely hope) is that we can get rid of
some of the couplings that we've had in the block layer. ide-scsi is just
the most obvious strange coupling - things like "sg.c" in general are
rather horrible. There's very little _SCSI_ in sg.c - it's really about
sending commands down to the block devices.

The reason I want to get rid of the couplings is that they end up being
big anchors holding down development: you can create a clean driver that
isn't dependent on the SCSI layer overheads (and people do, for things
like DAC etc), but when you do that you lose _all_ of the support
infrastructure, not just the bloat. Which is sad.

(And which is why things like ide-scsi exist - IDE didn't really want to
be a SCSI driver, but people _did_ want to be able to use some of the
generic support routines that the SCSI layer offers. You couldn't just
cherry-pick the parts you wanted).

The other part of the bio rewrite has been to get rid of another coupling:
the coupling between "struct buffer_head" (which is used for a limited
kind of memory management by a number of filesystems) and the act of
actually just doing IO.

I used to think that we could just relegate "struct buffer_head" to _be_
the IO entity, but it turns out to be much easier to just split off the IO
part, which is why you now have a separate "bio" structure for the block
IO part, and the buffer_head stuff uses that to get the work done.

Andre, I know that you're worried about the low-level drivers, but:

 - I've long since noticed that we cannot communicate, which is why Jens
   is the block level driver person. You'll have to live with it.

 - I personally don't think you _can_ make a good driver without having
   reasonable interfaces, and we didn't have them.

   For example, the network drivers have improved a lot and do not have
   _nearly_ the amount of problems block drivers have. That's obviously
   partly just because it is a simpler problem, but because it was simpler
   it was also possible to change them. The infrastructure changes in the
   networking during 2.3.x really did help drivers.

And note that the "Jens" and "communication" part is important. If you
have patches, please talk to Jens, tell him what the issues, are, and I
know I can communicate with him.

			Linus



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2001-12-07  5:10 ` your mail Linus Torvalds
@ 2001-12-27 18:09   ` Andre Hedrick
  2001-12-27 18:55     ` Linus Torvalds
  0 siblings, 1 reply; 341+ messages in thread
From: Andre Hedrick @ 2001-12-27 18:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Keith Owens, kbuild-devel, linux-kernel

On Thu, 6 Dec 2001, Linus Torvalds wrote:

> 
> On Fri, 7 Dec 2001, Keith Owens wrote:
> >
> > Linus, the time has come to convert the 2.5 kernel to kbuild 2.5.
> 
> We're getting the block IO layer in shape first, the time has not come for
> _anything_ else before that.
> 
> 		Linus

Lots of luck ... please pass your crack pipe arounds so the rest of us
idiots can see your vision or lack of ...

Regards,

Andre Hedrick
CEO/President, LAD Storage Consulting Group
Linux ATA Development
Linux Disk Certification Project


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2001-12-14 16:46 ` Gérard Roudier
  2001-12-14 20:09   ` Jens Axboe
@ 2001-12-18  0:34   ` Kirk Alexander
  1 sibling, 0 replies; 341+ messages in thread
From: Kirk Alexander @ 2001-12-18  0:34 UTC (permalink / raw)
  To:  Gérard_Roudier ; +Cc: Jens Axboe, linux-kernel

 --- Gérard_Roudier <groudier@free.fr> wrote: > 
> 
[snip]
> 
> You may let me know if sym53c8xx_2 still works with 810 rev 2.
> 



I tried the sym53c8xx_2 driver, put a fair load on the system (lots of sync'ing
and swapping) and didn't seem to have any trouble.

Cheers,
 Kirk Alexander



^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2001-12-17 18:42     ` Sebastian Dröge
@ 2001-12-17 18:43       ` Dave Jones
  0 siblings, 0 replies; 341+ messages in thread
From: Dave Jones @ 2001-12-17 18:43 UTC (permalink / raw)
  To: Sebastian Dröge; +Cc: linux-kernel

On Mon, 17 Dec 2001, Sebastian Dröge wrote:

> So I removed the apic.c hunk
> I think you meant that ;)

*nod*

> Anyway this doesn't solve the problem :(

Ok, this isn't urgent anyway, I'll get around to cleaning that
up later. Thanks for your help tracing this.

Dave.

-- 
| Dave Jones.        http://www.codemonkey.org.uk
| SuSE Labs


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2001-12-17 17:23   ` Sebastian Dröge
  2001-12-17 17:25     ` Dave Jones
@ 2001-12-17 18:42     ` Sebastian Dröge
  2001-12-17 18:43       ` Dave Jones
  1 sibling, 1 reply; 341+ messages in thread
From: Sebastian Dröge @ 2001-12-17 18:42 UTC (permalink / raw)
  To: Dave Jones; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 685 bytes --]

Hmm...
I don't see anything about ioapic.c in the patch...
So I removed the apic.c hunk
I think you meant that ;)
Anyway this doesn't solve the problem :(

Bye

On Mon, 17 Dec 2001 18:25:37 +0100 (CET)
Dave Jones <davej@suse.de> wrote:

> On Mon, 17 Dec 2001, Sebastian Dröge wrote:
> 
> > Thanks
> > This does work
> 
> Great, now can you edit the patch to remove the ioapic.c hunk,
> reapply, and see if that works..
> 
> > What do you think was exactly the problem?
> 
> looks like I dorked the apic init...
> I'll back that bit out for -dj2, until I've given
> it a bit more work.
> 
> regards,
> Dave.
> 
> -- 
> | Dave Jones.        http://www.codemonkey.org.uk
> | SuSE Labs
> 

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2001-12-17 17:23   ` Sebastian Dröge
@ 2001-12-17 17:25     ` Dave Jones
  2001-12-17 18:42     ` Sebastian Dröge
  1 sibling, 0 replies; 341+ messages in thread
From: Dave Jones @ 2001-12-17 17:25 UTC (permalink / raw)
  To: Sebastian Dröge; +Cc: linux-kernel

On Mon, 17 Dec 2001, Sebastian Dröge wrote:

> Thanks
> This does work

Great, now can you edit the patch to remove the ioapic.c hunk,
reapply, and see if that works..

> What do you think was exactly the problem?

looks like I dorked the apic init...
I'll back that bit out for -dj2, until I've given
it a bit more work.

regards,
Dave.

-- 
| Dave Jones.        http://www.codemonkey.org.uk
| SuSE Labs


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2001-12-17 16:52 ` Sebastian Dröge
  2001-12-17 16:55   ` Arnaldo Carvalho de Melo
@ 2001-12-17 17:23   ` Sebastian Dröge
  2001-12-17 17:25     ` Dave Jones
  2001-12-17 18:42     ` Sebastian Dröge
  1 sibling, 2 replies; 341+ messages in thread
From: Sebastian Dröge @ 2001-12-17 17:23 UTC (permalink / raw)
  To: Dave Jones; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 706 bytes --]

Thanks
This does work
What do you think was exactly the problem?

Bye

On Mon, 17 Dec 2001 17:57:01 +0100 (CET)
Dave Jones <davej@suse.de> wrote:

> On Mon, 17 Dec 2001, Sebastian Dröge wrote:
> 
> > 2.4.16-2.4.17-rc1 works perfectly
> > 2.5.0-2.5.1 works perfectly
> > Only 2.5.1-dj1 has this 2 errors (ISA-PnP non-detection and USB only root hub detection)
> > All have the same .config
> > If you need some more information feel free to ask me ;)
> 
> Ok, can you try backing out this patch.. (just patch as normal but with -R)
> http://www.codemonkey.org.uk/patches/2.5/small-bits/early-cpuinit-1.diff
> 
> regards,
> Dave.
> 
> -- 
> | Dave Jones.        http://www.codemonkey.org.uk
> | SuSE Labs
> 

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2001-12-17 16:52 ` Sebastian Dröge
@ 2001-12-17 16:55   ` Arnaldo Carvalho de Melo
  2001-12-17 17:23   ` Sebastian Dröge
  1 sibling, 0 replies; 341+ messages in thread
From: Arnaldo Carvalho de Melo @ 2001-12-17 16:55 UTC (permalink / raw)
  To: Sebastian, =?iso-8859-1?Q?Dr=F6ge_=3Csebastian=2Edroege=40gmx=2Ede=3E?=
  Cc: Dave Jones, linux-kernel, torvalds

Em Mon, Dec 17, 2001 at 05:52:06PM +0100, Sebastian Dröge escreveu:
> PS: 2.5.1 (dj1 or not ;) has one problem more on my pc:
> INIT can't send the TERM signal to all processes...

see the kill(-1,sig) thread...

> Nothing happens... no error message no nothing
> SysRQ works
> I don't know when it went into 2.5 but I think it wasn't there in -pre10 (don't try -pre11)
> PPS: What the hell is APIC (no I don't mean ACPI)? ;) I've enabled it on my UP machine but don't know what it does...
> Does anyone have informations about it?

Advanced Programmable Interrupt Controller, found in SMP machines and in
some UP ones, for UP its shouldn't be enabled in most cases.

- Arnaldo

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2001-12-17 16:07 Sebastian Dröge
  2001-12-17 16:22 ` your mail Dave Jones
@ 2001-12-17 16:52 ` Sebastian Dröge
  2001-12-17 16:55   ` Arnaldo Carvalho de Melo
  2001-12-17 17:23   ` Sebastian Dröge
  1 sibling, 2 replies; 341+ messages in thread
From: Sebastian Dröge @ 2001-12-17 16:52 UTC (permalink / raw)
  To: Dave Jones; +Cc: linux-kernel, torvalds

[-- Attachment #1: Type: text/plain, Size: 1681 bytes --]

Ok...
2.4.16-2.4.17-rc1 works perfectly
2.5.0-2.5.1 works perfectly
Only 2.5.1-dj1 has this 2 errors (ISA-PnP non-detection and USB only root hub detection)
All have the same .config
If you need some more information feel free to ask me ;)

Bye

PS: 2.5.1 (dj1 or not ;) has one problem more on my pc:
INIT can't send the TERM signal to all processes...
Nothing happens... no error message no nothing
SysRQ works
I don't know when it went into 2.5 but I think it wasn't there in -pre10 (don't try -pre11)
PPS: What the hell is APIC (no I don't mean ACPI)? ;) I've enabled it on my UP machine but don't know what it does...
Does anyone have informations about it?

On Mon, 17 Dec 2001 17:22:14 +0100 (CET)
Dave Jones <davej@suse.de> wrote:

> On Mon, 17 Dec 2001, Sebastian Dröge wrote:
> 
> > Attached you find my .config, lspci -vvv and dmesg output
> > I'll test 2.4.17-rc1 in a few minutes and will report what happens ;)
> 
> Thanks. Right now getting 2.4 into a better shape is more
> important than fixing 2.5, so if you find any problems repeatable
> in 2.4.17rc1, Marcelo really needs to know about it.
> 
> The only USB changes in my tree are __devinit_p changes, which
> really shouldn't be causing a problem, but there could be some
> other unrelated-to-usb patch which is causing this..
> 
> 2.4 info would be appreciated.
> 
> Dave.
> 
> -- 
> | Dave Jones.        http://www.codemonkey.org.uk
> | SuSE Labs
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2001-12-17 16:07 Sebastian Dröge
@ 2001-12-17 16:22 ` Dave Jones
  2001-12-17 16:52 ` Sebastian Dröge
  1 sibling, 0 replies; 341+ messages in thread
From: Dave Jones @ 2001-12-17 16:22 UTC (permalink / raw)
  To: Sebastian Dröge; +Cc: Linux Kernel Mailing List

On Mon, 17 Dec 2001, Sebastian Dröge wrote:

> Attached you find my .config, lspci -vvv and dmesg output
> I'll test 2.4.17-rc1 in a few minutes and will report what happens ;)

Thanks. Right now getting 2.4 into a better shape is more
important than fixing 2.5, so if you find any problems repeatable
in 2.4.17rc1, Marcelo really needs to know about it.

The only USB changes in my tree are __devinit_p changes, which
really shouldn't be causing a problem, but there could be some
other unrelated-to-usb patch which is causing this..

2.4 info would be appreciated.

Dave.

-- 
| Dave Jones.        http://www.codemonkey.org.uk
| SuSE Labs


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2001-12-15  0:56 ` Stephan von Krawczynski
@ 2001-12-15  6:59   ` Gérard Roudier
  0 siblings, 0 replies; 341+ messages in thread
From: Gérard Roudier @ 2001-12-15  6:59 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: kirkalx, axboe, linux-kernel



On Sat, 15 Dec 2001, Stephan von Krawczynski wrote:

> On Fri, 14 Dec 2001 17:46:37 +0100 (CET)
> Gérard Roudier <groudier@free.fr> wrote:
>
> > > My system is a clunky old Digital Pentium Pro with a
> > > NCR53c810 rev 2 scsi controller, so it can't use the
> > > sym driver.
> >
> > Use sym53c8xx_2 instead. This one uses 2 different firmwares,
> > [...]
> > You may let me know if sym53c8xx_2 still works with 810 rev 2.
>
> On my system it does. I have it as a second controller and am using sym-2
> without troubles.

Thanks for your report.

  Gérard.


^ permalink raw reply	[flat|nested] 341+ messages in thread

* Re: your mail
  2001-12-15  0:54           ` Peter Bornemann
@ 2001-12-15  6:57             ` Gérard Roudier
  0 siblings, 0 replies; 341+ messages in thread
From: Gérard Roudier @ 2001-12-15  6:57 UTC (permalink / raw)
  To: Peter Bornemann; +Cc: Jens Axboe, Kirk Alexander, linux-kernel



On Sat, 15 Dec 2001, Peter Bornemann wrote:

> On Fri, 14 Dec 2001, [ISO-8859-1] Gérard Roudier wrote:
>
> >
> >
> > On Fri, 14 Dec 2001, Peter Bornemann wrote:
> > > Ahemm -- well,
> > > maybe I'm the first one. I have a symbios card, which is recognized by
> > > lspci:  SCSI storage controller: LSI Logic Corp. / Symbios Logic Inc.
> > > (formerly NCR) 53c810 (rev 23).
> > Could you, please,  report me more accurate information.
> > TIA,
> >
>
> Well, it seems I made my intention not very clear: I do not want You to
> fix something in the driver, I just wanted from You to leave the old
> ncr-driver in the kernel, just for the situation of a first install. I
> think no newbie with little knowledge will be able to install Linux (or,
> maybe, FreeBSD), when he happens