LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [NET/IPv6] Race condition with flow_cache_genid?
@ 2008-02-03  5:28 Kyle Moffett
  2008-02-06  4:02 ` David Miller
  2008-02-06 18:36 ` Kyle Moffett
  0 siblings, 2 replies; 4+ messages in thread
From: Kyle Moffett @ 2008-02-03  5:28 UTC (permalink / raw)
  To: linux-net, linux-kernel

Hi, I was poking around trying to figure out how to install the Mobile
IPv6 daemons this evening and noticed they required a kernel patch,
although upon further inspection the kernel patch seemed to already be
applied in 2.6.24.  Unfortunately the flow cache appears to be
horribly racy.  Attached below are the only uses of the variable
"flow_cache_genid" in 2.6.24.

Now, I am no expert in this particular area of the code, but the
"atomic_t flow_cache_genid" variable is ONLY ever used with
atomic_inc() and atomic_read().  There are no memory barriers or other
dec_and_test()-style functions, so that variable could just as easily
be replaced with a plain old C int.

Basically either there is some missing locking here or it does not
need to be "atomic_t".  Judging from the way it *appears* to be used
to check if cache entries are up-to-date with the latest changes in
policy, I would guess the former.

In particular that whole "flow_cache_lookup()" thing looks racy as
hell, since somebody could be in the middle of that looking at "if
(fle->genid == atomic_read(&flow_cache_genid))".  It does the
atomic_read(), which BTW is literally implemented as:
  #define atomic_read(atomicvar) ((atomicvar)->value)
on some platforms.  Immediately after the atomic read (or even before,
since there's no cache-flush or read-modify-write), somebody calls
into "selinux_xfrm_notify_policyload()" and increments the
flow_cache_genid becase selinux just loaded a security policy.  Now
we're accepting a cache entry which applies to PREVIOUS security
policy.  I can only assume that's really bad.

Even worse, there seems to be a race between SELinux loading a new
policy and calling selinux_xfrm_notify_policyload(), since we could
easily get packets and process them according to the old cache entry
on one CPU before SELinux has had a chance to update the generation ID
from the other.  Furthermore, there's no guarantee the CPU caches will
get updated in reasonable time.  Clearly SELinux needs to have some
way of atomically invalidating the flow cache of all CPUs
*simultaneously* with loading a new policy, which probably means they
both need to be under the same lock, or something.

The same problem appears to occur with updating the XFRM policy and
incrementing flow_cache_genid.  Probably the fastest solution is to
put the flow cache under the xfrm_policy_lock (which already disables
local bottom-halves), and either take that lock during SELinux policy
load or if there are lock ordering problems then add a variable
"flow_cache_ignore" and change the xfrm_notify hooks:

void selinux_xfrm_notify_policyload_pre(void)
{
	write_lock_bh(&xfrm_policy_lock);
	flow_cache_genid++;
	flow_cache_ignore = 1;
	write_unlock_bh(&xfrm_policy_lock);
}

void selinux_xfrm_notify_policyload_post(void)
{
	write_lock_bh(&xfrm_policy_lock);
	flow_cache_ignore = 0;
	write_unlock_bh(&xfrm_policy_lock);
}

Cheers,
Kyle Moffett


BEGIN QUOTED CODE INVOLVING flow_cache_genid:

include/net/flow.h:94:
extern atomic_t flow_cache_genid;

net/core/flow.c:39:
atomic_t flow_cache_genid = ATOMIC_INIT(0);

net/core/flow.c:169:flow_cache_lookup():
	if (flow_hash_rnd_recalc(cpu))
		flow_new_hash_rnd(cpu);
	hash = flow_hash_code(key, cpu);

	head = &flow_table(cpu)[hash];
	for (fle = *head; fle; fle = fle->next) {
		if (fle->family == family &&
				fle->dir == dir &&
				flow_key_compare(key, &fle->key) == 0) {
			if (fle->genid == atomic_read(&flow_cache_genid)) {
				void *ret = fle->object;

				if (ret)
					atomic_inc(fle->object_ref);
				local_bh_enable();

				return ret;
			}
			break;
		}
	}

net/xfrm/xfrm_policy.c:1025:
int xfrm_policy_delete(struct xfrm_policy *pol, int dir)
{
	write_lock_bh(&xfrm_policy_lock);
	pol = __xfrm_policy_unlink(pol, dir);
	write_unlock_bh(&xfrm_policy_lock);
	if (pol) {
		if (dir < XFRM_POLICY_MAX)
			atomic_inc(&flow_cache_genid);
		xfrm_policy_kill(pol);
		return 0;
	}
	return -ENOENT;
}

net/ipv6/inet6_connection_sock.c:142:
static inline
void __inet6_csk_dst_store(struct sock *sk, struct dst_entry *dst,
		struct in6_addr *daddr, struct in6_addr *saddr)
{
	__ip6_dst_store(sk, dst, daddr, saddr);

#ifdef CONFIG_XFRM
	{
		struct rt6_info *rt = (struct rt6_info  *)dst;
		rt->rt6i_flow_cache_genid = atomic_read(&flow_cache_genid);
	}
#endif
}

security/selinux/include/xfrm.h:41:
static inline void selinux_xfrm_notify_policyload(void)
{
        atomic_inc(&flow_cache_genid);
}

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [NET/IPv6] Race condition with flow_cache_genid?
  2008-02-03  5:28 [NET/IPv6] Race condition with flow_cache_genid? Kyle Moffett
@ 2008-02-06  4:02 ` David Miller
  2008-02-06 18:36 ` Kyle Moffett
  1 sibling, 0 replies; 4+ messages in thread
From: David Miller @ 2008-02-06  4:02 UTC (permalink / raw)
  To: kyle; +Cc: linux-net, linux-kernel


You'll get a better set of eyes on this if you post it
to netdev@vger.kernel.org which is where the networking
developers hang out.

linux-net is for user questions.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [NET/IPv6] Race condition with flow_cache_genid?
  2008-02-03  5:28 [NET/IPv6] Race condition with flow_cache_genid? Kyle Moffett
  2008-02-06  4:02 ` David Miller
@ 2008-02-06 18:36 ` Kyle Moffett
  2008-02-08 18:16   ` Herbert Xu
  1 sibling, 1 reply; 4+ messages in thread
From: Kyle Moffett @ 2008-02-06 18:36 UTC (permalink / raw)
  To: netdev, linux-kernel, davem

Whoops, I accidentally sent this to linux-net@vger instead of
netdev@vger.  Original email below:


Hi, I was poking around trying to figure out how to install the Mobile
IPv6 daemons this evening and noticed they required a kernel patch,
although upon further inspection the kernel patch seemed to already be
applied in 2.6.24.  Unfortunately the flow cache appears to be
horribly racy.  Attached below are the only uses of the variable
"flow_cache_genid" in 2.6.24.

Now, I am no expert in this particular area of the code, but the
"atomic_t flow_cache_genid" variable is ONLY ever used with
atomic_inc() and atomic_read().  There are no memory barriers or other
dec_and_test()-style functions, so that variable could just as easily
be replaced with a plain old C int.

Basically either there is some missing locking here or it does not
need to be "atomic_t".  Judging from the way it *appears* to be used
to check if cache entries are up-to-date with the latest changes in
policy, I would guess the former.

In particular that whole "flow_cache_lookup()" thing looks racy as
hell, since somebody could be in the middle of that looking at "if
(fle->genid == atomic_read(&flow_cache_genid))".  It does the
atomic_read(), which BTW is literally implemented as:
  #define atomic_read(atomicvar) ((atomicvar)->value)
on some platforms.  Immediately after the atomic read (or even before,
since there's no cache-flush or read-modify-write), somebody calls
into "selinux_xfrm_notify_policyload()" and increments the
flow_cache_genid becase selinux just loaded a security policy.  Now
we're accepting a cache entry which applies to PREVIOUS security
policy.  I can only assume that's really bad.

Even worse, there seems to be a race between SELinux loading a new
policy and calling selinux_xfrm_notify_policyload(), since we could
easily get packets and process them according to the old cache entry
on one CPU before SELinux has had a chance to update the generation ID
from the other.  Furthermore, there's no guarantee the CPU caches will
get updated in reasonable time.  Clearly SELinux needs to have some
way of atomically invalidating the flow cache of all CPUs
*simultaneously* with loading a new policy, which probably means they
both need to be under the same lock, or something.

The same problem appears to occur with updating the XFRM policy and
incrementing flow_cache_genid.  Probably the fastest solution is to
put the flow cache under the xfrm_policy_lock (which already disables
local bottom-halves), and either take that lock during SELinux policy
load or if there are lock ordering problems then add a variable
"flow_cache_ignore" and change the xfrm_notify hooks:

void selinux_xfrm_notify_policyload_pre(void)
{
        write_lock_bh(&xfrm_policy_lock);
        flow_cache_genid++;
        flow_cache_ignore = 1;
        write_unlock_bh(&xfrm_policy_lock);
}

void selinux_xfrm_notify_policyload_post(void)
{
        write_lock_bh(&xfrm_policy_lock);
        flow_cache_ignore = 0;
        write_unlock_bh(&xfrm_policy_lock);
}

Cheers,
Kyle Moffett


BEGIN QUOTED CODE INVOLVING flow_cache_genid:

include/net/flow.h:94:
extern atomic_t flow_cache_genid;

net/core/flow.c:39:
atomic_t flow_cache_genid = ATOMIC_INIT(0);

net/core/flow.c:169:flow_cache_lookup():
        if (flow_hash_rnd_recalc(cpu))
                flow_new_hash_rnd(cpu);
        hash = flow_hash_code(key, cpu);

        head = &flow_table(cpu)[hash];
        for (fle = *head; fle; fle = fle->next) {
                if (fle->family == family &&
                                fle->dir == dir &&
                                flow_key_compare(key, &fle->key) == 0) {
                        if (fle->genid == atomic_read(&flow_cache_genid)) {
                                void *ret = fle->object;

                                if (ret)
                                        atomic_inc(fle->object_ref);
                                local_bh_enable();

                                return ret;
                        }
                        break;
                }
        }

net/xfrm/xfrm_policy.c:1025:
int xfrm_policy_delete(struct xfrm_policy *pol, int dir)
{
        write_lock_bh(&xfrm_policy_lock);
        pol = __xfrm_policy_unlink(pol, dir);
        write_unlock_bh(&xfrm_policy_lock);
        if (pol) {
                if (dir < XFRM_POLICY_MAX)
                        atomic_inc(&flow_cache_genid);
                xfrm_policy_kill(pol);
                return 0;
        }
        return -ENOENT;
}

net/ipv6/inet6_connection_sock.c:142:
static inline
void __inet6_csk_dst_store(struct sock *sk, struct dst_entry *dst,
                struct in6_addr *daddr, struct in6_addr *saddr)
{
        __ip6_dst_store(sk, dst, daddr, saddr);

#ifdef CONFIG_XFRM
        {
                struct rt6_info *rt = (struct rt6_info  *)dst;
                rt->rt6i_flow_cache_genid = atomic_read(&flow_cache_genid);
        }
#endif
}

security/selinux/include/xfrm.h:41:
static inline void selinux_xfrm_notify_policyload(void)
{
        atomic_inc(&flow_cache_genid);
}

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [NET/IPv6] Race condition with flow_cache_genid?
  2008-02-06 18:36 ` Kyle Moffett
@ 2008-02-08 18:16   ` Herbert Xu
  0 siblings, 0 replies; 4+ messages in thread
From: Herbert Xu @ 2008-02-08 18:16 UTC (permalink / raw)
  To: Kyle Moffett; +Cc: netdev, linux-kernel, davem

Kyle Moffett <kyle@moffetthome.net> wrote:
>
> Basically either there is some missing locking here or it does not
> need to be "atomic_t".  Judging from the way it *appears* to be used
> to check if cache entries are up-to-date with the latest changes in
> policy, I would guess the former.

You're right that it doesn't really have to be an atomic since
all the writers are from xfrm currently.  However, the fact that
it is atomic is used by the current code since sometimes they
increment the value without holding the xfrm policy lock.

Yes it is racy but that is fine for the purpose that this variable
serves.  All it does is to make sure that extant flow objects get
killed at some point after the increment.  There is absolutely no
requirement that the killing be immediate or synchronised.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-02-08 18:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-03  5:28 [NET/IPv6] Race condition with flow_cache_genid? Kyle Moffett
2008-02-06  4:02 ` David Miller
2008-02-06 18:36 ` Kyle Moffett
2008-02-08 18:16   ` Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).