LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Will Deacon <will@kernel.org>, Rafael Aquini <aquini@redhat.com>,
	Mark Salter <msalter@redhat.com>,
	Jon Masters <jcm@jonmasters.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-arm-kernel@lists.infradead.org,
	Michal Hocko <mhocko@kernel.org>, QI Fuli <qi.fuli@fujitsu.com>
Subject: Re: [PATCH 3/3] arm64: tlb: skip tlbi broadcast
Date: Mon, 9 Mar 2020 11:22:42 +0000	[thread overview]
Message-ID: <20200309112242.GB2487@mbp> (raw)
In-Reply-To: <20200223192520.20808-4-aarcange@redhat.com>

Hi Andrea,

On Sun, Feb 23, 2020 at 02:25:20PM -0500, Andrea Arcangeli wrote:
>  switch_mm(struct mm_struct *prev, struct mm_struct *next,
>  	  struct task_struct *tsk)
>  {
> -	if (prev != next)
> -		__switch_mm(next);
> +	unsigned int cpu = smp_processor_id();
> +
> +	if (!per_cpu(cpu_not_lazy_tlb, cpu)) {
> +		per_cpu(cpu_not_lazy_tlb, cpu) = true;
> +		atomic_inc(&next->context.nr_active_mm);
> +		__switch_mm(next, cpu);
> +	} else if (prev != next) {
> +		atomic_inc(&next->context.nr_active_mm);
> +		__switch_mm(next, cpu);
> +		atomic_dec(&prev->context.nr_active_mm);
> +	}

IIUC, nr_active_mm keeps track of how many instances of the current pgd
(TTBR0_EL1) are active.

> +enum tlb_flush_types tlb_flush_check(struct mm_struct *mm, unsigned int cpu)
> +{
> +	if (atomic_read(&mm->context.nr_active_mm) <= 1) {
> +		bool is_local = current->active_mm == mm &&
> +			per_cpu(cpu_not_lazy_tlb, cpu);
> +		cpumask_t *stale_cpumask = mm_cpumask(mm);
> +		unsigned int next_zero = cpumask_next_zero(-1, stale_cpumask);
> +		bool local_is_clear = false;
> +		if (next_zero < nr_cpu_ids &&
> +		    (is_local && next_zero == cpu)) {
> +			next_zero = cpumask_next_zero(next_zero, stale_cpumask);
> +			local_is_clear = true;
> +		}
> +		if (next_zero < nr_cpu_ids) {
> +			cpumask_setall(stale_cpumask);
> +			local_is_clear = false;
> +		}
> +
> +		/*
> +		 * Enforce CPU ordering between the above
> +		 * cpumask_setall(mm_cpumask) and the below
> +		 * atomic_read(nr_active_mm).
> +		 */
> +		smp_mb();
> +
> +		if (likely(atomic_read(&mm->context.nr_active_mm)) <= 1) {
> +			if (is_local) {
> +				if (!local_is_clear)
> +					cpumask_clear_cpu(cpu, stale_cpumask);
> +				return TLB_FLUSH_LOCAL;
> +			}
> +			if (atomic_read(&mm->context.nr_active_mm) == 0)
> +				return TLB_FLUSH_NO;
> +		}
> +	}
> +	return TLB_FLUSH_BROADCAST;

And this code here can assume that if nr_active_mm <= 1, no broadcast is
necessary.

One concern I have is the ordering between TTBR0_EL1 update in
cpu_do_switch_mm() and the nr_active_mm, both on a different CPU. We
only have an ISB for context synchronisation on that CPU but I don't
think the architecture guarantees any relation between sysreg access and
the memory update. We have a DSB but that's further down in switch_to().

However, what worries me more is that you can now potentially do a TLB
shootdown without clearing the intermediate (e.g. VA to pte) walk caches
from the TLB. Even if the corresponding pgd and ASID are no longer
active on other CPUs, I'm not sure it's entirely safe to free (and
re-allocate) pages belonging to a pgtable without first flushing the
TLB. All the architecture spec states is that the software must first
clear the entry followed by TLBI (the break-before-make rules).

That said, the benchmark numbers are not very encouraging. Around 1%
improvement in a single run, it can as well be noise. Also something
like hackbench may also show a slight impact on the context switch path.
Maybe with a true NUMA machine with hundreds of CPUs we may see a
difference, but it depends on how well the TLBI is implemented.

Thanks.

-- 
Catalin

  parent reply	other threads:[~2020-03-09 11:22 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-23 19:25 [PATCH 0/3] arm64: tlb: skip tlbi broadcast v2 Andrea Arcangeli
2020-02-23 19:25 ` [PATCH 1/3] mm: use_mm: fix for arches checking mm_users to optimize TLB flushes Andrea Arcangeli
2020-03-03  4:28   ` Rafael Aquini
2020-02-23 19:25 ` [PATCH 2/3] arm64: select CPUMASK_OFFSTACK if NUMA Andrea Arcangeli
2020-03-03  4:31   ` Rafael Aquini
2020-02-23 19:25 ` [PATCH 3/3] arm64: tlb: skip tlbi broadcast Andrea Arcangeli
2020-03-02 15:24   ` Rafael Aquini
2020-03-04  4:19     ` Rafael Aquini
2020-03-09 11:22   ` Catalin Marinas [this message]
2020-03-14  3:16     ` Andrea Arcangeli
2020-03-16 14:09       ` Mark Rutland
2020-03-31  9:45         ` Mark Rutland
2020-04-01  0:32           ` Andrea Arcangeli
2020-03-18  8:53 ` [PATCH 0/3] arm64: tlb: skip tlbi broadcast v2 qi.fuli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200309112242.GB2487@mbp \
    --to=catalin.marinas@arm.com \
    --cc=aarcange@redhat.com \
    --cc=aquini@redhat.com \
    --cc=jcm@jonmasters.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=msalter@redhat.com \
    --cc=qi.fuli@fujitsu.com \
    --cc=will@kernel.org \
    --subject='Re: [PATCH 3/3] arm64: tlb: skip tlbi broadcast' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).