LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Vikas Shivappa <vikas.shivappa@intel.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>,
	vikas.shivappa@intel.com, x86@kernel.org,
	linux-kernel@vger.kernel.org, hpa@zytor.com, tglx@linutronix.de,
	mingo@kernel.org, tj@kernel.org, peterz@infradead.org,
	matt.fleming@intel.com, will.auld@intel.com,
	glenn.p.williamson@intel.com, kanaka.d.juvva@intel.com
Subject: Re: [PATCH 7/7] x86/intel_rdt: Add CAT documentation and usage guide
Date: Thu, 26 Mar 2015 11:38:59 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.10.1503261133070.19649@vshiva-Udesk> (raw)
In-Reply-To: <20150325223941.GA5657@amt.cnet>


Hello Marcelo,

On Wed, 25 Mar 2015, Marcelo Tosatti wrote:

> On Thu, Mar 12, 2015 at 04:16:07PM -0700, Vikas Shivappa wrote:
>> This patch adds a description of Cache allocation technology, overview
>> of kernel implementation and usage of CAT cgroup interface.
>>
>> Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
>> ---
>>  Documentation/cgroups/rdt.txt | 183 ++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 183 insertions(+)
>>  create mode 100644 Documentation/cgroups/rdt.txt
>>
>> diff --git a/Documentation/cgroups/rdt.txt b/Documentation/cgroups/rdt.txt
>> new file mode 100644
>> index 0000000..98eb4b8
>> --- /dev/null
>> +++ b/Documentation/cgroups/rdt.txt
>> @@ -0,0 +1,183 @@
>> +        RDT
>> +        ---
>> +
>> +Copyright (C) 2014 Intel Corporation
>> +Written by vikas.shivappa@linux.intel.com
>> +(based on contents and format from cpusets.txt)
>> +
>> +CONTENTS:
>> +=========
>> +
>> +1. Cache Allocation Technology
>> +  1.1 What is RDT and CAT ?
>> +  1.2 Why is CAT needed ?
>> +  1.3 CAT implementation overview
>> +  1.4 Assignment of CBM and CLOS
>> +  1.5 Scheduling and Context Switch
>> +2. Usage Examples and Syntax
>> +
>> +1. Cache Allocation Technology(CAT)
>> +===================================
>> +
>> +1.1 What is RDT and CAT
>> +-----------------------
>> +
>> +CAT is a part of Resource Director Technology(RDT) or Platform Shared
>> +resource control which provides support to control Platform shared
>> +resources like cache. Currently Cache is the only resource that is
>> +supported in RDT.
>> +More information can be found in the Intel SDM section 17.15.
>> +
>> +Cache Allocation Technology provides a way for the Software (OS/VMM)
>> +to restrict cache allocation to a defined 'subset' of cache which may
>> +be overlapping with other 'subsets'.  This feature is used when
>> +allocating a line in cache ie when pulling new data into the cache.
>> +The programming of the h/w is done via programming  MSRs.
>> +
>> +The different cache subsets are identified by CLOS identifier (class
>> +of service) and each CLOS has a CBM (cache bit mask).  The CBM is a
>> +contiguous set of bits which defines the amount of cache resource that
>> +is available for each 'subset'.
>> +
>> +1.2 Why is CAT needed
>> +---------------------
>> +
>> +The CAT  enables more cache resources to be made available for higher
>> +priority applications based on guidance from the execution
>> +environment.
>> +
>> +The architecture also allows dynamically changing these subsets during
>> +runtime to further optimize the performance of the higher priority
>> +application with minimal degradation to the low priority app.
>> +Additionally, resources can be rebalanced for system throughput
>> +benefit.  (Refer to Section 17.15 in the Intel SDM)
>> +
>> +This technique may be useful in managing large computer systems which
>> +large LLC. Examples may be large servers running  instances of
>> +webservers or database servers. In such complex systems, these subsets
>> +can be used for more careful placing of the available cache
>> +resources.
>> +
>> +The CAT kernel patch would provide a basic kernel framework for users
>> +to be able to implement such cache subsets.
>> +
>> +1.3 CAT implementation Overview
>> +-------------------------------
>> +
>> +Kernel implements a cgroup subsystem to support cache allocation.
>> +
>> +Each cgroup has a CLOSid <-> CBM(cache bit mask) mapping.
>> +A CLOS(Class of service) is represented by a CLOSid.CLOSid is internal
>> +to the kernel and not exposed to user.  Each cgroup would have one CBM
>> +and would just represent one cache 'subset'.
>> +
>> +The cgroup follows cgroup hierarchy ,mkdir and adding tasks to the
>> +cgroup never fails.  When a child cgroup is created it inherits the
>> +CLOSid and the CBM from its parent.  When a user changes the default
>> +CBM for a cgroup, a new CLOSid may be allocated if the CBM was not
>> +used before.  The changing of 'cbm' may fail with -ERRNOSPC once the
>> +kernel runs out of maximum CLOSids it can support.
>> +User can create as many cgroups as he wants but having different CBMs
>> +at the same time is restricted by the maximum number of CLOSids
>> +(multiple cgroups can have the same CBM).
>> +Kernel maintains a CLOSid<->cbm mapping which keeps reference counter
>> +for each cgroup using a CLOSid.
>> +
>> +The tasks in the cgroup would get to fill the LLC cache represented by
>> +the cgroup's 'cbm' file.
>> +
>> +Root directory would have all available  bits set in 'cbm' file by
>> +default.
>> +
>> +1.4 Assignment of CBM,CLOS
>> +--------------------------
>> +
>> +The 'cbm' needs to be a  subset of the parent node's 'cbm'.
>> +Any contiguous subset of these bits(with a minimum of 2 bits) maybe
>> +set to indicate the cache mapping desired.  The 'cbm' between 2
>> +directories can overlap. The 'cbm' would represent the cache 'subset'
>> +of the CAT cgroup.  For ex: on a system with 16 bits of max cbm bits,
>> +if the directory has the least significant 4 bits set in its 'cbm'
>> +file(meaning the 'cbm' is just 0xf), it would be allocated the right
>> +quarter of the Last level cache which means the tasks belonging to
>> +this CAT cgroup can use the right quarter of the cache to fill. If it
>> +has the most significant 8 bits set ,it would be allocated the left
>> +half of the cache(8 bits  out of 16 represents 50%).
>> +
>> +The cache portion defined in the CBM file is available to all tasks
>> +within the cgroup to fill and these task are not allowed to allocate
>> +space in other parts of the cache.
>
> Is there a reason to expose the hardware interface rather
> than ratios to userspace ?
>
> Say, i'd like to allocate 20% of L3 cache to cgroup A,
> 80% to cgroup B.
>
> Well, you'd have to expose the shared percentages between
> any two cgroups (that information is there in the
> cbm bitmaps, but not in "ratios").
>
> One problem i see with exposing cbm bitmasks is that on hardware
> updates that change cache size or bitmask length, userspace must
> recalculate the bitmaps.
>
> Another is that its vendor dependant, while ratios (plus shared
> information for two given cgroups) is not.
>

Agree that this interface doesnot give options to directly allocate in terms of 
percentage . But note that specifying in bitmasks allows the user to 
allocate overlapping 
cache areas and also since we use cgroup we naturally follow the cgroup 
hierarchy. User should be able to convert the bitmasks into intended percentage 
or size values based on the other available cache size info in 
hooks like cpuinfo.

We discussed more on this before in the older patches and here is one thread 
where we discussed it for your reference - 
http://marc.info/?l=linux-kernel&m=142482002022543&w=2

Thanks,
Vikas

>
>> +
>> +1.5 Scheduling and Context Switch
>> +---------------------------------
>> +
>> +During context switch kernel implements this by writing the
>> +CLOSid (internally maintained by kernel) of the cgroup to which the
>> +task belongs to the CPU's IA32_PQR_ASSOC MSR. The MSR is only written
>> +when there is a change in the CLOSid for the CPU in order to minimize
>> +the latency incurred during context switch.
>> +
>> +2. Usage examples and syntax
>> +============================
>> +
>> +To check if CAT was enabled on your system
>> +
>> +dmesg | grep -i intel_rdt
>> +should output : intel_rdt: cbmlength:xx, Closs:xx
>> +the length of cbm and CLOS should depend on the system you use.
>> +
>> +
>> +Following would mount the cache allocation cgroup subsystem and create
>> +2 directories. Please refer to Documentation/cgroups/cgroups.txt on
>> +details about how to use cgroups.
>> +
>> +  cd /sys/fs/cgroup
>> +  mkdir rdt
>> +  mount -t cgroup -ordt rdt /sys/fs/cgroup/rdt
>> +  cd rdt
>> +
>> +Create 2 rdt cgroups
>> +
>> +  mkdir group1
>> +  mkdir group2
>> +
>> +Following are some of the Files in the directory
>> +
>> +  ls
>> +  rdt.cbm
>> +  tasks
>> +
>> +Say if the cache is 2MB and cbm supports 16 bits, then setting the
>> +below allocates the 'right 1/4th(512KB)' of the cache to group2
>> +
>> +Edit the CBM for group2 to set the least significant 4 bits.  This
>> +allocates 'right quarter' of the cache.
>> +
>> +  cd group2
>> +  /bin/echo 0xf > cat.cbm
>> +
>> +
>> +Edit the CBM for group2 to set the least significant 8 bits.This
>> +allocates the right half of the cache to 'group2'.
>> +
>> +  cd group2
>> +  /bin/echo 0xff > rdt.cbm
>> +
>> +Assign tasks to the group2
>> +
>> +  /bin/echo PID1 > tasks
>> +  /bin/echo PID2 > tasks
>> +
>> +  Meaning now threads
>> +  PID1 and PID2 get to fill the 'right half' of
>> +  the cache as the belong to cgroup group2.
>> +
>> +Create a group under group2
>> +
>> +  cd group2
>> +  mkdir group21
>> +  cat rdt.cbm
>> +   0xff - inherits parents mask.
>> +
>> +  /bin/echo 0xfff > rdt.cbm - throws error as mask has to parent's mask's subset
>> +
>> --
>> 1.9.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>

  reply	other threads:[~2015-03-26 18:40 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-12 23:16 [PATCH V5 0/7] x86/intel_rdt: Intel Cache Allocation Technology Vikas Shivappa
2015-03-12 23:16 ` [PATCH 1/7] x86/intel_rdt: Intel Cache Allocation Technology detection Vikas Shivappa
2015-03-12 23:16 ` [PATCH 2/7] x86/intel_rdt: Adds support for Class of service management Vikas Shivappa
2015-03-12 23:16 ` [PATCH 3/7] x86/intel_rdt: Support cache bit mask for Intel CAT Vikas Shivappa
2015-04-09 20:56   ` Marcelo Tosatti
2015-04-13  2:36     ` Vikas Shivappa
2015-03-12 23:16 ` [PATCH 4/7] x86/intel_rdt: Implement scheduling support for Intel RDT Vikas Shivappa
2015-03-12 23:16 ` [PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR Vikas Shivappa
2015-03-12 23:16 ` [PATCH 6/7] x86/intel_rdt: Intel haswell CAT enumeration Vikas Shivappa
2015-03-12 23:16 ` [PATCH 7/7] x86/intel_rdt: Add CAT documentation and usage guide Vikas Shivappa
2015-03-25 22:39   ` Marcelo Tosatti
2015-03-26 18:38     ` Vikas Shivappa [this message]
2015-03-27  1:29       ` Marcelo Tosatti
2015-03-31  1:17         ` Marcelo Tosatti
2015-03-31 17:27         ` Vikas Shivappa
2015-03-31 22:56           ` Marcelo Tosatti
2015-04-01 18:20             ` Vikas Shivappa
2015-07-28 23:37           ` Marcelo Tosatti
2015-07-29 21:20             ` Vikas Shivappa
2015-03-31 17:32         ` Vikas Shivappa
  -- strict thread matches above, loose matches on Subject: below --
2015-05-02  1:36 [PATCH V6 0/7] x86/intel_rdt: Intel Cache Allocation Technology Vikas Shivappa
2015-05-02  1:36 ` [PATCH 7/7] x86/intel_rdt: Add CAT documentation and usage guide Vikas Shivappa
2015-02-24 23:16 [PATCH V4 0/7] x86/intel_rdt: Intel Cache Allocation Technology Vikas Shivappa
2015-02-24 23:16 ` [PATCH 7/7] x86/intel_rdt: Add CAT documentation and usage guide Vikas Shivappa
2015-02-26 19:31   ` Hagen Paul Pfeifer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.10.1503261133070.19649@vshiva-Udesk \
    --to=vikas.shivappa@intel.com \
    --cc=glenn.p.williamson@intel.com \
    --cc=hpa@zytor.com \
    --cc=kanaka.d.juvva@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matt.fleming@intel.com \
    --cc=mingo@kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vikas.shivappa@linux.intel.com \
    --cc=will.auld@intel.com \
    --cc=x86@kernel.org \
    --subject='Re: [PATCH 7/7] x86/intel_rdt: Add CAT documentation and usage guide' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).