LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Robert Walker <robert.walker@arm.com>
To: Mathieu Poirier <mathieu.poirier@linaro.org>,
	Leo Yan <leo.yan@linaro.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
	Jonathan Corbet <corbet@lwn.net>,
	mike.leach@linaro.org, kim.phillips@arm.co,
	Tor Jeremiassen <tor@ti.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@redhat.com>, Namhyung Kim <namhyung@kernel.org>,
	linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, coresight@lists.linaro.org
Subject: Re: [RFT v3 1/4] perf cs-etm: Generate branch sample for missed packets
Date: Wed, 30 May 2018 10:45:54 +0100	[thread overview]
Message-ID: <5efe804f-364b-6ae1-3fca-ec7f6d8a383d@arm.com> (raw)
In-Reply-To: <20180528221347.GA4109@xps15>



On 28/05/18 23:13, Mathieu Poirier wrote:
> Leo and/or Robert,
> 
> On Mon, May 28, 2018 at 04:45:00PM +0800, Leo Yan wrote:
>> Commit e573e978fb12 ("perf cs-etm: Inject capabilitity for CoreSight
>> traces") reworks the samples generation flow from CoreSight trace to
>> match the correct format so Perf report tool can display the samples
>> properly.
>>
>> But the change has side effect for branch packet handling, it only
>> generate branch samples by checking previous packet flag
>> 'last_instr_taken_branch' is true, this results in below three kinds
>> packets are missed to generate branch samples:
>>
>> - The start tracing packet at the beginning of tracing data;
>> - The exception handling packet;
>> - If one CS_ETM_TRACE_ON packet is inserted, we also miss to handle it
>>    for branch samples.  CS_ETM_TRACE_ON packet itself can give the info
>>    that there have a discontinuity in the trace, on the other hand we
>>    also miss to generate proper branch sample for packets before and
>>    after CS_ETM_TRACE_ON packet.
>>
>> This patch is to add branch sample handling for up three kinds packets:
>>
>> - In function cs_etm__sample(), check if 'prev_packet->sample_type' is
>>    zero and in this case it generates branch sample for the start tracing
>>    packet; furthermore, we also need to handle the condition for
>>    prev_packet::end_addr is zero in the cs_etm__last_executed_instr();
>>
>> - In function cs_etm__sample(), check if 'prev_packet->exc' is true and
>>    generate branch sample for exception handling packet;
>>
>> - If there has one CS_ETM_TRACE_ON packet is coming, we firstly generate
>>    branch sample in the function cs_etm__flush(), this can save complete
>>    info for the previous CS_ETM_RANGE packet just before CS_ETM_TRACE_ON
>>    packet.  We also generate branch sample for the new CS_ETM_RANGE
>>    packet after CS_ETM_TRACE_ON packet, this have two purposes, the
>>    first one purpose is to save the info for the new CS_ETM_RANGE packet,
>>    the second purpose is to save CS_ETM_TRACE_ON packet info so we can
>>    have hint for a discontinuity in the trace.
>>
>>    For CS_ETM_TRACE_ON packet, its fields 'packet->start_addr' and
>>    'packet->end_addr' equal to 0xdeadbeefdeadbeefUL which are emitted in
>>    the decoder layer as dummy value.  This patch is to convert these
>>    values to zeros for more readable; this is accomplished by functions
>>    cs_etm__last_executed_instr() and cs_etm__first_executed_instr().  The
>>    later one is a new function introduced by this patch.
>>
>> Reviewed-by: Robert Walker <robert.walker@arm.com>
>> Fixes: e573e978fb12 ("perf cs-etm: Inject capabilitity for CoreSight traces")
>> Signed-off-by: Leo Yan <leo.yan@linaro.org>
>> ---
>>   tools/perf/util/cs-etm.c | 93 +++++++++++++++++++++++++++++++++++++-----------
>>   1 file changed, 73 insertions(+), 20 deletions(-)
>>
>> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
>> index 822ba91..8418173 100644
>> --- a/tools/perf/util/cs-etm.c
>> +++ b/tools/perf/util/cs-etm.c
>> @@ -495,6 +495,20 @@ static inline void cs_etm__reset_last_branch_rb(struct cs_etm_queue *etmq)
>>   static inline u64 cs_etm__last_executed_instr(struct cs_etm_packet *packet)
>>   {
>>   	/*
>> +	 * The packet is the start tracing packet if the end_addr is zero,
>> +	 * returns 0 for this case.
>> +	 */
>> +	if (!packet->end_addr)
>> +		return 0;
> 
> What is considered to be the "start tracing packet"?  Right now the only two
> kind of packets inserted in the decoder packet buffer queue are INST_RANGE and
> TRACE_ON.  How can we hit a condition where packet->end-addr == 0?
> 
> 
>> +
>> +	/*
>> +	 * The packet is the CS_ETM_TRACE_ON packet if the end_addr is
>> +	 * magic number 0xdeadbeefdeadbeefUL, returns 0 for this case.
>> +	 */
>> +	if (packet->end_addr == 0xdeadbeefdeadbeefUL)
>> +		return 0;
> 
> As it is with the above, I find triggering on addresses to be brittle and hard
> to maintain on the long run.  Packets all have a sample_type field that should
> be used in cases like this one.  That way we know exactly the condition that is
> targeted.
> 
> While working on this set, please spin-off another patch that defines
> CS_ETM_INVAL_ADDR 0xdeadbeefdeadbeefUL and replace all the cases where the
> numeral is used.  That way we stop using the hard coded value.
> 
>> +
>> +	/*
>>   	 * The packet records the execution range with an exclusive end address
>>   	 *
>>   	 * A64 instructions are constant size, so the last executed
>> @@ -505,6 +519,18 @@ static inline u64 cs_etm__last_executed_instr(struct cs_etm_packet *packet)
>>   	return packet->end_addr - A64_INSTR_SIZE;
>>   }
>>   
>> +static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet)
>> +{
>> +	/*
>> +	 * The packet is the CS_ETM_TRACE_ON packet if the start_addr is
>> +	 * magic number 0xdeadbeefdeadbeefUL, returns 0 for this case.
>> +	 */
>> +	if (packet->start_addr == 0xdeadbeefdeadbeefUL)
>> +		return 0;
> 
> Same comment as above.
> 
>> +
>> +	return packet->start_addr;
>> +}
>> +
>>   static inline u64 cs_etm__instr_count(const struct cs_etm_packet *packet)
>>   {
>>   	/*
>> @@ -546,7 +572,7 @@ static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq)
>>   
>>   	be       = &bs->entries[etmq->last_branch_pos];
>>   	be->from = cs_etm__last_executed_instr(etmq->prev_packet);
>> -	be->to	 = etmq->packet->start_addr;
>> +	be->to	 = cs_etm__first_executed_instr(etmq->packet);
>>   	/* No support for mispredict */
>>   	be->flags.mispred = 0;
>>   	be->flags.predicted = 1;
>> @@ -701,7 +727,7 @@ static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq)
>>   	sample.ip = cs_etm__last_executed_instr(etmq->prev_packet);
>>   	sample.pid = etmq->pid;
>>   	sample.tid = etmq->tid;
>> -	sample.addr = etmq->packet->start_addr;
>> +	sample.addr = cs_etm__first_executed_instr(etmq->packet);
>>   	sample.id = etmq->etm->branches_id;
>>   	sample.stream_id = etmq->etm->branches_id;
>>   	sample.period = 1;
>> @@ -897,13 +923,28 @@ static int cs_etm__sample(struct cs_etm_queue *etmq)
>>   		etmq->period_instructions = instrs_over;
>>   	}
>>   
>> -	if (etm->sample_branches &&
>> -	    etmq->prev_packet &&
>> -	    etmq->prev_packet->sample_type == CS_ETM_RANGE &&
>> -	    etmq->prev_packet->last_instr_taken_branch) {
>> -		ret = cs_etm__synth_branch_sample(etmq);
>> -		if (ret)
>> -			return ret;
>> +	if (etm->sample_branches && etmq->prev_packet) {
>> +		bool generate_sample = false;
>> +
>> +		/* Generate sample for start tracing packet */
>> +		if (etmq->prev_packet->sample_type == 0 ||
> 
> What kind of packet is sample_type == 0 ?
> 
>> +		    etmq->prev_packet->sample_type == CS_ETM_TRACE_ON)
>> +			generate_sample = true;
>> +
>> +		/* Generate sample for exception packet */
>> +		if (etmq->prev_packet->exc == true)
>> +			generate_sample = true;
> 
> Please don't do that.  Exception packets have a type of their own and can be
> added to the decoder packet queue the same way INST_RANGE and TRACE_ON packets
> are.  Moreover exception packet containt an address that, if I'm reading the
> documenation properly, can be used to keep track of instructions that were
> executed between the last address of the previous range packet and the address
> executed just before the exception occurred.  Mike and Rob will have to confirm
> this as the decoder may be doing all that hard work for us.
> 
>> +
>> +		/* Generate sample for normal branch packet */
>> +		if (etmq->prev_packet->sample_type == CS_ETM_RANGE &&
>> +		    etmq->prev_packet->last_instr_taken_branch)
>> +			generate_sample = true;
>> +
>> +		if (generate_sample) {
>> +			ret = cs_etm__synth_branch_sample(etmq);
>> +			if (ret)
>> +				return ret;
>> +		}
>>   	}
>>   
>>   	if (etm->sample_branches || etm->synth_opts.last_branch) {
>> @@ -922,11 +963,16 @@ static int cs_etm__sample(struct cs_etm_queue *etmq)
>>   static int cs_etm__flush(struct cs_etm_queue *etmq)
>>   {
>>   	int err = 0;
>> +	struct cs_etm_auxtrace *etm = etmq->etm;
>>   	struct cs_etm_packet *tmp;
>>   
>> -	if (etmq->etm->synth_opts.last_branch &&
>> -	    etmq->prev_packet &&
>> -	    etmq->prev_packet->sample_type == CS_ETM_RANGE) {
>> +	if (!etmq->prev_packet)
>> +		return 0;
>> +
>> +	if (etmq->prev_packet->sample_type != CS_ETM_RANGE)
>> +		return 0;
>> +
>> +	if (etmq->etm->synth_opts.last_branch) {
> 
> If you add:
> 
>          if (!etmq->etm->synth_opts.last_branch)
>                  return 0;
> 
> You can avoid indenting the whole block.
> 
>>   		/*
>>   		 * Generate a last branch event for the branches left in the
>>   		 * circular buffer at the end of the trace.
>> @@ -939,18 +985,25 @@ static int cs_etm__flush(struct cs_etm_queue *etmq)
>>   		err = cs_etm__synth_instruction_sample(
>>   			etmq, addr,
>>   			etmq->period_instructions);
>> +		if (err)
>> +			return err;
>>   		etmq->period_instructions = 0;
>> +	}
>>   
>> -		/*
>> -		 * Swap PACKET with PREV_PACKET: PACKET becomes PREV_PACKET for
>> -		 * the next incoming packet.
>> -		 */
>> -		tmp = etmq->packet;
>> -		etmq->packet = etmq->prev_packet;
>> -		etmq->prev_packet = tmp;
>> +	if (etm->sample_branches) {
>> +		err = cs_etm__synth_branch_sample(etmq);
>> +		if (err)
>> +			return err;
>>   	}
>>   
>> -	return err;
>> +	/*
>> +	 * Swap PACKET with PREV_PACKET: PACKET becomes PREV_PACKET for
>> +	 * the next incoming packet.
>> +	 */
>> +	tmp = etmq->packet;
>> +	etmq->packet = etmq->prev_packet;
>> +	etmq->prev_packet = tmp;
> 
> Robert, I remember noticing that when you first submitted the code but forgot to
> go back to it.  What is the point of swapping the packets?  I understand
> 
> etmq->prev_packet = etmq->packet;
> 
> But not
> 
> etmq->packet = tmp;
> 
> After all etmq->packet will be clobbered as soon as cs_etm_decoder__get_packet()
> is called, which is alwasy right after either cs_etm__sample() or
> cs_etm__flush().
>

This is code I inherited from the original versions of these patches, 
but it works because:
- etmq->packet and etmq->prev_packet are pointers to struct 
cs_etm_packet allocated by zalloc() in cs_etm__alloc_queue()
- cs_etm_decoder__get_packet() takes a pointer to struct cs_etm_packet 
and copies the contents of the first packet from the queue into the 
passed location with:
    *packet = decoder->packet_buffer[decoder->head]

So the swap code is only swapping the pointers over, not the contents of 
the packets.

Regards

Rob


> Thanks,
> Mathieu
>
> 
> 
>> +	return 0;
>>   }
>>   
>>   static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
>> -- 
>> 2.7.4
>>

  parent reply	other threads:[~2018-05-30  9:46 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-28  8:44 [RFT v3 0/4] Perf script: Add python script for CoreSight trace disassembler Leo Yan
2018-05-28  8:45 ` [RFT v3 1/4] perf cs-etm: Generate branch sample for missed packets Leo Yan
2018-05-28 22:13   ` Mathieu Poirier
2018-05-29  0:25     ` Leo Yan
2018-05-29 16:04       ` Mathieu Poirier
2018-05-30  0:28         ` Leo Yan
2018-05-30 14:45           ` Mathieu Poirier
2018-05-30 14:49             ` Leo Yan
2018-05-30  9:45     ` Robert Walker [this message]
2018-05-30 15:04       ` Mike Leach
2018-05-30 15:39         ` Leo Yan
2018-05-30 15:49           ` Leo Yan
2018-05-28  8:45 ` [RFT v3 2/4] perf script python: Add addr into perf sample dict Leo Yan
2018-05-31 10:45   ` [tip:perf/urgent] " tip-bot for Leo Yan
2018-05-28  8:45 ` [RFT v3 3/4] perf script python: Add script for CoreSight trace disassembler Leo Yan
2018-05-28  8:45 ` [RFT v3 4/4] coresight: Document " Leo Yan
2018-05-28 20:03 ` [RFT v3 0/4] Perf script: Add python script " Arnaldo Carvalho de Melo
2018-05-28 21:53   ` Mathieu Poirier
2018-05-29 13:32     ` Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5efe804f-364b-6ae1-3fca-ec7f6d8a383d@arm.com \
    --to=robert.walker@arm.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=corbet@lwn.net \
    --cc=coresight@lists.linaro.org \
    --cc=jolsa@redhat.com \
    --cc=kim.phillips@arm.co \
    --cc=leo.yan@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.poirier@linaro.org \
    --cc=mike.leach@linaro.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tor@ti.com \
    --subject='Re: [RFT v3 1/4] perf cs-etm: Generate branch sample for missed packets' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).