LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Samuel Mendoza-Jonas <sam@mendozajonas.com>
To: Justin.Lee1@Dell.com, netdev@vger.kernel.org
Cc: davem@davemloft.net, linux-kernel@vger.kernel.org,
	openbmc@lists.ozlabs.org
Subject: Re: [PATCH net-next v2 5/6] net/ncsi: Reset channel state in ncsi_start_dev()
Date: Thu, 01 Nov 2018 15:30:38 +1100	[thread overview]
Message-ID: <8004d6c236582253a82004ce72425b45f09d7d56.camel@mendozajonas.com> (raw)
In-Reply-To: <2be6038ad8cb43559495e6f84e97b8a6@AUSX13MPS306.AMER.DELL.COM>

On Tue, 2018-10-30 at 18:23 +0000, Justin.Lee1@Dell.com wrote:
> > On Fri, 2018-10-26 at 17:25 +0000, Justin.Lee1@Dell.com wrote:
> > > Hi Samuel,
> > > 
> > > I noticed a few issues and commented below.
> > > 
> > > Thanks,
> > > Justin
> > > 
> > > 
> > > >  /* Resources */
> > > > +int ncsi_reset_dev(struct ncsi_dev *nd);
> > > >  void ncsi_start_channel_monitor(struct ncsi_channel *nc);
> > > >  void ncsi_stop_channel_monitor(struct ncsi_channel *nc);
> > > >  struct ncsi_channel *ncsi_find_channel(struct ncsi_package *np,
> > > > diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
> > > > index 014321ad31d3..9bad03e3fa5e 100644
> > > > --- a/net/ncsi/ncsi-manage.c
> > > > +++ b/net/ncsi/ncsi-manage.c
> > > > @@ -550,8 +550,10 @@ static void ncsi_suspend_channel(struct ncsi_dev_priv *ndp)
> > > >  		spin_lock_irqsave(&nc->lock, flags);
> > > >  		nc->state = NCSI_CHANNEL_INACTIVE;
> > > >  		spin_unlock_irqrestore(&nc->lock, flags);
> > > > -		ncsi_process_next_channel(ndp);
> > > > -
> > > > +		if (ndp->flags & NCSI_DEV_RESET)
> > > > +			ncsi_reset_dev(nd);
> > > > +		else
> > > > +			ncsi_process_next_channel(ndp);
> > > >  		break;
> > > >  	default:
> > > >  		netdev_warn(nd->dev, "Wrong NCSI state 0x%x in suspend\n",
> > > > @@ -1554,7 +1556,7 @@ int ncsi_start_dev(struct ncsi_dev *nd)
> > > >  		return 0;
> > > >  	}
> > > >  
> > > > -	return ncsi_choose_active_channel(nd);
> > > > +	return ncsi_reset_dev(nd);
> > > 
> > > If there is no available channel due to the whitelist, ncsi_start_dev() function will return failed
> > > Status and the network interface may fail to bring up too. It is possible for user to disable all 
> > > channels and leave the interface up for checking the LOM status.
> > > 
> > 
> > I'm not sure that that is a bug, or at least not in the scope of this
> > series. If the whitelist is set such that no channels are valid then
> > there's nothing for NCSI to do. If we want to do something like always
> > monitor all channels then that would be best to do in another patch.
> > 
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(ncsi_start_dev);
> > > 
> > > Also, if I send set_package_mask and set_channel_mask commands back to back in a program,
> > > the state machine doesn't work well. If I use command line and wait for it to complete for 
> > > each step, then it is fine.
> > 
> > Yeah that's not great; probably hitting some corner cases in the NCSI
> > locking. I'll look into the multi-channel related stuff but I have a
> > feeling that if you tried this with the existing set/clear commands you
> > would probably hit something similar, especially on your dual core
> > platform. If so this is probably something to fix separately.
> > 
> 
> It is possible that it is causing by the following code in ncsi_reset_dev() function.
> The state might be overwritten and the previous operation is interrupted.
> 
> 	spin_lock_irqsave(&ndp->lock, flags);
> 	ndp->flags |= NCSI_DEV_RESET;
> 	ndp->active_channel = active;
> 	ndp->active_package = active->package;
> 	spin_unlock_irqrestore(&ndp->lock, flags);
> 
> 	nd->state = ncsi_dev_state_suspend;

Yep, we should probably add a check before calling ncsi_reset_dev() in
the netlink code if we're already in reset, and check in ncsi_reset_dev()
if we mid-configuration.

For your trace below can you share exactly which commands you were
sending? Those messages aren't upstream so it's not 100% clear what's
being sent.

Thanks!
Sam

> 
> > > npcm7xx-emc f0825000.eth eth2: NCSI: Multi-package enabled on ifindex 2, mask 0x00000001
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_stop_channel_monitor() - pkg 0 ch 0
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_dev_work()
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_suspend_channel() - pkg 0 ch 0 state 0400
> > > npcm7xx-emc f0825000.eth eth2: NCSI: pkg 0 ch 0 set as preferred channel
> > > npcm7xx-emc f0825000.eth eth2: NCSI: Multi-channel enabled on ifindex 2, mask 0x00000003
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_stop_channel_monitor() - pkg 0 ch 1
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_dev_work()
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_suspend_channel() - pkg 0 ch 1 state 0400
> > > npcm7xx-emc f0825000.eth eth2: NCSI: Package 1 set to all channels disabled
> > > npcm7xx-emc f0825000.eth eth2: NCSI: Multi-channel enabled on ifindex 2, mask 0x00000000
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel()
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - pkg 0
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - pass pkg whitelist
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - ch 0
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - pass ch whitelist
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - skip
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - ch 1
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - pass ch whitelist
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - skip
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - next pkg
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_choose_active_channel() - pkg 1
> > > npcm7xx-emc f0825000.eth eth2: NCSI: No channel found to configure!
> > > npcm7xx-emc f0825000.eth eth2: NCSI interface down
> > > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_dev_work()
> > > npcm7xx-emc f0825000.eth eth2: Wrong NCSI state 0x100 in workqueue
> > > 
> > > All masks are set correctly, but you can see the PS column is not right and channel doesn't
> > > configure correctly.
> > > 
> > > /sys/kernel/debug/ncsi_protocol# cat ncsi_device_status
> > > IFIDX IFNAME NAME   PID CID RX TX MP MC WP WC PC PS LS RU CR NQ HA
> > > ===================================================================
> > >   2   eth2   ncsi0  000 000 1  1  1  1  1  1  1  0  1  1  1  0  1
> > >   2   eth2   ncsi1  000 001 1  0  1  1  1  1  0  0  1  1  1  0  1
> > >   2   eth2   ncsi2  001 000 0  0  1  1  0  0  0  0  1  1  1  0  1
> > >   2   eth2   ncsi3  001 001 0  0  1  1  0  0  0  0  1  1  1  0  1
> > > ===================================================================
> > > MP: Multi-mode Package     WP: Whitelist Package
> > > MC: Multi-mode Channel     WC: Whitelist Channel
> > > PC: Primary Channel
> > > PS: Poll Status
> > > LS: Link Status
> > > RU: Running
> > > CR: Carrier OK
> > > NQ: Queue Stopped
> > > HA: Hardware Arbitration
> > > 
> > > PS column is getting from (int)nc->monitor.enabled.
> 
> 



  reply	other threads:[~2018-11-01  4:30 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-23 21:51 [PATCH net-next v2 0/6] net/ncsi: Allow enabling multiple packages & channels Samuel Mendoza-Jonas
2018-10-23 21:51 ` [PATCH net-next v2 1/6] net/ncsi: Don't enable all channels when HWA available Samuel Mendoza-Jonas
2018-10-24  3:12   ` kbuild test robot
2018-10-23 21:51 ` [PATCH net-next v2 2/6] net/ncsi: Probe single packages to avoid conflict Samuel Mendoza-Jonas
2018-10-23 21:51 ` [PATCH net-next v2 3/6] net/ncsi: Don't deselect package in suspend if active Samuel Mendoza-Jonas
2018-10-23 21:51 ` [PATCH net-next v2 4/6] net/ncsi: Don't mark configured channels inactive Samuel Mendoza-Jonas
2018-10-23 21:52 ` [PATCH net-next v2 5/6] net/ncsi: Reset channel state in ncsi_start_dev() Samuel Mendoza-Jonas
2018-10-26 17:25   ` Justin.Lee1
2018-10-30  0:23     ` Samuel Mendoza-Jonas
2018-10-30 18:23       ` Justin.Lee1
2018-11-01  4:30         ` Samuel Mendoza-Jonas [this message]
2018-11-01 15:51           ` Justin.Lee1
2018-10-23 21:52 ` [PATCH net-next v2 6/6] net/ncsi: Configure multi-package, multi-channel modes with failover Samuel Mendoza-Jonas
2018-10-26 21:48   ` Justin.Lee1
2018-10-30  3:05     ` Samuel Mendoza-Jonas
2018-10-23 23:55 ` [PATCH net-next v2 0/6] net/ncsi: Allow enabling multiple packages & channels David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8004d6c236582253a82004ce72425b45f09d7d56.camel@mendozajonas.com \
    --to=sam@mendozajonas.com \
    --cc=Justin.Lee1@Dell.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=openbmc@lists.ozlabs.org \
    --subject='Re: [PATCH net-next v2 5/6] net/ncsi: Reset channel state in ncsi_start_dev()' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).