[LU-9933] Hitting ASSERTION in lnet_peer_add_nid() Created: 30/Aug/17 Updated: 02/Oct/17 Resolved: 30/Sep/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.11.0 |
| Fix Version/s: | Lustre 2.11.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | James A Simmons | Assignee: | Amir Shehata (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
More than 1/2 the time when I attempt to bring up a file system I hit this assertion: 2017-08-30T14:33:56.720487-04:00 ninja33.ccs.ornl.gov kernel: LNetError: 1755:0:(peer.c:1248:lnet_peer_add_nid()) ASSERTION( nid ! |
| Comments |
| Comment by Olaf Weber [ 31/Aug/17 ] |
|
To get here the ping buffer must have contained only a single NID, which should always be the loopback NID. Like the lnet_peer_data_present() should have the following check changed if (pbuf->pb_info.pi_nnis > 1)
nid = pbuf->pb_info.pi_ni[1].ns_nid;
to if (pbuf->pb_info.pi_nnis <= 1) goto out; nid = pbuf->pb_info.pi_ni[1].ns_nid; |
| Comment by Gerrit Updater [ 31/Aug/17 ] |
|
Olaf Weber (olaf.weber@hpe.com) uploaded a new patch: https://review.whamcloud.com/28811 |
| Comment by Amir Shehata (Inactive) [ 22/Sep/17 ] |
|
this patch has +2, should we land it since it's a blocker? |
| Comment by Gerrit Updater [ 30/Sep/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28811/ |
| Comment by Peter Jones [ 30/Sep/17 ] |
|
Landed for 2.11. |
| Comment by Peter Jones [ 30/Sep/17 ] |
|
Does this affect b2_10? |
| Comment by Amir Shehata (Inactive) [ 02/Oct/17 ] |
|
No. This impacts dynamic discovery only |