[LU-4377] Segmentation fault seen in LNet Self Test sub-shell Created: 11/Dec/13  Updated: 22/Dec/15  Resolved: 23/Jul/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0, Lustre 2.6.0
Fix Version/s: Lustre 2.7.0

Type: Bug Priority: Minor
Reporter: Brett Lee (Inactive) Assignee: Amir Shehata (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-6025 lctl nodemap commands stop working in... Resolved
Severity: 3
Rank (Obsolete): 11992

 Description   

Seen in Lustre 2.5
This sequence of commands seems to reproduce the fault.

  1. lst
    lst > new_session
    Invalid parameters list in command line
    lst > new_session a
    Invalid parameters list in command line
    lst > end_session
    Invalid parameters list in command line
    lst > stop
    Segmentation fault


 Comments   
Comment by Amir Shehata (Inactive) [ 11/Dec/13 ]

found the reason of the crash
when you do "new_session a" the following code gets executed (under jt_lst_new_session) :
if (optind == argc - 1) {
name = argv[optind ++];
if (strlen(name) >= LST_NAME_SIZE)

{ fprintf(stderr, "Name size is limited to %d\n", LST_NAME_SIZE - 1); return -1; }

This increments optind unnecessarily and never resets it before the call to the next getopt_long() which causes the crash. Looking at the code it appears that this issue is repeated several times in the code. It appears that the assumption is that lst is never executed interactively. When it is executed interactively optind needs to be reset in order for getopt_long() to start scanning from the beginning of the command line.

The same crash can be observed if you do:
new_session a
new_session

Generated at Sat Feb 10 01:42:10 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.