[LU-1760] Misleading LNet Self Test message when lnet_selftest module is not loaded Created: 16/Aug/12  Updated: 29/May/17

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Prakash Surya (Inactive) Assignee: Amir Shehata (Inactive)
Resolution: Unresolved Votes: 0
Labels: llnl, patch

Severity: 3
Rank (Obsolete): 9760

 Description   

When creating an lnet self test session without the lnet_selftest module loaded I see the following message:

-bash-4.1# lst new_session stress
Invalid parameters list in command line

This is misleading as the real problem is not the parameter list, but the fact that lnet_selftest is not loaded. To an uninformed user, this can be extremely confusing.

From a quick scan of the code, the error occurs here:

 611         rc = lst_new_session_ioctl(name, timeout, force, &session_id);
 612         if (rc != 0) {
 613                 lst_print_error("session", "Failed to create session: %s\n",
 614                                 strerror(errno));
 615                 return rc;
 616         }

Because the module is not loaded, the ioctl fails and sets errno to EINVAL. lst_print_error then interprets this error code and prints the misleading message. It would be much better if something was reported to the user which hinted at a possible unloaded module.

Something like the following would be much better:

-bash-4.1# lst new_session stress
Failed to create a new session with error EINVAL. This can be caused by incorrect parameters or because the lnet_selftest module is not loaded.

Better yet, if the ioctl fails, the lst command could parse /proc/modules to determine if lnet_selftest is loaded.



 Comments   
Comment by Prakash Surya (Inactive) [ 16/Aug/12 ]

strace of the program's execution:

-bash-4.1# strace lst new_session stress
execve("/usr/sbin/lst", ["lst", "new_session", "stress"], [/* 24 vars */]) = 0
brk(0)                                  = 0x10030000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=124024, ...}) = 0
mmap(NULL, 124024, PROT_READ, MAP_PRIVATE, 3, 0) = 0xfff7bef0000
close(3)                                = 0
open("/lib64/libc.so.6", O_RDONLY)      = 3
read(3, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0\0\3\0\25\0\0\0\1\0\0\0\0\0\35\r "..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=2172712, ...}) = 0
mmap(NULL, 1982072, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xfff7bd00000
mmap(0xfff7bec0000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b0000) = 0xfff7bec0000
mmap(0xfff7bee0000, 15992, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xfff7bee0000
close(3)                                = 0
mprotect(0xfff7bec0000, 65536, PROT_READ) = 0
mprotect(0xfff7bf60000, 65536, PROT_READ) = 0
munmap(0xfff7bef0000, 124024)           = 0
brk(0)                                  = 0x10030000
brk(0x10060000)                         = 0x10060000
open("/dev/lnet", O_RDWR)               = 3
ioctl(3, 0xc008653f, 0xfffef191700)     = -1 EINVAL (Invalid argument)
write(2, "Invalid parameters list in comma"..., 40Invalid parameters list in command line
) = 40
exit_group(-1)                          = ?
Comment by Peter Jones [ 16/Aug/12 ]

Isaac

Could you please take care of this one?

Thanks

Peter

Comment by Peter Jones [ 16/Aug/12 ]

Doug

Isaac is out this week so can you help instead?

Thanks

Peter

Comment by Andreas Dilger [ 17/Aug/12 ]

Possibly better would be to just modprobe lst and try it again. Easier for the user, and no need for a complex error message.

Comment by Doug Oucharek (Inactive) [ 17/Aug/12 ]

The modprobe approach would certainly be more user friendly. Unless someone sees an issue with attempting the modprobe on this error condition, I'll go with that approach.

Comment by Prakash Surya (Inactive) [ 17/Aug/12 ]

I agree, I like the idea of inserting the module for the user.

Comment by Doug Oucharek (Inactive) [ 22/Aug/12 ]

The change created to address this ticket is: http://review.whamcloud.com/3752

Comment by Doug Oucharek (Inactive) [ 06/Dec/13 ]

Amir: can you find an answer to Isaac's question in the Gerrit patch and rebase the patch once you do?

Generated at Sat Feb 10 01:19:27 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.