[LU-14779] lctl get_param does invalid DNS lookups Created: 22/Jun/21  Updated: 15/Jul/21  Resolved: 08/Jul/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.15.0

Type: Task Priority: Minor
Reporter: John Hammond Assignee: Andreas Dilger
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
Related
Rank (Obsolete): 9223372036854775807

 Description   

lctl get_param does DNS lookups to determine if param components may be nids. But it doesn't always choose a sensible component. For example

lctl get_param ldlm.namespaces.MGC1.2.3.4@o2ib.lock_unused_count

will do gethostbyname("namespaces.MGC1.2.3.4").

# gdb lctl
...
(gdb) b gethostbyname
Breakpoint 1 at 0x402d10
(gdb) run get_param ldlm.namespaces.MGC1.2.3.4@o2ib.lock_unused_count
Starting program: /root/ex/lustre-release/lustre/utils/.libs/lctl get_param ldlm.namespaces.MGC1.2.3.4@o2ib.lock_unused_count
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, gethostbyname (name=0x646070 "namespaces.MGC1.2.3.4") at ../nss/getXXbyYY.c:93
93	{
(gdb) bt
#0  gethostbyname (name=0x646070 "namespaces.MGC1.2.3.4") at ../nss/getXXbyYY.c:93
#1  0x00007ffff7bc6334 in libcfs_ip_str2addr () from /lib64/liblustreapi.so.1
#2  0x00007ffff7bc691f in libcfs_str2nid () from /lib64/liblustreapi.so.1
#3  0x0000000000414558 in clean_path (popt=popt@entry=0x7fffffffe070, 
    path=path@entry=0x7fffffffe4eb "ldlm.namespaces.MGC1.2.3.4@o2ib.lock_unused_count") at lustre_cfg.c:770
#4  0x00000000004164ae in jt_lcfg_getparam (argc=2, argv=0x7fffffffe1f0) at lustre_cfg.c:1210
#5  0x00007ffff7bc7911 in Parser_execarg () from /lib64/liblustreapi.so.1
#6  0x0000000000416f59 in lctl_main (argc=3, argv=0x7fffffffe1e8) at lctl.c:562
#7  0x00007ffff71be505 in __libc_start_main (main=0x403160 <main>, argc=3, argv=0x7fffffffe1e8, init=<optimized out>, fini=<optimized out>, 
    rtld_fini=<optimized out>, stack_end=0x7fffffffe1d8) at ../csu/libc-start.c:266
#8  0x000000000040318e in _start ()

Similarly

get_param obdfilter.myfs-ost001d.exports.1.2.3.4@o2ib.ldlm_stats

will lookup myfs-ost001d.exports.1.2.3.4 and exports.1.2.3.4.

This behavior is because we are using libcfs_str2nid() to detect NIDs. But this function is written to handle hostnames as well as NIDs.

But this is not great for performance, since, in DNS as in life, looking for something that doesn't exist tends to take a lot longer than looking for something that does.



 Comments   
Comment by John Hammond [ 22/Jun/21 ]

Testing locally I also see lctl list_param -R "*" doing gethostbyname("namespaces.MGC192.168.122.75").

Comment by Andreas Dilger [ 22/Jun/21 ]

Could we just replace this NID lookup with a simple string comparison, something like "[MGC]*[0-9\\.]*@[a-f0-9]*". I believe that would match all (currently) valid NIDs without doing any lookups or NID parsing/unparsing at all.

Alternately, we could add a new libcfs_str2nid_numeric() that doesn't do any hostname lookups, since the parameter files will always have the numeric NIDs in them, and not hostnames.

Comment by Gerrit Updater [ 23/Jun/21 ]

Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/44056
Subject: LU-14779 utils: no DNS lookups for NID in get_param
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 429170cabfc8ec4a173ca26f70a17fa4bf453804

Comment by Gerrit Updater [ 08/Jul/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/44056/
Subject: LU-14779 utils: no DNS lookups for NID in get_param
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: f21c507fbca2afab5a5d97d4e816696a69d1c593

Comment by Peter Jones [ 08/Jul/21 ]

Landed for 2.15

Generated at Sat Feb 10 03:12:43 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.