[LU-13942] Сheck if exp_obd initialised and return error code to lctl user if not initialised Created: 03/Sep/20  Updated: 02/Jun/21  Resolved: 02/Jun/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.15.0

Type: Bug Priority: Major
Reporter: Artem Blagodarenko (Inactive) Assignee: Artem Blagodarenko (Inactive)
Resolution: Fixed Votes: 0
Labels: patch

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Null reference at the start obd_statfs() function.

Looks like a race between

PID: 4360   TASK: ffff94719c7fd140  CPU: 15  COMMAND: "lctl"
 #0 [ffff94719c7bb8b0] machine_kexec at ffffffff95a63674
 #1 [ffff94719c7bb910] __crash_kexec at ffffffff95b1cf02
 #2 [ffff94719c7bb9e0] crash_kexec at ffffffff95b1cff0
 #3 [ffff94719c7bb9f8] oops_end at ffffffff9616e758
 #4 [ffff94719c7bba20] no_context at ffffffff9615cafe
 #5 [ffff94719c7bba70] __bad_area_nosemaphore at ffffffff9615cb95
 #6 [ffff94719c7bbac0] bad_area_nosemaphore at ffffffff9615cd06
 #7 [ffff94719c7bbad0] __do_page_fault at ffffffff961716b0
 #8 [ffff94719c7bbb40] do_page_fault at ffffffff96171915
 #9 [ffff94719c7bbb70] page_fault at ffffffff9616d758
    [exception RIP: obd_statfs.constprop.43+36]
    RIP: ffffffffc1a47d64  RSP: ffff94719c7bbc20  RFLAGS: 00010246
    RAX: 0000000000000001  RBX: 000000000000b2c7  RCX: 0000000000000001
    RDX: 000000000000b2c7  RSI: ffff94719c7bbd40  RDI: 0000000000000000
    RBP: ffff94719c7bbc60   R8: ffff94716feace40   R9: 0000000000000000
    R10: 0000000000001000  R11: ffffffff95bd609d  R12: 0000000000000000
    R13: 000000000000b2c7  R14: ffff94719c7bbd40  R15: 0000000000000001
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
#10 [ffff94719c7bbc68] ll_statfs_internal at ffffffffc1a4fd9d [lustre]
#11 [ffff94719c7bbd38] filesfree_show at ffffffffc1a5df6b [lustre]
#12 [ffff94719c7bbde8] lustre_attr_show at ffffffffc13ffe79 [obdclass]
#13 [ffff94719c7bbdf8] sysfs_kf_seq_show at ffffffff95ccbeaf
#14 [ffff94719c7bbe18] kernfs_seq_show at ffffffff95cca5e6
#15 [ffff94719c7bbe28] seq_read at ffffffff95c68b50
#16 [ffff94719c7bbe98] kernfs_fop_read at ffffffff95ccaf35
#17 [ffff94719c7bbed8] vfs_read at ffffffff95c4118f
#18 [ffff94719c7bbf08] sys_read at ffffffff95c4204f
#19 [ffff94719c7bbf50] system_call_fastpath at ffffffff96176ddb
    RIP: 00007f399f7c66e0  RSP: 00007fff98d7e7e0  RFLAGS: 00010206
    RAX: 0000000000000000  RBX: 00000000006480c0  RCX: 0000000000648100
    RDX: 0000000000001000  RSI: 0000000000648100  RDI: 0000000000000003
    RBP: 000000000064810a   R8: 00000000006480e0   R9: 0000000000001000
    R10: 00007fff98d7e360  R11: 0000000000000246  R12: 0000000000648100
    R13: 0000000000000001  R14: 0000000000000000  R15: 0000000000000003
    ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b

and

PID: 4043   TASK: ffff9471ca155140  CPU: 3   COMMAND: "mount.lustre"
 #0 [ffff947178f3f7b8] __schedule at ffffffff96169b97
 #1 [ffff947178f3f848] schedule at ffffffff9616a099
 #2 [ffff947178f3f858] schedule_timeout at ffffffff96167b71
 #3 [ffff947178f3f908] wait_for_completion at ffffffff9616a44d
 #4 [ffff947178f3f968] llog_process_or_fork at ffffffffc13ddc14 [obdclass]
 #5 [ffff947178f3f9d0] llog_process at ffffffffc13ddef4 [obdclass]
 #6 [ffff947178f3f9e0] class_config_parse_llog at ffffffffc1411b65 [obdclass]
 #7 [ffff947178f3fa28] mgc_process_cfg_log at ffffffffc19a08c8 [mgc]
 #8 [ffff947178f3fab0] mgc_process_log at ffffffffc19a1c23 [mgc]
 #9 [ffff947178f3fb70] mgc_process_config at ffffffffc19a37f3 [mgc]
#10 [ffff947178f3fbf0] lustre_process_log at ffffffffc141d9b8 [obdclass]
#11 [ffff947178f3fc88] ll_fill_super at ffffffffc1a4dc55 [lustre]
#12 [ffff947178f3fd78] lustre_fill_super at ffffffffc1423b03 [obdclass]
#13 [ffff947178f3fdb0] mount_nodev at ffffffff95c452df
#14 [ffff947178f3fde8] lustre_mount at ffffffffc141b808 [obdclass]
#15 [ffff947178f3fe10] mount_fs at ffffffff95c45e5e
#16 [ffff947178f3fe58] vfs_kern_mount at ffffffff95c63a07
#17 [ffff947178f3fe90] do_mount at ffffffff95c6602f
#18 [ffff947178f3ff18] sys_mount at ffffffff95c66e63
#19 [ffff947178f3ff50] system_call_fastpath at ffffffff96176ddb
    RIP: 00007ff8530ed60a  RSP: 00007ffc04d9e948  RFLAGS: 00010206
    RAX: 00000000000000a5  RBX: 0000000000000000  RCX: 0000000001000000
    RDX: 0000000000409e34  RSI: 00007ffc04da4cf8  RDI: 0000000000615010
    RBP: 0000000000000000   R8: 0000000000615420   R9: 0000000000000001
    R10: 0000000001000000  R11: 0000000000000206  R12: 00007ffc04da4cf8
    R13: 00000000fffffff5  R14: 0000000000000301  R15: 0000000000615420
    ORIG_RAX: 00000000000000a5  CS: 0033  SS: 002b
exp_obd is filled in ll_fill_super() -> client_common_fill_super(), but mount process is stuck in lustre_process_log() and didn't reached client_common_fill_super() yet.

This command has been executed before the client mount is complete

crash> ps -a 4360
PID: 4360   TASK: ffff94719c7fd140  CPU: 15  COMMAND: "lctl"
ARG: lctl get_param llite/snx11214-ffff947163641800/filesfree 
ENV: SHELL=/bin/bash
     USER=admin
     PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
     PWD=/
     SHLVL=1
     HOME=/home/admin
     LOGNAME=admin
     _=/usr/sbin/lctl
 

Solution - check if exp_obd initialized and return error code to lctl user if not initialized.

Workaround - check if mount completed before calling lctl get_param



 Comments   
Comment by Gerrit Updater [ 03/Sep/20 ]

Artem Blagodarenko (artem.blagodarenko@hpe.com) uploaded a new patch: https://review.whamcloud.com/39812
Subject: LU-13942 obd: check if sbi->ll_md_exp is initialized
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 5d4a7b564673a4b2fe7148af62e0f89fe7cf3797

Comment by Gerrit Updater [ 02/Jun/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39812/
Subject: LU-13942 obd: check if sbi->ll_md_exp is initialized
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 1de8c3739d6bac76b63a0e59a8aadf4ce491f88a

Comment by Peter Jones [ 02/Jun/21 ]

Landed for 2.15

Generated at Sat Feb 10 03:05:31 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.