[LU-10224] recovery-small test_57: timeout Created: 10/Nov/17  Updated: 03/May/18  Resolved: 11/Dec/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.11.0, Lustre 2.10.4

Type: Bug Priority: Minor
Reporter: Maloo Assignee: Bruno Faccini (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by LU-5378 recovery-small test 57 crash on inval... Resolved
is duplicated by LU-10362 recovery-small/57: Oops in lprocfs_st... Closed
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for liuying <emoly.liu@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/d938f700-c59c-11e7-9c63-52540065bddc.

The sub-test test_57 failed with the following error:

Timeout occurred after 630 mins, last suite running was recovery-small, restarting cluster to continue tests
[14587.215937] general protection fault: 0000 [#1] SMP 
[14587.226037] CPU: 0 PID: 4274 Comm: lctl Tainted: G           OE  ------------   3.10.0-693.5.2.el7.x86_64 #1
[14587.227003] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[14587.228344] RIP: 0010:[<ffffffffc0742fa1>]  [<ffffffffc0742fa1>] lprocfs_stats_collect+0xc1/0x140 [obdclass]
[14587.239740] Call Trace:
[14587.240004]  [<ffffffffc0743b87>] lprocfs_stats_seq_show+0x47/0x140 [obdclass]
[14587.240754]  [<ffffffff81226550>] seq_read+0x250/0x3b0
[14587.241283]  [<ffffffff812702cd>] proc_reg_read+0x3d/0x80
[14587.241831]  [<ffffffff81200b1c>] vfs_read+0x9c/0x170
[14587.242347]  [<ffffffff812019df>] SyS_read+0x7f/0xe0
[14587.242854]  [<ffffffff816b5089>] system_call_fastpath+0x16/0x1b

Please provide additional information about the failure here.

Info required for matching: recovery-small 57



 Comments   
Comment by Steve Guminski (Inactive) [ 10/Nov/17 ]

I believe I also hit this same problem on master:

https://testing.hpdd.intel.com/test_sets/b286f820-c5a4-11e7-a066-52540065bddc

Comment by Andreas Dilger [ 12/Nov/17 ]

Found this in https://testing.hpdd.intel.com/sub_tests/5a997ade-c66c-11e7-9c63-52540065bddc


recovery-small test 57: read procfs entries causes kernel crash =================================== 20:33:25 (1510346005)
[19211.603958] Lustre: DEBUG MARKER: grep -c /mnt/lustre' ' /proc/mounts
[19211.610675] Lustre: DEBUG MARKER: lsof -t /mnt/lustre
[19211.689718] Lustre: DEBUG MARKER: umount /mnt/lustre 2>&1
[19211.713815] general protection fault: 0000 1 SMP
[19211.723227] CPU: 1 PID: 1225 Comm: lctl
[19211.736098] Call Trace:
[19211.736355] [<ffffffffc0745627>] lprocfs_stats_seq_show+0x47/0x140 [obdclass]
[19211.737040] [<ffffffff81226550>] seq_read+0x250/0x3b0
[19211.737536] [<ffffffff812702cd>] proc_reg_read+0x3d/0x80
[19211.738041] [<ffffffff81200b1c>] vfs_read+0x9c/0x170
[19211.738527] [<ffffffff812019df>] SyS_read+0x7f/0xe0
[19211.739000] [<ffffffff816b5089>] system_call_fastpath+0x16/0x1b

{nodormat}
Comment by nasf (Inactive) [ 15/Nov/17 ]

+1 on master:
https://testing.hpdd.intel.com/test_sets/6e70c5fa-c9d5-11e7-a066-52540065bddc

Comment by Bruno Faccini (Inactive) [ 20/Nov/17 ]

+1 on master at https://testing.hpdd.intel.com/test_sets/065827f2-cc1d-11e7-9c63-52540065bddc

Comment by Bruno Faccini (Inactive) [ 22/Nov/17 ]

After having a look to an associated crash-dump, my guess is that there is a possible race between umount process and concurrent "lctl get_param osc.<OSC>.stats", what recovery-small/test_57 is intended to detect.

This seems to occur due to osc_precleanup() calling ptlrpc_lprocfs_unregister_obd() before lprocfs_obd_cleanup(), allowing the obd->obd_svc_stats/lprocfs_stats struct to be freed before the associated proc file and internal structs concurrent users are done and thus can be cleaned. Looks like a revival of LU-106!

I wonder if this can just be fixed by changing ptlrpc_lprocfs_unregister_obd() and lprocfs_obd_cleanup() order in osc_precleanup() (and may be also in other places where both routines are used the same way/order, like in mdc_precleanup()/lwp_device_fini()/osp_init0()/osp_device_fini()).

Comment by Bruno Faccini (Inactive) [ 23/Nov/17 ]

Stacks showing the umount/lctl racy scenario in crash-dump :

PID: 25606  TASK: ffff88007b579fa0  CPU: 1   COMMAND: "lctl"
bt: seek error: kernel virtual address: ffffffffffffffff  type: "cpu_online_map"
 #0 [ffff880069293b50] machine_kexec at ffffffff8105c4cb
 #1 [ffff880069293bb0] __crash_kexec at ffffffff81104a42
 #2 [ffff880069293c80] crash_kexec at ffffffff81104b30
 #3 [ffff880069293c98] oops_end at ffffffff816ad338
 #4 [ffff880069293cc0] die at ffffffff8102e97b
 #5 [ffff880069293cf0] do_general_protection at ffffffff816accbe
 #6 [ffff880069293d20] general_protection at ffffffff816ac568
    [exception RIP: lprocfs_stats_collect+193]
    RIP: ffffffffc0720a41  RSP: ffff880069293dd8  RFLAGS: 00010246
    RAX: 0000000000007c15  RBX: ffff880069293e20  RCX: 0000000000000006
    RDX: 6f2d313030305953  RSI: 6f2d313030305a53  RDI: 0000000000000006
    RBP: ffff880069293e00   R8: 0000000000000500   R9: 0000000000000100
    R10: 0000000000000000  R11: ffff880069293ca6  R12: ffff88007c15a2c0
    R13: 0000000000000020  R14: ffff880079723000  R15: ffff880068d94d00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffff880069293e08] lprocfs_stats_seq_show at ffffffffc0721627 [obdclass]
 #8 [ffff880069293e78] seq_read at ffffffff81226550
 #9 [ffff880069293ee8] proc_reg_poll at ffffffff812702cd  <<<< or proc_reg_read() ???????
#10 [ffff880069293f08] vfs_read at ffffffff81200b1c
#11 [ffff880069293f38] sys_read at ffffffff812019df
#12 [ffff880069293f80] system_call_fastpath at ffffffff816b5089
    RIP: 00007f3b2efdc6f0  RSP: 00007fff00fef468  RFLAGS: 00000246
    RAX: 0000000000000000  RBX: ffffffff816b5089  RCX: ffffffffffffffff
    RDX: 0000000000001000  RSI: 0000000001d74250  RDI: 0000000000000003
    RBP: 0000000001d7425a   R8: 0000000001d72430   R9: 0000000000001000
    R10: 0000000000000063  R11: 0000000000000246  R12: 00007fff00ff0610
    R13: 0000000001d74250  R14: 0000000000000001  R15: 0000000001d72410
    ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b

PID: 25600  TASK: ffff88007ad72f70  CPU: 0   COMMAND: "umount"
 #0 [ffff880068ceb810] __schedule at ffffffff816a9005
 #1 [ffff880068ceb878] schedule at ffffffff816a9589
 #2 [ffff880068ceb888] schedule_timeout at ffffffff816a7099
 #3 [ffff880068ceb930] wait_for_completion at ffffffff816a993d
 #4 [ffff880068ceb990] proc_entry_rundown at ffffffff81270885
 #5 [ffff880068ceb9e0] remove_proc_subtree at ffffffff81276e52
 #6 [ffff880068ceba30] proc_remove at ffffffff81276efe
 #7 [ffff880068ceba40] lprocfs_obd_cleanup at ffffffffc071baff [obdclass]
 #8 [ffff880068ceba58] osc_precleanup at ffffffffc0ad8cd7 [osc]
 #9 [ffff880068ceba80] class_cleanup at ffffffffc072f1e2 [obdclass]
#10 [ffff880068cebaf8] class_process_config at ffffffffc0731b26 [obdclass]
#11 [ffff880068cebba8] class_manual_cleanup at ffffffffc0732746 [obdclass]
#12 [ffff880068cebc50] lov_putref at ffffffffc0b43a22 [lov]
#13 [ffff880068cebcb0] lov_disconnect at ffffffffc0b4ae02 [lov]
#14 [ffff880068cebcd8] obd_disconnect at ffffffffc0bc24f3 [lustre]
#15 [ffff880068cebcf8] ll_put_super at ffffffffc0bc5990 [lustre]
#16 [ffff880068cebe38] generic_shutdown_super at ffffffff81203692
#17 [ffff880068cebe60] kill_anon_super at ffffffff81203a62
#18 [ffff880068cebe78] lustre_kill_super at ffffffffc0734eb5 [obdclass]
#19 [ffff880068cebe90] deactivate_locked_super at ffffffff81203e19
#20 [ffff880068cebeb0] deactivate_super at ffffffff81204586
#21 [ffff880068cebec8] cleanup_mnt at ffffffff812217bf
#22 [ffff880068cebee0] __cleanup_mnt at ffffffff81221852
#23 [ffff880068cebef0] task_work_run at ffffffff810ad275
#24 [ffff880068cebf30] do_notify_resume at ffffffff8102ab62
#25 [ffff880068cebf50] int_signal at ffffffff816b533d
    RIP: 00007f403a9cc3e7  RSP: 00007ffc54bf3958  RFLAGS: 00000246
    RAX: 0000000000000000  RBX: 0000556135a43040  RCX: ffffffffffffffff
    RDX: 0000000000000001  RSI: 0000000000000000  RDI: 0000556135a445a0
    RBP: 0000556135a445a0   R8: 0000000000000008   R9: 000000000000000c
    R10: 00007ffc54bf3670  R11: 0000000000000246  R12: 00007f403b542d58
    R13: 0000000000000000  R14: 0000556135a43210  R15: 0000556135a43040
    ORIG_RAX: 00000000000000a6  CS: 0033  SS: 002b
Comment by Andreas Dilger [ 23/Nov/17 ]

Bruno, thanks for looking into this.

We used to have proper protection of the /process entries, but I guess that that was lost during the transition to sysfs or similar. Is there anything that landed just before this bug started appearing that might be the cause?

Comment by Bruno Faccini (Inactive) [ 23/Nov/17 ]

Andreas, I believe that, according to LU-5378, and the history of Maloo test results, it looks like this is a very old issue but not enough frequent to have been addressed until now.

Also, during the testing of my fix proposal as a first try for osc lprocfs, I am unable to reproduce the crash with a heavier reproducer than recovery-small/test_57, on a single-node setup :

# while true ; do umount /mnt/lustre ; mount -t lustre -o rw,flock,user_xattr,lazystatfs 10.2.5.126@tcp:/lustre /mnt/lustre ; done &
# while true ; do lctl get_param osc.*-f*.stats ; done &

But I am still able to get the same crash by stressing the mdc lprocfs with this reproducer :

# while true ; do umount /mnt/lustre ; mount -t lustre -o rw,flock,user_xattr,lazystatfs 10.2.5.126@tcp:/lustre /mnt/lustre ; done &
# while true ; do lctl get_param mdc.*.stats ; done &

and when stressing the osp lprocfs with the same kind of reproducer :

# while true ; do umount /mnt/lustre-mds1 ; mount -t lustre -o loop /tmp/lustre-mdt1 /mnt/lustre-mds1 ; done &
# while true ; do lctl get_param osp.*.stats ; done &

So seems that as I suspected before, my fix idea looks valid and also may need to be applied for all the obd_device types I have already identified and listed.

Will try to push a full patch soon now.

Comment by Gerrit Updater [ 24/Nov/17 ]

Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: https://review.whamcloud.com/30249
Subject: LU-10224 obd: free obd_svc_stats when all users are gone
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 1d4fc1f20bd701edb45ff3e2a26b734f7ba08e37

Comment by Gerrit Updater [ 11/Dec/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/30249/
Subject: LU-10224 obd: free obd_svc_stats when all users are gone
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: ffc843a0aacd78495b1cff51344aaee3e32fc2de

Comment by Peter Jones [ 11/Dec/17 ]

Landed for 2.11

Comment by Gerrit Updater [ 07/Feb/18 ]

Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/31197
Subject: LU-10224 obd: free obd_svc_stats when all users are gone
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: 758bb9b569c14f30c0257fbb7f4f675b6f003040

Comment by Bob Glossman (Inactive) [ 01/Mar/18 ]

another on b2_10:
https://testing.hpdd.intel.com/test_sets/91b464c2-1ce1-11e8-a10a-52540065bddc

Comment by Gerrit Updater [ 03/May/18 ]

John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/31197/
Subject: LU-10224 obd: free obd_svc_stats when all users are gone
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: d9c595b530c5bc5f112698151bd1e9f4d74080cb

Generated at Sat Feb 10 02:33:09 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.