[LU-11391] soft lockup in ldlm_prepare_lru_list() Created: 18/Sep/18 Updated: 22/Nov/18 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Stephane Thiell | Assignee: | Yang Sheng |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
CentOS 7.5 patchfull and Lustre 2.11.55 on AMD EPYC servers |
||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
Testing master branch, tag 2.11.55, and hit soft lockups in ldlm_prepare_lru_list() (workqueue: ldlm_pools_recalc_task) on the client when running mdtest from the IO-500 benchmark using a single client. [212288.213417] NMI watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [kworker/35:1:600] [212288.221336] Modules linked in: mgc(OE) lustre(OE) lmv(OE) mdc(OE) osc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) mpt3sas mpt2sas raid_class scsi_transport_sas mptctl mptbase rpcsec_gss_krb5 dell_rbu auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ib_ucm rpcrdma rdma_ucm ib_uverbs ib_iser ib_umad rdma_cm iw_cm libiscsi ib_ipoib scsi_transport_iscsi ib_cm mlx5_ib ib_core sunrpc vfat fat amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crc32_pclmul ghash_clmulni_intel dcdbas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg dm_multipath ccp dm_mod pcspkr shpchp i2c_piix4 ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt [212288.294305] fb_sys_fops ttm mlx5_core crct10dif_pclmul drm ahci mlxfw crct10dif_common tg3 libahci crc32c_intel devlink megaraid_sas ptp libata i2c_core pps_core [212288.307953] CPU: 35 PID: 600 Comm: kworker/35:1 Kdump: loaded Tainted: G OEL ------------ 3.10.0-862.9.1.el7_lustre.x86_64 #1 [212288.320378] Hardware name: Dell Inc. PowerEdge R7425/02MJ3T, BIOS 1.3.6 04/20/2018 [212288.328069] Workqueue: events ldlm_pools_recalc_task [ptlrpc] [212288.333925] task: ffff908bfc470000 ti: ffff908bfc464000 task.ti: ffff908bfc464000 [212288.341491] RIP: 0010:[<ffffffff9fd08ff2>] [<ffffffff9fd08ff2>] native_queued_spin_lock_slowpath+0x122/0x200 [212288.351518] RSP: 0018:ffff908bfc467be8 EFLAGS: 00000246 [212288.356918] RAX: 0000000000000000 RBX: 0000000000002000 RCX: 0000000001190000 [212288.364139] RDX: ffff906bffb99740 RSI: 0000000001b10000 RDI: ffff90abfa32953c [212288.371358] RBP: ffff908bfc467be8 R08: ffff908bffb19740 R09: 0000000000000000 [212288.378577] R10: 0000fbd0948bcb20 R11: 7fffffffffffffff R12: ffff907ca99fd018 [212288.385796] R13: 0000000000000000 R14: 0000000000018b40 R15: 0000000000018b40 [212288.393017] FS: 00007f96c2fbb740(0000) GS:ffff908bffb00000(0000) knlGS:0000000000000000 [212288.401190] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [212288.407023] CR2: 00007f8de6b4da88 CR3: 0000000aaf40e000 CR4: 00000000003407e0 [212288.414244] Call Trace: [212288.416793] [<ffffffffa0309510>] queued_spin_lock_slowpath+0xb/0xf [212288.423151] [<ffffffffa0316840>] _raw_spin_lock+0x20/0x30 [212288.428755] [<ffffffffc0fb9280>] ldlm_pool_set_clv+0x20/0x40 [ptlrpc] [212288.435391] [<ffffffffc0f9c956>] ldlm_cancel_lrur_policy+0xd6/0x100 [ptlrpc] [212288.442639] [<ffffffffc0f9e4ca>] ldlm_prepare_lru_list+0x1fa/0x4c0 [ptlrpc] [212288.449797] [<ffffffffc0f9c880>] ? ldlm_iter_helper+0x20/0x20 [ptlrpc] [212288.456522] [<ffffffffc0fa3e31>] ldlm_cancel_lru+0x61/0x170 [ptlrpc] [212288.463076] [<ffffffffc0fb7741>] ldlm_cli_pool_recalc+0x231/0x240 [ptlrpc] [212288.470148] [<ffffffffc0fb785c>] ldlm_pool_recalc+0x10c/0x1f0 [ptlrpc] [212288.476874] [<ffffffffc0fb7abc>] ldlm_pools_recalc_delay+0x17c/0x1d0 [ptlrpc] [212288.484208] [<ffffffffc0fb7cd3>] ldlm_pools_recalc_task+0x1c3/0x260 [ptlrpc] [212288.491431] [<ffffffff9fcb35ef>] process_one_work+0x17f/0x440 [212288.497356] [<ffffffff9fcb4686>] worker_thread+0x126/0x3c0 [212288.503016] [<ffffffff9fcb4560>] ? manage_workers.isra.24+0x2a0/0x2a0 [212288.509629] [<ffffffff9fcbb621>] kthread+0xd1/0xe0 [212288.514594] [<ffffffff9fcbb550>] ? insert_kthread_work+0x40/0x40 [212288.520776] [<ffffffffa03205e4>] ret_from_fork_nospec_begin+0xe/0x21 [212288.527300] [<ffffffff9fcbb550>] ? insert_kthread_work+0x40/0x40 [212288.533479] Code: 13 48 c1 ea 0d 48 98 83 e2 30 48 81 c2 40 97 01 00 48 03 14 c5 a0 53 93 a0 4c 89 02 41 8b 40 08 85 c0 75 0f 0f 1f 44 00 00 f3 90 <41> 8b 40 08 85 c0 74 f6 4d 8b 08 4d 85 c9 74 04 41 0f 18 09 8b Triggered a crash dump that can be made available if anyone interested, just let me know. Attaching vmcore-dmest.txt and the output of foreach bt. Client was running the following part of the IO-500 benchmark: [Starting] mdtest_easy_stat [Exec] mpirun -np 24 /home/sthiell/io-500-dev/bin/mdtest -T -F -d /firbench/nodom/datafiles/io500.2018.09.17-19.30.06/mdt_easy -n 200000 -u -L -x /firbench/nodom/datafiles/io500.2018.09.17-19.30.06/mdt_easy-stonewall |
| Comments |
| Comment by Yang Sheng [ 18/Sep/18 ] |
|
Hi, Stephane, Could you please upload the vmcore to our ftp site(ftp.whamcloud.com)? Better pack with debuginfo rpm. Thanks, |
| Comment by Stephane Thiell [ 18/Sep/18 ] |
|
Hi YangSheng, Done, uploaded as LU11391-vmcore-pack.tar with debuginfo rpms included. Hope that helps! Best, |
| Comment by Andreas Dilger [ 11/Oct/18 ] |
|
Stephane, could you please try setting the LDLM LRU size to avoid the LRU getting too large: client$ lctl set_param ldlm.namespaces.*.lru_size=50000 This might avoid the lockup that you are seeing. We are looking at making this the default for an upcoming release, since it seems to be a common problem. |
| Comment by Yang Sheng [ 12/Oct/18 ] |
|
Hi, Stephane, I have investigated the vmcore. Looks like we lost the timing of lockup. From stack trace you attached, the thread was spinning on pl_lock. Looks like not one can hold this lock for a long time except on server side. But this instance is client. Anyway, i'll try to reproduce it on my side. Thanks, |
| Comment by Johann Peyrard (Inactive) [ 20/Nov/18 ] |
|
We had the same issue last week. The only way to reduce this NMI message to near silent was to play with these two parameters : $ lctl set_param ldlm.namespaces.*.lru_size=10000 $ lctl set_param ldlm...lru_max_age=1000
Regards, Johann |