Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.10.4
-
None
-
SLES12SP3 -- 4.4.132-94.33-default client -- lustre 2.10.4
OSS & MDS on centos 7.3 -- lustre 2.10.0 (with a couple of hand picked patches)
-
3
-
9223372036854775807
Description
I'm running a robinhood scan on a lustre filesystem with 80M files. As I'm getting memory pressure on the system (80M * ~1kB per lustre_inode_cache is big) and the system tries to shrink the slabs, I get those errors in /var/log/messages:
2018-06-20T10:49:34.165875-04:00 btsotbal3000 kernel: [ 4889.674359] Lustre: lustre-MDT0000-mdc-ffff883fd53bb000: Connection to lustre-MDT0000 (at 10.11.201.11@o2ib) was lost; in progress operations using this service will wait for recov
ery to complete
2018-06-20T10:51:38.493202-04:00 btsotbal3000 kernel: [ 5014.015342] Lustre: 4222:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1529505581/real 0] req@ffff881f7b3513c0 x16037989090541
12/t0(0) o103->lustre-OST008c-osc-ffff883fd53bb000@10.11.201.18@o2ib:17/18 lens 328/224 e 0 to 1 dl 1529505643 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
2018-06-20T10:51:38.493217-04:00 btsotbal3000 kernel: [ 5014.017476] Lustre: lustre-OST008c-osc-ffff883fd53bb000: Connection to lustre-OST008c (at 10.11.201.18@o2ib) was lost; in progress operations using this service will wait for recovery to complete
2018-06-20T10:51:39.012492-04:00 btsotbal3000 kernel: [ 5014.527203] Lustre: 4222:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1529505581/real 0] req@ffff881e37d8f080 x1603798909214432/t0(0) o103->lustre-OST0084-osc-ffff883fd53bb000@10.11.201.17@o2ib:17/18 lens 328/224 e 0 to 1 dl 1529505643 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
2018-06-20T10:51:39.012505-04:00 btsotbal3000 kernel: [ 5014.527207] Lustre: 4222:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 262 previous similar messages
2018-06-20T10:51:40.008987-04:00 btsotbal3000 kernel: [ 5015.533078] Lustre: 4222:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1529505582/real 0] req@ffff881f2e08acc0 x1603798909677472/t0(0) o103->lustre-OST0089-osc-ffff883fd53bb000@10.11.201.18@o2ib:17/18 lens 328/224 e 0 to 1 dl 1529505644 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
2018-06-20T10:51:40.008998-04:00 btsotbal3000 kernel: [ 5015.533083] Lustre: 4222:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 553 previous similar messages
2018-06-20T10:51:42.022152-04:00 btsotbal3000 kernel: [ 5017.542234] Lustre: 4222:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1529505582/real 0] req@ffff8818937e8340 x1603798910485936/t0(0) o103->lustre-OST00a9-osc-ffff883fd53bb000@10.11.200.15@o2ib:17/18 lens 328/224 e 0 to 1 dl 1529505644 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
2018-06-20T10:51:42.022165-04:00 btsotbal3000 kernel: [ 5017.542239] Lustre: 4222:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 1102 previous similar messages
2018-06-20T10:51:46.030748-04:00 btsotbal3000 kernel: [ 5021.548246] Lustre: 4222:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1529505584/real 0] req@ffff881e0945b640 x1603798911754992/t0(0) o103->lustre-OST00ab-osc-ffff883fd53bb000@10.11.200.15@o2ib:17/18 lens 328/224 e 0 to 1 dl 1529505648 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
2018-06-20T10:51:46.030762-04:00 btsotbal3000 kernel: [ 5021.548249] Lustre: 4222:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 2104 previous similar messages
2018-06-20T10:51:54.040347-04:00 btsotbal3000 kernel: [ 5029.557301] Lustre: 4222:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1529505589/real 0] req@ffff881e054d39c0 x1603798914360720/t0(0) o103->lustre-OST009f-osc-ffff883fd53bb000@10.11.200.14@o2ib:17/18 lens 328/224 e 0 to 1 dl 1529505658 ref 2 fl Rpc:X/0/ffffffff rc 0/-1
2018-06-20T10:51:54.040360-04:00 btsotbal3000 kernel: [ 5029.557307] Lustre: 4222:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 4325 previous similar messages
2018-06-20T10:52:01.579715-04:00 btsotbal3000 kernel: [ 5037.097043] NMI watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [ptlrpcd_00_18:4222]
2018-06-20T10:52:01.579728-04:00 btsotbal3000 kernel: [ 5037.106479] Modules linked in: nfsv3 nfs fscache mpt3sas raid_class scsi_transport_sas mptctl mptbase osc(OEN) mgc(OEN) lustre(OEN) lmv(OEN) fld(OEN) mdc(OEN) fid(OEN) lov(OEN) ko2iblnd(OEN) ptlrpc(OEN) obdclass(OEN) lnet(OEN) libcfs(OEN) af_packet iscsi_ibft iscsi_boot_sysfs rdma_ucm(OEX) ib_ucm(OEX) rdma_cm(OEX) iw_cm(OEX) configfs ib_ipoib(OEX) ib_cm(OEX) ib_uverbs(OEX) ib_umad(OEX) mlx4_en(OEX) mlx4_ib(OEX) mlx4_core(OEX) msr xfs libcrc32c intel_rapl sb_edac edac_core ipmi_ssif x86_pkg_temp_thermal intel_powerclamp coretemp ixgbe kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel mdio drbg ansi_cprng aesni_intel igb mgag200 ttm drm_kms_helper ipmi_si drm ipmi_devintf mei_me ipmi_msghandler pcspkr i2c_algo_bit fjes aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops iTCO_wdt iTCO_vendor_support dcdbas(X) dca shpchp lpc_ich mfd_core processor mei wmi button nfsd nfs_acl lockd auth_rpcgss grace sunrpc mlx5_ib(OEX) ext4 crc16 jbd2 mbcache ib_core(OEX) sd_mod mlx5_core(OEX) vxlan ip6_udp_tunnel udp_tunnel ptp pps_core devlink inet_lro ehci_pci mlx_compat(OEX) ehci_hcd ahci libahci usbcore libata megaraid_sas usb_common sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4 [last unloaded: dell_rbu]
2018-06-20T10:52:01.579731-04:00 btsotbal3000 kernel: [ 5037.106543] Supported: No, Unsupported modules are loaded
2018-06-20T10:52:01.579732-04:00 btsotbal3000 kernel: [ 5037.106546] CPU: 12 PID: 4222 Comm: ptlrpcd_00_18 Tainted: G OE NX 4.4.132-94.33-default #1
2018-06-20T10:52:01.579733-04:00 btsotbal3000 kernel: [ 5037.106548] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.0.2 03/15/2016
2018-06-20T10:52:01.579734-04:00 btsotbal3000 kernel: [ 5037.106549] task: ffff883f99d95640 ti: ffff883f99d98000 task.ti: ffff883f99d98000
2018-06-20T10:52:01.579735-04:00 btsotbal3000 kernel: [ 5037.106550] RIP: 0010:[<ffffffff810cb01f>] [<ffffffff810cb01f>] native_queued_spin_lock_slowpath+0x10f/0x1a0
2018-06-20T10:52:01.579736-04:00 btsotbal3000 kernel: [ 5037.106557] RSP: 0018:ffff883f99d9bc00 EFLAGS: 00000246
2018-06-20T10:52:01.579736-04:00 btsotbal3000 kernel: [ 5037.106557] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000340000
2018-06-20T10:52:01.579737-04:00 btsotbal3000 kernel: [ 5037.106558] RDX: ffff883fff05af80 RSI: 0000000000100001 RDI: ffff881ffae04500
2018-06-20T10:52:01.579737-04:00 btsotbal3000 kernel: [ 5037.106559] RBP: ffff883f99d9bce0 R08: ffff881fff99af80 R09: 0000000000000000
2018-06-20T10:52:01.579738-04:00 btsotbal3000 kernel: [ 5037.106560] R10: 000000000000002d R11: ffff881efa09381b R12: 00000000147f4371
2018-06-20T10:52:01.579739-04:00 btsotbal3000 kernel: [ 5037.106561] R13: 00000000051fd0dc R14: 0000000000000000 R15: ffff883f99d95640
2018-06-20T10:52:01.579740-04:00 btsotbal3000 kernel: [ 5037.106562] FS: 0000000000000000(0000) GS:ffff881fff980000(0000) knlGS:0000000000000000
2018-06-20T10:52:01.579740-04:00 btsotbal3000 kernel: [ 5037.106563] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2018-06-20T10:52:01.579740-04:00 btsotbal3000 kernel: [ 5037.106564] CR2: 00007f0e6c018898 CR3: 0000000001e0c000 CR4: 00000000003606f0
2018-06-20T10:52:01.579741-04:00 btsotbal3000 kernel: [ 5037.106565] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
2018-06-20T10:52:01.579742-04:00 btsotbal3000 kernel: [ 5037.106566] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
2018-06-20T10:52:01.579743-04:00 btsotbal3000 kernel: [ 5037.106567] Stack:
2018-06-20T10:52:01.579744-04:00 btsotbal3000 kernel: [ 5037.106568] ffffffff811915d2 ffffffffa0b98f15 ffffffffa0eee8db ffffffffa0eee8db
2018-06-20T10:52:01.579744-04:00 btsotbal3000 kernel: [ 5037.106569] ffffffffa0eee8db ffffffffa0eee8db 0000000000000000 00080000ffffffff
2018-06-20T10:52:01.579744-04:00 btsotbal3000 kernel: [ 5037.106571] 5b2a6a1100000000 00000000ffffffff ffff883f92821048 ffffffffa0eee8db
2018-06-20T10:52:01.579745-04:00 btsotbal3000 kernel: [ 5037.106573] Call Trace:
2018-06-20T10:52:01.579745-04:00 btsotbal3000 kernel: [ 5037.106581] [<ffffffff811915d2>] queued_spin_lock_slowpath+0x7/0xa
2018-06-20T10:52:01.579746-04:00 btsotbal3000 kernel: [ 5037.106605] [<ffffffffa0b98f15>] LNetMDUnlink+0x65/0x150 [lnet]
2018-06-20T10:52:01.579747-04:00 btsotbal3000 kernel: [ 5037.106677] [<ffffffffa0e854be>] ptlrpc_unregister_reply+0xee/0x6b0 [ptlrpc]
2018-06-20T10:52:01.579748-04:00 btsotbal3000 kernel: [ 5037.106703] [<ffffffffa0e89de3>] ptlrpc_expire_one_request+0xb3/0x420 [ptlrpc]
2018-06-20T10:52:01.579748-04:00 btsotbal3000 kernel: [ 5037.106726] [<ffffffffa0e8a1f9>] ptlrpc_expired_set+0xa9/0x180 [ptlrpc]
2018-06-20T10:52:01.579749-04:00 btsotbal3000 kernel: [ 5037.106752] [<ffffffffa0eb7679>] ptlrpcd+0x229/0x490 [ptlrpc]
2018-06-20T10:52:01.579749-04:00 btsotbal3000 kernel: [ 5037.106760] [<ffffffff8109ebc9>] kthread+0xc9/0xe0
2018-06-20T10:52:01.579750-04:00 btsotbal3000 kernel: [ 5037.106765] [<ffffffff816177ef>] ret_from_fork+0x3f/0x80
2018-06-20T10:52:01.579751-04:00 btsotbal3000 kernel: [ 5037.107943] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x80
2018-06-20T10:52:01.579752-04:00 btsotbal3000 kernel: [ 5037.107944]
2018-06-20T10:52:01.579752-04:00 btsotbal3000 kernel: [ 5037.107945] Leftover inexact backtrace:
2018-06-20T10:52:01.579753-04:00 btsotbal3000 kernel: [ 5037.107945]
2018-06-20T10:52:01.579753-04:00 btsotbal3000 kernel: [ 5037.107947] [<ffffffff8109eb00>] ? kthread_park+0x50/0x50
2018-06-20T10:52:01.579754-04:00 btsotbal3000 kernel: [ 5037.107948] Code: c1 e8 12 48 c1 ea 0c 83 e8 01 83 e2 30 48 98 48 81 c2 80 af 01 00 48 03 14 c5 20 0d f5 81 4c 89 02 41 8b 40 08 85 c0 75 0a f3 90 <41> 8b 40 08 85 c0 74 f6 4d 8b 08 4d 85 c9 74 08 41 0f 0d 09 eb
At that point, all the CPUs go to 100% and it seems like we are not making progress anymore.
I can repro by running robinhood for a few minutes and dropping the cache (echo 2 > /proc/sys/vm/drop_caches).
I managed to fix it by limiting lru_max_age to 10s (ldlm.namespace.*.lru_max_age).
Overall, it feels like when the slab shrinker run, if there are too many locks, reclaiming those locks takes a while, prevents the rpc stack to make progress, and then we hit a race condition in LNetMDUnlink.
Attachments
Issue Links
- is related to
-
LU-11075 suspicious code in ldlm_prepare_lru_list()
- Resolved
-
LU-17428 reduce default value for lru_max_age to 300s
- Resolved
- is related to
-
LU-11189 OSC flow control
- Open
-
LU-11079 Control concurrent statahead instances
- Resolved
-
LU-13970 add an option to disable inode cache on Lustre client
- Resolved
- mentioned in
-
Page Loading...