Details
-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
Lustre 2.10.3
-
None
-
3.10.0-693.2.2.el7_lustre.pl1.x86_64
-
3
-
9223372036854775807
Description
We have more server instabilities with 2.10.3.
The problem started with an hung OSS `oak-io2-s2` this morning, it's not clear what happened and would required some analysis of the crash dump. The first stack trace in the log is the following:
[1369496.805186] INFO: task systemd:1 blocked for more than 120 seconds. [1369496.826365] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [1369496.852691] systemd D ffffffff81a8ae08 0 1 0 0x00000000 [1369496.876463] ffff88015368fb50 0000000000000082 ffff880153698000 ffff88015368ffd8 [1369496.901403] ffff88015368ffd8 ffff88015368ffd8 ffff880153698000 ffffffff81a8ae00 [1369496.926336] ffffffff81a8ae04 ffff880153698000 00000000ffffffff ffffffff81a8ae08 [1369496.951268] Call Trace: [1369496.959866] [<ffffffff816aa409>] schedule_preempt_disabled+0x29/0x70 [1369496.981629] [<ffffffff816a8337>] __mutex_lock_slowpath+0xc7/0x1d0 [1369497.002523] [<ffffffff816a774f>] mutex_lock+0x1f/0x2f [1369497.019998] [<ffffffff8127f9a2>] sysfs_permission+0x32/0x60 [1369497.039175] [<ffffffff8120c31e>] __inode_permission+0x6e/0xc0 [1369497.058921] [<ffffffff8120c388>] inode_permission+0x18/0x50 [1369497.078096] [<ffffffff8120e44e>] link_path_walk+0x27e/0x8b0 [1369497.097273] [<ffffffff8120ebdb>] path_lookupat+0x6b/0x7b0 [1369497.115877] [<ffffffff811df555>] ? kmem_cache_alloc+0x35/0x1e0 [1369497.135925] [<ffffffff81211d2f>] ? getname_flags+0x4f/0x1a0 [1369497.155099] [<ffffffff8120f34b>] filename_lookup+0x2b/0xc0 [1369497.173988] [<ffffffff81212ec7>] user_path_at_empty+0x67/0xc0 [1369497.193750] [<ffffffff8124b919>] ? ep_scan_ready_list.isra.7+0x1b9/0x1f0 [1369497.216641] [<ffffffff81212f31>] user_path_at+0x11/0x20 [1369497.234673] [<ffffffff81206473>] vfs_fstatat+0x63/0xc0 [1369497.252433] [<ffffffff81206a94>] SYSC_newfstatat+0x24/0x60 [1369497.271340] [<ffffffff810eaaba>] ? get_monotonic_boottime+0x4a/0x100 [1369497.293088] [<ffffffff810aec01>] ? posix_get_boottime+0x11/0x20 [1369497.313423] [<ffffffff810b0271>] ? SyS_clock_gettime+0x81/0xc0 [1369497.333456] [<ffffffff81206cde>] SyS_newfstatat+0xe/0x10 [1369497.351774] [<ffffffff816b5009>] system_call_fastpath+0x16/0x1b
Because OSTs were unresponsive, we took a crash dump at 11:10:
- vmcore-oak-io2-s2-2018-02-21-11:10:27.gz https://stanford.box.com/s/bm7kgo5vndy6wylmnmo24qx26d377wge
Failing over the OSTs to the partner OSS seemed to work at first.
Then users reported that some operations on directories were stuck, so I took a look at the MDS and found it with warnings and stack traces (see attached file `oak-md1-s2.kernel.log` for full logs). This is just to show the beginning:
Feb 21 11:20:54 oak-md1-s2 kernel: LustreError: 102992:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans()) oak-OST003f-osc-MDT0000: cannot cleanup orphans: rc = -11 Feb 21 11:20:54 oak-md1-s2 kernel: LustreError: 102992:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans()) Skipped 21 previous similar messages Feb 21 11:28:39 oak-md1-s2 kernel: Lustre: oak-OST0051-osc-MDT0000: Connection restored to 10.0.2.105@o2ib5 (at 10.0.2.105@o2ib5) Feb 21 11:28:39 oak-md1-s2 kernel: Lustre: Skipped 946 previous similar messages Feb 21 11:28:47 oak-md1-s2 kernel: LustreError: 102692:0:(client.c:3007:ptlrpc_replay_interpret()) @@@ status -2, old was 0 req@ffff88020ad62a00 x1592931638579664/t4295689592(4295689592) o6->oak-OST0053-osc-MDT0000@10.0.2.105@o2ib5:28/4 lens 664/400 e 5 to 0 dl 1519241348 ref 2 fl Interpret:R/4/0 rc -2/-2 Feb 21 11:31:33 oak-md1-s2 kernel: INFO: task mdt01_002:102837 blocked for more than 120 seconds. Feb 21 11:31:33 oak-md1-s2 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 21 11:31:33 oak-md1-s2 kernel: mdt01_002 D ffffffff00000000 0 102837 2 0x00000080 Feb 21 11:31:33 oak-md1-s2 kernel: ffff88102428f558 0000000000000046 ffff88027975cf10 ffff88102428ffd8 Feb 21 11:31:33 oak-md1-s2 kernel: ffff88102428ffd8 ffff88102428ffd8 ffff88027975cf10 ffff88027975cf10 Feb 21 11:31:33 oak-md1-s2 kernel: ffff88036428f248 ffff88036428f240 fffffffe00000001 ffffffff00000000 Feb 21 11:31:33 oak-md1-s2 kernel: Call Trace: Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffff816a94e9>] schedule+0x29/0x70 Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffff816aadd5>] rwsem_down_write_failed+0x225/0x3a0 Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffff81332047>] call_rwsem_down_write_failed+0x17/0x30 Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffff816a87cd>] down_write+0x2d/0x3d Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc128f7f4>] lod_qos_prep_create+0xaa4/0x17f0 [lod] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc1017200>] ? qsd_op_begin+0xb0/0x4d0 [lquota] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc1094d10>] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc1290ab8>] lod_prepare_create+0x298/0x3f0 [lod] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc105807e>] ? osd_idc_find_and_init+0x7e/0x100 [osd_ldiskfs] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc128563e>] lod_declare_striped_create+0x1ee/0x970 [lod] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc1287b54>] lod_declare_create+0x1e4/0x540 [lod] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc12f3a1f>] mdd_declare_create_object_internal+0xdf/0x2f0 [mdd] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc12e4b63>] mdd_declare_create+0x53/0xe20 [mdd] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc12e8b69>] mdd_create+0x7d9/0x1320 [mdd] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc11ba9bc>] mdt_reint_open+0x218c/0x31a0 [mdt] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc09ef4ce>] ? upcall_cache_get_entry+0x20e/0x8f0 [obdclass] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc119faa3>] ? ucred_set_jobid+0x53/0x70 [mdt] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc11af8a0>] mdt_reint_rec+0x80/0x210 [mdt] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc119130b>] mdt_reint_internal+0x5fb/0x9c0 [mdt] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc1191832>] mdt_intent_reint+0x162/0x430 [mdt] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc119c59e>] mdt_intent_policy+0x43e/0xc70 [mdt] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc0bbc672>] ? ldlm_resource_get+0x5e2/0xa30 [ptlrpc] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc0bb5277>] ldlm_lock_enqueue+0x387/0x970 [ptlrpc] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc0bde903>] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc0c06ae0>] ? lustre_swab_ldlm_lock_desc+0x30/0x30 [ptlrpc] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc0c63ea2>] tgt_enqueue+0x62/0x210 [ptlrpc] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc0c67da5>] tgt_request_handle+0x925/0x1370 [ptlrpc] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc0c10b16>] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc0c0d148>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffff810c4822>] ? default_wake_function+0x12/0x20 Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffff810ba588>] ? __wake_up_common+0x58/0x90 Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc0c14252>] ptlrpc_main+0xa92/0x1e40 [ptlrpc] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffffc0c137c0>] ? ptlrpc_register_service+0xe30/0xe30 [ptlrpc] Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffff810b098f>] kthread+0xcf/0xe0 Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40 Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffff816b4f58>] ret_from_fork+0x58/0x90 Feb 21 11:31:33 oak-md1-s2 kernel: [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40 Feb 21 11:31:33 oak-md1-s2 kernel: INFO: task mdt00_004:103144 blocked for more than 120 seconds.
then slightly later:
Feb 21 11:35:25 oak-md1-s2 kernel: LustreError: 103142:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519241425, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-oak-MDT0000_UUID lock: ffff8801fa7b4200/0xec47b06a6b3b9c78 lrc: 3/0,1 mode: --/CW res: [0x20000f271:0x2562:0x0].0x0 bits 0x2 rrc: 11 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 103142 timeout: 0 lvb_type: 0 Feb 21 11:35:25 oak-md1-s2 kernel: LustreError: dumping log to /tmp/lustre-log.1519241725.103142 Feb 21 11:35:27 oak-md1-s2 kernel: LustreError: 102833:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519241427, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-oak-MDT0000_UUID lock: ffff880afe56c400/0xec47b06a6b3d396a lrc: 3/0,1 mode: --/CW res: [0x20000f271:0x2562:0x0].0x0 bits 0x2 rrc: 11 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 102833 timeout: 0 lvb_type: 0 Feb 21 11:35:42 oak-md1-s2 kernel: LustreError: 220676:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519241442, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-oak-MDT0000_UUID lock: ffff880e0908aa00/0xec47b06a6b44312f lrc: 3/1,0 mode: --/PR res: [0x20000db01:0xda33:0x0].0x0 bits 0x13 rrc: 4 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 220676 timeout: 0 lvb_type: 0 Feb 21 11:37:04 oak-md1-s2 kernel: LNet: Service thread pid 102832 completed after 399.98s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Feb 21 11:37:04 oak-md1-s2 kernel: LNet: Skipped 3 previous similar messages Feb 21 11:37:06 oak-md1-s2 kernel: LustreError: 220680:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519241526, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-oak-MDT0000_UUID lock: ffff880248745a00/0xec47b06a6b61994d lrc: 3/1,0 mode: --/PR res: [0x20000c387:0x1420f:0x0].0x0 bits 0x13 rrc: 6 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 220680 timeout: 0 lvb_type: 0 Feb 21 11:38:44 oak-md1-s2 kernel: LNet: Service thread pid 103154 completed after 499.94s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). Feb 21 11:38:57 oak-md1-s2 kernel: LNet: Service thread pid 103147 was inactive for 412.60s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
We decided to do a stop/start of the MDT at 12:04:
Feb 21 12:04:28 oak-md1-s2 kernel: Lustre: Failing over oak-MDT0000
The full logs show many granted locks.
Recovery completed at 12:21 with 1 client evicted:
Feb 21 12:21:46 oak-md1-s2 kernel: Lustre: oak-MDT0000: Recovery over after 14:38, of 1263 clients 1262 recovered and 1 was evicted.
Which fixed the directory access issues.
That's the third time I see this issue (after OST failover) since the upgrade to 2.10.3. Never seen with 2.10.1+patches or 2.10.2 (without). Here, all servers are running 2.10.3, MDS is running 2.10.3 + first patch from LU-10680.
Note: most of our clients are still running 2.10.0 (~900) and 2.10.1 (~300) but we haven't seen any Lustre issues on the clients so far.
Thanks!