[LU-8010] lfs hsm command hangs up after lfs hsm_cancel Created: 12/Apr/16 Updated: 27/Sep/16 Resolved: 16/May/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | Lustre 2.9.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Tatsushi Takamura | Assignee: | Bruno Faccini (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | HSM | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Project: | HSM | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
When hsm_restore are canceled, the following lfs hsm command hangs up. (1) # lfs hsm_state /lustre/file /lustre/file: (0x0000000d) released exists archived, archive_id:1 # lfs hsm_restore /lustre/file # lfs hsm_cancel /lustre/file # lfs hsm_restore /lustre/file (quickly after lfs hsm_cancel) # lfs hsm_restore /lustre/file *hang up The call trace is as following: PID: 9550 TASK: ffff880c33f86040 CPU: 13 COMMAND: "lfs" #0 [ffff880c1ca8f568] schedule at ffffffff81539170 #1 [ffff880c1ca8f640] schedule_timeout at ffffffff8153a042 #2 [ffff880c1ca8f6f0] ldlm_completion_ast at ffffffffa0847fc9 [ptlrpc] #3 [ffff880c1ca8f7a0] ldlm_cli_enqueue_fini at ffffffffa08420e6 [ptlrpc] #4 [ffff880c1ca8f840] ldlm_cli_enqueue at ffffffffa08429a1 [ptlrpc] #5 [ffff880c1ca8f8f0] mdc_enqueue at ffffffffa0a6e8aa [mdc] #6 [ffff880c1ca8fa40] lmv_enqueue at ffffffffa0a25bfb [lmv] #7 [ffff880c1ca8fac0] ll_layout_refresh_locked at ffffffffa0b084f6 [lustre] #8 [ffff880c1ca8fc00] ll_layout_refresh at ffffffffa0b09159 [lustre] #9 [ffff880c1ca8fc50] vvp_io_init at ffffffffa0b5227f [lustre] #10 [ffff880c1ca8fcc0] cl_io_init0 at ffffffffa06a8e78 [obdclass] #11 [ffff880c1ca8fd00] cl_io_init at ffffffffa06abdf4 [obdclass] #12 [ffff880c1ca8fd40] cl_glimpse_size0 at ffffffffa0b4bc05 [lustre] #13 [ffff880c1ca8fda0] ll_getattr at ffffffffa0b07cb8 [lustre] #14 [ffff880c1ca8fe40] vfs_getattr at ffffffff81197c61 #15 [ffff880c1ca8fe80] vfs_fstatat at ffffffff81197cf4 #16 [ffff880c1ca8fee0] vfs_lstat at ffffffff81197d9e #17 [ffff880c1ca8fef0] sys_newlstat at ffffffff81197dc4 #18 [ffff880c1ca8ff80] tracesys at ffffffff8100b2e8 (via system_call) (2) # lfs hsm_state /lustre/file /lustre/file: (0x0000000d) released exists archived, archive_id:1 # lfs hsm_restore /lustre/file <MDS Immediate reset> <MDS recover and HSM setup> # lfs hsm_action /lustre/file /lustre/file: RESTORE waiting (from 0 to EOF) # lfs hsm_state /lustre/file /lustre/file: (0x0000000d) released exists archived, archive_id:1 # lfs hsm_cancel /lustre/file # lfs hsm_restore /lustre/file *hang up The call trace is as following: PID: 3731 TASK: ffff880c35087520 CPU: 13 COMMAND: "lfs" #0 [ffff880c327df598] schedule at ffffffff81539170 #1 [ffff880c327df670] schedule_timeout at ffffffff8153a042 #2 [ffff880c327df720] ptlrpc_set_wait at ffffffffa0860c41 [ptlrpc] #3 [ffff880c327df7e0] ptlrpc_queue_wait at ffffffffa0861301 [ptlrpc] #4 [ffff880c327df800] mdc_ioc_hsm_request at ffffffffa0a65138 [mdc] #5 [ffff880c327df830] mdc_iocontrol at ffffffffa0a66df9 [mdc] #6 [ffff880c327df950] obd_iocontrol at ffffffffa0a16aa5 [lmv] #7 [ffff880c327df9a0] lmv_iocontrol at ffffffffa0a2d9dc [lmv] #8 [ffff880c327dfb90] obd_iocontrol at ffffffffa0af2d15 [lustre] #9 [ffff880c327dfbe0] ll_dir_ioctl at ffffffffa0af9ae8 [lustre] #10 [ffff880c327dfe60] vfs_ioctl at ffffffff811a7972 #11 [ffff880c327dfea0] do_vfs_ioctl at ffffffff811a7b14 #12 [ffff880c327dff30] sys_ioctl at ffffffff811a8091 #13 [ffff880c327dff80] tracesys at ffffffff8100b2e8 (via system_call) |
| Comments |
| Comment by Bruno Faccini (Inactive) [ 13/Apr/16 ] |
|
Thanks for this report and the reproducer. I am presently working on a similar platform and to reproduce. |
| Comment by Tatsushi Takamura [ 14/Apr/16 ] |
|
The syslog messages is diffrecnt in the 2 patterns. (1) Apr 14 13:31:04 mds kernel: Lustre: lustre-MDT0000: Client 4e2ff398-da98-562d-99e7-cd38a8709ba2 (at 192.168.128.85@o2ib) reconnecting Apr 14 13:31:04 mds kernel: Lustre: lustre-MDT0000: Connection restored to ce03607c-8712-5fed-7d04-6d296439e868 (at 192.168.128.85@o2ib) The following is possibly related messages in the Client(which I run lfs command at) syslog. Apr 14 13:31:04 clinet kernel: Lustre: lustre-MDT0000-mdc-ffff880c35677c00: Connection to lustre-MDT0000 (at 192.168.128.81@o2ib) was lost; in progress operations using this service will wait for recovery to complete Apr 14 13:31:04 clinet kernel: Lustre: Skipped 1 previous similar message Apr 14 13:31:04 clinet kernel: LustreError: 6021:0:(ldlm_request.c:125:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1460607964, 300s ago), entering recovery for lustre-MDT0000_UUID@192.168.128.81@o2ib ns: lustre-MDT0000-mdc-ffff880c35677c00 lock: ffff880c3471dac0/0x241d687a0e4c12e9 lrc: 4/1,0 mode: --/CR res: [0x200000bd1:0x1:0x0].0x0 bits 0x8 rrc: 2 type: IBT flags: 0x0 nid: local remote: 0x936186c7472a77bf expref: -99 pid: 6021 timeout: 0 lvb_type: 3 Apr 14 13:31:04 clinet kernel: Lustre: lustre-MDT0000-mdc-ffff880c35677c00: Connection restored to 192.168.128.81@o2ib (at 192.168.128.81@o2ib) (2) Apr 14 14:16:09 mds kernel: LNet: Service thread pid 3763 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Apr 14 14:16:09 mds kernel: Pid: 3763, comm: mdt02_004 Apr 14 14:16:09 mds kernel: Apr 14 14:16:09 mds kernel: Call Trace: Apr 14 14:16:09 mds kernel: [<ffffffff8153a042>] schedule_timeout+0x192/0x2e0 Apr 14 14:16:09 mds kernel: [<ffffffff81089be0>] ? process_timeout+0x0/0x10 Apr 14 14:16:09 mds kernel: [<ffffffffa08b9e30>] ? ldlm_expired_completion_wait+0x0/0x250 [ptlrpc] Apr 14 14:16:09 mds kernel: [<ffffffffa08be989>] ldlm_completion_ast+0x609/0x9b0 [ptlrpc] Apr 14 14:16:09 mds kernel: [<ffffffff810672b0>] ? default_wake_function+0x0/0x20 Apr 14 14:16:09 mds kernel: [<ffffffffa08bdd8e>] ldlm_cli_enqueue_local+0x21e/0x810 [ptlrpc] Apr 14 14:16:09 mds kernel: [<ffffffffa08be380>] ? ldlm_completion_ast+0x0/0x9b0 [ptlrpc] Apr 14 14:16:09 mds kernel: [<ffffffffa1025130>] ? mdt_blocking_ast+0x0/0x2e0 [mdt] Apr 14 14:16:09 mds kernel: [<ffffffffa10318e4>] mdt_object_local_lock+0x3a4/0xb00 [mdt] Apr 14 14:16:09 mds kernel: [<ffffffffa1025130>] ? mdt_blocking_ast+0x0/0x2e0 [mdt] Apr 14 14:16:09 mds kernel: [<ffffffffa08be380>] ? ldlm_completion_ast+0x0/0x9b0 [ptlrpc] Apr 14 14:16:09 mds kernel: [<ffffffffa06cf791>] ? lu_object_find_at+0xb1/0xe0 [obdclass] Apr 14 14:16:09 mds kernel: [<ffffffffa10323b3>] mdt_object_lock_internal+0x63/0x310 [mdt] Apr 14 14:16:09 mds kernel: [<ffffffffa1032874>] mdt_object_lock+0x14/0x20 [mdt] Apr 14 14:16:09 mds kernel: [<ffffffffa1032a2e>] mdt_object_find_lock+0x5e/0x170 [mdt] Apr 14 14:16:09 mds kernel: [<ffffffffa106a683>] mdt_hsm_add_actions+0x973/0x12b0 [mdt] Apr 14 14:16:09 mds kernel: [<ffffffff8129957a>] ? strlcpy+0x4a/0x60 Apr 14 14:16:09 mds kernel: [<ffffffffa103d50f>] ? ucred_set_jobid+0x5f/0x70 [mdt] Apr 14 14:16:09 mds kernel: [<ffffffffa106116c>] mdt_hsm_request+0x60c/0x9b0 [mdt] Apr 14 14:16:09 mds kernel: [<ffffffffa094ec2c>] tgt_request_handle+0x8ec/0x1440 [ptlrpc] Apr 14 14:16:09 mds kernel: [<ffffffffa08fbc61>] ptlrpc_main+0xd21/0x1800 [ptlrpc] Apr 14 14:16:09 mds kernel: [<ffffffff815391be>] ? thread_return+0x4e/0x7d0 Apr 14 14:16:09 mds kernel: [<ffffffffa08faf40>] ? ptlrpc_main+0x0/0x1800 [ptlrpc] Apr 14 14:16:09 mds kernel: [<ffffffff810a0fce>] kthread+0x9e/0xc0 Apr 14 14:16:09 mds kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20 Apr 14 14:16:09 mds kernel: [<ffffffff810a0f30>] ? kthread+0x0/0xc0 Apr 14 14:16:09 mds kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20 Apr 14 14:16:09 mds kernel: Apr 14 14:16:09 mds kernel: LustreError: dumping log to /tmp/lustre-log.1460610969.3763 Apr 14 14:17:49 mds kernel: LustreError: 3763:0:(ldlm_request.c:106:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1460610768, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-lustre-MDT0000_UUID lock: ffff880c18f31980/0x605de39e648b48a7 lrc: 3/0,1 mode: --/EX res: [0x200000402:0x1:0x0].0x0 bits 0x8 rrc: 3 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 3763 timeout: 0 lvb_type: 0 Apr 14 14:17:49 mds kernel: LustreError: dumping log to /tmp/lustre-log.1460611069.3763 There is no related messages in the Client(which I run lfs command at) syslog. |
| Comment by Bruno Faccini (Inactive) [ 21/Apr/16 ] |
|
In both reproducers cases/scenarios, the root cause is the layout_lock not being released and leading to hung threads on both Client/MDS sides during handling of HSM requests on same file. I think I have found a hole in HSM code that can lead to a layout_lock taken for a restore not being released. I will push a tentative patch to fix it. |
| Comment by Gerrit Updater [ 21/Apr/16 ] |
|
Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: http://review.whamcloud.com/19710 |
| Comment by Bruno Faccini (Inactive) [ 22/Apr/16 ] |
|
Hello Tatsushi, |
| Comment by Tatsushi Takamura [ 26/Apr/16 ] |
|
Hello Bruno, I have confirmed that command hang-up from the reproducer(1) is resolved, but the restore command still hangs up using the the reproducer(2). |
| Comment by Bruno Faccini (Inactive) [ 29/Apr/16 ] |
|
Right, I double-checked and reproducer #2 is an other condition, leading to the same consequence (orphan layout lock), that must be fixed alternatively. In a few words, my understatnding is that a restore presently started when a MDT failure occurs must be set back in WAITING state to be replayed (if legal) upon CDT restart. This will allow to cancel it without leaving associated layout locked. |
| Comment by Tatsushi Takamura [ 08/May/16 ] |
|
Sorry for the late reply. I have checked with your new patch and both reproducers did not trigger a hang-up. |
| Comment by Gerrit Updater [ 16/May/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/19710/ |
| Comment by Peter Jones [ 16/May/16 ] |
|
Landed for 2.9 |