[LU-4727] Lhsmtool_posix process stuck in ll_layout_refresh() when restoring - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: Lustre 2.8.0
Affects Version/s: Lustre 2.6.0, Lustre 2.5.1
Labels:
- HSM
- cea

Severity:
3
Rank (Obsolete):
13001

Description

This is easy to reproduce. I hit this problem every time when I trying to run following commands.

rm /mnt/lustre/XXXX -f;
echo XXX > /mnt/lustre/XXXX;
cat /mnt/lustre/XXXX;
lfs hsm_archive --archive=5 /mnt/lustre/XXXX;
cat /mnt/lustre/XXXX;
lfs hsm_release /mnt/lustre/XXXX;
cat /mnt/lustre/XXXX; # This will restore automatically
lfs hsm_release /mnt/lustre/XXXX;
lfs hsm_restore /mnt/lustre/XXXX; # Lhsmtool_posix actually hang here
cat /mnt/lustre/XXXX; # this will stuck

And after some time, following messages shew up.

INFO: task flush-lustre-1:4106 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
flush-lustre- D 0000000000000005 0 4106 2 0x00000080
ffff8808165b7830 0000000000000046 0000000000000000 0000000000000000
0000000000013180 0000000000000000 ffff880851fc10f8 ffff88082d4e0c00
ffff88082cb7fab8 ffff8808165b7fd8 000000000000fb88 ffff88082cb7fab8
Call Trace:
[<ffffffff814fc9fe>] __mutex_lock_slowpath+0x13e/0x180
[<ffffffff814fc89b>] mutex_lock+0x2b/0x50
[<ffffffffa0c2814c>] ll_layout_refresh+0x26c/0x1080 [lustre]
[<ffffffff813104bb>] ? mix_pool_bytes_extract+0x16b/0x180
[<ffffffff81135cf9>] ? zone_statistics+0x99/0xc0
[<ffffffffa059e007>] ? cfs_hash_bd_lookup_intent+0x37/0x130 [libcfs]
[<ffffffffa0c51230>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
[<ffffffffa08b7450>] ? ldlm_completion_ast+0x0/0x930 [ptlrpc]
[<ffffffffa06dbba1>] ? cl_io_slice_add+0xc1/0x190 [obdclass]
[<ffffffffa0c78410>] vvp_io_init+0x340/0x490 [lustre]
[<ffffffffa05a11aa>] ? cfs_hash_find_or_add+0x9a/0x190 [libcfs]
[<ffffffffa06daff8>] cl_io_init0+0x98/0x160 [obdclass]
[<ffffffffa06ddc14>] cl_io_init+0x64/0xe0 [obdclass]
[<ffffffffa0c1894d>] cl_sync_file_range+0x12d/0x500 [lustre]
[<ffffffffa0c46cac>] ll_writepages+0x9c/0x220 [lustre]
[<ffffffff81128d81>] do_writepages+0x21/0x40
[<ffffffff811a43bd>] writeback_single_inode+0xdd/0x290
[<ffffffff811a47ce>] writeback_sb_inodes+0xce/0x180
[<ffffffff811a492b>] writeback_inodes_wb+0xab/0x1b0
[<ffffffff811a4ccb>] wb_writeback+0x29b/0x3f0
[<ffffffff814fb3a0>] ? thread_return+0x4e/0x76e
[<ffffffff8107eb42>] ? del_timer_sync+0x22/0x30
[<ffffffff811a4fb9>] wb_do_writeback+0x199/0x240
[<ffffffff811a50c3>] bdi_writeback_task+0x63/0x1b0
[<ffffffff81091f97>] ? bit_waitqueue+0x17/0xd0
[<ffffffff811379e0>] ? bdi_start_fn+0x0/0x100
[<ffffffff81137a66>] bdi_start_fn+0x86/0x100
[<ffffffff811379e0>] ? bdi_start_fn+0x0/0x100
[<ffffffff81091d66>] kthread+0x96/0xa0
[<ffffffff8100c14a>] child_rip+0xa/0x20
[<ffffffff81091cd0>] ? kthread+0x0/0xa0
[<ffffffff8100c140>] ? child_rip+0x0/0x20

It seems copy tool is waiting for md_enqueue(MDS_INODELOCK_LAYOUT). Other processes who are trying to lock lli->lli_layout_mutex will be stuck. This problem won't recover until lock enque times out and client reconnects.

Attachments

Issue Links

is duplicated by

LU-5196 HSM: client task stuck waiting for mutex in ll_layout_refresh

Resolved

is related to

LUDOC-252 Copytool Recommendations - Add/Clarify

Open

LU-4728 NULL pointer dereference in ldlm_cli_enqueue_local when enabling hsm_control after LU-4727 happends

Resolved

LU-6460 LLIF_FILE_RESTORING is not cleared at end of restore

Resolved

LU-4002 HSM restore vs unlink deadlock

Resolved

mentioned in: Page Loading...

(1 mentioned in)

Activity

[LU-4727] Lhsmtool_posix process stuck in ll_layout_refresh() when restoring

Jinshan Xiong (Inactive) added a comment - 23/Mar/15 7:16 PM

What's this process? From the name I guess it's not copy tool. John's patch can only fix copy tool case.

You will need patch 13138 too to address this case if the restoring will take longer than 120 seconds.

Jinshan Xiong (Inactive) added a comment - 23/Mar/15 7:16 PM What's this process? From the name I guess it's not copy tool. John's patch can only fix copy tool case. You will need patch 13138 too to address this case if the restoring will take longer than 120 seconds.

Frank Zago (Inactive) added a comment - 23/Mar/15 6:47 PM

Jinshan,

I got this trace when using stat on a file I was restoring (or was restored). I haven't been able to reproduce it so far. Is it the error you mention in your patch? John's patch is applied on that tree, and works well otherwise.

Mar 20 15:00:55 tasclient01 kernel: INFO: task stat:951 blocked for more than 120 seconds.
Mar 20 15:00:55 tasclient01 kernel:      Tainted: P           ---------------    2.6.32-431.17.1.el6.x86_64 #1
Mar 20 15:00:55 tasclient01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables thi
s message.
Mar 20 15:00:55 tasclient01 kernel: stat          D 0000000000000000     0   951  29769 0x00000084
Mar 20 15:00:55 tasclient01 kernel: ffff8801d0471a58 0000000000000082 0000000000000000 000000000000000d
Mar 20 15:00:55 tasclient01 kernel: 0000000000000004 ffff880237fee800 ffff880116db5610 0000000000000630
Mar 20 15:00:55 tasclient01 kernel: ffff8802143f1098 ffff8801d0471fd8 000000000000fbc8 ffff8802143f1098
Mar 20 15:00:55 tasclient01 kernel: Call Trace:
Mar 20 15:00:55 tasclient01 kernel: [<ffffffff8152935e>] __mutex_lock_slowpath+0x13e/0x180
Mar 20 15:00:55 tasclient01 kernel: [<ffffffff815291fb>] mutex_lock+0x2b/0x50
Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa09e009a>] ll_layout_refresh+0x1da/0xc60 [lustre]
Mar 20 15:00:55 tasclient01 kernel: [<ffffffff810f0e46>] ? ftrace_test_stop_func+0x16/0x20
Mar 20 15:00:55 tasclient01 kernel: [<ffffffff8100ad96>] ? ftrace_call+0x5/0x2b
Mar 20 15:00:55 tasclient01 kernel: [<ffffffff810f0e46>] ? ftrace_test_stop_func+0x16/0x20
Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa0a04ab0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre]
Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa06996a0>] ? ldlm_completion_ast+0x0/0x920 [ptlrpc]
Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa04dfe31>] ? cl_io_slice_add+0xc1/0x190 [obdclass]
Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa0a2d8f0>] vvp_io_init+0x340/0x490 [lustre]
Mar 20 15:00:55 tasclient01 kernel: [<ffffffff810f0e46>] ? ftrace_test_stop_func+0x16/0x20
Mar 20 15:00:55 tasclient01 kernel: [<ffffffff8100ad96>] ? ftrace_call+0x5/0x2b
Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa04df278>] cl_io_init0+0x98/0x160 [obdclass]
Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa04e1ea4>] cl_io_init+0x64/0xe0 [obdclass]
Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa0a23161>] cl_glimpse_size0+0x91/0x1d0 [lustre]
Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa09d4c25>] ll_inode_revalidate_it+0x1a5/0x1d0 [lustre]
Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa09d4c99>] ll_getattr_it+0x49/0x170 [lustre]
Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa09d4df7>] ll_getattr+0x37/0x40 [lustre]
Mar 20 15:00:55 tasclient01 kernel: [<ffffffff81227163>] ? security_inode_getattr+0x23/0x30
Mar 20 15:00:55 tasclient01 kernel: [<ffffffff8118e631>] vfs_getattr+0x51/0x80
Mar 20 15:00:55 tasclient01 kernel: [<ffffffff8118e6c4>] vfs_fstatat+0x64/0xa0
Mar 20 15:00:55 tasclient01 kernel: [<ffffffff810f0e46>] ? ftrace_test_stop_func+0x16/0x20
Mar 20 15:00:55 tasclient01 kernel: [<ffffffff8118e76e>] vfs_lstat+0x1e/0x20
Mar 20 15:00:55 tasclient01 kernel: [<ffffffff8118e794>] sys_newlstat+0x24/0x50
Mar 20 15:00:55 tasclient01 kernel: [<ffffffff810e1cc7>] ? audit_syscall_entry+0x1d7/0x200
Mar 20 15:00:55 tasclient01 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

Frank Zago (Inactive) added a comment - 23/Mar/15 6:47 PM Jinshan, I got this trace when using stat on a file I was restoring (or was restored). I haven't been able to reproduce it so far. Is it the error you mention in your patch? John's patch is applied on that tree, and works well otherwise. Mar 20 15:00:55 tasclient01 kernel: INFO: task stat:951 blocked for more than 120 seconds. Mar 20 15:00:55 tasclient01 kernel: Tainted: P --------------- 2.6.32-431.17.1.el6.x86_64 #1 Mar 20 15:00:55 tasclient01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables thi s message. Mar 20 15:00:55 tasclient01 kernel: stat D 0000000000000000 0 951 29769 0x00000084 Mar 20 15:00:55 tasclient01 kernel: ffff8801d0471a58 0000000000000082 0000000000000000 000000000000000d Mar 20 15:00:55 tasclient01 kernel: 0000000000000004 ffff880237fee800 ffff880116db5610 0000000000000630 Mar 20 15:00:55 tasclient01 kernel: ffff8802143f1098 ffff8801d0471fd8 000000000000fbc8 ffff8802143f1098 Mar 20 15:00:55 tasclient01 kernel: Call Trace: Mar 20 15:00:55 tasclient01 kernel: [<ffffffff8152935e>] __mutex_lock_slowpath+0x13e/0x180 Mar 20 15:00:55 tasclient01 kernel: [<ffffffff815291fb>] mutex_lock+0x2b/0x50 Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa09e009a>] ll_layout_refresh+0x1da/0xc60 [lustre] Mar 20 15:00:55 tasclient01 kernel: [<ffffffff810f0e46>] ? ftrace_test_stop_func+0x16/0x20 Mar 20 15:00:55 tasclient01 kernel: [<ffffffff8100ad96>] ? ftrace_call+0x5/0x2b Mar 20 15:00:55 tasclient01 kernel: [<ffffffff810f0e46>] ? ftrace_test_stop_func+0x16/0x20 Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa0a04ab0>] ? ll_md_blocking_ast+0x0/0x7d0 [lustre] Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa06996a0>] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa04dfe31>] ? cl_io_slice_add+0xc1/0x190 [obdclass] Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa0a2d8f0>] vvp_io_init+0x340/0x490 [lustre] Mar 20 15:00:55 tasclient01 kernel: [<ffffffff810f0e46>] ? ftrace_test_stop_func+0x16/0x20 Mar 20 15:00:55 tasclient01 kernel: [<ffffffff8100ad96>] ? ftrace_call+0x5/0x2b Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa04df278>] cl_io_init0+0x98/0x160 [obdclass] Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa04e1ea4>] cl_io_init+0x64/0xe0 [obdclass] Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa0a23161>] cl_glimpse_size0+0x91/0x1d0 [lustre] Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa09d4c25>] ll_inode_revalidate_it+0x1a5/0x1d0 [lustre] Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa09d4c99>] ll_getattr_it+0x49/0x170 [lustre] Mar 20 15:00:55 tasclient01 kernel: [<ffffffffa09d4df7>] ll_getattr+0x37/0x40 [lustre] Mar 20 15:00:55 tasclient01 kernel: [<ffffffff81227163>] ? security_inode_getattr+0x23/0x30 Mar 20 15:00:55 tasclient01 kernel: [<ffffffff8118e631>] vfs_getattr+0x51/0x80 Mar 20 15:00:55 tasclient01 kernel: [<ffffffff8118e6c4>] vfs_fstatat+0x64/0xa0 Mar 20 15:00:55 tasclient01 kernel: [<ffffffff810f0e46>] ? ftrace_test_stop_func+0x16/0x20 Mar 20 15:00:55 tasclient01 kernel: [<ffffffff8118e76e>] vfs_lstat+0x1e/0x20 Mar 20 15:00:55 tasclient01 kernel: [<ffffffff8118e794>] sys_newlstat+0x24/0x50 Mar 20 15:00:55 tasclient01 kernel: [<ffffffff810e1cc7>] ? audit_syscall_entry+0x1d7/0x200 Mar 20 15:00:55 tasclient01 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

Gerrit Updater added a comment - 08/Mar/15 11:40 AM

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13750/
Subject: ~~LU-4727~~ hsm: use IOC_MDC_GETFILEINFO in restore
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 96dbac2eaef7a5d1090807bedc9951279c06d037

Gerrit Updater added a comment - 08/Mar/15 11:40 AM Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13750/ Subject: LU-4727 hsm: use IOC_MDC_GETFILEINFO in restore Project: fs/lustre-release Branch: master Current Patch Set: Commit: 96dbac2eaef7a5d1090807bedc9951279c06d037

Gerrit Updater added a comment - 12/Feb/15 7:59 PM

John L. Hammond (john.hammond@intel.com) uploaded a new patch: http://review.whamcloud.com/13750
Subject: ~~LU-4727~~ hsm: use IOC_MDC_GETFILEINFO in restore
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8ec6354ded37f3e1f39d6e0336c9e17b1a97785b

Gerrit Updater added a comment - 12/Feb/15 7:59 PM John L. Hammond (john.hammond@intel.com) uploaded a new patch: http://review.whamcloud.com/13750 Subject: LU-4727 hsm: use IOC_MDC_GETFILEINFO in restore Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 8ec6354ded37f3e1f39d6e0336c9e17b1a97785b

Jinshan Xiong (Inactive) added a comment - 05/Feb/15 5:28 PM

the patch has been in Gerrit for a long time. Please let me know what I can do to move this forward, sigh.

Jinshan Xiong (Inactive) added a comment - 05/Feb/15 5:28 PM the patch has been in Gerrit for a long time. Please let me know what I can do to move this forward, sigh.

Vinayak Hariharmath (Inactive) added a comment - 19/Dec/14 9:40 AM

http://review.whamcloud.com/13138 solves the problem on single node setup on local vm. Thanks for the patch Jinshan

Vinayak Hariharmath (Inactive) added a comment - 19/Dec/14 9:40 AM http://review.whamcloud.com/13138 solves the problem on single node setup on local vm. Thanks for the patch Jinshan

Jinshan Xiong (Inactive) added a comment - 19/Dec/14 4:26 AM

Please try patch 13138 and check if it can fix the problem.

Jinshan Xiong (Inactive) added a comment - 19/Dec/14 4:26 AM Please try patch 13138 and check if it can fix the problem.

Gerrit Updater added a comment - 19/Dec/14 4:26 AM

Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: http://review.whamcloud.com/13138
Subject: ~~LU-4727~~ hsm: flush UPDATE lock for restore
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: bf2e4b958f60cb7eda9303ad0c079fd23ff2d16b

Gerrit Updater added a comment - 19/Dec/14 4:26 AM Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: http://review.whamcloud.com/13138 Subject: LU-4727 hsm: flush UPDATE lock for restore Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: bf2e4b958f60cb7eda9303ad0c079fd23ff2d16b

Jinshan Xiong (Inactive) added a comment - 18/Dec/14 11:57 PM - edited

I'm thinking about a solution for this problem.

When you say "flush UPDATE lock", how are you suggesting this be done? Take an update lock on the object, then take the layout lock? If so, when do we release the update lock? Before taking the layout lock, after getting the layout lock, or some other time?

By "flush UPDATE lock", I meant to acquire the UPDATE lock and release it immediately.

Is this comment in error? If not, what layout lock is it referring to/what sort of lock on layout? I ask because if it's a restore request, we take a layout lock, which seems to imply the caller did not have a layout lock already.

that means layout lock to take in the function, i.e., the code

                        mdt_lock_reg_init(&crh->crh_lh, LCK_EX);
                        obj = mdt_object_find_lock(mti, &crh->crh_fid,
                                                   &crh->crh_lh,
                                                   MDS_INODELOCK_LAYOUT);

Jinshan Xiong (Inactive) added a comment - 18/Dec/14 11:57 PM - edited I'm thinking about a solution for this problem. When you say "flush UPDATE lock", how are you suggesting this be done? Take an update lock on the object, then take the layout lock? If so, when do we release the update lock? Before taking the layout lock, after getting the layout lock, or some other time? By "flush UPDATE lock", I meant to acquire the UPDATE lock and release it immediately. Is this comment in error? If not, what layout lock is it referring to/what sort of lock on layout? I ask because if it's a restore request, we take a layout lock, which seems to imply the caller did not have a layout lock already. that means layout lock to take in the function, i.e., the code mdt_lock_reg_init(&crh->crh_lh, LCK_EX); obj = mdt_object_find_lock(mti, &crh->crh_fid, &crh->crh_lh, MDS_INODELOCK_LAYOUT);

Patrick Farrell (Inactive) added a comment - 12/Aug/14 7:20 PM

Jinshan - Looking at your description of a possible solution...
"So it requests LAYOUT lock and then add the request into a global list, we should change it to:
1. add to global list
2. flush UPDATE lock
3. request LAYOUT lock"

When you say "flush UPDATE lock", how are you suggesting this be done? Take an update lock on the object, then take the layout lock? If so, when do we release the update lock? Before taking the layout lock, after getting the layout lock, or some other time?

Also, this comment at the top of the function is confusing me:
" * in case of restore, caller must hold layout lock"
Is this comment in error? If not, what layout lock is it referring to/what sort of lock on layout? I ask because if it's a restore request, we take a layout lock, which seems to imply the caller did not have a layout lock already.

Patrick Farrell (Inactive) added a comment - 12/Aug/14 7:20 PM Jinshan - Looking at your description of a possible solution... "So it requests LAYOUT lock and then add the request into a global list, we should change it to: 1. add to global list 2. flush UPDATE lock 3. request LAYOUT lock" When you say "flush UPDATE lock", how are you suggesting this be done? Take an update lock on the object, then take the layout lock? If so, when do we release the update lock? Before taking the layout lock, after getting the layout lock, or some other time? Also, this comment at the top of the function is confusing me: " * in case of restore, caller must hold layout lock" Is this comment in error? If not, what layout lock is it referring to/what sort of lock on layout? I ask because if it's a restore request, we take a layout lock, which seems to imply the caller did not have a layout lock already.

Li Xi (Inactive) added a comment - 19/May/14 2:24 AM

Hi all,

Is there any progress in this issue? This issue is really annoying when I am testing HSM. Is there any easy way to walk around it at least? Using a dedicated mount point for the copytool is not helping....

Thanks!

Li Xi (Inactive) added a comment - 19/May/14 2:24 AM Hi all, Is there any progress in this issue? This issue is really annoying when I am testing HSM. Is there any easy way to walk around it at least? Using a dedicated mount point for the copytool is not helping.... Thanks!

People

Assignee:: Jinshan Xiong (Inactive)

Reporter:: Li Xi (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 19 Start watching this issue

Dates

Created:: 07/Mar/14 3:32 AM

Updated:: 25/Jan/22 8:54 PM

Resolved:: 23/Apr/15 4:44 PM