HSM _not only_ small fixes and to do list goes here
(LU-3647)
|
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.5.0 |
| Type: | Technical task | Priority: | Blocker |
| Reporter: | Jinshan Xiong (Inactive) | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | HSM, MB | ||
| Rank (Obsolete): | 10098 |
| Description |
|
I did quota test today and found a problem with hsm_release. The test script is as follows: #!/bin/bash
setup() {
( cd srcs/lustre/lustre/tests; sh llmount.sh )
lctl set_param mdt.*.hsm_control=enabled
rm -rf /tmp/arc
mkdir /tmp/arc
~/srcs/lustre/lustre/utils/lhsmtool_posix --daemon --hsm-root /tmp/arc /mnt/lustre
lctl conf_param lustre.quota.ost=u
lctl conf_param lustre.quota.mdt=u
}
LFS=~/srcs/lustre/lustre/utils/lfs
file=/mnt/lustre/testfile
setup
rm -f $file
dd if=/dev/zero of=$file bs=1M count=30
chown tstusr.tstusr $file
set -x
$LFS hsm_archive $file
while $LFS hsm_state $file |grep -qv archived; do
sleep 1
done
$LFS hsm_state $file
lctl set_param debug=-1
lctl set_param debug_mb=500
lctl dk > /dev/null
count=0
while :; do
lctl mark "############# $count"
count=$((count+1))
$LFS hsm_release $file
$LFS hsm_state $file
$LFS hsm_restore $file
$LFS hsm_state $file
sleep 1
done
The output on the console before the script hung: + /Users/jinxiong/srcs/lustre/lustre/utils/lfs hsm_state /mnt/lustre/testfile + grep -qv archived + /Users/jinxiong/srcs/lustre/lustre/utils/lfs hsm_state /mnt/lustre/testfile /mnt/lustre/testfile: (0x00000009) exists archived, archive_id:1 + lctl set_param debug=-1 debug=-1 + lctl set_param debug_mb=500 debug_mb=500 + lctl dk + count=0 + : + lctl mark '############# 0' + count=1 + /Users/jinxiong/srcs/lustre/lustre/utils/lfs hsm_release /mnt/lustre/testfile It looks like the mdt thread was hung at finding local root object, for unknown reason, the local root object was being deleted. This sounds impossible but happened: LNet: Service thread pid 2945 was inactive for 40.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Pid: 2945, comm: mdt_rdpg00_001 Call Trace: [<ffffffffa03c466e>] cfs_waitq_wait+0xe/0x10 [libcfs] [<ffffffffa056ffa7>] lu_object_find_at+0xb7/0x360 [obdclass] [<ffffffff81063410>] ? default_wake_function+0x0/0x20 [<ffffffffa0570266>] lu_object_find+0x16/0x20 [obdclass] [<ffffffffa0bf5b16>] mdt_object_find+0x56/0x170 [mdt] [<ffffffffa0c264ef>] mdt_mfd_close+0x15ef/0x1b60 [mdt] [<ffffffffa03d3900>] ? libcfs_debug_vmsg2+0xba0/0xbb0 [libcfs] [<ffffffffa0c27e32>] mdt_close+0x682/0xac0 [mdt] [<ffffffffa0bffa4a>] mdt_handle_common+0x52a/0x1470 [mdt] [<ffffffffa0c39365>] mds_readpage_handle+0x15/0x20 [mdt] [<ffffffffa0709a55>] ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc] [<ffffffffa03c454e>] ? cfs_timer_arm+0xe/0x10 [libcfs] [<ffffffffa03d540f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs] [<ffffffffa03d3951>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffff81055ad3>] ? __wake_up+0x53/0x70 [<ffffffffa070ad9d>] ptlrpc_main+0xacd/0x1710 [ptlrpc] [<ffffffffa070a2d0>] ? ptlrpc_main+0x0/0x1710 [ptlrpc] [<ffffffff81096a36>] kthread+0x96/0xa0 [<ffffffff8100c0ca>] child_rip+0xa/0x20 [<ffffffff810969a0>] ? kthread+0x0/0xa0 [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 I suspect this issue is related to quota because if I turned quota off everything became all right. |
| Comments |
| Comment by Peter Jones [ 05/Sep/13 ] |
|
Niu Could you please comment on this one? Thanks Peter |
| Comment by Niu Yawei (Inactive) [ 09/Sep/13 ] |
|
This can be reproduced even without the chown operation, and I'm not sure how quota code can affect this test, since there almost no quota code involved. I'll investigate it further. |
| Comment by Niu Yawei (Inactive) [ 10/Sep/13 ] |
|
I think I see the problem: lu_object_put_nocache(obj) will mark an object as dying object, so when lu_object_find_at() find the dying object, it will wait for the the dying object to be freed then try lookup again. The problem of this logic is that if the object is holding by somebody (never get freed), lu_object_find_at() will wait on the dying object forever. In this specific case: The root object is holding by lfsck (lfsck->li_local_root), and quota code calls local storage API to create quota files: local_oid_storage_init() -> lastid_compat_check() -> lu_object_put_nocache(root), then the root object is marked as dying but never been freed. Given the machanism of lu_object_put_nocache(), I think nobody should hold any object, nasf, what do you think about? Could we just remove the li_local_root and get the object on demand? |
| Comment by Jinshan Xiong (Inactive) [ 10/Sep/13 ] |
|
probably we shouldn't use nocache version of lu_object_put() at all. |
| Comment by Niu Yawei (Inactive) [ 11/Sep/13 ] |
| Comment by Jinshan Xiong (Inactive) [ 11/Sep/13 ] |
|
why does it call lu_object_put_nocache() in the first place? |
| Comment by Alex Zhuravlev [ 11/Sep/13 ] |
|
> why does it call lu_object_put_nocache() in the first place? because different stacks may want to use this object and expect own slices (at different time). I don't think NOCACHE can really help here, because it just postpones potential issues. |
| Comment by Jinshan Xiong (Inactive) [ 11/Sep/13 ] |
|
in that case, it seems like it's not allowed to hold object before the stack is fully initialized. |
| Comment by Alex Zhuravlev [ 11/Sep/13 ] |
|
it's not just stack initialization.. / has been accessed by few componenets during runtime. I'm trying to recall all the details. |
| Comment by Niu Yawei (Inactive) [ 13/Sep/13 ] |
|
Hi, Mike Any input on this? Is it possible to get rid of lu_object_put_nocache() for local storage? Thanks. |
| Comment by Alex Zhuravlev [ 13/Sep/13 ] |
|
as a short term fix I'd suggest to fix LFSCK - there is no real need to hold local root object. |
| Comment by Jinshan Xiong (Inactive) [ 13/Sep/13 ] |
|
nasf, can you comment? |
| Comment by nasf (Inactive) [ 13/Sep/13 ] |
|
This is the patch to release the root reference held by LFSCK: |
| Comment by Peter Jones [ 20/Sep/13 ] |
|
Landed for 2.5.0 |