Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.11.0
-
None
-
3
-
9223372036854775807
Description
n:lustre-release# bash $LUSTRE/tests/llmount.sh ... n:lustre-release# time touch /mnt/lustre/f0 real 1m30.060s user 0m0.001s sys 0m0.004s n:lustre-release# time touch /mnt/lustre/f1 real 2m0.058s user 0m0.000s sys 0m0.004s
libtoolizing lustre/utils creates a wrapper script for l_getidentity that doesn't work.
When invoked the wrapped script prints the following to stderr
/root/lustre-release/lustre/utils/l_getidentity: line 151: ls: command not found /root/lustre-release/lustre/utils/l_getidentity: line 198: rm: command not found /root/lustre-release/lustre/utils/l_getidentity: line 212: rm: command not found /root/lustre-release/lustre/utils/l_getidentity: line 213: mv: command not found /root/lustre-release/lustre/utils/l_getidentity: line 214: rm: command not found /root/lustre-release/lustre/utils/l_getidentity: error: `/root/lustre-release/lustre/utils/.libs/lt-l_getidentity' does not exist This script is just a wrapper for lt-l_getidentity. See the libtool documentation for more information.
But this doesn't go anywhere because stderr is to connected to anything when l_getidentity is run.
l_getidentity should not depend on liblustreapi. We should factor out whatever it needs in separate .c files and add them to l_getidentity dependencies.
Also why do it take 2 minutes to for the operation to complete. It seems like we're not handling failure from the identity downcall very well. It sets stuck at:
n:~# stack1 mdt 8856 mdt00_003 [<ffffffffc0921e80>] upcall_cache_get_entry+0x1d0/0x8f0 [obdclass] [<ffffffffc10120c7>] mdt_identity_get+0x17/0x50 [mdt] [<ffffffffc0ff36eb>] old_init_ucred_common+0xdb/0x290 [mdt] [<ffffffffc0ff39c7>] old_init_ucred+0x127/0x240 [mdt] [<ffffffffc0ff5405>] mdt_init_ucred_intent_getattr+0x85/0xa0 [mdt] [<ffffffffc0ff04f5>] mdt_intent_getattr+0xc5/0x470 [mdt] [<ffffffffc0fe60b2>] mdt_intent_opc+0x442/0xad0 [mdt] [<ffffffffc0fedc73>] mdt_intent_policy+0x1a3/0x360 [mdt] [<ffffffffc0d042fa>] ldlm_lock_enqueue+0x38a/0x970 [ptlrpc] [<ffffffffc0d2da33>] ldlm_handle_enqueue0+0x8f3/0x1400 [ptlrpc] [<ffffffffc0db3752>] tgt_enqueue+0x62/0x210 [ptlrpc] [<ffffffffc0dbb965>] tgt_request_handle+0x925/0x13b0 [ptlrpc] [<ffffffffc0d5fc7e>] ptlrpc_server_handle_request+0x24e/0xab0 [ptlrpc] [<ffffffffc0d63422>] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [<ffffffff810b252f>] kthread+0xcf/0xe0 [<ffffffff816b8798>] ret_from_fork+0x58/0x90 [<ffffffffffffffff>] 0xffffffffffffffff
It's sleeping at left = schedule_timeout(expiry) in upcall_cache_get_entry().
BTW, libtool is terrible.