Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.9.0
-
None
-
3
-
9223372036854775807
Description
replay-single test 70c hung while mounting MDS:
Starting mds1: /dev/lvm-Role_MDS/P1 /mnt/lustre-mds1 CMD: onyx-33vm7 mkdir -p /mnt/lustre-mds1; mount -t lustre /dev/lvm-Role_MDS/P1 /mnt/lustre-mds1
Console log on MDS:
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre /dev/lvm-Role_MDS/P1 /mnt/lustre-mds1 LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.2.4.127@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: Skipped 683 previous similar messagesLustre: 6963:0:(client.c:2113:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1469762588/real 1469762588] req@ffff880051ebaa00 x1541153266605312/t0(0) o250->MGC10.2.4.126@tcp@0@lo:26/25 lens 520/544 e 0 to 1 dl 1469762613 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Lustre: 6963:0:(client.c:2113:ptlrpc_expire_one_request()) Skipped 13 previous similar messages Lustre: 29062:0:(service.c:1335:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff88004a8faa00 x1541153729978528/t0(0) o101->6a772ed4-43ff-dc51-4d04-2c0278989dc2@10.2.4.120@tcp:-1/-1 lens 872/3512 e 24 to 0 dl 1469763017 ref 2 fl Interpret:/0/0 rc 0/0 Lustre: lustre-MDT0002: Client 6a772ed4-43ff-dc51-4d04-2c0278989dc2 (at 10.2.4.120@tcp) reconnecting Lustre: Skipped 1 previous similar message Lustre: lustre-MDT0002: Export ffff880057b24400 already connecting from 10.2.4.120@tcp Lustre: lustre-MDT0002: Export ffff880057b24400 already connecting from 10.2.4.120@tcp Lustre: lustre-MDT0002: Export ffff880057b24400 already connecting from 10.2.4.120@tcp Lustre: lustre-MDT0002: Export ffff880057b24400 already connecting from 10.2.4.120@tcp Lustre: Skipped 1 previous similar message Lustre: lustre-MDT0002: Export ffff880057b24400 already connecting from 10.2.4.120@tcp Lustre: Skipped 3 previous similar messages Lustre: lustre-MDT0002: Export ffff880057b24400 already connecting from 10.2.4.120@tcp Lustre: Skipped 6 previous similar messages Lustre: lustre-MDT0002: Export ffff880057b24400 already connecting from 10.2.4.120@tcp Lustre: Skipped 12 previous similar messages LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.2.4.127@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: Skipped 1909 previous similar messages Lustre: 6963:0:(client.c:2113:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1469763188/real 1469763188] req@ffff8800546a1200 x1541153266628608/t0(0) o250->MGC10.2.4.126@tcp@0@lo:26/25 lens 520/544 e 0 to 1 dl 1469763213 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Lustre: 6963:0:(client.c:2113:ptlrpc_expire_one_request()) Skipped 19 previous similar messages INFO: task mdt00_002:29063 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mdt00_002 D ffffffffa0b1d108 0 29063 2 0x00000080 ffff88004f7b3aa0 0000000000000046 ffff88004bc15c00 ffff88004f7b3fd8 ffff88004f7b3fd8 ffff88004f7b3fd8 ffff88004bc15c00 ffffffffa0b1d100 ffffffffa0b1d104 ffff88004bc15c00 00000000ffffffff ffffffffa0b1d108 Call Trace: [<ffffffff8163cb09>] schedule_preempt_disabled+0x29/0x70 [<ffffffff8163a805>] __mutex_lock_slowpath+0xc5/0x1c0 [<ffffffff81639c6f>] mutex_lock+0x1f/0x2f [<ffffffffa0a8e024>] nodemap_add_member+0x34/0x1b0 [ptlrpc] [<ffffffffa0dbf161>] mdt_obd_reconnect+0x81/0x1d0 [mdt] [<ffffffffa09d1e6f>] target_handle_connect+0x1c4f/0x2e30 [ptlrpc] [<ffffffffa0a6f5f2>] tgt_request_handle+0x3f2/0x1320 [ptlrpc] [<ffffffffa0a1bccb>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] [<ffffffffa0a19888>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [<ffffffff810b88d2>] ? default_wake_function+0x12/0x20 [<ffffffff810af038>] ? __wake_up_common+0x58/0x90 [<ffffffffa0a1fd80>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc] [<ffffffffa0a1f2e0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc] [<ffffffff810a5aef>] kthread+0xcf/0xe0 [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [<ffffffff816469d8>] ret_from_fork+0x58/0x90 [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
Maloo reports:
https://testing.hpdd.intel.com/test_sets/3f6a9a0e-557a-11e6-906c-5254006e85c2
https://testing.hpdd.intel.com/test_sets/cecb3c06-54af-11e6-a39e-5254006e85c2
Attachments
Issue Links
- is related to
-
LU-3291 IU UID/GID Mapping Feature
- Resolved