Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
Lustre 2.8.0, Lustre 2.9.0, Lustre 2.10.0, Lustre 2.11.0, Lustre 2.12.0, Lustre 2.10.3, Lustre 2.10.4, Lustre 2.10.5, Lustre 2.12.4
-
None
-
Server/Client : master, build # 3225 RHEL 6.7
-
3
-
9223372036854775807
Description
This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/1e79d2a6-7d21-11e5-a254-5254006e85c2.
The sub-test test_26 failed with the following error:
test failed to respond and timed out
Client dmesg:
Lustre: DEBUG MARKER: test_26 fail mds1 1 times LustreError: 980:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1445937610, 300s ago), entering recovery for MGS@10.2.4.140@tcp ns: MGC10.2.4.140@tcp lock: ffff88007bdd82c0/0x956ab2c8047544d6 lrc: 4/1,0 mode: --/CR res: [0x65727473756c:0x2:0x0].0x0 rrc: 1 type: PLN flags: 0x1000000000000 nid: local remote: 0x223a79061b204538 expref: -99 pid: 980 timeout: 0 lvb_type: 0 Lustre: 29433:0:(client.c:2039:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1445937910/real 1445937910] req@ffff880028347980 x1516173751413108/t0(0) o250->MGC10.2.4.140@tcp@10.2.4.140@tcp:26/25 lens 520/544 e 0 to 1 dl 1445937916 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 Lustre: 29433:0:(client.c:2039:ptlrpc_expire_one_request()) Skipped 67 previous similar messages
MDS console:
09:22:17:LustreError: 24638:0:(client.c:1138:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff88004d92c980 x1516158358024328/t0(0) o101->lustre-MDT0000-lwp-MDT0000@0@lo:23/10 lens 456/496 e 0 to 0 dl 0 ref 2 fl Rpc:/0/ffffffff rc 0/-1 09:25:19:LustreError: 24638:0:(client.c:1138:ptlrpc_import_delay_req()) Skipped 6 previous similar messages 09:25:19:LustreError: 24638:0:(qsd_reint.c:55:qsd_reint_completion()) lustre-MDT0000: failed to enqueue global quota lock, glb fid:[0x200000006:0x10000:0x0], rc:-5 09:25:19:LustreError: 24638:0:(qsd_reint.c:55:qsd_reint_completion()) Skipped 1 previous similar message 09:25:19:INFO: task umount:24629 blocked for more than 120 seconds. 09:25:19: Not tainted 2.6.32-573.7.1.el6_lustre.x86_64 #1 09:25:19:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 09:25:19:umount D 0000000000000000 0 24629 24628 0x00000080 09:25:19: ffff880059e2bb48 0000000000000086 0000000000000000 00000000000708b7 09:25:20: 0000603500000000 000000ac00000000 00001c1fd9b9c014 ffff880059e2bb98 09:25:20: ffff880059e2bb58 0000000101d3458a ffff880076ee3ad8 ffff880059e2bfd8 09:25:20:Call Trace: 09:25:20: [<ffffffff8153a756>] __mutex_lock_slowpath+0x96/0x210 09:25:20: [<ffffffff8153a27b>] mutex_lock+0x2b/0x50 09:25:20: [<ffffffffa02cb30d>] mgc_process_config+0x1dd/0x1210 [mgc] 09:25:20: [<ffffffffa0476b61>] ? libcfs_debug_msg+0x41/0x50 [libcfs] 09:25:20: [<ffffffffa07fe28d>] obd_process_config.clone.0+0x8d/0x2e0 [obdclass] 09:25:20: [<ffffffffa0476b61>] ? libcfs_debug_msg+0x41/0x50 [libcfs] 09:25:20: [<ffffffffa08024c2>] lustre_end_log+0x262/0x6a0 [obdclass] 09:25:20: [<ffffffffa082efb1>] server_put_super+0x911/0xed0 [obdclass] 09:25:20: [<ffffffff811b0116>] ? invalidate_inodes+0xf6/0x190 09:25:20: [<ffffffff8119437b>] generic_shutdown_super+0x5b/0xe0 09:25:20: [<ffffffff81194466>] kill_anon_super+0x16/0x60 09:25:20: [<ffffffffa07fa096>] lustre_kill_super+0x36/0x60 [obdclass] 09:25:20: [<ffffffff81194c07>] deactivate_super+0x57/0x80 09:25:20: [<ffffffff811b4a7f>] mntput_no_expire+0xbf/0x110 09:25:20: [<ffffffff811b55cb>] sys_umount+0x7b/0x3a0 09:25:20: [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
Info required for matching: replay-dual test_26
Attachments
Issue Links
- is duplicated by
-
LU-10771 replay-dual test_26: Kernel panic - not syncing: Out of memory and no killable processes
- Open
-
LU-11038 replay-dual test_26: MDS crash with BUG: unable to handle kernel NULL pointer dereference
- Resolved
-
LU-14749 runtests test 1 hangs on MDS unmount
- Resolved
- is related to
-
LU-14406 replay-dual test 22d fails with “Remote creation failed 1”
- Open
-
LU-14878 replay-dual test_26: dbench 5354 missing
- Open
-
LU-4572 hung mdt threads
- Resolved
-
LU-7640 stuck mdt thread required reboot of mds
- Resolved
-
LU-7692 LNet: Service thread Hung
- Resolved
-
LU-8502 replay-vbr: umount hangs waiting for mgs_ir_fini_fs
- Resolved
-
LU-482 Test failure on test suite replay-dual, subtest test_0a
- Resolved
-
LU-7725 Error unpacking OUT message
- Resolved
- is related to
-
LU-7716 Do not do subdir check if source and target are in the same directory
- Resolved
-
LU-7765 replay-dual test 26 buggy redirection
- Resolved
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...