[LU-6811] Interop 2.5.3<->master racer test_1: short description of the failure Created: 08/Jul/15 Updated: 11/Sep/18 Resolved: 11/Sep/18 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Environment: |
server: 2.5.3 |
||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
This issue was created by maloo for sarah_lw <wei3.liu@intel.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/ba481c76-2513-11e5-9d0f-5254006e85c2. The sub-test test_1 failed with the following error: test failed to respond and timed out test log 01:00:17:Lustre: DEBUG MARKER: == racer test 1: racer on clients: shadow-13vm10.shadow.whamcloud.com,shadow-13vm9 DURATION=900 == 00:55:29 (1436230529) 01:00:17:Lustre: DEBUG MARKER: DURATION=900 MDSCOUNT=1 LFS=/usr/bin/lfs /usr/lib64/lustre/tests/racer/racer.sh /mnt/lustre/racer 01:00:17:Lustre: DEBUG MARKER: DURATION=900 MDSCOUNT=1 LFS=/usr/bin/lfs /usr/lib64/lustre/tests/racer/racer.sh /mnt/lustre2/racer 01:00:17:LustreError: 1701:0:(llite_lib.c:1647:ll_setattr_raw()) [0x200000402:0x2:0x0]HSM set dirty failed: rc2 = -22 01:00:17:LustreError: 30151:0:(llite_lib.c:1647:ll_setattr_raw()) [0x200000403:0x7a:0x0]HSM set dirty failed: rc2 = -22 01:00:17:LustreError: 1409:0:(llite_lib.c:1647:ll_setattr_raw()) [0x200000400:0x42:0x0]HSM set dirty failed: rc2 = -22 01:00:17:LustreError: 11-0: lustre-OST0003-osc-ffff88002f179000: operation ldlm_enqueue to node 10.1.4.156@tcp failed: rc = -12 01:00:17:LustreError: 11381:0:(llite_lib.c:1647:ll_setattr_raw()) [0x200000400:0x93:0x0]HSM set dirty failed: rc2 = -22 01:00:17:LustreError: 28706:0:(llite_lib.c:1647:ll_setattr_raw()) [0x200000400:0x11c:0x0]HSM set dirty failed: rc2 = -22 01:00:17:LustreError: 3238:0:(file.c:184:ll_close_inode_openhandle()) lustre-clilmv-ffff88002f179000: inode [0x200000401:0x17b:0x0] mdc close failed: rc = -13 01:00:17:LustreError: 8305:0:(llite_lib.c:1647:ll_setattr_raw()) [0x200000400:0x155:0x0]HSM set dirty failed: rc2 = -22 01:00:17:LustreError: 8305:0:(llite_lib.c:1647:ll_setattr_raw()) Skipped 1 previous similar message 01:00:17:LustreError: 11-0: lustre-OST0000-osc-ffff880064962c00: operation ldlm_enqueue to node 10.1.4.156@tcp failed: rc = -12 01:00:17:LustreError: 8995:0:(llite_lib.c:1647:ll_setattr_raw()) [0x200000400:0x2b9:0x0]HSM set dirty failed: rc2 = -22 01:00:17:LustreError: 8995:0:(llite_lib.c:1647:ll_setattr_raw()) Skipped 8 previous similar messages 01:00:17:LustreError: 11-0: lustre-OST0000-osc-ffff88002f179000: operation ldlm_enqueue to node 10.1.4.156@tcp failed: rc = -12 01:00:17:LustreError: Skipped 1 previous similar message 01:00:17:LustreError: 11815:0:(llite_lib.c:1647:ll_setattr_raw()) [0x200000402:0x384:0x0]HSM set dirty failed: rc2 = -22 01:00:17:LustreError: 11815:0:(llite_lib.c:1647:ll_setattr_raw()) Skipped 14 previous similar messages 01:00:17:LustreError: 13286:0:(lcommon_cl.c:185:cl_file_inode_init()) Failure to initialize cl object [0x200000400:0x49a:0x0]: -16 01:00:17:LustreError: 4491:0:(lcommon_cl.c:185:cl_file_inode_init()) Failure to initialize cl object [0x200000401:0x564:0x0]: -16 01:00:17:6[24703]: segfault at 8 ip 000000385220b453 sp 00007fff93158fa0 error 4 in ld-2.12.so[3852200000+20000] 01:00:17:6[26914]: segfault at 8 ip 000000385220b453 sp 00007fff31d4c470 error 4 in ld-2.12.so[3852200000+20000] 01:00:17:LustreError: 11-0: lustre-MDT0000-mdc-ffff88002f179000: operation ldlm_enqueue to node 10.1.4.157@tcp failed: rc = -71 01:00:17:LustreError: Skipped 1 previous similar message 01:00:17:LustreError: 3152:0:(llite_lib.c:1647:ll_setattr_raw()) [0x200000400:0x7b0:0x0]HSM set dirty failed: rc2 = -22 01:00:17:LustreError: 3152:0:(llite_lib.c:1647:ll_setattr_raw()) Skipped 21 previous similar messages 01:00:17:LustreError: 11-0: lustre-OST0004-osc-ffff88002f179000: operation ldlm_enqueue to node 10.1.4.156@tcp failed: rc = -12 01:00:17:LustreError: 15304:0:(file.c:184:ll_close_inode_openhandle()) lustre-clilmv-ffff88002f179000: inode [0x200000403:0x941:0x0] mdc close failed: rc = -13 01:00:17:LustreError: 20600:0:(file.c:184:ll_close_inode_openhandle()) lustre-clilmv-ffff88002f179000: inode [0x200000403:0xa2c:0x0] mdc close failed: rc = -13 01:00:17:LustreError: 11-0: lustre-MDT0000-mdc-ffff880064962c00: operation ldlm_enqueue to node 10.1.4.157@tcp failed: rc = -71 01:00:17:LustreError: Skipped 1 previous similar message 01:00:17:LustreError: 3212:0:(file.c:184:ll_close_inode_openhandle()) lustre-clilmv-ffff88002f179000: inode [0x200000403:0xc20:0x0] mdc close failed: rc = -13 01:00:17:LustreError: 29399:0:(file.c:184:ll_close_inode_openhandle()) lustre-clilmv-ffff880064962c00: inode [0x200000400:0xe62:0x0] mdc close failed: rc = -13 01:00:17:LustreError: 11-0: lustre-MDT0000-mdc-ffff880064962c00: operation ldlm_enqueue to node 10.1.4.157@tcp failed: rc = -71 01:09:53:LustreError: 15097:0:(llite_lib.c:1647:ll_setattr_raw()) [0x200000401:0xec9:0x0]HSM set dirty failed: rc2 = -22 01:09:53:LustreError: 15097:0:(llite_lib.c:1647:ll_setattr_raw()) Skipped 48 previous similar messages 01:09:53:LustreError: 16369:0:(lcommon_cl.c:185:cl_file_inode_init()) Failure to initialize cl object [0x200000400:0x12ad:0x0]: -16 01:09:53:LustreError: 10510:0:(file.c:184:ll_close_inode_openhandle()) lustre-clilmv-ffff880064962c00: inode [0x200000402:0x10c6:0x0] mdc close failed: rc = -13 01:09:53:LustreError: 11-0: lustre-MDT0000-mdc-ffff880064962c00: operation ldlm_enqueue to node 10.1.4.157@tcp failed: rc = -71 01:09:53:LustreError: Skipped 10 previous similar messages 01:09:53:LustreError: 28692:0:(file.c:184:ll_close_inode_openhandle()) lustre-clilmv-ffff880064962c00: inode [0x200000400:0x16f6:0x0] mdc close failed: rc = -13 01:09:53:3[31441]: segfault at 8 ip 000000385220b453 sp 00007fff7e9345d0 error 4 in ld-2.12.so[3852200000+20000] 01:09:53:LustreError: 12673:0:(lcommon_cl.c:185:cl_file_inode_init()) Failure to initialize cl object [0x200000400:0x1ab7:0x0]: -16 01:09:54:LustreError: 20907:0:(llite_lib.c:1647:ll_setattr_raw()) [0x200000402:0x1752:0x0]HSM set dirty failed: rc2 = -22 01:09:54:LustreError: 20907:0:(llite_lib.c:1647:ll_setattr_raw()) Skipped 105 previous similar messages 01:09:54:LustreError: 5767:0:(file.c:184:ll_close_inode_openhandle()) lustre-clilmv-ffff880064962c00: inode [0x200000400:0x1cc1:0x0] mdc close failed: rc = -13 01:09:54:Lustre: 1525:0:(client.c:2018:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1436231046/real 1436231046] req@ffff88005ecd36c0 x1505996565685080/t0(0) o36->lustre-MDT0000-mdc-ffff88002f179000@10.1.4.157@tcp:12/10 lens 608/1016 e 0 to 1 dl 1436231090 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 01:09:54:Lustre: 1525:0:(client.c:2018:ptlrpc_expire_one_request()) Skipped 2 previous similar messages 01:09:54:Lustre: lustre-MDT0000-mdc-ffff88002f179000: Connection to lustre-MDT0000 (at 10.1.4.157@tcp) was lost; in progress operations using this service will wait for recovery to complete 01:09:54:Lustre: Skipped 1 previous similar message 01:09:54:Lustre: lustre-MDT0000-mdc-ffff88002f179000: Connection restored to lustre-MDT0000 (at 10.1.4.157@tcp) 01:09:54:Lustre: Skipped 2 previous similar messages 01:09:54:16[3172]: segfault at 8 ip 000000385220b453 sp 00007fff9358d480 error 4 in ld-2.12.so[3852200000+20000] 01:09:54:LustreError: 17858:0:(file.c:184:ll_close_inode_openhandle()) lustre-clilmv-ffff88002f179000: inode [0x200000401:0x2162:0x0] mdc close failed: rc = -13 01:09:54:LustreError: 2957:0:(lcommon_cl.c:185:cl_file_inode_init()) Failure to initialize cl object [0x200000400:0x26ad:0x0]: -16 01:09:54:LustreError: 11-0: lustre-MDT0000-mdc-ffff880064962c00: operation ldlm_enqueue to node 10.1.4.157@tcp failed: rc = -71 01:09:54:LustreError: Skipped 17 previous similar messages 01:09:54:19[28476]: segfault at 8 ip 000000385220b453 sp 00007fffe6786060 error 4 in ld-2.12.so[3852200000+20000] 01:09:54:INFO: task chmod:1525 blocked for more than 120 seconds. 01:09:54: Tainted: G W --------------- 2.6.32-504.23.4.el6.x86_64 #1 01:09:54:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 01:09:54:chmod D 0000000000000001 0 1525 1393 0x00000080 01:09:54: ffff88005a2ffcb8 0000000000000082 ffff88007904d440 0000000000000001 01:09:54: 00000000000158c0 ffff88007904d440 0000000000000001 ffff880037d9ccc0 01:09:54: ffff88005a2ffcf8 ffffffff81529a1e ffff88006a33bad8 ffff88005a2fffd8 01:09:54:Call Trace: 01:09:54: [<ffffffff81529a1e>] ? thread_return+0x4e/0x7d0 01:09:54: [<ffffffff8152b3a6>] __mutex_lock_slowpath+0x96/0x210 01:09:54: [<ffffffff8152aecb>] mutex_lock+0x2b/0x50 01:09:54: [<ffffffffa0a88755>] ll_setattr_raw+0x735/0x1430 [lustre] 01:09:54: [<ffffffffa0a894b5>] ll_setattr+0x65/0xd0 [lustre] 01:09:54: [<ffffffff811ad2d8>] notify_change+0x168/0x340 01:09:54: [<ffffffff8118ba97>] sys_fchmodat+0x117/0x170 01:09:54: [<ffffffff81193ee6>] ? sys_newstat+0x36/0x50 01:09:54: [<ffffffff8100c6f5>] ? math_state_restore+0x45/0x60 01:09:54: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b 01:09:54:LustreError: 29000:0:(lcommon_cl.c:185:cl_file_inode_init()) Failure to initialize cl object [0x200000400:0x2d2f:0x0]: -16 01:09:54:18[19161]: segfault at 8 ip 000000385220b453 sp 00007fff040c3900 error 4 in ld-2.12.so[3852200000+20000] 01:16:48:LustreError: 31170:0:(file.c:184:ll_close_inode_openhandle()) lustre-clilmv-ffff880064962c00: inode [0x200000401:0x2b84:0x0] mdc close failed: rc = -13 01:16:48:LustreError: 31170:0:(file.c:184:ll_close_inode_openhandle()) Skipped 4 previous similar messages 01:16:48:INFO: task chmod:1525 blocked for more than 120 seconds. 01:16:48: Tainted: G W --------------- 2.6.32-504.23.4.el6.x86_64 #1 01:16:48:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 01:16:48:chmod D 0000000000000001 0 1525 1 0x00000080 01:16:48: ffff88005a2ffcb8 0000000000000082 ffff88007904d440 0000000000000001 01:16:48: 00000000000158c0 ffff88007904d440 0000000000000001 ffff880037d9ccc0 01:16:48: ffff88005a2ffcf8 ffffffff81529a1e ffff88006a33bad8 ffff88005a2fffd8 01:16:48:Call Trace: 01:16:48: [<ffffffff81529a1e>] ? thread_return+0x4e/0x7d0 01:16:48: [<ffffffff8152b3a6>] __mutex_lock_slowpath+0x96/0x210 01:16:48: [<ffffffff8152aecb>] mutex_lock+0x2b/0x50 01:16:48: [<ffffffffa0a88755>] ll_setattr_raw+0x735/0x1430 [lustre] 01:16:48: [<ffffffffa0a894b5>] ll_setattr+0x65/0xd0 [lustre] 01:16:48: [<ffffffff811ad2d8>] notify_change+0x168/0x340 01:16:48: [<ffffffff8118ba97>] sys_fchmodat+0x117/0x170 01:16:48: [<ffffffff81193ee6>] ? sys_newstat+0x36/0x50 01:16:48: [<ffffffff8100c6f5>] ? math_state_restore+0x45/0x60 01:16:48: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b 01:16:48:INFO: task chmod:1525 blocked for more than 120 seconds. 01:16:48: Tainted: G W --------------- 2.6.32-504.23.4.el6.x86_64 #1 01:16:48:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 01:16:48:chmod D 0000000000000001 0 1525 1 0x00000080 01:16:48: ffff88005a2ffcb8 0000000000000082 ffff88007904d440 0000000000000001 01:16:48: 00000000000158c0 ffff88007904d440 0000000000000001 ffff880037d9ccc0 01:16:48: ffff88005a2ffcf8 ffffffff81529a1e ffff88006a33bad8 ffff88005a2fffd8 01:16:48:Call Trace: 01:16:48: [<ffffffff81529a1e>] ? thread_return+0x4e/0x7d0 01:16:48: [<ffffffff8152b3a6>] __mutex_lock_slowpath+0x96/0x210 01:16:48: [<ffffffff8152aecb>] mutex_lock+0x2b/0x50 01:16:48: [<ffffffffa0a88755>] ll_setattr_raw+0x735/0x1430 [lustre] 01:16:48: [<ffffffffa0a894b5>] ll_setattr+0x65/0xd0 [lustre] 01:16:48: [<ffffffff811ad2d8>] notify_change+0x168/0x340 01:16:48: [<ffffffff8118ba97>] sys_fchmodat+0x117/0x170 01:16:48: [<ffffffff81193ee6>] ? sys_newstat+0x36/0x50 01:16:48: [<ffffffff8100c6f5>] ? math_state_restore+0x45/0x60 01:16:48: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b 01:16:48:INFO: task chmod:1525 blocked for more than 120 seconds. 01:16:48: Tainted: G W --------------- 2.6.32-504.23.4.el6.x86_64 #1 01:16:48:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 01:16:48:chmod D 0000000000000001 0 1525 1 0x00000080 01:16:48: ffff88005a2ffcb8 0000000000000082 ffff88007904d440 0000000000000001 01:16:48: 00000000000158c0 ffff88007904d440 0000000000000001 ffff880037d9ccc0 01:16:48: ffff88005a2ffcf8 ffffffff81529a1e ffff88006a33bad8 ffff88005a2fffd8 01:16:48:Call Trace: 01:16:48: [<ffffffff81529a1e>] ? thread_return+0x4e/0x7d0 01:16:48: [<ffffffff8152b3a6>] __mutex_lock_slowpath+0x96/0x210 01:16:48: [<ffffffff8152aecb>] mutex_lock+0x2b/0x50 01:16:48: [<ffffffffa0a88755>] ll_setattr_raw+0x735/0x1430 [lustre] 01:16:48: [<ffffffffa0a894b5>] ll_setattr+0x65/0xd0 [lustre] 01:16:48: [<ffffffff811ad2d8>] notify_change+0x168/0x340 01:16:48: [<ffffffff8118ba97>] sys_fchmodat+0x117/0x170 01:16:48: [<ffffffff81193ee6>] ? sys_newstat+0x36/0x50 01:16:48: [<ffffffff8100c6f5>] ? math_state_restore+0x45/0x60 01:16:48: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b 01:16:48:INFO: task chmod:1525 blocked for more than 120 seconds. 01:16:48: Tainted: G W --------------- 2.6.32-504.23.4.el6.x86_64 #1 01:16:48:"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 01:16:48:chmod D 0000000000000001 0 1525 1 0x00000080 01:16:48: ffff88005a2ffcb8 0000000000000082 ffff88007904d440 0000000000000001 01:16:48: 00000000000158c0 ffff88007904d440 0000000000000001 ffff880037d9ccc0 01:16:48: ffff88005a2ffcf8 ffffffff81529a1e ffff88006a33bad8 ffff88005a2fffd8 01:16:48:Call Trace: 01:16:48: [<ffffffff81529a1e>] ? thread_return+0x4e/0x7d0 01:16:48: [<ffffffff8152b3a6>] __mutex_lock_slowpath+0x96/0x210 01:16:48: [<ffffffff8152aecb>] mutex_lock+0x2b/0x50 01:16:48: [<ffffffffa0a88755>] ll_setattr_raw+0x735/0x1430 [lustre] 01:16:48: [<ffffffffa0a894b5>] ll_setattr+0x65/0xd0 [lustre] 01:16:48: [<ffffffff811ad2d8>] notify_change+0x168/0x340 01:16:48: [<ffffffff8118ba97>] sys_fchmodat+0x117/0x170 01:24:43: [<ffffffff81193ee6>] ? sys_newstat+0x36/0x50 01:24:43: [<ffffffff8100c6f5>] ? math_state_restore+0x45/0x60 01:24:43: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b |
| Comments |
| Comment by James Nunez (Inactive) [ 11/Sep/18 ] |
|
Closing this ticket since this issue was seen on an older version of Lustre and has not been seen since. If we have more racer hangs or failures in interop testing, we should open a new ticket. |