[LU-4457] Interop 2.5.0<->2.6 failure on test suite sanity test_118d: Multiop failed to block on fsync Created: 08/Jan/14 Updated: 21/Jan/22 Resolved: 21/Jan/22 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.6.0, Lustre 2.7.0, Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
server: lustre-master build # 1823 RHEL6 ldiskfs |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 12220 | ||||||||
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/ff6f9f86-77ef-11e3-a6a3-52540035b04c. The sub-test test_118d failed with the following error:
test log shows: == sanity test 118d: Fsync validation inject a delay of the bulk ============ 15:07:14 (1389049634) 7+0 records in 7+0 records out 458752 bytes (459 kB) copied, 0.0035145 s, 131 MB/s CMD: client-21-ib lctl set_param fail_loc=0x214 fail_loc=0x214 sanity test_118d: @@@@@@ FAIL: Multiop failed to block on fsync, pid=26893 |
| Comments |
| Comment by Sarah Liu [ 08/Jan/14 ] |
|
OST console 15:07:43:Lustre: DEBUG MARKER: == sanity test 118d: Fsync validation inject a delay of the bulk ============ 15:07:14 (1389049634) 15:07:43:Lustre: DEBUG MARKER: lctl set_param fail_loc=0x214 15:07:44:LustreError: 8057:0:(fail.c:133:__cfs_fail_timeout_set()) cfs_fail_timeout id 214 sleeping for -1000ms 15:07:44:schedule_timeout: wrong timeout value fffffffffffffc18 15:07:44:Pid: 8057, comm: ll_ost_io00_039 Not tainted 2.6.32-358.23.2.el6_lustre.x86_64 #1 15:07:45:Call Trace: 15:07:45: [<ffffffff8150f529>] ? schedule_timeout+0x2c9/0x2e0 15:07:45: [<ffffffffa04b7921>] ? libcfs_debug_msg+0x41/0x50 [libcfs] 15:07:45: [<ffffffffa04b2d0f>] ? __cfs_fail_timeout_set+0xcf/0x150 [libcfs] 15:07:46: [<ffffffffa0887ec9>] ? cfs_fail_timeout_set.clone.2+0x29/0x30 [ptlrpc] 15:07:46: [<ffffffffa088b94b>] ? tgt_brw_write+0x34b/0x1550 [ptlrpc] 15:07:47: [<ffffffffa04b7921>] ? libcfs_debug_msg+0x41/0x50 [libcfs] 15:07:47: [<ffffffffa088dfea>] ? tgt_handle_request0+0x2ea/0x1490 [ptlrpc] 15:07:47: [<ffffffffa04b7921>] ? libcfs_debug_msg+0x41/0x50 [libcfs] 15:07:47: [<ffffffffa088f5ca>] ? tgt_request_handle+0x43a/0x980 [ptlrpc] 15:07:50: [<ffffffffa0842725>] ? ptlrpc_main+0xd25/0x1970 [ptlrpc] 15:07:50: [<ffffffffa0841a00>] ? ptlrpc_main+0x0/0x1970 [ptlrpc] 15:07:50: [<ffffffff81096a36>] ? kthread+0x96/0xa0 15:07:51: [<ffffffff8100c0ca>] ? child_rip+0xa/0x20 15:07:52: [<ffffffff810969a0>] ? kthread+0x0/0xa0 15:07:52: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 15:07:52:LustreError: 8057:0:(fail.c:137:__cfs_fail_timeout_set()) cfs_fail_timeout id 214 awake 15:07:52:Lustre: DEBUG MARKER: /usr/sbin/lctl mark sanity test_118d: @@@@@@ FAIL: Multiop failed to block on fsync, pid=26893 |
| Comment by Sarah Liu [ 19/Nov/14 ] |
|
still hit this error in interop testing between 2.5.3 client and 2.7 server, so reopen it: https://testing.hpdd.intel.com/test_sets/f2a815bc-6ba5-11e4-88ff-5254006e85c2 |
| Comment by Sarah Liu [ 13/Apr/15 ] |
|
Hit this in interop testing between 2.5.3 client and 2.8 server: |
| Comment by Ashish Purkar (Inactive) [ 20/Jul/16 ] |
|
Is this issue still seen with interop testing recently? |