[LU-2105] 2.3<->2.1 interop: Test failure on test suite sanityn, subtest test_33a Created: 08/Oct/12 Updated: 29/May/17 Resolved: 29/May/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.3.0, Lustre 2.1.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 4393 |
| Description |
|
This issue was created by maloo for yujian <yujian@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/11d06cfe-0e3e-11e2-91a3-52540035b04c. The sub-test test_33a failed with the following error: === START createmany old: 0 transaction CMD: client-28vm6.lab.whamcloud.com,client-28vm5 createmany -o /mnt/lustre/d0.sanityn/d33-\$(hostname)-3/f- -r /mnt/lustre2/d0.sanityn/d33-\$(hostname)-3/f- 10000 > /dev/null 2>&1 test failed to respond and timed out Info required for matching: sanityn 33a Lustre Client Build: http://build.whamcloud.com/job/lustre-b2_3/28 Console log on MDS (client-28vm3, 10.10.4.166) showed that: Lustre: DEBUG MARKER: lctl get_param -n osd*.lustre-MDT0000.mntdev^M
Lustre: DEBUG MARKER: procfile=/proc/fs/jbd/lvm--MDS-P1/info;^M
[ -f $procfile ] || procfile=/proc/fs/jbd2/lvm--MDS-P1/info;^M
[ -f $procfile ] || procfile=/proc/fs/jbd2/lvm--MDS-P1\:\*/info;^M
cat $procfile | head -1;^M
Lustre: 2698:0:(client.c:1780:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1349281914/real 1349281914] req@ffff88005e2b7800 x1414805216339338/t0(0) o400->lustre-OST0000-osc-MDT0000@10.10.4.167@tcp:28/4 lens 192/192 e 0 to 1 dl 1349281921 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1^M
Lustre: 2698:0:(client.c:1780:ptlrpc_expire_one_request()) Skipped 6 previous similar messages^M
Lustre: lustre-OST0000-osc-MDT0000: Connection to lustre-OST0000 (at 10.10.4.167@tcp) was lost; in progress operations using this service will wait for recovery to complete^M
Lustre: Skipped 2 previous similar messages^M
Lustre: 2698:0:(client.c:1780:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1349281919/real 1349281919] req@ffff880054d9a400 x1414805216339346/t0(0) o400->lustre-OST0000-osc-MDT0000@10.10.4.167@tcp:28/4 lens 192/192 e 0 to 1 dl 1349281926 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1^M
Lustre: 2698:0:(client.c:1780:ptlrpc_expire_one_request()) Skipped 6 previous similar messages^M
^M
<ConMan> Console [client-28vm3] disconnected from <client-28:6002> at 10-03 09:32.^M
^M
<ConMan> Console [client-28vm3] connected to <client-28:6002> at 10-03 09:32.^M
^MPress any key to continue.^M
^MPress any key to continue.^M
^MPress any key to continue.^M
^MPress any key to continue.^M
^MPress any key to continue.^M
^[[H^[[J^M
GNU GRUB version 0.97 (617K lower / 2094860K upper memory)^M
Console log on OSS (client-28vm4, 10.10.4.167) showed that: Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanityn test 33a: commit on sharing, cross crete\/delete, 2 clients, benchmark == 08:29:53 \(1349278193\)^M
Lustre: DEBUG MARKER: == sanityn test 33a: commit on sharing, cross crete/delete, 2 clients, benchmark == 08:29:53 (1349278193)^M
Lustre: lustre-OST0000: already connected client lustre-MDT0000-mdtlov_UUID (at 10.10.4.166@tcp) with handle 0xec210ade624773e8. Rejecting client with the same UUID trying to reconnect with handle 0x332bdfaf3e4636f^M
^M
<ConMan> Console [client-28vm4] disconnected from <client-28:6003> at 10-03 09:31.^M
^M
<ConMan> Console [client-28vm4] connected to <client-28:6003> at 10-03 09:32.^M
^MPress any key to continue.^M
^MPress any key to continue.^M
^MPress any key to continue.^M
^MPress any key to continue.^M
^MPress any key to continue.^M
^[[H^[[J^M
GNU GRUB version 0.97 (617K lower / 2094860K upper memory)^M
|
| Comments |
| Comment by Jian Yu [ 10/Oct/12 ] |
|
Lustre Server Build: http://build.whamcloud.com/job/lustre-b2_1/121 The same issue occurred again: https://maloo.whamcloud.com/test_sets/1c159394-12a6-11e2-a23c-52540035b04c |
| Comment by Andreas Dilger [ 29/May/17 ] |
|
Close old ticket. |