[LU-3001] sanity 27C: error: getstripe failed for f.sanity.27C0 Created: 21/Mar/13 Updated: 14/Dec/15 Resolved: 14/Dec/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | Emoly Liu |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | dne | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 7314 | ||||||||
| Description |
|
This issue was created by maloo for Li Wei <liwei@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/fb85b3e0-904d-11e2-8311-52540035b04c. The sub-test test_27C failed with the following error: Info required for matching: sanity 27C == sanity test 27C: check full striping across all OSTs == 13:26:01 (1363638361)
error: getstripe failed for f.sanity.27C0.
/usr/lib64/lustre/tests/sanity.sh: line 1828: [: -eq: unary operator expected
sanity test_27C: @@@@@@ FAIL:
Trace dump:
= /usr/lib64/lustre/tests/test-framework.sh:3977:error_noexit()
= /usr/lib64/lustre/tests/test-framework.sh:4000:error()
= /usr/lib64/lustre/tests/sanity.sh:1828:test_27C()
= /usr/lib64/lustre/tests/test-framework.sh:4255:run_one()
= /usr/lib64/lustre/tests/test-framework.sh:4288:run_one_logged()
= /usr/lib64/lustre/tests/test-framework.sh:4143:run_test()
= /usr/lib64/lustre/tests/sanity.sh:1832:main()
Dumping lctl log to /logdir/test_logs/2013-03-18/lustre-reviews-el6-x86_64--review--1_1_1__14094__-70104848885040-152208/sanity.test_27C.*.1363638362.log
CMD: c01,c02,c03,c04,c05,c06,c08,c09 /usr/sbin/lctl dk > /logdir/test_logs/2013-03-18/lustre-reviews-el6-x86_64--review--1_1_1__14094__-70104848885040-152208/sanity.test_27C.debug_log.\$(hostname -s).1363638362.log;
dmesg > /logdir/test_logs/2013-03-18/lustre-reviews-el6-x86_64--review--1_1_1__14094__-70104848885040-152208/sanity.test_27C.dmesg.\$(hostname -s).1363638362.log
|
| Comments |
| Comment by Emoly Liu [ 27/Mar/13 ] |
|
I can't reproduce it in 1MDS+2MDT. I will try it in 2MDS later. |
| Comment by Emoly Liu [ 27/Mar/13 ] |
|
I can't reproduce this failure in 2 MDSes on 2 VMs either. According to Maloo report above, I suspect this failure was probably caused by its previous test failure in test_17k/17n/27u. I will verify it. |
| Comment by Sarah Liu [ 28/Mar/13 ] |
|
another failure hit with 1MDS/2MDTs https://maloo.whamcloud.com/test_sets/7043259c-9656-11e2-9abb-52540035b04c |
| Comment by Richard Henwood (Inactive) [ 04/Apr/13 ] |
|
Here is, apparently, another: https://maloo.whamcloud.com/test_sets/97e8851c-9cf8-11e2-a280-52540035b04c |
| Comment by Emoly Liu [ 08/Apr/13 ] |
Richard, seems the maloo report above is not related to this failure. |
| Comment by Emoly Liu [ 10/Apr/13 ] |
|
1MDS2MDT: I tried to use a test patch(http://review.whamcloud.com/#change,5983) to reproduce this failure with "Test-Parameters: fortestonly mdtcount=2 testlist=sanity", but failed. The maloo report is at https://maloo.whamcloud.com/test_sessions/5c970e94-a139-11e2-b1c3-52540035b04c 09:14:24:== sanity test 27C: check full striping across all OSTs == 09:14:14 (1365524054) 09:14:24:0 1 2 3 4 5 6 09:14:24:1 2 3 4 5 6 0 09:14:24:2 3 4 5 6 0 1 09:14:24:3 4 5 6 0 1 2 09:14:24:4 5 6 0 1 2 3 09:14:24:5 6 0 1 2 3 4 09:14:24:6 0 1 2 3 4 5 2MDS2MDT+4OST: Since the valid value for miscount in Test-Parameter is only 1, I tried to use 3 VMs to reproduce it. 2VMs ran 2MDS and the other VM ran OST+client. [root@centos6-2 tests]# mgs_HOST=centos6-1 mds1_HOST=centos6-1 mds2_HOST=centos6-3 ost_HOST=centos6-2 MDSCOUNT=2 PDSH="pdsh -S -Rrsh -w" ONLY=27C sh sanity.sh centos6-3: centos6-1: centos6-2: Logging to local directory: /tmp/test_logs/1365579025 centos6-2: Checking config lustre mounted on /mnt/lustre Checking servers environments Checking clients centos6-2 environments Using TIMEOUT=100 centos6-1: centos6-2: seting jobstats to procname_uid Setting lustre.sys.jobid_var from disable to procname_uid Waiting 90 secs for update Updated after 8s: wanted 'procname_uid' got 'procname_uid' disable quota as required centos6-3: centos6-1: running as uid/gid/euid/egid 500/500/500/500, groups: [touch] [/mnt/lustre/d0_runas_test/f2554] only running test 27C centos6-2: centos6-1: excepting tests: 76 42a 42b 42c 42d 45 51d 68b centos6-1: centos6-2: skipping tests SLOW=no: 24o 27m 64b 68 71 77f 78 115 124b centos6-1: centos6-2: preparing for tests involving mounts mke2fs 1.42.6.wc2 (10-Dec-2012) debug=-1 == sanity test 27C: check full striping across all OSTs == 15:30:35 (1365579035) centos6-2: centos6-1: mkdir 1 for /mnt/lustre/d0.sanity/d27 0 1 2 3 1 2 3 0 2 3 0 1 3 0 1 2 Resetting fail_loc on all nodes...centos6-2: centos6-1: done. centos6-1: PASS 27C (0s) resend_count is set to 4 4 4 4 resend_count is set to 4 4 4 4 resend_count is set to 4 4 4 4 resend_count is set to 4 4 4 4 resend_count is set to 4 4 4 4 == sanity test complete, duration 11 sec == 15:30:36 (1365579036) And I ran test_27 series several times but still can't reproduce this failure. I'd like to close this ticket and reopen it if we hit this separately in the future. |
| Comment by nasf (Inactive) [ 13/Dec/15 ] |
|
It seems to be reproduced on latest master: |
| Comment by Andreas Dilger [ 14/Dec/15 ] |
|
Going to use |