[LU-541] 1.8<->2.1 interop: sanity test_65e: FAIL: no stripe info failed Created: 27/Jul/11 Updated: 16/Aug/16 Resolved: 16/Aug/16 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.0, Lustre 1.8.6 |
| Fix Version/s: | Lustre 2.1.0, Lustre 1.8.7 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Jian Yu | Assignee: | Zhenyu Xu |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Lustre Clients: Lustre Servers: |
||
| Severity: | 3 |
| Rank (Obsolete): | 4931 |
| Description |
|
sanity test 65e failed as follows: == test 65e: directory setstripe defaults ======================= == 04:55:44 sanity test_65e: @@@@@@ FAIL: no stripe info failed Dumping lctl log to /home/yujian/test_logs/2011-07-27/042626/sanity.test_65e.*.1311767744.log tar: Removing leading `/' from member names /home/yujian/test_logs/2011-07-27/042626/sanity-1311767744.tar.bz2 error: can't get EA for /mnt/lustre/d65 : Success (0) Resetting fail_loc on all nodes...done. FAIL (68s) Maloo report: https://maloo.whamcloud.com/test_sets/756128ea-b853-11e0-8bdf-52540025f9af |
| Comments |
| Comment by Jian Yu [ 27/Jul/11 ] |
|
sanity test 65g failed as follows: == test 65g: directory setstripe -d =========================== == 04:56:53 sanity test_65g: @@@@@@ FAIL: delete default stripe failed Dumping lctl log to /home/yujian/test_logs/2011-07-27/042626/sanity.test_65g.*.1311767814.log tar: Removing leading `/' from member names /home/yujian/test_logs/2011-07-27/042626/sanity-1311767814.tar.bz2 Resetting fail_loc on all nodes...done. FAIL (39s) Maloo report: https://maloo.whamcloud.com/test_sets/756128ea-b853-11e0-8bdf-52540025f9af |
| Comment by Peter Jones [ 27/Jul/11 ] |
|
Bobijam Could you please look into this one? Thanks Peter |
| Comment by Andreas Dilger [ 27/Jul/11 ] |
|
This is caused by the changes to "lfs getstripe" on master in a77212cd13627b2b9f1835c48599e91c82aeed9d. I don't think this is a critical failure, but the test output doesn't show the actual result, so it is hard to say. It would be useful to run a simple "lfs getstripe /mnt/lustre" from the 1.8.6 client against a 2.1.0 MDS and see what it reports, compared to a 1.8.6 client on a 1.8.6 MDS. Maybe Prakash has already done this? It would also be possible to have the MDS return default striping to 1.8 clients (those missing OBD_CONNECT_FULL20), but I don't know if this is strictly necessary. It looks like there is a bug in ll_dirstripe_verify.c, because in the places it is calling llapi_file_get_stripe() it is using "rc" to determine if there was a problem and "errno" for the error code, but "errno" has been clobbered by other operations inside llapi_file_get_stripe(). It should instead be using "rc" to check if the error was ENODATA. Similarly the call to llapi_file_get_lov_uuid() should use "rc" instead of "errno". Also (or instead?) it makes sense to set errno = rc inside the llapi_*() functions, because llapi_err() is also using errno to print out the error messages. I also see that test_65e is in sanity.sh ALWAYS_EXCEPT due to https://bugzilla.lustre.org/show_bug.cgi?id=12653, but that bug is closed and this test (and test_65a also) should have been removed from ALWAYS_EXCEPT. So, two or three things to fix for this bug:
|
| Comment by Oleg Drokin [ 28/Jul/11 ] |
|
This might be related to |
| Comment by Zhenyu Xu [ 28/Jul/11 ] |
|
yes, it's caused by the change on master in a77212cd13627b2b9f1835c48599e91c82aeed9d. c. If the directory in question DOES NOT have it's EA set, AND it The 1.8 client try to getstripe of /mnt/lustre/d65, and 2.0 server returns NODATA, while 1.8 server returns the filesystem's default EA values. |
| Comment by Jian Yu [ 28/Jul/11 ] |
|
sanity test 102k also failed with the "lfs getstripe" issue: == test 102k: setfattr without parameter of value shouldn't cause a crash == 23:32:11 /usr/lib64/lustre/tests/sanity.sh: line 4491: [: too many arguments sanity test_102k: @@@@@@ FAIL: stripe size /mnt/lustre/d102k has no stripe info != /mnt/lustre/d102k has no stripe info Dumping lctl log to /home/yujian/test_logs/2011-07-27/230722/sanity.test_102k.*.1311834732.log tar: Removing leading `/' from member names /home/yujian/test_logs/2011-07-27/230722/sanity-1311834732.tar.bz2 /usr/lib64/lustre/tests/sanity.sh: line 4492: [: too many arguments sanity test_102k: @@@@@@ FAIL: stripe count /mnt/lustre/d102k has no stripe info != /mnt/lustre/d102k has no stripe info Dumping lctl log to /home/yujian/test_logs/2011-07-27/230722/sanity.test_102k.*.1311834809.log tar: Removing leading `/' from member names /home/yujian/test_logs/2011-07-27/230722/sanity-1311834809.tar.bz2 /usr/lib64/lustre/tests/sanity.sh: line 4493: [: too many arguments sanity test_102k: @@@@@@ FAIL: stripe offset /mnt/lustre/d102k has no stripe info != /mnt/lustre/d102k has no stripe info Dumping lctl log to /home/yujian/test_logs/2011-07-27/230722/sanity.test_102k.*.1311834850.log tar: Removing leading `/' from member names /home/yujian/test_logs/2011-07-27/230722/sanity-1311834850.tar.bz2 Resetting fail_loc on all nodes...done. FAIL (164s) Maloo report: https://maloo.whamcloud.com/test_sets/c3eac614-b8ea-11e0-8bdf-52540025f9af |
| Comment by Zhenyu Xu [ 28/Jul/11 ] |
|
b18 interop patch tracking at http://review.whamcloud.com/1154 |
| Comment by Peter Jones [ 28/Jul/11 ] |
|
Seems to warrant being a blocker |
| Comment by Jian Yu [ 28/Jul/11 ] |
|
ost-pools test 24 also failed with the "lfs getstripe" issue: == test 24: Independence of pool from other setstripe parameters == 04:12:59 fat-amd-1-ib: Pool lustre.pool1 created Updated after 0 sec: wanted '' got '' fat-amd-1-ib: OST lustre-OST0000_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0001_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0002_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0003_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0004_UUID added to pool lustre.pool1 fat-amd-1-ib: OST lustre-OST0005_UUID added to pool lustre.pool1 Updated after 0 sec: wanted 'lustre-OST0000_UUID lustre-OST0001_UUID lustre-OST0002_UUID lustre-OST0003_UUID lustre-OST0004_UUID lustre-OST0005_UUID ' got 'lustre-OST0000_UUID lustre-OST0001_UUID lustre-OST0002_UUID lustre-OST0003_UUID lustre-OST0004_UUID lustre-OST0005_UUID ' total: 10 creates in 0.02 seconds: 518.35 creates/second total: 10 creates in 0.02 seconds: 569.32 creates/second total: 10 creates in 0.02 seconds: 569.02 creates/second total: 10 creates in 0.02 seconds: 576.37 creates/second ost-pools test_24: @@@@@@ FAIL: Stripe count () not inherited in /mnt/lustre/d0.ost-pools/d24/dir4/f240 (1) Maloo report: https://maloo.whamcloud.com/test_sets/02ffc662-b91b-11e0-8bdf-52540025f9af |
| Comment by Andreas Dilger [ 28/Jul/11 ] |
|
Bobijam, void llapi_error(int level, int rc, char *fmt, ...) In the liblustreapi.h header: /* Compatibility function for old programs using llapi_err() */ And then do a simple search & replace for "llapi_err(...," with "llapi_error(..., rc,". I notice in some places liblustreapi.c is using llapi_err(level | LLAPI_MSG_NO_ERRNO, ...) or llapi_err_noerrno(level, ...), but these can be replaced with llapi_printf(level, ...) and remove some complexity in the code. |
| Comment by Zhenyu Xu [ 30/Jul/11 ] |
|
llapi_err_noerrno() is for printing to stderr, and llapi_printf() is for stdout. So i think I'd preserve llapi_err_noerrno(), and change llapi_err() to llapi_error() to print explicitly specified error value. |
| Comment by Build Master (Inactive) [ 08/Aug/11 ] |
|
Integrated in Oleg Drokin : 70c52b3e8ff585d4be806f4024bc202df5e285b7
|
| Comment by Peter Jones [ 08/Aug/11 ] |
|
Lowering priority for remaining 1.8.x work now that 2.1 patch has landed |
| Comment by Build Master (Inactive) [ 08/Aug/11 ] |
|
Integrated in Oleg Drokin : 70c52b3e8ff585d4be806f4024bc202df5e285b7
|
| Comment by Build Master (Inactive) [ 08/Aug/11 ] |
|
Integrated in Oleg Drokin : 70c52b3e8ff585d4be806f4024bc202df5e285b7
|
| Comment by Build Master (Inactive) [ 08/Aug/11 ] |
|
Integrated in Oleg Drokin : 70c52b3e8ff585d4be806f4024bc202df5e285b7
|
| Comment by Build Master (Inactive) [ 08/Aug/11 ] |
|
Integrated in Oleg Drokin : 70c52b3e8ff585d4be806f4024bc202df5e285b7
|
| Comment by Build Master (Inactive) [ 08/Aug/11 ] |
|
Integrated in Oleg Drokin : 70c52b3e8ff585d4be806f4024bc202df5e285b7
|
| Comment by Build Master (Inactive) [ 08/Aug/11 ] |
|
Integrated in Oleg Drokin : 70c52b3e8ff585d4be806f4024bc202df5e285b7
|
| Comment by Build Master (Inactive) [ 08/Aug/11 ] |
|
Integrated in Oleg Drokin : 70c52b3e8ff585d4be806f4024bc202df5e285b7
|
| Comment by Build Master (Inactive) [ 08/Aug/11 ] |
|
Integrated in Oleg Drokin : 70c52b3e8ff585d4be806f4024bc202df5e285b7
|
| Comment by Build Master (Inactive) [ 08/Aug/11 ] |
|
Integrated in Oleg Drokin : 70c52b3e8ff585d4be806f4024bc202df5e285b7
|
| Comment by Build Master (Inactive) [ 08/Aug/11 ] |
|
Integrated in Oleg Drokin : 70c52b3e8ff585d4be806f4024bc202df5e285b7
|
| Comment by Build Master (Inactive) [ 08/Aug/11 ] |
|
Integrated in Oleg Drokin : 70c52b3e8ff585d4be806f4024bc202df5e285b7
|
| Comment by Build Master (Inactive) [ 08/Aug/11 ] |
|
Integrated in Oleg Drokin : 70c52b3e8ff585d4be806f4024bc202df5e285b7
|
| Comment by Build Master (Inactive) [ 08/Aug/11 ] |
|
Integrated in Oleg Drokin : 70c52b3e8ff585d4be806f4024bc202df5e285b7
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 672005075b3c48c7b67fb35bd4c607f8eb97e1e1
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 672005075b3c48c7b67fb35bd4c607f8eb97e1e1
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 672005075b3c48c7b67fb35bd4c607f8eb97e1e1
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 672005075b3c48c7b67fb35bd4c607f8eb97e1e1
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 672005075b3c48c7b67fb35bd4c607f8eb97e1e1
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 672005075b3c48c7b67fb35bd4c607f8eb97e1e1
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 672005075b3c48c7b67fb35bd4c607f8eb97e1e1
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 672005075b3c48c7b67fb35bd4c607f8eb97e1e1
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 672005075b3c48c7b67fb35bd4c607f8eb97e1e1
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 672005075b3c48c7b67fb35bd4c607f8eb97e1e1
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : 672005075b3c48c7b67fb35bd4c607f8eb97e1e1
|
| Comment by Minh Diep [ 21/Sep/11 ] |
|
Hit this on 2.1.0 RC2 testing https://maloo.whamcloud.com/test_sets/9d355b6c-e43f-11e0-9909-52540025f9af == sanity test 65e: directory setstripe defaults ========================= 00:52:15 (1316505135) |
| Comment by Minh Diep [ 21/Sep/11 ] |
|
I reopen this because we hit this in 2.1.0 RC2 testing |
| Comment by Andreas Dilger [ 21/Sep/11 ] |
|
Looking at the test output, it appears that the intended effect of removing the default striping from the directory is properly handled, but the test script expects specific output from "lfs getstripe" which wasn't landed until after 1.8.6 was released (http://review.whamcloud.com/1154). I don't think this should be a blocker for 2.1.0. |
| Comment by Andreas Dilger [ 12/Jan/12 ] |
|
This was hit again in https://maloo.whamcloud.com/test_sets/c55a8168-38f8-11e1-b15b-5254004bbbd3. I don't know which version of Lustre was being tested on the client, or whether it should have fixed this problem during interop testing. |
| Comment by James A Simmons [ 16/Aug/16 ] |
|
Old blocker for unsupported version |