[LU-9248] conf-sanity: test_55 fails with lov_objid size has to be 8192, not 8192 Created: 23/Mar/17 Updated: 19/Mar/19 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0, Lustre 2.9.0, Lustre 2.10.0, Lustre 2.10.4 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | tests | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
This issue was created by maloo for Joe Gmitter <joseph.gmitter@intel.com> conf-sanity: test_55 fails with lov_objid size has to be 8192, not 8192 Starting client: trevis-33vm1.trevis.hpdd.intel.com: -o user_xattr,flock trevis-33vm7@tcp:/lustre /mnt/lustre This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/3e251b4e-0f04-11e7-9053-5254006e85c2. |
| Comments |
| Comment by James Casper [ 24/May/17 ] |
|
2.9.57, b3575: |
| Comment by Andreas Dilger [ 01/Dec/17 ] |
|
This looks like some kind of bash comparison bug in the test that could be easily fixed: LOV_OBJID_SIZE=$(do_facet mds1 "$DEBUGFS -R 'stat lov_objid' $mdsdev 2>/dev/null" |
grep ^User | awk -F 'Size: ' '{print $2}')
if [ "$LOV_OBJID_SIZE" != $(lov_objid_size $i) ]; then
error "lov_objid size has to be $(lov_objid_size $i), not $LOV_OBJID_SIZE"
It might be that the quoted "$LOV_OBJID_SIZE" doesn't compare nicely with the unquoted $(lov_objid_size $i)? Try removing the double quotes, and using if [[ ... ]] instead? It would also be useful to add single quotes around the values in the error message, in case there are spaces around the values (which would also cause problems with the quoted value). |
| Comment by James Nunez (Inactive) [ 13/Apr/18 ] |
|
We haven’t see conf-sanity test 55 fail with lov_objid size has to be 8192, not 8192 for over a year. I’ve gone back to January of 2017 and don’t see this error message for this test. What we do see frequently is the error lov_objid size has to be 8192, not 0 which is not the same issue. We see this error message only during interop testing; when a client with version 2.9.56 (actually 2.9.55.36) or earlier runs against a server with version 2.9.57 (actually ~2.9.55.38) or later. For example, we see this failure with the following Lustre client/server conbinations: 2.9.0 clients and 2.11.50.52 servers The issue with interop testing is that the patch for echo checking size of lov_objid for ost index $i - LOV_OBJID_SIZE=$(do_facet mds1 "$DEBUGFS -R 'stat lov_objid' $mdsdev 2>/dev/null" | grep ^User | awk '\{print $6}') + LOV_OBJID_SIZE=$(do_facet mds1 "$DEBUGFS -R 'stat lov_objid' $mdsdev 2>/dev/null" | + grep ^User | awk -F 'Size: ' '\{print $2}') if [ "$LOV_OBJID_SIZE" != $(lov_objid_size $i) ]; then error "lov_objid size has to be $(lov_objid_size $i), not $LOV_OBJID_SIZE" else Looking at a master, 2.11.50, MDS, on a running system, we see # debugfs -R 'stat lov_objid' /dev/vda3 | grep ^User debugfs 1.42.13.wc6 (05-Feb-2017) User: 0 Group: 0 Project: 0 Size: 32 Using the “old”, pre 2.9.56 grep/awk commands printing $6, we get # debugfs -R 'stat lov_objid' /dev/vda3 | grep ^User | awk '\{print $6}'
debugfs 1.42.13.wc6 (05-Feb-2017)
0
which explains the output we see with interop testing. Thus, if we want to "fix" this issue, we would need to change what parameter is printed based on the server version number for all client from 2.9.0 and before which seems unlikely.
|
| Comment by Andreas Dilger [ 13/Apr/18 ] |
|
It would be better to do something like: do_facet mds1 "$DEBUGFS -R 'stat lov_objid' $mdsdev 2>/dev/null" |
grep "^User" | sed -e 's/.*Size: //' -e 's/ [A-Z].*//')
That will drop everything before and including "{{Size: }}", and then (just in case this changes again in the future) drop anything after the actual size. That should work with both old and new debugfs output, and be flexible in the future. We might even consider to replace the use of "^User" with "Size: " so that it is triggered on the actual data that we want rather than an unrelated value that just happens to exist on the same line. |
| Comment by Saurabh Tandan (Inactive) [ 08/May/18 ] |
|
+1 on 2.10.3 https://testing.hpdd.intel.com/test_sets/61ca2062-5067-11e8-abc3-52540065bddc |