[LU-9259] sanity test_17o failed with 'stat file should fail' Created: 27/Mar/17 Updated: 29/Jun/17 Resolved: 19/Apr/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.0 |
| Fix Version/s: | Lustre 2.10.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Nunez (Inactive) | Assignee: | nasf (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
review-dne |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
sanity test 17o is failing with the error message 'stat file should fail' sanity test 17o touches a file, fails the MDS and then checks to see if the file exists. Here’s the code: 703 local WDIR=$DIR/${tdir}o
704 local mdt_index
705 local rc=0
706
707 test_mkdir -p $WDIR
708 mdt_index=$($LFS getstripe -M $WDIR)
709 mdt_index=$((mdt_index+1))
710
711 touch $WDIR/$tfile
712
713 #fail mds will wait the failover finish then set
714 #following fail_loc to avoid interfer the recovery process.
715 fail mds${mdt_index}
716
717 #define OBD_FAIL_OSD_LMA_INCOMPAT 0x194
718 do_facet mds${mdt_index} lctl set_param fail_loc=0x194
719 ls -l $WDIR/$tfile && rc=1
720 do_facet mds${mdt_index} lctl set_param fail_loc=0
721 [[ $rc -ne 0 ]] && error "stat file should fail"
There’s nothing interesting in the console logs to explain why the file exists. So far, I only see failures for this error for review-dne. This test failed with this error message last year a bit, stopped failing, and started again recently. Here are the most recent failures: |
| Comments |
| Comment by Joseph Gmitter (Inactive) [ 27/Mar/17 ] |
|
Hi Fan Yong, Can you please have a look into this issue? Thanks. |
| Comment by Andreas Dilger [ 27/Mar/17 ] |
|
It looks from the logs (2017-03-06 at least) that the fail_loc check is not being hit. I'm not sure if that is because the file is not being created on the MDS where the fail_loc is set, or possibly the file attributes are cached on the client and the MDS isn't being involved in the lookup. The test itself could be improved a bit: test_mkdir -p $WDIR
mdt_index=$($LFS getstripe -M $WDIR)
mdt_index=$((mdt_index+1))
touch $WDIR/$tfile
The mdt_index should be gotten from the file after it is created instead of from the directory, since the directory is striped 2 ways by default, and "getstripe -M" on a striped directory will only return the stripe0/master index, which isn't necessarily where the inode will be allocated (depends on filename and hash function). Also, the client MDC DLM lock cache should be flushed so that the client is sure to do a lookup on the MDS. |
| Comment by Gerrit Updater [ 28/Mar/17 ] |
|
Fan Yong (fan.yong@intel.com) uploaded a new patch: https://review.whamcloud.com/26225 |
| Comment by Gerrit Updater [ 19/Apr/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/26225/ |