[LU-3878] sanity-benchmark test fsx: Bus error Created: 04/Sep/13 Updated: 13/Oct/21 Resolved: 23/Nov/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.1, Lustre 2.5.0, Lustre 2.6.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Jian Yu | Assignee: | Oleg Drokin |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Lustre build: http://build.whamcloud.com/job/lustre-b2_4/44/ (2.4.1 RC1) |
||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 10059 | ||||||||||||||||
| Description |
|
sanity-benchmark test fsx failed as follows: == sanity-benchmark test fsx: fsx ==================================================================== 22:15:58 (1378185358) debug=0 Using: fsx -c 50 -p 1000 -S 29278 -P /tmp -l 206139 -N 100000 /mnt/lustre/f0.fsxfile Chance of close/open is 1 in 50 Seed set to 29278 truncating to largest ever: 0xd3af /usr/lib64/lustre/tests/sanity-benchmark.sh: line 186: 12471 Bus error $CMD sanity-benchmark test_fsx: @@@@@@ FAIL: fsx failed Maloo report: https://maloo.whamcloud.com/test_sets/becb9218-14ef-11e3-ac48-52540035b04c This is a regression on Lustre b2_4 branch. |
| Comments |
| Comment by Keith Mannthey (Inactive) [ 04/Sep/13 ] |
|
This looks related to https://jira.hpdd.intel.com/browse/LU-2909 an earlier 2.4 blocker. |
| Comment by Andreas Dilger [ 04/Sep/13 ] |
|
Sorry, it seems |
| Comment by Jian Yu [ 05/Sep/13 ] |
|
Lustre build: http://build.whamcloud.com/job/lustre-b2_4/44/ (2.4.1 RC1) sanity-benchmark test fsx also hit the same failure: FYI, here is the query result of sanity-benchmark test_fsx with "FAIL" status on Lustre b2_4 branch: |
| Comment by Jian Yu [ 05/Sep/13 ] |
|
By searching on Maloo, I found that test fsx passed on FC18 on Lustre b2_4 build #40 and previous builds. Builds #41, #42, #43 were not tested on FC18. It seems that the culprit is in build #42. |
| Comment by Jian Yu [ 09/Sep/13 ] |
|
After patch http://review.whamcloud.com/7481 was reverted from Lustre b2_4 branch, the failure did not occur on Lustre 2.4.1 RC2. |
| Comment by Jian Yu [ 02/Nov/13 ] |
|
Lustre build: http://build.whamcloud.com/job/lustre-b2_4/47/ sanity-benchmark test fsx hit the same failure again: |
| Comment by Oleg Drokin [ 18/Nov/13 ] |
|
Is this only happening on zfs? Only on b2_4, but not on master? |
| Comment by Jian Yu [ 18/Nov/13 ] |
|
Here is the search result on Maloo: The failure occurred not only on zfs and b2_4, but also on ldiskfs and master/b2_5: |
| Comment by Jinshan Xiong (Inactive) [ 23/Nov/13 ] |
|
The failure of fax is probably a fallout of the previous failure on iozone. It used up all disk spaces on the OSTs, so there is no any grants on client which made mkwrite() fail. |
| Comment by Oleg Drokin [ 26/Nov/13 ] |
|
But why does it stop after the patches are reverted? |
| Comment by Jinshan Xiong (Inactive) [ 26/Nov/13 ] |
|
the previous iozone run used up all spaces. I can't connect this symptom to that patch. But from what I have seen so far, you reverted that patch on Sep 24 but it still occurred after that. |
| Comment by Jian Yu [ 05/Dec/13 ] |
|
Yes, the failure still occurred on the latest Lustre b2_4 branch with FSTYPE=zfs: In the above test reports, all of the iozone tests failed as follows: write: No space left on device or Write error No space left on device (rc = -1, len = 4194304) So, it seems that the out of space failure of iozone caused the fsx failure. |