Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.12.0, Lustre 2.14.0
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for James Nunez <james.a.nunez@intel.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/f888d9f2-8d67-11e8-87f3-52540065bddc
test_cascading_rw failed with the following error:
cascading_rw failed! 1
In this failure, cascading_rw runs several write to file iterations, in this case 104 iterations, hits some problem and returns -1. From the test_log, we see
23:41:23: Running test #/usr/lib64/lustre/tests/cascading_rw(iter 104) 23:41:23: Process 0 (trevis-9vm1.trevis.whamcloud.com) FAILED in cascading_rw.c:150:rw_file() write of file /mnt/lustre/d0.cascading_rw/cascading_rw return -1-------------------------------------------------------------------------- MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
The only interesting output in the console or dmesg logs is in the logs for client running the test. In the client console log, we see a message, but this shouldn’t be causing any issues ... should it?
[88550.129323] Lustre: DEBUG MARKER: == parallel-scale test cascading_rw: cascading_rw ==================================================== 23:40:10 (1532216410) [88550.536177] Lustre: cascading_rw: using old ioctl(LL_IOC_LOV_GETSTRIPE) on [0x20006bacf:0x10e05:0x0], use llapi_layout_get_by_path() [88623.058629] Lustre: DEBUG MARKER: /usr/sbin/lctl mark parallel-scale test_cascading_rw: @@@@@@ FAIL: cascading_rw failed! 1
The initial thought is that we are filling the file system. So, we need to add some debug logging to see if this is correct and then we can clean up the message in functions.sh/run_cascading_rw()
730 731 # FIXME 732 # Need space estimation here. 733
Although it’s hard to tell when this started, this issue looks like it started around 2018-07-19.
Here are a few other logs for this failure
https://testing.whamcloud.com/test_sets/c36b7bf8-8b55-11e8-9028-52540065bddc
https://testing.whamcloud.com/test_sets/e22eba50-8dad-11e8-87f3-52540065bddc
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
parallel-scale test_cascading_rw - cascading_rw failed! 1