Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.1.0, Lustre 1.8.6
-
None
-
Old Lustre Version: 1.8.6-wc1
Lustre Build: http://newbuild.whamcloud.com/job/lustre-b1_8/100/
New Lustre Version: master
Lustre Build: http://newbuild.whamcloud.com/job/lustre-master/257/
Rolling upgrading (Lustre servers and clients were upgraded one by one without unmounting others) from Lustre 1.8.6-wc1 to Lustre master under the following configuration:
OSS1: RHEL5/x86_64
OSS2: RHEL5/x86_64
MDS: RHEL5/x86_64
Client1: RHEL6/x86_64
Client2: RHEL5/x86_64
Old Lustre Version: 1.8.6-wc1 Lustre Build: http://newbuild.whamcloud.com/job/lustre-b1_8/100/ New Lustre Version: master Lustre Build: http://newbuild.whamcloud.com/job/lustre-master/257/ Rolling upgrading (Lustre servers and clients were upgraded one by one without unmounting others) from Lustre 1.8.6-wc1 to Lustre master under the following configuration: OSS1: RHEL5/x86_64 OSS2: RHEL5/x86_64 MDS: RHEL5/x86_64 Client1: RHEL6/x86_64 Client2: RHEL5/x86_64
-
3
-
4080
Description
After upgrading OSS1 (fat-amd-2), running sanity test 27y on Lustre 1.8.6-wc1 clients (client-[12,13]) failed as follows:
== test 27y: create files while OST0 is degraded and the rest inactive == 20:40:14 lustre-OST0001-osc is Deactivate: open(/mnt/lustre/d0.sanity/d27/f27y0) error: No space left on device total: 0 creates in 61.40 seconds: 0.00 creates/second llapi_semantic_traverse: Failed to open '/mnt/lustre/d0.sanity/d27/f27y0': No such file or directory (2) error: getstripe failed for /mnt/lustre/d0.sanity/d27/f27y0. sanity test_27y: @@@@@@ FAIL: files created on deactivated OSTs instead of degraded OST Dumping lctl log to /home/yujian/test_logs/1313465731/sanity.test_27y.*.1313466087.log
Maloo report: https://maloo.whamcloud.com/test_sets/124d3d8a-c7c5-11e0-8d02-52540025f9af
Attachments
Issue Links
- Trackbacks
-
Changelog 1.8 {}version 1.8.7wc1{} {}Support for networks: socklnd \any kernel supported by Lustre, qswlnd Qsnet kernel modules 5.20 and later, openiblnd IbGold 1.8.2, o2iblnd OFED 1.3, 1.4.1, 1.4.2, 1.5.1, 1.5.2, 1.5.3.1 and 1.5.3.2 gmlnd GM 2.1....
From OST0's dmesg log
Lustre: DEBUG MARKER: == test 27v: skip object creation on slow OST ================= == 02:28:42
LustreError: 6122:0:(libcfs_fail.h:81:cfs_fail_check_set()) *** cfs_fail_loc=215 ***
LustreError: 6122:0:(libcfs_fail.h:81:cfs_fail_check_set()) *** cfs_fail_loc=215 ***
LustreError: 6122:0:(fail.c:126:__cfs_fail_timeout_set()) cfs_fail_timeout id 705 sleeping for 50000 ms
Lustre: DEBUG MARKER: == test 27w: check lfs setstripe -c -s -i options ============= == 02:29:27
Lustre: DEBUG MARKER: setstripe /mnt/lustre/d0.sanity/d27/f1 -c 1 -i 0
Lustre: DEBUG MARKER: setstripe /mnt/lustre/d0.sanity/d27/f2 -c 2 -i 1
Lustre: DEBUG MARKER: == test 27x: create files while OST0 is degraded == 02:29:29
Lustre: DEBUG MARKER: == test 27y: create files while OST0 is degraded and the rest inactive == 02:29:40
LustreError: 6122:0:(fail.c:130:__cfs_fail_timeout_set()) cfs_fail_timeout id 705 awake
LustreError: 6121:0:(filter.c:3937:filter_precreate()) create failed rc = -28
test27y was affected by test27v which set fail_loc to 0x705 holding OST’s object creation.