[LU-594] 1.8<->2.1 interop: sanity test_27y: FAIL: files created on deactivated OSTs instead of degraded OST Created: 16/Aug/11 Updated: 29/Aug/11 Resolved: 29/Aug/11 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.0, Lustre 1.8.6 |
| Fix Version/s: | Lustre 2.1.0, Lustre 1.8.7 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Jian Yu | Assignee: | Zhenyu Xu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Old Lustre Version: 1.8.6-wc1 New Lustre Version: master Rolling upgrading (Lustre servers and clients were upgraded one by one without unmounting others) from Lustre 1.8.6-wc1 to Lustre master under the following configuration: OSS1: RHEL5/x86_64 |
||
| Severity: | 3 |
| Rank (Obsolete): | 4080 |
| Description |
|
After upgrading OSS1 (fat-amd-2), running sanity test 27y on Lustre 1.8.6-wc1 clients (client-[12,13]) failed as follows: == test 27y: create files while OST0 is degraded and the rest inactive == 20:40:14 lustre-OST0001-osc is Deactivate: open(/mnt/lustre/d0.sanity/d27/f27y0) error: No space left on device total: 0 creates in 61.40 seconds: 0.00 creates/second llapi_semantic_traverse: Failed to open '/mnt/lustre/d0.sanity/d27/f27y0': No such file or directory (2) error: getstripe failed for /mnt/lustre/d0.sanity/d27/f27y0. sanity test_27y: @@@@@@ FAIL: files created on deactivated OSTs instead of degraded OST Dumping lctl log to /home/yujian/test_logs/1313465731/sanity.test_27y.*.1313466087.log Maloo report: https://maloo.whamcloud.com/test_sets/124d3d8a-c7c5-11e0-8d02-52540025f9af |
| Comments |
| Comment by Zhenyu Xu [ 16/Aug/11 ] |
|
OST1 debug log 00002000:00020000:15.0:1313466086.875951:0:11221:0:(filter.c:3937:filter_precreate()) create failed rc = -28 OST1 have no free files to create object which causes the error. |
| Comment by Jian Yu [ 16/Aug/11 ] |
|
Before running sanity test 27: ---------------- client-[12-13] ---------------- UUID bytes Used Available Use% Mounted on lustre-MDT0000_UUID 3.3G 163.9M 3.0G 5% /mnt/lustre[MDT:0] lustre-OST0000_UUID 9.4G 409.7M 8.5G 4% /mnt/lustre[OST:0] lustre-OST0001_UUID 9.4G 409.7M 8.5G 4% /mnt/lustre[OST:1] filesystem summary: 18.8G 819.3M 17.0G 4% /mnt/lustre UUID Inodes IUsed IFree IUse% Mounted on lustre-MDT0000_UUID 832967 28 832939 0% /mnt/lustre[MDT:0] lustre-OST0000_UUID 625856 58 625798 0% /mnt/lustre[OST:0] lustre-OST0001_UUID 625856 56 625800 0% /mnt/lustre[OST:1] filesystem summary: 832967 28 832939 0% /mnt/lustre After running sanity test 27: ---------------- client-[12-13] ---------------- UUID bytes Used Available Use% Mounted on lustre-MDT0000_UUID 3.3G 164.1M 3.0G 5% /mnt/lustre[MDT:0] lustre-OST0000_UUID 9.4G 425.0M 8.5G 5% /mnt/lustre[OST:0] lustre-OST0001_UUID 9.4G 409.7M 8.5G 4% /mnt/lustre[OST:1] filesystem summary: 18.8G 834.7M 17.0G 5% /mnt/lustre UUID Inodes IUsed IFree IUse% Mounted on lustre-MDT0000_UUID 833019 121 832898 0% /mnt/lustre[MDT:0] lustre-OST0000_UUID 625856 625856 0 100% /mnt/lustre[OST:0] lustre-OST0001_UUID 625856 114 625742 0% /mnt/lustre[OST:1] filesystem summary: 833019 121 832898 0% /mnt/lustre IFree became 0 on lustre-OST0000_UUID. |
| Comment by Jian Yu [ 16/Aug/11 ] |
|
Please take a look at this report: https://maloo.whamcloud.com/test_sets/7a2423ee-c7e6-11e0-8d02-52540025f9af |
| Comment by Zhenyu Xu [ 16/Aug/11 ] |
|
observed MDS keep on issue object create on OST0, each time the requested object id is 32 objects over what the last id on OST0, until exhausted inode space on OST0. 00000100:00100000:4.0:1313485047.653387:0:19932:0:(service.c:1705:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_creat_01:lustre-mdtlov_UUID+5:5717:x1377267856994752:12345-10.10.4.132@tcp:5 00000100:00100000:4.0:1313485047.655668:0:19932:0:(service.c:1705:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_ost_creat_01:lustre-mdtlov_UUID+5:5717:x1377267856994753:12345-10.10.4.132@tcp:5 |
| Comment by Zhenyu Xu [ 16/Aug/11 ] |
|
Yu Jian, The MDS log shows little useful info, would you mind set MDS debug level as -1 and try to reproduce it? I suspect that oscc precreate on MDS got erroneous/imcompatible reply package and loop forever until there is no inode space on OST0. |
| Comment by Jian Yu [ 17/Aug/11 ] |
Please look into the following report: I used the latest master codes (Jenkins build #259). |
| Comment by Zhenyu Xu [ 18/Aug/11 ] |
|
From OST0's dmesg log Lustre: DEBUG MARKER: == test 27v: skip object creation on slow OST ================= == 02:28:42 test27y was affected by test27v which set fail_loc to 0x705 holding OST’s object creation. |
| Comment by Zhenyu Xu [ 18/Aug/11 ] |
|
b1_8 patch tracking at http://review.whamcloud.com/1263 |
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : b113af75053c721c2540b1e4cac28599ddf84e22
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : b113af75053c721c2540b1e4cac28599ddf84e22
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : b113af75053c721c2540b1e4cac28599ddf84e22
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : b113af75053c721c2540b1e4cac28599ddf84e22
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : b113af75053c721c2540b1e4cac28599ddf84e22
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : b113af75053c721c2540b1e4cac28599ddf84e22
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : b113af75053c721c2540b1e4cac28599ddf84e22
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : b113af75053c721c2540b1e4cac28599ddf84e22
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : b113af75053c721c2540b1e4cac28599ddf84e22
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : b113af75053c721c2540b1e4cac28599ddf84e22
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Johann Lombardi : b113af75053c721c2540b1e4cac28599ddf84e22
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Oleg Drokin : ce95c918eb48bdb5fb910f2d75062fbca27ddc47
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Oleg Drokin : ce95c918eb48bdb5fb910f2d75062fbca27ddc47
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Oleg Drokin : ce95c918eb48bdb5fb910f2d75062fbca27ddc47
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Oleg Drokin : ce95c918eb48bdb5fb910f2d75062fbca27ddc47
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Oleg Drokin : ce95c918eb48bdb5fb910f2d75062fbca27ddc47
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Oleg Drokin : ce95c918eb48bdb5fb910f2d75062fbca27ddc47
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Oleg Drokin : ce95c918eb48bdb5fb910f2d75062fbca27ddc47
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Oleg Drokin : ce95c918eb48bdb5fb910f2d75062fbca27ddc47
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Oleg Drokin : ce95c918eb48bdb5fb910f2d75062fbca27ddc47
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Oleg Drokin : ce95c918eb48bdb5fb910f2d75062fbca27ddc47
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Oleg Drokin : ce95c918eb48bdb5fb910f2d75062fbca27ddc47
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Oleg Drokin : ce95c918eb48bdb5fb910f2d75062fbca27ddc47
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Oleg Drokin : ce95c918eb48bdb5fb910f2d75062fbca27ddc47
|
| Comment by Build Master (Inactive) [ 29/Aug/11 ] |
|
Integrated in Oleg Drokin : ce95c918eb48bdb5fb910f2d75062fbca27ddc47
|