[LU-7306] OST reported as good but not used after error Created: 14/Oct/15 Updated: 05/Aug/20 Resolved: 05/Aug/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Paul Kline (Inactive) | Assignee: | chroma triage |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Running on LHC with build https://jenkins.iml.intel.com:8080/job/chroma/7714/ |
||
| Attachments: |
|
| Severity: | 3 |
| Project: | Hydra |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
After an error occurred on OST0003 the OST is reported as good in IML but not used in write operations: Log error: Oct 14 05:20:13 lotus-27 pengine[15316]: notice: process_pe_message: Calculated Transition 56: /var/lib/pacemaker/pengine/pe-input-56.bz2 Oct 14 05:20:13 lotus-27 crmd[15317]: notice: te_rsc_command: Initiating action 18: monitor masterfs-OST0003_16129c_monitor_5000 on lotus-27.iml.intel.com (local) Oct 14 05:20:13 lotus-27 crmd[15317]: notice: te_rsc_command: Initiating action 21: monitor masterfs-OST0007_730503_monitor_5000 on lotus-27.iml.intel.com (local) Oct 14 05:20:13 lotus-27 crmd[15317]: notice: te_rsc_command: Initiating action 24: monitor masterfs-OST0001_1db880_monitor_5000 on lotus-27.iml.intel.com (local) Oct 14 05:20:13 lotus-27 crmd[15317]: notice: te_rsc_command: Initiating action 27: monitor masterfs-OST0000_8d9981_monitor_5000 on lotus-26.iml.intel.com Oct 14 05:20:13 lotus-27 crmd[15317]: notice: te_rsc_command: Initiating action 28: start masterfs-OST0005_49bf3b_start_0 on lotus-27.iml.intel.com (local) Oct 14 05:20:13 lotus-27.iml.intel.com kernel: Lustre: masterfs-OST0001: precreate FID 0x0:275010873 is over 100000 larger than the LAST_ID 0x0:0, only precreating the last 10000 objects. Oct 14 05:20:13 lotus-27.iml.intel.com kernel: LustreError: 39256:0:(ost_handler.c:170:ost_validate_obdo()) masterfs-OST0003: client 10.14.80.179@tcp sent bad object 0x0:0: rc = -71 Oct 14 05:20:13 lotus-27 crmd[15317]: notice: process_lrm_event: Operation masterfs-OST0007_730503_monitor_5000: ok (node=lotus-27.iml.intel.com, call=42, rc=0, cib-update=113, confirmed=false) Oct 14 05:20:13 lotus-27 crmd[15317]: notice: process_lrm_event: Operation masterfs-OST0001_1db880_monitor_5000: ok (node=lotus-27.iml.intel.com, call=43, rc=0, cib-update=114, confirmed=false) Oct 14 05:20:13 lotus-27 crmd[15317]: notice: process_lrm_event: Operation masterfs-OST0003_16129c_monitor_5000: ok (node=lotus-27.iml.intel.com, call=41, rc=0, cib-update=115, confirmed=false) Oct 14 05:20:14 lotus-27.iml.intel.com kernel: LDISKFS-fs (dm-8): mounted filesystem with ordered data mode. quota=on. Opts: Output of LFS DF: [root@lotus-21vm9 ~]# lfs df UUID 1K-blocks Used Available Use% Mounted on masterfs-MDT0000_UUID 491695680 81228 458689168 0% /mnt/masterfs[MDT:0] masterfs-OST0000_UUID 653933816 3217728 617782340 1% /mnt/masterfs[OST:0] masterfs-OST0001_UUID 653933816 3217984 617781060 1% /mnt/masterfs[OST:1] masterfs-OST0002_UUID 653933816 3216704 617783364 1% /mnt/masterfs[OST:2] masterfs-OST0003_UUID 653933816 71860 620936672 0% /mnt/masterfs[OST:3] masterfs-OST0004_UUID 653933816 3218752 617781316 1% /mnt/masterfs[OST:4] masterfs-OST0005_UUID 653933816 3216704 617783364 1% /mnt/masterfs[OST:5] masterfs-OST0006_UUID 653933816 3218752 617781316 1% /mnt/masterfs[OST:6] masterfs-OST0007_UUID 653933816 3217728 617782340 1% /mnt/masterfs[OST:7] filesystem summary: 5231470528 22596212 4945411772 0% /mnt/masterfs |
| Comments |
| Comment by Paul Kline (Inactive) [ 14/Oct/15 ] |
|
Re-set striping and checked OST status, OST0003 is still not being used: [root@lotus-21vm9 ~]# lfs check osts masterfs-OST0002-osc-ffff88007dbf6c00: active masterfs-OST0001-osc-ffff88007dbf6c00: active masterfs-OST0006-osc-ffff88007dbf6c00: active masterfs-OST0003-osc-ffff88007dbf6c00: active masterfs-OST0004-osc-ffff88007dbf6c00: active masterfs-OST0007-osc-ffff88007dbf6c00: active masterfs-OST0000-osc-ffff88007dbf6c00: active masterfs-OST0005-osc-ffff88007dbf6c00: active |
| Comment by Brad Hoagland (Inactive) [ 15/Oct/15 ] |
|
HYD-Triage: Needs converted to LDEV ticket. |