[LU-16623] lod_statfs_and_check() does not skip unusable OSTs Created: 07/Mar/23  Updated: 23/Jan/24  Resolved: 14/Jun/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.16.0
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Major
Reporter: Andreas Dilger Assignee: Andreas Dilger
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-16014 sanity test_27M: crash in lod_qos_pre... Resolved
is related to LU-16648 sanity test_27M: crashed in lod_statf... Resolved
is related to LU-12624 DNE3: striped directory allocate stri... Resolved
is related to LU-16981 LBUG in lod_striped_create, fewer str... Resolved
is related to LU-17199 'lfs setstripe -C -1' can be set beyo... Resolved
is related to LU-16578 osp prealloc_status stuck at -11 afte... Closed
is related to LU-16938 "lfs setstripe -C -1" stripes too wid... Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

The LOV_USES_DEFAULT_STRIPE flag is never set during OST object allocation, so min_stripe_count() does not allow reducing the stripe count set on a file. This can result in the MDS trying to allocate objects on too many OSTs, especially when the space is imbalanced, or OSTs are deactivated by "osp.*.max_create_count=0".

It also appears that lod_statfs_and_check() is expecting lod_statfs_check() to only return -ENOSPC, so it isn't skipping an OST when it is mounted read-only or when max_create_count=0. It would probably be better to change lod_statfs_and_check() to skip the OST if lod_statfs_check() returns any error, since all of them are reasons to skip an OST.

Also, it appears that the merged MDT and OST QOS allocation does skip MDTs marked with OS_STATFS_ENOINO.



 Comments   
Comment by Andreas Dilger [ 07/Mar/23 ]

I think I see a larger problem with the way that the lod_statfs_check() is being handled. Not only did lod_statfs_and_check() only check for -ENOSPC being returned (which wasn't true when OS_STATFS_NOPRECREATE was returned when max_create_count=0 is set), but for any other return code, the OST would be marked inactive (ltd_active = 0) in the later part of this function, which means that OST_DESTROY RPCs would not be sent and in general the MDS would stop sending any RPCs to this OST:

(lod_qos.c:200:lod_statfs_and_check()) Process entered
(osp_dev.c:779:osp_statfs()) testfs-OST0000-osc-MDT0000: 78276 blocks, 76593 free, 69736 avail, 100000 files, 98890 free files
(lod_qos.c:236:lod_statfs_and_check()) testfs-OST0000-osc-MDT0000: turns inactive  *******
(lod_qos.c:263:lod_statfs_and_check()) Process leaving (rc=18446744073709551511 : -105 : ffffffffffffff97)
Comment by Andreas Dilger [ 08/Mar/23 ]

OK, I may have confused ltd_active=0 with deactivating the whole device with "lctl set_param osp.*.active=0". Setting max_create_count=0 does set ltd_active=0 and reduce the value reported by lod.*.activeobd once some file tries to be created on that MDT, which is also done by setting "lctl set_param osp.*.active=0.

One open question is why that is done for OS_STATE_NOPRECREATE but not consistently OS_STATE_ENOSPC or OS_STATE_ENOINO? Also, the ldt_active field is used for other things in the LOD code, for example controlling whether lod_sync() will send a sync RPC to the OST/MDT, and controls llog config and set_info calls.

Comment by Gerrit Updater [ 10/Mar/23 ]

"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50250
Subject: LU-16623 lod: handle object allocation consistently
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: eddaa4e00a06d30033b51573e9a7cf3236dbcaf7

Comment by Gerrit Updater [ 14/Jun/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50250/
Subject: LU-16623 lod: handle object allocation consistently
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: ced540165ef573570b8a8cba6e43f79e5fc6539f

Comment by Peter Jones [ 14/Jun/23 ]

Landed for 2.16

Comment by Gerrit Updater [ 09/Jan/24 ]

"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53628
Subject: LU-16623 tests: interop sanity-flr/202 sanity-pfl/15
Project: fs/lustre-release
Branch: b2_15
Current Patch Set: 1
Commit: 3ce12e111df7421699a938a975dc5860f16306d6

Comment by Gerrit Updater [ 23/Jan/24 ]

"Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53774
Subject: LU-16623 lod: handle object allocation consistently
Project: fs/lustre-release
Branch: b2_15
Current Patch Set: 1
Commit: f31275d578c86de8d1f3ed05b5041b10298eb421

Generated at Sat Feb 10 03:28:36 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.