[LUDOC-520] Minor detail in section 14.9.3 (Removing an OST from the File System) Created: 17/Nov/23  Updated: 17/Nov/23

Status: Open
Project: Lustre Documentation
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Shane Nehring Assignee: Lustre Manual Triage
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Lustre 2.15.2


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This is an extremely minor issue that I bring up only because it confused me for a bit.

The manual outlines the steps necessary to completely remove an ost from a filesystem. The manual explicitly says to delete any attach, setup, add_osc, add_pool, and other records. The example shows the add_uuid, attach, setup, and add_osc events being removed from the logs.

However, if you followed the entire section you likely also have a conf_param event for setting osc.active=0 for the ost(s). If, like me, you didn't include those events in the list of llog_cancels you ran you'd run into errors like:

LustreError: 4459:0:(obd_config.c:1526:class_process_config()) no device for: work-OST0000-osc-ffff93ebd7f3e000
LustreError: 4459:0:(obd_config.c:1998:class_config_llog_handler()) MGC172.16.200.250@o2ib: cfg command failed: rc = -22
Lustre:    cmd=cf00f 0:work-OST0000-osc  1:osc.active=0  
LustreError: 15b-f: MGC172.16.200.250@o2ib: Configuration from log work-client failed from MGS -22. Check client and MGS are on compatible version.
Lustre: Unmounted work-client
LustreError: 4447:0:(super25.c:182:lustre_fill_super()) llite: Unable to mount <unknown>: rc = -22

when it came time to have a client mount the fs again.

It might be good to change the wording to be "...all records related to the removed OST(s)." or maybe include "conf_param" in the list of records that should also be removed, just to be as clear as possible. I don't disagree that this is already implied by the inclusion of "other records" in the existing documentation, just that a flu-addled admin (such as my current self) might assume that record should be retained since it wasn't explicitly called out.


Generated at Sat Feb 10 03:43:36 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.