[LU-17308] makes "lctl pool_*" more reliable for big configurations - Whamcloud Community JIRA

Recently, at the CEA, we hit an issue to re-create pools after a --writeconf on a big configuration (a lot of targets and pools).

Errors were returned when adding too quickly OST in the pool (using separate commands). The workaround is to add delays between each command.

This was hit on a standalone MGS with a mounted client (I am not 100% sure) with a 2.12.9.
Since 2.12, there are several patches that could help:

But I found some issues when I tried to understand the "lctl pool_*" command:

with a client (MDT and MGT share the same node), the sanity check before touching the MGS configuration is done in userspace by checking the lov client pool parameters. But nothing guarantees those parameters are sync with the MGS. Only the MGS configuration should be trusted, otherwise this could lead to inconsistencies (e.g: adding an OST to a non-existing pool). I think those kinds of behavior is more likely to be hit when executing several commands in a row (clients have to cancel their config lock and re-read their configuration for each command).
on a separate MGS (without a client mounted), the MGS configuration is checked in userspace. But there are a lot of overheads. e.g: to add an OST, the MGS client configuration (fsname-client) is read 5 times (sanity check x3 + kernel x1 + check result x1). So when the configuration is big, this take time. And this use case is not documented.
"lctl pool_add/pool_remove" do not check the ioctl return code (kernel).
check_pool_cmd_result() does not re-compute the client wait delay with mgc_requeue_timeout_min parameter.

is related to

LU-8970 Kernel warning on client mount if pool is defined multiple times

LU-17250 Add new MDT to existing filesystem misses OST pools, nodemaps, and other config