Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17308

makes "lctl pool_*" more reliable for big configurations

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      Recently, at the CEA, we hit an issue to re-create pools after a --writeconf on a big configuration (a lot of targets and pools).

      Errors were returned when adding too quickly OST in the pool (using separate commands). The workaround is to add delays between each command.

      This was hit on a standalone MGS with a mounted client (I am not 100% sure) with a 2.12.9.
      Since 2.12, there are several patches that could help:

      • LU-17182 utils: pool_add send OSTs in one batch
      • LU-15706 llog: deal with "SKIP" pool llog records correctly
      • LU-14516 mgc: configurable wait-to-reprocess time
      • LU-13686 utils: pool_add/remove error code fix

      But I found some issues when I tried to understand the "lctl pool_*" command:

      1. with a client (MDT and MGT share the same node), the sanity check before touching the MGS configuration is done in userspace by checking the lov client pool parameters. But nothing guarantees those parameters are sync with the MGS. Only the MGS configuration should be trusted, otherwise this could lead to inconsistencies (e.g: adding an OST to a non-existing pool). I think those kinds of behavior is more likely to be hit when executing several commands in a row (clients have to cancel their config lock and re-read their configuration for each command).
      2. on a separate MGS (without a client mounted), the MGS configuration is checked in userspace. But there are a lot of overheads. e.g: to add an OST, the MGS client configuration (fsname-client) is read 5 times (sanity check x3 + kernel x1 + check result x1). So when the configuration is big, this take time. And this use case is not documented.
      3. "lctl pool_add/pool_remove" do not check the ioctl return code (kernel).
      4. check_pool_cmd_result() does not re-compute the client wait delay with mgc_requeue_timeout_min parameter.

      Attachments

        Issue Links

          Activity

            People

              eaujames Etienne Aujames
              eaujames Etienne Aujames
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: