Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-20079

ldlm: kobject leak when client connect fails before obd_connect completes

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • RHEL9.7, RHEL10.1
    • 3
    • 9223372036854775807

    Description

      When a Lustre client mount fails before obd_connect() completes, cl_mgc_mgsexp stays NULL. lustre_stop_mgc(){{ skips {{obd_disconnect(), so client_disconnect_export() and ldlm_namespace_free_prior() are never called. client_obd_cleanup() then calls ldlm_namespace_free_post() directly, leaving stale sysfs kobjects under /sys/fs/lustre/ldlm/namespaces/ and /sys/fs/lustre/mgc/. Subsequent mount attempts fail with -EEXIST until reboot.

      Root cause: client_connect_import() calls class_disconnect() on the error path instead of client_disconnect_export(), skipping ldlm_namespace_free_prior().

      Fix: call ldlm_namespace_free_prior() explicitly in the error path of client_connect_import() after class_disconnect().

      Affects: MGC, MDC, and OSC since all share client_connect_import() as their ->o_connect() method.

      Verified on RHEL 9.7 and RHEL 10.1.

      Attachments

        Activity

          People

            hnishida Hiroshi Nishida
            hnishida Hiroshi Nishida
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: