Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-20160

crash in device_free() when sysfs registration fail

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Medium
    • None
    • Lustre 2.18.0
    • None
    • 3
    • 9223372036854775807

    Description

      Looks like LU-18162 series of patches that converted various components to LU devices introduced a crash on the cleanup patch when device allocation fails for any reason.

      This was noticed when a bug introuces an attempt to double-register a sysfs name for mdc in particular.

      Trivial reproduction with this patch:

      diff --git a/lustre/ldlm/ldlm_resource.c b/lustre/ldlm/ldlm_resource.c
      index ff81d55377..2cac972b25 100644
      --- a/lustre/ldlm/ldlm_resource.c
      +++ b/lustre/ldlm/ldlm_resource.c
      @@ -1023,6 +1023,11 @@ struct ldlm_namespace *ldlm_namespace_new(struct obd_device *obd, char *name,
           ns->ns_lock_cache_policy = LDLM_LOCK_CACHE_LFRU;
           ns->ns_lock_cache_ops = &ldlm_lfru_cache_ops;
       
      +    if (!strncmp(name, "lustre-MDT0000-mdc-", 19)) {
      +        CERROR("injected sysfs registration failure for %s\n", name);
      +        GOTO(out_hash, rc = -17);
      +    }
      +
           rc = ldlm_namespace_sysfs_register(ns);
           if (rc) {
               CERROR("%s: cannot initialize ns sysfs: rc = %d\n", name, rc);
      

      This crashes like this:

       [ 3782.402249] BUG: unable to handle page fault for address: 000000000000142a
      [ 3782.402254] #PF: supervisor read access in kernel mode
      [ 3782.402257] #PF: error_code(0x0000) - not-present page
      [ 3782.402272] PGD 0 P4D 0 
      [ 3782.402284] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      [ 3782.402290] CPU: 0 PID: 102409 Comm: llog_process_th Kdump: loaded Tainted: G
                 OE     -------  ---  5.14.0rocky96-debug #3
      [ 3782.402296] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.
      0-8.fc42 06/10/2025
      [ 3782.402300] RIP: 0010:mdc_device_free+0xc/0x160 [mdc]
      [ 3782.402375] Code: fd 00 00 00 00 04 00 e8 22 49 62 ff 48 c7 c7 60 37 6b c1 e8
       b6 1a 62 ff 45 31 e4 eb bf 90 0f 1f 44 00 00 41 54 55 4c 8b 66 28 <66> 41 83 bc
       24 2a 14 00 00 00 75 21 48 89 f5 48 85 f6 75 49 31 ff
      [ 3782.402393] RSP: 0018:ffffb2cbc9f9fb80 EFLAGS: 00010282
      [ 3782.402401] RAX: 00000000ffffffef RBX: ffff94f3c433ca10 RCX: ffff94f3c6250000
      [ 3782.402406] RDX: 0000000000000000 RSI: ffff94f455f3a300 RDI: ffffb2cbc9f9fbf8
      [ 3782.402409] RBP: 00000000ffffffef R08: ffff94f3cc79e000 R09: 0000000080080006
      [ 3782.402412] R10: 00000000ffffffef R11: 0000000000000000 R12: 0000000000000000
      [ 3782.402415] R13: ffffb2cbc9f9fbf8 R14: ffffffffc16b3f20 R15: ffff94f3c433cdb8
      [ 3782.402432] FS:  0000000000000000(0000) GS:ffff94f502000000(0000) knlGS:0000000000000000
      [ 3782.402438] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 3782.402442] CR2: 000000000000142a CR3: 0000000086862006 CR4: 0000000000172ef0 [ 3782.402453] Call Trace:
      [ 3782.402469]  <TASK>
      [ 3782.402476]  ? show_trace_log_lvl+0x1e1/0x31b
      [ 3782.402495]  ? show_trace_log_lvl+0x1e1/0x31b
      [ 3782.402516]  ? mdc_device_alloc+0x16c/0x260 [mdc]
      [ 3782.402587]  ? __die_body.cold+0x8/0xd
      [ 3782.402599]  ? page_fault_oops+0xac/0x150
      [ 3782.402609]  ? kernelmode_fixup_or_oops+0x84/0x110
      [ 3782.402625]  ? exc_page_fault+0x6f/0x190
      [ 3782.402640]  ? asm_exc_page_fault+0x22/0x30
      [ 3782.402658]  ? mdc_device_free+0xc/0x160 [mdc]
      [ 3782.402729]  mdc_device_alloc+0x16c/0x260 [mdc]
      [ 3782.402793]  obd_setup+0x195/0x460 [obdclass]
      [ 3782.403137]  class_setup+0x607/0x7c0 [obdclass]
      [ 3782.403397]  class_process_config+0x1837/0x1e50 [obdclass]
      [ 3782.403674]  ? class_config_llog_handler+0x64a/0x1330 [obdclass]
      [ 3782.403943]  ? lustre_cfg_init+0x88/0x1a0 [obdclass]
      [ 3782.404185]  class_config_llog_handler+0x798/0x1330 [obdclass]
      [ 3782.404466]  llog_process_thread+0xda5/0x1b20 [obdclass]
      [ 3782.404722]  ? llog_validate+0x380/0x380 [obdclass]
      [ 3782.404983]  llog_process_thread_daemonize+0x6d/0x90 [obdclass]
      [ 3782.405222]  kthread+0xf3/0x120
      [ 3782.405267]  ? kthread_park+0x90/0x90
      

      where 000000000000142a in particular is cli->cl_mod_rpcs_in_flight coming from

      +static struct lu_device *mdc_device_free(const struct lu_env *env,
      +                                        struct lu_device *lu)
      +{
      +       struct obd_device *obd = lu->ld_obd;
      +       struct client_obd *cli = &obd->u.cli;
      +       struct osc_device *osc = lu2osc_dev(lu);
      +
      +       LASSERT(cli->cl_mod_rpcs_in_flight == 0);
      
      (gdb) p/x &((struct obd_device *)0)->u.cli.cl_mod_rpcs_in_flight
      $4 = 0x14a2
      

      this tells us that the obd backpointer is NULL on such an error path when registration fails, which is not exactly surprising.

      Real crashes in maloo could be seen here:

      https://testing.whamcloud.com/test_sets/d631fd20-2d64-44cf-ba8e-0beb74d2ae96

      https://testing.whamcloud.com/test_sets/785a27cf-8d9f-49a1-a812-b94a477e3cbe

      https://testing.whamcloud.com/test_sets/8c7da0c8-c742-472b-8379-1349d3499372

      and so on.

      Changing the debug patch to test for OST0000-osc causes a crash in osc_cleanup_common()

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: