Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18826

Kernel panic due to null pointer from obd_get_mod_rpc_slot

Details

    • Bug
    • Resolution: Unresolved
    • Critical
    • None
    • Lustre 2.16.0, Lustre 2.17.0, Lustre 2.16.1, Lustre 2.15.7
    • None
    • 3
    • 9223372036854775807

    Description

      [Sat Nov 23 23:41:34 UTC 2024] Call trace:
      [Sat Nov 23 23:41:34 UTC 2024]  kthread_should_stop+0x18/0x40
      [Sat Nov 23 23:41:34 UTC 2024]  obd_get_mod_rpc_slot+0x10c/0x43c [obdclass]
      [Sat Nov 23 23:41:34 UTC 2024]  ptlrpc_get_mod_rpc_slot+0x38/0x60 [ptlrpc]
      [Sat Nov 23 23:41:34 UTC 2024]  mdc_close+0x21c/0xe64 [mdc]
      [Sat Nov 23 23:41:34 UTC 2024]  lmv_close+0x1a8/0x480 [lmv]
      [Sat Nov 23 23:41:34 UTC 2024]  ll_close_inode_openhandle+0x404/0xcc8 [lustre]
      [Sat Nov 23 23:41:34 UTC 2024]  ll_md_real_close+0xa4/0x280 [lustre]
      [Sat Nov 23 23:41:34 UTC 2024]  ll_clear_inode+0x1a0/0x7e0 [lustre]
      [Sat Nov 23 23:41:34 UTC 2024]  ll_delete_inode+0x70/0x260 [lustre] 

      Rootcause Analysis

      1. Kernel try to start new kthread from kthreadd by alloc_thread_stack_node
      2. Ran out of memory, try to clean up inode cache
      3. In obd_get_mod_rpc_slot, unfortunately in flight rpcs is full. So trying to put into sleep using wait_woken
      4. In kthread_should_stop -> to_kthread, it tried to read set_child_tid but it's null. It's expected since the task_struct here is kthreadd, which is still the parent one, because the allocation have not yet completed

      Repro

      1. Create a dummy obd device
      2. Modify current task_struct, 
        set_child_tid: 0000000000000000, current->flags: 208840, set max_in_flight_mod_rpcs to 0 so that it will be put into sleep all the time
      3. call obd_get_mod_rpc_slot and get kernel panic
      [ 1621.881498] Call trace:
      [ 1621.881915]  kthread_should_stop+0x18/0x40
      [ 1621.882627]  obd_get_mod_rpc_slot+0x10c/0x43c [obdclass]
      [ 1621.883502]  test_obd_rpc_slot+0xdc/0x270 [task_mod]
      [ 1621.884317]  task_mod_init+0x70/0x1000 [task_mod] 

      Proposed fix

      Skip the waiting part in obd_get_mod_rpc_slot since we know it will cause kernel panic.

      https://git.whamcloud.com/?p=fs/lustre-release.git;a=blob;f=lustre/obdclass/genops.c;h=93b1fe8050729d38e507a8432c67aa8dddd8987d;hb=HEAD#l2274

      During the normal flow, the process will be put into sleep and be woken up by claim_mod_rpc_function.

      2231         avail = cli->cl_mod_rpcs_in_flight < cli->cl_max_mod_rpcs_in_flight ||
      2232                 (close_req && cli->cl_close_rpcs_in_flight == 0);
      2233         if (avail) {
      2234                 cli->cl_mod_rpcs_in_flight++;
      2235                 if (close_req)
      2236                         cli->cl_close_rpcs_in_flight++;
      2237                 ret = woken_wake_function(wq_entry, mode, flags, key);
      2238                 w->woken = true;
      2239         } else if (cli->cl_close_rpcs_in_flight) 

      In this special case, the process is kthreadd, which will trigger kernel panic if put into sleep. So the modified logic will looks similar to close_req, which simply get a guaranteed slot by

      2235                 if (close_req) cli->cl_close_rpcs_in_flight++; 

      The fix will be a similar process, except it will skip the whole enqueue process as well.

      	if (! ((current->flags & PF_KTHREAD) && current->set_child_tid) ){
      		// Skip wait_woken as it will cause kernel panic. Grant it a slot.
      		cli->cl_mod_rpcs_in_flight++;
      
      	} else {
      // ... enqueue and wake_woken logic

      Attachments

        Activity

          People

            lijinc Lijing Chen
            lijinc Lijing Chen
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: