Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16941

MDT deadlock involving lquota_wb

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Critical
    • None
    • Lustre 2.15.3
    • None
    • 3
    • 9223372036854775807

    Description

      Hello! Two days ago, we upgraded the Lustre servers of Sherlock's scratch (Fir) from 2.12.9 to 2.15.3. The clients were already running 2.15.x. The server upgrade went smoothly but I noticed one issue today: we got an alert that a Robinhood changelog readerĀ  (also running 2.15.3) was stuck, and indeed:

      Jul  4 11:39:40 fir-rbh01 kernel: INFO: task robinhood:8863 blocked for more than 120 seconds.
      Jul  4 11:39:40 fir-rbh01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Jul  4 11:39:40 fir-rbh01 kernel: robinhood       D ffff9bacf6375280     0  8863      1 0x00000080
      Jul  4 11:39:40 fir-rbh01 kernel: Call Trace:
      Jul  4 11:39:40 fir-rbh01 kernel: [<ffffffffac7b8bf9>] schedule_preempt_disabled+0x29/0x70
      Jul  4 11:39:40 fir-rbh01 kernel: [<ffffffffac7b6ca7>] __mutex_lock_slowpath+0xc7/0x1e0
      Jul  4 11:39:40 fir-rbh01 kernel: [<ffffffffac7b602f>] mutex_lock+0x1f/0x33
      Jul  4 11:39:40 fir-rbh01 kernel: [<ffffffffac7ae6ed>] lookup_slow+0x33/0xab
      Jul  4 11:39:40 fir-rbh01 kernel: [<ffffffffac26bf8e>] path_lookupat+0x89e/0x8d0
      Jul  4 11:39:40 fir-rbh01 kernel: [<ffffffffac235c6b>] ? kmem_cache_alloc+0x19b/0x1f0
      Jul  4 11:39:40 fir-rbh01 kernel: [<ffffffffac26e16f>] ? getname_flags+0x4f/0x1a0
      Jul  4 11:39:40 fir-rbh01 kernel: [<ffffffffac26bfeb>] filename_lookup+0x2b/0xd0
      Jul  4 11:39:40 fir-rbh01 kernel: [<ffffffffac26f337>] user_path_at_empty+0x67/0xc0
      Jul  4 11:39:40 fir-rbh01 kernel: [<ffffffffac25e493>] ? fput+0x13/0x20
      Jul  4 11:39:40 fir-rbh01 kernel: [<ffffffffac26f3a1>] user_path_at+0x11/0x20
      Jul  4 11:39:40 fir-rbh01 kernel: [<ffffffffac2618a3>] vfs_fstatat+0x63/0xd0
      Jul  4 11:39:40 fir-rbh01 kernel: [<ffffffffac261d01>] SYSC_newlstat+0x31/0x70
      Jul  4 11:39:40 fir-rbh01 kernel: [<ffffffffac1468f6>] ? __audit_syscall_exit+0x1f6/0x2b0
      Jul  4 11:39:40 fir-rbh01 kernel: [<ffffffffac2621ce>] SyS_newlstat+0xe/0x20
      Jul  4 11:39:40 fir-rbh01 kernel: [<ffffffffac7c539a>] system_call_fastpath+0x25/0x2a
      Jul  4 11:39:40 fir-rbh01 kernel: INFO: task robinhood:8864 blocked for more than 120 seconds.
      

      I tried to restart the client and Robinhood multiple times without success, same issue. When I look at the MDT (fir-MDT0001), which is currently still running, I can notice a constant CPU load from this process:

         PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                         
       12875 root      20   0       0      0      0 S  13.6  0.0  39:41.42 lquota_wb_fir-M   
      

      Looking further with crash, there seems to be two lquota_wb threads (12871 and 12875):

      crash> bt 12871
      PID: 12871  TASK: ffff8ccd6c9b0000  CPU: 6   COMMAND: "lquota_wb_fir-M"
       #0 [ffff8ccd6ca93cf0] __schedule at ffffffffb07b78d8
       #1 [ffff8ccd6ca93d58] schedule at ffffffffb07b7ca9
       #2 [ffff8ccd6ca93d68] schedule_timeout at ffffffffb07b5778
       #3 [ffff8ccd6ca93e10] qsd_upd_thread at ffffffffc175a665 [lquota]
       #4 [ffff8ccd6ca93ec8] kthread at ffffffffb00cb621
       #5 [ffff8ccd6ca93f50] ret_from_fork_nospec_begin at ffffffffb07c51dd
      crash> bt 12875
      PID: 12875  TASK: ffff8ccd6cba1080  CPU: 2   COMMAND: "lquota_wb_fir-M"
       #0 [ffff8ccd6cbabcf0] __schedule at ffffffffb07b78d8
       #1 [ffff8ccd6cbabd58] schedule at ffffffffb07b7ca9
       #2 [ffff8ccd6cbabd68] schedule_timeout at ffffffffb07b5778
       #3 [ffff8ccd6cbabe10] qsd_upd_thread at ffffffffc175a665 [lquota]
       #4 [ffff8ccd6cbabec8] kthread at ffffffffb00cb621
       #5 [ffff8ccd6cbabf50] ret_from_fork_nospec_begin at ffffffffb07c51dd
      

      At first glance, as this involves lquota_wb, this looks similar to LU-15283, but I can't find the same backtrace signature. And the issue should be fixed in 2.15.3.

      However, the following thread is suspect for me, as it involves lquota and seems stuck:

      crash> bt 32263
      PID: 32263  TASK: ffff8cdd78a21080  CPU: 5   COMMAND: "mdt01_011"
       #0 [ffff8ccd6b5a3640] __schedule at ffffffffb07b78d8
       #1 [ffff8ccd6b5a36a8] schedule at ffffffffb07b7ca9
       #2 [ffff8ccd6b5a36b8] schedule_timeout at ffffffffb07b5778
       #3 [ffff8ccd6b5a3760] schedule_timeout_interruptible at ffffffffb07b58fe
       #4 [ffff8ccd6b5a3770] qsd_acquire at ffffffffc175e915 [lquota]
       #5 [ffff8ccd6b5a3810] qsd_op_begin0 at ffffffffc175f6bf [lquota]
       #6 [ffff8ccd6b5a38b0] qsd_op_begin at ffffffffc1760142 [lquota]
       #7 [ffff8ccd6b5a38f8] osd_declare_qid at ffffffffc17ffcb3 [osd_ldiskfs]
       #8 [ffff8ccd6b5a3950] osd_declare_attr_qid at ffffffffc17bbdfc [osd_ldiskfs]
       #9 [ffff8ccd6b5a39a8] osd_declare_attr_set at ffffffffc17bffbe [osd_ldiskfs]
      #10 [ffff8ccd6b5a39f8] lod_sub_declare_attr_set at ffffffffc1a7752c [lod]
      #11 [ffff8ccd6b5a3a48] lod_declare_attr_set at ffffffffc1a5854b [lod]
      #12 [ffff8ccd6b5a3ac8] mdd_attr_set at ffffffffc1d1ab32 [mdd]
      #13 [ffff8ccd6b5a3b48] mdt_attr_set at ffffffffc193ac36 [mdt]
      #14 [ffff8ccd6b5a3b98] mdt_reint_setattr at ffffffffc193bd3a [mdt]
      #15 [ffff8ccd6b5a3c10] mdt_reint_rec at ffffffffc193e69a [mdt]
      #16 [ffff8ccd6b5a3c38] mdt_reint_internal at ffffffffc1913f3c [mdt]
      #17 [ffff8ccd6b5a3c78] mdt_reint at ffffffffc191e807 [mdt]
      #18 [ffff8ccd6b5a3ca8] tgt_request_handle at ffffffffc13b725f [ptlrpc]
      #19 [ffff8ccd6b5a3d38] ptlrpc_server_handle_request at ffffffffc1360aa3 [ptlrpc]
      #20 [ffff8ccd6b5a3df0] ptlrpc_main at ffffffffc1362734 [ptlrpc]
      #21 [ffff8ccd6b5a3ec8] kthread at ffffffffb00cb621
      #22 [ffff8ccd6b5a3f50] ret_from_fork_nospec_begin at ffffffffb07c51dd
      

      For now, I am attaching a live "foreach bt" taken from this MDS as fir-md1-s2_fir-MDT0001_foreach_bt_lquota_wb_20230704.txt. Let me know if you have any idea on how to fix this. Thanks much.

      Attachments

        Activity

          People

            wc-triage WC Triage
            sthiell Stephane Thiell
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: