Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16966

ofd_object_fallocate dead lock?

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.16.0, Lustre 2.15.4
    • Lustre 2.15.2
    • None
    • 2
    • 9223372036854775807

    Description

      We have had multiple servers get dead lock with this stack trace.

      (attached longer console output)

      Jul 15 05:46:28 nbp11-srv3 kernel: INFO: task ll_ost07_000:9230 blocked for more than 120 seconds.
      Jul 15 05:46:28 nbp11-srv3 kernel:      Tainted: G           OE    --------- -  - 4.18.0-425.3.1.el8_lustre.x86_64 #1
      Jul 15 05:46:28 nbp11-srv3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Jul 15 05:46:28 nbp11-srv3 kernel: task:ll_ost07_000    state:D stack:    0 pid: 9230 ppid:     2 flags:0x80004080
      Jul 15 05:46:28 nbp11-srv3 kernel: Call Trace:
      Jul 15 05:46:28 nbp11-srv3 kernel: __schedule+0x2d1/0x860
      Jul 15 05:46:28 nbp11-srv3 kernel: schedule+0x35/0xa0
      Jul 15 05:46:28 nbp11-srv3 kernel: wait_transaction_locked+0x89/0xd0 [jbd2]
      Jul 15 05:46:28 nbp11-srv3 kernel: ? finish_wait+0x80/0x80
      Jul 15 05:46:28 nbp11-srv3 kernel: add_transaction_credits+0xd4/0x290 [jbd2]
      Jul 15 05:46:28 nbp11-srv3 kernel: ? ldiskfs_do_update_inode+0x604/0x800 [ldiskfs]
      Jul 15 05:46:28 nbp11-srv3 kernel: start_this_handle+0x10a/0x520 [jbd2]
      Jul 15 05:46:28 nbp11-srv3 kernel: ? osd_fallocate_preallocate.isra.38+0x275/0x760 [osd_ldiskfs]
      Jul 15 05:46:28 nbp11-srv3 kernel: ? ldiskfs_mark_iloc_dirty+0x32/0x90 [ldiskfs]
      Jul 15 05:46:28 nbp11-srv3 kernel: jbd2__journal_restart+0xb4/0x160 [jbd2]
      Jul 15 05:46:28 nbp11-srv3 kernel: osd_fallocate_preallocate.isra.38+0x5a6/0x760 [osd_ldiskfs]
      Jul 15 05:46:28 nbp11-srv3 kernel: osd_fallocate+0xfd/0x370 [osd_ldiskfs]
      Jul 15 05:46:28 nbp11-srv3 kernel: ofd_object_fallocate+0x5dd/0xa30 [ofd]
      Jul 15 05:46:28 nbp11-srv3 kernel: ofd_fallocate_hdl+0x467/0x730 [ofd]
      Jul 15 05:46:28 nbp11-srv3 kernel: tgt_request_handle+0xc97/0x1a40 [ptlrpc]
      Jul 15 05:46:28 nbp11-srv3 kernel: ? ptlrpc_nrs_req_get_nolock0+0xff/0x1f0 [ptlrpc]
      Jul 15 05:46:28 nbp11-srv3 kernel: ptlrpc_server_handle_request+0x323/0xbe0 [ptlrpc]
      Jul 15 05:46:28 nbp11-srv3 kernel: ptlrpc_main+0xc0f/0x1570 [ptlrpc]
      Jul 15 05:46:28 nbp11-srv3 kernel: ? ptlrpc_wait_event+0x590/0x590 [ptlrpc]
      Jul 15 05:46:28 nbp11-srv3 kernel: kthread+0x10a/0x120
      Jul 15 05:46:28 nbp11-srv3 kernel: ? set_kthread_struct+0x50/0x50
      Jul 15 05:46:28 nbp11-srv3 kernel: ret_from_fork+0x1f/0x40
      

      Attachments

        1. brw_stats
          8 kB
        2. brw_stats.save.1693236421
          89 kB
        3. dmesg.out
          119 kB
        4. fallocate-range-locking.patch
          1 kB
        5. nbp15.hang
          45 kB
        6. stack.out
          51 kB
        7. stack1.out
          55 kB
        8. stack1-1.out
          55 kB

        Issue Links

          Activity

            People

              bzzz Alex Zhuravlev
              mhanafi Mahmoud Hanafi
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: