Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.15.2
-
None
-
Lustre version:
lustre-iokit-2.15.2-1nas_mofed496el8_lustre_20230111v1.x86_64
kmod-lustre-2.15.2-1nas_mofed496el8_lustre_20230111v1.x86_64
pcp-lustre-0.4.16-2.noarch
lustre-devel-2.15.2-1nas_mofed496el8_lustre_20230111v1.x86_64
lustre-osd-ldiskfs-mount-2.15.2-1nas_mofed496el8_lustre_20230111v1.x86_64
lustre-2.15.2-1nas_mofed496el8_lustre_20230111v1.x86_64
lustre-tests-2.15.2-1nas_mofed496el8_lustre_20230111v1.x86_64
kmod-lustre-osd-ldiskfs-2.15.2-1nas_mofed496el8_lustre_20230111v1.x86_64
kmod-lustre-tests-2.15.2-1nas_mofed496el8_lustre_20230111v1.x86_64
kernel: 4.18.0-425.3.1.el8_lustre.x86_64
mofed: mlnx-ofa_kernel-4.9-mofed496.x86_64Lustre version: lustre-iokit-2.15.2-1nas_mofed496el8_lustre_20230111v1.x86_64 kmod-lustre-2.15.2-1nas_mofed496el8_lustre_20230111v1.x86_64 pcp-lustre-0.4.16-2.noarch lustre-devel-2.15.2-1nas_mofed496el8_lustre_20230111v1.x86_64 lustre-osd-ldiskfs-mount-2.15.2-1nas_mofed496el8_lustre_20230111v1.x86_64 lustre-2.15.2-1nas_mofed496el8_lustre_20230111v1.x86_64 lustre-tests-2.15.2-1nas_mofed496el8_lustre_20230111v1.x86_64 kmod-lustre-osd-ldiskfs-2.15.2-1nas_mofed496el8_lustre_20230111v1.x86_64 kmod-lustre-tests-2.15.2-1nas_mofed496el8_lustre_20230111v1.x86_64 kernel: 4.18.0-425.3.1.el8_lustre.x86_64 mofed: mlnx-ofa_kernel-4.9-mofed496.x86_64
-
2
-
9223372036854775807
Description
We have had multiple servers get dead lock with this stack trace.
(attached longer console output)
Jul 15 05:46:28 nbp11-srv3 kernel: INFO: task ll_ost07_000:9230 blocked for more than 120 seconds. Jul 15 05:46:28 nbp11-srv3 kernel: Tainted: G OE --------- - - 4.18.0-425.3.1.el8_lustre.x86_64 #1 Jul 15 05:46:28 nbp11-srv3 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jul 15 05:46:28 nbp11-srv3 kernel: task:ll_ost07_000 state:D stack: 0 pid: 9230 ppid: 2 flags:0x80004080 Jul 15 05:46:28 nbp11-srv3 kernel: Call Trace: Jul 15 05:46:28 nbp11-srv3 kernel: __schedule+0x2d1/0x860 Jul 15 05:46:28 nbp11-srv3 kernel: schedule+0x35/0xa0 Jul 15 05:46:28 nbp11-srv3 kernel: wait_transaction_locked+0x89/0xd0 [jbd2] Jul 15 05:46:28 nbp11-srv3 kernel: ? finish_wait+0x80/0x80 Jul 15 05:46:28 nbp11-srv3 kernel: add_transaction_credits+0xd4/0x290 [jbd2] Jul 15 05:46:28 nbp11-srv3 kernel: ? ldiskfs_do_update_inode+0x604/0x800 [ldiskfs] Jul 15 05:46:28 nbp11-srv3 kernel: start_this_handle+0x10a/0x520 [jbd2] Jul 15 05:46:28 nbp11-srv3 kernel: ? osd_fallocate_preallocate.isra.38+0x275/0x760 [osd_ldiskfs] Jul 15 05:46:28 nbp11-srv3 kernel: ? ldiskfs_mark_iloc_dirty+0x32/0x90 [ldiskfs] Jul 15 05:46:28 nbp11-srv3 kernel: jbd2__journal_restart+0xb4/0x160 [jbd2] Jul 15 05:46:28 nbp11-srv3 kernel: osd_fallocate_preallocate.isra.38+0x5a6/0x760 [osd_ldiskfs] Jul 15 05:46:28 nbp11-srv3 kernel: osd_fallocate+0xfd/0x370 [osd_ldiskfs] Jul 15 05:46:28 nbp11-srv3 kernel: ofd_object_fallocate+0x5dd/0xa30 [ofd] Jul 15 05:46:28 nbp11-srv3 kernel: ofd_fallocate_hdl+0x467/0x730 [ofd] Jul 15 05:46:28 nbp11-srv3 kernel: tgt_request_handle+0xc97/0x1a40 [ptlrpc] Jul 15 05:46:28 nbp11-srv3 kernel: ? ptlrpc_nrs_req_get_nolock0+0xff/0x1f0 [ptlrpc] Jul 15 05:46:28 nbp11-srv3 kernel: ptlrpc_server_handle_request+0x323/0xbe0 [ptlrpc] Jul 15 05:46:28 nbp11-srv3 kernel: ptlrpc_main+0xc0f/0x1570 [ptlrpc] Jul 15 05:46:28 nbp11-srv3 kernel: ? ptlrpc_wait_event+0x590/0x590 [ptlrpc] Jul 15 05:46:28 nbp11-srv3 kernel: kthread+0x10a/0x120 Jul 15 05:46:28 nbp11-srv3 kernel: ? set_kthread_struct+0x50/0x50 Jul 15 05:46:28 nbp11-srv3 kernel: ret_from_fork+0x1f/0x40
Attachments
Issue Links
- duplicates
-
LU-15800 Fallocate causes transaction deadlock
- Resolved