Details
-
Bug
-
Resolution: Fixed
-
Critical
-
None
-
None
-
3
-
9223372036854775807
Description
PID: 154236 TASK: ffff9ab9b2f330c0 CPU: 9 COMMAND: "ll_ost_io01_002" #0 [ffff9ab9b2f6af58] __schedule at ffffffff9876ab17 #1 [ffff9ab9b2f6afe0] schedule at ffffffff9876b019 #2 [ffff9ab9b2f6aff0] wait_transaction_locked at ffffffffc0760085 [jbd2] #3 [ffff9ab9b2f6b048] add_transaction_credits at ffffffffc0760368 [jbd2] #4 [ffff9ab9b2f6b0a8] start_this_handle at ffffffffc07605e1 [jbd2] #5 [ffff9ab9b2f6b140] jbd2__journal_start at ffffffffc0760a93 [jbd2] #6 [ffff9ab9b2f6b188] __ldiskfs_journal_start_sb at ffffffffc19c1c79 [ldiskfs] #7 [ffff9ab9b2f6b1c8] ldiskfs_release_dquot at ffffffffc19b92ec [ldiskfs] #8 [ffff9ab9b2f6b1e8] dqput at ffffffff982aeb5d #9 [ffff9ab9b2f6b210] __dquot_drop at ffffffff982b0215 #10 [ffff9ab9b2f6b248] dquot_drop at ffffffff982b0285 #11 [ffff9ab9b2f6b258] ldiskfs_clear_inode at ffffffffc19bdcf2 [ldiskfs] #12 [ffff9ab9b2f6b270] ldiskfs_evict_inode at ffffffffc19dccdf [ldiskfs] #13 [ffff9ab9b2f6b2b0] evict at ffffffff9825ee14 #14 [ffff9ab9b2f6b2d8] dispose_list at ffffffff9825ef1e #15 [ffff9ab9b2f6b300] prune_icache_sb at ffffffff9825ff2c #16 [ffff9ab9b2f6b368] prune_super at ffffffff98244323 #17 [ffff9ab9b2f6b3a0] shrink_slab at ffffffff981ca105 #18 [ffff9ab9b2f6b440] do_try_to_free_pages at ffffffff981cd3c2 #19 [ffff9ab9b2f6b4b8] try_to_free_pages at ffffffff981cd5dc #20 [ffff9ab9b2f6b550] __alloc_pages_slowpath at ffffffff987601ef #21 [ffff9ab9b2f6b640] __alloc_pages_nodemask at ffffffff981c1465 #22 [ffff9ab9b2f6b6f0] alloc_pages_current at ffffffff9820e2c8 #23 [ffff9ab9b2f6b738] new_slab at ffffffff982192d5 #24 [ffff9ab9b2f6b770] ___slab_alloc at ffffffff9821ad4c #25 [ffff9ab9b2f6b840] __slab_alloc at ffffffff9876160c #26 [ffff9ab9b2f6b880] kmem_cache_alloc at ffffffff9821c3eb #27 [ffff9ab9b2f6b8c0] __radix_tree_preload at ffffffff9837b7b9 #28 [ffff9ab9b2f6b8f0] radix_tree_maybe_preload at ffffffff9837bd0e #29 [ffff9ab9b2f6b900] __add_to_page_cache_locked at ffffffff981b734a #30 [ffff9ab9b2f6b940] add_to_page_cache_lru at ffffffff981b74b7 #31 [ffff9ab9b2f6b970] find_or_create_page at ffffffff981b783e #32 [ffff9ab9b2f6b9b0] osd_bufs_get at ffffffffc1a773c3 [osd_ldiskfs] #33 [ffff9ab9b2f6ba10] ofd_preprw_write at ffffffffc144f156 [ofd] #34 [ffff9ab9b2f6ba90] ofd_preprw at ffffffffc14500ce [ofd] #35 [ffff9ab9b2f6bb28] tgt_brw_write at ffffffffc0ece6e9 [ptlrpc] #36 [ffff9ab9b2f6bca0] tgt_request_handle at ffffffffc0eccd4a [ptlrpc] #37 [ffff9ab9b2f6bd30] ptlrpc_server_handle_request at ffffffffc0e72586 [ptlrpc] #38 [ffff9ab9b2f6bde8] ptlrpc_main at ffffffffc0e7625a [ptlrpc] #39 [ffff9ab9b2f6bec8] kthread at ffffffff980c1f81 #40 [ffff9ab9b2f6bf50] ret_from_fork_nospec_begin at ffffffff98777c1d
Attachments
Issue Links
- is related to
-
LU-9728 out of memory on OSS causing allocation failures or hung threads
-
- Resolved
-
Activity
Link | New: This issue is related to NCP-58 [ NCP-58 ] |
Fix Version/s | New: Lustre 2.16.0 [ 15190 ] |
Resolution | New: Fixed [ 1 ] | |
Status | Original: Open [ 1 ] | New: Resolved [ 5 ] |
Comment |
[ Hi Stephane,
I can't find dt_bufs_get/osd_bufs_get in your "foreach bt", so I think this not the same bug. The following stack trace seems to indicate that jbd2 thread hang committing the transaction to the disk (it waits for IO): {noformat} PID: 63000 TASK: ffff954835bba100 CPU: 10 COMMAND: "jbd2/md6-8" #0 [ffff9548167df930] __schedule at ffffffffae18c028 #1 [ffff9548167df998] schedule at ffffffffae18c3f9 #2 [ffff9548167df9a8] schedule_timeout at ffffffffae18a0c1 #3 [ffff9548167dfa50] io_schedule_timeout at ffffffffae18bcad #4 [ffff9548167dfa80] io_schedule at ffffffffae18bd48 #5 [ffff9548167dfa90] bit_wait_io at ffffffffae18a711 #6 [ffff9548167dfaa8] __wait_on_bit_lock at ffffffffae18a2c1 #7 [ffff9548167dfae8] __lock_page at ffffffffadbbd7f4 #8 [ffff9548167dfb40] write_cache_pages at ffffffffadbcaba0 #9 [ffff9548167dfc48] generic_writepages at ffffffffadbcac3d #10 [ffff9548167dfca8] jbd2_journal_commit_transaction at ffffffffc085b58e [jbd2] #11 [ffff9548167dfe48] kjournald2 at ffffffffc0861f89 [jbd2] #12 [ffff9548167dfec8] kthread at ffffffffadac5f91 #13 [ffff9548167dff50] ret_from_fork_nospec_begin at ffffffffae199ddd {noformat} So, maybe this is related to a slow/unresponsive disk when writing in the journal blocks. ] |
Attachment | New: fir-io7-s1_crash_foreach_bt_20220831.log [ 45475 ] |
Priority | Original: Minor [ 4 ] | New: Critical [ 2 ] |
Link | New: This issue is related to DDN-2969 [ DDN-2969 ] |
Link | New: This issue is related to DDN-2616 [ DDN-2616 ] |
Link | New: This issue is duplicated by DDN-2756 [ DDN-2756 ] |
Attachment | New: st_vmcore [ 41899 ] |