[LU-13350] ldiskfs deadlock during inode expansion Created: 10/Mar/20  Updated: 11/Mar/20  Resolved: 11/Mar/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Shaun Tancheff Assignee: WC Triage
Resolution: Not a Bug Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   
    PID: 15742  TASK: ffff880fb4f1eeb0  CPU: 11  COMMAND: "mdt01_077"
    
      #0 [ffff880bd3f5eda8] __schedule at ffffffff816b3de4
      #1 [ffff880bd3f5ee30] schedule at ffffffff816b4409
      #2 [ffff880bd3f5ee40] rwsem_down_read_failed at ffffffff816b5a3d
      #3 [ffff880bd3f5eec8] call_rwsem_down_read_failed at ffffffff81338218
      #4 [ffff880bd3f5ef18] down_read at ffffffff816b3520
      #5 [ffff880bd3f5ef30] ldiskfs_xattr_block_set at ffffffffc13d3a8e [ldiskfs]
      #6 [ffff880bd3f5efb8] ldiskfs_expand_extra_isize_ea at ffffffffc13d4bc1 [ldiskfs]
      #7 [ffff880bd3f5f0b0] ldiskfs_mark_inode_dirty at ffffffffc1421254 [ldiskfs]
      #8 [ffff880bd3f5f108] ldiskfs_dirty_inode at ffffffffc1424960 [ldiskfs]
      #9 [ffff880bd3f5f128] __mark_inode_dirty at ffffffff81232e5d
     #10 [ffff880bd3f5f160] ldiskfs_mb_new_blocks at ffffffffc13f52cd [ldiskfs]
     #11 [ffff880bd3f5f240] ldiskfs_ind_map_blocks at ffffffffc13d8e5b [ldiskfs]
     #12 [ffff880bd3f5f3b8] ldiskfs_map_blocks at ffffffffc141ddd5 [ldiskfs]
     #13 [ffff880bd3f5f440] ldiskfs_getblk at ffffffffc141e295 [ldiskfs]
     #14 [ffff880bd3f5f4a0] ldiskfs_bread at ffffffffc141e457 [ldiskfs]
     #15 [ffff880bd3f5f4d8] ldiskfs_append at ffffffffc13e26c1 [ldiskfs]
     #16 [ffff880bd3f5f518] do_split at ffffffffc13e3e59 [ldiskfs]
     #17 [ffff880bd3f5f600] ldiskfs_dx_add_entry at ffffffffc13e701a [ldiskfs]
     #18 [ffff880bd3f5f750] __ldiskfs_add_entry at ffffffffc13e8624 [ldiskfs]
     #19 [ffff880bd3f5f7c0] osd_ldiskfs_add_entry at ffffffffc1471ef4 [osd_ldiskfs]
     #20 [ffff880bd3f5f818] __osd_ea_add_rec at ffffffffc1472767 [osd_ldiskfs]
     #21 [ffff880bd3f5f888] osd_index_ea_insert at ffffffffc147d356 [osd_ldiskfs]
     #22 [ffff880bd3f5f950] lod_sub_insert at ffffffffc151d355 [lod]
     #23 [ffff880bd3f5f9f0] lod_insert at ffffffffc14fb744 [lod]
     #24 [ffff880bd3f5fa00] __mdd_index_insert_only at ffffffffc12606cf [mdd]
     #25 [ffff880bd3f5fa48] __mdd_index_insert at ffffffffc1261565 [mdd]
     #26 [ffff880bd3f5fa90] mdd_create at ffffffffc126d6a8 [mdt]
     #27 [ffff880bd3f5fb88] mdt_create at ffffffffc1300369 [mdt]
     #28 [ffff880bd3f5fc38] mdt_reint_create at ffffffffc130083b [mdt]
     #29 [ffff880bd3f5fc68] mdt_reint_rec at ffffffffc1301d93 [mdt]
     #30 [ffff880bd3f5fc90] mdt_reint_internal at ffffffffc12e11bb [mdt]
     #31 [ffff880bd3f5fcc8] mdt_reint at ffffffffc12ec187 [mdt]
     #32 [ffff880bd3f5fcf8] tgt_request_handle at ffffffffc0ee18ba [ptlrpc]
     #33 [ffff880bd3f5fd40] ptlrpc_server_handle_request at ffffffffc0e86f13 [ptlrpc]
     #34 [ffff880bd3f5fde0] ptlrpc_main at ffffffffc0e8a862 [ptlrpc]
     #35 [ffff880bd3f5fec8] kthread at ffffffff810b4031
     #36 [ffff880bd3f5ff50] ret_from_fork at ffffffff816c155d



 Comments   
Comment by Andrew Perepechko [ 11/Mar/20 ]

A possible scenario for the deadlock is the following:
step 1) Lustre filesystem is created with pre-i_projid mkfs tool
step 2) a directory is created and filled with entries to the end of the directory block so creation of a new direntry will cause block allocation
step 3) xattr for the directory inode is filled to the end of the free inode space
step 4) Lustre filesystem is mounted with i_projid-aware ldiskfs module (Lustre upgrade)
step 5) a new entry is created in the directory from step 2
step 5.1) block allocation is requested under write sem
step 5.2) __mark_inode_dirty() causes inode expansion due to ondisk format change
step 5.3) xattrs cannot stay in the expanded inode and block allocation for xattrs is requested under read sem (DEADLOCK)

Comment by Andrew Perepechko [ 11/Mar/20 ]

Closing as not a bug since the fix is only needed for kernels 3.17 or earlier, RHEL 7.4 or earlier.

Generated at Sat Feb 10 03:00:31 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.