[LU-9937] Re-order inode_lock and lli_trunc_sem to fix pio deadlock Created: 01/Sep/17  Updated: 29/Jan/22  Resolved: 29/Jan/22

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Patrick Farrell (Inactive) Assignee: Patrick Farrell (Inactive)
Resolution: Won't Fix Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

When using pio, we take the inode lock in ll_file_io_generic. Unfortunately, this violates the expected ordering with lli_trunc_sem, as seen in vvp_io_setattr_

{start,end}

. This can result in a deadlock where a thread doing a write holds the inode lock and wants the trunc sem in read mode, but setattr already holds the trunc sem in write mode and wants the inode lock.

It appears to be safe to change the ordering between inode_lock and the lli_trunc_sem, so we always take the inode lock first.

Also, the forthcoming patch cleans up inode locking with pio, removing lli_inode_locked. The inode info is shared between all threads doing i/o on that inode, so usage of this meant we would not take the inode lock for operations that needed it if someone already held it (notably ll_fsync), and we would also let other threads unlock the inode lock for us.

So operations that expect complete coverage could get unlocked at some random point in their operation. Also, if a pio were holding the inode lock, multiple ll_fsync operations could occur at the same time.



 Comments   
Comment by Gerrit Updater [ 01/Sep/17 ]

Patrick Farrell (paf@cray.com) uploaded a new patch: https://review.whamcloud.com/28836
Subject: LU-9937 llite: fix pio deadlock
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ffbf1e49a97eb86309adc2ad581e4d7e16d56ae8

Comment by Andreas Dilger [ 29/Jan/22 ]

PIO is removed.

Generated at Sat Feb 10 02:30:37 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.