[LU-7925] ll_d_iput() can clear i_nlink for an inode in use Created: 26/Mar/16  Updated: 23/Nov/17  Resolved: 06/Dec/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.9.0

Type: Bug Priority: Major
Reporter: Andrew Perepechko Assignee: WC Triage
Resolution: Fixed Votes: 0
Labels: patch

Issue Links:
Related
is related to LU-8003 ll_ddelete() has obsolete reference t... Resolved
is related to LU-10131 Update inode attributes on unlink Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

ll_d_iput() can sometimes clear i_nlink for an inode used by a dentry alias in use.

The dentry can be either for a different path, but hard linked to the same inode. Or it can be a dentry for the same path, but marked invalid/unhashed. The latter can happen with e.g. 3.x kernels that use atomic_open, for which older server do not return MDS_INODELOCK_LOOKUP.

Spontaneous i_nlink clearing can cause stat(2) return invalid st_nlink, since i_nlink is used to transfer this value. Also, the kernel itself sometimes accesses i_nlink (see do_coredump() for instance).

The original intent for i_nlink clearing seems to be that unused inodes should be reclaimed instantly so that GFP_NOFS allocations (such as OBD_ALLOC()) would not face OOM. However, this piece of code seems to be either broken or completely wrong.

Firstly, although clearing nlink does make the VFS reclaim the inode immediately during final iput, there is also a dentry cache. So, in order to reclaim an inode its dentry should be released first. Closing a file will not lead to ll_d_iput(), even a blocking AST only marks dentries invalid and unhashes them. Dentry reclaim would normally happen via GFP_FS dentry slab shrink.

Secondly, just because OBD_ALLOC() uses GFP_NOFS allocation does not mean that dcache shrink cannot happen if there is not enough memory. Slow path allocation wakes up kswapd and continues indefinitely (see should_alloc_retry()) for any allocation request below PAGE_ALLOC_COSTLY_ORDER. And PAGE_ALLOC_COSTLY_ORDER allocation is a separate issue, in many cases not related to running out of memory.

A test case and a patch that removes find_cbdata functionality completely will be uploaded shortly.



 Comments   
Comment by Gerrit Updater [ 26/Mar/16 ]

Andrew Perepechko (andrew.perepechko@seagate.com) uploaded a new patch: http://review.whamcloud.com/19163
Subject: LU-7925 tests: a test case for i_nlink zeroing issue
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 19f447f0ee0c65398a9a15cc34d40e09ad6acf1a

Comment by Gerrit Updater [ 26/Mar/16 ]

Andrew Perepechko (andrew.perepechko@seagate.com) uploaded a new patch: http://review.whamcloud.com/19164
Subject: LU-7925 llite: avoid clearing i_nlink for inodes in use
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: e37285916047f35eb8531b137e959f4f72bba395

Comment by Gerrit Updater [ 11/Apr/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/19164/
Subject: LU-7925 llite: avoid clearing i_nlink for inodes in use
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: e6b7abc567ee0a8085e440c91e102d4318575529

Comment by Joseph Gmitter (Inactive) [ 06/Dec/16 ]

Patch landed to master for 2.9.0

Generated at Sat Feb 10 02:13:06 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.