[LU-5912] locking flaw generates logged errors Created: 12/Nov/14 Updated: 23/Apr/15 Resolved: 16/Jan/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0 |
| Fix Version/s: | Lustre 2.7.0, Lustre 2.5.4 |
| Type: | Bug | Priority: | Major |
| Reporter: | Bob Glossman (Inactive) | Assignee: | Bob Glossman (Inactive) |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | None | ||
| Environment: |
el6.6 |
||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 16507 | ||||||||||||
| Description |
|
we are violating the vfs locking scheme in el6.6 during our direct calling of f_op->fsync(). The call sequence cfs_tracefile_dump_all_pages() WARN_ON_ONCE(!mutex_is_locked(&inode->i_mutex) &&
!(inode->i_state & I_FREEING));
Comment in ext4_sync_file() says:
We don't conform to that requirement, thus hitting the WARN_ON_ONCE() check in ext4_flush_unwritten_io(). Not sure how to repair this problem. We can't just put a mutex_lock/mutex_unlock inside or around our filp_fsync() as there are other kernel versions we actively support that lock i_mutex inside their ext4_sync_file() routines, at least el7 and sles11sp3, maybe others too. While this isn't a fatal problem and doesn't cause panics & crashes, it is a regression. As an example a typical instance looks like this in the syslog: Nov 12 08:59:21 centos65-2 kernel: WARNING: at fs/ext4/inode.c:3929 ext4_flush_unwritten_io+0x74/0x80 [ext4]() (Not tainted) Nov 12 08:59:21 centos65-2 kernel: Hardware name: VMware Virtual Platform Nov 12 08:59:21 centos65-2 kernel: Modules linked in: jbd sha512_generic crc32c_intel libcfs(U) rfcomm sco bridge bnep l2cap autofs4 8021q garp stp llc ipv6 fuse uinput microcode vmware_balloon btusb bluetooth rfkill snd_ens1371 snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc e1000 sg i2c_piix4 i2c_core shpchp ext4 jbd2 mbcache sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix mptspi mptscsih mptbase scsi_transport_spi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: lnet] Nov 12 08:59:21 centos65-2 kernel: Pid: 106487, comm: lctl Not tainted 2.6.32.504.1.3.l1111 #1 Nov 12 08:59:21 centos65-2 kernel: Call Trace: Nov 12 08:59:21 centos65-2 kernel: [<ffffffff81074df7>] ? warn_slowpath_common+0x87/0xc0 Nov 12 08:59:21 centos65-2 kernel: [<ffffffff81074e4a>] ? warn_slowpath_null+0x1a/0x20 Nov 12 08:59:21 centos65-2 kernel: [<ffffffffa00e8bb4>] ? ext4_flush_unwritten_io+0x74/0x80 [ext4] Nov 12 08:59:21 centos65-2 kernel: [<ffffffffa00e4fc8>] ? ext4_sync_file+0x88/0x1d0 [ext4] Nov 12 08:59:21 centos65-2 kernel: [<ffffffffa03a3cf8>] ? cfs_tracefile_dump_all_pages+0x178/0x2c0 [libcfs] Nov 12 08:59:21 centos65-2 kernel: [<ffffffffa03a3ecb>] ? cfs_trace_dump_debug_buffer_usrstr+0x8b/0x90 [libcfs] Nov 12 08:59:21 centos65-2 kernel: [<ffffffffa039a583>] ? __proc_dump_kernel+0x23/0x30 [libcfs] Nov 12 08:59:21 centos65-2 kernel: [<ffffffffa0399efb>] ? lprocfs_call_handler+0x2b/0x70 [libcfs] Nov 12 08:59:21 centos65-2 kernel: [<ffffffffa0399f95>] ? proc_dump_kernel+0x25/0x30 [libcfs] Nov 12 08:59:21 centos65-2 kernel: [<ffffffff81204157>] ? proc_sys_call_handler+0x97/0xd0 Nov 12 08:59:21 centos65-2 kernel: [<ffffffff812041a4>] ? proc_sys_write+0x14/0x20 Nov 12 08:59:21 centos65-2 kernel: [<ffffffff8118e4a8>] ? vfs_write+0xb8/0x1a0 Nov 12 08:59:21 centos65-2 kernel: [<ffffffff8118ee71>] ? sys_write+0x51/0x90 Nov 12 08:59:21 centos65-2 kernel: [<ffffffff810e5ece>] ? __audit_syscall_exit+0x25e/0x290 Nov 12 08:59:21 centos65-2 kernel: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b Nov 12 08:59:21 centos65-2 kernel: ---[ end trace f63906e52fe1f987 ]--- |
| Comments |
| Comment by Andreas Dilger [ 13/Nov/14 ] |
|
Bob, can you check what patch introduced this WARN_ON(), so we can see what the implications of this change are. It seems unlikely that RHEL would have changed the locking in the middle of a stable release, but I've been surprised before. As for the solution, we need a configure check that determines whether the lock is held before calling fsync(), and make this locking conditional in Lustre. Ideally there would be an easy way to check this, but if not then we've resorted to grepping in the sources to check if the mutex is held in the caller or not. On the client we don't have any expectation of access to the source tree, only the headers, so it is hopefully something we can detect from there. |
| Comment by Bob Glossman (Inactive) [ 13/Nov/14 ] |
|
the obvious grep for the specific comment line about holding i_mutex won't work as a configure test. the ext4 source in sles11sp3 has an exactly similar line, but it lies. in sles11sp3 it actually does the locking inside ext4 code in spite of the comment saying otherwise. i'm having a hard time thinking up a configure test that can make the correct determination. |
| Comment by Bob Glossman (Inactive) [ 13/Nov/14 ] |
|
I think the WARN_ON_ONCE() check went into kernel code with the following commit: commit c278531d39f3158bfee93dc67da0b77e09776de2 ext4: fix ext4_flush_completed_IO wait semantics BUG #1) All places where we call ext4_flush_completed_IO are broken BUG #2) Race window becomes wider due to the 'ext4: completed_io This patch make following changes: Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Not exactly sure what kernel.org kernel version it first appears in. |
| Comment by Bob Glossman (Inactive) [ 14/Nov/14 ] |
|
I believe the mod when the check went in may not be the issue. When the locking of i_mutex moved from the caller of a filesystem's fsync routine to inside it may be what we want to find & do a configure test for. Before then it was safe to do the locking in the caller, after then it is not. As far as I can tell i_mutex locking is done inside the fsync as early as 3.0 (sles11sp3) kernels and is still done in the most recent kernels I have looked at. It is only not done in 2.6.32-xxx kernels (6.5 & 6.6). It is only checked for in 6.6 kernels and some later kernels, but the check is gone from the most recent. |
| Comment by Bob Glossman (Inactive) [ 14/Nov/14 ] |
|
I think the kernel mod that moved i_mutex locking into .fsync routines was this one: commit 02c24a82187d5a628c68edfe71ae60dc135cd178 fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers Btrfs needs to be able to control how filemap_write_and_wait_range() is called Acked-by: Jan Kara <jack@suse.cz> |
| Comment by Bob Glossman (Inactive) [ 14/Nov/14 ] |
|
As far as I can see it looks like the versions needing i_mutex locking outside ext4_file_sync() exactly matches versions where ->fsync() routines take 3 args. There is already configure tests & #ifdef's for that. I think I should be able to come up with a patch that will fit in the existing conditionals for our filp_fsync() definitions. |
| Comment by Bob Glossman (Inactive) [ 14/Nov/14 ] |
|
proposed fix: |
| Comment by Gerrit Updater [ 14/Nov/14 ] |
|
Bob Glossman (bob.glossman@intel.com) uploaded a new patch: http://review.whamcloud.com/12731 |
| Comment by Gerrit Updater [ 04/Dec/14 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12731/ |
| Comment by Karsten Weiss [ 12/Jan/15 ] |
|
FWIW: We currently use Lustre 2.5.3 on CentOS 6.5 servers and see this WARNING during shutdown of our Lustre 2.5.3 clients Can we expect to see a fix for the Lustre 2.5.x maintenance branch? (I can't find a RHEL 6.6 kernel update notice in the preliminary release notes of either 2.5.4 or 2.5.5 - only in 2.7.0.) |
| Comment by Gerrit Updater [ 14/Jan/15 ] |
|
Bob Glossman (bob.glossman@intel.com) uploaded a new patch: http://review.whamcloud.com/13404 |
| Comment by Jodi Levi (Inactive) [ 16/Jan/15 ] |
|
Patch landed to Master. |
| Comment by Gerrit Updater [ 27/Jan/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13404/ |
| Comment by Gerrit Updater [ 11/Feb/15 ] |
|
Dmitry Eremin (dmitry.eremin@intel.com) uploaded a new patch: http://review.whamcloud.com/13730 |
| Comment by Gerrit Updater [ 27/Feb/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13730/ |
| Comment by Gerrit Updater [ 05/Mar/15 ] |
|
Dmitry Eremin (dmitry.eremin@intel.com) uploaded a new patch: http://review.whamcloud.com/13982 |