[LU-9479] sanity test 184d 244: don't instantiate PFL component when taking group lock Created: 09/May/17 Updated: 08/Jan/24 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Andreas Dilger | Assignee: | Zhenyu Xu |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | pfl | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
This issue was created by maloo for bobijam <bobijam.xu@intel.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/67af86be-2027-11e7-9073-5254006e85c2. The sub-test test_244 failed with the following error: test failed to respond and timed out Info required for matching: sanity 244 sendfile_grouplock.c calls sendfile_copy(sourfile, 0, destfile, 98765) which will call into lov_io_init() and atomic_inc(&lov->lo_active_ios) and sendfile_copy() tries to write to the file, which will check to get layout, and ll_layout_refresh() finds there is an active ios (marked by ll_get_grouplock()), so the write hung there sendfile_grou S 0000000000000000 0 7394 7321 0x00000080 ffff88000eb3f618 0000000000000082 ffff88000eb3f5e0 ffff88000eb3f5dc 00001ce200000000 ffff88003f828400 0000005dce083b5f ffff880003436ac0 00000000000005ff 0000000100017a1d ffff88002b57fad0 ffff88000eb3ffd8 Call Trace: [<ffffffffa0afa20b>] lov_layout_wait+0x11b/0x220 [lov] [<ffffffff810640e0>] ? default_wake_function+0x0/0x20 [<ffffffffa0afc11e>] lov_conf_set+0x37e/0xa30 [lov] [<ffffffffa040f471>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffffa059d888>] cl_conf_set+0x58/0x100 [obdclass] [<ffffffffa0fa5dd4>] ll_layout_conf+0x84/0x3f0 [lustre] [<ffffffffa040f471>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffffa0fb0b9d>] ll_layout_refresh+0x96d/0x1710 [lustre] [<ffffffffa040f471>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffffa0ff7d6f>] vvp_io_init+0x32f/0x450 [lustre] [<ffffffffa040f471>] ? libcfs_debug_msg+0x41/0x50 [libcfs] [<ffffffffa05a5148>] cl_io_init0+0x88/0x150 [obdclass] [<ffffffffa05a7caa>] cl_io_init+0x4a/0xa0 [obdclass] [<ffffffffa05a7dbc>] cl_io_rw_init+0xbc/0x200 [obdclass] [<ffffffffa0fa7213>] ll_file_io_generic+0x203/0xaf0 [lustre] [<ffffffffa0fa941d>] ll_file_aio_write+0x13d/0x280 [lustre] [<ffffffffa0fa969a>] ll_file_write+0x13a/0x270 [lustre] [<ffffffff81189ef8>] vfs_write+0xb8/0x1a0 [<ffffffff811ba76d>] kernel_write+0x3d/0x50 [<ffffffff811ba7da>] write_pipe_buf+0x5a/0x90 [<ffffffff811b9342>] splice_from_pipe_feed+0x72/0x120 [<ffffffff811ba780>] ? write_pipe_buf+0x0/0x90 [<ffffffff811ba780>] ? write_pipe_buf+0x0/0x90 [<ffffffff811b9d9e>] __splice_from_pipe+0x6e/0x80 [<ffffffff811ba780>] ? write_pipe_buf+0x0/0x90 [<ffffffff811b9e01>] splice_from_pipe+0x51/0x70 [<ffffffff811b9e3d>] default_file_splice_write+0x1d/0x30 [<ffffffff811b9fca>] do_splice_from+0xba/0xf0 [<ffffffff811ba020>] direct_splice_actor+0x20/0x30 [<ffffffff811ba256>] splice_direct_to_actor+0xc6/0x1c0 [<ffffffff811ba000>] ? direct_splice_actor+0x0/0x30 [<ffffffff811ba39d>] do_splice_direct+0x4d/0x60 [<ffffffff8118a344>] do_sendfile+0x184/0x1e0 [<ffffffff8118a3d4>] sys_sendfile64+0x34/0xb0 [<ffffffff810e031e>] ? __audit_syscall_exit+0x25e/0x290 [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b |
| Comments |
| Comment by Andreas Dilger [ 09/May/17 ] |
|
This problem was worked around in patch https://review.whamcloud.com/26646 " This is inefficient for the source file since it is about to be deleted, and for the destination file since it doesn't need all of those objects. Per comments in |
| Comment by Zhenyu Xu [ 23/May/17 ] |
|
Jinshan, from commit d25892863a30a385cdae2b1991ab0a9ca76fcd7d, the lov->lo_active_ios was created, and get grouplock also increases it. LU-1876 hsm: account for active IOs correctly
We used to use refcount of lsm to account for active IOs but this
turns out wrong because empty layout files don't have a lsm.
In this patch, lov_object::lo_active_ios is invented to account for
active IOs for both raid0 and empty layout;
I'm think of not accounting get_grouplock as a active IOs so that process with the grouplock won't hung itself in the first place, do you think it's plausible? |
| Comment by Zhenyu Xu [ 23/May/17 ] |
|
well, even lo_active_ios is resolved with my last comment, there still exists problem that the later instantiated component does not protected by the prior given group lock. |
| Comment by Andreas Dilger [ 31/Aug/17 ] |
|
Bobijam, can you discuss with Oleg some way we could get group locks for newly-instantiated components that is not racy. It seems that the workaround is not working properly, since there are now multiple problems with grouplocks being seen. |
| Comment by Patrick Farrell (Inactive) [ 20/Feb/18 ] |
|
Is it not possible for a group lock to prevent instantiation of further components by those not holding a group lock? It would imply a group lock also had to take a layout lock, though. |
| Comment by Andreas Dilger [ 22/Feb/18 ] |
|
Patrick, the problem here is that we don't want to prevent instantistion new component instantistion, but if new objects are instantisted they do not have a group lock on them (since that is handled between each OST and client separately for each object). |
| Comment by Patrick Farrell (Inactive) [ 22/Feb/18 ] |
|
Right, I understand - Note "instantiation of further components by those not holding a group lock". I think that's necessary because otherwise that problem you note occurs - The new segment doesn't get the old group lock applied to it. I don't see a way to solve that, so instead holding a group lock should prevent instantiation of a new component by those not holding the group lock, because otherwise they could get around the group lock and write to that new component. Those holding the group lock can do it, which gives them a chance to apply the group lock to the new component. This is still potentially problematic though, if the client who instantiated the new layout releases its group lock, other clients holding a group lock will not have that portion locked and so it could become unlocked. This is not fun. |
| Comment by Andreas Dilger [ 22/Feb/18 ] |
|
The issue is that group locks are handled by the OSS, but object instantiation is done by the MDS, so there is no way for the MDS to even know that there is a group lock on the file... |
| Comment by Patrick Farrell (Inactive) [ 22/Feb/18 ] |
|
Yup... No straightforward suggestions from me. |
| Comment by Patrick Farrell (Inactive) [ 18/May/18 ] |
|
No intention to work on this right now, but I wanted to write this down here so it doesn't get forgotten: As Andreas said, the issue is that group locks are handled by the OSS, but object instantiation is handled by the MDS. As Oleg noted at LUG, given that, it seems like we need to move group locks to being layout locks. A few thoughts on this.
There may also be some ... interesting ... interoperability issues with clients that do not request/expect group locks on the layout. That might be challenging to get right. |
| Comment by Andreas Dilger [ 22/Oct/21 ] |
|
It is worthwhile to note that the "MDS holds IBITS group lock" issue is (partly) solved already via patch https://review.whamcloud.com/39406 " That would potentially allow group locks to always lock on the MDT first, and then lock only initialized components. Older clients that don't understand the MDT group lock would (unfortunately) keep their current behavior of always instantiating and locking all components. The new MDS could potentially revoke the layout lock (and keep it) from older clients that do not have this functionality, to prevent non-group locks from instantiating uninitialized components, but there is unfortunately no OBD_CONNECT_* flag for this. It might be enough in 2.15 for a group lock to hold the MDS_INODEBITS_LAYOUT lock exclusively, if there are uninitialized components to block older clients, but this wouldn't be needed if all components are already initialized, or all clients are new and understand MDS group locking (a new OBD_CONNECT_* flag could be added for this). |