[LU-7198] vvp_io.c:701:vvp_io_fault_start()) binary changed while waiting for the page fault lock Created: 22/Sep/15 Updated: 14/Jun/18 Resolved: 25/Apr/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.4 |
| Fix Version/s: | Lustre 2.9.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | D. Marc Stearman (Inactive) | Assignee: | Zhenyu Xu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | llnl, llnlfixready | ||
| Environment: |
Livermore branch: lustre-2.5.4-4chaos_2.6.32_504.16.2.1chaos.ch5.3.x86_64.x86_64 |
||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
We continue to see messages like this: <ConMan> Console [sierra562] log at 2015-07-20 16:00:00 PDT. <ConMan> Console [sierra562] log at 2015-07-20 17:00:00 PDT. Creating a new issue rather than reopen |
| Comments |
| Comment by Joseph Gmitter (Inactive) [ 23/Sep/15 ] |
|
Hi Bobijam, |
| Comment by Gerrit Updater [ 24/Sep/15 ] |
|
Bobi Jam (bobijam@hotmail.com) uploaded a new patch: http://review.whamcloud.com/16622 |
| Comment by Christopher Morrone [ 22/Oct/15 ] |
|
Lustre 2.5.4-12chaos contains change 16650 targeting the b2_5_fe branch. That did not fix the stated problem: 2015-10-19 18:34:53 Lustre: 87030:0:(vvp_io.c:716:vvp_io_fault_start()) binary [0x200053b04:0xee27:0x0] changed while waiting for the page fault lock <ConMan> Console [cab34] log at 2015-10-19 19:00:00 PDT. <ConMan> Console [cab34] log at 2015-10-19 20:00:00 PDT. 2015-10-19 20:21:21 Lustre: 88635:0:(vvp_io.c:716:vvp_io_fault_start()) binary [0x200053b04:0xee27:0x0] changed while waiting for the page fault lock Keep trying. |
| Comment by Christopher Morrone [ 22/Oct/15 ] |
|
And, even worse, we now have applications hanging in sys_ftruncate all over the place. I can't help but think that this patch that messes with locking around vvp_io_setattr_start() is to blame for our current major center-wide disruption. Here is the place where code is hanging: 2015-10-21 17:31:35 XXXXXX D 0000000000000006 0 28745 28734 0x00000004 2015-10-21 17:31:35 ffff8805f0befb98 0000000000000086 ffff88034ac159c0 ffff88033b277150 2015-10-21 17:31:35 00ff8805f0befc28 0000000000000000 ffff8805f0befb28 ffff88034ac159c0 2015-10-21 17:31:35 ffff8805eee3e0c0 0000000000000006 ffff8805f3043068 ffff8805f0beffd8 2015-10-21 17:31:35 Call Trace: 2015-10-21 17:31:35 [<ffffffff8153e475>] rwsem_down_failed_common+0x95/0x1d0 2015-10-21 17:31:35 [<ffffffff8153e5d3>] rwsem_down_write_failed+0x23/0x30 2015-10-21 17:31:35 [<ffffffff8129fac3>] call_rwsem_down_write_failed+0x13/0x20 2015-10-21 17:31:35 [<ffffffff8153daa5>] ? down_write+0x45/0x50 2015-10-21 17:31:35 [<ffffffffa0b0c229>] vvp_io_setattr_start+0x129/0x170 [lustre] 2015-10-21 17:31:35 [<ffffffffa0638c7a>] cl_io_start+0x6a/0x140 [obdclass] 2015-10-21 17:31:35 [<ffffffffa063d384>] cl_io_loop+0xb4/0x1b0 [obdclass] 2015-10-21 17:31:35 [<ffffffffa0b04948>] cl_setattr_ost+0x218/0x2f0 [lustre] 2015-10-21 17:31:35 [<ffffffffa0ad06f1>] ll_setattr_raw+0xa31/0x1060 [lustre] 2015-10-21 17:31:35 [<ffffffffa0ad0d85>] ll_setattr+0x65/0xd0 [lustre] 2015-10-21 17:31:35 [<ffffffff811b1de8>] notify_change+0x168/0x340 2015-10-21 17:31:35 [<ffffffff81190bd4>] do_truncate+0x64/0xa0 2015-10-21 17:31:35 [<ffffffff8153c771>] ? mutex_lock+0x31/0x60 2015-10-21 17:31:35 [<ffffffff81190f80>] sys_ftruncate+0x120/0x130 2015-10-21 17:31:35 [<ffffffff8100b112>] system_call_fastpath+0x16/0x1b 2015-10-21 17:31:35 XXXXXX D 000000000000000b 0 28750 28734 0x00000004 2015-10-21 17:31:35 ffff88054385bd08 0000000000000082 0000000000000000 ffff8805dce716e0 2015-10-21 17:31:35 ffff88054385bcc8 0000000000000286 00005ab6b25e0bdc ffff88054385bd64 2015-10-21 17:31:35 ffffffffa0b3bb00 0000000105ec1b77 ffff8806395a05f8 ffff88054385bfd8 2015-10-21 17:31:35 Call Trace: 2015-10-21 17:31:35 [<ffffffff8153cc66>] __mutex_lock_slowpath+0x96/0x210 2015-10-21 17:31:35 [<ffffffff8153c77e>] mutex_lock+0x3e/0x60 2015-10-21 17:31:35 [<ffffffffa0acff9a>] ll_setattr_raw+0x2da/0x1060 [lustre] 2015-10-21 17:31:35 [<ffffffffa0ad0d85>] ll_setattr+0x65/0xd0 [lustre] 2015-10-21 17:31:35 [<ffffffff811b1de8>] notify_change+0x168/0x340 2015-10-21 17:31:35 [<ffffffff81190bd4>] do_truncate+0x64/0xa0 2015-10-21 17:31:35 [<ffffffff8153c771>] ? mutex_lock+0x31/0x60 2015-10-21 17:31:35 [<ffffffff81190f80>] sys_ftruncate+0x120/0x130 2015-10-21 17:31:35 [<ffffffff8100b112>] system_call_fastpath+0x16/0x1b |
| Comment by Zhenyu Xu [ 22/Oct/15 ] |
|
Would you mind getting the code line whereabout of ll_setattr_raw+0x2da? |
| Comment by Jinshan Xiong (Inactive) [ 22/Oct/15 ] |
|
also please upload the reproduce program so that we can use it to check if the same problem exists in master. |
| Comment by Christopher Morrone [ 22/Oct/15 ] |
|
I like your optimism, but no, we don't have a reproducer to upload. I can get the line number later today when I'm free. |
| Comment by Christopher Morrone [ 22/Oct/15 ] |
|
ll_setattr_raw+0x2da looks like the obvious location, mutex_lock on line 1620 below: 1616 }
1617 ll_finish_md_op_data(op_data);
1618 }
1619 if (!S_ISDIR(inode->i_mode)) {
1620 mutex_lock(&inode->i_mutex);
1621 if ((attr->ia_valid & ATTR_SIZE) && !hsm_import)
1622 inode_dio_wait(inode);
1623 }
1624
1625 ll_stats_ops_tally(ll_i2sbi(inode), (attr->ia_valid & ATTR_SIZE)
|
| Comment by Zhenyu Xu [ 27/Oct/15 ] |
|
still investigating, haven't found the deallock root cause. |
| Comment by Zhenyu Xu [ 28/Oct/15 ] |
|
Is there other kind of Dead threads besides these two? I think thread 28750 was waiting for the i_mutex which 28745 was holding, while 28745 seemed waiting for a write lock which is either lli->lli_trunc_sem (most likely) or i_alloc_sem (unlikely), and there should be other thread holding this lli_trunc_sem. |
| Comment by Christopher Morrone [ 28/Oct/15 ] |
|
Unfortunately, I do not have that information. |
| Comment by Zhenyu Xu [ 10/Nov/15 ] |
|
Can you provide bt -a output and keep thread name masqueraded if necessary? |
| Comment by Zhenyu Xu [ 11/Nov/15 ] |
|
I find out the deadlock reason, this patch need some prepositional patch to drop the i_mutex usage in write path, or else vvp_io_write_start() will held lli->lli_trunc_sem then call lustre_generic_file_write()->generic_file_aio_write() which will get i_mutex, and this order is different from other IO path (setattr, truncate, etc). So I need push some prepositional patch before this one. |
| Comment by Zhenyu Xu [ 11/Nov/15 ] |
|
updated patch #16650 which depends on #17117 and #17116 (these two patches drops i_mutex usage in write path). |
| Comment by Christopher Morrone [ 11/Nov/15 ] |
|
We can't provide further information on the buggy patch 16500 (bt -a). We stopped using it in production as soon as we identified that patch as the problem. Keep in mind that not only did this patch introduce a deadlock, it also did not fix the error messages which this ticket is about. If you have only fixed the deadlock, then you should really be tracking your 16650 backport in some other ticket. If you think it will address our error message, we would appreciate an explanation of why the latest revision is expected to address the error message versus the previous one. |
| Comment by Jinshan Xiong (Inactive) [ 13/Nov/15 ] |
|
After taking a further look, I tend to think timestamp checking in vvp_io_fault_start() is not atomic therefore it's normal to get this kind of mismatch. Taking a look at ll_merge_lvb() in b2_5, it first assigns mtime from MDS to inode's i_mtime, and then merge timestamp from OST objects. However, the timestamp from MDS could be stale because we maintain mtime mainly on OST objects(thinking about Lustre write implementation), and if the check in vvp_io_fault_start() happens before mtime on OST objects are merged, definitely it will get wrong timestamp from the inode, even the timestamp it fetched in vvp_io_fault_init() could be wrong in the first place. I tend to think we should remove the timestamp check in vvp_io_fault_start(). |
| Comment by Peter Jones [ 15/Dec/15 ] |
|
Bobijam Do you think that Jinshan's suggestion is viable? If so please can you create a patch on that basis? Thanks Peter |
| Comment by Zhenyu Xu [ 16/Dec/15 ] |
|
http://review.whamcloud.com/#/c/17334/ is the patch |
| Comment by Peter Jones [ 16/Dec/15 ] |
|
Bobi LLNL will need a fix targeted on b2_5_fe Peter |
| Comment by Zhenyu Xu [ 16/Dec/15 ] |
|
port for b2_5_fe http://review.whamcloud.com/17624 |
| Comment by D. Marc Stearman (Inactive) [ 19/Feb/16 ] |
|
Chris, can you pull in the latest patch please? |
| Comment by Christopher Morrone [ 19/Feb/16 ] |
|
There are no reviews on the backported patch, and it failed automated testing. |
| Comment by D. Marc Stearman (Inactive) [ 19/Feb/16 ] |
|
Peter, please add this to the work queue. When a patch is ready for us, please add the "llnlfixready" lable. |
| Comment by Christopher Morrone [ 25/Mar/16 ] |
|
We need a master branch version of this patch as well. |
| Comment by Gerrit Updater [ 26/Mar/16 ] |
|
Bobi Jam (bobijam@hotmail.com) uploaded a new patch: http://review.whamcloud.com/19162 |
| Comment by Gerrit Updater [ 25/Apr/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/19162/ |
| Comment by Peter Jones [ 25/Apr/16 ] |
|
Landed for 2.9 |