[LU-5727] MDS OOMs with 2.5.3 clients and lru_size != 0 Created: 12/Oct/14 Updated: 30/Jan/17 Resolved: 30/Oct/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.5.3 |
| Fix Version/s: | Lustre 2.7.0, Lustre 2.5.4 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | David Singleton | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | llnl | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 16087 | ||||||||||||
| Description |
|
We have seen mds (admittedly with smallish memory) OOM'ing while testing 2.5.3 whereas there was no problem with 2.5.0. It turns out the problem is that, even though we have lru_size=800 everywhere, the client LDLM lru's are growing huge so that the MDS unreclaimable ldlm slabs fill memory. It looks like the root cause is the change to ldlm_cancel_aged_policy() in commit 0a6c6fcd46 on the 2.5 branch ( cancel lock if (too many in lru cache || lock unused too long) In 2.5.3, it's cancel lock if (too many in lru cache && lock unused too long) Disabling early_lock_cancel doesn't seem to help. It might be arguable which of the two behaviours is correct but the lru_size doco suggests the former - the latter makes lru_size != 0 ineffective in practice. It also looks like the change was not actually necessary for |
| Comments |
| Comment by Jinshan Xiong (Inactive) [ 13/Oct/14 ] |
|
indeed. it looks like we need change the code to: if (added >= count && cfs_time_before(cfs_time_current(), cfs_time_add(lock->l_last_used, ns->ns_max_age)) && ns->ns_cancel != NULL && ns->ns_cancel(lock) == 0) return LDLM_POLICY_KEEP_LOCK; return LDLM_POLICY_CANCEL_LOCK; |
| Comment by David Singleton [ 13/Oct/14 ] |
|
Should ns->ns_cancel(lock) == 0 still imply LDLM_POLICY_KEEP_LOCK as in ldlm_cancel_lrur_policy? |
| Comment by Niu Yawei (Inactive) [ 28/Oct/14 ] |
| Comment by Peter Jones [ 30/Oct/14 ] |
|
Landed for 2.7 |
| Comment by Patrick Farrell (Inactive) [ 04/Nov/14 ] |
|
Is there any plan to backport this patch to 2.5.3, given that |
| Comment by James A Simmons [ 04/Nov/14 ] |
|
Wasn't the version landed to b2_5 a much lighter weight patch compared to the one landed to 2.7. |
| Comment by Patrick Farrell (Inactive) [ 04/Nov/14 ] |
|
Yes, but the problem report (in the ticket description) is against 2.5.3, so presumably the bug exists in 2.5. |
| Comment by James A Simmons [ 04/Nov/14 ] |
|
Yep the one part ported back to b2_5 is that part that is broken |
| Comment by James A Simmons [ 04/Nov/14 ] |
|
Okay I back ported to b2_5 - http://review.whamcloud.com/#/c/12565. I'm going to give it a try to make sure we don't have client regressions. |
| Comment by David Singleton [ 05/Nov/14 ] |
|
Something to try on an idle client when you have the patch in. Run /bin/ls -R /lustre and monitor /proc/fs/lustre/ldlm/namespaces/mdc/lock* We find the (unused) lock count is OK after a) but blows out after b) - open(".", ...) seems to be an issue. This appears to be independent of whether lru_size is positive or 0. I'll create another ticket is this is confirmed. |
| Comment by Li Xi (Inactive) [ 14/Nov/14 ] |
|
This problem happened on our system really frequently these days. And patch http://review.whamcloud.com/#/c/12565/ doesn't help. Lock count of MDC keeps on growing rapidly even it exceeds lru_size when lru_resize is disabled on client. We found that patch ' I don't know the cancel policy in detail, but do we really need to keep these locks under these circumstances? |
| Comment by Jinshan Xiong (Inactive) [ 14/Nov/14 ] |
|
Hi Li Xi, Do you know what type of locks they are? I guess they must be OPEN locks. I can provide a hot fix after you have verified it. Jinshan |
| Comment by Jinshan Xiong (Inactive) [ 15/Nov/14 ] |
|
The patch |
| Comment by Gerrit Updater [ 15/Nov/14 ] |
|
Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: http://review.whamcloud.com/12733 |
| Comment by Gerrit Updater [ 15/Nov/14 ] |
|
Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: http://review.whamcloud.com/12734 |
| Comment by Gerrit Updater [ 15/Nov/14 ] |
|
Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: http://review.whamcloud.com/12735 |
| Comment by Gerrit Updater [ 15/Nov/14 ] |
|
Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: http://review.whamcloud.com/12736 |
| Comment by Gerrit Updater [ 15/Nov/14 ] |
|
Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: http://review.whamcloud.com/12565 |
| Comment by Shuichi Ihara (Inactive) [ 15/Nov/14 ] |
|
OK, confirmed the revert patch solved problem and the lock_count is controled by lru_size when lru resize is disabled. With revert patch # mount -t lustre 10.0.8.128@o2ib:/backup /backup
# lctl set_param ldlm.namespaces.*mdc*.lru_size=1000
ldlm.namespaces.backup-MDT0000-mdc-ffff8811c9f17c00.lru_size=1000
# lctl get_param ldlm.namespaces.*mdc*.lock_count
ldlm.namespaces.backup-MDT0000-mdc-ffff8811c9f17c00.lock_count=0
# find /backup/ > /dev/null
# lctl get_param ldlm.namespaces.*mdc*.lock_count
ldlm.namespaces.backup-MDT0000-mdc-ffff8811c9f17c00.lock_count=1000
Without revert patches # mount -t lustre 10.0.8.128@o2ib:/backup /backup
# lctl set_param ldlm.namespaces.*mdc*.lru_size=1000
ldlm.namespaces.backup-MDT0000-mdc-ffff8811c9f17c00.lru_size=1000
# lctl get_param ldlm.namespaces.*mdc*.lock_count
ldlm.namespaces.backup-MDT0000-mdc-ffff8811c9f17c00.lock_count=0
# find /backup/ > /dev/null
# lctl get_param ldlm.namespaces.*mdc*.lock_count
ldlm.namespaces.backup-MDT0000-mdc-ffff880c9250b000.lock_count=25966
|
| Comment by Gerrit Updater [ 03/Dec/14 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12733/ |
| Comment by Gerrit Updater [ 04/Dec/14 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12565/ |