[LU-10902] mdd_changelog_user_purge() must check current number of users safer and earlier to disable ChangeLogs recording if none Created: 11/Apr/18 Updated: 06/May/18 Resolved: 06/May/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.12.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Bruno Faccini (Inactive) | Assignee: | Bruno Faccini (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
During my analysis of a situation where a ChangeLogs user's purge was taking a very long time (due to single user never being unregistered and idle since ages), I have found that the current code (still true in master) in mdd_changelog_user_purge() only disable ChangeLogs recording when the last user's purge has been synchronously done, meaning that more records will unnecessary continue to be recorded (and compete adding overhead) until the full purge completes : int mdd_changelog_user_purge(const struct lu_env *env,
struct mdd_device *mdd, __u32 id)
{
struct mdd_changelog_user_purge mcup = {
.mcup_id = id,
.mcup_found = false,
.mcup_usercount = 0,
.mcup_minrec = ULLONG_MAX,
};
struct llog_ctxt *ctxt;
int rc;
ENTRY;
CDEBUG(D_IOCTL, "%s: Purge request: id=%u\n",
mdd2obd_dev(mdd)->obd_name, id);
ctxt = llog_get_context(mdd2obd_dev(mdd),
LLOG_CHANGELOG_USER_ORIG_CTXT);
if (ctxt == NULL ||
(ctxt->loc_handle->lgh_hdr->llh_flags & LLOG_F_IS_CAT) == 0)
GOTO(out, rc = -ENXIO);
rc = llog_cat_process(env, ctxt->loc_handle,
mdd_changelog_user_purge_cb, &mcup,
0, 0);
if ((rc == 0) && mcup.mcup_found) {
CDEBUG(D_IOCTL, "%s: Purging changelog entries for user %d "
"record=%llu\n",
mdd2obd_dev(mdd)->obd_name, id, mcup.mcup_minrec);
/* Cancelling record 0 destroys the entire changelog, make sure
we don't do that unless we mean it. */
if (mcup.mcup_minrec != 0 || mcup.mcup_usercount == 0) {
rc = mdd_changelog_llog_cancel(env, mdd, <<<<<< sync, and can take a long
mcup.mcup_minrec); <<<<<<<<< time if huge backlog
}
} else {
CWARN("%s: No changelog for user %u; rc=%d\n",
mdd2obd_dev(mdd)->obd_name, id, rc);
GOTO(out, rc = -ENOENT);
}
if ((rc == 0) && (mcup.mcup_usercount == 0)) {
/* No more users; turn changelogs off */
CDEBUG(D_IOCTL, "turning off changelogs\n");
rc = mdd_changelog_off(env, mdd); <<<<<<<<< ChangeLog recording being stopped
}
EXIT;
out:
if (ctxt != NULL)
llog_ctxt_put(ctxt);
return rc;
}
Also, ChangeLog recording is being stopped without any regard nor protection against the fact a new user may have registered in the interval, leading to a situation where a user can be registered but not ChangeLog recording will happen. Will push a patch soon to fix both problems. |
| Comments |
| Comment by Gerrit Updater [ 16/Apr/18 ] |
|
Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: https://review.whamcloud.com/32007 |
| Comment by Bruno Faccini (Inactive) [ 19/Apr/18 ] |
|
A use case leading to such scenario could occur if an old and unused changelog user is being unregistered and a new one is being registerd (like for a RobinHood operations restart from scratch including full-scan and start of Changelogs gather in parallel) but previous purge has still not completed due to some huge records back-log. |
| Comment by Gerrit Updater [ 06/May/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/32007/ |
| Comment by Peter Jones [ 06/May/18 ] |
|
Landed for 2.12 |