[LU-8999] sanity-quota test_38: skipped id entries Created: 10/Jan/17 Updated: 21/Jan/19 Resolved: 08/Aug/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.0, Lustre 2.10.1, Lustre 2.11.0, Lustre 2.10.2 |
| Fix Version/s: | Lustre 2.12.0, Lustre 2.10.7 |
| Type: | Bug | Priority: | Major |
| Reporter: | Maloo | Assignee: | Hongchao Zhang |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
RHEL 7.3 Server/Client |
||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/d6ff8e38-d464-11e6-ace4-5254006e85c2. The sub-test test_38 failed with the following error: skipped id entries test_logs: == sanity-quota test 38: Quota accounting iterator doesn't skip id entries =========================== 05:03:51 (1483707831) Waiting for local destroys to complete Creating test directory CMD: onyx-35vm3,onyx-35vm4 lctl set_param fail_val=0 fail_loc=0 fail_val=0 fail_loc=0 fail_val=0 fail_loc=0 Create 10000 files... CMD: onyx-35vm3 lctl set_param -n osd*.*MDT*.force_sync=1 CMD: onyx-35vm4 lctl set_param -n osd*.*OS*.force_sync=1 CMD: onyx-35vm3 /usr/sbin/lctl get_param osd-ldiskfs.lustre-MDT0000.quota_slave.acct_user Found 10010 id entries sanity-quota test_38: @@@@@@ FAIL: skipped id entries |
| Comments |
| Comment by Nathaniel Clark [ 02/Feb/17 ] |
|
Another failure on master: https://testing.hpdd.intel.com/test_sets/0fa1f43e-e8c8-11e6-935d-5254006e85c2 |
| Comment by James Casper [ 24/May/17 ] |
|
2.9.57, b3575: |
| Comment by Gerrit Updater [ 04/Aug/17 ] |
|
Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: https://review.whamcloud.com/28345 |
| Comment by Gerrit Updater [ 14/Aug/17 ] |
|
Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/28530 |
| Comment by Gerrit Updater [ 17/Aug/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28345/ |
| Comment by Peter Jones [ 17/Aug/17 ] |
|
Landed for 2.11 |
| Comment by Gerrit Updater [ 18/Aug/17 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/28530/ |
| Comment by Sarah Liu [ 19/Sep/17 ] |
|
still hit this problem on b2_10 2.10.1 RC1 testing RHEL7.4 zfs another one on 2.10.1 after the patch landed |
| Comment by Jian Yu [ 22/Sep/17 ] |
|
The original failure occurred in ldiskfs test session, while the patch fixed ZFS related codes. |
| Comment by James Nunez (Inactive) [ 25/Oct/17 ] |
|
Hongchao - Would you please comment on this issue? Thank you. |
| Comment by Hongchao Zhang [ 16/Nov/17 ] |
|
status update: |
| Comment by Gerrit Updater [ 17/Nov/17 ] |
|
Hongchao Zhang (hongchao.zhang@intel.com) uploaded a new patch: https://review.whamcloud.com/30145 |
| Comment by Gerrit Updater [ 24/Nov/17 ] |
|
Hongchao Zhang (hongchao.zhang@intel.com) uploaded a new patch: https://review.whamcloud.com/30243 |
| Comment by Gerrit Updater [ 17/Dec/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/30243/ |
| Comment by Peter Jones [ 17/Dec/17 ] |
|
It looks to me like a correction for the test has landed and I expect the debug patch will now be abandoned. |
| Comment by Gerrit Updater [ 21/Dec/17 ] |
|
Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/30631 |
| Comment by Gerrit Updater [ 04/Jan/18 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/30631/ |
| Comment by Minh Diep [ 04/Jan/18 ] |
|
doesn't look like this is fixed |
| Comment by Hongchao Zhang [ 05/Jan/18 ] |
|
I have checked the log, there are 13 extra quota id, which could be left by other tests, but the quota id 0 ~ 9999 required by [root@zhanghc ~]#grep "id:" quota_ids |awk '{print $3}'|sort -n
0
1
...
9998
9999
10273
14421
18394
19745
20974
23221
23332
25594
27020
29911
30457
30979
31362
- id: 10273
usage: { inodes: 1, kbytes: 4 }
- id: 14421
usage: { inodes: 1, kbytes: 4 }
- id: 18394
usage: { inodes: 1, kbytes: 0 }
- id: 19745
usage: { inodes: 1, kbytes: 4 }
- id: 20974
usage: { inodes: 1, kbytes: 4 }
- id: 23221
usage: { inodes: 1, kbytes: 0 }
- id: 23332
usage: { inodes: 1, kbytes: 0 }
- id: 25594
usage: { inodes: 1, kbytes: 4 }
- id: 27020
usage: { inodes: 1, kbytes: 0 }
- id: 29911
usage: { inodes: 1, kbytes: 0 }
- id: 30457
usage: { inodes: 1, kbytes: 0 }
- id: 30979
usage: { inodes: 1, kbytes: 4 }
- id: 31362
usage: { inodes: 1, kbytes: 0 }
|
| Comment by Gerrit Updater [ 05/Jan/18 ] |
|
Hongchao Zhang (hongchao.zhang@intel.com) uploaded a new patch: https://review.whamcloud.com/30730 |
| Comment by Gerrit Updater [ 14/Jan/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/30730/ |
| Comment by Peter Jones [ 14/Jan/18 ] |
|
Landed for 2.11 |
| Comment by Gerrit Updater [ 17/Jan/18 ] |
|
Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/30896 |
| Comment by Saurabh Tandan (Inactive) [ 31/Jan/18 ] |
|
Reopening the issue as it still occurs for 2.10.57 "SLES 12 SP3 Server/DNE/ldiskfs SLES 12 SP3 Client" config. |
| Comment by Gerrit Updater [ 31/Jan/18 ] |
|
Hongchao Zhang (hongchao.zhang@intel.com) uploaded a new patch: https://review.whamcloud.com/31098 |
| Comment by Gerrit Updater [ 09/Feb/18 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/30896/ |
| Comment by Hongchao Zhang [ 22/Mar/18 ] |
|
I have managed to reproduce this issue in my local VMs, and there are two problems to cause it. 1. the problem caused by multiple calls of the "lprocfs_quota_seq_start" of the seq_operations 2. there is a problem in "walk_tree_dqentry" int walk_tree_dqentry(const struct lu_env *env, struct osd_object *obj,
int type, uint blk, int depth, uint index,
struct osd_it_quota *it)
{
dqbuf_t buf = getdqbuf();
loff_t ret;
u32 *ref = (u32 *) buf;
ENTRY;
if (!buf)
RETURN(-ENOMEM);
ret = quota_read_blk(env, obj, type, blk, buf);
if (ret < 0) {
CERROR("Can't read quota tree block %u.\n", blk);
goto out_buf;
}
ret = 1;
for (; index <= 0xff && ret > 0; index++) {
blk = le32_to_cpu(ref[index]);
if (!blk) /* No reference */
continue;
if (depth < LUSTRE_DQTREEDEPTH - 1)
ret = walk_tree_dqentry(env, obj, type, blk,
depth + 1, 0, it);
else
ret = walk_block_dqentry(env, obj, type, blk, 0, it); <--- here, if ret == 0, the "index++" in the "for" clause
<--- will still be called and cause one entry is skipped.
}
it->oiq_blk[depth + 1] = blk;
it->oiq_index[depth] = index;
out_buf:
freedqbuf(buf);
RETURN(ret);
}
|
| Comment by Gerrit Updater [ 22/Mar/18 ] |
|
Hongchao Zhang (hongchao.zhang@intel.com) uploaded a new patch: https://review.whamcloud.com/31721 |
| Comment by Gerrit Updater [ 09/Apr/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31721/ |
| Comment by Peter Jones [ 09/Apr/18 ] |
|
Landed for 2.12 |
| Comment by Cliff White (Inactive) [ 17/May/18 ] |
|
We are seeing this again on 2.10.4 |
| Comment by Gerrit Updater [ 28/May/18 ] |
|
Hongchao Zhang (hongchao.zhang@intel.com) uploaded a new patch: https://review.whamcloud.com/32567 |
| Comment by Gerrit Updater [ 28/May/18 ] |
|
Hongchao Zhang (hongchao.zhang@intel.com) uploaded a new patch: https://review.whamcloud.com/32568 |
| Comment by Peter Jones [ 08/Aug/18 ] |
|
Should not have been reopened for ticket affecting LTS branch |
| Comment by Gerrit Updater [ 19/Jan/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32567/ |