[LU-2979] sanity 133a: proc counter for mkdir on mds1 was not incremented Created: 18/Mar/13 Updated: 14/Aug/16 Resolved: 14/Aug/16 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Maloo | Assignee: | Di Wang |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | LB | ||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 7262 | ||||||||||||||||||||
| Description |
|
This issue was created by maloo for Li Wei <liwei@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/9b412d48-8dbe-11e2-bb99-52540035b04c. The sub-test test_133a failed with the following error:
Info required for matching: sanity 133a == sanity test 133a: Verifying MDT stats ========================================== 09:55:50 (1363366550)
CMD: client-20-ib /usr/sbin/lctl list_param mdt.*.rename_stats
mdt.lustre-MDT0000.rename_stats
CMD: client-20-ib /usr/sbin/lctl set_param mdt.*.md_stats=clear
mdt.lustre-MDT0000.md_stats=clear
CMD: client-21-ib /usr/sbin/lctl set_param obdfilter.*.stats=clear
obdfilter.lustre-OST0000.stats=clear
obdfilter.lustre-OST0001.stats=clear
obdfilter.lustre-OST0002.stats=clear
obdfilter.lustre-OST0003.stats=clear
obdfilter.lustre-OST0004.stats=clear
obdfilter.lustre-OST0005.stats=clear
obdfilter.lustre-OST0006.stats=clear
CMD: client-20-ib /usr/sbin/lctl get_param mdt.lustre-MDT0000.md_stats
sanity test_133a: @@@@@@ FAIL: The counter for mkdir on mds1 was not incremented
Trace dump:
= /usr/lib64/lustre/tests/test-framework.sh:3973:error_noexit()
= /usr/lib64/lustre/tests/test-framework.sh:3996:error()
= /usr/lib64/lustre/tests/sanity.sh:8033:check_stats()
= /usr/lib64/lustre/tests/sanity.sh:8056:test_133a()
= /usr/lib64/lustre/tests/test-framework.sh:4251:run_one()
= /usr/lib64/lustre/tests/test-framework.sh:4284:run_one_logged()
= /usr/lib64/lustre/tests/test-framework.sh:4139:run_test()
= /usr/lib64/lustre/tests/sanity.sh:8083:main()
Dumping lctl log to /logdir/test_logs/2013-03-15/lustre-reviews-el6-x86_64--review--1_2_1__14053__-70011914322200-085246/sanity.test_133a.*.1363366552.log
CMD: client-20-ib,client-21-ib,client-22-ib,client-23-ib.lab.whamcloud.com /usr/sbin/lctl dk > /logdir/test_logs/2013-03-15/lustre-reviews-el6-x86_64--review--1_2_1__14053__-70011914322200-085246/sanity.test_133a.debug_log.\$(hostname -s).1363366552.log;
dmesg > /logdir/test_logs/2013-03-15/lustre-reviews-el6-x86_64--review--1_2_1__14053__-70011914322200-085246/sanity.test_133a.dmesg.\$(hostname -s).1363366552.log
|
| Comments |
| Comment by Jodi Levi (Inactive) [ 18/Mar/13 ] |
|
Di, |
| Comment by Di Wang [ 18/Mar/13 ] |
|
Jodi, minor is fine to me, IMHO, it is probably test script or counter tracking problem, which does not impact the "real" function. |
| Comment by Sarah Liu [ 21/Mar/13 ] |
|
another instance seen in interop between 2.3.0 client and 2.4 server: |
| Comment by Di Wang [ 23/Apr/13 ] |
|
http://review.whamcloud.com/6136 Add some debug information for the failure to help me understand the problem. |
| Comment by Robert Read (Inactive) [ 10/May/13 ] |
|
IMHO, not having properly functioning performance counters is pretty major problem. |
| Comment by John Hammond [ 13/May/13 ] |
|
I believe this is a bug related to on demand allocation of per-cpu stats. If nothing has yet happened on cpu 0 then no stats will have been allocated for cpu 0, and hence lprocfs_stats_counter_get(stats, 0, index) will return NULL. This NULL will in turn be returned from lprocfs_stats_seq_start() and will be interpreted by the seq_file code as an early EOF. static inline struct lprocfs_counter *
lprocfs_stats_counter_get(struct lprocfs_stats *stats, unsigned int cpuid,
int index)
{
struct lprocfs_counter *cntr;
cntr = &stats->ls_percpu[cpuid]->lp_cntr[index];
if ((stats->ls_flags & LPROCFS_STATS_FLAG_IRQ_SAFE) != 0)
cntr = (void *)cntr + index * sizeof(__s64);
return cntr;
}
static void *lprocfs_stats_seq_start(struct seq_file *p, loff_t *pos)
{
struct lprocfs_stats *stats = p->private;
/* return 1st cpu location */
return (*pos >= stats->ls_num) ? NULL :
lprocfs_stats_counter_get(stats, 0, *pos);
}
|
| Comment by Di Wang [ 13/May/13 ] |
|
Ah, quite possible, I spent so much time to investigate the problem on collecting, but did not notice listing. Thank you. |
| Comment by John Hammond [ 13/May/13 ] |
|
Please see http://review.whamcloud.com/6328. |
| Comment by Jodi Levi (Inactive) [ 14/May/13 ] |
|
Patch landed to master |
| Comment by Sarah Liu [ 21/May/13 ] |
|
verified with the latest tag-2.4.50RC1, client is running tag-2.4.50RC1 and server is running 2.3.0 |
| Comment by Andreas Dilger [ 12/Jul/13 ] |
|
http://review.whamcloud.com/6136 is still not landed |
| Comment by James A Simmons [ 14/Aug/16 ] |
|
Old blocker for unsupported version |