[LU-3246] fc18: sanity test_133c: @@@@@@ FAIL: The counter for destroy on ost was not incremented Created: 30/Apr/13  Updated: 05/Aug/15  Resolved: 05/Aug/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0, Lustre 2.4.1
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Minh Diep Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: MB, yuc2
Environment:

https://maloo.whamcloud.com/test_sets/f457225a-aec7-11e2-a127-52540035b04c


Issue Links:
Related
is related to LU-4513 sanity test_220: prealloc_last_id: Fo... Resolved
Severity: 3
Rank (Obsolete): 8033

 Description   

== sanity test 133c: Verifying OST stats ============================================================= 17:23:17 (1366935797)
CMD: client-27vm7 lctl set_param -n osd*.MD.force_sync 1
CMD: client-27vm7 lctl get_param -n osc.MDT.sync_*
CMD: client-27vm7 lctl get_param -n osc.MDT.sync_*
CMD: client-27vm7 lctl get_param -n osc.MDT.sync_*
CMD: client-27vm7 lctl get_param -n osc.MDT.sync_*
CMD: client-27vm7 lctl get_param -n osc.MDT.sync_*
CMD: client-27vm7 lctl get_param -n osc.MDT.sync_*
Waiting for local destroys to complete
CMD: client-27vm7 /usr/sbin/lctl set_param mdt.*.md_stats=clear
mdt.lustre-MDT0000.md_stats=clear
CMD: client-27vm8 /usr/sbin/lctl set_param obdfilter.*.stats=clear
obdfilter.lustre-OST0000.stats=clear
obdfilter.lustre-OST0001.stats=clear
obdfilter.lustre-OST0002.stats=clear
obdfilter.lustre-OST0003.stats=clear
obdfilter.lustre-OST0004.stats=clear
obdfilter.lustre-OST0005.stats=clear
obdfilter.lustre-OST0006.stats=clear
1+0 records in
1+0 records out
524288 bytes (524 kB) copied, 0.00290828 s, 180 MB/s
CMD: client-27vm8 /usr/sbin/lctl get_param obdfilter.lustre-OST0000.stats
write_bytes 1 samples [bytes] 524288 524288 524288
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 0.0022926 s, 447 kB/s
CMD: client-27vm8 /usr/sbin/lctl get_param obdfilter.lustre-OST0000.stats
read_bytes 1 samples [bytes] 4096 4096 4096
CMD: client-27vm8 /usr/sbin/lctl get_param obdfilter.lustre-OST0000.stats
punch 1 samples [reqs]
Waiting for local destroys to complete
CMD: client-27vm8 /usr/sbin/lctl get_param obdfilter.lustre-OST0000.stats

sanity test_133c: @@@@@@ FAIL: The counter for destroy on ost was not incremented
Trace dump:
= /usr/lib64/lustre/tests/test-framework.sh:4024:error_noexit()
= /usr/lib64/lustre/tests/test-framework.sh:4047:error()
= /usr/lib64/lustre/tests/sanity.sh:8163:check_stats()
= /usr/lib64/lustre/tests/sanity.sh:8274:test_133c()
= /usr/lib64/lustre/tests/test-framework.sh:4301:run_one()
= /usr/lib64/lustre/tests/test-framework.sh:4334:run_one_logged()
= /usr/lib64/lustre/tests/test-framework.sh:4189:run_test()
= /usr/lib64/lustre/tests/sanity.sh:8278:main()
Dumping lctl log to /logdir/test_logs/2013-04-25/lustre-reviews-el6-x86_64-vs-lustre-reviews-fc18-x86_64-full-2_4_1_15112_-70194508889820-153004/sanity.test_133c.*.1366935807.log
CMD: client-27vm1,client-27vm2.lab.whamcloud.com,client-27vm7,client-27vm8 /usr/sbin/lctl dk > /logdir/test_logs/2013-04-25/lustre-reviews-el6-x86_64-vs-lustre-reviews-fc18-x86_64-full-2_4_1_15112_-70194508889820-153004/sanity.test_133c.debug_log.\$(hostname -s).1366935807.log;
dmesg > /logdir/test_logs/2013-04-25/lustre-reviews-el6-x86_64-vs-lustre-reviews-fc18-x86_64-full-2_4_1_15112_-70194508889820-153004/sanity.test_133c.dmesg.\$(hostname -s).1366935807.log



 Comments   
Comment by Sarah Liu [ 15/May/13 ]

Hit similar issue when testing interop between 2.3.0 client and 2.4 server:
https://maloo.whamcloud.com/test_sets/e89c3aac-bbee-11e2-b013-52540035b04c

Comment by Jian Yu [ 09/Aug/13 ]

Lustre Branch: b2_4
Lustre Build: http://build.whamcloud.com/job/lustre-b2_4/27/
Distro/Arch: RHEL6.4/x86_64 + FC18/x86_64 (Server + Client)

The failure occurred regularly on Lustre b2_4 branch on FC18 client:
https://maloo.whamcloud.com/test_sets/996b2278-fd79-11e2-9fdb-52540035b04c

Comment by Jian Yu [ 04/Sep/13 ]

Lustre client: http://build.whamcloud.com/job/lustre-b2_3/41/ (2.3.0)
Lustre server: http://build.whamcloud.com/job/lustre-b2_4/44/ (2.4.1 RC1)

sanity test 133c hit the same failure:
https://maloo.whamcloud.com/test_sets/cde5bab8-14f3-11e3-9828-52540035b04c

Comment by Jian Yu [ 04/Sep/13 ]

Lustre build: http://build.whamcloud.com/job/lustre-b2_4/44/ (2.4.1 RC1)
Distro/Arch: RHEL6.4/x86_64 + FC18/x86_64 (Server + Client)

sanity test 133c hit the same failure:
https://maloo.whamcloud.com/test_sets/0cbde1d0-14ee-11e3-ac48-52540035b04c

Comment by Jian Yu [ 19/Dec/13 ]

Lustre client: http://build.whamcloud.com/job/lustre-b2_3/41/ (2.3.0)
Lustre server: http://build.whamcloud.com/job/lustre-b2_4/69/ (2.4.2 RC1)

sanity test 133c hit the same failure:
https://maloo.whamcloud.com/test_sets/3c82ea6e-685e-11e3-a16f-52540035b04c

Comment by Andreas Dilger [ 20/Jan/14 ]

This has suddenly started to be hit very often on master.

Comment by James A Simmons [ 20/Jan/14 ]

Its due to patch 8029 landing. I'm looking at a fix.

Comment by Jodi Levi (Inactive) [ 04/Feb/14 ]

James,
Have you had a chance to make any progress on this one?

Comment by Jodi Levi (Inactive) [ 04/Feb/14 ]

We are seeing this failing less than 2% of the time, so reducing from blocker.

Comment by James A Simmons [ 16/May/14 ]

Is this showing up anymore? Can we close it.

Comment by James A Simmons [ 05/Aug/15 ]

This is an old ticket that can be closed.

Comment by Peter Jones [ 05/Aug/15 ]

ok

Generated at Sat Feb 10 01:32:14 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.