[LU-7537] sanity 133c FAIL:The destroy counter on ost is wrong - expected 1 Created: 10/Dec/15  Updated: 16/Mar/17  Resolved: 16/Mar/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.10.0

Type: Bug Priority: Minor
Reporter: Sarah Liu Assignee: WC Triage
Resolution: Fixed Votes: 0
Labels: None
Environment:

before rolling upgrade: 2.5.5 RHEL6.6
after rolling upgrade: master/3264 RHEL6.7 ldiskfs


Attachments: File 7537.tar.gz    
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

After rolling upgrade from 2.5.5 RHEL6.6 to master/3264 RHEL6.7 sanity 133c failed as. This may the same issue as LU-2066

onyx-28: == sanity test 133c: Verifying OST stats ========================================== 18:13:24 (1449713604)
onyx-28: Delete is not completed in 25 seconds
onyx-28: osc.lustre-OST0000-osc-MDT0000.sync_changes=6766
onyx-28: osc.lustre-OST0000-osc-MDT0000.sync_in_flight=0
onyx-28: osc.lustre-OST0000-osc-MDT0000.sync_in_progress=4096
onyx-28: Waiting for local destroys to complete
onyx-28: mdt.lustre-MDT0000.md_stats=clear
onyx-28: obdfilter.lustre-OST0000.stats=clear
onyx-28: 1+0 records in
onyx-28: 1+0 records out
onyx-28: 524288 bytes (524 kB) copied, 0.00680449 s, 77.1 MB/s
onyx-28: write_bytes 1 samples [bytes] 524288 524288 524288
onyx-28: 1+0 records in
onyx-28: 1+0 records out
onyx-28: 1024 bytes (1.0 kB) copied, 0.00397918 s, 257 kB/s
onyx-28: read_bytes 1 samples [bytes] 4096 4096 4096
onyx-28: punch 1 samples [reqs]
onyx-28: Waiting for local destroys to complete
onyx-28: destroy 2671 samples [reqs]
onyx-28:  sanity test_133c: @@@@@@ FAIL: The destroy counter on ost is wrong - expected 1 
onyx-28:   Trace dump:
onyx-28:   = /usr/lib64/lustre/tests/test-framework.sh:4822:error_noexit()
onyx-28:   = /usr/lib64/lustre/tests/test-framework.sh:4853:error()
onyx-28:   = /usr/lib64/lustre/tests/sanity.sh:9118:check_stats()
onyx-28:   = /usr/lib64/lustre/tests/sanity.sh:9226:test_133c()
onyx-28:   = /usr/lib64/lustre/tests/test-framework.sh:5100:run_one()
onyx-28:   = /usr/lib64/lustre/tests/test-framework.sh:5137:run_one_logged()
onyx-28:   = /usr/lib64/lustre/tests/test-framework.sh:5002:run_test()
onyx-28:   = /usr/lib64/lustre/tests/sanity.sh:9230:main()
onyx-28: Dumping lctl log to /tmp/test_logs/2015-12-09/171436/sanity.test_133c.*.1449713652.log
onyx-28: onyx-25: Host key verification failed.
onyx-28: onyx-25: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
onyx-28: onyx-25: rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.6]
onyx-28: onyx-27: Host key verification failed.
onyx-28: onyx-27: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
onyx-28: onyx-27: rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.6]
onyx-28: onyx-26: Host key verification failed.
onyx-28: onyx-26: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
onyx-28: onyx-26: rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.6]
onyx-28: onyx-28: Host key verification failed.
onyx-28: onyx-28: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
onyx-28: onyx-28: rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.6]
onyx-28: FAIL 133c (53s)


 Comments   
Comment by Andreas Dilger [ 10/Dec/15 ]

Sarah, are there logs in Maloo for this test failure? Are the client and server versions the same?

Comment by Andreas Dilger [ 10/Dec/15 ]

It looks like the destroy count is 2671 when it should be 1.

Comment by Sarah Liu [ 10/Dec/15 ]

Andreas,

I will upload related log to this ticket, and the server and client are same version after rolling upgrade all of them to master

Comment by Sarah Liu [ 10/Dec/15 ]

logs

Comment by Gerrit Updater [ 22/Feb/17 ]

Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: https://review.whamcloud.com/25583
Subject: LU-7537 tests: clean up sanity test_133c
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 23c42c4f5ae9705b8a92b586a2a83e082b534d6e

Comment by Gerrit Updater [ 16/Mar/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/25583/
Subject: LU-7537 tests: clean up sanity test_133c
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 6937a7dc35f2c28403d0f5ceb5e15b67db3b78fe

Comment by Peter Jones [ 16/Mar/17 ]

Landed for 2.10

Generated at Sat Feb 10 02:09:45 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.