[LU-1726] Test failure on test suite runtests Created: 08/Aug/12  Updated: 29/May/17  Resolved: 29/May/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.3.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: Jinshan Xiong (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: None

Attachments: File 1726.tar.gz    
Severity: 3
Rank (Obsolete): 10166

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/29c65974-dccd-11e1-8744-52540035b04c.

runtests : @@@@@@ FAIL: old and new files are different: rc=22 


 Comments   
Comment by Jodi Levi (Inactive) [ 09/Aug/12 ]

Sarah can you provide the logs for this please?

Comment by Peter Jones [ 10/Aug/12 ]

Lai

Could you please look into this one?

Thanks

Peter

Comment by Lai Siyao [ 10/Aug/12 ]

The log shows copied files are identical after copy, but after lustre stop and start, all file contents are different then.

I can't reproduce it, but I suspect file data is not flushed correctly upon umount.

Hi Sarah, did this failure occur always, or just once observed? If it happens often, could you reserve the failed system and let me check there? Sure ATM I'll try to reproduce it locally.

Comment by Sarah Liu [ 10/Aug/12 ]

I only found two failure instances in master. Run this test 4 times and cannot reproduce it though.
https://maloo.whamcloud.com/test_sessions/f09de4de-dc84-11e1-853a-52540035b04c
https://maloo.whamcloud.com/test_sessions/27d75f28-dccd-11e1-8744-52540035b04c

It looks like this error more often seen in b2_1

https://maloo.whamcloud.com/test_sets/query?utf8=%E2%9C%93&test_set[test_set_script_id]=f946ba8e-32bc-11e0-aaee-52540025f9ae&test_set[status]=FAIL&test_set[query_bugs]=&test_session[test_host]=&test_session[test_group]=&test_session[user_id]=a9a64c70-4b39-11e0-9bc2-52540025f9af&test_session[query_date]=&test_session[query_recent_period]=&test_node[os_type_id]=&test_node[distribution_type_id]=&test_node[architecture_type_id]=&test_node[file_system_type_id]=&test_node[lustre_branch_id]=&test_node_network[network_type_id]=&commit=Update+results

Comment by Sarah Liu [ 10/Aug/12 ]

Lai, I tried on b2_1/build #109 twice and got this failure once, I think it would be easier reproduce on that branch. The attached are debug and dmesg log of client and server. If you still have trouble reproduce it on your local system please let me know

https://maloo.whamcloud.com/test_sets/7030d978-e323-11e1-9f91-52540035b04c

Comment by Sarah Liu [ 10/Aug/12 ]

debug and dmesg logs

Comment by Lai Siyao [ 16/Aug/12 ]

I don't see anything special in the logs, and I can't reproduce it anyway.

Hi Jinshan, I suspect dirty data is not flushed to disk upon umount, do you have any suggestion on how to debug such case?

Comment by Jinshan Xiong (Inactive) [ 16/Aug/12 ]

I'll take a look at this bug

Comment by Jodi Levi (Inactive) [ 23/Aug/12 ]

Not happening on the latest tag. If this happens again on future tags, then we can put as blocker again.

Comment by Andreas Dilger [ 29/May/17 ]

Close old ticket.

Generated at Sat Feb 10 01:19:09 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.