[LU-8446] metadata-updates: FAIL: wrong timestamps Created: 27/Jul/16  Updated: 01/Nov/16  Resolved: 29/Sep/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.9.0
Fix Version/s: Lustre 2.9.0

Type: Bug Priority: Critical
Reporter: Jian Yu Assignee: Niu Yawei (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Attachments: File metadata-updates.1472950648.tar.gz    
Issue Links:
Duplicate
Related
is related to LU-8603 improve metadata-updates.sh with sub-... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

metadata-updates test failed as follows:

onyx-40vm2: /mnt/lustre/d0.metadata-updates/onyx-40vm1/testfile [ atime mtime ] expected : 978307200 1115251200 ;  got : 1469410011 1115251200 
onyx-40vm1: /mnt/lustre/d0.metadata-updates/onyx-40vm2/testfile [ atime mtime ] expected : 978307200 1115251200 ;  got : 1469410011 1115251200 
 metadata-updates : @@@@@@ FAIL: wrong timestamps

Maloo reports:
https://testing.hpdd.intel.com/test_sets/825136bc-524f-11e6-bf87-5254006e85c2
https://testing.hpdd.intel.com/test_sets/957409a0-5037-11e6-bf87-5254006e85c2
https://testing.hpdd.intel.com/test_sets/92afe362-5008-11e6-9f8e-5254006e85c2



 Comments   
Comment by Joseph Gmitter (Inactive) [ 01/Aug/16 ]

Hi Jian,

Can you determine what build this problem starting occurring at? That will help to narrow in on potential patches that could have introduced this issue.

thanks.
Joe

Comment by Jian Yu [ 02/Aug/16 ]

Hi Joe,

By searching on Maloo, I found the failure started occurring on 2016-06-06:
https://testing.hpdd.intel.com/test_sets/12cab8e0-2c5c-11e6-a0ce-5254006e85c2

The build is https://build.hpdd.intel.com/job/lustre-master/3384. However, it's been removed from Jenkins.

Comment by Saurabh Tandan (Inactive) [ 06/Sep/16 ]

Roughly this issue has occurred around 100 times in past 30 days overall.

Comment by Saurabh Tandan (Inactive) [ 07/Sep/16 ]

This issue was first seen on 2016-06-06 for build# 3384 on master Tag 2.8.53 which was the last build for this tag:
https://testing.hpdd.intel.com/test_sets/12cab8e0-2c5c-11e6-a0ce-5254006e85c2

Comment by Peter Jones [ 08/Sep/16 ]

Niu

Could you please assist with this issue?

Thanks

Peter

Comment by Jian Yu [ 12/Sep/16 ]

I'm not sure why maloo doesn't provide debug log for such kind of test (metadata-updates).

I think this is because the tests in metadata-updates.sh are not wrapped as test_x() sub-tests. While gathering logs, there are no sub-test names in the log files, and then Maloo doesn't import them (LU-8603).

From report https://testing.hpdd.intel.com/test_sets/12cab8e0-2c5c-11e6-a0ce-5254006e85c2, we can see:

Dumping lctl log to /logdir/test_logs/2016-06-05/lustre-master-el6_7-x86_64-vs-lustre-master-el7-x86_64--full--2_17_1__3384__-69908717151380-111335/metadata-updates..*.1465232271.log
CMD: onyx-32vm3,onyx-32vm4,onyx-32vm5.onyx.hpdd.intel.com,onyx-32vm6 /usr/sbin/lctl dk > /logdir/test_logs/2016-06-05/lustre-master-el6_7-x86_64-vs-lustre-master-el7-x86_64--full--2_17_1__3384__-69908717151380-111335/metadata-updates..debug_log.\$(hostname -s).1465232271.log;
         dmesg > /logdir/test_logs/2016-06-05/lustre-master-el6_7-x86_64-vs-lustre-master-el7-x86_64--full--2_17_1__3384__-69908717151380-111335/metadata-updates..dmesg.\$(hostname -s).1465232271.log

I'm not sure whether autotest has deleted those logs or not. If not, you can find them in /home/autotest/logdir/... or /home/autotest2/logdir/... on the test cluster.

Comment by Niu Yawei (Inactive) [ 14/Sep/16 ]

Thanks for the log, Yujian.

The log looks normal: OST updated lvb correctly, and clients even returned correct timestamps on glimplse shortly before the failure point. I'll investigate it futher.

Comment by Gerrit Updater [ 20/Sep/16 ]

Niu Yawei (yawei.niu@intel.com) uploaded a new patch: http://review.whamcloud.com/22623
Subject: LU-8446 llite: clear inode timestamps after losing UPDATE lock
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: c631f414ea2d3e9d3acbb653bc85c10f68dcb9d5

Comment by Gerrit Updater [ 29/Sep/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/22623/
Subject: LU-8446 llite: clear inode timestamps after losing UPDATE lock
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 4a5e3556a6016cf5ded9c4126454916ab847a1b6

Comment by Peter Jones [ 29/Sep/16 ]

Landed for 2.9

Generated at Sat Feb 10 02:17:36 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.