[LU-5490] sanity test_133d: FAIL: samedir_rename_size error Created: 14/Aug/14  Updated: 03/May/18  Resolved: 15/Mar/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0, Lustre 2.5.3, Lustre 2.9.0
Fix Version/s: Lustre 2.11.0, Lustre 2.10.4

Type: Bug Priority: Minor
Reporter: Jian Yu Assignee: Nathaniel Clark
Resolution: Fixed Votes: 0
Labels: zfs
Environment:

Lustre build: https://build.hpdd.intel.com/job/lustre-master/2615/
Distro/Arch: RHEL6.5/x86_64
FSTYPE=zfs


Issue Links:
Related
is related to LU-8066 Move lustre procfs handling to sysfs ... Open
is related to LU-4203 Test failure sanity test_133d: crossd... Resolved
Severity: 3
Rank (Obsolete): 15318

 Description   

sanity test 133d failed as follows:

== sanity test 133d: Verifying rename_stats ========================================== 01:32:10 (1408005130)
CMD: onyx-47vm7 /usr/sbin/lctl list_param mdt.*.rename_stats
mdt.lustre-MDT0000.rename_stats
CMD: onyx-47vm7 /usr/sbin/lctl set_param mdt.*.rename_stats=clear
mdt.lustre-MDT0000.rename_stats=clear
total: 512 creates in 1.52 seconds: 336.70 creates/second
source rename dir size: 64K
target rename dir size: 8K
CMD: onyx-47vm7 /usr/sbin/lctl get_param mdt.*.rename_stats
mdt.lustre-MDT0000.rename_stats=
rename_stats:
- snapshot_time:  1408005133.29235
- same_dir       
      512bytes: { sample:   1, pct: 100, cum_pct: 100 }
CMD: onyx-47vm7 /usr/sbin/lctl get_param mdt.*.rename_stats
CMD: onyx-47vm7 /usr/sbin/lctl get_param mdt.*.rename_stats
 sanity test_133d: @@@@@@ FAIL: samedir_rename_size error

Maloo report: https://testing.hpdd.intel.com/test_sets/4b98ba1c-23f5-11e4-b2ba-5254006e85c2



 Comments   
Comment by Jian Yu [ 24/Aug/14 ]

Lustre Build: https://build.hpdd.intel.com/job/lustre-b2_5/84/
Distro/Arch: RHEL6.5/x86_64
FSTYPE=zfs

The same failure occurred:
https://testing.hpdd.intel.com/test_sets/880a64a2-2b23-11e4-bb80-5254006e85c2

Comment by James Nunez (Inactive) [ 25/Mar/15 ]

I hit this failure on master (pre-2.8.0) in review-zfs. The results are at https://testing.hpdd.intel.com/test_sessions/12f05e44-d21a-11e4-a005-5254006e85c2

Comment by Emoly Liu [ 04/May/15 ]

Another failure instance on master in review-zfs: https://testing.hpdd.intel.com/test_sets/efa92684-eaf9-11e4-95aa-5254006e85c2

Comment by Andreas Dilger [ 12/May/15 ]

Between this failure and LU-4203 there are 11 failures in the past month. I suspect there is some inconsistency with how osd-zfs is recording the directory size in the rename stats (which I think is the number of entries) vs. how the test or /proc is checking them (which is the number of bytes). I don't think we need to change osd-zfs to record the stats differently, but rather fix the test so that it handles this consistently. It isn't clear why the test only fails intermittently.

Comment by James Nunez (Inactive) [ 10/Jul/15 ]

More failures on master at:
2015-07-10 03:43:38 - https://testing.hpdd.intel.com/test_sets/840b7844-26ec-11e5-b3d7-5254006e85c2
2015-07-22 20:01:38 - https://testing.hpdd.intel.com/test_sets/c4fdb934-30f1-11e5-a788-5254006e85c2
2015-08-24 15:36:45 - https://testing.hpdd.intel.com/test_sets/fb01ec64-4aad-11e5-88e8-5254006e85c2
2016-01-04 01:56:27 - https://testing.hpdd.intel.com/test_sets/fafafd4a-b2b4-11e5-8114-5254006e85c2
2016-02-02 00:40:04 - https://testing.hpdd.intel.com/test_sets/95eca8d4-c968-11e5-9e6a-5254006e85c2
2016-02-04 07:09:54 - https://testing.hpdd.intel.com/test_sets/57eaf714-cb3b-11e5-b49e-5254006e85c2
2016-02-04 09:47:15 - https://testing.hpdd.intel.com/test_sets/4b7d3746-cb4d-11e5-be8d-5254006e85c2

Comment by Andreas Dilger [ 05/Feb/16 ]

Nathaniel, could you take a crack at this if you have time. Not critical, but a source of ongoing annoyance for patch landings.

Comment by Nathaniel Clark [ 11/Feb/16 ]

"cause" of error seems to be size of directory reported by ls/stat is one thing (i.e. "64K") and size in mdt.lustre-MDT0000.rename_stats is another (i.e. "512bytes"). Not sure why the discrepancy yet, but this is why the test is failing.

Comment by Nathaniel Clark [ 03/Mar/16 ]

This is still happening on master:
https://testing.hpdd.intel.com/test_sets/242be33c-e0d2-11e5-97b5-5254006e85c2
https://testing.hpdd.intel.com/test_sets/75019ae8-dfee-11e5-9400-5254006e85c2
https://testing.hpdd.intel.com/test_sets/29b44c56-df1d-11e5-8471-5254006e85c2

Comment by Bob Glossman (Inactive) [ 21/Apr/16 ]

another on master:
https://testing.hpdd.intel.com/test_sets/233fe796-0749-11e6-b5f1-5254006e85c2

Comment by Patrick Farrell (Inactive) [ 12/Sep/17 ]

+1 on master:
https://testing.hpdd.intel.com/test_sets/1d65eb6a-975b-11e7-b761-5254006e85c2

Comment by Gerrit Updater [ 18/Sep/17 ]

Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: https://review.whamcloud.com/29053
Subject: LU-5490 tests: FORTESTONLY
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 821baea4626beb70dbca822f3109d58180db3fd2

Comment by Nathaniel Clark [ 07/Mar/18 ]

This test can fail for DNE as test_133d only checks 1 MDS.

Comment by Gerrit Updater [ 08/Mar/18 ]

Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: https://review.whamcloud.com/31585
Subject: LU-5490 tests: Sanity/133d ensure stats read is on correct MDT
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: fa38f33d74bcd91ecf8768dde3dece69d629c099

Comment by Gerrit Updater [ 15/Mar/18 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/31585/
Subject: LU-5490 tests: Sanity/133d ensure stats read is on correct MDT
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 1d80fd72bf42f653a5d6a4a31fc2c5df571f1afc

Comment by Peter Jones [ 15/Mar/18 ]

Landed for 2.11

Comment by Minh Diep [ 25/Apr/18 ]

+1 on b2_10 https://testing.hpdd.intel.com/test_sets/0307fc3a-4887-11e8-960d-52540065bddc

Comment by Gerrit Updater [ 25/Apr/18 ]

Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/32149
Subject: LU-5490 tests: Sanity/133d ensure stats read is on correct MDT
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: dcbadac5ad9143ad19e6db8afddcf050253a84ff

Comment by Gerrit Updater [ 03/May/18 ]

John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/32149/
Subject: LU-5490 tests: Sanity/133d ensure stats read is on correct MDT
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: 4eb9ba01430acf7f16d9a801c8cb6b39b2c75a27

Generated at Sat Feb 10 01:51:56 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.