[LU-3573] lustre-rsync-test test_8: @@@@@@ FAIL: Failure in replication; differences found. Created: 10/Jul/13  Updated: 20/Jul/15  Resolved: 12/Dec/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0, Lustre 2.5.1, Lustre 2.7.0, Lustre 2.5.3
Fix Version/s: Lustre 2.7.0, Lustre 2.5.4

Type: Bug Priority: Critical
Reporter: Keith Mannthey (Inactive) Assignee: Nathaniel Clark
Resolution: Fixed Votes: 0
Labels: prz, zfs
Environment:

a patch pushed to autotest


Issue Links:
Duplicate
is duplicated by LU-5765 sanity test_123a test_123b: rm: no su... Resolved
Related
is related to LU-5254 readdir missing a directory Resolved
is related to LU-5924 conf-sanity test_32b: list verificati... Resolved
is related to LU-6101 sanity test_24A: Can not delete direc... Resolved
is related to LU-5475 readdir missing a directory Resolved
is related to LU-3580 Panic in ptlrpc when rerunning lustre... Resolved
Severity: 3
Rank (Obsolete): 9026

 Description   

On ZFS an error was seen with lustre-rsync test 8 the logs are here:
https://maloo.whamcloud.com/test_sets/e3492890-e901-11e2-ae91-52540035b04c

I don't really know how to read lrsync_log.

The test error reports a very basic:

== lustre-rsync-test test 8: Replicate multiple file/directory moves == 16:00:59 (1373410859)
CMD: wtm-10vm7 lctl --device lustre-MDT0000 changelog_register -n
lustre-MDT0000: Registered changelog user cl13
CMD: wtm-10vm7 lctl get_param -n mdd.lustre-MDT0000.changelog_users
Lustre filesystem: lustre
MDT device: lustre-MDT0000
Source: /mnt/lustre
Target: /tmp/target
Statuslog: /tmp/lustre_rsync.log
Changelog registration: cl13
Starting changelog record: 0
Clear changelog after use: no
Errors: 0
lustre_rsync took 107 seconds
Changelog records consumed: 1881
Only in /tmp/target/d0.lustre-rsync-test/d8/d08/d083: a3
 lustre-rsync-test test_8: @@@@@@ FAIL: Failure in replication; differences found. 
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:4066:error_noexit()
  = /usr/lib64/lustre/tests/test-framework.sh:4093:error()
  .....

Out of the last 100 runs it reports 1 error so it could be related to the base patch or a rare error.



 Comments   
Comment by Jodi Levi (Inactive) [ 11/Jul/13 ]

Nathaniel,
Could you have a look at this one?
Thank you!

Comment by Andreas Dilger [ 11/Jul/13 ]

It appears that this has been failing for several weeks. I searched for "lustre-rsync-test test8" in Maloo Results->Search->SubTest and found quite a few hits, dating back to 2013-03-25 after LU-2886 was fixed (which totally broke lustre-rsync-test for interop):

https://maloo.whamcloud.com/sub_tests/1d923f9a-959d-11e2-bc9e-52540035b04c
https://maloo.whamcloud.com/sub_tests/b0fef176-99d8-11e2-8bb4-52540035b04c
https://maloo.whamcloud.com/sub_tests/92a06174-b1a7-11e2-802d-52540035b04c
https://maloo.whamcloud.com/sub_tests/7f4534c4-b2b0-11e2-9b6a-52540035b04c
https://maloo.whamcloud.com/sub_tests/8a0326fc-bf47-11e2-88e0-52540035b04c
:
:

The actual file reported changes ("a8", "a9", etc). The Maloo search also confirmed that none of the recent test_8 failures happened on ldiskfs, only ZFS.

Comment by nasf (Inactive) [ 27/Dec/13 ]

Another failure instance:

https://maloo.whamcloud.com/test_sets/395e5f0a-6ec4-11e3-a57c-52540035b04c

Comment by Jian Yu [ 13/Mar/14 ]

Lustre Build: http://build.whamcloud.com/job/lustre-b2_5/41/ (2.5.1 RC3)
Distro/Arch: RHEL6.5/x86_64
FSTYPE=zfs

The same failure occurred:
https://maloo.whamcloud.com/test_sets/f858f01c-aa58-11e3-b4b1-52540035b04c

Comment by Nathaniel Clark [ 03/Apr/14 ]

This bug may be fixed by LU-4838 specifically patch http://review.whamcloud.com/9838

Comment by Peter Jones [ 24/Apr/14 ]

Nathaniel

Could you please prove/disprove this theory by porting the LU-4838 patch to b2_5 and using test parameters to run the test in question?

Thanks

Peter

Comment by Nathaniel Clark [ 24/Jul/14 ]

LU-4838 did not fix this issue.

Comment by Jian Yu [ 23/Sep/14 ]

While testing patch http://review.whamcloud.com/12012 on Lustre b2_5 branch with FSTYPE=zfs, the same failure occurred:
https://testing.hpdd.intel.com/test_sets/983a9d84-431e-11e4-bf48-5254006e85c2

Comment by Nathaniel Clark [ 02/Oct/14 ]

I've been running this on my local test cluster (tests 1-8 repetitively) and have not had the failure happen locally to debug the system in the failed state. This failure has only happened 4 times out of 121 test runs (zfs on master) in the last week (about 3%).

Comment by Nathaniel Clark [ 03/Oct/14 ]

Finally reproduced:

File reported as missing from lustre but present in rsync'd dir is in-fact missing from lustre. Remounting client then then mdt did not cause file to re-appear, and there is no log of the file being deleted.

This behaviour can be reproduced by just running:

    for i in 1 2 3 4 5 6 7 8 9; do
	mkdir $DIR/$tdir/d$i
	for j in 1 2 3 4 5 6 7 8 9; do
	    mkdir $DIR/$tdir/d$i/d$i$j
	    createmany -o $DIR/$tdir/d$i/d$i$j/a 10 \
		> /dev/null
	    mv $DIR/$tdir/d$i/d$i$j $DIR/$tdir/d$i/d0$i$j
	    createmany -o $DIR/$tdir/d$i/d0$i$j/b 10 \
	            > /dev/null
	    mv $DIR/$tdir/d$i/d0$i$j/a0 $DIR/$tdir/d$i/d0$i$j/c0
	done
	mv $DIR/$tdir/d$i $DIR/$tdir/d0$i
    done

and comparing the created files to what would be expected.

     for i in 1 2 3 4 5 6 7 8 9; do
	for j in 1 2 3 4 5 6 7 8 9; do
	    for k in 1 2 3 4 5 6 7 8 9; do
		echo $D/d0$i/d0$i$j/a$k >> $F
	    done
	    for k in 0 1 2 3 4 5 6 7 8 9; do
		echo $D/d0$i/d0$i$j/b$k >> $F
	    done
	    echo $D/d0$i/d0$i$j/c0 >> $F
	done
    done

    find $DIR/$tdir -type f | sort | diff $F - || error "Differences Found"
Comment by Jodi Levi (Inactive) [ 05/Oct/14 ]

Nathaniel,
Is the solution to fix this clear or do we need to involve anyone else?

Comment by Nathaniel Clark [ 06/Oct/14 ]

File actually does exist in MDT's zfs filesystem, and can be accessed directly from client, but ls does not show it in the directory. This reminds of a "recent" bug, but I'm having trouble locating it
Update:
Similar bugs are LU-5475 and LU-5254

Comment by Nathaniel Clark [ 16/Oct/14 ]

Status of current testing:
The issue appears to be in the iteration of the directory. All the files are correctly created on the MDT and are visible (as far as I can tell via mounting MDT as ZFS), but iterating across that directory using zap_cursor_advance and zap_cursor_retrieve seem to somehow miss that one file. No errors are return in the creation of the object or adding the object to the directory, the object is just missing during iteration.

Comment by Andreas Dilger [ 16/Oct/14 ]

Nathaniel, does this problem disappear if the client is unmounted/remounted, or is it a function of the directory itself? Is it possible that there are hash collisions in the name entries that are not being handled properly?

Comment by Nathaniel Clark [ 21/Oct/14 ]

Andreas,
It is a function of the directory itself. client can be umounted then remounted w/o any change, the MDT can be mounted / exported/ imported and remounted w/o any change. It feels like it might be an issue in ZFS itself, but I don't have a smoking gun.

Comment by Nathaniel Clark [ 22/Oct/14 ]

Not sure if this is related but, I was trying to see if it was something in the file structures on disk vs. something under the hood for ZFS so I followed the directions for Backup and Restore of a File-Level backup. I restored and then tried to mount the MDT resulting in this:

LustreError: 46380:0:(osd_oi.c:232:osd_fld_lookup()) ASSERTION( ss != ((void *)0) ) failed: 
LustreError: 46380:0:(osd_oi.c:232:osd_fld_lookup()) LBUG
Kernel panic - not syncing: LBUG
Pid: 46380, comm: mount.lustre Tainted: P           ---------------    2.6.32-431.20.3.el6_lustre.g5a7c614.x86_64 #1
Call Trace:
 [<ffffffff8152859c>] ? panic+0xa7/0x16f
 [<ffffffffa0665eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
 [<ffffffffa0e9c78f>] ? osd_fld_lookup+0xcf/0xd0 [osd_zfs]
 [<ffffffffa0e9c88e>] ? fid_is_on_ost+0xfe/0x310 [osd_zfs]
 [<ffffffffa0392869>] ? dbuf_rele_and_unlock+0x169/0x1e0 [zfs]
 [<ffffffffa0e9cbac>] ? osd_get_name_n_idx+0x4c/0xe80 [osd_zfs]
 [<ffffffffa03b3f53>] ? dsl_dataset_block_freeable+0x43/0x60 [zfs]
 [<ffffffffa03a8d93>] ? dmu_tx_hold_zap+0x1a3/0x200 [zfs]
 [<ffffffffa0e9dd1a>] ? osd_convert_root_to_new_seq+0x33a/0x670 [osd_zfs]
 [<ffffffffa0e8c4b8>] ? osd_mount+0xbc8/0xf60 [osd_zfs]
 [<ffffffffa0e8ed66>] ? osd_device_alloc+0x2a6/0x3b0 [osd_zfs]
 [<ffffffffa07ebf0f>] ? obd_setup+0x1bf/0x290 [obdclass]
 [<ffffffffa07ec1e8>] ? class_setup+0x208/0x870 [obdclass]
 [<ffffffffa07f438c>] ? class_process_config+0xc5c/0x1ac0 [obdclass]
 [<ffffffffa07f94b5>] ? lustre_cfg_new+0x4f5/0x6f0 [obdclass]
 [<ffffffffa07f9808>] ? do_lcfg+0x158/0x450 [obdclass]
 [<ffffffffa07f9b94>] ? lustre_start_simple+0x94/0x200 [obdclass]
 [<ffffffffa0832df1>] ? server_fill_super+0x1061/0x1720 [obdclass]
 [<ffffffffa07ff6b8>] ? lustre_fill_super+0x1d8/0x550 [obdclass]
 [<ffffffffa07ff4e0>] ? lustre_fill_super+0x0/0x550 [obdclass]
 [<ffffffff8118c01f>] ? get_sb_nodev+0x5f/0xa0
 [<ffffffffa07f72b5>] ? lustre_get_sb+0x25/0x30 [obdclass]
 [<ffffffff8118b67b>] ? vfs_kern_mount+0x7b/0x1b0
 [<ffffffff8118b822>] ? do_kern_mount+0x52/0x130
 [<ffffffff8119e422>] ? vfs_ioctl+0x22/0xa0
 [<ffffffff811ad1fb>] ? do_mount+0x2fb/0x930
 [<ffffffff811ad8c0>] ? sys_mount+0x90/0xe0
 [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b

I had fully and cleanly unmounted everything prior to creating the backup, I then reformatted the ZFS filesystem.

update
This happens on seq is FID_SEQ_ROOT.

Comment by Andreas Dilger [ 24/Oct/14 ]

I don't think it is supported to do file-level backup and restore for ZFS. For ZFS you need to do "zfs send" and "zfs recv" or else the OI files will be broken, and there is not currently OI Scrub functionality for osd-zfs.

If that isn't clearly documented in the Lustre User Manual's Backup and Restore section, then that is a defect in the manual that should be fixed.

Comment by Nathaniel Clark [ 24/Oct/14 ]

Doing update restore using LUDOC-161 information (zfs send/recv) instead of tar works much better. It shows no change after restore, the file is still missing form lustre client.

Comment by Jian Yu [ 25/Oct/14 ]

One more instance on master branch: https://testing.hpdd.intel.com/test_sets/a2161baa-5bea-11e4-a35f-5254006e85c2

Comment by Nathaniel Clark [ 27/Oct/14 ]

The file itself isn't hidden, the name is "cloaked".
In this example d074/a9 has been "lost", notice that a9 will never appear to ls:

[root@lubuilder d074]# ls
a1  a2  a3  a4  a5  a6  a7  a8  b0  b1  b2  b3  b4  b5  b6  b7  b8  b9  c0
[root@lubuilder d074]# mv a9 a9a
[root@lubuilder d074]# ls
a1  a2  a3  a4  a5  a6  a7  a8  a9a  b0  b1  b2  b3  b4  b5  b6  b7  b8  b9  c0
[root@lubuilder d074]# mv a9a a9
[root@lubuilder d074]# ls
a1  a2  a3  a4  a5  a6  a7  a8  b0  b1  b2  b3  b4  b5  b6  b7  b8  b9  c0
[root@lubuilder d074]# mv a9 a9a
[root@lubuilder d074]# ls
a1  a2  a3  a4  a5  a6  a7  a8  a9a  b0  b1  b2  b3  b4  b5  b6  b7  b8  b9  c0
[root@lubuilder d074]# mv a8 a9
[root@lubuilder d074]# ls
a1  a2  a3  a4  a5  a6  a7  a9a  b0  b1  b2  b3  b4  b5  b6  b7  b8  b9  c0
[root@lubuilder d074]# mv a9 a8
[root@lubuilder d074]# ls
a1  a2  a3  a4  a5  a6  a7  a8  a9a  b0  b1  b2  b3  b4  b5  b6  b7  b8  b9  c0
Comment by Andreas Dilger [ 29/Oct/14 ]

Does ZFS have whiteout entries, or is there some problem with the hashing that prevents the directory entry from appearing? I know that "ls" will not show entries that have inode == 0, but the inode number should be independent of the filename, so that wouldn't behave in this manner. Maybe there is a ZFS "hidden" flag that is not being initialized correctly and in some cases this flag is set? Does anything appear with "strace" or with "zdb"?

Comment by Nathaniel Clark [ 30/Oct/14 ]

The inode is not 0, the inode is tied to the file and this issue is with the name in that directory.
I can delete the file and create a new one, and it will still be hidden. It must be in how ZFS stores the name, I'm trying to debug that now.

Comment by Jian Yu [ 03/Nov/14 ]

One more instance on Lustre b2_5 branch with FSTYPE=zfs: https://testing.hpdd.intel.com/test_sets/5310f46c-61fd-11e4-bd1f-5254006e85c2

Comment by Nathaniel Clark [ 05/Nov/14 ]

http://review.whamcloud.com/12582

Comment by Peter Jones [ 10/Nov/14 ]

Fix landed for 2.7

Comment by Bob Glossman (Inactive) [ 10/Nov/14 ]

in b2_5:
http://review.whamcloud.com/12649

Comment by Andreas Dilger [ 14/Nov/14 ]

Unfortunately, version 2 of the http://review.whamcloud.com/12582 patch failed review-zfs conf-sanity test_32b with a real problem that is now causing serious test failures on master. The fact that it failed on the unlanded patch, when no other tests were failing before that time makes it very likely that this patch caused the problem. I've opened LU-5924 for that issue, but we might consider reverting this patch if it cannot be addressed quickly.

Comment by Jodi Levi (Inactive) [ 17/Nov/14 ]

Patch landed to Master. b2_5 patch ready to land and tracked outside of this ticket.

Comment by Gerrit Updater [ 01/Dec/14 ]

Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/12887
Subject: Revert "LU-3573 osd-zfs: Only advance zap cursor as needed"
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 5f6d6b371a18a8b164f0414a5dc7356f066fd2c4

Comment by Gerrit Updater [ 02/Dec/14 ]

Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: http://review.whamcloud.com/12904
Subject: LU-3573 osd-zfs: Only advance zap cursor as needed
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 90a5fa16f900ef24412bad534c848e99e091e44b

Comment by Nathaniel Clark [ 02/Dec/14 ]

This should be a fix for this bug without re-introducing LU-5924
http://review.whamcloud.com/12904

Comment by Nathaniel Clark [ 02/Dec/14 ]

When http://review.whamcloud.com/12887 is landed this bug will be open

Comment by Gerrit Updater [ 02/Dec/14 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12887/
Subject: Revert "LU-3573 osd-zfs: Only advance zap cursor as needed"
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 038d747f2ac5971875734701f5319047d6703844

Comment by Gerrit Updater [ 10/Dec/14 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12904/
Subject: LU-3573 osd-zfs: Only advance zap cursor as needed
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 0780fde4d415aa1f407ed0094f20f30e6f1b0d3a

Comment by Jodi Levi (Inactive) [ 12/Dec/14 ]

Patches landed to Master.

Comment by Gerrit Updater [ 12/Jan/15 ]

Bob Glossman (bob.glossman@intel.com) uploaded a new patch: http://review.whamcloud.com/13357
Subject: LU-3573 osd-zfs: Only advance zap cursor as needed
Project: fs/lustre-release
Branch: b2_5
Current Patch Set: 1
Commit: 78d6d3aeea75334fa7af3332c2835fcf573a8a7a

Comment by Gerrit Updater [ 27/Jan/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13357/
Subject: LU-3573 osd-zfs: Only advance zap cursor as needed
Project: fs/lustre-release
Branch: b2_5
Current Patch Set:
Commit: ce75d8f368a506993730da4224f4cfc18c2a0260

Generated at Sat Feb 10 01:35:04 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.