[LU-3700] sanity-hsm test_21 Error: 'wrong block number' - Whamcloud Community JIRA

Details

Type: Technical task
Resolution: Duplicate
Priority: Blocker
Fix Version/s: Lustre 2.6.0
Affects Version/s: Lustre 2.5.0
Labels:
- HSM
- zfs
Environment:
Patches submitted to autotest

Rank (Obsolete):
9548

Description

from https://maloo.whamcloud.com/test_sets/0afc2c56-fc86-11e2-8ce2-52540035b04c

This sanity-hsm test 21 seems to be hitting a lot right now
Wrong block number is one of the errors seen.

test_21 	

    Error: 'wrong block number'
    Failure Rate: 33.00% of last 100 executions [all branches]

== sanity-hsm test 21: Simple release tests == 23:18:20 (1375510700)
2+0 records in
2+0 records out
2097152 bytes (2.1 MB) copied, 0.353933 s, 5.9 MB/s
 sanity-hsm test_21: @@@@@@ FAIL: wrong block number 
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:4202:error_noexit()

Attachments

Issue Links

is related to

LU-3704 sanity-hsm test_21 - test_24 Error: 'could not release file'

Resolved

Activity

[LU-3700] sanity-hsm test_21 Error: 'wrong block number'

Nathaniel Clark added a comment - 11/Dec/13 1:17 AM

The OSS is definitely returning a funny (not the same) number of blocks after doing hsm_archive. I checked this on my local setup with wireshark and OST_GETATTR is replying with different number of blocks than before doing hsm_archive.

It may be that the passing tests are the ones hitting the cached value on the client and the failure is the one getting the value from the OST.

OST debug log:

1386629604.515513:0:1344:0:(ldlm_lockd.c:1167:ldlm_handle_enqueue0()) ### server-side enqueue handler START
1386629604.515525:0:1344:0:(ldlm_lockd.c:1253:ldlm_handle_enqueue0()) ### server-side enqueue handler, new lock created ns: filter-lustre-OST0001_UUID lock: ffff880072cb7740/0xb2c4c9585d128715 lrc: 2/0,0 mode: --/PR res: [0x21aa:0x0:0x0].0 rrc: 2 type: EXT [0->0] (req 0->0) flags: 0x40000000000000 nid: local remote: 0x59a0a522abcc8a58 expref: -99 pid: 1344 timeout: 0 lvb_type: 0
1386629604.515559:0:1344:0:(client.c:1473:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc ll_ost00_002:lustre-OST0001_UUID:1344:1453973284418480:10.10.16.236@tcp:106
1386629604.515582:0:1344:0:(client.c:2123:ptlrpc_set_wait()) set ffff880072921ec0 going to sleep for 6 seconds
1386629604.516784:0:1344:0:(ofd_lvb.c:195:ofd_lvbo_update()) res: [0x21aa:0x0:0x0] updating lvb size: 0 -> 2097152
1386629604.516789:0:1344:0:(ofd_lvb.c:207:ofd_lvbo_update()) res: [0x21aa:0x0:0x0] updating lvb atime: 0 -> 1386629600
1386629604.516799:0:1344:0:(ofd_lvb.c:265:ofd_lvbo_update()) res: [0x100000000:0x21aa:0x0] updating lvb blocks from disk: 1 -> 4103
1386629604.516813:0:1344:0:(client.c:1840:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ll_ost00_002:lustre-OST0001_UUID:1344:1453973284418480:10.10.16.236@tcp:106

Nathaniel Clark added a comment - 11/Dec/13 1:17 AM The OSS is definitely returning a funny (not the same) number of blocks after doing hsm_archive. I checked this on my local setup with wireshark and OST_GETATTR is replying with different number of blocks than before doing hsm_archive. It may be that the passing tests are the ones hitting the cached value on the client and the failure is the one getting the value from the OST. OST debug log: 1386629604.515513:0:1344:0:(ldlm_lockd.c:1167:ldlm_handle_enqueue0()) ### server-side enqueue handler START 1386629604.515525:0:1344:0:(ldlm_lockd.c:1253:ldlm_handle_enqueue0()) ### server-side enqueue handler, new lock created ns: filter-lustre-OST0001_UUID lock: ffff880072cb7740/0xb2c4c9585d128715 lrc: 2/0,0 mode: --/PR res: [0x21aa:0x0:0x0].0 rrc: 2 type: EXT [0->0] (req 0->0) flags: 0x40000000000000 nid: local remote: 0x59a0a522abcc8a58 expref: -99 pid: 1344 timeout: 0 lvb_type: 0 1386629604.515559:0:1344:0:(client.c:1473:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc ll_ost00_002:lustre-OST0001_UUID:1344:1453973284418480:10.10.16.236@tcp:106 1386629604.515582:0:1344:0:(client.c:2123:ptlrpc_set_wait()) set ffff880072921ec0 going to sleep for 6 seconds 1386629604.516784:0:1344:0:(ofd_lvb.c:195:ofd_lvbo_update()) res: [0x21aa:0x0:0x0] updating lvb size: 0 -> 2097152 1386629604.516789:0:1344:0:(ofd_lvb.c:207:ofd_lvbo_update()) res: [0x21aa:0x0:0x0] updating lvb atime: 0 -> 1386629600 1386629604.516799:0:1344:0:(ofd_lvb.c:265:ofd_lvbo_update()) res: [0x100000000:0x21aa:0x0] updating lvb blocks from disk: 1 -> 4103 1386629604.516813:0:1344:0:(client.c:1840:ptlrpc_check_set()) Completed RPC pname:cluuid:pid:xid:nid:opc ll_ost00_002:lustre-OST0001_UUID:1344:1453973284418480:10.10.16.236@tcp:106

Bruno Faccini (Inactive) added a comment - 10/Dec/13 11:25 PM

Andreas, I am not sure to understand you. Do you mean that on ZFS, to get the correct st_blocks from "stat -c %b" just after make_small(), we need to "cancel_lru_locks osc" in between ?

Bruno Faccini (Inactive) added a comment - 10/Dec/13 11:25 PM Andreas, I am not sure to understand you. Do you mean that on ZFS, to get the correct st_blocks from "stat -c %b" just after make_small(), we need to "cancel_lru_locks osc" in between ?

Andreas Dilger added a comment - 10/Dec/13 9:51 PM

Hmm, looking at the test more closely, now that we have a test failure, it seems the test itself is defective. The blocks count == 1 should not be true after ARCHIVE (as it was checked beofre the 8467 patch), only after RELEASE. The failed test reported:

# error "$f: wrong size after archive: $size != $orig_size"

wrong block number after archive:  4103 != 1

so it is setting "orig_size" == 1.

That the test passes on ldiskfs is likely just a race, because it is a bit faster and the client does not have the actual on-disk blocks count, so it will return "st_blocks=1" to userspace to avoid bugs in tar/rsync that think st_blocks=0 means the file has no data in it. The zfs test is a bit slower to archive, and the client gets the actual st_blocks value back, and (incorrectly) fails this check.

I suspect that even with make_small using conv=fsync that the client still has cached the original blocks count, so what is needed to fix this bug is to flush the locks on the client before getting orig_blocks:

cancel_lru_locks osc

That way, it returns an accurate value for st_blocks for later testing.

Nathaniel, can you please submit a patch ASAP so maybe we can land this by tomorrow.

Andreas Dilger added a comment - 10/Dec/13 9:51 PM Hmm, looking at the test more closely, now that we have a test failure, it seems the test itself is defective. The blocks count == 1 should not be true after ARCHIVE (as it was checked beofre the 8467 patch), only after RELEASE. The failed test reported: # error "$f: wrong size after archive: $size != $orig_size" wrong block number after archive: 4103 != 1 so it is setting "orig_size" == 1. That the test passes on ldiskfs is likely just a race, because it is a bit faster and the client does not have the actual on-disk blocks count, so it will return "st_blocks=1" to userspace to avoid bugs in tar/rsync that think st_blocks=0 means the file has no data in it. The zfs test is a bit slower to archive, and the client gets the actual st_blocks value back, and (incorrectly) fails this check. I suspect that even with make_small using conv=fsync that the client still has cached the original blocks count, so what is needed to fix this bug is to flush the locks on the client before getting orig_blocks: cancel_lru_locks osc That way, it returns an accurate value for st_blocks for later testing. Nathaniel, can you please submit a patch ASAP so maybe we can land this by tomorrow.

Nathaniel Clark added a comment - 10/Dec/13 5:27 PM

There was a failure https://maloo.whamcloud.com/test_sets/8d7a2b66-6134-11e3-bd66-52540035b04c post #8467. It is, so far, unique (but not seemingly patch related).

Nathaniel Clark added a comment - 10/Dec/13 5:27 PM There was a failure https://maloo.whamcloud.com/test_sets/8d7a2b66-6134-11e3-bd66-52540035b04c post #8467. It is, so far, unique (but not seemingly patch related).

Nathaniel Clark added a comment - 06/Dec/13 9:31 PM

In case 8467 doesn't fix 21 for ZFS, here is a patch to EXCEPT it: http://review.whamcloud.com/8503

Nathaniel Clark added a comment - 06/Dec/13 9:31 PM In case 8467 doesn't fix 21 for ZFS, here is a patch to EXCEPT it: http://review.whamcloud.com/8503

Andreas Dilger added a comment - 04/Dec/13 8:43 PM

Nathaniel, can you please submit a separate patch to disable this test for ZFS only. That can test and possibly land in parallel with 8467 if that does not give us any relief.

Andreas Dilger added a comment - 04/Dec/13 8:43 PM Nathaniel, can you please submit a separate patch to disable this test for ZFS only. That can test and possibly land in parallel with 8467 if that does not give us any relief.

Bruno Faccini (Inactive) added a comment - 03/Dec/13 2:29 PM - edited

Patch is at http://review.whamcloud.com/8467.
But I wonder how, even for ZFS, released file st_blocks can be nothing else than 1 after my patch #7776 to force it as part of ~~LU-3864~~ …

Bruno Faccini (Inactive) added a comment - 03/Dec/13 2:29 PM - edited Patch is at http://review.whamcloud.com/8467 . But I wonder how, even for ZFS, released file st_blocks can be nothing else than 1 after my patch #7776 to force it as part of LU-3864 …

Nathaniel Clark added a comment - 03/Dec/13 2:21 PM

I'll work up and test a patch per Andreas's comment.

Nathaniel Clark added a comment - 03/Dec/13 2:21 PM I'll work up and test a patch per Andreas's comment.

Andreas Dilger added a comment - 02/Dec/13 10:46 PM

It would make sense to me that "lfs hsm_release" would cause all of the DLM locks to be revoked from the client, so any stat from the client would return st_blocks == 1. This should be visible in the debug logs, if this test is run with at least +dlmtrace enabled.

I think in the short term it makes sense to fix sanity-hsm.sh test_21 to enable full debug for this test (using debugsave() and debugrestore()), and print the actual block number that is returned. Maybe it is as simple as ZFS returning 2 with an external xattr or something, which might even happen with ldiskfs? Probably it makes sense to also allow some small margin, like 5 blocks or so. This test is also bad because there are two places that print "wrong block number", and it isn't even clear which one is failing.

I think it makes sense to submit a patch to change this immediately to the following:

        local fid=$(make_small $f)
        local orig_size=$(stat -c "%s" $f)
        local orig_blocks=$(stat -c "%b" $f)

        check_hsm_flags $f "0x00000000"
        $LFS hsm_archive $f || error "could not archive file"
        wait_request_state $fid ARCHIVE SUCCEED

        local blocks=$(stat -c "%b" $f)
        [ $blocks -eq $orig_blocks ] || error "$f: wrong blocks after archive: $blocks != $orig_blocks"
        local size=$(stat -c "%s" $f)
        [ $size -eq $orig_size ] || error "$f: wrong size after archive: $size != $orig_size"

        # Release and check states
        $LFS hsm_release $f || error "$f: could not release file"
        check_hsm_flags $f "0x0000000d"

        blocks=$(stat -c "%b" $f)
        [ $blocks -gt 5 ] || error "$f: too many blocks after release: $blocks > 5"
        size=$(stat -c "%s" $f)
        [ $size -ne $orig_size ] || error "$f: wrong size after release: $size != $orig_size"

Maybe this will allow ZFS to pass, but even if it doesn't then we will have more information to debug the problem.

Andreas Dilger added a comment - 02/Dec/13 10:46 PM It would make sense to me that " lfs hsm_release " would cause all of the DLM locks to be revoked from the client, so any stat from the client would return st_blocks == 1. This should be visible in the debug logs, if this test is run with at least +dlmtrace enabled. I think in the short term it makes sense to fix sanity-hsm.sh test_21 to enable full debug for this test (using debugsave() and debugrestore()), and print the actual block number that is returned. Maybe it is as simple as ZFS returning 2 with an external xattr or something, which might even happen with ldiskfs? Probably it makes sense to also allow some small margin, like 5 blocks or so. This test is also bad because there are two places that print "wrong block number", and it isn't even clear which one is failing. I think it makes sense to submit a patch to change this immediately to the following: local fid=$(make_small $f) local orig_size=$(stat -c "%s" $f) local orig_blocks=$(stat -c "%b" $f) check_hsm_flags $f "0x00000000" $LFS hsm_archive $f || error "could not archive file" wait_request_state $fid ARCHIVE SUCCEED local blocks=$(stat -c "%b" $f) [ $blocks -eq $orig_blocks ] || error "$f: wrong blocks after archive: $blocks != $orig_blocks" local size=$(stat -c "%s" $f) [ $size -eq $orig_size ] || error "$f: wrong size after archive: $size != $orig_size" # Release and check states $LFS hsm_release $f || error "$f: could not release file" check_hsm_flags $f "0x0000000d" blocks=$(stat -c "%b" $f) [ $blocks -gt 5 ] || error "$f: too many blocks after release: $blocks > 5" size=$(stat -c "%s" $f) [ $size -ne $orig_size ] || error "$f: wrong size after release: $size != $orig_size" Maybe this will allow ZFS to pass, but even if it doesn't then we will have more information to debug the problem.

Bruno Faccini (Inactive) added a comment - 02/Dec/13 4:01 PM

Could it be that there is some timing delay, and more likely with ZFS, between hsm_release and st_blocks to become 1 ?
Will setup a ZFS platform and run sanity-hsm/test_21 in a loop to reproduce.

Bruno Faccini (Inactive) added a comment - 02/Dec/13 4:01 PM Could it be that there is some timing delay, and more likely with ZFS, between hsm_release and st_blocks to become 1 ? Will setup a ZFS platform and run sanity-hsm/test_21 in a loop to reproduce.

Andreas Dilger added a comment - 27/Nov/13 3:51 PM

Failed again on ZFS.

https://maloo.whamcloud.com/test_sessions/58a20a92-5759-11e3-8d5c-52540035b04c

Andreas Dilger added a comment - 27/Nov/13 3:51 PM Failed again on ZFS. https://maloo.whamcloud.com/test_sessions/58a20a92-5759-11e3-8d5c-52540035b04c

People

Assignee:: Bruno Faccini (Inactive)

Reporter:: Keith Mannthey (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 05/Aug/13 8:40 PM

Updated:: 30/Dec/13 8:27 PM

Resolved:: 30/Dec/13 8:27 PM