[LU-7144] Interop 2.7.0<->master- sanity-scrub test_14: (6) Some entry under /lost+found should be repaired Created: 11/Sep/15  Updated: 02/Sep/16  Resolved: 24/Jul/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.9.0

Type: Bug Priority: Minor
Reporter: Maloo Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

Client: 2.7.0
Server: lustre-master# 3166 , RHEL 7


Issue Links:
Duplicate
Related
is related to LU-7202 sanity-benchmark: no label for lustre... Resolved
is related to LU-7746 skip test of new functionality on ups... Resolved
is related to LU-6463 Interop 2.5.3<->master ost-pools test... Closed
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/c84cd21a-514d-11e5-9f68-5254006e85c2.

The sub-test test_14 failed with the following error:

(6) Some entry under /lost+found should be repaired

Test log:

Starting ost1:   /dev/lvm-Role_OSS/P1 /mnt/ost1
CMD: shadow-18vm3 mkdir -p /mnt/ost1; mount -t lustre   		                   /dev/lvm-Role_OSS/P1 /mnt/ost1
CMD: shadow-18vm3 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/openmpi/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh set_default_debug \"vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck\" \"all -lnet -lnd -pinger\" 4 
CMD: shadow-18vm3 e2label /dev/lvm-Role_OSS/P1 2>/dev/null
Started lustre-OST0000
Starting client: shadow-18vm5.shadow.whamcloud.com:  -o user_xattr,flock shadow-18vm12@tcp:/lustre /mnt/lustre
CMD: shadow-18vm5.shadow.whamcloud.com mkdir -p /mnt/lustre
CMD: shadow-18vm5.shadow.whamcloud.com mount -t lustre -o user_xattr,flock shadow-18vm12@tcp:/lustre /mnt/lustre
CMD: shadow-18vm3 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-OST0000.oi_scrub
/usr/lib64/lustre/tests/sanity-scrub.sh: line 1076: [: -gt: unary operator expected
 sanity-scrub test_14: @@@@@@ FAIL: (6) Some entry under /lost+found should be repaired 

Console:

03:53:27:Lustre: DEBUG MARKER: == sanity-scrub test 14: OI scrub can repair objects under lost+found == 03:53:14 (1441079594)
03:53:27:Lustre: DEBUG MARKER: grep -c /mnt/lustre' ' /proc/mounts
03:53:27:Lustre: DEBUG MARKER: lsof -t /mnt/lustre
03:53:27:Lustre: DEBUG MARKER: umount /mnt/lustre 2>&1
03:53:27:Lustre: DEBUG MARKER: mkdir -p /mnt/lustre
03:53:27:Lustre: DEBUG MARKER: mount -t lustre -o user_xattr,flock shadow-18vm12@tcp:/lustre /mnt/lustre
03:53:27:LustreError: 11-0: lustre-OST0000-osc-ffff88007952b400: operation ost_connect to node 10.1.4.213@tcp failed: rc = -16
03:53:27:Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity-scrub test_14: @@@@@@ FAIL: \(6\) Some entry under \/lost+found should be repaired 
03:53:27:Lustre: DEBUG MARKER: sanity-scrub test_14: @@@@@@ FAIL: (6) Some entry under /lost+found should be repaired


 Comments   
Comment by Andreas Dilger [ 04/Dec/15 ]

Fan Yong, could you please add a skip for this test when running against a too-old server.

Would it make sense to just skip sanity-scrub and sanity-lfsck entirely when running in interop mode? It seems like there is little benefit to running the client-version sanity-scrub and sanity-lfsck, when they will just be wrong compared to what code is implemented on the server.

Are there any subtests that run as part sanity-scrub or sanity-lfsck that actually do something on the client that should continue to be tested in interop mode?

Either we run the server-side version of the tests on the client, or we skip these scripts entirely in interop mode and just depend on non-interop testing for sanity-scrub and sanity-lfsck.

Comment by nasf (Inactive) [ 09/Dec/15 ]

Andreas, I agree with you that it is NOT necessary to test sanity-scrub/sanity-lfsck under interoperation mode. I will make patch to skip them.

Comment by Gerrit Updater [ 09/Dec/15 ]

Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/17520
Subject: LU-7144 tests: skip scrub/lfsck test under interoperation
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ba2a970da529c29b75de035adfa855b6a6adf223

Comment by Gerrit Updater [ 13/Dec/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/17520/
Subject: LU-7144 tests: skip scrub/lfsck test under interoperation
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: be0c22a64ae1675d4995ab3ae6da75fbd04f9426

Comment by Peter Jones [ 14/Dec/15 ]

Landed for 2.8

Comment by Saurabh Tandan (Inactive) [ 15/Dec/15 ]

Another instance for following interop config but tests ran before the patch Landed.
Server: Master, Build# 3266, Tag 2.7.64
Client: 2.5.5, b2_5_fe/62
https://testing.hpdd.intel.com/test_sets/b61e3db8-9fcc-11e5-a33d-5254006e85c2

Comment by Saurabh Tandan (Inactive) [ 18/Dec/15 ]

Another instance for EL6.7 Server/EL6.7 Client - ZFS
Master, build# 3270
Failed to run any tests on sanity-benchmark.

no label for lustre-ost5/ost5

https://testing.hpdd.intel.com/test_sets/7f526b00-a275-11e5-bdef-5254006e85c2
Tests ran on : 2015-12-12

Comment by Saurabh Tandan (Inactive) [ 18/Dec/15 ]

Another instance forEL7.1 Server/EL7.1 Client - ZFS
Master, build# 3264
https://testing.hpdd.intel.com/test_sets/2cac0c80-a135-11e5-83b8-5254006e85c2

Comment by nasf (Inactive) [ 19/Dec/15 ]

Another instance for EL6.7 Server/EL6.7 Client - ZFS
Master, build# 3270
Failed to run any tests on sanity-benchmark.
no label for lustre-ost5/ost5
https://testing.hpdd.intel.com/test_sets/7f526b00-a275-11e5-bdef-5254006e85c2
Tests ran on : 2015-12-12

I do not think it is related with sanity-scrub interoperability test.

Comment by Saurabh Tandan (Inactive) [ 24/Dec/15 ]

Another instance found on:
Server: master, build# 3276 , RHEL 6.7
Client: 2.7.1 , b2_7_fe/34
https://testing.hpdd.intel.com/test_sets/9a5d0066-a592-11e5-a14a-5254006e85c2

Comment by Saurabh Tandan (Inactive) [ 08/Feb/16 ]

Encountered same issue for tag 2.7.66, for interop config - EL6.7 Server/2.7.1 Client
master - build# 3316 , b2_7_fe/34
test log:

Starting ost1:   /dev/lvm-Role_OSS/P1 /mnt/ost1
CMD: shadow-3vm7 mkdir -p /mnt/ost1; mount -t lustre   		                   /dev/lvm-Role_OSS/P1 /mnt/ost1
CMD: shadow-3vm7 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/qt-3.3/bin:/usr/lib64/openmpi/bin:/usr/bin:/bin:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh set_default_debug \"vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck\" \"all -lnet -lnd -pinger\" 4 
CMD: shadow-3vm7 e2label /dev/lvm-Role_OSS/P1 2>/dev/null
Started lustre-OST0000
Starting client: shadow-3vm5.shadow.whamcloud.com:  -o user_xattr,flock shadow-3vm12@tcp:/lustre /mnt/lustre
CMD: shadow-3vm5.shadow.whamcloud.com mkdir -p /mnt/lustre
CMD: shadow-3vm5.shadow.whamcloud.com mount -t lustre -o user_xattr,flock shadow-3vm12@tcp:/lustre /mnt/lustre
CMD: shadow-3vm7 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-OST0000.oi_scrub
/usr/lib64/lustre/tests/sanity-scrub.sh: line 1076: [: -gt: unary operator expected
 sanity-scrub test_14: @@@@@@ FAIL: (6) Some entry under /lost+found should be repaired 
Comment by nasf (Inactive) [ 09/Feb/16 ]

We have the patch http://review.whamcloud.com/#/c/17521/ for b2_7_fe.

Comment by Saurabh Tandan (Inactive) [ 10/Feb/16 ]

Another instance found for interop tag 2.7.66 - EL6.7 Server/2.7.1 Client, build# 3316
https://testing.hpdd.intel.com/test_sets/535a0f2e-cc98-11e5-b80c-5254006e85c2

Another instance found for interop tag 2.7.66 - EL6.7 Server/2.5.5 Client, build# 3316
https://testing.hpdd.intel.com/test_sets/ad6dd9b2-cc9f-11e5-963e-5254006e85c2

Another instance found for interop tag 2.7.66 - EL7 Server/2.5.5 Client, build# 3316
https://testing.hpdd.intel.com/test_sets/781e3562-cc46-11e5-901d-5254006e85c2

Comment by Gerrit Updater [ 10/Feb/16 ]

Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/18399
Subject: LU-7144 tests: print client/server versions for tests
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 5280fe64f9bcdf1587e84883a693c24ba240aefe

Comment by nasf (Inactive) [ 11/Feb/16 ]

Another instance found for interop tag 2.7.66 - EL6.7 Server/2.7.1 Client, build# 3316
https://testing.hpdd.intel.com/test_sets/535a0f2e-cc98-11e5-b80c-5254006e85c2
Another instance found for interop tag 2.7.66 - EL6.7 Server/2.5.5 Client, build# 3316
https://testing.hpdd.intel.com/test_sets/ad6dd9b2-cc9f-11e5-963e-5254006e85c2
Another instance found for interop tag 2.7.66 - EL7 Server/2.5.5 Client, build# 3316
https://testing.hpdd.intel.com/test_sets/781e3562-cc46-11e5-901d-5254006e85c2

The reason is that the test scripts run on client. Although we have landed related patch (http://review.whamcloud.com/17520/) to master (b2_8), but the tested client is b2_7 or b2_5 based, related patches have NOT been landed to these branches yet. So we still hit trouble.

Generated at Sat Feb 10 02:06:22 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.