[LU-3152] test_27z did not consider fid on OST. Created: 10/Apr/13  Updated: 19/Apr/13  Resolved: 19/Apr/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Blocker
Reporter: Di Wang Assignee: Di Wang
Resolution: Fixed Votes: 0
Labels: LB

Severity: 3
Rank (Obsolete): 7642

 Description   

= sanity test 27z: check SEQ/OID on the MDT and OST filesystems == 11:25:05 (1365618305)
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0425101 s, 24.7 MB/s
8+0 records in
8+0 records out
8388608 bytes (8.4 MB) copied, 0.317879 s, 26.4 MB/s
check file /mnt/lustre/d0.sanity/d27/f.sanity.27z-1
FID seq 0x4c0000400, oid 0x7aa ver 0x0
LOV seq 0x4c0000400, oid 0x7aa, count: 1
want: stripe:0 ost:0 oid:332/0x14c seq:0x200000400
Stopping /mnt/ost1 (opts on oss01
oss01: find: `/mnt/ost1/O/0x200000400': No such file or directory
Starting ost1: /dev/disk/by-id/scsi-3600d0231000d0143506404614c0b55f4-part1 /mnt/ost1
Started lustre-OST0000
sanity test_27z: @@@@@@ FAIL: : no filter_fid info
Trace dump:
= /usr/lib64/lustre/tests/test-framework.sh:4022:error_noexit()
= /usr/lib64/lustre/tests/test-framework.sh:4045:error()
= /usr/lib64/lustre/tests/sanity.sh:1745:check_seq_oid()
= /usr/lib64/lustre/tests/sanity.sh:1789:test_27z()
= /usr/lib64/lustre/tests/test-framework.sh:4284:run_one()
= /usr/lib64/lustre/tests/test-framework.sh:4317:run_one_logged()
= /usr/lib64/lustre/tests/test-framework.sh:4187:run_test()
= /usr/lib64/lustre/tests/sanity.sh:1792:main()
Dumping lctl log to /tmp/test_logs/2013-04-10/104917/sanity.test_27z.*.1365618315.log
c21: Host key verification failed.^M
c21: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
c21: rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.6]
c13: Host key verification failed.^M
c13: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
c13: rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.6]
c15: Host key verification failed.^M
c15: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
c15: rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.6]
c14: Host key verification failed.^M
c14: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
c14: rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.6]
c22: Host key verification failed.^M
c22: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
c22: rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.6]
mds03: Host key verification failed.^M
mds03: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
mds03: rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.6]
oss02: Host key verification failed.^M
oss02: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
oss02: rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.6]
mds02: Host key verification failed.^M
mds02: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
mds02: rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.6]
oss01: Host key verification failed.^M
oss01: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
oss01: rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.6]
sanity test_27z: @@@@@@ FAIL: test_27z failed with 5
Trace dump:
= /usr/lib64/lustre/tests/test-framework.sh:4022:error_noexit()
= /usr/lib64/lustre/tests/test-framework.sh:4045:error()
= /usr/lib64/lustre/tests/test-framework.sh:4284:run_one()
= /usr/lib64/lustre/tests/test-framework.sh:4317:run_one_logged()
= /usr/lib64/lustre/tests/test-framework.sh:4187:run_test()
= /usr/lib64/lustre/tests/sanity.sh:1792:main()
Dumping lctl log to /tmp/test_logs/2013-04-10/104917/sanity.test_27z.*.1365618328.log



 Comments   
Comment by Li Wei (Inactive) [ 11/Apr/13 ]

I have a fix in http://review.whamcloud.com/5785 that removes the "0x" prefix of the sequence number.

Comment by Oleg Drokin [ 11/Apr/13 ]

so how serious is this? only when we run in fid on ost mode (is that like just for DNE?) I don't think we ever hit this in our regular testing yet?

Comment by Andreas Dilger [ 11/Apr/13 ]

There is also a separate issue for the output of "lfs getstripe -v" which prints lmm_object_id and lmm_seq (formerly called lmm_object_gr) for a file that was created with 2.1-2.3.56 versions of Lustre. The 1.8 and current master branches use this consistently, where lmm_oi.oi_fid.f_seq is the sequence (IGIF seq == inode in the 1.8 case).

I submitted http://review.whamcloud.com/6026 to fix that, which should also be ported to b2_1 in case someone is using old clients with 2.4 servers. The lmm_oi will be repaired by LFSCK Phase 2 so that its usage is consistent.

Comment by Andreas Dilger [ 11/Apr/13 ]

Per issues in LU-2888 I'm bumping this to be an HB, since the on-disk format for the LLOG files was accidentally changed, but missed during upgrade testing. Di is working on a patch to fix that, which should be included in the next tag instead of afterward.

Comment by Andreas Dilger [ 12/Apr/13 ]

My patch in http://review.whamcloud.com/6026 will not be needed at all if Di's patch in http://review.whamcloud.com/6037 lands.

Comment by Peter Jones [ 18/Apr/13 ]

Andreas

The latter patch has been abandoned but I now see http://review.whamcloud.com/#change,6022 for this issue - should that land?

Peter

Comment by Peter Jones [ 19/Apr/13 ]

Landed for 2.4

Generated at Sat Feb 10 01:31:25 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.