[LU-1295] mds-survey dosen't work with one thread Created: 09/Apr/12  Updated: 04/May/12  Resolved: 04/May/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.2.0, Lustre 2.3.0
Fix Version/s: None

Type: Bug Priority: Blocker
Reporter: Richard Henwood (Inactive) Assignee: Di Wang
Resolution: Cannot Reproduce Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 6419

 Description   

running mds-survey with a single thread seems to fail and corrupt the MDT.

# mkfs.lustre --reformat --fsname=survey --mdt --mgs --index=0 /dev/rsXX0

   Permanent disk data:
Target:     survey-MDT0000
Index:      0
Lustre FS:  survey
Mount type: ldiskfs
Flags:      0x65
              (MDT MGS first_time update )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters:

device size = 430074MB
formatting backing filesystem ldiskfs on /dev/rsXX0
	target name  survey-MDT0000
	4k blocks     110098944
	options        -J size=400 -I 512 -i 2048 -q -O dirdata,uninit_bg,dir_nlink,huge_file,flex_bg -E lazy_journal_init -F
mkfs_cmd = mke2fs -j -b 4096 -L survey-MDT0000  -J size=400 -I 512 -i 2048 -q -O dirdata,uninit_bg,dir_nlink,huge_file,flex_bg -E lazy_journal_init -F /dev/rsXX0 110098944
Writing CONFIGS/mountdata
bin# mount -t lustre /dev/rsXX0 /mnt
bin# dir_count=1 thrlo=1 thrhi=1 file_count=250000 sh mds-survey
Mon Apr  9 15:15:40 CDT 2012 mds-survey from mds51.ls4.tacc.utexas.edu
mdt 1 file  250000 dir    1 thr    1 create 35157.45 [30996.56,36061.95] lookup             ERROR md_getattr             ERROR setxattr             ERROR destroy             ERROR 
program exited with error 
bin# dir_count=1 thrlo=1 thrhi=1 file_count=250000 sh mds-survey
Mon Apr  9 15:15:53 CDT 2012 mds-survey from mds51.ls4.tacc.utexas.edu
error: test_mkdir: File exists
ERROR: fail test_mkdir
created directories on localhost:survey-MDT0000_ecc failed
program exited with error 
bin# 


 Comments   
Comment by Peter Jones [ 12/Apr/12 ]

Wangdi

Could you please look into this one?

Thanks

Peter

Comment by Di Wang [ 12/Apr/12 ]

Hmm, it works for me

[root@testnode1 mds-survey]# dir_count=1 thrlo=1 thrhi=1 file_count=2500 sh mds-survey
Thu Apr 12 18:21:24 EDT 2012 mds-survey from testnode1
mdt 1 file 2500 dir 1 thr 1 create 18383.84 [18383.84,18383.84] lookup 139813.21 [139813.21,139813.21] md_getattr 85327.14 [85327.14,85327.14] setxattr 19402.10 [19402.10,19402.10] destroy 15801.98 [15801.98,15801.98]
done!

Richard, did the test includes the patch 2356(http://review.whamcloud.com/#change,2356) on lu-1244?

Comment by Richard Henwood (Inactive) [ 04/May/12 ]

I've tried a recent master and I can't reliably reproduce in my virtual environment.

Generated at Sat Feb 10 01:15:23 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.