[LU-1810] Striping of mount point Created: 31/Aug/12 Updated: 21/Sep/12 Resolved: 21/Sep/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.3 |
| Fix Version/s: | None |
| Type: | Task | Priority: | Minor |
| Reporter: | Douglas Allen Cain (Inactive) | Assignee: | Cliff White (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | striping | ||
| Environment: |
rhel6 |
||
| Attachments: |
|
| Rank (Obsolete): | 10236 |
| Description |
|
After running <lfs setstripe -d /data> we ran the below command to start striping /data lfs setstripe -s 0 -c 7 --offset=4 When we execute <lfs getstripe -v /data> it shows: /data stripe count: 7 stripe_size: 1048576 stripe_offset: -1 Why is it that when we execute <watch lfs df> it still shows only ost0 being written to? We have stopped all services that write to this directory and umount /data but after remounting /data and restarting the services to continue writing it still shows only writing to ost0. To us it seems that the default striping is still going on even though we removed it before running the <lfs getstripe>. |
| Comments |
| Comment by Peter Jones [ 31/Aug/12 ] |
|
Cliff Could you please help with this one? Thanks Peter |
| Comment by Cliff White (Inactive) [ 31/Aug/12 ] |
|
How large are the files you are creating? Remember, a change in striping policy only affects files created after the change. |
| Comment by Cain, Douglas CTR (US) [ 04/Sep/12 ] |
|
Cliff, I remember reading that but what I don't understand is when we upgraded to To answer your question, "How large are the files you are creating?" After V/r, |
| Comment by George Jackson (Inactive) [ 04/Sep/12 ] |
|
Cliff, Actually, in reviewing the data being sent to the filesystem, the sizes written are anywhere from 250K to 10M in size. Thanks, |
| Comment by Cliff White (Inactive) [ 04/Sep/12 ] |
|
As you can see from the lfs getstripe,(lmm_stripe_count: 1) the file you examined was created with a stripe count of 1. Striping policy changes only apply to files created after the striping policy was set, as file striping is set at file creation time. If you run lfs getstripe on the directory, what is the result? Also, lfs df is not especially useful for instantaneous measurement of performance.
That provides a much more accurate picture. However, in this case, i think the issue is with the stripe settings for the directory, |
| Comment by George Jackson (Inactive) [ 05/Sep/12 ] |
|
Cliff, Even with creating a new file after we run lfs setstripe on a newly created dir, /data/big, we're still unable to stripe. The lfs getstripe reports /data/big The newly created file shows an lmm_stripe_count of 1 as well. Could it be because /data is the mount point of the filesystem and needs to be reformatted to include a stripe count of 7? Also, at what size does the file start striping? In other words, we ran a 'dd if=/dev/zero of=/data/big/test1.file bs=5000000' but even after 2 GB we still didn't see striping occurring. If there is any other info you need please let us know. Thanks, George |
| Comment by Cliff White (Inactive) [ 05/Sep/12 ] |
|
It looks like you have set a fixed stripe_offset on /data/big. That's is likely causing your problem Attache the complete output of lfs getstripe to the bug. You should really leave rest of the defaults alone, and only set stripe_count (-c) Also, there is no need to erase striping (-d option) |
| Comment by Cain, Douglas CTR (US) [ 05/Sep/12 ] |
|
Cliff, Ran the below commands: rm -rf /data/big Then ran: V/r, |
| Comment by Cliff White (Inactive) [ 05/Sep/12 ] |
|
That's quite bizarre. # mkdir foo
# lfs getstripe foo
foo
stripe_count: 1 stripe_size: 1048576 stripe_offset: -1
# lfs setstripe -c 7 foo
# lfs getstripe foo
foo
stripe_count: 7 stripe_size: 1048576 stripe_offset: -1
# cd foo
# touch bar
# lfs getstripe bar
bar
lmm_stripe_count: 7
lmm_stripe_size: 1048576
lmm_layout_gen: 0
lmm_stripe_offset: 14
obdidx objid objid group
14 238461 0x3a37d 0
12 238461 0x3a37d 0
15 238429 0x3a35d 0
5 238366 0x3a31e 0
18 238494 0x3a39e 0
19 238366 0x3a31e 0
17 238367 0x3a31f 0
Do you have some mount options set? Did you set any mkfsoptions when creating the filesystem? |
| Comment by Cliff White (Inactive) [ 05/Sep/12 ] |
|
Did you actually create the 'test' file in the data/big directory, or did you 'mv' it there?
# touch bob
# lfs getstripe bob
bob
lmm_stripe_count: 1
lmm_stripe_size: 1048576
lmm_layout_gen: 0
lmm_stripe_offset: 18
obdidx objid objid group
18 238434 0x3a362 0
# mv bob foo
# lfs getstripe foo/bob
foo/bob
lmm_stripe_count: 1
lmm_stripe_size: 1048576
lmm_layout_gen: 0
lmm_stripe_offset: 18
obdidx objid objid group
18 238434 0x3a362 0
a 'mv' does not restripe the file. If I wish to restripe an existing file, this works: # touch baz
# lfs getstripe baz
baz
lmm_stripe_count: 1
lmm_stripe_size: 1048576
lmm_layout_gen: 0
lmm_stripe_offset: 33
obdidx objid objid group
33 239920 0x3a930 0
# cp baz foo/baz
# lfs getstripe foo/baz
foo/baz
lmm_stripe_count: 7
lmm_stripe_size: 1048576
lmm_layout_gen: 0
lmm_stripe_offset: 4
obdidx objid objid group
4 240095 0x3a9df 0
7 240031 0x3a99f 0
10 240096 0x3a9e0 0
11 239743 0x3a87f 0
9 239746 0x3a882 0
8 239488 0x3a780 0
14 240160 0x3aa20 0
|
| Comment by Cain, Douglas CTR (US) [ 05/Sep/12 ] |
|
Cliff, We are not getting the same results and to answer your question "Did you rm -rf /data/big Then ran: We are running version 2.1.3 on rhel 6. We formatted the ost file system Thanks, |
| Comment by Cliff White (Inactive) [ 05/Sep/12 ] |
|
another possibility from our testing - are you certain all the OSTs are on line and up? |
| Comment by Cliff White (Inactive) [ 05/Sep/12 ] |
|
I would take a step back at this point. Is the filesystem otherwise healthy? Any other client errors? 1) Verify that all your OSTs are online and accessible to the Lustre MGS
2) Please attach the contents of 'tune2fs -print <your device>' for both the MGS and MDS (if MGS is separate) where <your device> is replaced by the actual /dev/ path. 3) If stripe_count is == 1, each new file create show go to a different OST. to verify this, try something like: mkdir st1 lfs setstipe -c 1 st1 for i in a b c d e f g;do touch st1/$i; done If you then run 'lfs getstripe st1' each file should have a different obdidx value. (and different lmm_stripe_offset) Please verify this on your system. |
| Comment by Cain, Douglas CTR (US) [ 06/Sep/12 ] |
|
Yes all osts are on line. |
| Comment by Cain, Douglas CTR (US) [ 06/Sep/12 ] |
|
Cliff, I have verified that all mount points are mounted on our stand alone mgs and After logging into the mds and executing lctl dl, I confirmed that the mds After running your for I in I received the same obdidx value = 0 same After running tune2fs on both mgs and mds it returned: Thank you, |
| Comment by Cliff White (Inactive) [ 06/Sep/12 ] |
|
Douglas, I asked you specifically for the output of 'tunefs.lustre --print <device>' run on all your MGS,MDS and OST devices. Please attach that to the bug. |
| Comment by Cain, Douglas CTR (US) [ 06/Sep/12 ] |
|
Cliff, I am waiting on my boss so I can get permission due to the area that we work V/r, |
| Comment by Cain, Douglas CTR (US) [ 06/Sep/12 ] |
|
Cliff, Still awaiting permission to send you the output file but I can tell you I have to unmount all mount points if I do not unmount than I receive: V/r, |
| Comment by Cliff White (Inactive) [ 06/Sep/12 ] |
|
You seem to be misreading my instructions. The command i asked you to run is "tunefs.lustre --print <device>" - the --print option is crucial. Please attach the full, complete output to this bug. |
| Comment by Cliff White (Inactive) [ 06/Sep/12 ] |
|
And, do not umount the devices. This can happen on a live system. |
| Comment by Cain, Douglas CTR (US) [ 06/Sep/12 ] |
|
Cliff, In this email you asked me in step 2 to run tune2fs. I will run tunefs now. V/r, |
| Comment by Cain, Douglas CTR (US) [ 06/Sep/12 ] |
|
Cliff, I updated the ticket through the website but I have been advised not to send V/r, |
| Comment by Cliff White (Inactive) [ 06/Sep/12 ] |
|
Thank you for explaining this is a classified site. That explains a few things. What you are attempting is a very, very basic part of Lustre. It's worked for a long time for a lot of people. I see absolutely nothing unique or unusual in your setup, based on the data you have provided me. In the absence of error messages from a Lustre client or server, or a full script capture of
|
| Comment by Cliff White (Inactive) [ 06/Sep/12 ] |
|
The important thing is that 'lfs getstripe' should return a list of objects, when you create a striped file. Focus on getting that to work. |
| Comment by Cain, Douglas CTR (US) [ 07/Sep/12 ] |
|
Cliff, You asked if we were receiving any errors. On one of our clients we are (file.c:2196:ll_inode_revalidate_fini()) failure -13 Could you please shed some light on this? V/r, |
| Comment by Cliff White (Inactive) [ 07/Sep/12 ] |
|
The error is EACCES 13 /* Permission denied */ I would need more context from the system log to be able to say more. It's unlikely to have anything to do with the striping issue. |
| Comment by Cain, Douglas CTR (US) [ 07/Sep/12 ] |
|
Cliff, Also, on the mds side it shows, an error while communicating with Thanks, |
| Comment by Cliff White (Inactive) [ 07/Sep/12 ] |
|
Again, i would need to see the actual error, and some context. |
| Comment by George Jackson (Inactive) [ 21/Sep/12 ] |
|
Cliff, our striping issue is now resolved. Your comments on this were very helpful but we found some underlying issues with our initial configuration related to the indexes of the OSTs. We ended up reformatting all mgt, mdt, and ost using the correct indexing and are now able to stripe. Thanks again for your help, you may close this issue as resolved. |
| Comment by Peter Jones [ 21/Sep/12 ] |
|
Thanks for letting us know George! |