[LUDOC-29] increase maximum stripe count from 160 to 2000 for wide striping in user manual Created: 15/Dec/11  Updated: 06/Apr/12  Resolved: 06/Apr/12

Status: Closed
Project: Lustre Documentation
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Andreas Dilger Assignee: Jian Yu
Resolution: Fixed Votes: 0
Labels: None

Rank (Obsolete): 7181

 Description   

When the wide striping feature is landed, the manual needs to be updated to indicate that the maximum number of stripes is 2000 instead of 160. The maximum file size is now 2000 * 16TB = 2.5PB instead of 160 * 2TB = 320GB. These limits are explicitly documented in SettingUpLustreSystem.xml, but the 160 stripe count limit is likely listed in several other places in the manual as well, as well as possibly the lfs.1 man page, code examples, etc.



 Comments   
Comment by Jian Yu [ 09/Jan/12 ]

The maximum file size is now 2000 * 16TB = 2.5PB instead of 160 * 2TB = 320GB.

This should be "2000 * 16TB = 31.25PB instead of 160 * 2TB = 320TB".

Comment by Jian Yu [ 09/Jan/12 ]

Patch is in: http://review.whamcloud.com/1929.

Comment by Andreas Dilger [ 18/Jan/12 ]

I just realized that there needs to be additional changes to the documentation for wide striping. In particular, the process to upgrade from an existing filesystem to enable wide striping on the MDT needs to be documented. This is basically "tune2fs -O large_xattr /dev/mdtdev". Then, once this feature is enabled and in use on the MDT, it will not be possible to directly downgrade the MDT filesystem to an earlier version of Lustre that does not support wide striping. The only way to disable it would be to delete (or use "lfs_migrate" with a stripe_count = 160) all of the files with large xattrs and then unmount the MDT and then run "tune2fs -O ^large_xattr" to turn off this filesystem feature.

Comment by Jian Yu [ 19/Jan/12 ]

I just realized that there needs to be additional changes to the documentation for wide striping.

Thanks for the comments. I'm going to do the experiment first and then update the manual accordingly.

Comment by Andreas Dilger [ 20/Jan/12 ]

James, I'm adding you here for review and if you want to contribute for the manual updates for the wide striping feature.

Comment by Jian Yu [ 21/Feb/12 ]

Hello Andreas,

I just realized that there needs to be additional changes to the documentation for wide striping. In particular, the process to upgrade from an existing filesystem to enable wide striping on the MDT needs to be documented.

I found the current UpgradingLustre.xml section was referring only to Lustre 2.0 and there was no downgrading Lustre section. Should I wait for the porting of http://review.whamcloud.com/#change,2078 landed on master and then add the above upgrade/downgrade notes into the manual?

Comment by Andreas Dilger [ 22/Feb/12 ]

Yes, it makes sense to split the documentation update into two parts. Please make a separate patch to document the upgrade process (tune2fs) based on change 2078. You don't need to wait until it lands before starting to update that documentation.

Comment by Jian Yu [ 29/Feb/12 ]

Hello Andreas,

use "lfs_migrate" with a stripe_count = 160

I found that "lfs_migrate" had no option to specify a new stripe count for the file. What it could do are:

1) with "-R" option to restripe the file using default directory striping
2) without "-R" option to stripe the new file with the same stripe count and size as the old file:

    #......
        UNLINK="-u"
        COUNT=$($LFS getstripe -c "$OLDNAME" 2> /dev/null)
        SIZE=$($LFS getstripe -s "$OLDNAME" 2> /dev/null)
    #......
    [ "$UNLINK" ] && $LFS setstripe -c${COUNT} -s${SIZE} "$NEWNAME"
    #......

So, it seems that to migrate a file with a new stripe count, we have to set the new stripe count on the directory which contains the file, and then run "lfs_migrate -R -y $dir/$file" as follows:

# lfs getstripe -d /mnt/lustre/dir
stripe_count:   400 stripe_size:    1048576 stripe_offset:  -1 
# lfs getstripe -i -c -s /mnt/lustre/dir/file
lmm_stripe_count:   400
lmm_stripe_size:    1048576
lmm_stripe_offset:  656

# lfs setstripe -c 160 /mnt/lustre/dir
# lfs getstripe -d /mnt/lustre/dir
stripe_count:   160 stripe_size:    1048576 stripe_offset:  -1 
# lfs getstripe -i -c -s /mnt/lustre/dir/file
lmm_stripe_count:   400
lmm_stripe_size:    1048576
lmm_stripe_offset:  656

# lfs_migrate -R -y /mnt/lustre/dir/file
/mnt/lustre/dir/file: done
# lfs getstripe -d /mnt/lustre/dir
stripe_count:   160 stripe_size:    1048576 stripe_offset:  -1 
# lfs getstripe -i -c -s /mnt/lustre/dir/file
lmm_stripe_count:   160
lmm_stripe_size:    1048576
lmm_stripe_offset:  1

Do I understand correctly? Should we add an option to lfs_migrate to support migrating a file directly with a new stripe count?

Comment by Andreas Dilger [ 29/Feb/12 ]

Yes, this would be reasonable to do, and not very difficult.

It might also be useful to make "lfs_migrate" a bit more intelligent and restripe all "wide" files to be "-1" again (i.e. restripe to the number of OSTs), where "wide" is the previous stripe count, or 160. This would avoid the problem of specifying a stripe_count = 1000 and then this affects all files being migrated, even those with a stripe_count = 1.

Comment by Jian Yu [ 02/Mar/12 ]

It might also be useful to make "lfs_migrate" a bit more intelligent and restripe all "wide" files to be "-1" again (i.e. restripe to the number of OSTs), where "wide" is the previous stripe count, or 160.

I'm not sure I understand this correctly. If "lfs_migrate" restriped all "wide" files to be "-1", then specifying a new stripe count would not work.

Suppose the number of OSTs was 400, and the previous stripe count of a "wide" file was also 400, now we want to migrate and restripe the "wide" file with a new stripe count of 100, if the "lfs_migrate" intelligently restriped the "wide" file to be "-1", then the stripe count of the "wide" file would be still 400 not 100. So, it seems this is incorrect. Am I misunderstanding?

Comment by Andreas Dilger [ 02/Mar/12 ]

It would be uncommon to restripe from more OSTs to fewer OSTs. I was thinking more about the case where a filesystem had 100 OSTs, and "wide" files have 100 stripes, then the filesystem is upgraded to have 200 OSTs. "wide" files should be restriped to have 200 stripes if they are being migrated.

Having some intelligence in lfs_migrate would avoid the problem of trying to restripe files with different striping (e.g. "1" and "100"). That said, I don't think that any such feature will be accepted into 2.2, and I'm also wondering whether the "smart" lfs_migrate will make mistakes in some cases (e.g. application is tuned for 96 stripes, but lfs_migrate might restripe it to 128 stripes whether this is desirable or not).

So, let's forget this idea, and look at implementing ONLY the change to allow lfs_migrate to accept a "--stripe-count|-c" parameter to allow it to restripe to a new number of OSTs. It is up to the administrator to find suitable input files for her usage of lfs_migrate.

Comment by Jian Yu [ 02/Mar/12 ]

So, let's forget this idea, and look at implementing ONLY the change to allow lfs_migrate to accept a "--stripe-count|-c" parameter to allow it to restripe to a new number of OSTs.

OK, got it. The patch for lfs_migrate are in http://review.whamcloud.com/2247 and http://review.whamcloud.com/2310.

# lfs getstripe -c -i -s /mnt/lustre/dir/file
lmm_stripe_count:   160
lmm_stripe_size:    1048576
lmm_stripe_offset:  481

# lfs_migrate -c 100 -y /mnt/lustre/dir/file
/mnt/lustre/dir/file: done

# lfs getstripe -c -i -s /mnt/lustre/dir/file
lmm_stripe_count:   100
lmm_stripe_size:    1048576
lmm_stripe_offset:  564

# lfs_migrate -c -1 -y /mnt/lustre/dir/file
/mnt/lustre/dir/file: done

# lfs getstripe -c -i -s /mnt/lustre/dir/file
lmm_stripe_count:   160
lmm_stripe_size:    1048576
lmm_stripe_offset:  627
Comment by Jian Yu [ 13/Mar/12 ]

Please make a separate patch to document the upgrade process (tune2fs) based on change 2078.

Patch is in http://review.whamcloud.com/2295.

Comment by Jian Yu [ 06/Apr/12 ]

The manual change for Lustre 2.2.0 has been landed.
The lfs_migrate change for Lustre 2.3.0 has also been landed.
The manual change for Lustre 2.3.0 will be worked in LUDOC-54.
Let's close this ticket.

Generated at Sat Feb 10 03:39:41 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.