Quota enforcement landing (LU-1842)

[LU-1920] Create and attach Test Plan for quota testing Created: 12/Sep/12  Updated: 20/Jan/14  Resolved: 20/Jan/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: None

Type: Technical task Priority: Critical
Reporter: Jodi Levi (Inactive) Assignee: Cliff White (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Project: Orion
Rank (Obsolete): 4014

 Description   

Please write and attach the test plan for how to test this feature.



 Comments   
Comment by Johann Lombardi (Inactive) [ 23/Oct/12 ]

Quota test plan
***************

1. Correctness (both ldiskfs and zfs)
--------------

  • sanity-quota with SLOW=yes
  • online OST addition
  • failover tests (MDT & OST) with quota on and enforced
  • running all acc-sm tests with quota enabled and enforced

2. Upgrade (ldiskfs only)
----------

  • create a filesystem with 1.8/2.1, enable quota, set some quota limits and create some files
  • upgrade to 2.4 (require tunefs.lustre --quota and lctl conf_param lustre.quota... to be run)
  • check limits and usage
  • remove / create more files and check behavior

3. Client Interoperability (both ldiskfs and zfs)
--------------------------
2.3 client compatible, clients prior to 2.3 aren't yet (due to EINPROGRESS support)

  • run s-q with 2.3 client and 2.4 servers (require to use s-q version from 2.4)
  • compatibility with older clients to be tested once EINPROGRESS patches are landed to b2_! and b1_8.

4. Impact on performance (both ldiskfs and zfs)
------------------------
Run ior and mdtest on hyperion with:

  • fresh filesystem with no quota settings. This should provide us with reference numbers.
  • quota enabled via conf_param. Impact on performance should be null.
  • quota enforcement enabled with a large limit for the user (via setquota). Impact on performance should be close to null.
  • quota enforcement enabled with a limit close to expected usage (should still fit). Impact on performance to be compared with prior lustre release (2.1, 2.2 or 2.3)
  • quota enforcement enabled with a limit smaller than usage (EDQUOT error expected). We should again compare with a prior lustre release.
  • quota disabled via conf_param. Impact on performance should be NULL.

5. DNE support (both ldiskfs and zfs)
---------------

  • sanity-quota with SLOW=yes with multiple MDTs
  • space rebalancing for inodes (never exercised when one single MDT)
  • impact on metadata performance with remote directory (on MDT1) while master still runs on MDT0
  • online MDT addition
Comment by Johann Lombardi (Inactive) [ 23/Oct/12 ]

Niu, could you please comment on the above TP? Thanks in advance.

Comment by Niu Yawei (Inactive) [ 24/Oct/12 ]

the test plan looks comprehensive to me. thanks.

Comment by Jodi Levi (Inactive) [ 24/Oct/12 ]

Does this need to be stressed at scale? If no: does this have a feature that needs to be enabled or disabled during normal SWL testing?

Comment by Johann Lombardi (Inactive) [ 25/Oct/12 ]

Does this need to be stressed at scale?

Yes, it would be great to run the performance test on hyperion.

If no: does this have a feature that needs to be enabled or disabled during normal SWL testing?

I don't think SWL has any special code for quota. That said, we can probably run it on a filesystem with quota enabled.

Comment by Jodi Levi (Inactive) [ 25/Oct/12 ]

Johann,
Can you please add the details for what the performance test should entail? Also, how is Quota enabled on Hyperion?
Thank you!

Comment by Johann Lombardi (Inactive) [ 25/Oct/12 ]

I have updated the document. I think we need to run the same test with a lustre version <2.4 in order to compare.

Comment by Sarah Liu [ 21/Dec/12 ]

For the online OST addition testing, here are the test steps, please comment if it is not enough.

1. set up lustre with one OST-0, enable quota and create a file then use up all the blocks
2. online OST addition to the file system
3. check if the quota is enabled on the new OST-1
4. create a new file make sure it is on the new OST-1 and then write, should expect EDQUOT
5. delete the first file on OST-0 and write again on OST-1, should expect success

Comment by Niu Yawei (Inactive) [ 21/Dec/12 ]

Sarah, it looks good to me. Thanks.

Comment by Johann Lombardi (Inactive) [ 01/Feb/13 ]

Have we finally run the whole test plan? No major issues found?

Comment by Jodi Levi (Inactive) [ 01/Feb/13 ]

It appears the tests were run twice on the 2.3.59 tag on both ldiskfs and ZFS and all passed.
Sarah can comment more.

Comment by Johann Lombardi (Inactive) [ 01/Feb/13 ]

That's good news.

Sarah, could you please share with me the performance numbers?

Comment by Sarah Liu [ 04/Feb/13 ]

Hi Johann,

Cliff is working on the performance part(section 4), I don't have the permission to access Hyperion. Section 5(DNE part) is now skipped, and for the rest of part have all been tested(by autotest or manually).

Comment by Sarah Liu [ 26/Feb/13 ]

Just to make it clear that before downgrade, need run "tune2fs -O ^quota" on all MDT and OST devices.

Comment by Johann Lombardi (Inactive) [ 13/Mar/13 ]

Now that DNE is landed, can we retest quota with DNE?

Comment by Johann Lombardi (Inactive) [ 13/Mar/13 ]

Reassign to Cliff for feedback, since those tests need to be run on hyperion, i guess.
I am also interested in the performance results.
BTW, have we run all the quota tests with ZFS as suggested in the test plan?

Comment by Sarah Liu [ 13/Mar/13 ]

Hi Johann,

I've run all tests with ZFS except DNE.

Comment by Cliff White (Inactive) [ 20/Jan/14 ]

This is one year old and I believe done. Reopen if needed

Generated at Sat Feb 10 01:20:50 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.