[LU-8817] min_ost_size function under test-framework incorrect. Created: 09/Nov/16  Updated: 13/Jun/17  Resolved: 13/Jun/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.1
Fix Version/s: Lustre 2.10.0

Type: Bug Priority: Minor
Reporter: Arshad Hussain Assignee: WC Triage
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

min_ost_size() function is defined as below and it gets the min kbytesavail among the attached OST's. This function is used twice under
1. ./lustre/tests/obdfilter-survey.sh
2. ./lustre/tests/sanity-benchmark.sh

 $LCTL get_param -n osc.*.kbytesavail | sort -n | head -n1

However, it get the MDT 's kbytesavail (OSP) which is different.

# lctl get_param osc.*.kbytesavail
osc.lustre-OST0000-osc-MDT0000.kbytesavail=151004 << Reports this value.
osc.lustre-OST0000-osc-ffff880026480400.kbytesavail=151276
osc.lustre-OST0001-osc-MDT0000.kbytesavail=151004
osc.lustre-OST0001-osc-ffff880026480400.kbytesavail=151276
#

It should report kbytesavail for OST's , which is also matching lfs df output.

# lctl get_param osc.*.kbytesavail | grep -v MDT
osc.lustre-OST0000-osc-ffff880026480400.kbytesavail=151276
osc.lustre-OST0001-osc-ffff880026480400.kbytesavail=151276
#

Output of 'lfs df'. 'Available' field only.

    $ lfs df | grep OST | awk '{print $4}' | sort -n | head -n1
    151276

Output with current code:

    $ lctl get_param -n osc.*.kbytesavail | sort -n | head -n1
    151004


 Comments   
Comment by Gerrit Updater [ 09/Nov/16 ]

Arshad Hussain (arshad.hussain@seagate.com) uploaded a new patch: http://review.whamcloud.com/23685
Subject: LU-8817 tests: Update 'min_ost_size'
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 43c1df70b063f254edde8a6283a056a9565b78f9

Comment by James Nunez (Inactive) [ 21/Jan/17 ]

Arshad - Is this patch still necessary? I'm running master and I don't see the MDT's available space when I run 'lctl get_param osc.*.kbytesavail' on the client:

# lctl get_param osc.*.kbytesavail
osc.scratch-OST0000-osc-ffff88007acea800.kbytesavail=307348
osc.scratch-OST0001-osc-ffff88007acea800.kbytesavail=309396
osc.scratch-OST0002-osc-ffff88007acea800.kbytesavail=309392
osc.scratch-OST0003-osc-ffff88007acea800.kbytesavail=309392

but there is an MDT:

# lfs df
UUID                   1K-blocks        Used   Available Use% Mounted on
scratch-MDT0000_UUID      125368        1868      114140   2% /lustre/scratch[MDT:0]
scratch-OST0000_UUID      350360       15540      307348   5% /lustre/scratch[OST:0]
scratch-OST0001_UUID      350360       13492      309396   4% /lustre/scratch[OST:1]
scratch-OST0002_UUID      350360       13496      309392   4% /lustre/scratch[OST:2]
scratch-OST0003_UUID      350360       13496      309392   4% /lustre/scratch[OST:3]

I think your patch is good and works, but I think it's not necessary ... for master. What do you think?

Comment by Arshad Hussain [ 24/Jan/17 ]

Hello James,

Thanks for the review and Apologies for delay response.

What I could still see is that on the setup where MDS and CLIENT are on same node – min_ost_size() (lctl get_param osc.*.kbytesavail) return the value from OSP. This does not manifest on 2 node or greater setup where client is separate. I believe that it is reasonable to update min_ost_size() under test-framework.sh to cover this case also by switching to “lfs df”.

Thanks.

On a single node master these are the results:

# lctl get_param osc.*.kbytesavail
osc.lustre-OST0000-osc-MDT0000.kbytesavail=309124 <<< min_ost_size() Reports this value.
osc.lustre-OST0000-osc-ffff880016c0a800.kbytesavail=309396
osc.lustre-OST0001-osc-MDT0000.kbytesavail=309124
osc.lustre-OST0001-osc-ffff880016c0a800.kbytesavail=309396
#

However lfs df shows differently.

# lfs df | grep OST | awk '{print $4}' | sort -un | head -1
309396
#

This is the master version I am using

# git log --format=oneline -5
3b3eeeb08407588c75e47a751a72fad6ac7d78f2 LU-8871 kernel: kernel upgrade [SLES12 SP2 4.4.21-84]
1bb6ce8d8cc4de97d49ab5f1d8a07b60e3dc3639 LU-4423 ptlrpc: use 64-bit times for ptlrpc sec expiry
fb403f8b5f8ba61fe0da28e7f7c5e01776717750 LU-8945 ptlrpc : remove userland usage from ptlrpc
11dfa318972db8c8d4bdc36848a4aee8072557f7 LU-8835 osc: handle 64 bit time properly in osc_cache_too_much
874690977923f9fa984f608e7bf1d6effda04e6b LU-6210 lod: Change positional struct initializers to C99
#

However, on the 2 node setup the results are what you are pointing out. The mismatch does not manifest here.
Running on client

# lctl get_param osc.*.kbytesavail
osc.lustre-OST0000-osc-ffff88003aff3000.kbytesavail=71096
osc.lustre-OST0001-osc-ffff88003aff3000.kbytesavail=71096

lfs df

# lfs df | grep OST | awk '{print $4}' | sort -un | head -1
71096
#

Comment by Gerrit Updater [ 13/Jun/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/23685/
Subject: LU-8817 tests: Update 'min_ost_size'
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 4eeff96e35c65ba818f604ead2efd66d26241dc0

Comment by Peter Jones [ 13/Jun/17 ]

Landed for 2.10

Generated at Sat Feb 10 02:20:48 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.