[LU-8586] pios_ssf returning ENOSPC due to mixed OST size. Created: 07/Sep/16  Updated: 17/Dec/16  Resolved: 17/Dec/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.1
Fix Version/s: Lustre 2.10.0

Type: Bug Priority: Major
Reporter: Arshad Hussain Assignee: James Nunez (Inactive)
Resolution: Fixed Votes: 0
Labels: patch

Issue Links:
Duplicate
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

test(pios_ssf) failing intermittently is that the two OST’s are setup with different OST size by TB. The size difference is significant (~200MB). This leads to smaller OST filling up much faster during the pios run. Therefore, any subsequently write to FS now returns and ENOSPC.

Fail case.

/dev/vdc                                     688976      34196    618444   6% /mnt/ost1
/dev/vdd                                     468680      30108    412640   7% /mnt/ost2

Pass case. OST with identical space definition.

/dev/vdc                                     688976      34196    618444   6% /mnt/ost1
/dev/vdd                                     688976      34196    618444   6% /mnt/ost2


 Comments   
Comment by Gerrit Updater [ 07/Sep/16 ]

Arshad Hussain (arshad.hussain@seagate.com) uploaded a new patch: http://review.whamcloud.com/22343
Subject: LU-8586 test: Fix failure due to mixed OST size.
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2b4310be272eb6dba3dbfd2a2aeaf5e3986a6ac5

Comment by Arshad Hussain [ 09/Sep/16 ]

Update:

For Fail case the error seen is ...

stdout.log
Run	Test	Tstamp		T	N	S	C	Aggregate	Lowest	Highest	Run_time
-----------------------------------------------------------------------------------------------------------
1	Write	1452330509	1	37	8MB	1MB	205MB/s		0.036s	0.041s	1.442s
2	Write	1452330510	8	37	8MB	1MB	191MB/s		0.201s	0.410s	1.545s
Error: IO error while doing pwrite64, aborting.
 sanity-benchmark test_pios_ssf: @@@@@@ FAIL: test_pios_ssf failed with 24 
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:4672:error()
  = /usr/lib64/lustre/tests/test-framework.sh:4932:run_one()
  = /usr/lib64/lustre/tests/test-framework.sh:4968:run_one_logged()
  = /usr/lib64/lustre/tests/test-framework.sh:4774:run_test()
  = /usr/lib64/lustre/tests/sanity-benchmark.sh:324:main()
Dumping lctl log to /tmp/test_logs/1451120908/sanity-benchmark.test_pios_ssf.*.1451120914.log
Resetting fail_loc and fail_val on all nodes...done.
FAIL pios_ssf (7s)

OST setup is

/dev/vdc                                     688976      34196    618444   6% /mnt/ost1
/dev/vdd                                     468680      30108    412640   7% /mnt/ost2
Comment by Gerrit Updater [ 17/Dec/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/22343/
Subject: LU-8586 test: Fix failure due to mixed OST size.
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 6e6a3ac4ef3231463e05935ee68a1508b3a5d8d4

Comment by Peter Jones [ 17/Dec/16 ]

Landed for 2.10

Generated at Sat Feb 10 02:18:51 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.