[LU-4167] Interop 2.4.1<->2.5 failure on test suite conf-sanity test_32d: unknown param max_dirty_mb Created: 28/Oct/13  Updated: 11/Feb/15  Resolved: 03/Jun/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0, Lustre 2.5.1
Fix Version/s: Lustre 2.6.0, Lustre 2.7.0, Lustre 2.5.3

Type: Bug Priority: Minor
Reporter: Maloo Assignee: Emoly Liu
Resolution: Fixed Votes: 0
Labels: None
Environment:

server: 2.4.1 RHEL6 ldiskfs
client: lustre-b2_5 build #2 RHEL6 ldiskfs


Severity: 3
Rank (Obsolete): 11296

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/75f79ad4-3eb9-11e3-a21b-52540035b04c.

The sub-test test_32d failed with the following error:

test_32d failed with 1

MDS console:

23:22:48:Lustre: 5409:0:(obd_mount.c:837:lustre_check_exclusion()) Excluding t32fs-OST0000-osc (on exclusion list)
23:22:48:LustreError: 5409:0:(obd_config.c:1303:class_process_proc_param()) t32fs-OST0000-osc: unknown param max_dirty_mb=15
23:22:48:Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param t32fs-MDT0000.lov.stripesize=4M



 Comments   
Comment by Andreas Dilger [ 29/Oct/13 ]

It isn't possible to run LFSCK on an OST in Lustre 2.4. This test needs to be changed to skip running LFSCK for versions older than 2.4.93 or so.

Comment by Emoly Liu [ 01/Nov/13 ]

http://review.whamcloud.com/8132

Comment by Jian Yu [ 05/Jan/14 ]

Lustre client build: http://build.whamcloud.com/job/lustre-b2_5/5/
Lustre server build: http://build.whamcloud.com/job/lustre-b2_4/70/ (2.4.2)

The same failure occurred:
https://maloo.whamcloud.com/test_sets/73e57c7e-74f7-11e3-95ae-52540035b04c

Comment by Bob Glossman (Inactive) [ 14/Feb/14 ]

seen again without interop:
https://maloo.whamcloud.com/test_sets/e694ab82-9545-11e3-80d2-52540035b04c

Comment by Jian Yu [ 07/Mar/14 ]

Lustre client build: http://build.whamcloud.com/job/lustre-b2_5/39/ (2.5.1 RC1)
Lustre server build: http://build.whamcloud.com/job/lustre-b2_4/70/ (2.4.2)

The same failure occurred:
https://maloo.whamcloud.com/test_sets/416da544-a54b-11e3-9fee-52540035b04c

Comment by Jodi Levi (Inactive) [ 03/Jun/14 ]

Patch landed to Master.

Comment by Jian Yu [ 05/Jun/14 ]

Lustre client build: http://build.whamcloud.com/job/lustre-b2_5/61/
Lustre server build: http://build.whamcloud.com/job/lustre-b2_4/73/ (2.4.3)

The same failure occurred: https://maloo.whamcloud.com/test_sets/7e385b20-ead5-11e3-966a-52540035b04c

Comment by Jian Yu [ 21/Aug/14 ]

Lustre client build: https://build.hpdd.intel.com/job/lustre-b2_5/80/
Lustre server build: http://build.whamcloud.com/job/lustre-b2_4/73/ (2.4.3)

The same failure still occurred:
https://testing.hpdd.intel.com/test_sets/d0ffc616-266e-11e4-8ee8-5254006e85c2

Lustre b2_5 build #80 contains patch http://review.whamcloud.com/8132.

Hi Emoly, could you please take a look at this issue? Thanks!

Comment by Emoly Liu [ 21/Aug/14 ]

Hi Fanyong, could you please help this one? Thanks
I find this error happened even during patch set 8 maloo test in http://review.whamcloud.com/8132.

"Patch Set 8:
It is very strange failure:
00:08:45:Lustre: DEBUG MARKER: /usr/sbin/lctl lfsck_start -M t32fs-OST0000 00:08:45:LustreError: 2706:0:(ofd_obd.c:1568:ofd_iocontrol()) t32fs-OST0000: not supported cmd = -1073191194 00:09:26:Lustre: DEBUG MARKER: /usr/sbin/lctl mark conf-sanity test_32d: @@@@@@ FAIL: Start OI scrub on OST0
Means ofd_iocontrol() cannot recognise the @cmd OBD_IOC_START_LFSCK (-1073191194).
So unless the test source code is very old (such as 2.4), otherwise, the 2.6 candidate should has processed it properly. I want to check the built source code (#20521), but I cannot find it..."

Comment by nasf (Inactive) [ 21/Aug/14 ]

You can directly try LFSCK on the specified server build to check whether such version support OI scrub on OST or not.

lctl lfsck_start -M ${fsname}-OST0000
Comment by Emoly Liu [ 21/Aug/14 ]

Thanks, Fanyong. I will try it.

Comment by Emoly Liu [ 21/Aug/14 ]

I ran the command "lfsck_start" on 2.4.3 server like fanyong suggested. It reported the same error

[root@onyx-25 ~]# cat /proc/fs/lustre/version
lustre: 2.4.3
kernel: patchless_client
build:  2.4.3-RC1--PRISTINE-2.6.32-358.23.2.el6_lustre.x86_64
[root@onyx-25 ~]# lctl lfsck_start -M lustre-OST0000
Fail to start LFSCK: Inappropriate ioctl for device
[root@onyx-25 ~]# tail /var/log/messages
...
Aug 21 00:30:21 onyx-25 kernel: LustreError: 10157:0:(ofd_obd.c:1568:ofd_iocontrol()) lustre-OST0000: not supported cmd = -1073191194

So, let me add OST version check into that script to fix the problem.

Comment by Andreas Dilger [ 21/Aug/14 ]

It would be great to also fix the ofd_iocontrol()message to print the ioctl() CMD argument as hex instead of a signed integer.

Comment by Emoly Liu [ 22/Aug/14 ]

The patch for master is at http://review.whamcloud.com/11556
The patch for b2_5 is at http://review.whamcloud.com/11574

Generated at Sat Feb 10 01:40:16 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.