[LU-5510] 2.4.3<->2.5.3 interop: sanity-scrub test_15: FAIL: (7) Expected 'inconsistent' on mds1, but got 'inconsistent,upgrade' Created: 20/Aug/14  Updated: 11/Feb/15  Resolved: 08/Feb/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0, Lustre 2.5.3
Fix Version/s: Lustre 2.7.0

Type: Bug Priority: Minor
Reporter: Jian Yu Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

Lustre client build: https://build.hpdd.intel.com/job/lustre-b2_5/80/
Lustre server build: https://build.hpdd.intel.com/job/lustre-b2_4/73/ (2.4.3)
Distro/Arch: RHEL6.5/x86_64


Severity: 3
Rank (Obsolete): 15367

 Description   

sanity-scrub test 15 failed as follows:

Started LFSCK on the MDT device lustre-MDT0000.
CMD: shadow-46vm3 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.oi_scrub
CMD: shadow-46vm3 /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0000.oi_scrub
 sanity-scrub test_15: @@@@@@ FAIL: (7) Expected 'inconsistent' on mds1, but got 'inconsistent,upgrade' 

Maloo report: https://testing.hpdd.intel.com/test_sets/6a6b507e-f8af-11e3-842c-5254006e85c2



 Comments   
Comment by Jian Yu [ 21/Aug/14 ]

Hi Nasf,

Is this a new interop failure related to the patches for LU-4058?

Comment by Jian Yu [ 21/Aug/14 ]

The failure did not occur in the interop testing between Lustre b2_5 build #83 client with Lustre 2.4.3 server.

Comment by Jian Yu [ 02/Sep/14 ]

Lustre client build: https://build.hpdd.intel.com/job/lustre-b2_5/86/ (2.5.3 RC1)
Lustre server build: https://build.hpdd.intel.com/job/lustre-b2_4/73/ (2.4.3)
Distro/Arch: RHEL6.5/x86_64

The failure occurred again: https://testing.hpdd.intel.com/test_sets/91e037a4-31f4-11e4-8d72-5254006e85c2

Comment by nasf (Inactive) [ 25/Dec/14 ]

Yujian,

I do not think the issue is related with LU-4058. It looks like that there were some objects without LMA EA, then the OI scrub regarded that the system was upgrading from Lustre-1.8. It may be not related with the interoperability, instead, it may be reproducible on Lustre-2.4.3 directly. Have you ever run the sanity-scrub on pure Lustre-2.4.3 without interoperating with Lustre-2.5.3?

Comment by Jian Yu [ 25/Dec/14 ]

Have you ever run the sanity-scrub on pure Lustre-2.4.3 without interoperating with Lustre-2.5.3?

Yes, and here are the full group test sessions on Lustre 2.4.3:
https://testing.hpdd.intel.com/test_sessions/e75fa67e-ab62-11e3-a696-52540035b04c
https://testing.hpdd.intel.com/test_sessions/3aaccf46-ac5d-11e3-81d7-52540035b04c
https://testing.hpdd.intel.com/test_sessions/ffeb5ee0-abe4-11e3-bcad-52540035b04c

sanity-scrub test passed.

However, I found the following failures on Lustre 2.4.2 full group test sessions:
https://testing.hpdd.intel.com/test_sets/4f889346-6a5e-11e3-81c0-52540035b04c
https://testing.hpdd.intel.com/test_sets/fc56303c-6c58-11e3-92d0-52540035b04c

It looks like the failure existed on Lustre b2_4 and occured sporadically.

Comment by Gerrit Updater [ 26/Dec/14 ]

Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/13187
Subject: LU-5510 scrub: ldiskfs_create_inode returns locked inode
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: cac7f444551d7d495d5dbc744c9114d1bdd77fbb

Comment by Gerrit Updater [ 08/Feb/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13187/
Subject: LU-5510 scrub: ldiskfs_create_inode returns locked inode
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 3c357081f5c0af79d76e2f556c14ca74ca47cf3b

Comment by Peter Jones [ 08/Feb/15 ]

Landed for 2.7

Comment by Andreas Dilger [ 11/Feb/15 ]

I saw that a few people other than myself hit the problem:

LustreError: 19224:0:(osd_handler.c:2334:__osd_object_create()) ASSERTION( obj->oo_inode->i_state & 8 ) failed

that was added as part of this patch. Just adding a comment here to make it clear that you need to do "make -C ldiskfs clean" to rebuild the ldiskfs module with the new patch. It seems there is something wrong with the build dependencies in ldiskfs that prevent it from automatically detecting that the source patches have changed since they were applied.

I was thinking of disabling this LASSERT() and just handling the case of a returned locked inode, but there isn't really any reason to be compatible with different versions of ldiskfs, since it is always included as part of the Lustre modules (AFAIK only LLNL made their own ldiskfs modules with their kernel and have since moved to ZFS anyway).

Generated at Sat Feb 10 01:52:06 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.