[LU-5957] Broken layout after swapping layouts with unstriped files Created: 26/Nov/14  Updated: 01/Sep/16  Resolved: 03/Feb/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.1, Lustre 2.7.0, Lustre 2.5.3
Fix Version/s: Lustre 2.7.0

Type: Bug Priority: Minor
Reporter: Henri Doreau (Inactive) Assignee: Bruno Faccini (Inactive)
Resolution: Fixed Votes: 0
Labels: cea, patch

Attachments: File ebadf.sh    
Severity: 3
Rank (Obsolete): 16639

 Description   

The following operations sequence eventually fails with EBADF (let fstripe be a striped file, fnostripeX unstriped files):

1) swap_layouts fstripe fnostripe1
2) I/O into fstripe
3) swap_layouts fstripe fnostripe2
4) I/O into fstripe # Fails with EBADF

Please find a reproducer attached. This seems to affect current master as well as older versions of that provide layout swap.

The file remains unavailable, also after a client umount/remount or from another client.

It seems that lov_conf_set() doesn't call lov_layout_change() in the second case, leading to lov_io_init_empty() being called eventually on the write...



 Comments   
Comment by Henri Doreau (Inactive) [ 26/Nov/14 ]

"unstriped" is probably poorly chosen, I mean files with no LOV EA.

Comment by Bruno Faccini (Inactive) [ 27/Nov/14 ]

I wonder if this could be related to LU-2766, with a new symptom ?

Comment by Henri Doreau (Inactive) [ 27/Nov/14 ]

I don't think so, according to our analysis (which just ended) this is a server-side issue. It comes from a missing clear of a MDT flag after a swap layout that removes LOV. We will provide a patch.

Comment by Bruno Faccini (Inactive) [ 28/Nov/14 ]

Ah ok, so I wonder if this could be in mdd_swap_layouts() after the swap has been re-ordered to start by the no layout file ??

Comment by Gerrit Updater [ 28/Nov/14 ]

Henri Doreau (henri.doreau@cea.fr) uploaded a new patch: http://review.whamcloud.com/12877
Subject: LU-5957 mdt: Update MDT flags after layout swap
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: dcebecc7d669e71c45597d07715faed4e49885e9

Comment by Henri Doreau (Inactive) [ 28/Nov/14 ]

Bruno, as you might see in the patch above it's an mdt issue: an I/O on a non-striped file invokes mdt_create_data, which creates the LOV and set MOF_LOV_CREATED on the object. If you "destripe" this object again an re-do an I/O this flag will prevent the LOV from being re-created as expected.

Comment by Gerrit Updater [ 03/Feb/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12877/
Subject: LU-5957 mdt: Update MDT flags after layout swap
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 1d3ee3ed2b56d73f392e1b2a033f7d274f5202d9

Comment by Peter Jones [ 03/Feb/15 ]

Landed for 2.7

Comment by Henri Doreau (Inactive) [ 09/Feb/15 ]

Could this be considered for 2.5?

Comment by Gerrit Updater [ 30/Mar/15 ]

Henri Doreau (henri.doreau@cea.fr) uploaded a new patch: http://review.whamcloud.com/14255
Subject: LU-5957 mdt: Update MDT flags after layout swap
Project: fs/lustre-release
Branch: b2_5
Current Patch Set: 1
Commit: 6032c1e646942169d73ea3763c0399c5d663e498

Comment by Henri Doreau (Inactive) [ 30/Mar/15 ]

Here is a 2.5 backport: http://review.whamcloud.com/14255

Generated at Sat Feb 10 01:55:58 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.