[LU-9842] If you disable xattr cache on client and run sanity 102n it will crash the MDS server Created: 07/Aug/17  Updated: 14/Sep/17  Resolved: 28/Aug/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.10.0, Lustre 2.11.0
Fix Version/s: Lustre 2.10.1, Lustre 2.11.0

Type: Bug Priority: Critical
Reporter: James A Simmons Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

Any lustre client


Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Run the following on a client with the maloo test suite installed:

lctl set_param llite.lustre-*.xattr_cache=0
ONLY="102n" sh ./sanity.sh

You will get the following crash on the MDS server:

2017-08-07T13:55:33.198203-04:00 ninja34.ccs.ornl.gov kernel: Lustre: DEBUG MARKER: == sanity test 102n: silently ignor
e setxattr on internal trusted xattrs ============================= 13:55:32 (1502128532)
2017-08-07T13:55:33.734878-04:00 ninja34.ccs.ornl.gov kernel: LustreError: 28812:0:(osd_handler.c:3818:osd_xattr_get())
ASSERTION( osd_dev(dt->do_lu.lo_dev)->od_is_ost ) failed:
2017-08-07T13:55:33.734922-04:00 ninja34.ccs.ornl.gov kernel: LustreError: 28812:0:(osd_handler.c:3818:osd_xattr_get())
LBUG
2017-08-07T13:55:33.734942-04:00 ninja34.ccs.ornl.gov kernel: Pid: 28812, comm: mdt01_004
2017-08-07T13:55:33.740968-04:00 ninja34.ccs.ornl.gov kernel: #012Call Trace:
2017-08-07T13:55:33.749351-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffffa08e081e>] libcfs_call_trace+0x4e/0x60 [libcfs]
2017-08-07T13:55:33.758102-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffffa08e08ac>] lbug_with_loc+0x4c/0xb0 [libcfs]
2017-08-07T13:55:33.766438-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffffa0c65184>] osd_xattr_get+0x804/0x840 [osd_ldisk
fs]
2017-08-07T13:55:33.775382-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffff813217d2>] ? strlcpy+0x42/0x60
2017-08-07T13:55:33.782529-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffffa163acbc>] lod_xattr_get+0xdc/0x6a0 [lod]
2017-08-07T13:55:33.790618-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffffa0c5b82c>] ? osd_read_lock+0x5c/0xe0 [osd_ldisk
fs]
2017-08-07T13:55:33.799460-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffffa16a838b>] mdd_xattr_get+0x10b/0x360 [mdd]
2017-08-07T13:55:33.807619-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffffa156b1ce>] mdt_getxattr+0xa4e/0x1080 [mdt]
2017-08-07T13:55:33.824972-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffffa12731e2>] ? lustre_msg_get_transno+0x22/0xf0 [
ptlrpc]
2017-08-07T13:55:33.825006-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffffa15554bc>] mdt_tgt_getxattr+0x1c/0x30 [mdt]
2017-08-07T13:55:33.841968-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffffa12d797d>] tgt_request_handle+0xa3d/0x1310 [ptl
rpc]
2017-08-07T13:55:33.842003-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffffa12802de>] ptlrpc_server_handle_request+0x28e/0
xa30 [ptlrpc]
2017-08-07T13:55:33.851568-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffff810ba628>] ? __wake_up_common+0x58/0x90
2017-08-07T13:55:33.859338-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffffa1284250>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
2017-08-07T13:55:33.875537-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffffa12837b0>] ? ptlrpc_main+0x0/0x1de0 [ptlrpc]
2017-08-07T13:55:33.875571-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffff810b0a4f>] kthread+0xcf/0xe0
2017-08-07T13:55:33.882185-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffff810b0980>] ? kthread+0x0/0xe0
2017-08-07T13:55:33.888871-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffff81697418>] ret_from_fork+0x58/0x90
2017-08-07T13:55:33.895964-04:00 ninja34.ccs.ornl.gov kernel: [<ffffffff810b0980>] ? kthread+0x0/0xe0



 Comments   
Comment by John Hammond [ 08/Aug/17 ]

It's enough to ask for trusted.fid with the xattr cache disabled. The assertion was added by https://review.whamcloud.com/24882 LU-8998 pfl: enhance PFID EA for PFL which is in 2.9.56. So I wonder is this really present in 2.8.0 and 2.9.0?

Comment by James A Simmons [ 08/Aug/17 ]

Okay I updated the version impact.

Comment by Peter Jones [ 08/Aug/17 ]

Fan Yong

Could you please advise?

Thanks

Peter

Comment by John Hammond [ 08/Aug/17 ]

Fan Yong, it looks like this assertion can be removed and then the MDT will handle this correctly. Do you agree?

Comment by nasf (Inactive) [ 08/Aug/17 ]
Fan Yong, it looks like this assertion can be removed and then the MDT will handle this correctly. Do you agree?

Basically, it is yes, but we can some improvement. XATTR_NAME_FID is OST side EA, if someone calls getxattr() for XATTR_NAME_FID on the MDT, we should return ENODATA.

Comment by Gerrit Updater [ 08/Aug/17 ]

Fan Yong (fan.yong@intel.com) uploaded a new patch: https://review.whamcloud.com/28434
Subject: LU-9842 osd: return ENODATA for XATTR_NAME_FID on MDT
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 39977d3b3cd4fd8e295138956e9ec566bb74831a

Comment by Gerrit Updater [ 28/Aug/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28434/
Subject: LU-9842 osd: return ENODATA for XATTR_NAME_FID on MDT
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: eb34cf7695766b39e15704861d6ac3d636042196

Comment by Gerrit Updater [ 28/Aug/17 ]

Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/28761
Subject: LU-9842 osd: return ENODATA for XATTR_NAME_FID on MDT
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: d409f42ed9082e152dc1731a721f38fec5ff2ae6

Comment by Gerrit Updater [ 14/Sep/17 ]

John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/28761/
Subject: LU-9842 osd: return ENODATA for XATTR_NAME_FID on MDT
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: ab22f5265cf0e0471a3a60cb0c4dcda0d4016092

Generated at Sat Feb 10 02:29:46 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.