[LU-1043] mds_getxattr operation failed with -95 Created: 26/Jan/12 Updated: 22/Jun/16 Resolved: 22/Jun/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Christopher Morrone | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | llnl | ||
| Environment: |
1.8.5-6chaos clients, 2.1.0-17chaos servers |
||
| Severity: | 1 |
| Rank (Obsolete): | 4019 |
| Description |
|
After upgrading a 1.8 server cluster to 2.1, we are seeing this on 1.8 clients: LustreError: 11-0: lsd-MDT0000-mdc-ffff8101b737d400: An error occurred while communicating with NID 172.16.64.141@tcp; the mds_getxattr operation failed with -95 Which is associate with us getting "No such file or directory" when we stat some files that existed before the 1.8->2.1. Not all files have this problem. On a the lustre-discuss, there are hints that ORNL-3 may have info about this, but that ticket is closed to use so we don't really know for certain. |
| Comments |
| Comment by Christopher Morrone [ 26/Jan/12 ] |
|
That may have been a little too terse. There ARE files there, but the effected ones are not accessible on the 1.8 clients: $ ls -l total 0 ?--------- ? ? ? ? ? somefile |
| Comment by Peter Jones [ 26/Jan/12 ] |
|
Oleg Could you please comment on this one? Peter |
| Comment by Oleg Drokin [ 26/Jan/12 ] |
|
Hm, I remember seeing something like that in ORNL testing 2.x (even though ORNL-3 does not mention this particular message). |
| Comment by Christopher Morrone [ 26/Jan/12 ] |
|
No apparent change with the patch from ORNL-3 applied. Note that 2.1 clients can stat the same problem files just fine. It is only the 1.8 clients that have a problem, so it would appear to be a 1.8->2.1 interaction problem. |
| Comment by Christopher Morrone [ 27/Jan/12 ] |
|
The title of this ticket might need to change at some point. At the moment I am tracing just a simple command line "stat" for the problem file. This also fails, returning -2 (ENOENT), but does not create the mds_getxattr messages. So far it looks like the 1.8 client is calling ll_glimpse_size(), which fails. The failure first really seems to be noticed in lov_merge_lvb(). Here's an excerpt: 00000080:00000001:2:1327701005.463721:0:17015:0:(obd_class.h:1311:obd_merge_lvb()) Process entered 00020000:00000001:2:1327701005.463723:0:17015:0:(lov_offset.c:63:lov_stripe_size()) Process entered 00020000:00000001:2:1327701005.463726:0:17015:0:(lov_offset.c:79:lov_stripe_size()) Process leaving (rc=75497472 : 75497472 : 4800000) 00020000:00000001:2:1327701005.463728:0:17015:0:(lov_merge.c:110:lov_merge_lvb()) Process leaving (rc=18446744073709551614 : -2 : fffffffffffffffe) 00000080:00000001:2:1327701005.463731:0:17015:0:(obd_class.h:1317:obd_merge_lvb()) Process leaving (rc=18446744073709551614 : -2 : fffffffffffffffe) 00000080:00010000:2:1327701005.463734:0:17015:0:(file.c:1015:ll_glimpse_size()) glimpse: size: 75497472, blocks: 73736 00000080:00000001:2:1327701005.463736:0:17015:0:(file.c:1017:ll_glimpse_size()) Process leaving (rc=18446744073709551614 : -2 : fffffffffffffffe) On one of the 2.1 OSTs that is contacted by this 1.8 client, I see this on the console: LustreError: 5980:0:(ldlm_resource.c:1088:ldlm_resource_get()) lvbo_init failed for resource 10043232: rc -2 The client log show that it is the same resource that the client is interested in: 00020000:00010000:2:1327701005.463570:0:17015:0:(lov_request.c:176:lov_update_enqueue_set()) ### lock acquired, setting rss=0, kms=0 ns: lsd-OST0009-osc-ffff8101b737d400 lock: ffff8102 33ff3c00/0x949dab10e7808c56 lrc: 4/1,0 mode: PR/PR res: 10043232/0 rrc: 1 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x0 remote: 0xd8619151a9a4ec55 expref : -99 pid: 17015 timeout 0 If that give anyone any inkling of what might be going on, let me know. |
| Comment by Christopher Morrone [ 27/Jan/12 ] |
|
Ah, filter_fid2dentry() is returning NULL on this OST, and that is why it returns ENOENT. My guess it that we really lost this object some time in the past. What made me think that it was an incompatibility problem is that 2.1 clients APPEARED to be able to stat the file just fine. But in reality the 2.1 behavior has just changed. Whereas the lstat() returned an error under 1.8, under 2.1 the client appears to merge as much info as it knows and return that. Because the size reported for the file under 2.1 is the size of that file's contents on just ONE ost (this file is striped over 2 osts, which is our current filesystem default). Which makes me wonder: was this change intentional? Do we really want stat to return partial info? We discussed this a little locally, and we can certainly make up reasons for why it should be either way. If this is by design, I suppose that is fine. Although until be have the online lustre fsck finding this kind of breakage will be more difficult, since stat will not show the problem. |
| Comment by Andreas Dilger [ 30/Jan/12 ] |
|
I don't think that it is intentional to return partial information to userspace when one of the objects is missing. There is no way for the client to know for sure whether the missing object had some data in it or not. It is true that in many cases, if the size of the objects "before" the missing one are strictly smaller than stripe_size, chances are the file didn't grow large enough to put any data on that object, but this isn't guaranteed in case of a sparse file, and we have no external way of confirming this (e.g. SOM or data checksum). |
| Comment by Christopher Morrone [ 30/Jan/12 ] |
|
Even in the case of non-sparse files, the client appears to make no attempt guess where the real end of the file is. It appears to simply add up the file sizes from the OSTs that returned info and report that number. The user doing the stat() (or whatever) will really not know that anything is wrong with the file if we return partial data unless they happen to know ahead of time what the file size should be. So it sounds like we saying that the 2.1 client's behavior should change. |
| Comment by Andreas Dilger [ 04/Jun/12 ] |
|
I suspect that this was changed in the CLIO code, likely by accident. Even if one considers e.g. network RAID-10 (e.g. some OST returning an error on a missing object) then the LOV layer should check some other OST for the backup copy of that object, before returning an error if that object is also missing. Similarly, the only time that the client would know the file size without all objects is in case of SOM, in which case the OST shouldn't be contacted at all. In no case should the missing OST object be silently ignored when computing the file size. During reads for parts of the file not located on that missing OST object the read should still succeed. |
| Comment by Andreas Dilger [ 04/Jun/12 ] |
|
NB: anyone working on this bug, please add a test case for it by deleting one of the OST objects and verifying correct behaviour |
| Comment by Peter Jones [ 04/Jun/12 ] |
|
Reassigning to Niu |
| Comment by Niu Yawei (Inactive) [ 05/Jun/12 ] |
|
patch for master: http://review.whamcloud.com/#change,3041 (glimpse size report error when some ost object is missing) |
| Comment by Niu Yawei (Inactive) [ 30/Aug/13 ] |
|
I suspect the object missing problem (for 1.8) is related to |
| Comment by Andreas Dilger [ 29/Sep/15 ] |
|
Closing old bug, fixed in 2.4. |