[LU-5951] sanity test_39k: mtime is lost on close Created: 24/Nov/14 Updated: 13/Dec/16 Resolved: 09/Dec/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.7.0, Lustre 2.8.0, Lustre 2.5.4 |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Maloo | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | 22pl | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||||||||||||||
| Rank (Obsolete): | 16613 | ||||||||||||||||||||||||||||||||||||
| Description |
| Comments |
| Comment by Andreas Dilger [ 24/Nov/14 ] |
|
Just looking back at patches that landed before Nov 14. I found http://review.whamcloud.com/12243 from |
| Comment by Andreas Dilger [ 24/Nov/14 ] |
|
Another possibility is http://review.whamcloud.com/10858 " |
| Comment by Niu Yawei (Inactive) [ 27/Nov/14 ] |
|
Looks this was introduced when integrating OFD stack: Author: Mikhail Pershin <tappro@whamcloud.com>
Date: Wed May 23 23:00:33 2012 +0400
LU-1406 ofd: IO operations
add IO functions to OFD
see the ofd_commitrw(): + if (cmd == OBD_BRW_WRITE) { + /* Don't update timestamps if this write is older than a + * setattr which modifies the timestamps. b=10150 */ + + /* XXX when we start having persistent reservations this needs + * to be changed to ofd_fmd_get() to create the fmd if it + * doesn't already exist so we can store the reservation handle + * there. */ + valid = OBD_MD_FLUID | OBD_MD_FLGID; + fmd = ofd_fmd_find(exp, &info->fti_fid); + if (!fmd || fmd->fmd_mactime_xid < info->fti_xid) + valid |= OBD_MD_FLATIME | OBD_MD_FLMTIME | + OBD_MD_FLCTIME; This actually should be: if (fmd && fmd->fmd_mactime_xid > info->fti_xid)
valid &=~ time_flags;
I'm going to cook a patch soon. |
| Comment by Gerrit Updater [ 27/Nov/14 ] |
|
Niu Yawei (yawei.niu@intel.com) uploaded a new patch: http://review.whamcloud.com/12865 |
| Comment by Niu Yawei (Inactive) [ 27/Nov/14 ] |
|
patch for master: http://review.whamcloud.com/12865 |
| Comment by Jian Yu [ 30/Nov/14 ] |
|
While verifying patch http://review.whamcloud.com/12804 on Lustre b2_5 branch, the same failure occurred: |
| Comment by Gerrit Updater [ 04/Dec/14 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12865/ |
| Comment by Jian Yu [ 08/Dec/14 ] |
|
More instance on Lustre b2_5 branch: |
| Comment by Niu Yawei (Inactive) [ 05/Jan/15 ] |
|
patch landed on master. |
| Comment by Gerrit Updater [ 07/Jan/15 ] |
|
Jian Yu (jian.yu@intel.com) uploaded a new patch: http://review.whamcloud.com/13261 |
| Comment by Gerrit Updater [ 27/Jan/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13261/ |
| Comment by Di Wang [ 01/Jul/15 ] |
|
It seems a regression, I saw it twice recently https://testing.hpdd.intel.com/sub_tests/3032df32-1fbc-11e5-bc94-5254006e85c2 |
| Comment by Niu Yawei (Inactive) [ 02/Jul/15 ] |
|
I think the regression should be introduced by: commit bf3e7f67cb33f3b4e0590ef8af3843ac53d0a4e8
Author: Gregoire Pichon <gregoire.pichon@bull.net>
Date: Wed May 13 16:42:44 2015 +0200
LU-5319 ptlrpc: embed highest XID in each request
Atomically assign XIDs and put request and sending list so
we can learn the lowest unreplied XID at any point.
This allows to embed in every resquests the highest XID for
which a reply has been received and does not have an unreplied
lower-numbered XID.
This will be used by the MDT target to release in-memory
reply data corresponding to XIDs of reply received by the client.
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Change-Id: Ic88fb6db704d8e9a78a34fe16f64abb2cdffc4c4
Reviewed-on: http://review.whamcloud.com/14793
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Where we deferred the xid assignment from request packing to request sending, that breaks the fix of bug 10150, see osc_build_rpc(): /* Need to update the timestamps after the request is built in case * we race with setattr (locally or in queue at OST). If OST gets * later setattr before earlier BRW (as determined by the request xid), * the OST will not use BRW timestamps. Sadly, there is no obvious * way to do this in a single call. bug 10150 */ Looks we have to fix the race of setattr vs. brw in another method or just fix the multi-slot patch, any suggestions? |
| Comment by Gerrit Updater [ 02/Jul/15 ] |
|
Niu Yawei (yawei.niu@intel.com) uploaded a new patch: http://review.whamcloud.com/15473 |
| Comment by Andreas Dilger [ 03/Jul/15 ] |
|
It isn't totally clear that we need the change from http://review.whamcloud.com/14793 in order for the multi-slot code to work. While it would make the tracking of unreplied RPCs a bit more complex, having an atomic XID assignment set at "send" time is not quite the same as "unreplied" so there still needs to be a mechanism used to track which RPCs have replies. The one major difference would be that there needs to be some mechanism to track RPC XIDs which are never sent, so that they don't permanently get stuck as the lowest unreplied XID. It would seem possible to do this in __ptlrpc_req_free() I think? |
| Comment by Alex Zhuravlev [ 03/Jul/15 ] |
|
well, if we don't track that, then it's very easy to "lose" some slots: at moment X we used 8 slots, then later we were using 2 slots at most. using tags we can reuse only those 2 slots, but we can't report the others slots can be reused. there is no strong need to maintain that absolutely up to date, |
| Comment by Niu Yawei (Inactive) [ 03/Jul/15 ] |
|
Ok, I'll update the patch to maintain an unreplied xid list for each import. |
| Comment by James Nunez (Inactive) [ 06/Jul/15 ] |
|
I've seen this issue again: |
| Comment by Gerrit Updater [ 02/Oct/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/15473/ |
| Comment by Joseph Gmitter (Inactive) [ 02/Oct/15 ] |
|
Landed for 2.8.0 |
| Comment by Gerrit Updater [ 06/Oct/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) uploaded a new patch: http://review.whamcloud.com/16734 |
| Comment by Gerrit Updater [ 06/Oct/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16734/ |
| Comment by Joseph Gmitter (Inactive) [ 06/Oct/15 ] |
|
Reopening as the recent landing had caused |
| Comment by Joseph Gmitter (Inactive) [ 06/Oct/15 ] |
|
The fixVersion has been updated to 2.9.0 to properly address the issue that was being addressed by http://review.whamcloud.com/15473/ |
| Comment by Gerrit Updater [ 08/Oct/15 ] |
|
Niu Yawei (yawei.niu@intel.com) uploaded a new patch: http://review.whamcloud.com/16759 |
| Comment by Jeremy Filizetti [ 01/Dec/15 ] |
|
The patch (http://review.whamcloud.com/#/c/16759/) here is necessary for GSS Shared Key (and I assume Kerberos) to function without generating an LBUG. |
| Comment by Gerrit Updater [ 09/Dec/15 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16759/ |
| Comment by Peter Jones [ 09/Dec/15 ] |
|
Landed for 2.8 |