[LU-2943] LBUG mdt_reconstruct_open()) ASSERTION( (!(rc < 0) || (lustre_msg_get_transno(req->rq_repmsg) == 0)) ) Created: 11/Mar/13 Updated: 20/Nov/13 Resolved: 20/Nov/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.4 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Diego Moreno (Inactive) | Assignee: | Bruno Faccini (Inactive) |
| Resolution: | Fixed | Votes: | 2 |
| Labels: | mn1 | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 7064 | ||||||||
| Description |
|
This issue has already been hit on lustre 2.2 (see It's been hit four consecutive times so it seems quite easy to reproduce. 2013-03-06 16:05:01 LustreError: 31751:0:(mdt_open.c:1023:mdt_reconstruct_open()) ASSERTION( (!(rc < 0)
On the crash, the file who make the LBUG is a file created by mpio. Onsite support team made the following analysis The return status (rc) is -EREMOTE (-66) and it seems the So mdt_reint_open() would return -EREMOTE and then On the attachment file you can find the struct mdt_thread_info info data |
| Comments |
| Comment by Bruno Faccini (Inactive) [ 11/Mar/13 ] |
|
Hello Diego, |
| Comment by Oleg Drokin [ 12/Mar/13 ] |
|
I think this might be somewhat related to lu-2275 that I fixed for 2.3, perhaps you should try patches from there. Additionally, how can there be EREMOTE in 2.1.4, it's not like we really have DNE there? If it's easy to reproduce, I wonder what's your reproducer method? |
| Comment by Alexandre Louvet [ 12/Mar/13 ] |
|
> If it's easy to reproduce, I wonder what's your reproducer method? Right now we don't know. We got it several times in few days but we were not able to identify the root cause (thousand of nodes and hundreds of jobs running ...). Since then we moved back to 2.1.3 were we doesn't have such problem. We will continue to investigate and try ro reproduce it on test cluster. |
| Comment by Bruno Faccini (Inactive) [ 12/Mar/13 ] |
|
Yes, I agree EREMOTE wih 2.1.4 is strange! Diego, can you also provide, if any, the list of patches that may have been added on top of 2.1.4 ? |
| Comment by Aurelien Degremont (Inactive) [ 12/Mar/13 ] |
|
> If it's easy to reproduce, I wonder what's your reproducer method? In fact, it is not so easy to reproduce. I'm not sure we simply have to retry again to hit this bug again. Not sure what the client was really doing to trigger this... |
| Comment by Oleg Drokin [ 12/Mar/13 ] |
|
Quick search for EREMOTE in b2_! returns |
| Comment by Aurelien Degremont (Inactive) [ 12/Mar/13 ] |
Yes, I had the same reaction, but the code is there, and for along time, according to git blame ("Openlock cache forward port" 2008-08). /* get openlock if this is not replay and if a client requested it */ if (!req_is_replay(req) && create_flags & MDS_OPEN_LOCK) { ldlm_mode_t lm; if (create_flags & FMODE_WRITE) lm = LCK_CW; else if (create_flags & MDS_FMODE_EXEC) lm = LCK_PR; else lm = LCK_CR; mdt_lock_handle_init(lhc); mdt_lock_reg_init(lhc, lm); rc = mdt_object_lock(info, child, lhc, MDS_INODELOCK_LOOKUP | MDS_INODELOCK_OPEN, MDT_CROSS_LOCK); if (rc) { result = rc; GOTO(out_child, result); } else { result = -EREMOTE; mdt_set_disposition(info, ldlm_rep, DISP_OPEN_LOCK); } } You can see that under some condition result could be set to -EREMOTE and return to caller (mdt_reint_open()), with absolutely no link to DNE. |
| Comment by Patrick Valentin (Inactive) [ 12/Mar/13 ] |
|
Hello Bruno, Below is the first set of patches, present on top of both 2.1.3 and 2.1.4. As said above, when the customer moved back to 2.1.3, the problem no longer appeared. ORNL-22 general ptlrpcd threads pool support
From branch b2_1 (id: 71350744808a2791d6b623bfb24623052322380d)
LU-1144 ptlrpc: implement a NUMA aware ptlrpcd binding policy
This patch is a backport on lustre 2.1 of the master branch patch.
LU-1110 fid: add full support for open-by-fid
This patch is a backport on lustre 2.1 of the master branch patch.
LU-645 Avoid unnecessary dentry rehashing
This patch is a backport on lustre 2.1 of the b1_8 branch patch.
LU-1331 changelog: allow changelog to extend record
This patch is a backport on lustre 2.1 of the master branch patch.
LU-1448 llite: Prevent NULL pointer dereference on disabled OSC
This patch is a backport on lustre 2.1 of the master branch patch.
LU-1714 lnet: Properly initialize sg_magic value
This patch is a backport on lustre 2.1 of the master branch patch.
And here is the second set of patches, which is only on top of 2.1.4: LU-1887 ptlrpc: grant shrink rpc format is special
From branch b2_1 (id: 1de6014a19aae85ad92fc00265f9aeb86fb7f0cb)
LU-2613 mdt: update disk for fake transactions
This patch is coming from "review.whamcloud.com/#change,5143"
patch set 2, which is still in "Review in Progress" status.
LU-2624 ptlrpc: improve stop of ptlrpcd threads
This patch is a backport on lustre 2.1 of the master branch patch.
LU-2683 lov: release all locks in closure to release sublock
This patch is coming from "review.whamcloud.com/#change,5208" patch
set 2, which was in "Review in Progress" status.
It is now in master branch since 2013-03-04.
LU-1666 obdclass: reduce lock contention on coh_page_guard
From branch b2_1 (id: 3d63043afdbf9842ce763bcff1efa30472ec3881)
LU-744 obdclass: revise cl_page refcount
From branch b2_1 (id: 17f83b93481932e3476b076651ab60e1fbd15136)
Note: I also found |
| Comment by Bruno Faccini (Inactive) [ 12/Mar/13 ] |
|
Thank's Patrick. Aurelien, it is true that EREMOTE is already there and since quite a long time! And this puzzle me because actually I still can not understand why you did not hit this before ?? Also, info->mti_spec.sp_cr_flags are MDS_OPEN_OWNEROVERRIDE|MDS_OPEN_LOCK which should come from an NFS export ... And thus we may only have triggered a very rare open reconstruct need for a NFSd Client request ?? And like in |
| Comment by Aurelien Degremont (Inactive) [ 13/Mar/13 ] |
|
Bruno, The fact that we hit this bug with 2.1.4 (Bull 227) and 2.4 ( Strange.... |
| Comment by Bruno Faccini (Inactive) [ 13/Mar/13 ] |
|
Hello Aurelien, Yes this looked strange to me as well, and this is why I better think you hit a very rare situation that trigger and old and still present bug. Can you try to determine from the crash-dump which FS and Client/Node were involved ?? I wonder it could a Lustre FS that some of you Client should then re-export via NFS ?? BTW, and according to my colleagues, seems that EREMOTE usage for open_lock feature may be avoided, so I may be back with a patch proposal. |
| Comment by Aurelien Degremont (Inactive) [ 13/Mar/13 ] |
|
Filesystem is our scratch fs on TERA-100, which is absolutely not re-exported by NFS. Client which seems involved was a classical compute client, as previously said, using MPI-IO (Not sure at all this is related). |
| Comment by Bruno Faccini (Inactive) [ 14/Mar/13 ] |
|
Yes, it is right, this can definitely also happen out of NFS-export scenario ... So I think that finally you experienced a somewhat rare case of an open-reconstruct/recovery scenario that triggered the bug (like But let's wait for the currently investigated fix, as part of |
| Comment by Sebastien Buisson (Inactive) [ 25/Mar/13 ] |
|
Hi, Now that a fix for Thanks, |
| Comment by Bruno Faccini (Inactive) [ 26/Mar/13 ] |
|
Master/2.4 fix landed from |
| Comment by Antoine Percher [ 28/Mar/13 ] |
|
I have found the client node who are certainly the client who send the failing request 2013-03-06 16:03:04 INFO: task %%A197:17742 blocked for more than 120 seconds. I have also attach a file with complete client log |
| Comment by Sebastien Buisson (Inactive) [ 04/Apr/13 ] |
|
Hi Bruno, We tried to backport the fix from TIA, |
| Comment by Bruno Faccini (Inactive) [ 05/Apr/13 ] |
|
Yes, I am working on it and as you pointed it is not an easy one, will keep you updated. |
| Comment by Bruno Faccini (Inactive) [ 05/Apr/13 ] |
|
Seb, |
| Comment by Bruno Faccini (Inactive) [ 07/Apr/13 ] |
|
B2_1 port/patch http://review.whamcloud.com/5954 submitted and successfully passed auto-tests. |
| Comment by Patrick Valentin (Inactive) [ 09/Apr/13 ] |
|
Bruno, |
| Comment by Alexandre Louvet [ 23/Aug/13 ] |
|
Bruno, sorry to says that just after installing the patch, we got a lot of crashes on 3 large clusters. The lbug message is : followed by this stack : Kernel panic - not syncing: LBUG That was observed on system running lustre 2.1.5 + patches
I agree this is not the same context than previously, but it located exactly were the patch is modifying the source code. Alex. |
| Comment by Bruno Faccini (Inactive) [ 26/Aug/13 ] |
|
Humm sorry about that, and due to your report I am currently looking again to original/master patch from |
| Comment by Bruno Faccini (Inactive) [ 29/Aug/13 ] |
|
New version/patch-set #3 of b2_1 port/patch http://review.whamcloud.com/5954 submitted and successfully passed auto-tests. |
| Comment by Alexandre Louvet [ 05/Sep/13 ] |
|
What the current status of the latest patch ? |
| Comment by Bruno Faccini (Inactive) [ 09/Sep/13 ] |
|
Hello Alex, |
| Comment by Sebastien Buisson (Inactive) [ 09/Sep/13 ] |
|
Hi Bruno, The patchset #3 of http://review.whamcloud.com/5954 has been rolled out at CEA for test purpose at the end of last week. Cheers, |
| Comment by Alexandre Louvet [ 09/Sep/13 ] |
|
Hello Bruno, I'll keep you inform. Cheers, |
| Comment by Bruno Faccini (Inactive) [ 18/Nov/13 ] |
|
Hello Alex and Seb, do you have any update fo this ticket ?? |
| Comment by Sebastien Buisson (Inactive) [ 20/Nov/13 ] |
|
Hi Bruno, Support team confirms that your fix does fix the issue. Sebastien. |
| Comment by Bruno Faccini (Inactive) [ 20/Nov/13 ] |
|
Cool, thanks for your update Seb. So I am marking this ticket as Fixed. |