[LU-4249] exception RIP: lqe64_hash_keycmp+12 Created: 13/Nov/13 Updated: 26/Feb/15 Resolved: 08/Aug/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.5.3 |
| Type: | Bug | Priority: | Major |
| Reporter: | Mahmoud Hanafi | Assignee: | Niu Yawei (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
our source tree is at https://github.com/jlan/lustre-nas |
||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 11583 | ||||||||||||||||
| Description |
|
OSS crashed with Projection Fault at lqe64_hash_keycmp See attached files. libcfs/libcfs/hash.c cfs_hlist_for_each defined as hlist_for_each but hlist_for_each doesn't seem to be defined any where. See attached backtrace info. |
| Comments |
| Comment by Niu Yawei (Inactive) [ 14/Nov/13 ] |
|
hlist_for_each() is a kernel function. Mahmoud, do you know if this happened when shutding down OST? Thanks. |
| Comment by Mahmoud Hanafi [ 14/Nov/13 ] |
|
This was not during shutdown |
| Comment by Mahmoud Hanafi [ 15/Nov/13 ] |
|
We hit this bug again today. Is there any progress coming up with a cause/fix? |
| Comment by Andreas Dilger [ 15/Nov/13 ] |
|
Mahmoud, could you please attach the console log from before the crash and the oops message itself. |
| Comment by Mahmoud Hanafi [ 16/Nov/13 ] |
|
Here 1 Hour before the oops and the oops message itself. |
| Comment by Niu Yawei (Inactive) [ 18/Nov/13 ] |
|
Looks like the lqe is freed unexpectedly, but I didn't see from the code how it will be freed in such case. Anyway, the message: LustreError: 13891:0:(qsd_handler.c:344:qsd_req_completion()) $$$ DQACQ failed with -37, flags:0x2 qsd:nbp7-OST002a qtype:usr id:12157 enforced:1 granted:4194304 pending:0 waiting:0 req:1 usage:4009436 qunit:1048576 qtune:262144 edquot:0^M indicating something was wrong. If such error message can be reproduced, could you reproduce it and capture the logs on both OST and MDT with D_QUOTA & D_TRACE enabled? I'd like to figure out why the acquire request always failed with -ENOLCK. |
| Comment by Mahmoud Hanafi [ 18/Nov/13 ] |
|
Uploaded debug logs: |
| Comment by Niu Yawei (Inactive) [ 19/Nov/13 ] |
|
The numerous ENOLCK errors is possibly caused by a race:
Then pre-acquire will always try to get lock with stale lockh, and fail with -ENLOCK. But I didn't see why it cause crash in lqe64_hash_keycmp(), anyway, I'll cooke a patch to fix the race first. |
| Comment by Niu Yawei (Inactive) [ 19/Nov/13 ] |
|
patch to fix the -ENOLCK error: http://review.whamcloud.com/8322 |
| Comment by Niu Yawei (Inactive) [ 22/Nov/13 ] |
|
Looks my analysis about the ENOLCK isn't quite right: the DQACQ should have held lock reference, so lock can't be canceled when there is pending DQACQ. Mahmoud, do you know what kind of operations could cause the problem? Is your source tree same as our source tree? |
| Comment by Peter Jones [ 25/Nov/13 ] |
|
Niu NASA do include some patches. There is a link to their tree in the Environment field above Peter |
| Comment by Mahmoud Hanafi [ 25/Nov/13 ] |
|
We had another crash. We do have a crash dump. But needs to be only view by US Citizen. I tried to get more info on the DQACQ error. Got a list of users that are generating the error. And then looked at there quota. Found some odd things. Some users are have used files count much less than what the are using. for example
lfs quota says he is using 30 files but find show >22000 files. there are other examples like this. |
| Comment by Keith Mannthey (Inactive) [ 25/Nov/13 ] |
|
I am able to look at this Crashdump as I have done before. I send an email. |
| Comment by Niu Yawei (Inactive) [ 26/Nov/13 ] |
|
Thank you, Keith, let's see if there is anything new in the new crash.
Is this a new system or upgraded from old system? Will 'lfs quota' show the newly created files? Can quotacheck fix the inconsistence problem? ('tune2fs -O ^quota' to disable quota then 'tune2fs -O quota' to enable quota, with uptodate e2fsprogs, see |
| Comment by Mahmoud Hanafi [ 26/Nov/13 ] |
|
This is a filesystem that was upgraded from 2.1.5 to 2.4. We initially ran into this file undercount and did the tune2fs command. One thing to note about these users. Their files are initially copied as root to this filesystem. They their ownership is changed. Here is an example /nobackupp7 <-Top level directory
So when the owner ship is changed the file are not accounted for. Question Using tune2fs: Keith, Mahmoud |
| Comment by Keith Mannthey (Inactive) [ 26/Nov/13 ] |
|
I will take a good look this crash and report back. |
| Comment by Niu Yawei (Inactive) [ 27/Nov/13 ] |
Is this reproduceable? Any error messages in the log?
No, no need to do mount and umount. |
| Comment by Mahmoud Hanafi [ 21/Dec/13 ] |
|
Keith, We need some update on this? Any luck extracting info from the core files? |
| Comment by Cliff White (Inactive) [ 13/Jan/14 ] |
|
I have taken over this from Kieth, very sorry about the delay. I have the dump unpacked, and should be able to get you some information this week. |
| Comment by Cliff White (Inactive) [ 14/Jan/14 ] |
|
I see you are using a non-standard build - would it be possible for you to unload your vmlinux for this crash dump? |
| Comment by Mahmoud Hanafi [ 14/Jan/14 ] |
|
I upload the |
| Comment by Mahmoud Hanafi [ 14/Jan/14 ] |
|
Cliff, Thanks, |
| Comment by Niu Yawei (Inactive) [ 15/Jan/14 ] |
|
Mahmoud, Cliff has already checked the crashdump, and the crashdump shows that it crashed in same place as the first crash you posted in this ticket. I'm not sure if it's related to the cfs_hash bug, but you'd apply e44670a0301d3edf13ebd1bc728ad4c797369de2 ( |
| Comment by Mahmoud Hanafi [ 15/Jan/14 ] |
|
We will apply Thanks. |
| Comment by Mahmoud Hanafi [ 30/Jan/14 ] |
|
Looks like ustreError: 13476:0:(qsd_handler.c:344:qsd_req_completion()) Skipped 9 previous similar messages^M |
| Comment by Hongchao Zhang [ 30/Jan/14 ] |
|
from the assembly code of lqe64_hask_keycmp, the problem should be the lquota_entry managed by cfs_hash is destroyed, 0xffffffffa0bf8f2c <lqe64_hash_keycmp+12>: cmp %rax,0x10(%rsi) <-- 0x10(%rsi) is the "lqe->lqe_id.qid_uid"
[exception RIP: lqe64_hash_keycmp+12]
RIP: ffffffffa0bf8f2c RSP: ffff88058e5736d0 RFLAGS: 00010206
RAX: 0000000000002f7d RBX: ffff880fd6450e00 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 5a5a5a5a5a5a5a5a RDI: ffff88058f1b2f60
RBP: ffff88058e5736d0 R8: 0000000000000003 R9: 0000000000000000
R10: ffff88058f1b2f28 R11: ffffffffa0cac2f0 R12: ffff88058e573760
R13: ffff88058f1b2f60 R14: 0000000000000000 R15: 5a5a5a5a5a5a5a5a
the RSI is 0x5a5a5a5a5a5a5a5a, which means the corresponding lquota_entry is freed. |
| Comment by Mahmoud Hanafi [ 30/Jan/14 ] |
|
Can we expect a patch soon. This is our most critical issue. |
| Comment by Johann Lombardi (Inactive) [ 31/Jan/14 ] |
|
Hi Mahmoud, One possible scenario is that the quota code screws up the refcount on the lquota_entry structure, causing a lqe to be freed while still referenced in the hash table. We are working on a diagnostic patch under http://review.whamcloud.com/#/c/9070. We would like to make sure it passes our regression tests before you apply it. Stay tuned. |
| Comment by Mahmoud Hanafi [ 12/Mar/14 ] |
|
We got a crash and here is the debug output. LustreError: 167-0: nbp7-MDT0000-lwp-OST004a: This client was evicted by nbp7-MDT0000; in progress operations using this service will fail. Lustre: nbp7-MDT0000-lwp-OST003e: Connection restored to nbp7-MDT0000 (at 10.151.27.38@o2ib) Lustre: Evicted from MGS (at 10.151.27.38@o2ib) after server handle changed from 0x77507a47a4c36f84 to 0x5127d6849d7f5d1c Lustre: nbp7-MDT0000-lwp-OST004a: Connection restored to nbp7-MDT0000 (at 10.151.27.38@o2ib) Lustre: 45157:0:(qsd_reint.c:365:qsd_reconciliation()) nbp7-OST0032: failed to report quota. [0x200000006:0x20000:0x0], -37 Lustre: 45157:0:(qsd_reint.c:525:qsd_reint_main()) nbp7-OST0032: reconciliation failed. [0x0:0x0:0x0], -37 Lustre: 45187:0:(qsd_reint.c:365:qsd_reconciliation()) nbp7-OST0026: failed to report quota. [0x200000006:0x20000:0x0], -37 Lustre: 45187:0:(qsd_reint.c:525:qsd_reint_main()) nbp7-OST0026: reconciliation failed. [0x0:0x0:0x0], -37 Lustre: 45188:0:(qsd_reint.c:365:qsd_reconciliation()) nbp7-OST0042: failed to report quota. [0x200000006:0x20000:0x0], -37 Lustre: 45188:0:(qsd_reint.c:525:qsd_reint_main()) nbp7-OST0042: reconciliation failed. [0x0:0x0:0x0], -37 Lustre: 45189:0:(qsd_reint.c:365:qsd_reconciliation()) nbp7-OST002a: failed to report quota. [0x200000006:0x20000:0x0], -37 Lustre: 45189:0:(qsd_reint.c:525:qsd_reint_main()) nbp7-OST002a: reconciliation failed. [0x0:0x0:0x0], -37 Lustre: nbp7-OST0026: deleting orphan objects from 0x0:16242960 to 0x0:16242977 Lustre: nbp7-OST002a: deleting orphan objects from 0x0:16088539 to 0x0:16088557 Lustre: nbp7-OST002e: deleting orphan objects from 0x0:15914638 to 0x0:15914671 Lustre: nbp7-OST003a: deleting orphan objects from 0x0:15238739 to 0x0:15238764 Lustre: nbp7-OST0042: deleting orphan objects from 0x0:15604504 to 0x0:15604524 Lustre: nbp7-OST003e: deleting orphan objects from 0x0:16407770 to 0x0:16407841 Lustre: nbp7-OST004a: deleting orphan objects from 0x0:15822277 to 0x0:15822318 Lustre: nbp7-OST004e: deleting orphan objects from 0x0:15730335 to 0x0:15730375 Lustre: nbp7-OST0036: deleting orphan objects from 0x0:14913994 to 0x0:14914030 Lustre: nbp7-OST0046: deleting orphan objects from 0x0:15378324 to 0x0:15378342 Lustre: nbp7-OST0032: deleting orphan objects from 0x0:15902313 to 0x0:15902368 Lustre: nbp7-OST0052: deleting orphan objects from 0x0:15714908 to 0x0:15714926 Lustre: 45191:0:(qsd_reint.c:365:qsd_reconciliation()) nbp7-OST0046: failed to report quota. [0x200000006:0x20000:0x0], -37 Lustre: 45191:0:(qsd_reint.c:525:qsd_reint_main()) nbp7-OST0046: reconciliation failed. [0x0:0x0:0x0], -37 Lustre: 45209:0:(qsd_reint.c:365:qsd_reconciliation()) nbp7-OST003e: failed to report quota. [0x200000006:0x20000:0x0], -37 Lustre: 45209:0:(qsd_reint.c:525:qsd_reint_main()) nbp7-OST003e: reconciliation failed. [0x0:0x0:0x0], -37 LustreError: 11901:0:(lquota_internal.h:272:lqe_putref()) $$$ Freeing quota entry while it is still referenced in the hash!!! qsd:nbp7-OST002a qtype:usr id:11920 enforced:1 granted:31459296 pending:0 waiting:0 req:0 usage:31396440 qunit:4194304 qtune:524288 edquot:0 LustreError: 11901:0:(lquota_internal.h:273:lqe_putref()) LBUG |
| Comment by Johann Lombardi (Inactive) [ 12/Mar/14 ] |
|
There should be a stack trace dumped after the LBUG. Could you please paste it here? |
| Comment by Mahmoud Hanafi [ 12/Mar/14 ] |
|
This is from a second host that crashed at the same time. It has the call trace ustre: nbp7-MDT0000-lwp-OST003b: Connection restored to nbp7-MDT0000 (at 10.151.27.38@o2ib) Lustre: Evicted from MGS (at 10.151.27.38@o2ib) after server handle changed from 0x77507a47a4c430e4 to 0x5127d6849d7f8850 Lustre: nbp7-MDT0000-lwp-OST0033: Connection restored to nbp7-MDT0000 (at 10.151.27.38@o2ib) Lustre: 47116:0:(qsd_reint.c:365:qsd_reconciliation()) nbp7-OST003f: failed to report quota. [0x200000006:0x20000:0x0], -37 Lustre: 47116:0:(qsd_reint.c:525:qsd_reint_main()) nbp7-OST003f: reconciliation failed. [0x0:0x0:0x0], -37 Lustre: 47117:0:(qsd_reint.c:365:qsd_reconciliation()) nbp7-OST002f: failed to report quota. [0x200000006:0x20000:0x0], -37 Lustre: 47117:0:(qsd_reint.c:525:qsd_reint_main()) nbp7-OST002f: reconciliation failed. [0x0:0x0:0x0], -37 Lustre: 47122:0:(qsd_reint.c:365:qsd_reconciliation()) nbp7-OST004f: failed to report quota. [0x200000006:0x20000:0x0], -37 Lustre: 47122:0:(qsd_reint.c:525:qsd_reint_main()) nbp7-OST004f: reconciliation failed. [0x0:0x0:0x0], -37 Lustre: nbp7-OST002b: deleting orphan objects from 0x0:16291625 to 0x0:16291649 Lustre: nbp7-OST003b: deleting orphan objects from 0x0:16000564 to 0x0:16000656 Lustre: nbp7-OST0027: deleting orphan objects from 0x0:16520477 to 0x0:16520523 Lustre: nbp7-OST0033: deleting orphan objects from 0x0:16359404 to 0x0:16359437 Lustre: nbp7-OST0047: deleting orphan objects from 0x0:16321198 to 0x0:16321227 Lustre: nbp7-OST0043: deleting orphan objects from 0x0:16268525 to 0x0:16268561 Lustre: nbp7-OST002f: deleting orphan objects from 0x0:15751870 to 0x0:15751911 Lustre: nbp7-OST003f: deleting orphan objects from 0x0:16136079 to 0x0:16136110 Lustre: nbp7-OST004b: deleting orphan objects from 0x0:15683379 to 0x0:15683400 Lustre: nbp7-OST004f: deleting orphan objects from 0x0:16583505 to 0x0:16583534 Lustre: nbp7-OST0037: deleting orphan objects from 0x0:16228767 to 0x0:16228810 Lustre: nbp7-OST0053: deleting orphan objects from 0x0:16334906 to 0x0:16334966 Lustre: 47127:0:(qsd_reint.c:365:qsd_reconciliation()) nbp7-OST004b: failed to report quota. [0x200000006:0x20000:0x0], -37 Lustre: 47127:0:(qsd_reint.c:525:qsd_reint_main()) nbp7-OST004b: reconciliation failed. [0x0:0x0:0x0], -37 LustreError: 12423:0:(lquota_internal.h:272:lqe_putref()) $$$ Freeing quota entry while it is still referenced in the hash!!! qsd:nbp7-OST003f qtype:usr id:9637 enforced:1 granted:2111482660 pending:0 waiting:0 req:0 usage:2111481984 qunit:4096 qtune:1024 edquot:0 LustreError: 12423:0:(lquota_internal.h:273:lqe_putref()) LBUG Pid: 12423, comm: lquota_wb_nbp7- Call Trace: [<ffffffffa0486895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa0486e97>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa0c761ef>] qsd_upd_thread+0xc8f/0xdd0 [lquota] [<ffffffff81063be0>] ? default_wake_function+0x0/0x20 Entering kdb (current=0xffff881023ab4040, pid 12423) on processor 24 Oops: (null) due to oops @ 0x0 |
| Comment by Niu Yawei (Inactive) [ 14/Mar/14 ] |
|
I didn't see how quota code screwed the refcount so far, will continue to investigate it further. Johann, do you have any idea? Thanks. |
| Comment by Niu Yawei (Inactive) [ 17/Mar/14 ] |
|
One thing confused me is that: qsd_reconciliation() reported "failed to report quota for -ENOLCK", which means qsd_reconciliation() called qsd_adjust() and returned -ENOLCK, however I didn't see the corresponding error message from qsd_req_completion() in the log: /* despite -EDQUOT & -EINPROGRESS errors, the master might still
* grant us back quota space to adjust quota overrun */
if (ret != 0 && ret != -EDQUOT && ret != -EINPROGRESS) {
if (ret != -ETIMEDOUT && ret != -ENOTCONN &&
ret != -ESHUTDOWN && ret != -EAGAIN)
/* print errors only if return code is unexpected */
LQUOTA_ERROR(lqe, "DQACQ failed with %d, flags:0x%x",
ret, reqbody->qb_flags);
GOTO(out, ret);
}
|
| Comment by Niu Yawei (Inactive) [ 20/Mar/14 ] |
|
Hi, Mahmoud I didn't see why "DQACQ failed with -37" wasn't printed in the log, is it possible that the running code isn't same as 2.4.0-4nasS? Do you have other 2.4 systems are also suffering this problem? Thanks. As Johann pointed, the error message could probably be flooded. Could you enable quota debug (D_QUOTA) on all OSTs and provide lustre logs as well when it's reproduced next time? Thank you. |
| Comment by Mahmoud Hanafi [ 20/Mar/14 ] |
|
We have a second filesystem running the same code (2.4.1-5.2nasS) It doesn't have any of the ENOLCK errors. The filesystem with the errors was a 2.1.5 that was upgraded to 2.4. The other started out with 2.4. We do want to track down the ENOLCK errors. We have D_QUOTA enabled. But when we see a crash the host panics and we are not able to get a debug dump. Additional info: These past few crashes have been after the MDS crashed and finished recovery. Then we would see 1 or more OSS crash. But we have see this crash without the MDS crash. I took a current debug dump of all OSS and MDS and uploaded to the ftp site. (nbp7.debug.tgz) |
| Comment by Shuichi Ihara (Inactive) [ 25/Mar/14 ] |
|
Hi, we hit same issue (filed as Mar 25 00:29:51 ddnoss4 kernel: LustreError: 4503:0:(qsd_handler.c:344:qsd_req_completion()) $$$ DQACQ failed with -37, flags:0x2 qsd:home2-OST0026 qtype:grp id:3303 enforced:1 granted:7364460 pending:0 waiting:0 req:1 usage:7165364 qunit:4194304 qtune:524288 edquot:0 Mar 25 00:39:51 ddnoss4 kernel: LustreError: 4503:0:(qsd_handler.c:344:qsd_req_completion()) $$$ DQACQ failed with -37, flags:0x2 qsd:home2-OST0026 qtype:grp id:3303 enforced:1 granted:7364460 pending:0 waiting:0 req:1 usage:7165364 qunit:4194304 qtune:524288 edquot:0 Mar 25 00:49:51 ddnoss4 kernel: LustreError: 4503:0:(qsd_handler.c:344:qsd_req_completion()) $$$ DQACQ failed with -37, flags:0x2 qsd:home2-OST0026 qtype:grp id:3303 enforced:1 granted:7364460 pending:0 waiting:0 req:1 usage:7165364 qunit:4194304 qtune:524288 edquot:0 Mar 25 00:59:51 ddnoss4 kernel: LustreError: 4503:0:(qsd_handler.c:344:qsd_req_completion()) $$$ DQACQ failed with -37, flags:0x2 qsd:home2-OST0026 qtype:gr |
| Comment by Niu Yawei (Inactive) [ 25/Mar/14 ] |
|
Hi, Ihara If the OSS is still showing such error message every 10 minutes, collecting debug logs (it's better to have both MDT and OST log with D_QUOTA enabled) can help us to figure out why dqacq failed with -ENOLCK. Thank you. |
| Comment by Niu Yawei (Inactive) [ 26/Mar/14 ] |
|
Seems there is a small race window in qsd_intent_lock(), I updated the patch. |
| Comment by Shuichi Ihara (Inactive) [ 26/Mar/14 ] |
|
Hi Niu, uploaded debug log on ftp.whamcloud.com/uploads/ |
| Comment by Niu Yawei (Inactive) [ 27/Mar/14 ] |
|
Hi, Ihara/Mahmoud |
| Comment by Jay Lan (Inactive) [ 27/Mar/14 ] |
|
Niu, do you want me to undo patch at http://review.whamcloud.com/#/c/9070 or to keep it, in addition to #9808? Thanks! |
| Comment by Jay Lan (Inactive) [ 27/Mar/14 ] |
|
The #9070 needs to be reversed before applying #9808. However, there are still conflicts in three files. I believe your patch was not generated against 2.4 branch. The tree files with conflicts were: lustre/quota/qmt_pool.c One conflict in qmt_pool.c at qmt_pool_lqe_lookup(): if (qid->qid_uid == 0) { /* caller wants to access grace time, no need to look up the * entry since we keep a reference on ID 0 all the time */ lqe = pool->qpi_grace_lqe[qtype]; <<<<<<< HEAD lqe_getref(lqe); GOTO(out, 0); ======= lqe_getref(lqe, LQE_REF_IDX_MAX); GOTO(out, lqe); >>>>>>> 6ec5b50... LU-4249 quota: lqe reference debug patch }/* now that we have the pool, let's look-up the quota entry in the
The top of trunk of 2.4 branch shuold have GOTO line like this: I think one example is enough for you to see the difference. |
| Comment by Niu Yawei (Inactive) [ 28/Mar/14 ] |
|
Sorry, I made the patch against master mistakenly, here is the patch for b2_4: http://review.whamcloud.com/9833 This new patch doesn't address the conflict in lustre/quota/qsd_writeback.c, because there is a quota fix introduced in 2.4.2 ( |
| Comment by Jay Lan (Inactive) [ 28/Mar/14 ] |
|
Hi Niu, I cherry-picked |
| Comment by Niu Yawei (Inactive) [ 03/Apr/14 ] |
|
Hi Javed Looks your post isn't related to this bug, could you open another ticket to track it? Thanks. |
| Comment by javed shaikh (Inactive) [ 03/Apr/14 ] |
|
thanks niu. deleted from here. |
| Comment by Shuichi Ihara (Inactive) [ 03/Apr/14 ] |
|
Hi Niu, so, finally, what exactly patches should we apply againt b2_4 if we want enabled debug patches? |
| Comment by Niu Yawei (Inactive) [ 03/Apr/14 ] |
fix of |
| Comment by Shuichi Ihara (Inactive) [ 18/Apr/14 ] |
|
Niu, we applied patches, and then, same error messages have been showing up. Attahced are recent debug log and syslog messages after applies patches. |
| Comment by Niu Yawei (Inactive) [ 18/Apr/14 ] |
|
Hi, Ihara, the ENOLCK error message problem is addressed in another ticket, please see This debug patch is to collect information on lqe refcount when system crash on "exception RIP: ...", please keep the debug patch applied until the crash problem reproduced. Thanks. |
| Comment by Niu Yawei (Inactive) [ 04/Jul/14 ] |
| Comment by Niu Yawei (Inactive) [ 09/Jul/14 ] |
|
b2_4: http://review.whamcloud.com/11019 |
| Comment by Peter Jones [ 08/Aug/14 ] |
|
Landed for 2.7 |
| Comment by Jay Lan (Inactive) [ 08/Aug/14 ] |
|
Do we need to undo #9833 (the debug patch) and apply #11019? Also please land #11019 and #11020. |
| Comment by Peter Jones [ 08/Aug/14 ] |
|
I will leave it to Niu to comment definitively about whether the debug patch needs to be removed but my understanding is that #11019 is intended to be the fix for b2_4. The fixes for maintenance branches will be landed when a release is scheduled for those branches. |
| Comment by Jay Lan (Inactive) [ 08/Aug/14 ] |
|
But wouldn't that cause your engineers who work in future patches have different code base from ours with accumulated b2_4/b2_5 patches? |
| Comment by Peter Jones [ 08/Aug/14 ] |
|
Well, even without this happening it would still be quite possible that the release you baseline on lags behind the tip of the maintenance branch. If such a situation arises where there are multiple "floating" patches that affect the same area then we can always assist in resolving the conflict. I would think that introducing patch dependencies should make this easier though. It's probably simpler to talk about this on the next call than back and forth in a ticket |
| Comment by Niu Yawei (Inactive) [ 11/Aug/14 ] |
Yes, that debug patch a little bit heavier than the former one (but it provides more detailed information), it wasn't supposed to be carried in production system all the time. Thanks. |