[LU-7975] "(lod_object.c:700:lod_ah_init()) ASSERTION( lc->ldo_stripenr == 0 )" LBUG/Assert on MDS Created: 01/Apr/16 Updated: 14/Jun/18 Resolved: 31/May/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.9.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Bruno Faccini (Inactive) | Assignee: | Bruno Faccini (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | cea | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
A site has encountered multiple crashes with same signature/stack+msgs following : LustreError: 89879:0:(osp_precreate.c:1222:osp_object_truncate()) can't punch object: -11 Lustre: composit-OST0009-osc-MDT0000: Connection to composit-OST0009 (at 10.0.14.31@o2ib) was lost; in progress operations using this service will wait for recovery to complete LustreError: 89879:0:(lod_object.c:700:lod_ah_init()) ASSERTION( lc->ldo_stripenr == 0 ) failed: LustreError: 89879:0:(lod_object.c:700:lod_ah_init()) LBUG Pid: 89879, comm: mdt01_006 Call Trace: [<ffffffffa057e895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa057ee97>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa266c0af>] lod_ah_init+0x58f/0x5d0 [lod] [<ffffffffa26c7ad3>] mdd_object_make_hint+0x83/0xa0 [mdd] [<ffffffffa26d4502>] mdd_create_data+0x332/0x7d0 [mdd] [<ffffffffa25a93f0>] mdt_finish_open+0x1350/0x19a0 [mdt] [<ffffffffa257e5f4>] ? mdt_object_lock+0x14/0x20 [mdt] [<ffffffffa25a9fbd>] mdt_open_by_fid_lock+0x57d/0x910 [mdt] [<ffffffffa25aabac>] mdt_reint_open+0x56c/0x21a0 [mdt] [<ffffffffa059b14c>] ? upcall_cache_get_entry+0x29c/0x890 [libcfs] [<ffffffffa0983930>] ? lu_ucred+0x20/0x30 [obdclass] [<ffffffffa2572945>] ? mdt_ucred+0x15/0x20 [mdt] [<ffffffffa258f8ec>] ? mdt_root_squash+0x2c/0x410 [mdt] [<ffffffffa123bad6>] ? __req_capsule_get+0x166/0x710 [ptlrpc] [<ffffffffa2593ab1>] mdt_reint_rec+0x41/0xe0 [mdt] [<ffffffffa2578f83>] mdt_reint_internal+0x4c3/0x780 [mdt] [<ffffffffa257950e>] mdt_intent_reint+0x1ee/0x520 [mdt] [<ffffffffa2576cee>] mdt_intent_policy+0x3ae/0x770 [mdt] [<ffffffffa11ca2f5>] ldlm_lock_enqueue+0x135/0x980 [ptlrpc] [<ffffffffa11f43fb>] ldlm_handle_enqueue0+0x51b/0x10c0 [ptlrpc] [<ffffffffa25771b6>] mdt_enqueue+0x46/0xe0 [mdt] [<ffffffffa257c84a>] mdt_handle_common+0x52a/0x1470 [mdt] [<ffffffffa25b98f5>] mds_regular_handle+0x15/0x20 [mdt] [<ffffffffa12238d5>] ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc] [<ffffffffa05904fa>] ? lc_watchdog_touch+0x7a/0x190 [libcfs] [<ffffffffa121c289>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] [<ffffffff81057849>] ? __wake_up_common+0x59/0x90 [<ffffffffa122605d>] ptlrpc_main+0xaed/0x1780 [ptlrpc] [<ffffffffa1225570>] ? ptlrpc_main+0x0/0x1780 [ptlrpc] [<ffffffff8109e78e>] kthread+0x9e/0xc0 [<ffffffff8100c28a>] child_rip+0xa/0x20 [<ffffffff8109e6f0>] ? kthread+0x0/0xc0 [<ffffffff8100c280>] ? child_rip+0x0/0x20 According to existing tickets, I have found that this kind of problem has already (partially?) been addressed in LU-4260, LU-4791 and LU-5346 tickets. BTW, I have also determined that these MDT assert always occurs just following an OSS crash, hence the -EAGAIN/EWOULDBLOCK error in "(osp_precreate.c:1222:osp_object_truncate()) can't punch object: -11" msg just preceding the assert ! |
| Comments |
| Comment by Alex Zhuravlev [ 01/Apr/16 ] |
|
the good thing is that we've got a new infrastructure with DNE2 allowing for striping procedures (declaration and creation) to be per-thandle and not put intermediate state into an object. another improvement we can make with this infrastructure is to release OSP objects after creation, don't leave them in cache consuming memory. |
| Comment by Bruno Faccini (Inactive) [ 01/Apr/16 ] |
|
After more crash-dump and Lustre debug logs analysis, I was able to find a reproducer for the situation unveiled at customer site and first causing MDS CERROR/msg "(osp_precreate.c:1222:osp_object_truncate()) can't punch object: -11" and a few later "(lod_object.c:700:lod_ah_init()) ASSERTION( lc->ldo_stripenr == 0 ) failed" Assertion/LBUG() : # !!! with all OSTs down !!! mkdir /mnt/lustre/foo_dir/ lfs setstripe -c 2 /mnt/lustre/foo_dir/ # open file with O_LOV_DELAY_CREATE, then non-0 truncate multiop /mnt/lustre/foo_dir/foo5 oO_RDWR:O_CREAT:O_LOV_DELAY_CREATE:T1050000c # normal open cat /etc/hosts > /mnt/lustre/foo_dir/foo5 --->>> "-bash: /mnt/lustre/foo_dir/foo5: Resource temporarily unavailable" and "(osp_precreate.c:1222:osp_object_truncate()) can't punch object: -11" # normal open again cat /etc/hosts > /mnt/lustre/foo_dir/foo5 --->>> "LustreError: 11915:0:(lod_object.c:700:lod_ah_init()) ASSERTION( lc->ldo_stripenr == 0 ) failed:" Then, diging more in both MDS full Lustre debug log along into the source code, I found the reason of the -EAGAIN return from osp_object_create() call, which is not sent over the wire but in turn is coded as -EWOULDBLOCK in ptlrpc_import_delay_req() !! : /**
* Based on the current state of the import, determine if the request
* can be sent, is an error, or should be delayed.
*
* Returns true if this request should be delayed. If false, and
* *status is set, then the request can not be sent and *status is the
* error code. If false and status is 0, then request can be sent.
*
* The imp->imp_lock must be held.
*/
static int ptlrpc_import_delay_req(struct obd_import *imp,
struct ptlrpc_request *req, int *status)
{
int delay = 0;
ENTRY;
..............
} else if (req->rq_send_state != imp->imp_state) {
/* invalidate in progress - any requests should be drop */
if (cfs_atomic_read(&imp->imp_inval_count) != 0) {
DEBUG_REQ(D_ERROR, req, "invalidate in flight");
*status = -EIO;
} else if (imp->imp_dlm_fake || req->rq_no_delay) {
*status = -EWOULDBLOCK; <<<<<<<<<<<<<<<<<<<<<<<<<<<<
.........................
when /*
*
*/
int osp_object_truncate(const struct lu_env *env, struct dt_object *dt,
__u64 size)
{
...............
/*
* XXX: decide how do we do here with resend
* if we don't resend, then client may see wrong file size
* if we do resend, then MDS thread can get stuck for quite long
*/
req->rq_no_resend = req->rq_no_delay = 1;
................
So this raises more implications and I think resend (or any retry mechanism) may not be an option at the moment if we don't want data corruption failure (missing non-0 truncate() effect/result) when this particular scenario occurs. Because even if striping allocated things/stuff cleanup upon error return path must be fixed in MDS code, the error should not be returned to the user otherwise further open() will fail and non-0 truncate() effect/result will be lost. |
| Comment by Alex Zhuravlev [ 01/Apr/16 ] |
|
as for truncate, there is a ticket to move this functionality to the client side. could be a good time, probably.. |
| Comment by Gerrit Updater [ 01/Apr/16 ] |
|
Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: http://review.whamcloud.com/19301 |
| Comment by Gerrit Updater [ 01/Apr/16 ] |
|
Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: http://review.whamcloud.com/19302 |
| Comment by Bruno Faccini (Inactive) [ 01/Apr/16 ] |
|
Patch at http://review.whamcloud.com/19301 fixes cleanup in delayed stripe error path and also implements resend mechanism from MDS side. It may keep a MDS thread busy for some time doing so. Patch at http://review.whamcloud.com/19302 is an other way to fix, also doing delayed stripe error path necessary cleanup, but offloading resend mechanism to Client side, which may be less intrusive. |
| Comment by Bruno Faccini (Inactive) [ 24/May/16 ] |
|
http://review.whamcloud.com/19301 has been abandoned in favor of http://review.whamcloud.com/19302, according to reviewers comments and choice between both solutions. |
| Comment by Gerrit Updater [ 31/May/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/19302/ |
| Comment by Peter Jones [ 31/May/16 ] |
|
Landed for 2.9 |