Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.4.0
-
3
-
6847
Description
If a successful reply is lost for an intent lock request, MDS will not correctly recover from this situation on resend.
The cause of which seems to be the code of ldlm_handle_enqueue0() and mdt_intent_fixup_resent()
int ldlm_handle_enqueue0(struct ldlm_namespace *ns, struct ptlrpc_request *req, const struct ldlm_request *dlm_req, const struct ldlm_callback_suite *cbs) { ... /* The lock's callback data might be set in the policy function */ lock = ldlm_lock_create(ns, &dlm_req->lock_desc.l_resource.lr_name, dlm_req->lock_desc.l_resource.lr_type, dlm_req->lock_desc.l_req_mode, cbs, NULL, 0); ... lock->l_export = class_export_lock_get(req->rq_export, lock); if (lock->l_export->exp_lock_hash) { cfs_hash_add(lock->l_export->exp_lock_hash, &lock->l_remote_handle, &lock->l_exp_hash); } ... err = ldlm_lock_enqueue(ns, &lock, cookie, &flags); ... }
static void mdt_intent_fixup_resent(struct mdt_thread_info *info, struct ldlm_lock *new_lock, struct ldlm_lock **old_lock, struct mdt_lock_handle *lh) { struct ptlrpc_request *req = mdt_info_req(info); struct obd_export *exp = req->rq_export; struct lustre_handle remote_hdl; struct ldlm_request *dlmreq; struct ldlm_lock *lock; if (!(lustre_msg_get_flags(req->rq_reqmsg) & MSG_RESENT)) return; dlmreq = req_capsule_client_get(info->mti_pill, &RMF_DLM_REQ); remote_hdl = dlmreq->lock_handle[0]; lock = cfs_hash_lookup(exp->exp_lock_hash, &remote_hdl); if (lock) { if (lock != new_lock) { ... }
On resend, ldlm_handle_enqueue0() add the new lock into hash even though there's already a granted lock with the same remote handle. mdt_intent_fixup_resent() will find the newly added lock in hash and ignore it. This will cause to an enqueue request on the newly created lock, deadlock and client eviction.
Alexey thinks that the problem has existed since we moved from the correct code:
static void fixup_handle_for_resent_req(struct ptlrpc_request *req, int offset, struct ldlm_lock *new_lock, struct ldlm_lock **old_lock, struct lustre_handle *lockh) { struct obd_export *exp = req->rq_export; struct ldlm_request *dlmreq = lustre_msg_buf(req->rq_reqmsg, offset, sizeof(*dlmreq)); struct lustre_handle remote_hdl = dlmreq->lock_handle[0]; struct list_head *iter; if (!(lustre_msg_get_flags(req->rq_reqmsg) & MSG_RESENT)) return; spin_lock(&exp->exp_ldlm_data.led_lock); list_for_each(iter, &exp->exp_ldlm_data.led_held_locks) { struct ldlm_lock *lock; lock = list_entry(iter, struct ldlm_lock, l_export_chain); if (lock == new_lock) continue; <==================== N.B. if (lock->l_remote_handle.cookie == remote_hdl.cookie) { lockh->cookie = lock->l_handle.h_cookie; LDLM_DEBUG(lock, "restoring lock cookie"); DEBUG_REQ(D_DLMTRACE, req,"restoring lock cookie "LPX64, lockh->cookie); if (old_lock) *old_lock = LDLM_LOCK_GET(lock); spin_unlock(&exp->exp_ldlm_data.led_lock); return; } } ... }
Logs for this issue will follow.
Attachments
Issue Links
- is related to
-
LU-5266 LBUG on Failover -ldlm_process_extent_lock()) ASSERTION( lock->l_granted_mode != lock->l_req_mode )
-
- Resolved
-
-
LU-4584 Lock revocation process fails consistently
-
- Resolved
-
-
LU-5314 Lustre 2.4.2 MDS hit LBUG and crash
-
- Resolved
-
- is related to
-
LU-5530 MDS thread lockup witrh patched 2.5 server
-
- Resolved
-
Activity
I tested master with the patches as well and got the same results.
With the patches the evictions are gone which is good. Now its just locking up.
I am afraid I am not able to conclude about what is going wrong, using this only MDS Lustre debug-log. Are there still similar stacks being dumped in dmesg/syslog ?
Also, I did not find any eviction trace in this log.
IOR single shared file run is producing the below messages. Also I have a log dumps that I uploaded to
ftp.whamcloud.com/uploads/LU-2827/lustre-log-LU-2827-29-05-2014
May 29 18:47:39 atlas-tds-mds1 kernel: [12046.175064] Lustre: 17065:0:(service.c:1349:ptlrpc_at_send_early_reply()) Skipped 8 previous similar messages
May 29 18:47:45 atlas-tds-mds1 kernel: [12052.188755] Lustre: atlastds-MDT0000: Client 23d4a774-8c3b-4edc-3b98-24d112966233 (at 14@gni2) reconnecting
May 29 18:47:45 atlas-tds-mds1 kernel: [12052.199379] Lustre: Skipped 1 previous similar message
May 29 19:00:15 atlas-tds-mds1 kernel: [12801.706037] Lustre: 17829:0:(service.c:1349:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply
May 29 19:00:15 atlas-tds-mds1 kernel: [12801.706039] req@ffff8807e74adc00 x1468567563329976/t0(0) o101->73ecb9e8-d229-752a-3cfc-2036f509fb5b@50@gni2:0/0 lens 512/3512 e 0 to 0 dl 1401404420 ref 2 fl Interpret:/0/0 rc 0/0
May 29 19:00:15 atlas-tds-mds1 kernel: [12802.233175] Lustre: 17062:0:(service.c:1349:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply
May 29 19:00:15 atlas-tds-mds1 kernel: [12802.233176] req@ffff8807e76d6400 x1468567565457596/t0(0) o101->e8d3ca06-b910-df85-78e9-5653469b4bb8@81@gni2:0/0 lens 512/3512 e 0 to 0 dl 1401404420 ref 2 fl Interpret:/0/0 rc 0/0
May 29 19:00:15 atlas-tds-mds1 kernel: [12802.263339] Lustre: 17062:0:(service.c:1349:ptlrpc_at_send_early_reply()) Skipped 4 previous similar messages
May 29 19:00:21 atlas-tds-mds1 kernel: [12808.519392] Lustre: atlastds-MDT0000: Client 568959a5-c11b-5ded-e1c9-1c90892a82ca (at 49@gni2) reconnecting
Tested it first on my smallest size test bed and the two patches that I combined and back port to b2_5 did pretty well. So I moved the test to a larger system and when I went to run my simul reproducer I now get this:
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.308755] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.317414] mdt02_247 D 0000000000000008 0 17580 2 0x00000000
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.324711] ffff88104d7b98c8 0000000000000046 0000000000000000 ffffffffa0e0f4ab
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.333113] ffff8810719e5310 ffff8810719e52b8 ffff88104d7a5538 ffffffffa0e0f4ab
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.341442] ffff88104d7b7af8 ffff88104d7b9fd8 000000000000fbc8 ffff88104d7b7af8
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.349651] Call Trace:
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.352455] [<ffffffff8152a6d5>] rwsem_down_failed_common+0x95/0x1d0
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.359226] [<ffffffffa0c0c32b>] ? ldiskfs_xattr_trusted_get+0x2b/0x30 [ldiskfs]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.367490] [<ffffffff811ae017>] ? generic_getxattr+0x87/0x90
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.373729] [<ffffffff8152a866>] rwsem_down_read_failed+0x26/0x30
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.380272] [<ffffffffa08b7083>] ? lod_xattr_get+0x153/0x420 [lod]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.386972] [<ffffffff8128eab4>] call_rwsem_down_read_failed+0x14/0x30
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.393901] [<ffffffff81529d64>] ? down_read+0x24/0x30
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.399595] [<ffffffffa0dd949d>] mdt_object_open_lock+0x1ed/0x9d0 [mdt]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.406680] [<ffffffffa0dbb7ac>] ? mdt_attr_get_complex+0x4ec/0x770 [mdt]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.413960] [<ffffffffa0de1ac7>] mdt_reint_open+0x15b7/0x2150 [mdt]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.420672] [<ffffffffa0453f76>] ? upcall_cache_get_entry+0x296/0x880 [libcfs]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.428803] [<ffffffffa05f3700>] ? lu_ucred+0x20/0x30 [obdclass]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.435224] [<ffffffffa0dca441>] mdt_reint_rec+0x41/0xe0 [mdt]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.441611] [<ffffffffa0dafe63>] mdt_reint_internal+0x4c3/0x780 [mdt]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.448515] [<ffffffffa0db03ee>] mdt_intent_reint+0x1ee/0x520 [mdt]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.455274] [<ffffffffa0dadbce>] mdt_intent_policy+0x3ae/0x770 [mdt]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.462212] [<ffffffffa070e511>] ldlm_lock_enqueue+0x361/0x8c0 [ptlrpc]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.469289] [<ffffffffa0737b9f>] ldlm_handle_enqueue0+0x51f/0x10f0 [ptlrpc]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.476684] [<ffffffffa0dae096>] mdt_enqueue+0x46/0xe0 [mdt]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.482790] [<ffffffffa0db2c5a>] mdt_handle_common+0x52a/0x1470 [mdt]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.489687] [<ffffffffa0def765>] mds_regular_handle+0x15/0x20 [mdt]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.496399] [<ffffffffa0767c25>] ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.504762] [<ffffffffa04384ce>] ? cfs_timer_arm+0xe/0x10 [libcfs]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.511492] [<ffffffffa04493cf>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.518772] [<ffffffffa075f2a9>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.525902] [<ffffffff810546b9>] ? __wake_up_common+0x59/0x90
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.532184] [<ffffffffa0768f8d>] ptlrpc_main+0xaed/0x1920 [ptlrpc]
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.545438] [<ffffffff8109ab56>] kthread+0x96/0xa0
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.550656] [<ffffffff8100c20a>] child_rip+0xa/0x20
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.556068] [<ffffffff8109aac0>] ? kthread+0x0/0xa0
May 29 18:27:03 atlas-tds-mds1 kernel: [10809.561451] [<ffffffff8100c200>] ? child_rip+0x0/0x20
Its better in that the MDS doesn't assert any more.I also canceled the job and the system recovered which is a good sign.
I spent some time to reproduce the MDS LBUG//LASSERT(lock->l_granted_mode != lock->l_req_mode) I had when running the simple reproducer from LU-4584 on a Lustre build with this ticket/Gerrit-change LU-2827/#5978-[8,9] and here is what seems to happen :
_ in fact even a simple dd to create a file with a default and significant striping (I am using 154 OSTs), right after file-system start/mount triggers the problem ...
_ what I see in the Client/MDS full debug logs then is that, as expected after LU-4584 scenario analysis, the Client resend occurs during a layout lock (MDS_INODELOCK_LAYOUT) reply from the server that is truncated due to the striping/layout size exceeding reply buffer size :
Client 1st/original request
00000080:00010000:2.0:1400265787.405960:0:2750:0:(file.c:3998:ll_layout_refresh()) ### lustre: requeue layout lock for file [0x200002340:0x1:0x0](ffff880320dbc678) 00000080:00000001:2.0:1400265787.405962:0:2750:0:(obd_class.h:1691:md_enqueue()) Process entered 00800000:00000001:2.0:1400265787.405963:0:2750:0:(lmv_obd.c:1949:lmv_enqueue()) Process entered 00800000:00000001:2.0:1400265787.405964:0:2750:0:(lmv_obd.c:620:lmv_check_connect()) Process entered 00800000:00000001:2.0:1400265787.405965:0:2750:0:(lmv_obd.c:623:lmv_check_connect()) Process leaving (rc=0 : 0 : 0) 00800000:00000002:2.0:1400265787.405967:0:2750:0:(lmv_obd.c:1956:lmv_enqueue()) ENQUEUE 'layout' on [0x200002340:0x1:0x0] 00800000:00000002:2.0:1400265787.405968:0:2750:0:(lmv_obd.c:1963:lmv_enqueue()) ENQUEUE 'layout' on [0x200002340:0x1:0x0] -> mds #0 00800000:00000001:2.0:1400265787.405970:0:2750:0:(obd_class.h:1691:md_enqueue()) Process entered 00000002:00000001:2.0:1400265787.405971:0:2750:0:(mdc_locks.c:802:mdc_enqueue()) Process entered 00000002:00000001:2.0:1400265787.405972:0:2750:0:(mdc_locks.c:511:mdc_intent_layout_pack()) Process entered 00000100:00000010:2.0:1400265787.405973:0:2750:0:(client.c:410:ptlrpc_request_cache_alloc()) slab-alloced 'req': 976 at ffff8802faff9400. 00000020:00000040:2.0:1400265787.405974:0:2750:0:(genops.c:1022:class_import_get()) import ffff880328196800 refcount=6 obd=lustre-MDT0000-mdc-ffff880331322400 00010000:00000001:2.0:1400265787.405977:0:2750:0:(ldlm_request.c:766:ldlm_prep_elc_req()) Process entered 00010000:00000001:2.0:1400265787.405978:0:2750:0:(ldlm_request.c:1649:ldlm_prepare_lru_list()) Process entered 00010000:00000001:2.0:1400265787.405980:0:2750:0:(ldlm_lock.c:211:ldlm_lock_put()) Process entered 00010000:00000001:2.0:1400265787.405981:0:2750:0:(ldlm_lock.c:244:ldlm_lock_put()) Process leaving 00010000:00000001:2.0:1400265787.405982:0:2750:0:(ldlm_request.c:1771:ldlm_prepare_lru_list()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1400265787.405984:0:2750:0:(client.c:584:__ptlrpc_request_bufs_pack()) Process entered 02000000:00000001:2.0:1400265787.405985:0:2750:0:(sec.c:416:sptlrpc_req_get_ctx()) Process entered 02000000:00000001:2.0:1400265787.405986:0:2750:0:(sec.c:434:sptlrpc_req_get_ctx()) Process leaving (rc=0 : 0 : 0) 02000000:00000010:2.0:1400265787.405988:0:2750:0:(sec_null.c:174:null_alloc_reqbuf()) kmalloced 'req->rq_reqbuf': 512 at ffff8802f7eb1800. 00000100:00000001:2.0:1400265787.405990:0:2750:0:(client.c:638:__ptlrpc_request_bufs_pack()) Process leaving (rc=0 : 0 : 0) 00010000:00000001:2.0:1400265787.405991:0:2750:0:(ldlm_request.c:1886:ldlm_cli_cancel_list()) Process entered 00010000:00000001:2.0:1400265787.405992:0:2750:0:(ldlm_request.c:1889:ldlm_cli_cancel_list()) Process leaving (rc=0 : 0 : 0) 00010000:00000001:2.0:1400265787.405994:0:2750:0:(ldlm_request.c:1886:ldlm_cli_cancel_list()) Process entered 00010000:00000001:2.0:1400265787.405995:0:2750:0:(ldlm_request.c:1889:ldlm_cli_cancel_list()) Process leaving (rc=0 : 0 : 0) 00010000:00000001:2.0:1400265787.405996:0:2750:0:(ldlm_request.c:817:ldlm_prep_elc_req()) Process leaving (rc=0 : 0 : 0) 00000002:00000001:2.0:1400265787.405998:0:2750:0:(mdc_locks.c:538:mdc_intent_layout_pack()) Process leaving (rc=18446612145115206656 : -131928594344960 : ffff8802faff9400) 00000002:00000001:2.0:1400265787.406000:0:2750:0:(lustre_mdc.h:107:mdc_get_rpc_lock()) Process entered 00010000:00000001:2.0:1400265787.406001:0:2750:0:(ldlm_request.c:875:ldlm_cli_enqueue()) Process entered 00010000:00000001:2.0:1400265787.406002:0:2750:0:(ldlm_lock.c:1596:ldlm_lock_create()) Process entered 00010000:00000001:2.0:1400265787.406003:0:2750:0:(lustre_fid.h:707:fid_flatten32()) Process leaving (rc=36962305 : 36962305 : 2340001) 00010000:00000040:2.0:1400265787.406004:0:2750:0:(ldlm_resource.c:1189:ldlm_resource_getref()) getref res: ffff880325333840 count: 2 00010000:00000001:2.0:1400265787.406006:0:2750:0:(ldlm_lock.c:472:ldlm_lock_new()) Process entered 00010000:00000010:2.0:1400265787.406007:0:2750:0:(ldlm_lock.c:477:ldlm_lock_new()) slab-alloced 'lock': 504 at ffff880320f970c0. 00000020:00000001:2.0:1400265787.406008:0:2750:0:(lustre_handles.c:80:class_handle_hash()) Process entered 00000020:00000040:2.0:1400265787.406009:0:2750:0:(lustre_handles.c:114:class_handle_hash()) added object ffff880320f970c0 with handle 0x5121400b7cdaafb2 to hash 00000020:00000001:2.0:1400265787.406011:0:2750:0:(lustre_handles.c:115:class_handle_hash()) Process leaving 00010000:00000001:2.0:1400265787.406011:0:2750:0:(ldlm_lock.c:515:ldlm_lock_new()) Process leaving (rc=18446612145752338624 : -131927957212992 : ffff880320f970c0) 00010000:00000001:2.0:1400265787.406013:0:2750:0:(ldlm_lock.c:1636:ldlm_lock_create()) Process leaving (rc=18446612145752338624 : -131927957212992 : ffff880320f970c0) 00010000:00000001:2.0:1400265787.406015:0:2750:0:(ldlm_lock.c:274:ldlm_lock_remove_from_lru()) Process entered 00010000:00000001:2.0:1400265787.406016:0:2750:0:(ldlm_lock.c:283:ldlm_lock_remove_from_lru()) Process leaving 00010000:00010000:2.0:1400265787.406017:0:2750:0:(ldlm_lock.c:797:ldlm_lock_addref_internal_nolock()) ### ldlm_lock_addref(CR) ns: lustre-MDT0000-mdc-ffff880331322400 lock: ffff880320f970c0/0x5121400b7cdaafb2 lrc: 3/1,0 mode: --/CR res: [0x200002340:0x1:0x0].0 bits 0x0 rrc: 2 type: IBT flags: 0x10000000000000 nid: local remote: 0x0 expref: -99 pid: 2750 timeout: 0 lvb_type: 3 00010000:00010000:2.0:1400265787.406023:0:2750:0:(ldlm_request.c:926:ldlm_cli_enqueue()) ### client-side enqueue START, flags 1000 ns: lustre-MDT0000-mdc-ffff880331322400 lock: ffff880320f970c0/0x5121400b7cdaafb2 lrc: 3/1,0 mode: --/CR res: [0x200002340:0x1:0x0].0 bits 0x8 rrc: 2 type: IBT flags: 0x0 nid: local remote: 0x0 expref: -99 pid: 2750 timeout: 0 lvb_type: 3 00010000:00010000:2.0:1400265787.406028:0:2750:0:(ldlm_request.c:988:ldlm_cli_enqueue()) ### sending request ns: lustre-MDT0000-mdc-ffff880331322400 lock: ffff880320f970c0/0x5121400b7cdaafb2 lrc: 3/1,0 mode: --/CR res: [0x200002340:0x1:0x0].0 bits 0x8 rrc: 2 type: IBT flags: 0x0 nid: local remote: 0x0 expref: -99 pid: 2750 timeout: 0 lvb_type: 3 00000100:00000001:2.0:1400265787.406038:0:2750:0:(client.c:2676:ptlrpc_queue_wait()) Process entered 00000100:00000001:2.0:1400265787.406039:0:2750:0:(client.c:859:ptlrpc_prep_set()) Process entered 00000100:00000010:2.0:1400265787.406040:0:2750:0:(client.c:860:ptlrpc_prep_set()) kmalloced 'set': 144 at ffff8803287f55c0. 00000100:00000001:2.0:1400265787.406041:0:2750:0:(client.c:876:ptlrpc_prep_set()) Process leaving (rc=18446612145878554048 : -131927830997568 : ffff8803287f55c0) 00000100:00000001:2.0:1400265787.406043:0:2750:0:(client.c:2605:ptlrpc_request_addref()) Process entered 00000100:00000001:2.0:1400265787.406044:0:2750:0:(client.c:2607:ptlrpc_request_addref()) Process leaving (rc=18446612145115206656 : -131928594344960 : ffff8802faff9400) 00000020:00000001:2.0:1400265787.406046:0:2750:0:(class_obd.c:138:lustre_get_jobid()) Process entered 00000020:00000001:2.0:1400265787.406047:0:2750:0:(class_obd.c:143:lustre_get_jobid()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1400265787.406048:0:2750:0:(client.c:2111:ptlrpc_set_wait()) Process entered 00000100:00000001:2.0:1400265787.406049:0:2750:0:(client.c:1415:ptlrpc_send_new_req()) Process entered 00000100:00000040:2.0:1400265787.406051:0:2750:0:(lustre_net.h:3296:ptlrpc_rqphase_move()) @@@ move req "New" -> "Rpc" req@ffff8802faff9400 x1466802977730748/t0(0) o101->lustre-MDT0000-mdc-ffff880331322400@10.10.4.17@tcp:12/10 lens 376/1544 e 0 to 0 dl 0 ref 2 fl New:/0/ffffffff rc 0/-1 00000100:00000001:2.0:1400265787.406057:0:2750:0:(client.c:1064:ptlrpc_import_delay_req()) Process entered 00000100:00000001:2.0:1400265787.406058:0:2750:0:(client.c:1115:ptlrpc_import_delay_req()) Process leaving (rc=0 : 0 : 0) 02000000:00000001:2.0:1400265787.406059:0:2750:0:(sec.c:659:sptlrpc_req_refresh_ctx()) Process entered 02000000:00000001:2.0:1400265787.406060:0:2750:0:(sec.c:686:sptlrpc_req_refresh_ctx()) Process leaving (rc=0 : 0 : 0) 00000100:00100000:2.0:1400265787.406062:0:2750:0:(client.c:1478:ptlrpc_send_new_req()) Sending RPC pname:cluuid:pid:xid:nid:opc file:a56bcdbb-ca09-9568-70fd-b3f47d606d12:2750:1466802977730748:10.10.4.17@tcp:101 00000100:00000001:2.0:1400265787.406065:0:2750:0:(niobuf.c:682:ptl_send_rpc()) Process entered 02000000:00000001:2.0:1400265787.406066:0:2750:0:(sec.c:961:sptlrpc_cli_wrap_request()) Process entered 02000000:00000001:2.0:1400265787.406066:0:2750:0:(sec.c:997:sptlrpc_cli_wrap_request()) Process leaving (rc=0 : 0 : 0) 02000000:00000001:2.0:1400265787.406068:0:2750:0:(sec.c:1689:sptlrpc_cli_alloc_repbuf()) Process entered 02000000:00000010:2.0:1400265787.406069:0:2750:0:(sec_null.c:217:null_alloc_repbuf()) kmalloced 'req->rq_repbuf': 2048 at ffff880308075800. 02000000:00000001:2.0:1400265787.406070:0:2750:0:(sec.c:1699:sptlrpc_cli_alloc_repbuf()) Process leaving (rc=0 : 0 : 0) 00000400:00000010:2.0:1400265787.406072:0:2750:0:(lib-lnet.h:500:lnet_me_alloc()) alloc '(me)': 96 at ffff880320454dc0 (tot 83133036). 00000400:00000010:2.0:1400265787.406075:0:2750:0:(lib-lnet.h:468:lnet_md_alloc()) alloc '(md)': 128 at ffff880320454d40 (tot 83133164). 00000100:00000200:2.0:1400265787.406077:0:2750:0:(niobuf.c:808:ptl_send_rpc()) Setup reply buffer: 2048 bytes, xid 1466802977730748, portal 10 00000100:00000001:2.0:1400265787.406079:0:2750:0:(client.c:2605:ptlrpc_request_addref()) Process entered 00000100:00000001:2.0:1400265787.406079:0:2750:0:(client.c:2607:ptlrpc_request_addref()) Process leaving (rc=18446612145115206656 : -131928594344960 : ffff8802faff9400) 00000100:00000040:2.0:1400265787.406082:0:2750:0:(niobuf.c:829:ptl_send_rpc()) @@@ send flg=0 req@ffff8802faff9400 x1466802977730748/t0(0) o101->lustre-MDT0000-mdc-ffff880331322400@10.10.4.17@tcp:12/10 lens 376/1544 e 0 to 0 dl 1400265794 ref 3 fl Rpc:/0/ffffffff rc 0/-1 00000100:00000001:2.0:1400265787.406088:0:2750:0:(niobuf.c:60:ptl_send_buf()) Process entered
Server handle+reply
00000100:00100000:0.0:1400265787.402458:0:11385:0:(service.c:1939:ptlrpc_server_handle_req_in()) got req x1466802977730748 00000020:00000001:0.0:1400265787.402460:0:11385:0:(genops.c:741:class_conn2export()) Process entered 00000020:00000040:0.0:1400265787.402460:0:11385:0:(genops.c:753:class_conn2export()) looking for export cookie 0xd169abc5abfab541 00000020:00000001:0.0:1400265787.402462:0:11385:0:(lustre_handles.c:172:class_handle2object()) Process entered 00000020:00000040:0.0:1400265787.402463:0:11385:0:(genops.c:839:class_export_get()) GETting export ffff880212978800 : new refcount 12 00000020:00000001:0.0:1400265787.402464:0:11385:0:(lustre_handles.c:195:class_handle2object()) Process leaving (rc=18446612141216073728 : -131932493477888 : ffff880212978800) 00000020:00000001:0.0:1400265787.402466:0:11385:0:(genops.c:755:class_conn2export()) Process leaving (rc=18446612141216073728 : -131932493477888 : ffff880212978800) 00000100:00000001:0.0:1400265787.402468:0:11385:0:(service.c:1068:ptlrpc_update_export_timer()) Process entered 00000100:00000001:0.0:1400265787.402469:0:11385:0:(service.c:1081:ptlrpc_update_export_timer()) Process leaving 00000020:00000010:0.0:1400265787.402470:0:11385:0:(lu_object.c:1643:keys_init()) kmalloced 'ctx->lc_value': 320 at ffff880268932c00. 02000000:00000010:0.0:1400265787.402472:0:11385:0:(capa.c:416:lu_capainfo_key_init()) kmalloced 'value': 128 at ffff880212f7cac0. 00000020:00000010:0.0:1400265787.402474:0:11385:0:(lu_ucred.c:53:lu_ucred_key_init()) kmalloced 'value': 72 at ffff880213336bc0. 00000020:00000010:0.0:1400265787.402475:0:11385:0:(tgt_main.c:187:tgt_ses_key_init()) kmalloced 'value': 144 at ffff880215e7a740. 00000100:00000040:0.0:1400265787.402477:0:11385:0:(service.c:1203:ptlrpc_at_set_timer()) armed mdt at +1s 00000100:00000001:0.0:1400265787.402479:0:11385:0:(service.c:1670:ptlrpc_server_request_add()) Process entered 00000100:00000001:0.0:1400265787.402479:0:11385:0:(service.c:1571:ptlrpc_server_hpreq_init()) Process entered 00000100:00000001:0.0:1400265787.402481:0:11385:0:(service.c:1622:ptlrpc_server_hpreq_init()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1400265787.402482:0:11385:0:(service.c:1685:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1400265787.402486:0:11385:0:(service.c:2002:ptlrpc_server_handle_req_in()) Process leaving (rc=1 : 1 : 1) 00000400:00000001:0.0:1400265787.402488:0:11385:0:(watchdog.c:433:lc_watchdog_disable()) Process entered 00000400:00000001:0.0:1400265787.402489:0:11385:0:(watchdog.c:441:lc_watchdog_disable()) Process leaving 00000400:00000001:0.0:1400265787.402491:0:11385:0:(watchdog.c:416:lc_watchdog_touch()) Process entered 00000400:00000001:0.0:1400265787.402492:0:11385:0:(watchdog.c:427:lc_watchdog_touch()) Process leaving 00000100:00000001:0.0:1400265787.402493:0:11385:0:(service.c:2025:ptlrpc_server_handle_request()) Process entered 00000100:00000001:0.0:1400265787.402494:0:11385:0:(service.c:1803:ptlrpc_server_request_get()) Process entered 00000100:00100000:0.0:1400265787.402496:0:11385:0:(nrs_fifo.c:182:nrs_fifo_req_get()) NRS start fifo request from 12345-10.10.4.10@tcp, seq: 172 00000100:00000040:0.0:1400265787.402498:0:11385:0:(service.c:1841:ptlrpc_server_request_get()) RPC GETting export ffff880212978800 : new rpc_count 1 00000100:00000001:0.0:1400265787.402499:0:11385:0:(service.c:1843:ptlrpc_server_request_get()) Process leaving (rc=18446612141172062208 : -131932537489408 : ffff88020ff7f800) 00000100:00000040:0.0:1400265787.402502:0:11385:0:(lustre_net.h:3296:ptlrpc_rqphase_move()) @@@ move req "New" -> "Interpret" req@ffff88020ff7f800 x1466802977730748/t0(0) o101->a56bcdbb-ca09-9568-70fd-b3f47d606d12@10.10.4.10@tcp:0/0 lens 376/0 e 0 to 0 dl 1400265793 ref 1 fl New:/0/ffffffff rc 0/-1 00000100:00000001:0.0:1400265787.402508:0:11385:0:(service.c:1068:ptlrpc_update_export_timer()) Process entered 00000100:00000001:0.0:1400265787.402509:0:11385:0:(service.c:1081:ptlrpc_update_export_timer()) Process leaving 00000100:00100000:0.0:1400265787.402511:0:11385:0:(service.c:2086:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc mdt00_002:a56bcdbb-ca09-9568-70fd-b3f47d606d12+12:2750:x1466802977730748:12345-10.10.4.10@tcp:101 00000100:00000200:0.0:1400265787.402513:0:11385:0:(service.c:2091:ptlrpc_server_handle_request()) got req 1466802977730748 00000020:00000001:0.0:1400265787.402515:0:11385:0:(tgt_handler.c:589:tgt_request_handle()) Process entered 00000020:00000001:0.0:1400265787.402516:0:11385:0:(tgt_handler.c:553:tgt_handler_find_check()) Process entered 00000020:00000001:0.0:1400265787.402517:0:11385:0:(tgt_handler.c:575:tgt_handler_find_check()) Process leaving (rc=18446744072111134752 : -1598416864 : ffffffffa0ba1820) 00000020:00000001:0.0:1400265787.402519:0:11385:0:(tgt_handler.c:492:tgt_handle_recovery()) Process entered 00000020:00000001:0.0:1400265787.402520:0:11385:0:(tgt_handler.c:542:tgt_handle_recovery()) Process leaving (rc=1 : 1 : 1) 00000020:00000001:0.0:1400265787.402522:0:11385:0:(tgt_handler.c:377:tgt_handle_request0()) Process entered 00000020:00000001:0.0:1400265787.402522:0:11385:0:(tgt_handler.c:307:tgt_request_preprocess()) Process entered 00000020:00000001:0.0:1400265787.402524:0:11385:0:(tgt_handler.c:362:tgt_request_preprocess()) Process leaving (rc=0 : 0 : 0) 00000020:00000001:0.0:1400265787.402525:0:11385:0:(tgt_handler.c:1220:tgt_enqueue()) Process entered 00010000:00000001:0.0:1400265787.402526:0:11385:0:(ldlm_lockd.c:1165:ldlm_handle_enqueue0()) Process entered 00010000:00010000:0.0:1400265787.402527:0:11385:0:(ldlm_lockd.c:1167:ldlm_handle_enqueue0()) ### server-side enqueue handler START 00010000:00000001:0.0:1400265787.402528:0:11385:0:(ldlm_lockd.c:1556:ldlm_request_cancel()) Process entered 00010000:00000001:0.0:1400265787.402529:0:11385:0:(ldlm_lockd.c:1560:ldlm_request_cancel()) Process leaving (rc=0 : 0 : 0) 00010000:00000001:0.0:1400265787.402531:0:11385:0:(ldlm_lock.c:1596:ldlm_lock_create()) Process entered 00010000:00000001:0.0:1400265787.402532:0:11385:0:(lustre_fid.h:707:fid_flatten32()) Process leaving (rc=36962305 : 36962305 : 2340001) 00010000:00000040:0.0:1400265787.402534:0:11385:0:(ldlm_resource.c:1189:ldlm_resource_getref()) getref res: ffff8802bdc23a80 count: 2 00010000:00000001:0.0:1400265787.402535:0:11385:0:(ldlm_lock.c:472:ldlm_lock_new()) Process entered 00010000:00000010:0.0:1400265787.402537:0:11385:0:(ldlm_lock.c:477:ldlm_lock_new()) slab-alloced 'lock': 504 at ffff880212a98ac0. 00000020:00000001:0.0:1400265787.402538:0:11385:0:(lustre_handles.c:80:class_handle_hash()) Process entered 00000020:00000040:0.0:1400265787.402539:0:11385:0:(lustre_handles.c:114:class_handle_hash()) added object ffff880212a98ac0 with handle 0xd169abc5abfab66e to hash 00000020:00000001:0.0:1400265787.402541:0:11385:0:(lustre_handles.c:115:class_handle_hash()) Process leaving 00010000:00000001:0.0:1400265787.402542:0:11385:0:(ldlm_lock.c:515:ldlm_lock_new()) Process leaving (rc=18446612141217254080 : -131932492297536 : ffff880212a98ac0) 00010000:00000001:0.0:1400265787.402544:0:11385:0:(ldlm_lock.c:1636:ldlm_lock_create()) Process leaving (rc=18446612141217254080 : -131932492297536 : ffff880212a98ac0) 00010000:00010000:0.0:1400265787.402546:0:11385:0:(ldlm_lockd.c:1258:ldlm_handle_enqueue0()) ### server-side enqueue handler, new lock created ns: mdt-lustre-MDT0000_UUID lock: ffff880212a98ac0/0xd169abc5abfab66e lrc: 2/0,0 mode: --/CR res: [0x200002340:0x1:0x0].0 bits 0x0 rrc: 2 type: IBT flags: 0x40000000000000 nid: local remote: 0x5121400b7cdaafb2 expref: -99 pid: 11385 timeout: 0 lvb_type: 0 00010000:00000040:0.0:1400265787.402552:0:11385:0:(ldlm_lockd.c:1270:ldlm_handle_enqueue0()) lock GETting export ffff880212978800 : new locks_count 9 00000020:00000040:0.0:1400265787.402553:0:11385:0:(genops.c:839:class_export_get()) GETting export ffff880212978800 : new refcount 13 00010000:00000001:0.0:1400265787.402555:0:11385:0:(ldlm_lock.c:1666:ldlm_lock_enqueue()) Process entered 00000004:00000001:0.0:1400265787.402556:0:11385:0:(mdt_handler.c:3497:mdt_intent_policy()) Process entered 00000004:00000001:0.0:1400265787.402558:0:11385:0:(mdt_handler.c:3434:mdt_intent_opc()) Process entered 00000004:00000001:0.0:1400265787.402559:0:11385:0:(mdt_handler.c:2735:mdt_unpack_req_pack_rep()) Process entered 00000004:00000001:0.0:1400265787.402560:0:11385:0:(mdt_handler.c:2753:mdt_unpack_req_pack_rep()) Process leaving (rc=0 : 0 : 0) 00000004:00000001:0.0:1400265787.402562:0:11385:0:(mdt_handler.c:3247:mdt_intent_layout()) Process entered 00000004:00000001:0.0:1400265787.402562:0:11385:0:(mdt_handler.c:2181:mdt_object_find()) Process entered 00000004:00000040:0.0:1400265787.402563:0:11385:0:(mdt_handler.c:2183:mdt_object_find()) Find object for [0x200002340:0x1:0x0] 00000020:00000001:0.0:1400265787.402565:0:11385:0:(lustre_fid.h:707:fid_flatten32()) Process leaving (rc=36962305 : 36962305 : 2340001) 00000020:00000001:0.0:1400265787.402566:0:11385:0:(lustre_fid.h:707:fid_flatten32()) Process leaving (rc=36962305 : 36962305 : 2340001) 00000004:00000001:0.0:1400265787.402568:0:11385:0:(mdt_handler.c:2190:mdt_object_find()) Process leaving (rc=18446612141272470360 : -131932437081256 : ffff880215f41358) 00000004:00000001:0.0:1400265787.402570:0:11385:0:(mdd_object.c:238:mdd_xattr_get()) Process entered 00000004:00000001:0.0:1400265787.402572:0:11385:0:(lod_object.c:498:lod_xattr_get()) Process entered 00000004:00000001:0.0:1400265787.402575:0:11385:0:(lod_object.c:502:lod_xattr_get()) Process leaving (rc=3728 : 3728 : e90) 00000004:00000001:0.0:1400265787.402577:0:11385:0:(mdd_object.c:257:mdd_xattr_get()) Process leaving (rc=3728 : 3728 : e90) 00000004:00000001:0.0:1400265787.402579:0:11385:0:(mdt_internal.h:559:mdt_object_put()) Process entered 00000020:00000001:0.0:1400265787.402579:0:11385:0:(lustre_fid.h:707:fid_flatten32()) Process leaving (rc=36962305 : 36962305 : 2340001) 00000004:00000001:0.0:1400265787.402581:0:11385:0:(mdt_internal.h:561:mdt_object_put()) Process leaving 00000100:00000001:0.0:1400265787.402582:0:11385:0:(pack_generic.c:337:lustre_pack_reply_v2()) Process entered 02000000:00000001:0.0:1400265787.402583:0:11385:0:(sec.c:2113:sptlrpc_svc_alloc_rs()) Process entered 02000000:00000010:0.0:1400265787.402587:0:11385:0:(sec_null.c:320:null_alloc_rs()) kmalloced 'rs': 4336 at ffff88020f756000. 02000000:00000001:0.0:1400265787.402588:0:11385:0:(sec.c:2149:sptlrpc_svc_alloc_rs()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1400265787.402590:0:11385:0:(pack_generic.c:371:lustre_pack_reply_v2()) Process leaving (rc=0 : 0 : 0) 00000004:00000001:0.0:1400265787.402592:0:11385:0:(mdt_handler.c:3284:mdt_intent_layout()) Process leaving (rc=0 : 0 : 0) 00000004:00000001:0.0:1400265787.402593:0:11385:0:(mdt_handler.c:3483:mdt_intent_opc()) Process leaving (rc=0 : 0 : 0) 00000004:00000001:0.0:1400265787.402594:0:11385:0:(mdt_handler.c:3536:mdt_intent_policy()) Process leaving (rc=0 : 0 : 0) 00010000:00000001:0.0:1400265787.402596:0:11385:0:(ldlm_inodebits.c:188:ldlm_process_inodebits_lock()) Process entered 00010000:00000001:0.0:1400265787.402597:0:11385:0:(ldlm_inodebits.c:90:ldlm_inodebits_compat_queue()) Process entered 00010000:00000001:0.0:1400265787.402598:0:11385:0:(ldlm_inodebits.c:163:ldlm_inodebits_compat_queue()) Process leaving (rc=1 : 1 : 1) 00010000:00000001:0.0:1400265787.402600:0:11385:0:(ldlm_inodebits.c:90:ldlm_inodebits_compat_queue()) Process entered 00010000:00000001:0.0:1400265787.402601:0:11385:0:(ldlm_inodebits.c:163:ldlm_inodebits_compat_queue()) Process leaving (rc=1 : 1 : 1) 00010000:00000001:0.0:1400265787.402602:0:11385:0:(ldlm_lock.c:1146:ldlm_grant_lock()) Process entered 00010000:00000001:0.0:1400265787.402603:0:11385:0:(ldlm_lock.c:1123:ldlm_grant_lock_with_skiplist()) Process entered 00010000:00000001:0.0:1400265787.402604:0:11385:0:(ldlm_lock.c:1003:search_granted_lock()) Process entered 00010000:00000001:0.0:1400265787.402605:0:11385:0:(ldlm_lock.c:1074:search_granted_lock()) Process leaving 00010000:00000001:0.0:1400265787.402605:0:11385:0:(ldlm_lock.c:1086:ldlm_granted_list_add_lock()) Process entered 00010000:00000040:0.0:1400265787.402606:0:11385:0:(ldlm_resource.c:1414:ldlm_resource_dump()) --- Resource: [0x200002340:0x1:0x0].0 (ffff8802bdc23a80) refcount = 2 00010000:00000040:0.0:1400265787.402608:0:11385:0:(ldlm_resource.c:1417:ldlm_resource_dump()) Granted locks (in reverse order): 00010000:00000040:0.0:1400265787.402610:0:11385:0:(ldlm_resource.c:1420:ldlm_resource_dump()) ### ### ns: mdt-lustre-MDT0000_UUID lock: ffff880212a982c0/0xd169abc5abfab64b lrc: 2/0,0 mode: PR/PR res: [0x200002340:0x1:0x0].0 bits 0x2 rrc: 2 type: IBT flags: 0x40200000000000 nid: 10.10.4.10@tcp remote: 0x5121400b7cdaafa4 expref: 13 pid: 11385 timeout: 0 lvb_type: 0 00010000:00010000:0.0:1400265787.402617:0:11385:0:(ldlm_lock.c:1091:ldlm_granted_list_add_lock()) ### About to add lock: ns: mdt-lustre-MDT0000_UUID lock: ffff880212a98ac0/0xd169abc5abfab66e lrc: 3/0,0 mode: CR/CR res: [0x200002340:0x1:0x0].0 bits 0x8 rrc: 2 type: IBT flags: 0x50000000000000 nid: 10.10.4.10@tcp remote: 0x5121400b7cdaafb2 expref: 13 pid: 11385 timeout: 0 lvb_type: 3 00010000:00000001:0.0:1400265787.402622:0:11385:0:(ldlm_lock.c:1113:ldlm_granted_list_add_lock()) Process leaving 00010000:00000001:0.0:1400265787.402623:0:11385:0:(ldlm_lock.c:1129:ldlm_grant_lock_with_skiplist()) Process leaving 00010000:00000001:0.0:1400265787.402624:0:11385:0:(ldlm_pool.c:345:ldlm_srv_pool_recalc()) Process entered 00010000:00000001:0.0:1400265787.402625:0:11385:0:(ldlm_pool.c:349:ldlm_srv_pool_recalc()) Process leaving (rc=0 : 0 : 0) 00010000:00000001:0.0:1400265787.402626:0:11385:0:(ldlm_lock.c:1165:ldlm_grant_lock()) Process leaving 00010000:00000001:0.0:1400265787.402627:0:11385:0:(ldlm_inodebits.c:238:ldlm_process_inodebits_lock()) Process leaving (rc=0 : 0 : 0) 00010000:00000001:0.0:1400265787.402628:0:11385:0:(ldlm_lock.c:1762:ldlm_lock_enqueue()) Process leaving via out (rc=0 : 0 : 0x0) 00010000:00000001:0.0:1400265787.402630:0:11385:0:(ldlm_lockd.c:1380:ldlm_handle_enqueue0()) Process leaving 00010000:00010000:0.0:1400265787.402632:0:11385:0:(ldlm_lockd.c:1393:ldlm_handle_enqueue0()) ### server-side enqueue handler, sending reply(err=0, rc=0) ns: mdt-lustre-MDT0000_UUID lock: ffff880212a98ac0/0xd169abc5abfab66e lrc: 3/0,0 mode: CR/CR res: [0x200002340:0x1:0x0].0 bits 0x8 rrc: 2 type: IBT flags: 0x40000000000000 nid: 10.10.4.10@tcp remote: 0x5121400b7cdaafb2 expref: 13 pid: 11385 timeout: 0 lvb_type: 3 00000004:00000001:0.0:1400265787.402639:0:11385:0:(mdt_lvb.c:104:mdt_lvbo_fill()) Process entered 00000020:00000010:0.0:1400265787.402640:0:11385:0:(lu_object.c:1643:keys_init()) kmalloced 'ctx->lc_value': 320 at ffff880212baaa00. 00000020:00000010:0.0:1400265787.402642:0:11385:0:(lu_object.c:429:lu_global_key_init()) kmalloced 'value': 512 at ffff8801f657ce00. 00000020:00000010:0.0:1400265787.402645:0:11385:0:(dt_object.c:55:dt_global_key_init()) kmalloced 'value': 1216 at ffff88032a027000. 00000040:00000010:0.0:1400265787.402647:0:11385:0:(llog_obd.c:278:llog_key_init()) kmalloced 'value': 224 at ffff8802b9611ac0. 00000020:00000010:0.0:1400265787.402649:0:11385:0:(tgt_main.c:146:tgt_key_init()) kmalloced 'value': 1152 at ffff8803277a9800. 80000000:00000010:0.0:1400265787.402651:0:11385:0:(fld_handler.c:60:fld_key_init()) kmalloced 'value': 72 at ffff880212e4a440. 40000000:00000010:0.0:1400265787.402653:0:11385:0:(fid_handler.c:406:seq_key_init()) kmalloced 'value': 48 at ffff8802128a1a00. 00040000:00000010:0.0:1400265787.402656:0:11385:0:(lquota_lib.c:53:lquota_key_init()) kmalloced 'value': 240 at ffff88029dd554c0. 00040000:00000010:0.0:1400265787.402659:0:11385:0:(qmt_dev.c:336:qmt_key_init()) kmalloced 'value': 376 at ffff8801f67b2e00. 00040000:00000010:0.0:1400265787.402661:0:11385:0:(qsd_lib.c:71:qsd_key_init()) kmalloced 'value': 376 at ffff8801f67b2600. 00000004:00000010:0.0:1400265787.402665:0:11385:0:(osd_handler.c:5503:osd_key_init()) kmalloced 'info': 4400 at ffff88020f758000. 00000004:00000010:0.0:1400265787.402668:0:11385:0:(osd_handler.c:5507:osd_key_init()) kmalloced 'info->oti_it_ea_buf': 5120 at ffff88020f75a000. 00100000:00000010:0.0:1400265787.402671:0:11385:0:(lfsck_lib.c:46:lfsck_key_init()) kmalloced 'value': 1144 at ffff88020f723800. 00000004:00000010:0.0:1400265787.402673:0:11385:0:(mdt_handler.c:5817:mdt_key_init()) kmalloced 'value': 1464 at ffff88020f723000. 00000004:00000010:0.0:1400265787.402675:0:11385:0:(lod_dev.c:804:lod_key_init()) kmalloced 'value': 336 at ffff8801f67b2c00. 00000004:00000010:0.0:1400265787.402678:0:11385:0:(mdd_device.c:1553:mdd_key_init()) kmalloced 'value': 1504 at ffff88020f724800. 00000004:00000010:0.0:1400265787.402680:0:11385:0:(osp_dev.c:1171:osp_key_init()) kmalloced 'value': 568 at ffff880212882800. 00000004:00000001:0.0:1400265787.402682:0:11385:0:(mdt_handler.c:2181:mdt_object_find()) Process entered 00000004:00000040:0.0:1400265787.402683:0:11385:0:(mdt_handler.c:2183:mdt_object_find()) Find object for [0x200002340:0x1:0x0] 00000020:00000001:0.0:1400265787.402684:0:11385:0:(lustre_fid.h:707:fid_flatten32()) Process leaving (rc=36962305 : 36962305 : 2340001) 00000020:00000001:0.0:1400265787.402686:0:11385:0:(lustre_fid.h:707:fid_flatten32()) Process leaving (rc=36962305 : 36962305 : 2340001) 00000004:00000001:0.0:1400265787.402688:0:11385:0:(mdt_handler.c:2190:mdt_object_find()) Process leaving (rc=18446612141272470360 : -131932437081256 : ffff880215f41358) 00000004:00000001:0.0:1400265787.402689:0:11385:0:(mdd_object.c:238:mdd_xattr_get()) Process entered 00000004:00000001:0.0:1400265787.402690:0:11385:0:(lod_object.c:498:lod_xattr_get()) Process entered 00000004:00000001:0.0:1400265787.402692:0:11385:0:(lod_object.c:502:lod_xattr_get()) Process leaving (rc=3728 : 3728 : e90) 00000004:00000001:0.0:1400265787.402694:0:11385:0:(mdd_object.c:257:mdd_xattr_get()) Process leaving (rc=3728 : 3728 : e90) 00000004:00000001:0.0:1400265787.402695:0:11385:0:(mdd_object.c:238:mdd_xattr_get()) Process entered 00000004:00000001:0.0:1400265787.402696:0:11385:0:(lod_object.c:498:lod_xattr_get()) Process entered 00000004:00000001:0.0:1400265787.402698:0:11385:0:(lod_object.c:502:lod_xattr_get()) Process leaving (rc=3728 : 3728 : e90) 00000004:00000001:0.0:1400265787.402699:0:11385:0:(mdd_object.c:257:mdd_xattr_get()) Process leaving (rc=3728 : 3728 : e90) 00000004:00000001:0.0:1400265787.402700:0:11385:0:(mdt_internal.h:559:mdt_object_put()) Process entered 00000020:00000001:0.0:1400265787.402701:0:11385:0:(lustre_fid.h:707:fid_flatten32()) Process leaving (rc=36962305 : 36962305 : 2340001) 00000004:00000001:0.0:1400265787.402703:0:11385:0:(mdt_internal.h:561:mdt_object_put()) Process leaving 00000020:00000010:0.0:1400265787.402704:0:11385:0:(lu_object.c:429:lu_global_key_fini()) kfreed 'info': 512 at ffff8801f657ce00. 00000020:00000010:0.0:1400265787.402706:0:11385:0:(dt_object.c:56:dt_global_key_fini()) kfreed 'info': 1216 at ffff88032a027000. 00000040:00000010:0.0:1400265787.402708:0:11385:0:(llog_obd.c:278:llog_key_fini()) kfreed 'info': 224 at ffff8802b9611ac0. 00000020:00000010:0.0:1400265787.402710:0:11385:0:(tgt_main.c:163:tgt_key_fini()) kfreed 'info': 1152 at ffff8803277a9800. 80000000:00000010:0.0:1400265787.402712:0:11385:0:(fld_handler.c:60:fld_key_fini()) kfreed 'info': 72 at ffff880212e4a440. 40000000:00000010:0.0:1400265787.402714:0:11385:0:(fid_handler.c:406:seq_key_fini()) kfreed 'info': 48 at ffff8802128a1a00. 00040000:00000010:0.0:1400265787.402716:0:11385:0:(lquota_lib.c:53:lquota_key_fini()) kfreed 'info': 240 at ffff88029dd554c0. 00040000:00000010:0.0:1400265787.402717:0:11385:0:(qmt_dev.c:336:qmt_key_fini()) kfreed 'info': 376 at ffff8801f67b2e00. 00040000:00000010:0.0:1400265787.402719:0:11385:0:(qsd_lib.c:71:qsd_key_fini()) kfreed 'info': 376 at ffff8801f67b2600. 00000004:00000010:0.0:1400265787.402721:0:11385:0:(osd_handler.c:5533:osd_key_fini()) kfreed 'info->oti_it_ea_buf': 5120 at ffff88020f75a000. 00000004:00000010:0.0:1400265787.402723:0:11385:0:(osd_handler.c:5536:osd_key_fini()) kfreed 'info': 4400 at ffff88020f758000. 00100000:00000010:0.0:1400265787.402725:0:11385:0:(lfsck_lib.c:55:lfsck_key_fini()) kfreed 'info': 1144 at ffff88020f723800. 00000004:00000010:0.0:1400265787.402727:0:11385:0:(mdt_handler.c:5829:mdt_key_fini()) kfreed 'info': 1464 at ffff88020f723000. 00000004:00000010:0.0:1400265787.402729:0:11385:0:(lod_dev.c:819:lod_key_fini()) kfreed 'info': 336 at ffff8801f67b2c00. 00000004:00000010:0.0:1400265787.402731:0:11385:0:(mdd_device.c:1563:mdd_key_fini()) kfreed 'info': 1504 at ffff88020f724800. 00000004:00000010:0.0:1400265787.402732:0:11385:0:(osp_dev.c:1171:osp_key_fini()) kfreed 'info': 568 at ffff880212882800. 00000020:00000010:0.0:1400265787.402734:0:11385:0:(lu_object.c:1593:keys_fini()) kfreed 'ctx->lc_value': 320 at ffff880212baaa00. 00000004:00000001:0.0:1400265787.402735:0:11385:0:(mdt_lvb.c:179:mdt_lvbo_fill()) Process leaving (rc=3728 : 3728 : e90) 00010000:00000001:0.0:1400265787.402737:0:11385:0:(ldlm_lock.c:2072:ldlm_reprocess_all()) Process entered 00010000:00000001:0.0:1400265787.402738:0:11385:0:(ldlm_lock.c:1793:ldlm_reprocess_queue()) Process entered 00010000:00000001:0.0:1400265787.402739:0:11385:0:(ldlm_lock.c:1812:ldlm_reprocess_queue()) Process leaving (rc=1 : 1 : 1) 00010000:00000001:0.0:1400265787.402740:0:11385:0:(ldlm_lock.c:1793:ldlm_reprocess_queue()) Process entered 00010000:00000001:0.0:1400265787.402741:0:11385:0:(ldlm_lock.c:1812:ldlm_reprocess_queue()) Process leaving (rc=1 : 1 : 1) 00010000:00000001:0.0:1400265787.402742:0:11385:0:(ldlm_lock.c:1976:ldlm_run_ast_work()) Process leaving (rc=0 : 0 : 0) 00010000:00000001:0.0:1400265787.402743:0:11385:0:(ldlm_lock.c:2100:ldlm_reprocess_all()) Process leaving 00010000:00000001:0.0:1400265787.402744:0:11385:0:(ldlm_lock.c:211:ldlm_lock_put()) Process entered 00010000:00000001:0.0:1400265787.402745:0:11385:0:(ldlm_lock.c:244:ldlm_lock_put()) Process leaving 00010000:00010000:0.0:1400265787.402746:0:11385:0:(ldlm_lockd.c:1436:ldlm_handle_enqueue0()) ### server-side enqueue handler END (lock ffff880212a98ac0, rc 0) 00000020:00000001:0.0:1400265787.402747:0:11385:0:(tgt_handler.c:1231:tgt_enqueue()) Process leaving (rc=0 : 0 : 0) 00010000:00000040:0.0:1400265787.402749:0:11385:0:(ldlm_lib.c:2363:target_committed_to_req()) last_committed 21474836483, transno 0, xid 1466802977730748 00010000:00000001:0.0:1400265787.402750:0:11385:0:(ldlm_lib.c:2422:target_send_reply()) Process entered 00010000:00000200:0.0:1400265787.402753:0:11385:0:(ldlm_lib.c:2410:target_send_reply_msg()) @@@ sending reply req@ffff88020ff7f800 x1466802977730748/t0(0) o101->a56bcdbb-ca09-9568-70fd-b3f47d606d12@10.10.4.10@tcp:0/0 lens 376/4072 e 0 to 0 dl 1400265793 ref 1 fl Interpret:/0/0 rc 0/0
Reply truncated at Client side, request is resent
00000100:00000001:0.0:1400265787.406980:0:7821:0:(events.c:97:reply_in_callback()) Process entered 00000100:00000200:0.0:1400265787.406983:0:7821:0:(events.c:99:reply_in_callback()) @@@ type 2, status 0 req@ffff8802faff9400 x1466802977730748/t0(0) o101->lustre-MDT0000-mdc-ffff880331322400@10.10.4.17@tcp:12/10 lens 376/1544 e 0 to 0 dl 1400265794 ref 2 fl Rpc:/0/ffffffff rc 0/-1 00000100:00100000:0.0:1400265787.406989:0:7821:0:(events.c:126:reply_in_callback()) truncate req ffff8802faff9400 rpc 1544 - 4040+192 00000100:00000001:0.0:1400265787.406993:0:7821:0:(events.c:173:reply_in_callback()) Process leaving 00000400:00000010:0.0:1400265787.406995:0:7821:0:(lib-lnet.h:528:lnet_msg_free()) kfreed 'msg': 352 at ffff8802f7eb7400 (tot 83132940). 00000800:00000001:0.0:1400265787.406997:0:7821:0:(socklnd_cb.c:346:ksocknal_receive()) Process entered 00000800:00000001:0.0:1400265787.406999:0:7821:0:(socklnd_cb.c:384:ksocknal_receive()) Process leaving (rc=1 : 1 : 1) 00000800:00000001:0.1:1400265787.407011:0:7821:0:(socklnd_lib-linux.c:1002:ksocknal_data_ready()) Process entered 00000800:00000001:0.1:1400265787.407013:0:7821:0:(socklnd_cb.c:1566:ksocknal_read_callback()) Process entered 00000800:00000001:0.1:1400265787.407014:0:7821:0:(socklnd_cb.c:1585:ksocknal_read_callback()) Process leaving 00000800:00000001:0.1:1400265787.407015:0:7821:0:(socklnd_lib-linux.c:1017:ksocknal_data_ready()) Process leaving 00000800:00000001:0.0:1400265787.407017:0:7821:0:(socklnd_cb.c:346:ksocknal_receive()) Process entered 00000800:00000001:0.0:1400265787.407020:0:7821:0:(socklnd_cb.c:384:ksocknal_receive()) Process leaving (rc=1 : 1 : 1) 00000800:00000001:0.0:1400265787.407021:0:7821:0:(socklnd_cb.c:346:ksocknal_receive()) Process entered 00000800:00000001:0.0:1400265787.407023:0:7821:0:(socklnd_cb.c:384:ksocknal_receive()) Process leaving (rc=1 : 1 : 1) 00000100:00000001:2.0:1400265787.407086:0:2750:0:(client.c:1525:ptlrpc_check_set()) Process entered 00000400:00000001:2.0:1400265787.407089:0:2750:0:(lib-msg.c:48:lnet_build_unlink_event()) Process entered 00000400:00000001:2.0:1400265787.407090:0:2750:0:(lib-msg.c:57:lnet_build_unlink_event()) Process leaving 00000100:00000001:2.0:1400265787.407091:0:2750:0:(events.c:97:reply_in_callback()) Process entered 00000100:00000200:2.0:1400265787.407095:0:2750:0:(events.c:99:reply_in_callback()) @@@ type 6, status 0 req@ffff8802faff9400 x1466802977730748/t0(0) o101->lustre-MDT0000-mdc-ffff880331322400@10.10.4.17@tcp:12/10 lens 376/1544 e 0 to 0 dl 1400265794 ref 2 fl Rpc:R/0/ffffffff rc -75/-1 00000100:00000200:2.0:1400265787.407103:0:2750:0:(events.c:120:reply_in_callback()) @@@ unlink req@ffff8802faff9400 x1466802977730748/t0(0) o101->lustre-MDT0000-mdc-ffff880331322400@10.10.4.17@tcp:12/10 lens 376/1544 e 0 to 0 dl 1400265794 ref 2 fl Rpc:R/0/ffffffff rc -75/-1 00000100:00000001:2.0:1400265787.407109:0:2750:0:(events.c:173:reply_in_callback()) Process leaving 00000400:00000010:2.0:1400265787.407110:0:2750:0:(lib-lnet.h:508:lnet_me_free()) kfreed 'me': 96 at ffff880320454dc0 (tot 83132844). 00000400:00000200:2.0:1400265787.407112:0:2750:0:(lib-md.c:73:lnet_md_unlink()) Unlinking md ffff880320454d40 00000400:00000010:2.0:1400265787.407114:0:2750:0:(lib-lnet.h:491:lnet_md_free()) kfreed 'md': 128 at ffff880320454d40 (tot 83132716). 00000100:00000001:2.0:1400265787.407115:0:2750:0:(client.c:2389:ptlrpc_unregister_reply()) Process leaving (rc=1 : 1 : 1) 00000100:00000001:2.0:1400265787.407117:0:2750:0:(client.c:1220:after_reply()) Process entered 02000000:00000001:2.0:1400265787.407118:0:2750:0:(sec.c:1710:sptlrpc_cli_free_repbuf()) Process entered 02000000:00000010:2.0:1400265787.407119:0:2750:0:(sec_null.c:231:null_free_repbuf()) kfreed 'req->rq_repbuf': 2048 at ffff880308075800. 02000000:00000001:2.0:1400265787.407121:0:2750:0:(sec.c:1724:sptlrpc_cli_free_repbuf()) Process leaving 00000100:00000001:2.0:1400265787.407124:0:2750:0:(client.c:1244:after_reply()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1400265787.407125:0:2750:0:(client.c:1886:ptlrpc_check_set()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1400265787.407126:0:2750:0:(client.c:1525:ptlrpc_check_set()) Process entered 00000100:00000001:2.0:1400265787.407127:0:2750:0:(client.c:2381:ptlrpc_unregister_reply()) Process leaving (rc=1 : 1 : 1) 00000100:00000001:2.0:1400265787.407129:0:2750:0:(client.c:1064:ptlrpc_import_delay_req()) Process entered 00000100:00000001:2.0:1400265787.407129:0:2750:0:(client.c:1115:ptlrpc_import_delay_req()) Process leaving (rc=0 : 0 : 0) 02000000:00000001:2.0:1400265787.407131:0:2750:0:(sec.c:659:sptlrpc_req_refresh_ctx()) Process entered 02000000:00000001:2.0:1400265787.407132:0:2750:0:(sec.c:686:sptlrpc_req_refresh_ctx()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1400265787.407133:0:2750:0:(niobuf.c:682:ptl_send_rpc()) Process entered 02000000:00000001:2.0:1400265787.407135:0:2750:0:(sec.c:961:sptlrpc_cli_wrap_request()) Process entered 02000000:00000001:2.0:1400265787.407136:0:2750:0:(sec.c:997:sptlrpc_cli_wrap_request()) Process leaving (rc=0 : 0 : 0) 02000000:00000001:2.0:1400265787.407137:0:2750:0:(sec.c:1689:sptlrpc_cli_alloc_repbuf()) Process entered 02000000:00000010:2.0:1400265787.407139:0:2750:0:(sec_null.c:217:null_alloc_repbuf()) kmalloced 'req->rq_repbuf': 8192 at ffff8803006c2000. 02000000:00000001:2.0:1400265787.407141:0:2750:0:(sec.c:1699:sptlrpc_cli_alloc_repbuf()) Process leaving (rc=0 : 0 : 0) 00000400:00000010:2.0:1400265787.407142:0:2750:0:(lib-lnet.h:500:lnet_me_alloc()) alloc '(me)': 96 at ffff880320454d40 (tot 83132812). 00000400:00000010:2.0:1400265787.407145:0:2750:0:(lib-lnet.h:468:lnet_md_alloc()) alloc '(md)': 128 at ffff880320454dc0 (tot 83132940). 00000100:00000200:2.0:1400265787.407147:0:2750:0:(niobuf.c:808:ptl_send_rpc()) Setup reply buffer: 8192 bytes, xid 1466802977730748, portal 10 00000100:00000001:2.0:1400265787.407149:0:2750:0:(client.c:2605:ptlrpc_request_addref()) Process entered 00000100:00000001:2.0:1400265787.407150:0:2750:0:(client.c:2607:ptlrpc_request_addref()) Process leaving (rc=18446612145115206656 : -131928594344960 : ffff8802faff9400) 00000100:00000040:2.0:1400265787.407153:0:2750:0:(niobuf.c:829:ptl_send_rpc()) @@@ send flg=2 req@ffff8802faff9400 x1466802977730748/t0(0) o101->lustre-MDT0000-mdc-ffff880331322400@10.10.4.17@tcp:12/10 lens 376/4232 e 0 to 0 dl 1400265794 ref 3 fl Rpc:/2/ffffffff rc 0/-1 00000100:00000001:2.0:1400265787.407159:0:2750:0:(niobuf.c:60:ptl_send_buf()) Process entered 00000100:00000040:2.0:1400265787.407161:0:2750:0:(niobuf.c:64:ptl_send_buf()) conn=ffff8803282dacc0 id 12345-10.10.4.17@tcp 00000400:00000010:2.0:1400265787.407162:0:2750:0:(lib-lnet.h:468:lnet_md_alloc()) alloc '(md)': 128 at ffff8802f8c444c0 (tot 83133068). 00000100:00000200:2.0:1400265787.407164:0:2750:0:(niobuf.c:86:ptl_send_buf()) Sending 376 bytes to portal 12, xid 1466802977730748, offset 0 00000400:00000010:2.0:1400265787.407166:0:2750:0:(lib-lnet.h:517:lnet_msg_alloc()) alloc '(msg)': 352 at ffff8802f7244000 (tot 83133420). 00000400:00000200:2.0:1400265787.407169:0:2750:0:(lib-move.c:2176:LNetPut()) LNetPut -> 12345-10.10.4.17@tcp 00000800:00000200:2.0:1400265787.407173:0:2750:0:(socklnd_cb.c:947:ksocknal_send()) sending 376 bytes in 1 frags to 12345-10.10.4.17@tcp 00000800:00000010:2.0:1400265787.407175:0:2750:0:(socklnd_cb.c:51:ksocknal_alloc_tx()) alloc '(tx)': 232 at ffff880310ae9180 (tot 83133652). 00000800:00000200:2.0:1400265787.407178:0:2750:0:(socklnd.c:199:ksocknal_find_peer_locked()) got peer [ffff880338348380] -> 12345-10.10.4.17@tcp (5) 00000800:00000200:2.0:1400265787.407182:0:2750:0:(socklnd_cb.c:706:ksocknal_queue_tx_locked()) Sending to 12345-10.10.4.17@tcp ip 10.10.4.17:988 00000800:00000200:2.0:1400265787.407184:0:2750:0:(socklnd_cb.c:725:ksocknal_queue_tx_locked()) Packet ffff880310ae9180 type 1, nob 472 niov 2 nkiov 0 00000100:00000001:2.0:1400265787.407189:0:2750:0:(niobuf.c:101:ptl_send_buf()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1400265787.407191:0:2750:0:(niobuf.c:837:ptl_send_rpc()) Process leaving via out (rc=0 : 0 : 0x0) 00000100:00000001:2.0:1400265787.407193:0:2750:0:(client.c:1886:ptlrpc_check_set()) Process leaving (rc=1 : 1 : 1) 00000100:00000001:2.0:1400265787.407196:0:2750:0:(client.c:2058:ptlrpc_set_next_timeout()) Process entered 00000100:00000001:2.0:1400265787.407197:0:2750:0:(client.c:2095:ptlrpc_set_next_timeout()) Process leaving (rc=6 : 6 : 6) 00000100:00100000:2.0:1400265787.407198:0:2750:0:(client.c:2132:ptlrpc_set_wait()) set ffff8803287f55c0 going to sleep for 6 seconds 00000100:00000001:2.0:1400265787.407199:0:2750:0:(client.c:1525:ptlrpc_check_set()) Process entered 00000100:00000001:2.0:1400265787.407200:0:2750:0:(client.c:1886:ptlrpc_check_set()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:2.0:1400265787.407201:0:2750:0:(client.c:1525:ptlrpc_check_set()) Process entered 00000100:00000001:2.0:1400265787.407202:0:2750:0:(client.c:1886:ptlrpc_check_set()) Process leaving (rc=0 : 0 : 0) 00000800:00000200:1.0:1400265787.407277:0:7820:0:(socklnd_cb.c:514:ksocknal_process_transmit()) send(0) 0 00000800:00000001:1.0:1400265787.407285:0:7820:0:(socklnd_cb.c:392:ksocknal_tx_done()) Process entered 00000800:00000010:1.0:1400265787.407286:0:7820:0:(socklnd_cb.c:106:ksocknal_free_tx()) kfreed 'tx': 232 at ffff880310ae9180 (tot 83133420). 00000100:00000001:1.0:1400265787.407289:0:7820:0:(events.c:62:request_out_callback()) Process entered 00000100:00000200:1.0:1400265787.407293:0:7820:0:(events.c:68:request_out_callback()) @@@ type 5, status 0 req@ffff8802faff9400 x1466802977730748/t0(0) o101->lustre-MDT0000-mdc-ffff880331322400@10.10.4.17@tcp:12/10 lens 376/4232 e 0 to 0 dl 1400265794 ref 3 fl Rpc:/2/ffffffff rc 0/-1 00000100:00000001:1.0:1400265787.407300:0:7820:0:(client.c:2313:__ptlrpc_req_finished()) Process entered 00000100:00000040:1.0:1400265787.407302:0:7820:0:(client.c:2325:__ptlrpc_req_finished()) @@@ refcount now 2 req@ffff8802faff9400 x1466802977730748/t0(0) o101->lustre-MDT0000-mdc-ffff880331322400@10.10.4.17@tcp:12/10 lens 376/4232 e 0 to 0 dl 1400265794 ref 3 fl Rpc:/2/ffffffff rc 0/-1 00000100:00000001:1.0:1400265787.407307:0:7820:0:(client.c:2332:__ptlrpc_req_finished()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:1.0:1400265787.407309:0:7820:0:(events.c:87:request_out_callback()) Process leaving
Server detects resend, but since not expected for layout-lock, handles again, hence the LBUG
00000100:00000001:1.0:1400265787.403522:0:11347:0:(events.c:296:request_in_callback()) Process entered 00000100:00000200:1.0:1400265787.403523:0:11347:0:(events.c:306:request_in_callback()) event type 2, status 0, service mdt00000100:00000010:1.0:1400265787.403525:0:11347:0:(client.c:410:ptlrpc_request_cache_alloc()) slab-alloced 'req': 976 at ffff880214c69000. 00000100:00000040:1.0:1400265787.403526:0:11347:0:(events.c:350:request_in_callback()) incoming req@ffff880214c69000 x1466802977730748 msgsize 376 00000100:00100000:1.0:1400265787.403529:0:11347:0:(events.c:352:request_in_callback()) peer: 12345-10.10.4.10@tcp 00000100:00000001:1.0:1400265787.403533:0:11347:0:(events.c:385:request_in_callback()) Process leaving00000400:00000010:1.0:1400265787.403534:0:11347:0:(lib-lnet.h:528:lnet_msg_free()) kfreed 'msg': 352 at ffff880212baaa00 (tot 117245148). 00000800:00000001:1.0:1400265787.403536:0:11347:0:(socklnd_cb.c:346:ksocknal_receive()) Process entered 00000800:00000001:1.0:1400265787.403538:0:11347:0:(socklnd_cb.c:384:ksocknal_receive()) Process leaving (rc=1 : 1 : 1) 00000400:00000001:0.0:1400265787.403585:0:11385:0:(watchdog.c:416:lc_watchdog_touch()) Process entered 00000400:00000001:0.0:1400265787.403587:0:11385:0:(watchdog.c:427:lc_watchdog_touch()) Process leaving 00000100:00000001:0.0:1400265787.403590:0:11385:0:(service.c:1860:ptlrpc_server_handle_req_in()) Process entered 02000000:00000001:0.0:1400265787.403591:0:11385:0:(sec.c:2053:sptlrpc_svc_unwrap_request()) Process entered 00000100:00000001:0.0:1400265787.403592:0:11385:0:(pack_generic.c:582:__lustre_unpack_msg()) Process entered 00000100:00000001:0.0:1400265787.403593:0:11385:0:(pack_generic.c:601:__lustre_unpack_msg()) Process leaving (rc=0 : 0 : 0) 02000000:00000001:0.0:1400265787.403595:0:11385:0:(sec.c:2100:sptlrpc_svc_unwrap_request()) Process leaving (rc=0 : 0 : 0) 00000100:00100000:0.0:1400265787.403597:0:11385:0:(service.c:1939:ptlrpc_server_handle_req_in()) got req x1466802977730748 00000020:00000001:0.0:1400265787.403598:0:11385:0:(genops.c:741:class_conn2export()) Process entered 00000020:00000040:0.0:1400265787.403599:0:11385:0:(genops.c:753:class_conn2export()) looking for export cookie 0xd169abc5abfab541 00000020:00000001:0.0:1400265787.403600:0:11385:0:(lustre_handles.c:172:class_handle2object()) Process entered 00000020:00000040:0.0:1400265787.403601:0:11385:0:(genops.c:839:class_export_get()) GETting export ffff880212978800 : new refcount 13 00000020:00000001:0.0:1400265787.403603:0:11385:0:(lustre_handles.c:195:class_handle2object()) Process leaving (rc=18446612141216073728 : -131932493477888 : ffff880212978800) 00000020:00000001:0.0:1400265787.403605:0:11385:0:(genops.c:755:class_conn2export()) Process leaving (rc=18446612141216073728 : -131932493477888 : ffff880212978800) 00000100:00000001:0.0:1400265787.403607:0:11385:0:(service.c:1068:ptlrpc_update_export_timer()) Process entered 00000100:00000001:0.0:1400265787.403608:0:11385:0:(service.c:1081:ptlrpc_update_export_timer()) Process leaving 00000020:00000010:0.0:1400265787.403609:0:11385:0:(lu_object.c:1643:keys_init()) kmalloced 'ctx->lc_value': 320 at ffff880268932c00. 02000000:00000010:0.0:1400265787.403611:0:11385:0:(capa.c:416:lu_capainfo_key_init()) kmalloced 'value': 128 at ffff880213336bc0. 00000020:00000010:0.0:1400265787.403612:0:11385:0:(lu_ucred.c:53:lu_ucred_key_init()) kmalloced 'value': 72 at ffff880212f7cac0. 00000020:00000010:0.0:1400265787.403614:0:11385:0:(tgt_main.c:187:tgt_ses_key_init()) kmalloced 'value': 144 at ffff880215e7a740. 00000100:00000040:0.0:1400265787.403616:0:11385:0:(service.c:1203:ptlrpc_at_set_timer()) armed mdt at +1s 00000100:00000001:0.0:1400265787.403618:0:11385:0:(service.c:1670:ptlrpc_server_request_add()) Process entered 00000100:00000001:0.0:1400265787.403618:0:11385:0:(service.c:1571:ptlrpc_server_hpreq_init()) Process entered 00000100:00000001:0.0:1400265787.403620:0:11385:0:(service.c:1622:ptlrpc_server_hpreq_init()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1400265787.403621:0:11385:0:(service.c:1685:ptlrpc_server_request_add()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1400265787.403625:0:11385:0:(service.c:2002:ptlrpc_server_handle_req_in()) Process leaving (rc=1 : 1 : 1) 00000400:00000001:0.0:1400265787.403627:0:11385:0:(watchdog.c:433:lc_watchdog_disable()) Process entered 00000400:00000001:0.0:1400265787.403628:0:11385:0:(watchdog.c:441:lc_watchdog_disable()) Process leaving 00000400:00000001:0.0:1400265787.403629:0:11385:0:(watchdog.c:416:lc_watchdog_touch()) Process entered 00000400:00000001:0.0:1400265787.403630:0:11385:0:(watchdog.c:427:lc_watchdog_touch()) Process leaving 00000100:00000001:0.0:1400265787.403631:0:11385:0:(service.c:2025:ptlrpc_server_handle_request()) Process entered 00000100:00000001:0.0:1400265787.403632:0:11385:0:(service.c:1803:ptlrpc_server_request_get()) Process entered 00000100:00100000:0.0:1400265787.403634:0:11385:0:(nrs_fifo.c:182:nrs_fifo_req_get()) NRS start fifo request from 12345-10.10.4.10@tcp, seq: 173 00000100:00000040:0.0:1400265787.403636:0:11385:0:(service.c:1841:ptlrpc_server_request_get()) RPC GETting export ffff880212978800 : new rpc_count 1 00000100:00000001:0.0:1400265787.403637:0:11385:0:(service.c:1843:ptlrpc_server_request_get()) Process leaving (rc=18446612141252710400 : -131932456841216 : ffff880214c69000) 00000100:00000040:0.0:1400265787.403640:0:11385:0:(lustre_net.h:3296:ptlrpc_rqphase_move()) @@@ move req "New" -> "Interpret" req@ffff880214c69000 x1466802977730748/t0(0) o101->a56bcdbb-ca09-9568-70fd-b3f47d606d12@10.10.4.10@tcp:0/0 lens 376/0 e 0 to 0 dl 1400265793 ref 1 fl New:/2/ffffffff rc 0/-1 00000100:00000001:0.0:1400265787.403652:0:11385:0:(service.c:1068:ptlrpc_update_export_timer()) Process entered 00000100:00000001:0.0:1400265787.403653:0:11385:0:(service.c:1081:ptlrpc_update_export_timer()) Process leaving 00000100:00100000:0.0:1400265787.403655:0:11385:0:(service.c:2086:ptlrpc_server_handle_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc mdt00_002:a56bcdbb-ca09-9568-70fd-b3f47d606d12+13:2750:x1466802977730748:12345-10.10.4.10@tcp:101 00000100:00000200:0.0:1400265787.403657:0:11385:0:(service.c:2091:ptlrpc_server_handle_request()) got req 1466802977730748 00000020:00000001:0.0:1400265787.403659:0:11385:0:(tgt_handler.c:589:tgt_request_handle()) Process entered 00000020:00000001:0.0:1400265787.403660:0:11385:0:(tgt_handler.c:553:tgt_handler_find_check()) Process entered 00000020:00000001:0.0:1400265787.403661:0:11385:0:(tgt_handler.c:575:tgt_handler_find_check()) Process leaving (rc=18446744072111134752 : -1598416864 : ffffffffa0ba1820) 00000020:00000001:0.0:1400265787.403663:0:11385:0:(tgt_handler.c:492:tgt_handle_recovery()) Process entered 00000020:00000001:0.0:1400265787.403664:0:11385:0:(tgt_handler.c:542:tgt_handle_recovery()) Process leaving (rc=1 : 1 : 1) 00000020:00000001:0.0:1400265787.403666:0:11385:0:(tgt_handler.c:377:tgt_handle_request0()) Process entered 00000020:00000001:0.0:1400265787.403667:0:11385:0:(tgt_handler.c:307:tgt_request_preprocess()) Process entered 00000020:00000001:0.0:1400265787.403668:0:11385:0:(tgt_handler.c:362:tgt_request_preprocess()) Process leaving (rc=0 : 0 : 0) 00000020:00000001:0.0:1400265787.403670:0:11385:0:(tgt_handler.c:1220:tgt_enqueue()) Process entered 00010000:00000001:0.0:1400265787.403671:0:11385:0:(ldlm_lockd.c:1165:ldlm_handle_enqueue0()) Process entered 00010000:00010000:0.0:1400265787.403671:0:11385:0:(ldlm_lockd.c:1167:ldlm_handle_enqueue0()) ### server-side enqueue handler START 00010000:00000001:0.0:1400265787.403673:0:11385:0:(ldlm_lockd.c:1556:ldlm_request_cancel()) Process entered 00010000:00000001:0.0:1400265787.403673:0:11385:0:(ldlm_lockd.c:1560:ldlm_request_cancel()) Process leaving (rc=0 : 0 : 0) 00010000:00010000:0.0:1400265787.403677:0:11385:0:(ldlm_lockd.c:1241:ldlm_handle_enqueue0()) @@@ found existing lock cookie 0xd169abc5abfab66e req@ffff880214c69000 x1466802977730748/t0(0) o101->a56bcdbb-ca09-9568-70fd-b3f47d606d12@10.10.4.10@tcp:0/0 lens 376/0 e 0 to 0 dl 1400265793 ref 1 fl Interpret:/2/ffffffff rc 0/-1 00010000:00000001:0.0:1400265787.403683:0:11385:0:(ldlm_lockd.c:1242:ldlm_handle_enqueue0()) Process leaving via existing_lock (rc=0 : 0 : 0x0) 00010000:00000001:0.0:1400265787.403684:0:11385:0:(ldlm_lock.c:1666:ldlm_lock_enqueue()) Process entered 00000004:00000001:0.0:1400265787.403685:0:11385:0:(mdt_handler.c:3497:mdt_intent_policy()) Process entered 00000004:00000001:0.0:1400265787.403687:0:11385:0:(mdt_handler.c:3434:mdt_intent_opc()) Process entered 00000004:00000001:0.0:1400265787.403688:0:11385:0:(mdt_handler.c:2735:mdt_unpack_req_pack_rep()) Process entered 00000004:00000001:0.0:1400265787.403689:0:11385:0:(mdt_handler.c:2753:mdt_unpack_req_pack_rep()) Process leaving (rc=0 : 0 : 0) 00000004:00000001:0.0:1400265787.403690:0:11385:0:(mdt_handler.c:3247:mdt_intent_layout()) Process entered 00000004:00000001:0.0:1400265787.403691:0:11385:0:(mdt_handler.c:2181:mdt_object_find()) Process entered 00000004:00000040:0.0:1400265787.403692:0:11385:0:(mdt_handler.c:2183:mdt_object_find()) Find object for [0x200002340:0x1:0x0] 00000020:00000001:0.0:1400265787.403693:0:11385:0:(lustre_fid.h:707:fid_flatten32()) Process leaving (rc=36962305 : 36962305 : 2340001) 00000020:00000001:0.0:1400265787.403695:0:11385:0:(lustre_fid.h:707:fid_flatten32()) Process leaving (rc=36962305 : 36962305 : 2340001) 00000004:00000001:0.0:1400265787.403697:0:11385:0:(mdt_handler.c:2190:mdt_object_find()) Process leaving (rc=18446612141272470360 : -131932437081256 : ffff880215f41358) 00000004:00000001:0.0:1400265787.403699:0:11385:0:(mdd_object.c:238:mdd_xattr_get()) Process entered 00000004:00000001:0.0:1400265787.403700:0:11385:0:(lod_object.c:498:lod_xattr_get()) Process entered 00000004:00000001:0.0:1400265787.403704:0:11385:0:(lod_object.c:502:lod_xattr_get()) Process leaving (rc=3728 : 3728 : e90) 00000004:00000001:0.0:1400265787.403706:0:11385:0:(mdd_object.c:257:mdd_xattr_get()) Process leaving (rc=3728 : 3728 : e90) 00000004:00000001:0.0:1400265787.403707:0:11385:0:(mdt_internal.h:559:mdt_object_put()) Process entered 00000020:00000001:0.0:1400265787.403708:0:11385:0:(lustre_fid.h:707:fid_flatten32()) Process leaving (rc=36962305 : 36962305 : 2340001) 00000004:00000001:0.0:1400265787.403710:0:11385:0:(mdt_internal.h:561:mdt_object_put()) Process leaving 00000100:00000001:0.0:1400265787.403711:0:11385:0:(pack_generic.c:337:lustre_pack_reply_v2()) Process entered 02000000:00000001:0.0:1400265787.403712:0:11385:0:(sec.c:2113:sptlrpc_svc_alloc_rs()) Process entered 02000000:00000010:0.0:1400265787.403714:0:11385:0:(sec_null.c:320:null_alloc_rs()) kmalloced 'rs': 4336 at ffff88020f758000. 02000000:00000001:0.0:1400265787.403715:0:11385:0:(sec.c:2149:sptlrpc_svc_alloc_rs()) Process leaving (rc=0 : 0 : 0) 00000100:00000001:0.0:1400265787.403717:0:11385:0:(pack_generic.c:371:lustre_pack_reply_v2()) Process leaving (rc=0 : 0 : 0) 00000004:00000001:0.0:1400265787.403718:0:11385:0:(mdt_handler.c:3284:mdt_intent_layout()) Process leaving (rc=0 : 0 : 0) 00000004:00000001:0.0:1400265787.403719:0:11385:0:(mdt_handler.c:3483:mdt_intent_opc()) Process leaving (rc=0 : 0 : 0) 00000004:00000001:0.0:1400265787.403721:0:11385:0:(mdt_handler.c:3536:mdt_intent_policy()) Process leaving (rc=0 : 0 : 0) 00010000:00000001:0.0:1400265787.403722:0:11385:0:(ldlm_inodebits.c:188:ldlm_process_inodebits_lock()) Process entered 00010000:00040000:0.0:1400265787.403723:0:11385:0:(ldlm_inodebits.c:190:ldlm_process_inodebits_lock()) ASSERTION( lock->l_granted_mode != lock->l_req_mode ) failed: 00010000:00000010:0.1:1400265787.415728:0:11385:0:(ldlm_lock.c:453:lock_handle_free()) slab-freed 'lock': 504 at ffff8802107f1700. 00010000:00040000:0.0:1400265787.415731:0:11385:0:(ldlm_inodebits.c:190:ldlm_process_inodebits_lock()) LBUG 00000400:00000001:0.0:1400265787.503768:0:11385:0:(linux-debug.c:151:libcfs_run_lbug_upcall()) Process entered 00000400:00000001:0.0:1400265787.503770:0:11385:0:(linux-debug.c:118:libcfs_run_upcall()) Process entered 00010000:00000010:0.1:1400265787.504023:0:0:0:(ldlm_lock.c:453:lock_handle_free()) slab-freed 'lock': 504 at ffff880212a986c0. 00000400:00080000:3.0:1400265787.504316:0:11385:0:(linux-debug.c:142:libcfs_run_upcall()) Invoked LNET upcall /usr/lib/lustre/lnet_upcall LBUG,/var/lib/jenkins/workspace/lustre-reviews/arch/x86_64/build_type/server/distro/el6/ib_stack/inkernel/BUILD/BUILD/lustre-2.5.58/lustre/ptlrpc/../../lustre/ldlm/ldlm_inodebits.c,ldlm_process_inodebits_lock,190
I have an add-on for current Gerrit-change LU-2827/#5978-[8,9] that handles this truncate/resend during layout-lock issue.
And I just submitted it, at http://review.whamcloud.com/10378, as a new patch/change that relates on latest LU-2827/#5978-9 patch-set.
Oleg,
I thought I made it clear already that LU-4584 contains already all information/reproducer/... that fully detail its specific scenario of a layout-lock reply from Server being truncated, due to reply buffer too short on Client vs layout/striping size, then causing Client to resend layout-lock request where current bug leads to a duplicated lock on Server and later Client eviction during 1st lock blocking callback process.
BTW, it reproduces solid on homogeneous b2_4/b2_5/master Servers/Client platforms and again triggers the LBUG/LASSERT(lock->l_granted_mode != lock->l_req_mode) that the change introduced in ldlm_process_inodebits_lock() when running with this ticket/Gerrit-change LU-2827/#5978-8.
Merged b2_4 backport of both #5978 and #10378 master changes for this ticket, is at http://review.whamcloud.com/10902.