[LU-2177] ldlm_flock_completion_ast causes LBUG because of a race Created: 15/Oct/12 Updated: 04/Jan/16 Resolved: 12/Jun/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.3, Lustre 1.8.8 |
| Fix Version/s: | Lustre 2.6.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Hiroya Nozaki | Assignee: | Emoly Liu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Environment: |
MDSx1, OSSx1(OSTx3), Clientx1, running an application program using flock |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 5215 | ||||||||
| Description |
|
I believe that I have found a possible race condition between ldlm_cli_enqueue_fini() and cleanup_resource() when these handle flocks. The case is below: thread A: flock thread B: evict thread B: evict cleanup_resource() static void cleanup_resource(struct ldlm_resource *res, cfs_list_t *q, int flags) { ... /* Set CBPENDING so nothing in the cancellation path * can match this lock */ lock->l_flags |= LDLM_FL_CBPENDING; lock->l_flags |= LDLM_FL_FAILED; lock->l_flags |= flags; /* ... without sending a CANCEL message for local_only. */ if (local_only) lock->l_flags |= LDLM_FL_LOCAL_ONLY; if (local_only && (lock->l_readers || lock->l_writers)) { /* This is a little bit gross, but much better than the * alternative: pretend that we got a blocking AST from * the server, so that when the lock is decref'd, it * will go away ... */ unlock_res(res); LDLM_DEBUG(lock, "setting FL_LOCAL_ONLY"); if (lock->l_completion_ast) lock->l_completion_ast(lock, 0, NULL); LDLM_LOCK_RELEASE(lock); continue; } ... } thread A: flock ldlm_cli_enqueue_fini() int ldlm_cli_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req, ldlm_type_t type, __u8 with_policy, ldlm_mode_t mode, int *flags, void *lvb, __u32 lvb_len, struct lustre_handle *lockh,int rc) { ... if (!is_replay) { rc = ldlm_lock_enqueue(ns, &lock, NULL, flags); if (lock->l_completion_ast != NULL) { int err = lock->l_completion_ast(lock, *flags, NULL); if (!rc) rc = err; if (rc) cleanup_phase = 1; } } ... } Result: Both threads are possible ldlm_flock_completion_ast() ldlm_flock_completion_ast(struct ldlm_lock *lock, int flags, void *data) { cfs_flock_t *getlk = lock->l_ast_data; struct obd_device *obd; struct obd_import *imp = NULL; struct ldlm_flock_wait_data fwd; struct l_wait_info lwi; ldlm_error_t err; int rc = 0; ENTRY; CDEBUG(D_DLMTRACE, "flags: 0x%x data: %p getlk: %p\n", flags, data, getlk); /* Import invalidation. We need to actually release the lock * references being held, so that it can go away. No point in * holding the lock even if app still believes it has it, since * server already dropped it anyway. Only for granted locks too. */ if ((lock->l_flags & (LDLM_FL_FAILED|LDLM_FL_LOCAL_ONLY)) == (LDLM_FL_FAILED|LDLM_FL_LOCAL_ONLY)) { if (lock->l_req_mode == lock->l_granted_mode && lock->l_granted_mode != LCK_NL && NULL == data) ldlm_lock_decref_internal(lock, lock->l_req_mode); /* Need to wake up the waiter if we were evicted */ cfs_waitq_signal(&lock->l_waitq); RETURN(0); } .... } the blow is the dlmtrace. it actually says that ldlm_lock_decref_interanl_nolock() was done to the same lock twice, althought it rw refcount has reached zero. 5827: flock 00010000:00000001:13:1350018852.706543:0:5827:0:(ldlm_request.c:409:ldlm_cli_enqueue_fini()) Process entered 00010000:00000001:13:1350018852.706543:0:5827:0:(ldlm_lock.c:451:__ldlm_handle2lock()) Process entered 00000020:00000001:13:1350018852.706544:0:5827:0:(lustre_handles.c:172:class_handle2object()) Process entered 00000020:00000001:13:1350018852.706545:0:5827:0:(lustre_handles.c:195:class_handle2object()) Process leaving (rc=18446604488316264960 : -139585393286656 : ffff810c3e37a200) 00010000:00000001:13:1350018852.706548:0:5827:0:(ldlm_resource.c:1105:ldlm_resource_putref_internal()) Process entered 00010000:00000001:13:1350018852.706549:0:5827:0:(ldlm_resource.c:1118:ldlm_resource_putref_internal()) Process leaving (rc=0 : 0 : 0) 00010000:00000001:13:1350018852.706551:0:5827:0:(ldlm_lock.c:502:__ldlm_handle2lock()) Process leaving 00010000:00000001:13:1350018852.706552:0:5827:0:(ldlm_lock.c:1217:ldlm_lock_enqueue()) Process entered 00010000:00000001:13:1350018861.719601:0:5827:0:(ldlm_lock.c:895:ldlm_grant_lock()) Process entered 00010000:00000001:13:1350018861.719602:0:5827:0:(ldlm_lock.c:914:ldlm_grant_lock()) Process leaving 00010000:00000001:13:1350018861.719603:0:5827:0:(ldlm_lock.c:1329:ldlm_lock_enqueue()) Process leaving via out (rc=0 : 0 : 0) 00010000:00000001:13:1350018861.719604:0:5827:0:(ldlm_flock.c:525:ldlm_flock_completion_ast()) Process entered 00010000:00010000:4:1350018861.719604:0:5829:0:(ldlm_resource.c:574:cleanup_resource()) ### setting FL_LOCAL_ONLY ns: fefs-MDT0000-mdc-ffff8105e9bbbc00 lock: ffff810c3e37a200/0xee16fc0aef26bb17 lrc: 5/0,1 mode: PW/PW res: 12/706300508 rrc: 2 type: FLK pid: 5827 [0->16777215] flags: 0x2000c10 remote: 0xa520aeaebdefc517 expref: -99 pid: 5827 timeout: 0 00010000:00010000:13:1350018861.719605:0:5827:0:(ldlm_flock.c:528:ldlm_flock_completion_ast()) flags: 0x40000 data: 0000000000000000 getlk: ffff810c3ecc1ad8 00010000:00000001:13:1350018861.719607:0:5827:0:(ldlm_lock.c:659:ldlm_lock_decref_internal()) Process entered 1 ================================================== 00010000:00010000:13:1350018861.719607:0:5827:0:(ldlm_lock.c:637:ldlm_lock_decref_internal_nolock()) ### ldlm_lock_decref(PW) ns: fefs-MDT0000-mdc-ffff8105e9bbbc00 lock: ffff810c3e37a200/0xee16fc0aef26bb17 lrc: 5/0,1 mode: PW/PW res: 12/706300508 rrc: 2 type: FLK pid: 5827 [0->16777215] flags: 0x2000c10 remote: 0xa520aeaebdefc517 expref: -99 pid: 5827 timeout: 0 ================================================== 00010000:00000001:4:1350018861.719611:0:5829:0:(ldlm_flock.c:525:ldlm_flock_completion_ast()) Process entered 00010000:00000001:13:1350018861.719612:0:5827:0:(ldlm_lock.c:152:ldlm_lock_put()) Process entered 00010000:00010000:4:1350018861.719612:0:5829:0:(ldlm_flock.c:528:ldlm_flock_completion_ast()) flags: 0x0 data: 0000000000000000 getlk: ffff810c3ecc1ad8 00010000:00000001:13:1350018861.719613:0:5827:0:(ldlm_lock.c:182:ldlm_lock_put()) Process leaving 00010000:00010000:13:1350018861.719613:0:5827:0:(ldlm_lock.c:683:ldlm_lock_decref_internal()) ### final decref done on cbpending lock ns: fefs-MDT0000-mdc-ffff8105e9bbbc00 lock: ffff810c3e37a200/0xee16fc0aef26bb17 lrc: 4/0,0 mode: PW/PW res: 12/706300508 rrc: 2 type: FLK pid: 5827 [0->16777215] flags: 0x2000c10 remote: 0xa520aeaebdefc517 expref: -99 pid: 5827 timeout: 0 00010000:00000001:13:1350018861.719617:0:5827:0:(ldlm_lock.c:206:ldlm_lock_remove_from_lru()) Process entered 00010000:00000001:13:1350018861.719618:0:5827:0:(ldlm_lock.c:210:ldlm_lock_remove_from_lru()) Process leaving 00010000:00000001:13:1350018861.719619:0:5827:0:(ldlm_lockd.c:1763:ldlm_bl_to_thread()) Process entered 00010000:00000001:4:1350018861.719619:0:5829:0:(ldlm_lock.c:659:ldlm_lock_decref_internal()) Process entered 2 ================================================== 00010000:00010000:4:1350018861.719619:0:5829:0:(ldlm_lock.c:637:ldlm_lock_decref_internal_nolock()) ### ldlm_lock_decref(PW) ns: fefs-MDT0000-mdc-ffff8105e9bbbc00 lock: ffff810c3e37a200/0xee16fc0aef26bb17 lrc: 5/0,0 mode: PW/PW res: 12/706300508 rrc: 2 type: FLK pid: 5827 [0->16777215] flags: 0x2000c10 remote: 0xa520aeaebdefc517 expref: -99 pid: 5827 timeout: 0 ================================================== |
| Comments |
| Comment by Hiroya Nozaki [ 18/Oct/12 ] |
|
patch for b1_8 |
| Comment by Hiroya Nozaki [ 18/Oct/12 ] |
|
patch for master |
| Comment by Hiroya Nozaki [ 18/Oct/12 ] |
|
I uploaded the patch to fix the problem for b1_8 and master branches. |
| Comment by Vitaly Fertman [ 18/Apr/14 ] |
| Comment by Emoly Liu [ 12/Jun/14 ] |
|
The patch http://review.whamcloud.com/10005 landed to 2.6 . |