[LU-7593] umount vs tgt_last_rcvd_update deadlock Created: 22/Dec/15  Updated: 27/May/19  Resolved: 25/Oct/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.9.0

Type: Bug Priority: Major
Reporter: Andriy Skulysh Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: bgti, patch

Issue Links:
Duplicate
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

tgt_client_del() and
ofd_commitrw_write->tgt_last_rcvd_update
take transaction and ted->ted_lcd_lock
in different order:

thread1:
osd_trans_start
tgt_client_data_update
tgt_client_del <<< mutex_lock(&ted->ted_lcd_lock);
ofd_obd_disconnect
class_disconnect_export_list
class_disconnect_exports
class_cleanup
...
sys_umount

thread2:
__mutex_lock_slowpath
mutex_lock <<< mutex_lock(&ted->ted_lcd_lock);
tgt_last_rcvd_update
tgt_txn_stop_cb
dt_txn_hook_stop
osd_trans_stop
ofd_trans_stop
ofd_commitrw_write
...
tgt_brw_write



 Comments   
Comment by Gerrit Updater [ 22/Dec/15 ]

Andriy Skulysh (andriy.skulysh@seagate.com) uploaded a new patch: http://review.whamcloud.com/17703
Subject: LU-7593 target: umount vs tgt_last_rcvd_update deadlock
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: f3e97908af29d76962bd20542f7c20ff9e4c3fe8

Comment by Gerrit Updater [ 22/Dec/15 ]

Andriy Skulysh (andriy.skulysh@seagate.com) uploaded a new patch: http://review.whamcloud.com/17704
Subject: LU-7593 target: umount vs tgt_last_rcvd_update deadlock
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2c3f38babef3a657e5a6e0dc022ec273ae0447a1

Comment by Peter Jones [ 22/Dec/15 ]

Andriy

Could you please elaborate as to why there are two patches here? Was this issue hit in testing on master?

Peter

Comment by Andriy Skulysh [ 23/Dec/15 ]

I forgot to add a test for the deadlock. Second patch contains both test and fix

Comment by Peter Jones [ 23/Dec/15 ]

Ah I see. So only the second patch is intended to land and the first one will be abandoned?

Comment by Andriy Skulysh [ 25/Dec/15 ]

Yes, only second patch is needed.

Comment by Gerrit Updater [ 02/Sep/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/17704/
Subject: LU-7593 target: umount vs tgt_last_rcvd_update deadlock
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 9197849bc761f7fbbdee65916ad1346eec79d412

Comment by Peter Jones [ 02/Sep/16 ]

Landed for 2.9

Comment by nasf (Inactive) [ 13/Oct/16 ]

I forgot to add a test for the deadlock. Second patch contains both test and fix

The 2nd patch http://review.whamcloud.com/#/c/17704/ does NOT contain test. So I think you refer to the 1st patch http://review.whamcloud.com/#/c/17703/ ?
And the two patches looks quite different.

Comment by Gerrit Updater [ 13/Oct/16 ]

Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/23129
Subject: LU-7593 target: take ted_lcd_lock after transaction started
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 57ab9916062248a102bbe7938509c41390c67e0c

Comment by Gerrit Updater [ 25/Oct/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/23129/
Subject: LU-7593 target: take ted_lcd_lock after transaction started
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: cbca8055b75d084bc6554d1a06f3cd8ccf2f8c09

Comment by nasf (Inactive) [ 25/Oct/16 ]

The patch http://review.whamcloud.com/23129/ has been landed to Lustre-2.9

Generated at Sat Feb 10 02:10:14 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.