[LU-4135] mdt_save_lock() is broken Created: 23/Oct/13  Updated: 31/Jan/22  Resolved: 18/Nov/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0
Fix Version/s: Lustre 2.6.0, Lustre 2.5.1

Type: Bug Priority: Blocker
Reporter: Mikhail Pershin Assignee: Mikhail Pershin
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
duplicates LU-4103 interop 2.5/2.6 replay-dual test_21a:... Resolved
duplicates LU-4143 MDS OOPS sanity-hsm/test_52 NULL Poin... Closed
duplicates LU-4142 MDS OOPS sanity-hsm/test_52 NULL Poin... Closed
Related
is related to LU-4143 MDS OOPS sanity-hsm/test_52 NULL Poin... Closed
Severity: 3
Rank (Obsolete): 11218

 Description   

The mdt_save_lock() is broken and doesn't save any lock ever but simply unlock it. That happens because mti_has_trans is always 0 and is not updated upon transaction execution since commit 607905a789357a34166f34e7c992b03f5040eafc.

Another issue with mdt_save_lock is 'req' variable which can be NULL in codepath mdt_export_cleanup()>mdt_ctxt_add_dirty_flag>mdt_add_dirty_flag->mdt_object_unlock()->mdt_save_lock().



 Comments   
Comment by Mikhail Pershin [ 23/Oct/13 ]

patch to fix this issue: http://review.whamcloud.com/8048

Comment by Oleg Drokin [ 23/Oct/13 ]

So how does this manifests itself? a crash in a specific circumstances or what?
Also how did you find it and when will this hit in real world?

Comment by Mikhail Pershin [ 23/Oct/13 ]

I've found that during testing side patches for Unified Target but it is clear that req is taken from mdt_thread_info and it is NULL in case of mdt_export_cleanup() in master as well, so I created this bug. Meanwhile I wonder why we don't see that issue in master tests and find out that things are even worse, currently in master mdt_save_lock() never saves them but just do unlock because mti_has_trans is always 0 after commit 607905a789357a34166f34e7c992b03f5040eafc. In my patches for UT it works and bug happens.

This issue is quite critical now because we broke important part of recovery so I'd change summary to something like "restore mdt_lock_save() functionality". Patch is refreshed already to fix both issues.

With proper mdt_save_lock() functionality the oops happens in test 52 sanity-hsm.sh due to NULL req variable as I wrote in first comment.

Comment by Mikhail Pershin [ 18/Nov/13 ]

patch was merged

Comment by Andreas Dilger [ 05/Feb/14 ]

Patch was also merged to b2_5 and will be in the 2.5.1 release.

Generated at Sat Feb 10 01:39:59 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.