[LU-7981] double read of lli_trunc_sem in ll_page_mkwrite and vvp_io_fault_start leads to deadlock - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: Lustre 2.9.0
Affects Version/s: None
Labels:
None

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

After applying the patch for ~~LU-7927~~ to our code, another deadlock was exposed. It does not look like this was CAUSED by ~~LU-7927~~, it just seems the timing change caused by ~~LU-7927~~ allowed this bug to be observed. (Or possibly this code was deadlocking there first - It's hard to say precisely)

The lli_trunc_sem is taken in 'read' mode in both ll_page_mkwrite and vvp_io_fault_start. This can lead to a deadlock with another thread which asks for the semaphore in write mode before that time.

—
The issue is a double down_read on lli_trunc_sem:

PID: 35117 TASK: ffff8807c26e9680 CPU: 6 COMMAND: "fsx-linux-aio"
#0 [ffff8807c29f7ac0] schedule at ffffffff8149cf35
#1 [ffff8807c29f7b40] rwsem_down_read_failed at ffffffff8149ed25
#2 [ffff8807c29f7b90] call_rwsem_down_read_failed at ffffffff81271f64
#3 [ffff8807c29f7be8] vvp_io_fault_start at ffffffffa08f2526 [lustre]
#4 [ffff8807c29f7c58] cl_io_start at ffffffffa0522115 [obdclass]
#5 [ffff8807c29f7c80] cl_io_loop at ffffffffa0525705 [obdclass]
#6 [ffff8807c29f7cb0] ll_page_mkwrite at ffffffffa08d2a2a [lustre]
#7 [ffff8807c29f7d30] __do_fault at ffffffff81148c70
#8 [ffff8807c29f7db8] handle_mm_fault at ffffffff8114c2cf
#9 [ffff8807c29f7e40] __do_page_fault at ffffffff814a3420
#10 [ffff8807c29f7f40] do_page_fault at ffffffff814a37de
#11 [ffff8807c29f7f50] page_fault at ffffffff8149ff62
RIP: 000000002002551b RSP: 00007fffffff64c8 RFLAGS: 00010212

Done in ll_page_mkwrite, then again in vvp_io_fault_start.

This is a problem because a waiting writer takes priority over any
future readers. Here's an example of one:
PID: 35131 TASK: ffff8807c4ecf1c0 CPU: 13 COMMAND: "fsx-linux-aio"
#0 [ffff8807c3555b58] schedule at ffffffff8149cf35
#1 [ffff8807c3555bd8] rwsem_down_write_failed at ffffffff8149ef45
#2 [ffff8807c3555c50] call_rwsem_down_write_failed at ffffffff81271f93
#3 [ffff8807c3555ca0] vvp_io_setattr_start at ffffffffa08f0cea [lustre]
#4 [ffff8807c3555ce0] cl_io_start at ffffffffa0522115 [obdclass]
#5 [ffff8807c3555d08] cl_io_loop at ffffffffa0525705 [obdclass]
#6 [ffff8807c3555d38] cl_setattr_ost at ffffffffa08eb250 [lustre]
#7 [ffff8807c3555d80] ll_setattr_raw at ffffffffa08be009 [lustre]
#8 [ffff8807c3555e68] ll_setattr at ffffffffa08be313 [lustre]
#9 [ffff8807c3555e78] notify_change at ffffffff8119d401
#10 [ffff8807c3555eb8] do_truncate at ffffffff8118066d
#11 [ffff8807c3555f28] do_sys_ftruncate.constprop.20 at ffffffff811809bb
#12 [ffff8807c3555f70] sys_ftruncate at ffffffff81180a4e
#13 [ffff8807c3555f80] system_call_fastpath at ffffffff814a7db2
RIP: 0000000020152867 RSP: 00007fffffff6678 RFLAGS: 00010246
RAX: 000000000000004d RBX: ffffffff814a7db2 RCX: 0000010000081000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000005
RBP: 00007fffffff6670 R8: 0000000000000000 R9: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: ffffffff81180a4e
R13: ffff8807c3555f78 R14: 0000000000000000 R15: 00000000201028b0
ORIG_RAX: 000000000000004d CS: 0033 SS: 002b

Just to make clear, here's the sequence of events:
Thread 1 (pid 35117 above): down_read() <-- SUCCEEDS
Thread 2 (pid 35131 above): down_write() <-- FAILS, starts waiting
Thread 1: down_read() [again] <-- Fails, stuck behind thread 2 (which is
stuck behind thread 1)

Attachments

Activity

People

Assignee:: Zhenyu Xu

Reporter:: Patrick Farrell (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 04/Apr/16 6:13 PM

Updated:: 03/May/16 5:57 PM

Resolved:: 03/May/16 5:57 PM