Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
When trying to use direct I/O on loop devices to format and mount an XFS filesystem backed on a Lustre file, we ran into the following crash:
PID: 51772 TASK: ffff950e22f117c0 CPU: 1 COMMAND: "loop4"
#0 [ffffafe700c8b8a0] machine_kexec at ffffffffb9659a5e
#1 [ffffafe700c8b8f8] __crash_kexec at ffffffffb975928d
#2 [ffffafe700c8b9c0] panic at ffffffffb96b1498
#3 [ffffafe700c8ba48] ll_direct_IO_impl at ffffffffc1755194 [lustre]
#4 [ffffafe700c8bb08] generic_file_read_iter at ffffffffb981cd1f
#5 [ffffafe700c8bb50] vvp_io_read_start at ffffffffc17673be [lustre]
#6 [ffffafe700c8bbe8] cl_io_start at ffffffffc09e673d [obdclass]
#7 [ffffafe700c8bc10] cl_io_loop at ffffffffc09e9d1a [obdclass]
#8 [ffffafe700c8bc48] ll_file_io_generic at ffffffffc170e510 [lustre]
#9 [ffffafe700c8bd40] ll_file_read_iter at ffffffffc170f9a6 [lustre]
#10 [ffffafe700c8bdb0] lo_rw_aio at ffffffffc08037a9 [loop]
#11 [ffffafe700c8be28] loop_queue_work at ffffffffc0804bc7 [loop]
#12 [ffffafe700c8bee0] kthread_worker_fn at ffffffffb96d5224
#13 [ffffafe700c8bf10] kthread at ffffffffb96d4802
#14 [ffffafe700c8bf50] ret_from_fork at ffffffffba000242
This crash is triggered by `LASSERT(ll_dio_aio)` in `ll_file_io_generic`. The issue is that the loop block device uses `kiocb::ki_flags = IOCB_DIRECT` to trigger a direct I/O but Lustre only looks at `file::f_flags & O_DIRECT` to assess whether we are in DIO or not. This leads to inconsistencies in the expected variables that should be available in the read/write code paths.
This crash was produced with:
truncate -s 100M /mnt/lustre/disk losetup -f b 4096 -direct-io=on /mnt/lustre/disk
Attachments
Issue Links
- is related to
-
LU-16695 switch Lustre to use IOCB_APPEND and IOCB_DIRECT instead of file flags
- Resolved