Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.7.0
-
centos 6 + Lustre head of tree (2.7+)
-
3
-
9223372036854775807
Description
I can consistently crash the lustre client with the reproducer attached.
Info from the logs:
<0>LustreError: 26474:0:(osc_cache.c:519:osc_extent_merge()) ASSERTION( cur->oe_dlmlock == victim->oe_dlmlock ) failed: <0>LustreError: 26474:0:(osc_cache.c:519:osc_extent_merge()) LBUG
Stack trace from crash:
crash> bt PID: 26474 TASK: ffff88003747caa0 CPU: 3 COMMAND: "llsendfile3" #0 [ffff88001a2835f0] machine_kexec at ffffffff81038f3b #1 [ffff88001a283650] crash_kexec at ffffffff810c5b62 #2 [ffff88001a283720] panic at ffffffff815285a3 #3 [ffff88001a2837a0] lbug_with_loc at ffffffffa0ac8eeb [libcfs] #4 [ffff88001a2837c0] osc_extent_merge at ffffffffa06ce57d [osc] #5 [ffff88001a2838d0] osc_extent_release at ffffffffa06d3efb [osc] #6 [ffff88001a283900] osc_io_end at ffffffffa06c520f [osc] #7 [ffff88001a283920] cl_io_end at ffffffffa0dfc270 [obdclass] #8 [ffff88001a283950] lov_io_end_wrapper at ffffffffa070f3b1 [lov] #9 [ffff88001a283970] lov_io_call at ffffffffa070f0fe [lov] #10 [ffff88001a2839a0] lov_io_end at ffffffffa0710fbc [lov] #11 [ffff88001a2839c0] cl_io_end at ffffffffa0dfc270 [obdclass] #12 [ffff88001a2839f0] cl_io_loop at ffffffffa0e00b52 [obdclass] #13 [ffff88001a283a20] ll_file_io_generic at ffffffffa125e20c [lustre] #14 [ffff88001a283b40] ll_file_aio_write at ffffffffa125e933 [lustre] #15 [ffff88001a283ba0] ll_file_write at ffffffffa125edd9 [lustre] #16 [ffff88001a283c10] vfs_write at ffffffff81188df8 #17 [ffff88001a283c50] kernel_write at ffffffff811b8ded #18 [ffff88001a283c80] write_pipe_buf at ffffffff811b8e5a #19 [ffff88001a283cc0] splice_from_pipe_feed at ffffffff811b7a92 #20 [ffff88001a283d10] __splice_from_pipe at ffffffff811b84ee #21 [ffff88001a283d50] splice_from_pipe at ffffffff811b8551 #22 [ffff88001a283da0] default_file_splice_write at ffffffff811b858d #23 [ffff88001a283dc0] do_splice_from at ffffffff811b862e #24 [ffff88001a283e00] direct_splice_actor at ffffffff811b8680 #25 [ffff88001a283e10] splice_direct_to_actor at ffffffff811b8956 #26 [ffff88001a283e80] do_splice_direct at ffffffff811b8a9d #27 [ffff88001a283ed0] do_sendfile at ffffffff811891fc #28 [ffff88001a283f30] sys_sendfile64 at ffffffff81189294 #29 [ffff88001a283f80] system_call_fastpath at ffffffff8100b072 RIP: 0000003a522df7da RSP: 00007fffe6f8add8 RFLAGS: 00010206 RAX: 0000000000000028 RBX: ffffffff8100b072 RCX: 0000000000a00000 RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000004 RBP: 0000000000000004 R8: 0000003a5258f300 R9: 0000003a51a0e9f0 R10: 0000000000a00000 R11: 0000000000000206 R12: 0000000000000000 R13: 00007fffe6f8aed0 R14: 0000000000401b90 R15: 0000000000000003 ORIG_RAX: 0000000000000028 CS: 0033 SS: 002b
This is related to the group lock on the target file. If the group lock is commented out, then no crash happens.
Landed for 2.8