[LU-3125] Oops in mdd_swap_layouts() Created: 08/Apr/13  Updated: 23/Apr/13  Resolved: 23/Apr/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Blocker
Reporter: John Hammond Assignee: Lai Siyao
Resolution: Fixed Votes: 0
Labels: MB, mdd

Issue Links:
Duplicate
is duplicated by LU-3145 layout swap on stripeless file will i... Closed
Related
is related to LU-3145 layout swap on stripeless file will i... Closed
Severity: 3
Rank (Obsolete): 7595

 Description   

Layout swap on stripeless files may oops in mdd_swap_layouts().

To reproduce:

fd[0] = open("f0", O_CREAT|O_WRONLY, 0666);
fd[1] = open("f1", O_CREAT|O_WRONLY|O_LOV_DELAY_CREAT, 0666);

struct lustre_swap_layouts sl = {
        .sl_fd = fd[1],
        .sl_flags = 0,
        .sl_gid = 0,
};

ioctl(fd[0], LL_IOC_LOV_SWAP_LAYOUTS, &sl)
crash> bt -l
...
    [exception RIP: mdd_swap_layouts+918]
    RIP: ffffffffa051d2b6  RSP: ffff88014a29bc20  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: ffff8801529d16c0  RCX: ffff88015159aa00
    RDX: 0000000000000002  RSI: 0000000000000000  RDI: ffff880154757fc0
    RBP: ffff88014a29bce0   R8: 0000000000000246   R9: 00000000fffffffc
    R10: ffff880161679c60  R11: 0000000000000000  R12: ffff880161679cc0
    R13: ffff88013f458940  R14: ffff88015eac9670  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #9 [ffff88014a29bce8] mdt_swap_layouts at ffffffffa0a4b834 [mdt]
    /root/lustre-release/lustre/include/md_object.h: 620
#10 [ffff88014a29bd48] mdt_handle_common at ffffffffa0a460f8 [mdt]
    /root/lustre-release/lustre/mdt/mdt_handler.c: 2984
#11 [ffff88014a29bd98] mds_regular_handle at ffffffffa0a823f5 [mdt]
    /root/lustre-release/lustre/mdt/mdt_mds.c: 354
#12 [ffff88014a29bda8] ptlrpc_server_handle_request at ffffffffa06293cc [ptlrpc]
    /root/lustre-release/lustre/include/lustre_net.h: 2944
#13 [ffff88014a29bea8] ptlrpc_main at ffffffffa062a8c5 [ptlrpc]
    /root/lustre-release/lustre/ptlrpc/service.c: 2482
#14 [ffff88014a29bf48] kernel_thread at ffffffff8100c0ca
    /usr/src/debug///////kernel-2.6.32-279.19.1.el6/linux-2.6.32-279.19.1.el6_lustre_gcov.x86_64/arch/x86/kernel/entry_64.S: 1213

static int mdd_swap_layouts(const struct lu_env *env, struct md_object *obj1,
struct md_object *obj2, __u64 flags)
{
...

/* lmm and generation layout initialization */
if (fst_buf)

{ fst_lmm = fst_buf->lb_buf; fst_gen = le16_to_cpu(fst_lmm->lmm_layout_gen); }

else

{ fst_lmm = NULL; fst_gen = 0; }

...

/* save the orignal lmm common header of first file

  • to be able to roll back */
    OBD_ALLOC_PTR(old_fst_lmm);
    if (old_fst_lmm == NULL)
    GOTO(unlock, rc = -ENOMEM);

/* XXX fst_lmm may be NULL here. */
memcpy(old_fst_lmm, fst_lmm, sizeof(*old_fst_lmm));

...
}



 Comments   
Comment by Andreas Dilger [ 08/Apr/13 ]

The patch for fixing this problem should include a test case which triggers the above problem.

Comment by Lai Siyao [ 09/Apr/13 ]

patch is on http://review.whamcloud.com/#change,5998

Comment by Jodi Levi (Inactive) [ 10/Apr/13 ]

Did this patch include the feedback from LU-3145?

Comment by John Hammond [ 10/Apr/13 ]

No. It's really two issues. The later issue is already present but masked by this one. That said, we can merge the issues or not.

Comment by Peter Jones [ 23/Apr/13 ]

Landed for 2.4

Generated at Sat Feb 10 01:31:11 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.