Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.8.0
-
1
-
9223372036854775807
Description
MDT crashed with the below stack trace and the issue is related to large EA.
[411763.959595] LDISKFS-fs error (device md66): ldiskfs_xattr_inode_iget: [411763.959667] LDISKFS-fs error (device md66): ldiskfs_xattr_inode_iget: Backpointer from EA inode 2300579986 to parent invalid. [411763.959670] Aborting journal on device md66-8. [411763.959758] LustreError: 243183:0:(osd_handler.c:914:osd_trans_commit_cb()) transaction @0xffff8806f77781c0 commit error: 2 [411763.986296] Kernel panic - not syncing: LDISKFS-fs (device md66): panic forced after error [411763.986297] [411763.986299] Pid: 179095, comm: mdt01_092 Not tainted 2.6.32-431.17.1.x2.0.61.x86_64 #1 [411763.986300] Call Trace: [411763.986311] [<ffffffff81524e6e>] ? panic+0xa7/0x16f [411763.986324] [<ffffffffa10a98a8>] ? ldiskfs_commit_super+0x188/0x210 [ldiskfs] [411763.986331] [<ffffffffa10a9eb4>] ? ldiskfs_handle_error+0xc4/0xd0 [ldiskfs] [411763.986338] [<ffffffffa10aa252>] ? __ldiskfs_error+0x82/0x90 [ldiskfs] [411763.986344] [<ffffffffa10b194f>] ? ldiskfs_xattr_inode_iget+0xbf/0x190 [ldiskfs] [411763.986350] [<ffffffffa10b1c66>] ? ldiskfs_xattr_inode_get+0x26/0xf0 [ldiskfs] [411763.986356] [<ffffffffa10b15ce>] ? ldiskfs_xattr_find_entry+0x3e/0x120 [ldiskfs] [411763.986362] [<ffffffffa10b26bc>] ? ldiskfs_xattr_get+0x18c/0x330 [ldiskfs] [411763.986368] [<ffffffffa10b492b>] ? ldiskfs_xattr_trusted_get+0x2b/0x30 [ldiskfs] [411763.986372] [<ffffffff811ae467>] ? generic_getxattr+0x87/0x90 [411763.986384] [<ffffffffa1102a60>] ? osd_xattr_get+0x230/0x2e0 [osd_ldiskfs] [411763.986392] [<ffffffffa11a657d>] ? lod_xattr_get+0x19d/0x4c0 [lod] [411763.986403] [<ffffffffa0f5bce1>] ? mdd_xattr_get+0x111/0x400 [mdd] [411763.986418] [<ffffffffa0faddd0>] ? mdt_big_xattr_get+0x110/0xa70 [mdt] [411763.986424] [<ffffffffa11a3708>] ? lod_object_read_unlock+0x38/0xd0 [lod] [411763.986430] [<ffffffffa0f52b38>] ? mdd_read_unlock+0x38/0xd0 [mdd] [411763.986435] [<ffffffffa0f5bcef>] ? mdd_xattr_get+0x11f/0x400 [mdd] [411763.986441] [<ffffffffa0f56b15>] ? mdd_la_get+0xb5/0x240 [mdd] [411763.986451] [<ffffffffa0fae7fc>] ? mdt_attr_get_lov+0xcc/0x1e0 [mdt] [411763.986460] [<ffffffffa0faeed6>] ? mdt_attr_get_complex+0x5c6/0xbb0 [mdt] [411763.986469] [<ffffffffa0fafc1f>] ? mdt_getattr_internal+0x23f/0x13f0 [mdt] [411763.986478] [<ffffffffa0fb3072>] ? mdt_getattr_name_lock+0xbf2/0x1aa0 [mdt] [411763.986519] [<ffffffffa0a4a294>] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc] [411763.986528] [<ffffffffa0fb4442>] ? mdt_intent_getattr+0x292/0x470 [mdt] [411763.986536] [<ffffffffa0fa3594>] ? mdt_intent_policy+0x494/0xce0 [mdt] [411763.986557] [<ffffffffa09fc739>] ? ldlm_lock_enqueue+0x129/0x9d0 [ptlrpc] [411763.986580] [<ffffffffa0a28a2b>] ? ldlm_handle_enqueue0+0x51b/0x13b0 [ptlrpc] [411763.986592] [<ffffffffa067e63e>] ? cfs_timer_arm+0xe/0x10 [libcfs] [411763.986625] [<ffffffffa0aaaac1>] ? tgt_enqueue+0x61/0x230 [ptlrpc] [411763.986653] [<ffffffffa0aaa1ee>] ? tgt_request_handle+0x6fe/0xaf0 [ptlrpc] [411763.986678] [<ffffffffa0a5a57a>] ? ptlrpc_main+0xd7a/0x1880 [ptlrpc] [411763.986681] [<ffffffff8152557e>] ? thread_return+0x4e/0x760 [411763.986706] [<ffffffffa0a59800>] ? ptlrpc_main+0x0/0x1880 [ptlrpc] [411763.986709] [<ffffffff8109ac66>] ? kthread+0x96/0xa0 [411763.986712] [<ffffffff8100c20a>] ? child_rip+0xa/0x20 [411763.986714] [<ffffffff8109abd0>] ? kthread+0x0/0xa0 [411763.986715] [<ffffffff8100c200>] ? child_rip+0x0/0x20
Zam investigated the issue and the details are below:
1. inode (not ea one, just an ordinary inode) has ino > 2G.
2. EA inode gets allocated
3. backward link from EA inode is set, using i_mtime.tv_sec
4. EA inode is written to disk, inode->i_mtime is written as i_mtime field of ext4_inode.
5. fs is remounted or EA inode is removed from inode cache.
6. ldiskfs accesses EA inode, by calling ldiskfs_xattr_inode_iget()
7. ldiskfs_iget(sb, ea_ino) – reads EA inode from disk and sets inode->i_mtime (timeval struct) from on-disk inode field (__le32 type)
by using macro:
#define EXT4_INODE_GET_XTIME(xtime, inode, raw_inode) \ do { \ (inode)->xtime.tv_sec = (signed)le32_to_cpu((raw_inode)->xtime); \ if (EXT4_FITS_IN_INODE(raw_inode, EXT4_I(inode), xtime ## _extra)) \ ext4_decode_extra_time(&(inode)->xtime, \ raw_inode->xtime ## _extra); \ } while (0)
in result, the highest bit of the inode number which we know is 1 (ino > 2G ) gets extended over high 32 bits of long int tv_sec field.
8. comparing tv_sec (signed long int, with high 32 bits set) and inode->i_ino (unsigned long, high 32 bits are clear) fails:
if (ea_inode->i_xattr_inode_parent != parent->i_ino || ea_inode->i_generation != parent->i_generation) { ext4_error(parent->i_sb, "Backpointer from EA inode %d " "to parent invalid.", ea_ino); *err = -EINVAL; goto error; }
Zam has been able to reproduce the issue and also has a patch fix for it which he will post shortly.
Attachments
Issue Links
- is related to
-
LU-7781 kernel update [RHEL7.2 3.10.0-327.10.1.el7]
- Resolved