[LU-11535] Memory corruption by ldiskfs_ext_remove_space slab-256 Created: 17/Oct/18  Updated: 29/Oct/18  Resolved: 29/Oct/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.12.0

Type: Bug Priority: Critical
Reporter: Artem Blagodarenko (Inactive) Assignee: Artem Blagodarenko (Inactive)
Resolution: Fixed Votes: 0
Labels: patch

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

The failure happened at

[27870519.051376] BUG: unable to handle kernel NULL pointer dereference at (null)
[27870519.058800] IP: [<ffffffffa067cac9>] lu_device_put+0x9/0x50 [obdclass]
[27870519.065736] PGD 59d964067 PUD e912b8067 PMD 0 
[27870519.070586] Oops: 0000 [#1] SMP 
[27870519.074201] last sysfs file: /sys/module/ipv6/initstate
[27870519.079770] CPU 7 
…
[27870519.176276] Pid: 92764, comm: jbd2/md141 Tainted: P           ---------------    2.6.32-431.17.1.x2.0.87.x86_64 #1 Seagate SATI-TL/Type2 - Board Product Sati2
[27870519.190999] RIP: 0010:[<ffffffffa067cac9>]  [<ffffffffa067cac9>] lu_device_put+0x9/0x50 [obdclass]
[....
[27870519.282591] Process jbd2/md141 (pid: 92764, threadinfo ffff8805c6f3e000, task ffff880e833a8ae0)
[27870519.291704] Stack:
[27870519.294056]  ffff8805c6f3fcf0 ffffffffa0f25acb ffff8805c6f3fcd0 ffff880592d0d4e8
[27870519.301689] <d> ffff880f43897a98 0000000000000000 ffff880c18808800 0000000000f36fcb
[27870519.309871] <d> ffff8805c6f3fd20 ffffffffa0ecc8e1 ffff8809650ddb9c ffff880f438979c0
[27870519.318320] Call Trace:
[27870519.321125]  [<ffffffffa0f25acb>] osd_trans_commit_cb+0xcb/0x2b0 [osd_ldiskfs]
[27870519.328780]  [<ffffffffa0ecc8e1>] ldiskfs_journal_commit_callback+0x61/0x80 [ldiskfs]
[27870519.337036]  [<ffffffffa03eb8ef>] jbd2_journal_commit_transaction+0x116f/0x15a0 [jbd2]

The transaction was allocated at slab-256. The slab element before transaction belongs to ldiskfs ext path, executed function is ldiskfs_ext_remove_space().
There is a bug in a while loop where bread is called.

    depth = ext_depth(inode);
        if (path) {
                int k = i = depth;
                while (--k > 0)
                        path[k].p_block =
                                le16_to_cpu(path[k].p_hdr->eh_entries)+1;
        } else {
                path = kzalloc(sizeof(struct ldiskfs_ext_path) *
                               LDISKFS_SB(inode->i_sb)->s_max_ext_tree_depth,
                               GFP_NOFS);
                if (path == NULL) {
                        ldiskfs_journal_stop(handle);
                        return -ENOMEM;
                }
                path[0].p_depth = depth;
                path[0].p_hdr = ext_inode_hdr(inode);
                i = 0;
 
                if (ldiskfs_ext_check(inode, path[0].p_hdr, depth, 0)) {
                        err = -EIO;
                        goto out;
                }
        }
        err = 0;
 
        while (i >= 0 && err == 0) {
                if (i == depth) {
                        /* this is leaf block */
                        err = ldiskfs_ext_rm_leaf(handle, inode, path,
                                               &partial_cluster, start,
                                               end);
                        /* root level has p_bh == NULL, brelse() eats this */
                        brelse(path[i].p_bh);
                        path[i].p_bh = NULL;
                        i--;
                        continue;
                }
...
                        memset(path + i + 1, 0, sizeof(*path));
                        bh = read_extent_tree_block(inode,
                                ldiskfs_idx_pblock(path[i].p_idx), depth - i - 1,
                                LDISKFS_EX_NOCACHE);
                        if (IS_ERR(bh)) {
                                /* should we reset i_size? */
                                err = PTR_ERR(bh);
                                break;
                        }

The allocation was done for s_max_ext_tree_depth elements. Iteration index start with 0. And compared with depth(number of elements). So
memset(path + i + 1, 0, sizeof(*path));
could zero memory outside the allocation. The depth is 5 at vmcore.



 Comments   
Comment by Gerrit Updater [ 17/Oct/18 ]

Artem Blagodarenko (c17828@cray.com) uploaded a new patch: https://review.whamcloud.com/33388
Subject: LU-11535 ldiskfs: allocate extra ldiskfs_ext_path for root
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: a1f4ee2715a0a5c8a46e3d3ea8cafb6ef5bc12a6

Comment by Gerrit Updater [ 29/Oct/18 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33388/
Subject: LU-11535 ldiskfs: allocate extra ldiskfs_ext_path for root
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 7231a4d0d2661ddd81a2296064404529cb87605a

Comment by Peter Jones [ 29/Oct/18 ]

Landed for 2.12

Generated at Sat Feb 10 02:44:43 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.