Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5753

fix tid wraparound problem for rhel5

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Fix
    • Minor
    • None
    • Lustre 2.1.6
    • RHEL5 kernel
    • 3
    • 16149

    Description

      Under RHEL5, we hit a kernel bug like following:

      kernel BUG at fs/jbd2/commit.c:395
      invalid opcode: 0000 [1] SMP
      last sysfs file: /devices/pci0000:00/0000:00:09.0/0000:04:00.0/host2/target2:0:0/2:0:0:27/timeout
      CPU 9
      Modules linked in: obdfilter(U) fsfilt_ldiskfs(U) exportfs ost(U) mgc(U) ldiskfs(U) jbd2 crc16 lustre(U) lov(U) osc(U) lquota(U) mdc(U) fid(U) fld(U) ksocklnd(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) nfs nfs_acl sg autofs4 ib_srp(U) lockd sunrpc rdma_ucm(U) ib_ucm(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ipoib_helper(U) ib_cm(U) ib_sa(U) ipv6 xfrm_nalgo crypto_api ib_uverbs(U) ib_umad(U) iw_nes(U) mlx4_ib(U) ib_mthca(U) ib_mad(U) ib_core(U) loop dm_mirror dm_round_robin dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec i2c_core dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport mlx4_en(U) mlx4_core(U) i7core_edac tpm_tis bnx2 hpilo tpm edac_mc tpm_bios serio_raw pcspkr dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache shpchp cciss sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
      Pid: 13516, comm: jbd2/dm-14-8 Tainted: G -------------------- 2.6.18-348.6.1.el5_lustre #1
      RIP: 0010:[<ffffffff88cadf98>] [<ffffffff88cadf98>] :jbd2:jbd2_journal_commit_transaction+0x50/0x1120
      RSP: 0018:ffff810289fcbd70 EFLAGS: 00010246
      RAX: 0000000000000008 RBX: ffff8105d7611800 RCX: 0000000000000246
      RDX: 0000000000000008 RSI: 0000000000000286 RDI: 0000000000000000
      RBP: ffff8105d7611800 R08: ffff8105d7611898 R09: 0000000000000001
      R10: 0000000000000000 R11: 0000000000000280 R12: ffff8105d7611824
      R13: ffff8105d76118c0 R14: 0000000000000000 R15: ffff8105d7611890
      FS: 0000000000000000(0000) GS:ffff81061fe5dc40(0000) knlGS:0000000000000000
      CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 000000000928d2c8 CR3: 000000061db55000 CR4: 00000000000006a0
      Process jbd2/dm-14-8 (pid: 13516, threadinfo ffff810289fca000, task ffff8105ec0f27f0)
      Stack: ffff8105e4eb6000 00000fdcec0f2828 0000000800000000 00000001ffffffff
      0000000000000001 0000000084552f72 0000000000000000 00000000000000e4
      0000000000000000 0000000000000000 0000000000000000 0000000000000001
      Call Trace:
      [<ffffffff8003dddd>] lock_timer_base+0x1b/0x3c
      [<ffffffff8004b384>] try_to_del_timer_sync+0x7f/0x88
      [<ffffffff88cb223e>] :jbd2:kjournald2+0x9a/0x1ec
      [<ffffffff800a3cdf>] autoremove_wake_function+0x0/0x2e
      [<ffffffff88cb21a4>] :jbd2:kjournald2+0x0/0x1ec
      [<ffffffff800a3ac7>] keventd_create_kthread+0x0/0xc4
      [<ffffffff80032c4f>] kthread+0xfe/0x132
      [<ffffffff8005dfc1>] child_rip+0xa/0x11
      [<ffffffff800a3ac7>] keventd_create_kthread+0x0/0xc4
      [<ffffffff80032b51>] kthread+0x0/0x132
      [<ffffffff8005dfb7>] child_rip+0x0/0x11

      {no format}

      it could be reproduced by following steps will make rhel5 kernel crash.
      Generating a big enough sequence id.

      [root@localhost ~]# dumpe2fs sequence_image1 | grep sequence
      dumpe2fs 1.42.9.wc1 (24-Feb-2014)
      Journal sequence: 0xb2a5df5c

      Then running following program under Ext4 filesystem:
      [root@promote mnt]# cat test_fsync.c
      #include <stdio.h>
      #include <stdlib.h>
      #include <unistd.h>
      #include <sys/types.h>
      #include <sys/stat.h>
      #include <fcntl.h>

      int main(int argc, char **argv)
      {

      int ret;

      int fd = open(argv[1], O_RDWR | O_CREAT);
      if (fd < 0)
      return 1;

      ret = fdatasync(fd);
      if (ret < 0)

      { printf("we failed here?\n"); goto failed; }

      failed:
      close(fd);
      return 0;
      }

      1. gcc test_fsync.c
        #./a.out new_file ---->kernel will crash

      Since @i_sync_tid and @i_datasync_tid is not properly assigned, it is initialized to zero.
      these value is passed into ext4_sync_file().

      the problem is 0 is a special value, tid_geq() will think 0 is a later commit..this will cause problems here.

      This problem did not exist for rhel6, we need backport this patch for Lustre rhel5 series.

      I confirm following patch could fix the problem:
      commit 688f869ce3bdc892daa993534dc6df18c95df931
      Author: Theodore Ts'o <tytso@mit.edu>
      Date: Wed Mar 16 17:16:31 2011 -0400

      ext4: Initialize fsync transaction ids in ext4_new_inode()

      When allocating a new inode, we need to make sure i_sync_tid and
      i_datasync_tid are initialized. Otherwise, one or both of these two
      values could be left initialized to zero, which could potentially
      result in BUG_ON in jbd2_journal_commit_transaction.

      (This could happen by having journal->commit_request getting set to
      zero, which could wake up the kjournald process even though there is
      no running transaction, which then causes a BUG_ON via the
      J_ASSERT(j_ruinning_transaction != NULL) statement.

      Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

      diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
      index 2fd3b0e..a679a48 100644
      — a/fs/ext4/ialloc.c
      +++ b/fs/ext4/ialloc.c
      @@ -1054,6 +1054,11 @@ got:
      }
      }

      + if (ext4_handle_valid(handle))

      { + ei->i_sync_tid = handle->h_transaction->t_tid; + ei->i_datasync_tid = handle->h_transaction->t_tid; + }

      +
      err = ext4_mark_inode_dirty(handle, inode);
      if (err) {
      ext4_std_error(sb, err);

      Attachments

        Activity

          People

            ys Yang Sheng
            wangshilong Wang Shilong (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: