Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3102

kernel BUG at fs/jbd2/transaction.c:1033

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • None
    • Lustre 2.4.0
    • None
    • wide striping

    • 3
    • 7542

    Description

      While testing wide-striping by creating 200 osts per oss, 2 oss, I ran the following. Note that I see the stripe count to 500 which is more than 400 total osts.

      while true; do
      rm -rf /mnt/lustre/dir$(hostname)
      for i in $(seq 1 500); do
      DIR=/mnt/lustre/dir$(hostname)/dir$i
      mkdir -p $DIR
      lfs setstripe -c $i $DIR
      touch $DIR/lustre
      ls -l $DIR/lustre
      lfs getstripe $DIR > /dev/null
      done
      done

      -----------[ cut here ]-----------
      kernel BUG at fs/jbd2/transaction.c:1033!
      invalid opcode: 0000 1 SMP
      last sysfs file: /sys/devices/pci0000:00/0000:00:1a.1/usb4/4-1/speed
      CPU 1
      Modules linked in: osp(U) lod(U) mdt(U) mgs(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) mdd(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) ldiskfs(U) jbd2 nfs fscache nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa igb mlx4_ib ib_mad ib_core mlx4_en mlx4_core microcode serio_raw i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ext3 jbd mbcache sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]

      Pid: 2344, comm: mdt00_004 Not tainted 2.6.32-279.19.1.el6_lustre.gc4681d8.x86_64 #1 Supermicro X8DTT/X8DTT
      RIP: 0010:[<ffffffffa040c86d>] [<ffffffffa040c86d>] jbd2_journal_dirty_metadata+0x10d/0x150 [jbd2]
      RSP: 0018:ffff88020aad75c0 EFLAGS: 00010246
      RAX: ffff8801be71ac80 RBX: ffff8801fab4b978 RCX: ffff88020c1e5610
      RDX: 0000000000000000 RSI: ffff88020c1e5610 RDI: 0000000000000000
      RBP: ffff88020aad75e0 R08: ffff88020c1e5610 R09: fdd5e22cf78d8402
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff880241019a58
      R13: ffff88020c1e5610 R14: ffff880215f16800 R15: 0000000000000000
      FS: 00007f0d793f1700(0000) GS:ffff880032e20000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00007fff56043e40 CR3: 000000032e70c000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process mdt00_004 (pid: 2344, threadinfo ffff88020aad6000, task ffff88020aae5500)
      Stack:
      ffff8801fab4b978 ffffffffa0472a50 ffff88020c1e5610 0000000000000000
      <d> ffff88020aad7620 ffffffffa04321bb ffffffffa0472a30 ffff8801fab4b978
      <d> ffff88032b127000 ffff8801e9029c50 ffff8801e9029b80 ffff88020aad76a0
      Call Trace:
      [<ffffffffa04321bb>] __ldiskfs_handle_dirty_metadata+0x7b/0x100 [ldiskfs]
      [<ffffffffa043cd22>] ldiskfs_mark_iloc_dirty+0x332/0x5b0 [ldiskfs]
      [<ffffffffa043e4a3>] ldiskfs_mark_inode_dirty+0x83/0x1f0 [ldiskfs]
      [<ffffffffa043fb90>] ldiskfs_dirty_inode+0x40/0x60 [ldiskfs]
      [<ffffffffa0d153f7>] osd_ldiskfs_write_record+0x2d7/0x330 [osd_ldiskfs]
      [<ffffffffa0d16058>] osd_write+0x148/0x2a0 [osd_ldiskfs]
      [<ffffffffa0623eb5>] dt_record_write+0x45/0x130 [obdclass]
      [<ffffffffa0c32cd0>] ? md_capainfo+0x20/0x30 [mdd]
      [<ffffffffa05f088e>] llog_osd_write_blob+0x2fe/0x730 [obdclass]
      [<ffffffffa05f42c1>] llog_osd_write_rec+0x821/0x1200 [obdclass]
      [<ffffffffa0d0f9c5>] ? iam_path_fini+0x25/0x30 [osd_ldiskfs]
      [<ffffffffa05c03b8>] llog_write_rec+0xc8/0x290 [obdclass]
      [<ffffffffa05c857d>] llog_cat_add_rec+0xad/0x480 [obdclass]
      [<ffffffffa05c01b1>] llog_add+0x91/0x1d0 [obdclass]
      [<ffffffffa0f1fef7>] osp_sync_add_rec+0x247/0x8a0 [osp]
      [<ffffffffa0d0d0cf>] ? osd_oi_delete+0x2af/0x4b0 [osd_ldiskfs]
      [<ffffffffa0f205fb>] osp_sync_add+0x7b/0x80 [osp]
      [<ffffffffa0f144e6>] osp_object_destroy+0x106/0x150 [osp]
      [<ffffffffa0ed0347>] lod_object_destroy+0x1a7/0x350 [lod]
      [<ffffffffa0c2ba19>] mdd_finish_unlink+0x229/0x380 [mdd]
      [<ffffffffa0c2e748>] mdd_unlink+0x9c8/0xe20 [mdd]
      [<ffffffffa0e19a18>] mdo_unlink+0x18/0x50 [mdt]
      [<ffffffffa0e1ccd9>] mdt_reint_unlink+0x739/0xfd0 [mdt]
      [<ffffffffa0e196d1>] mdt_reint_rec+0x41/0xe0 [mdt]
      [<ffffffffa0e12d53>] mdt_reint_internal+0x4e3/0x7d0 [mdt]
      [<ffffffffa0e13084>] mdt_reint+0x44/0xe0 [mdt]
      [<ffffffffa0e01078>] mdt_handle_common+0x648/0x1660 [mdt]
      [<ffffffffa0e3d125>] mds_regular_handle+0x15/0x20 [mdt]
      [<ffffffffa07b51dc>] ptlrpc_server_handle_request+0x41c/0xdf0 [ptlrpc]
      [<ffffffffa04b45de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
      [<ffffffffa04c5d8f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
      [<ffffffffa07ac819>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
      [<ffffffff81052223>] ? __wake_up+0x53/0x70
      [<ffffffffa07b6725>] ptlrpc_main+0xb75/0x1870 [ptlrpc]
      [<ffffffffa07b5bb0>] ? ptlrpc_main+0x0/0x1870 [ptlrpc]
      [<ffffffff8100c0ca>] child_rip+0xa/0x20
      [<ffffffffa07b5bb0>] ? ptlrpc_main+0x0/0x1870 [ptlrpc]
      [<ffffffffa07b5bb0>] ? ptlrpc_main+0x0/0x1870 [ptlrpc]
      [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
      Code: c6 9c 03 00 00 4c 89 f7 e8 a1 ff 0d e1 48 8b 33 ba 01 00 00 00 4c 89 e7 e8 11 ec ff ff 4c 89 f0 66 ff 00 66 66 90 e9 73 ff ff ff <0f> 0b eb fe 0f 0b eb fe 0f 0b 66 0f 1f 84 00 00 00 00 00 eb f5
      RIP [<ffffffffa040c86d>] jbd2_journal_dirty_metadata+0x10d/0x150 [jbd2]
      RSP <ffff88020aad75c0>
      Initializing cgroup subsys cpuset
      Initializing cgroup subsys cpu
      Linux version 2.6.32-279.19.1.el6_lustre.gc4681d8.x86_64 (jenkins@builder-1-sde1-el6-x8664.lab.whamcloud.com) (gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC) ) #1 SMP Wed Mar 6 18:02:10 PST 2013
      Command line: ro root=UUID=7bb27ca3-4652-470c-b9ba-d628c22b7754 rd_NO_LUKS rd_NO_LVM LANG=en_US.UTF-8 rd_NO_MD console=ttyS0,115200 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off memmap=exactmap memmap=563K@64K memmap=131513K@49715K elfcorehdr=181228K memmap=64K$0K memmap=13K$627K memmap=104K$920K memmap=8K$3136952K memmap=56K#3136960K memmap=328K#3137016K memmap=64K$3137344K memmap=8272K$3137456K memmap=262144K$3670016K memmap=4K$4175872K memmap=4096K$4190208K
      KERNEL supported cpus:
      Intel GenuineIntel
      AMD AuthenticAMD

      Attachments

        Issue Links

          Activity

            People

              bobijam Zhenyu Xu
              mdiep Minh Diep
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: