Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2521

sanity test 60a crash

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Critical
    • None
    • Lustre 2.4.0
    • None
    • 3
    • 5937

    Description

      I have been hitting hangs/crashes in sanity test 60a for quite a while and thought these are OOM related, but today I got it happening on a bigger memory box and it crashed like this:

      [500819.731843] Lustre: 14303:0:(llog-test.c:864:llog_test_7()) 7e: test llog_changelog_rec
      [500821.282423] BUG: unable to handle kernel paging request at ffff880011cfced0
      [500821.282912] IP: [<ffffffffa06e45ce>] ldiskfs_journal_commit_callback+0x6e/0xc0 [ldiskfs]
      [500821.283618] PGD 1a26063 PUD 1a2a063 PMD 18e067 PTE 11cfc160
      [500821.284042] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
      [500821.284422] last sysfs file: /sys/devices/virtual/block/loop6/queue/scheduler
      [500821.284517] CPU 1 
      [500821.284517] Modules linked in: llog_test lustre ofd osp lod ost mdt osd_ldiskfs fsfilt_ldiskfs ldiskfs mdd mgs lquota obdecho mgc lov osc mdc lmv fid fld ptlrpc obdclass lvfs ksocklnd lnet libcfs ext2 exportfs jbd sha512_generic sha256_generic ext4 mbcache jbd2 virtio_balloon virtio_console i2c_piix4 i2c_core virtio_blk virtio_net virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod nfs lockd fscache nfs_acl auth_rpcgss sunrpc be2iscsi bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: libcfs]
      [500821.284517] 
      [500821.284517] Pid: 24793, comm: jbd2/loop3-8 Not tainted 2.6.32-debug #6 Bochs Bochs
      [500821.284517] RIP: 0010:[<ffffffffa06e45ce>]  [<ffffffffa06e45ce>] ldiskfs_journal_commit_callback+0x6e/0xc0 [ldiskfs]
      [500821.284517] RSP: 0018:ffff88002190fcd0  EFLAGS: 00010202
      [500821.284517] RAX: ffff88001d951f40 RBX: ffff88001d951f40 RCX: ffff880011cfced0
      [500821.284517] RDX: ffff88006337ff40 RSI: 000000001d953160 RDI: ffff8800a209cb60
      [500821.284517] RBP: ffff88002190fd10 R08: 0000000000000001 R09: ffff880000000000
      [500821.284517] R10: ffff880023d28000 R11: 0000000087654321 R12: ffff88006337ff40
      [500821.284517] R13: ffff8800a209cb60 R14: ffff8800419b0bf0 R15: ffff880011cfced0
      [500821.284517] FS:  0000000000000000(0000) GS:ffff880006280000(0000) knlGS:0000000000000000
      [500821.284517] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      [500821.284517] CR2: ffff880011cfced0 CR3: 0000000001a25000 CR4: 00000000000006e0
      [500821.284517] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [500821.284517] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [500821.284517] Process jbd2/loop3-8 (pid: 24793, threadinfo ffff88002190e000, task ffff8800203a4180)
      [500821.284517] Stack:
      [500821.284517]  ffff88002190fce0 00000000810376d9 ffff88002190fd10 ffff8800256b2c58
      [500821.284517] <d> ffff880011cfcdf0 ffff8800256b27f0 0000000000000000 00000000000000fc
      [500821.284517] <d> ffff88002190fe50 ffffffffa0391d37 ffff88002190fd80 ffffffff81009310
      [500821.284517] Call Trace:
      [500821.284517]  [<ffffffffa0391d37>] jbd2_journal_commit_transaction+0x13d7/0x16e0 [jbd2]
      [500821.284517]  [<ffffffff81009310>] ? __switch_to+0xd0/0x320
      [500821.284517]  [<ffffffff8107c65b>] ? try_to_del_timer_sync+0x7b/0xe0
      [500821.284517]  [<ffffffffa0397627>] kjournald2+0xb7/0x210 [jbd2]
      [500821.284517]  [<ffffffff8108fd60>] ? autoremove_wake_function+0x0/0x40
      [500821.284517]  [<ffffffffa0397570>] ? kjournald2+0x0/0x210 [jbd2]
      [500821.284517]  [<ffffffff8108fa16>] kthread+0x96/0xa0
      [500821.284517]  [<ffffffff8100c14a>] child_rip+0xa/0x20
      [500821.284517]  [<ffffffff8108f980>] ? kthread+0x0/0xa0
      [500821.284517]  [<ffffffff8100c140>] ? child_rip+0x0/0x20
      [500821.284517] Code: 00 00 00 49 81 c7 e0 00 00 00 4c 39 fb 4c 8b 23 48 89 d8 74 48 4c 89 e2 eb 06 0f 1f 00 49 89 d4 48 8b 4b 08 4c 89 ef 48 89 4a 08 <48> 89 11 48 89 03 48 89 43 08 e8 53 69 e1 e0 8b 55 cc 48 89 de 
      

      I have crashdump with modules in /exports/crashdumps/192.168.10.219-2012-12-22-07:51:12

      Attachments

        Activity

          People

            wc-triage WC Triage
            green Oleg Drokin
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: