Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9097

sanity test_253 test_255: ZFS list corruption

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • Lustre 2.10.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Andreas Dilger <andreas.dilger@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/937abbdc-e5e0-11e6-978f-5254006e85c2.

      The sub-test test_253 timed out. Looking at the console log on the server, it appears to be ZFS list corruption:

      00:50:58:[ 4422.775971] WARNING: at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0()
      00:50:58:[ 4422.775971] list_del corruption. prev->next should be ffffc900031a3010, but was           (null)
      00:50:58:[ 4422.795082] CPU: 0 PID: 32 Comm: kswapd0 Tainted: P           OE  ------------   3.10.0-514.2.2.el7_lustre.x86_64 #1
      00:50:58:[ 4422.815142] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
      00:50:58:[ 4422.815142] Call Trace:
      00:50:58:[ 4422.815142]  [<ffffffff81686318>] dump_stack+0x19/0x1b
      00:50:58:[ 4422.815142]  [<ffffffff81085940>] warn_slowpath_common+0x70/0xb0
      00:50:58:[ 4422.815142]  [<ffffffff810859dc>] warn_slowpath_fmt+0x5c/0x80
      00:50:58:[ 4422.815142]  [<ffffffff81333301>] __list_del_entry+0xa1/0xd0
      00:50:58:[ 4422.815142]  [<ffffffff8133333d>] list_del+0xd/0x30
      00:50:58:[ 4422.815142]  [<ffffffffa065cf1d>] __spl_cache_flush+0xed/0x150 [spl]
      00:50:58:[ 4422.815142]  [<ffffffffa065d046>] spl_cache_flush+0x36/0x50 [spl]
      00:50:58:[ 4422.815142]  [<ffffffffa065d71f>] spl_kmem_cache_reap_now+0x10f/0x120 [spl]
      00:50:58:[ 4422.815142]  [<ffffffffa070b3c9>] arc_kmem_reap_now+0x79/0xe0 [zfs]
      00:50:58:[ 4422.815142]  [<ffffffffa0710bb7>] arc_shrinker_func+0x97/0x130 [zfs]
      00:50:58:[ 4422.815142]  [<ffffffff81194213>] shrink_slab+0x163/0x330
      00:50:58:[ 4422.815142]  [<ffffffff811f5361>] ? vmpressure+0x21/0x90
      00:50:58:[ 4422.815142]  [<ffffffff81198001>] balance_pgdat+0x4b1/0x5e0
      00:50:58:[ 4422.815142]  [<ffffffff811982a3>] kswapd+0x173/0x450
      
      
      

      Please provide additional information about the failure here.

      Info required for matching: sanity 253
      Info required for matching: sanity 255

      Attachments

        Issue Links

          Activity

            [LU-9097] sanity test_253 test_255: ZFS list corruption

            this is a duplicate of LU-9110

            bzzz Alex Zhuravlev added a comment - this is a duplicate of LU-9110

            Alex, can you please look into this. It needs to be tracked back to when it started happening, whether it is one of your recent patch landings, or related to the update to a newer ZFS release, or is some kind of random memory corruption.

            adilger Andreas Dilger added a comment - Alex, can you please look into this. It needs to be tracked back to when it started happening, whether it is one of your recent patch landings, or related to the update to a newer ZFS release, or is some kind of random memory corruption.

            Hi Alex,

            Can you please look into this one?

            Thanks.
            Joe

            jgmitter Joseph Gmitter (Inactive) added a comment - Hi Alex, Can you please look into this one? Thanks. Joe

            People

              bzzz Alex Zhuravlev
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: