Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1929

performance-sanity subtest test_3: list_add corruption

Details

    • Bug
    • Resolution: Duplicate
    • Blocker
    • None
    • Lustre 2.3.0
    • None
    • 3
    • 6326

    Description

      This issue was created by maloo for yujian <yujian@whamcloud.com>

      This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/d0f9d278-fd9a-11e1-afe5-52540035b04c.

      Info required for matching: performance-sanity 3

      Lustre Build: http://build.whamcloud.com/job/lustre-b2_3/17

      Console log on MDS (fat-intel-2):

      Lustre: DEBUG MARKER: ===== mdsrate-create-small.sh
      ------------[ cut here ]------------
      WARNING: at lib/list_debug.c:30 __list_add+0x8f/0xa0() (Not tainted)
      Hardware name: X8DTT-H
      list_add corruption. prev->next should be next (ffffc90022c8c01c), but was (null). (prev=ffff8805f43957b8).
      Modules linked in: nfs fscache cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) ldiskfs(U) jbd2 lustre(U) lquota(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa mlx4_ib ib_mad ib_core mlx4_en mlx4_core e1000e microcode serio_raw i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ext3 jbd mbcache sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
      generalInitializing cgroup subsys cpuset
      

      Attachments

        Issue Links

          Activity

            [LU-1929] performance-sanity subtest test_3: list_add corruption
            yujian Jian Yu added a comment - This is fixed in LU-1881 . Lustre Build: http://build.whamcloud.com/job/lustre-b2_3/19 performance-sanity test passed: https://maloo.whamcloud.com/test_sets/2413e27e-fe44-11e1-b4cd-52540035b04c
            yujian Jian Yu added a comment - - edited

            Lustre Build: http://build.whamcloud.com/job/lustre-b2_3/18

            parallel-scale test_compilebench: https://maloo.whamcloud.com/test_sets/6592fc34-fdaa-11e1-a1b4-52540035b04c

            Console log on MDS (fat-intel-2):

            Lustre: DEBUG MARKER: ./compilebench -D /mnt/lustre/d0.compilebench -i 4 -r 4 --makej
            BUG: unable to handle kernel 
            ------------[ cut here ]------------
            WARNING: at lib/list_debug.c:51 list_del+0x8d/0xa0() (Not tainted)
            Hardware name: X8DTT-H
            list_del corruption. next->prev should be ffff880335a9f000, but was (null)
            Modules linked in: cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) ldiskfs(U) jbd2 nfs fscache lustre(U) lquota(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa mlx4_ib ib_mad ib_core mlx4_en mlx4_core e1000e microcode serio_raw i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ext3 jbd mbcache sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
            Pid: 104, comm: events/5 Not tainted 2.6.32-279.5.1.el6_lustre.x86_64 #1
            Call Trace:
             [<ffffffff8106b747>] ? warn_slowpath_common+0x87/0xc0
             [<ffffffff8106b836>] ? warn_slowpath_fmt+0x46/0x50
             [<ffffffff812833bd>] ? list_del+0x8d/0xa0
             [<ffffffff81164008>] ? free_block+0xc8/0x170
             [<ffffffff811642e1>] ? drain_array+0xc1/0x100
             [<ffffffff811652ae>] ? cache_reap+0x8e/0x260
             [<ffffffff810923be>] ? prepare_to_wait+0x4e/0x80
             [<ffffffff81165220>] ? cache_reap+0x0/0x260
             [<ffffffff8108c760>] ? worker_thread+0x170/0x2a0
             [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40
             [<ffffffff8108c5f0>] ? worker_thread+0x0/0x2a0
             [<ffffffff81091d66>] ? kthread+0x96/0xa0
             [<ffffffff8100c14a>] ? child_rip+0xa/0x20
             [<ffffffff81091cd0>] ? kthread+0x0/0xa0
            

            Please refer to the above report for more console logs.

            yujian Jian Yu added a comment - - edited Lustre Build: http://build.whamcloud.com/job/lustre-b2_3/18 parallel-scale test_compilebench: https://maloo.whamcloud.com/test_sets/6592fc34-fdaa-11e1-a1b4-52540035b04c Console log on MDS (fat-intel-2): Lustre: DEBUG MARKER: ./compilebench -D /mnt/lustre/d0.compilebench -i 4 -r 4 --makej BUG: unable to handle kernel ------------[ cut here ]------------ WARNING: at lib/list_debug.c:51 list_del+0x8d/0xa0() (Not tainted) Hardware name: X8DTT-H list_del corruption. next->prev should be ffff880335a9f000, but was (null) Modules linked in: cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) ldiskfs(U) jbd2 nfs fscache lustre(U) lquota(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa mlx4_ib ib_mad ib_core mlx4_en mlx4_core e1000e microcode serio_raw i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ext3 jbd mbcache sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 104, comm: events/5 Not tainted 2.6.32-279.5.1.el6_lustre.x86_64 #1 Call Trace: [<ffffffff8106b747>] ? warn_slowpath_common+0x87/0xc0 [<ffffffff8106b836>] ? warn_slowpath_fmt+0x46/0x50 [<ffffffff812833bd>] ? list_del+0x8d/0xa0 [<ffffffff81164008>] ? free_block+0xc8/0x170 [<ffffffff811642e1>] ? drain_array+0xc1/0x100 [<ffffffff811652ae>] ? cache_reap+0x8e/0x260 [<ffffffff810923be>] ? prepare_to_wait+0x4e/0x80 [<ffffffff81165220>] ? cache_reap+0x0/0x260 [<ffffffff8108c760>] ? worker_thread+0x170/0x2a0 [<ffffffff810920d0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8108c5f0>] ? worker_thread+0x0/0x2a0 [<ffffffff81091d66>] ? kthread+0x96/0xa0 [<ffffffff8100c14a>] ? child_rip+0xa/0x20 [<ffffffff81091cd0>] ? kthread+0x0/0xa0 Please refer to the above report for more console logs.
            laisiyao Lai Siyao added a comment -

            In my previous test, this test is quite likely to trigger LU-1881, and this crash doesn't give much information, but it's quite possible to be that issue. IMO it's better to retest performance-sanity after LU-1881 fix is merged.

            laisiyao Lai Siyao added a comment - In my previous test, this test is quite likely to trigger LU-1881 , and this crash doesn't give much information, but it's quite possible to be that issue. IMO it's better to retest performance-sanity after LU-1881 fix is merged.
            pjones Peter Jones added a comment -

            Lai

            It seems that it might be best to understand this failure first

            Peter

            pjones Peter Jones added a comment - Lai It seems that it might be best to understand this failure first Peter

            People

              laisiyao Lai Siyao
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: