Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10056

sanity test_60a invokes oom-killer in subtest 7f and times out

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.11.0
    • None
    • 3
    • 9223372036854775807

    Description

      Console log on MDS:

       Lustre: 32199:0:(llog_test.c:1018:llog_test_7_sub()) 7_sub: records are not aligned, written 64071 from 64767
       Lustre: 32199:0:(llog_test.c:1124:llog_test_7()) 7f: test llog_changelog_user_rec
       sssd_ssh invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
       sssd_ssh cpuset=/ mems_allowed=0
       CPU: 0 PID: 665 Comm: sssd_ssh Tainted: P           OE  ------------   3.10.0-693.1.1.el7_lustre.x86_64 #1
       Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
        ffff88003690dee0 000000007415b4cd ffff88007a0b39f0 ffffffff816a3d6d
        ffff88007a0b3a80 ffffffff8169f186 ffff88007a0b3ae8 ffff88007a0b3a40
        ffffffff816b04dc ffffffff81a6ea00 0000000000000000 0000000000000000
       Call Trace:
        [<ffffffff816a3d6d>] dump_stack+0x19/0x1b
        [<ffffffff8169f186>] dump_header+0x90/0x229
        [<ffffffff816b04dc>] ? notifier_call_chain+0x4c/0x70
        [<ffffffff810b6ab8>] ? __blocking_notifier_call_chain+0x58/0x70
        [<ffffffff8118653e>] check_panic_on_oom+0x2e/0x60
        [<ffffffff8118695b>] out_of_memory+0x23b/0x4f0
        [<ffffffff8169fc8a>] __alloc_pages_slowpath+0x5d6/0x724
        [<ffffffff8118cd85>] __alloc_pages_nodemask+0x405/0x420
        [<ffffffff811d412f>] alloc_pages_vma+0xaf/0x1f0
        [<ffffffff811c3830>] ? end_swap_bio_write+0x80/0x80
        [<ffffffff811c453d>] read_swap_cache_async+0xed/0x160
        [<ffffffff811c4658>] swapin_readahead+0xa8/0x110
        [<ffffffff811b235b>] handle_mm_fault+0xadb/0xfa0
        [<ffffffff8109ea4c>] ? signal_setup_done+0x3c/0x60
        [<ffffffff816affb4>] __do_page_fault+0x154/0x450
        [<ffffffff816b02e5>] do_page_fault+0x35/0x90
        [<ffffffff816ac508>] page_fault+0x28/0x30
      

      Maloo reports:
      https://testing.hpdd.intel.com/test_sessions/2a0ab571-1893-4650-bdbf-4b24c85e8367
      https://testing.hpdd.intel.com/test_sessions/9e63bf6c-094f-4c49-8823-de2ad48b5302

      Attachments

        Issue Links

          Activity

            [LU-10056] sanity test_60a invokes oom-killer in subtest 7f and times out
            bogl Bob Glossman (Inactive) added a comment - another on b2_10: https://testing.hpdd.intel.com/test_sets/975629ee-0385-11e8-a7cd-52540065bddc

            This has been hit a few more times in the past 4 weeks.

            adilger Andreas Dilger added a comment - This has been hit a few more times in the past 4 weeks.

            +1 at https://testing.hpdd.intel.com/test_sets/68e44f30-b87d-11e7-9abd-52540065bddc

            I have done some debug on the associated MDS crash-dump due to OOM. It looks like again, the kmalloc-512 kmem_cache's Slabs consume almost all available memory (>1.2GB vs 1.6GB), like for LU-7329 and LU-7883 but this time at earlier llog_test/llog_test_7() step than llog_test_10() before!

            Could it be that something in the auto-test VMs/OS/daemons/... configs has changed ?? And thus we need to apply the same fix (dt_sync() calls to flush journal callbacks) to prior log_test's sub-tests than only in llog_test_10() ??

            bfaccini Bruno Faccini (Inactive) added a comment - +1 at https://testing.hpdd.intel.com/test_sets/68e44f30-b87d-11e7-9abd-52540065bddc I have done some debug on the associated MDS crash-dump due to OOM. It looks like again, the kmalloc-512 kmem_cache's Slabs consume almost all available memory (>1.2GB vs 1.6GB), like for LU-7329 and LU-7883 but this time at earlier llog_test/llog_test_7() step than llog_test_10() before! Could it be that something in the auto-test VMs/OS/daemons/... configs has changed ?? And thus we need to apply the same fix (dt_sync() calls to flush journal callbacks) to prior log_test's sub-tests than only in llog_test_10() ??

            People

              wc-triage WC Triage
              yujian Jian Yu
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: