Details

    • Bug
    • Resolution: Duplicate
    • Minor
    • Lustre 2.8.0
    • Lustre 2.8.0
    • 3
    • 6,380
    • 17628

    Description

      This issue was created by maloo for wangdi <di.wang@intel.com>

      This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/61eb856e-bd39-11e4-8d85-5254006e85c2.

      The sub-test test_4 failed with the following error:

      test failed to respond and timed out
      

      Please provide additional information about the failure here.

      Info required for matching: sanity-lfsck 4

      Attachments

        Issue Links

          Activity

            [LU-6295] sanity-lfsck test_4: oom on MDT0
            di.wang Di Wang added a comment -

            duplicate with LU-6380

            di.wang Di Wang added a comment - duplicate with LU-6380

            But what will happen if such FID mapping is really crashed in the real world?

            yong.fan nasf (Inactive) added a comment - But what will happen if such FID mapping is really crashed in the real world?
            di.wang Di Wang added a comment - - edited

            This because the inject fail makes recovery hang and can not finish.

            20:26:51:Lustre: lustre-MDT0000-osd: the OI mapping for the FID [0x200000009:0x0:0x0] become inconsistent, the given ID 111/4224599767, the ID in OI mapping 111/111
            20:26:51:LustreError: 11956:0:(lod_sub_object.c:903:lod_sub_prep_llog()) lustre-MDT0000-mdtlov: can't get id from catalogs: rc = -78
            
            di.wang Di Wang added a comment - - edited This because the inject fail makes recovery hang and can not finish. 20:26:51:Lustre: lustre-MDT0000-osd: the OI mapping for the FID [0x200000009:0x0:0x0] become inconsistent, the given ID 111/4224599767, the ID in OI mapping 111/111 20:26:51:LustreError: 11956:0:(lod_sub_object.c:903:lod_sub_prep_llog()) lustre-MDT0000-mdtlov: can't get id from catalogs: rc = -78
            di.wang Di Wang added a comment -

            16:30:47:Mem-Info:
            16:30:47:Node 0 DMA per-cpu:
            16:30:47:CPU 0: hi: 0, btch: 1 usd: 0
            16:30:47:CPU 1: hi: 0, btch: 1 usd: 0
            16:30:47:Node 0 DMA32 per-cpu:
            16:30:47:CPU 0: hi: 186, btch: 31 usd: 30
            16:30:47:CPU 1: hi: 186, btch: 31 usd: 0
            16:30:47:active_anon:1002 inactive_anon:1020 isolated_anon:0
            16:30:47: active_file:30 inactive_file:93 isolated_file:0
            16:30:47: unevictable:0 dirty:0 writeback:1054 unstable:0
            16:30:47: free:13256 slab_reclaimable:1615 slab_unreclaimable:10739
            16:30:47: mapped:26 shmem:13 pagetables:539 bounce:0
            16:30:47:Node 0 DMA free:8336kB min:332kB low:412kB high:496kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15348kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:24kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
            16:30:47:lowmem_reserve[]: 0 2004 2004 2004
            16:30:47:Node 0 DMA32 free:44688kB min:44720kB low:55900kB high:67080kB active_anon:4008kB inactive_anon:4080kB active_file:120kB inactive_file:372kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2052308kB mlocked:0kB dirty:0kB writeback:4216kB mapped:104kB shmem:52kB slab_reclaimable:6460kB slab_unreclaimable:42932kB kernel_stack:1672kB pagetables:2156kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2304 all_unreclaimable? no
            16:30:47:lowmem_reserve[]: 0 0 0 0
            16:30:47:Node 0 DMA: 0*4kB 0*8kB 1*16kB 0*32kB 0*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 2*2048kB 1*4096kB = 8336kB
            16:30:47:Node 0 DMA32: 2304*4kB 1172*8kB 445*16kB 167*32kB 77*64kB 18*128kB 3*256kB 1*512kB 1*1024kB 0*2048kB 1*4096kB = 44688kB
            16:30:47:1189 total pagecache pages
            16:30:47:1052 pages in swap cache
            16:30:47:Swap cache stats: add 2969, delete 1917, find 29/40
            16:30:47:Free swap = 4117292kB
            16:30:47:Total swap = 4128764kB
            16:30:47:524284 pages RAM
            16:30:47:43706 pages reserved
            16:30:47:316 pages shared
            16:30:47:462410 pages non-shared
            16:30:47:[ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name
            16:30:47:[ 412] 0 412 2692 33 0 -17 -1000 udevd
            16:30:47:[ 1049] 0 1049 6899 30 0 -17 -1000 auditd
            16:30:47:[ 1069] 0 1069 62273 87 1 0 0 rsyslogd
            16:30:47:[ 1099] 0 1099 4560 30 1 0 0 irqbalance
            16:30:47:[ 1115] 32 1115 4744 19 1 0 0 rpcbind
            16:30:47:[ 1135] 29 1135 5837 6 1 0 0 rpc.statd
            16:30:47:[ 1252] 81 1252 6418 4 1 0 0 dbus-daemon
            16:30:47:[ 1269] 0 1269 53919 20 1 0 0 ypbind
            16:30:47:[ 1338] 0 1338 1020 9 1 0 0 acpid
            16:30:47:[ 1348] 68 1348 10507 95 1 0 0 hald
            16:30:47:[ 1349] 0 1349 5099 4 0 0 0 hald-runner
            16:30:47:[ 1381] 0 1381 5629 4 1 0 0 hald-addon-inpu
            16:30:47:[ 1391] 68 1391 4501 3 1 0 0 hald-addon-acpi
            16:30:47:[ 1429] 0 1429 26827 2 0 0 0 rpc.rquotad
            16:30:47:[ 1434] 0 1434 5417 0 0 0 0 rpc.mountd
            16:30:47:[ 1474] 0 1474 6291 3 0 0 0 rpc.idmapd
            16:30:47:[ 1507] 498 1507 57325 150 1 0 0 munged
            16:30:47:[ 1525] 0 1525 16553 7 0 -17 -1000 sshd
            16:30:47:[ 1534] 0 1534 5429 21 0 0 0 xinetd
            16:30:47:[ 1562] 0 1562 22208 2 1 0 0 sendmail
            16:30:47:[ 1571] 51 1571 20071 0 0 0 0 sendmail
            16:30:47:[ 1595] 0 1595 29215 130 1 0 0 crond
            16:30:47:[ 1608] 0 1608 5276 51 0 0 0 atd
            16:30:47:[ 1622] 0 1622 1020 25 1 0 0 agetty
            16:30:47:[ 1623] 0 1623 1016 23 1 0 0 mingetty
            16:30:47:[ 1625] 0 1625 1016 24 1 0 0 mingetty
            16:30:47:[ 1627] 0 1627 1016 23 1 0 0 mingetty
            16:30:47:[ 1629] 0 1629 1016 23 1 0 0 mingetty
            16:30:47:[ 1631] 0 1631 1016 24 1 0 0 mingetty
            16:30:47:[ 1633] 0 1633 2692 36 1 -17 -1000 udevd
            16:30:47:[ 1634] 0 1634 2691 24 0 -17 -1000 udevd
            16:30:47:[ 1636] 0 1636 1016 24 0 0 0 mingetty
            16:30:47:[ 2278] 38 2278 7689 171 0 0 0 ntpd
            16:30:47:[12785] 0 12785 4763 60 0 0 0 anacron
            16:30:47:Kernel panic - not syncing: Out of memory: system-wide panic_on_oom is enabled

            di.wang Di Wang added a comment - 16:30:47:Mem-Info: 16:30:47:Node 0 DMA per-cpu: 16:30:47:CPU 0: hi: 0, btch: 1 usd: 0 16:30:47:CPU 1: hi: 0, btch: 1 usd: 0 16:30:47:Node 0 DMA32 per-cpu: 16:30:47:CPU 0: hi: 186, btch: 31 usd: 30 16:30:47:CPU 1: hi: 186, btch: 31 usd: 0 16:30:47:active_anon:1002 inactive_anon:1020 isolated_anon:0 16:30:47: active_ file:30 inactive_ file:93 isolated_ file:0 16:30:47: unevictable:0 dirty:0 writeback:1054 unstable:0 16:30:47: free:13256 slab_reclaimable:1615 slab_unreclaimable:10739 16:30:47: mapped:26 shmem:13 pagetables:539 bounce:0 16:30:47:Node 0 DMA free:8336kB min:332kB low:412kB high:496kB active_anon:0kB inactive_anon:0kB active_ file:0kB inactive_ file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15348kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:24kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes 16:30:47:lowmem_reserve[]: 0 2004 2004 2004 16:30:47:Node 0 DMA32 free:44688kB min:44720kB low:55900kB high:67080kB active_anon:4008kB inactive_anon:4080kB active_ file:120kB inactive_ file:372kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2052308kB mlocked:0kB dirty:0kB writeback:4216kB mapped:104kB shmem:52kB slab_reclaimable:6460kB slab_unreclaimable:42932kB kernel_stack:1672kB pagetables:2156kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2304 all_unreclaimable? no 16:30:47:lowmem_reserve[]: 0 0 0 0 16:30:47:Node 0 DMA: 0*4kB 0*8kB 1*16kB 0*32kB 0*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 2*2048kB 1*4096kB = 8336kB 16:30:47:Node 0 DMA32: 2304*4kB 1172*8kB 445*16kB 167*32kB 77*64kB 18*128kB 3*256kB 1*512kB 1*1024kB 0*2048kB 1*4096kB = 44688kB 16:30:47:1189 total pagecache pages 16:30:47:1052 pages in swap cache 16:30:47:Swap cache stats: add 2969, delete 1917, find 29/40 16:30:47:Free swap = 4117292kB 16:30:47:Total swap = 4128764kB 16:30:47:524284 pages RAM 16:30:47:43706 pages reserved 16:30:47:316 pages shared 16:30:47:462410 pages non-shared 16:30:47:[ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name 16:30:47:[ 412] 0 412 2692 33 0 -17 -1000 udevd 16:30:47:[ 1049] 0 1049 6899 30 0 -17 -1000 auditd 16:30:47:[ 1069] 0 1069 62273 87 1 0 0 rsyslogd 16:30:47:[ 1099] 0 1099 4560 30 1 0 0 irqbalance 16:30:47:[ 1115] 32 1115 4744 19 1 0 0 rpcbind 16:30:47:[ 1135] 29 1135 5837 6 1 0 0 rpc.statd 16:30:47:[ 1252] 81 1252 6418 4 1 0 0 dbus-daemon 16:30:47:[ 1269] 0 1269 53919 20 1 0 0 ypbind 16:30:47:[ 1338] 0 1338 1020 9 1 0 0 acpid 16:30:47:[ 1348] 68 1348 10507 95 1 0 0 hald 16:30:47:[ 1349] 0 1349 5099 4 0 0 0 hald-runner 16:30:47:[ 1381] 0 1381 5629 4 1 0 0 hald-addon-inpu 16:30:47:[ 1391] 68 1391 4501 3 1 0 0 hald-addon-acpi 16:30:47:[ 1429] 0 1429 26827 2 0 0 0 rpc.rquotad 16:30:47:[ 1434] 0 1434 5417 0 0 0 0 rpc.mountd 16:30:47:[ 1474] 0 1474 6291 3 0 0 0 rpc.idmapd 16:30:47:[ 1507] 498 1507 57325 150 1 0 0 munged 16:30:47:[ 1525] 0 1525 16553 7 0 -17 -1000 sshd 16:30:47:[ 1534] 0 1534 5429 21 0 0 0 xinetd 16:30:47:[ 1562] 0 1562 22208 2 1 0 0 sendmail 16:30:47:[ 1571] 51 1571 20071 0 0 0 0 sendmail 16:30:47:[ 1595] 0 1595 29215 130 1 0 0 crond 16:30:47:[ 1608] 0 1608 5276 51 0 0 0 atd 16:30:47:[ 1622] 0 1622 1020 25 1 0 0 agetty 16:30:47:[ 1623] 0 1623 1016 23 1 0 0 mingetty 16:30:47:[ 1625] 0 1625 1016 24 1 0 0 mingetty 16:30:47:[ 1627] 0 1627 1016 23 1 0 0 mingetty 16:30:47:[ 1629] 0 1629 1016 23 1 0 0 mingetty 16:30:47:[ 1631] 0 1631 1016 24 1 0 0 mingetty 16:30:47:[ 1633] 0 1633 2692 36 1 -17 -1000 udevd 16:30:47:[ 1634] 0 1634 2691 24 0 -17 -1000 udevd 16:30:47:[ 1636] 0 1636 1016 24 0 0 0 mingetty 16:30:47:[ 2278] 38 2278 7689 171 0 0 0 ntpd 16:30:47: [12785] 0 12785 4763 60 0 0 0 anacron 16:30:47:Kernel panic - not syncing: Out of memory: system-wide panic_on_oom is enabled

            People

              di.wang Di Wang
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: