Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5461

mdt_readpage returning - ENOMEM causes directory to be unreadable

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.7.0
    • Lustre 2.5.2
    • None
    • 3
    • 15215

    Description

      While testing HSM copytool in a single VM with 512MB memory, I saw page allocation errors in mdt_readpage, and subsequent IO errors on the client when trying to read that directory again. It appears the client is caching the error page, and not allowing the ll_get_dir_page() try to fetch it again. Here are the errors on the client side:

      LustreError: 18907:0:(dir.c:422:ll_get_dir_page()) read cache page: [0x200000402:0x27a:0x0] at 0: rc -12
      LustreError: 18907:0:(dir.c:584:ll_dir_read()) error reading dir [0x200000402:0x27a:0x0] at 0: rc -12
      LustreError: 18912:0:(dir.c:398:ll_get_dir_page()) dir page locate: [0x200000402:0x27a:0x0] at 0: rc -5
      LustreError: 18912:0:(dir.c:584:ll_dir_read()) error reading dir [0x200000402:0x27a:0x0] at 0: rc -5
      LustreError: 7358:0:(dir.c:398:ll_get_dir_page()) dir page locate: [0x200000402:0x27a:0x0] at 0: rc -5
      LustreError: 7358:0:(dir.c:584:ll_dir_read()) error reading dir [0x200000402:0x27a:0x0] at 0: rc -5
      

      And this is the allocation failure on the MDT:

      mdt_rdpg00_001: page allocation failure. order:0, mode:0xc0
      Pid: 4794, comm: mdt_rdpg00_001 Not tainted 2.6.32-431.17.1.el6_lustre.x86_64 #1
      Call Trace:
      [<ffffffff8112f64a>] ? __alloc_pages_nodemask+0x74a/0x8d0
      [<ffffffffa0653d10>] ? lustre_swab_mdt_body+0x0/0x140 [ptlrpc]
      [<ffffffff8116769a>] ? alloc_pages_current+0xaa/0x110
      [<ffffffffa0c9f3c0>] ? mdt_readpage+0x1d0/0x940 [mdt]
      [<ffffffffa0c8f58a>] ? mdt_handle_common+0x52a/0x1470 [mdt]
      [<ffffffffa0ccb735>] ? mds_readpage_handle+0x15/0x20 [mdt]
      [<ffffffffa0660bc5>] ? ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc]
      [<ffffffffa03713cf>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
      [<ffffffffa06582a9>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]
      [<ffffffffa0661f2d>] ? ptlrpc_main+0xaed/0x1740 [ptlrpc]
      [<ffffffffa0661440>] ? ptlrpc_main+0x0/0x1740 [ptlrpc]
      [<ffffffff8109ab56>] ? kthread+0x96/0xa0
      [<ffffffff8100c20a>] ? child_rip+0xa/0x20
      [<ffffffff8109aac0>] ? kthread+0x0/0xa0
      [<ffffffff8100c200>] ? child_rip+0x0/0x20
      Mem-Info:
      Node 0 DMA per-cpu:
      CPU    0: hi:    0, btch:   1 usd:   0
      Node 0 DMA32 per-cpu:
      CPU    0: hi:  186, btch:  31 usd:  72
      active_anon:9625 inactive_anon:10498 isolated_anon:0
      active_file:24454 inactive_file:28863 isolated_file:0
      unevictable:0 dirty:4370 writeback:0 unstable:0
      free:1018 slab_reclaimable:5718 slab_unreclaimable:18294
      mapped:2472 shmem:132 pagetables:1268 bounce:0
      Node 0 DMA free:2020kB min:84kB low:104kB high:124kB active_anon:464kB inactive_anon:2924kB active_file:1412kB inactive_file:7108kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15368kB mlocked:0kB dirty:408kB writeback:0kB mapped:268kB shmem:308kB slab_reclaimable:264kB slab_unreclaimable:632kB kernel_stack:40kB pagetables:704kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
      lowmem_reserve[]: 0 489 489 489
      Node 0 DMA32 free:2052kB min:2784kB low:3480kB high:4176kB active_anon:38036kB inactive_anon:39068kB active_file:96404kB inactive_file:108344kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:500896kB mlocked:0kB dirty:17072kB writeback:0kB mapped:9620kB shmem:220kB slab_reclaimable:22608kB slab_unreclaimable:72544kB kernel_stack:1608kB pagetables:4368kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:64 all_unreclaimable? no
      lowmem_reserve[]: 0 0 0 0
      Node 0 DMA: 1*4kB 0*8kB 0*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2020kB
      Node 0 DMA32: 183*4kB 5*8kB 4*16kB 2*32kB 2*64kB 0*128kB 2*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 2052kB
      53571 total pagecache pages
      112 pages in swap cache
      Swap cache stats: add 245, delete 133, find 25/32
      Free swap  = 834836kB
      Total swap = 835576kB
      131055 pages RAM
      5534 pages reserved
      66002 pages shared
      75464 pages non-shared
      

      Attachments

        Activity

          People

            laisiyao Lai Siyao
            rread Robert Read
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: