Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13680

large allocations in osd_bufs_get() failing

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.14.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      Large allocations in osd_bufs_get() can fail if the OSS memory is fragmented after use, if that thread has not serviced a cacheless IO since startup or is newly started:

      kernel: ll_ost_io04_052: page allocation failure: order:5, mode:0x10c050
      kernel: CPU: 4 PID: 1980 Comm: ll_ost_io04_052 
      Call Trace:
      dump_stack+0x19/0x1b
      warn_alloc_failed+0x110/0x180
      __alloc_pages_slowpath+0x6b6/0x724
      __alloc_pages_nodemask+0x404/0x420
      alloc_pages_current+0x98/0x110
      __get_free_pages+0xe/0x40
      kmalloc_order_trace+0x2e/0xa0
      osd_bufs_get+0x7b7/0x870 [osd_ldiskfs]
      ofd_preprw_read+0x2ea/0x1110 [ofd]
      ofd_preprw+0x499/0x8c0 [ofd]
      tgt_brw_read+0x9e3/0x1e40 [ptlrpc]
      tgt_request_handle+0xada/0x1570 [ptlrpc]
      ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
      ptlrpc_main+0xb34/0x1470 [ptlrpc]
      

      The order 5 allocation is 128KB, so osd_bufs_get() should be using OBD_ALLOC_LARGE() or equivalent.

      Attachments

        Issue Links

          Activity

            People

              adilger Andreas Dilger
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: