Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9728

out of memory on OSS causing allocation failures or hung threads

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.10.1, Lustre 2.11.0
    • Lustre 2.7.0, Lustre 2.5.3, Lustre 2.10.0
    • None
    • 3
    • 9223372036854775807

    Description

      In several cases recently there have been memory allocation failures on the OSS due to large amounts of RAM usage from the Lustre read cache:

      LNet: Service thread pid 4950 was inactive for 200.73s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      
      schedule+0x29/0x70
      schedule_timeout+0x209/0x2d0
      io_schedule_timeout+0xae/0x130
      io_schedule+0x18/0x20
      sleep_on_page+0xe/0x20
      __wait_on_bit_lock+0x5b/0xc0
      __lock_page+0x78/0xa0
      __find_lock_page+0x54/0x70
      find_or_create_page+0x34/0xa0
      osd_bufs_get+0x20f/0x410 [osd_ldiskfs]
      ofd_preprw+0x647/0x11a0 [ofd]
      tgt_brw_read+0x9a1/0x14c0 [ptlrpc]
      tgt_request_handle+0x8fb/0x11f0 [ptlrpc]
      ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
      ptlrpc_main+0xc00/0x1f60 [ptlrpc]
      

      Looking at the page allocation code from osd_bufs_get() to osd_get_page() it appears this is only using GFP_NOFS for allocations, to avoid recursing into the filesystem.

      static struct page *osd_get_page(struct dt_object *dt, loff_t offset, int rw)
      {
              page = find_or_create_page(inode->i_mapping, offset >> PAGE_SHIFT,
                                         GFP_NOFS | __GFP_HIGHMEM);
      

      However, looking back in the pre-OSD code, the equivalent code was using GFP_HIGHUSER to allow memory pressure and direct memory reclaim from the OSS threads when memory was short:

      /*
       * the routine is used to request pages from pagecache
       *
       * use GFP_NOFS for requests from a local client not allowing to enter FS
       * as we might end up waiting on a page he sent in the request we're serving.
       * use __GFP_HIGHMEM so that the pages can use all of the available memory
       * on 32-bit machines
       * use more aggressive GFP_HIGHUSER flags from non-local clients to be able to
       * generate more memory pressure.
       *
       * See Bug 19529 and Bug 19917 for details.
       */
      static struct page *filter_get_page(struct obd_device *obd, struct inode *inode,
                                          obd_off offset, int localreq)
      {
              page = find_or_create_page(inode->i_mapping, offset >> CFS_PAGE_SHIFT,
                                         (localreq ? (GFP_NOFS | __GFP_HIGHMEM) :
                                                   GFP_HIGHUSER));
      

      It looks like something similar can be done with the OSD code for ldiskfs at least, though it isn't as clear what is possible for ZFS since the buffer allocation is handled quite differently.

      Attachments

        Issue Links

          Activity

            People

              adilger Andreas Dilger
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: