Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-502

ll_ost_io threads can be killed with OOM killer

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.1.0
    • Lustre 2.1.0
    • None
    • RHEL6, OST
    • 3
    • 4944

    Description

      in case running OSS node with low memory situation and have high IO load from IOR, OOM killer start a kill processes.

      [root@link ~]# grep -i oom /var/log/messages | grep ost_io
      Jul 10 12:02:17 link kernel: ll_ost_io_103 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 12:02:26 link kernel: ll_ost_io_120 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 12:02:31 link kernel: ll_ost_io_120 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 12:02:53 link kernel: ll_ost_io_125 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 16:12:30 link kernel: ll_ost_io_107 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 16:13:03 link kernel: ll_ost_io_111 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 16:15:10 link kernel: ll_ost_io_110 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 16:16:02 link kernel: ll_ost_io_114 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 16:16:02 link kernel: ll_ost_io_122 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 16:16:26 link kernel: ll_ost_io_101 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 16:17:00 link kernel: ll_ost_io_112 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0

      in that case OST think client request in processing and blocked to reconnect client with EBUSY flag until someone will reboot a oss node.

      Jul 10 10:44:23 link kernel: Lustre: 6542:0:(ldlm_lib.c:846:target_handle_connect()) stry-OST0004: refuse reconnection from dfa0119a-9163-4b83-c579-10ff0e228ac4@172.18.1.167@o2ib to 0xffff88009bbee400/9
      Jul 10 10:44:23 link kernel: LustreError: 6542:0:(ldlm_lib.c:2118:target_send_reply_msg()) @@@ processing error (16) req@ffff8800351d5400 x1372270371834801/t0(0) o-1>dfa0119a-9163-4b83-c579-10ff0e228ac4@NET_0x50000ac1201a7_UUID:0/0 lens 368/264 e 0 to 0 dl 1310319963 ref 1 fl Interpret:/ffffffff/ffffffff rc -16/-1

      Attachments

        Activity

          People

            rread Robert Read
            shadow Alexey Lyashkov
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: