Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-502

ll_ost_io threads can be killed with OOM killer

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: Lustre 2.1.0
    • Fix Version/s: Lustre 2.1.0
    • Labels:
      None
    • Environment:
      RHEL6, OST
    • Severity:
      3
    • Rank (Obsolete):
      4944

      Description

      in case running OSS node with low memory situation and have high IO load from IOR, OOM killer start a kill processes.

      [root@link ~]# grep -i oom /var/log/messages | grep ost_io
      Jul 10 12:02:17 link kernel: ll_ost_io_103 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 12:02:26 link kernel: ll_ost_io_120 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 12:02:31 link kernel: ll_ost_io_120 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 12:02:53 link kernel: ll_ost_io_125 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 16:12:30 link kernel: ll_ost_io_107 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 16:13:03 link kernel: ll_ost_io_111 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 16:15:10 link kernel: ll_ost_io_110 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 16:16:02 link kernel: ll_ost_io_114 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 16:16:02 link kernel: ll_ost_io_122 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 16:16:26 link kernel: ll_ost_io_101 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
      Jul 10 16:17:00 link kernel: ll_ost_io_112 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0

      in that case OST think client request in processing and blocked to reconnect client with EBUSY flag until someone will reboot a oss node.

      Jul 10 10:44:23 link kernel: Lustre: 6542:0:(ldlm_lib.c:846:target_handle_connect()) stry-OST0004: refuse reconnection from dfa0119a-9163-4b83-c579-10ff0e228ac4@172.18.1.167@o2ib to 0xffff88009bbee400/9
      Jul 10 10:44:23 link kernel: LustreError: 6542:0:(ldlm_lib.c:2118:target_send_reply_msg()) @@@ processing error (16) req@ffff8800351d5400 x1372270371834801/t0(0) o-1>dfa0119a-9163-4b83-c579-10ff0e228ac4@NET_0x50000ac1201a7_UUID:0/0 lens 368/264 e 0 to 0 dl 1310319963 ref 1 fl Interpret:/ffffffff/ffffffff rc -16/-1

        Attachments

          Activity

            People

            • Assignee:
              rread Robert Read (Inactive)
              Reporter:
              shadow Alexey Lyashkov
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: