Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.1.0
-
None
-
RHEL6, OST
-
3
-
4944
Description
in case running OSS node with low memory situation and have high IO load from IOR, OOM killer start a kill processes.
[root@link ~]# grep -i oom /var/log/messages | grep ost_io
Jul 10 12:02:17 link kernel: ll_ost_io_103 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
Jul 10 12:02:26 link kernel: ll_ost_io_120 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
Jul 10 12:02:31 link kernel: ll_ost_io_120 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
Jul 10 12:02:53 link kernel: ll_ost_io_125 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
Jul 10 16:12:30 link kernel: ll_ost_io_107 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
Jul 10 16:13:03 link kernel: ll_ost_io_111 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
Jul 10 16:15:10 link kernel: ll_ost_io_110 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
Jul 10 16:16:02 link kernel: ll_ost_io_114 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
Jul 10 16:16:02 link kernel: ll_ost_io_122 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
Jul 10 16:16:26 link kernel: ll_ost_io_101 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
Jul 10 16:17:00 link kernel: ll_ost_io_112 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_adj=0
in that case OST think client request in processing and blocked to reconnect client with EBUSY flag until someone will reboot a oss node.
Jul 10 10:44:23 link kernel: Lustre: 6542:0:(ldlm_lib.c:846:target_handle_connect()) stry-OST0004: refuse reconnection from dfa0119a-9163-4b83-c579-10ff0e228ac4@172.18.1.167@o2ib to 0xffff88009bbee400/9
Jul 10 10:44:23 link kernel: LustreError: 6542:0:(ldlm_lib.c:2118:target_send_reply_msg()) @@@ processing error (16) req@ffff8800351d5400 x1372270371834801/t0(0) o-1>dfa0119a-9163-4b83-c579-10ff0e228ac4@NET_0x50000ac1201a7_UUID:0/0 lens 368/264 e 0 to 0 dl 1310319963 ref 1 fl Interpret:/ffffffff/ffffffff rc -16/-1