Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-179

lustre client lockup when under memory pressure

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 1.8.6
    • None
    • None
    • Client is running 2.6.27.45-lustre-1.8.3.ddn3.3. Connectivity is 10GigE
    • 3
    • 10103

    Description

      A customer is seeing a problem on a client where the client loses access to Lustre when the node is subjected to memory pressure from an errant application.

      Lustre starts reporting -113 (No route to host) errors for certain NIDS in the filesystem despite the TCP/IP network being functional. After the memory pressure is relieved the Lustre errors remain. I am collecting logs currently.

      From the customer report:

      Lnet is reporting no-route-to-host for a significant number of OSS / MDSs (client log attached).

      Mar 29 09:23:27 cgp-bigmem kernel: [589295.826095] LustreError: 4980:0:(events.c:66:request_out_callback()) @@@ type 4, status 113 req@ffff881d2e995400 x1363985318437337/t0 o8>lus03-OST0000_UUID@172.17.128.130@tcp:28/4 lens 368/584 e 0 to 1 dl 1301387122 ref 2 fl Rpc:N/0/0 rc 0/0

      but from user-space on the client, all those nodes are pingable:

      cgp-bigmem:/var/log# ping 172.17.128.130
      PING 172.17.128.130 (172.17.128.130) 56(84) bytes of data.
      64 bytes from 172.17.128.130: icmp_seq=1 ttl=62 time=0.102 ms
      64 bytes from 172.17.128.130: icmp_seq=2 ttl=62 time=0.091 ms
      64 bytes from 172.17.128.130: icmp_seq=3 ttl=62 time=0.091 ms
      64 bytes from 172.17.128.130: icmp_seq=4 ttl=62 time=0.090 ms

      however a lnet ping hangs:
      cgp-bigmem:~# lctl ping 172.17.128.130@tcp

      From another client, the ping works as expected

      farm2-head1:# lctl ping 172.17.128.130@tcp
      12345-0@lo
      12345-172.17.128.130@tcp

      cgp-bigmem:~# lfs check servers | grep -v active
      error: check 'lus01-OST0007-osc-ffff88205bd52000' Resource temporarily unavailable
      error: check 'lus01-OST0008-osc-ffff88205bd52000' Resource temporarily unavailable
      error: check 'lus01-OST0009-osc-ffff88205bd52000' Resource temporarily unavailable
      error: check 'lus01-OST000a-osc-ffff88205bd52000' Resource temporarily unavailable
      error: check 'lus01-OST000b-osc-ffff88205bd52000' Resource temporarily unavailable
      error: check 'lus01-OST000c-osc-ffff88205bd52000' Resource temporarily unavailable
      error: check 'lus01-OST000d-osc-ffff88205bd52000' Resource temporarily unavailable
      error: check 'lus01-OST000e-osc-ffff88205bd52000' Resource temporarily unavailable
      error: check 'lus02-MDT0000-mdc-ffff8880735ea000' Resource temporarily unavailable
      error: check 'lus03-OST0000-osc-ffff8840730a1400' Resource temporarily unavailable
      error: check 'lus03-OST0001-osc-ffff8840730a1400' Resource temporarily unavailable
      error: check 'lus03-OST0002-osc-ffff8840730a1400' Resource temporarily unavailable
      error: check 'lus03-OST0003-osc-ffff8840730a1400' Resource temporarily unavailable
      error: check 'lus03-OST0004-osc-ffff8840730a1400' Resource temporarily unavailable
      error: check 'lus03-OST0005-osc-ffff8840730a1400' Resource temporarily unavailable
      error: check 'lus03-OST0006-osc-ffff8840730a1400' Resource temporarily unavailable
      error: check 'lus03-OST0007-osc-ffff8840730a1400' Resource temporarily unavailable
      error: check 'lus03-OST0008-osc-ffff8840730a1400' Resource temporarily unavailable
      error: check 'lus03-OST0009-osc-ffff8840730a1400' Resource temporarily unavailable
      error: check 'lus03-OST000a-osc-ffff8840730a1400' Resource temporarily unavailable
      error: check 'lus03-OST000b-osc-ffff8840730a1400' Resource temporarily unavailable
      error: check 'lus03-OST000c-osc-ffff8840730a1400' Resource temporarily unavailable
      error: check 'lus03-OST0019-osc-ffff8840730a1400' Resource temporarily unavailable
      error: check 'lus03-OST001a-osc-ffff8840730a1400' Resource temporarily unavailable
      error: check 'lus05-OST0010-osc-ffff886070dab800' Resource temporarily unavailable
      error: check 'lus05-OST0012-osc-ffff886070dab800' Resource temporarily unavailable
      error: check 'lus05-OST0014-osc-ffff886070dab800' Resource temporarily unavailable
      error: check 'lus05-OST0016-osc-ffff886070dab800' Resource temporarily unavailable
      error: check 'lus05-OST0018-osc-ffff886070dab800' Resource temporarily unavailable
      error: check 'lus05-OST001a-osc-ffff886070dab800' Resource temporarily unavailable
      error: check 'lus05-OST001c-osc-ffff886070dab800' Resource temporarily unavailable
      error: check 'lus05-OST000f-osc-ffff886070dab800' Resource temporarily unavailable
      error: check 'lus05-OST0011-osc-ffff886070dab800' Resource temporarily unavailable
      error: check 'lus05-OST0013-osc-ffff886070dab800' Resource temporarily unavailable
      error: check 'lus05-OST0015-osc-ffff886070dab800' Resource temporarily unavailable
      error: check 'lus05-OST0017-osc-ffff886070dab800' Resource temporarily unavailable
      error: check 'lus05-OST0019-osc-ffff886070dab800' Resource temporarily unavailable
      error: check 'lus05-OST001b-osc-ffff886070dab800' Resource temporarily unavailable
      error: check 'lus05-OST001d-osc-ffff886070dab800' Resource temporarily unavailable
      error: check 'lus04-OST0001-osc-ffff88806e9d8c00' Resource temporarily unavailable
      error: check 'lus04-OST0003-osc-ffff88806e9d8c00' Resource temporarily unavailable
      error: check 'lus04-OST0005-osc-ffff88806e9d8c00' Resource temporarily unavailable
      error: check 'lus04-OST0007-osc-ffff88806e9d8c00' Resource temporarily unavailable
      error: check 'lus04-OST0009-osc-ffff88806e9d8c00' Resource temporarily unavailable
      error: check 'lus04-OST000b-osc-ffff88806e9d8c00' Resource temporarily unavailable
      error: check 'lus04-OST000d-osc-ffff88806e9d8c00' Resource temporarily unavailable

      Attachments

        Activity

          [LU-179] lustre client lockup when under memory pressure
          bobijam Zhenyu Xu added a comment -

          close ticket per Guy Coates' update info.

          bobijam Zhenyu Xu added a comment - close ticket per Guy Coates' update info.
          gmpc@sanger.ac.uk Guy Coates added a comment -

          We upgraded the kernel on this machine from 2.6.27/SLES11 kernel + 1.8.5.56 lustre client to 2.6.32 kernel +1.8.5.56 lustre client , and the problems seems to have stopped.

          You can close this issue.

          Thanks,

          Guy

          gmpc@sanger.ac.uk Guy Coates added a comment - We upgraded the kernel on this machine from 2.6.27/SLES11 kernel + 1.8.5.56 lustre client to 2.6.32 kernel +1.8.5.56 lustre client , and the problems seems to have stopped. You can close this issue. Thanks, Guy
          bobijam Zhenyu Xu added a comment - - edited

          can you help checkout what lustre threads was doing during this hang? (better to have thread stacks)

          bobijam Zhenyu Xu added a comment - - edited can you help checkout what lustre threads was doing during this hang? (better to have thread stacks)

          Yes.

          The system is a NUMA system with 512Gb ram currently. The problem seems to happen during memory pressure, a figure of 70% has been quoted but it's worth saying the application is single-threaded so it's quite likely that some NUMA regions are experiencing 100% memory usage.

          One thing I've suggested is pinning the application to a different NUMA region to the lustre kernel threads (if this is even possible) so the application wouldn't starve Lustre of memory so easily.

          apittman Ashley Pittman (Inactive) added a comment - Yes. The system is a NUMA system with 512Gb ram currently. The problem seems to happen during memory pressure, a figure of 70% has been quoted but it's worth saying the application is single-threaded so it's quite likely that some NUMA regions are experiencing 100% memory usage. One thing I've suggested is pinning the application to a different NUMA region to the lustre kernel threads (if this is even possible) so the application wouldn't starve Lustre of memory so easily.
          bobijam Zhenyu Xu added a comment -

          is it the same pattern as lnet ping hangs while ping works ok?

          bobijam Zhenyu Xu added a comment - is it the same pattern as lnet ping hangs while ping works ok?

          As above the customer is was still observing this problem using the latest code on the 10th Jun, could you reopen this bug accordingly.

          apittman Ashley Pittman (Inactive) added a comment - As above the customer is was still observing this problem using the latest code on the 10th Jun, could you reopen this bug accordingly.
          gmpc@sanger.ac.uk Guy Coates added a comment -

          Was able to get output from top from the last client lockup; pdflush is sat in 100% CPU.

          top - 09:10:36 up 2 days, 20:08, 2 users, load average: 801.64, 799.78, 796.51
          Tasks: 2891 total, 36 running, 2855 sleeping, 0 stopped, 0 zombie
          Cpu(s): 0.0%us, 25.1%sy, 0.0%ni, 70.8%id, 4.1%wa, 0.0%hi, 0.0%si, 0.0%st
          Mem: 528386840k total, 70774068k used, 457612772k free, 112k buffers
          Swap: 4192924k total, 0k used, 4192924k free, 81176k cached

          PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
          13691 cgppipe 39 19 23.6g 23g 9992 S 201 4.6 7646:35 java
          5640 cgppipe 39 19 3122m 2.7g 744 S 100 0.5 5717:22 bwa
          18662 root 0 -20 4 4 0 R 100 0.0 3756:26 elim.uptime
          153 root 20 0 0 0 0 R 100 0.0 3759:05 pdflush
          5528 root 20 0 13992 1528 900 R 100 0.0 3761:22 pim
          1809 root 20 0 56440 7628 2240 R 3 0.0 0:04.24 top
          4612 root 20 0 8832 532 404 S 0 0.0 2:30.10 irqbalance

          gmpc@sanger.ac.uk Guy Coates added a comment - Was able to get output from top from the last client lockup; pdflush is sat in 100% CPU. top - 09:10:36 up 2 days, 20:08, 2 users, load average: 801.64, 799.78, 796.51 Tasks: 2891 total, 36 running, 2855 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 25.1%sy, 0.0%ni, 70.8%id, 4.1%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 528386840k total, 70774068k used, 457612772k free, 112k buffers Swap: 4192924k total, 0k used, 4192924k free, 81176k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 13691 cgppipe 39 19 23.6g 23g 9992 S 201 4.6 7646:35 java 5640 cgppipe 39 19 3122m 2.7g 744 S 100 0.5 5717:22 bwa 18662 root 0 -20 4 4 0 R 100 0.0 3756:26 elim.uptime 153 root 20 0 0 0 0 R 100 0.0 3759:05 pdflush 5528 root 20 0 13992 1528 900 R 100 0.0 3761:22 pim 1809 root 20 0 56440 7628 2240 R 3 0.0 0:04.24 top 4612 root 20 0 8832 532 404 S 0 0.0 2:30.10 irqbalance
          gmpc@sanger.ac.uk Guy Coates added a comment - - edited

          We've just had a re-occurrence of this problem running 1.8.5.56 (as tagged in git).
          Client starts logging problems at Jun 9 14:49:13.

          gmpc@sanger.ac.uk Guy Coates added a comment - - edited We've just had a re-occurrence of this problem running 1.8.5.56 (as tagged in git). Client starts logging problems at Jun 9 14:49:13.
          gmpc@sanger.ac.uk Guy Coates added a comment -

          Client log

          gmpc@sanger.ac.uk Guy Coates added a comment - Client log

          Integrated in lustre-b1_8 » i686,server,el5,ofa #71
          Remove changelog entry for LU-179

          Johann Lombardi : 08b76cd92b2a4b6854ce3910a07531996449a9fd
          Files :

          • lustre/ChangeLog
          hudson Build Master (Inactive) added a comment - Integrated in lustre-b1_8 » i686,server,el5,ofa #71 Remove changelog entry for LU-179 Johann Lombardi : 08b76cd92b2a4b6854ce3910a07531996449a9fd Files : lustre/ChangeLog

          People

            bobijam Zhenyu Xu
            ihara Shuichi Ihara (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: