Details

    • Bug
    • Resolution: Fixed
    • Major
    • None
    • Lustre 2.2.0
    • 3
    • 6441

    Description

      It is possible that I am seeing multiple bugs. So, you may want to split this one bug into several bugs.

      Let me emphasize that this problem occurs only on one system, and that is the system built withhttp://review.whamcloud.com/#change,2170, which is code specifically for the 2.6.38.2 kernel.

      This bug is preventing us from shipping the code to this customer.

      PROBLEM 1
      ---------
      My testing of the client on kernel 2.6.38.2 was going well overnight. The performance I am getting is the line-rate of the 10G NICs while running IOZone. Then, I switched to xdd.linux, which uses direct IO; xdd.linux also got line-rate.

      Then, I noticed that I may have hit a minor bug.

      After the iozone tests, I removed all of the files. Then, I did an lfs df -h, and I saw that some OSTs had 20G used still. After I unmounted all of the clients then remounted them, the problem went away (that is, all the OSTs had the same amount of used space). Here is some session output:

      [root@compute-01-32 lustre]# cd /mnt/lustre
      [root@compute-01-32 lustre]# find . -type f

      [root@compute-01-32 lustre]# # Note that there are no files; I removed them a while ago.
      [root@compute-01-32 lustre]# lfs df -h
      UUID bytes Used Available Use% Mounted on
      denver-MDT0000_UUID 73.2M 4.6M 68.6M 6% /mnt/lustre[MDT:0]
      denver-OST0000_UUID 190.0G 459.5M 189.6G 0% /mnt/lustre[OST:0]
      denver-OST0001_UUID 190.0G 459.5M 189.6G 0% /mnt/lustre[OST:1]
      denver-OST0002_UUID 190.0G 459.5M 189.6G 0% /mnt/lustre[OST:2]
      denver-OST0003_UUID 190.0G 459.5M 189.6G 0% /mnt/lustre[OST:3]
      denver-OST0004_UUID 190.0G 459.5M 189.6G 0% /mnt/lustre[OST:4]
      denver-OST0005_UUID 190.0G 20.4G 169.6G 11% /mnt/lustre[OST:5]
      denver-OST0006_UUID 190.0G 459.5M 189.6G 0% /mnt/lustre[OST:6]
      denver-OST0007_UUID 190.0G 20.4G 169.6G 11% /mnt/lustre[OST:7]

      filesystem summary: 1.5T 43.6G 1.4T 3% /mnt/lustre

      [root@compute-01-32 lustre]# # Note that two OSTs have used 20.4G, even though no files !

      [root@compute-01-32 lustre]# df
      Filesystem 1K-blocks Used Available Use% Mounted on
      /dev/mapper/vg_compute0132-lv_root
      51606140 12008608 36976092 25% /
      tmpfs 1924356 88 1924268 1% /dev/shm
      /dev/sda2 495844 59594 410650 13% /boot
      /dev/mapper/vg_compute0132-lv_home
      413557824 26856608 365693664 7% /home
      172.20.0.1:/depot/shared
      12095040 12094688 0 100% /shared
      172.20.0.1:/home 413557824 26856608 365693664 7% /home
      10.7.200.111@tcp0:10.7.200.112@tcp0:/denver
      1594036256 45706984 1548328760 3% /mnt/lustre

      PROBLEM 2
      ---------

      I then unmounted Lustre on all clients, rebooted all the clients,
      then remounted Lustre on all clients.
      The problem described above seemed to have been resolved.

      I decided to try and reproduce this bug. So, I started up IOZone again. Keep in mind that this is after a client reboot.

      IOZone ran for a few seconds, then hung. I could ping the node with 2.6.38.8 kernel, but I could not ssh to it. The video console was locked up. Pressing Caps Lock and NumLock on the keyboard did not light up any LEDs on the keyboard. So, I power cycled.

      This is all that I saw in /var/log/messages:

      Feb 28 09:33:53 compute-01-32 avahi-daemon[1508]: Joining mDNS multicast group on interface eth2.IPv4 with address 10.7.1.32.
      Feb 28 09:33:53 compute-01-32 avahi-daemon[1508]: New relevant interface eth2.IPv4 for mDNS.
      Feb 28 09:33:53 compute-01-32 avahi-daemon[1508]: Registering new address record for 10.7.1.32 on eth2.IPv4.
      Feb 28 09:33:53 compute-01-32 avahi-daemon[1508]: Withdrawing address record for 10.7.1.32 on eth2.
      Feb 28 09:33:53 compute-01-32 avahi-daemon[1508]: Leaving mDNS multicast group on interface eth2.IPv4 with address 10.7.1.32.
      Feb 28 09:33:53 compute-01-32 avahi-daemon[1508]: Interface eth2.IPv4 no longer relevant for mDNS.
      Feb 28 09:33:53 compute-01-32 avahi-daemon[1508]: Joining mDNS multicast group on interface eth2.IPv4 with address 10.7.1.32.
      Feb 28 09:33:53 compute-01-32 avahi-daemon[1508]: New relevant interface eth2.IPv4 for mDNS.
      Feb 28 09:33:53 compute-01-32 avahi-daemon[1508]: Registering new address record for 10.7.1.32 on eth2.IPv4.
      Feb 28 09:33:53 compute-01-32 kernel: device eth2 entered promiscuous mode
      Feb 28 09:33:54 compute-01-32 kernel: Lustre: Lustre: Build Version: 2.1.56-g17d2c48-CHANGED-../lustre/scripts
      Feb 28 09:33:55 compute-01-32 kernel: Lustre: Added LNI 10.7.1.32@tcp [8/256/0/180]
      Feb 28 09:33:55 compute-01-32 kernel: Lustre: Accept secure, port 988
      Feb 28 09:33:55 compute-01-32 kernel: Lustre: MGC10.7.200.111@tcp: Reactivating import
      Feb 28 09:33:55 compute-01-32 kernel: Lustre: Client denver-client has started
      Feb 28 09:37:28 compute-01-32 kernel: LustreError: 2567:0:(osc_request.c:797:osc_announce_cached()) dirty 2028 - 2028 > system dirty_max 647168
      Feb 28 09:37:40 compute-01-32 kernel: LustreError: 2568:0:(osc_request.c:797:osc_announce_cached()) dirty 2040 - 2040 > system dirty_max 647168
      Feb 28 09:37:46 compute-01-32 kernel: LustreError: 2565:0:(osc_request.c:797:osc_announce_cached()) dirty 1954 - 1954 > system dirty_max 647168
      Feb 28 09:37:47 compute-01-32 kernel: LustreError: 2561:0:(osc_request.c:797:osc_announce_cached()) dirty 1975 - 1976 > system dirty_max 647168
      Feb 28 09:37:50 compute-01-32 kernel: LustreError: 2566:0:(osc_request.c:797:osc_announce_cached()) dirty 2001 - 2002 > system dirty_max 647168
      Feb 28 09:38:09 compute-01-32 kernel: LustreError: 2567:0:(osc_request.c:797:osc_announce_cached()) dirty 1946 - 1946 > system dirty_max 647168
      Feb 28 09:39:20 compute-01-32 kernel: LustreError: 2615:0:(osc_request.c:797:osc_announce_cached()) dirty 2017 - 2017 > system dirty_max 647168
      Feb 28 09:39:47 compute-01-32 kernel: LustreError: 2615:0:(osc_request.c:797:osc_announce_cached()) dirty 1847 - 1848 > system dirty_max 647168
      Feb 28 09:39:47 compute-01-32 kernel: LustreError: 2615:0:(osc_request.c:797:osc_announce_cached()) Skipped 3 previous similar messages
      Feb 28 09:40:23 compute-01-32 kernel: LustreError: 2609:0:(osc_request.c:797:osc_announce_cached()) dirty 1918 - 1919 > system dirty_max 647168
      Feb 28 09:40:23 compute-01-32 kernel: LustreError: 2609:0:(osc_request.c:797:osc_announce_cached()) Skipped 1 previous similar message

      I did not have a serial cable plugged into get the crash dump. I have set this up now, so if the problem occurs again, we should get more data. I will try to reproduce this bug.

      PROBLEM 3
      ---------

      I set up the serial cable, and I verified that SysRq-T worked and sent its output over the serial cable to another node that was capturing the data.

      After many hours of testing, the 2.6.38.2 client became unresponsive again. I see some Out of memory messages. I will keep an eye on the slab usage next time.

      Below following is the output over ttyS0. After the problem occurred, I plugged in a monitor and keyboard, and SysRq-T did not work. Perhaps I need to have the keyboard in prior to the problem occurring .

      LustreError: 19819:0:(ldlm_request.c:1171:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway
      LustreError: 19819:0:(ldlm_request.c:1797:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108
      LustreError: 19819:0:(ldlm_request.c:1171:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway
      LustreError: 19819:0:(ldlm_request.c:1797:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108
      Out of memory: Kill process 1421 (rsyslogd) score 1 or sacrifice child
      Killed process 1421 (rsyslogd) total-vm:242652kB, anon-rss:0kB, file-rss:784kB
      Out of memory: Kill process 1452 (irqbalance) score 1 or sacrifice child
      Killed process 1452 (irqbalance) total-vm:9252kB, anon-rss:0kB, file-rss:416kB
      INFO: task irqbalance:1452 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      INFO: task irqbalance:1452 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      INFO: task automount:1823 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      INFO: task ksmtuned:1975 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      INFO: task irqbalance:1452 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      INFO: task automount:1823 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      INFO: task ksmtuned:1975 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      INFO: task irqbalance:1452 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      INFO: task automount:1823 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      INFO: task ksmtuned:1975 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

      Attachments

        Activity

          [LU-1153] Client Unresponsive
          pjones Peter Jones added a comment -

          As per Terascala ok to close

          pjones Peter Jones added a comment - As per Terascala ok to close

          I have a client with kernel: 2.6.32-220.el6.x86_64

          Is that what you want me to try?

          What git tag should I use to get the 2.1.x code?

          rspellman Roger Spellman (Inactive) added a comment - I have a client with kernel: 2.6.32-220.el6.x86_64 Is that what you want me to try? What git tag should I use to get the 2.1.x code?
          pjones Peter Jones added a comment -

          Roger

          Peter said today that he thought that this issue was independent of using the 2.6.38 client. Are you able to reproduce this same behaviour when running vanilla 2.1.x and RHEL6 clients, say?

          Peter

          pjones Peter Jones added a comment - Roger Peter said today that he thought that this issue was independent of using the 2.6.38 client. Are you able to reproduce this same behaviour when running vanilla 2.1.x and RHEL6 clients, say? Peter

          I updated to the 5th patch on:
          http://review.whamcloud.com/#change,2170

          This bug is still present there.

          rspellman Roger Spellman (Inactive) added a comment - I updated to the 5th patch on: http://review.whamcloud.com/#change,2170 This bug is still present there.

          I had no trouble building against 2.6.38.2.

          I don't see LB_LINUX_COMPILE_IFELSE in .config in this release.

          [root@RS_vm-2_6_38_2 linux-2.6.38.2]# pwd
          /usr/src/kernels/linux-2.6.38.2
          [root@RS_vm-2_6_38_2 linux-2.6.38.2]# grep LB_LINUX_COMPILE_IFELSE .config
          [root@RS_vm-2_6_38_2 linux-2.6.38.2]#

          -Roger

          rspellman Roger Spellman (Inactive) added a comment - I had no trouble building against 2.6.38.2. I don't see LB_LINUX_COMPILE_IFELSE in .config in this release. [root@RS_vm-2_6_38_2 linux-2.6.38.2] # pwd /usr/src/kernels/linux-2.6.38.2 [root@RS_vm-2_6_38_2 linux-2.6.38.2] # grep LB_LINUX_COMPILE_IFELSE .config [root@RS_vm-2_6_38_2 linux-2.6.38.2] # -Roger
          laisiyao Lai Siyao added a comment -

          Hmm, if you use kernel-devel package to build kernel module, lustre configure will fail on LB_LINUX_COMPILE_IFELSE in build/autoconf/lustre-build-linux.m4, I tried to tweak it a bit, but didn't succeed. In my setup, I build lustre patchless client against kernel source, and you need update kernel source Makefile 'EXTRAVERSION' to be in accordance with your kernel version(my setup is '.6-26.rc1.fc15.x86_64').

          laisiyao Lai Siyao added a comment - Hmm, if you use kernel-devel package to build kernel module, lustre configure will fail on LB_LINUX_COMPILE_IFELSE in build/autoconf/lustre-build-linux.m4, I tried to tweak it a bit, but didn't succeed. In my setup, I build lustre patchless client against kernel source, and you need update kernel source Makefile 'EXTRAVERSION' to be in accordance with your kernel version(my setup is '.6-26.rc1.fc15.x86_64').
          jay Jinshan Xiong (Inactive) added a comment - - edited

          I'm trying to compile lustre-master with fc15 but it doesn't work. Can you please tell me which distribution you;re using for your client?

          jay Jinshan Xiong (Inactive) added a comment - - edited I'm trying to compile lustre-master with fc15 but it doesn't work. Can you please tell me which distribution you;re using for your client?
          laisiyao Lai Siyao added a comment - Jinshan, please check http://review.whamcloud.com/#change,1863 and http://review.whamcloud.com/#change,2145 .

          I'll look into this issue. For the warning of simple_setattr(), it means we shouldn't use simple_setattr() if we have truncate method implemented. We used to call inode_setattr() for old kernels, we really need to fix that.

          jay Jinshan Xiong (Inactive) added a comment - I'll look into this issue. For the warning of simple_setattr(), it means we shouldn't use simple_setattr() if we have truncate method implemented. We used to call inode_setattr() for old kernels, we really need to fix that.

          Peter,
          Thanks for the clarification.

          Jinshan, I believe that the problem has to do with writing then over-writing the same file. IOZone does this by default, and the loop that I submitted does that too. When I just write a file once, I am not seeing this problem (in the time frame that I'm testing). You can do that in IOZone with the -+n option.

          Roger

          rspellman Roger Spellman (Inactive) added a comment - Peter, Thanks for the clarification. Jinshan, I believe that the problem has to do with writing then over-writing the same file. IOZone does this by default, and the loop that I submitted does that too. When I just write a file once, I am not seeing this problem (in the time frame that I'm testing). You can do that in IOZone with the -+n option. Roger

          People

            laisiyao Lai Siyao
            rspellman Roger Spellman (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: