Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14883

umount blocked due to remaining statahead threads

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.12.6
    • None
    • CentOS 7.9 clients (3.10.0-1160.24.1.el7.x86_64)
    • 3
    • 9223372036854775807

    Description

      We noticed today on a cluster running 2.12.6 clients that a large number of them were blocked at umount when applying a security patch. No jobs were running anymore on these clients, but umount doesn't finish.

      Dumping the tasks on a client has revealed that 3 ll_sa threads are still running:

      PID: 25857  TASK: ffff9a78cd32b180  CPU: 36  COMMAND: "ll_sa_24325"
       #0 [ffff9a7519997c98] __schedule at ffffffffa2f8899f
       #1 [ffff9a7519997d28] schedule at ffffffffa2f88eb9
       #2 [ffff9a7519997d38] ll_statahead_thread at ffffffffc133787e [lustre]
       #3 [ffff9a7519997ec8] kthread at ffffffffa28c5da1
       #4 [ffff9a7519997f50] ret_from_fork_nospec_begin at ffffffffa2f95de4
      
      PID: 34631  TASK: ffff9b754a283180  CPU: 10  COMMAND: "ll_sa_30018"
       #0 [ffff99fd4ebd7c98] __schedule at ffffffffa2f8899f
       #1 [ffff99fd4ebd7d28] schedule at ffffffffa2f88eb9
       #2 [ffff99fd4ebd7d38] ll_statahead_thread at ffffffffc133787e [lustre]
       #3 [ffff99fd4ebd7ec8] kthread at ffffffffa28c5da1
       #4 [ffff99fd4ebd7f50] ret_from_fork_nospec_begin at ffffffffa2f95de4
      
      PID: 41809  TASK: ffff9afeeeb8b180  CPU: 40  COMMAND: "ll_sa_40406"
       #0 [ffff99fe617c7c98] __schedule at ffffffffa2f8899f
       #1 [ffff99fe617c7d28] schedule at ffffffffa2f88eb9
       #2 [ffff99fe617c7d38] ll_statahead_thread at ffffffffc133787e [lustre]
       #3 [ffff99fe617c7ec8] kthread at ffffffffa28c5da1
       #4 [ffff99fe617c7f50] ret_from_fork_nospec_begin at ffffffffa2f95de4
      

      This is blocking umount/ll_kill_super():

      crash> bt -f 19912
      PID: 19912  TASK: ffff9b7532fae300  CPU: 44  COMMAND: "umount"
       #0 [ffff9a002139bd08] __schedule at ffffffffa2f8899f
          ffff9a002139bd10: 0000000000000082 ffff9a002139bfd8 
          ffff9a002139bd20: ffff9a002139bfd8 ffff9a002139bfd8 
          ffff9a002139bd30: 000000000001acc0 ffff9afb94b4e300 
          ffff9a002139bd40: ffff9b797f2d3940 00000001e05d9d9f 
          ffff9a002139bd50: 0000000000000282 000000006a88e28b 
          ffff9a002139bd60: ffff9a002139bdc0 ffff9b797f313940 
          ffff9a002139bd70: ffff9a002139bdc0 00000001e05d9e1c 
          ffff9a002139bd80: ffff9b797f313940 0000000000015980 
          ffff9a002139bd90: ffff9a002139bda0 ffffffffa2f88eb9 
       #1 [ffff9a002139bd98] schedule at ffffffffa2f88eb9
          ffff9a002139bda0: ffff9a002139be48 ffffffffa2f868d8 
       #2 [ffff9a002139bda8] schedule_timeout at ffffffffa2f868d8
          ffff9a002139bdb0: ffff9a0073ffd888 0000000000000282 
          ffff9a002139bdc0: ffff9b797f314b48 ffff9b797f314b48 
          ffff9a002139bdd0: 00000001e05d9e1c ffff9b797f313940 
          ffff9a002139bde0: ffffffffa28ad7f0 ffff9b7532fae300 
          ffff9a002139bdf0: ffffffffffffffff 0000000000000000 
          ffff9a002139be00: 0000000000000000 0000000000000000 
          ffff9a002139be10: 0000000000000282 000000006a88e28b 
          ffff9a002139be20: ffff9b7532fae300 ffff9a0030bf0000 
          ffff9a002139be30: ffff9b7532fae300 ffffffffa3907570 
          ffff9a002139be40: ffff9b7532faead0 ffff9a002139be70 
          ffff9a002139be50: ffffffffc130a449 
       #3 [ffff9a002139be50] ll_kill_super at ffffffffc130a449 [lustre]
          ffff9a002139be58: 0000000000000000 ffff9b75f7eba000 
          ffff9a002139be68: ffffffffc0fa8ae0 ffff9a002139be88 
          ffff9a002139be78: ffffffffc0f599ad 
       #4 [ffff9a002139be78] lustre_kill_super at ffffffffc0f599ad [obdclass]
          ffff9a002139be80: ffff9b75f7eba000 ffff9a002139bea8 
          ffff9a002139be90: ffffffffa2a50fae 
       #5 [ffff9a002139be90] deactivate_locked_super at ffffffffa2a50fae
          ffff9a002139be98: ffff9b75f7eba000 0000000000000000 
          ffff9a002139bea8: ffff9a002139bec0 ffffffffa2a51736 
       #6 [ffff9a002139beb0] deactivate_super at ffffffffa2a51736
          ffff9a002139beb8: ffff9a0073ffd680 ffff9a002139bed8 
          ffff9a002139bec8: ffffffffa2a70dbf 
       #7 [ffff9a002139bec8] cleanup_mnt at ffffffffa2a70dbf
          ffff9a002139bed0: ffff9a0073ffd6b8 ffff9a002139bee8 
          ffff9a002139bee0: ffffffffa2a70e52 
       #8 [ffff9a002139bee0] __cleanup_mnt at ffffffffa2a70e52
          ffff9a002139bee8: ffff9a002139bf28 ffffffffa28c28db 
       #9 [ffff9a002139bef0] task_work_run at ffffffffa28c28db
          ffff9a002139bef8: ffff9b7532faeb04 0000000000000002 
          ffff9a002139bf08: ffff9a002139bf58 0000000000000000 
          ffff9a002139bf18: 0000000000000000 0000000000000000 
          ffff9a002139bf28: ffff9a002139bf48 ffffffffa282cc65 
      #10 [ffff9a002139bf30] do_notify_resume at ffffffffa282cc65
          ffff9a002139bf38: 0000000000000000 0000000000000000 
          ffff9a002139bf48: 0000000000000000 ffffffffa2f962ef 
      #11 [ffff9a002139bf50] int_signal at ffffffffa2f962ef
          RIP: 00002b88f55a5a97  RSP: 00007fffaf522e48  RFLAGS: 00000246
          RAX: 0000000000000000  RBX: 00005603f7e53040  RCX: ffffffffffffffff
          RDX: 0000000000000001  RSI: 0000000000000000  RDI: 00005603f7e54840
          RBP: 00005603f7e54840   R8: 0000000000000000   R9: 0000000000000005
          R10: 00007fffaf5228a0  R11: 0000000000000246  R12: 00002b88f4e3ad78
          R13: 0000000000000000  R14: 00005603f7e54960  R15: 00005603f7e53040
          ORIG_RAX: 00000000000000a6  CS: 0033  SS: 002b
      
      sb = 0xffff9b75f7eba000
      
      crash> struct super_block.s_fs_info ffff9b75f7eba000
        s_fs_info = 0xffff9b75f7eb9000
      crash> struct lustre_sb_info.lsi_llsbi 0xffff9b75f7eb9000
        lsi_llsbi = 0xffff9a0030bf0000
      crash> struct ll_sb_info.ll_umounting 0xffff9a0030bf0000
        ll_umounting = 1
      crash> struct ll_sb_info.ll_sa_running 0xffff9a0030bf0000
        ll_sa_running = {
          counter = 3
        }
      

      Here on this cluster clients are all 2.12.6, and server is Oak which is running 2.12.6 too.
      We have one client that is available for live troubleshooting (it's still blocked at umount).

      Is there another known bug with such problem in 2.12? Happy to provide more info.

      Attachments

        Issue Links

          Activity

            [LU-14883] umount blocked due to remaining statahead threads

            Hi Lai,

            Unfortunately, the process is gone, same for the other ones.

            However for 30018, there is also a ll_agl_30018:

            USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
            root 34631 0.0 0.0 0 0 ? S Jun12 0:00 [ll_sa_30018]
            root 34632 0.0 0.0 0 0 ? S Jun12 0:00 [ll_agl_30018]
            

            But I don't think it's releaving anything interesting:

            PID: 34632 TASK: ffff9b754a280000 CPU: 31 COMMAND: "ll_agl_30018"
             #0 [ffff9a685cb9bd88] __schedule at ffffffffa2f8899f
             #1 [ffff9a685cb9be18] schedule at ffffffffa2f88eb9
             #2 [ffff9a685cb9be28] ll_agl_thread at ffffffffc1336d9e [lustre]
             #3 [ffff9a685cb9bec8] kthread at ffffffffa28c5da1
             #4 [ffff9a685cb9bf50] ret_from_fork_nospec_begin at ffffffffa2f95de4
            

             
            Attaching the output of foreach bt just in case as foreach_bt.log

            Qian, thanks, I will try to get more debug data but I don't have a perfect reproducer yet, we're just seeing that problem often when we restart nodes.

            sthiell Stephane Thiell added a comment - Hi Lai, Unfortunately, the process is gone, same for the other ones. However for 30018, there is also a ll_agl_30018: USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND root 34631 0.0 0.0 0 0 ? S Jun12 0:00 [ll_sa_30018] root 34632 0.0 0.0 0 0 ? S Jun12 0:00 [ll_agl_30018] But I don't think it's releaving anything interesting: PID: 34632 TASK: ffff9b754a280000 CPU: 31 COMMAND: "ll_agl_30018" #0 [ffff9a685cb9bd88] __schedule at ffffffffa2f8899f #1 [ffff9a685cb9be18] schedule at ffffffffa2f88eb9 #2 [ffff9a685cb9be28] ll_agl_thread at ffffffffc1336d9e [lustre] #3 [ffff9a685cb9bec8] kthread at ffffffffa28c5da1 #4 [ffff9a685cb9bf50] ret_from_fork_nospec_begin at ffffffffa2f95de4   Attaching the output of foreach bt just in case as foreach_bt.log Qian, thanks, I will try to get more debug data but I don't have a perfect reproducer yet, we're just seeing that problem often when we restart nodes.
            laisiyao Lai Siyao added a comment -

            For statahead thread name "ll_sa_24325", "24325" is the process that triggered statahead, can you check whether these processes are still there?

            laisiyao Lai Siyao added a comment - For statahead thread name "ll_sa_24325", "24325" is the process that triggered statahead, can you check whether these processes are still there?
            qian_wc Qian Yingjin added a comment - - edited

            Hi Stephane,

            Could you please provide the reproducer program?
            Or could you please provide the Lustre debug log?

            lctl set_param debug=reada
            lctl set_param debug=+vfstrace
            lctl set_param subsystem_debug=llite
            lctl clear
            run the reproducer
            lctl dk > log
            
            Get the unfinalized stat ahead thread:
            ps -el | grep ll_sa
            

            Thanks,
            Qian

            qian_wc Qian Yingjin added a comment - - edited Hi Stephane, Could you please provide the reproducer program? Or could you please provide the Lustre debug log? lctl set_param debug=reada lctl set_param debug=+vfstrace lctl set_param subsystem_debug=llite lctl clear run the reproducer lctl dk > log Get the unfinalized stat ahead thread: ps -el | grep ll_sa Thanks, Qian

            Hi Qian,

            It looks like it's relatively easy to reproduce on this cluster, yes. I checked and as we are based on Lustre 2.12.6 on clients and servers, we don't have the patch that was recently reverted in master for LU-14868.

            Thanks!

            sthiell Stephane Thiell added a comment - Hi Qian, It looks like it's relatively easy to reproduce on this cluster, yes. I checked and as we are based on Lustre 2.12.6 on clients and servers, we don't have the patch that was recently reverted in master for LU-14868 . Thanks!
            qian_wc Qian Yingjin added a comment -

            Sure, Peter.

            Hi Stephane,
            Is this bug easy to reproduce?
            Did your installed Lustre include the reverted patch in master: https://review.whamcloud.com/#/c/44371/?

            Thanks,
            Qian

            qian_wc Qian Yingjin added a comment - Sure, Peter. Hi Stephane, Is this bug easy to reproduce? Did your installed Lustre include the reverted patch in master: https://review.whamcloud.com/#/c/44371/? Thanks, Qian
            pjones Peter Jones added a comment -

            Yingjin

            Could you please assist

            Thanks

            Peter

            pjones Peter Jones added a comment - Yingjin Could you please assist Thanks Peter

            People

              qian_wc Qian Yingjin
              sthiell Stephane Thiell
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: