Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9193

Multiple hangs observed with many open/getattr

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.13.0, Lustre 2.12.7
    • Lustre 2.7.0, Lustre 2.5.3, Lustre 2.8.0, Lustre 2.9.0
    • None
    • Centos 7.2
      Centos 6.[7-8]
      SELinux enforcing
    • 3
    • 9223372036854775807

    Description

      Tested (reproduced) on 2.5 , 2.7, 2.8 and 2.9

      MPI job on 300 nodes: 2/3 open and 1/3 stat on same file => hang (The MDS server threads are idle and the load is close to 0, threads are waiting for a lock but no threads have an active lock. After a long time, 15/30mn, threads become responsive again and resume operations normally).
      Same job with only stat => no problem
      Same job with only open => no problem

      Some of the logs were similar to LU-5497 and LU-4579 but patches did not fix the issue.
      If all job's clients are evicted manually then lustre recover and resume to a normal state.
      Lustre 2.7.2 with patch of LU-5781 (Solve a race for LRU lock cancel) was tested too.

      So far what prevents the issue is to disable SELINUX:

      1. cat policy-noxattr-lustre.patch
          • serefpolicy-3.13.1/policy/modules/kernel/filesystem.te.orig 2016-08-02 19:56:29.997519918 +0000
            +++ serefpolicy-3.13.1/policy/modules/kernel/filesystem.te 2016-08-02 19:57:10.124519918 +0000
            @@ -32,7 +32,8 @@ fs_use_xattr gfs2 gen_context(system_u:o
            fs_use_xattr gpfs gen_context(system_u:object_r:fs_t,s0);
            fs_use_xattr jffs2 gen_context(system_u:object_r:fs_t,s0);
            fs_use_xattr jfs gen_context(system_u:object_r:fs_t,s0);
            -fs_use_xattr lustre gen_context(system_u:object_r:fs_t,s0);
            +# Lustre is not supported Selinux correctly
            +#fs_use_xattr lustre gen_context(system_u:object_r:fs_t,s0);
            fs_use_xattr ocfs2 gen_context(system_u:object_r:fs_t,s0);
            fs_use_xattr overlay gen_context(system_u:object_r:fs_t,s0);
            fs_use_xattr xfs gen_context(system_u:object_r:fs_t,s0);

      Reproducer (127 clients: vm3 to vm130, vm0 is MDS, vm1 and 2 are OSS):
      mkdir /lustre/testfs/testuser/testdir; sleep 4; clush -bw vm[3-130] 'seq 0 1000 | xargs -P 7 -I{} sh -c "(({}%3==0)) && touch /lustre/testfs/testuser/testdir/foo$(hostname -s | tr -d vm) || stat /lustre/testfs/testuser/testdir > /dev/null"'

      Tested disabling statahead. No impact.

      Traces of stuck processes on the MDS look all the same (could be related to DDN-366):
      8631 TASK: ffff880732202280 CPU:
      [ffff88071587f760] __schedule at ffffffff8163b6cd
      [ffff88071587f7c8] schedule at ffffffff8163bd69
      [ffff88071587f7d8] schedule_timeout at ffffffff816399c5
      [ffff88071587f880] ldlm_completion_ast at ffffffffa08b7fe1
      [ffff88071587f920] ldlm_cli_enqueue_local at ffffffffa08b9c20
      [ffff88071587f9b8] mdt_object_local_lock at ffffffffa0f3c6d2
      [ffff88071587fa60] mdt_object_lock_internal at ffffffffa0f3cffb
      [ffff88071587faa0] mdt_getattr_name_lock at ffffffffa0f3ddf6
      [ffff88071587fb28] mdt_intent_getattr at ffffffffa0f3f3f0
      [ffff88071587fb68] mdt_intent_policy at ffffffffa0f42e7c
      [ffff88071587fbd0] ldlm_lock_enqueue at ffffffffa089f1f7
      [ffff88071587fc28] ldlm_handle_enqueue0 at ffffffffa08c4fb2
      [ffff88071587fcb8] tgt_enqueue at ffffffffa09493f2
      [ffff88071587fcd8] tgt_request_handle at ffffffffa094ddf5
      [ffff88071587fd20] ptlrpc_server_handle_request at ffffffffa08f87cb
      [ffff88071587fde8] ptlrpc_main at ffffffffa08fc0f0
      [ffff88071587fec8] kthread at ffffffff810a5b8f
      [ffff88071587ff50] ret_from_fork at ffffffff81646c98

      I do not have (yet) the clients traces but last call on clients is mdc_enqueue.

      Crash dump analysis started (in progress). So far nothing obvious about SELinux, first analysis lead to ldlm_handle_enqueue0() and ldlm_lock_enqueue(). We see some processes idle for 280s, it's not entirely stuck but very very slow (we had to crash the VM to get proper dump because it was very hard to use crash as the threads are not 100% stuck).

      Attachments

        1. logs_050717.tar.gz
          89.55 MB
          Jean-Baptiste Riaux
        2. logs-9193-patchset11-tests.tar.gz
          8.46 MB
          Jean-Baptiste Riaux
        3. LU-9193.tar.gz
          16.73 MB
          Jean-Baptiste Riaux
        4. LU-9193-patchset8.tar.gz
          71.05 MB
          Jean-Baptiste Riaux
        5. lustre-logs.tar.gz
          12.65 MB
          Jean-Baptiste Riaux
        6. lustre-logs-210617.tar.gz
          14.31 MB
          Jean-Baptiste Riaux
        7. lustre-LU9193-240817.tar.gz
          79.36 MB
          Jean-Baptiste Riaux
        8. vm0.tar.gz
          13.93 MB
          Jean-Baptiste Riaux
        9. vm105.tar.gz
          134 kB
          Jean-Baptiste Riaux
        10. vm62.tar.gz
          133 kB
          Jean-Baptiste Riaux

        Issue Links

          Activity

            [LU-9193] Multiple hangs observed with many open/getattr

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41387/
            Subject: LU-9193 security: return security context for metadata ops
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: cd17e1a2e6367a7c3f07753e71fa569c28000c81

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41387/ Subject: LU-9193 security: return security context for metadata ops Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: cd17e1a2e6367a7c3f07753e71fa569c28000c81

            Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41387
            Subject: LU-9193 security: return security context for metadata ops
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 68554f5ebf1479748b9bc9f6f86d4eb4c5dc9873

            gerrit Gerrit Updater added a comment - Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41387 Subject: LU-9193 security: return security context for metadata ops Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 68554f5ebf1479748b9bc9f6f86d4eb4c5dc9873

            Created LU-12212 for reconnect issue

            tappro Mikhail Pershin added a comment - Created LU-12212 for reconnect issue

            this patch causes often client reconnects somehow:

            Apr 21 03:09:50 nodez kernel: Lustre: 4239:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1555830583/real 1555830583] req@ffff8800ad5fc740 x1631406542374112/t0(0) o101->lustre-MDT0000-mdc-ffff8800af2fb800@0@lo:12/10 lens 616/4752 e 0 to 1 dl 1555830590 ref 2 fl Rpc:X/2/ffffffff rc 0/-1
            Apr 21 03:09:50 nodez kernel: Lustre: lustre-MDT0000-mdc-ffff8800af2fb800: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
            Apr 21 03:09:50 nodez kernel: Lustre: lustre-MDT0000: Client 2f22eecb-5055-b4d6-16b5-a958700fcbda (at 0@lo) reconnecting
            Apr 21 03:09:50 nodez kernel: Lustre: lustre-MDT0000: Connection restored to 4e8c3a45-3b8a-cd6a-047f-22363e8171e6 (at 0@lo)
            Apr 21 03:09:50 nodez kernel: Lustre: Skipped 1 previous similar message

            I am seeing many such messages during dbench run and git bisect shows this patch is source of problems.

            tappro Mikhail Pershin added a comment - this patch causes often client reconnects somehow: Apr 21 03:09:50 nodez kernel: Lustre: 4239:0:(client.c:2134:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1555830583/real 1555830583] req@ffff8800ad5fc740 x1631406542374112/t0(0) o101->lustre-MDT0000-mdc-ffff8800af2fb800@0@lo:12/10 lens 616/4752 e 0 to 1 dl 1555830590 ref 2 fl Rpc:X/2/ffffffff rc 0/-1 Apr 21 03:09:50 nodez kernel: Lustre: lustre-MDT0000-mdc-ffff8800af2fb800: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Apr 21 03:09:50 nodez kernel: Lustre: lustre-MDT0000: Client 2f22eecb-5055-b4d6-16b5-a958700fcbda (at 0@lo) reconnecting Apr 21 03:09:50 nodez kernel: Lustre: lustre-MDT0000: Connection restored to 4e8c3a45-3b8a-cd6a-047f-22363e8171e6 (at 0@lo) Apr 21 03:09:50 nodez kernel: Lustre: Skipped 1 previous similar message I am seeing many such messages during dbench run and git bisect shows this patch is source of problems.

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34573/
            Subject: LU-9193 security: return security context for metadata ops
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: 7aa5ae2673f70ef851fb903b280a0fc9a47c476b

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34573/ Subject: LU-9193 security: return security context for metadata ops Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: 7aa5ae2673f70ef851fb903b280a0fc9a47c476b

            Sebastien Piechurski (sebastien.piechurski@atos.net) uploaded a new patch: https://review.whamcloud.com/34573
            Subject: LU-9193 security: return security context for metadata ops
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 24b11428909ea77c7f4a101b4e2b7f8b2c490f06

            gerrit Gerrit Updater added a comment - Sebastien Piechurski (sebastien.piechurski@atos.net) uploaded a new patch: https://review.whamcloud.com/34573 Subject: LU-9193 security: return security context for metadata ops Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 24b11428909ea77c7f4a101b4e2b7f8b2c490f06
            pjones Peter Jones added a comment -

            Landed for 2.13

            pjones Peter Jones added a comment - Landed for 2.13

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/26831/
            Subject: LU-9193 security: return security context for metadata ops
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: fca35f74f9ec5c5ed77e774f3e3209d9df057a01

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/26831/ Subject: LU-9193 security: return security context for metadata ops Project: fs/lustre-release Branch: master Current Patch Set: Commit: fca35f74f9ec5c5ed77e774f3e3209d9df057a01

            Bruno, unless you think your current patch is fundamentally moving in the wrong direction (I haven looked at it yet) then I think it makes sense to land the current patch that solves the first problem, then make a second patch to address this next problem.

            At least we are moving in the right direction to fix the first problem, which is what the application was originally hitting, and possibly the second problem you are able to produce is not actually being seen by applications, so we have more time to work on a solution.

            adilger Andreas Dilger added a comment - Bruno, unless you think your current patch is fundamentally moving in the wrong direction (I haven looked at it yet) then I think it makes sense to land the current patch that solves the first problem, then make a second patch to address this next problem. At least we are moving in the right direction to fix the first problem, which is what the application was originally hitting, and possibly the second problem you are able to produce is not actually being seen by applications, so we have more time to work on a solution.

            The last dead-lock situation encountered in this ticket, has nothing related with original issue that this ticket has been created for, but seems to be more generic and being addressed in both LU-10235/LU-10262 tickets now.

            bfaccini Bruno Faccini (Inactive) added a comment - The last dead-lock situation encountered in this ticket, has nothing related with original issue that this ticket has been created for, but seems to be more generic and being addressed in both LU-10235 / LU-10262 tickets now.

            People

              bfaccini Bruno Faccini (Inactive)
              riauxjb Jean-Baptiste Riaux (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              23 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: