Details

    • Bug
    • Resolution: Done
    • Major
    • None
    • Lustre 2.1.3
    • 3
    • 6118

    Description

      We have got 4 OSSes that crash at the same time, at umount, with the following bt :

      PID: 18173 TASK: ffff8803376dc040 CPU: 4 COMMAND: "umount"
      #0 [ffff8802b115f8d0] machine_kexec at ffffffff8102895b
      0000001 [ffff8802b115f930] crash_kexec at ffffffff810a4622
      0000002 [ffff8802b115fa00] panic at ffffffff81484657
      0000003 [ffff8802b115fa80] lbug_with_loc at ffffffffa04ade5b [libcfs]
      0000004 [ffff8802b115faa0] llog_recov_thread_stop at ffffffffa072e55b [ptlrpc]
      0000005 [ffff8802b115fad0] llog_recov_thread_fini at ffffffffa072e593 [ptlrpc]
      0000006 [ffff8802b115faf0] filter_llog_finish at ffffffffa0c7d3dd [obdfilter]
      0000007 [ffff8802b115fb20] obd_llog_finish at ffffffffa057c2f8 [obdclass]
      0000008 [ffff8802b115fb40] filter_precleanup at ffffffffa0c7cdaf [obdfilter]
      0000009 [ffff8802b115fba0] class_cleanup at ffffffffa05a3ca7 [obdclass]
      0000010 [ffff8802b115fc20] class_process_config at ffffffffa05a5feb [obdclass]
      0000011 [ffff8802b115fcb0] class_manual_cleanup at ffffffffa05a6d29 [obdclass]
      0000012 [ffff8802b115fd70] server_put_super at ffffffffa05b2c0c [obdclass]
      0000013 [ffff8802b115fe40] generic_shutdown_super at ffffffff8116542b
      0000014 [ffff8802b115fe60] kill_anon_super at ffffffff81165546
      0000015 [ffff8802b115fe80] lustre_kill_super at ffffffffa05a8966 [obdclass]
      0000016 [ffff8802b115fea0] deactivate_super at ffffffff811664e0
      0000017 [ffff8802b115fec0] mntput_no_expire at ffffffff811826bf
      0000018 [ffff8802b115fef0] sys_umount at ffffffff81183188
      0000019 [ffff8802b115ff80] system_call_fastpath at ffffffff810030f2
      RIP: 00007f62ddfbdd67 RSP: 00007fffab738308 RFLAGS: 00010202
      RAX: 00000000000000a6 RBX: ffffffff810030f2 RCX: 0000000000000010
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007f62deeb3bb0
      RBP: 00007f62deeb3b80 R8: 00007f62deeb3bd0 R9: 0000000000000000
      R10: 00007fffab738130 R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000000000 R14: 0000000000000000 R15: 00007f62deeb3c10
      ORIG_RAX: 00000000000000a6 CS: 0033 SS: 002b

      This bt is identical as the one shown LU-1194 which is supposed to be fixed in 2.1.3.

      Site is classified so I can't upload the binary crash but I can export the content of some structures upon request.

      Attachments

        1. ptlrpcd.c
          32 kB
        2. recov_thread.c
          24 kB

        Activity

          [LU-2615] group of OSS crashed at umount

          Ah, I see. Thank you very much!

          lixi Li Xi (Inactive) added a comment - Ah, I see. Thank you very much!

          in ptlrpcd_stop, cfs_wait_for_completion(&pc->pc_finishing) will be called to wait the pending RPCs to complete!

          can the issue be reproduced in your site?

          hongchao.zhang Hongchao Zhang added a comment - in ptlrpcd_stop, cfs_wait_for_completion(&pc->pc_finishing) will be called to wait the pending RPCs to complete! can the issue be reproduced in your site?

          Hi Hongchao,

          Sorry, may be 'race' is not the right word to express my thought.

          At the time llcd_send() returns, the completion handler llcd_interpret() might not be called yet, right? When the llcd is still under use by the RPC on flight, llog_recov_thread_stop() will hit a LBUG. I can't find any codes in filter_llog_finish() which waits for the RPC finishes, so I guess it is possible that when llog_recov_thread_stop() is called, the RPC is still on flight. Am I right?

          Thanks
          Li Xi

          lixi Li Xi (Inactive) added a comment - Hi Hongchao, Sorry, may be 'race' is not the right word to express my thought. At the time llcd_send() returns, the completion handler llcd_interpret() might not be called yet, right? When the llcd is still under use by the RPC on flight, llog_recov_thread_stop() will hit a LBUG. I can't find any codes in filter_llog_finish() which waits for the RPC finishes, so I guess it is possible that when llog_recov_thread_stop() is called, the RPC is still on flight. Am I right? Thanks Li Xi

          what are the two threads involved in the race?
          normally, the llog_recov_thread_stop is only called by llog_recov_thread_fini, and "llog_recov_thread_stop" is called in two places,
          one is the cleanup for the failed llog_recov_thread_init call, the other is the normal cleanup phase during device cleanup
          (called in filter_llog_finish). they can't be called simultaneously

          could you please attach some more info about this issue, and can it be reproduced on your site?

          hongchao.zhang Hongchao Zhang added a comment - what are the two threads involved in the race? normally, the llog_recov_thread_stop is only called by llog_recov_thread_fini, and "llog_recov_thread_stop" is called in two places, one is the cleanup for the failed llog_recov_thread_init call, the other is the normal cleanup phase during device cleanup (called in filter_llog_finish). they can't be called simultaneously could you please attach some more info about this issue, and can it be reproduced on your site?

          We hit the same problem on lustre-2.1.6 too.

          After reading a few codes, I am wondering whether it is possible for following race problem to happen. Please correct me if I am wrong.

          filter_llog_finish
          --llog_recov_thread_fini
          ----llog_sync
          ------llog_obd_repl_sync
          --------llog_cancel
          ----------llog_obd_repl_cancel
          ------------llcd_push
          --------------llcd_send
          ----------------Sending async
          ----llog_recov_thread_stop
          ------LBUG,because llcd_send is sending a llcd and llcd_interpret() is not called since no reply has been got now.

          Thanks!

          lixi Li Xi (Inactive) added a comment - We hit the same problem on lustre-2.1.6 too. After reading a few codes, I am wondering whether it is possible for following race problem to happen. Please correct me if I am wrong. filter_llog_finish --llog_recov_thread_fini ----llog_sync ------llog_obd_repl_sync --------llog_cancel ----------llog_obd_repl_cancel ------------llcd_push --------------llcd_send ----------------Sending async ----llog_recov_thread_stop ------LBUG,because llcd_send is sending a llcd and llcd_interpret() is not called since no reply has been got now. Thanks!

          Hi,

          I have asked people on site for the results of the tests.

          Cheers,
          Sebastien.

          sebastien.buisson Sebastien Buisson (Inactive) added a comment - Hi, I have asked people on site for the results of the tests. Cheers, Sebastien.

          Hi, what is the output of the test? Thanks

          hongchao.zhang Hongchao Zhang added a comment - Hi, what is the output of the test? Thanks

          Hi,

          Yes, it will disable the ptlrpcd thread pools (although not shaking off the patch completely) and it should be still a relevant test.

          Thanks

          hongchao.zhang Hongchao Zhang added a comment - Hi, Yes, it will disable the ptlrpcd thread pools (although not shaking off the patch completely) and it should be still a relevant test. Thanks

          Hi,

          It might be difficult to have the opportunity to install packages with those 2 patches reverted at customer site.
          Instead, could we just set ptlrpcd_bind_policy=1 and max_ptlrpcds=2 as options for the ptlrpc kernel module, so that it behaves like if patch from ORNL-22 was not applied?
          Is it still a relevant test for you?

          Thanks,
          Sebastien.

          sebastien.buisson Sebastien Buisson (Inactive) added a comment - Hi, It might be difficult to have the opportunity to install packages with those 2 patches reverted at customer site. Instead, could we just set ptlrpcd_bind_policy=1 and max_ptlrpcds=2 as options for the ptlrpc kernel module, so that it behaves like if patch from ORNL-22 was not applied? Is it still a relevant test for you? Thanks, Sebastien.

          the remaining "llcd" should have been sent over ptlrpc_request for llog_ctxt->loc_llcd == NULL, and the request could not finish, then "llcd_interpret" wasn't
          called to free the "llcd", there are 2 patches (ORNL-22 general ptlrpcd threads pool support; LU-1144 implement a NUMA aware ptlrpcd binding policy) among
          the patches applied currently is related to it, could you please help to revert the 2 patches and test it, Thanks!

          hongchao.zhang Hongchao Zhang added a comment - the remaining "llcd" should have been sent over ptlrpc_request for llog_ctxt->loc_llcd == NULL, and the request could not finish, then "llcd_interpret" wasn't called to free the "llcd", there are 2 patches (ORNL-22 general ptlrpcd threads pool support; LU-1144 implement a NUMA aware ptlrpcd binding policy) among the patches applied currently is related to it, could you please help to revert the 2 patches and test it, Thanks!

          People

            hongchao.zhang Hongchao Zhang
            louveta Alexandre Louvet (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: