Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10871

MDS hit LBUG: LustreError: 2566:0:(osd_handler.c:3304:osd_destroy()) ASSERTION( !lu_object_is_dying(dt->do_lu.lo_header) ) failed

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.11.0, Lustre 2.14.0
    • 2.11.0 RC3
      lustre-master-ib build 73
    • 3
    • 9223372036854775807

    Description

      After soak running for about 7 hours

      MDS hit LBUG

      Apr 1 23:58:08 soak-11 multipathd: sdp: mark as failed
      Apr 1 23:58:08 soak-11 multipathd: 360080e50001fedb80000015952012962: Entering recovery mode: max_retries=300
      Apr 1 23:58:08 soak-11 multipathd: 360080e50001fedb80000015952012962: remaining active paths: 0
      Apr 1 23:58:09 soak-11 kernel: LustreError: 2566:0:(osd_handler.c:3853:osd_ref_del()) soaked-MDT0003: nlink == 0 on [0x2c000d70d:0x389:0x0], maybe an upgraded file? (LU-3915)
      Apr 1 23:58:09 soak-11 kernel: LustreError: 2566:0:(osd_handler.c:3304:osd_destroy()) ASSERTION( !lu_object_is_dying(dt->do_lu.lo_header) ) failed: 
      Apr 1 23:58:09 soak-11 kernel: LustreError: 2566:0:(osd_handler.c:3304:osd_destroy()) LBUG
      Apr 1 23:58:09 soak-11 kernel: Pid: 2566, comm: mdt_out00_004
      Apr 1 23:58:09 soak-11 kernel: #012Call Trace:
      Apr 1 23:58:09 soak-11 kernel: [<ffffffffc0dc47ae>] libcfs_call_trace+0x4e/0x60 [libcfs]
      Apr 1 23:58:09 soak-11 kernel: [<ffffffffc0dc483c>] lbug_with_loc+0x4c/0xb0 [libcfs]
      Apr 1 23:58:09 soak-11 kernel: [<ffffffffc142a0b0>] osd_destroy+0x4a0/0x760 [osd_ldiskfs]
      Apr 1 23:58:09 soak-11 kernel: [<ffffffffc1429237>] ? osd_ref_del+0x2c7/0x6a0 [osd_ldiskfs]
      Apr 1 23:58:09 soak-11 kernel: [<ffffffffc1428589>] ? osd_attr_set+0x199/0xb80 [osd_ldiskfs]
      Apr 1 23:58:09 soak-11 kernel: [<ffffffff816b3232>] ? down_write+0x12/0x3d
      Apr 1 23:58:09 soak-11 kernel: [<ffffffffc11c9141>] out_obj_destroy+0x101/0x2c0 [ptlrpc]
      Apr 1 23:58:09 soak-11 kernel: [<ffffffffc11c93b0>] out_tx_destroy_exec+0x20/0x190 [ptlrpc]
      Apr 1 23:58:09 soak-11 kernel: [<ffffffffc11c3e91>] out_tx_end+0xe1/0x5c0 [ptlrpc]
      Apr 1 23:58:09 soak-11 kernel: [<ffffffffc11c7a72>] out_handle+0x1442/0x1bb0 [ptlrpc]
      Apr 1 23:58:09 soak-11 kernel: [<ffffffffc11568a2>] ? lustre_msg_get_opc+0x22/0xf0 [ptlrpc]
      Apr 1 23:58:09 soak-11 kernel: [<ffffffffc11be0a9>] ? tgt_request_preprocess.isra.28+0x299/0x7a0 [ptlrpc]
      Apr 1 23:58:09 soak-11 kernel: [<ffffffffc11beeda>] tgt_request_handle+0x92a/0x13b0 [ptlrpc]
      Apr 1 23:58:09 soak-11 kernel: [<ffffffffc1164813>] ptlrpc_server_handle_request+0x253/0xab0 [ptlrpc]
      Apr 1 23:58:09 soak-11 kernel: [<ffffffffc11616c8>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
      Apr 1 23:58:09 soak-11 kernel: [<ffffffff810c7c82>] ? default_wake_function+0x12/0x20
      Apr 1 23:58:09 soak-11 kernel: [<ffffffff810bdc4b>] ? __wake_up_common+0x5b/0x90
      Apr 1 23:58:10 soak-11 kernel: [<ffffffffc1167fc2>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
      Apr 1 23:58:10 soak-11 kernel: [<ffffffffc1167530>] ? ptlrpc_main+0x0/0x1e40 [ptlrpc]
      Apr 1 23:58:10 soak-11 kernel: [<ffffffff810b4031>] kthread+0xd1/0xe0
      Apr 1 23:58:10 soak-11 kernel: [<ffffffff810b3f60>] ? kthread+0x0/0xe0
      Apr 1 23:58:10 soak-11 kernel: [<ffffffff816c0577>] ret_from_fork+0x77/0xb0
      Apr 1 23:58:10 soak-11 kernel: [<ffffffff810b3f60>] ? kthread+0x0/0xe0
      Apr 1 23:58:10 soak-11 kernel:
      
      Apr 1 23:58:10 soak-11 kernel: Kernel panic - not syncing: LBUG
      Apr 2 00:02:02 soak-11 systemd: Starting Stop Read-Ahead Data Collection...
      Apr 2 00:02:02 soak-11 rsyslogd: action 'action 0' resumed (module 'builtin:omfwd') [v8.24.0 try http://www.rsyslog.com/e/2359 ]
      Apr 2 00:02:02 soak-11 rsyslogd: action 'action 0' resumed (module 'builtin:omfwd') [v8.24.0 try http://www.rsyslog.com/e/2359 ]
      Apr 2 00:02:02 soak-11 systemd: Started Stop Read-Ahead Data Collection.
      Apr 2 00:03:37 soak-11 chronyd[1280]: Source 64.6.144.6 replaced with 69.10.161.7
      Apr 2 00:10:01 soak-11 systemd: Created slice User Slice of root.
      Apr 2 00:10:01 soak-11 systemd: Starting User Slice of root.
      Apr 2 00:10:01 soak-11 systemd: Started Session 1 of user root.
      Apr 2 00:10:01 soak-11 systemd: Starting Session 1 of user root.
      Apr 2 00:10:01 soak-11 CROND[2234]: (root) CMD (/usr/lib64/sa/sa1 1 1)
      Apr 2 00:10:01 soak-11 systemd: Removed slice User Slice of root.
      Apr 2 00:10:01 soak-11 systemd: Stopping User Slice of root.
      Apr 2 00:15:54 soak-11 systemd: Starting Cleanup of Temporary Directories...
      Apr 2 00:15:54 soak-11 systemd: Started Cleanup of Temporary Directories.
      
      

      Attachments

        Issue Links

          Activity

            [LU-10871] MDS hit LBUG: LustreError: 2566:0:(osd_handler.c:3304:osd_destroy()) ASSERTION( !lu_object_is_dying(dt->do_lu.lo_header) ) failed
            adilger Andreas Dilger added a comment - Hit in Janitor testing: https://testing.whamcloud.com/gerrit-janitor/39666/testresults/sanityn-ldiskfs-DNE-centos7_x86_64-centos7_x86_64/
            adilger Andreas Dilger made changes -
            Assignee Original: Hongchao Zhang [ hongchao.zhang ] New: WC Triage [ wc-triage ]
            Resolution Original: Cannot Reproduce [ 5 ]
            Status Original: Resolved [ 5 ] New: Reopened [ 4 ]
            adilger Andreas Dilger made changes -
            Resolution New: Cannot Reproduce [ 5 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            sarah Sarah Liu added a comment -

            hit this again on lustre-master-ib build#437

            sarah Sarah Liu added a comment - hit this again on lustre-master-ib build#437
            jamesanunez James Nunez (Inactive) made changes -
            Remote Link New: This issue links to "Page (Whamcloud Community Wiki)" [ 24508 ]
            sarah Sarah Liu added a comment -

            Hit the problem again on master branch
            tag-2.13.52
            crash dump can be found on spirit at /scratch/dumps/soak-8.spirit.whamcloud.com/10.10.1.108-2020-02-15-11\:36\:21/

            [ 2575.647060] Lustre: Skipped 17 previous similar messages
            [ 2575.694765] LustreError: dumping log to /tmp/lustre-log.1581765979.5236
            [ 2576.220180] LNet: 4683:0:(o2iblnd_cb.c:3393:kiblnd_check_conns()) Timed out tx for 192.168.1.110@o2ib: 0 seconds
            [ 2576.231562] LNet: 4683:0:(o2iblnd_cb.c:3393:kiblnd_check_conns()) Skipped 6 previous similar messages
            [ 2576.241901] LNetError: 4683:0:(lib-msg.c:479:lnet_handle_local_failure()) ni 192.168.1.108@o2ib added to recovery queue. Health = 900
            [ 2606.654791] LustreError: 6202:0:(ldlm_request.c:124:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1581765710, 300s ago); not entering recove
            ry in server code, just going back to sleep ns: mdt-soaked-MDT0000_UUID lock: ffff88f8c632d580/0xa21a133420fc92f4 lrc: 3/1,0 mode: --/PR res: [0x20000040e:0x3:0x
            0].0x0 bits 0x13/0x0 rrc: 13 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 6202 timeout: 0 lvb_type: 0
            [ 2606.698855] LustreError: 6202:0:(ldlm_request.c:124:ldlm_expired_completion_wait()) Skipped 3 previous similar messages
            [ 2643.197397] LNetError: 4683:0:(lib-msg.c:479:lnet_handle_local_failure()) ni 192.168.1.108@o2ib added to recovery queue. Health = 900
            [ 2643.210833] LNetError: 4683:0:(lib-msg.c:479:lnet_handle_local_failure()) Skipped 1 previous similar message
            [ 2767.192219] Lustre: MGS: Connection restored to 5c9f140e-4423-4 (at 192.168.1.110@o2ib)
            [ 2767.201198] Lustre: Skipped 1 previous similar message
            [ 2769.137827] Lustre: soaked-MDT0000: Received new LWP connection from 192.168.1.110@o2ib, removing former export from same NID
            [ 2769.150490] Lustre: Skipped 1 previous similar message
            [ 2853.947102] Lustre: soaked-MDT0002-osp-MDT0000: Connection restored to 192.168.1.110@o2ib (at 192.168.1.110@o2ib)
            [ 2853.958590] Lustre: Skipped 2 previous similar messages
            [ 2870.647167] Lustre: 4940:0:(service.c:1440:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/5), not sending early reply  req@ffff88f8af2fda00 x1658
            449052245760/t0(0) o36->775f92d9-5437-4@192.168.1.127@o2ib:509/0 lens 528/2888 e 24 to 0 dl 1581766279 ref 2 fl Interpret:/0/0 rc 0/0 job:''
            [ 2871.318872] Lustre: 4858:0:(service.c:1440:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/5), not sending early reply  req@ffff88f4963b2d00 x1658
            449307111936/t0(0) o36->4d2dbe7c-0a77-4@192.168.1.126@o2ib:509/0 lens 528/2888 e 24 to 0 dl 1581766279 ref 2 fl Interpret:/0/0 rc 0/0 job:''
            [ 2871.348299] Lustre: 4858:0:(service.c:1440:ptlrpc_at_send_early_reply()) Skipped 2 previous similar messages
            [ 2877.572936] Lustre: soaked-MDT0000: Client 4d2dbe7c-0a77-4 (at 192.168.1.126@o2ib) reconnecting
            [ 3153.983138] Lustre: soaked-MDT0000: Received new LWP connection from 192.168.1.110@o2ib, removing former export from same NID
            [ 3153.995796] Lustre: Skipped 1 previous similar message
            [ 3153.996927] Lustre: soaked-MDT0000: Connection restored to soaked-MDT0001-mdtlov_UUID (at 192.168.1.109@o2ib)
            [ 3153.996931] Lustre: Skipped 4 previous similar messages
            [ 3154.093652] LustreError: 5720:0:(osd_handler.c:4169:osd_ref_del()) soaked-MDT0000: nlink == 0 on [0x200011d4e:0x9eaf:0x0], maybe an upgraded file? (LU-3915)
            [ 3154.109319] LustreError: 5720:0:(osd_handler.c:3592:osd_destroy()) ASSERTION( !lu_object_is_dying(dt->do_lu.lo_header) ) failed:
            [ 3154.122342] LustreError: 5720:0:(osd_handler.c:3592:osd_destroy()) LBUG
            [ 3154.129735] Pid: 5720, comm: mdt_out01_006 3.10.0-1062.9.1.el7_lustre.x86_64 #1 SMP Wed Feb 12 06:45:58 UTC 2020
            [ 3154.141104] Call Trace:
            [ 3154.143850]  [<ffffffffc0b0ffac>] libcfs_call_trace+0x8c/0xc0 [libcfs]
            [ 3154.151193]  [<ffffffffc0b1005c>] lbug_with_loc+0x4c/0xa0 [libcfs]
            [ 3154.158133]  [<ffffffffc1224f80>] osd_destroy+0x4a0/0x760 [osd_ldiskfs]
            [ 3154.165558]  [<ffffffffc0f92321>] out_obj_destroy+0x101/0x2c0 [ptlrpc]
            [ 3154.172976]  [<ffffffffc0f92590>] out_tx_destroy_exec+0x20/0x190 [ptlrpc]
            [ 3154.180640]  [<ffffffffc0f8c991>] out_tx_end+0xe1/0x5c0 [ptlrpc]
            [ 3154.187425]  [<ffffffffc0f90c52>] out_handle+0x1442/0x1bb0 [ptlrpc]
            [ 3154.194490]  [<ffffffffc0f8972a>] tgt_request_handle+0x95a/0x1610 [ptlrpc]
            [ 3154.202224]  [<ffffffffc0f2b0f6>] ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc]
            [ 3154.210846]  [<ffffffffc0f2f4f4>] ptlrpc_main+0xbb4/0x1550 [ptlrpc]
            [ 3154.217898]  [<ffffffffa58c61f1>] kthread+0xd1/0xe0
            [ 3154.223385]  [<ffffffffa5f8dd37>] ret_from_fork_nospec_end+0x0/0x39
            [ 3154.230407]  [<ffffffffffffffff>] 0xffffffffffffffff
            [ 3154.235988] Kernel panic - not syncing: LBUG
            [ 3154.240763] CPU: 30 PID: 5720 Comm: mdt_out01_006 Kdump: loaded Tainted: G           OE  ------------   3.10.0-1062.9.1.el7_lustre.x86_64 #1
            [ 3154.254834] Hardware name: Intel Corporation S2600GZ ........../S2600GZ, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
            [ 3154.267355] Call Trace:
            [ 3154.270096]  [<ffffffffa5f7ac23>] dump_stack+0x19/0x1b
            [ 3154.275834]  [<ffffffffa5f74967>] panic+0xe8/0x21f
            [ 3154.281187]  [<ffffffffc0b100ab>] lbug_with_loc+0x9b/0xa0 [libcfs]
            [ 3154.288092]  [<ffffffffc1224f80>] osd_destroy+0x4a0/0x760 [osd_ldiskfs]
            [ 3154.295481]  [<ffffffffa5f80caa>] ? _cond_resched+0x3a/0x50
            [ 3154.301699]  [<ffffffffa5f7fd42>] ? down_write+0x12/0x3d
            [ 3154.307674]  [<ffffffffc0f92321>] out_obj_destroy+0x101/0x2c0 [ptlrpc]
            [ 3154.314999]  [<ffffffffc0f92590>] out_tx_destroy_exec+0x20/0x190 [ptlrpc]
            [ 3154.322612]  [<ffffffffc0f8c991>] out_tx_end+0xe1/0x5c0 [ptlrpc]
            [ 3154.329353]  [<ffffffffc0f90c52>] out_handle+0x1442/0x1bb0 [ptlrpc]
            [ 3154.336372]  [<ffffffffa596ad65>] ? tracing_is_on+0x15/0x30
            [ 3154.342627]  [<ffffffffc0f8972a>] tgt_request_handle+0x95a/0x1610 [ptlrpc]
            [ 3154.350305]  [<ffffffffc0af700e>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
            [ 3154.358212]  [<ffffffffc0f2b0f6>] ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc]
            [ 3154.366791]  [<ffffffffc0f2787b>] ? ptlrpc_wait_event+0x12b/0x4f0 [ptlrpc]
            [ 3154.374495]  [<ffffffffa58d3360>] ? task_rq_unlock+0x20/0x20
            [ 3154.380825]  [<ffffffffa58d3903>] ? __wake_up+0x13/0x20
            [ 3154.386689]  [<ffffffffc0f2f4f4>] ptlrpc_main+0xbb4/0x1550 [ptlrpc]
            [ 3154.393738]  [<ffffffffc0f2e940>] ? ptlrpc_register_service+0xf90/0xf90 [ptlrpc]
            [ 3154.401985]  [<ffffffffa58c61f1>] kthread+0xd1/0xe0
            [ 3154.407429]  [<ffffffffa58c6120>] ? insert_kthread_work+0x40/0x40
            [ 3154.414231]  [<ffffffffa5f8dd37>] ret_from_fork_nospec_begin+0x21/0x21
            [ 3154.421517]  [<ffffffffa58c6120>] ? insert_kthread_work+0x40/0x40
            [    0.000000] Initializing cgroup subsys cpuset
            [    0.000000] Initializing cgroup subsys cpu
            
            sarah Sarah Liu added a comment - Hit the problem again on master branch tag-2.13.52 crash dump can be found on spirit at /scratch/dumps/soak-8.spirit.whamcloud.com/10.10.1.108-2020-02-15-11\:36\:21/ [ 2575.647060] Lustre: Skipped 17 previous similar messages [ 2575.694765] LustreError: dumping log to /tmp/lustre-log.1581765979.5236 [ 2576.220180] LNet: 4683:0:(o2iblnd_cb.c:3393:kiblnd_check_conns()) Timed out tx for 192.168.1.110@o2ib: 0 seconds [ 2576.231562] LNet: 4683:0:(o2iblnd_cb.c:3393:kiblnd_check_conns()) Skipped 6 previous similar messages [ 2576.241901] LNetError: 4683:0:(lib-msg.c:479:lnet_handle_local_failure()) ni 192.168.1.108@o2ib added to recovery queue. Health = 900 [ 2606.654791] LustreError: 6202:0:(ldlm_request.c:124:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1581765710, 300s ago); not entering recove ry in server code, just going back to sleep ns: mdt-soaked-MDT0000_UUID lock: ffff88f8c632d580/0xa21a133420fc92f4 lrc: 3/1,0 mode: --/PR res: [0x20000040e:0x3:0x 0].0x0 bits 0x13/0x0 rrc: 13 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 6202 timeout: 0 lvb_type: 0 [ 2606.698855] LustreError: 6202:0:(ldlm_request.c:124:ldlm_expired_completion_wait()) Skipped 3 previous similar messages [ 2643.197397] LNetError: 4683:0:(lib-msg.c:479:lnet_handle_local_failure()) ni 192.168.1.108@o2ib added to recovery queue. Health = 900 [ 2643.210833] LNetError: 4683:0:(lib-msg.c:479:lnet_handle_local_failure()) Skipped 1 previous similar message [ 2767.192219] Lustre: MGS: Connection restored to 5c9f140e-4423-4 (at 192.168.1.110@o2ib) [ 2767.201198] Lustre: Skipped 1 previous similar message [ 2769.137827] Lustre: soaked-MDT0000: Received new LWP connection from 192.168.1.110@o2ib, removing former export from same NID [ 2769.150490] Lustre: Skipped 1 previous similar message [ 2853.947102] Lustre: soaked-MDT0002-osp-MDT0000: Connection restored to 192.168.1.110@o2ib (at 192.168.1.110@o2ib) [ 2853.958590] Lustre: Skipped 2 previous similar messages [ 2870.647167] Lustre: 4940:0:(service.c:1440:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/5), not sending early reply req@ffff88f8af2fda00 x1658 449052245760/t0(0) o36->775f92d9-5437-4@192.168.1.127@o2ib:509/0 lens 528/2888 e 24 to 0 dl 1581766279 ref 2 fl Interpret:/0/0 rc 0/0 job:'' [ 2871.318872] Lustre: 4858:0:(service.c:1440:ptlrpc_at_send_early_reply()) @@@ Could not add any time (5/5), not sending early reply req@ffff88f4963b2d00 x1658 449307111936/t0(0) o36->4d2dbe7c-0a77-4@192.168.1.126@o2ib:509/0 lens 528/2888 e 24 to 0 dl 1581766279 ref 2 fl Interpret:/0/0 rc 0/0 job:'' [ 2871.348299] Lustre: 4858:0:(service.c:1440:ptlrpc_at_send_early_reply()) Skipped 2 previous similar messages [ 2877.572936] Lustre: soaked-MDT0000: Client 4d2dbe7c-0a77-4 (at 192.168.1.126@o2ib) reconnecting [ 3153.983138] Lustre: soaked-MDT0000: Received new LWP connection from 192.168.1.110@o2ib, removing former export from same NID [ 3153.995796] Lustre: Skipped 1 previous similar message [ 3153.996927] Lustre: soaked-MDT0000: Connection restored to soaked-MDT0001-mdtlov_UUID (at 192.168.1.109@o2ib) [ 3153.996931] Lustre: Skipped 4 previous similar messages [ 3154.093652] LustreError: 5720:0:(osd_handler.c:4169:osd_ref_del()) soaked-MDT0000: nlink == 0 on [0x200011d4e:0x9eaf:0x0], maybe an upgraded file? (LU-3915) [ 3154.109319] LustreError: 5720:0:(osd_handler.c:3592:osd_destroy()) ASSERTION( !lu_object_is_dying(dt->do_lu.lo_header) ) failed: [ 3154.122342] LustreError: 5720:0:(osd_handler.c:3592:osd_destroy()) LBUG [ 3154.129735] Pid: 5720, comm: mdt_out01_006 3.10.0-1062.9.1.el7_lustre.x86_64 #1 SMP Wed Feb 12 06:45:58 UTC 2020 [ 3154.141104] Call Trace: [ 3154.143850] [<ffffffffc0b0ffac>] libcfs_call_trace+0x8c/0xc0 [libcfs] [ 3154.151193] [<ffffffffc0b1005c>] lbug_with_loc+0x4c/0xa0 [libcfs] [ 3154.158133] [<ffffffffc1224f80>] osd_destroy+0x4a0/0x760 [osd_ldiskfs] [ 3154.165558] [<ffffffffc0f92321>] out_obj_destroy+0x101/0x2c0 [ptlrpc] [ 3154.172976] [<ffffffffc0f92590>] out_tx_destroy_exec+0x20/0x190 [ptlrpc] [ 3154.180640] [<ffffffffc0f8c991>] out_tx_end+0xe1/0x5c0 [ptlrpc] [ 3154.187425] [<ffffffffc0f90c52>] out_handle+0x1442/0x1bb0 [ptlrpc] [ 3154.194490] [<ffffffffc0f8972a>] tgt_request_handle+0x95a/0x1610 [ptlrpc] [ 3154.202224] [<ffffffffc0f2b0f6>] ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc] [ 3154.210846] [<ffffffffc0f2f4f4>] ptlrpc_main+0xbb4/0x1550 [ptlrpc] [ 3154.217898] [<ffffffffa58c61f1>] kthread+0xd1/0xe0 [ 3154.223385] [<ffffffffa5f8dd37>] ret_from_fork_nospec_end+0x0/0x39 [ 3154.230407] [<ffffffffffffffff>] 0xffffffffffffffff [ 3154.235988] Kernel panic - not syncing: LBUG [ 3154.240763] CPU: 30 PID: 5720 Comm: mdt_out01_006 Kdump: loaded Tainted: G OE ------------ 3.10.0-1062.9.1.el7_lustre.x86_64 #1 [ 3154.254834] Hardware name: Intel Corporation S2600GZ ........../S2600GZ, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 [ 3154.267355] Call Trace: [ 3154.270096] [<ffffffffa5f7ac23>] dump_stack+0x19/0x1b [ 3154.275834] [<ffffffffa5f74967>] panic+0xe8/0x21f [ 3154.281187] [<ffffffffc0b100ab>] lbug_with_loc+0x9b/0xa0 [libcfs] [ 3154.288092] [<ffffffffc1224f80>] osd_destroy+0x4a0/0x760 [osd_ldiskfs] [ 3154.295481] [<ffffffffa5f80caa>] ? _cond_resched+0x3a/0x50 [ 3154.301699] [<ffffffffa5f7fd42>] ? down_write+0x12/0x3d [ 3154.307674] [<ffffffffc0f92321>] out_obj_destroy+0x101/0x2c0 [ptlrpc] [ 3154.314999] [<ffffffffc0f92590>] out_tx_destroy_exec+0x20/0x190 [ptlrpc] [ 3154.322612] [<ffffffffc0f8c991>] out_tx_end+0xe1/0x5c0 [ptlrpc] [ 3154.329353] [<ffffffffc0f90c52>] out_handle+0x1442/0x1bb0 [ptlrpc] [ 3154.336372] [<ffffffffa596ad65>] ? tracing_is_on+0x15/0x30 [ 3154.342627] [<ffffffffc0f8972a>] tgt_request_handle+0x95a/0x1610 [ptlrpc] [ 3154.350305] [<ffffffffc0af700e>] ? ktime_get_real_seconds+0xe/0x10 [libcfs] [ 3154.358212] [<ffffffffc0f2b0f6>] ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc] [ 3154.366791] [<ffffffffc0f2787b>] ? ptlrpc_wait_event+0x12b/0x4f0 [ptlrpc] [ 3154.374495] [<ffffffffa58d3360>] ? task_rq_unlock+0x20/0x20 [ 3154.380825] [<ffffffffa58d3903>] ? __wake_up+0x13/0x20 [ 3154.386689] [<ffffffffc0f2f4f4>] ptlrpc_main+0xbb4/0x1550 [ptlrpc] [ 3154.393738] [<ffffffffc0f2e940>] ? ptlrpc_register_service+0xf90/0xf90 [ptlrpc] [ 3154.401985] [<ffffffffa58c61f1>] kthread+0xd1/0xe0 [ 3154.407429] [<ffffffffa58c6120>] ? insert_kthread_work+0x40/0x40 [ 3154.414231] [<ffffffffa5f8dd37>] ret_from_fork_nospec_begin+0x21/0x21 [ 3154.421517] [<ffffffffa58c6120>] ? insert_kthread_work+0x40/0x40 [ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu
            sarah Sarah Liu made changes -
            Affects Version/s New: Lustre 2.14.0 [ 14490 ]

            Hi Cliff,
            Thanks. but I can't login the spirit anymore (previously it is okay), I have reopened DCO-5820 to require the access to spirit.

            hongchao.zhang Hongchao Zhang added a comment - Hi Cliff, Thanks. but I can't login the spirit anymore (previously it is okay), I have reopened DCO-5820 to require the access to spirit.

            The dump is on spirit. spirit.hpdd.intel.com

            cliffw Cliff White (Inactive) added a comment - The dump is on spirit. spirit.hpdd.intel.com

            Hi Cliff

            Is the Core dump at /scratch/dumps at onyx.hpdd.intel.com? I can't find it at that directory.
            Thanks!

            hongchao.zhang Hongchao Zhang added a comment - Hi Cliff Is the Core dump at /scratch/dumps at onyx.hpdd.intel.com? I can't find it at that directory. Thanks!

            People

              wc-triage WC Triage
              sarah Sarah Liu
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: