Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11232

replay-ost-single test_0b: BUG: unable to handle kernel NULL pointer dereference

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.10.5
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/7c77283e-9882-11e8-b0aa-52540065bddc

      test_0b failed with the following error:

      trevis-9vm10 crashed during replay-ost-single test_0b
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      replay-ost-single test_0b - trevis-9vm10 crashed during replay-ost-single test_0b

      Attachments

        Issue Links

          Activity

            [LU-11232] replay-ost-single test_0b: BUG: unable to handle kernel NULL pointer dereference
            adilger Andreas Dilger added a comment - - edited

            There is a second issue happening here. The LFSCK check at MDT mount is detecting that the MDT was restored from backup and then running a full scrub on the filesystem.

            osd_scrub_setup()) dm-3: UUID has been changed from 664b5c1b-7333-88c0-43f0-1645592e9480 to 470cd461-31b1-a8d5-48b2-c736bed6d0db
            

            This causes the MDT is being mounted read-only because an unused inode was accessed:

            [  313.551387] LDISKFS-fs error (device dm-3): ldiskfs_lookup:1968: inode #40972: comm OI_scrub: deleted inode referenced: 65549
            [  313.567933] Aborting journal on device dm-3-8.
            [  313.933723] LDISKFS-fs (dm-3): Remounting filesystem read-only
            

            It isn't clear if there is a race condition between OI Scrub running and accessing a file that was deleted, or something is accessing a stale inode via the OI and unlinking it while OI Scrub is processing it? This might happen because LFSCK is running without full locking on the files to avoid blocking other MDS threads that are accessing the filesystem.

            If this is happening on a regular basis, it would potentially be useful to change this error message to include the parent name/inode so that we can see what kind of file it is (Lustre internal or part of the namespace). Failing that, running "debugfs -c -R 'ncheck <inode_number>' /dev/mdtdev" on the MDT filesystem after the failure would report the pathname, so long as we have access to it before it is reformatted (maybe as part of the test script).

            adilger Andreas Dilger added a comment - - edited There is a second issue happening here. The LFSCK check at MDT mount is detecting that the MDT was restored from backup and then running a full scrub on the filesystem. osd_scrub_setup()) dm-3: UUID has been changed from 664b5c1b-7333-88c0-43f0-1645592e9480 to 470cd461-31b1-a8d5-48b2-c736bed6d0db This causes the MDT is being mounted read-only because an unused inode was accessed: [ 313.551387] LDISKFS-fs error (device dm-3): ldiskfs_lookup:1968: inode #40972: comm OI_scrub: deleted inode referenced: 65549 [ 313.567933] Aborting journal on device dm-3-8. [ 313.933723] LDISKFS-fs (dm-3): Remounting filesystem read-only It isn't clear if there is a race condition between OI Scrub running and accessing a file that was deleted, or something is accessing a stale inode via the OI and unlinking it while OI Scrub is processing it? This might happen because LFSCK is running without full locking on the files to avoid blocking other MDS threads that are accessing the filesystem. If this is happening on a regular basis, it would potentially be useful to change this error message to include the parent name/inode so that we can see what kind of file it is (Lustre internal or part of the namespace). Failing that, running " debugfs -c -R 'ncheck <inode_number>' /dev/mdtdev " on the MDT filesystem after the failure would report the pathname, so long as we have access to it before it is reformatted (maybe as part of the test script).
            adilger Andreas Dilger added a comment - - edited

            It seems that this obd_zombie_impexp_cull() issue may be a duplicate of LU-9806, LU-10806, and LU-10573, which all appear to be related to problems with the zombie export list pointing to an export that is no longer present. This issue could be tracked under LU-10806 so that this ticket can address the OI Scrub problem.

            adilger Andreas Dilger added a comment - - edited It seems that this obd_zombie_impexp_cull() issue may be a duplicate of LU-9806 , LU-10806 , and LU-10573 , which all appear to be related to problems with the zombie export list pointing to an export that is no longer present. This issue could be tracked under LU-10806 so that this ticket can address the OI Scrub problem.
            sarah Sarah Liu added a comment - - edited

            server/client: SLES12sp3 2.10.5 rc1

            OST dmesg

            [  270.466705] Lustre: DEBUG MARKER: test -b /dev/lvm-Role_OSS/P4
            [  270.504171] Lustre: DEBUG MARKER: e2label /dev/lvm-Role_OSS/P4
            [  270.539602] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-ost4; mount -t lustre   		                   /dev/lvm-Role_OSS/P4 /mnt/lustre-ost4
            [  270.580165] LDISKFS-fs warning (device dm-3): ldiskfs_multi_mount_protect:328: MMP interval 42 higher than expected, please wait.
                           
            [  273.124433] Lustre: lustre-OST0001: Denying connection for new client de942e5a-e67a-c940-aec3-3a8bef12e248(at 10.9.4.98@tcp), waiting for 4 known clients (1 recovered, 0 in progress, and 0 evicted) to recover in 0:20
            [  273.124438] Lustre: Skipped 21 previous similar messages
            [  275.002105] Lustre: lustre-OST0002: Will be in recovery for at least 1:00, or until 4 clients reconnect
            [  275.002167] Lustre: lustre-OST0002: Connection restored to 10.9.4.104@tcp (at 10.9.4.104@tcp)
            [  279.319191] Lustre: lustre-OST0002: Connection restored to 5252abe8-f295-e986-07d7-b18acab9614a (at 10.9.4.101@tcp)
            [  279.319198] Lustre: Skipped 1 previous similar message
            [  288.125028] LustreError: 137-5: lustre-OST0003_UUID: not available for connect from 10.9.4.98@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
            [  288.125034] LustreError: Skipped 75 previous similar messages
            [  293.887723] Lustre: lustre-OST0001: recovery is timed out, evict stale exports
            [  293.887738] Lustre: lustre-OST0001: disconnecting 3 stale clients
            [  293.939466] Lustre: lustre-OST0001: Recovery over after 1:06, of 4 clients 1 recovered and 3 were evicted.
            [  294.285458] Lustre: lustre-OST0001: Connection restored to cdaa6359-d98e-8831-5ce1-1b42a13d2b24 (at 10.9.4.100@tcp)
            [  313.518413] LDISKFS-fs (dm-3): recovery complete
            [  313.530904] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc
            [  313.545257] LustreError: 3564:0:(osd_scrub.c:2784:osd_scrub_setup()) dm-3: UUID has been changed from 664b5c1b-7333-88c0-43f0-1645592e9480 to 470cd461-31b1-a8d5-48b2-c736bed6d0db
            [  313.551387] LDISKFS-fs error (device dm-3): ldiskfs_lookup:1968: inode #40972: comm OI_scrub: deleted inode referenced: 65549
            [  313.567933] Aborting journal on device dm-3-8.
            [  313.933723] LDISKFS-fs (dm-3): Remounting filesystem read-only
            [  313.946278] LustreError: 3564:0:(llog.c:1313:llog_backup()) MGC10.9.4.104@tcp: failed to open backup logfile lustre-OST0003T: rc = -30
            [  313.946284] LustreError: 3564:0:(mgc_request.c:1897:mgc_llog_local_copy()) MGC10.9.4.104@tcp: failed to copy remote log lustre-OST0003: rc = -30
            [  313.952302] LustreError: 3579:0:(tgt_lastrcvd.c:1122:tgt_client_del()) lustre-OST0003: failed to update server data, skip client 6155c338-c467-e54e-4cd8-161d004528a0 zeroing, rc -30
            [  313.954793] BUG: unable to handle kernel NULL pointer dereference at 0000000000000378
            [  313.956197] IP: [<ffffffff81618ffc>] _raw_spin_lock+0xc/0x20
            [  313.957212] PGD 800000007a65c067 PUD 373f7067 PMD 0 
            [  313.958270] Oops: 0002 [#1] SMP 
            [  313.959012] Modules linked in:
            [  313.959459] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff88007be93040[0x0, 1, [0x1:0x0:0x0] hash exist]{
                           
            [  313.959463] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff88007be93090
                           
            [  313.959468] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-ldiskfs@ffff88007c44d500osd-ldiskfs-object@ffff88007c44d500(i:ffff88006867e750:81/3077371105)[plain]
                           
            [  313.959471] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) } header@ffff88007be93040
                           
            [  313.959554] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff8800657b17c0[0x0, 1, [0x200000003:0x0:0x0] hash exist]{
                           
            [  313.959557] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8800657b1810
                           
            [  313.959561] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-ldiskfs@ffff8800374b9d00osd-ldiskfs-object@ffff8800374b9d00(i:ffff880068578bd8:80/615556883)[plain]
                           
            [  313.959564] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) } header@ffff8800657b17c0
                           
            [  313.959845] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff8800655af040[0x0, 1, [0xa:0x0:0x0] hash exist]{
                           
            [  313.959848] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8800655af090
                           
            [  313.959852] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-ldiskfs@ffff8800374b9800osd-ldiskfs-object@ffff8800374b9800(i:ffff88006868b2c8:82/3077371139)[plain]
                           
            [  313.959854] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) } header@ffff8800655af040
                           
            [  313.959920] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff8800655af580[0x0, 1, [0xa:0x2:0x0] hash exist]{
                           
            [  313.959923] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8800655af5d0
                           
            [  313.959929] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-ldiskfs@ffff8800370e3000osd-ldiskfs-object@ffff8800370e3000(i:ffff88006865ab58:83/3077371140)[plain]
                           
            [  313.959931] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) } header@ffff8800655af580
                           
            [  313.960365] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff880065b4edc0[0x0, 1, [0x200000001:0x1017:0x0] hash exist]{
                           
            [  313.960368] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff880065b4ee10
                           
            [  313.960373] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-ldiskfs@ffff88007b667800osd-ldiskfs-object@ffff88007b667800(i:ffff880068656208:12/3002225808)[plain]
                           
            [  313.960375] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) } header@ffff880065b4edc0
                           
            [  313.967885] LustreError: 3579:0:(obd_config.c:558:class_setup()) setup lustre-OST0003 failed (-30)
            [  313.967889] LustreError: 3579:0:(obd_config.c:1682:class_config_llog_handler()) MGC10.9.4.104@tcp: cfg command failed: rc = -30
            [  313.967893] Lustre:    cmd=cf003 0:lustre-OST0003  1:dev  2:0  3:f  
                           
            [  313.967952] LustreError: 15c-8: MGC10.9.4.104@tcp: The configuration from log 'lustre-OST0003' failed (-30). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
            [  313.967954] LustreError: 3564:0:(obd_mount_server.c:1386:server_start_targets()) failed to start server lustre-OST0003: -30
            [  313.967990] LustreError: 3564:0:(obd_mount_server.c:1879:server_fill_super()) Unable to start targets: -30
            [  313.968004] LustreError: 3564:0:(obd_config.c:609:class_cleanup()) Device 12 not setup
            [  313.994053]  osp(OEN)
            [  313.994392]  ofd(OEN) lfsck(OEN) ost(OEN) mgc(OEN) osd_ldiskfs(OEN) ldiskfs(OEN) lquota(OEN) fid(OEN) fld(OEN) ksocklnd(OEN) ptlrpc(OEN) obdclass(OEN) lnet(OEN) libcfs(OEN) sd_mod(E) iscsi_tcp(E) libiscsi_tcp(E) rpcsec_gss_krb5(E) auth_rpcgss(E) nfsv4(E) dns_resolver(E) nfs(E) lockd(E) grace(E) fscache(E) af_packet(E) iscsi_ibft(E) iscsi_boot_sysfs(E) rpcrdma(E) sunrpc(E) ib_isert(E) iscsi_target_mod(E) ib_iser(E) libiscsi(E) scsi_transport_iscsi(E) ib_srpt(E) target_core_mod(E) ib_srp(E) scsi_transport_srp(E) ib_ipoib(E) rdma_ucm(E) ib_ucm(E) ib_uverbs(E) ib_umad(E) rdma_cm(E) configfs(E) ib_cm(E) iw_cm(E) ib_core(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) jitterentropy_rng(E) drbg(E) ansi_cprng(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E)
            [  314.004777]  ablk_helper(E) cryptd(E) 8139too(E) 8139cp(E) pcspkr(E) joydev(E) virtio_balloon(E) mii(E) i2c_piix4(E) processor(E) button(E) ext4(E) crc16(E) jbd2(E) mbcache(E) ata_generic(E) virtio_blk(E) ata_piix(E) floppy(E) serio_raw(E) ahci(E) libahci(E) libata(E) virtio_pci(E) virtio_ring(E) virtio(E) uhci_hcd(E) ehci_hcd(E) usbcore(E) usb_common(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) autofs4(E)
            [  314.010930] Supported: No, Unsupported modules are loaded
            [  314.011495] CPU: 1 PID: 2363 Comm: obd_zombid Tainted: G           OE   N  4.4.132-94.33_lustre-default #1
            [  314.012447] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
            [  314.013041] task: ffff88007a724680 ti: ffff88007be74000 task.ti: ffff88007be74000
            [  314.013797] RIP: 0010:[<ffffffff81618ffc>]  [<ffffffff81618ffc>] _raw_spin_lock+0xc/0x20
            [  314.014688] RSP: 0018:ffff88007be77dd0  EFLAGS: 00010246
            [  314.015243] RAX: 0000000000000000 RBX: ffff880066b09c00 RCX: ffffea00018c57e0
            [  314.015964] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 0000000000000378
            [  314.016692] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
            [  314.017466] R10: 0000000000000000 R11: ffff88006317dbaf R12: 0000000000000378
            [  314.018196] R13: ffff8800650f3008 R14: ffffffffa08b93a4 R15: 0000000000000000
            [  314.018916] FS:  0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
            [  314.019721] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
            [  314.020323] CR2: 0000000000000378 CR3: 000000007c61c000 CR4: 0000000000060670
            [  314.021062] Stack:
            [  314.021319]  ffffffffa0b59f5f ffffffffa0cdfab6 ffff880066b09c00 ffff880066b09c00
            [  314.022326]  ffff880066b09c54 ffff880065028000 ffff880066b09c00 ffffffffa0ccb56a
            [  314.023331]  ffff880066b09c00 ffff8800650f3008 ffff88007a724680 ffffffffa0817309
            [  314.024338] Call Trace:
            [  314.024777]  [<ffffffffa0b59f5f>] tgt_grant_discard+0x2f/0x180 [ptlrpc]
            [  314.025483]  [<ffffffffa0ccb56a>] ofd_destroy_export+0x6a/0x160 [ofd]
            [  314.026185]  [<ffffffffa0817309>] class_export_destroy+0xe9/0x470 [obdclass]
            [  314.026928]  [<ffffffffa081fe9b>] obd_zombie_impexp_cull+0x37b/0x530 [obdclass]
            [  314.027688]  [<ffffffffa08200a0>] obd_zombie_impexp_thread+0x50/0x190 [obdclass]
            [  314.028463]  [<ffffffff8109eb89>] kthread+0xc9/0xe0
            [  314.029000]  [<ffffffff81619805>] ret_from_fork+0x55/0x80
            [  314.032089] DWARF2 unwinder stuck at ret_from_fork+0x55/0x80
            [  314.032690] 
            [  314.032902] Leftover inexact backtrace:
                           
            [  314.033465]  [<ffffffff8109eac0>] ? kthread_park+0x50/0x50
            [  314.034036] Code: 66 66 66 90 8b 07 85 c0 74 03 31 c0 c3 ba 01 00 00 00 f0 0f b1 17 85 c0 75 f0 b0 01 c3 66 90 66 66 66 66 90 31 c0 ba 01 00 00 00 <f0> 0f b1 17 85 c0 75 02 f3 c3 89 c6 e9 4e 85 b7 ff 0f 1f 00 66 
            [  314.040103] RIP  [<ffffffff81618ffc>] _raw_spin_lock+0xc/0x20
            [  314.040775]  RSP <ffff88007be77dd0>
            [  314.041161] CR2: 0000000000000378
            
            sarah Sarah Liu added a comment - - edited server/client: SLES12sp3 2.10.5 rc1 OST dmesg [ 270.466705] Lustre: DEBUG MARKER: test -b /dev/lvm-Role_OSS/P4 [ 270.504171] Lustre: DEBUG MARKER: e2label /dev/lvm-Role_OSS/P4 [ 270.539602] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-ost4; mount -t lustre /dev/lvm-Role_OSS/P4 /mnt/lustre-ost4 [ 270.580165] LDISKFS-fs warning (device dm-3): ldiskfs_multi_mount_protect:328: MMP interval 42 higher than expected, please wait. [ 273.124433] Lustre: lustre-OST0001: Denying connection for new client de942e5a-e67a-c940-aec3-3a8bef12e248(at 10.9.4.98@tcp), waiting for 4 known clients (1 recovered, 0 in progress, and 0 evicted) to recover in 0:20 [ 273.124438] Lustre: Skipped 21 previous similar messages [ 275.002105] Lustre: lustre-OST0002: Will be in recovery for at least 1:00, or until 4 clients reconnect [ 275.002167] Lustre: lustre-OST0002: Connection restored to 10.9.4.104@tcp (at 10.9.4.104@tcp) [ 279.319191] Lustre: lustre-OST0002: Connection restored to 5252abe8-f295-e986-07d7-b18acab9614a (at 10.9.4.101@tcp) [ 279.319198] Lustre: Skipped 1 previous similar message [ 288.125028] LustreError: 137-5: lustre-OST0003_UUID: not available for connect from 10.9.4.98@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. [ 288.125034] LustreError: Skipped 75 previous similar messages [ 293.887723] Lustre: lustre-OST0001: recovery is timed out, evict stale exports [ 293.887738] Lustre: lustre-OST0001: disconnecting 3 stale clients [ 293.939466] Lustre: lustre-OST0001: Recovery over after 1:06, of 4 clients 1 recovered and 3 were evicted. [ 294.285458] Lustre: lustre-OST0001: Connection restored to cdaa6359-d98e-8831-5ce1-1b42a13d2b24 (at 10.9.4.100@tcp) [ 313.518413] LDISKFS-fs (dm-3): recovery complete [ 313.530904] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [ 313.545257] LustreError: 3564:0:(osd_scrub.c:2784:osd_scrub_setup()) dm-3: UUID has been changed from 664b5c1b-7333-88c0-43f0-1645592e9480 to 470cd461-31b1-a8d5-48b2-c736bed6d0db [ 313.551387] LDISKFS-fs error (device dm-3): ldiskfs_lookup:1968: inode #40972: comm OI_scrub: deleted inode referenced: 65549 [ 313.567933] Aborting journal on device dm-3-8. [ 313.933723] LDISKFS-fs (dm-3): Remounting filesystem read-only [ 313.946278] LustreError: 3564:0:(llog.c:1313:llog_backup()) MGC10.9.4.104@tcp: failed to open backup logfile lustre-OST0003T: rc = -30 [ 313.946284] LustreError: 3564:0:(mgc_request.c:1897:mgc_llog_local_copy()) MGC10.9.4.104@tcp: failed to copy remote log lustre-OST0003: rc = -30 [ 313.952302] LustreError: 3579:0:(tgt_lastrcvd.c:1122:tgt_client_del()) lustre-OST0003: failed to update server data, skip client 6155c338-c467-e54e-4cd8-161d004528a0 zeroing, rc -30 [ 313.954793] BUG: unable to handle kernel NULL pointer dereference at 0000000000000378 [ 313.956197] IP: [<ffffffff81618ffc>] _raw_spin_lock+0xc/0x20 [ 313.957212] PGD 800000007a65c067 PUD 373f7067 PMD 0 [ 313.958270] Oops: 0002 [#1] SMP [ 313.959012] Modules linked in: [ 313.959459] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff88007be93040[0x0, 1, [0x1:0x0:0x0] hash exist]{ [ 313.959463] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff88007be93090 [ 313.959468] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-ldiskfs@ffff88007c44d500osd-ldiskfs-object@ffff88007c44d500(i:ffff88006867e750:81/3077371105)[plain] [ 313.959471] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) } header@ffff88007be93040 [ 313.959554] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff8800657b17c0[0x0, 1, [0x200000003:0x0:0x0] hash exist]{ [ 313.959557] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8800657b1810 [ 313.959561] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-ldiskfs@ffff8800374b9d00osd-ldiskfs-object@ffff8800374b9d00(i:ffff880068578bd8:80/615556883)[plain] [ 313.959564] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) } header@ffff8800657b17c0 [ 313.959845] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff8800655af040[0x0, 1, [0xa:0x0:0x0] hash exist]{ [ 313.959848] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8800655af090 [ 313.959852] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-ldiskfs@ffff8800374b9800osd-ldiskfs-object@ffff8800374b9800(i:ffff88006868b2c8:82/3077371139)[plain] [ 313.959854] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) } header@ffff8800655af040 [ 313.959920] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff8800655af580[0x0, 1, [0xa:0x2:0x0] hash exist]{ [ 313.959923] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff8800655af5d0 [ 313.959929] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-ldiskfs@ffff8800370e3000osd-ldiskfs-object@ffff8800370e3000(i:ffff88006865ab58:83/3077371140)[plain] [ 313.959931] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) } header@ffff8800655af580 [ 313.960365] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) header@ffff880065b4edc0[0x0, 1, [0x200000001:0x1017:0x0] hash exist]{ [ 313.960368] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....local_storage@ffff880065b4ee10 [ 313.960373] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) ....osd-ldiskfs@ffff88007b667800osd-ldiskfs-object@ffff88007b667800(i:ffff880068656208:12/3002225808)[plain] [ 313.960375] LustreError: 3579:0:(ofd_dev.c:251:ofd_stack_fini()) } header@ffff880065b4edc0 [ 313.967885] LustreError: 3579:0:(obd_config.c:558:class_setup()) setup lustre-OST0003 failed (-30) [ 313.967889] LustreError: 3579:0:(obd_config.c:1682:class_config_llog_handler()) MGC10.9.4.104@tcp: cfg command failed: rc = -30 [ 313.967893] Lustre: cmd=cf003 0:lustre-OST0003 1:dev 2:0 3:f [ 313.967952] LustreError: 15c-8: MGC10.9.4.104@tcp: The configuration from log 'lustre-OST0003' failed (-30). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. [ 313.967954] LustreError: 3564:0:(obd_mount_server.c:1386:server_start_targets()) failed to start server lustre-OST0003: -30 [ 313.967990] LustreError: 3564:0:(obd_mount_server.c:1879:server_fill_super()) Unable to start targets: -30 [ 313.968004] LustreError: 3564:0:(obd_config.c:609:class_cleanup()) Device 12 not setup [ 313.994053] osp(OEN) [ 313.994392] ofd(OEN) lfsck(OEN) ost(OEN) mgc(OEN) osd_ldiskfs(OEN) ldiskfs(OEN) lquota(OEN) fid(OEN) fld(OEN) ksocklnd(OEN) ptlrpc(OEN) obdclass(OEN) lnet(OEN) libcfs(OEN) sd_mod(E) iscsi_tcp(E) libiscsi_tcp(E) rpcsec_gss_krb5(E) auth_rpcgss(E) nfsv4(E) dns_resolver(E) nfs(E) lockd(E) grace(E) fscache(E) af_packet(E) iscsi_ibft(E) iscsi_boot_sysfs(E) rpcrdma(E) sunrpc(E) ib_isert(E) iscsi_target_mod(E) ib_iser(E) libiscsi(E) scsi_transport_iscsi(E) ib_srpt(E) target_core_mod(E) ib_srp(E) scsi_transport_srp(E) ib_ipoib(E) rdma_ucm(E) ib_ucm(E) ib_uverbs(E) ib_umad(E) rdma_cm(E) configfs(E) ib_cm(E) iw_cm(E) ib_core(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) jitterentropy_rng(E) drbg(E) ansi_cprng(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) [ 314.004777] ablk_helper(E) cryptd(E) 8139too(E) 8139cp(E) pcspkr(E) joydev(E) virtio_balloon(E) mii(E) i2c_piix4(E) processor(E) button(E) ext4(E) crc16(E) jbd2(E) mbcache(E) ata_generic(E) virtio_blk(E) ata_piix(E) floppy(E) serio_raw(E) ahci(E) libahci(E) libata(E) virtio_pci(E) virtio_ring(E) virtio(E) uhci_hcd(E) ehci_hcd(E) usbcore(E) usb_common(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) autofs4(E) [ 314.010930] Supported: No, Unsupported modules are loaded [ 314.011495] CPU: 1 PID: 2363 Comm: obd_zombid Tainted: G OE N 4.4.132-94.33_lustre-default #1 [ 314.012447] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 314.013041] task: ffff88007a724680 ti: ffff88007be74000 task.ti: ffff88007be74000 [ 314.013797] RIP: 0010:[<ffffffff81618ffc>] [<ffffffff81618ffc>] _raw_spin_lock+0xc/0x20 [ 314.014688] RSP: 0018:ffff88007be77dd0 EFLAGS: 00010246 [ 314.015243] RAX: 0000000000000000 RBX: ffff880066b09c00 RCX: ffffea00018c57e0 [ 314.015964] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 0000000000000378 [ 314.016692] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 314.017466] R10: 0000000000000000 R11: ffff88006317dbaf R12: 0000000000000378 [ 314.018196] R13: ffff8800650f3008 R14: ffffffffa08b93a4 R15: 0000000000000000 [ 314.018916] FS: 0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000 [ 314.019721] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 314.020323] CR2: 0000000000000378 CR3: 000000007c61c000 CR4: 0000000000060670 [ 314.021062] Stack: [ 314.021319] ffffffffa0b59f5f ffffffffa0cdfab6 ffff880066b09c00 ffff880066b09c00 [ 314.022326] ffff880066b09c54 ffff880065028000 ffff880066b09c00 ffffffffa0ccb56a [ 314.023331] ffff880066b09c00 ffff8800650f3008 ffff88007a724680 ffffffffa0817309 [ 314.024338] Call Trace: [ 314.024777] [<ffffffffa0b59f5f>] tgt_grant_discard+0x2f/0x180 [ptlrpc] [ 314.025483] [<ffffffffa0ccb56a>] ofd_destroy_export+0x6a/0x160 [ofd] [ 314.026185] [<ffffffffa0817309>] class_export_destroy+0xe9/0x470 [obdclass] [ 314.026928] [<ffffffffa081fe9b>] obd_zombie_impexp_cull+0x37b/0x530 [obdclass] [ 314.027688] [<ffffffffa08200a0>] obd_zombie_impexp_thread+0x50/0x190 [obdclass] [ 314.028463] [<ffffffff8109eb89>] kthread+0xc9/0xe0 [ 314.029000] [<ffffffff81619805>] ret_from_fork+0x55/0x80 [ 314.032089] DWARF2 unwinder stuck at ret_from_fork+0x55/0x80 [ 314.032690] [ 314.032902] Leftover inexact backtrace: [ 314.033465] [<ffffffff8109eac0>] ? kthread_park+0x50/0x50 [ 314.034036] Code: 66 66 66 90 8b 07 85 c0 74 03 31 c0 c3 ba 01 00 00 00 f0 0f b1 17 85 c0 75 f0 b0 01 c3 66 90 66 66 66 66 90 31 c0 ba 01 00 00 00 <f0> 0f b1 17 85 c0 75 02 f3 c3 89 c6 e9 4e 85 b7 ff 0f 1f 00 66 [ 314.040103] RIP [<ffffffff81618ffc>] _raw_spin_lock+0xc/0x20 [ 314.040775] RSP <ffff88007be77dd0> [ 314.041161] CR2: 0000000000000378

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: