Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12881

PPC client: replay-single test_88: OSS crash

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.12.3
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/ff7eba46-eb0e-11e9-b62b-52540065bddc

      test_88 failed with the following error:

      trevis-55vm1 crashed during replay-single test_88
      

      OSS

      [27261.744883] Lustre: DEBUG MARKER: trevis-55vm1.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
      [27261.944250] Lustre: DEBUG MARKER: e2label /dev/mapper/ost1_flakey 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
      [27262.327156] Lustre: DEBUG MARKER: e2label /dev/mapper/ost1_flakey 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
      [27262.751745] Lustre: DEBUG MARKER: e2label /dev/mapper/ost1_flakey 2>/dev/null
      [27274.129532] Lustre: lustre-OST0000: Client 4cb36394-c82c-73c3-1556-513200b2dec9 (at 10.9.0.82@tcp) reconnected, waiting for 3 clients in recovery for 0:47
      [27281.129518] Lustre: lustre-OST0000: Client 4cb36394-c82c-73c3-1556-513200b2dec9 (at 10.9.0.82@tcp) reconnected, waiting for 3 clients in recovery for 0:40
      [27281.132018] Lustre: lustre-OST0000: Connection restored to 7869087e-c8f3-7e36-bfea-1a6a80815089 (at 10.9.0.82@tcp)
      [27281.133764] Lustre: Skipped 258 previous similar messages
      [27288.129381] Lustre: lustre-OST0000: Client 4cb36394-c82c-73c3-1556-513200b2dec9 (at 10.9.0.82@tcp) reconnected, waiting for 3 clients in recovery for 0:33
      [27295.129230] Lustre: lustre-OST0000: Client 4cb36394-c82c-73c3-1556-513200b2dec9 (at 10.9.0.82@tcp) reconnected, waiting for 3 clients in recovery for 0:26
      [27302.129006] Lustre: lustre-OST0000: Client 4cb36394-c82c-73c3-1556-513200b2dec9 (at 10.9.0.82@tcp) reconnected, waiting for 3 clients in recovery for 0:19
      [27316.128667] Lustre: lustre-OST0000: Client 4cb36394-c82c-73c3-1556-513200b2dec9 (at 10.9.0.82@tcp) reconnected, waiting for 3 clients in recovery for 0:05
      [27316.131145] Lustre: Skipped 1 previous similar message
      [27321.904160] Lustre: lustre-OST0000: recovery is timed out, evict stale exports
      [27321.905603] Lustre: lustre-OST0000: disconnecting 1 stale clients
      [27322.321807] Lustre: lustre-OST0000: Recovery over after 1:00, of 3 clients 2 recovered and 1 was evicted.
      [27323.285845] general protection fault: 0000 [#1] SMP 
      [27323.286917] Modules linked in: loop dm_flakey lustre(OE) obdecho(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) ptlrpc_gss(OE) osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core sunrpc dm_mod iosf_mbi crc32_pclmul ppdev ghash_clmulni_intel aesni_intel lrw parport_pc gf128mul glue_helper ablk_helper cryptd joydev pcspkr virtio_balloon i2c_piix4 parport ip_tables ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk
      [27323.301087]  8139too ata_piix crct10dif_pclmul crct10dif_common libata crc32c_intel serio_raw 8139cp virtio_pci virtio_ring virtio mii floppy [last unloaded: dm_flakey]
      [27323.303822] CPU: 1 PID: 24518 Comm: 00tso_ll Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.x86_64 #1
      [27323.305811] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [27323.306759] task: ffff92b9099dc100 ti: ffff92b8fb228000 task.ti: ffff92b8fb228000
      [27323.307985] RIP: 0010:[<ffffffff828db6e3>]  [<ffffffff828db6e3>] account_system_time+0x73/0x180
      [27323.309499] RSP: 0018:ffff92b93fd03e48  EFLAGS: 00010002
      [27323.310380] RAX: 0000000000000000 RBX: ffff92b9099dc100 RCX: 00000000000f4240
      [27323.311553] RDX: 00000000000f4240 RSI: 0000000000010000 RDI: 0000000000010000
      [27323.312755] RBP: ffff92b93fd03e70 R08: ffff92b93fd1ab80 R09: 0000000000006abc
      [27323.313928] R10: 000000003b9aca00 R11: ffff92b90742c100 R12: 0000000000000002
      [27323.315098] R13: 00c94907b992ffff R14: 0000000000000000 R15: 0000000000000000
      [27323.316279] FS:  0000000000000000(0000) GS:ffff92b93fd00000(0000) knlGS:0000000000000000
      [27323.317603] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [27323.318556] CR2: 00007f7bd0665000 CR3: 000000007ada8000 CR4: 00000000000606e0
      [27323.319734] Call Trace:
      [27323.320177]  <IRQ> 
      [27323.320522]  [<ffffffff828db9a1>] account_process_tick+0x61/0x170
      [27323.321624]  [<ffffffff8290c160>] ? tick_sched_do_timer+0x50/0x50
      [27323.322643]  [<ffffffff828ac36c>] update_process_times+0x2c/0x80
      [27323.323643]  [<ffffffff8290bed0>] tick_sched_handle+0x30/0x70
      [27323.324594]  [<ffffffff8290c199>] tick_sched_timer+0x39/0x80
      [27323.325538]  [<ffffffff828c71e3>] __hrtimer_run_queues+0xf3/0x270
      [27323.326549]  [<ffffffff828c776f>] hrtimer_interrupt+0xaf/0x1d0
      [27323.327532]  [<ffffffff8285a61b>] local_apic_timer_interrupt+0x3b/0x60
      [27323.328635]  [<ffffffff82f7c6e3>] smp_apic_timer_interrupt+0x43/0x60
      [27323.329699]  [<ffffffff82f78df2>] apic_timer_interrupt+0x162/0x170
      [27323.330720]  <EOI> 
      [27323.331104]  [<ffffffffc0d7aae9>] ? ksocknal_tx_prep+0x39/0x50 [ksocklnd]
      [27323.332291]  [<ffffffffc0d7aad2>] ? ksocknal_tx_prep+0x22/0x50 [ksocklnd]
      [27323.333418]  [<ffffffffc0d7ab40>] ksocknal_queue_tx_locked+0x40/0x4e0 [ksocklnd]
      [27323.334661]  [<ffffffffc0d7b30e>] ksocknal_launch_packet+0x10e/0x430 [ksocklnd]
      [27323.335868]  [<ffffffffc0d7c0e8>] ksocknal_send+0x188/0x470 [ksocklnd]
      [27323.337035]  [<ffffffffc0a76594>] lnet_ni_send+0x44/0xd0 [lnet]
      [27323.338037]  [<ffffffffc0a7dd32>] lnet_send+0x82/0x1c0 [lnet]
      [27323.339004]  [<ffffffffc0a7e13c>] LNetPut+0x2cc/0xb50 [lnet]
      [27323.340281]  [<ffffffffc0e0fb56>] ptl_send_buf+0x146/0x530 [ptlrpc]
      [27323.341364]  [<ffffffffc0e12f0b>] ptlrpc_send_reply+0x29b/0x840 [ptlrpc]
      [27323.342504]  [<ffffffffc0dd169e>] target_send_reply_msg+0x8e/0x170 [ptlrpc]
      [27323.343701]  [<ffffffffc0ddbbde>] target_send_reply+0x30e/0x730 [ptlrpc]
      [27323.344842]  [<ffffffffc0e1a177>] ? lustre_msg_set_last_committed+0x27/0xa0 [ptlrpc]
      [27323.346247]  [<ffffffffc0e7ef17>] tgt_request_handle+0x697/0x1580 [ptlrpc]
      [27323.347456]  [<ffffffffc0918f97>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      [27323.348606]  [<ffffffffc0e2624b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
      [27323.349882]  [<ffffffff828cfeb4>] ? __wake_up+0x44/0x50
      [27323.350786]  [<ffffffffc0e29bac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
      [27323.351829]  [<ffffffff828d1ad0>] ? finish_task_switch+0x50/0x1c0
      [27323.352867]  [<ffffffffc0e29080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
      [27323.354096]  [<ffffffff828c2e81>] kthread+0xd1/0xe0
      [27323.354910]  [<ffffffff828c2db0>] ? insert_kthread_work+0x40/0x40
      [27323.355924]  [<ffffffff82f77c37>] ret_from_fork_nospec_begin+0x21/0x21
      [27323.357000]  [<ffffffff828c2db0>] ? insert_kthread_work+0x40/0x40
      [27323.358009] Code: 41 bc 04 00 00 00 89 c7 81 e7 00 00 ff 03 39 fe 0f 84 82 00 00 00 4c 8b ab 68 07 00 00 48 01 93 98 05 00 00 48 01 8b a8 05 00 00 <41> 8b 85 40 01 00 00 85 c0 74 28 4d 8d b5 44 01 00 00 48 89 55 
      [27323.363280] RIP  [<ffffffff828db6e3>] account_system_time+0x73/0x180
      [27323.364369]  RSP <ffff92b93fd03e48>
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      replay-single test_88 - trevis-55vm1 crashed during replay-single test_88

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: