Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14988

crash in ll_migrate in racer

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.15.0
    • None
    • 3
    • 9223372036854775807

    Description

      https://review.whamcloud.com/#/c/43964/ seems to cause a crash in ll_migrate similar to what was reported in now landed LU-13157

      very reproducable on recent master-next

       
      [  628.058780] LustreError: 25573:0:(ldlm_resource.c:1124:ldlm_resource_complain()) Skipped 23 previous similar messages
      [  628.489874] LustreError: 14648:0:(osp_sync.c:1094:osp_sync_process_committed()) lustre-OST0003-osc-MDT0001: can't cancel 279 records: rc = -30
      [  628.599298] LustreError: 14648:0:(osp_sync.c:1094:osp_sync_process_committed()) Skipped 15 previous similar messages
      [  628.701810] Lustre: lustre-OST0000-osc-ffff88029a0e8008: Connection restored to 192.168.123.100@tcp (at 0@lo)
      [  629.004000] LustreError: 14050:0:(osp_sync.c:1079:osp_sync_process_committed()) lustre-OST0001-osc-MDT0001: can't cancel record: rc = -30
      [  629.035251] LustreError: 14050:0:(osp_sync.c:1079:osp_sync_process_committed()) Skipped 5 previous similar messages
      [  630.287812] LustreError: 25548:0:(llite_lib.c:1836:ll_md_setattr()) md_setattr fails: rc = -30
      [  630.894353] Lustre: mdt07_002: service thread pid 11113 was inactive for 66.027 seconds. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one.[  630.930762] Lustre: Skipped 6 previous similar messages
      [  631.237306] LustreError: 11-0: lustre-MDT0001-mdc-ffff88009e7fb7e8: operation ldlm_enqueue to node 0@lo failed: rc = -30
      [  631.396604] LustreError: 25697:0:(llite_lib.c:1836:ll_md_setattr()) md_setattr fails: rc = -30
      [  631.768361] LustreError: 25697:0:(llite_lib.c:1836:ll_md_setattr()) Skipped 1 previous similar message[  632.634314] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      [  632.658290] IP: [<ffffffffa1051e52>] ll_migrate+0x9b2/0xec0 [lustre]
      [  632.660785] PGD 800000028b3a6067 PUD 28b3a7067 PMD 0 
      [  632.662892] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
      [  632.665067] Modules linked in: loop zfs(PO) zunicode(PO) zzstd(O) zlua(O) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) jbd2 mbcache lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) dm_flakey dm_mod libcfs(OE) crc_t10dif crct10dif_generic sb_edac edac_core iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd virtio_balloon virtio_console pcspkr i2c_piix4 ip_tables rpcsec_gss_krb5 drm_kms_helper ttm ata_generic pata_acpi drm crct10dif_pclmul crct10dif_common crc32c_intel drm_panel_orientation_quirks ata_piix serio_raw virtio_blk i2c_core libata floppy[  632.698829] CPU: 6 PID: 25076 Comm: lfs Kdump: loaded Tainted: P        W  OE  ------------   3.10.0-7.9-debug #2
      [  632.701212] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  632.702454] task: ffff8802912324f0 ti: ffff88024ad2c000 task.ti: ffff88024ad2c000
      [  632.705158] RIP: 0010:[<ffffffffa1051e52>]  [<ffffffffa1051e52>] ll_migrate+0x9b2/0xec0 [lustre]
      [  632.708792] RSP: 0018:ffff88024ad2fbc8  EFLAGS: 00010206[  632.710770] RAX: 0000000000000000 RBX: ffff880253c01458 RCX: 0000000000000000
      [  632.713070] RDX: 0000000000000000 RSI: ffff880327331138 RDI: ffff880327331118
      [  632.715401] RBP: ffff88024ad2fc48 R08: ffff8802668e2058 R09: 0000000000000001
      [  632.717769] R10: 0000000000000000 R11: ffff88024ad2f5e6 R12: 0000000000000000
      [  632.720049] R13: ffff8800848d48e8 R14: ffff88029b7166d8 R15: 0000000000000030
      [  632.722352] FS:  00007fa279316740(0000) GS:ffff880331b80000(0000) knlGS:0000000000000000
      [  632.781764] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  632.784011] CR2: 0000000000000008 CR3: 0000000266662000 CR4: 00000000001607e0
      [  632.786316] Call Trace:
      [  632.788315]  [<ffffffff81242213>] ? __check_object_size+0x1c3/0x220
      [  632.790913]  [<ffffffffa103f561>] ll_dir_ioctl+0x5d01/0x6ed0 [lustre]
      [  632.793261]  [<ffffffff81411979>] ? do_raw_spin_unlock+0x49/0x90
      [  632.795476]  [<ffffffff8115260f>] ? delayacct_end+0x8f/0xb0
      [  632.817827]  [<ffffffff81152744>] ? __delayacct_blkio_end+0x34/0x60
      [  632.820259]  [<ffffffff817e0257>] ? io_schedule_timeout+0xe7/0x130
      [  632.822620]  [<ffffffff811b62dd>] ? find_get_pages_tag+0x10d/0x260
      [  632.824759]  [<ffffffff811c3691>] ? pagevec_lookup_tag+0x21/0x30
      [  632.827037]  [<ffffffff811b400e>] ? __filemap_fdatawait_range+0xbe/0x1a0
      [  632.830403]  [<ffffffff8125b3fd>] do_vfs_ioctl+0x40d/0x6c0
      [  632.833084]  [<ffffffff81264d2b>] ? iput+0x3b/0x180
      [  632.835354]  [<ffffffff8125b751>] SyS_ioctl+0xa1/0xc0
      [  632.838391]  [<ffffffff817ee00c>] system_call_fastpath+0x1f/0x24[  632.840775] Code: 03 49 89 45 50 48 8b 44 24 38 49 89 85 38 01 00 00 48 8b 43 20 41 81 8d 30 01 00 00 00 20 00 00 49 89 85 40 01 00 00 48 8b 43 18 <48> 8b 78 08 48 83 c7 18 e8 b1 13 79 e0 48 8b 43 18 48 8b 40 08 [  632.884740] RIP  [<ffffffffa1051e52>] ll_migrate+0x9b2/0xec0 [lustre]
      [  632.891978]  RSP <ffff88024ad2fbc8>
      [  632.912110] CR2: 0000000000000008

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: