Details

    • Bug
    • Resolution: Unresolved
    • Blocker
    • None
    • None
    • None
    • RHEL 8.4 debug kernel.
      Don't reproduced with ONLY=38, but easy to reproduce with full test suite run.
    • 3
    • 9223372036854775807

    Description

      [ 8554.748991] Lustre: DEBUG MARKER: == sanity-flr test 36d: write/punch FLR file update OST layout version ========================================================== 21:00:04 (1703786404)
      [ 8588.313641] Lustre: DEBUG MARKER: == sanity-flr test 37: mirror I/O API verification ======= 21:00:38 (1703786438)
      [ 8600.635491] Lustre: Unmounted lustre-client
      [ 8600.944134] Lustre: Mounted lustre-client
      [ 8630.238315] Lustre: DEBUG MARKER: == sanity-flr test 38: resync ============================ 21:01:20 (1703786480)
      [ 8633.346584] page:ffffea00071e1d80 refcount:0 mapcount:1 mapping:dead000000000400 index:0x7f2232076 compound_mapcount: 1
      [ 8633.348480] anon flags: 0x17ffffc0000000()
      [ 8633.349531] raw: 0017ffffc0000000 ffffea00071e0001 dead000000000200 dead000000000400
      [ 8633.351227] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
      [ 8633.352187] page dumped because: VM_BUG_ON_PAGE(PageTail(page))
      [ 8633.352932] ------------[ cut here ]------------
      [ 8633.353521] kernel BUG at include/linux/page-flags.h:505!
      [ 8633.354219] invalid opcode: 0000 [#1] SMP KASAN PTI
      [ 8633.354797] CPU: 1 PID: 237586 Comm: lt-lfs Tainted: G    B   W  OE    ---------r-  - 4.18.0-305.25.1.el8_4.x86_64+debug #1
      [ 8633.356130] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.16.0-4.module_el8.9.0+3659+9c8643f3 04/01/2014
      [ 8633.357256] RIP: 0010:lov_page_init_empty+0x2a4/0x330 [lov]
      [ 8633.357985] Code: c0 59 c2 c7 05 51 76 05 00 01 00 00 00 e8 34 28 d6 fe 5b 31 c0 5d 41 5c 41 5d c3 48 c7 c6 e0 34 56 c2 48 89 df e8 9c c8 ed ca <0f> 0b 48 c7 c7 c0 bf 59 c2 e8 89 55 62 cb 48 89 ef e8 66 ac fd ca
      [ 8633.360240] RSP: 0018:ffff8882067c73f0 EFLAGS: 00010282
      [ 8633.360853] RAX: dffffc0000000000 RBX: ffffea00071e1d80 RCX: 0000000000000007
      [ 8633.361704] RDX: 1ffffd4000e3c3b7 RSI: 0000000000000000 RDI: ffffea00071e1db8
      [ 8633.362568] RBP: ffffffffc12fc380 R08: ffffed1044f3bda5 R09: ffffed1044f3bda5
      [ 8633.363409] R10: ffff8882279ded23 R11: ffffed1044f3bda4 R12: ffff8881f6e3fc38
      [ 8633.364263] R13: ffff8881f6e3fc20 R14: ffff8881f5e3fc90 R15: ffff888224cdb458
      [ 8633.365116] FS:  00007f2233d02480(0000) GS:ffff888227800000(0000) knlGS:0000000000000000
      [ 8633.366068] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 8633.366746] CR2: 00007f2231e75000 CR3: 000000021cf8c006 CR4: 0000000000020ee0
      [ 8633.367607] Call Trace:
      [ 8633.367922]  lov_page_init_composite+0x95d/0x10a0 [lov]
      [ 8633.368588]  ? lov_page_init_empty+0x330/0x330 [lov]
      [ 8633.369274]  ? cl_page_alloc+0xac5/0x13f0 [obdclass]
      [ 8633.369901]  cl_page_alloc+0x8cf/0x13f0 [obdclass]
      [ 8633.370516]  ? __kmalloc_node+0x17a/0x2a0
      [ 8633.371069]  ? cl_page_make_ready+0xa80/0xa80 [obdclass]
      [ 8633.371707]  ? iov_iter_get_pages_alloc+0x23d/0x10e0
      [ 8633.372491]  ? __init_waitqueue_head+0x9c/0x110
      [ 8633.373066]  ? memset+0x1f/0x40
      [ 8633.373499]  cl_page_find+0x3d3/0x620 [obdclass]
      [ 8633.374138]  ll_direct_IO_impl+0x10d5/0x2ab0 [lustre]
      [ 8633.374773]  ? ll_write_end+0x12b0/0x12b0 [lustre]
      [ 8633.375379]  ? rcu_read_unlock+0x50/0x50
      [ 8633.375846]  ? touch_atime+0xca/0x250
      [ 8633.376314]  generic_file_read_iter+0x1ed/0x4c0
      [ 8633.376853]  ? trace_hardirqs_on+0x20/0x195
      [ 8633.377410]  vvp_io_read_start+0x1042/0x18f0 [lustre]
      [ 8633.378071]  ? vvp_io_setattr_fini+0x180/0x180 [lustre]
      [ 8633.378706]  ? lov_lock_init_composite+0x1b1/0x1f0 [lov]
      [ 8633.379408]  ? cl_lock_request+0x148/0x370 [obdclass]
      [ 8633.380073]  cl_io_start+0x187/0x3a0 [obdclass]
      [ 8633.380667]  cl_io_loop+0x183/0x490 [obdclass]
      [ 8633.381265]  ll_file_io_generic+0x937/0x2540 [lustre]
      [ 8633.381897]  ? ll_io_init+0x1080/0x1080 [lustre]
      [ 8633.382518]  ll_file_read_iter+0x1505/0x2a60 [lustre]
      [ 8633.383181]  ? ll_file_write_iter+0x21a0/0x21a0 [lustre]
      [ 8633.383809]  ? lock_downgrade+0x710/0x710
      [ 8633.384348]  ? ll_getattr_dentry+0xaeb/0x2600 [lustre]
      [ 8633.385084]  new_sync_read+0x390/0x550
      [ 8633.385529]  ? do_iter_readv_writev+0x6d0/0x6d0
      [ 8633.386096]  ? lock_downgrade+0x710/0x710
      [ 8633.386569]  ? rcu_read_unlock+0x50/0x50
      [ 8633.387067]  ? __ia32_sys_lstat+0x70/0x70
      [ 8633.387559]  ? fsnotify_first_mark+0x150/0x150
      [ 8633.388116]  vfs_read+0xff/0x300
      [ 8633.388511]  ksys_pread64+0x11b/0x140
      [ 8633.388949]  ? __audit_syscall_exit+0x796/0xab0
      [ 8633.389516]  ? __ia32_sys_write+0xb0/0xb0
      [ 8633.389996]  ? trace_hardirqs_on_thunk+0x1a/0x20
      [ 8633.390570]  ? trace_hardirqs_on_caller+0x22/0x1a0
      [ 8633.391213]  ? do_syscall_64+0x22/0x430
      [ 8633.391670]  do_syscall_64+0xa5/0x430
      [ 8633.392141]  entry_SYSCALL_64_after_hwframe+0x6a/0xdf
      [ 8633.392734] RIP: 0033:0x7f22334491c8
      [ 8633.393193] Code: b8 ff ff ff ff eb c5 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 8b 05 86 d2 20 00 49 89 ca 85 c0 75 17 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 60 c3 0f 1f 80 00 00 00 00 41 55 49 89 cd 41
      [ 8633.395464] RSP: 002b:00007ffc7847e528 EFLAGS: 00000246 ORIG_RAX: 0000000000000011
      [ 8633.396380] RAX: ffffffffffffffda RBX: 0000000000400000 RCX: 00007f22334491c8
      [ 8633.397236] RDX: 0000000000400000 RSI: 00007f2231e76000 RDI: 0000000000000003
      [ 8633.398098] RBP: 00007f2231e76000 R08: 0000000000000000 R09: 0000000000000000
      [ 8633.398956] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      [ 8633.399816] R13: 0000000000000003 R14: 0000000000000fff R15: 0000000000000000
      [ 8633.400675] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) jbd2 mbcache rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache iTCO_wdt iTCO_vendor_support joydev crct10dif_pclmul crc32_pclmul ghash_clmulni_intel qxl drm_ttm_helper ttm pcspkr drm_kms_helper syscopyarea sysfillrect i6300esb virtio_balloon sysimgblt fb_sys_fops drm lpc_ich i2c_i801 sunrpc vfat fat ip_tables xfs libcrc32c ahci libahci virtio_console crc32c_intel virtio_scsi e1000 virtio_blk libata serio_raw [last unloaded: libcfs]
      [ 8633.408363] ---[ end trace f6b2871834a024d8 ]---
      [ 8633.409254] RIP: 0010:lov_page_init_empty+0x2a4/0x330 [lov]
      [ 8633.410292] Code: c0 59 c2 c7 05 51 76 05 00 01 00 00 00 e8 34 28 d6 fe 5b 31 c0 5d 41 5c 41 5d c3 48 c7 c6 e0 34 56 c2 48 89 df e8 9c c8 ed ca <0f> 0b 48 c7 c7 c0 bf 59 c2 e8 89 55 62 cb 48 89 ef e8 66 ac fd ca
      [ 8633.413491] RSP: 0018:ffff8882067c73f0 EFLAGS: 00010282
      [ 8633.414539] RAX: dffffc0000000000 RBX: ffffea00071e1d80 RCX: 0000000000000007
      [ 8633.415772] RDX: 1ffffd4000e3c3b7 RSI: 0000000000000000 RDI: ffffea00071e1db8
      [ 8633.417060] RBP: ffffffffc12fc380 R08: ffffed1044f3bda5 R09: ffffed1044f3bda5
      [ 8633.418332] R10: ffff8882279ded23 R11: ffffed1044f3bda4 R12: ffff8881f6e3fc38
      [ 8633.419697] R13: ffff8881f6e3fc20 R14: ffff8881f5e3fc90 R15: ffff888224cdb458
      [ 8633.421061] FS:  00007f2233d02480(0000) GS:ffff888227800000(0000) knlGS:0000000000000000
      [ 8633.422602] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 8633.423714] CR2: 00007f2231e75000 CR3: 000000021cf8c006 CR4: 0000000000020ee0
      [ 8633.425090] Kernel panic - not syncing: Fatal exception
      
      503 static __always_inline void SetPageUptodate(struct page *page)
      504 {
      505         VM_BUG_ON_PAGE(PageTail(page), page);
      506         /*
      507          * Memory barrier must be issued before setting the PG_uptodate bit,
      508          * so that all previous stores issued in order to bring the page
      509          * uptodate are actually visible before PageUptodate becomes true.
      510          */
      511         smp_wmb();
      512         set_bit(PG_uptodate, &page->flags);
      513 }
      

      Attachments

        Activity

          [LU-17388] sanity-flr test 38: resync panic.

          It seems possible that this might relate to some of the other stale page in cache issues that have been seen recently.

          adilger Andreas Dilger added a comment - It seems possible that this might relate to some of the other stale page in cache issues that have been seen recently.

          People

            wc-triage WC Triage
            shadow Alexey Lyashkov
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: