Details
-
Bug
-
Resolution: Unresolved
-
Blocker
-
None
-
None
-
None
-
RHEL 8.4 debug kernel.
Don't reproduced with ONLY=38, but easy to reproduce with full test suite run.
-
3
-
9223372036854775807
Description
[ 8554.748991] Lustre: DEBUG MARKER: == sanity-flr test 36d: write/punch FLR file update OST layout version ========================================================== 21:00:04 (1703786404) [ 8588.313641] Lustre: DEBUG MARKER: == sanity-flr test 37: mirror I/O API verification ======= 21:00:38 (1703786438) [ 8600.635491] Lustre: Unmounted lustre-client [ 8600.944134] Lustre: Mounted lustre-client [ 8630.238315] Lustre: DEBUG MARKER: == sanity-flr test 38: resync ============================ 21:01:20 (1703786480) [ 8633.346584] page:ffffea00071e1d80 refcount:0 mapcount:1 mapping:dead000000000400 index:0x7f2232076 compound_mapcount: 1 [ 8633.348480] anon flags: 0x17ffffc0000000() [ 8633.349531] raw: 0017ffffc0000000 ffffea00071e0001 dead000000000200 dead000000000400 [ 8633.351227] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 [ 8633.352187] page dumped because: VM_BUG_ON_PAGE(PageTail(page)) [ 8633.352932] ------------[ cut here ]------------ [ 8633.353521] kernel BUG at include/linux/page-flags.h:505! [ 8633.354219] invalid opcode: 0000 [#1] SMP KASAN PTI [ 8633.354797] CPU: 1 PID: 237586 Comm: lt-lfs Tainted: G B W OE ---------r- - 4.18.0-305.25.1.el8_4.x86_64+debug #1 [ 8633.356130] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.16.0-4.module_el8.9.0+3659+9c8643f3 04/01/2014 [ 8633.357256] RIP: 0010:lov_page_init_empty+0x2a4/0x330 [lov] [ 8633.357985] Code: c0 59 c2 c7 05 51 76 05 00 01 00 00 00 e8 34 28 d6 fe 5b 31 c0 5d 41 5c 41 5d c3 48 c7 c6 e0 34 56 c2 48 89 df e8 9c c8 ed ca <0f> 0b 48 c7 c7 c0 bf 59 c2 e8 89 55 62 cb 48 89 ef e8 66 ac fd ca [ 8633.360240] RSP: 0018:ffff8882067c73f0 EFLAGS: 00010282 [ 8633.360853] RAX: dffffc0000000000 RBX: ffffea00071e1d80 RCX: 0000000000000007 [ 8633.361704] RDX: 1ffffd4000e3c3b7 RSI: 0000000000000000 RDI: ffffea00071e1db8 [ 8633.362568] RBP: ffffffffc12fc380 R08: ffffed1044f3bda5 R09: ffffed1044f3bda5 [ 8633.363409] R10: ffff8882279ded23 R11: ffffed1044f3bda4 R12: ffff8881f6e3fc38 [ 8633.364263] R13: ffff8881f6e3fc20 R14: ffff8881f5e3fc90 R15: ffff888224cdb458 [ 8633.365116] FS: 00007f2233d02480(0000) GS:ffff888227800000(0000) knlGS:0000000000000000 [ 8633.366068] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8633.366746] CR2: 00007f2231e75000 CR3: 000000021cf8c006 CR4: 0000000000020ee0 [ 8633.367607] Call Trace: [ 8633.367922] lov_page_init_composite+0x95d/0x10a0 [lov] [ 8633.368588] ? lov_page_init_empty+0x330/0x330 [lov] [ 8633.369274] ? cl_page_alloc+0xac5/0x13f0 [obdclass] [ 8633.369901] cl_page_alloc+0x8cf/0x13f0 [obdclass] [ 8633.370516] ? __kmalloc_node+0x17a/0x2a0 [ 8633.371069] ? cl_page_make_ready+0xa80/0xa80 [obdclass] [ 8633.371707] ? iov_iter_get_pages_alloc+0x23d/0x10e0 [ 8633.372491] ? __init_waitqueue_head+0x9c/0x110 [ 8633.373066] ? memset+0x1f/0x40 [ 8633.373499] cl_page_find+0x3d3/0x620 [obdclass] [ 8633.374138] ll_direct_IO_impl+0x10d5/0x2ab0 [lustre] [ 8633.374773] ? ll_write_end+0x12b0/0x12b0 [lustre] [ 8633.375379] ? rcu_read_unlock+0x50/0x50 [ 8633.375846] ? touch_atime+0xca/0x250 [ 8633.376314] generic_file_read_iter+0x1ed/0x4c0 [ 8633.376853] ? trace_hardirqs_on+0x20/0x195 [ 8633.377410] vvp_io_read_start+0x1042/0x18f0 [lustre] [ 8633.378071] ? vvp_io_setattr_fini+0x180/0x180 [lustre] [ 8633.378706] ? lov_lock_init_composite+0x1b1/0x1f0 [lov] [ 8633.379408] ? cl_lock_request+0x148/0x370 [obdclass] [ 8633.380073] cl_io_start+0x187/0x3a0 [obdclass] [ 8633.380667] cl_io_loop+0x183/0x490 [obdclass] [ 8633.381265] ll_file_io_generic+0x937/0x2540 [lustre] [ 8633.381897] ? ll_io_init+0x1080/0x1080 [lustre] [ 8633.382518] ll_file_read_iter+0x1505/0x2a60 [lustre] [ 8633.383181] ? ll_file_write_iter+0x21a0/0x21a0 [lustre] [ 8633.383809] ? lock_downgrade+0x710/0x710 [ 8633.384348] ? ll_getattr_dentry+0xaeb/0x2600 [lustre] [ 8633.385084] new_sync_read+0x390/0x550 [ 8633.385529] ? do_iter_readv_writev+0x6d0/0x6d0 [ 8633.386096] ? lock_downgrade+0x710/0x710 [ 8633.386569] ? rcu_read_unlock+0x50/0x50 [ 8633.387067] ? __ia32_sys_lstat+0x70/0x70 [ 8633.387559] ? fsnotify_first_mark+0x150/0x150 [ 8633.388116] vfs_read+0xff/0x300 [ 8633.388511] ksys_pread64+0x11b/0x140 [ 8633.388949] ? __audit_syscall_exit+0x796/0xab0 [ 8633.389516] ? __ia32_sys_write+0xb0/0xb0 [ 8633.389996] ? trace_hardirqs_on_thunk+0x1a/0x20 [ 8633.390570] ? trace_hardirqs_on_caller+0x22/0x1a0 [ 8633.391213] ? do_syscall_64+0x22/0x430 [ 8633.391670] do_syscall_64+0xa5/0x430 [ 8633.392141] entry_SYSCALL_64_after_hwframe+0x6a/0xdf [ 8633.392734] RIP: 0033:0x7f22334491c8 [ 8633.393193] Code: b8 ff ff ff ff eb c5 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 8b 05 86 d2 20 00 49 89 ca 85 c0 75 17 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 60 c3 0f 1f 80 00 00 00 00 41 55 49 89 cd 41 [ 8633.395464] RSP: 002b:00007ffc7847e528 EFLAGS: 00000246 ORIG_RAX: 0000000000000011 [ 8633.396380] RAX: ffffffffffffffda RBX: 0000000000400000 RCX: 00007f22334491c8 [ 8633.397236] RDX: 0000000000400000 RSI: 00007f2231e76000 RDI: 0000000000000003 [ 8633.398098] RBP: 00007f2231e76000 R08: 0000000000000000 R09: 0000000000000000 [ 8633.398956] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 8633.399816] R13: 0000000000000003 R14: 0000000000000fff R15: 0000000000000000 [ 8633.400675] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) jbd2 mbcache rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache iTCO_wdt iTCO_vendor_support joydev crct10dif_pclmul crc32_pclmul ghash_clmulni_intel qxl drm_ttm_helper ttm pcspkr drm_kms_helper syscopyarea sysfillrect i6300esb virtio_balloon sysimgblt fb_sys_fops drm lpc_ich i2c_i801 sunrpc vfat fat ip_tables xfs libcrc32c ahci libahci virtio_console crc32c_intel virtio_scsi e1000 virtio_blk libata serio_raw [last unloaded: libcfs] [ 8633.408363] ---[ end trace f6b2871834a024d8 ]--- [ 8633.409254] RIP: 0010:lov_page_init_empty+0x2a4/0x330 [lov] [ 8633.410292] Code: c0 59 c2 c7 05 51 76 05 00 01 00 00 00 e8 34 28 d6 fe 5b 31 c0 5d 41 5c 41 5d c3 48 c7 c6 e0 34 56 c2 48 89 df e8 9c c8 ed ca <0f> 0b 48 c7 c7 c0 bf 59 c2 e8 89 55 62 cb 48 89 ef e8 66 ac fd ca [ 8633.413491] RSP: 0018:ffff8882067c73f0 EFLAGS: 00010282 [ 8633.414539] RAX: dffffc0000000000 RBX: ffffea00071e1d80 RCX: 0000000000000007 [ 8633.415772] RDX: 1ffffd4000e3c3b7 RSI: 0000000000000000 RDI: ffffea00071e1db8 [ 8633.417060] RBP: ffffffffc12fc380 R08: ffffed1044f3bda5 R09: ffffed1044f3bda5 [ 8633.418332] R10: ffff8882279ded23 R11: ffffed1044f3bda4 R12: ffff8881f6e3fc38 [ 8633.419697] R13: ffff8881f6e3fc20 R14: ffff8881f5e3fc90 R15: ffff888224cdb458 [ 8633.421061] FS: 00007f2233d02480(0000) GS:ffff888227800000(0000) knlGS:0000000000000000 [ 8633.422602] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8633.423714] CR2: 00007f2231e75000 CR3: 000000021cf8c006 CR4: 0000000000020ee0 [ 8633.425090] Kernel panic - not syncing: Fatal exception
503 static __always_inline void SetPageUptodate(struct page *page) 504 { 505 VM_BUG_ON_PAGE(PageTail(page), page); 506 /* 507 * Memory barrier must be issued before setting the PG_uptodate bit, 508 * so that all previous stores issued in order to bring the page 509 * uptodate are actually visible before PageUptodate becomes true. 510 */ 511 smp_wmb(); 512 set_bit(PG_uptodate, &page->flags); 513 }
It seems possible that this might relate to some of the other stale page in cache issues that have been seen recently.