Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13168

Client panic "Freechain corrupt"/"Redzone Overwritten"

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.14.0, Lustre 2.12.5
    • None
    • None
    • Centos 7.7
      MDS/OSS Lustre version: 2.12.3
      clients Lustre: 2.12.3_93_gb75f04d-1
    • 3
    • 9223372036854775807

    Description

       

      We are using 2.12.3_93_gb75f04d-1 on clients to fix a panic when deleting files and using data on MDT (LU-12462). This has resolved the panic on deleting files, however we are now experiencing 2-3 kernel panics a day between our 6 cluster login machines.

      We do not yet know what is triggering these, however they all start from with either a kmalloc-8 freechain corrupt or kmalloc-8 Redzone overwritten, I've reproduced samples of both the vmcore-dmesg file generated by kdump/, this looks similar to me to LU-12581

      Typical dmesg from crashed client:

      [171210.346747] =============================================================================
      [171210.346754] BUG kmalloc-8 (Tainted: G OE ------------ ): Freechain corrupt
      [171210.346756] -----------------------------------------------------------------------------
      [171210.346759] Disabling lock debugging due to kernel taint
      [171210.346763] INFO: Slab 0xffffeb5450defb40 objects=102 used=6 fp=0xffff8eb6b7bedfa8 flags=0x6fffff00000081
      [171210.346765] INFO: Object 0xffff8eb6b7bedf30 @offset=3888 fp=0x7fff8eb6b7bedf08
      [171210.346770] Redzone ffff8eb6b7bedf28: bb bb bb bb bb bb bb bb ........
      [171210.346773] Object ffff8eb6b7bedf30: 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkk.
      [171210.346775] Redzone ffff8eb6b7bedf38: bb bb bb bb bb bb bb bb ........
      [171210.346778] Padding ffff8eb6b7bedf48: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
      [171210.346783] CPU: 21 PID: 8721 Comm: pool Kdump: loaded Tainted: G B OE ------------ 3.10.0-1062.9.1.el7.x86_64 #1
      [171210.346785] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/21/2019
      [171210.346787] Call Trace:
      [171210.346799] [<ffffffffa777ac23>] dump_stack+0x19/0x1b
      [171210.346805] [<ffffffffa7221561>] print_trailer+0x161/0x280
      [171210.346808] [<ffffffffa7221ebf>] on_freelist+0xff/0x270
      [171210.346813] [<ffffffffa77774cc>] free_debug_processing+0x18d/0x270
      [171210.346818] [<ffffffffa71ddcb5>] ? kvfree+0x35/0x40
      [171210.346822] [<ffffffffa7223bee>] __slab_free+0x1ce/0x290
      [171210.346829] [<ffffffffa7272e58>] ? generic_setxattr+0x68/0x80
      [171210.346834] [<ffffffffa7273635>] ? __vfs_setxattr_noperm+0x65/0x1b0
      [171210.346840] [<ffffffffa732b7ae>] ? evm_inode_setxattr+0xe/0x10
      [171210.346844] [<ffffffffa71ddcb5>] ? kvfree+0x35/0x40
      [171210.346847] [<ffffffffa7223db6>] kfree+0x106/0x140
      [171210.346851] [<ffffffffa71ddcb5>] kvfree+0x35/0x40
      [171210.346855] [<ffffffffa727399b>] setxattr+0x15b/0x1e0
      [171210.346861] [<ffffffffa725c3ed>] ? putname+0x3d/0x60
      [171210.346865] [<ffffffffa725d602>] ? user_path_at_empty+0x72/0xc0
      [171210.346871] [<ffffffffa724d828>] ? __sb_start_write+0x58/0x120
      [171210.346876] [<ffffffffa7273c87>] SyS_setxattr+0xb7/0x100
      [171210.346882] [<ffffffffa778dede>] system_call_fastpath+0x25/0x2a
      [171210.346885] =============================================================================
      [171210.346888] BUG kmalloc-8 (Tainted: G B OE ------------ ): Wrong object count. Counter is 6 but counted were 98
      [171210.346889] -----------------------------------------------------------------------------
      [171210.346893] INFO: Slab 0xffffeb5450defb40 objects=102 used=6 fp=0xffff8eb6b7bedfa8 flags=0x6fffff00000081
      [171210.346897] CPU: 21 PID: 8721 Comm: pool Kdump: loaded Tainted: G B OE ------------ 3.10.0-1062.9.1.el7.x86_64 #1
      [171210.346899] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/21/2019
      [171210.346901] Call Trace:
      [171210.346905] [<ffffffffa777ac23>] dump_stack+0x19/0x1b
      [171210.346908] [<ffffffffa7221b54>] slab_err+0xb4/0xe0
      [171210.346915] [<ffffffffa7030a1e>] ? show_stack+0x4e/0x60
      [171210.346918] [<ffffffffa7221561>] ? print_trailer+0x161/0x280
      [171210.346921] [<ffffffffa7221f85>] on_freelist+0x1c5/0x270
      [171210.346925] [<ffffffffa77774cc>] free_debug_processing+0x18d/0x270
      [171210.346929] [<ffffffffa71ddcb5>] ? kvfree+0x35/0x40
      [171210.346932] [<ffffffffa7223bee>] __slab_free+0x1ce/0x290
      [171210.346937] [<ffffffffa7272e58>] ? generic_setxattr+0x68/0x80
      [171210.346941] [<ffffffffa7273635>] ? __vfs_setxattr_noperm+0x65/0x1b0
      [171210.346944] [<ffffffffa732b7ae>] ? evm_inode_setxattr+0xe/0x10
      [171210.346948] [<ffffffffa71ddcb5>] ? kvfree+0x35/0x40
      [171210.346951] [<ffffffffa7223db6>] kfree+0x106/0x140
      [171210.346955] [<ffffffffa71ddcb5>] kvfree+0x35/0x40
      [171210.346959] [<ffffffffa727399b>] setxattr+0x15b/0x1e0
      [171210.346963] [<ffffffffa725c3ed>] ? putname+0x3d/0x60
      [171210.346967] [<ffffffffa725d602>] ? user_path_at_empty+0x72/0xc0
      [171210.346971] [<ffffffffa724d828>] ? __sb_start_write+0x58/0x120
      [171210.346976] [<ffffffffa7273c87>] SyS_setxattr+0xb7/0x100
      [171210.346980] [<ffffffffa778dede>] system_call_fastpath+0x25/0x2a
      [171210.346983] FIX kmalloc-8: Object count adjusted.
      [171210.346985] =============================================================================
      [171210.346988] BUG kmalloc-8 (Tainted: G B OE ------------ ): Redzone overwritten
      [171210.346989] -----------------------------------------------------------------------------
      [171210.346993] INFO: 0xffff8eb6b7bed0b0-0xffff8eb6b7bed0b7. First byte 0x4c instead of 0xcc
      [171210.346996] INFO: Slab 0xffffeb5450defb40 objects=102 used=98 fp=0xffff8eb6b7bedfa8 flags=0x6fffff00000081
      [171210.346998] INFO: Object 0xffff8eb6b7bed0a8 @offset=168 fp=0x7f7f0e36373e5050
      [171210.347001] Redzone ffff8eb6b7bed0a0: cc cc cc cc cc cc cc cc ........
      [171210.347004] Object ffff8eb6b7bed0a8: d0 0b d6 0b 88 01 00 25 .......%
      [171210.347006] Redzone ffff8eb6b7bed0b0: 4c 4c 4c 4c 4c 4c 4c 4c LLLLLLLL
      [171210.347009] Padding ffff8eb6b7bed0c0: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
      [171210.347012] CPU: 21 PID: 8721 Comm: pool Kdump: loaded Tainted: G B OE ------------ 3.10.0-1062.9.1.el7.x86_64 #1
      [171210.347014] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/21/2019
      [171210.347016] Call Trace:
      [171210.347020] [<ffffffffa777ac23>] dump_stack+0x19/0x1b
      [171210.347023] [<ffffffffa7221561>] print_trailer+0x161/0x280
      [171210.347026] [<ffffffffa72217ef>] check_bytes_and_report+0xcf/0x110
      [171210.347030] [<ffffffffa722237d>] check_object+0x1dd/0x2a0
      [171210.347033] [<ffffffffa77773cc>] free_debug_processing+0x8d/0x270
      [171210.347037] [<ffffffffa71ddcb5>] ? kvfree+0x35/0x40
      [171210.347040] [<ffffffffa7223bee>] __slab_free+0x1ce/0x290
      [171210.347045] [<ffffffffa7272e58>] ? generic_setxattr+0x68/0x80
      [171210.347049] [<ffffffffa7273635>] ? __vfs_setxattr_noperm+0x65/0x1b0
      [171210.347258] [<ffffffffa732b7ae>] ? evm_inode_setxattr+0xe/0x10
      [171210.347262] [<ffffffffa71ddcb5>] ? kvfree+0x35/0x40
      [171210.347265] [<ffffffffa7223db6>] kfree+0x106/0x140
      [171210.347449] [<ffffffffa71ddcb5>] kvfree+0x35/0x40
      [171210.347627] [<ffffffffa727399b>] setxattr+0x15b/0x1e0
      [171210.347823] [<ffffffffa725c3ed>] ? putname+0x3d/0x60
      [171210.348010] [<ffffffffa725d602>] ? user_path_at_empty+0x72/0xc0
      [171210.348204] [<ffffffffa724d828>] ? __sb_start_write+0x58/0x120
      [171210.348209] [<ffffffffa7273c87>] SyS_setxattr+0xb7/0x100
      [171210.348392] [<ffffffffa778dede>] system_call_fastpath+0x25/0x2a
      [171210.348578] FIX kmalloc-8: Restoring 0xffff8eb6b7bed0b0-0xffff8eb6b7bed0b7=0xcc
      [171210.349139] FIX kmalloc-8: Object at 0xffff8eb6b7bed0a8 not freed
      [171210.462694] general protection fault: 0000 [#1] SMP 
      [171210.488281] Modules linked in: fuse can_bcm sctp can_raw can nfsd mgc(OE) lustre(OE) lmv(OE) mdc(OE) fid(OE) osc(OE) lov(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) cts lnet(OE) rpcsec_gss_krb5 nfsv4 dns_resolver libcfs(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_recent xt_conntrack nf_conntrack iptable_filter dm_mirror dm_region_hash dm_log dm_mod mlx4_ib(OE) ib_uverbs(OE) ib_core(OE) sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm mlx4_core(OE) mgag200 ttm irqbypass drm_kms_helper crc32_pclmul iTCO_wdt crc32c_intel iTCO_vendor_support ghash_clmulni_intel aesni_intel
      [171210.866605] syscopyarea lrw sysfillrect sysimgblt fb_sys_fops gf128mul glue_helper drm ablk_helper cryptd ses enclosure drm_panel_orientation_quirks ipmi_si mlx_compat(OE) pcspkr ipmi_devintf devlink ioatdma ipmi_msghandler pcc_cpufreq wmi i2c_i801 hpwdt lpc_ich acpi_power_meter binfmt_misc knem(OE) auth_rpcgss ip_tables smartpqi bridge stp llc xfs isci libsas qla3xxx e1000e igb i2c_algo_bit megaraid_sas aacraid aic79xx ata_piix mpt2sas raid_class mptspi scsi_transport_spi mptsas mptscsih mptbase arcmsr ahci libahci sata_nv sata_svw bnx2x libcrc32c bnx2 ext4 mbcache jbd2 sata_sil libata tg3 e1000 nfsv3 nfs_acl nfs lockd grace sunrpc fscache tun sd_mod crc_t10dif crct10dif_generic sg ixgbe crct10dif_pclmul crct10dif_common hpsa dca mdio ptp hpilo scsi_transport_sas pps_core [last unloaded: ipmi_msghandler]
      [171211.240594] 
      [171211.242206] CPU: 21 PID: 8721 Comm: pool Kdump: loaded Tainted: G B OE ------------ 3.10.0-1062.9.1.el7.x86_64 #1
      [171211.300030] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/21/2019
      [171211.336094] task: ffff8e972f73b150 ti: ffff8e9cdf368000 task.ti: ffff8e9cdf368000
      [171211.375938] RIP: 0010:[<ffffffffc112fcdc>] [<ffffffffc112fcdc>] cl_page_delete0+0x6c/0x220 [obdclass]
      [171211.423586] RSP: 0018:ffff8e9cdf36bb98 EFLAGS: 00010287
      [171211.451848] RAX: 7fffffffc1439900 RBX: ffff8eb76f7e4a90 RCX: 000000000000001c
      [171211.490610] RDX: ffff8e88df67bb50 RSI: ffff8ea253a26b58 RDI: ffff8e8fd9ff89a4
      [171211.530173] RBP: ffff8e9cdf36bbb0 R08: ffff8ea253a26b58 R09: 0000000000000046
      [171211.568509] R10: 0000000000000230 R11: ffff8eae63eebc00 R12: ffff8eb76f7e4a28
      [171211.606963] R13: ffffffffc118d878 R14: ffff8e9cdf36bcd0 R15: ffff8e9cdf36bc60
      [171211.645074] FS: 00002aaabe446700(0000) GS:ffff8ec27f3c0000(0000) knlGS:0000000000000000
      [171211.689665] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [171211.722223] CR2: 0000000001440b30 CR3: 0000003bd6a22000 CR4: 00000000003607e0
      [171211.762135] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [171211.801929] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [171211.842015] Call Trace:
      [171211.855677] [<ffffffffc112fec3>] cl_page_delete+0x33/0x110 [obdclass]
      [171211.892125] [<ffffffffc158a9ff>] ll_invalidatepage+0x7f/0x170 [lustre]
      [171211.931786] [<ffffffffa71ce22d>] do_invalidatepage_range+0x7d/0x90
      [171211.965295] [<ffffffffa71ce2d7>] truncate_inode_page+0x77/0x80
      [171211.996796] [<ffffffffa71ce50a>] truncate_inode_pages_range+0x1ea/0x750
      [171212.033187] [<ffffffffa71ceadf>] truncate_inode_pages_final+0x4f/0x60
      [171212.071886] [<ffffffffc1570acf>] ll_delete_inode+0x4f/0x230 [lustre]
      [171212.106401] [<ffffffffa7268544>] evict+0xb4/0x180
      [171212.133368] [<ffffffffa726896c>] iput+0xfc/0x190
      [171212.162024] [<ffffffffa725cbde>] do_unlinkat+0x1ae/0x2d0
      [171212.191747] [<ffffffffa725dc96>] SyS_unlink+0x16/0x20
      [171212.221730] [<ffffffffa778dede>] system_call_fastpath+0x25/0x2a
      [171212.253488] Code: 89 e6 ba 04 00 00 00 4c 89 ef e8 80 fb ff ff 49 8b 44 24 30 49 83 c4 28 49 39 c4 48 8d 58 e0 74 2b 66 0f 1f 44 00 00 48 8b 43 18 <48> 8b 40 40 48 85 c0 74 0b 48 89 de 4c 89 ef e8 20 2e 26 e6 48 
      [171212.353283] RIP [<ffffffffc112fcdc>] cl_page_delete0+0x6c/0x220 [obdclass]
      [171212.391650] RSP <ffff8e9cdf36bb98>
       
      

      second dmesg (Redzone overwritten message)

       

      [588058.598892] =============================================================================
      [588058.598898] BUG kmalloc-8 (Tainted: G OE ------------ ): Redzone overwritten
      [588058.598900] -----------------------------------------------------------------------------
      [588058.598903] Disabling lock debugging due to kernel taint
      [588058.598906] INFO: 0xffff8b70ccbdde48-0xffff8b70ccbdde4f. First byte 0x4c instead of 0xcc
      [588058.598908] INFO: Slab 0xffffd60ee632f740 objects=102 used=93 fp=0xffff8b70ccbddcd8 flags=0x6fffff00000081
      [588058.598910] INFO: Object 0xffff8b70ccbdde40 @offset=3648 fp=0x7f7f0b704c3d5d48
      [588058.598914] Redzone ffff8b70ccbdde38: cc cc cc cc cc cc cc cc ........
      [588058.598916] Object ffff8b70ccbdde40: d0 0b d6 0b 88 01 00 25 .......%
      [588058.598918] Redzone ffff8b70ccbdde48: 4c 4c 4c 4c 4c 4c 4c 4c LLLLLLLL
      [588058.598920] Padding ffff8b70ccbdde58: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
      [588058.598924] CPU: 27 PID: 12194 Comm: pool Kdump: loaded Tainted: G B OE ------------ 3.10.0-1062.9.1.el7.x86_64 #1
      [588058.598926] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/21/2019
      [588058.598927] Call Trace:
      [588058.598939] [<ffffffffb477ac23>] dump_stack+0x19/0x1b
      [588058.598944] [<ffffffffb4221561>] print_trailer+0x161/0x280
      [588058.598947] [<ffffffffb42217ef>] check_bytes_and_report+0xcf/0x110
      [588058.598950] [<ffffffffb422237d>] check_object+0x1dd/0x2a0
      [588058.598953] [<ffffffffb47773cc>] free_debug_processing+0x8d/0x270
      [588058.598958] [<ffffffffb41ddcb5>] ? kvfree+0x35/0x40
      [588058.598962] [<ffffffffb4223bee>] __slab_free+0x1ce/0x290
      [588058.598968] [<ffffffffb4272e58>] ? generic_setxattr+0x68/0x80
      [588058.598972] [<ffffffffb4273635>] ? __vfs_setxattr_noperm+0x65/0x1b0
      [588058.598977] [<ffffffffb432b7ae>] ? evm_inode_setxattr+0xe/0x10
      [588058.598980] [<ffffffffb41ddcb5>] ? kvfree+0x35/0x40
      [588058.598982] [<ffffffffb4223db6>] kfree+0x106/0x140
      [588058.598985] [<ffffffffb41ddcb5>] kvfree+0x35/0x40
      [588058.598989] [<ffffffffb427399b>] setxattr+0x15b/0x1e0
      [588058.598994] [<ffffffffb425c3ed>] ? putname+0x3d/0x60
      [588058.598998] [<ffffffffb425d602>] ? user_path_at_empty+0x72/0xc0
      [588058.599003] [<ffffffffb424d828>] ? __sb_start_write+0x58/0x120
      [588058.599008] [<ffffffffb42802f1>] ? do_utimes+0xf1/0x180
      [588058.599011] [<ffffffffb4273c87>] SyS_setxattr+0xb7/0x100
      [588058.599016] [<ffffffffb478dede>] system_call_fastpath+0x25/0x2a
      [588058.599019] FIX kmalloc-8: Restoring 0xffff8b70ccbdde48-0xffff8b70ccbdde4f=0xcc
      [588058.599022] FIX kmalloc-8: Object at 0xffff8b70ccbdde40 not freed
      [588060.269020] WebExtensions[13188]: segfault at 1fff8 ip 00001f17cb0e5fbb sp 00007fffffffb998 error 4
      [588076.827561] atom[21965]: segfault at 21ea75682310 ip 00002aaaab0d6550 sp 00007fffffffc3a8 error 4 in libnode.so[2aaaaaccf000+12ba000]
      [588128.154889] LustreError: 32046:0:(cl_page.c:394:cl_pagevec_put()) page@ffff8b70fd832600[0 ffff8b5e72b18270 4 1 (null)]
      [588128.154903] LustreError: 32046:0:(cl_page.c:394:cl_pagevec_put()) vvp-page@ffff8b70fd832650(0:0) vm@ffffd60ec75c3dc0 6fffff00000009 3:0 0 449438 lru
      [588128.154910] LustreError: 32046:0:(cl_page.c:394:cl_pagevec_put()) lov-page@ffff8b70fd832690, comp index: 30002, gen: 8
      [588128.154924] LustreError: 32046:0:(cl_page.c:394:cl_pagevec_put()) osc-page@ffff8b70fd8326c8 112542: 1< 0x845fed 2 0 - - > 2< 460972032 0 4096 0x0 0x420 | (null) ffff8b872b233648 ffff8b49c2ff9540 > 3< 0 0 0 > 4< 0 0 8 1879965695 - | - - - - > 5< - - - - | 0 - | 0 - ->
      [588128.154930] LustreError: 32046:0:(cl_page.c:394:cl_pagevec_put()) end page@ffff8b70fd832600
      [588128.154935] LustreError: 32046:0:(cl_page.c:394:cl_pagevec_put()) list_empty(&page->cp_batch)
      [588128.154939] LustreError: 32046:0:(cl_page.c:394:cl_pagevec_put()) ASSERTION( 0 ) failed: 
      [588128.196374] LustreError: 32046:0:(cl_page.c:394:cl_pagevec_put()) LBUG
      [588128.230830] Pid: 32046, comm: wget 3.10.0-1062.9.1.el7.x86_64 #1 SMP Fri Dec 6 15:49:49 UTC 2019
      [588128.230832] Call Trace:
      [588128.230848] [<ffffffffc10d97cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [588128.230867] [<ffffffffc10d987c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [588128.230876] [<ffffffffc12fd1f3>] cl_pagevec_put+0x3a3/0x3e0 [obdclass]
      [588128.230922] [<ffffffffc12fd240>] cl_page_put+0x10/0x20 [obdclass]
      [588128.230944] [<ffffffffc175d895>] ll_releasepage+0xb5/0x1a0 [lustre]
      [588128.230971] [<ffffffffb41bd565>] try_to_release_page+0x35/0x50
      [588128.230979] [<ffffffffb41d2a19>] shrink_page_list+0xa09/0xc30
      [588128.230985] [<ffffffffb41d3266>] shrink_inactive_list+0x1c6/0x5d0
      [588128.230989] [<ffffffffb41d3d65>] shrink_lruvec+0x385/0x740
      [588128.230993] [<ffffffffb41d4196>] shrink_zone+0x76/0x1a0
      [588128.230997] [<ffffffffb41d4680>] do_try_to_free_pages+0xf0/0x520
      [588128.231002] [<ffffffffb41d4d0a>] try_to_free_mem_cgroup_pages+0xda/0x190
      [588128.231006] [<ffffffffb423c7ce>] mem_cgroup_reclaim+0x4e/0x120
      [588128.231011] [<ffffffffb423d19c>] __mem_cgroup_try_charge+0x4ec/0x670
      [588128.231014] [<ffffffffb423db09>] mem_cgroup_charge_common+0x59/0xc0
      [588128.231018] [<ffffffffb423f4ca>] mem_cgroup_cache_charge+0x8a/0xb0
      [588128.231022] [<ffffffffb41be1ee>] __add_to_page_cache_locked+0x4e/0x190
      [588128.231026] [<ffffffffb41be387>] add_to_page_cache_lru+0x37/0xb0
      [588128.231030] [<ffffffffb41be449>] grab_cache_page_nowait+0x49/0xa0
      [588128.231033] [<ffffffffc175e1d5>] ll_write_begin+0xd5/0xc00 [lustre]
      [588128.231048] [<ffffffffb41bd28f>] generic_file_buffered_write+0x10f/0x270
      [588128.231052] [<ffffffffb41bfaf2>] __generic_file_aio_write+0x1e2/0x400
      [588128.231056] [<ffffffffc176c51b>] __generic_file_write_iter+0xcb/0x340 [lustre]
      [588128.231072] [<ffffffffc1770704>] vvp_io_write_start+0x4c4/0x970 [lustre]
      [588128.231088] [<ffffffffc13011a8>] cl_io_start+0x68/0x130 [obdclass]
      [588128.231112] [<ffffffffc130338c>] cl_io_loop+0xcc/0x1c0 [obdclass]
      [588128.231134] [<ffffffffc1725f4b>] ll_file_io_generic+0x63b/0xc90 [lustre]
      [588128.231147] [<ffffffffc1726a39>] ll_file_aio_write+0x289/0x660 [lustre]
      [588128.231158] [<ffffffffc1726f10>] ll_file_write+0x100/0x1c0 [lustre]
      [588128.231170] [<ffffffffb424a7f0>] vfs_write+0xc0/0x1f0
      [588128.231175] [<ffffffffb424b60f>] SyS_write+0x7f/0xf0
      [588128.231179] [<ffffffffb478dede>] system_call_fastpath+0x25/0x2a
      [588128.231185] [<ffffffffffffffff>] 0xffffffffffffffff
      [588128.231226] Kernel panic - not syncing: LBUG
      [588128.255989] CPU: 14 PID: 32046 Comm: wget Kdump: loaded Tainted: G B OE ------------ 3.10.0-1062.9.1.el7.x86_64 #1
      [588128.315914] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 10/21/2019
      [588128.353586] Call Trace:
      [588128.367273] [<ffffffffb477ac23>] dump_stack+0x19/0x1b
      [588128.395924] [<ffffffffb4774967>] panic+0xe8/0x21f
      [588128.423537] [<ffffffffc10d98cb>] lbug_with_loc+0x9b/0xa0 [libcfs]
      [588128.458128] [<ffffffffc12fd1f3>] cl_pagevec_put+0x3a3/0x3e0 [obdclass]
      [588128.496480] [<ffffffffc12fbcf0>] ? cl_page_delete0+0x80/0x220 [obdclass]
      [588128.536173] [<ffffffffc12fd240>] cl_page_put+0x10/0x20 [obdclass]
      [588128.569680] [<ffffffffc175d895>] ll_releasepage+0xb5/0x1a0 [lustre]
      [588128.606245] [<ffffffffb41bd565>] try_to_release_page+0x35/0x50
      [588128.637150] [<ffffffffb41d2a19>] shrink_page_list+0xa09/0xc30
      [588128.668853] [<ffffffffb41d3266>] shrink_inactive_list+0x1c6/0x5d0
      [588128.701180] [<ffffffffb41d3d65>] shrink_lruvec+0x385/0x740
      [588128.729829] [<ffffffffb41d4196>] shrink_zone+0x76/0x1a0
      [588128.757029] [<ffffffffb41d4680>] do_try_to_free_pages+0xf0/0x520
      [588128.788834] [<ffffffffb41d4d0a>] try_to_free_mem_cgroup_pages+0xda/0x190
      [588128.825660] [<ffffffffb423c7ce>] mem_cgroup_reclaim+0x4e/0x120
      [588128.857225] [<ffffffffb423d19c>] __mem_cgroup_try_charge+0x4ec/0x670
      [588128.890392] [<ffffffffb423db09>] mem_cgroup_charge_common+0x59/0xc0
      [588128.926221] [<ffffffffb423f4ca>] mem_cgroup_cache_charge+0x8a/0xb0
      [588128.959155] [<ffffffffb41be1ee>] __add_to_page_cache_locked+0x4e/0x190
      [588128.997956] [<ffffffffb41be387>] add_to_page_cache_lru+0x37/0xb0
      [588129.030741] [<ffffffffb41be449>] grab_cache_page_nowait+0x49/0xa0
      [588129.068075] [<ffffffffc175e1d5>] ll_write_begin+0xd5/0xc00 [lustre]
      [588129.108147] [<ffffffffb41bd28f>] generic_file_buffered_write+0x10f/0x270
      [588129.147957] [<ffffffffb41bfaf2>] __generic_file_aio_write+0x1e2/0x400
      [588129.187721] [<ffffffffc176c51b>] __generic_file_write_iter+0xcb/0x340 [lustre]
      [588129.227843] [<ffffffffc1770704>] vvp_io_write_start+0x4c4/0x970 [lustre]
      [588129.267789] [<ffffffffc13011a8>] cl_io_start+0x68/0x130 [obdclass]
      [588129.300695] [<ffffffffc130338c>] cl_io_loop+0xcc/0x1c0 [obdclass]
      [588129.337964] [<ffffffffc1725f4b>] ll_file_io_generic+0x63b/0xc90 [lustre]
      [588129.377915] [<ffffffffc1726a39>] ll_file_aio_write+0x289/0x660 [lustre]
      [588129.418350] [<ffffffffc1726f10>] ll_file_write+0x100/0x1c0 [lustre]
      [588129.458398] [<ffffffffb424a7f0>] vfs_write+0xc0/0x1f0
      [588129.488082] [<ffffffffb424b60f>] SyS_write+0x7f/0xf0
      [588129.518291] [<ffffffffb478dede>] system_call_fastpath+0x25/0x2a
      

       

       

      Attachments

        Issue Links

          Activity

            [LU-13168] Client panic "Freechain corrupt"/"Redzone Overwritten"
            pjones Peter Jones added a comment -

            Fix confirmed in 2.12.5

            pjones Peter Jones added a comment - Fix confirmed in 2.12.5

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38604/
            Subject: LU-13168 tests: verify truncated xattr is handled
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: 75c0eb51332639a09c720fb41f3a2cdb5b029afb

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38604/ Subject: LU-13168 tests: verify truncated xattr is handled Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: 75c0eb51332639a09c720fb41f3a2cdb5b029afb

            Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38604
            Subject: LU-13168 tests: verify truncated xattr is handled
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 7d506d0989c0467e5a5fce7f6295eae578aef82e

            gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38604 Subject: LU-13168 tests: verify truncated xattr is handled Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 7d506d0989c0467e5a5fce7f6295eae578aef82e
            pjones Peter Jones added a comment -

            Landed for 2.14

            pjones Peter Jones added a comment - Landed for 2.14

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38434/
            Subject: LU-13168 tests: verify truncated xattr is handled
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: cb74546354201434a6fd3d53acd1a0808fbfcb1c

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38434/ Subject: LU-13168 tests: verify truncated xattr is handled Project: fs/lustre-release Branch: master Current Patch Set: Commit: cb74546354201434a6fd3d53acd1a0808fbfcb1c
            pjones Peter Jones added a comment -

            > Will this be included in the next 2.12 LTS release?

            Very likely but it needs to land to master first.

            pjones Peter Jones added a comment - > Will this be included in the next 2.12 LTS release? Very likely but it needs to land to master first.

            I've applied the patch to our 2.12.4 clients. This appears to build fine. Our test code now returns an error as expected rather than triggering a panic. Will this be included in the next 2.12 LTS release?

             

             

            cjm14 Christopher Mountford added a comment - I've applied the patch to our 2.12.4 clients. This appears to build fine. Our test code now returns an error as expected rather than triggering a panic. Will this be included in the next 2.12 LTS release?    

            I've tested the 38434 patch, I downloaded the tarball and built the client rpms from there. This seems to have fix the problem, the system call now returns an error without causing a panic. I'll try applying the diff to the 2.12.4 source and check that works as well.

            I'll grab the latest gio as well and check if the handling of extended attributes is fixed, if not I'll put a bug report in to gnome.

            cjm14 Christopher Mountford added a comment - I've tested the 38434 patch, I downloaded the tarball and built the client rpms from there. This seems to have fix the problem, the system call now returns an error without causing a panic. I'll try applying the diff to the 2.12.4 source and check that works as well. I'll grab the latest gio as well and check if the handling of extended attributes is fixed, if not I'll put a bug report in to gnome.

            Christopher, it looks like there was still a problem with PFL layouts, even after the 36589 patch. I've updated the 38434 patch to also fix the PFL layout problem.

            adilger Andreas Dilger added a comment - Christopher, it looks like there was still a problem with PFL layouts, even after the 36589 patch. I've updated the 38434 patch to also fix the PFL layout problem.

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38434
            Subject: LU-13168 tests: verify truncated xattr is handled
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: e0e7f0e4ef0aa868c0d9d38bb517357ef50cea25

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38434 Subject: LU-13168 tests: verify truncated xattr is handled Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: e0e7f0e4ef0aa868c0d9d38bb517357ef50cea25

            People

              adilger Andreas Dilger
              cjm14 Christopher Mountford
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: