[LU-12882] PPC client: sanity-quota test_51: OSS crashed Created: 18/Oct/19  Updated: 24/Oct/19

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.3
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: ppc

Issue Links:
Duplicate
duplicates LU-11997 Crash in lustre_swab_fiemap Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/0acb8bb8-eb0f-11e9-b62b-52540065bddc

test_51 failed with the following error:

trevis-55vm1 crashed during sanity-quota test_51

OSS crash

[17319.609460] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanity-quota test 51: Test project accounting with mv\/cp ========================================== 16:13:54 \(1570637634\)
[17319.806668] Lustre: DEBUG MARKER: == sanity-quota test 51: Test project accounting with mv/cp ========================================== 16:13:54 (1570637634)
[17321.764506] Lustre: DEBUG MARKER: lctl set_param fail_val=0 fail_loc=0
[17322.699241] Lustre: DEBUG MARKER: lctl set_param -n osd*.*OS*.force_sync=1
[17324.443057] Lustre: DEBUG MARKER: lctl set_param -n osd*.*OS*.force_sync=1
[17325.461198] general protection fault: 0000 [#1] SMP 
[17325.462589] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core sunrpc dm_mod iosf_mbi crc32_pclmul ghash_clmulni_intel ppdev aesni_intel lrw parport_pc gf128mul glue_helper ablk_helper cryptd joydev i2c_piix4 pcspkr virtio_balloon parport ip_tables ext4 mbcache jbd2 ata_generic pata_acpi ata_piix virtio_blk libata crct10dif_pclmul crct10dif_common 8139too crc32c_intel serio_raw
[17325.482979]  virtio_pci virtio_ring virtio 8139cp mii floppy
[17325.484320] CPU: 1 PID: 5126 Comm: in.mrshd Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.x86_64 #1
[17325.487118] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[17325.488461] task: ffff8b9f37aa2080 ti: ffff8b9f20560000 task.ti: ffff8b9f20560000
[17325.490296] RIP: 0010:[<ffffffffb221d6a4>]  [<ffffffffb221d6a4>] kmem_cache_alloc+0x74/0x1f0
[17325.492465] RSP: 0018:ffff8b9f20563ce0  EFLAGS: 00010286
[17325.493698] RAX: 0000000000000000 RBX: ffff8b9ef6479900 RCX: 0000000001496816
[17325.495475] RDX: 0000000001496815 RSI: 0000000000000200 RDI: ffff8b9f3d001b00
[17325.496816] RBP: ffff8b9f20563d10 R08: 000000000001f0a0 R09: ffffffffb21f9b94
[17325.498036] R10: ffff8b9f392155e8 R11: ffff8b9f1f8b3400 R12: c0368b1f9f8bffff
[17325.499218] R13: 0000000000000200 R14: ffff8b9f3d001b00 R15: ffff8b9f3d001b00
[17325.500410] FS:  00007f6aeefac780(0000) GS:ffff8b9f3fd00000(0000) knlGS:0000000000000000
[17325.501740] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[17325.502725] CR2: 00007f6aeefbb000 CR3: 00000000641ec000 CR4: 00000000000606e0
[17325.503985] Call Trace:
[17325.504485]  [<ffffffffb21f9b94>] anon_vma_clone+0x64/0x1c0
[17325.505660]  [<ffffffffb21f9d22>] anon_vma_fork+0x32/0x130
[17325.506667]  [<ffffffffb20959f3>] dup_mm+0x473/0x750
[17325.507537]  [<ffffffffb2097152>] copy_process+0x1452/0x1a40
[17325.508549]  [<ffffffffb20978f1>] do_fork+0x91/0x320
[17325.509450]  [<ffffffffb2777d15>] ? system_call_after_swapgs+0xa2/0x146
[17325.510613]  [<ffffffffb2777d21>] ? system_call_after_swapgs+0xae/0x146
[17325.511772]  [<ffffffffb2777d15>] ? system_call_after_swapgs+0xa2/0x146
[17325.512935]  [<ffffffffb2777d21>] ? system_call_after_swapgs+0xae/0x146
[17325.514110]  [<ffffffffb2777d15>] ? system_call_after_swapgs+0xa2/0x146
[17325.515260]  [<ffffffffb2097c06>] SyS_clone+0x16/0x20
[17325.516140]  [<ffffffffb27781b4>] stub_clone+0x44/0x70
[17325.517067]  [<ffffffffb2777ddb>] ? system_call_fastpath+0x22/0x27
[17325.518158] Code: 3b df 4d 49 8b 50 08 4d 8b 20 49 8b 40 10 4d 85 e4 0f 84 28 01 00 00 48 85 c0 0f 84 1f 01 00 00 49 63 46 20 48 8d 4a 01 4d 8b 06 <49> 8b 1c 04 4c 89 e0 65 49 0f c7 08 0f 94 c0 84 c0 74 ba 49 63 
[17325.523887] RIP  [<ffffffffb221d6a4>] kmem_cache_alloc+0x74/0x1f0
[17325.525008]  RSP <ffff8b9f20563ce0>

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity-quota test_51 - trevis-55vm1 crashed during sanity-quota test_51



 Comments   
Comment by Oleg Drokin [ 24/Oct/19 ]

ppc crashes in kmem_cache_alloc are all currently due to LU-11997.

it's just a memory corruption due to improper swabbing code.

Generated at Sat Feb 10 02:56:28 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.