[LU-3281] IO Fails - client stack overrun Created: 06/May/13  Updated: 16/Mar/16  Resolved: 08/May/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.4.0, Lustre 2.8.0

Type: Bug Priority: Blocker
Reporter: Cliff White (Inactive) Assignee: Zhenyu Xu
Resolution: Fixed Votes: 0
Labels: HB
Environment:

hyperion/LLNL


Attachments: File console.iwc113     File console.iwc21     File console.iwc4    
Issue Links:
Related
is related to LU-3188 IOR fails due to client stack overrun Resolved
Severity: 3
Rank (Obsolete): 8119

 Description   

Testing fix for LU-3188 http://review.whamcloud.com/#change,6191
Client panics immediately when running IOR

2013-05-05 12:06:15 Lustre: DEBUG MARKER: == test iorssf: iorssf == 12:06:15
2013-05-05 12:30:42 BUG: scheduling while atomic: ior/5692/0x10000002
2013-05-05 12:30:42 BUG: unable to handle kernel paging request at 0000000315c2e000
2013-05-05 12:30:42 IP: [<ffffffff810568e4>] update_curr+0x144/0x1f0
2013-05-05 12:30:42 PGD 106a964067 PUD 0
2013-05-05 12:30:42 Thread overran stack, or stack corrupted
2013-05-05 12:30:42 Oops: 0000 [#1] SMP
2013-05-05 12:30:42 last sysfs file: /sys/devices/pci0000:00/0000:00:02.0/0000:03:00.0/infiniband/mlx4_0/ports/1/pkeys/127
2013-05-05 12:30:42 CPU 9
2013-05-05 12:30:42 Modules linked in: lmv(U) mgc(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ko2iblnd(U) lnet(U) sha512_generic sha256_generic libcfs(U) ipmi_devintf acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core ext4 mbcache jbd2 dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun kvm sg sd_mod crc_t10dif wmi dcdbas sb_edac edac_core ahci i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc mlx4_en mlx4_core igb dca ptp pps_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: cpufreq_ondemand]
2013-05-05 12:30:43

2013-05-05 12:30:43 Pid: 5692, comm: ior Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Dell Inc. PowerEdge C6220/0HYFFG
2013-05-05 12:30:43 RIP: 0010:[<ffffffff810568e4>]  [<ffffffff810568e4>] update_curr+0x144/0x1f0
2013-05-05 12:30:43 RSP: 0018:ffff88089c423db8  EFLAGS: 00010086
2013-05-05 12:30:43 RAX: ffff880840d79540 RBX: 0000000072806048 RCX: ffff880877f101c0
2013-05-05 12:30:43 RDX: 00000000000192d8 RSI: 0000000000000000 RDI: ffff880840d79578
2013-05-05 12:30:43 RBP: ffff88089c423de8 R08: ffffffff8160bb65 R09: 0000000000000007
2013-05-05 12:30:43 R10: 0000000000000010 R11: 0000000000000007 R12: ffff88089c436768
2013-05-05 12:30:43 R13: 00000000007c9fa8 R14: 0000082565f22284 R15: ffff880840d79540
2013-05-05 12:30:43 FS:  00002aaaafebf8c0(0000) GS:ffff88089c420000(0000) knlGS:0000000000000000
2013-05-05 12:30:43 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
2013-05-05 12:30:43 CR2: 0000000315c2e000 CR3: 000000106aea6000 CR4: 00000000000407e0
2013-05-05 12:30:43 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
2013-05-05 12:30:43 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
2013-05-05 12:30:43 Process ior (pid: 5692, threadinfo ffff880872806000, task ffff880840d79540)
2013-05-05 12:30:43 Stack:
2013-05-05 12:30:43  ffff88089c423dc8 ffffffff81013783 ffff880840d79578 ffff88089c436768
2013-05-05 12:30:43 <d> 0000000000000000 0000000000000000 ffff88089c423e18 ffffffff81056e9b
2013-05-05 12:30:43 <d> ffff88089c436700 0000000000000009 0000000000016700 0000000000000009
2013-05-05 12:30:43 Call Trace:
2013-05-05 12:30:43  <IRQ>
2013-05-05 12:30:43  [<ffffffff81013783>] ? native_sched_clock+0x13/0x80
2013-05-05 12:30:43 BUG: unable to handle kernel paging request at 000000000001182f
2013-05-05 12:30:43 IP: [<ffffffff8100f4dd>] print_context_stack+0xad/0x140
2013-05-05 12:30:43 PGD 106a964067 PUD 106a825067 PMD 0
2013-05-05 12:30:43 Thread overran stack, or stack corrupted
2013-05-05 12:30:43 Oops: 0000 [#2] SMP
2013-05-05 12:30:43 last sysfs file: /sys/devices/pci0000:00/0000:00:02.0/0000:03:00.0/infiniband/mlx4_0/ports/1/pkeys/127
2013-05-05 12:30:43 CPU 9

Same as LU-3188 - continuous stack dumps until node crash



 Comments   
Comment by Keith Mannthey (Inactive) [ 06/May/13 ]

iwc4 (a client) didn't not report stack corruption but the scheduler state seems to have been corrupted.

Comment by Keith Mannthey (Inactive) [ 06/May/13 ]

iwc113 (a client) had the official stack corruption error messages. There is not much data in the stacks.

Comment by Zhenyu Xu [ 06/May/13 ]

If possible would you try this patch?

http://review.whamcloud.com/6270

commit message
LU-3281 osc: don't hold a spinlock before waking up process

ptlrpcd_add_req() will wake_up other process, do not hold a spinlock
before calling ptlrpcd_queue_work()->ptlrpcd_add_req().
Comment by Keith Mannthey (Inactive) [ 06/May/13 ]

Another client iwc21 was able to dump a stack in the kernel logs. This looks to be very relevant.

Comment by Keith Mannthey (Inactive) [ 06/May/13 ]

This is the the long read stack seen on client 21

2013-05-05 14:07:48 ------------[ cut here ]------------
2013-05-05 14:07:48 WARNING: at kernel/sched_fair.c:1846 hrtick_start_fair+0x18b/0x190() (Not tainted)
2013-05-05 14:07:48 Hardware name: PowerEdge C6220
2013-05-05 14:07:48 Modules linked in: lmv(U) mgc(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ko2iblnd(U) lnet(U) sha512_generic sha256_generic libcfs(U) ipmi_devintf acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core ext4 mbcache jbd2 dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun kvm sg sd_mod crc_t10dif wmi dcdbas sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support shpchp ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc mlx4_en mlx4_core igb dca ptp pps_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: cpufreq_ondemand]
2013-05-05 14:07:49 Pid: 10797, comm: ior Not tainted 2.6.32-358.2.1.el6.x86_64 #1
2013-05-05 14:07:49 Call Trace:
2013-05-05 14:07:49  [<ffffffff8106e2e7>] ? warn_slowpath_common+0x87/0xc0
2013-05-05 14:07:49 BUG: unable to handle kernel NULL pointer dereference at 0000000000000a08
2013-05-05 14:07:49 IP: [<ffffffff8100f4dd>] print_context_stack+0xad/0x140
2013-05-05 14:07:49 PGD c3255b067 PUD fe2643067 PMD 0 
2013-05-05 14:07:49 Thread overran stack, or stack corrupted
2013-05-05 14:07:49 Oops: 0000 [#1] SMP 
2013-05-05 14:07:49 last sysfs file: /sys/devices/pci0000:00/0000:00:02.0/0000:03:00.0/infiniband/mlx4_0/ports/1/pkeys/127
2013-05-05 14:07:49 CPU 28 
2013-05-05 14:07:49 Modules linked in: lmv(U) mgc(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ko2iblnd(U) lnet(U) sha512_generic sha256_generic libcfs(U) ipmi_devintf acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core ext4 mbcache jbd2 dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun kvm sg sd_mod crc_t10dif wmi dcdbas sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support shpchp ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc mlx4_en mlx4_core igb dca ptp pps_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: cpufreq_ondemand]
2013-05-05 14:07:49 
2013-05-05 14:07:49 Pid: 10797, comm: ior Not tainted 2.6.32-358.2.1.el6.x86_64 #1 Dell Inc. PowerEdge C6220/0W6W6G
2013-05-05 14:07:49 RIP: 0010:[<ffffffff8100f4dd>]  [<ffffffff8100f4dd>] print_context_stack+0xad/0x140
2013-05-05 14:07:49 RSP: 0018:ffff8808725282d8  EFLAGS: 00010006
2013-05-05 14:07:49 RAX: 0000000000000020 RBX: ffff880872528460 RCX: 00000000000015d3
2013-05-05 14:07:49 RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000046
2013-05-05 14:07:49 RBP: ffff880872528338 R08: 0000000000000000 R09: ffffffff8163fde0
2013-05-05 14:07:49 R10: 0000000000000001 R11: 0000000000000000 R12: ffff8808725283d8
2013-05-05 14:07:49 R13: ffff880872528000 R14: ffffffff81600460 R15: 0000000000000000
2013-05-05 14:07:49 FS:  00002aaaafebf8c0(0000) GS:ffff88089c580000(0000) knlGS:0000000000000000
2013-05-05 14:07:49 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
2013-05-05 14:07:49 CR2: 0000000000000a08 CR3: 0000000dfedb4000 CR4: 00000000000407e0
2013-05-05 14:07:49 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
2013-05-05 14:07:49 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
2013-05-05 14:07:49 Process ior (pid: 10797, threadinfo ffff880872528000, task ffff88085d734080)
2013-05-05 14:07:49 Stack:
2013-05-05 14:07:49  ffff880872528378 ffff880872529ff8 ffff8808725283d8 ffffffffffffe000
2013-05-05 14:07:49 <d> ffffffff817c2fa0 ffffffff8106e2e7 ffff880872528348 ffff880872528440
2013-05-05 14:07:49 <d> 000000000000cbe0 ffffffff81600460 ffffffff817c2fa0 ffff88089c583fc0
2013-05-05 14:07:49 Call Trace:
2013-05-05 14:07:49  [<ffffffff8106e2e7>] ? warn_slowpath_common+0x87/0xc0
2013-05-05 14:07:49  [<ffffffff8100e4a0>] dump_trace+0x190/0x3b0
2013-05-05 14:07:49  [<ffffffff810575eb>] ? hrtick_start_fair+0x18b/0x190
2013-05-05 14:07:49  [<ffffffff8100f245>] show_trace_log_lvl+0x55/0x70
2013-05-05 14:07:49  [<ffffffff8100f275>] show_trace+0x15/0x20
2013-05-05 14:07:49  [<ffffffff8150d19a>] dump_stack+0x6f/0x76
2013-05-05 14:07:49  [<ffffffff810b2dfa>] ? print_modules+0x5a/0xf0
2013-05-05 14:07:49  [<ffffffff8106e2e7>] warn_slowpath_common+0x87/0xc0
2013-05-05 14:07:49  [<ffffffff8106e33a>] warn_slowpath_null+0x1a/0x20
2013-05-05 14:07:49  [<ffffffff810575eb>] hrtick_start_fair+0x18b/0x190
2013-05-05 14:07:49  [<ffffffff81064a00>] pick_next_task_fair+0xd0/0x130
2013-05-05 14:07:49  [<ffffffff8150d6d8>] schedule+0x178/0x3b2
2013-05-05 14:07:49  [<ffffffff81064d6a>] __cond_resched+0x2a/0x40
2013-05-05 14:07:49  [<ffffffff8150e320>] _cond_resched+0x30/0x40
2013-05-05 14:07:50  [<ffffffffa0708df5>] cl_page_free+0xd5/0x5b0 [obdclass]
2013-05-05 14:07:50  [<ffffffffa07094bf>] cl_page_put+0x1ef/0x4c0 [obdclass]
2013-05-05 14:07:50  [<ffffffffa0707628>] ? cl_page_disown0+0x88/0x180 [obdclass]
2013-05-05 14:07:50  [<ffffffffa0708502>] ? cl_page_disown+0xa2/0x120 [obdclass]
2013-05-05 14:07:50  [<ffffffffa0a0c7e7>] discard_pagevec+0x47/0x110 [osc]
2013-05-05 14:07:50  [<ffffffffa0a0d2ac>] osc_lru_shrink+0x9fc/0xe40 [osc]
2013-05-05 14:07:50  [<ffffffffa0a0e346>] osc_lru_del+0x3c6/0x560 [osc]
2013-05-05 14:07:50  [<ffffffffa057ed84>] ? cfs_hash_dual_bd_unlock+0x34/0x60 [libcfs]
2013-05-05 14:07:50  [<ffffffffa0a0e9c4>] osc_page_delete+0xe4/0x320 [osc]
2013-05-05 14:07:50  [<ffffffffa0709eb5>] cl_page_delete0+0xc5/0x4e0 [obdclass]
2013-05-05 14:07:50  [<ffffffffa070a312>] cl_page_delete+0x42/0x120 [obdclass]
2013-05-05 14:07:50  [<ffffffffa0b6674b>] ll_releasepage+0x12b/0x1a0 [lustre]
2013-05-05 14:07:50  [<ffffffff81172160>] ? mem_cgroup_uncharge_cache_page+0x10/0x20
2013-05-05 14:07:50  [<ffffffff811190a0>] try_to_release_page+0x30/0x60
2013-05-05 14:07:50  [<ffffffff81132827>] shrink_page_list.clone.3+0x517/0x650
2013-05-05 14:07:50  [<ffffffff8117086b>] ? mem_cgroup_lru_del_list+0x2b/0xb0
2013-05-05 14:07:50  [<ffffffff81132bf7>] ? isolate_lru_pages.clone.0+0xd7/0x170
2013-05-05 14:07:50  [<ffffffff81133343>] shrink_inactive_list+0x343/0x830
2013-05-05 14:07:50  [<ffffffff81133bde>] shrink_mem_cgroup_zone+0x3ae/0x610
2013-05-05 14:07:50  [<ffffffff8117257d>] ? mem_cgroup_iter+0xfd/0x280
2013-05-05 14:07:50  [<ffffffff81133ea3>] shrink_zone+0x63/0xb0
2013-05-05 14:07:50  [<ffffffff81134aa9>] zone_reclaim+0x349/0x400
2013-05-05 14:07:50  [<ffffffffa0703279>] ? cl_env_hops_keycmp+0x19/0x70 [obdclass]
2013-05-05 14:07:50  [<ffffffff8112a62c>] get_page_from_freelist+0x69c/0x830
2013-05-05 14:07:50  [<ffffffff8112bbc3>] __alloc_pages_nodemask+0x113/0x8d0
2013-05-05 14:07:50  [<ffffffff810e6dd5>] ? call_rcu_sched+0x15/0x20
2013-05-05 14:07:50  [<ffffffffa057ed84>] ? cfs_hash_dual_bd_unlock+0x34/0x60 [libcfs]
2013-05-05 14:07:50  [<ffffffff81166ab2>] kmem_getpages+0x62/0x170
2013-05-05 14:07:50  [<ffffffff8116711f>] cache_grow+0x2cf/0x320
2013-05-05 14:07:50  [<ffffffff81167372>] cache_alloc_refill+0x202/0x240
2013-05-05 14:07:50  [<ffffffffa056ab60>] ? cfs_alloc+0x30/0x60 [libcfs]
2013-05-05 14:07:50  [<ffffffff81168239>] __kmalloc+0x1a9/0x220
2013-05-05 14:07:50  [<ffffffffa056ab60>] cfs_alloc+0x30/0x60 [libcfs]
2013-05-05 14:07:50  [<ffffffffa088ba00>] null_alloc_reqbuf+0x190/0x420 [ptlrpc]
2013-05-05 14:07:50  [<ffffffffa087b5c9>] sptlrpc_cli_alloc_reqbuf+0x69/0x220 [ptlrpc]
2013-05-05 14:07:50  [<ffffffffa084f891>] lustre_pack_request+0x81/0x180 [ptlrpc]
2013-05-05 14:07:50  [<ffffffffa083d085>] __ptlrpc_request_bufs_pack+0xe5/0x3b0 [ptlrpc]
2013-05-05 14:07:50  [<ffffffffa083d3ac>] ptlrpc_request_bufs_pack+0x5c/0x80 [ptlrpc]
2013-05-05 14:07:50  [<ffffffffa083d3f4>] ptlrpc_request_pack+0x24/0x70 [ptlrpc]
2013-05-05 14:07:50  [<ffffffffa09fe6e5>] osc_brw_prep_request+0x155/0x1140 [osc]
2013-05-05 14:07:50  [<ffffffffa0a12cec>] ? osc_req_attr_set+0x16c/0x5b0 [osc]
2013-05-05 14:07:50  [<ffffffffa0712aa1>] ? cl_req_attr_set+0xd1/0x230 [obdclass]
2013-05-05 14:07:50  [<ffffffffa0a0491b>] osc_build_rpc+0x86b/0x1730 [osc]
2013-05-05 14:07:50  [<ffffffffa0a1c003>] osc_send_read_rpc+0x6a3/0x880 [osc]
2013-05-05 14:07:50  [<ffffffffa057f522>] ? cfs_hash_bd_from_key+0x42/0xd0 [libcfs]
2013-05-05 14:07:50  [<ffffffffa06fcabc>] ? lu_object_put+0x12c/0x330 [obdclass]
2013-05-05 14:07:50  [<ffffffffa0a1fb76>] osc_io_unplug0+0xb46/0x12c0 [osc]
2013-05-05 14:07:50  [<ffffffffa070a83c>] ? cl_page_find0+0x44c/0x810 [obdclass]
2013-05-05 14:07:50  [<ffffffff8112bbc3>] ? __alloc_pages_nodemask+0x113/0x8d0
2013-05-05 14:07:50  [<ffffffffa0a22091>] osc_io_unplug+0x11/0x20 [osc]
2013-05-05 14:07:50  [<ffffffffa0a28c80>] osc_queue_sync_pages+0x1d0/0x360 [osc]
2013-05-05 14:07:50  [<ffffffffa0a137e0>] osc_io_submit+0x330/0x4a0 [osc]
2013-05-05 14:07:50  [<ffffffffa071240c>] cl_io_submit_rw+0x6c/0x160 [obdclass]
2013-05-05 14:07:50  [<ffffffffa0aa904a>] lov_io_submit+0x4ea/0xbc0 [lov]
2013-05-05 14:07:50  [<ffffffffa071240c>] cl_io_submit_rw+0x6c/0x160 [obdclass]
2013-05-05 14:07:50  [<ffffffffa0714a2e>] cl_io_read_page+0xae/0x170 [obdclass]
2013-05-05 14:07:50  [<ffffffffa07087f7>] ? cl_page_assume+0xf7/0x220 [obdclass]
2013-05-05 14:07:51  [<ffffffffa0b4e196>] ll_readpage+0x96/0x1a0 [lustre]
2013-05-05 14:07:51  [<ffffffff8111b7ac>] generic_file_aio_read+0x1fc/0x700
2013-05-05 14:07:51  [<ffffffffa0b7b297>] vvp_io_read_start+0x257/0x470 [lustre]
2013-05-05 14:07:51  [<ffffffffa071256a>] cl_io_start+0x6a/0x140 [obdclass]
2013-05-05 14:07:51  [<ffffffffa0716ca4>] cl_io_loop+0xb4/0x1b0 [obdclass]
2013-05-05 14:07:51  [<ffffffffa0b22fcf>] ll_file_io_generic+0x33f/0x600 [lustre]
2013-05-05 14:07:51  [<ffffffffa0b233cf>] ll_file_aio_read+0x13f/0x2c0 [lustre]
2013-05-05 14:07:51  [<ffffffffa0b23c6c>] ll_file_read+0x16c/0x2a0 [lustre]
2013-05-05 14:07:51  [<ffffffff811817a5>] vfs_read+0xb5/0x1a0
2013-05-05 14:07:51  [<ffffffff811818e1>] sys_read+0x51/0x90
2013-05-05 14:07:51  [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
2013-05-05 14:07:51 Code: 38 08 00 85 c0 74 2c 49 8d 44 24 08 48 39 c3 74 7d 31 d2 48 8b 75 c8 48 8b 7d c0 41 ff 56 10 48 81 7d c8 41 ae 00 81 49 8b 45 00 <8b> 90 e8 09 00 00 74 1b 48 83 c3 08 4d 85 ff 75 92 4c 39 eb 76 
2013-05-05 14:07:51 RIP  [<ffffffff8100f4dd>] print_context_stack+0xad/0x140
2013-05-05 14:07:51  RSP <ffff8808725282d8>
2013-05-05 14:07:51 CR2: 0000000000000a08

I thought one of goals of the last patch was to move to osc_io_unplug_async ? In general it seems like calling osc_io_unplug from any context can be unsafe for the stack, as memory pressure can really lengthen things out.

Comment by Andreas Dilger [ 06/May/13 ]

Has anyone actually looked at the stack usage of these functions? It is entirely possible that there is some stack-hogging function that could be cleaned up. Please run "make checkstack" and look at all of the functions in these stacks to see which are the biggest stack users.

Comment by Keith Mannthey (Inactive) [ 06/May/13 ]

I just took a quick run at finding the stack info out.

checkstack and my v1 script don't seem to resolve all the stack calls but this I what I got as a first pass.

There are a few heavy functions in this code path.

360 osc:osc_send_read_rpc
296 osc:osc_build_rpc
216 osc:osc_io_unplug0
184 osc:osc_lru_shrink
168 osc:osc_brw_prep_request
144 obdclass:cl_page_find0
104 osc:osc_io_submit
96 lustre:vvp_io_read_start
96 lustre:ll_file_read
88 lov:lov_io_submit
80 ptlrpc:__ptlrpc_request_bufs_pack
80 obdclass:cl_page_put
72 osc:osc_lru_del
72 lustre:ll_file_io_generic
64 ptlrpc:ptlrpc_request_bufs_pack
64 lustre:ll_releasepage
56 osc:osc_queue_sync_pages
56 obdclass:lu_object_put
48 osc:osc_req_attr_set
48 lustre:ll_readpage
40 ptlrpc:lustre_pack_request
40 obdclass:lprocfs_alloc_stats
40 obdclass:cl_req_attr_set
40 lustre:ll_file_aio_read
32 ptlrpc:sptlrpc_cli_alloc_reqbuf
32 ptlrpc:ptlrpc_request_pack
32 ptlrpc:null_alloc_reqbuf
32 osc:osc_page_delete
32 obdclass:lprocfs_alloc_obd_stats
32 obdclass:lprocfs_alloc_md_stats
32 libcfs:cfs_hash_dual_bd_unlock
32 libcfs:cfs_hash_bd_from_key
32 libcfs:cfs_alloc
24 osc:discard_pagevec
24 obdclass:cl_page_assume
24 obdclass:cl_io_submit_rw
24 obdclass:cl_io_read_page
16 obdclass:cl_page_delete
8 obdclass:cl_page_free
8 obdclass:cl_page_delete0
8 obdclass:cl_io_start
8 obdclass:cl_env_hops_keycmp
Comment by Zhenyu Xu [ 07/May/13 ]

Is iwc21 still using old client image (without http://review.whamcloud.com/#change,6191 patch), its stack seems indicate that.

And iwc4 shows another long calling path (memory pressure leading a long calling chain).

extract from iwc4 log
2013-05-05 13:05:45 WARNING: at kernel/sched_fair.c:1846 hrtick_start_fair+0x18b/0x190() (Not tainted)
2013-05-05 13:05:45 Hardware name: PowerEdge C6220
2013-05-05 13:05:45 Modules linked in: lmv(U) mgc(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ko2iblnd(U) lnet(U) sha512_generic sha256_generic libcfs(U) ipmi_devintf acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr mlx4_ib ib_sa ib_mad iw_cxgb4 iw_cxgb3 ib_core ext4 mbcache jbd2 dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun kvm sg sd_mod crc_t10dif wmi dcdbas sb_edac edac_core i2c_i801 i2c_core ahci iTCO_wdt iTCO_vendor_support shpchp ioatdma nfs lockd fscache auth_rpcgss nfs_acl sunrpc mlx4_en mlx4_core igb dca ptp pps_core be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: cpufreq_ondemand]
2013-05-05 13:05:45 Pid: 7442, comm: ior Not tainted 2.6.32-358.2.1.el6.x86_64 #1
2013-05-05 13:05:45 Call Trace:
2013-05-05 13:05:45  [<ffffffff8106e2e7>] ? warn_slowpath_common+0x87/0xc0
2013-05-05 13:05:45  [<ffffffff8106e33a>] ? warn_slowpath_null+0x1a/0x20
2013-05-05 13:05:45  [<ffffffff810575eb>] ? hrtick_start_fair+0x18b/0x190
2013-05-05 13:05:45  [<ffffffff81064a00>] ? pick_next_task_fair+0xd0/0x130
2013-05-05 13:05:45  [<ffffffff8150d6d8>] ? schedule+0x178/0x3b2
2013-05-05 13:05:45  [<ffffffff8100bb8e>] ? apic_timer_interrupt+0xe/0x20
2013-05-05 13:05:45  [<ffffffff81064d6a>] ? __cond_resched+0x2a/0x40
2013-05-05 13:05:45  [<ffffffff8150e320>] ? _cond_resched+0x30/0x40
2013-05-05 13:05:45  [<ffffffffa0712df5>] ? cl_page_free+0xd5/0x5b0 [obdclass]
2013-05-05 13:05:45  [<ffffffffa07134bf>] ? cl_page_put+0x1ef/0x4c0 [obdclass]
2013-05-05 13:05:45  [<ffffffff8112c74f>] ? free_hot_page+0x2f/0x60
2013-05-05 13:05:45  [<ffffffff8112f36e>] ? __put_single_page+0x1e/0x30
2013-05-05 13:05:45  [<ffffffffa0a9beca>] ? lov_page_fini+0x6a/0x170 [lov]
2013-05-05 13:05:45  [<ffffffffa0712e2b>] ? cl_page_free+0x10b/0x5b0 [obdclass]
2013-05-05 13:05:45  [<ffffffffa07134bf>] ? cl_page_put+0x1ef/0x4c0 [obdclass]
2013-05-05 13:05:45  [<ffffffffa0711628>] ? cl_page_disown0+0x88/0x180 [obdclass]
2013-05-05 13:05:45  [<ffffffffa0712502>] ? cl_page_disown+0xa2/0x120 [obdclass]
2013-05-05 13:05:45  [<ffffffffa0a087e7>] ? discard_pagevec+0x47/0x110 [osc]
2013-05-05 13:05:45  [<ffffffffa0a092ac>] ? osc_lru_shrink+0x9fc/0xe40 [osc]
2013-05-05 13:05:45  [<ffffffffa0a0a346>] ? osc_lru_del+0x3c6/0x560 [osc]
2013-05-05 13:05:45  [<ffffffffa0588d84>] ? cfs_hash_dual_bd_unlock+0x34/0x60 [libcfs]
2013-05-05 13:05:45  [<ffffffffa0a0a9c4>] ? osc_page_delete+0xe4/0x320 [osc]
2013-05-05 13:05:45  [<ffffffffa0713eb5>] ? cl_page_delete0+0xc5/0x4e0 [obdclass]
2013-05-05 13:05:45  [<ffffffffa0714312>] ? cl_page_delete+0x42/0x120 [obdclass]
2013-05-05 13:05:45  [<ffffffffa0b6274b>] ? ll_releasepage+0x12b/0x1a0 [lustre]
2013-05-05 13:05:45  [<ffffffff811190a0>] ? try_to_release_page+0x30/0x60
2013-05-05 13:05:45  [<ffffffff81132827>] ? shrink_page_list.clone.3+0x517/0x650
2013-05-05 13:05:45  [<ffffffff8117086b>] ? mem_cgroup_lru_del_list+0x2b/0xb0
2013-05-05 13:05:45  [<ffffffff81132bf7>] ? isolate_lru_pages.clone.0+0xd7/0x170
2013-05-05 13:05:45  [<ffffffff81133343>] ? shrink_inactive_list+0x343/0x830
2013-05-05 13:05:45  [<ffffffff8112d04a>] ? determine_dirtyable_memory+0x1a/0x30
2013-05-05 13:05:45  [<ffffffff8112d0f7>] ? get_dirty_limits+0x27/0x2f0
2013-05-05 13:05:45  [<ffffffff81133bde>] ? shrink_mem_cgroup_zone+0x3ae/0x610
2013-05-05 13:05:45  [<ffffffff8117257d>] ? mem_cgroup_iter+0xfd/0x280
2013-05-05 13:05:45  [<ffffffff81133ea3>] ? shrink_zone+0x63/0xb0
2013-05-05 13:05:46  [<ffffffff81134aa9>] ? zone_reclaim+0x349/0x400
2013-05-05 13:05:46  [<ffffffff8112a62c>] ? get_page_from_freelist+0x69c/0x830
2013-05-05 13:05:46  [<ffffffffa0710005>] ? cl_page_slice_add+0x55/0x140 [obdclass]
2013-05-05 13:05:46  [<ffffffff8112bbc3>] ? __alloc_pages_nodemask+0x113/0x8d0
2013-05-05 13:05:46  [<ffffffff8116095a>] ? alloc_pages_current+0xaa/0x110
2013-05-05 13:05:46  [<ffffffff8111a1d7>] ? __page_cache_alloc+0x87/0x90
2013-05-05 13:05:46  [<ffffffff81119bbe>] ? find_get_page+0x1e/0xa0
2013-05-05 13:05:46  [<ffffffff8111b560>] ? grab_cache_page_nowait+0x70/0xc0
2013-05-05 13:05:46  [<ffffffffa0b4afeb>] ? ll_readahead+0x8bb/0x10f0 [lustre]
2013-05-05 13:05:46  [<ffffffffa0b75af5>] ? vvp_io_read_page+0x305/0x340 [lustre]
2013-05-05 13:05:46  [<ffffffffa071ea0d>] ? cl_io_read_page+0x8d/0x170 [obdclass]
2013-05-05 13:05:46  [<ffffffffa07127f7>] ? cl_page_assume+0xf7/0x220 [obdclass]
2013-05-05 13:05:46  [<ffffffffa0b4a196>] ? ll_readpage+0x96/0x1a0 [lustre]
2013-05-05 13:05:46  [<ffffffff81096ce0>] ? wake_bit_function+0x0/0x50
2013-05-05 13:05:46  [<ffffffff8111b7ac>] ? generic_file_aio_read+0x1fc/0x700
2013-05-05 13:05:46  [<ffffffffa0b77297>] ? vvp_io_read_start+0x257/0x470 [lustre]
2013-05-05 13:05:46  [<ffffffffa071c56a>] ? cl_io_start+0x6a/0x140 [obdclass]
2013-05-05 13:05:46  [<ffffffffa0720ca4>] ? cl_io_loop+0xb4/0x1b0 [obdclass]
2013-05-05 13:05:46  [<ffffffffa0b1efcf>] ? ll_file_io_generic+0x33f/0x600 [lustre]
2013-05-05 13:05:46  [<ffffffffa0b1f3cf>] ? ll_file_aio_read+0x13f/0x2c0 [lustre]
2013-05-05 13:05:46  [<ffffffffa0b1fc6c>] ? ll_file_read+0x16c/0x2a0 [lustre]
2013-05-05 13:05:46  [<ffffffff811817a5>] ? vfs_read+0xb5/0x1a0
2013-05-05 13:05:46  [<ffffffff811818e1>] ? sys_read+0x51/0x90
2013-05-05 13:05:46  [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
2013-05-05 13:05:46 ---[ end trace a96e3bc7fcd2c0a4 ]---
Comment by Keith Mannthey (Inactive) [ 07/May/13 ]

Ok I ran the v3 of patch against checkstack:

Overall it is a good reduction. Note: I believe gcc is doing some differing handing of some of the static symbols that explains the loss of osc_send_read_rpc from the list.

376 osc:osc_io_unplug0
200 osc:osc_build_rpc
184 osc:osc_lru_shrink
168 osc:osc_brw_prep_request
144 obdclass:cl_page_find0
104 osc:osc_io_submit
96 lustre:vvp_io_read_start
96 lustre:ll_file_read
88 lov:lov_io_submit
80 ptlrpc:__ptlrpc_request_bufs_pack
80 obdclass:cl_page_put
72 osc:osc_lru_del
72 lustre:ll_file_io_generic
64 ptlrpc:ptlrpc_request_bufs_pack
64 lustre:ll_releasepage
56 osc:osc_queue_sync_pages
56 obdclass:lu_object_put
48 osc:osc_req_attr_set
48 lustre:ll_readpage
40 ptlrpc:lustre_pack_request
40 obdclass:lprocfs_alloc_stats
40 obdclass:cl_req_attr_set
40 lustre:ll_file_aio_read
32 ptlrpc:sptlrpc_cli_alloc_reqbuf
32 ptlrpc:ptlrpc_request_pack
32 ptlrpc:null_alloc_reqbuf
32 osc:osc_page_delete
32 obdclass:lprocfs_alloc_obd_stats
32 obdclass:lprocfs_alloc_md_stats
32 libcfs:cfs_hash_dual_bd_unlock
32 libcfs:cfs_hash_bd_from_key
32 libcfs:cfs_alloc
24 osc:discard_pagevec
24 obdclass:cl_page_assume
24 obdclass:cl_io_submit_rw
24 obdclass:cl_io_read_page
16 obdclass:cl_page_delete
8 obdclass:cl_page_free
8 obdclass:cl_page_delete0
8 obdclass:cl_io_start
8 obdclass:cl_env_hops_keycmp
Comment by Jodi Levi (Inactive) [ 08/May/13 ]

Closing ticket as patch landed to master. Please let me know if more work is needed and I will reopen.

Comment by Gerrit Updater [ 30/Sep/15 ]

Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/16692
Subject: LU-3281 obdclass: remove structure holes to reduce memory
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: d4bee9e4e76e3e7e9701595f6f45b7e27cff3a78

Comment by Gerrit Updater [ 07/Oct/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16692/
Subject: LU-3281 obdclass: remove structure holes to reduce memory
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 790ca7b3c47d27639390bd7a2b9693182d3f24d6

Generated at Sat Feb 10 01:32:31 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.