Details
-
Bug
-
Resolution: Cannot Reproduce
-
Blocker
-
None
-
Lustre 2.2.0
-
None
-
autotest
build: http://build.whamcloud.com/job/lustre-reviews/3282/
-
3
-
6520
Description
While automating nfs over lustre, I hit this on MDS when run compilebench
https://maloo.whamcloud.com/test_sets/7e7f7290-103e-11e1-8338-52540025f9af
21:11:22:Lustre: DEBUG MARKER: == sanityn test compilebench: compilebench == 21:11:16 (1321420276)
21:11:22:Lustre: DEBUG MARKER: ./compilebench -D /mnt/lustre/d0.compilebench -i 2 -r 2 --makej
21:34:43:LustreError: 2805:0:(lvfs_lib.c:94:lprocfs_counter_add()) ASSERTION(!cfs_in_interrupt()) failed
21:34:43:BUG: unable to handle kernel paging request at ffffffff894c99e0
21:34:43:IP: [<ffffffff8104fc84>] update_curr+0x134/0x1e0
21:34:43:PGD 1a27067 PUD 1a2b063 PMD 0
21:34:43:Thread overran stack, or stack corrupted
21:34:43:Oops: 0000 1 SMP
21:34:43:last sysfs file: /sys/module/lockd/initstate
21:34:43:CPU 0
21:34:43:Modules linked in: nfsd lockd nfs_acl auth_rpcgss lmv(U) cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fsfilt_ldiskfs(U) exportfs mgs(U) mgc(U) lustre(U) lov(U) osc(U) lquota(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ldiskfs(U) jbd2 autofs4 sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core dm_mirror dm_region_hash dm_log virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk ata_generic pata_acpi ata_piix virtio_pci virtio_ring virtio dm_mod [last unloaded: speedstep_lib]
21:34:43:
21:34:44:Modules linked in: nfsd lockd nfs_acl auth_rpcgss lmv(U) cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fsfilt_ldiskfs(U) exportfs mgs(U) mgc(U) lustre(U) lov(U) osc(U) lquota(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ldiskfs(U) jbd2 autofs4 sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core dm_mirror dm_region_hash dm_log virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk ata_generic pata_acpi ata_piix virtio_pci virtio_ring virtio dm_mod [last unloaded: speedstep_lib]
21:34:44:Pid: 2805, comm: nfsd Not tainted 2.6.32-131.6.1.el6_lustre.gc202086.x86_64 #1 KVM
21:34:44:RIP: 0010:[<ffffffff8104fc84>] [<ffffffff8104fc84>] update_curr+0x134/0x1e0
21:34:44:RSP: 0018:ffff880002003c38 EFLAGS: 00010086
21:34:44:RAX: ffff8800402ecb40 RBX: 0000000000f26a00 RCX: ffff88007faa60c0
21:34:44:RDX: 0000000000018ac8 RSI: ffff88007819f4f8 RDI: ffff8800402ecb78
21:34:44:RBP: ffff880002003c68 R08: 0000000000000000 R09: 0000000000000001
21:34:44:R10: 0000000000000000 R11: 0000000000000000 R12: ffff880002015fe8
21:34:44:R13: 00000000000f4095 R14: 0000000000000001 R15: 0000000000000000
21:34:44:FS: 0000000000000000(0000) GS:ffff880002000000(0000) knlGS:0000000000000000
21:34:44:CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
21:34:44:CR2: ffffffff894c99e0 CR3: 0000000072e5a000 CR4: 00000000000006f0
21:34:44:DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
21:34:44:DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
21:34:44:Process nfsd (pid: 2805, threadinfo ffff880040330000, task ffff8800402ecb40)
21:34:44:Stack:
21:34:44: 0000000000000000 ffff880002016950 ffff880037d37578 ffff880002015fe8
21:34:44:<0> 0000000000000003 0000000000000000 ffff880002003ca8 ffffffff81050499
21:34:44:<0> ffff880002003ca8 ffff880037d37578 ffff880002015f80 0000000000000000
21:34:44:Call Trace:
21:34:44: <IRQ>
21:34:44: [<ffffffff81050499>] enqueue_entity+0x39/0x340
21:34:44: [<ffffffff81050be3>] enqueue_task_fair+0x43/0x90
21:34:44: [<ffffffff81057389>] enqueue_task+0x79/0x90
21:34:44: [<ffffffff810573ce>] activate_task+0x2e/0x40
21:34:44: [<ffffffff8105dad3>] try_to_wake_up+0x273/0x400
21:34:45: [<ffffffff8105dc72>] default_wake_function+0x12/0x20
21:34:45: [<ffffffff8108e176>] autoremove_wake_function+0x16/0x40
21:34:45: [<ffffffff8141cac3>] ? __napi_complete+0x23/0x40
21:34:45: [<ffffffff8104af29>] __wake_up_common+0x59/0x90
21:34:45: [<ffffffff8104f838>] __wake_up+0x48/0x70
21:34:45: [<ffffffff8109dff0>] ? tick_sched_timer+0x0/0xc0
21:34:45: [<ffffffff81067ac7>] printk_tick+0x47/0x50
21:34:45: [<ffffffff810798fd>] update_process_times+0x4d/0x70
21:34:45: [<ffffffff8109e056>] tick_sched_timer+0x66/0xc0
21:34:45: [<ffffffff8109289e>] __run_hrtimer+0x8e/0x1a0
21:34:45: [<ffffffff81036259>] ? kvm_clock_get_cycles+0x9/0x10
21:34:45: [<ffffffff81092c46>] hrtimer_interrupt+0xe6/0x250
21:34:45: [<ffffffff814e3f3b>] smp_apic_timer_interrupt+0x6b/0x9b
21:34:45: [<ffffffff8100bc93>] apic_timer_interrupt+0x13/0x20
21:34:45: <EOI>
21:34:45: [<ffffffff814de457>] ? _spin_unlock_irqrestore+0x17/0x20
21:34:45: [<ffffffffa036a274>] cfs_trace_unlock_tcd+0x44/0x80 [libcfs]
21:34:45: [<ffffffffa037579d>] libcfs_debug_vmsg2+0x5dd/0xb50 [libcfs]
21:34:45: [<ffffffffa0375691>] ? libcfs_debug_vmsg2+0x4d1/0xb50 [libcfs]
21:34:46: [<ffffffffa04826c9>] ? cl_env_hops_keycmp+0x19/0x70 [obdclass]
21:34:46: [<ffffffffa03795c2>] ? cfs_hash_bd_add_locked+0x62/0x90 [libcfs]
21:34:46: [<ffffffffa0375d69>] libcfs_assertion_failed+0x59/0x70 [libcfs]
21:34:46: [<ffffffffa03d89c5>] lprocfs_counter_add+0x165/0x196 [lvfs]
21:34:46: [<ffffffffa0558537>] ldlm_pool_shrink+0x57/0xf0 [ptlrpc]
21:34:46: [<ffffffff81093e7f>] ? up+0x2f/0x50
21:34:46: [<ffffffffa055931b>] ldlm_pools_shrink+0x27b/0x320 [ptlrpc]
21:34:46: [<ffffffffa05593f3>] ldlm_pools_srv_shrink+0x13/0x20 [ptlrpc]
21:34:46: [<ffffffff81125a6a>] shrink_slab+0x13a/0x1a0
21:34:46: [<ffffffff81127dfb>] do_try_to_free_pages+0x2fb/0x520
21:34:46: [<ffffffff8112820f>] try_to_free_pages+0x9f/0x130
21:34:46: [<ffffffff81129350>] ? isolate_pages_global+0x0/0x380
21:34:46: [<ffffffff8111fe6d>] __alloc_pages_nodemask+0x40d/0x8b0
21:34:46: [<ffffffff81103c00>] ? perf_event_context_sched_out+0x260/0x290
21:34:46: [<ffffffff81159ac2>] kmem_getpages+0x62/0x170
21:34:46: [<ffffffff8115a6da>] fallback_alloc+0x1ba/0x270
21:34:46: [<ffffffff8115a12f>] ? cache_grow+0x2cf/0x320
21:34:46: [<ffffffff8115a459>] ____cache_alloc_node+0x99/0x160
21:34:46: [<ffffffffa036ba13>] ? cfs_alloc+0x63/0x90 [libcfs]
21:34:47: [<ffffffff8115b069>] __kmalloc+0x199/0x230
21:34:47: [<ffffffffa036ba13>] cfs_alloc+0x63/0x90 [libcfs]
21:34:47: [<ffffffffa056379a>] ptlrpc_prep_bulk_imp+0x7a/0x350 [ptlrpc]
21:34:47: [<ffffffffa057165c>] ? lustre_msg_set_timeout+0x9c/0x110 [ptlrpc]
21:34:47: [<ffffffffa06404df>] osc_brw_prep_request+0x88f/0x1040 [osc]
21:34:47: [<ffffffffa06559eb>] ? osc_req_attr_set+0xfb/0x2a0 [osc]
21:34:47: [<ffffffffa07b3238>] ? ccc_req_attr_set+0x78/0x150 [lustre]
21:34:47: [<ffffffffa04912d4>] ? cl_req_prep+0x84/0x190 [obdclass]
21:34:47: [<ffffffffa0641ca5>] osc_send_oap_rpc+0x1015/0x1be0 [osc]
21:34:47: [<ffffffffa0633f71>] ? osc_consume_write_grant+0x81/0x160 [osc]
21:34:47: [<ffffffffa0642b4e>] osc_check_rpcs+0x2de/0x470 [osc]
21:34:47: [<ffffffffa06390d3>] ? on_list+0x43/0x50 [osc]
21:34:47: [<ffffffffa06436f3>] osc_queue_async_io+0x3c3/0x8f0 [osc]
21:34:47: [<ffffffff81037132>] ? pvclock_clocksource_read+0x72/0xd0
21:34:47: [<ffffffff8126d306>] ? vsnprintf+0x2b6/0x5f0
21:34:47: [<ffffffff81098a8a>] ? do_gettimeofday+0x1a/0x50
21:34:47: [<ffffffffa06515ef>] osc_page_cache_add+0xcf/0x200 [osc]
21:34:48: [<ffffffffa0485ab8>] cl_page_invoke+0xb8/0x160 [obdclass]
21:34:48: [<ffffffff8119b791>] ? __mark_inode_dirty+0x41/0x160
21:34:48: [<ffffffffa0486838>] cl_page_cache_add+0x58/0x240 [obdclass]
21:34:48: [<ffffffffa07a9503>] ? ll_set_page_dirty+0x13/0x90 [lustre]
21:34:48: [<ffffffffa07754b6>] ? vvp_write_pending+0x56/0x150 [lustre]
21:34:48: [<ffffffffa07ba5a3>] vvp_io_commit_write+0x343/0x5a0 [lustre]
21:34:48: [<ffffffffa037a8f2>] ? cfs_hash_lookup+0x82/0xa0 [libcfs]
21:34:48: [<ffffffffa049481f>] cl_io_commit_write+0xaf/0x1f0 [obdclass]
21:34:48: [<ffffffffa0484ab9>] ? cl_env_get+0x29/0x350 [obdclass]
21:34:48: [<ffffffffa0791c4d>] ll_commit_write+0xed/0x300 [lustre]
21:34:48: [<ffffffffa07a92c0>] ll_write_end+0x30/0x60 [lustre]
21:34:48: [<ffffffff8110dd44>] generic_file_buffered_write+0x174/0x2a0
21:34:48: [<ffffffff8106dd57>] ? current_fs_time+0x27/0x30
21:34:48: [<ffffffff8110f630>] __generic_file_aio_write+0x250/0x480
21:34:48: [<ffffffff8110f8cf>] generic_file_aio_write+0x6f/0xe0
21:34:48: [<ffffffffa07baec1>] vvp_io_write_start+0xa1/0x270 [lustre]
21:34:48: [<ffffffffa0491718>] cl_io_start+0x68/0x170 [obdclass]
21:34:48: [<ffffffffa0495bd0>] cl_io_loop+0x110/0x1c0 [obdclass]
21:34:49: [<ffffffffa076261b>] ll_file_io_generic+0x44b/0x580 [lustre]
21:34:49: [<ffffffffa037ceeb>] ? cfs_hash_add_unique+0x1b/0x40 [libcfs]
21:34:49: [<ffffffffa0484c2e>] ? cl_env_get+0x19e/0x350 [obdclass]
21:34:49: [<ffffffffa076288f>] ll_file_aio_write+0x13f/0x310 [lustre]
21:34:49: [<ffffffffa0762750>] ? ll_file_aio_write+0x0/0x310 [lustre]
21:34:49: [<ffffffff8117247b>] do_sync_readv_writev+0xfb/0x140
21:34:49: [<ffffffff8108e160>] ? autoremove_wake_function+0x0/0x40
21:34:49: [<ffffffff812053a6>] ? security_file_permission+0x16/0x20
21:34:49: [<ffffffff8117353f>] do_readv_writev+0xcf/0x1f0
21:34:49: [<ffffffff8115b19a>] ? kmem_cache_alloc+0x9a/0x190
21:34:49: [<ffffffff81205706>] ? security_task_setgroups+0x16/0x20
21:34:49: [<ffffffff81096e55>] ? set_groups+0x25/0x1a0
21:34:49: [<ffffffff811736a6>] vfs_writev+0x46/0x60
21:34:49: [<ffffffffa091a3d7>] nfsd_vfs_write+0x107/0x430 [nfsd]
21:34:49: [<ffffffff810db8be>] ? rcu_start_gp+0x1be/0x230
21:34:49: [<ffffffffa0918852>] ? nfsd_setuser_and_check_port+0x62/0xb0 [nfsd]
21:34:49: [<ffffffffa091ce39>] nfsd_write+0x99/0x100 [nfsd]
21:34:49: [<ffffffffa0927590>] nfsd4_write+0x100/0x130 [nfsd]
21:34:50: [<ffffffffa0927f01>] nfsd4_proc_compound+0x3d1/0x490 [nfsd]
21:34:50: [<ffffffffa091543e>] nfsd_dispatch+0xfe/0x240 [nfsd]
21:34:50: [<ffffffffa027a514>] svc_process_common+0x344/0x640 [sunrpc]
21:34:50: [<ffffffff8105dc60>] ? default_wake_function+0x0/0x20
21:34:50: [<ffffffffa027ab50>] svc_process+0x110/0x160 [sunrpc]
21:34:50: [<ffffffffa0915b62>] nfsd+0xc2/0x160 [nfsd]
21:34:50: [<ffffffffa0915aa0>] ? nfsd+0x0/0x160 [nfsd]
21:34:50: [<ffffffff8108ddf6>] kthread+0x96/0xa0
21:34:50: [<ffffffff8100c1ca>] child_rip+0xa/0x20
21:34:50: [<ffffffff8108dd60>] ? kthread+0x0/0xa0
21:34:50: [<ffffffff8100c1c0>] ? child_rip+0x0/0x20
21:34:50:Code: 00 44 8b 35 5b 62 9e 00 45 85 f6 74 32 48 8b 50 08 8b 5a 18 48 8b 90 10 09 00 00 48 8b 4a 50 48 85 c9 74 1b 48 63 db 48 8b 51 20 <48> 03 14 dd e0 49 b9 81 4c 01 2a 48 8b 49 78 48 85 c9 75 e8 48
21:34:50:RIP [<ffffffff8104fc84>] update_curr+0x134/0x1e0
21:34:50: RSP <ffff880002003c38>
21:34:50:CR2: ffffffff894c99e0
21:34:50:--[ end trace 670fbcb3104badfe ]--
Attachments
Issue Links
- duplicates
-
LU-969 2.1 client stack overruns
- Resolved