[LU-3063] osp_sync.c:866:osp_sync_thread()) ASSERTION( rc == 0 || rc == LLOG_PROC_BREAK ) failed: 29 changes, 26 in progress, 7 in flight: -5 Created: 29/Mar/13 Updated: 12/Sep/13 Resolved: 12/Sep/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.1, Lustre 2.5.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Jinshan Xiong (Inactive) | Assignee: | Keith Mannthey (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | LB | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 2 | ||||||||
| Rank (Obsolete): | 7462 | ||||||||
| Description |
|
LBUG happens quite often when running racer under a memory pressure environment. For example, I'm using 2G memory vmware, and set up lustre with: OSTSIZE=$((512*1024)) REFORMAT=1 sh racer.sh It can hit LBUG as follows: ll_ost_io01_005: page allocation failure. order:4, mode:0x50 Pid: 22344, comm: ll_ost_io01_005 Not tainted 2.6.32-279.19.1.el6.x86_64.debug #1 Call Trace: [<ffffffff81139b2a>] ? __alloc_pages_nodemask+0x6aa/0xa20 [<ffffffff81175dfe>] ? kmem_getpages+0x6e/0x170 [<ffffffff8117884b>] ? fallback_alloc+0x1cb/0x2b0 [<ffffffff811780a9>] ? cache_grow+0x4c9/0x530 [<ffffffff8117852b>] ? ____cache_alloc_node+0xab/0x200 [<ffffffff81179d08>] ? __kmalloc+0x288/0x330 [<ffffffffa048ebb0>] ? cfs_alloc+0x30/0x60 [libcfs] [<ffffffffa048ebb0>] ? cfs_alloc+0x30/0x60 [libcfs] [<ffffffffa0e35ad8>] ? ost_io_thread_init+0x48/0x300 [ost] [<ffffffffa0888703>] ? ptlrpc_main+0xa3/0x1810 [ptlrpc] [<ffffffff810aebad>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff8151f690>] ? _spin_unlock_irq+0x30/0x40 [<ffffffff8105960d>] ? finish_task_switch+0x7d/0x110 [<ffffffff810595d8>] ? finish_task_switch+0x48/0x110 [<ffffffff810097dc>] ? __switch_to+0x1ac/0x320 [<ffffffffa0888660>] ? ptlrpc_main+0x0/0x1810 [ptlrpc] [<ffffffff8100c1ca>] ? child_rip+0xa/0x20 [<ffffffff8151f690>] ? _spin_unlock_irq+0x30/0x40 [<ffffffff8100bb10>] ? restore_args+0x0/0x30 [<ffffffffa0888660>] ? ptlrpc_main+0x0/0x1810 [ptlrpc] [<ffffffff8100c1c0>] ? child_rip+0x0/0x20 Mem-Info: Node 0 DMA per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 CPU 1: hi: 0, btch: 1 usd: 0 CPU 2: hi: 0, btch: 1 usd: 0 CPU 3: hi: 0, btch: 1 usd: 0 Node 0 DMA32 per-cpu: CPU 0: hi: 186, btch: 31 usd: 40 CPU 1: hi: 186, btch: 31 usd: 0 CPU 2: hi: 186, btch: 31 usd: 0 CPU 3: hi: 186, btch: 31 usd: 38 active_anon:74087 inactive_anon:150981 isolated_anon:0 active_file:14723 inactive_file:15191 isolated_file:5 unevictable:0 dirty:1405 writeback:0 unstable:0 free:57272 slab_reclaimable:6236 slab_unreclaimable:76201 mapped:1475 shmem:214401 pagetables:2281 bounce:0 Node 0 DMA free:8756kB min:332kB low:412kB high:496kB active_anon:656kB inactive_anon:1672kB active_file:68kB inactive_file:1488kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15300kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:2328kB slab_reclaimable:8kB slab_unreclaimable:3028kB kernel_stack:16kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 2004 2004 2004 Node 0 DMA32 free:231280kB min:44720kB low:55900kB high:67080kB active_anon:298048kB inactive_anon:602252kB active_file:58824kB inactive_file:59020kB unevictable:0kB isolated(anon):0kB isolated(file):20kB present:2052192kB mlocked:0kB dirty:5804kB writeback:0kB mapped:5900kB shmem:855276kB slab_reclaimable:24936kB slab_unreclaimable:287616kB kernel_stack:4200kB pagetables:9860kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 Node 0 DMA: 203*4kB 104*8kB 24*16kB 6*32kB 5*64kB 3*128kB 3*256kB 2*512kB 2*1024kB 1*2048kB 0*4096kB = 8812kB Node 0 DMA32: 35286*4kB 9007*8kB 1989*16kB 87*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 251840kB 244228 total pagecache pages 0 pages in swap cache Swap cache stats: add 0, delete 0, find 0/0 Free swap = 0kB Total swap = 0kB 524272 pages RAM 77789 pages reserved 65618 pages shared 325369 pages non-shared LustreError: 47322:0:(vvp_io.c:1086:vvp_io_commit_write()) Write page 17112 of inode ffff880038147728 failed -28 LustreError: 39799:0:(vvp_io.c:1086:vvp_io_commit_write()) Write page 1022 of inode ffff8800598402a8 failed -28 LustreError: 44035:0:(vvp_io.c:1086:vvp_io_commit_write()) Write page 255 of inode ffff880046224268 failed -28 LustreError: 46598:0:(vvp_io.c:1086:vvp_io_commit_write()) Write page 304 of inode ffff8800203b02a8 failed -28 Buffer I/O error on device loop0, logical block 17163 lost page write due to I/O error on loop0 LustreError: 48203:0:(file.c:2707:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000401:0x6f4d:0x0] error: rc = -116 Buffer I/O error on device loop0, logical block 29955 lost page write due to I/O error on loop0 Buffer I/O error on device loop0, logical block 29956 lost page write due to I/O error on loop0 Buffer I/O error on device loop0, logical block 29960 lost page write due to I/O error on loop0 Buffer I/O error on device loop0, logical block 12699 lost page write due to I/O error on loop0 Buffer I/O error on device loop0, logical block 12723 lost page write due to I/O error on loop0 Buffer I/O error on device loop0, logical block 12783 lost page write due to I/O error on loop0 Buffer I/O error on device loop0, logical block 15429 lost page write due to I/O error on loop0 Buffer I/O error on device loop0, logical block 15430 lost page write due to I/O error on loop0 LDISKFS-fs error (device loop0): ldiskfs_find_entry: reading directory #50271 offset 0 Aborting journal on device loop0-8. LustreError: 4993:0:(file.c:158:ll_close_inode_openhandle()) inode 144115205272531443 mdc close failed: rc = -30 LustreError: 4972:0:(llite_lib.c:1292:ll_md_setattr()) md_setattr fails: rc = -30 LustreError: 3150:0:(osd_handler.c:635:osd_trans_commit_cb()) transaction @0xffff88005eef12a0 commit error: 2 LDISKFS-fs error (device loop0): ldiskfs_journal_start_sb: Detected aborted journal LDISKFS-fs (loop0): Remounting filesystem read-only LustreError: 3150:0:(osd_handler.c:635:osd_trans_commit_cb()) transaction @0xffff880058789818 commit error: 2 LustreError: 3498:0:(osd_io.c:997:osd_ldiskfs_read()) lustre-MDT0000: can't read 4096@172032 on ino 110: rc = -5 LustreError: 3498:0:(llog_osd.c:562:llog_osd_next_block()) lustre-MDT0000-osd: can't read llog block from log [0x1:0xc:0x0] offset 172032: rc = -5 LustreError: 3498:0:(osp_sync.c:866:osp_sync_thread()) ASSERTION( rc == 0 || rc == LLOG_PROC_BREAK ) failed: 29 changes, 26 in progress, 7 in flight: -5 LustreError: 3498:0:(osp_sync.c:866:osp_sync_thread()) LBUG Pid: 3498, comm: osp-syn-1 Call Trace: [<ffffffffa048d8c5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa048dec7>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa0ea4a73>] osp_sync_thread+0x783/0x800 [osp] [<ffffffff810595d8>] ? finish_task_switch+0x48/0x110 [<ffffffff8151f690>] ? _spin_unlock_irq+0x30/0x40 [<ffffffff8105960d>] ? finish_task_switch+0x7d/0x110 [<ffffffff810595d8>] ? finish_task_switch+0x48/0x110 [<ffffffffa0ea42f0>] ? osp_sync_thread+0x0/0x800 [osp] LustreError: 3496:0:(osd_io.c:997:osd_ldiskfs_read()) lustre-MDT0000: can't read 4096@180224 on ino 108: rc = -5 LustreError: 3496:0:(llog_osd.c:562:llog_osd_next_block()) lustre-MDT0000-osd: can't read llog block from log [0x1:0xa:0x0] offset 180224: rc = -5 LustreError: 3496:0:(osp_sync.c:866:osp_sync_thread()) ASSERTION( rc == 0 || rc == LLOG_PROC_BREAK ) failed: 14 changes, 30 in progress, 6 in flight: -5 LustreError: 3496:0:(osp_sync.c:866:osp_sync_thread()) LBUG Pid: 3496, comm: osp-syn-0 Call Trace: [<ffffffffa048d8c5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa048dec7>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa0ea4a73>] osp_sync_thread+0x783/0x800 [osp] [<ffffffff810595d8>] ? finish_task_switch+0x48/0x110 [<ffffffff8151f690>] ? _spin_unlock_irq+0x30/0x40 [<ffffffff8105960d>] ? finish_task_switch+0x7d/0x110 [<ffffffff810595d8>] ? finish_task_switch+0x48/0x110 [<ffffffffa0ea42f0>] ? osp_sync_thread+0x0/0x800 [osp] [<ffffffff8100c1ca>] child_rip+0xa/0x20 [<ffffffff8151f690>] ? _spin_unlock_irq+0x30/0x40 [<ffffffff8100bb10>] ? restore_args+0x0/0x30 [<ffffffffa0ea42f0>] ? osp_sync_thread+0x0/0x800 [osp] [<ffffffff8100c1c0>] ? child_rip+0x0/0x20 LustreError: 50716:0:(file.c:158:ll_close_inode_openhandle()) inode 144115205255753869 mdc close failed: rc = -30 LustreError: 50716:0:(file.c:158:ll_close_inode_openhandle()) Skipped 1 previous similar message LDISKFS-fs (loop0): Remounting filesystem read-only Kernel panic - not syncing: LBUG Pid: 3496, comm: osp-syn-0 Not tainted 2.6.32-279.19.1.el6.x86_64.debug #1 Call Trace: [<ffffffff8151baba>] ? panic+0xa0/0x16d [<ffffffffa048df1b>] ? lbug_with_loc+0x9b/0xb0 [libcfs] [<ffffffffa0ea4a73>] ? osp_sync_thread+0x783/0x800 [osp] [<ffffffff810595d8>] ? finish_task_switch+0x48/0x110 [<ffffffff8151f690>] ? _spin_unlock_irq+0x30/0x40 [<ffffffff8105960d>] ? finish_task_switch+0x7d/0x110 [<ffffffff810595d8>] ? finish_task_switch+0x48/0x110 [<ffffffffa0ea42f0>] ? osp_sync_thread+0x0/0x800 [osp] [<ffffffff8100c1ca>] ? child_rip+0xa/0x20 [<ffffffff8151f690>] ? _spin_unlock_irq+0x30/0x40 [<ffffffff8100bb10>] ? restore_args+0x0/0x30 [<ffffffffa0ea42f0>] ? osp_sync_thread+0x0/0x800 [osp] [<ffffffff8100c1c0>] ? child_rip+0x0/0x20 |
| Comments |
| Comment by Oleg Drokin [ 29/Mar/13 ] |
|
so this stems from this: struct ost_thread_local_cache {
/*
* pool of nio buffers used by write-path
*/
struct niobuf_local local[OST_THREAD_POOL_SIZE];
unsigned int temporary:1;
};
and OST_THREAD_POOL_SIZE is defined as PTLRPC_MAX_BRW_PAGES |
| Comment by Keith Mannthey (Inactive) [ 29/Mar/13 ] |
ll_ost_io01_005: page allocation failure. order:4, mode:0x50 It fails a page allocation at the kernel level. Node 0 DMA: 203*4kB 104*8kB 24*16kB 6*32kB 5*64kB 3*128kB 3*256kB 2*512kB 2*1024kB 1*2048kB 0*4096kB = 8812kB Node 0 DMA32: 35286*4kB 9007*8kB 1989*16kB 87*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 251840kB Higher order pages are scarce for both zones. (order 4 is 16 pages 64k) The kernel may not willing give up it's last few pages for anything less than the memory reclaim processes. This is memory fragmentation, the longer you run a system the less likely higher order allocations will work. Perhaps racers needs a mim memory configuration to run properly, I don't know much about that test. We might be a little concerned about the lbug after we mount the FS readonly, it is not the softest landing for a memory allocation error. |
| Comment by Oleg Drokin [ 01/Apr/13 ] |
|
We just need to convert the allocation to ALLOC_LARGE for the short term I imagine, probably not all that easy to reduce this. The reason for the failure is due to racer puttign pressure on OST threads and ptlrpc deciding to spawn a few more threads, but those threads need some tls cache to handle IO, and we fail allocating it, sp. now that 4M rpc increases the required allocation size. |
| Comment by Alex Zhuravlev [ 01/Apr/13 ] |
|
Keith, you're right about handling of course. I think in this case the approach should be "relax for a moment, try again". we can repeat as long as we need, in the worst case OST(s) will get full until we get some memory to handle llog and send RPCs. |
| Comment by Keith Mannthey (Inactive) [ 08/May/13 ] |
|
I just wanted to note I am digging into this issue a bit and pulled fresh code today. I am able to reproduce panics with 2gb VM and 4GB Swap. I use memhog to trigger the order 4 allocations (memhog 5200M I think) and then wait around. I just setup serial to capture the full panics to see if they are the same as reported and the panic calls don't alwasy make it to disk. My goal is to allow the system to fail the order 4 allocations without causing any panics on the system. |
| Comment by Keith Mannthey (Inactive) [ 09/May/13 ] |
|
Well I hit this other fun thing with memory pressure running a bit more than the original issue: The occurens during a unmount/mount cycles and normally before order 4 memory operations start to fail. The mount/umount happens every 300 seconds or so as per the standarnd racer run. SLAB: cache with size 256 has lost its name SLAB: cache with size 40 has lost its name SLAB: cache with size 192 has lost its name SLAB: cache with size 1216 has lost its name SLAB: cache with size 256 has lost its name SLAB: cache with size 40 has lost its name SLAB: cache with size 192 has lost its name SLAB: cache with size 1216 has lost its name SLAB: cache with size 256 has lost its name SLAB: cache with size 40 has lost its name SLAB: cache with size 192 has lost its name SLAB: cache with size 1216 has lost its name SLAB: cache with size 256 has lost its name SLAB: cache with size 40 has lost its name SLAB: cache with size 192 has lost its name SLAB: cache with size 1216 has lost its name LustreError: 3320:0:(lprocfs_status.c:489:lprocfs_register()) Lproc: Attempting to register llite more than once SLAB: cache with size 256 has lost its name SLAB: cache with size 40 has lost its name SLAB: cache with size 192 has lost its name SLAB: cache with size 1216 has lost its name SLAB: cache with size 256 has lost its name SLAB: cache with size 40 has lost its name SLAB: cache with size 192 has lost its name SLAB: cache with size 1216 has lost its name SLAB: cache with size 256 has lost its name SLAB: cache with size 40 has lost its name SLAB: cache with size 192 has lost its name SLAB: cache with size 1216 has lost its name SLAB: cache with size 256 has lost its name SLAB: cache with size 40 has lost its name SLAB: cache with size 192 has lost its name SLAB: cache with size 1216 has lost its name SLAB: cache with size 256 has lost its name SLAB: cache with size 40 has lost its name SLAB: cache with size 192 has lost its name SLAB: cache with size 1216 has lost its name SLAB: cache with size 256 has lost its name SLAB: cache with size 40 has lost its name SLAB: cache with size 192 has lost its name SLAB: cache with size 1216 has lost its name SLAB: cache with size 256 has lost its name SLAB: cache with size 40 has lost its name SLAB: cache with size 192 has lost its name SLAB: cache with size 1216 has lost its name LustreError: 11-0: lustre-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11. BUG: unable to handle kernel paging request at ffffffffffffffde IP: [<ffffffffa05a8498>] lprocfs_srch+0x48/0x80 [obdclass] PGD 1a87067 PUD 1a88067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/system/cpu/possible CPU 2 Modules linked in: lustre(U) ofd(U) osp(U) lod(U) ost(U) mdt(U) osd_ldiskfs(U) fsfilt_ldiskfs(U) ldiskfs(U) mdd(U) mgs(U) lquota(U) obdecho(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) exportfs jbd ksocklnd(U) lnet(U) sha512_generic sha256_generic libcfs(U) ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 ppdev parport_pc parport microcode i2c_piix4 i2c_core sg snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc e1000 ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom ahci pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: lvfs] Pid: 4433, comm: mount.lustre Not tainted 2.6.32.may03master #1 innotek GmbH VirtualBox RIP: 0010:[<ffffffffa05a8498>] [<ffffffffa05a8498>] lprocfs_srch+0x48/0x80 [obdclass] RSP: 0000:ffff8800425aba58 EFLAGS: 00010286 RAX: ffffffffa0630ac0 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff8800425abb68 RDI: ffffffffa0630ac0 RBP: ffff8800425aba78 R08: 00000000fffffffe R09: 0000000000000000 R10: 000000000000000f R11: 000000000000000f R12: ffff8800425abb68 R13: ffffffffffffff8e R14: 0000000000000000 R15: ffff8800425abb68 FS: 00007f3789752700(0000) GS:ffff880002300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffffffffffffffde CR3: 000000001bcb7000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process mount.lustre (pid: 4433, threadinfo ffff8800425aa000, task ffff8800650c1540) Stack: ffffffffffffff8e ffff8800425abb68 ffffffffffffff8e 0000000000000000 <d> ffff8800425abac8 ffffffffa05aa3e4 ffff8800425aba98 ffffffff81281356 <d> 0000000000000004 0000000affffffff ffff88000017c000 0000000000000006 Call Trace: [<ffffffffa05aa3e4>] lprocfs_register+0x34/0x100 [obdclass] [<ffffffff81281356>] ? vsnprintf+0x336/0x5e0 [<ffffffffa188eb7e>] lprocfs_register_mountpoint+0x12e/0xb20 [lustre] [<ffffffffa187e3b6>] client_common_fill_super+0x1a6/0x5280 [lustre] [<ffffffff81281640>] ? sprintf+0x40/0x50 [<ffffffffa1884204>] ll_fill_super+0xd74/0x1500 [lustre] [<ffffffffa05d557d>] lustre_fill_super+0x3dd/0x530 [obdclass] [<ffffffffa05d51a0>] ? lustre_fill_super+0x0/0x530 [obdclass] [<ffffffff811841df>] get_sb_nodev+0x5f/0xa0 [<ffffffffa05ccde5>] lustre_get_sb+0x25/0x30 [obdclass] [<ffffffff8118381b>] vfs_kern_mount+0x7b/0x1b0 [<ffffffff811839c2>] do_kern_mount+0x52/0x130 [<ffffffff811a3c72>] do_mount+0x2d2/0x8d0 [<ffffffff811a4300>] sys_mount+0x90/0xe0 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b Code: a0 49 89 f4 e8 3a 72 f6 e0 4d 85 ed 75 1d 48 c7 c7 c0 0a 63 a0 e8 c9 3c af e0 48 83 c4 08 48 89 d8 5b 41 5c 41 5d c9 c3 0f 1f 00 <49> 8b 5d 50 48 85 db 75 10 eb d8 0f 1f 44 00 00 48 8b 5b 40 48 RIP [<ffffffffa05a8498>] lprocfs_srch+0x48/0x80 [obdclass] |
| Comment by Keith Mannthey (Inactive) [ 15/May/13 ] |
|
Today testing the lproc patch I encountered: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: [<ffffffff8116823c>] kmem_cache_alloc+0x6c/0x190 PGD 3ac6b067 PUD 608bc067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/module/ptlrpc/initstate CPU 3 Modules linked in: obdecho(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) exportfs jbd ksocklnd(U) lnet(U) sha512_generic sha256_generic libcfs(U) ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 ppdev parport_pc parport microcode i2c_piix4 i2c_core snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc e1000 sg ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom ahci pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: lvfs] Pid: 10558, comm: ptlrpcd_rcv Not tainted 2.6.32.may03master #1 innotek GmbH VirtualBox RIP: 0010:[<ffffffff8116823c>] [<ffffffff8116823c>] kmem_cache_alloc+0x6c/0x190 RSP: 0018:ffff880000ebfd10 EFLAGS: 00010046 RAX: 0000000000000003 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000050 RDI: 0000000000000000 RBP: ffff880000ebfd40 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000050 R13: 0000000000000000 R14: 0000000000000050 R15: 0000000000000246 FS: 00007ff97f825700(0000) GS:ffff880002380000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000018 CR3: 000000002ca45000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process ptlrpcd_rcv (pid: 10558, threadinfo ffff880000ebe000, task ffff88003af34aa0) Stack: ffff880000ebfd40 0000000000000000 ffff880000ebfea0 ffffffffa08ef840 <d> 0000000000000000 ffff880000ebfee0 ffff880000ebfd60 ffffffffa034ca72 <d> ffff880000ebfd60 0000000000000040 ffff880000ebfd80 ffffffffa08c243a Call Trace: [<ffffffffa034ca72>] cfs_mem_cache_alloc+0x22/0x30 [libcfs] [<ffffffffa08c243a>] osc_key_init+0x3a/0x200 [osc] [<ffffffffa04e11ef>] keys_fill+0x6f/0x190 [obdclass] [<ffffffffa04e132d>] lu_context_refill+0x1d/0x30 [obdclass] [<ffffffffa04e1356>] lu_env_refill+0x16/0x30 [obdclass] [<ffffffffa06c773c>] ptlrpcd_check+0x4c/0x560 [ptlrpc] [<ffffffff8109703c>] ? remove_wait_queue+0x3c/0x50 [<ffffffffa06c80c0>] ptlrpcd+0x1a0/0x390 [ptlrpc] [<ffffffff810dc565>] ? __audit_syscall_exit+0x265/0x290 [<ffffffff81063310>] ? default_wake_function+0x0/0x20 [<ffffffffa06c7f20>] ? ptlrpcd+0x0/0x390 [ptlrpc] [<ffffffff8100c0ca>] child_rip+0xa/0x20 [<ffffffffa06c7f20>] ? ptlrpcd+0x0/0x390 [ptlrpc] [<ffffffffa06c7f20>] ? ptlrpcd+0x0/0x390 [ptlrpc] [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 Code: 00 49 89 c7 fa 66 0f 1f 44 00 00 65 48 8b 04 25 c0 cb 00 00 f7 40 14 00 00 00 12 0f 85 88 00 00 00 65 8b 04 25 b8 e0 00 00 48 98 <49> 8b 44 c5 00 44 8b 00 45 85 c0 0f 84 d7 00 00 00 8b 10 c7 40 RIP [<ffffffff8116823c>] kmem_cache_alloc+0x6c/0x190 I will attach the full serial log shortly. |
| Comment by Keith Mannthey (Inactive) [ 15/May/13 ] |
|
Racer 2gb kmem_cache_alloc crash -v1 serial out log. |
| Comment by Keith Mannthey (Inactive) [ 15/May/13 ] |
|
With the lastest error it seems that osc_init is not checking creating of the osc_caches. I am working with a test patch in this area to fail osc_init in the case where lu_kmem_init fails. If it helps I will post it for review. |
| Comment by Keith Mannthey (Inactive) [ 16/May/13 ] |
|
Well I haven not hit the kmam_cache_alloc crash in the last while. I hit a 2nd lproc (documented in lu-3331) issues yesterday and today is a bad situaion. The system has not crashed but my root file system has stopped responding. All io is hanging on the system. My last terminal was fine until I interacted with the root filesystem. Sorry not a lot of debug but this is what is happening on the serial console: LustreError: 15631:0:(service.c:844:ptlrpc_register_service()) Skipped 1 previous similar message LustreError: 15631:0:(ost_handler.c:2677:ost_setup()) failed to start OST I/O service: -12 LustreError: 15631:0:(ost_handler.c:2677:ost_setup()) Skipped 1 previous similar message LustreError: 15631:0:(obd_config.c:572:class_setup()) setup OSS failed (-12) LustreError: 15631:0:(obd_config.c:572:class_setup()) Skipped 1 previous similar message LustreError: 15631:0:(obd_mount.c:201:lustre_start_simple()) OSS setup error -12 LustreError: 15631:0:(obd_mount.c:201:lustre_start_simple()) Skipped 1 previous similar message LustreError: 15631:0:(obd_mount_server.c:1225:server_start_targets()) failed to start OSS: -12 LustreError: 15631:0:(obd_mount_server.c:1225:server_start_targets()) Skipped 1 previous similar message LustreError: 15631:0:(obd_mount_server.c:1699:server_fill_super()) Unable to start targets: -12 LustreError: 15631:0:(obd_mount_server.c:1699:server_fill_super()) Skipped 1 previous similar message LustreError: 15631:0:(obd_mount_server.c:844:lustre_disconnect_lwp()) lustre-MDT0000-lwp-OST0000: Can't end config log lustre-client. LustreError: 15631:0:(obd_mount_server.c:844:lustre_disconnect_lwp()) Skipped 1 previous similar message LustreError: 15631:0:(obd_mount_server.c:1426:server_put_super()) lustre-OST0000: failed to disconnect lwp. (rc=-2) LustreError: 15631:0:(obd_mount_server.c:1426:server_put_super()) Skipped 1 previous similar message LustreError: 15631:0:(obd_mount_server.c:1456:server_put_super()) no obd lustre-OST0000 LustreError: 15631:0:(obd_mount_server.c:1456:server_put_super()) Skipped 1 previous similar message LustreError: 15631:0:(obd_mount_server.c:135:server_deregister_mount()) lustre-OST0000 not registered LustreError: 15631:0:(obd_mount_server.c:135:server_deregister_mount()) Skipped 1 previous similar message LustreError: 15631:0:(obd_mount.c:1267:lustre_fill_super()) Unable to mount (-12) LustreError: 15631:0:(obd_mount.c:1267:lustre_fill_super()) Skipped 1 previous similar message LustreError: 11-0: lustre-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11. LustreError: Skipped 1 previous similar message LustreError: 24257:0:(file.c:2718:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000400:0x269:0x0] error: rc = -116 LustreError: 24264:0:(file.c:2718:ll_inode_revalidate_fini()) lustre: revalidate FID [0x200000400:0x269:0x0] error: rc = -116 Out of memory: Kill process 17806 (memhog) score 890 or sacrifice child Killed process 17806, UID 0, (memhog) total-vm:5503100kB, anon-rss:1446724kB, file-rss:16kB LustreError: 19093:0:(ofd_grant.c:255:ofd_grant_space_left()) lustre-OST0001: cli 77404d61-8728-23ef-e160-d009880a0b48/ffff880057434000 left 66191360 < tot_grant 68330240 unstable 0 pending 0 LustreError: 22611:0:(ofd_grant.c:255:ofd_grant_space_left()) lustre-OST0001: cli f6b05f1d-8d84-085e-aff5-8b8b01874ede/ffff880065281400 left 66191360 < tot_grant 68330240 unstable 0 pending 98304 LustreError: 4974:0:(ofd_grant.c:255:ofd_grant_space_left()) lustre-OST0001: cli f6b05f1d-8d84-085e-aff5-8b8b01874ede/ffff880065281400 left 60882944 < tot_grant 64422656 unstable 12288 pending 12288 LustreError: 4974:0:(ofd_grant.c:255:ofd_grant_space_left()) Skipped 82 previous similar messages LustreError: 30176:0:(ofd_grant.c:255:ofd_grant_space_left()) lustre-OST0001: cli f6b05f1d-8d84-085e-aff5-8b8b01874ede/ffff880065281400 left 56311808 < tot_grant 58970880 unstable 0 pending 0 LustreError: 30176:0:(ofd_grant.c:255:ofd_grant_space_left()) Skipped 84 previous similar messages LustreError: 25853:0:(ofd_grant.c:255:ofd_grant_space_left()) lustre-OST0001: cli 77404d61-8728-23ef-e160-d009880a0b48/ffff880057434000 left 43249664 < tot_grant 48173824 unstable 0 pending 0 LustreError: 25853:0:(ofd_grant.c:255:ofd_grant_space_left()) Skipped 119 previous similar messages LustreError: 25853:0:(ofd_grant.c:255:ofd_grant_space_left()) lustre-OST0001: cli 77404d61-8728-23ef-e160-d009880a0b48/ffff880057434000 left 27381760 < tot_grant 27546368 unstable 81920 pending 0 LustreError: 25853:0:(ofd_grant.c:255:ofd_grant_space_left()) Skipped 388 previous similar messages LustreError: 20014:0:(vvp_io.c:1086:vvp_io_commit_write()) Write page 42946 of inode ffff880053550b38 failed -28 LustreError: 5617:0:(vvp_io.c:1086:vvp_io_commit_write()) Write page 35089 of inode ffff88001874dbf8 failed -28 LustreError: 19936:0:(ofd_grant.c:255:ofd_grant_space_left()) lustre-OST0001: cli f6b05f1d-8d84-085e-aff5-8b8b01874ede/ffff880065281400 left 0 < tot_grant 537344 unstable 0 pending 0 LustreError: 19936:0:(ofd_grant.c:255:ofd_grant_space_left()) Skipped 575 previous similar messages INFO: task mdt_rdpg00_000:18958 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. INFO: task dir_create.sh:19649 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. INFO: task ls:25366 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. INFO: task ls:25367 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. INFO: task ls:25368 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. INFO: task ls:25369 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. INFO: task ls:25370 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. INFO: task ls:25371 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. INFO: task ls:25372 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. INFO: task ls:25373 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. LustreError: dumping log to /tmp/lustre-log.1368691592.18956 LustreError: 11-0: lustre-MDT0000-mdc-ffff88007cd05000: Communicating with 0@lo, operation mds_connect failed with -16. LustreError: Skipped 1 previous similar message LustreError: 11-0: lustre-MDT0000-mdc-ffff88007cd05000: Communicating with 0@lo, operation mds_connect failed with -16. LustreError: Skipped 3 previous similar messages Outside of the oom killer hitting memhog there are no Linux messages. I can't kill memhog or the racer test (stalled in the kernel somewhere). I am working to at least get process stacks from the kernel. |
| Comment by Keith Mannthey (Inactive) [ 16/May/13 ] |
|
Well the vm was non-responsove. I am setting out better RAS in my VM... I will attach the var log messages from the boot. |
| Comment by Keith Mannthey (Inactive) [ 16/May/13 ] |
|
System was up but rootfs io hung. No real into outside of what is in this log. |
| Comment by Keith Mannthey (Inactive) [ 23/May/13 ] |
|
AFter a little lproc issue I have good data from this odd io hang. All userspace process were hung but some part of the kernel were running just fine. I am still looking at all the data (full logs and vmcore) and hope to have a better idea of what happened tomorrow. |
| Comment by Keith Mannthey (Inactive) [ 23/May/13 ] |
|
Full console from odd hang. There is a sysrq dump of the memory state and all the processes at the end. |
| Comment by Keith Mannthey (Inactive) [ 23/May/13 ] |
|
So a it seems general processes are all in TASK_UNINTERRUPTIBLE hanging out for a common mutex. The mutex is held by "crond" (doing who knows what) and it looks like PID: 1433 TASK: ffff8800375c2ae0 CPU: 2 COMMAND: "crond" #0 [ffff88007d4dd9b8] schedule at ffffffff8150da92 #1 [ffff88007d4dda80] __wait_on_freeing_inode at ffffffff8119c8c8 #2 [ffff88007d4ddaf0] find_inode_fast at ffffffff8119c948 #3 [ffff88007d4ddb20] ifind_fast at ffffffff8119daac #4 [ffff88007d4ddb50] iget_locked at ffffffff8119dd49 #5 [ffff88007d4ddb90] ext4_iget at ffffffffa00baf27 [ext4] #6 [ffff88007d4ddc00] ext4_lookup at ffffffffa00c1bb5 [ext4] #7 [ffff88007d4ddc40] do_lookup at ffffffff81190465 #8 [ffff88007d4ddca0] __link_path_walk at ffffffff81190c24 #9 [ffff88007d4ddd60] path_walk at ffffffff811917aa #10 [ffff88007d4ddda0] do_path_lookup at ffffffff8119197b #11 [ffff88007d4dddd0] user_path_at at ffffffff81192607 #12 [ffff88007d4ddea0] vfs_fstatat at ffffffff81186a1c #13 [ffff88007d4ddee0] vfs_stat at ffffffff81186b8b #14 [ffff88007d4ddef0] sys_newstat at ffffffff81186bb4 #15 [ffff88007d4ddf80] system_call_fastpath at ffffffff8100b072 All other process don't make it past do_path_loopup or have no filesystem usage. I am now looking into why crond is stuck where it is. |
| Comment by Keith Mannthey (Inactive) [ 24/May/13 ] |
|
Hmm, Crash didn't want to show me the inode that was being waited on (I expected to to be very close to __wait_on_freeing_inode(inode) An example racer task: PID: 23088 TASK: ffff880037598aa0 CPU: 1 COMMAND: "ls"
#0 [ffff8800179ef7d8] schedule at ffffffff8150da92
#1 [ffff8800179ef8a0] __mutex_lock_slowpath at ffffffff8150f13e
#2 [ffff8800179ef910] mutex_lock at ffffffff8150efdb
#3 [ffff8800179ef930] mdc_close at ffffffffa0a7064b [mdc]
#4 [ffff8800179ef980] lmv_close at ffffffffa0a3dcd8 [lmv]
#5 [ffff8800179ef9d0] ll_close_inode_openhandle at ffffffffa14edd0e [lustre]
#6 [ffff8800179efa50] ll_release_openhandle at ffffffffa14f2e11 [lustre]
#7 [ffff8800179efa80] ll_file_open at ffffffffa14f53c7 [lustre]
#8 [ffff8800179efb70] ll_dir_open at ffffffffa14d6e1b [lustre]
#9 [ffff8800179efb90] __dentry_open at ffffffff8117e0ca
#10 [ffff8800179efbf0] lookup_instantiate_filp at ffffffff8117e4b9
#11 [ffff8800179efc10] ll_revalidate_nd at ffffffffa14d5a8a [lustre]
#12 [ffff8800179efc40] do_lookup at ffffffff81190326
#13 [ffff8800179efca0] __link_path_walk at ffffffff81190c24
#14 [ffff8800179efd60] path_walk at ffffffff811917aa
#15 [ffff8800179efda0] do_path_lookup at ffffffff8119197b
#16 [ffff8800179efdd0] do_filp_open at ffffffff811928bb
#17 [ffff8800179eff20] do_sys_open at ffffffff8117de79
#18 [ffff8800179eff70] sys_open at ffffffff8117df90
#19 [ffff8800179eff80] system_call_fastpath at ffffffff8100b072
RIP: 00000035d68dac10 RSP: 00007fff4a55b8f8 RFLAGS: 00010202
RAX: 0000000000000002 RBX: ffffffff8100b072 RCX: 0000000000000020
RDX: 0000000000000001 RSI: 0000000000090800 RDI: 0000000000f74fd0
RBP: 00007fd577d776a0 R8: 0000000000000000 R9: 0000000000000000
R10: 0000000000f74bf0 R11: 0000000000000246 R12: ffffffff8117df90
R13: ffff8800179eff78 R14: 0000000000000000 R15: 0000000000f74fd0
ORIG_RAX: 0000000000000002 CS: 0033 SS: 002b
I am going to turn on Mutex debugging to speed things up a bit and retest. |
| Comment by Keith Mannthey (Inactive) [ 01/Jun/13 ] |
|
Well after sync with Master a few days ago I have not crashed. I am running 2 lproc patches and http://review.whamcloud.com/6514 . I don't seem to get the same IO errors and I have not seen the original Assertion but I will keep testing. |
| Comment by Diego Moreno (Inactive) [ 01/Aug/13 ] |
|
Just to let you know that I'm hitting this one on 2.3.63. And I guess this one is also present on 2.4.0. I see the patch has been integrated in master but not in 2.4. I think it could be interesting to see this one integrated in a future 2.4.1 version. |
| Comment by Sebastien Buisson (Inactive) [ 20/Aug/13 ] |
|
Hi, I have pushed the b2_4 version of the patch here: TIA, |
| Comment by Peter Jones [ 12/Sep/13 ] |
|
Landed for 2.4.1 and 2.5.0 |