[LU-3007] Umount OST caused a oops in ptlrpc_free_rqbd Created: 21/Mar/13  Updated: 17/Jul/13  Resolved: 17/Jul/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: James A Simmons Assignee: Bruno Faccini (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

OSS server runing Lustre 2.3.61


Attachments: File lu-3007.log    
Severity: 3
Rank (Obsolete): 7320

 Description   

After finishing a test run with the 2.4 file system when we went to umount the entire file system one of the OSS expected a oops when it went to umount.
The oops is:

Mar 9 20:11:12 widow-oss11c2 kernel: [112998.231220] Lustre: server umount routed1-OST00b1 complete
Mar 9 20:11:13 widow-oss11c2 kernel: [112999.326796] LustreError: 20552:0:(obd_mount.c:2115:server_put_super()) routed1-OST01d1: failed to disconnect lwp. (rc=-110)
Mar 9 20:11:13 widow-oss11c2 kernel: [112999.359171] LustreError: 20552:0:(obd_mount.c:2115:server_put_super()) Skipped 1 previous similar message
Mar 9 20:11:13 widow-oss11c2 kernel: [112999.370669] Lustre: server umount routed1-OST0171 complete
Mar 9 20:11:13 widow-oss11c2 kernel: [112999.370672] Lustre: Skipped 1 previous similar message
Mar 9 20:11:13 widow-oss11c2 kernel: [112999.429394] Lustre: Failing over routed1-OST01d1
Mar 9 20:11:13 widow-oss11c2 kernel: [112999.448337] Lustre: Skipped 1 previous similar message
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.073001] BUG: soft lockup - CPU#0 stuck for 67s! [obd_zombid:5676]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.073001] Modules linked in: osp(U) ofd(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) ldiskfs(U) mdd(U) lustre(U) lov(U) osc(U) mdc(U) lquota(U) fid(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ext4 mbcache jbd2 sha512_generic sha256_generic ipmi_devintf ipmi_si ipmi_msghandler autofs4 dm_round_robin scsi_wait_scan sd_mod crc_t10dif ib_srp scsi_transport_srp scsi_tgt nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_REJECT xt_comment nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ib_sa dm_mirror dm_region_hash dm_log dm_multipath dm_mod sg sr_mod cdrom radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core ib_mthca ib_mad ib_core shpchp microcode dcdbas serio_raw ata_generic pata_acpi ata_piix iTCO_wdt iTCO_vendor_support i5000_edac edac_core i5k_amb ioatdma dca ipv6 nfs lockd fscache nfs_acl auth_rpcgss sunrpc bnx2 [last unloaded: libcfs]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] Pid: 5676, comm: obd_zombid Not tainted 2.6.32-279.22.1.el6.head.x86_64 #1 Dell Inc. PowerEdge 1950/0TT740
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] RIP: 0010:[<ffffffff81147e76>] [<ffffffff81147e76>] remove_vm_area+0x86/0xa0
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] RSP: 0018:ffff880419427cb0 EFLAGS: 00000287
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] RAX: ffff8800a66fb540 RBX: ffff880419427cd0 RCX: ffff88011e073440
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] RDX: ffff88011e0733c0 RSI: 0000000000000001 RDI: ffffffff81abea30
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] RBP: ffffffff8100bb8e R08: 0000000000000000 R09: ffffc900cc6ee000
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] R10: ffff880028402500 R11: ffff88008acf3800 R12: ffff880419427c80
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] R13: 0000000000000297 R14: ffff8800000206c0 R15: ffffea0002410270
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] FS: 00007f4834066700(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] CR2: 00000000004050b9 CR3: 0000000001a85000 CR4: 00000000000006f0
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] Process obd_zombid (pid: 5676, threadinfo ffff880419426000, task ffff8803e2329540)
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] Stack:
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] ffff88001b6df840 ffff8800a66fb540 ffff88008ad17850 ffffc900cc6ee000
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] <d> ffff880419427d00 ffffffff81147f4e 000000000000005a ffff88006f537800
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] <d> ffff8803d5e93400 ffff8803d5e93430 ffff880419427d10 ffffffff811480ba
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] Call Trace:
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffff81147f4e>] ? __vunmap+0x2e/0x120
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffff811480ba>] ? vfree+0x2a/0x40
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa06e3b0e>] ? cfs_free_large+0xe/0x10 [libcfs]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa0e307c2>] ? ptlrpc_free_rqbd+0xc2/0x3b0 [ptlrpc]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa0e33c45>] ? ptlrpc_unregister_service+0x5d5/0x10b0 [ptlrpc]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffff81052223>] ? __wake_up+0x53/0x70
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa1253292>] ? ost_cleanup+0x52/0x1a0 [ost]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa07f1d92>] ? class_decref+0x212/0x580 [obdclass]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa06f32e1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa07cf549>] ? obd_zombie_impexp_cull+0x309/0x5d0 [obdclass]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa07cf8d5>] ? obd_zombie_impexp_thread+0xc5/0x1c0 [obdclass]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffff8105fa40>] ? default_wake_function+0x0/0x20
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa07cf810>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa07cf810>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa07cf810>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] Code: 30 ea ab 81 48 89 45 e8 e8 88 4d 3a 00 48 8b 15 19 73 e6 00 48 c7 c1 78 f1 fa 81 48 8b 45 e8 48 39 d3 74 0c 90 48 89 d1 48 8b 12 <48> 39 d3 75 f5 48 8b 13 48 89 11 f0 81 05 a4 6b 97 00 00 00 00
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] Call Trace:
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffff81147f4e>] ? __vunmap+0x2e/0x120
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffff811480ba>] ? vfree+0x2a/0x40
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa06e3b0e>] ? cfs_free_large+0xe/0x10 [libcfs]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa0e307c2>] ? ptlrpc_free_rqbd+0xc2/0x3b0 [ptlrpc]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa0e33c45>] ? ptlrpc_unregister_service+0x5d5/0x10b0 [ptlrpc]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffff81052223>] ? __wake_up+0x53/0x70
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa1253292>] ? ost_cleanup+0x52/0x1a0 [ost]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa07f1d92>] ? class_decref+0x212/0x580 [obdclass]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa06f32e1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa07cf549>] ? obd_zombie_impexp_cull+0x309/0x5d0 [obdclass]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa07cf8d5>] ? obd_zombie_impexp_thread+0xc5/0x1c0 [obdclass]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffff8105fa40>] ? default_wake_function+0x0/0x20
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa07cf810>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa07cf810>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffffa07cf810>] ? obd_zombie_impexp_thread+0x0/0x1c0 [obdclass]
Mar 9 20:12:30 widow-oss11c2 kernel: [113076.102203] [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Mar 9 20:13:35 widow-oss11c2 kernel: [113140.803317] Lustre: server umount routed1-OST0231 complete
Mar 9 20:13:35 widow-oss11c2 kernel: [113140.819063] Lustre: Skipped 1 previous similar message



 Comments   
Comment by Bruno Faccini (Inactive) [ 21/Mar/13 ]

James,
It is only a soft-lockup trace/stack, was it a one-shoot or were there multiple ??
How many OSTs to be stopped ??
A full log may be welcome.

Comment by Oleg Drokin [ 21/Mar/13 ]

so it seems the unmount actually succeeded from further messages.
the only issue is we are probably taking too long to unregister some buffers when there are many of them, so should not be holding any locks over that?

Comment by James A Simmons [ 22/Mar/13 ]

Bruno the soft-lockup only happened once on single OSS when unmounting this particular OST. The total OSTs were 376 if I remember correctly. The other were fine.

Comment by Bruno Faccini (Inactive) [ 22/Mar/13 ]

Humm, Having a look to some assembly of the remove_vm_area() routine it seems that the soft-lockup occured during the for loop to find the matching vm_area :

struct vm_struct *remove_vm_area(const void *addr)
{
        struct vmap_area *va;

        va = find_vmap_area((unsigned long)addr);
        if (va && va->flags & VM_VM_AREA) {
                struct vm_struct *vm = va->private;

                if (!(vm->flags & VM_UNLIST)) {
                        struct vm_struct *tmp, **p;
                        /*
                         * remove from list and disallow access to
                         * this vm_struct before unmap. (address range
                         * confliction is maintained by vmap.)
                         */
                        write_lock(&vmlist_lock);
                        for (p = &vmlist; (tmp = *p) != vm; p = &tmp->next)  <<<<<<<<<<<<
                                ;
                        *p = tmp->next;
                        write_unlock(&vmlist_lock);
                }

                vmap_debug_free_range(va->va_start, va->va_end);
                free_unmap_vmap_area(va);
                vm->size -= PAGE_SIZE;

                return vm;
        }
        return NULL;
}

Thus it is kind of strange that this thread was the only "impacted"!!...
Also is it reasonable to think that it spent more than 67s in this loop and finally exited from it ?
May be it spent most of this time trying to write_lock(&vmlist_lock) before ?

Comment by James A Simmons [ 28/Mar/13 ]

I found out from our admin it was more than OST umounting that had a problem. The logs you saw was for one that completed. The rest never completed and after 30 minutes the admin rebooted the machine instead.

Comment by Oleg Drokin [ 28/Mar/13 ]

can we get logs from one of the nodes that did not complete unmounting, please?

Comment by James A Simmons [ 28/Mar/13 ]

Here is the entire log of the day for this OSS that crashed when unmounting.

Comment by Bruno Faccini (Inactive) [ 05/Apr/13 ]

Thank's for the log, but next time this kind of hung situation requires a hard reboot/reset, is it possible that you take a crash-dump instead ??

Comment by Bruno Faccini (Inactive) [ 15/Apr/13 ]

James,
There is nothing more of interest in the log, thus we need at least a full back trace (Alt+SysRq+T, or echo t > /proc/sysrq-trigger) to be taken at the time the soft-lockup is reported, and a forced crash-dump would be a must.

Comment by James A Simmons [ 23/Apr/13 ]

We have changed are policy now for test shots. For the last test shot we did not encounter this problem. If we don't encounter this at the next test shot I say we can close the ticket.

Comment by Bruno Faccini (Inactive) [ 17/Jul/13 ]

Hello James,
How went the further test shots finally ??
If ok, can we close this ticket? If not, have you been able to take a crash-dump upon a new occurrence?

Comment by James A Simmons [ 17/Jul/13 ]

Yes you can close this ticket. If we encounter it we can reopen but we haven't seen this problem lately.

Comment by Bruno Faccini (Inactive) [ 17/Jul/13 ]

Ok thank's James.

Generated at Sat Feb 10 01:30:09 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.