Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.3.0, Lustre 2.1.4, Lustre 2.1.5
-
None
-
3
-
4604
Description
This issue was created by maloo for Oleg Drokin <green@whamcloud.com>
This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/667d0d32-80d8-11e1-997d-525400d2bfa6.
MDT crashed:
07:51:48:Lustre: DEBUG MARKER: == replay-single test 44c: race in target handle connect ============================================= 04:51:46 (1333799506)
07:51:49:Turning device dm-0 (0xfd00000) read-only
07:51:50:Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000
07:52:19:Lustre: lustre-OST0003-osc-MDT0000: Connection to service lustre-OST0003 via nid 172.29.3.36@tcp was lost; in progress operations using this service will wait for recovery to complete.
07:52:20:LustreError: 8320:0:(osc_create.c:614:osc_create()) lustre-OST0003-osc-MDT0000: oscc recovery failed: -11
07:52:20:LustreError: 8320:0:(lov_obd.c:1068:lov_clear_orphans()) error in orphan recovery on OST idx 3/7: rc = -11
07:52:20:LustreError: 8320:0:(mds_lov.c:884:__mds_lov_synchronize()) lustre-OST0003_UUID failed at mds_lov_clear_orphans: -11
07:52:20:LustreError: 8320:0:(mds_lov.c:905:__mds_lov_synchronize()) lustre-OST0003_UUID sync failed -11, deactivating
07:52:20:general protection fault: 0000 1 SMP
07:52:20:last sysfs file: /sys/devices/system/cpu/possible
07:52:20:CPU 0
07:52:21:Modules linked in: nfs fscache cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) lustre(U) lquota(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ldiskfs(U) jbd2 nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: obdecho]
07:52:21:
07:52:21:Pid: 8320, comm: ll_sync_03 Not tainted 2.6.32-220.4.2.el6_lustre.gf957caa.x86_64 #1 Red Hat KVM
07:52:21:RIP: 0010:[<ffffffffa08da03a>] [<ffffffffa08da03a>] lov_notify+0x1ba/0x1040 [lov]
07:52:21:RSP: 0018:ffff88007c625dc0 EFLAGS: 00010246
07:52:21:RAX: 5a5a5a5a5a5a5a5a RBX: ffff88007bdf42b8 RCX: 0000000000000000
07:52:21:RDX: 0000000000000000 RSI: ffff88005ad54ac0 RDI: ffff88007bdf4868
07:52:21:RBP: ffff88007c625e20 R08: 000000005a5a5a5a R09: 0000000000000000
07:52:21:R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
07:52:21:R13: 0000000000000003 R14: 0000000000000000 R15: ffff8800716228f8
07:52:21:FS: 00007f0f8a81e700(0000) GS:ffff880002200000(0000) knlGS:0000000000000000
07:52:21:CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
07:52:21:CR2: 00007ff18a036000 CR3: 0000000071be4000 CR4: 00000000000006f0
07:52:21:DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
07:52:21:DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
07:52:22:Process ll_sync_03 (pid: 8320, threadinfo ffff88007c624000, task ffff88007c3fe100)
07:52:22:Stack:
07:52:22: 0000000000000286 01000000fffffff5 ffff8800716223b8 000000036a958c00
07:52:22:<0> ffff8800580664d8 00000000716223b8 ffff88007c625e50 ffff88007bdf42b8
07:52:23:<0> ffff8800716223b8 0000000000000003 ffff8800580664d8 ffff8800716223b8
07:52:23:Call Trace:
07:52:23: [<ffffffffa07ef93c>] obd_notify.clone.0+0xac/0x290 [mds]
07:52:23: [<ffffffffa07f2f74>] __mds_lov_synchronize+0x2c4/0x12e0 [mds]
07:52:23: [<ffffffff811a6433>] ? copy_fs_struct+0x83/0x90
07:52:23: [<ffffffffa07f47a0>] ? mds_lov_synchronize+0x0/0xb0 [mds]
07:52:23: [<ffffffffa07f47ff>] mds_lov_synchronize+0x5f/0xb0 [mds]
07:52:23: [<ffffffff81003330>] ? xen_emergency_restart+0x20/0x30
07:52:23: [<ffffffff8100c14a>] child_rip+0xa/0x20
07:52:23: [<ffffffffa07f47a0>] ? mds_lov_synchronize+0x0/0xb0 [mds]
07:52:23: [<ffffffffa07f47a0>] ? mds_lov_synchronize+0x0/0xb0 [mds]
07:52:23: [<ffffffff8100c140>] ? child_rip+0x0/0x20
07:52:24:Code: 20 05 00 00 49 81 c7 40 05 00 00 45 85 c0 0f 84 ed 05 00 00 45 31 f6 45 31 e4 0f 1f 80 00 00 00 00 48 8b 83 78 05 00 00 49 63 d4 <48> 8b 04 d0 48 85 c0 74 67 48 83 78 40 00 74 60 f6 05 d3 05 b8
Attachments
Issue Links
- is duplicated by
-
LU-1391 lov device is released while some osc devices are still remaining on a Client node
- Resolved