Details
-
Bug
-
Resolution: Cannot Reproduce
-
Major
-
None
-
Lustre 2.7.0, Lustre 2.10.0
-
None
-
RHEL6
-
3
-
9223372036854775807
Description
We have just upgraded our servers to 2.7. This has caused one of the MDS to assert.
Message from syslogd@cs04r-sc-mds03-02 at Jun 9 16:56:29 ...
kernel:LustreError: 7605:0:(osd_handler.c:2530:osd_object_destroy()) ASSERTION( !lu_object_is_dying(dt->do_lu.lo_header) ) failed:
Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: LustreError: 7605:0:(osd_handler.c:2530:osd_object_destroy()) LBUG
Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: Pid: 7605, comm: mdt02_006
Jun 9 16:56:29 cs04r-sc-mds03-02 kernel:
Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: Call Trace:
Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa0410895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa0410e97>] lbug_with_loc+0x47/0xb0 [libcfs]
Could you advise a suitable course of action
Attachments
Activity
Created the bug https://jira.hpdd.intel.com/browse/LU-8496 (Race is changelog clear path). The assertion is different but seems that the path is same, so please review.
Mike, we had the following fix for Lustre-2.1 :
MRP-1443 llog: avoid llog cancel race Concurrently running two or more lfs changelog_clear need to be protected against races. llog_process_thread() used to read llogs without taking into account that the llog being read may be destroyed by another process. This patch serializes changelog cancellings using llog_ctxt's mutex. diff --git a/lustre/mdd/mdd_device.c b/lustre/mdd/mdd_device.c index 1642f0a..8140208 100644 --- a/lustre/mdd/mdd_device.c +++ b/lustre/mdd/mdd_device.c @@ -386,6 +386,7 @@ int mdd_changelog_llog_cancel(const struct lu_env *env, if (ctxt == NULL) return -ENXIO; + cfs_mutex_lock(&ctxt->loc_mutex); cfs_spin_lock(&mdd->mdd_cl.mc_lock); cur = (long long)mdd->mdd_cl.mc_index; cfs_spin_unlock(&mdd->mdd_cl.mc_lock); @@ -413,6 +414,7 @@ int mdd_changelog_llog_cancel(const struct lu_env *env, rc = llog_cancel(ctxt, NULL, 1, (struct llog_cookie *)&endrec, 0); out: + cfs_mutex_unlock(&ctxt->loc_mutex); llog_ctxt_put(ctxt); return rc; }
do you think it might be useful for 2.7+ ?
This system was upgraded from 2.5, I have only seen this once. Though we did have an issue with an MDS not responding after a minor network outage. But I do not have a compelling set of logs to suggest they are related.
I will put the console output below
Jun 8 13:27:57 cs04r-sc-mds03-01 kernel: LNet: There was an unexpected network error while writing to 172.23.148.22: -110. Jun 8 13:30:32 cs04r-sc-mds03-01 kernel: Lustre: MGS: haven't heard from client de0451fe-3e87-4bcb-2ca6-d2af988671be (at 172.23.148.35@tcp) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881fcf0bcc00, cur 1433766632 expire 1433766482 last 1433766405 Jun 8 13:30:32 cs04r-sc-mds03-01 kernel: Lustre: Skipped 1 previous similar message Jun 8 13:30:48 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000: Client bb255a22-f3c1-835b-8049-eab34c95ba65 (at 172.23.148.64@tcp) reconnecting Jun 8 13:30:59 cs04r-sc-mds03-01 kernel: Lustre: lustre03-MDT0000: Client db5a1353-f37b-fe0a-ccf8-9bc50f7a62ad (at 172.23.148.65@tcp) reconnecting Jun 8 13:31:03 cs04r-sc-mds03-01 kernel: Lustre: MGS: Client b85575c0-8d63-0c39-a18e-c25179bf68dd (at 172.23.148.26@tcp) reconnecting Jun 8 13:31:08 cs04r-sc-mds03-01 kernel: Lustre: MGS: Client 0e2a3416-2996-0da3-aab5-16ab1d68433f (at 172.23.148.24@tcp) reconnecting Jun 8 13:31:08 cs04r-sc-mds03-01 kernel: Lustre: Skipped 2 previous similar messages Jun 8 13:31:38 cs04r-sc-mds03-01 kernel: Lustre: MGS: Client 8945eb8e-242f-a306-9ce7-98c47b58cd6c (at 172.23.148.38@tcp) reconnecting Jun 8 13:31:38 cs04r-sc-mds03-01 kernel: Lustre: Skipped 1 previous similar message Jun 8 13:38:59 cs04r-sc-mds03-01 kernel: LustreError: 20218:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 8 13:38:59 cs04r-sc-mds03-01 kernel: LustreError: 20218:0:(llog_cat.c:508:llog_cat_cancel_records()) Skipped 18 previous similar messages Jun 8 13:38:59 cs04r-sc-mds03-01 kernel: LustreError: 20218:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 52222 of catalog 0x8:10 rc=-2 Jun 8 13:38:59 cs04r-sc-mds03-01 kernel: LustreError: 20218:0:(mdd_device.c:260:llog_changelog_cancel()) Skipped 18 previous similar messages Jun 8 13:49:04 cs04r-sc-mds03-01 kernel: LustreError: 18959:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 8 13:49:04 cs04r-sc-mds03-01 kernel: LustreError: 18959:0:(llog_cat.c:508:llog_cat_cancel_records()) Skipped 17 previous similar messages Jun 8 13:49:04 cs04r-sc-mds03-01 kernel: LustreError: 18959:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 52247 of catalog 0x8:10 rc=-2 Jun 8 13:49:04 cs04r-sc-mds03-01 kernel: LustreError: 18959:0:(mdd_device.c:260:llog_changelog_cancel()) Skipped 17 previous similar messages Jun 8 14:02:30 cs04r-sc-mds03-01 kernel: LustreError: 18965:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 8 14:02:30 cs04r-sc-mds03-01 kernel: LustreError: 18965:0:(llog_cat.c:508:llog_cat_cancel_records()) Skipped 25 previous similar messages Jun 8 14:02:30 cs04r-sc-mds03-01 kernel: LustreError: 18965:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 52272 of catalog 0x8:10 rc=-2 Jun 8 14:02:30 cs04r-sc-mds03-01 kernel: LustreError: 18965:0:(mdd_device.c:260:llog_changelog_cancel()) Skipped 25 previous similar messages Jun 8 14:04:33 cs04r-sc-mds03-01 kernel: LustreError: 19000:0:(llog_cat.c:163:llog_cat_id2handle()) lustre03-MDD0000: error opening log id 0x10f8e:1:0: rc = -2 Jun 8 14:04:33 cs04r-sc-mds03-01 kernel: LustreError: 19000:0:(llog_cat.c:537:llog_cat_process_cb()) lustre03-MDD0000: cannot find handle for llog 0x10f8e:1: -2
Also, is this hitting repeatedly, or did it go away when the system was restarted?
Mike, it looks like this is related to llog handling during ChangeLog processing. Is it possible there is a race with multiple threads cancelling the same records? In any case, there shouldn't be an LASSERT() when deleting a log file if the file is already being deleted?
Preceding messages:
Jun 9 11:28:23 cs04r-sc-mds03-02 kernel: LustreError: 8077:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 11:28:23 cs04r-sc-mds03-02 kernel: LustreError: 8077:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 52990 of catalog 0x8:10 rc=-2 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: swapper: page allocation failure. order:2, mode:0x20 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-504.8.1.el6_lustre.x86_64 #1 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: Call Trace: Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: <IRQ> [<ffffffff811347ba>] ? __alloc_pages_nodemask+0x74a/0x8d0 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff811736e2>] ? kmem_getpages+0x62/0x170 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff811742fa>] ? fallback_alloc+0x1ba/0x270 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff81173d4f>] ? cache_grow+0x2cf/0x320 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff81174079>] ? ____cache_alloc_node+0x99/0x160 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff81174cc9>] ? __kmalloc+0x199/0x230 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffffa0314277>] ? bnx2x_alloc_rx_data+0x47/0x1d0 [bnx2x] Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff812a2c88>] ? swiotlb_sync_single+0x28/0xd0 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffffa0314ea9>] ? bnx2x_rx_int+0xaa9/0x1670 [bnx2x] Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff81450a6a>] ? skb_release_head_state+0x6a/0x110 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff8145086e>] ? __kfree_skb+0x1e/0xa0 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffffa031250c>] ? bnx2x_free_tx_pkt+0x1cc/0x2e0 [bnx2x] Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffffa0310182>] ? bnx2x_drain_tx_queues+0xd2/0x140 [bnx2x] Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff8145a2c3>] ? __napi_complete+0x23/0x40 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffffa0315c8f>] ? bnx2x_poll+0x10f/0x400 [bnx2x] Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff81462a23>] ? net_rx_action+0x103/0x2f0 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff810eaec0>] ? handle_IRQ_event+0x60/0x170 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff8100fb55>] ? do_softirq+0x65/0xa0 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff81534405>] ? do_IRQ+0x75/0xf0 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: <EOI> [<ffffffff812eaf5e>] ? intel_idle+0xde/0x170 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff812eaf41>] ? intel_idle+0xc1/0x170 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff81426517>] ? cpuidle_idle_call+0xa7/0x140 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 Jun 9 11:34:22 cs04r-sc-mds03-02 kernel: [<ffffffff815236e7>] ? start_secondary+0x2be/0x301 Jun 9 11:35:26 cs04r-sc-mds03-02 kernel: LustreError: 7140:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 11:35:26 cs04r-sc-mds03-02 kernel: LustreError: 7140:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53002 of catalog 0x8:10 rc=-2 Jun 9 11:36:24 cs04r-sc-mds03-02 kernel: LustreError: 8124:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 11:36:24 cs04r-sc-mds03-02 kernel: LustreError: 8124:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53004 of catalog 0x8:10 rc=-2 Jun 9 11:57:34 cs04r-sc-mds03-02 kernel: LustreError: 8149:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 11:57:34 cs04r-sc-mds03-02 kernel: LustreError: 8149:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53028 of catalog 0x8:10 rc=-2 Jun 9 12:06:33 cs04r-sc-mds03-02 kernel: LustreError: 7605:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 12:06:33 cs04r-sc-mds03-02 kernel: LustreError: 7605:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53035 of catalog 0x8:10 rc=-2 Jun 9 12:21:21 cs04r-sc-mds03-02 kernel: LustreError: 8118:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 12:21:21 cs04r-sc-mds03-02 kernel: LustreError: 8118:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53047 of catalog 0x8:10 rc=-2 Jun 9 12:26:42 cs04r-sc-mds03-02 kernel: LustreError: 8086:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 12:26:42 cs04r-sc-mds03-02 kernel: LustreError: 8086:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53051 of catalog 0x8:10 rc=-2 Jun 9 13:12:41 cs04r-sc-mds03-02 kernel: LustreError: 8124:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 13:12:41 cs04r-sc-mds03-02 kernel: LustreError: 8124:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53090 of catalog 0x8:10 rc=-2 Jun 9 13:48:26 cs04r-sc-mds03-02 kernel: LustreError: 7408:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 13:48:26 cs04r-sc-mds03-02 kernel: LustreError: 7408:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53117 of catalog 0x8:10 rc=-2 Jun 9 13:53:05 cs04r-sc-mds03-02 kernel: LustreError: 7433:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 13:53:05 cs04r-sc-mds03-02 kernel: LustreError: 7433:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53120 of catalog 0x8:10 rc=-2 Jun 9 13:58:31 cs04r-sc-mds03-02 kernel: LustreError: 7176:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 13:58:31 cs04r-sc-mds03-02 kernel: LustreError: 7176:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53124 of catalog 0x8:10 rc=-2 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: swapper: page allocation failure. order:2, mode:0x20 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: kswapd0: page allocation failure. order:2, mode:0x20 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: Pid: 340, comm: kswapd0 Not tainted 2.6.32-504.8.1.el6_lustre.x86_64 #1 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: Call Trace: Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: <IRQ> [<ffffffff811347ba>] ? __alloc_pages_nodemask+0x74a/0x8d0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff811736e2>] ? kmem_getpages+0x62/0x170 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff811742fa>] ? fallback_alloc+0x1ba/0x270 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81173d4f>] ? cache_grow+0x2cf/0x320 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81174079>] ? ____cache_alloc_node+0x99/0x160 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81174cc9>] ? __kmalloc+0x199/0x230 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0314277>] ? bnx2x_alloc_rx_data+0x47/0x1d0 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff812a2c88>] ? swiotlb_sync_single+0x28/0xd0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0314ea9>] ? bnx2x_rx_int+0xaa9/0x1670 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa03153c7>] ? bnx2x_rx_int+0xfc7/0x1670 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa03122a1>] ? bnx2x_msix_fp_int+0xd1/0x170 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0315c8f>] ? bnx2x_poll+0x10f/0x400 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81462a23>] ? net_rx_action+0x103/0x2f0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff810eaec0>] ? handle_IRQ_event+0x60/0x170 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8107d90f>] ? __do_softirq+0x11f/0x1e0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8100fb55>] ? do_softirq+0x65/0xa0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81534405>] ? do_IRQ+0x75/0xf0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: <EOI> [<ffffffff81175f52>] ? kfree+0x122/0x320 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0eb605d>] ? osd_object_free+0x11d/0x160 [osd_ldiskfs] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0573d43>] ? lu_object_free+0x113/0x1a0 [obdclass] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0574c07>] ? lu_site_purge+0x2e7/0x4f0 [obdclass] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0574f98>] ? lu_cache_shrink+0x188/0x310 [obdclass] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8113d8da>] ? shrink_slab+0x11a/0x1a0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81140c5a>] ? balance_pgdat+0x57a/0x800 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81141014>] ? kswapd+0x134/0x3b0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81140ee0>] ? kswapd+0x0/0x3b0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8109e66e>] ? kthread+0x9e/0xc0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8100c20a>] ? child_rip+0xa/0x20 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8109e5d0>] ? kthread+0x0/0xc0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: kswapd0: page allocation failure. order:2, mode:0x20 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: Pid: 340, comm: kswapd0 Not tainted 2.6.32-504.8.1.el6_lustre.x86_64 #1 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: Call Trace: Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: <IRQ> [<ffffffff811347ba>] ? __alloc_pages_nodemask+0x74a/0x8d0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff811736e2>] ? kmem_getpages+0x62/0x170 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff811742fa>] ? fallback_alloc+0x1ba/0x270 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81173d4f>] ? cache_grow+0x2cf/0x320 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81174079>] ? ____cache_alloc_node+0x99/0x160 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81174cc9>] ? __kmalloc+0x199/0x230 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0314277>] ? bnx2x_alloc_rx_data+0x47/0x1d0 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff812a2c88>] ? swiotlb_sync_single+0x28/0xd0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0314ea9>] ? bnx2x_rx_int+0xaa9/0x1670 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa03153c7>] ? bnx2x_rx_int+0xfc7/0x1670 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa03122a1>] ? bnx2x_msix_fp_int+0xd1/0x170 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0315c8f>] ? bnx2x_poll+0x10f/0x400 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81462a23>] ? net_rx_action+0x103/0x2f0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff810eaec0>] ? handle_IRQ_event+0x60/0x170 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8107d90f>] ? __do_softirq+0x11f/0x1e0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8100fb55>] ? do_softirq+0x65/0xa0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81534405>] ? do_IRQ+0x75/0xf0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: <EOI> [<ffffffff81175f52>] ? kfree+0x122/0x320 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0eb605d>] ? osd_object_free+0x11d/0x160 [osd_ldiskfs] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0573d43>] ? lu_object_free+0x113/0x1a0 [obdclass] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0574c07>] ? lu_site_purge+0x2e7/0x4f0 [obdclass] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0574f98>] ? lu_cache_shrink+0x188/0x310 [obdclass] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8113d8da>] ? shrink_slab+0x11a/0x1a0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81140c5a>] ? balance_pgdat+0x57a/0x800 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81141014>] ? kswapd+0x134/0x3b0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81140ee0>] ? kswapd+0x0/0x3b0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8109e66e>] ? kthread+0x9e/0xc0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8100c20a>] ? child_rip+0xa/0x20 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8109e5d0>] ? kthread+0x0/0xc0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-504.8.1.el6_lustre.x86_64 #1 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: Call Trace: Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: <IRQ> [<ffffffff811347ba>] ? __alloc_pages_nodemask+0x74a/0x8d0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff814b9133>] ? tcp_v4_do_rcv+0x2e3/0x490 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff811736e2>] ? kmem_getpages+0x62/0x170 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff811742fa>] ? fallback_alloc+0x1ba/0x270 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81173d4f>] ? cache_grow+0x2cf/0x320 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81174079>] ? ____cache_alloc_node+0x99/0x160 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81174cc9>] ? __kmalloc+0x199/0x230 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0314277>] ? bnx2x_alloc_rx_data+0x47/0x1d0 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff812a2c88>] ? swiotlb_sync_single+0x28/0xd0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0314ea9>] ? bnx2x_rx_int+0xaa9/0x1670 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81450a6a>] ? skb_release_head_state+0x6a/0x110 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa031250c>] ? bnx2x_free_tx_pkt+0x1cc/0x2e0 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa03102f2>] ? bnx2x_napi_disable_cnic+0x102/0x120 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8145a2c3>] ? __napi_complete+0x23/0x40 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffffa0315c8f>] ? bnx2x_poll+0x10f/0x400 [bnx2x] Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81462a23>] ? net_rx_action+0x103/0x2f0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff810eaec0>] ? handle_IRQ_event+0x60/0x170 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8100fb55>] ? do_softirq+0x65/0xa0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81534405>] ? do_IRQ+0x75/0xf0 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: <EOI> [<ffffffff812eaf5e>] ? intel_idle+0xde/0x170 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff812eaf41>] ? intel_idle+0xc1/0x170 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81426517>] ? cpuidle_idle_call+0xa7/0x140 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 Jun 9 14:00:47 cs04r-sc-mds03-02 kernel: [<ffffffff815236e7>] ? start_secondary+0x2be/0x301 Jun 9 14:20:37 cs04r-sc-mds03-02 kernel: LustreError: 8091:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 14:20:37 cs04r-sc-mds03-02 kernel: LustreError: 8091:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53140 of catalog 0x8:10 rc=-2 Jun 9 14:25:39 cs04r-sc-mds03-02 kernel: LustreError: 8075:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 14:25:39 cs04r-sc-mds03-02 kernel: LustreError: 8075:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53144 of catalog 0x8:10 rc=-2 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: swapper: page allocation failure. order:2, mode:0x20 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: swapper: page allocation failure. order:2, mode:0x20 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-504.8.1.el6_lustre.x86_64 #1 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: Call Trace: Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: <IRQ> [<ffffffff811347ba>] ? __alloc_pages_nodemask+0x74a/0x8d0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff811736e2>] ? kmem_getpages+0x62/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff811742fa>] ? fallback_alloc+0x1ba/0x270 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81173d4f>] ? cache_grow+0x2cf/0x320 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81174079>] ? ____cache_alloc_node+0x99/0x160 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81174cc9>] ? __kmalloc+0x199/0x230 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0314277>] ? bnx2x_alloc_rx_data+0x47/0x1d0 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff812a2c88>] ? swiotlb_sync_single+0x28/0xd0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0314ea9>] ? bnx2x_rx_int+0xaa9/0x1670 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81450897>] ? __kfree_skb+0x47/0xa0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa031250c>] ? bnx2x_free_tx_pkt+0x1cc/0x2e0 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa03102fa>] ? bnx2x_napi_disable_cnic+0x10a/0x120 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8145a2c3>] ? __napi_complete+0x23/0x40 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0315c8f>] ? bnx2x_poll+0x10f/0x400 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81462a23>] ? net_rx_action+0x103/0x2f0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff810eaec0>] ? handle_IRQ_event+0x60/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100fb55>] ? do_softirq+0x65/0xa0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81534405>] ? do_IRQ+0x75/0xf0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: <EOI> [<ffffffff812eaf5e>] ? intel_idle+0xde/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff812eaf41>] ? intel_idle+0xc1/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81426517>] ? cpuidle_idle_call+0xa7/0x140 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff815236e7>] ? start_secondary+0x2be/0x301 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: swapper: page allocation failure. order:2, mode:0x20 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-504.8.1.el6_lustre.x86_64 #1 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: Call Trace: Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: <IRQ> [<ffffffff811347ba>] ? __alloc_pages_nodemask+0x74a/0x8d0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff811736e2>] ? kmem_getpages+0x62/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff811742fa>] ? fallback_alloc+0x1ba/0x270 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81173d4f>] ? cache_grow+0x2cf/0x320 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81174079>] ? ____cache_alloc_node+0x99/0x160 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81174cc9>] ? __kmalloc+0x199/0x230 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0314277>] ? bnx2x_alloc_rx_data+0x47/0x1d0 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff812a2c88>] ? swiotlb_sync_single+0x28/0xd0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0314ea9>] ? bnx2x_rx_int+0xaa9/0x1670 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa03153c7>] ? bnx2x_rx_int+0xfc7/0x1670 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8101fa8a>] ? amd_pmu_cpu_prepare+0x7a/0x100 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81039787>] ? native_apic_msr_write+0x37/0x40 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81030042>] ? generic_set_all+0xb2/0x340 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8145a2c3>] ? __napi_complete+0x23/0x40 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0315c8f>] ? bnx2x_poll+0x10f/0x400 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81462a23>] ? net_rx_action+0x103/0x2f0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff810eaec0>] ? handle_IRQ_event+0x60/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100fb55>] ? do_softirq+0x65/0xa0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81534405>] ? do_IRQ+0x75/0xf0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: <EOI> [<ffffffff812eaf5e>] ? intel_idle+0xde/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff812eaf41>] ? intel_idle+0xc1/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81426517>] ? cpuidle_idle_call+0xa7/0x140 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff815236e7>] ? start_secondary+0x2be/0x301 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: mdt00_028: page allocation failure. order:2, mode:0x20 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: Pid: 8137, comm: mdt00_028 Not tainted 2.6.32-504.8.1.el6_lustre.x86_64 #1 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: Call Trace: Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: <IRQ> [<ffffffff811347ba>] ? __alloc_pages_nodemask+0x74a/0x8d0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff811736e2>] ? kmem_getpages+0x62/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff811742fa>] ? fallback_alloc+0x1ba/0x270 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81173d4f>] ? cache_grow+0x2cf/0x320 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81174079>] ? ____cache_alloc_node+0x99/0x160 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81174cc9>] ? __kmalloc+0x199/0x230 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0314277>] ? bnx2x_alloc_rx_data+0x47/0x1d0 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff812a2c88>] ? swiotlb_sync_single+0x28/0xd0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0314ea9>] ? bnx2x_rx_int+0xaa9/0x1670 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81064ba2>] ? default_wake_function+0x12/0x20 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff810577e9>] ? __wake_up_common+0x59/0x90 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0315c8f>] ? bnx2x_poll+0x10f/0x400 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81462a23>] ? net_rx_action+0x103/0x2f0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff810eaec0>] ? handle_IRQ_event+0x60/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8107d90f>] ? __do_softirq+0x11f/0x1e0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81133ea0>] ? drain_local_pages+0x0/0x20 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100fb55>] ? do_softirq+0x65/0xa0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81534405>] ? do_IRQ+0x75/0xf0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: <EOI> [<ffffffff810b743e>] ? smp_call_function_many+0x1ee/0x260 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81133ea0>] ? drain_local_pages+0x0/0x20 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff810b74d2>] ? smp_call_function+0x22/0x30 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8107d594>] ? on_each_cpu+0x24/0x50 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81131d8c>] ? drain_all_pages+0x1c/0x20 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8113465d>] ? __alloc_pages_nodemask+0x5ed/0x8d0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff811736e2>] ? kmem_getpages+0x62/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff811742fa>] ? fallback_alloc+0x1ba/0x270 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81173d4f>] ? cache_grow+0x2cf/0x320 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81174079>] ? ____cache_alloc_node+0x99/0x160 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa07c5bd5>] ? null_alloc_rs+0xc5/0x390 [ptlrpc] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81174cc9>] ? __kmalloc+0x199/0x230 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa07c5bd5>] ? null_alloc_rs+0xc5/0x390 [ptlrpc] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa07b4684>] ? sptlrpc_svc_alloc_rs+0x74/0x360 [ptlrpc] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa078afad>] ? lustre_pack_reply_v2+0x9d/0x280 [ptlrpc] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa078b236>] ? lustre_pack_reply_flags+0xa6/0x1e0 [ptlrpc] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa078b381>] ? lustre_pack_reply+0x11/0x20 [ptlrpc] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa07b1ec3>] ? req_capsule_server_pack+0x53/0x100 [ptlrpc] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa1125bf5>] ? mdt_getxattr+0x635/0x1470 [mdt] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa11042c5>] ? mdt_object_lock_internal+0x65/0x360 [mdt] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa110472c>] ? mdt_intent_getxattr+0x9c/0x150 [mdt] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa05755b6>] ? lu_object_find+0x16/0x20 [obdclass] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa10fbcf4>] ? mdt_intent_policy+0x494/0xce0 [mdt] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa073f4f9>] ? ldlm_lock_enqueue+0x129/0x9d0 [ptlrpc] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa076b46b>] ? ldlm_handle_enqueue0+0x51b/0x13f0 [ptlrpc] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0422c8a>] ? lc_watchdog_touch+0x7a/0x190 [libcfs] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa07eb921>] ? tgt_enqueue+0x61/0x230 [ptlrpc] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa07ec56e>] ? tgt_request_handle+0x8be/0x1000 [ptlrpc] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa079c5a1>] ? ptlrpc_main+0xe41/0x1960 [ptlrpc] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa079b760>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8109e66e>] ? kthread+0x9e/0xc0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100c20a>] ? child_rip+0xa/0x20 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8109e5d0>] ? kthread+0x0/0xc0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: swapper: page allocation failure. order:2, mode:0x20 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-504.8.1.el6_lustre.x86_64 #1 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: Call Trace: Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: <IRQ> [<ffffffff811347ba>] ? __alloc_pages_nodemask+0x74a/0x8d0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff811736e2>] ? kmem_getpages+0x62/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff811742fa>] ? fallback_alloc+0x1ba/0x270 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81173d4f>] ? cache_grow+0x2cf/0x320 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81174079>] ? ____cache_alloc_node+0x99/0x160 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81174cc9>] ? __kmalloc+0x199/0x230 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0314277>] ? bnx2x_alloc_rx_data+0x47/0x1d0 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff812a2c88>] ? swiotlb_sync_single+0x28/0xd0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0314ea9>] ? bnx2x_rx_int+0xaa9/0x1670 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81450897>] ? __kfree_skb+0x47/0xa0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa031250c>] ? bnx2x_free_tx_pkt+0x1cc/0x2e0 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa03101b2>] ? bnx2x_drain_tx_queues+0x102/0x140 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8145a2c3>] ? __napi_complete+0x23/0x40 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0315c8f>] ? bnx2x_poll+0x10f/0x400 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81462a23>] ? net_rx_action+0x103/0x2f0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff810eaec0>] ? handle_IRQ_event+0x60/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100fb55>] ? do_softirq+0x65/0xa0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81534405>] ? do_IRQ+0x75/0xf0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: <EOI> [<ffffffff812eaf5e>] ? intel_idle+0xde/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff812eaf41>] ? intel_idle+0xc1/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81426517>] ? cpuidle_idle_call+0xa7/0x140 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff815236e7>] ? start_secondary+0x2be/0x301 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: swapper: page allocation failure. order:2, mode:0x20 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-504.8.1.el6_lustre.x86_64 #1 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: Call Trace: Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: <IRQ> [<ffffffff811347ba>] ? __alloc_pages_nodemask+0x74a/0x8d0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff814b9133>] ? tcp_v4_do_rcv+0x2e3/0x490 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff811736e2>] ? kmem_getpages+0x62/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff811742fa>] ? fallback_alloc+0x1ba/0x270 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81173d4f>] ? cache_grow+0x2cf/0x320 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81174079>] ? ____cache_alloc_node+0x99/0x160 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81174cc9>] ? __kmalloc+0x199/0x230 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0314277>] ? bnx2x_alloc_rx_data+0x47/0x1d0 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff812a2c88>] ? swiotlb_sync_single+0x28/0xd0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0314ea9>] ? bnx2x_rx_int+0xaa9/0x1670 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0315c8f>] ? bnx2x_poll+0x10f/0x400 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81462a23>] ? net_rx_action+0x103/0x2f0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff810eaec0>] ? handle_IRQ_event+0x60/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100fb55>] ? do_softirq+0x65/0xa0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81534405>] ? do_IRQ+0x75/0xf0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: <EOI> [<ffffffff812eaf5e>] ? intel_idle+0xde/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff812eaf41>] ? intel_idle+0xc1/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81426517>] ? cpuidle_idle_call+0xa7/0x140 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff815236e7>] ? start_secondary+0x2be/0x301 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-504.8.1.el6_lustre.x86_64 #1 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: Call Trace: Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: <IRQ> [<ffffffff811347ba>] ? __alloc_pages_nodemask+0x74a/0x8d0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff811736e2>] ? kmem_getpages+0x62/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff811742fa>] ? fallback_alloc+0x1ba/0x270 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81173d4f>] ? cache_grow+0x2cf/0x320 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81174079>] ? ____cache_alloc_node+0x99/0x160 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81174cc9>] ? __kmalloc+0x199/0x230 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0314277>] ? bnx2x_alloc_rx_data+0x47/0x1d0 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff812a2c88>] ? swiotlb_sync_single+0x28/0xd0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0314ea9>] ? bnx2x_rx_int+0xaa9/0x1670 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81450897>] ? __kfree_skb+0x47/0xa0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa031250c>] ? bnx2x_free_tx_pkt+0x1cc/0x2e0 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0310262>] ? bnx2x_napi_disable_cnic+0x72/0x120 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8145a2c3>] ? __napi_complete+0x23/0x40 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffffa0315c8f>] ? bnx2x_poll+0x10f/0x400 [bnx2x] Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81462a23>] ? net_rx_action+0x103/0x2f0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff810eaec0>] ? handle_IRQ_event+0x60/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100fb55>] ? do_softirq+0x65/0xa0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81534405>] ? do_IRQ+0x75/0xf0 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: <EOI> [<ffffffff812eaf5e>] ? intel_idle+0xde/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff812eaf41>] ? intel_idle+0xc1/0x170 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81426517>] ? cpuidle_idle_call+0xa7/0x140 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 Jun 9 14:28:14 cs04r-sc-mds03-02 kernel: [<ffffffff815236e7>] ? start_secondary+0x2be/0x301 Jun 9 14:29:55 cs04r-sc-mds03-02 kernel: LustreError: 8125:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 14:29:55 cs04r-sc-mds03-02 kernel: LustreError: 8125:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53147 of catalog 0x8:10 rc=-2 Jun 9 14:30:06 cs04r-sc-mds03-02 kernel: LustreError: 8130:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 14:30:06 cs04r-sc-mds03-02 kernel: LustreError: 8130:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53147 of catalog 0x8:10 rc=-2 Jun 9 14:52:27 cs04r-sc-mds03-02 kernel: LustreError: 8089:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 14:52:27 cs04r-sc-mds03-02 kernel: LustreError: 8089:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53177 of catalog 0x8:10 rc=-2 Jun 9 14:56:40 cs04r-sc-mds03-02 kernel: LustreError: 8121:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 14:56:40 cs04r-sc-mds03-02 kernel: LustreError: 8121:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53181 of catalog 0x8:10 rc=-2 Jun 9 15:06:27 cs04r-sc-mds03-02 kernel: LustreError: 8081:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 15:06:27 cs04r-sc-mds03-02 kernel: LustreError: 8081:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53191 of catalog 0x8:10 rc=-2 Jun 9 15:06:37 cs04r-sc-mds03-02 kernel: LustreError: 8137:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 15:06:37 cs04r-sc-mds03-02 kernel: LustreError: 8137:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53192 of catalog 0x8:10 rc=-2 Jun 9 15:09:51 cs04r-sc-mds03-02 kernel: LustreError: 8148:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 15:09:51 cs04r-sc-mds03-02 kernel: LustreError: 8148:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53196 of catalog 0x8:10 rc=-2 Jun 9 15:19:17 cs04r-sc-mds03-02 kernel: LustreError: 8133:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 15:19:17 cs04r-sc-mds03-02 kernel: LustreError: 8133:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53204 of catalog 0x8:10 rc=-2 Jun 9 15:28:50 cs04r-sc-mds03-02 kernel: LustreError: 8133:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 15:28:50 cs04r-sc-mds03-02 kernel: LustreError: 8133:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53212 of catalog 0x8:10 rc=-2 Jun 9 15:37:10 cs04r-sc-mds03-02 kernel: LustreError: 8089:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 15:37:10 cs04r-sc-mds03-02 kernel: LustreError: 8089:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53219 of catalog 0x8:10 rc=-2 Jun 9 15:40:18 cs04r-sc-mds03-02 kernel: LustreError: 8107:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 15:40:18 cs04r-sc-mds03-02 kernel: LustreError: 8107:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53223 of catalog 0x8:10 rc=-2 Jun 9 15:42:49 cs04r-sc-mds03-02 kernel: LustreError: 7142:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 15:42:49 cs04r-sc-mds03-02 kernel: LustreError: 7142:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53232 of catalog 0x8:10 rc=-2 Jun 9 15:47:41 cs04r-sc-mds03-02 kernel: LustreError: 7408:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 15:47:41 cs04r-sc-mds03-02 kernel: LustreError: 7408:0:(llog_cat.c:508:llog_cat_cancel_records()) Skipped 9 previous similar messages Jun 9 15:47:41 cs04r-sc-mds03-02 kernel: LustreError: 7408:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53263 of catalog 0x8:10 rc=-2 Jun 9 15:47:41 cs04r-sc-mds03-02 kernel: LustreError: 7408:0:(mdd_device.c:260:llog_changelog_cancel()) Skipped 9 previous similar messages Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: swapper: page allocation failure. order:2, mode:0x20 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: swapper: page allocation failure. order:2, mode:0x20 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-504.8.1.el6_lustre.x86_64 #1 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: Call Trace: Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: <IRQ> [<ffffffff811347ba>] ? __alloc_pages_nodemask+0x74a/0x8d0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff814b9133>] ? tcp_v4_do_rcv+0x2e3/0x490 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff811736e2>] ? kmem_getpages+0x62/0x170 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff811742fa>] ? fallback_alloc+0x1ba/0x270 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81173d4f>] ? cache_grow+0x2cf/0x320 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81174079>] ? ____cache_alloc_node+0x99/0x160 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81174cc9>] ? __kmalloc+0x199/0x230 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa0314277>] ? bnx2x_alloc_rx_data+0x47/0x1d0 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff812a2c88>] ? swiotlb_sync_single+0x28/0xd0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa0314ea9>] ? bnx2x_rx_int+0xaa9/0x1670 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa031250c>] ? bnx2x_free_tx_pkt+0x1cc/0x2e0 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa031055a>] ? bnx2x_free_msix_irqs+0x8a/0x190 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8145a2c3>] ? __napi_complete+0x23/0x40 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa0315c8f>] ? bnx2x_poll+0x10f/0x400 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81462a23>] ? net_rx_action+0x103/0x2f0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff810eaec0>] ? handle_IRQ_event+0x60/0x170 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8100fb55>] ? do_softirq+0x65/0xa0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81534405>] ? do_IRQ+0x75/0xf0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: <EOI> [<ffffffff812eaf5e>] ? intel_idle+0xde/0x170 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff812eaf41>] ? intel_idle+0xc1/0x170 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81426517>] ? cpuidle_idle_call+0xa7/0x140 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff815236e7>] ? start_secondary+0x2be/0x301 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: swapper: page allocation failure. order:2, mode:0x20 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-504.8.1.el6_lustre.x86_64 #1 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: Call Trace: Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: <IRQ> [<ffffffff811347ba>] ? __alloc_pages_nodemask+0x74a/0x8d0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff811736e2>] ? kmem_getpages+0x62/0x170 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff811742fa>] ? fallback_alloc+0x1ba/0x270 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81173d4f>] ? cache_grow+0x2cf/0x320 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81174079>] ? ____cache_alloc_node+0x99/0x160 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81174cc9>] ? __kmalloc+0x199/0x230 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa0314277>] ? bnx2x_alloc_rx_data+0x47/0x1d0 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff812a2c88>] ? swiotlb_sync_single+0x28/0xd0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa0314ea9>] ? bnx2x_rx_int+0xaa9/0x1670 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa031250c>] ? bnx2x_free_tx_pkt+0x1cc/0x2e0 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa03102fa>] ? bnx2x_napi_disable_cnic+0x10a/0x120 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8145a2c3>] ? __napi_complete+0x23/0x40 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa0315c8f>] ? bnx2x_poll+0x10f/0x400 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff810b0760>] ? tick_sched_timer+0x0/0xc0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81462a23>] ? net_rx_action+0x103/0x2f0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff810eaec0>] ? handle_IRQ_event+0x60/0x170 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8100fb55>] ? do_softirq+0x65/0xa0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81534405>] ? do_IRQ+0x75/0xf0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: <EOI> [<ffffffff812eaf5e>] ? intel_idle+0xde/0x170 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff812eaf41>] ? intel_idle+0xc1/0x170 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81426517>] ? cpuidle_idle_call+0xa7/0x140 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff815236e7>] ? start_secondary+0x2be/0x301 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-504.8.1.el6_lustre.x86_64 #1 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: Call Trace: Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: <IRQ> [<ffffffff811347ba>] ? __alloc_pages_nodemask+0x74a/0x8d0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff811736e2>] ? kmem_getpages+0x62/0x170 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff811742fa>] ? fallback_alloc+0x1ba/0x270 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81173d4f>] ? cache_grow+0x2cf/0x320 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81174079>] ? ____cache_alloc_node+0x99/0x160 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81174cc9>] ? __kmalloc+0x199/0x230 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa030fa27>] ? bnx2x_frag_alloc+0x17/0x20 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa0314277>] ? bnx2x_alloc_rx_data+0x47/0x1d0 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff812a2c88>] ? swiotlb_sync_single+0x28/0xd0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa0314ea9>] ? bnx2x_rx_int+0xaa9/0x1670 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8145a2c3>] ? __napi_complete+0x23/0x40 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffffa0315c8f>] ? bnx2x_poll+0x10f/0x400 [bnx2x] Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81462a23>] ? net_rx_action+0x103/0x2f0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff810eaec0>] ? handle_IRQ_event+0x60/0x170 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8100fb55>] ? do_softirq+0x65/0xa0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81534405>] ? do_IRQ+0x75/0xf0 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: <EOI> [<ffffffff812eaf5e>] ? intel_idle+0xde/0x170 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff812eaf41>] ? intel_idle+0xc1/0x170 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81426517>] ? cpuidle_idle_call+0xa7/0x140 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 Jun 9 16:00:44 cs04r-sc-mds03-02 kernel: [<ffffffff815236e7>] ? start_secondary+0x2be/0x301 Jun 9 16:01:46 cs04r-sc-mds03-02 kernel: LustreError: 8137:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 16:01:46 cs04r-sc-mds03-02 kernel: LustreError: 8137:0:(llog_cat.c:508:llog_cat_cancel_records()) Skipped 14 previous similar messages Jun 9 16:01:46 cs04r-sc-mds03-02 kernel: LustreError: 8137:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53286 of catalog 0x8:10 rc=-2 Jun 9 16:01:46 cs04r-sc-mds03-02 kernel: LustreError: 8137:0:(mdd_device.c:260:llog_changelog_cancel()) Skipped 14 previous similar messages Jun 9 16:18:57 cs04r-sc-mds03-02 kernel: LustreError: 8081:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 16:18:57 cs04r-sc-mds03-02 kernel: LustreError: 8081:0:(llog_cat.c:508:llog_cat_cancel_records()) Skipped 3 previous similar messages Jun 9 16:18:57 cs04r-sc-mds03-02 kernel: LustreError: 8081:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53302 of catalog 0x8:10 rc=-2 Jun 9 16:18:57 cs04r-sc-mds03-02 kernel: LustreError: 8081:0:(mdd_device.c:260:llog_changelog_cancel()) Skipped 3 previous similar messages Jun 9 16:40:16 cs04r-sc-mds03-02 kernel: LustreError: 8133:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 16:40:16 cs04r-sc-mds03-02 kernel: LustreError: 8133:0:(llog_cat.c:508:llog_cat_cancel_records()) Skipped 2 previous similar messages Jun 9 16:40:16 cs04r-sc-mds03-02 kernel: LustreError: 8133:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53323 of catalog 0x8:10 rc=-2 Jun 9 16:40:16 cs04r-sc-mds03-02 kernel: LustreError: 8133:0:(mdd_device.c:260:llog_changelog_cancel()) Skipped 2 previous similar messages Jun 9 16:41:41 cs04r-sc-mds03-02 kernel: LustreError: 7408:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 16:41:41 cs04r-sc-mds03-02 kernel: LustreError: 7408:0:(llog_cat.c:508:llog_cat_cancel_records()) Skipped 2 previous similar messages Jun 9 16:41:41 cs04r-sc-mds03-02 kernel: LustreError: 7408:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53326 of catalog 0x8:10 rc=-2 Jun 9 16:41:41 cs04r-sc-mds03-02 kernel: LustreError: 7408:0:(mdd_device.c:260:llog_changelog_cancel()) Skipped 2 previous similar messages Jun 9 16:45:02 cs04r-sc-mds03-02 kernel: LustreError: 8118:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 16:45:02 cs04r-sc-mds03-02 kernel: LustreError: 8118:0:(llog_cat.c:508:llog_cat_cancel_records()) Skipped 1 previous similar message Jun 9 16:45:02 cs04r-sc-mds03-02 kernel: LustreError: 8118:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53331 of catalog 0x8:10 rc=-2 Jun 9 16:45:02 cs04r-sc-mds03-02 kernel: LustreError: 8118:0:(mdd_device.c:260:llog_changelog_cancel()) Skipped 1 previous similar message Jun 9 16:50:09 cs04r-sc-mds03-02 kernel: LustreError: 7548:0:(llog_cat.c:508:llog_cat_cancel_records()) lustre03-MDD0000: fail to cancel 0 of 1 llog-records: rc = -2 Jun 9 16:50:09 cs04r-sc-mds03-02 kernel: LustreError: 7548:0:(llog_cat.c:508:llog_cat_cancel_records()) Skipped 3 previous similar messages Jun 9 16:50:09 cs04r-sc-mds03-02 kernel: LustreError: 7548:0:(mdd_device.c:260:llog_changelog_cancel()) lustre03-MDD0000: cancel idx 53340 of catalog 0x8:10 rc=-2 Jun 9 16:50:09 cs04r-sc-mds03-02 kernel: LustreError: 7548:0:(mdd_device.c:260:llog_changelog_cancel()) Skipped 3 previous similar messages
The actual failure:
Jun 9 16:55:05 cs04r-sc-mds03-02 kernel: LustreError: 8140:0:(llog_cat.c:163:llog_cat_id2handle()) lustre03-MDD0000: error opening log id 0x11564:1:0: rc = -2 Jun 9 16:55:05 cs04r-sc-mds03-02 kernel: LustreError: 8140:0:(llog_cat.c:537:llog_cat_process_cb()) lustre03-MDD0000: cannot find handle for llog 0x11564:1: -2 Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: LustreError: 7605:0:(osd_handler.c:2530:osd_object_destroy()) ASSERTION(!lu_object_is_dying(dt->do_lu.lo_header) ) failed: Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: LustreError: 7605:0:(osd_handler.c:2530:osd_object_destroy()) LBUG Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: Pid: 7605, comm: mdt02_006 Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: Call Trace: Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa0410895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa0410e97>] lbug_with_loc+0x47/0xb0 [libcfs] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa0ebc011>] osd_object_destroy+0x3b1/0x460 [osd_ldiskfs] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa0536cf4>] llog_osd_destroy+0x5d4/0xd40 [obdclass] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa052d3f1>] llog_destroy+0x51/0x170 [obdclass] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa052f0c8>] llog_cat_process_cb+0x3a8/0x5f0 [obdclass] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa052a5a9>] llog_process_thread+0xaa9/0xe80 [obdclass] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa052ed20>] ? llog_cat_process_cb+0x0/0x5f0 [obdclass] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa052aabf>] llog_process_or_fork+0x13f/0x540 [obdclass] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa052dc5d>] llog_cat_process_or_fork+0x1ad/0x300 [obdclass] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa1242f50>] ? llog_changelog_cancel_cb+0x0/0x1d0 [mdd] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa052ddc9>] llog_cat_process+0x19/0x20 [obdclass] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa1242d7f>] llog_changelog_cancel+0x5f/0x230 [mdd] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa04211c1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa0530df8>] llog_cancel+0x58/0x240 [obdclass] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa12491aa>] mdd_changelog_user_purge+0x46a/0x6f0 [mdd] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa1249a8c>] mdd_iocontrol+0x65c/0xb70 [mdd] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa10f8119>] mdt_ioc_child+0x149/0x1d0 [mdt] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa1104c4b>] mdt_iocontrol+0x2fb/0x8e0 [mdt] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa11054b1>] mdt_set_info+0x281/0x430 [mdt] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa078b381>] ? lustre_pack_reply+0x11/0x20 [ptlrpc] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa07ec56e>] tgt_request_handle+0x8be/0x1000 [ptlrpc] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa079c5a1>] ptlrpc_main+0xe41/0x1960 [ptlrpc] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffff8106c4f0>] ? pick_next_task_fair+0xd0/0x130 Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffffa079b760>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffff8109e66e>] kthread+0x9e/0xc0 Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffff8100c20a>] child_rip+0xa/0x20 Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffff8109e5d0>] ? kthread+0x0/0xc0 Jun 9 16:56:29 cs04r-sc-mds03-02 kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20 Jun 9 16:56:29 cs04r-sc-mds03-02 kernel:
Outdated issue, there were several patches landed to fix llog races and issues and this issue may be fixed already. Reopen if will appear again