Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
None
-
Lustre 2.4.0
-
None
-
2
-
9581
Description
System crash after reboot/remount of osts. We have crash dump and can upload if needed.
ustre: nbp7-OST002b: deleting orphan objects from 0x0:131250 to 0x0:131297
Lustre: Skipped 3 previous similar messages
Lustre: nbp7-OST0037: deleting orphan objects from 0x0:131412 to 0x0:131457
Lustre: nbp7-OST0027: deleting orphan objects from 0x0:131476 to 0x0:131585
-----------[ cut here ]-----------
WARNING: at mm/page_alloc.c:1361 get_page_from_freelist+0x818/0x830() (Tainted: G --------------- T)
-----------[ cut here ]-----------
WARNING: at mm/page_alloc.c:1361 get_page_from_freelist+0x818/0x830() (Tainted: G --------------- T)
Hardware name: SUMMIT
Modules linked in: osp(U) ofd(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) ldiskfs(U) lquota(U) jbd2 mdd(U) acpi_cpufreq freq_table mperf lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic libcfs(U) dm_round_robin scsi_dh_rdac lpfc(U) scsi_transport_fc nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc bonding 8021q garp stp llc ib_ucm(U) rdma_ucm(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_srp(U) scsi_transport_srp scsi_tgt ib_ipoib(U) ib_cm(U) ib_sa(U) ipv6 ib_uverbs(U) ib_umad(U) mlx4_ib(U) ib_mad(U) ib_core(U) tcp_bic microcode sg i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support mlx4_core(U) memtrack(U) igb dca shpchp ext3 jbd isci libsas sd_mod crc_t10dif mpt2sas scsi_transport_sas raid_class ahci wmi dm_multipath dm_mirror dm_region_hash dm_log dm_mod gr3u1 o[luat sto f un3l2 oacpdueds :i nsc ksidb_w,a witai_sticanng ]f
or the rest, timeout in 10 second(s)
Pid: 24478, comm: ll_ost_io03_003 Tainted: G --------------- T 2.6.32-358.6.2.el6.20130607.x86_64.lustre240 #1
Call Trace:
[<ffffffff8106d607>] ? warn_slowpath_common+0x87/0xc0
[<ffffffff8106d65a>] ? warn_slowpath_null+0x1a/0x20
[<ffffffff81126ee8>] ? get_page_from_freelist+0x818/0x830
[<ffffffff81128303>] ? __alloc_pages_nodemask+0x113/0x8d0
[<ffffffff81128303>] ? __alloc_pages_nodemask+0x113/0x8d0
[<ffffffff81124bc6>] ? __rmqueue+0x156/0x490
[<ffffffff81163302>] ? kmem_getpages+0x62/0x170
[<ffffffff8116396f>] ? cache_grow+0x2cf/0x320
[<ffffffff81163bc2>] ? cache_alloc_refill+0x202/0x240
[<ffffffff811be01d>] ? bio_integrity_prep+0x7d/0x320
[<ffffffff81164a89>] ? __kmalloc+0x1a9/0x220
[<ffffffff811be01d>] ? bio_integrity_prep+0x7d/0x320
[<ffffffff81266ca0>] ? generic_make_request+0x2b0/0x550
[<ffffffff81118de3>] ? mempool_alloc+0x63/0x140
[<ffffffffa0015f67>] ? dm_merge_bvec+0xc7/0x100 [dm_mod]
[<ffffffff81266fcd>] ? submit_bio+0x8d/0x120
[<ffffffffa05b238e>] ? lprocfs_oh_tally+0x2e/0x50 [obdclass]
[<ffffffffa0c37dac>] ? osd_submit_bio+0x1c/0x60 [osd_ldiskfs]
[<ffffffffa0c38581>] ? osd_do_bio+0x791/0x810 [osd_ldiskfs]
[<ffffffffa03c802c>] ? fsfilt_map_nblocks+0xcc/0xf0 [fsfilt_ldiskfs]
[<ffffffffa03c82d5>] ? fsfilt_ldiskfs_map_inode_pages+0x85/0x90 [fsfilt_ldiskfs]
[<ffffffffa0c3af68>] ? osd_write_commit+0x328/0x610 [osd_ldiskfs]
[<ffffffffa0cd7cc4>] ? ofd_commitrw_write+0x684/0x11b0 [ofd]
[<ffffffffa0cdaa2d>] ? ofd_commitrw+0x5cd/0xbb0 [ofd]
[<ffffffffa02817e5>] ? lprocfs_counter_add+0x125/0x182 [lvfs]
[<ffffffffa0c9a1d8>] ? obd_commitrw+0x128/0x3d0 [ost]
[<ffffffffa0ca40d1>] ? ost_brw_write+0xea1/0x15d0 [ost]
[<ffffffff81128303>] ? __alloc_pages_nodemask+0x113/0x8d0
[<ffffffffa07301c0>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
[<ffffffffa0caa32b>] ? ost_handle+0x3ecb/0x48e0 [ost]
[<ffffffffa0777bab>] ? ptlrpc_update_export_timer+0x4b/0x560 [ptlrpc]
[<ffffffffa0780388>] ? ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
[<ffffffffa048ab60>] ? cfs_alloc+0x30/0x60 [libcfs]
[<ffffffffa028154f>] ? lprocfs_stats_alloc_one+0x36f/0x390 [lvfs]
[<ffffffffa078171e>] ? ptlrpc_main+0xace/0x1700 [ptlrpc]
[<ffffffffa0780c50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
[<ffffffff8100c0ca>] ? child_rip+0xa/0x20
[<ffffffffa0780c50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
[<ffffffffa0780c50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
[<ffffffff8100c0c0>] ? child_rip+0x0/0x20
--[ end trace b1c8704e155f856a ]--
BUG: unable to handle kernel NULL pointer dereference at 0000000000000036
IP: [<ffffffffa0c3af3f>] osd_write_commit+0x2ff/0x610 [osd_ldiskfs]
PGD 0
Oops: 0000 1 SMP
last sysfs file: /sys/devices/pci0000:80/0000:80:03.0/0000:86:00.1/host16/rport-16:0-0/target16:0:0/16:0:0:72/state
.All cpus are now in kdb
Entering kdb (current=0xffff881f9c5d0040, pid 24479) on processor 30 Oops: (null)
due to oops @ 0xffffffffa0c3af3f
r15 = 0x0000000000000100 r14 = 0x0000000000000100
r13 = 0xffff881f99ad3000 r12 = 0xffff880f1eac3a10
bp = 0xffff881f99acb940 bx = 0x0000000000400000
r11 = 0x0000000000000000 r10 = 0xffff881f99ac8f28
r9 = 0x0000000000000000 r8 = 0x0000000000000001
ax = 0xfffffffffffffffe cx = 0xffff881f99947000
dx = 0x0000000000000100 si = 0xffff881f99bdd800
di = 0xffff880f1eac3a10 orig_ax = 0xffffffffffffffff
ip = 0xffffffffa0c3af3f cs = 0x0000000000000010
flags = 0x0000000000010246 sp = 0xffff881f99acb8d0
ss = 0x0000000000000018 ®s = 0xffff881f99acb838