Details
-
Bug
-
Resolution: Duplicate
-
Minor
-
None
-
Lustre 1.8.9
-
None
-
CentOS6.4
-
3
-
12207
Description
When 2 OSTs, out of 32, went unhealthy (LUN went offline and lustre server reporting io refusing services), accessing files striping across the OSTs would cause client kernel panic. Here are the client dumps:
—
Jan 4 01:12:43 trestles-2-17.sdsc.edu: kernel: LustreError: 3595:0:(osc_request.c:1652:osc_brw_redo_request()) @@@ redo for recoverable error 5 req@ffff881023c69c00 x1454016746327144/t0 o3>puma-OST0000_UUID@172.25.33.113@tcp:6/4 lens 448/592 e 0 to 1 dl 1388826770 ref 2 fl Interpret:R/0/0 rc -5/-5
Jan 4 01:12:43 trestles-2-17.sdsc.edu: kernel: LustreError: 3595:0:(osc_request.c:1652:osc_brw_redo_request()) Skipped 2 previous similar messages
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: LustreError: 3595:0:(osc_request.c:2330:brw_interpret()) puma-OST0000-osc-ffff880c2515e400: too many resent retries for object: 23302047, rc = -5.
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: LustreError: 3595:0:(osc_request.c:2357:brw_interpret()) ASSERTION(!(aa->aa_oa->o_valid & OBD_MD_FLHANDLE)) failed
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: LustreError: 3595:0:(osc_request.c:2357:brw_interpret()) LBUG
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: Pid: 3595, comm: ptlrpcd
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel:
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: Call Trace:
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: BUG: unable to handle kernel NULL pointer dereference at (null)
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: IP: [<(null)>] (null)
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: PGD 0
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: Oops: 0010 1 SMP
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: last sysfs file: /sys/devices/system/node/node7/meminfo
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: CPU 30
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: Modules linked in: mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ptlrpc(U) nfs lockd fscache auth_rpcgss nfs_acl limic(U) knem(U) autofs4 ksocklnd(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) sunrpc ipmi_devintf ipt_REJECT iptable_filter ip_tables rdma_ucm(U) ib_ucm(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ib_cm(U) ipv6 ib_uverbs(U) ib_umad(U) mlx4_vnic(U) mlx4_en(U) mlx4_ib(U) ib_sa(U) ib_mad(U) ib_core(U) mlx4_core(U) compat(U) tcp_htcp igb dca ptp pps_core microcode sg serio_raw k10temp amd64_edac_mod edac_core edac_mce_amd i2c_piix4 i2c_core shpchp ext4 jbd2 mbcache sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel:
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: Pid: 3595, comm: ptlrpcd Tainted: G W --------------- 2.6.32-358.23.2.el6.x86_64 #1 Supermicro H8QG6/H8QG6
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: RIP: 0010:[<0000000000000000>] [<(null)>] (null)
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: RSP: 0018:ffff8807d27bdb48 EFLAGS: 00010246
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: RAX: ffff8807d27bdbac RBX: ffff8807d27bdba0 RCX: ffffffffa0366260
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: RDX: ffff8807d27bdbe0 RSI: ffff8807d27bdba0 RDI: ffff8807d27bc000
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: RBP: ffff8807d27bdbe0 R08: 0000000000000000 R09: 0000000000000000
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: R10: 0000000000000003 R11: 0000000000000000 R12: 000000000000cbe0
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: R13: ffffffffa0366260 R14: 0000000000000000 R15: ffff880e2f483fc0
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: FS: 00002b844dc5ed80(0000) GS:ffff880e2f480000(0000) knlGS:0000000000000000
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: CR2: 0000000000000000 CR3: 0000000001a85000 CR4: 00000000000007e0
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: Process ptlrpcd (pid: 3595, threadinfo ffff8807d27bc000, task ffff88082474e040)
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: Stack:
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: ffffffff8100e4a0 ffff8807d27bdbac ffff88082474e040 ffffffffa0699f78
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: <d> 00000000a069a9a8 ffff8807d27bc000 ffff8807d27bdfd8 ffff8807d27bc000
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: <d> 000000000000001e ffff880e2f480000 ffff8807d27bdbe0 ffff8807d27bdbb0
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: Call Trace:
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: [<ffffffff8100e4a0>] ? dump_trace+0x190/0x3b0
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: [<ffffffffa035a835>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: [<ffffffffa035ae65>] lbug_with_loc+0x75/0xe0 [libcfs]
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: [<ffffffffa03635d6>] libcfs_assertion_failed+0x66/0x70 [libcfs]
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: [<ffffffffa06903ff>] brw_interpret+0xcff/0xe90 [osc]
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: [<ffffffffa04b6a9a>] ptlrpc_check_set+0x24a/0x16b0 [ptlrpc]
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: [<ffffffff81081b5b>] ? try_to_del_timer_sync+0x7b/0xe0
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: [<ffffffff81081be2>] ? del_timer_sync+0x22/0x30
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: [<ffffffffa04ed7ad>] ptlrpcd_check+0x18d/0x270 [ptlrpc]
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: [<ffffffffa04eda50>] ptlrpcd+0x160/0x270 [ptlrpc]
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: [<ffffffff81063990>] ? default_wake_function+0x0/0x20
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: [<ffffffff8100c0ca>] child_rip+0xa/0x20
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: [<ffffffffa04ed8f0>] ? ptlrpcd+0x0/0x270 [ptlrpc]
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: Code: Bad RIP value.
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: RIP [<(null)>] (null)
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: RSP <ffff8807d27bdb48>
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: CR2: 0000000000000000
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: --[ end trace e64f567342ffc045 ]--
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: Kernel panic - not syncing: Fatal exception
Jan 4 01:12:55 trestles-2-17.sdsc.edu: kernel: Pid: 3595, comm: ptlrpcd Tainted: G D W --------------- 2.6.32-358.23.2.el6.x86_64 #1
Attachments
Issue Links
- duplicates
-
LU-3067 ASSERTION(!(aa->aa_oa->o_valid & OBD_MD_FLHANDLE))
-
- Resolved
-