[LU-1097] Oops: EIP is at osc_rd_lockless_truncate+0xd/0x30 [osc] Created: 13/Feb/12  Updated: 29/May/17  Resolved: 29/May/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.1
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Jian Yu Assignee: WC Triage
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

Lustre Tag: v2_1_1_0_RC2
Lustre Build: http://build.whamcloud.com/job/lustre-b2_1/41/
Distro/Arch: RHEL6/x86_64(server), RHEL6/i686(client)
Network: TCP (1GigE)
ENABLE_QUOTA=yes


Severity: 3
Rank (Obsolete): 10312

 Description   

recovery-small test 57 hung as follows:

== recovery-small test 57: read procfs entries causes kernel crash == 23:13:25 (1329030805)
fail_loc=0x80000B00
Stopping client client-11vm1.lab.whamcloud.com /mnt/lustre (opts:)

On client client-11vm1, the following oops occurred:

23:13:26:Lustre: DEBUG MARKER: == recovery-small test 57: read procfs entries causes kernel crash == 23:13:25 (1329030805)
23:13:38:LustreError: 12253:0:(fail.c:126:__cfs_fail_timeout_set()) cfs_fail_timeout id b00 sleeping for 10000ms
23:13:39:LustreError: 12253:0:(fail.c:130:__cfs_fail_timeout_set()) cfs_fail_timeout id b00 awake
23:13:39:BUG: unable to handle kernel NULL pointer dereference at 0000003c
23:13:39:IP: [<f83d976d>] osc_rd_lockless_truncate+0xd/0x30 [osc]
23:13:39:*pdpt = 0000000030d5a001 *pde = 000000007edd6067 
23:13:39:Oops: 0000 [#1] SMP 
23:13:39:last sysfs file: /sys/module/lov/initstate
23:13:39:Modules linked in: lustre(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) lquota(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) libcfs(U) nfs fscache nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa ib_mad ib_core microcode virtio_balloon 8139too 8139cp mii i2c_piix4 i2c_core ext3 jbd mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: libcfs]
23:13:39:
23:13:39:Pid: 12253, comm: lctl Not tainted 2.6.32-220.el6.i686 #1 Red Hat KVM
23:13:39:EIP: 0060:[<f83d976d>] EFLAGS: 00010282 CPU: 0
23:13:52:EIP is at osc_rd_lockless_truncate+0xd/0x30 [osc]
23:13:52:EAX: f0fc8000 EBX: f83d9760 ECX: 00000000 EDX: 00000000
23:13:52:ESI: 00001000 EDI: 08d18e68 EBP: f0d47f9c ESP: f0d47f10
23:13:52: DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
23:13:52:Process lctl (pid: 12253, ti=f0d46000 task=f31d4030 task.ti=f0d46000)
23:13:52:Stack:
23:13:52: fffffffe 00001000 08d18e68 f0d47f9c fafc7a40 00001000 f0d47f3c c206e060
23:13:52:<0> f5b76ac0 f0fc8000 f0fc8000 00000001 00000000 f5b76ac0 c254f3c0 fafc7960
23:13:52:<0> fffffffb c0578564 f0d47f9c 00001000 08d18e68 c254f3c0 00001000 08d18e68
23:13:52:Call Trace:
23:13:52: [<fafc7a40>] ? lprocfs_fops_read+0xe0/0x1b0 [obdclass]
23:13:52: [<fafc7960>] ? lprocfs_fops_read+0x0/0x1b0 [obdclass]
23:13:52: [<c0578564>] ? proc_reg_read+0x64/0xa0
23:13:52: [<c0578500>] ? proc_reg_read+0x0/0xa0
23:13:53: [<c052b01d>] ? vfs_read+0x9d/0x190
23:13:53: [<c04afccc>] ? audit_syscall_entry+0x21c/0x240
23:13:53: [<c052b151>] ? sys_read+0x41/0x70
23:13:53: [<c0409a9f>] ? sysenter_do_call+0x12/0x28
23:13:54:Code: 24 b8 de ff ff ff 85 d2 7e e0 89 96 04 04 00 00 89 d8 eb d6 8d 76 00 8d bc 27 00 00 00 00 83 ec 10 8b 54 24 1c 8b 92 b0 00 00 00 <8b> 52 3c c7 44 24 08 3d ad 3e f8 89 04 24 89 54 24 0c 8b 54 24 
23:13:54:EIP: [<f83d976d>] osc_rd_lockless_truncate+0xd/0x30 [osc] SS:ESP 0068:f0d47f10
23:13:54:CR2: 000000000000003c
23:13:54:---[ end trace 067b36d4fa8cb4e7 ]---
23:13:54:Kernel panic - not syncing: Fatal exception
23:13:54:Pid: 12253, comm: lctl Tainted: G      D    ----------------   2.6.32-220.el6.i686 #1
23:13:54:Call Trace:
23:13:54: [<c082e23d>] ? panic+0x42/0xf9
23:13:54: [<c083214c>] ? oops_end+0xbc/0xd0
23:13:54: [<c0433132>] ? no_context+0xc2/0x190
23:13:54: [<c04333ab>] ? bad_area+0x3b/0x50
23:13:54: [<c043384e>] ? __do_page_fault+0x34e/0x420
23:13:54: [<c0833aaa>] ? do_page_fault+0x2a/0x90
23:13:54: [<f83d9760>] ? osc_rd_lockless_truncate+0x0/0x30 [osc]
23:13:54: [<c0833a80>] ? do_page_fault+0x0/0x90
23:13:54: [<c0831537>] ? error_code+0x73/0x78
23:13:54: [<f83d9760>] ? osc_rd_lockless_truncate+0x0/0x30 [osc]
23:13:54: [<f902007b>] ? lcw_cb+0xb/0x330 [libcfs]
23:13:54: [<f902007b>] ? lcw_cb+0xb/0x330 [libcfs]
23:13:54: [<f83d976d>] ? osc_rd_lockless_truncate+0xd/0x30 [osc]
23:13:54: [<fafc7a40>] ? lprocfs_fops_read+0xe0/0x1b0 [obdclass]
23:13:54: [<fafc7960>] ? lprocfs_fops_read+0x0/0x1b0 [obdclass]
23:14:06: [<c0578564>] ? proc_reg_read+0x64/0xa0
23:14:06: [<c0578500>] ? proc_reg_read+0x0/0xa0
23:14:06: [<c052b01d>] ? vfs_read+0x9d/0x190
23:14:06: [<c04afccc>] ? audit_syscall_entry+0x21c/0x240
23:14:06: [<c052b151>] ? sys_read+0x41/0x70
23:14:06: [<c0409a9f>] ? sysenter_do_call+0x12/0x28

Maloo report: https://maloo.whamcloud.com/test_sets/57a27058-55d2-11e1-9aa8-5254004bbbd3



 Comments   
Comment by Peter Jones [ 14/Feb/12 ]

Only seen with i686 client

Comment by Andreas Dilger [ 29/May/17 ]

Close old ticket.

Generated at Sat Feb 10 01:13:28 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.