[LU-8766] The obdfilter-survey make the kernel panic Created: 27/Oct/16  Updated: 31/Oct/16  Resolved: 31/Oct/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.9.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: sebg-crd-pm (Inactive) Assignee: Peter Jones
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Centos7.2 3.10.0-327.el7.x86_64
Lustre 2.8.59_62_g165c308 + zfs0.6.5.7


Attachments: Text File vmcore-dmesg.txt    
Issue Links:
Duplicate
duplicates LU-8748 set block size for zfs echo object Resolved
Related
is related to LU-8748 set block size for zfs echo object Resolved
Epic/Theme: zfs
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   
[82488.026431] Lustre: Echo OBD driver; http://www.lustre.org/
[82505.984937] BUG: unable to handle kernel paging request at 00000000000c0000
[82505.984993] IP: [<ffffffff811c0ea5>] kmem_cache_alloc+0x75/0x1d0
[82505.985041] PGD 0 
[82505.985058] Oops: 0000 [#1] SMP 
[82505.985810] CPU: 4 PID: 7181 Comm: txg_sync Tainted: P           OE  ------------   3.10.0-327.el7.x86_64 #1
[82505.985871] Hardware name: FOXCONN AGLAIA/02010HE02600G, BIOS 4.6.3 05/06/2011
[82505.985916] task: ffff883f52a74500 ti: ffff883f5206c000 task.ti: ffff883f5206c000
[82505.985963] RIP: 0010:[<ffffffff811c0ea5>]  [<ffffffff811c0ea5>] kmem_cache_alloc+0x75/0x1d0
[82505.986677] Call Trace:
[82505.986711]  [<ffffffffa000b939>] ? spl_kmem_cache_alloc+0x99/0x150 [spl]
[82505.986763]  [<ffffffffa000b939>] spl_kmem_cache_alloc+0x99/0x150 [spl]
[82505.986864]  [<ffffffffa0169bd7>] zio_buf_alloc+0x57/0x60 [zfs]
[82505.986946]  [<ffffffffa016d81c>] zio_write_phys+0x9c/0xd0 [zfs]
[82505.987028]  [<ffffffffa012b100>] ? vdev_file_open+0x50/0x50 [zfs]
[82505.988999]  [<ffffffffa012b22f>] vdev_label_write+0x6f/0x80 [zfs]
[82505.990964]  [<ffffffffa012b100>] ? vdev_file_open+0x50/0x50 [zfs]
[82505.992925]  [<ffffffffa012b4be>] vdev_uberblock_sync+0x15e/0x1f0 [zfs]
[82505.994891]  [<ffffffffa012b100>] ? vdev_file_open+0x50/0x50 [zfs]
[82505.996816]  [<ffffffffa012b3ac>] vdev_uberblock_sync+0x4c/0x1f0 [zfs]
[82505.998703]  [<ffffffffa012ce75>] vdev_uberblock_sync_list+0x85/0x120 [zfs]
[82506.000546]  [<ffffffffa012d19f>] vdev_config_sync+0x11f/0x140 [zfs]
[82506.002348]  [<ffffffffa011072c>] spa_sync+0x91c/0xb70 [zfs]
[82506.004062]  [<ffffffff810a6b0b>] ? autoremove_wake_function+0x2b/0x40
[82506.005798]  [<ffffffffa01228ac>] txg_sync_thread+0x3cc/0x640 [zfs]
[82506.007487]  [<ffffffffa01224e0>] ? txg_fini+0x2a0/0x2a0 [zfs]
[82506.009122]  [<ffffffffa000c861>] thread_generic_wrapper+0x71/0x80 [spl]
[82506.010752]  [<ffffffffa000c7f0>] ? __thread_exit+0x20/0x20 [spl]
[82506.012366]  [<ffffffff810a5aef>] kthread+0xcf/0xe0
[82506.013982]  [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[82506.015575]  [<ffffffff81645858>] ret_from_fork+0x58/0x90
[82506.017122]  [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[82506.018632] Code: ce 00 00 49 8b 50 08 4d 8b 20 49 8b 40 10 4d 85 e4 0f 84 1f 01 00 00 48 85 c0 0f 84 16 01 00 00 49 63 46 20 48 8d 4a 01 4d 8b 06 <49> 8b 1c 04 4c 89 e0 65 49 0f c7 08 0f 94 c0 84 c0 74 b9 49 63 
[82506.021768] RIP  [<ffffffff811c0ea5>] kmem_cache_alloc+0x75/0x1d0


 Comments   
Comment by sebg-crd-pm (Inactive) [ 27/Oct/16 ]

Environment
Centos7.2
Lustre 2.8.59_62_g165c308 + zfs0.6.5.7

Run command
targets="obdtest-OST0000" nobjlo=1 nobjhi=512 thrlo=1 thrhi=512 size=1024 rszlo=1024 rszhi=1024 tests_str="write read" case=disk ./obdfilter-survey

the attached file is the log of kernel panic

PS:Lustre 2.8.0 + zfs 0.6.5.7 is ok

Comment by Andreas Dilger [ 27/Oct/16 ]

We have found a similar issue recently and have a patch. Could you please try the patch http://review.whamcloud.com/23323

LU-8748 osd-zfs: set block size of echo object

Set block size for zfs echo object.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I6efab645181ab3de6686bf82f4ecbf9ea3384b1b

If that does not fix the problem, would you be able to do a git bisect between 2.8.0 and commit 165c308 to see which patch introduced the problem?

Comment by sebg-crd-pm (Inactive) [ 31/Oct/16 ]

yes, it's ok in 2.8.59_78_g7d4106c

Comment by Peter Jones [ 31/Oct/16 ]

ok thanks. The fix for LU-8748 has been landed to master for 2.9

Generated at Sat Feb 10 02:20:22 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.