[LU-689] Add bz22221 patch Created: 18/Sep/11  Updated: 04/Nov/11  Resolved: 04/Nov/11

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 1.8.6
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Zhenyu Xu Assignee: Zhenyu Xu
Resolution: Won't Fix Votes: 0
Labels: None

Severity: 3
Bugzilla ID: 22,221
Rank (Obsolete): 6555

 Description   

client nodes crash on fs w/inactive OST.

an OST was deactivated (lctl conf_param
xxx.osc.active=0). On the XT, clients are crashing as a result, with:

[2010-02-26 02:36:59][c11-1c0s6n2]Unable to handle kernel NULL pointer
dereference at 00000000000000c0 RIP:
[2010-02-26
02:36:59][c11-1c0s6n2]<ffffffff88438431>{:lov:lov_prep_async_page+1521}
[2010-02-26 02:36:59][c11-1c0s6n2]PGD 1a30b4067 PUD 1f44c4067 PMD 0
[2010-02-26 02:36:59][c11-1c0s6n2]Oops: 0000 [1] SMP
[2010-02-26 02:36:59][c11-1c0s6n2]last sysfs file:
/devices/system/node/node1/nr_hugepages
[2010-02-26 02:37:00][c11-1c0s6n2]CPU 5
[2010-02-26 02:37:00][c11-1c0s6n2]Modules linked in: mgc lustre lov mdc lquota
osc ptlrpc obdclass lvfs kptllnd l
net libcfs portals rca heartbeat
[2010-02-26 02:37:00][c11-1c0s6n2]Pid: 28287, comm: testexe Tainted: PF U
2.6.16.60-0.39_1.0102.4787.2.2.41-cn
l #1
[2010-02-26 02:37:00][c11-1c0s6n2]RIP: 0010:[<ffffffff88438431>]
<ffffffff88438431>{:lov:lov_prep_async_page+1521
}
[2010-02-26 02:37:00][c11-1c0s6n2]RSP: 0018:ffff81018b1ad678 EFLAGS: 00010246
[2010-02-26 02:37:00][c11-1c0s6n2]RAX: 0000000000000000 RBX: ffff8101e0df5be0
RCX: ffff8101e0df5bf8
[2010-02-26 02:37:00][c11-1c0s6n2]RDX: ffff8101ec32c000 RSI: ffff81018b1ad62c
RDI: ffff8103777e4ec0
[2010-02-26 02:37:00][c11-1c0s6n2]RBP: ffff81018b1ad708 R08: 0000000000200000
R09: ffffffff8852a260
[2010-02-26 02:37:00][c11-1c0s6n2]R10: 0000000000000002 R11: 0000000000000000
R12: ffff8103777e4ec0
[2010-02-26 02:37:00][c11-1c0s6n2]R13: 0000000000000000 R14: ffff8103e26d00c0
R15: 0000000000000000
[2010-02-26 02:37:00][c11-1c0s6n2]FS: 0000000000c5a860(0063)
GS:ffff8104004c5240(0000) knlGS:0000000000000000
[2010-02-26 02:37:00][c11-1c0s6n2]CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[2010-02-26 02:37:00][c11-1c0s6n2]CR2: 00000000000000c0 CR3: 00000001f4139000
CR4: 00000000000006e0
[2010-02-26 02:37:00][c11-1c0s6n2]Process testexe (pid: 28287, threadinfo
ffff81018b1ac000, task ffff8101ff694040
)
[2010-02-26 02:37:00][c11-1c0s6n2]Stack: 0000000000000282 ffff81000001bd10
0000000000000044 0000000000020052
[2010-02-26 02:37:00][c11-1c0s6n2]CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[2010-02-26 02:37:00][c11-1c0s6n2]CR2: 00000000000000c0 CR3: 00000001f4139000
CR4: 00000000000006e0
[2010-02-26 02:37:00][c11-1c0s6n2]Process testexe (pid: 28287, threadinfo
ffff81018b1ac000, task ffff8101ff694040
)
[2010-02-26 02:37:00][c11-1c0s6n2]Stack: 0000000000000282 ffff81000001bd10
0000000000000044 0000000000020052
[2010-02-26 02:37:00][c11-1c0s6n2] 0000000000000000 ffff81000001bd10
ffff81018b1ad6f8 0000000000000086
[2010-02-26 02:37:00][c11-1c0s6n2] ffff8101e0df5bf8 0000000000000000
[2010-02-26 02:37:00][c11-1c0s6n2]Call Trace:
<ffffffff884c44ee>{:lustre:llap_from_page_with_lockh+2062}
[2010-02-26 02:37:00][c11-1c0s6n2]
<ffffffff801674eb>

{add_to_page_cache+75}

<ffffffff884c772a>{:lustre:ll_read_ahead_page+330}
[2010-02-26 02:37:00][c11-1c0s6n2]
<ffffffff8844327a>{:lov:lov_stripe_size+698}
<ffffffff882477fe>{:ptlrpc:search_queue+286}
[2010-02-26 02:37:00][c11-1c0s6n2]
<ffffffff884c7e63>{:lustre:ll_read_ahead_pages+307}
[2010-02-26 02:37:00][c11-1c0s6n2]
<ffffffff884c629e>{:lustre:ll_ra_count_get+126}
<ffffffff884c869e>{:lustre:ll_readahead+1774}
[2010-02-26 02:37:00][c11-1c0s6n2]
<ffffffff88377a7f>{:osc:osc_set_data_with_check+415}
[2010-02-26 02:37:00][c11-1c0s6n2]
<ffffffff8844379f>{:lov:lov_stripe_intersects+79}
<ffffffff884ca38e>{:lustre:ll_readpage+1310}
[2010-02-26 02:37:00][c11-1c0s6n2]
<ffffffff8843a733>{:lov:lov_enqueue+1907}
<ffffffff8843fd58>{:lov:lov_stripe_lock+72}
[2010-02-26 02:37:00][c11-1c0s6n2]
<ffffffff80168211>

{do_generic_mapping_read+657}

<ffffffff80168460>

{file_read_actor+0}

[2010-02-26 02:37:00][c11-1c0s6n2]
<ffffffff8825cc40>{:ptlrpc:ldlm_completion_ast+0}
<ffffffff801686e2>{__generic_file_aio_read+402}
[2010-02-26 02:37:00][c11-1c0s6n2]
<ffffffff8016a492>

{generic_file_readv+178}

<ffffffff884a43c7>{:lustre:ll_extent_unlock+1047}
[2010-02-26 02:37:00][c11-1c0s6n2]
<ffffffff801486a0>

{autoremove_wake_function+0}

<ffffffff801aa6bd>{__touch_atime+125}
[2010-02-26 02:37:00][c11-1c0s6n2]
<ffffffff884a5a6e>{:lustre:ll_file_readv+3806}
<ffffffff884a5cae>{:lustre:ll_file_read+30}
[2010-02-26 02:37:00][c11-1c0s6n2] <ffffffff8018e050>

{vfs_read+176}

<ffffffff8018e3d0>

{sys_read+80}

[2010-02-26 02:37:00][c11-1c0s6n2] <ffffffff8010be16>

{system_call+126}

[2010-02-26 02:37:00][c11-1c0s6n2]Code: 49 8b 95 c0 00 00 00 48 8b 02 48 85 c0
74 12 48 8b 40 10 48
[2010-02-26 02:37:00][c11-1c0s6n2]RIP
<ffffffff88438431>{:lov:lov_prep_async_page+1521} RSP <ffff81018b1ad678>
[2010-02-26 02:37:00][c11-1c0s6n2]CR2: 00000000000000c0
[2010-02-26 02:37:00][c11-1c0s6n2] <0>Kernel panic - not syncing: Oops



 Comments   
Comment by Zhenyu Xu [ 18/Sep/11 ]

patch tracking at http://review.whamcloud.com/1394

Comment by Peter Jones [ 03/Nov/11 ]

Bobi

Is this patch also needed on master?

Peter

Comment by Zhenyu Xu [ 04/Nov/11 ]

no, master does not need this patch.

Comment by Peter Jones [ 04/Nov/11 ]

thanks Bobi- then let's close this ticket for now.

Generated at Sat Feb 10 01:09:28 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.