[LU-11814] conf-sanity test_93 osd_handler.c:7132:osd_device_init0()) ASSERTION( info ) failed: Created: 09/Dec/18  Updated: 11/Aug/20  Resolved: 04/Jul/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.2
Fix Version/s: Lustre 2.14.0

Type: Bug Priority: Major
Reporter: Shuichi Ihara Assignee: Yang Sheng
Resolution: Fixed Votes: 0
Labels: None
Environment:

2.10.5-ddn6


Issue Links:
Duplicate
duplicates LU-13313 conf-sanity test_93: Crashed while pa... Resolved
is duplicated by LU-8346 conf-sanity test_93: test failed to r... Resolved
is duplicated by LU-12300 conf-sanity test 93: osd_handler.c:77... Resolved
Related
Severity: 2
Rank (Obsolete): 9223372036854775807

 Description   

OSS crashes when multiple OSTs were going to be mounted in parallel.

[208315.026819] Lustre: Skipped 1 previous similar message
[208323.352453] LustreError: 12457:0:(osd_handler.c:7132:osd_device_init0()) ASSERTION( info ) failed: 
[208323.355270] LustreError: 12457:0:(osd_handler.c:7132:osd_device_init0()) LBUG
[208323.355518] LDISKFS-fs (sdd): file extents enabled, maximum tree depth=5
[208323.357619] Pid: 12457, comm: mount.lustre 3.10.0-862.9.1.el7_lustre.ddn1.x86_64 #1 SMP Tue Sep 11 19:05:37 JST 2018
[208323.357620] Call Trace:
[208323.357631]  [<ffffffffc042f7cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
[208323.357648]  [<ffffffffc042f87c>] lbug_with_loc+0x4c/0xa0 [libcfs]
[208323.357654]  [<ffffffffc10cc6d5>] osd_device_alloc+0x615/0x770 [osd_ldiskfs]
[208323.357671]  [<ffffffffc099ae2a>] obd_setup+0x11a/0x2b0 [obdclass]
[208323.357703]  [<ffffffffc099cc58>] class_setup+0x2a8/0x840 [obdclass]
[208323.357724]  [<ffffffffc09a08ed>] class_process_config+0x196d/0x2420 [obdclass]
[208323.357744]  [<ffffffffc09a45f8>] do_lcfg+0x258/0x500 [obdclass]
[208323.357763]  [<ffffffffc09a8e68>] lustre_start_simple+0x88/0x210 [obdclass]
[208323.357784]  [<ffffffffc09d59b4>] server_fill_super+0xf34/0x185a [obdclass]
[208323.357808]  [<ffffffffc09abfe8>] lustre_fill_super+0x328/0x950 [obdclass]
[208323.357841]  [<ffffffff9781f3bf>] mount_nodev+0x4f/0xb0
[208323.357868]  [<ffffffffc09a4008>] lustre_mount+0x38/0x60 [obdclass]
[208323.357871]  [<ffffffff9781ff3e>] mount_fs+0x3e/0x1b0
[208323.357873]  [<ffffffff9783d4c7>] vfs_kern_mount+0x67/0x110
[208323.357876]  [<ffffffff9783faef>] do_mount+0x1ef/0xce0
[208323.357878]  [<ffffffff97840923>] SyS_mount+0x83/0xd0
[208323.357882]  [<ffffffff97d20795>] system_call_fastpath+0x1c/0x21
[208323.357897]  [<ffffffffffffffff>] 0xffffffffffffffff
[208323.357899] Kernel panic - not syncing: LBUG
[208323.357902] CPU: 14 PID: 12457 Comm: mount.lustre Kdump: loaded Tainted: G           OE  ------------ T 3.10.0-862.9.1.el7_lustre.ddn1.x86_64 #1
[208323.357903] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[208323.357904] Call Trace:
[208323.357909]  [<ffffffff97d0e84e>] dump_stack+0x19/0x1b
[208323.357911]  [<ffffffff97d08b50>] panic+0xe8/0x21f
[208323.357919]  [<ffffffffc042f8cb>] lbug_with_loc+0x9b/0xa0 [libcfs]
[208323.357929]  [<ffffffffc10cc6d5>] osd_device_alloc+0x615/0x770 [osd_ldiskfs]
[208323.357947]  [<ffffffffc099ae2a>] obd_setup+0x11a/0x2b0 [obdclass]
[208323.357964]  [<ffffffffc099cc58>] class_setup+0x2a8/0x840 [obdclass]
[208323.357983]  [<ffffffffc09a08ed>] class_process_config+0x196d/0x2420 [obdclass]
[208323.357989]  [<ffffffff979568d3>] ? number.isra.2+0x323/0x360
[208323.358009]  [<ffffffffc09a45f8>] do_lcfg+0x258/0x500 [obdclass]
[208323.358027]  [<ffffffffc09a8e68>] lustre_start_simple+0x88/0x210 [obdclass]
[208323.358045]  [<ffffffffc09d59b4>] server_fill_super+0xf34/0x185a [obdclass]
[208323.358055]  [<ffffffffc043ad07>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
[208323.358075]  [<ffffffffc09abfe8>] lustre_fill_super+0x328/0x950 [obdclass]
[208323.358092]  [<ffffffffc09abcc0>] ? lustre_common_put_super+0x270/0x270 [obdclass]
[208323.358095]  [<ffffffff9781f3bf>] mount_nodev+0x4f/0xb0
[208323.358113]  [<ffffffffc09a4008>] lustre_mount+0x38/0x60 [obdclass]
[208323.358117]  [<ffffffff9781ff3e>] mount_fs+0x3e/0x1b0
[208323.358119]  [<ffffffff9783d4c7>] vfs_kern_mount+0x67/0x110
[208323.358122]  [<ffffffff9783faef>] do_mount+0x1ef/0xce0
[208323.358125]  [<ffffffff977f7c2c>] ? kmem_cache_alloc_trace+0x3c/0x200
[208323.358127]  [<ffffffff97840923>] SyS_mount+0x83/0xd0
[208323.358131]  [<ffffffff97d20795>] system_call_fastpath+0x1c/0x21

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
conf-sanity test_93 - trevis-17vm12 crashed during conf-sanity test_93



 Comments   
Comment by Shuichi Ihara [ 10/Dec/18 ]

nope, this is different. this is OSS crashes when OSTs were mounted.

Comment by Andreas Dilger [ 19/Dec/18 ]

+1 on master https://testing.whamcloud.com/test_sets/dcd260fe-039e-11e9-b837-52540065bddc

Comment by Andreas Dilger [ 07/Feb/19 ]

+4, 2 on master, 2 on b2_10:
https://testing.whamcloud.com/test_sets/0baf5b7a-2974-11e9-a318-52540065bddc
https://testing.whamcloud.com/test_sets/282ee652-286c-11e9-b97f-52540065bddc
https://testing.whamcloud.com/test_sets/2051afde-1a91-11e9-a2cc-52540065bddc
https://testing.whamcloud.com/test_sets/558839f6-2555-11e9-af90-52540065bddc

Comment by Alexander Boyko [ 21/Mar/19 ]

masterĀ https://testing.whamcloud.com/sub_tests/f4d79dee-4b4d-11e9-92fe-52540065bddc

Comment by Andreas Dilger [ 05/May/19 ]

+1 on b2_12:
https://testing.whamcloud.com/test_sets/748b6f5c-6f57-11e9-aeec-52540065bddc

Comment by Gu Zheng (Inactive) [ 31/May/19 ]

another instance from master:

https://testing.whamcloud.com/test_sessions/10776e76-fc32-4c8c-9bf1-a4111f992278

Comment by Minh Diep [ 28/Aug/19 ]

+1 on b2_12 https://testing.whamcloud.com/test_sets/51e08d04-c56d-11e9-90ad-52540065bddc

Comment by Li Xi [ 17/Feb/20 ]

Hit on b2_12 https://testing.whamcloud.com/test_sets/46eb4e32-5011-11ea-8c2d-52540065bddc

Comment by Gerrit Updater [ 18/Mar/20 ]

Yang Sheng (ys@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37974
Subject: LU-11814 obd: Crashed while mount in parallel
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: a318d65621afefd9897ffe202c7fcf44ea680f0a

Comment by Gerrit Updater [ 19/Mar/20 ]

Yang Sheng (ys@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37985
Subject: LU-11814 obd: Reproducer
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ad8bf03ad9329fadff6e10f4ed9c28143f2e529b

Comment by Gerrit Updater [ 14/Apr/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37974/
Subject: LU-11814 obd: Crashed while mount in parallel
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: e4fd618ff498814145002b2c3f56746b3d172e07

Comment by Yang Sheng [ 14/Apr/20 ]

Patch landed. Close ticket.

Comment by Mikhail Pershin [ 20/Apr/20 ]

it is still crashing:
https://testing.whamcloud.com/test_sessions/0d88e2cc-285e-46d4-9cdb-19ee56b6010b

crashed on the latest master

Comment by Mikhail Pershin [ 20/Apr/20 ]

reopening, please check that last crash reported

Comment by Gerrit Updater [ 29/Apr/20 ]

Yang Sheng (ys@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38416
Subject: LU-11814 obdcalss: ensure LCT_QUIESCENT take sync
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 88b5b58f32350e41d6d396678477b9360e1fd674

Comment by Andreas Dilger [ 06/May/20 ]

+1 on master https://testing.whamcloud.com/test_sets/152e6895-3cfc-4994-8fee-230fb4c1087e

Comment by Gerrit Updater [ 04/Jul/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38416/
Subject: LU-11814 obdcalss: ensure LCT_QUIESCENT take sync
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 979f5e1db041dc49585b97c4915a6bc3e58435da

Comment by Yang Sheng [ 04/Jul/20 ]

Second patch landed.

Generated at Sat Feb 10 02:47:10 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.