Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.16.0
-
None
-
3
-
9223372036854775807
Description
[ 3093.416284] Lustre: DEBUG MARKER: == conf-sanity test 41c: concurrent mounts of MDT/OST should all fail but one ========================================================== 19:54:14 (1699300454) ... [ 3149.141357] LustreError: 187855:0:(libcfs_fail.h:190:cfs_race()) cfs_race id 716 sleeping [ 3149.143276] LustreError: 187854:0:(libcfs_fail.h:201:cfs_race()) cfs_fail_race id 716 waking [ 3149.143494] LustreError: 187855:0:(libcfs_fail.h:199:cfs_race()) cfs_fail_race id 716 awake: rc=500 [ 3149.143591] LustreError: 187855:0:(obd_config.c:696:class_setup()) Device 0 setup in progress (type osd-zfs) [ 3149.143660] LustreError: 187855:0:(obd_mount.c:213:lustre_start_simple()) lustre-MDT0000-osd setup error -17 [ 3149.143731] LustreError: 187855:0:(tgt_mount.c:2183:server_fill_super()) Unable to start osd on lustre-mdt1/mdt1: -17 [ 3149.143804] LustreError: 187855:0:(super25.c:188:lustre_fill_super()) llite: Unable to mount <unknown>: rc = -17 [ 3149.143896] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC [ 3149.144137] CPU: 0 PID: 187854 Comm: mount.lustre Tainted: G W O --------- - - 4.18.0 #2 [ 3149.144266] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc36 04/01/2014 [ 3149.144445] RIP: 0010:class_setup+0x610/0xad0 [obdclass] [ 3149.144519] Code: 05 61 f0 09 00 00 00 00 00 e8 2c 3a ea ff 31 d2 be 2f 02 00 00 48 c7 c7 10 3b 98 c0 e8 49 65 7a e2 e8 b4 ed c6 e2 48 8b 04 24 <48> 8b 40 28 48 83 f8 01 0f 84 8e 03 00 00 48 8b 04 24 48 8b 48 28 [ 3149.144747] RSP: 0018:ffff9206ab77bae8 EFLAGS: 00010246 [ 3149.144814] RAX: 6b6b6b6b6b6b6b6b RBX: ffff9206a6cf4600 RCX: 000000000002d000 [ 3149.144912] RDX: 0000000000000000 RSI: 000000000000022f RDI: ffffffffc0983b10 [ 3149.145018] RBP: ffff9206b42b0530 R08: ffffffffc07d5000 R09: ffffffffa3e0bbc0 [ 3149.145117] R10: ffff9206ab77ba20 R11: ffff9206ad3457a3 R12: ffff9206b42b0110 [ 3149.145233] R13: ffff9206b42b02b8 R14: ffff9206b42b0048 R15: 0000000000000000 [ 3149.145334] FS: 00007f1f838808c0(0000) GS:ffff9206cfe00000(0000) knlGS:0000000000000000 [ 3149.145434] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3149.145528] CR2: 0000000000667000 CR3: 00000001908f7003 CR4: 0000000000370eb0 [ 3149.145634] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3149.145736] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3149.145835] Call Trace: [ 3149.146062] ? libcfs_debug_msg+0x9be/0xb00 [libcfs] [ 3149.146380] ? xas_load+0x8/0x80 [ 3149.146452] ? xas_find+0x173/0x1b0 [ 3149.146854] ? xa_find+0xae/0xe0 [ 3149.146911] ? do_raw_spin_unlock+0x44/0xc0 [ 3149.146973] ? _raw_spin_unlock+0x1a/0x30 [ 3149.147061] class_process_config+0x14fa/0x2e60 [obdclass] [ 3149.147154] ? do_lcfg+0x15a/0x4b0 [obdclass] [ 3149.147247] do_lcfg+0x223/0x4b0 [obdclass] [ 3149.147322] lustre_start_simple+0x72/0x1c0 [obdclass] [ 3149.147471] osd_start+0x565/0x7b0 [ptlrpc] [ 3149.147536] ? kstrtou16+0x1b/0x40 [ 3149.147607] ? target_name2index+0x106/0x140 [obdclass] [ 3149.147721] server_fill_super+0x327/0x1100 [ptlrpc] [ 3149.147814] ? obd_zombie_barrier+0x36/0x90 [obdclass] [ 3149.147889] ? debug_mutex_init+0x31/0x40 [ 3149.147978] lustre_fill_super+0x390/0x480 [lustre] [ 3149.148066] ? lustre_mount+0x10/0x10 [lustre] [ 3149.148141] mount_nodev+0x41/0x90
this problem was introduced in c5e5060d950 ("LU-8802 obd: remove MAX_OBD_DEVICES") IMO:
if (class_name2dev(new_obd->obd_name) == -1) { class_incref(new_obd, "obd_device_list", new_obd); rc = xa_alloc(&obd_devs, &dev_no, new_obd, xa_limit_31b, GFP_ATOMIC);
two threads can try and create OBDs with a same name:
00000020:00000080:0.0:1699293418.519360:0:185838:0:(genops.c:417:class_newdev()) Allocate new device lustre-OST0000-osd (00000000b8694366) 00000020:00000080:1.0:1699293418.519360:0:185839:0:(genops.c:417:class_newdev()) Allocate new device lustre-OST0000-osd (00000000e7494c1a)
Attachments
Issue Links
- is related to
-
LU-8802 Dynamically allocate obd_devices.
- Open