[LU-4134] obdfilter-suvery bugs and panics (ioctl API isn't protected over shutdown/setup property). Created: 22/Oct/13 Updated: 04/Jan/18 Resolved: 11/Dec/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.0, Lustre 2.5.0 |
| Fix Version/s: | Lustre 2.11.0, Lustre 2.10.3 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Alexey Lyashkov | Assignee: | Yang Sheng |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | patch | ||
| Environment: |
lustre 2.1/2.5 on any OS. |
||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 11207 | ||||||||||||
| Description |
|
using ioctl api - isn't safe as we lack a protect to use name2obd / uuid2obd / num2obd calls result during shutdown. Xyratex bugs: MRP-509, MRP-1396 |
| Comments |
| Comment by Alexey Lyashkov [ 23/Oct/13 ] |
|
patch tries to make shutdown / setup obd device make lots clear to fix various bugs in lustre code. |
| Comment by Cliff White (Inactive) [ 20/Jan/14 ] |
|
Alexey, the patch needs a rebase, are you able to re-submit? Please see the Gerrit comments |
| Comment by Alexey Lyashkov [ 20/Jan/14 ] |
|
Cliff, I was busy with |
| Comment by Cliff White (Inactive) [ 04/Mar/14 ] |
|
Alexey, there are a few issues with the patch in our last tests, would a refresh be possible? Please see the comments on macros in Gerrit. |
| Comment by Peter Jones [ 10/Jul/15 ] |
|
Needs rebasing to make any progress. |
| Comment by Alexey Lyashkov [ 10/Jul/15 ] |
|
need fix a gerrit to ability to login via google. |
| Comment by James A Simmons [ 18/Sep/15 ] |
|
Alexey I rebased your patch against the latest master. Please let me know if it is correct. P.S |
| Comment by Alexander Boyko [ 21/Jun/17 ] |
|
I updated the patch, all tests are passed. |
| Comment by Gerrit Updater [ 24/Oct/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/8045/ |
| Comment by Peter Jones [ 24/Oct/17 ] |
|
Landed for 2.11 |
| Comment by Gerrit Updater [ 24/Oct/17 ] |
|
Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/29740 |
| Comment by Yang Sheng [ 07/Nov/17 ] |
|
This patch will cause a double free in failure path as below: [23427.124766] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == conf-sanity test 41c: concurrent mounts of MDT\/OST should all fail but one ======================== 16:20:02 \(1509664802\)
[23427.175328] Lustre: DEBUG MARKER: == conf-sanity test 41c: concurrent mounts of MDT/OST should all fail but one ======================== 16:20:02 (1509664802)
[23427.320400] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost1' ' /proc/mounts
[23427.356134] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
[23433.423356] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-33vm7: executing set_default_debug -1 all 4
[23433.459190] Lustre: DEBUG MARKER: trevis-33vm7: executing set_default_debug -1 all 4
[23434.318418] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-33vm3: executing set_default_debug -1 all 4
[23434.356057] Lustre: DEBUG MARKER: trevis-33vm3: executing set_default_debug -1 all 4
[23435.195522] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-33vm7: executing set_default_debug -1 all 4
[23435.238307] Lustre: DEBUG MARKER: trevis-33vm7: executing set_default_debug -1 all 4
[23436.043367] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-33vm3: executing set_default_debug -1 all 4
[23436.078104] Lustre: DEBUG MARKER: trevis-33vm3: executing set_default_debug -1 all 4
[23436.305850] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-ost1
[23436.354028] Lustre: DEBUG MARKER: test -b /dev/lvm-Role_OSS/P1
[23436.392706] Lustre: DEBUG MARKER: e2label /dev/lvm-Role_OSS/P1
[23436.427055] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-ost1; mount -t lustre /dev/lvm-Role_OSS/P1 /mnt/lustre-ost1
[23436.493237] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc
[23436.602541] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
[23436.637692] Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lust
[23436.877956] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-33vm8: executing set_default_debug -1 all 4
[23436.915153] Lustre: DEBUG MARKER: trevis-33vm8: executing set_default_debug -1 all 4
[23436.944609] Lustre: DEBUG MARKER: e2label /dev/lvm-Role_OSS/P1 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
[23436.978188] Lustre: DEBUG MARKER: e2label /dev/lvm-Role_OSS/P1 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
[23437.019478] Lustre: DEBUG MARKER: e2label /dev/lvm-Role_OSS/P1 2>/dev/null
[23439.851173] Lustre: lustre-OST0000: deleting orphan objects from 0x0:3 to 0x0:33
[23443.391660] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost1' ' /proc/mounts
[23443.424213] Lustre: DEBUG MARKER: umount -f /mnt/lustre-ost1
[23449.552602] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
[23473.647115] Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lust
[23473.914217] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-33vm2: executing load_modules_local
[23473.920620] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-33vm7: executing load_modules_local
[23473.927710] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-33vm8: executing load_modules_local
[23473.936093] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-33vm3: executing load_modules_local
[23473.966215] Lustre: DEBUG MARKER: trevis-33vm2: executing load_modules_local
[23473.973080] Lustre: DEBUG MARKER: trevis-33vm7: executing load_modules_local
[23473.996666] Lustre: DEBUG MARKER: trevis-33vm8: executing load_modules_local
[23474.007128] Lustre: DEBUG MARKER: trevis-33vm3: executing load_modules_local
[23475.560023] Lustre: 9927:0:(gss_svc_upcall.c:1186:gss_init_svc_upcall()) Init channel is not opened by lsvcgssd, following request might be dropped until lsvcgssd is active
[23475.560034] Lustre: 9927:0:(gss_mech_switch.c:71:lgss_mech_register()) Register gssnull mechanism
[23475.560039] Key type lgssc registered
[23475.624355] Lustre: Echo OBD driver; http://www.lustre.org/
[23476.635995] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-33vm3: executing set_default_debug -1 all 4
[23476.669582] Lustre: DEBUG MARKER: trevis-33vm3: executing set_default_debug -1 all 4
[23477.478297] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-33vm7: executing set_default_debug -1 all 4
[23477.513178] Lustre: DEBUG MARKER: trevis-33vm7: executing set_default_debug -1 all 4
[23478.326013] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-33vm3: executing set_default_debug -1 all 4
[23478.362859] Lustre: DEBUG MARKER: trevis-33vm3: executing set_default_debug -1 all 4
[23478.582138] Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lust
[23478.862393] Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-33vm8: executing lsmod
[23478.898933] Lustre: DEBUG MARKER: trevis-33vm8: executing lsmod
[23478.942337] Lustre: DEBUG MARKER: test -b /dev/lvm-Role_OSS/P1
[23478.974285] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_loc=0x80000716
[23479.008341] Lustre: DEBUG MARKER: mount -t lustre /dev/lvm-Role_OSS/P1 /mnt/lustre-ost1
[23479.010416] Lustre: DEBUG MARKER: mount -t lustre /dev/lvm-Role_OSS/P1 /mnt/lustre-ost1
[23479.061423] LustreError: 10335:0:(libcfs_fail.h:165:cfs_race()) cfs_race id 716 sleeping
[23479.063837] LustreError: 10338:0:(libcfs_fail.h:170:cfs_race()) cfs_fail_race id 716 waking
[23479.063860] LustreError: 10335:0:(libcfs_fail.h:168:cfs_race()) cfs_fail_race id 716 awake, rc=0
[23479.063923] LustreError: 10335:0:(genops.c:489:class_register_device()) lustre-OST0000-osd: already exists, won't add
[23479.063932] LustreError: 10335:0:(genops.c:415:class_free_dev()) ASSERTION( obd->obd_magic == OBD_DEVICE_MAGIC ) failed: ffff8800793ae9c8 obd_magic 5a5a5a5a != ab5cd6ef
[23479.075186] LustreError: 10335:0:(genops.c:415:class_free_dev()) LBUG
[23479.075187] Pid: 10335, comm: mount.lustre
[23479.075187]
[23479.075187][ 0.075856] ioremap error for 0x7ffff000-0x80000000, requested 0x2, got 0x0
[ 0.076057] dmi: Firmware registration failed.
|
| Comment by Gerrit Updater [ 07/Nov/17 ] |
|
Yang Sheng (yang.sheng@intel.com) uploaded a new patch: https://review.whamcloud.com/29967 |
| Comment by Gerrit Updater [ 11/Dec/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/29967/ |
| Comment by Yang Sheng [ 11/Dec/17 ] |
|
Patch landed to 2.11.0. Close this ticket. |
| Comment by Gerrit Updater [ 12/Dec/17 ] |
|
Minh Diep (minh.diep@intel.com) uploaded a new patch: https://review.whamcloud.com/30487 |
| Comment by Gerrit Updater [ 04/Jan/18 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/29740/ |
| Comment by Gerrit Updater [ 04/Jan/18 ] |
|
John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/30487/ |