Details
-
Bug
-
Resolution: Not a Bug
-
Minor
-
None
-
Lustre 2.10.0
-
None
-
3
-
9223372036854775807
Description
OSS0-A213:~/oss-213-tar-files/benchmark/obd # sh ./obdsurvey-script.sh
/usr/bin/obdfilter-survey: line 242: ( << 16) | ( << 8) | : syntax error: operand expected (error token is "<< 16) | ( << 8) | ")
/usr/bin/obdfilter-survey: line 254: [: -lt: unary operator expected
Fri Jul 14 13:14:41 PDT 2017 Obdfilter-survey for case=disk from OSS0-A213
ost 8 sz 2048000000K rsz 1024K obj 8 thr 8 write 3354.53 [ 0.00, 1769.93] rewrite 3272.87 [ 0.00, 2786.88] read 3064.40 ERROR
ost 8 sz 2048000000K rsz 1024K obj 8 thr 16 write 3461.43 [ 0.00, 1821.87] rewrite 3395.52 [ 0.00, 1627.93] read 3629.66 ERROR
ost 8 sz 2048000000K rsz 1024K obj 8 thr 32 write 3561.64 [ 0.00, 1824.88] rewrite 3465.79 [ 0.00, 1714.86] read 10853.31 ERROR
ost 8 sz 2048000000K rsz 1024K obj 8 thr 64 write 3636.38 [ 0.00, 1966.92] rewrite
Message from syslogd@OSS0-A213 at Jul 14 14:50:09 ...
kernel:[347802.884464] VERIFY3(0 == remove_reference(hdr, ((void *)0), tag)) failed (0 == 3)
Message from syslogd@OSS0-A213 at Jul 14 14:50:09 ...
kernel:[347802.884465] PANIC at arc.c:3069:arc_buf_destroy()
SS0-A213:~/oss-213-tar-files/benchmark/obd # zpool status
pool: mgs
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
mgs ONLINE 0 0 0
35000c5008355dc6f ONLINE 0 0 0
errors: No known data errors
pool: ost1
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
ost1 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
35000c5008357281b ONLINE 0 0 0
35000c50083552aeb ONLINE 0 0 0
35000c500837f27fb ONLINE 0 0 0
35000c5008355705f ONLINE 0 0 0
35000c500837f234b ONLINE 0 0 0
35000c500837ecd6f ONLINE 0 0 0
35000c500837f2b37 ONLINE 0 0 0
35000c500835535cf ONLINE 0 0 0
35000c5008355cad3 ONLINE 0 0 0
35000c500835554f7 ONLINE 0 0 0
35000c5008355c11b ONLINE 0 0 0
errors: No known data errors
pool: ost2
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-9P
scan: none requested
config:
NAME STATE READ WRITE CKSUM
ost2 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
35000c50083552443 ONLINE 0 0 0
35000c50083555057 ONLINE 0 0 0
35000c5008355cedb ONLINE 0 0 0
35000c5008355ddcf ONLINE 0 0 0
35000c500837f2b93 ONLINE 0 0 0
35000c5008355bbdf ONLINE 0 0 0
35000c500837f249f ONLINE 0 0 0
35000c5008355c33b ONLINE 0 0 0
35000c50083571913 ONLINE 0 0 0
35000c5008355dd6f ONLINE 0 0 10
35000c5008355d73f ONLINE 0 0 7
errors: No known data errors
pool: ost3
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://zfsonlinux.org/msg/ZFS-8000-8A
scan: none requested
config:
NAME STATE READ WRITE CKSUM
ost3 ONLINE 0 0 134
raidz2-0 ONLINE 0 0 901
35000c50083571d7b ONLINE 0 0 4
35000c50083570b87 ONLINE 0 0 0
35000c5008355bba7 ONLINE 0 0 0
35000c50083571937 ONLINE 0 0 0
35000c5008355bd23 ONLINE 0 0 7
35000c5008355d007 ONLINE 0 0 4
35000c50083572bb3 ONLINE 0 0 0
35000c500837f11ef ONLINE 0 0 0
35000c5008355e177 ONLINE 0 0 0
35000c500837f23f7 ONLINE 0 0 1
35000c500835529ef ONLINE 0 0 21
errors: 7 data errors, use '-v' for a list
pool: ost4
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://zfsonlinux.org/msg/ZFS-8000-8A
scan: none requested
config:
NAME STATE READ WRITE CKSUM
ost4 ONLINE 0 0 24
raidz2-0 ONLINE 0 0 354
35000c5008355e3ef ONLINE 0 0 12
35000c5008355c19b ONLINE 0 0 6
35000c5008355b603 ONLINE 0 0 0
35000c5008355cd5b ONLINE 0 0 3
35000c5008355c0f3 ONLINE 0 0 0
35000c500837f24c7 ONLINE 0 0 1
35000c500837f2a37 ONLINE 0 0 0
35000c50083555217 ONLINE 0 0 0
35000c500837f2a77 ONLINE 0 0 0
35000c5008355e01f ONLINE 0 0 7
35000c5008355ce6b ONLINE 0 0 0
errors: 1 data errors, use '-v' for a list
OSS0-A213:~/oss-213-tar-files/benchmark/obd #
dmesg
[262188.139753] Lustre: Echo OBD driver; http://www.lustre.org/
[262532.913462] perf interrupt took too long (2604 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[263037.870130] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) echo_client still has objects at cleanup time, wait for 1 second
[263037.870205] LustreError: 28451:0:(class_obd.c:387:class_handle_ioctl()) OBD ioctl: device not setup 30
[264430.833796] perf interrupt took too long (5036 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
[277208.904437] perf interrupt took too long (10037 > 10000), lowering kernel.perf_event_max_sample_rate to 12500
[334273.467060] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) echo_client still has objects at cleanup time, wait for 1 second
[334273.467074] LustreError: 23712:0:(class_obd.c:387:class_handle_ioctl()) OBD ioctl: device not setup 30
[334273.467077] LustreError: 23712:0:(class_obd.c:387:class_handle_ioctl()) Skipped 3 previous similar messages
[335724.942682] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) echo_client still has objects at cleanup time, wait for 1 second
[335724.942696] LustreError: 14287:0:(class_obd.c:387:class_handle_ioctl()) OBD ioctl: device not setup 30
[335724.942698] LustreError: 14287:0:(class_obd.c:387:class_handle_ioctl()) Skipped 3 previous similar messages
[338549.537566] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) echo_client still has objects at cleanup time, wait for 1 second
[338549.537596] LustreError: 30054:0:(class_obd.c:387:class_handle_ioctl()) OBD ioctl: device not setup 30
[338549.537598] LustreError: 30054:0:(class_obd.c:387:class_handle_ioctl()) Skipped 3 previous similar messages
[338981.231985] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) echo_client still has objects at cleanup time, wait for 1 second
[338981.232155] LustreError: 13121:0:(class_obd.c:387:class_handle_ioctl()) OBD ioctl: device not setup 30
[338981.232157] LustreError: 13121:0:(class_obd.c:387:class_handle_ioctl()) Skipped 3 previous similar messages
[347802.884464] VERIFY3(0 == remove_reference(hdr, ((void *)0), tag)) failed (0 == 3)
[347802.884465] PANIC at arc.c:3069:arc_buf_destroy()
[347802.884466] Showing stack for process 9032
[347802.884468] CPU: 4 PID: 9032 Comm: z_rd_int_1 Tainted: P OE N 4.4.21-69-default #1
[347802.884471] Hardware name: Supermicro X10DRH/X10DRH-IT, BIOS 2.0 12/17/2015
[347802.884478] 0000000000000000 ffffffff8130d890 ffffffffa12eff30 ffff881f6c40fd30
[347802.884483] ffffffffa066f08f ffff881fb4ab42c0 0000000000000030 ffff881f6c40fd40
[347802.884487] ffff881f6c40fce0 2833594649524556 6d6572203d3d2030 656665725f65766f
[347802.884488] Call Trace:
[347802.884497] [<ffffffff81019a59>] dump_trace+0x59/0x310
[347802.884502] [<ffffffff81019dfa>] show_stack_log_lvl+0xea/0x170
[347802.884505] [<ffffffff8101ab81>] show_stack+0x21/0x40
[347802.884507] [<ffffffff8130d890>] dump_stack+0x5c/0x7c
[347802.884518] [<ffffffffa066f08f>] spl_panic+0xbf/0xf0 [spl]
[347802.884589] [<ffffffffa11c1b0f>] arc_buf_destroy+0xef/0x100 [zfs]
[347802.884609] [<ffffffffa11c7ad9>] dbuf_read_done+0x79/0xd0 [zfs]
[347802.884626] [<ffffffffa11bf00f>] arc_read_done+0x15f/0x2a0 [zfs]
[347802.884655] [<ffffffffa126da2b>] zio_done+0x2eb/0xb80 [zfs]
[347802.884683] [<ffffffffa12686c1>] zio_execute+0x81/0xe0 [zfs]
[347802.884690] [<ffffffffa066d24d>] taskq_thread+0x22d/0x430 [spl]
[347802.884695] [<ffffffff8109980d>] kthread+0xbd/0xe0
[347802.884699] [<ffffffff815e177f>] ret_from_fork+0x3f/0x70
[347802.885767] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
[347802.885768] Leftover inexact backtrace:
[347802.885771] [<ffffffff81099750>] ? kthread_park+0x50/0x50
[578289.860341] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) echo_client still has objects at cleanup time, wait for 1 second
[578290.864148] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) echo_client still has objects at cleanup time, wait for 1 second
[578292.864333] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) echo_client still has objects at cleanup time, wait for 1 second
[578292.864337] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) Skipped 1 previous similar message
[578295.864567] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) echo_client still has objects at cleanup time, wait for 1 second
[578295.864571] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) Skipped 2 previous similar messages
[578300.865021] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) echo_client still has objects at cleanup time, wait for 1 second
[578300.865025] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) Skipped 4 previous similar messages
[578309.865780] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) echo_client still has objects at cleanup time, wait for 1 second
[578309.865784] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) Skipped 8 previous similar messages
[578326.867232] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) echo_client still has objects at cleanup time, wait for 1 second
[578326.867236] LustreError: 10923:0:(echo_client.c:1039:echo_device_free()) Skipped 16 previous similar messages
OSS0-A213:~/oss-213-tar-files/benchmark/obd #