[LU-9305] Running File System Aging create write checksum errors Created: 29/Sep/16  Updated: 10/May/19  Due: 28/Feb/17  Resolved: 29/Sep/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.10.0, Lustre 2.11.0

Type: Bug Priority: Major
Reporter: John Salinas (Inactive) Assignee: Alex Zhuravlev
Resolution: Fixed Votes: 1
Labels: LS_RZ, prod, rel-note
Environment:

My most recent re-production of this was:
ZFS based on 0.7.0 RC4 fs/zfs:coral-rc1-combined
Lustre tagged release 2.9.57(but 2.9.58 fails as well)
Centos 7.3 3.10.0-514.16.1.el7.x86_64

I have personally verified this fails on Lustre 2.8, 2.9 and latest tagged release, zfs 0.6.5-current ZOL Master and the most recent Centos 7.1, 7.2, and 7.3 kernels.


Attachments: File BasicLibs.py     File FileAger-wolf6.py     File FileAger-wolf7.py     File FileAger-wolf8.py     File FileAger-wolf9.py     PDF File Linux_x64_Memory_Address_Mapping.pdf     File debug_info.20170406_143409_48420_wolf-3.wolf.hpdd.intel.com.tgz     Text File debug_vmalloc.patch     Text File debug_vmalloc_lustre.patch     Text File debug_vmalloc_spl.patch     File wolf-6_client.tgz    
Issue Links:
Duplicate
is duplicated by LU-9304 BUG: Bad page state in process ll_ost... Resolved
Related
is related to LU-9279 coral-beta-combined build 124 kernel ... Resolved
is related to LU-9854 Lustre 2.10.0 mmap() issues Resolved
Severity: 1
Rank (Obsolete): 9223372036854775807

 Description   

My most recent re-production of this was:
ZFS based on 0.7.0 RC4 fs/zfs:coral-rc1-combined
Lustre tagged release 2.9.57(but 2.9.58 fails as well)
Centos 7.3 3.10.0-514.16.1.el7.x86_64

I have personally verified this fails on Lustre 2.8, 2.9 and latest tagged release, zfs 0.6.5-current ZOL Master and the most recent Centos 7.1, 7.2, and 7.3 kernels.

This may well be a Lustre issue I need to try to reproduce on raidz, with out large RPCs, etc.

On both the clients and OSS nodes we see checksum errors while the file aging test is running such as:
[ 9354.968454] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0000 from 12345-192.168.1.6@o2ib inode [0x200000401:0x254:0x0] object 0x0:292 extent [117440512-125698047]: client csum de357896, server csum 5cd77893

[ 9394.315856] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0000 from 12345-192.168.1.6@o2ib inode [0x200000401:0x28c:0x0] object 0x0:320 extent [67108864-82968575]: client csum df6bd34a, server csum 8480d352
[ 9404.371609] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0000 from 12345-192.168.1.6@o2ib inode [0x200000401:0x298:0x0] object 0x0:326 extent [67108864-74448895]: client csum 2ced4ec0, server csum 1f814ec4



 Comments   
Comment by John Salinas (Inactive) [ 09/Jan/17 ]

Waiting on Jinshan - moving out to next month

Comment by Kalyana Chadalavada (Inactive) [ 23/Feb/17 ]

bad test configuration resulted in this issue. fixed the test

Comment by John Salinas (Inactive) [ 07/Apr/17 ]

We have seen this come even with 1 file ager process on each client with its own top level directory:

Server

[32845.401121] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0000 from 12345-192.168.1.6@o2ib inode [0
x200000405:0x179:0x0] object 0x0:833 extent [34406400-50331647]: client csum 40cd640, server csum 315ad637
[34838.145682] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0000 from 12345-192.168.1.6@o2ib inode [0
x200000405:0x89a:0x0] object 0x0:1743 extent [83886080-91881471]: client csum 1ed46ca1, server csum 42996ca
e
[34843.171489] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0000 from 12345-192.168.1.6@o2ib inode [0
x200000405:0x8a4:0x0] object 0x0:1748 extent [83886080-96468991]: client csum 4a943dde, server csum 4f793de
6

Client:

[1442511.437096] LustreError: 132-0: BAD WRITE CHECKSUM: changed in transit before arrival at OST: from 192.168.1.4@o2ib inode [0x200000405:0xc8:0x0] object 0x0:493 extent [50331648-66977791]
[1442511.466353] LustreError: 23155:0:(osc_request.c:1369:check_write_checksum()) original client csum 26eef72b (type 2), server csum 6a2afc80 (type 2), client csum now 26eef72b
[1442511.492569] LustreError: 23155:0:(osc_request.c:1533:osc_brw_redo_request()) @@@ redo for recoverable error -11  req@ffff8805b936d100 x1562671616998000/t4295713190(4295713190) o4->lsdraid-OST0001-osc-ffff880da148f800@192.168.1.4@o2ib:6/4 lens 608/416 e 0 to 0 dl 1491483899 ref 3 fl Interpret:R/4/0 rc 0/0
[1442586.196168] LustreError: 132-0: BAD WRITE CHECKSUM: changed in transit before arrival at OST: from 192.168.1.4@o2ib inode [0x200000405:0x130:0x0] object 0x0:545 extent [68812800-83886079]
[1442586.220163] LustreError: 23181:0:(osc_request.c:1369:check_write_checksum()) original client csum 7f41af68 (type 2), server csum f877af67 (type 2), client csum now 7f41af68
[1442586.242785] LustreError: 23181:0:(osc_request.c:1533:osc_brw_redo_request()) @@@ redo for recoverable error -11  req@ffff88032f1d7b00 x1562671617250864/t4295718640(4295718640) o4->lsdraid-OST0001-osc-ffff880da148f800@192.168.1.4@o2ib:6/4 lens 608/416 e 0 to 0 dl 1491483974 ref 3 fl Interpret:R/4/0 rc 0/0
[1442614.589662] LustreError: 132-0: BAD WRITE CHECKSUM: changed in transit before arrival at OST: from 192.168.1.3@o2ib inode [0x200000405:0x179:0x0] object 0x0:833 extent [34406400-50331647]
[1442614.614320] LustreError: 23206:0:(osc_request.c:1369:check_write_checksum()) original client csum 40cd640 (type 2), server csum 315ad637 (type 2), client csum now 40cd640
[1442614.637858] LustreError: 23206:0:(osc_request.c:1533:osc_brw_redo_request()) @@@ redo for recoverable error -11  req@ffff880971531500 x1562671617382816/t47244690905(47244690905) o4->lsdraid-OST0000-osc-ffff880da148f800@192.168.1.3@o2ib:6/4 lens 608/416 e 0 to 0 dl 1491484041 ref 3 fl Interpret:R/4/0 rc 0/0
[1442677.105050] LustreError: 132-0: BAD WRITE CHECKSUM: changed in transit before arrival at OST: from 192.168.1.4@o2ib inode [0x200000405:0x30e:0x0] object 0x0:783 extent [67108864-82313215]
[1442677.133091] LustreError: 23207:0:(osc_request.c:1369:check_write_checksum()) original client csum bd02b56a (type 2), server csum 8f588935 (type 2), client csum now bd02b56a
[1442677.155271] LustreError: 23207:0:(osc_request.c:1533:osc_brw_redo_request()) @@@ redo for recoverable error -11  req@ffff8808bfa3d400 x1562671617640272/t4295723360(4295723360) o4->lsdraid-OST0001-osc-ffff880da148f800@192.168.1.4@o2ib:6/4 lens 608/416 e 0 to 0 dl 1491484103 ref 3 fl Interpret:R/4/0 rc 0/0
[1442727.932598] LustreError: 132-0: BAD WRITE CHECKSUM: changed in transit before arrival at OST: from 192.168.1.4@o2ib inode [0x200000405:0x3df:0x0] object 0x0:887 extent [67108864-81018879]
[1442727.955261] LustreError: 23206:0:(osc_request.c:1369:check_write_checksum()) original client csum 54933a67 (type 2), server csum 31bca8f7 (type 2), client csum now 54933a67
[1442727.976294] LustreError: 23206:0:(osc_request.c:1533:osc_brw_redo_request()) @@@ redo for recoverable error -11  req@ffff880b4d432d00 x1562671617842304/t4295725770(4295725770) o4->lsdraid-OST0001-osc-ffff880da148f800@192.168.1.4@o2ib:6/4 lens 608/416 e 0 to 0 dl 1491484154 ref 3 fl Interpret:R/4/0 rc 0/0
[1444428.717785] LustreError: 132-0: BAD WRITE CHECKSUM: changed in transit before arrival at OST: from 192.168.1.4@o2ib inode [0x200000405:0x6d3:0x0] object 0x0:1265 extent [83886080-100007935]
[1444428.742999] LustreError: 23196:0:(osc_request.c:1369:check_write_checksum()) original client csum c62adf42 (type 2), server csum 47f2df45 (type 2), client csum now c62adf42
[1444428.767062] LustreError: 23196:0:(osc_request.c:1533:osc_brw_redo_request()) @@@ redo for recoverable error -11  req@ffff8809b6092d00 x1562671628068048/t4295747039(4295747039) o4->lsdraid-OST0001-osc-ffff880da148f800@192.168.1.4@o2ib:6/4 lens 608/416 e 0 to 0 dl 1491485852 ref 3 fl Interpret:R/4/0 rc 0/0
[1444607.282988] LustreError: 132-0: BAD WRITE CHECKSUM: changed in transit before arrival at OST: from 192.168.1.3@o2ib inode [0x200000405:0x89a:0x0] object 0x0:1743 extent [83886080-91881471]
[1444607.308197] LustreError: 23147:0:(osc_request.c:1369:check_write_checksum()) original client csum 1ed46ca1 (type 2), server csum 42996cae (type 2), client csum now 1ed46ca1
[1444607.331424] LustreError: 23147:0:(osc_request.c:1533:osc_brw_redo_request()) @@@ redo for recoverable error -11  req@ffff880814117b00 x1562671628483776/t47244951956(47244951956) o4->lsdraid-OST0000-osc-ffff880da148f800@192.168.1.3@o2ib:6/4 lens 608/416 e 0 to 0 dl 1491486032 ref 3 fl Interpret:R/4/0 rc 0/0
[1444612.350218] LustreError: 132-0: BAD WRITE CHECKSUM: changed in transit before arrival at OST: from 192.168.1.3@o2ib inode [0x200000405:0x8a4:0x0] object 0x0:1748 extent [83886080-96468991]
[1444612.375107] LustreError: 23148:0:(osc_request.c:1369:check_write_checksum()) original client csum 4a943dde (type 2), server csum 4f793de6 (type 2), client csum now 4a943dde

...etc..

This is followed by a crash
https://jira.hpdd.intel.com/browse/LU-9304

Also some similar messages as:
https://jira.hpdd.intel.com/browse/LDEV-199

Comment by John Salinas (Inactive) [ 07/Apr/17 ]

Has started reproducing again

Comment by Jodi Levi (Inactive) [ 13/Apr/17 ]

John is seeing this often now, crash results are above. Please comment in the ticket or reach out to John S if you have any questions.

Comment by Andreas Dilger [ 13/Apr/17 ]

John, since the server-side checksum is re-validated after the data was submitted to ZFS (to ensure it is overlapping with ZFS computing its own data checksum), this appears to be a problem with the data on the OST being modified after it arrives at the OSS.

What features are enabled for this testing? Does this include dRAID, 16MB RPC, 16MB recordsize, anything else? If those features are enabled, have you tried reproducing this with vanilla ZFS + 1MB RPC + 1MB recordsize to see if it reproduces there, and if not isolating it to one of the new CORAL features?

We obviously don't see this with our normal testing, but it could potentially also be workload related, so getting a description of the IO pattern your filesystem aging tool might help.

Comment by John Salinas (Inactive) [ 14/Apr/17 ]

Greetings,

Here were the settings for the runs that are not default. I have not observed that brwsize is automatic though I haven't tried recently.

ZFS OSS
echo "16777216" > /sys/module/zfs/parameters/zfs_max_recordsize
echo "65536" > /sys/module/spl/parameters/spl_kmem_alloc_warn
zpool create -f -o ashift=12 -o cachefile=none -O recordsize=16MB ost0 draid2 cfg=test_2_5_4_18_draidcfg.nvl mpathaa mpathab mpathac mpathad mpathae mpathaf mpathag mpathah mpathai mpathaj mpathak mpathal mpatham mpathan mpathao mpathap mpathaq mpathar

Lustre OSS
mkfs.lustre --backfstype=zfs --reformat --fsname=lsdraid --ost --index=0 --mgsnode=192.168.1.5@o2ib ost0/ost0
lctl set_param obdfilter.lsdraid-OST0000.brw_size=16

Lustre client
/usr/bin/mount -o flock -t lustre 192.168.1.5@o2ib:/lsdraid /mnt/lustre/
/usr/sbin/lctl set_param osc.*.max_pages_per_rpc=1024
or
/usr/sbin/lctl set_param osc.*.max_pages_per_rpc=4096

I have seen this on RAIDz pools as well it is not specific to dRAID. I will attach fileager python script to this bug

Comment by John Salinas (Inactive) [ 14/Apr/17 ]

What would we like to try first – on raidz, 1MB RPCs, or 1MB bwrsize/record size?

Comment by Jinshan Xiong (Inactive) [ 14/Apr/17 ]

John - there are too much new code in your test. I would prefer to start with ZFS(with ABD) + 1M RPC, and then add 16M RPC, and then add DRAID, and then MD class.

This series of problems, for example LU-9279 and LU-7280, can also narrow down this way.

Comment by John Salinas (Inactive) [ 14/Apr/17 ]

Yes, indeed although perhaps one solutions fixes them all

I should be able to start this testing tomorrow and will update this ticket with the results of the lowest factor thing that still shows this. These checksum errors are actually the hardest to produce of all of these bugs. But I have seen on and off for a while now – almost 9 months now.

Comment by John Salinas (Inactive) [ 15/Apr/17 ]

Lustre 2.9.0 + 0.7.0 RC3 (none of our patches) record size 1M on OST0 and 16M on OST1. brw_size=16 on both raidz – messages but no crash manual dumps: 10.8.1.4-2017-04-15-00:26:17 10.8.1.4-2017-04-15-01:47:43 10.8.1.3-2017-04-15-13:22:45 10.8.1.4-2017-04-15-13:22:47

wolf-4 OSS

[  163.434692] Lustre: lsdraid-OST0001: Recovery over after 0:06, of 5 clients 5 recovered and 0 were evicted.
[  163.480746] Lustre: lsdraid-OST0001: deleting orphan objects from 0x0:720 to 0x0:1025
[  370.631336] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x3b0:0x0] object 0x0:1225 extent [83886080-92680191]: client csum d5f42113, server csum 1a89e99c
[  480.339896] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x49c:0x0] object 0x0:4041 extent [33554432-47890431]: client csum e47bcdcb, server csum 86becdcf
[  488.890964] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x4ae:0x0] object 0x0:5107 extent [67108864-73793535]: client csum b74b30df, server csum 20c030ec
[  509.914190] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x52f:0x0] object 0x0:6348 extent [33554432-43007999]: client csum cbc76f28, server csum 4b241635
[  539.505532] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x5be:0x0] object 0x0:7700 extent [67108864-78381055]: client csum b6e2021c, server csum c5ce4f88
[  560.736133] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x5f1:0x0] object 0x0:8747 extent [67108864-81104895]: client csum ddc22e54, server csum 894f5e1a
[  618.743576] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x6d0:0x0] object 0x0:11762 extent [67108864-81694719]: client csum 734e4939, server csum 175394a5
[  618.764867] LustreError: Skipped 1 previous similar message
[ 1080.395798] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x7fa:0x0] object 0x0:14839 extent [40140800-50331647]: client csum 937c50bf, server csum f71e2e65
[ 1080.417120] LustreError: Skipped 2 previous similar messages
[ 3001.142322] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0xd10:0x0] object 0x0:49284 extent [100663296-108527615]: client csum ab9466a8, server csum 10b4e228
[ 3400.563954] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0xfb0:0x0] object 0x0:54388 extent [67108864-82837503]: client csum 71e8cd52, server csum 35becd53
[ 3461.970072] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x1052:0x0] object 0x0:55534 extent [67108864-74973183]: client csum c0a766ab, server csum ab5a66bb
[ 3762.672549] BUG: Bad page state in process ll_ost_io01_003  pfn:182ec6d
[ 3762.680002] page:ffffea0060bb1b40 count:-1 mapcount:0 mapping:          (null) index:0x0
[ 3762.689091] page flags: 0x6fffff00000000()
[ 3762.693727] page dumped because: nonzero _count
[ 3762.700757] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlib_deflate ses dm_service_time enclosure intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel mpt3sas lrw gf128mul glue_helper ablk_helper cryptd raid_class scsi_transport_sas mei_me iTCO_wdt ipmi_ssif iTCO_vendor_support mei ipmi_devintf sb_edac sg
[ 3762.790920]  ioatdma lpc_ich shpchp edac_core pcspkr i2c_i801 ipmi_si mfd_core ipmi_msghandler acpi_pad acpi_power_meter wmi nfsd dm_multipath dm_mod auth_rpcgss nfs_acl lockd grace sunrpc binfmt_misc ip_tables ext4 mbcache jbd2 mlx4_en mlx4_ib vxlan ib_sa ip6_udp_tunnel ib_mad udp_tunnel ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper igb crct10dif_pclmul ptp crct10dif_common ttm ahci crc32c_intel pps_core mlx4_core libahci drm dca i2c_algo_bit libata i2c_core
[ 3762.850233] CPU: 31 PID: 9096 Comm: ll_ost_io01_003 Tainted: P          IOE  ------------   3.10.0-327.36.3.el7.x86_64 #1
[ 3762.864178] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 3762.877501]  ffffea0060bb1b40 000000006cbfa991 ffff880fd6a47908 ffffffff81636431
[ 3762.887516]  ffff880fd6a47930 ffffffff81631645 ffffea0060bb1b40 0000000000000000
[ 3762.897491]  000fffff00000000 ffff880fd6a47978 ffffffff811714dd fff00000fe000000
[ 3762.907458] Call Trace:
[ 3762.912046]  [<ffffffff81636431>] dump_stack+0x19/0x1b
[ 3762.919394]  [<ffffffff81631645>] bad_page.part.59+0xdf/0xfc
[ 3762.927333]  [<ffffffff811714dd>] free_pages_prepare+0x16d/0x190
[ 3762.935630]  [<ffffffff81171e21>] free_hot_cold_page+0x31/0x140
[ 3762.943790]  [<ffffffff8117200f>] __free_pages+0x3f/0x60
[ 3762.951264]  [<ffffffffa0fa1ad3>] osd_bufs_put+0x123/0x1f0 [osd_zfs]
[ 3762.959874]  [<ffffffffa109b84a>] ofd_commitrw_write+0xea/0x1c20 [ofd]
[ 3762.968646]  [<ffffffffa109ff2d>] ofd_commitrw+0x51d/0xa40 [ofd]
[ 3762.976868]  [<ffffffffa0e0f8d2>] obd_commitrw+0x2ec/0x32f [ptlrpc]
[ 3762.985338]  [<ffffffffa0de7f71>] tgt_brw_write+0xea1/0x1640 [ptlrpc]
[ 3762.993957]  [<ffffffffa0d3e560>] ? target_send_reply_msg+0x170/0x170 [ptlrpc]
[ 3763.003453]  [<ffffffffa0de4225>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[ 3763.012530]  [<ffffffffa0d901ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 3763.022429]  [<ffffffffa0a33128>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[ 3763.031354]  [<ffffffffa0d8dd68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 3763.040220]  [<ffffffff810b8952>] ? default_wake_function+0x12/0x20
[ 3763.048476]  [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90
[ 3763.056267]  [<ffffffffa0d94260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[ 3763.064562]  [<ffffffffa0d937c0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[ 3763.074037]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[ 3763.080685]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[ 3763.089162]  [<ffffffff81646a98>] ret_from_fork+0x58/0x90
[ 3763.096349]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[ 3855.476573] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x12e3:0x0] object 0x0:58439 extent [67108864-82837503]: client csum 71e8cd52, server csum 14e5cd5e
[ 3923.650281] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x13bc:0x0] object 0x0:59171 extent [33554432-48742399]: client csum 9005f4a9, server csum db87ac4c
[ 5698.551136] perf interrupt took too long (2521 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[ 5904.311835] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x1734:0x0] object 0x0:66681 extent [67108864-80281599]: client csum 1eaa58ca, server csum 44a378f0
[ 8708.045614] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x1d31:0x0] object 0x0:67733 extent [121729024-134217727]: client csum 99efe98c, server csum e23d22e1
[ 9738.442312] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x2051:0x0] object 0x0:68278 extent [100663296-116666367]: client csum d42f69dc, server csum 8732074f
[10448.854337] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x237a:0x0] object 0x0:68809 extent [100663296-112549887]: client csum 7a8b3e1a, server csum 1dbd0291
[10480.902373] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x2396:0x0] object 0x0:68834 extent [85426176-100663295]: client csum f43a36f0, server csum 9d10e702
[11720.767365] BUG: Bad page state in process ll_ost_io01_001  pfn:15d132f
[11720.777259] page:ffffea005744cbc0 count:-1 mapcount:0 mapping:          (null) index:0x0
[11720.788693] page flags: 0x6fffff00000000()
[11720.795463] page dumped because: nonzero _count
[11720.802596] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlib_deflate ses dm_service_time enclosure intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel mpt3sas lrw gf128mul glue_helper ablk_helper cryptd raid_class scsi_transport_sas mei_me iTCO_wdt ipmi_ssif iTCO_vendor_support mei ipmi_devintf sb_edac sg
[11720.893130]  ioatdma lpc_ich shpchp edac_core pcspkr i2c_i801 ipmi_si mfd_core ipmi_msghandler acpi_pad acpi_power_meter wmi nfsd dm_multipath dm_mod auth_rpcgss nfs_acl lockd grace sunrpc binfmt_misc ip_tables ext4 mbcache jbd2 mlx4_en mlx4_ib vxlan ib_sa ip6_udp_tunnel ib_mad udp_tunnel ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper igb crct10dif_pclmul ptp crct10dif_common ttm ahci crc32c_intel pps_core mlx4_core libahci drm dca i2c_algo_bit libata i2c_core
[11720.951749] CPU: 35 PID: 8509 Comm: ll_ost_io01_001 Tainted: P    B     IOE  ------------   3.10.0-327.36.3.el7.x86_64 #1
[11720.965393] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[11720.978463]  ffffea005744cbc0 00000000a971f860 ffff880fdb6bf908 ffffffff81636431
[11720.988249]  ffff880fdb6bf930 ffffffff81631645 ffffea005744cbc0 0000000000000000
[11720.998053]  000fffff00000000 ffff880fdb6bf978 ffffffff811714dd fff00000fe000000
[11721.007838] Call Trace:
[11721.012009]  [<ffffffff81636431>] dump_stack+0x19/0x1b
[11721.019195]  [<ffffffff81631645>] bad_page.part.59+0xdf/0xfc
[11721.026948]  [<ffffffff811714dd>] free_pages_prepare+0x16d/0x190
[11721.035167]  [<ffffffff81171e21>] free_hot_cold_page+0x31/0x140
[11721.043294]  [<ffffffff8117200f>] __free_pages+0x3f/0x60
[11721.050752]  [<ffffffffa0fa1ad3>] osd_bufs_put+0x123/0x1f0 [osd_zfs]
[11721.059372]  [<ffffffffa109b84a>] ofd_commitrw_write+0xea/0x1c20 [ofd]
[11721.068157]  [<ffffffffa109ff2d>] ofd_commitrw+0x51d/0xa40 [ofd]
[11721.076424]  [<ffffffffa0e0f8d2>] obd_commitrw+0x2ec/0x32f [ptlrpc]
[11721.085001]  [<ffffffffa0de7f71>] tgt_brw_write+0xea1/0x1640 [ptlrpc]
[11721.094001]  [<ffffffff81632d15>] ? __slab_free+0x10e/0x277
[11721.101706]  [<ffffffff810c15cc>] ? update_curr+0xcc/0x150
[11721.109340]  [<ffffffff810be46e>] ? account_entity_dequeue+0xae/0xd0
[11721.117924]  [<ffffffffa0d3e560>] ? target_send_reply_msg+0x170/0x170 [ptlrpc]
[11721.127469]  [<ffffffffa0de4225>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[11721.136601]  [<ffffffffa0d901ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[11721.146564]  [<ffffffffa0a33128>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[11721.155726]  [<ffffffffa0d8dd68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[11721.164815]  [<ffffffff810b8952>] ? default_wake_function+0x12/0x20
[11721.173099]  [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90
[11721.180948]  [<ffffffffa0d94260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[11721.189287]  [<ffffffffa0d937c0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[11721.198828]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[11721.205490]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[11721.214017]  [<ffffffff81646a98>] ret_from_fork+0x58/0x90
[11721.221178]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[11906.409714] perf interrupt took too long (5056 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
[12369.576466] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x28dd:0x0] object 0x0:69605 extent [100663296-115441663]: client csum 34b2200, server csum 5f29220d
[12574.297235] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x2a16:0x0] object 0x0:69767 extent [100663296-114409471]: client csum c953b2e4, server csum f3b9a3f5
[12583.154014] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000405:0x2a22:0x0] object 0x0:69773 extent [100663296-117309439]: client csum fa39f722, server csum 17548bac

wolf-3 OSS

[  702.495373] Lustre: lsdraid-OST0000: Connection restored to 5dd53d1b-72ff-64c0-86f7-b4ab04036f55 (at 192.168.1.6@o2ib)
[  712.111566] LustreError: 35894:0:(ofd_grant.c:641:ofd_grant_check()) lsdraid-OST0000: cli 5dd53d1b-72ff-64c0-86f7-b4ab04036f55 claims 17432576 GRANT, real grant 0
[  712.629997] LustreError: 39491:0:(ofd_grant.c:641:ofd_grant_check()) lsdraid-OST0000: cli 5dd53d1b-72ff-64c0-86f7-b4ab04036f55 claims 17432576 GRANT, real grant 0
[  712.649481] LustreError: 39491:0:(ofd_grant.c:641:ofd_grant_check()) Skipped 8 previous similar messages
[  713.660785] LustreError: 38266:0:(ofd_grant.c:641:ofd_grant_check()) lsdraid-OST0000: cli 5dd53d1b-72ff-64c0-86f7-b4ab04036f55 claims 17432576 GRANT, real grant 0
[  713.679875] LustreError: 38266:0:(ofd_grant.c:641:ofd_grant_check()) Skipped 5 previous similar messages
[  715.665680] LustreError: 38165:0:(ofd_grant.c:641:ofd_grant_check()) lsdraid-OST0000: cli 5dd53d1b-72ff-64c0-86f7-b4ab04036f55 claims 17432576 GRANT, real grant 0
[  715.685499] LustreError: 38165:0:(ofd_grant.c:641:ofd_grant_check()) Skipped 48 previous similar messages
[  835.423369] Lustre: lsdraid-OST0000: Connection restored to 4e5e1424-c5a7-dbfe-ccf8-a041ec520cb5 (at 192.168.1.9@o2ib)
[  835.437468] Lustre: Skipped 2 previous similar messages
[11228.546836] perf interrupt took too long (2506 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[28193.720410] LNet: Service thread pid 91775 was inactive for 200.29s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
[28193.743765] Pid: 91775, comm: ll_ost00_010
[28193.750363] 
Call Trace:
[28193.758633]  [<ffffffff8163bb39>] schedule+0x29/0x70
[28193.765982]  [<ffffffffa05cb2fd>] cv_wait_common+0x10d/0x130 [spl]
[28193.774687]  [<ffffffff810a6b80>] ? autoremove_wake_function+0x0/0x40
[28193.783567]  [<ffffffffa05cb335>] __cv_wait+0x15/0x20 [spl]
[28193.791608]  [<ffffffffa1439c23>] txg_wait_open+0xb3/0xf0 [zfs]
[28193.799877]  [<ffffffffa13e264d>] dmu_free_long_range+0x25d/0x3d0 [zfs]
[28193.808919]  [<ffffffffa1092468>] osd_unlinked_object_free+0x28/0x280 [osd_zfs]
[28193.818586]  [<ffffffffa10927d3>] osd_unlinked_list_emptify+0x63/0xa0 [osd_zfs]
[28193.828178]  [<ffffffffa1094dba>] osd_trans_stop+0x31a/0x5b0 [osd_zfs]
[28193.836927]  [<ffffffffa119516f>] ofd_trans_stop+0x1f/0x60 [ofd]
[28193.845026]  [<ffffffffa1198d82>] ofd_object_destroy+0x2b2/0x890 [ofd]
[28193.853770]  [<ffffffffa1191987>] ofd_destroy_by_fid+0x307/0x510 [ofd]
[28193.862440]  [<ffffffffa0cdcbe0>] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc]
[28193.871264]  [<ffffffffa0cd71f0>] ? ldlm_completion_ast+0x0/0x910 [ptlrpc]
[28193.880161]  [<ffffffffa1181627>] ofd_destroy_hdl+0x267/0xa50 [ofd]
[28193.888454]  [<ffffffffa0d6b225>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[28193.897329]  [<ffffffffa0d171ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[28193.907053]  [<ffffffffa09c7128>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[28193.915785]  [<ffffffffa0d14d68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[28193.924476]  [<ffffffff810b8952>] ? default_wake_function+0x12/0x20
[28193.932565]  [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90
[28193.940211]  [<ffffffffa0d1b260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[28193.948394]  [<ffffffffa0d1a7c0>] ? ptlrpc_main+0x0/0x1de0 [ptlrpc]
[28193.956493]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[28193.963027]  [<ffffffff810a5ac0>] ? kthread+0x0/0xe0
[28193.969635]  [<ffffffff81646a98>] ret_from_fork+0x58/0x90
[28193.976729]  [<ffffffff810a5ac0>] ? kthread+0x0/0xe0

[28193.985950] LustreError: dumping log to /tmp/lustre-log.1492246924.91775
[28199.712751] LNet: Service thread pid 91775 completed after 206.29s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
[31329.310375] perf interrupt took too long (5002 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
[root@wolf-3 10.8.1.3-2017-04-14-22:46:09]# 

[root@wolf-4 combined]# ps aux |grep 9096
root 9096 0.6 0.0 0 0 ? S 01:55 4:21 [ll_ost_io01_003]
root 77386 0.0 0.0 112656 976 pts/0 S+ 12:56 0:00 grep --color=auto 9096
[root@wolf-4 combined]# man ps
[root@wolf-4 combined]# ps aux |grep 8509
root 8509 4.3 0.0 0 0 ? D 01:55 28:56 [ll_ost_io01_001]
root 84813 0.0 0.0 112656 976 pts/0 S+ 12:57 0:00 grep --color=auto 8509

[root@wolf-4 combined]# cat /proc/9096/stack
[<ffffffffa0d8dff5>] ptlrpc_wait_event+0x325/0x340 [ptlrpc]
[<ffffffffa0d93fcb>] ptlrpc_main+0x80b/0x1de0 [ptlrpc]
[<ffffffff810a5b8f>] kthread+0xcf/0xe0
[<ffffffff81646a98>] ret_from_fork+0x58/0x90
[<ffffffffffffffff>] 0xffffffffffffffff
[root@wolf-4 combined]# cat /proc/8509/stack
[<ffffffff8108c04f>] usleep_range+0x4f/0x70
[<ffffffffa269c99a>] dmu_tx_wait+0x33a/0x360 [zfs]
[<ffffffffa269ca45>] dmu_tx_assign+0x85/0x3f0 [zfs]
[<ffffffffa0f94fea>] osd_trans_start+0xaa/0x3c0 [osd_zfs]
[<ffffffffa10960db>] ofd_trans_start+0x6b/0xe0 [ofd]
[<ffffffffa109c0a3>] ofd_commitrw_write+0x943/0x1c20 [ofd]
[<ffffffffa109ff2d>] ofd_commitrw+0x51d/0xa40 [ofd]
[<ffffffffa0e0f8d2>] obd_commitrw+0x2ec/0x32f [ptlrpc]
[<ffffffffa0de7f71>] tgt_brw_write+0xea1/0x1640 [ptlrpc]
[<ffffffffa0de4225>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[<ffffffffa0d901ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[<ffffffffa0d94260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[<ffffffff810a5b8f>] kthread+0xcf/0xe0
[<ffffffff81646a98>] ret_from_fork+0x58/0x90
[<ffffffffffffffff>] 0xffffffffffffffff

Comment by John Salinas (Inactive) [ 15/Apr/17 ]

Yesterday I tried the following combinations:
Lustre 2.9.0 + latest coral_beta_combined record size 16M brw_size=16 draid zfs_abd_scatter_enabled = 0, max_pages_per_rpc=4096 – crash 10.8.1.3-2017-04-14-22:46:09
Lustre 2.9.0 + latest coral_beta_combined record size 16M brw_size=16 draid zfs_abd_scatter_enabled = 0, max_pages_per_rpc=256 – crash 10.8.1.3-2017-04-15-00:39:07

wolf-3 OSS 10.8.1.3-2017-04-14-22:46:09

147931.299899] Lustre: lsdraid-OST0000: new disk, initializing
[147931.307239] Lustre: srv-lsdraid-OST0000: No data found on store. Initialize space
[147936.355608] Lustre: lsdraid-OST0000: Connection restored to lsdraid-MDT0000-mdtlov_UUID (at 192.168.1.5@o2ib)
[147963.624729] Lustre: lsdraid-OST0000: Connection restored to bd4f4e40-dbac-a829-f1fd-3c4450a08dcb (at 192.168.1.6@o2ib)
[147970.995882] Lustre: lsdraid-OST0000: Connection restored to b9fbce4c-a90b-3f7f-770e-f9863c38efb5 (at 192.168.1.8@o2ib)
[147975.210049] Lustre: lsdraid-OST0000: Connection restored to 862f84d1-bf42-0dd3-ba54-1e1a9568317e (at 192.168.1.7@o2ib)
[147975.223042] Lustre: Skipped 1 previous similar message
[148306.620448] Lustre: lsdraid-OST0000: Connection restored to b9fbce4c-a90b-3f7f-770e-f9863c38efb5 (at 192.168.1.8@o2ib)
[148306.633674] Lustre: Skipped 1 previous similar message
[233987.779195] perf interrupt took too long (10163 > 9615), lowering kernel.perf_event_max_sample_rate to 13000
[414188.327658] Lustre: 83697:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1492208717/real 1492208717]  req@ffff880f11ac8300 x1564414877971952/t0(0) o39->lsdraid-MDT0000-lwp-OST0000@192.168.1.5@o2ib:12/10 lens 224/224 e 0 to 1 dl 1492208723 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
[414188.364839] Lustre: Failing over lsdraid-OST0000
[414192.689319] Lustre: 118209:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1492208721/real 1492208721]  req@ffff8815f6846f00 x1564414877971968/t0(0) o400->MGC192.168.1.5@o2ib@192.168.1.5@o2ib:26/25 lens 224/224 e 0 to 1 dl 1492208728 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
[414194.373337] Lustre: 83697:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1492208723/real 1492208723]  req@ffff880f11ac8300 x1564414877972032/t0(0) o251->MGC192.168.1.5@o2ib@192.168.1.5@o2ib:26/25 lens 224/224 e 0 to 1 dl 1492208729 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
[414194.411850] Lustre: server umount lsdraid-OST0000 complete
[414368.256969] Lustre: lsdraid-OST0000: new disk, initializing
[414368.265405] Lustre: srv-lsdraid-OST0000: No data found on store. Initialize space
[414375.147139] Lustre: lsdraid-OST0000: Connection restored to lsdraid-MDT0000-mdtlov_UUID (at 192.168.1.5@o2ib)
[414533.259382] Lustre: Failing over lsdraid-OST0000
[414533.276260] Lustre: server umount lsdraid-OST0000 complete
[414724.001373] Lustre: lsdraid-OST0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-450
[414725.696637] Lustre: lsdraid-OST0000: Will be in recovery for at least 2:30, or until 1 client reconnects
[414725.709414] Lustre: lsdraid-OST0000: Connection restored to lsdraid-MDT0000-mdtlov_UUID (at 192.168.1.5@o2ib)
[414725.874431] Lustre: lsdraid-OST0000: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted.
[415336.132350] Lustre: lsdraid-OST0000: Connection restored to bd4f4e40-dbac-a829-f1fd-3c4450a08dcb (at 192.168.1.6@o2ib)
[415406.632740] ------------[ cut here ]------------
[415406.633861] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0000 from 12345-192.168.1.6@o2ib inode [0x200000401:0x7a:0x0] object 0x0:88 extent [50331648-57343999]: client csum 41b33fd5, server csum 649d3feb
[415406.665939] kernel BUG at include/linux/scatterlist.h:65!
[415406.674352] invalid opcode: 0000 [#1] SMP 
[415406.681344] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) zfs(OE) zunicode(OE) zavl(OE) icp(OE) zcommon(OE) znvpair(OE) spl(OE) zlib_deflate lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) sha512_generic crypto_null xfs libcrc32c rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm dm_service_time ses enclosure intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper mpt3sas ablk_helper cryptd raid_class scsi_transport_sas ipmi_devintf ipmi_ssif iTCO_wdt
[415406.776798]  sg pcspkr iTCO_vendor_support ipmi_si ipmi_msghandler mei_me sb_edac acpi_power_meter ioatdma lpc_ich edac_core acpi_pad shpchp mei wmi i2c_i801 mfd_core dm_multipath dm_mod nfsd auth_rpcgss nfs_acl lockd grace sunrpc binfmt_misc ip_tables ext4 mbcache jbd2 mlx4_en mlx4_ib vxlan ib_sa ip6_udp_tunnel ib_mad udp_tunnel ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper igb crct10dif_pclmul ttm crct10dif_common ptp crc32c_intel ahci pps_core drm mlx4_core libahci dca i2c_algo_bit libata i2c_core [last unloaded: zunicode]
[415406.848441] CPU: 29 PID: 89865 Comm: ll_ost_io01_000 Tainted: G          IOE  ------------   3.10.0-327.36.3.el7.x86_64 #1
[415406.863708] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[415406.878344] task: ffff8817d96e5c00 ti: ffff881a6b35c000 task.ti: ffff881a6b35c000
[415406.889651] RIP: 0010:[<ffffffffa0c0cfef>]  [<ffffffffa0c0cfef>] cfs_crypto_hash_update_page+0x9f/0xb0 [libcfs]
[415406.903951] RSP: 0018:ffff881a6b35fab8  EFLAGS: 00010202
[415406.912870] RAX: 0000000000000002 RBX: ffff8820050b5900 RCX: 0000000000000000
[415406.923849] RDX: 0000000000000020 RSI: 0000000000000000 RDI: ffff881a6b35fad8
[415406.934787] RBP: ffff881a6b35fb00 R08: 00000000000195a0 R09: ffff881a6b35fab8
[415406.945693] R10: ffff88103e807900 R11: 0000000000000001 R12: 3534333231303635
[415406.956568] R13: 0000000032313036 R14: 0000000000000433 R15: 0000000000000000
[415406.967407] FS:  0000000000000000(0000) GS:ffff88203e6c0000(0000) knlGS:0000000000000000
[415406.979287] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[415406.988490] CR2: 00007fc89400b008 CR3: 000000000194a000 CR4: 00000000001407e0
[415406.999227] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[415407.009940] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[415407.020607] Stack:
[415407.025494]  0000000000000002 0000000000000000 0000000000000000 0000000000000000
[415407.036487]  00000000ced088e5 0000000000000000 ffff882024772701 ffff880db7053000
[415407.047418]  0000000000000000 ffff881a6b35fb68 ffffffffa0f8e459 ffff8819d6ea98a8
[415407.058319] Call Trace:
[415407.063640]  [<ffffffffa0f8e459>] tgt_checksum_bulk.isra.33+0x35a/0x4e7 [ptlrpc]
[415407.074501]  [<ffffffffa0f6721d>] tgt_brw_write+0x114d/0x1640 [ptlrpc]
[415407.084323]  [<ffffffff810c15cc>] ? update_curr+0xcc/0x150
[415407.092958]  [<ffffffff810be46e>] ? account_entity_dequeue+0xae/0xd0
[415407.102588]  [<ffffffffa0ebd560>] ? target_send_reply_msg+0x170/0x170 [ptlrpc]
[415407.113192]  [<ffffffffa0f63225>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[415407.123952]  [<ffffffffa0f0f1ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[415407.135575]  [<ffffffffa0c13128>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[415407.146329]  [<ffffffffa0f0cd68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[415407.156963]  [<ffffffff810b8952>] ? default_wake_function+0x12/0x20
[415407.166363]  [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90
[415407.175301]  [<ffffffffa0f13260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[415407.184635]  [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0
[415407.193114]  [<ffffffffa0f127c0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[415407.204113]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[415407.212374]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[415407.222423]  [<ffffffff81646a98>] ret_from_fork+0x58/0x90
[415407.231187]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[415407.241105] Code: 89 43 38 48 8b 43 20 ff 50 c0 48 8b 55 d8 65 48 33 14 25 28 00 00 00 75 0d 48 83 c4 28 5b 41 5c 41 5d 41 5e 5d c3 e8 61 e0 46 e0 <0f> 0b 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 
[415407.268624] RIP  [<ffffffffa0c0cfef>] cfs_crypto_hash_update_page+0x9f/0xb0 [libcfs]
[415407.279914]  RSP <ffff881a6b35fab8>

wolf-3 OSS 10.8.1.3-2017-04-15-00:39:07

[ 6415.538534] Lustre: lsdraid-OST0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-450
[ 6422.155237] Lustre: lsdraid-OST0000: Will be in recovery for at least 2:30, or until 1 client reconnects
[ 6422.165992] Lustre: lsdraid-OST0000: Connection restored to lsdraid-MDT0000-mdtlov_UUID (at 192.168.1.5@o2ib)
[ 6422.291438] Lustre: lsdraid-OST0000: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted.
[ 6422.301549] Lustre: lsdraid-OST0000: deleting orphan objects from 0x0:91 to 0x0:129
[ 6474.856831] Lustre: lsdraid-OST0000: Connection restored to  (at 192.168.1.8@o2ib)
[ 6565.960924] BUG: Bad page state in process ll_ost_io01_007  pfn:18eecce
[ 6565.961668] BUG: Bad page state in process ll_ost_io01_006  pfn:18eecca
[ 6565.961672] page:ffffea0063bb3280 count:-1 mapcount:0 mapping:          (null) index:0x0
[ 6565.961674] page flags: 0x6fffff00000000()
[ 6565.961675] page dumped because: nonzero _count
[ 6565.961726] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache zfs(OE) zunicode(OE) zavl(OE) icp(OE) zcommon(OE) znvpair(OE) spl(OE) zlib_deflate dm_service_time xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ses enclosure intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel mpt3sas raid_class lrw gf128mul glue_helper ablk_helper cryptd scsi_transport_sas iTCO_wdt iTCO_vendor_support mei_me ipmi_devintf ipmi_ssif lpc_ich sb_edac ipmi_si
[ 6565.961778]  edac_core sg ipmi_msghandler mei shpchp pcspkr ioatdma mfd_core i2c_i801 wmi acpi_pad acpi_power_meter nfsd dm_multipath dm_mod nfs_acl lockd binfmt_misc grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 mlx4_en vxlan ip6_udp_tunnel udp_tunnel mlx4_ib ib_sa ib_mad ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper igb ttm ptp crct10dif_pclmul pps_core crct10dif_common ahci drm crc32c_intel dca libahci mlx4_core i2c_algo_bit libata i2c_core
[ 6565.961782] CPU: 31 PID: 10886 Comm: ll_ost_io01_006 Tainted: G          IOE  ------------   3.10.0-327.36.3.el7.x86_64 #1
[ 6565.961784] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 6565.961792]  ffffea0063bb3280 000000008d05d0f2 ffff88202236f6f8 ffffffff81636431
[ 6565.961797]  ffff88202236f720 ffffffff81631645 ffff88203e759c68 0000000000003735
[ 6565.961803]  0000000000000001 ffff88202236f828 ffffffff81173028 ffff881022e59370
[ 6565.961804] Call Trace:
[ 6565.961819]  [<ffffffff81636431>] dump_stack+0x19/0x1b
[ 6565.961824]  [<ffffffff81631645>] bad_page.part.59+0xdf/0xfc
[ 6565.961833]  [<ffffffff81173028>] get_page_from_freelist+0x848/0x9b0
[ 6565.961844]  [<ffffffffa06cadaa>] ? spl_kmem_free+0x2a/0x40 [spl]
[ 6565.961848]  [<ffffffff81173327>] __alloc_pages_nodemask+0x197/0xba0
[ 6565.961862]  [<ffffffffa01f9f02>] ? mlx4_ib_post_send+0x4e2/0xb20 [mlx4_ib]
[ 6565.961910]  [<ffffffffa0b68f8d>] ? lu_obj_hop_keycmp+0x1d/0x30 [obdclass]
[ 6565.961927]  [<ffffffffa081d717>] ? cfs_hash_bd_lookup_intent+0x57/0x160 [libcfs]
[ 6565.961935]  [<ffffffff811b4afa>] alloc_pages_current+0xaa/0x170
[ 6565.961952]  [<ffffffffa0d5786b>] osd_bufs_get+0x4cb/0xba0 [osd_zfs]
[ 6565.961970]  [<ffffffffa10ade3d>] ofd_preprw_write.isra.29+0x1bd/0xcd0 [ofd]
[ 6565.961980]  [<ffffffffa10af13a>] ofd_preprw+0x7ea/0x10c0 [ofd]
[ 6565.962092]  [<ffffffffa0e8fce7>] tgt_brw_write+0xc17/0x1640 [ptlrpc]
[ 6565.962098]  [<ffffffff81632d15>] ? __slab_free+0x10e/0x277
[ 6565.962105]  [<ffffffff810c15cc>] ? update_curr+0xcc/0x150
[ 6565.962110]  [<ffffffff810be46e>] ? account_entity_dequeue+0xae/0xd0
[ 6565.962115]  [<ffffffff81639d72>] ? mutex_lock+0x12/0x2f
[ 6565.962178]  [<ffffffffa0e8c225>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[ 6565.962234]  [<ffffffffa0e381ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 6565.962249]  [<ffffffffa081a128>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[ 6565.962302]  [<ffffffffa0e35d68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 6565.962311]  [<ffffffff810b8952>] ? default_wake_function+0x12/0x20
[ 6565.962315]  [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90
[ 6565.962368]  [<ffffffffa0e3c260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[ 6565.962377]  [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0
[ 6565.962428]  [<ffffffffa0e3b7c0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[ 6565.962436]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[ 6565.962441]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[ 6565.962449]  [<ffffffff81646a98>] ret_from_fork+0x58/0x90
[ 6565.962454]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[ 6565.962456] Disabling lock debugging due to kernel taint
[ 6565.962539] BUG: Bad page state in process ll_ost_io01_006  pfn:18eecc5
[ 6565.962541] page:ffffea0063bb3140 count:-1 mapcount:0 mapping:          (null) index:0x0
[ 6565.962542] page flags: 0x6fffff00000000()
[ 6565.962543] page dumped because: nonzero _count
[ 6565.962576] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache zfs(OE) zunicode(OE) zavl(OE) icp(OE) zcommon(OE) znvpair(OE) spl(OE) zlib_deflate dm_service_time xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ses enclosure intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel mpt3sas raid_class lrw gf128mul glue_helper ablk_helper cryptd scsi_transport_sas iTCO_wdt iTCO_vendor_support mei_me ipmi_devintf ipmi_ssif lpc_ich sb_edac ipmi_si
[ 6565.962601]  edac_core sg ipmi_msghandler mei shpchp pcspkr ioatdma mfd_core i2c_i801 wmi acpi_pad acpi_power_meter nfsd dm_multipath dm_mod nfs_acl lockd binfmt_misc grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 mlx4_en vxlan ip6_udp_tunnel udp_tunnel mlx4_ib ib_sa ib_mad ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper igb ttm ptp crct10dif_pclmul pps_core crct10dif_common ahci drm crc32c_intel dca libahci mlx4_core i2c_algo_bit libata i2c_core
[ 6565.962604] CPU: 31 PID: 10886 Comm: ll_ost_io01_006 Tainted: G    B     IOE  ------------   3.10.0-327.36.3.el7.x86_64 #1
[ 6565.962605] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 6565.962612]  ffffea0063bb3140 000000008d05d0f2 ffff88202236f6f8 ffffffff81636431
[ 6565.962619]  ffff88202236f720 ffffffff81631645 ffff88203e759c68 0000000000003735
[ 6565.962625]  0000000000000001 ffff88202236f828 ffffffff81173028 ffff881022e59370
[ 6565.962626] Call Trace:
[ 6565.962632]  [<ffffffff81636431>] dump_stack+0x19/0x1b
[ 6565.962636]  [<ffffffff81631645>] bad_page.part.59+0xdf/0xfc
[ 6565.962641]  [<ffffffff81173028>] get_page_from_freelist+0x848/0x9b0
[ 6565.962650]  [<ffffffffa06cadaa>] ? spl_kmem_free+0x2a/0x40 [spl]
[ 6565.962655]  [<ffffffff81173327>] __alloc_pages_nodemask+0x197/0xba0
[ 6565.962669]  [<ffffffffa01f9f02>] ? mlx4_ib_post_send+0x4e2/0xb20 [mlx4_ib]
[ 6565.962711]  [<ffffffffa0b68f8d>] ? lu_obj_hop_keycmp+0x1d/0x30 [obdclass]
[ 6565.962727]  [<ffffffffa081d717>] ? cfs_hash_bd_lookup_intent+0x57/0x160 [libcfs]
[ 6565.962733]  [<ffffffff811b4afa>] alloc_pages_current+0xaa/0x170
[ 6565.962745]  [<ffffffffa0d5786b>] osd_bufs_get+0x4cb/0xba0 [osd_zfs]
[ 6565.962767]  [<ffffffffa10ade3d>] ofd_preprw_write.isra.29+0x1bd/0xcd0 [ofd]
[ 6565.962781]  [<ffffffffa10af13a>] ofd_preprw+0x7ea/0x10c0 [ofd]
[ 6565.962855]  [<ffffffffa0e8fce7>] tgt_brw_write+0xc17/0x1640 [ptlrpc]
[ 6565.962861]  [<ffffffff81632d15>] ? __slab_free+0x10e/0x277
[ 6565.962866]  [<ffffffff810c15cc>] ? update_curr+0xcc/0x150
[ 6565.962870]  [<ffffffff810be46e>] ? account_entity_dequeue+0xae/0xd0
[ 6565.962875]  [<ffffffff81639d72>] ? mutex_lock+0x12/0x2f
[ 6565.962949]  [<ffffffffa0e8c225>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[ 6565.963019]  [<ffffffffa0e381ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 6565.963034]  [<ffffffffa081a128>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[ 6565.963103]  [<ffffffffa0e35d68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 6565.963109]  [<ffffffff810b8952>] ? default_wake_function+0x12/0x20
[ 6565.963112]  [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90
[ 6565.963181]  [<ffffffffa0e3c260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[ 6565.963187]  [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0
[ 6565.963256]  [<ffffffffa0e3b7c0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[ 6565.963262]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[ 6565.963267]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[ 6565.963273]  [<ffffffff81646a98>] ret_from_fork+0x58/0x90
[ 6565.963278]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[ 6565.963280] BUG: Bad page state in process ll_ost_io01_006  pfn:18eecc6
[ 6565.963282] page:ffffea0063bb3180 count:-1 mapcount:0 mapping:          (null) index:0x0
[ 6565.963284] page flags: 0x6fffff00000000()
[ 6565.963285] page dumped because: nonzero _count
[ 6565.963320] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache zfs(OE) zunicode(OE) zavl(OE) icp(OE) zcommon(OE) znvpair(OE) spl(OE) zlib_deflate dm_service_time xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ses enclosure intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel mpt3sas raid_class lrw gf128mul glue_helper ablk_helper cryptd scsi_transport_sas iTCO_wdt iTCO_vendor_support mei_me ipmi_devintf ipmi_ssif lpc_ich sb_edac ipmi_si
[ 6565.963346]  edac_core sg ipmi_msghandler mei shpchp pcspkr ioatdma mfd_core i2c_i801 wmi acpi_pad acpi_power_meter nfsd dm_multipath dm_mod nfs_acl lockd binfmt_misc grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 mlx4_en vxlan ip6_udp_tunnel udp_tunnel mlx4_ib ib_sa ib_mad ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper igb ttm ptp crct10dif_pclmul pps_core crct10dif_common ahci drm crc32c_intel dca libahci mlx4_core i2c_algo_bit libata i2c_core
[ 6565.963349] CPU: 31 PID: 10886 Comm: ll_ost_io01_006 Tainted: G    B     IOE  ------------   3.10.0-327.36.3.el7.x86_64 #1
[ 6565.963350] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 6565.963358]  ffffea0063bb3180 000000008d05d0f2 ffff88202236f6f8 ffffffff81636431
[ 6565.963365]  ffff88202236f720 ffffffff81631645 ffff88203e759c68 0000000000003735
[ 6565.963372]  0000000000000001 ffff88202236f828 ffffffff81173028 ffff881022e59370
[ 6565.963372] Call Trace:
[ 6565.963378]  [<ffffffff81636431>] dump_stack+0x19/0x1b
[ 6565.963383]  [<ffffffff81631645>] bad_page.part.59+0xdf/0xfc
[ 6565.963388]  [<ffffffff81173028>] get_page_from_freelist+0x848/0x9b0
[ 6565.963397]  [<ffffffffa06cadaa>] ? spl_kmem_free+0x2a/0x40 [spl]
[ 6565.963403]  [<ffffffff81173327>] __alloc_pages_nodemask+0x197/0xba0
[ 6565.963416]  [<ffffffffa01f9f02>] ? mlx4_ib_post_send+0x4e2/0xb20 [mlx4_ib]
[ 6565.963458]  [<ffffffffa0b68f8d>] ? lu_obj_hop_keycmp+0x1d/0x30 [obdclass]
[ 6565.963473]  [<ffffffffa081d717>] ? cfs_hash_bd_lookup_intent+0x57/0x160 [libcfs]
[ 6565.963479]  [<ffffffff811b4afa>] alloc_pages_current+0xaa/0x170
[ 6565.963491]  [<ffffffffa0d5786b>] osd_bufs_get+0x4cb/0xba0 [osd_zfs]
[ 6565.963506]  [<ffffffffa10ade3d>] ofd_preprw_write.isra.29+0x1bd/0xcd0 [ofd]
[ 6565.963519]  [<ffffffffa10af13a>] ofd_preprw+0x7ea/0x10c0 [ofd]
[ 6565.963593]  [<ffffffffa0e8fce7>] tgt_brw_write+0xc17/0x1640 [ptlrpc]
[ 6565.963599]  [<ffffffff81632d15>] ? __slab_free+0x10e/0x277
[ 6565.963603]  [<ffffffff810c15cc>] ? update_curr+0xcc/0x150
[ 6565.963607]  [<ffffffff810be46e>] ? account_entity_dequeue+0xae/0xd0
[ 6565.963612]  [<ffffffff81639d72>] ? mutex_lock+0x12/0x2f
[ 6565.963686]  [<ffffffffa0e8c225>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[ 6565.963756]  [<ffffffffa0e381ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 6565.963778]  [<ffffffffa081a128>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[ 6565.963847]  [<ffffffffa0e35d68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 6565.963853]  [<ffffffff810b8952>] ? default_wake_function+0x12/0x20
[ 6565.963856]  [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90
[ 6565.963925]  [<ffffffffa0e3c260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[ 6565.963931]  [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0
[ 6565.964000]  [<ffffffffa0e3b7c0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[ 6565.964006]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[ 6565.964011]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[ 6565.964016]  [<ffffffff81646a98>] ret_from_fork+0x58/0x90
[ 6565.964021]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[ 6567.436859] page:ffffea0063bb3380 count:-1 mapcount:0 mapping:          (null) index:0x0
[ 6567.447916] page flags: 0x6fffff00000000()
[ 6567.454287] page dumped because: nonzero _count
[ 6567.461107] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache zfs(OE) zunicode(OE) zavl(OE) icp(OE) zcommon(OE) znvpair(OE) spl(OE) zlib_deflate dm_service_time xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ses enclosure intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel mpt3sas raid_class lrw gf128mul glue_helper ablk_helper cryptd scsi_transport_sas iTCO_wdt iTCO_vendor_support mei_me ipmi_devintf ipmi_ssif lpc_ich sb_edac ipmi_si
[ 6567.549458]  edac_core sg ipmi_msghandler mei shpchp pcspkr ioatdma mfd_core i2c_i801 wmi acpi_pad acpi_power_meter nfsd dm_multipath dm_mod nfs_acl lockd binfmt_misc grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 mlx4_en vxlan ip6_udp_tunnel udp_tunnel mlx4_ib ib_sa ib_mad ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper igb ttm ptp crct10dif_pclmul pps_core crct10dif_common ahci drm crc32c_intel dca libahci mlx4_core i2c_algo_bit libata i2c_core
[ 6567.606553] CPU: 19 PID: 11266 Comm: ll_ost_io01_007 Tainted: G    B     IOE  ------------   3.10.0-327.36.3.el7.x86_64 #1
[ 6567.619967] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 6567.632682]  ffffea0063bb3380 0000000029637c7c ffff880f32283908 ffffffff81636431
[ 6567.642074]  ffff880f32283930 ffffffff81631645 ffffea0063bb3380 0000000000000000
[ 6567.651459]  000fffff00000000 ffff880f32283978 ffffffff811714dd fff00000fe000000
[ 6567.660857] Call Trace:
[ 6567.664645]  [<ffffffff81636431>] dump_stack+0x19/0x1b
[ 6567.671441]  [<ffffffff81631645>] bad_page.part.59+0xdf/0xfc
[ 6567.678829]  [<ffffffff811714dd>] free_pages_prepare+0x16d/0x190
[ 6567.686591]  [<ffffffff81171e21>] free_hot_cold_page+0x31/0x140
[ 6567.694250]  [<ffffffff8117200f>] __free_pages+0x3f/0x60
[ 6567.701235]  [<ffffffffa0d56ad3>] osd_bufs_put+0x123/0x1f0 [osd_zfs]
[ 6567.709381]  [<ffffffffa10ab84a>] ofd_commitrw_write+0xea/0x1c20 [ofd]
[ 6567.717717]  [<ffffffffa10aff2d>] ofd_commitrw+0x51d/0xa40 [ofd]
[ 6567.725522]  [<ffffffffa0eb78d2>] obd_commitrw+0x2ec/0x32f [ptlrpc]
[ 6567.733604]  [<ffffffffa0e8ff71>] tgt_brw_write+0xea1/0x1640 [ptlrpc]
[ 6567.741863]  [<ffffffffa0de6560>] ? target_send_reply_msg+0x170/0x170 [ptlrpc]
[ 6567.751008]  [<ffffffffa0e8c225>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[ 6567.760002]  [<ffffffffa0e381ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 6567.769852]  [<ffffffffa081a128>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[ 6567.778843]  [<ffffffffa0e35d68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 6567.787757]  [<ffffffff810b8952>] ? default_wake_function+0x12/0x20
[ 6567.796038]  [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90
[ 6567.803866]  [<ffffffffa0e3c260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[ 6567.812150]  [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0
[ 6567.819620]  [<ffffffffa0e3b7c0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[ 6567.828951]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[ 6567.835460]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[ 6567.843817]  [<ffffffff81646a98>] ret_from_fork+0x58/0x90
[ 6567.850900]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[ 6591.647844] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0000 from 12345-192.168.1.6@o2ib inode [0x200000402:0x38:0x0] object 0x0:151 extent [67108864-74711039]: client csum 10225ab5, server csum d83f5ab1
[ 6602.366408] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0000 from 12345-192.168.1.6@o2ib inode [0x200000402:0x46:0x0] object 0x0:158 extent [67108864-82968575]: client csum df6bd34a, server csum a629d34d
[ 6611.821644] general protection fault: 0000 [#1] SMP 
[ 6611.829518] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache zfs(OE) zunicode(OE) zavl(OE) icp(OE) zcommon(OE) znvpair(OE) spl(OE) zlib_deflate dm_service_time xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ses enclosure intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel mpt3sas raid_class lrw gf128mul glue_helper ablk_helper cryptd scsi_transport_sas iTCO_wdt iTCO_vendor_support mei_me ipmi_devintf ipmi_ssif lpc_ich sb_edac ipmi_si
[ 6611.923714]  edac_core sg ipmi_msghandler mei shpchp pcspkr ioatdma mfd_core i2c_i801 wmi acpi_pad acpi_power_meter nfsd dm_multipath dm_mod nfs_acl lockd binfmt_misc grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 mlx4_en vxlan ip6_udp_tunnel udp_tunnel mlx4_ib ib_sa ib_mad ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper igb ttm ptp crct10dif_pclmul pps_core crct10dif_common ahci drm crc32c_intel dca libahci mlx4_core i2c_algo_bit libata i2c_core
[ 6611.985416] CPU: 55 PID: 9668 Comm: ll_ost_io01_000 Tainted: G    B     IOE  ------------   3.10.0-327.36.3.el7.x86_64 #1
[ 6611.999894] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 6612.013786] task: ffff880fd957e780 ti: ffff880fdb368000 task.ti: ffff880fdb368000
[ 6612.024361] RIP: 0010:[<ffffffffa0814e30>]  [<ffffffffa0814e30>] adler32_update+0x70/0x250 [libcfs]
[ 6612.036764] RSP: 0018:ffff880fdb36b990  EFLAGS: 00010212
[ 6612.044902] RAX: 0000000000000cce RBX: 0000000000000cce RCX: 3433323130363534
[ 6612.055097] RDX: 0000000000000cce RSI: 0cd1944c0d8d4332 RDI: 0cd1944c0d8d4332
[ 6612.065272] RBP: ffff880fdb36b9f8 R08: 00000000000195a0 R09: 0000000000000cce
[ 6612.075453] R10: ffff88103e807900 R11: 0000000000000001 R12: 3433323130363534
[ 6612.085641] R13: 0000000031303635 R14: ffffffffa0834410 R15: 0000000000000001
[ 6612.095830] FS:  0000000000000000(0000) GS:ffff88203e8c0000(0000) knlGS:0000000000000000
[ 6612.107119] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6612.115792] CR2: 00007f19c6c7c000 CR3: 000000000194a000 CR4: 00000000001407e0
[ 6612.126030] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 6612.136265] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 6612.146492] Stack:
[ 6612.150994]  ffff881d1a0d1cd0 00000ccedb36b9c8 0cd1944c0d8d4332 0000000000000000
[ 6612.161627]  00000cce00000000 ffffffffa0834410 ffff882027752a08 ffff880fdb36b9f0
[ 6612.172284]  0cd1944c0d8d4332 3433323130363534 0000000031303635 ffffffffa0834410
[ 6612.182948] Call Trace:
[ 6612.187988]  [<ffffffff812b1a78>] crypto_shash_update+0x38/0x100
[ 6612.197017]  [<ffffffff812b1d6e>] shash_ahash_update+0x3e/0x70
[ 6612.205854]  [<ffffffff812b1db2>] shash_async_update+0x12/0x20
[ 6612.214676]  [<ffffffffa0813fce>] cfs_crypto_hash_update_page+0x7e/0xb0 [libcfs]
[ 6612.225344]  [<ffffffffa0eb7459>] tgt_checksum_bulk.isra.33+0x35a/0x4e7 [ptlrpc]
[ 6612.236606]  [<ffffffffa0e9021d>] tgt_brw_write+0x114d/0x1640 [ptlrpc]
[ 6612.246831]  [<ffffffff810c15cc>] ? update_curr+0xcc/0x150
[ 6612.255910]  [<ffffffff810be46e>] ? account_entity_dequeue+0xae/0xd0
[ 6612.265910]  [<ffffffffa0de6560>] ? target_send_reply_msg+0x170/0x170 [ptlrpc]
[ 6612.276879]  [<ffffffffa0e8c225>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[ 6612.287460]  [<ffffffffa0e381ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 6612.298869]  [<ffffffffa081a128>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[ 6612.309312]  [<ffffffffa0e35d68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 6612.319759]  [<ffffffff810b8952>] ? default_wake_function+0x12/0x20
[ 6612.329610]  [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90
[ 6612.338955]  [<ffffffffa0e3c260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[ 6612.348565]  [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0
[ 6612.357360]  [<ffffffffa0e3b7c0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[ 6612.368146]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[ 6612.376092]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[ 6612.385802]  [<ffffffff81646a98>] ret_from_fork+0x58/0x90
[ 6612.394179]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[ 6612.403767] Code: 44 00 00 8b 5d b8 b8 b0 15 00 00 81 fb b0 15 00 00 0f 46 c3 29 45 b8 83 f8 0f 89 45 a4 0f 8e f8 00 00 00 48 8b 7d a8 89 45 bc 90 <44> 0f b6 2f 44 0f b6 77 01 48 83 c7 10 44 0f b6 67 f2 0f b6 5f 
[ 6612.430647] RIP  [<ffffffffa0814e30>] adler32_update+0x70/0x250 [libcfs]
[ 6612.440428]  RSP <ffff880fdb36b990>
Comment by John Salinas (Inactive) [ 15/Apr/17 ]

The above dumps are now on onyx:
/scratch/johnsali/
rw-rr- 1 johnsali johnsali 430M Apr 15 08:24 10.8.1.3-2017-04-14-224609.tgz
rw-rr- 1 johnsali johnsali 142M Apr 15 08:24 10.8.1.3-2017-04-15-003907.tgz
rw-rr- 1 johnsali johnsali 707M Apr 15 08:26 10.8.1.3-2017-04-15-132245.tgz
rw-rr- 1 johnsali johnsali 274M Apr 15 08:27 10.8.1.4-2017-04-15-002617.tgz
rw-rr- 1 johnsali johnsali 485M Apr 15 08:29 10.8.1.4-2017-04-15-014743.tgz
rw-rr- 1 johnsali johnsali 782M Apr 15 08:31 10.8.1.4-2017-04-15-132247.tgz

Comment by John Salinas (Inactive) [ 17/Apr/17 ]

I did a mostly successful run with Lustre 2.9.0 + 0.7.0 RC3 (none of our patches) record size 1M on both OSS which forces a max RPC size of 1MB both raidz. With this combination set I did not not see BAD WRITE CHECKSUM. It is my assumption based on these runs that the ZFS record size and associated brw size of 16MB is the issue. I suppose the one outstanding question is this only caused by FileAger. I will do some runs today without this application and see if I can produce it with any others. If it is helpful I could narrow it to a routine in FileAger.

Comment by John Salinas (Inactive) [ 19/Apr/17 ]

Here is a test case:

#!/usr/bin/python3.4

import os
import uuid
import tempfile

def WriteFileBasic(multiplysize, blocksize, writechar, path):
    """
    Writes a basic file in a sequanetial fashion and fills it with writechar with the size of the file
    being equal to writechar * multiplysize and written in chunks at time based on blocksize.
    """
    writethis = writechar * multiplysize
    unique_filename = uuid.uuid4()
    filetowrite = path + '/basic-' + str(unique_filename)

    fd = open(filetowrite, 'wb')
    for x in range(blocksize):
        fd.write(bytes(writethis, 'UTF-8'))
    fd.close()


directory = tempfile.mkdtemp(dir="/mnt/lustre")
if not os.path.exists(directory):
    os.makedirs(directory)

print("Writing files to: {0}".format(directory))
for i in range(100):
    for x in range(2):
        WriteFileBasic(multiplysize=204800, blocksize=1024, writechar='0123456', path=directory)
        WriteFileBasic(multiplysize=204800, blocksize=1024, writechar='0123456', path=directory)
        WriteFileBasic(multiplysize=204800, blocksize=128, writechar='0', path=directory)
        WriteFileBasic(multiplysize=204800, blocksize=128, writechar='0', path=directory)
        WriteFileBasic(multiplysize=204800, blocksize=1024, writechar='0123456', path=directory)
        WriteFileBasic(multiplysize=124, blocksize=1024, writechar='0', path=directory) 

I start multiple copies of that on one client node at least 2 but 4 sometimes seems to make this come out a little quicker

Here is an example from tonight running this.
mpirun -np 4 -wdir /mnt/lustre -machinefile hosts -env I_MPI_EXTRA_FILESYSTEM=on -env I_MPI_EXTRA_FILESYSTEM_LIST=lustre /home/johnsali/wolf-3/ior/src/ior -a POSIX -F -N 4 -d 2 -i 1 -s 20480 -b 8m -t 8m
./testcase_pagestate.py &
./testcase_pagestate.py &

almost immediately:
[103132.581173] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0000 from 12345-192.168.1.6@o2ib inode [0x200000bd0:0xa98:0x0] object 0x0:1018 extent [83886080-92979199]: client csum 3e2f59b2, server csum cf13a5a5

But it took several minutes to get:
[103332.411485] BUG: Bad page state in process ll_ost_io01_000 pfn:171cd2c
[103332.420695] page:ffffea005c734b00 count:-1 mapcount:0 mapping: (null) index:0x0
[103332.431396] page flags: 0x6fffff00000000()
[103332.437504] page dumped because: nonzero _count
...etc...

The case is basically writing large file, large file, very small file or large file, very small file while stream IO is going on from ior.

The client sees:

[ 2618.770550] LustreError: 132-0: BAD WRITE CHECKSUM: changed in transit before arrival at OST: from 192.168.1.3@o2ib inode [0x200000bd0:0x51:0x0] object 0x0:55 extent [83886080-91750399]
[ 2618.789138] LustreError: 4304:0:(osc_request.c:1369:check_write_checksum()) original client csum 361b6a59 (type 1), server csum eba84c4d (type 1), client csum now 361b6a59
[ 2618.806340] LustreError: 4304:0:(osc_request.c:1533:osc_brw_redo_request()) @@@ redo for recoverable error -11  req@ffff88104b076900 x1564965266913648/t8589935484(8589935484) o4->lsdraid-OST0000-osc-ffff880831e2b000@192.168.1.3@o2ib:6/4 lens 608/416 e 0 to 0 dl 1492467295 ref 3 fl Interpret:R/4/0 rc 0/0
[ 3863.562459] perf interrupt took too long (2511 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[ 7147.464039] LustreError: 132-0: BAD WRITE CHECKSUM: changed in transit before arrival at OST: from 192.168.1.4@o2ib inode [0x200000bd0:0x26a:0x0] object 0x0:418 extent [50331648-56172543]
[ 7147.482834] LustreError: 4330:0:(osc_request.c:1369:check_write_checksum()) original client csum acbcc47a (type 1), server csum 669e8240 (type 1), client csum now acbcc47a
[ 7147.500040] LustreError: 4330:0:(osc_request.c:1533:osc_brw_redo_request()) @@@ redo for recoverable error -11  req@ffff880cd3e78c00 x1564965267472464/t8589944546(8589944546) o4->lsdraid-OST0001-osc-ffff880831e2b000@192.168.1.4@o2ib:6/4 lens 608/416 e 0 to 0 dl 1492471826 ref 3 fl Interpret:R/4/0 rc 0/0
[11418.640873] LustreError: 132-0: BAD WRITE CHECKSUM: changed in transit before arrival at OST: from 192.168.1.3@o2ib inode [0x200000bd0:0x56f:0x0] object 0x0:544 extent [134217728-135659519]
[11418.659867] LustreError: 4329:0:(osc_request.c:1369:check_write_checksum()) original client csum 30220b9d (type 1), server csum a413949 (type 1), client csum now 30220b9d
[11418.677004] LustreError: 4329:0:(osc_request.c:1533:osc_brw_redo_request()) @@@ redo for recoverable error -11  req@ffff88071d3c6600 x1564965270676240/t8590001480(8590001480) o4->lsdraid-OST0000-osc-ffff880831e2b000@192.168.1.3@o2ib:6/4 lens 608/416 e 0 to 0 dl 1492476105 ref 3 fl Interpret:R/4/0 rc 0/0
[99151.313874] LustreError: 132-0: BAD WRITE CHECKSUM: changed in transit before arrival at OST: from 192.168.1.4@o2ib inode [0x200000bd0:0xa34:0x0] object 0x0:1674 extent [16777216-26214399]
[99151.332763] LustreError: 4275:0:(osc_request.c:1369:check_write_checksum()) original client csum 504d71aa (type 1), server csum 4a62c464 (type 1), client csum now 504d71aa
[99151.349965] LustreError: 4275:0:(osc_request.c:1533:osc_brw_redo_request()) @@@ redo for recoverable error -11  req@ffff880ddb749e00 x1564965274742880/t8590111250(8590111250) o4->lsdraid-OST0001-osc-ffff880831e2b000@192.168.1.4@o2ib:6/4 lens 608/416 e 0 to 0 dl 1492563918 ref 3 fl Interpret:R/4/0 rc 0/0
[99195.553911] LustreError: 132-0: BAD WRITE CHECKSUM: changed in transit before arrival at OST: from 192.168.1.3@o2ib inode [0x200000bd0:0xa98:0x0] object 0x0:1018 extent [83886080-92979199]
[99195.572806] LustreError: 4264:0:(osc_request.c:1369:check_write_checksum()) original client csum 3e2f59b2 (type 1), server csum cf13a5a5 (type 1), client csum now 3e2f59b2
[99195.595271] LustreError: 4264:0:(osc_request.c:1533:osc_brw_redo_request()) @@@ redo for recoverable error -11  req@ffff88060d47ce00 x1564965274910512/t8590045682(8590045682) o4->lsdraid-OST0000-osc-ffff880831e2b000@192.168.1.3@o2ib:6/4 lens 608/416 e 0 to 0 dl 1492563962 ref 3 fl Interpret:R/4/0 rc 0/0
Comment by Jinshan Xiong (Inactive) [ 19/Apr/17 ]

Thanks for the test case - I'm looking at it now

Comment by Jinshan Xiong (Inactive) [ 19/Apr/17 ]

I strongly suspect this is a use-after-free bug based on the crash occurred in:

cfs_crypto_hash_update_page+0x9f

[415406.889651] RIP: 0010:[<ffffffffa0c0cfef>]  [<ffffffffa0c0cfef>] cfs_crypto_hash_update_page+0x9f/0xb0 [libcfs]
[415406.903951] RSP: 0018:ffff881a6b35fab8  EFLAGS: 00010202
[415406.912870] RAX: 0000000000000002 RBX: ffff8820050b5900 RCX: 0000000000000000
[415406.923849] RDX: 0000000000000020 RSI: 0000000000000000 RDI: ffff881a6b35fad8
[415406.934787] RBP: ffff881a6b35fb00 R08: 00000000000195a0 R09: ffff881a6b35fab8
[415406.945693] R10: ffff88103e807900 R11: 0000000000000001 R12: 3534333231303635
[415406.956568] R13: 0000000032313036 R14: 0000000000000433 R15: 0000000000000000

where %r12 is supposed to a pointer to struct page, but now the value is 3534333231303635.

The bad page not only occurred in free page, but also in the page allocation path.

[ 6565.963282] page:ffffea0063bb3180 count:-1 mapcount:0 mapping:          (null) index:0x0
[ 6565.963284] page flags: 0x6fffff00000000()
[ 6565.963285] page dumped because: nonzero _count
[ 6565.963320] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache zfs(OE) zunicode(OE) zavl(OE) icp(OE) zcommon(OE) znvpair(OE) spl(OE) zlib_deflate dm_service_time xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ses enclosure intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel mpt3sas raid_class lrw gf128mul glue_helper ablk_helper cryptd scsi_transport_sas iTCO_wdt iTCO_vendor_support mei_me ipmi_devintf ipmi_ssif lpc_ich sb_edac ipmi_si
[ 6565.963346]  edac_core sg ipmi_msghandler mei shpchp pcspkr ioatdma mfd_core i2c_i801 wmi acpi_pad acpi_power_meter nfsd dm_multipath dm_mod nfs_acl lockd binfmt_misc grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 mlx4_en vxlan ip6_udp_tunnel udp_tunnel mlx4_ib ib_sa ib_mad ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper igb ttm ptp crct10dif_pclmul pps_core crct10dif_common ahci drm crc32c_intel dca libahci mlx4_core i2c_algo_bit libata i2c_core
[ 6565.963349] CPU: 31 PID: 10886 Comm: ll_ost_io01_006 Tainted: G    B     IOE  ------------   3.10.0-327.36.3.el7.x86_64 #1
[ 6565.963350] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 6565.963358]  ffffea0063bb3180 000000008d05d0f2 ffff88202236f6f8 ffffffff81636431
[ 6565.963365]  ffff88202236f720 ffffffff81631645 ffff88203e759c68 0000000000003735
[ 6565.963372]  0000000000000001 ffff88202236f828 ffffffff81173028 ffff881022e59370
[ 6565.963372] Call Trace:
[ 6565.963378]  [<ffffffff81636431>] dump_stack+0x19/0x1b
[ 6565.963383]  [<ffffffff81631645>] bad_page.part.59+0xdf/0xfc
[ 6565.963388]  [<ffffffff81173028>] get_page_from_freelist+0x848/0x9b0

even wild pointer is hardly to point to page array.

Comment by Jinshan Xiong (Inactive) [ 20/Apr/17 ]

as Don and John pointed out, the page pointer was actually overwritten by data, that is ascii code of "54321065".

This page pointer is in the array of bulk buffer descriptor. I would strongly suggest to try other RPC size and see how it goes.

Comment by Gerrit Updater [ 20/Apr/17 ]

Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: https://review.whamcloud.com/26749
Subject: LU-9305 target: debug patch for memory problem
Project: fs/lustre-release
Branch: b2_9
Current Patch Set: 1
Commit: 3db1d6171732665b5b99e47899cc5a1edd63901d

Comment by John Salinas (Inactive) [ 20/Apr/17 ]
[  680.163717] Lustre: lsdraid-OST0001: new disk, initializing
[  680.170127] Lustre: srv-lsdraid-OST0001: No data found on store. Initialize space
[  680.282057] Lustre: lsdraid-OST0001: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-450
[  686.506162] Lustre: lsdraid-OST0001: Connection restored to lsdraid-MDT0000-mdtlov_UUID (at 192.168.1.5@o2ib)
[  807.056123] Lustre: lsdraid-OST0001: Connection restored to 2db09e97-e362-e59d-705b-766599d48fc1 (at 192.168.1.7@o2ib)
[ 1091.277796] BUG: unable to handle kernel paging request at ffffc9044065c000
[ 1091.285701] IP: [<ffffffffa0558fd3>] __spl_cache_grow+0x53/0x2d0 [spl]
[ 1091.289923] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0001 from 12345-192.168.1.6@o2ib inode [0x200000401:0xf0:0x0] object 0x0:122 extent [16777216-26214399]: client csum 504d71aa, server csum 91a9afd8
[ 1091.314026] PGD 103e842067 PUD 1d6edd2067 PMD 1d6edd6067 PTE 8c118c0c0c0c0163
[ 1091.322114] Oops: 000b [#1] SMP 
[ 1091.325755] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache zfs(OE) zunicode(OE) zavl(OE) icp(OE) xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm zcommon(OE) ib_ucm znvpair(OE) ib_uverbs ib_umad spl(OE) rdma_cm ib_cm zlib_deflate iw_cm dm_service_time ses enclosure intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul mpt3sas iTCO_wdt glue_helper ablk_helper cryptd iTCO_vendor_support mei_me raid_class scsi_transport_sas mei lpc_ich sg ipmi_devintf sb_edac i2c_i801
[ 1091.406299]  ioatdma shpchp ipmi_ssif pcspkr mfd_core edac_core ipmi_si ipmi_msghandler acpi_pad acpi_power_meter wmi nfsd dm_multipath dm_mod nfs_acl lockd grace binfmt_misc auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 mlx4_en mlx4_ib vxlan ib_sa ip6_udp_tunnel ib_mad udp_tunnel ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper ttm crct10dif_pclmul igb crct10dif_common ptp crc32c_intel ahci pps_core drm mlx4_core libahci dca libata i2c_algo_bit i2c_core
[ 1091.457502] CPU: 23 PID: 2206 Comm: spl_kmem_cache Tainted: G          IOE  ------------   3.10.0-327.36.3.el7.x86_64 #1
[ 1091.469686] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 1091.481391] task: ffff8820062d5c00 ti: ffff882005cdc000 task.ti: ffff882005cdc000
[ 1091.490950] RIP: 0010:[<ffffffffa0558fd3>]  [<ffffffffa0558fd3>] __spl_cache_grow+0x53/0x2d0 [spl]
[ 1091.502162] RSP: 0018:ffff882005cdfda8  EFLAGS: 00010286
[ 1091.509211] RAX: ffffc9044065c000 RBX: 00000000002aa040 RCX: 8000001e22e5a163
[ 1091.518301] RDX: ffff880da3e51320 RSI: 0000000000000000 RDI: ffffffff819c4a80
[ 1091.527375] RBP: ffff882005cdfdf0 R08: 8000000000000163 R09: ffffea00788b96c0
[ 1091.537364] R10: 0000000000003735 R11: 000000000000007d R12: ffff88201476a3c0
[ 1091.547332] R13: ffffc9044065c000 R14: ffff88201476a440 R15: ffff881ff9cf4800
[ 1091.557229] FS:  0000000000000000(0000) GS:ffff88203e540000(0000) knlGS:0000000000000000
[ 1091.568335] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1091.576885] CR2: ffffc9044065c000 CR3: 000000000194a000 CR4: 00000000001407e0
[ 1091.586967] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1091.597005] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1091.606947] Stack:
[ 1091.611234]  ffffffff8163b4e8 00000004000a0000 ffff882005cdffd8 ffff882005cdffd8
[ 1091.621938]  ffff881ff9cf4800 ffff88201476a3c0 ffff8820236fe9d0 ffff88201476a440
[ 1091.632723]  0000000000000000 ffff882005cdfe10 ffffffffa055926d ffff882008688a00
[ 1091.643454] Call Trace:
[ 1091.648425]  [<ffffffff8163b4e8>] ? __schedule+0x2d8/0x900
[ 1091.656983]  [<ffffffffa055926d>] spl_cache_grow_work+0x1d/0x40 [spl]
[ 1091.666325]  [<ffffffffa055aed6>] taskq_thread+0x246/0x470 [spl]
[ 1091.675091]  [<ffffffff810b8940>] ? wake_up_state+0x20/0x20
[ 1091.683310]  [<ffffffffa055ac90>] ? taskq_thread_spawn+0x60/0x60 [spl]
[ 1091.692681]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[ 1091.699315]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[ 1091.707826]  [<ffffffff81646a98>] ret_from_fork+0x58/0x90
[ 1091.715012]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[ 1091.723419] Code: 48 14 00 00 0a 00 89 75 c4 8b 77 64 48 8b 7f 50 89 5d c0 81 65 c0 00 00 0a 00 e8 29 ff ff ff 48 85 c0 49 89 c5 0f 84 af 01 00 00 <c7> 00 22 22 22 22 49 8d 5d 20 41 8b 57 60 c7 45 c8 00 00 00 00 
[ 1091.747870] RIP  [<ffffffffa0558fd3>] __spl_cache_grow+0x53/0x2d0 [spl]
[ 1091.756319]  RSP <ffff882005cdfda8>
[ 1091.761197] CR2: ffffc9044065c000

Will work on getting dump uploaded

Comment by John Salinas (Inactive) [ 20/Apr/17 ]

after cfs_fail_loc 718 is set:

[  435.209479] Lustre: lsdraid-OST0001: Recovery over after 0:19, of 5 clients 5 recovered and 0 were evicted.
[  435.209867] Lustre: lsdraid-OST0001: deleting orphan objects from 0x0:124 to 0x0:129
[  533.512212] Lustre: *** cfs_fail_loc=718, val=0***
[  533.517927] BUG: unable to handle kernel paging request at ffff88010252eb80
[  533.525799] IP: [<ffffffff813018d9>] memset+0x9/0xb0
[  533.531412] PGD 1f36067 PUD 80000001000001e1 
[  533.536360] Oops: 0003 [#1] SMP 
[  533.540036] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm zfs(OE) zunicode(OE) zavl(OE) icp(OE) zcommon(OE) znvpair(OE) spl(OE) zlib_deflate ses dm_service_time enclosure intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel mpt3sas lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt raid_class scsi_transport_sas mei_me sb_edac ipmi_devintf ipmi_ssif sg i2c_i801 mei iTCO_vendor_support
[  533.621233]  edac_core ioatdma ipmi_si lpc_ich shpchp pcspkr mfd_core ipmi_msghandler acpi_power_meter acpi_pad wmi dm_multipath dm_mod nfsd auth_rpcgss nfs_acl lockd grace sunrpc binfmt_misc ip_tables ext4 mbcache jbd2 mlx4_en vxlan ip6_udp_tunnel udp_tunnel mlx4_ib ib_sa ib_mad ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper igb ttm crct10dif_pclmul ptp crct10dif_common crc32c_intel ahci pps_core drm mlx4_core libahci dca libata i2c_algo_bit i2c_core
[  533.677646] CPU: 32 PID: 11307 Comm: ll_ost_io01_001 Tainted: G          IOE  ------------   3.10.0-327.36.3.el7.x86_64 #1
[  533.691971] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[  533.705642] task: ffff881023490b80 ti: ffff880424ec0000 task.ti: ffff880424ec0000
[  533.716011] RIP: 0010:[<ffffffff813018d9>]  [<ffffffff813018d9>] memset+0x9/0xb0
[  533.726303] RSP: 0018:ffff880424ec3948  EFLAGS: 00010246
[  533.734246] RAX: 0000000000000000 RBX: ffff88010252fa00 RCX: 0000000000000080
[  533.744233] RDX: 0000000000000080 RSI: 0000000000000000 RDI: ffff88010252eb80
[  533.754220] RBP: ffff880424ec3980 R08: 00000000000195a0 R09: ffff88010252eb80
[  533.764200] R10: ffff88103e807900 R11: 0000000000000001 R12: 00000000000080d0
[  533.774184] R13: ffff88010252eb80 R14: 0000000000000064 R15: ffff88103e807900
[  533.784166] FS:  0000000000000000(0000) GS:ffff88203e780000(0000) knlGS:0000000000000000
[  533.795262] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  533.803720] CR2: ffff88010252eb80 CR3: 000000000194a000 CR4: 00000000001407e0
[  533.813755] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  533.823787] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  533.833820] Stack:
[  533.838100]  ffffffff811c2603 ffffffff812aa9ac ffff882004649548 ffffffff81691140
[  533.848528]  ffffffffa060e050 0000000000000008 ffffffff816910e0 ffff880424ec39b0
[  533.858946]  ffffffff812aa9ac ffff882004649548 ffffffffa060e050 ffffffffa060e050
[  533.869359] Call Trace:
[  533.874228]  [<ffffffff811c2603>] ? __kmalloc+0x1f3/0x230
[  533.882358]  [<ffffffff812aa9ac>] ? crypto_create_tfm+0x2c/0xd0
[  533.891067]  [<ffffffff812aa9ac>] crypto_create_tfm+0x2c/0xd0
[  533.899544]  [<ffffffff812b2607>] crypto_init_shash_ops_async+0x37/0xe0
[  533.908976]  [<ffffffff812b11d7>] crypto_ahash_init_tfm+0x37/0xb0
[  533.917782]  [<ffffffff812aa9c3>] crypto_create_tfm+0x43/0xd0
[  533.926170]  [<ffffffff812aaac3>] crypto_alloc_tfm+0x73/0xf0
[  533.934455]  [<ffffffff8108c29b>] ? lock_timer_base.isra.33+0x2b/0x50
[  533.943611]  [<ffffffff812b1269>] crypto_alloc_ahash+0x19/0x20
[  533.952069]  [<ffffffffa0a1f06a>] cfs_crypto_hash_alloc+0x6a/0x340 [libcfs]
[  533.961765]  [<ffffffffa0a1f47f>] cfs_crypto_hash_init+0x2f/0x60 [libcfs]
[  533.971325]  [<ffffffffa0e485b3>] tgt_checksum_bulk.isra.33+0xa1/0x4e7 [ptlrpc]
[  533.981867]  [<ffffffffa0e17abf>] ? do_kernel_tlb_flush+0x5f/0x70 [ptlrpc]
[  533.991484]  [<ffffffffa0e215d4>] tgt_brw_write+0x1494/0x1930 [ptlrpc]
[  534.000650]  [<ffffffff810c5618>] ? load_balance+0x218/0x890
[  534.008863]  [<ffffffff810bb7c5>] ? sched_clock_cpu+0x85/0xc0
[  534.017149]  [<ffffffff810c3ea8>] ? enqueue_task_fair+0x208/0x6c0
[  534.025789]  [<ffffffff81639d72>] ? mutex_lock+0x12/0x2f
[  534.033578]  [<ffffffffa0e1d545>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[  534.043058]  [<ffffffffa0dc91ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[  534.053344]  [<ffffffffa0a25128>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[  534.062655]  [<ffffffffa0dc6d68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[  534.071893]  [<ffffffff810b8952>] ? default_wake_function+0x12/0x20
[  534.080490]  [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90
[  534.088602]  [<ffffffffa0dcd260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[  534.097161]  [<ffffffffa0dcc7c0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[  534.106930]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[  534.113801]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[  534.122495]  [<ffffffff81646a98>] ret_from_fork+0x58/0x90
[  534.129892]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[  534.138519] Code: 16 fe 66 44 89 1f 66 44 89 54 17 fe eb 0c 48 83 fa 01 72 06 44 8a 1e 44 88 1f c3 90 90 90 90 90 90 90 49 89 f9 40 88 f0 48 89 d1 <f3> aa 4c 89 c8 c3 0f 1f 84 00 00 00 00 00 0f 1f 84 00 00 00 00 
[  534.163663] RIP  [<ffffffff813018d9>] memset+0x9/0xb0
[  534.170708]  RSP <ffff880424ec3948>
[  534.175973] CR2: ffff88010252eb80

/scratch/dumps/wolf-4.wolf.hpdd.intel.com/10.8.1.4-2017-04-20-16:21:36/

Comment by John Salinas (Inactive) [ 20/Apr/17 ]

with patchset 2

[  173.090934] FS-Cache: Loaded
[  173.134655] FS-Cache: Netfs 'nfs' registered for caching
[  173.152193] Key type dns_resolver registered
[  173.185711] NFS: Registering the id_resolver key type
[  173.192435] Key type id_resolver registered
[  173.198100] Key type id_legacy registered
[  452.300699] LNet: HW CPU cores: 72, npartitions: 2
[  452.310445] alg: No test for adler32 (adler32-zlib)
[  452.317175] alg: No test for crc32 (crc32-table)
[  461.673655] LNet: HW CPU cores: 72, npartitions: 2
[  461.682181] alg: No test for adler32 (adler32-zlib)
[  461.688830] alg: No test for crc32 (crc32-table)
[  469.713466] Lustre: Lustre: Build Version: 2.9.0_1_g35befcd_dirty
[  469.790097] LNet: Added LNI 192.168.1.3@o2ib [8/256/0/180]
[  491.961472] SPL: using hostid 0x61303830
[  495.486200] Lustre: lsdraid-OST0000: new disk, initializing
[  495.493762] Lustre: srv-lsdraid-OST0000: No data found on store. Initialize space
[  500.820839] Lustre: lsdraid-OST0000: Connection restored to lsdraid-MDT0000-mdtlov_UUID (at 192.168.1.5@o2ib)
[  526.243239] Lustre: lsdraid-OST0000: Connection restored to 2db09e97-e362-e59d-705b-766599d48fc1 (at 192.168.1.7@o2ib)
[  557.233889] Lustre: lsdraid-OST0000: Connection restored to c589bc26-079a-4258-d464-134826a46aab (at 192.168.1.8@o2ib)
[  557.247226] Lustre: Skipped 1 previous similar message
[  561.088981] Lustre: lsdraid-OST0000: Connection restored to fae0dc5a-dd13-67e6-6ad5-19863b0bf1b8 (at 192.168.1.6@o2ib)
[ 1139.285936] Lustre: lsdraid-OST0000: Connection restored to fae0dc5a-dd13-67e6-6ad5-19863b0bf1b8 (at 192.168.1.6@o2ib)
[ 1139.299353] Lustre: Skipped 1 previous similar message
[ 1270.492825] Lustre: *** cfs_fail_loc=718, val=0***
[ 1271.961394] Lustre: *** cfs_fail_loc=718, val=0***
[ 1273.070965] Lustre: *** cfs_fail_loc=718, val=0***
[ 1304.728187] Lustre: *** cfs_fail_loc=718, val=0***
[ 1304.735012] Lustre: Skipped 1 previous similar message
[ 1363.976425] Lustre: *** cfs_fail_loc=718, val=0***
[ 1371.987749] Lustre: *** cfs_fail_loc=718, val=0***
[ 1371.993779] Lustre: Skipped 425 previous similar messages
[ 1387.983061] Lustre: *** cfs_fail_loc=718, val=0***
[ 1387.989713] Lustre: Skipped 594 previous similar messages
[ 1397.290238] BUG: unable to handle kernel paging request at ffff881d4fb4ddd0
[ 1397.298860] IP: [<ffffffff810594e2>] native_set_pte+0x12/0x30
[ 1397.306019] PGD 1f36067 PUD 1fb0c60063 PMD 1fca1db063 PTE 8000001d4fb4d161
[ 1397.314507] Oops: 0003 [#1] SMP 
[ 1397.318869] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) sha512_generic crypto_null rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm zfs(OE) zunicode(OE) zavl(OE) icp(OE) zcommon(OE) znvpair(OE) spl(OE) zlib_deflate ses enclosure dm_service_time intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel mpt3sas aesni_intel lrw gf128mul glue_helper ablk_helper ipmi_ssif cryptd iTCO_wdt iTCO_vendor_support ipmi_devintf
[ 1397.404650]  raid_class scsi_transport_sas lpc_ich ipmi_si sg mei_me shpchp mei pcspkr wmi mfd_core sb_edac acpi_power_meter edac_core ioatdma ipmi_msghandler acpi_pad i2c_i801 dm_multipath dm_mod ip_tables ext4 mbcache jbd2 mlx4_en mlx4_ib vxlan ib_sa ip6_udp_tunnel ib_mad udp_tunnel ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt igb drm_kms_helper crct10dif_pclmul crct10dif_common ttm ptp ahci crc32c_intel pps_core libahci drm mlx4_core dca i2c_algo_bit libata i2c_core [last unloaded: libcfs]
[ 1397.462418] CPU: 33 PID: 25885 Comm: ll_ost_io01_008 Tainted: G          IOE  ------------   3.10.0-327.36.3.el7.x86_64 #1
[ 1397.475779] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 1397.488478] task: ffff882011a8c500 ti: ffff881f25dc4000 task.ti: ffff881f25dc4000
[ 1397.497892] RIP: 0010:[<ffffffff810594e2>]  [<ffffffff810594e2>] native_set_pte+0x12/0x30
[ 1397.508120] RSP: 0018:ffff881f25dc7968  EFLAGS: 00010202
[ 1397.515140] RAX: 8000000d3f3ba161 RBX: ffff881d4fb4ddd0 RCX: 0000000d3f3ba000
[ 1397.524186] RDX: 8000000d3f3ba161 RSI: 8000000d3f3ba161 RDI: ffff881d4fb4ddd0
[ 1397.533247] RBP: ffff881f25dc7978 R08: 8000000d3f3ba163 R09: 0000000000d3f3ba
[ 1397.542307] R10: ffff881d349309f1 R11: 000000000000000f R12: 8000000d3f3ba161
[ 1397.551368] R13: ffff880d3f3ba000 R14: 0000000000000008 R15: ffff881f25dc7aa0
[ 1397.560432] FS:  0000000000000000(0000) GS:ffff88203e7c0000(0000) knlGS:0000000000000000
[ 1397.570571] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1397.578099] CR2: ffff881d4fb4ddd0 CR3: 000000000194a000 CR4: 00000000001407e0
[ 1397.587168] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1397.596240] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1397.605297] Stack:
[ 1397.608611]  ffff881f25dc7aa0 0000000000000000 ffff881f25dc7a68 ffffffff81061f2b
[ 1397.618028]  0000000000000002 0000000000000000 0000000000000246 0000000000d3f3bb
[ 1397.627446]  0000000800000001 0000000000d3f3ba ffff880dbf3ba000 ffff881f25dc79c0
[ 1397.636849] Call Trace:
[ 1397.640636]  [<ffffffff81061f2b>] __change_page_attr_set_clr+0xa3b/0xb40
[ 1397.649201]  [<ffffffff811a4e57>] ? vm_unmap_aliases+0x187/0x1b0
[ 1397.656971]  [<ffffffff8106218c>] change_page_attr_set_clr+0x15c/0x4d0
[ 1397.665322]  [<ffffffff810629cf>] set_memory_ro+0x2f/0x40
[ 1397.672472]  [<ffffffffa0e9d162>] set_niobuf_ro+0x9b/0xe0 [ptlrpc]
[ 1397.680512]  [<ffffffffa0e76202>] tgt_brw_write+0x1062/0x1720 [ptlrpc]
[ 1397.688894]  [<ffffffff810c15cc>] ? update_curr+0xcc/0x150
[ 1397.696110]  [<ffffffff810be46e>] ? account_entity_dequeue+0xae/0xd0
[ 1397.704316]  [<ffffffff812f29a5>] ? cpumask_next_and+0x35/0x50
[ 1397.711936]  [<ffffffff81639d72>] ? mutex_lock+0x12/0x2f
[ 1397.718799]  [<ffffffffa0e71a95>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[ 1397.727414]  [<ffffffffa0e1e1ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 1397.737100]  [<ffffffffa09b9128>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[ 1397.745824]  [<ffffffffa0e1bd68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 1397.754313]  [<ffffffff8163dc7b>] ? _raw_spin_unlock_irqrestore+0x1b/0x40
[ 1397.762820]  [<ffffffffa0e22260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[ 1397.770720]  [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0
[ 1397.777748]  [<ffffffffa0e217c0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[ 1397.786874]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[ 1397.793159]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[ 1397.801280]  [<ffffffff81646a98>] ret_from_fork+0x58/0x90
[ 1397.808119]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[ 1397.816197] Code: 5c 5d c3 66 0f 1f 44 00 00 48 89 d7 e8 c8 d7 00 00 4c 89 23 5b 41 5c 5d c3 55 48 89 e5 41 54 49 89 f4 53 48 89 fb 0f 1f 44 00 00 <4c> 89 23 5b 41 5c 5d c3 66 0f 1f 44 00 00 e8 9b d7 00 00 4c 89 
[ 1397.839619] RIP  [<ffffffff810594e2>] native_set_pte+0x12/0x30
[ 1397.846922]  RSP <ffff881f25dc7968>
[ 1397.851564] CR2: ffff881d4fb4ddd0

Dump is on wolf: /scratch/dumps/wolf-3.wolf.hpdd.intel.com/10.8.1.3-2017-04-20-22:07:48/

Comment by John Salinas (Inactive) [ 20/Apr/17 ]

second hit with patchset 2

[  857.262725] Lustre: Lustre: Build Version: 2.9.0_1_g35befcd_dirty
[  857.384040] LNet: Added LNI 192.168.1.3@o2ib [8/256/0/180]
[  857.520676] ZFS: Unloaded module v0.7.0-rc3_36_g401aa88
[  857.533582] SPL: Unloaded module v0.7.0-rc3
[  857.547935] SPL: Loaded module v0.7.0-rc3
[  859.612907] ZFS: Loaded module v0.7.0-rc3_36_g401aa88, ZFS pool version 5000, ZFS filesystem version 5
[  870.553672] NOTICE: metaslab_init: vd-0, ms-0 forcing segregate bias via SMO 92
[  870.562005] NOTICE: metaslab_init: vd-0, ms-0 creating custom group
[  870.583160] SPL: using hostid 0x61303830
[  870.857423] NOTICE: metaslab_init: vd-0, ms-0 forcing segregate bias via SMO 92
[  870.865795] NOTICE: metaslab_init: vd-0, ms-0 creating custom group
[  876.472864] LustreError: 11-0: lsdraid-MDT0000-lwp-OST0000: operation mds_connect to node 192.168.1.5@o2ib failed: rc = -114
[  876.504803] Lustre: lsdraid-OST0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-450
[  912.586603] Lustre: lsdraid-OST0000: Will be in recovery for at least 2:30, or until 5 clients reconnect
[  912.597331] Lustre: lsdraid-OST0000: Connection restored to lsdraid-MDT0000-mdtlov_UUID (at 192.168.1.5@o2ib)
[  919.714545] Lustre: lsdraid-OST0000: Connection restored to 07f342f3-1696-e195-8d16-791033c4157f (at 192.168.1.6@o2ib)
[  924.892432] Lustre: lsdraid-OST0000: Connection restored to af2d5328-0eb5-fb82-8955-57688f7071c8 (at 192.168.1.7@o2ib)
[  924.904473] Lustre: Skipped 1 previous similar message
[  925.135654] Lustre: *** cfs_fail_loc=718, val=0***
[  925.709380] Lustre: *** cfs_fail_loc=718, val=0***
[  925.714769] Lustre: Skipped 7 previous similar messages
[  926.739968] Lustre: *** cfs_fail_loc=718, val=0***
[  926.745354] Lustre: Skipped 59 previous similar messages
[  929.467794] Lustre: lsdraid-OST0000: Recovery over after 0:17, of 5 clients 5 recovered and 0 were evicted.
[  929.468012] Lustre: lsdraid-OST0000: deleting orphan objects from 0x0:17 to 0x0:33
[  929.482465] Lustre: *** cfs_fail_loc=718, val=0***
[  929.482469] Lustre: Skipped 24 previous similar messages
[  933.503369] Lustre: *** cfs_fail_loc=718, val=0***
[  933.508760] Lustre: Skipped 234 previous similar messages
[  941.692986] Lustre: *** cfs_fail_loc=718, val=0***
[  941.698373] Lustre: Skipped 576 previous similar messages
[  957.692264] Lustre: *** cfs_fail_loc=718, val=0***
[  957.697648] Lustre: Skipped 973 previous similar messages
[  989.055076] BUG: unable to handle kernel paging request at ffff881cd08a8000
[  989.062920] IP: [<ffffffff812fffa7>] clear_page_c_e+0x7/0x10
[  989.070052] PGD 1f36067 PUD 1d887c2063 PMD 1e0754d063 PTE 8000001cd08a8161
[  989.078525] Oops: 0003 [#1] SMP 
[  989.082865] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) zfs(OE) zunicode(OE) zavl(OE) icp(OE) zcommon(OE) znvpair(OE) spl(OE) zlib_deflate lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) sha512_generic crypto_null rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm dm_service_time ses enclosure intel_powerclamp coretemp intel_rapl kvm_intel kvm mpt3sas crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd raid_class scsi_transport_sas sb_edac iTCO_wdt iTCO_vendor_support edac_core
[  989.167378]  ipmi_devintf mei_me ipmi_ssif lpc_ich pcspkr sg ipmi_si mei mfd_core shpchp ipmi_msghandler i2c_i801 ioatdma wmi acpi_power_meter acpi_pad dm_multipath dm_mod nfsd nfs_acl lockd grace binfmt_misc auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 mlx4_en mlx4_ib vxlan ib_sa ip6_udp_tunnel ib_mad udp_tunnel ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper igb crct10dif_pclmul ttm ahci crct10dif_common ptp crc32c_intel libahci pps_core drm mlx4_core dca libata i2c_algo_bit i2c_core [last unloaded: zunicode]
[  989.228104] CPU: 27 PID: 10318 Comm: ll_ost_io01_009 Tainted: G          IOE  ------------   3.10.0-327.36.3.el7.x86_64 #1
[  989.241413] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[  989.254015] task: ffff88200a3c9700 ti: ffff880e42e08000 task.ti: ffff880e42e08000
[  989.263324] RIP: 0010:[<ffffffff812fffa7>]  [<ffffffff812fffa7>] clear_page_c_e+0x7/0x10
[  989.273352] RSP: 0018:ffff880e42e0b7c8  EFLAGS: 00010246
[  989.280239] RAX: 0000000000000000 RBX: 0000000073422a00 RCX: 0000000000001000
[  989.289149] RDX: 00000000000003c0 RSI: 0000000000000000 RDI: ffff881cd08a8000
[  989.298067] RBP: ffff880e42e0b8c8 R08: ffffffff818790bf R09: ffffea0073422c00
[  989.306984] R10: 0000000000000010 R11: 000000000000007d R12: ffff880e42e0bfd8
[  989.315897] R13: 0000000073422c00 R14: ffffea0073422800 R15: ffff880e42e08000
[  989.324812] FS:  0000000000000000(0000) GS:ffff88203e640000(0000) knlGS:0000000000000000
[  989.334786] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  989.342124] CR2: ffff881cd08a8000 CR3: 000000000194a000 CR4: 00000000001407e0
[  989.351012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  989.359883] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  989.368744] Stack:
[  989.371888]  ffffffff81172cca ffffffff81304e4a ffff88207ffd8008 0000000000000002
[  989.381128]  0000000000000000 0000000025501a00 ffffc90129c0c010 ffff880e42e0b898
[  989.390367]  ffff88207ffd8000 ffff880e42e0b858 0000000000000000 0000000225501ab8
[  989.399608] Call Trace:
[  989.403265]  [<ffffffff81172cca>] ? get_page_from_freelist+0x4ea/0x9b0
[  989.411492]  [<ffffffff81304e4a>] ? sg_init_table+0x1a/0x40
[  989.418647]  [<ffffffff81173327>] __alloc_pages_nodemask+0x197/0xba0
[  989.426713]  [<ffffffffa11c1627>] ? zio_data_buf_alloc+0x57/0x60 [zfs]
[  989.434934]  [<ffffffffa10ebd2b>] ? arc_get_data_buf.isra.28+0x3b/0x60 [zfs]
[  989.443730]  [<ffffffffa145d9d2>] ? osd_bufs_get+0x632/0xba0 [osd_zfs]
[  989.451939]  [<ffffffff811b4afa>] alloc_pages_current+0xaa/0x170
[  989.459565]  [<ffffffff8116e32e>] __get_free_pages+0xe/0x50
[  989.466708]  [<ffffffff811bfd8e>] kmalloc_order_trace+0x2e/0xa0
[  989.474233]  [<ffffffff811c2629>] __kmalloc+0x219/0x230
[  989.481032]  [<ffffffffa0e52372>] ? ptlrpc_new_bulk+0x52/0x860 [ptlrpc]
[  989.489368]  [<ffffffffa0e52779>] ptlrpc_new_bulk+0x459/0x860 [ptlrpc]
[  989.497599]  [<ffffffffa0e5edfd>] ptlrpc_prep_bulk_exp+0x5d/0x170 [ptlrpc]
[  989.506220]  [<ffffffffa0ec9e1c>] tgt_brw_write+0xc7c/0x1720 [ptlrpc]
[  989.514324]  [<ffffffff810c15cc>] ? update_curr+0xcc/0x150
[  989.521352]  [<ffffffff810be46e>] ? account_entity_dequeue+0xae/0xd0
[  989.529346]  [<ffffffff81639d72>] ? mutex_lock+0x12/0x2f
[  989.536181]  [<ffffffffa0ec5a95>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[  989.544747]  [<ffffffffa0e721ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[  989.554156]  [<ffffffffa0a10128>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[  989.563441]  [<ffffffffa0e6fd68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[  989.572921]  [<ffffffff810b8952>] ? default_wake_function+0x12/0x20
[  989.581769]  [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90
[  989.590130]  [<ffffffffa0e76260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[  989.598893]  [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0
[  989.606772]  [<ffffffffa0e757c0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[  989.616706]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[  989.623780]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[  989.632663]  [<ffffffff81646a98>] ret_from_fork+0x58/0x90
[  989.640240]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[  989.649047] Code: bc 0f 1f 00 e8 cb b0 d7 ff 90 90 90 90 90 90 90 90 90 90 90 b9 00 02 00 00 31 c0 f3 48 ab c3 0f 1f 44 00 00 b9 00 10 00 00 31 c0 <f3> aa c3 66 0f 1f 44 00 00 eb ee 0f 1f 84 00 00 00 00 00 0f 1f 
[  989.674496] RIP  [<ffffffff812fffa7>] clear_page_c_e+0x7/0x10
[  989.682493]  RSP <ffff880e42e0b7c8>
[  989.687942] CR2: ffff881cd08a8000

/scratch/dumps/wolf-3.wolf.hpdd.intel.com/10.8.1.3-2017-04-20-22:27:43

Comment by John Salinas (Inactive) [ 21/Apr/17 ]

first hit with patchset 4

[  196.980256] Lustre: lsdraid-OST0000: new disk, initializing
[  196.987888] Lustre: srv-lsdraid-OST0000: No data found on store. Initialize space
[  203.007167] Lustre: lsdraid-OST0000: Connection restored to lsdraid-MDT0000-mdtlov_UUID (at 192.168.1.5@o2ib)
[  242.522855] Lustre: lsdraid-OST0000: Connection restored to 07f342f3-1696-e195-8d16-791033c4157f (at 192.168.1.6@o2ib)
[  242.536218] Lustre: Skipped 1 previous similar message
[  246.676409] Lustre: lsdraid-OST0000: Connection restored to af2d5328-0eb5-fb82-8955-57688f7071c8 (at 192.168.1.7@o2ib)
[  317.831618] Lustre: *** cfs_fail_loc=718, val=0***
[  317.838452] Lustre: Skipped 1 previous similar message
[  318.365837] Lustre: *** cfs_fail_loc=718, val=0***
[  318.371955] Lustre: Skipped 8 previous similar messages
[  319.443920] Lustre: *** cfs_fail_loc=718, val=0***
[  319.450175] Lustre: Skipped 13 previous similar messages
[  321.530983] Lustre: *** cfs_fail_loc=718, val=0***
[  321.537800] Lustre: Skipped 29 previous similar messages
[  325.532999] Lustre: *** cfs_fail_loc=718, val=0***
[  325.539815] Lustre: Skipped 56 previous similar messages
[  333.692456] Lustre: *** cfs_fail_loc=718, val=0***
[  333.698586] Lustre: Skipped 87 previous similar messages
[  349.711139] Lustre: *** cfs_fail_loc=718, val=0***
[  349.717261] Lustre: Skipped 110 previous similar messages
[  381.759538] Lustre: *** cfs_fail_loc=718, val=0***
[  381.765684] Lustre: Skipped 308 previous similar messages
[  446.160621] Lustre: *** cfs_fail_loc=718, val=0***
[  446.166751] Lustre: Skipped 367 previous similar messages
[  469.361039] Lustre: lsdraid-OST0000: haven't heard from client 07f342f3-1696-e195-8d16-791033c4157f (at 192.168.1.6@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff882026c2cc00, cur 1492739200 expire 1492739050 last 1492738973
[  557.162229] Lustre: lsdraid-OST0000: Connection restored to af2d5328-0eb5-fb82-8955-57688f7071c8 (at 192.168.1.7@o2ib)
[  557.175994] Lustre: Skipped 1 previous similar message
[  573.640222] Lustre: lsdraid-OST0000: Connection restored to 07f342f3-1696-e195-8d16-791033c4157f (at 192.168.1.6@o2ib)
[  573.654129] Lustre: Skipped 1 previous similar message
[  593.981345] Lustre: *** cfs_fail_loc=718, val=0***
[  593.988918] Lustre: Skipped 167 previous similar messages
[  608.411987] BUG: unable to handle kernel paging request at ffff881f6732b000
[  608.420688] IP: [<ffffffff810594e2>] native_set_pte+0x12/0x30
[  608.427896] PGD 1f36067 PUD 1f8281c063 PMD 1f715f4063 PTE 8000001f6732b161
[  608.436393] Oops: 0003 [#1] SMP 
[  608.440766] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) sha512_generic crypto_null rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache xprtrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm zfs(OE) ses zunicode(OE) enclosure zavl(OE) icp(OE) zcommon(OE) znvpair(OE) spl(OE) zlib_deflate dm_service_time intel_powerclamp coretemp intel_rapl kvm_intel kvm mpt3sas crc32_pclmul ghash_clmulni_intel aesni_intel raid_class scsi_transport_sas lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt ipmi_ssif
[  608.526014]  ipmi_devintf iTCO_vendor_support sg shpchp wmi ipmi_si pcspkr i2c_i801 sb_edac edac_core ipmi_msghandler ioatdma lpc_ich acpi_pad acpi_power_meter mei_me mei mfd_core dm_multipath dm_mod ip_tables ext4 mbcache jbd2 mlx4_ib ib_sa ib_mad mlx4_en vxlan ip6_udp_tunnel ib_core udp_tunnel ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper igb crct10dif_pclmul crct10dif_common ttm ptp ahci crc32c_intel libahci pps_core drm mlx4_core dca libata i2c_algo_bit i2c_core [last unloaded: libcfs]
[  608.583845] CPU: 27 PID: 78970 Comm: ll_ost_io01_006 Tainted: G          IOE  ------------   3.10.0-327.36.3.el7.x86_64 #1
[  608.597165] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[  608.609810] task: ffff880c88b20b80 ti: ffff881f88454000 task.ti: ffff881f88454000
[  608.619155] RIP: 0010:[<ffffffff810594e2>]  [<ffffffff810594e2>] native_set_pte+0x12/0x30
[  608.629312] RSP: 0018:ffff881f88457968  EFLAGS: 00010286
[  608.636233] RAX: 8000000d17200163 RBX: ffff881f6732b000 RCX: 80000000000001e3
[  608.645190] RDX: fffffffffffffbff RSI: 8000000d17200163 RDI: ffff881f6732b000
[  608.654124] RBP: ffff881f88457978 R08: 00003ffffffff000 R09: ffff880000000000
[  608.663060] R10: 0000000000003735 R11: ffffffff812f2a89 R12: 8000000d17200163
[  608.671998] R13: 0000000000000001 R14: ffff881f6732c000 R15: 8000000000000163
[  608.680937] FS:  0000000000000000(0000) GS:ffff88203e640000(0000) knlGS:0000000000000000
[  608.690947] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  608.698339] CR2: ffff881f6732b000 CR3: 000000000194a000 CR4: 00000000001407e0
[  608.707293] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  608.716246] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  608.725191] Stack:
[  608.728415]  ffff881f6732b000 0000000000d17200 ffff881f88457a68 ffffffff81061843
[  608.737735]  0000000000000002 0000000000000000 0000000000000246 0000000000d17201
[  608.747061]  0000013600000001 0000000000d17200 ffff880d97200000 0000001f6732b000
[  608.756391] Call Trace:
[  608.760124]  [<ffffffff81061843>] __change_page_attr_set_clr+0x353/0xb40
[  608.768619]  [<ffffffff811a4e57>] ? vm_unmap_aliases+0x187/0x1b0
[  608.776336]  [<ffffffff8106218c>] change_page_attr_set_clr+0x15c/0x4d0
[  608.784635]  [<ffffffff810629cf>] set_memory_ro+0x2f/0x40
[  608.791716]  [<ffffffffa1168244>] set_niobuf_ro+0xd4/0xe0 [ptlrpc]
[  608.799634]  [<ffffffffa11411fb>] tgt_brw_write+0x105b/0x17c0 [ptlrpc]
[  608.807899]  [<ffffffff81632d15>] ? __slab_free+0x10e/0x277
[  608.815079]  [<ffffffff810c15cc>] ? update_curr+0xcc/0x150
[  608.822138]  [<ffffffff810be46e>] ? account_entity_dequeue+0xae/0xd0
[  608.830159]  [<ffffffff81639d72>] ? mutex_lock+0x12/0x2f
[  608.837022]  [<ffffffffa113ca95>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[  608.845637]  [<ffffffffa10e91ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[  608.855115]  [<ffffffffa09a1128>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[  608.863926]  [<ffffffffa10e6d68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[  608.872632]  [<ffffffff810b8952>] ? default_wake_function+0x12/0x20
[  608.880820]  [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90
[  608.888467]  [<ffffffffa10ed260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[  608.896626]  [<ffffffff81013588>] ? __switch_to+0xf8/0x4b0
[  608.903881]  [<ffffffffa10ec7c0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[  608.913258]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[  608.919755]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[  608.928110]  [<ffffffff81646a98>] ret_from_fork+0x58/0x90
[  608.935143]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[  608.943379] Code: 5c 5d c3 66 0f 1f 44 00 00 48 89 d7 e8 c8 d7 00 00 4c 89 23 5b 41 5c 5d c3 55 48 89 e5 41 54 49 89 f4 53 48 89 fb 0f 1f 44 00 00 <4c> 89 23 5b 41 5c 5d c3 66 0f 1f 44 00 00 e8 9b d7 00 00 4c 89 
[  608.967131] RIP  [<ffffffff810594e2>] native_set_pte+0x12/0x30
[  608.974611]  RSP <ffff881f88457968>
[  608.979419] CR2: ffff881f6732b000

Dump is at: /scratch/dumps/wolf-3.wolf.hpdd.intel.com/10.8.1.3-2017-04-21-01:49:25/

Comment by John Salinas (Inactive) [ 21/Apr/17 ]

second hit with patchset4

 
[  152.871109] Lustre: Lustre: Build Version: 2.9.0_1_g53542ca_dirty
[  152.997124] LNet: Added LNI 192.168.1.3@o2ib [8/256/0/180]
[  153.162730] ZFS: Unloaded module v0.7.0-rc3_36_g401aa88
[  153.173764] SPL: Unloaded module v0.7.0-rc3
[  153.190539] SPL: Loaded module v0.7.0-rc3
[  155.229474] ZFS: Loaded module v0.7.0-rc3_36_g401aa88, ZFS pool version 5000, ZFS filesystem version 5
[  167.916347] NOTICE: metaslab_init: vd-0, ms-0 forcing segregate bias via SMO 92
[  167.924719] NOTICE: metaslab_init: vd-0, ms-0 creating custom group
[  168.039258] SPL: using hostid 0x61303830
[  168.357919] NOTICE: metaslab_init: vd-0, ms-0 forcing segregate bias via SMO 92
[  168.366282] NOTICE: metaslab_init: vd-0, ms-0 creating custom group
[  170.663169] LustreError: 11-0: lsdraid-MDT0000-lwp-OST0000: operation mds_connect to node 192.168.1.5@o2ib failed: rc = -114
[  170.680720] Lustre: lsdraid-OST0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-450
[  174.877306] Lustre: lsdraid-OST0000: Will be in recovery for at least 2:30, or until 5 clients reconnect
[  174.888049] Lustre: lsdraid-OST0000: Connection restored to lsdraid-MDT0000-mdtlov_UUID (at 192.168.1.5@o2ib)
[  207.371303] Lustre: lsdraid-OST0000: Connection restored to 5e8c8fa0-be14-2b41-67e0-3466f31ea8e7 (at 192.168.1.7@o2ib)
[  209.652713] Lustre: lsdraid-OST0000: Connection restored to 4dcd08e3-cafc-4ca1-7235-43953f33f442 (at 192.168.1.8@o2ib)
[  223.525746] Lustre: lsdraid-OST0000: Connection restored to 296dffc9-ea18-7945-3471-93b3383a80df (at 192.168.1.9@o2ib)
[  223.537806] Lustre: Skipped 1 previous similar message
[  223.655093] Lustre: lsdraid-OST0000: Recovery over after 0:49, of 5 clients 5 recovered and 0 were evicted.
[  223.658928] Lustre: lsdraid-OST0000: deleting orphan objects from 0x0:8 to 0x0:33
[  223.672318] Lustre: *** cfs_fail_loc=718, val=0***
[  224.197809] Lustre: *** cfs_fail_loc=718, val=0***
[  224.203212] Lustre: Skipped 14 previous similar messages
[  225.236523] Lustre: *** cfs_fail_loc=718, val=0***
[  225.241917] Lustre: Skipped 58 previous similar messages
[  227.238561] Lustre: *** cfs_fail_loc=718, val=0***
[  227.244349] Lustre: Skipped 99 previous similar messages
[  237.043823] Lustre: *** cfs_fail_loc=718, val=0***
[  237.049245] Lustre: Skipped 130 previous similar messages
[  237.504376] BUG: unable to handle kernel paging request at ffff881f8ac40000
[  237.512234] IP: [<ffffffff811befb5>] new_slab+0x1d5/0x300
[  237.518316] PGD 1f36067 PUD 1fbf0f8063 PMD 1fdba24063 PTE 8000001f8ac40161
[  237.526095] Oops: 0003 [#1] SMP 
[  237.529741] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) zfs(OE) zunicode(OE) zavl(OE) icp(OE) zcommon(OE) znvpair(OE) spl(OE) zlib_deflate lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) sha512_generic crypto_null rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ses dm_service_time enclosure intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel mpt3sas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd raid_class iTCO_wdt iTCO_vendor_support lpc_ich scsi_transport_sas ipmi_ssif
[  237.614813]  ipmi_devintf mei_me mfd_core ipmi_si pcspkr mei sg sb_edac ipmi_msghandler shpchp edac_core i2c_i801 ioatdma wmi acpi_pad acpi_power_meter dm_multipath dm_mod nfsd nfs_acl lockd binfmt_misc grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 mlx4_en mlx4_ib vxlan ib_sa ip6_udp_tunnel ib_mad udp_tunnel ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper igb ttm crct10dif_pclmul crct10dif_common ahci ptp crc32c_intel libahci pps_core drm mlx4_core dca libata i2c_algo_bit i2c_core [last unloaded: zunicode]
[  237.675607] CPU: 64 PID: 6818 Comm: ll_ost_io01_007 Tainted: G          IOE  ------------   3.10.0-327.36.3.el7.x86_64 #1
[  237.688770] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[  237.701368] task: ffff880fe93b4500 ti: ffff880fa81b4000 task.ti: ffff880fa81b4000
[  237.710675] RIP: 0010:[<ffffffff811befb5>]  [<ffffffff811befb5>] new_slab+0x1d5/0x300
[  237.720400] RSP: 0018:ffff880fa81b7698  EFLAGS: 00010046
[  237.727286] RAX: 0000000000000000 RBX: ffff88103e807300 RCX: 0000000000000003
[  237.736222] RDX: 0000000000000008 RSI: ffff881f8ac40000 RDI: ffff88103e807300
[  237.745160] RBP: ffff880fa81b76c0 R08: 00000000ffffffe8 R09: 0000000000000000
[  237.754098] R10: ffffffffffffffe0 R11: ffffffffffffff83 R12: ffff881f8ac40000
[  237.763022] R13: ffffea007e2b1000 R14: ffff881f8ac40000 R15: ffff881f8ac40000
[  237.771946] FS:  0000000000000000(0000) GS:ffff88203eb00000(0000) knlGS:0000000000000000
[  237.781942] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  237.789312] CR2: ffff881f8ac40000 CR3: 000000000194a000 CR4: 00000000001407e0
[  237.798242] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  237.807091] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  237.815920] Stack:
[  237.819015]  0000000000000002 0000000000000000 ffff88203eb19660 ffff88103e807300
[  237.828211]  ffff88103e8001c0 ffff880fa81b7798 ffffffff8163339c ffff880fe93b4500
[  237.837410]  ffff88203eb19660 0000000000002000 ffff880fe93b4500 0000000000000000
[  237.846603] Call Trace:
[  237.850197]  [<ffffffff8163339c>] __slab_alloc+0x315/0x48f
[  237.857226]  [<ffffffffa0505329>] ? spl_kmem_cache_alloc+0x99/0x150 [spl]
[  237.865647]  [<ffffffff811c1693>] kmem_cache_alloc+0x193/0x1d0
[  237.873000]  [<ffffffffa0505329>] ? spl_kmem_cache_alloc+0x99/0x150 [spl]
[  237.881419]  [<ffffffffa0505329>] spl_kmem_cache_alloc+0x99/0x150 [spl]
[  237.889702]  [<ffffffffa11e9627>] zio_data_buf_alloc+0x57/0x60 [zfs]
[  237.897659]  [<ffffffffa110f4ac>] abd_alloc_linear+0xac/0xf0 [zfs]
[  237.905430]  [<ffffffffa110f7fc>] abd_alloc+0x30c/0x470 [zfs]
[  237.912690]  [<ffffffff811c1535>] ? kmem_cache_alloc+0x35/0x1d0
[  237.920141]  [<ffffffffa1113dbe>] arc_hdr_alloc_pabd+0x6e/0x110 [zfs]
[  237.928192]  [<ffffffffa1113f1c>] arc_hdr_alloc+0xbc/0x120 [zfs]
[  237.935770]  [<ffffffffa11189bd>] arc_alloc_buf+0x3d/0xc0 [zfs]
[  237.943230]  [<ffffffffa1118a63>] arc_loan_buf+0x23/0x30 [zfs]
[  237.950621]  [<ffffffffa112813a>] dmu_request_arcbuf+0x1a/0x20 [zfs]
[  237.958611]  [<ffffffffa0bc8931>] osd_bufs_get+0x591/0xba0 [osd_zfs]
[  237.966488]  [<ffffffffa159ae3d>] ofd_preprw_write.isra.29+0x1bd/0xcd0 [ofd]
[  237.975129]  [<ffffffffa159c13a>] ofd_preprw+0x7ea/0x10c0 [ofd]
[  237.982530]  [<ffffffffa0ef1dba>] tgt_brw_write+0xc1a/0x17c0 [ptlrpc]
[  237.990434]  [<ffffffff81632d15>] ? __slab_free+0x10e/0x277
[  237.997354]  [<ffffffff810c15cc>] ? update_curr+0xcc/0x150
[  238.004154]  [<ffffffff810be46e>] ? account_entity_dequeue+0xae/0xd0
[  238.011910]  [<ffffffff81639d72>] ? mutex_lock+0x12/0x2f
[  238.018508]  [<ffffffffa0eeda95>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[  238.026838]  [<ffffffffa0e9a1ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[  238.036063]  [<ffffffffa0a10128>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[  238.044264]  [<ffffffffa0e97d68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[  238.052434]  [<ffffffff810b8952>] ? default_wake_function+0x12/0x20
[  238.060000]  [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90
[  238.067086]  [<ffffffffa0e9e260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[  238.074646]  [<ffffffffa0e9d7c0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[  238.083438]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[  238.089406]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[  238.097228]  [<ffffffff81646a98>] ret_from_fork+0x58/0x90
[  238.103772]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[  238.111576] Code: 7f 00 00 0f af 43 18 48 98 4c 01 f0 49 39 c6 73 40 4d 89 f7 eb 06 4d 89 e7 49 89 cc 4c 89 fe 48 89 df e8 5f fc ff ff 48 63 43 20 <4d> 89 24 07 41 0f b7 55 1a 8b 43 18 81 e2 ff 7f 00 00 48 63 c8 
[  238.134485] RIP  [<ffffffff811befb5>] new_slab+0x1d5/0x300
[  238.141178]  RSP <ffff880fa81b7698>
[  238.145611] CR2: ffff881f8ac40000

Dump is in /scratch/dumps/wolf-3.wolf.hpdd.intel.com/10.8.1.3-2017-04-21-01:55:40/

SPL
module/spl/spl-kmem-cache.c:1439).
1434 * Allocate an object from the per-cpu magazine, or if the magazine
1435 * is empty directly allocate from a slab and repopulate the magazine.
1436 */
1437 void *
1438 spl_kmem_cache_alloc(spl_kmem_cache_t *skc, int flags)
1439 {
1440 spl_kmem_magazine_t *skm;
1441 void *obj = NULL;
1442
1443 ASSERT0(flags & ~KM_PUBLIC_MASK);

ZFS
module/zfs/zio.c:287).
279 /*
280 * Use zio_data_buf_alloc to allocate data. The data will not appear in a
281 * crashdump if the kernel panics. This exists so that we will limit the amount
282 * of ZFS data that shows up in a kernel crashdump. (Thus reducing the amount
283 * of kernel heap dumped to disk when the kernel panics)
284 */
285 void *
286 zio_data_buf_alloc(size_t size)
287 {
288 size_t c = (size - 1) >> SPA_MINBLOCKSHIFT;
289
290 VERIFY3U(c, <, SPA_MAXBLOCKSIZE >> SPA_MINBLOCKSHIFT);
291
292 return (kmem_cache_alloc(zio_data_buf_cache[c], KM_PUSHPAGE));
293

abd_alloc_linear+0xac/0xf0
abd_alloc+0x30c/0x470
arc_hdr_alloc_pabd+0x6e/0x110
arc_hdr_alloc+0xbc/0x120

arc_loan_buf+0x23/0x30 module/zfs/arc.c:2441).
2433 /*
2434 * Loan out an anonymous arc buffer. Loaned buffers are not counted as in
2435 * flight data by arc_tempreserve_space() until they are "returned". Loaned
2436 * buffers must be returned to the arc before they can be used by the DMU or
2437 * freed.
2438 */
2439 arc_buf_t *
2440 arc_loan_buf(spa_t *spa, boolean_t is_metadata, int size)
2441

{ 2442 ▸ arc_buf_t *buf = arc_alloc_buf(spa, arc_onloan_tag, 2443 ▸ is_metadata ? ARC_BUFC_METADATA : ARC_BUFC_DATA, size); 2444 2445 ▸ atomic_add_64(&arc_loaned_bytes, size); 2446 ▸ return (buf); 2447 }


2448
2449 arc_buf_t *
2450 arc_loan_compressed_buf(spa_t *spa, uint64_t psize, uint64_t lsize,
2451 enum zio_compress compression_type)
2452

{ 2453 ▸ arc_buf_t *buf = arc_alloc_compressed_buf(spa, arc_onloan_tag, 2454 ▸ psize, lsize, compression_type); 2455 2456 ▸ atomic_add_64(&arc_loaned_bytes, psize); 2457 ▸ return (buf); 2558 }

dmu_request_arcbuf+0x1a/0x20 module/zfs/dmu.c:1379).
1374 /*
1375 * Allocate a loaned anonymous arc buffer.
1376 */
1377 arc_buf_t *
1378 dmu_request_arcbuf(dmu_buf_t *handle, int size)
1379

{ 1380 dmu_buf_impl_t *db = (dmu_buf_impl_t *)handle; 1381 1382 return (arc_loan_buf(db->db_objset->os_spa, B_FALSE, size)); 1383 }

osd
lustre/osd-zfs/osd_io.c:498).
493 }
494
495 static int osd_bufs_get(const struct lu_env *env, struct dt_object *dt,
496 loff_t offset, ssize_t len, struct niobuf_local *lnb,
497 int rw)
498 {
499 struct osd_object *obj = osd_dt_obj(dt);
500 int rc;
501
502 LASSERT(dt_object_exists(dt));

Comment by John Salinas (Inactive) [ 21/Apr/17 ]

dump 3 from patchset 4

[  156.320964] LustreError: 11-0: lsdraid-MDT0000-lwp-OST0000: operation mds_connect to node 192.168.1.5@o2ib failed: rc = -114
[  156.338809] Lustre: lsdraid-OST0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-450
[  159.331568] Lustre: lsdraid-OST0000: Will be in recovery for at least 2:30, or until 5 clients reconnect
[  159.342332] Lustre: lsdraid-OST0000: Connection restored to 5e8c8fa0-be14-2b41-67e0-3466f31ea8e7 (at 192.168.1.7@o2ib)
[  181.282127] LustreError: 11-0: lsdraid-MDT0000-lwp-OST0000: operation mds_connect to node 192.168.1.5@o2ib failed: rc = -114
[  200.687163] Lustre: lsdraid-OST0000: Connection restored to 296dffc9-ea18-7945-3471-93b3383a80df (at 192.168.1.9@o2ib)
[  202.510262] Lustre: lsdraid-OST0000: Connection restored to lsdraid-MDT0000-mdtlov_UUID (at 192.168.1.5@o2ib)
[  202.521437] Lustre: Skipped 2 previous similar messages
[  202.644651] Lustre: lsdraid-OST0000: Recovery over after 0:43, of 5 clients 5 recovered and 0 were evicted.
[  202.662213] Lustre: *** cfs_fail_loc=718, val=0***
[  202.664553] Lustre: lsdraid-OST0000: deleting orphan objects from 0x0:42 to 0x0:65
[  202.676128] Lustre: Skipped 1 previous similar message
[  203.204051] Lustre: *** cfs_fail_loc=718, val=0***
[  203.209468] Lustre: Skipped 15 previous similar messages
[  204.205239] Lustre: *** cfs_fail_loc=718, val=0***
[  204.210635] Lustre: Skipped 54 previous similar messages
[  219.576985] Lustre: *** cfs_fail_loc=718, val=0***
[  219.582413] Lustre: Skipped 25 previous similar messages
[  219.937469] BUG: unable to handle kernel paging request at ffff881ffc24e320
[  219.945327] IP: [<ffffffff810594e2>] native_set_pte+0x12/0x30
[  219.951798] PGD 1f36067 PUD 2021953063 PMD 1fca2d6063 PTE 8000001ffc24e161
[  219.959587] Oops: 0003 [#1] SMP 
[  219.963242] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) zfs(OE) zunicode(OE) zavl(OE) icp(OE) zcommon(OE) znvpair(OE) spl(OE) zlib_deflate lustre(OE) lmv(OE) mdc(OE) lov(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) sha512_generic crypto_null rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm dm_service_time ses enclosure intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd mpt3sas ipmi_devintf raid_class sb_edac ipmi_ssif iTCO_wdt iTCO_vendor_support sg
[  220.048604]  scsi_transport_sas lpc_ich mfd_core pcspkr edac_core mei_me ipmi_si mei shpchp ipmi_msghandler acpi_power_meter ioatdma acpi_pad i2c_i801 wmi nfsd dm_multipath dm_mod auth_rpcgss nfs_acl lockd grace binfmt_misc sunrpc ip_tables ext4 mbcache jbd2 mlx4_en mlx4_ib vxlan ip6_udp_tunnel udp_tunnel ib_sa ib_mad ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper crct10dif_pclmul ttm igb crct10dif_common ahci ptp crc32c_intel libahci pps_core drm mlx4_core libata dca i2c_algo_bit i2c_core [last unloaded: zunicode]
[  220.110799] CPU: 54 PID: 8651 Comm: ll_ost_io01_004 Tainted: G          IOE  ------------   3.10.0-327.36.3.el7.x86_64 #1
[  220.124226] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[  220.137091] task: ffff881fe5fcb980 ti: ffff881fe0714000 task.ti: ffff881fe0714000
[  220.146672] RIP: 0010:[<ffffffff810594e2>]  [<ffffffff810594e2>] native_set_pte+0x12/0x30
[  220.157064] RSP: 0018:ffff881fe0717968  EFLAGS: 00010212
[  220.164225] RAX: 3036353433323032 RBX: ffff881ffc24e320 RCX: 0000353433323000
[  220.173434] RDX: 3036353433323032 RSI: 3036353433323032 RDI: ffff881ffc24e320
[  220.182649] RBP: ffff881fe0717978 R08: 3036353433323130 R09: 0000000353433323
[  220.191867] R10: ffffea00809c1a00 R11: ffffffff812f2a89 R12: 3036353433323032
[  220.201085] R13: ffff881fb8a64000 R14: 0000000000000020 R15: ffff881fe0717aa0
[  220.210280] FS:  0000000000000000(0000) GS:ffff88203e880000(0000) knlGS:0000000000000000
[  220.220567] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  220.228231] CR2: ffff881ffc24e320 CR3: 000000000194a000 CR4: 00000000001407e0
[  220.237226] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  220.246227] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  220.255218] Stack:
[  220.258445]  ffff881fe0717aa0 0000000000000002 ffff881fe0717a68 ffffffff81061f2b
[  220.267757]  0000000000000002 0000000000000000 0000000000000246 0000000001fb8a65
[  220.277071]  0000002000000001 0000000001fb8a64 ffff882038a64000 ffff881fe07179c0
[  220.286371] Call Trace:
[  220.290081]  [<ffffffff81061f2b>] __change_page_attr_set_clr+0xa3b/0xb40
[  220.298530]  [<ffffffff811a4e57>] ? vm_unmap_aliases+0x187/0x1b0
[  220.306183]  [<ffffffff8106218c>] change_page_attr_set_clr+0x15c/0x4d0
[  220.314412]  [<ffffffff8106298f>] set_memory_rw+0x2f/0x40
[  220.321426]  [<ffffffffa0e0e20b>] set_niobuf_ro+0x9b/0xe0 [ptlrpc]
[  220.329289]  [<ffffffffa0de7256>] tgt_brw_write+0x10b6/0x17c0 [ptlrpc]
[  220.337498]  [<ffffffff81632d15>] ? __slab_free+0x10e/0x277
[  220.344639]  [<ffffffff810c15cc>] ? update_curr+0xcc/0x150
[  220.351676]  [<ffffffff810be46e>] ? account_entity_dequeue+0xae/0xd0
[  220.359688]  [<ffffffff81639d72>] ? mutex_lock+0x12/0x2f
[  220.366557]  [<ffffffffa0de2a95>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[  220.375177]  [<ffffffffa0d8f1ab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[  220.384663]  [<ffffffffa0a4c128>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[  220.393188]  [<ffffffffa0d8cd68>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[  220.401686]  [<ffffffff810b8952>] ? default_wake_function+0x12/0x20
[  220.409595]  [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90
[  220.417034]  [<ffffffffa0d93260>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[  220.424942]  [<ffffffffa0d927c0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[  220.434078]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[  220.440375]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[  220.448504]  [<ffffffff81646a98>] ret_from_fork+0x58/0x90
[  220.455344]  [<ffffffff810a5ac0>] ? kthread_create_on_node+0x140/0x140
[  220.463428] Code: 5c 5d c3 66 0f 1f 44 00 00 48 89 d7 e8 c8 d7 00 00 4c 89 23 5b 41 5c 5d c3 55 48 89 e5 41 54 49 89 f4 53 48 89 fb 0f 1f 44 00 00 <4c> 89 23 5b 41 5c 5d c3 66 0f 1f 44 00 00 e8 9b d7 00 00 4c 89 
[  220.486860] RIP  [<ffffffff810594e2>] native_set_pte+0x12/0x30
[  220.494168]  RSP <ffff881fe0717968>
[  220.498816] CR2: ffff881ffc24e320

/scratch/dumps/wolf-3.wolf.hpdd.intel.com/10.8.1.3-2017-04-21-02:01:36/

Comment by Jinshan Xiong (Inactive) [ 21/Apr/17 ]

it turned out the block size of the file in question is only 128KB.

struct arc_buf_hdr_t {
  b_dva = {
    dva_word = {0, 0}
  }, 
  b_birth = 0, 
  b_type = ARC_BUFC_DATA, 
  b_hash_next = 0x0, 
  b_flags = (ARC_FLAG_HAS_L1HDR | ARC_FLAG_COMPRESSED_ARC | ARC_FLAG_COMPRESS_1), 
  b_psize = 256, 
  b_lsize = 256, 

The page table was overwritten by other thread

{ {noformat}

crash> rd -64 0xffff881ffc24e000 64
ffff881ffc24e000: 3534333231303635 3635343332313036 5601234560123456
ffff881ffc24e010: 3036353433323130 3130363534333231 0123456012345601
ffff881ffc24e020: 3231303635343332 3332313036353433 2345601234560123
ffff881ffc24e030: 3433323130363534 3534333231303635 4560123456012345
ffff881ffc24e040: 3635343332313036 3036353433323130 6012345601234560
ffff881ffc24e050: 3130363534333231 3231303635343332 1234560123456012
ffff881ffc24e060: 3332313036353433 3433323130363534 3456012345601234


even before this thread started to do RDMA transfer.

I also checked the local_nb and bulk descriptor and they all look good.

Comment by Gerrit Updater [ 21/Apr/17 ]

Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: https://review.whamcloud.com/26784
Subject: LU-9305 target: debug patch for memory problem
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8be516a5a091c108a2e7b8e57cff5cc7933d2381

Comment by Jinshan Xiong (Inactive) [ 22/Apr/17 ]

After debug kernel is loaded, the kernel crashed differently:

wolf-4 login: [ 8062.662592] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 8062.671380] IP: [<ffffffffa057d80b>] avl_remove+0x6b/0x260 [zavl]
[ 8062.678204] PGD 0 
[ 8062.680469] Oops: 0000 [#1] SMP 
[ 8062.684093] Modules linked in: lustre(OE) lmv(OE) mdc(OE) lov(OE) osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) zfs(OE) zunicode(OE) zavl(OE) icp(OE) zcommon(OE) znvpair(OE) spl(OE) zlib_deflate rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel iTCO_wdt lrw gf128mul glue_helper ipmi_devintf ablk_helper cryptd iTCO_vendor_support pcspkr dm_service_time sb_edac edac_core ses enclosure lpc_ich ipmi_ssif mei_me mei mfd_core i2c_i801 sg ioatdma ipmi_si shpchp ipmi_msghandler wmi acpi_power_meter acpi_pad dm_multipath dm_mod nfsd binfmt_misc nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 mlx4_en mlx4_ib vxlan ip6_udp_tunnel ib_sa udp_tunnel ib_mad ib_core ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea crct10dif_pclmul sysfillrect sysimgblt crct10dif_common drm_kms_helper crc32c_intel ttm igb ahci ptp drm libahci mpt3sas pps_core mlx4_core libata raid_class dca i2c_algo_bit scsi_transport_sas i2c_core [last unloaded: zunicode]
[ 8062.819766] CPU: 4 PID: 29605 Comm: ll_ost_io00_007 Tainted: G          IOE  ------------   3.10.0-327.36.3.el7.x86_64.debug #1
[ 8062.832580] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 8062.844233] task: ffff881d8459cf40 ti: ffff8817d25f8000 task.ti: ffff8817d25f8000
[ 8062.852589] RIP: 0010:[<ffffffffa057d80b>]  [<ffffffffa057d80b>] avl_remove+0x6b/0x260 [zavl]
[ 8062.862127] RSP: 0018:ffff8817d25fb760  EFLAGS: 00010202
[ 8062.868052] RAX: 0000000000000000 RBX: ffff880e7b8409e0 RCX: ffff880e7b840c20
[ 8062.876019] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
[ 8062.883978] RBP: ffff8817d25fb798 R08: 0000000000000000 R09: 0000000000000001
[ 8062.891939] R10: 0000000000000000 R11: 0000000000000400 R12: ffff880d1e8ea440
[ 8062.899907] R13: 0000000000000000 R14: ffff880e7b840a38 R15: 0000000000000001
[ 8062.907874] FS:  0000000000000000(0000) GS:ffff881037600000(0000) knlGS:0000000000000000
[ 8062.916906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8062.923316] CR2: 0000000000000008 CR3: 0000000001a42000 CR4: 00000000001407e0
[ 8062.931277] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8062.939243] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 8062.947203] Stack:
[ 8062.949446]  ffff880e7b840ae0 00000000391623c4 ffff880e7b8409e0 00000000391623c4
[ 8062.957747]  ffff880e7b8409e0 ffff880d1e8e9f40 0000000000000000 ffff8817d25fb7e0
[ 8062.966049]  ffffffffa0a3ca2e 0000000000000246 ffff8819536e6048 ffff880e7b8409e0
[ 8062.974366] Call Trace:
[ 8062.977146]  [<ffffffffa0a3ca2e>] dbuf_destroy+0x16e/0x530 [zfs]
[ 8062.983880]  [<ffffffffa0a3c440>] dbuf_rele_and_unlock+0x3c0/0x650 [zfs]
[ 8062.991372]  [<ffffffff8110fdfd>] ? trace_hardirqs_on+0xd/0x10
[ 8062.997912]  [<ffffffffa0a3c86c>] ? dbuf_rele+0x2c/0x80 [zfs]
[ 8063.004355]  [<ffffffffa0a3c86c>] ? dbuf_rele+0x2c/0x80 [zfs]
[ 8063.010784]  [<ffffffffa0560dca>] ? spl_kmem_free+0x2a/0x40 [spl]
[ 8063.017603]  [<ffffffffa0a3e0a9>] ? dbuf_hold_impl+0xb9/0xd0 [zfs]
[ 8063.024530]  [<ffffffffa0a3c880>] dbuf_rele+0x40/0x80 [zfs]
[ 8063.030786]  [<ffffffffa0a494fb>] dmu_assign_arcbuf+0x13b/0x260 [zfs]
[ 8063.037997]  [<ffffffffa140057a>] osd_write_commit+0x3ca/0x9d0 [osd_zfs]
[ 8063.045502]  [<ffffffffa17b9a54>] ofd_commitrw_write+0x1014/0x1c20 [ofd]
[ 8063.052994]  [<ffffffffa17bd20d>] ofd_commitrw+0x51d/0xa50 [ofd]
[ 8063.059795]  [<ffffffffa152260a>] obd_commitrw+0x2ec/0x32f [ptlrpc]
[ 8063.066866]  [<ffffffffa14fa446>] tgt_brw_write+0xec6/0x17d0 [ptlrpc]
[ 8063.074115]  [<ffffffffa14f0e44>] ? req_can_reconstruct+0x44/0x110 [ptlrpc]
[ 8063.081894]  [<ffffffff8110fdfd>] ? trace_hardirqs_on+0xd/0x10
[ 8063.088457]  [<ffffffffa144e980>] ? target_send_reply_msg+0x180/0x180 [ptlrpc]
[ 8063.096585]  [<ffffffffa14f5e1b>] tgt_request_handle+0x93b/0x1350 [ptlrpc]
[ 8063.104320]  [<ffffffffa14a1739>] ptlrpc_server_handle_request+0x239/0xad0 [ptlrpc]
[ 8063.112892]  [<ffffffffa11537b8>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[ 8063.120527]  [<ffffffffa149f248>] ? ptlrpc_wait_event+0xb8/0x370 [ptlrpc]
[ 8063.128164]  [<ffffffffa14a5980>] ptlrpc_main+0xac0/0x1e50 [ptlrpc]
[ 8063.135174]  [<ffffffff8170ac3c>] ? _raw_spin_unlock_irq+0x2c/0x50
[ 8063.142133]  [<ffffffffa14a4ec0>] ? ptlrpc_register_service+0xe70/0xe70 [ptlrpc]
[ 8063.150402]  [<ffffffff810bfdcd>] kthread+0xed/0x100
[ 8063.155950]  [<ffffffff810bfce0>] ? insert_kthread_work+0x80/0x80
[ 8063.162764]  [<ffffffff81715218>] ret_from_fork+0x58/0x90
[ 8063.168796]  [<ffffffff810bfce0>] ? insert_kthread_work+0x80/0x80

and this:

wolf-3 login: [44768.868775] Kernel panic - not syncing: avl_find() succeeded inside avl_add()
[44768.876759] CPU: 19 PID: 13199 Comm: ll_ost_io01_014 Tainted: G    B     IOE  ------------   3.10.0-327.36.3.el7.x86_64.debug #1
[44768.889675] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[44768.901332]  ffffffffa05991b0 000000006acc281b ffff880fd2b075a0 ffffffff817010d6
[44768.909707]  ffff880fd2b07620 ffffffff816f9c9b 0000000000000008 ffff880fd2b07630
[44768.918075]  ffff880fd2b075d0 000000006acc281b 0000000000000000 ffff881a88d75b58
[44768.926427] Call Trace:
[44768.929182]  [<ffffffff817010d6>] dump_stack+0x19/0x1b
[44768.934936]  [<ffffffff816f9c9b>] panic+0xdd/0x1f1
[44768.940310]  [<ffffffffa0598790>] avl_add+0x70/0x80 [zavl]
[44768.946495]  [<ffffffffa0cd67d7>] dbuf_create+0x377/0x7d0 [zfs]
[44768.953121]  [<ffffffff8170ab37>] ? _raw_spin_unlock+0x27/0x40
[44768.959679]  [<ffffffffa0cd4800>] ? dbuf_find+0x100/0x320 [zfs]
[44768.966335]  [<ffffffffa0cd9b6c>] __dbuf_hold_impl+0x61c/0xaa0 [zfs]
[44768.973473]  [<ffffffffa0cda099>] dbuf_hold_impl+0xa9/0xd0 [zfs]
[44768.980241]  [<ffffffffa0cda0f5>] dbuf_hold_level+0x35/0x60 [zfs]
[44768.987091]  [<ffffffffa0cda136>] dbuf_hold+0x16/0x20 [zfs]
[44768.993358]  [<ffffffffa0ce4226>] dmu_buf_hold_array_by_dnode+0x116/0xa00 [zfs]
[44769.001564]  [<ffffffffa0ce517c>] dmu_buf_hold_array.constprop.12+0x6c/0xb0 [zfs]
[44769.010048]  [<ffffffffa0ce52f1>] dmu_write.part.9+0x51/0x100 [zfs]
[44769.017097]  [<ffffffffa0ce5576>] dmu_assign_arcbuf+0x1b6/0x260 [zfs]
[44769.024314]  [<ffffffffa0c7457a>] osd_write_commit+0x3ca/0x9d0 [osd_zfs]
[44769.031823]  [<ffffffffa182aa54>] ofd_commitrw_write+0x1014/0x1c20 [ofd]
[44769.039327]  [<ffffffffa182e20d>] ofd_commitrw+0x51d/0xa50 [ofd]
[44769.046124]  [<ffffffffa163460a>] obd_commitrw+0x2ec/0x32f [ptlrpc]
[44769.053212]  [<ffffffffa160c446>] tgt_brw_write+0xec6/0x17d0 [ptlrpc]
[44769.060485]  [<ffffffffa1602e44>] ? req_can_reconstruct+0x44/0x110 [ptlrpc]
[44769.068278]  [<ffffffff8110fdfd>] ? trace_hardirqs_on+0xd/0x10
[44769.074868]  [<ffffffffa1560980>] ? target_send_reply_msg+0x180/0x180 [ptlrpc]
[44769.083019]  [<ffffffffa1607e1b>] tgt_request_handle+0x93b/0x1350 [ptlrpc]
[44769.090757]  [<ffffffffa15b3739>] ptlrpc_server_handle_request+0x239/0xad0 [ptlrpc]
[44769.099347]  [<ffffffffa0b447b8>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[44769.107004]  [<ffffffffa15b1248>] ? ptlrpc_wait_event+0xb8/0x370 [ptlrpc]
[44769.114654]  [<ffffffffa15b7980>] ptlrpc_main+0xac0/0x1e50 [ptlrpc]
[44769.121730]  [<ffffffffa15b6ec0>] ? ptlrpc_register_service+0xe70/0xe70 [ptlrpc]
[44769.130008]  [<ffffffff810bfdcd>] kthread+0xed/0x100
[44769.135574]  [<ffffffff810bfce0>] ? insert_kthread_work+0x80/0x80
[44769.142400]  [<ffffffff81715218>] ret_from_fork+0x58/0x90
[44769.148440]  [<ffffffff810bfce0>] ? insert_kthread_work+0x80/0x80

I suspect that there are something with avl tree in ZFS.

Comment by Andreas Dilger [ 27/Apr/17 ]

Attached are some old patches against old versions of Lustre, SPL, and the kernel that were used for debugging similar issues in the past. They originate from https://bugzilla.lustre.org/show_bug.cgi?id=22471 for the backstory, if interested.

Comment by Gerrit Updater [ 01/May/17 ]

Jinshan Xiong (jinshan.xiong@intel.com) uploaded a new patch: https://review.whamcloud.com/26895
Subject: LU-9305 osd-zfs: set minimum block size to (PAGE_SIZE << 2)
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 46010a8b21c9b542094086bc48759fcafcd4b2ff

Comment by Jinshan Xiong (Inactive) [ 01/May/17 ]

With patch https://review.whamcloud.com/26895 or https://github.com/zfsonlinux/zfs/pull/6084 applied, there is no memory corruption problem any more. However, there are still BAD WRITE CHECKSUM problem I discovered there are known problem with kernel crypto framework, where it could cause out-of-bound access. At least I have seen it twice as follows:

[12925.744435] IP: [<ffffffffa0a390c0>] adler32_update+0x70/0x250 [libcfs]
[12925.751854] PGD 1f3e067 PUD 20723c4067 PMD 2072289067 PTE 8000000ba7548060
[12925.759585] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[12925.764684] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) zfs(O) zunicode(O) zavl(O) icp(O) zcommon(O) znvpair(O) spl(O) zlib_deflate sha512_generic crypto_null rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper dm_service_time ablk_helper iTCO_wdt iTCO_vendor_support ses cryptd ipmi_devintf ipmi_ssif ioatdma enclosure mei_me sb_edac ipmi_si edac_core pcspkr mei sg shpchp lpc_ich mfd_core
[12925.844878]  ipmi_msghandler i2c_i801 wmi acpi_power_meter acpi_pad nfsd nfs_acl lockd dm_multipath dm_mod auth_rpcgss grace sunrpc binfmt_misc ip_tables ext4 mbcache jbd2 mlx4_ib mlx4_en ib_sa ib_mad vxlan ib_core ip6_udp_tunnel udp_tunnel ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper ttm igb mlx4_core crct10dif_pclmul drm ahci crct10dif_common ptp crc32c_intel libahci mpt3sas pps_core libata dca raid_class i2c_algo_bit scsi_transport_sas i2c_core [last unloaded: zunicode]
[12925.896243] CPU: 3 PID: 136176 Comm: ll_ost_io00_009 Tainted: G          IOE  ------------   3.10.0-327.36.3_debug #3
[12925.908091] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[12925.919737] task: ffff881d9c86f2c0 ti: ffff881b6ff28000 task.ti: ffff881b6ff28000
[12925.928095] RIP: 0010:[<ffffffffa0a390c0>]  [<ffffffffa0a390c0>] adler32_update+0x70/0x250 [libcfs]
[12925.938229] RSP: 0018:ffff881b6ff2b988  EFLAGS: 00010212
[12925.944164] RAX: 0000000000001000 RBX: 0000000000001000 RCX: ffffea002e9d5200
[12925.952124] RDX: 0000000000001000 RSI: ffff880ba7548000 RDI: ffff880ba7548000
[12925.960092] RBP: ffff881b6ff2b9f0 R08: 0000000000000000 R09: 0000000000001000
[12925.968054] R10: 0000000000265464 R11: 0000000000221140 R12: ffffea002e9d5202
[12925.976021] R13: 0000000000001000 R14: ffffffffa0a58410 R15: 00000000000044e8
[12925.983984] FS:  0000000000000000(0000) GS:ffff88103ecc0000(0000) knlGS:0000000000000000
[12925.993009] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[12925.999427] CR2: ffff880ba7548000 CR3: 0000000001952000 CR4: 00000000001407e0
[12926.007397] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[12926.015365] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[12926.023333] Stack:
[12926.025576]  ffff880ad3900770 0000100000000000 ffff880ba7548000 0000000000000bf6
[12926.033913]  0000100000000000 00000000000442b4 000000000004427c 0000000000044244
[12926.042238]  ffff880ba7548000 ffffea002e9d5202 0000000000001000 ffffffffa0a58410
[12926.050569] Call Trace:
[12926.053305]  [<ffffffff812b58f8>] crypto_shash_update+0x38/0x100
[12926.060017]  [<ffffffff812b5bee>] shash_ahash_update+0x3e/0x70
[12926.066535]  [<ffffffff812b5c32>] shash_async_update+0x12/0x20
[12926.073067]  [<ffffffffa0a38083>] cfs_crypto_hash_update_page+0x93/0xc0 [libcfs]
[12926.081409]  [<ffffffffa140e3da>] tgt_checksum_bulk.isra.33+0x338/0x4c5 [ptlrpc]
[12926.089753]  [<ffffffffa13e71bd>] tgt_brw_write+0x114d/0x1640 [ptlrpc]
[12926.097108]  [<ffffffffa13d2720>] ? tgt_ses_key_init+0x20/0xd0 [ptlrpc]
[12926.104554]  [<ffffffffa13d2720>] ? tgt_ses_key_init+0x20/0xd0 [ptlrpc]
[12926.111945]  [<ffffffff810234fb>] ? save_stack_trace+0x2b/0x50
[12926.118468]  [<ffffffff811c0710>] ? set_track+0x70/0x1d0
[12926.124444]  [<ffffffffa133d560>] ? target_send_reply_msg+0x170/0x170 [ptlrpc]
[12926.132570]  [<ffffffffa13e31c5>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[12926.140299]  [<ffffffffa138f1cb>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[12926.148868]  [<ffffffffa0a3e3b8>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[12926.156498]  [<ffffffffa138cd88>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[12926.164088]  [<ffffffff810b8d02>] ? default_wake_function+0x12/0x20
[12926.171091]  [<ffffffff810af468>] ? __wake_up_common+0x58/0x90
[12926.177662]  [<ffffffffa1393280>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[12926.184707]  [<ffffffffa13927e0>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[12926.192975]  [<ffffffff810a5f3f>] kthread+0xcf/0xe0
[12926.198427]  [<ffffffff810a5f3f>] ? kthread+0xcf/0xe0
[12926.204077]  [<ffffffff810a5e70>] ? kthread_create_on_node+0x140/0x140
[12926.211372]  [<ffffffff8164b658>] ret_from_fork+0x58/0x90
[12926.217408]  [<ffffffff810a5e70>] ? kthread_create_on_node+0x140/0x140

I turned on page allocation debug so it crashed kernel when this occurred. Otherwise it would just include wrong data in checksum computation. Another occurrence is:

[  959.530510] IP: [<ffffffff8106ad08>] _begin+0x28/0x187
[  959.536263] PGD 1f3e067 PUD 20709b7067 PMD 207092c067 PTE 8000000ed1465060
[  959.543972] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[  959.549053] Modules linked in: libcfs(OE+) zfs(O) zunicode(O) zavl(O) icp(O) zcommon(O) znvpair(O) spl(O) zlib_deflate sha512_generic crypto_null rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm intel_powerclamp coretemp intel_rapl kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel nfsd lrw ipmi_devintf iTCO_wdt iTCO_vendor_support gf128mul glue_helper nfs_acl ses ablk_helper cryptd dm_service_time sb_edac lockd ipmi_ssif edac_core enclosure ioatdma ipmi_si sg pcspkr shpchp ipmi_msghandler auth_rpcgss mei_me mei wmi lpc_ich acpi_power_meter i2c_i801 mfd_core grace sunrpc acpi_pad dm_multipath dm_mod
[  959.628384]  binfmt_misc ip_tables ext4 mbcache jbd2 mlx4_ib mlx4_en ib_sa ib_mad vxlan ib_core ip6_udp_tunnel udp_tunnel ib_addr sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt drm_kms_helper ttm igb mlx4_core ahci drm crct10dif_pclmul libahci crct10dif_common crc32c_intel ptp mpt3sas libata pps_core raid_class dca scsi_transport_sas i2c_algo_bit i2c_core [last unloaded: zunicode]
[  959.667972] CPU: 3 PID: 19005 Comm: modprobe Tainted: G          IOE  ------------   3.10.0-327.36.3_debug #3
[  959.679030] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[  959.690670] task: ffff880f641f6600 ti: ffff880fec7ec000 task.ti: ffff880fec7ec000
[  959.699013] RIP: 0010:[<ffffffff8106ad08>]  [<ffffffff8106ad08>] _begin+0x28/0x187
[  959.707466] RSP: 0018:ffff880fec7ef558  EFLAGS: 00010202
[  959.713388] RAX: 0000000063b91701 RBX: 00000000fcd8a08b RCX: 00000000f362822f
[  959.721346] RDX: 00000000baa5a8d4 RSI: 00000000b1a46a3b RDI: 00000000280dab4e
[  959.729303] RBP: 00000000080d2940 R08: ffffffff8165c280 R09: ffff880ea32c1ee0
[  959.737261] R10: ffff880ed1464f80 R11: ffff880ed1465040 R12: 000000006c5045fe
[  959.745218] R13: ffff880ed1464fc0 R14: ffff880fec7ef558 R15: ffff880fec7ef7f8
[  959.753176] FS:  00007fe14014f740(0000) GS:ffff88103ecc0000(0000) knlGS:0000000000000000
[  959.762199] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  959.768605] CR2: ffff880ed1465000 CR3: 0000000e91549000 CR4: 00000000001407e0
[  959.776564] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  959.784522] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  959.792479] Stack:
[  959.794718]  0830274608302746 0830274608302746 0830274608302746 0830274608302746
[  959.803008]  0830274608302746 0830274608302746 0830274608302746 0830274608302746
[  959.811297]  0830274608302746 0830274608302746 0830274608302746 0830274608302746
[  959.819588] Call Trace:
[  959.822319]  [<ffffffff810e8222>] ? __module_text_address+0x12/0x70
[  959.829319]  [<ffffffff810eec6e>] ? is_module_text_address+0xe/0x20
[  959.836322]  [<ffffffff810234fb>] ? save_stack_trace+0x2b/0x50
[  959.842843]  [<ffffffff811c0710>] ? set_track+0x70/0x1d0
[  959.848778]  [<ffffffff811c15bd>] ? init_object+0x3d/0x70
[  959.854809]  [<ffffffff8106a71d>] ? sha1_apply_transform_avx2+0x1d/0x30
[  959.862187]  [<ffffffff8106a463>] ? __sha1_ssse3_update+0x53/0xd0
[  959.868984]  [<ffffffff8106a788>] ? sha1_ssse3_update+0x58/0xf0
[  959.875592]  [<ffffffff812b58f8>] ? crypto_shash_update+0x38/0x100
[  959.882492]  [<ffffffff812b5bee>] ? shash_ahash_update+0x3e/0x70
[  959.889199]  [<ffffffff812b5c32>] ? shash_async_update+0x12/0x20
[  959.895910]  [<ffffffffa0e74083>] ? cfs_crypto_hash_update_page+0x93/0xc0 [libcfs]
[  959.904367]  [<ffffffffa0e7482a>] ? cfs_crypto_register+0x2ca/0x4e0 [libcfs]
[  959.912237]  [<ffffffffa079d000>] ? 0xffffffffa079cfff
[  959.917981]  [<ffffffffa079d1a9>] ? libcfs_init+0x1a9/0x1000 [libcfs]
[  959.925171]  [<ffffffff810020e8>] ? do_one_initcall+0xb8/0x230
[  959.931685]  [<ffffffff810edc1e>] ? load_module+0x134e/0x1b50
[  959.938095]  [<ffffffff8131af20>] ? ddebug_proc_write+0xf0/0xf0
[  959.944705]  [<ffffffff810e9eb3>] ? copy_module_from_fd.isra.42+0x53/0x150
[  959.952380]  [<ffffffff810ee5d6>] ? SyS_finit_module+0xa6/0xd0
[  959.958885]  [<ffffffff8164b709>] ? system_call_fastpath+0x16/0x1b

This matches the bug description at https://www.spinics.net/lists/linux-crypto/msg22859.html and the corresponding redhat bug is https://bugzilla.redhat.com/show_bug.cgi?id=1399754 but unfortunately I don't have permission to look into the detail.

I tend to think this is a kernel BUG and this piece of code has been changed in rh 7.3 kernels. I also exercised the same code on centos 7.3 and didn't see the issue, so I would recommend to upgrade our distro to 7.3.

Comment by Gerrit Updater [ 16/May/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/26895/
Subject: LU-9305 osd-zfs: arc_buf could be non-pagesize aligned
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 3800d05641158608e5a139336c00a781fa9fb8c7

Comment by Peter Jones [ 16/May/17 ]

Landed for 2.10

Comment by John Salinas (Inactive) [ 18/May/17 ]

We have moved to Centos 7.3 and rebased our ZFS patches on 0.4.0 which contains Jinshan's zfs fix for this issue. However, we still see this issue while running with large RPCs, large blocks:
[ 3874.558287] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0000 from 12345-192.168.1.6@o2ib inode [0x200000409:0x1cb7:0x0] object 0x0:1972 extent [33554432-40140799]: client csum aa4dd555, server csum 7aa8a961
[ 4251.083446] perf: interrupt took too long (5080 > 4966), lowering kernel.perf_event_max_sample_rate to 39000
[ 5708.667871] perf: interrupt took too long (6430 > 6350), lowering kernel.perf_event_max_sample_rate to 31000
[ 7819.451501] perf: interrupt took too long (8170 > 8037), lowering kernel.perf_event_max_sample_rate to 24000
[ 8255.183667] BUG: Bad page state in process ll_ost_io01_020 pfn:1e50eea
[ 8255.192884] page:ffffea007943ba80 count:-1 mapcount:0 mapping: (null) index:0x0
[ 8255.203581] page flags: 0x6fffff00000000()
[ 8255.209724] page dumped because: nonzero _count
[ 8255.216322] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlib_deflate dm_service_time ses enclosure intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd mpt3sas raid_class scsi_transport_sas iTCO_wdt ipmi_devintf ipmi_ssif iTCO_vendor_support ipmi_si sg ipmi_msghandler
[ 8255.305720] mei_me sb_edac mei wmi shpchp lpc_ich edac_core ioatdma i2c_i801 acpi_pad acpi_power_meter pcspkr nfsd dm_multipath dm_mod nfs_acl lockd binfmt_misc grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 mlx4_en sd_mod crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt igb fb_sys_fops crct10dif_pclmul ttm crct10dif_common ptp crc32c_intel ahci pps_core drm mlx4_core libahci dca i2c_algo_bit libata i2c_core devlink fjes
[ 8255.356725] CPU: 23 PID: 9463 Comm: ll_ost_io01_020 Tainted: P IOE ------------ 3.10.0-514.16.1.el7.x86_64 #1
[ 8255.370500] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 8255.383755] ffffea007943ba80 00000000b9135f7a ffff881ed71f7910 ffffffff81686ac3
[ 8255.393696] ffff881ed71f7938 ffffffff81681cf8 ffffea007943ba80 0000000000000000
[ 8255.403640] 000fffff00000000 ffff881ed71f7980 ffffffff811893dd fff00000fe000000
[ 8255.413528] Call Trace:
[ 8255.417780] [<ffffffff81686ac3>] dump_stack+0x19/0x1b
[ 8255.425063] [<ffffffff81681cf8>] bad_page.part.75+0xdf/0xfc
[ 8255.432845] [<ffffffff811893dd>] free_pages_prepare+0x16d/0x190
[ 8255.441047] [<ffffffff81189e54>] free_hot_cold_page+0x74/0x160
[ 8255.449077] [<ffffffff8118a005>] __free_pages+0x25/0x30
[ 8255.456423] [<ffffffffa0d80373>] osd_bufs_put+0x123/0x1f0 [osd_zfs]
[ 8255.464931] [<ffffffffa0e074ea>] ofd_commitrw_write+0xea/0x1c80 [ofd]
[ 8255.473594] [<ffffffffa0e0bc3f>] ofd_commitrw+0x4af/0xa80 [ofd]
[ 8255.481667] [<ffffffffa0b44589>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[ 8255.491039] [<ffffffffa127057c>] obd_commitrw+0x2ed/0x330 [ptlrpc]
[ 8255.499374] [<ffffffffa1244290>] tgt_brw_write+0x1010/0x1700 [ptlrpc]
[ 8255.508017] [<ffffffff810ce474>] ? update_curr+0x104/0x190
[ 8255.515546] [<ffffffff810c9ba8>] ? __enqueue_entity+0x78/0x80
[ 8255.523505] [<ffffffffa11968a0>] ? target_send_reply_msg+0x170/0x170 [ptlrpc]
[ 8255.532921] [<ffffffffa1240615>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[ 8255.541949] [<ffffffffa11e8eab>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 8255.551821] [<ffffffffa09f89a8>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[ 8255.560938] [<ffffffffa11e6a58>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 8255.569835] [<ffffffff810c54d2>] ? default_wake_function+0x12/0x20
[ 8255.578105] [<ffffffff810ba628>] ? __wake_up_common+0x58/0x90
[ 8255.585892] [<ffffffffa11eced0>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[ 8255.594199] [<ffffffffa11ec430>] ? ptlrpc_register_service+0xe60/0xe60 [ptlrpc]
[ 8255.603645] [<ffffffff810b0a4f>] kthread+0xcf/0xe0
[ 8255.610217] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
[ 8255.618602] [<ffffffff816970d8>] ret_from_fork+0x58/0x90
[ 8255.625702] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
[ 8264.352388] LustreError: 168-f: BAD WRITE CHECKSUM: lsdraid-OST0000 from 12345-192.168.1.8@o2ib inode [0x20000040b:0x30e:0x0] object 0x0:7751 extent [83886080-91357183]: client csum a25342f0, server csum ec458710

We have verified we see this on: draid pools, raidz pools, 16MB blocks, 4MB blocks.

Comment by Jinshan Xiong (Inactive) [ 18/May/17 ]

This is unfortunate. And the symptom looks exactly as before. Let me take a look.

Comment by Jinshan Xiong (Inactive) [ 18/May/17 ]

After I loaded a debug kernel, I'm seeing the same issue in the crypto framework as follows:

wolf-3 login: [ 2646.125764] BUG: unable to handle kernel paging request at ffff881dbaa90000
[ 2646.135260] IP: [<ffffffffa00ec39b>] full_block+0x22/0x22 [crc32c_intel]
[ 2646.144385] PGD 1fa6067 PUD 2078fca067 PMD 2078df4067 PTE 8000001dbaa90163
[ 2646.153763] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 2646.160489] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate sha512_ssse3 sha512_generic crypto_null rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd dm_service_time iTCO_wdt ipmi_devintf nfsd sb_edac ioatdma iTCO_vendor_support mei_me edac_core ses mei enclosure sg i2c_i801 pcspkr shpchp lpc_ich ipmi_ssif ipmi_si ipmi_msghandler auth_rpcgss nfs_acl lockd acpi_power_meter grace wmi acpi_pad sunrpc dm_multipath dm_mod binfmt_misc ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mlx4_en mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops crct10dif_pclmul ttm crct10dif_common crc32c_intel igb drm ahci mlx4_core libahci ptp mpt3sas libata pps_core raid_class dca i2c_algo_bit scsi_transport_sas devlink i2c_core fjes [last unloaded: zunicode]
[ 2646.315278] CPU: 18 PID: 5239 Comm: ll_ost_io01_034 Tainted: P          IOE  ------------   3.10.0-514.16.1.el7_debug.x86_64 #0
[ 2646.332691] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 2646.346775] task: ffff880f42a5a1e0 ti: ffff881010120000 task.ti: ffff881010120000
[ 2646.357540] RIP: 0010:[<ffffffffa00ec39b>]  [<ffffffffa00ec39b>] full_block+0x22/0x22 [crc32c_intel]
[ 2646.370219] RSP: 0018:ffff8810101239b8  EFLAGS: 00010246
[ 2646.378558] RAX: 0000000000000080 RBX: 0000000000001000 RCX: ffff881dbaa90400
[ 2646.389046] RDX: ffff881dbaa90800 RSI: 0000000000001000 RDI: 0000000000000000
[ 2646.399530] RBP: ffff8810101239f0 R08: 00000000ad073f81 R09: 0000000000000000
[ 2646.409882] R10: 0000000000000000 R11: ffff881dbaa90c00 R12: ffff8820001d75e0
[ 2646.420238] R13: ffff881dbaa90000 R14: ffffffffa00ef050 R15: ffff8820001d75e0
[ 2646.430613] FS:  0000000000000000(0000) GS:ffff88202e400000(0000) knlGS:0000000000000000
[ 2646.442039] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2646.450807] CR2: ffff881dbaa90000 CR3: 00000000019c2000 CR4: 00000000001407e0
[ 2646.461154] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2646.471557] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 2646.481851] Stack:
[ 2646.486462]  0000000000001000 ffff881dbaa90000 0000000000001000 ffffffffa00ec2b6
[ 2646.497256]  ffff881dbaa90000 ffffea0076eaa400 0000000000001000 ffff881010123a48
[ 2646.508054]  ffffffff812da387 ffffffff812da274 ffff8820001d75e0 ffffea0076eaa400
[ 2646.518840] Call Trace:
[ 2646.523941]  [<ffffffffa00ec2b6>] ? crc32c_pcl_intel_update+0x66/0xa0 [crc32c_intel]
[ 2646.535024]  [<ffffffff812da387>] crypto_shash_update+0x47/0x120
[ 2646.544161]  [<ffffffff812da274>] ? crypto_shash_setkey+0x34/0xc0
[ 2646.553399]  [<ffffffff812da6ae>] shash_ahash_update+0x3e/0x70
[ 2646.562319]  [<ffffffff812da6f2>] shash_async_update+0x12/0x20
[ 2646.571200]  [<ffffffffa0dc806e>] cfs_crypto_hash_update_page+0x7e/0xb0 [libcfs]
[ 2646.581949]  [<ffffffffa1196417>] tgt_checksum_bulk.isra.33+0x338/0x4c5 [ptlrpc]
[ 2646.592589]  [<ffffffffa0dc7e77>] ? cfs_crypto_hash_speed+0x17/0x30 [libcfs]
[ 2646.602883]  [<ffffffffa116f1fd>] tgt_brw_write+0x114d/0x1640 [ptlrpc]
[ 2646.612493]  [<ffffffff811dbd90>] ? set_track+0x70/0x1d0
[ 2646.620657]  [<ffffffff811dcc3d>] ? init_object+0x3d/0x70
[ 2646.628917]  [<ffffffff810c9f08>] ? __enqueue_entity+0x78/0x80
[ 2646.637733]  [<ffffffffa10c5560>] ? target_send_reply_msg+0x170/0x170 [ptlrpc]
[ 2646.648096]  [<ffffffffa116b205>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[ 2646.658044]  [<ffffffffa111720b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 2646.668790]  [<ffffffffa0dce3c8>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[ 2646.678614]  [<ffffffffa1114dc8>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 2646.688302]  [<ffffffff810c57f2>] ? default_wake_function+0x12/0x20
[ 2646.697368]  [<ffffffff810ba948>] ? __wake_up_common+0x58/0x90
[ 2646.705956]  [<ffffffffa111b2c0>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[ 2646.714933]  [<ffffffff81029569>] ? __switch_to+0xd9/0x4c0
[ 2646.723039]  [<ffffffffa111a820>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[ 2646.733214]  [<ffffffff810b0d6f>] kthread+0xcf/0xe0
[ 2646.740500]  [<ffffffff810b0ca0>] ? kthread_create_on_node+0x140/0x140
[ 2646.749606]  [<ffffffff8169c598>] ret_from_fork+0x58/0x90
[ 2646.757376]  [<ffffffff810b0ca0>] ? kthread_create_on_node+0x140/0x140
[ 2646.766378] Code: 00 00 00 48 8d 3c 37 ff e7 48 c7 c0 80 00 00 00 48 8d 91 00 08 00 00 4c 8d 99 00 0c 00 00 48 81 c1 00 04 00 00 4d 31 c9 4d 31 d2 <f2> 4c 0f 38 f1 81 00 fc ff ff f2 4c 0f 38 f1 8a 00 fc ff ff f2 
[ 2646.792052] RIP  [<ffffffffa00ec39b>] full_block+0x22/0x22 [crc32c_intel]
[ 2646.801371]  RSP <ffff8810101239b8>
[ 2646.806927] CR2: ffff881dbaa90000

I can confirm that this issue is not yet fixed in reel 7.3 kernel.

Comment by Jinshan Xiong (Inactive) [ 19/May/17 ]

There is nothing we can do but disable checksum at this moment

Comment by John Salinas (Inactive) [ 19/May/17 ]

Is there danger of bad data being written to disk if we disable checksums?

I suppose that could allow us to test some other Lustre things.

Comment by Jinshan Xiong (Inactive) [ 19/May/17 ]

With checksum disabled, the server can no longer detect data corruption. However, it can allow us to do perform tests.

I have disabled PAGEALLOC debug and see if I can discover new issues.

Comment by Andreas Dilger [ 23/May/17 ]

Can we apply a patch from the upstream kernel to RHEL to verify that this problem is fixed?

If an upstream patch fixes the problem for us, we can still apply the patch to our Lustre kernels, and also request RH to include the fix into their RHEL kernels.

Comment by John Salinas (Inactive) [ 24/May/17 ]

We can – which patch did you have in mind?

Comment by Jinshan Xiong (Inactive) [ 29/May/17 ]

Somehow this issue is hardly to be reproduced.

A new backtrace was seen today:

wolf-4 login: [ 4720.733857] BUG: unable to handle kernel paging request at ffff8814eef5a000
[ 4720.741675] IP: [<ffffffff8132bfc6>] memcpy+0x6/0x110
[ 4720.747337] PGD 1fa6067 PUD 207d5ed067 PMD 207d475067 PTE 80000014eef5a060
[ 4720.755053] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 4720.760151] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate sha512_ssse3 sha512_generic crypto_null rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw nfsd gf128mul glue_helper sb_edac ablk_helper dm_service_time ses iTCO_wdt ipmi_devintf iTCO_vendor_support cryptd enclosure nfs_acl lockd ioatdma edac_core mei_me shpchp mei pcspkr auth_rpcgss ipmi_ssif lpc_ich grace sg i2c_i801 ipmi_si wmi ipmi_msghandler sunrpc acpi_power_meter acpi_pad dm_multipath dm_mod binfmt_misc ip_tables ext4 mbcache jbd2 mlx4_en sd_mod crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops crct10dif_pclmul ttm crct10dif_common igb crc32c_intel ahci drm libahci mlx4_core ptp mpt3sas libata pps_core dca i2c_algo_bit raid_class i2c_core scsi_transport_sas devlink fjes [last unloaded: zunicode]
[ 4720.891622] CPU: 68 PID: 57946 Comm: ll_ost_io01_032 Tainted: P          IOE  ------------   3.10.0-514.16.1.el7_debug.x86_64 #0
[ 4720.904542] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 4720.916202] task: ffff8820240543c0 ti: ffff881dc541c000 task.ti: ffff881dc541c000
[ 4720.924561] RIP: 0010:[<ffffffff8132bfc6>]  [<ffffffff8132bfc6>] memcpy+0x6/0x110
[ 4720.932939] RSP: 0018:ffff881dc541f8b8  EFLAGS: 00010286
[ 4720.938879] RAX: ffffc90a1f5c3000 RBX: ffff8819e3d73b10 RCX: 0000000000001000
[ 4720.946844] RDX: 0000000000001000 RSI: ffff8814eef5a000 RDI: ffffc90a1f5c3000
[ 4720.954811] RBP: ffff881dc541f908 R08: ffff88197a4bc930 R09: ffff88017fc08b40
[ 4720.962783] R10: ffff88017fc08b40 R11: ffff88197a4bc830 R12: 0000000000001000
[ 4720.970746] R13: 0000000007000000 R14: ffff88187d978000 R15: 0000000000001000
[ 4720.978713] FS:  0000000000000000(0000) GS:ffff88202ec00000(0000) knlGS:0000000000000000
[ 4720.987751] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4720.994166] CR2: ffff8814eef5a000 CR3: 00000000019c2000 CR4: 00000000001407e0
[ 4721.002130] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 4721.010103] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 4721.018068] Stack:
[ 4721.020320]  ffffffffa0bcb7ac ffff8819e3d73b18 ffff881ffe507be8 0000000000000000
[ 4721.028618]  ffff8814eef5a000 0000000007000000 ffff881df792c0f8 0000000000000346
[ 4721.036917]  0000000000001000 ffff881dc541f938 ffff881dc541f970 ffffffffa0bcd43f
[ 4721.045214] Call Trace:
[ 4721.047990]  [<ffffffffa0bcb7ac>] ? dmu_write_impl+0x5c/0xd0 [zfs]
[ 4721.054926]  [<ffffffffa0bcd43f>] dmu_write.part.7+0x6f/0x100 [zfs]
[ 4721.061960]  [<ffffffffa0bcd4e9>] dmu_write+0x19/0x20 [zfs]
[ 4721.068197]  [<ffffffffa12d6929>] osd_write_commit+0x549/0x9a0 [osd_zfs]
[ 4721.075699]  [<ffffffffa13d0774>] ofd_commitrw_write+0x1014/0x1c20 [ofd]
[ 4721.083195]  [<ffffffffa13d3f2d>] ofd_commitrw+0x51d/0xa40 [ofd]
[ 4721.089970]  [<ffffffffa119d890>] obd_commitrw+0x2ec/0x32f [ptlrpc]
[ 4721.097034]  [<ffffffffa1175f51>] tgt_brw_write+0xea1/0x1640 [ptlrpc]
[ 4721.104236]  [<ffffffff811dbd90>] ? set_track+0x70/0x1d0
[ 4721.110174]  [<ffffffff811dcc3d>] ? init_object+0x3d/0x70
[ 4721.116209]  [<ffffffff810ce7e4>] ? update_curr+0x104/0x190
[ 4721.122438]  [<ffffffff810c9f08>] ? __enqueue_entity+0x78/0x80
[ 4721.128960]  [<ffffffff810d097c>] ? enqueue_entity+0x26c/0xb60
[ 4721.135514]  [<ffffffffa10cc560>] ? target_send_reply_msg+0x170/0x170 [ptlrpc]
[ 4721.143642]  [<ffffffffa1172205>] tgt_request_handle+0x915/0x1320 [ptlrpc]
[ 4721.151365]  [<ffffffffa111e20b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[ 4721.159937]  [<ffffffffa0a7b3c8>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[ 4721.167557]  [<ffffffffa111bdc8>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc]
[ 4721.175144]  [<ffffffff810c57f2>] ? default_wake_function+0x12/0x20
[ 4721.182140]  [<ffffffff810ba948>] ? __wake_up_common+0x58/0x90
[ 4721.188693]  [<ffffffffa11222c0>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc]
[ 4721.195728]  [<ffffffffa1121820>] ? ptlrpc_register_service+0xe40/0xe40 [ptlrpc]
[ 4721.203994]  [<ffffffff810b0d6f>] kthread+0xcf/0xe0
[ 4721.209447]  [<ffffffff810b0ca0>] ? kthread_create_on_node+0x140/0x140
[ 4721.216736]  [<ffffffff8169c598>] ret_from_fork+0x58/0x90
[ 4721.222766]  [<ffffffff810b0ca0>] ? kthread_create_on_node+0x140/0x140
[ 4721.230060] Code: 43 60 48 2b 43 50 88 43 4e 5b 5d c3 66 0f 1f 84 00 00 00 00 00 e8 7b fc ff ff eb e2 90 90 90 90 90 90 90 90 90 48 89 f8 48 89 d1 <f3> a4 c3 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 20 4c 8b 06 4c 8b 
[ 4721.251830] RIP  [<ffffffff8132bfc6>] memcpy+0x6/0x110
[ 4721.257578]  RSP <ffff881dc541f8b8>
[ 4721.261470] CR2: ffff8814eef5a000

I have enabled PAGEALLOC debug so this backtrace clearly shows the page was somehow freed in dmu_write(). I have disabled checksum.

Comment by John Salinas (Inactive) [ 30/May/17 ]

You hit this after checksums were disabled?

Comment by Jinshan Xiong (Inactive) [ 30/May/17 ]

Yes.

Comment by John Salinas (Inactive) [ 30/May/17 ]

So where in the dmu_write there is a mem copy that trips across this how many of those are there? I don't see a dump is that still having issues?

zfs dmu_write_impl -> module/zfs/dmu.c:974).

969	}
970	
971	static void
972	dmu_write_impl(dmu_buf_t **dbp, int numbufs, uint64_t offset, uint64_t size,
973	    const void *buf, dmu_tx_t *tx)
974	{
975		int i;
976	
977		for (i = 0; i < numbufs; i++) {
978			uint64_t tocpy; 

zfs dmu_write -> module/zfs/dmu.c:1006).

1001			buf = (char *)buf + tocpy;
1002		}
1003	}
1004	
1005	void
1006	dmu_write(objset_t *os, uint64_t object, uint64_t offset, uint64_t size,
1007	    const void *buf, dmu_tx_t *tx)
1008	{
1009		dmu_buf_t **dbp;
1010		int numbufs; 

osd_zfs osd_write_commit -> lustre/osd-zfs/osd_io.c:762).

757	}
758	
759	static int osd_write_commit(const struct lu_env *env, struct dt_object *dt,
760				struct niobuf_local *lnb, int npages,
761				struct thandle *th)
762	{
763		struct osd_object  *obj  = osd_dt_obj(dt);
764		struct osd_device  *osd = osd_obj2dev(obj);
765		struct osd_thandle *oh;
766		uint64_t            new_size = 0; 

ofd_commitrw -> lustre/ofd/ofd_io.c:1190).

1185	 */
1186	int ofd_commitrw(const struct lu_env *env, int cmd, struct obd_export *exp,
1187			 struct obdo *oa, int objcount, struct obd_ioobj *obj,
1188			 struct niobuf_remote *rnb, int npages,
1189			 struct niobuf_local *lnb, int old_rc)
1190	{
1191		struct ofd_thread_info	*info = ofd_info(env);
1192		struct ofd_mod_data	*fmd;
1193		__u64			 valid;
1194		struct ofd_device	*ofd = ofd_exp(exp);
Comment by Jinshan Xiong (Inactive) [ 30/May/17 ]

It's in dmu_write_impl() at line 994.

 989                 if (tocpy == db->db_size)                                       
 990                         dmu_buf_will_fill(db, tx);                              
 991                 else                                                            
 992                         dmu_buf_will_dirty(db, tx);                             
 993                                                                                 
 994                 (void) memcpy((char *)db->db_data + bufoff, buf, tocpy);        
 995                                                                                 
 996                 if (tocpy == db->db_size)                                       
 997                         dmu_buf_fill_done(db, tx); 

By analyzing registers and stack, ffff8814eef5a000 was the value of buf, and the size is 4096 that one page size.

Comment by Joseph Gmitter (Inactive) [ 06/Jun/17 ]

Jinshan reports that he is not able to reproduce this bug with a debug kernel.

John, is this something that you can still reproduce?

Comment by Joseph Gmitter (Inactive) [ 06/Jun/17 ]

Per John is this failing on every run without debug kernel enabled.

John will try to reproduce with a debug kernel. John, do you need anything from us to do that?

Comment by John Salinas (Inactive) [ 06/Jun/17 ]

Where is the debug kernel?

Comment by John Hammond [ 07/Jun/17 ]

Is this a Lustre issue, or an issue with the coral patches? It's very hard to understand from the comments here.

John, could you please edit the issue description to include the kernel version, Lustre version, ZFS version, and any patches applied?

Comment by John Salinas (Inactive) [ 08/Jun/17 ]

This is an issue with Lustre when running with ZFS. It is probably worth your time to go read the other bugs mentioned so you can get a sense of the work that has been done on this bug and various software combinations and configurations that have been tried.

We have tracked this bug back at least as far as Lustre 2.7-current master. My theory is that it has been their since the very beginning if when we started using ZFS.

It has noting to do with CORAL patches it fails with vanilla ZFS On Linux + Lustre (any version we have tried). It is easy to reproduce and their are several known ways to reproduce it – one of them attached in this bug. I have never been able to run a system configure as it would be for CORAL without hitting this for me it it fails 100% of the time, every run and complete blocks my Lustre testing.

I have multiple dumps from last night if it would be helpful I can attach those as well. Jinshan has done the most work no this bug have you talked to him?

Comment by John Salinas (Inactive) [ 08/Jun/17 ]

I updated the env. field with some version information.

Comment by Gerrit Updater [ 12/Jun/17 ]

Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: https://review.whamcloud.com/27568
Subject: LU-9305 tests: Run Reproducer
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 161e89a5b5de1f231bc73c84ae6a10095af61495

Comment by Jinshan Xiong (Inactive) [ 13/Jun/17 ]

The symptom of this issue is that the pages allocated from osd_bufs_get_write() by alloc_page(OSD_GFP_IO) were somehow freed so that the attempt to access this page later will trigger checksum error or kernel crash(when page alloc debug is enabled).

It's easier to observe this issue by enabling DEBUG_PAGEALLOC. It doesn't help by enabling KASAN that implies it's not caused by buffer overflow or access memory after free.

I also ran the test script with DEBUG_VM because I would expect to catch the problem that the pages are double freed. It didn't help either.

Comment by Jinshan Xiong (Inactive) [ 13/Jun/17 ]

After KASAN is enabled I can no longer reproduce the issue with the same test script. I guess this is because it makes everything slower.

I suspect that there would exist race in kernel page allocator that caused pages are reallocated before they are really freed.

Comment by Jinshan Xiong (Inactive) [ 13/Jun/17 ]

I've got an idea to load a 4.11 vanilla kernel w/o KASAN enabled and see if we can still see this issue.

Comment by Nathaniel Clark [ 13/Jun/17 ]

I saw LU-9305 with ZFS 0.7.0-rc4 and master, but  with this breakdown:

ZFS Record Size ZPOOL Record Size brw_size   BAD WRITE CHECKSUM  
16M 128K 16   YES  
8M 8M 8   NO  
16M 16M 16   NO  

I have hit a crash, but it appears to be identical to LU-6649.

Comment by Nathaniel Clark [ 14/Jun/17 ]

Reproducer caused issue at 4MB: https://testing.hpdd.intel.com/test_sets/24865c18-507f-11e7-a743-5254006e85c2

This is with stock ZFS 0.7.0-rc4 and Lustre master

OST1/2 console log:

20:09:59:[  471.989304] Lustre: DEBUG MARKER: == lu9305-test test 1: Run LU-9305 Reproducer 4MB ==================================================== 20:02:29 (1497384149)
20:09:59:[  472.178497] Lustre: DEBUG MARKER: echo 4194304 > /sys/module/zfs/parameters/zfs_max_recordsize
20:09:59:[  472.452383] Lustre: DEBUG MARKER: zfs set recordsize=4M lustre-ost1
20:09:59:[  472.724658] Lustre: DEBUG MARKER: zfs set recordsize=4M lustre-ost2
20:09:59:[  473.534956] Lustre: DEBUG MARKER: lctl set_param obdfilter.*-OST*.brw_size=4
20:10:02:[  764.437671] LustreError: 168-f: lustre-OST0000: BAD WRITE CHECKSUM: from 12345-10.9.4.141@tcp inode [0x200000404:0x5:0x0] object 0x0:7 extent [1258422272-1262616575]: client csum e7575e64, server csum df9ad04f
20:10:02:[  908.431065] general protection fault: 0000 [#1] SMP 
20:10:02:[  908.432048] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlib_deflate libcfs(OE) dm_mod rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core iosf_mbi crc32_pclmul ghash_clmulni_intel ppdev aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr virtio_balloon i2c_piix4 parport_pc parport nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ata_generic pata_acpi ext4 mbcache jbd2 cirrus drm_kms_helper virtio_blk syscopyarea sysfillrect ata_piix sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common 8139too crc32c_intel drm virtio_pci virtio_ring serio_raw virtio libata i2c_core 8139cp mii floppy
20:10:02:[  908.432048] CPU: 1 PID: 6560 Comm: socknal_sd00_01 Tainted: P           OE  ------------   3.10.0-514.21.1.el7_lustre.x86_64 #1
20:10:02:[  908.432048] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
20:10:02:[  908.432048] task: ffff88007a520000 ti: ffff880036358000 task.ti: ffff880036358000
20:10:02:[  908.432048] RIP: 0010:[<ffffffff8118edaa>]  [<ffffffff8118edaa>] put_page+0xa/0x60
20:10:02:[  908.432048] RSP: 0018:ffff88003635ba70  EFLAGS: 00010206
20:10:02:[  908.432048] RAX: 0000000000000030 RBX: 0000000000000001 RCX: ffffffff81aa61e0
20:10:02:[  908.432048] RDX: 0000000000000008 RSI: ffff880040c8c2aa RDI: 3231303635343332
20:10:02:[  908.432048] RBP: ffff88003635ba78 R08: 3433323130363534 R09: 3130363534333231
20:10:02:[  908.432048] R10: ffff88007a7aae80 R11: 0000000000066f38 R12: ffff880040c8c2c0
20:10:02:[  908.432048] R13: ffff88007ad5c900 R14: 00000000000005a8 R15: 0000000000000000
20:10:02:[  908.432048] FS:  0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
20:10:02:[  908.432048] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
20:10:02:[  908.432048] CR2: 00007f8ccceb0e30 CR3: 000000003685c000 CR4: 00000000000406e0
20:10:02:[  908.432048] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
20:10:02:[  908.432048] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
20:10:02:[  908.432048] Stack:
20:10:02:[  908.432048]  0000000000000001 ffff88003635baa0 ffffffff8155ec9f ffff88007ad5c900
20:10:02:[  908.432048]  0000000000000000 ffff88007a7ab3e0 ffff88003635bab8 ffffffff8155ed74
20:10:02:[  908.432048]  ffff88007ad5c900 ffff88003635bad0 ffffffff8155ed92 ffff88007ad5c900
20:10:02:[  908.432048] Call Trace:
20:10:02:[  908.432048]  [<ffffffff8155ec9f>] skb_release_data+0x8f/0x140
20:10:02:[  908.432048]  [<ffffffff8155ed74>] skb_release_all+0x24/0x30
20:10:02:[  908.432048]  [<ffffffff8155ed92>] __kfree_skb+0x12/0x20
20:10:02:[  908.432048]  [<ffffffff815c05a0>] tcp_recvmsg+0x7e0/0xb50
20:10:02:[  908.432048]  [<ffffffff815eda6b>] inet_recvmsg+0x7b/0xa0
20:10:02:[  908.432048]  [<ffffffff8155637f>] sock_recvmsg+0xbf/0x100
20:10:02:[  908.432048]  [<ffffffff8155637f>] ? sock_recvmsg+0xbf/0x100
20:10:02:[  908.432048]  [<ffffffff8168ea8b>] ? _raw_spin_unlock_bh+0x1b/0x40
20:10:02:[  908.432048]  [<ffffffffa0b177f8>] ? ksocknal_recv+0x118/0x2a0 [ksocklnd]
20:10:02:[  908.432048]  [<ffffffff815563fa>] kernel_recvmsg+0x3a/0x50
20:10:02:[  908.432048]  [<ffffffffa0b19bde>] ksocknal_lib_recv_kiov+0x15e/0x3a0 [ksocklnd]
20:10:02:[  908.432048]  [<ffffffffa0b16bef>] ksocknal_process_receive+0x29f/0xd90 [ksocklnd]
20:10:02:[  908.432048]  [<ffffffffa0b17a6e>] ksocknal_scheduler+0xee/0x670 [ksocklnd]
20:10:02:[  908.432048]  [<ffffffff810b1b20>] ? wake_up_atomic_t+0x30/0x30
20:10:02:[  908.432048]  [<ffffffffa0b17980>] ? ksocknal_recv+0x2a0/0x2a0 [ksocklnd]
20:10:02:[  908.432048]  [<ffffffff810b0a4f>] kthread+0xcf/0xe0
20:10:02:[  908.432048]  [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
20:10:02:[  908.432048]  [<ffffffff81697798>] ret_from_fork+0x58/0x90
20:10:02:[  908.432048]  [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
20:10:02:[  908.432048] Code: fb a8 20 74 05 e8 97 fe ff ff 48 89 df 31 f6 e8 cd b0 ff ff 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 53 <48> f7 07 00 c0 00 00 48 89 fb 75 31 f0 ff 4f 1c 0f 94 c0 84 c0 
20:10:02:[  908.432048] RIP  [<ffffffff8118edaa>] put_page+0xa/0x60
20:10:02:[  908.432048]  RSP <ffff88003635ba70>

EDIT: I've uploaded a new version of reproducer that will run with stock settings prior to setting up large blocks

Comment by Andreas Dilger [ 16/Jun/17 ]

Has anyone been able to reproduce this problem with ZFS 0.6.5 and/or 1MB recordsize? If it always requires larger than 1MB recordsize, then my understanding is that this requires a custom ZFS build to enable, so it isn't something that could be hit during normal usage.

Comment by John Salinas (Inactive) [ 16/Jun/17 ]

We have produced with ZFS 0.6.5.7 not sure about 0.6.5. My test case did not reproduce on 1MB record size. Only 2,4,8,16. We debated on if this because it actually works with 1MB or it just takes forever and a day to see it at that record size.

No, there is no special ZFS build for recent ZFS builds for large record sizes (up to 16MB is supported out of the box).

Comment by Gerrit Updater [ 16/Jun/17 ]

Wei Liu (wei3.liu@intel.com) uploaded a new patch: https://review.whamcloud.com/27693
Subject: LU-9305 tests: Run Reproducer
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: e2ae8c3f2d95a154336a3e9e96e0070ff16a7983

Comment by Sarah Liu [ 16/Jun/17 ]

The above patch is the reproducer based on current zfs version which master is using (zfs-0.6.5.9-1.el7.x86_64)

Comment by Alex Zhuravlev [ 18/Jun/17 ]

any success on reproducing this with very few (ideally 1) clients?

Comment by John Salinas (Inactive) [ 19/Jun/17 ]

I could not reproduce with a client that was on the OSS node – however, with 1 client and 1 sos is reproduces easily.

Comment by Nathaniel Clark [ 21/Jun/17 ]

I've reproduced this with 4 clients with records size of 4MB (retesting with 1 client):
https://testing.hpdd.intel.com/test_sets/4ff95cfc-56be-11e7-9221-5254006e85c2

OST3,OST4 console:

19:00:33:[ 2346.715685] LNetError: 6844:0:(socklnd_cb.c:1149:ksocknal_process_receive()) [ffff880061190000] Error -14 on read from 12345-10.9.4.124@tcp ip 10.9.4.124:1021
19:00:33:
19:00:33:[ 2346.721187] general protection fault: 0000 [#1] [ 2346.721362] LustreError: 6844:0:(events.c:449:server_bulk_callback()) event type 5, status -5, desc ffff88005f936c00
19:00:33:[ 2346.721398] LustreError: 6844:0:(events.c:449:server_bulk_callback()) event type 5, status -5, desc ffff88005f937c00
19:00:33:[ 2346.721401] LustreError: 6844:0:(events.c:449:server_bulk_callback()) event type 5, status -5, desc ffff88005f937c00
19:00:33:[ 2346.721404] LustreError: 6844:0:(events.c:449:server_bulk_callback()) event type 5, status -5, desc ffff88005f937c00
19:00:33:[ 2346.721407] LustreError: 6844:0:(events.c:449:server_bulk_callback()) event type 5, status -5, desc ffff88005f937c00
19:00:33:[ 2346.721410] LustreError: 6844:0:(events.c:449:server_bulk_callback()) event type 5, status -5, desc ffff88005f936600
19:00:33:[ 2346.721413] LustreError: 6844:0:(events.c:449:server_bulk_callback()) event type 5, status -5, desc ffff88005f936600
19:00:33:[ 2346.721416] LustreError: 6844:0:(events.c:449:server_bulk_callback()) event type 5, status -5, desc ffff88005f936600
19:00:33:[ 2346.739405] SMP 
19:00:33:[ 2346.739405] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) zlib_deflate libcfs(OE) dm_mod rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ppdev pcspkr virtio_balloon parport_pc parport i2c_piix4 nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm 8139too crct10dif_pclmul crct10dif_common drm i2c_core crc32c_intel virtio_pci 8139cp serio_raw mii virtio_ring virtio ata_piix libata floppy
19:00:33:[ 2346.766822] CPU: 1 PID: 6842 Comm: socknal_reaper Tainted: P           OE  ------------   3.10.0-514.21.1.el7_lustre.x86_64 #1
19:00:33:[ 2346.766822] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
19:00:33:[ 2346.766822] task: ffff88007930bec0 ti: ffff880069c44000 task.ti: ffff880069c44000
19:00:33:[ 2346.766822] RIP: 0010:[<ffffffff8118edaa>]  [<ffffffff8118edaa>] put_page+0xa/0x60
19:00:33:[ 2346.766822] RSP: 0018:ffff880069c47cc8  EFLAGS: 00010206
19:00:33:[ 2346.766822] RAX: 0000000000000030 RBX: 0000000000000001 RCX: ffffffffffffffec
19:00:33:[ 2346.766822] RDX: 0000000000000014 RSI: 0000000000000000 RDI: 3231303635343332
19:00:33:[ 2346.766822] RBP: ffff880069c47cd0 R08: 0000000000000004 R09: 0000000000000003
19:00:33:[ 2346.766822] R10: 00000000000000e4 R11: 0000000000000000 R12: ffff8800626a3b40
19:00:33:[ 2346.766822] R13: ffff88007b85b700 R14: ffff8800796a9f90 R15: 0000000000000008
19:00:33:[ 2346.766822] FS:  0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
19:00:33:[ 2346.766822] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
19:00:33:[ 2346.766822] CR2: 00007f1d484fe000 CR3: 000000007ad1c000 CR4: 00000000000406e0
19:00:33:[ 2346.766822] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
19:00:33:[ 2346.766822] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
19:00:33:[ 2346.766822] Stack:
19:00:33:[ 2346.766822]  0000000000000001 ffff880069c47cf8 ffffffff8155ec9f ffff88007b85b700
19:00:33:[ 2346.766822]  ffff88007b85b700 ffffffff8155ed1e ffff880069c47d10 ffffffff8155ed74
19:00:33:[ 2346.766822]  0000000000000000 ffff880069c47d38 ffffffff8155edd2 0000000000000000
19:00:33:[ 2346.766822] Call Trace:
19:00:33:[ 2346.766822]  [<ffffffff8155ec9f>] skb_release_data+0x8f/0x140
19:00:33:[ 2346.766822]  [<ffffffff8155ed1e>] ? skb_release_data+0x10e/0x140
19:00:33:[ 2346.766822]  [<ffffffff8155ed74>] skb_release_all+0x24/0x30
19:00:33:[ 2346.842652]  [<ffffffff8155edd2>] kfree_skb+0x32/0xa0
19:00:33:[ 2346.842652]  [<ffffffff8155ed1e>] skb_release_data+0x10e/0x140
19:00:33:[ 2346.842652]  [<ffffffff8155ed74>] skb_release_all+0x24/0x30
19:00:33:[ 2346.842652]  [<ffffffff8155ed92>] __kfree_skb+0x12/0x20
19:00:33:[ 2346.842652]  [<ffffffff815c3574>] tcp_close+0x284/0x3f0
19:00:33:[ 2346.842652]  [<ffffffff815ec357>] inet_release+0x77/0x80
19:00:33:[ 2346.842652]  [<ffffffff8155547f>] sock_release+0x1f/0x80
19:00:33:[ 2346.842652]  [<ffffffffa0b1728d>] ksocknal_terminate_conn+0xcd/0x200 [ksocklnd]
19:00:33:[ 2346.842652]  [<ffffffffa0b20c76>] ksocknal_reaper+0xb6/0x3d0 [ksocklnd]
19:00:33:[ 2346.842652]  [<ffffffff810c54e0>] ? wake_up_state+0x20/0x20
19:00:33:[ 2346.842652]  [<ffffffffa0b20bc0>] ? ksocknal_recv_hello+0x830/0x830 [ksocklnd]
19:00:33:[ 2346.842652]  [<ffffffff810b0a4f>] kthread+0xcf/0xe0
19:00:33:[ 2346.842652]  [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
19:00:33:[ 2346.842652]  [<ffffffff81697798>] ret_from_fork+0x58/0x90
19:00:33:[ 2346.842652]  [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
19:00:33:[ 2346.842652] Code: fb a8 20 74 05 e8 97 fe ff ff 48 89 df 31 f6 e8 cd b0 ff ff 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 53 <48> f7 07 00 c0 00 00 48 89 fb 75 31 f0 ff 4f 1c 0f 94 c0 84 c0 
19:00:33:[ 2346.842652] RIP  [<ffffffff8118edaa>] put_page+0xa/0x60
Comment by Alex Zhuravlev [ 22/Jun/17 ]

I think it makes sesne to try the following scenarios:
1) fixed blocksize
a) I/O match blocksize
b) I/O don't match blocksize
2) growing blocksize
a) I/O match blocksize
b) I/O don't match blocksize

hopefully we'll be able to nail it down to 0-copy or copy IO.

Comment by Andreas Dilger [ 22/Jun/17 ]

One thought I had (though I haven't done any investigation on it) is if recordsize > RPC size and/or DLM lock size, then it is possible that multiple OSS threads are modifying the same ZFS block at one time. Do we have protections in place to avoid this? With ldiskfs, the underlying blocksize is always smaller than the RPC and DLM lock size.

Comment by Alex Zhuravlev [ 22/Jun/17 ]

well, if it was just a rewrite, then no any kind of issues with page structures could happen?

also, if this is a pure ZFS issues, then I'd expect to reproduce it with obdfilter-survey?

Comment by Nathaniel Clark [ 22/Jun/17 ]

Reproduced with 16M recordsize on 0.6.5.9
https://testing.hpdd.intel.com/test_sets/26671780-56e1-11e7-a749-5254006e85c2

22:41:42:[ 4006.884664] LustreError: 168-f: lustre-OST0002: BAD WRITE CHECKSUM: from 12345-10.9.4.98@tcp inode [0x200000404:0x31:0x0] object 0x0:53 extent [29360128-33554431]: client csum 6bb7b362, server csum 3b557958
22:41:42:[ 4007.637386] ------------[ cut here ]------------
22:41:42:[ 4007.833042] WARNING: at lib/list_debug.c:29 __list_add+0x65/0xc0()
22:41:42:[ 4007.833042] list_add corruption. next->prev should be prev (ffff8800797a6c90), but was 3332313036353433. (next=ffffc9000c8f1010).
22:41:42:
22:41:42:[ 4007.833042] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE) mgc(OE) osd_zfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) zfs(POE) zunicode(POE) zavl(POE) zcommon(POE) znvpair(POE) spl(OE) zlib_deflate libcfs(OE) dm_mod rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core iosf_mbi crc32_pclmul ghash_clmulni_intel ppdev aesni_intel lrw gf128mul glue_helper ablk_helper cryptd virtio_balloon pcspkr i2c_piix4 parport_pc parport nfsd nfs_acl lockd auth_rpcgss grace sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk cirrus crct10dif_pclmul crct10dif_common 8139too drm_kms_helper syscopyarea sysfillrect crc32c_intel sysimgblt fb_sys_fops ttm ata_piix virtio_pci 8139cp virtio_ring serio_raw mii virtio drm libata i2c_core floppy
22:41:42:[ 4008.044390] CPU: 1 PID: 13454 Comm: ll_ost_io00_017 Tainted: P           OE  ------------   3.10.0-514.21.1.el7_lustre.x86_64 #1
22:41:42:[ 4008.044390] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
22:41:42:[ 4008.044390]  ffff880055b63548 000000008b0ef237 ffff880055b63500 ffffffff8168717f
22:41:42:[ 4008.044390]  ffff880055b63538 ffffffff81085cb0 ffffc9000e1e1010 ffffc9000c8f1010
22:41:42:[ 4008.044390]  ffff8800797a6c90 ffff88007982c180 ffffc9000f1e2018 ffff880055b635a0
22:41:42:[ 4008.044390] Call Trace:
22:41:42:[ 4008.206860]  [<ffffffff8168717f>] dump_stack+0x19/0x1b
22:41:42:[ 4008.206860]  [<ffffffff81085cb0>] warn_slowpath_common+0x70/0xb0
22:41:42:[ 4008.206860]  [<ffffffff81085d4c>] warn_slowpath_fmt+0x5c/0x80
22:41:42:[ 4008.206860]  [<ffffffff813339d5>] __list_add+0x65/0xc0
22:41:42:[ 4008.206860]  [<ffffffffa06d8b89>] spl_cache_refill+0x555/0x594 [spl]
22:41:42:[ 4008.358866]  [<ffffffff811de665>] ? kmem_cache_alloc+0x35/0x1e0
22:41:42:[ 4008.358866]  [<ffffffffa06d09c2>] spl_kmem_cache_alloc+0x132/0x150 [spl]
22:41:42:[ 4008.358866]  [<ffffffff811de665>] ? kmem_cache_alloc+0x35/0x1e0
22:41:42:[ 4008.358866]  [<ffffffffa082c027>] zio_data_buf_alloc+0x57/0x60 [zfs]
22:41:42:[ 4008.358866]  [<ffffffffa077f479>] arc_get_data_buf.isra.20+0x129/0x3a0 [zfs]
22:41:42:[ 4008.358866]  [<ffffffffa078268f>] arc_buf_alloc+0x10f/0x190 [zfs]
22:41:42:[ 4008.358866]  [<ffffffffa078a697>] dbuf_read+0x357/0x8d0 [zfs]
22:41:42:[ 4008.358866]  [<ffffffffa07a1306>] dmu_tx_check_ioerr+0x66/0xe0 [zfs]
22:41:42:[ 4008.358866]  [<ffffffffa07a15d6>] dmu_tx_count_write+0x256/0x6b0 [zfs]
22:41:42:[ 4008.358866]  [<ffffffffa0788e46>] ? dbuf_rele+0x26/0x30 [zfs]
22:41:42:[ 4008.477431]  [<ffffffff811dda61>] ? __kmalloc_node+0x1d1/0x2b0
22:41:42:[ 4008.477431]  [<ffffffffa07a50ba>] ? dnode_hold_impl+0x3da/0x5c0 [zfs]
22:41:42:[ 4008.477431]  [<ffffffffa06cfb50>] ? spl_kmem_zalloc+0xc0/0x170 [spl]
22:41:42:[ 4008.477431]  [<ffffffffa06cfb50>] ? spl_kmem_zalloc+0xc0/0x170 [spl]
22:41:42:[ 4008.477431]  [<ffffffffa06cfb50>] ? spl_kmem_zalloc+0xc0/0x170 [spl]
22:41:42:[ 4008.477431]  [<ffffffffa07a1a66>] dmu_tx_hold_write+0x36/0x50 [zfs]
22:41:42:[ 4008.477431]  [<ffffffffa0f9f09e>] osd_declare_write_commit+0x58e/0x7a0 [osd_zfs]
22:41:42:[ 4008.477431]  [<ffffffffa0b27329>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
22:41:42:[ 4008.477431]  [<ffffffffa109c22d>] ofd_commitrw_write+0x74d/0x1c50 [ofd]
22:41:42:[ 4008.477431]  [<ffffffffa10a02ef>] ofd_commitrw+0x4af/0xa80 [ofd]
22:41:42:[ 4008.477431]  [<ffffffffa0e02045>] obd_commitrw+0x2ed/0x330 [ptlrpc]
22:41:42:[ 4008.532407]  [<ffffffffa0dd58b0>] tgt_brw_write+0x1010/0x1740 [ptlrpc]
22:41:42:[ 4008.543616]  [<ffffffff8132396b>] ? string.isra.7+0x3b/0xf0
22:41:42:[ 4008.543616]  [<ffffffffa0d27920>] ? target_send_reply_msg+0x170/0x170 [ptlrpc]
22:41:42:[ 4008.543616]  [<ffffffffa0dd0755>] tgt_request_handle+0x915/0x1360 [ptlrpc]
22:41:42:[ 4008.543616]  [<ffffffffa0d7a003>] ptlrpc_server_handle_request+0x233/0xa90 [ptlrpc]
22:41:42:[ 4008.677153]  [<ffffffff810ba628>] ? __wake_up_common+0x58/0x90
22:41:42:[ 4008.677153]  [<ffffffffa0d7dfe0>] ptlrpc_main+0xaa0/0x1dd0 [ptlrpc]
22:41:42:[ 4008.677153]  [<ffffffffa0d7d540>] ? ptlrpc_register_service+0xe30/0xe30 [ptlrpc]
22:41:42:[ 4008.677153]  [<ffffffff810b0a4f>] kthread+0xcf/0xe0
22:41:42:[ 4008.677153]  [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
22:41:42:[ 4008.677153]  [<ffffffff81697798>] ret_from_fork+0x58/0x90
22:41:42:[ 4008.677153]  [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140
22:41:42:[ 4008.677153] ---[ end trace 7d43a0ec1971c9bb ]---
Comment by Nathaniel Clark [ 22/Jun/17 ]

This doesn't seem to reproduce with single client (several runs - no reproduction, still trying though). But the multi-client tests don't write to the same files. They all write to separate files in the same directory. So there shouldn't be multiple writes coming into the same block, unless the blocks from single clients aren't aligned? But then single client should be able to cause error...

Comment by Gerrit Updater [ 22/Jun/17 ]

Alex Zhuravlev (alexey.zhuravlev@intel.com) uploaded a new patch: https://review.whamcloud.com/27785
Subject: LU-9305 tests: Run Reproducer
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 5226fcc3fa788615bebd2eab204d25c1fcc9bc07

Comment by Alex Zhuravlev [ 22/Jun/17 ]

the patch above will be trying to reproduce with a fixed (big) blocksize.

Comment by Andreas Dilger [ 22/Jun/17 ]

Nathaniel, even a single client will have multiple RPCs in flight at one time. If the client RPC size is below the ZFS recordsize, then it is possible that multiple threads are processing IO for the same block. It should be possible to record (at least) the DLMTRACE and RPCTRACE messages in the Lustre debug log (possibly using "lctl debug_daemon" if this takes a while to be hit and the logs are large), and then after the bad checksum message is hit go back through the logs to find what threads are accessing that object.

That said, given that there is also list_head corruption hit at the same time, this looks more like memory corruption (e.g. use after free) than a Lustre-level problem, unless there is some kind of bug in the BAD WRITE CHECKSUM handling that is causing the list_head corruption.

Comment by Jinshan Xiong (Inactive) [ 23/Jun/17 ]

I have tried most of the ideas above well it won't lose by doing it one more time.

I have ever loaded 4.11 kernel with KASAN enabled and I could no longer reproduce the problem. So I have an idea to try vanilla 4.11 kernel and see how it goes.

I have talked with John S. about the idea but unfortunately he has no time to perform it.

Comment by Andreas Dilger [ 23/Jun/17 ]

Now that  the CORAL team is past their customer demo, it makes sense that John would reproduce this problem on one of his test systems and work with Alex and Jinshan to debug there. 

Comment by Alex Zhuravlev [ 23/Jun/17 ]

is it possible to get a dumped log from autotest/maloo ?

Comment by Sarah Liu [ 23/Jun/17 ]

Alex, for your patch #27785, Maloo only has console and test log since tests passed. I think autotest only collects debug/dmesg etc when test fails. Is that what you need?

Comment by Alex Zhuravlev [ 24/Jun/17 ]

well, I have this line in the logs:
12:20:31:[11737.507659] LustreError: dumping log to /tmp/lustre-log.1498220428.13468
and the test failed with TIMEOUT (as OST crashed)
is it possible to get /tmp/lustre-log.1498220428.13468 ?

Comment by Alex Zhuravlev [ 26/Jun/17 ]

unfortunately I can't reproduce this locally. so it becomes vital to access dumped logs.

Comment by Peter Jones [ 26/Jun/17 ]

Alex

I suggest that you connect with John Salinas and see if you can get direct access to the system that he can 100% reporoduce this on

Peter

Comment by Alex Zhuravlev [ 26/Jun/17 ]

sure, it's great to access that system. but having debugging logs is still useful, IMO. as we can do many tests concurrently, trying diffferent ideas, etc. and for sure this is not the last situation when that can help.

Comment by Sarah Liu [ 26/Jun/17 ]

hi Alex, can you please point to me which link/logs do you refer to?

Comment by Alex Zhuravlev [ 26/Jun/17 ]

for example: https://testing.hpdd.intel.com/test_sessions/1bf95369-a387-4d1e-bdb3-51f394f95749
trevis-36vm4 dumped a lustre log:
12:20:31:[11737.507659] LustreError: dumping log to /tmp/lustre-log.1498220428.13468

Comment by Sarah Liu [ 26/Jun/17 ]

Alex, please go to trevis and check the following dir, hope it has what you are looking for

-sh-4.1$ ls /scratch/dumps/trevis-36vm4.trevis.hpdd.intel.com
10.9.5.195-2017-06-22-22:11:54  10.9.5.195-2017-06-23-12:20:38
10.9.5.195-2017-06-23-06:33:26  2016-06-26-05:38

When a autotest node crashes, the core dump location will be in /scratch/dumps/<hostname>

Comment by Alex Zhuravlev [ 26/Jun/17 ]

thanks. did you upload those files or autotest did automatically?

Comment by Sarah Liu [ 26/Jun/17 ]

it is automatically generated if a node crashes. please refer to https://wiki.hpdd.intel.com/display/TEI/Core+dump+location+for+autotest+nodes

Comment by Alex Zhuravlev [ 29/Jun/17 ]

I added few CDEBUG() and BUG() in tgt_warn_on_cksum() ..

00080000:00080000:1.0:1498681682.290436:0:13445:0:(osd_io.c:541:osd_bufs_get()) pages: ffffea00003ac3c0/4096@12582912
00000020:00080000:1.0:1498681682.363189:0:13445:0:(tgt_handler.c:2121:tgt_warn_on_cksum()) pages: ffffea00003ac3c0/4096@12582912

00080000:00080000:0.0:1498681682.396783:0:14108:0:(osd_io.c:541:osd_bufs_get()) pages: ffffea00003ac3c0/4096@20086784

so tgt_warn_on_cksum in thread 13445 did hit BUG and couldn't proceed to osd_bufs_put() to release page ffffea00003ac3c0 - in theory. but thread 14108 was able to get this page few cycles after.

Comment by Jinshan Xiong (Inactive) [ 30/Jun/17 ]

Alex:

That's a good finding. What would be the next step?

Comment by Alex Zhuravlev [ 03/Jul/17 ]

may I ask to run https://review.whamcloud.com/27913 on a real iron, please?
I'm not sure about the real sequence, but my theory is that osd_bufs_put() may try to release pages already released by dmu_assign_arcbuf() when blocksize mismatch.

Comment by Jinshan Xiong (Inactive) [ 03/Jul/17 ]

wolf cluster should be available to use. Do you have access to wolf?

Comment by Alex Zhuravlev [ 03/Jul/17 ]

No, i'm not familiar with wolf cluster.

Comment by Gerrit Updater [ 06/Jul/17 ]

Alex Zhuravlev (alexey.zhuravlev@intel.com) uploaded a new patch: https://review.whamcloud.com/27950
Subject: LU-9305 osd: do not release pages twice
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 12d3b3dcfbc17fc201dc9de463720e3a3a994f49

Comment by Alex Zhuravlev [ 06/Jul/17 ]

with https://review.whamcloud.com/27950 I can't reproduce the issue.

Comment by Jinshan Xiong (Inactive) [ 06/Jul/17 ]

This is indeed a race condition. I wonder why I couldn't catch the race by enabling VM_BUG_ON_PAGE() in put_page_testzero().

Comment by Jinshan Xiong (Inactive) [ 07/Jul/17 ]

The patch should be able to fix the problem - I have run the patched lustre for several hours w/o hitting the problem. Without the patch it died quickly.

Comment by Alex Zhuravlev [ 07/Jul/17 ]

Jinshan, thanks for testing.

Comment by Gerrit Updater [ 10/Jul/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/27950/
Subject: LU-9305 osd: do not release pages twice
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 6d039389735c52f965505643de9d8e4772e3f87f

Comment by Phil Schwan (Inactive) [ 08/Sep/17 ]

Do we think that this bug is inherently to blame for the "Error -14" / put_page crash that I see referenced in a couple of the comments?

19:00:33:[ 2346.715685] LNetError: 6844:0:(socklnd_cb.c:1149:ksocknal_process_receive()) [ffff880061190000] Error -14 on read from 12345-10.9.4.124@tcp ip 10.9.4.124:1021
19:00:33:[ 2346.766822] CPU: 1 PID: 6842 Comm: socknal_reaper Tainted: P OE ------------ 3.10.0-514.21.1.el7_lustre.x86_64 #1
19:00:33:[ 2346.766822] RIP: 0010:[<ffffffff8118edaa>] [<ffffffff8118edaa>] put_page+0xa/0x60

(I give up trying to get it to format correctly)

?

We've hit this crash about 20 times in the past month in normal production use (ZFS 0.7.1-1 / Lustre 2.9.51_45_g3b3eeeb), but generally without a "BAD WRITE CHECKSUM" error anywhere in sight.

Comment by Jinshan Xiong (Inactive) [ 10/Sep/17 ]

Hi Phil,

The problem discovered in this ticket is about memory corruption, so I wouldn't be surprised if the symptom is related to this bug. Please go ahead and try the patch and see how it goes.

Comment by Phil Schwan (Inactive) [ 12/Sep/17 ]

Thanks, Jinshan! After another 20x crashes yesterday, we installed the patch about 12 hours ago. So far so good.

Generated at Sat Feb 10 02:25:00 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.