[LU-6160] all test builds should enable SPL/ZFS debugging Created: 26/Jan/15  Updated: 07/Nov/18  Resolved: 07/Jun/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.12.0

Type: Improvement Priority: Major
Reporter: Isaac Huang (Inactive) Assignee: Oleg Drokin
Resolution: Fixed Votes: 0
Labels: zfs

Issue Links:
Related
is related to LU-6155 osd_count_not_mapped() calls dbuf_hol... Resolved
is related to LU-6195 osd-zfs: osd_declare_object_destroy()... Resolved
Rank (Obsolete): 17226

 Description   

ASSERTs in SPL/ZFS are off by default. It'd be a good idea to enable SPL/ZFS assertions at least for test builds, which'd pinpoint problems much earlier before they manifest as hard-to-diagnose symptoms.

I have a patch that enables SPL/ZFS debugging:
http://review.whamcloud.com/#/c/13431/

But there's build failures for i686 in SPL/ZFS. So upstream SPL/ZFS would need a fix for that.



 Comments   
Comment by Alex Zhuravlev [ 27/Jan/15 ]

I think Lustre isn't quite ready for this yet. in some cases (like append to llog) we can't declare in the way ZFS's debugging expects.

Comment by Isaac Huang (Inactive) [ 10/Feb/15 ]

Alex, can you please point me to the code?

Comment by Alex Zhuravlev [ 10/Feb/15 ]

well, I don't remember exact lines, but basically in debug mode DMU tracks all writes are declared properly (including offset/range), while LLOG doesn't know exact offset at declare.

Comment by Olaf Faaland [ 11/Aug/17 ]

I hear that Oleg successfully ran Lustre using ZFS built with debug enabled, and submitted patches for some bugs he found that way.

Was Lustre patched to enable this, so that this ticket can now go forward? Or did he just make some one-off change in his environment, perhaps to ZFS, to enable this for testing purposes?

Comment by Alex Zhuravlev [ 11/Aug/17 ]

"ran" doesn't exactly mean it's reliable.. not that I'm against running with debug enabled, but there were known issues we never fixed, some can't be fixed outside of ZFS code.

Comment by Peter Jones [ 11/Aug/17 ]

Assigining to Oleg for his comment

Comment by Gerrit Updater [ 14/Aug/17 ]

Giuseppe Di Natale (dinatale2@llnl.gov) uploaded a new patch: https://review.whamcloud.com/28544
Subject: LU-6160 osd-zfs: Fix refcount_add call
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 510f1536aa60e11271d621e2171dc19bcf8c7a5b

Comment by Giuseppe Di Natale (Inactive) [ 14/Aug/17 ]

I'm submitting https://review.whamcloud.com/28544 just so lustre builds against zfs packages that have debug enabled.

Comment by Gerrit Updater [ 07/Jun/18 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/28544/
Subject: LU-6160 osd-zfs: Fix refcount_add call
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: ad7e62cc15e9e90d33a7302308d47566e4af3593

Comment by Peter Jones [ 07/Jun/18 ]

Landed for 2.12

Comment by Olaf Faaland [ 07/Jun/18 ]

I see that Oleg landed the patch to allow Lustre to build against zfs packages with debug enabled.

But were the appropriate change(s) made to your build system to enable debug in zfs for test builds?

Comment by Gerrit Updater [ 07/Nov/18 ]

Nathaniel Clark (nclark@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33604
Subject: LU-6160 osd-zfs: Fix refcount_add call
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: fe7bf8a315c6c995f9eb674bd15308ff8def7633

Generated at Sat Feb 10 01:57:47 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.