[LU-7776] lustre-single lnet-selftest test failed Created: 15/Feb/16 Updated: 15/Jun/16 Resolved: 15/Jun/16 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0, Lustre 2.9.0 |
| Fix Version/s: | Lustre 2.9.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Abrar-ahmed | Assignee: | WC Triage |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Solo setup |
||
| Epic/Theme: | test |
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
lnet-selftest test fails in test setup stdout.log 1 UP mgs MGS MGS 5 2 UP mgc MGC192.168.108.18@tcp c0ab2420-8f51-ad18-f779-591cad596879 5 3 UP mds MDS MDS_uuid 3 4 UP lod lustre-MDT0000-mdtlov lustre-MDT0000-mdtlov_UUID 4 5 UP mdt lustre-MDT0000 lustre-MDT0000_UUID 11 6 UP mdd lustre-MDD0000 lustre-MDD0000_UUID 4 7 UP qmt lustre-QMT0000 lustre-QMT0000_UUID 4 8 UP lwp lustre-MDT0000-lwp-MDT0000 lustre-MDT0000-lwp-MDT0000_UUID 5 9 UP osd-ldiskfs lustre-OST0000-osd lustre-OST0000-osd_UUID 5 10 UP ost OSS OSS_uuid 3 11 UP obdfilter lustre-OST0000 lustre-OST0000_UUID 7 12 UP lwp lustre-MDT0000-lwp-OST0000 lustre-MDT0000-lwp-OST0000_UUID 5 13 UP osd-ldiskfs lustre-OST0001-osd lustre-OST0001-osd_UUID 5 14 UP obdfilter lustre-OST0001 lustre-OST0001_UUID 7 15 UP lwp lustre-MDT0000-lwp-OST0001 lustre-MDT0000-lwp-OST0001_UUID 5 21 UP osp lustre-OST0000-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5 22 UP osp lustre-OST0001-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5 Modules still loaded: lustre/osp/osp.o lustre/lod/lod.o lustre/ost/ost.o lustre/mdt/mdt.o lustre/mdd/mdd.o lustre/mgs/mgs.o ldiskfs/ldiskfs.o lustre/quota/lquota.o lustre/lfsck/lfsck.o lustre/mgc/mgc.o lustre/fid/fid.o lustre/fld/fld.o lustre/ptlrpc/ptlrpc.o lustre/obdclass/obdclass.o lnet/klnds/socklnd/ksocklnd.o lnet/lnet/lnet.o libcfs/libcfs/libcfs.o |
| Comments |
| Comment by Abrar-ahmed [ 15/Feb/16 ] |
|
lnet-selftest.sh script is fails while trying to execute cleanupall() during test setup. cleanupall() in turn fails trying to remove modules while still in use. This happens on a solo setup when local_node returns true and variable CLIENTONLY is set to true. Further cleanupall() internally calls stopall() which checks CLIENTONLY and returns midway if true without further cleanup of mgs, mds and ost. This causes cleanupall() to fail at a later stage trying to remove loaded modules. stopall() {
...
[ "$CLIENTONLY" ] && return
History of change shows that this regression was introduced as a result of a debug patch http://review.whamcloud.com/12469 ( |
| Comment by Andreas Dilger [ 18/Feb/16 ] |
|
I don't think that reverting the patch is a good idea, since I believe this will cause lnet-selftest to begin failing again in our test configuration. Instead, I think it should be enough to change the "cleanupall" to "stopall" so that it doesn't try to unload the modules, which isn't necessary. The goal of the |
| Comment by Abrar-ahmed [ 31/Mar/16 ] |
|
@Andreas Dilger Here is my understanding of the debug patch submitted via commit <a8ba5c645f91faf86a84c99dd2cc049bc54e12b1> - local_mode && CLIENTONLY=yes
- stopall
- RESTORE_MOUNT=yes
+ local_mode && CLIENTONLY=yes
+ RESTORE_MOUNT=yes
+ LOAD_MODULES_REMOTE=true
+ cleanupall
So changing cleanupall to stopall would be functionally reverting the debug patch. Would this not cause your test setup to fail again?. |
| Comment by Abrar-ahmed [ 31/Mar/16 ] |
|
@Andreas Dilger Alternate solution to keep the debug patch functionality would be to avoid calling cleanupall on local_mode setups. Something like below - local_mode && CLIENTONLY=yes + if local_mode; then + CLIENTONLY=yes + stopall + else + LOAD_MODULES_REMOTE=true + cleanupall + fi Let me know which solution works for you or if you want to suggest alternatives. I can upload a patch for the same. |
| Comment by Andreas Dilger [ 01/Apr/16 ] |
|
Looks reasonable, and if this patch works for you then you can submit it and it can be tested. |
| Comment by Gerrit Updater [ 02/Apr/16 ] |
|
Abrarahmed Momin (kais_abrar@yahoo.co.in) uploaded a new patch: http://review.whamcloud.com/19308 |
| Comment by Abrar-ahmed [ 14/Apr/16 ] |
|
@Andreas Dilge: Hi Andreas, have uploaded the discussed patch and test run results were fine. Can you and others kindly review the patch. |
| Comment by Gerrit Updater [ 14/Jun/16 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/19308/ |
| Comment by Joseph Gmitter (Inactive) [ 15/Jun/16 ] |
|
patch has landed to master for 2.9.0 |