[LU-3121] lustre-initialization-1: Can't load module 'osd-zfs' Created: 08/Apr/13 Updated: 02/Jun/15 Resolved: 05/Aug/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Maloo | Assignee: | Chris Gearing (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | HB, zfs | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 7578 | ||||||||
| Description |
|
This issue was created by maloo for Li Wei <liwei@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/418e2830-9f9f-11e2-9f27-52540035b04c. The sub-test lustre-initialization_1 failed with the following error:
Info required for matching: lustre-initialization-1 lustre-initialization_1 For some unknown reason, the MDS couldn't load osd-zfs.ko: 08:17:45:LNet: Accept all, port 7988 08:17:45:LustreError: 158-c: Can't load module 'osd-zfs' 08:17:45:LustreError: 2944:0:(genops.c:304:class_newdev()) OBD: unknown type: osd-zfs 08:17:45:LustreError: 2944:0:(obd_config.c:374:class_attach()) Cannot create device lustre-MDT0000-osd of type osd-zfs : -19 08:17:45:LustreError: 2944:0:(obd_mount.c:196:lustre_start_simple()) lustre-MDT0000-osd attach error -19 08:17:45:LustreError: 2944:0:(obd_mount_server.c:1664:server_fill_super()) Unable to start osd on lustre-mdt1/mdt1: -19 08:17:45:LustreError: 2944:0:(obd_mount.c:1264:lustre_fill_super()) Unable to mount (-19) |
| Comments |
| Comment by Peter Jones [ 08/Apr/13 ] |
|
Lai Could you please comment? Thanks Peter |
| Comment by Li Wei (Inactive) [ 08/Apr/13 ] |
|
I spent some time to provision the build manually on Toro. After "yum install"ing lustre-osd-zfs RPM and the ZFS RPMs, the MDT could be successfully mounted. This suggests something might be wrong with Autotest/Toro. |
| Comment by Jodi Levi (Inactive) [ 08/Apr/13 ] |
|
Duplicate of |
| Comment by Peter Jones [ 08/Apr/13 ] |
|
Seems to only hit rarely but we'd still like to understand why this happens |
| Comment by Chris Gearing (Inactive) [ 08/Apr/13 ] |
|
I will had extra debug to Autotest which will require a restart, this issue must be intermittent because zfs testing does run. https://maloo.whamcloud.com/test_sessions?utf8=%E2%9C%93&test_group=review-zfs&commit=Apply+Filter |
| Comment by Nathaniel Clark [ 09/Apr/13 ] |
|
I think the https://maloo.whamcloud.com/test_sessions/3ef2158c-9f9f-11e2-9f27-52540035b04c failure is due to the patch being tested. |
| Comment by Li Wei (Inactive) [ 12/Apr/13 ] |
|
Nathaniel, could you explain how the patch caused the failure, please? I went through the patch again, but did not figure out why it would cause such failures. |
| Comment by Nathaniel Clark [ 16/Apr/13 ] |
|
Li, I am wildly incorrect. I must have been looking at a different patch. My comment form the 9th is wrong. Sorry for the confusion. |
| Comment by Li Wei (Inactive) [ 17/Apr/13 ] |
|
No problem. Ironically, my http://review.whamcloud.com/5785 has been hitting this issue every time recently, although I can't see any fault in the patch itself. Probably there really is a problem in the patch... |
| Comment by Li Wei (Inactive) [ 17/Apr/13 ] |
|
Ah, there are also considerable amount of failures when testing other patches. |
| Comment by Zhenyu Xu [ 17/Apr/13 ] |
|
another hit at https://maloo.whamcloud.com/test_sets/d170d43c-a767-11e2-b3cc-52540035b04c |
| Comment by Jian Yu [ 06/May/13 ] |
|
Another instance: https://maloo.whamcloud.com/test_sets/2352a39e-b658-11e2-bf90-52540035b04c |
| Comment by Jian Yu [ 07/May/13 ] |
|
This is blocking the patch review testing on zfs: |
| Comment by Jian Yu [ 08/May/13 ] |
|
Another one: https://maloo.whamcloud.com/test_sets/cd712682-b72c-11e2-bd0f-52540035b04c |
| Comment by Nathaniel Clark [ 09/May/13 ] |
|
rpms seem to install and run cleanly, must be an install issue of some sort. Possibly depmod issue? |
| Comment by Nathaniel Clark [ 09/May/13 ] |
|
I can reproduce these exact symptoms by installing all of lustre except lustre-osd-zfs normally, then install lustre-osd-zfs with --noscripts (causing it to skip running depmod). lustre-osd-zfs may skip depmod if the kernel isn't installed yet, still trying to figure out this exact vector. |
| Comment by Nathaniel Clark [ 09/May/13 ] |
|
If it's a module dependency ordering issue, it may be solved by http://review.whamcloud.com/6259 which fixes up lustre-osd-* / lustre-modules rpm dependencies. |
| Comment by Nathaniel Clark [ 14/May/13 ] |
|
The above linked patch has landed for |
| Comment by Jian Yu [ 15/May/13 ] |
|
Lustre Branch: master The issue in this ticket still occurred: |
| Comment by Jian Yu [ 18/May/13 ] |
|
Lustre Tag: v2_4_0_RC1 Hit the issue again: |
| Comment by Nathaniel Clark [ 24/May/13 ] |
+ yum install -y kernel-2.6.32-358.6.1.el6_lustre.x86_64 lustre-ldiskfs lustre-modules lustre lustre-tests Loaded plugins: fastestmirror, security Setting up Install Process Resolving Dependencies --> Running transaction check ---> Package kernel.x86_64 0:2.6.32-358.6.1.el6_lustre will be installed ---> Package lustre.x86_64 0:2.4.50-2.6.32_358.6.1.el6_lustre.x86_64 will be installed --> Processing Dependency: lustre-osd for package: lustre-2.4.50-2.6.32_358.6.1.el6_lustre.x86_64.x86_64 --> Processing Dependency: libnetsnmpmibs.so.20()(64bit) for package: lustre-2.4.50-2.6.32_358.6.1.el6_lustre.x86_64.x86_64 --> Processing Dependency: libnetsnmphelpers.so.20()(64bit) for package: lustre-2.4.50-2.6.32_358.6.1.el6_lustre.x86_64.x86_64 --> Processing Dependency: libnetsnmpagent.so.20()(64bit) for package: lustre-2.4.50-2.6.32_358.6.1.el6_lustre.x86_64.x86_64 --> Processing Dependency: libnetsnmp.so.20()(64bit) for package: lustre-2.4.50-2.6.32_358.6.1.el6_lustre.x86_64.x86_64 ---> Package lustre-ldiskfs.x86_64 0:4.1.0-2.6.32_358.6.1.el6_lustre.x86_64 will be installed ---> Package lustre-modules.x86_64 0:2.4.50-2.6.32_358.6.1.el6_lustre.x86_64 will be installed ---> Package lustre-tests.x86_64 0:2.4.50-2.6.32_358.6.1.el6_lustre.x86_64 will be installed --> Running transaction check ---> Package lustre-osd-ldiskfs.x86_64 0:2.4.50-2.6.32_358.6.1.el6_lustre.x86_64 will be installed ---> Package net-snmp-libs.x86_64 1:5.5-44.el6_4.1 will be installed --> Processing Dependency: libsensors.so.4()(64bit) for package: 1:net-snmp-libs-5.5-44.el6_4.1.x86_64 --> Running transaction check ---> Package lm_sensors-libs.x86_64 0:3.1.1-17.el6 will be installed --> Finished Dependency Resolution The problem is that lustre requires lustre-osd and picks lustre-osd-ldiskfs thus not installing lustre-osd-zfs The yum line should explicitly install lustre-osd-ldiskfs and lustre-osd-zfs, or we could add a virtual package lustre-all that requires both osd packages. |
| Comment by Nathaniel Clark [ 24/May/13 ] |
|
zfs and ancillary packages are installed after kickstart, but lustre-osd-zfs isn't one of them. |
| Comment by Nathaniel Clark [ 18/Jul/13 ] |
|
I believe this is fixed now. There is still some bug where lustre-inialization-1 fails, but it does not appear to be failing on loading osd-zfs. |