[LU-6707] EL7 client cannot find loop device for posix test Created: 31/Jan/15 Updated: 09/May/18 Resolved: 16/Mar/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.9.0, Lustre 2.10.0 |
| Fix Version/s: | Lustre 2.10.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Sarah Liu | Assignee: | Jian Yu |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | triage | ||
| Environment: |
server: lustre-master #2770 RHEL6 |
||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 17309 | ||||||||||||
| Description |
|
https://testing.hpdd.intel.com/test_sets/209ce6aa-7ee3-11e4-ab67-5254006e85c2 posix test_1: @@@@@@ FAIL: /dev/loop/1 and /dev/loop1 gone? |
| Comments |
| Comment by Jian Yu [ 01/May/15 ] |
|
Lustre build: https://build.hpdd.intel.com/job/lustre-b_ieel2_0/176/ The same failure occurred: On a RHEL 7.1 client: # ls /dev/loop* /dev/loop-control There is no loop device. |
| Comment by Minh Diep [ 01/May/15 ] |
|
This need to investigate by testing manually. I can't find anything that point to TEI. |
| Comment by Jodi Levi (Inactive) [ 04/May/15 ] |
|
Sarah, |
| Comment by Sarah Liu [ 04/May/15 ] |
|
Jodi, Sure, I will do it today and update the ticket when I have some results |
| Comment by Sarah Liu [ 08/May/15 ] |
|
Right after provision EL7 client, nothing has been run, there is no loop device under /dev, will do more investigation. [root@eagle-39vm3 ~]# ls /dev|grep loop loop-control [root@eagle-39vm3 ~]# rpm -a|grep lustre [root@eagle-39vm3 ~]# rpm -qa|grep lustre lustre-client-modules-2.7.52-3.10.0_229.1.2.el7.x86_64_gbd07c02.x86_64 lustre-iokit-2.7.52-3.10.0_229.1.2.el7.x86_64_gbd07c02.x86_64 lustre-client-2.7.52-3.10.0_229.1.2.el7.x86_64_gbd07c02.x86_64 lustre-client-tests-2.7.52-3.10.0_229.1.2.el7.x86_64_gbd07c02.x86_64 [root@eagle-39vm3 ~]# uname -a Linux eagle-39vm3.eagle.hpdd.intel.com 3.10.0-229.1.2.el7.x86_64 #1 SMP Fri Mar 27 03:04:26 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux [root@eagle-39vm3 ~]# This is what I got from EL6 [root@eagle-39vm1 ~]# uname -a Linux eagle-39vm1.eagle.hpdd.intel.com 2.6.32-504.16.2.el6_lustre.gec772b8.x86_64 #1 SMP Thu Apr 30 21:09:20 PDT 2015 x86_64 x86_64 x86_64 GNU/Linux [root@eagle-39vm1 ~]# ls /dev|grep loop loop0 loop1 loop2 loop3 loop4 loop5 loop6 loop7 |
| Comment by Sarah Liu [ 09/May/15 ] |
|
On EL7, there is no loop device has been set up during system initialization, cannot find the setup commands in /etc/rc.d/init.d/functions [root@eagle-39vm3 init.d]# pwd /etc/rc.d/init.d [root@eagle-39vm3 init.d]# grep -r "losetup" . [root@eagle-39vm3 init.d]# uname -a Linux eagle-39vm3.eagle.hpdd.intel.com 3.10.0-229.1.2.el7.x86_64 #1 SMP Fri Mar 27 03:04:26 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux [root@eagle-39vm3 init.d]# rpm -qf functions initscripts-9.49.17-1.el7_0.1.x86_64 [root@eagle-39vm3 init.d]# While on EL6, /etc/rc.d/init.d/functions has the following commands. [root@eagle-39vm1 init.d]# grep -r "losetup" . ./functions: losetup $dev > /dev/null 2>&1 && \ ./functions: losetup -d $dev [root@eagle-39vm1 init.d]# rpm -qf functions initscripts-9.03.46-1.el6.centos.1.x86_64 [root@eagle-39vm1 init.d]# pwd /etc/rc.d/init.d [root@eagle-39vm1 init.d]# I think this is the reason why EL7 doesn't have loop devices setup after booting. May be our provision script should handle this. |
| Comment by Minh Diep [ 11/May/15 ] |
|
This needs more investigation, but if EL7 doesn't have the loop devices created, perhaps posix testsuite need to do that. I wonder if we are missing any packages in EL7 |
| Comment by Sarah Liu [ 11/May/15 ] |
|
I have tried that the loop device can be created manually with losetup, I used the same provision command for EL6 and EL7 |
| Comment by Jian Yu [ 12/May/15 ] |
|
The same issue is on SLES12 client: |
| Comment by Minh Diep [ 15/May/15 ] |
|
so looks like starting the new kernel, the default loop devices are not there. This means we have to create it from posix test suit(Lustre QE) should be owned by lustre QE group. |
| Comment by Andrea Garcia (Inactive) [ 18/May/15 ] |
|
Yu Jian will update this ticket with what needs to be done (think is the posix test suite that needs to be modified/updated). |
| Comment by Jian Yu [ 18/May/15 ] |
|
Hi Minh, |
| Comment by Minh Diep [ 18/May/15 ] |
|
Hi Yujian I think the testsuite has to take care of creating what's needed to run the test ( and remove it afterward ideally). creating sanityusr... was not the right way to begin with. |
| Comment by Jian Yu [ 18/May/15 ] |
|
OK, so the changes will be adding functions into test-framework.sh to check, create and remove loop devices. And in lustre/tests/posix.sh, make setup_loop_dev() and cleanup_loop_dev() call those functions. |
| Comment by Sarah Liu [ 03/Jun/15 ] |
|
it turns out that if loading loop module with options max_loop=8 , the system will have loop devices [root@eagle-54vm1 modprobe.d]# modprobe loop max_loop=8 [ 307.399441] loop: module loaded [root@eagle-54vm1 modprobe.d]# ls /dev/loop* /dev/loop0 /dev/loop2 /dev/loop4 /dev/loop6 /dev/loop-control /dev/loop1 /dev/loop3 /dev/loop5 /dev/loop7 |
| Comment by Gerrit Updater [ 03/Jun/15 ] |
|
Wei Liu (wei3.liu@intel.com) uploaded a new patch: http://review.whamcloud.com/15130 |
| Comment by Sarah Liu [ 18/Jun/15 ] |
|
This issue is blocked by TEI-3627, posix test needs stropts.h and xtitypes.h which are missing from current EL7 build. |
| Comment by Yang Sheng [ 26/Jun/15 ] |
|
RHEL7 uses /dev/loop-control to manage loop devices. So haven't any /dev/loopXX device be created in advance. But you can stilll use losetup to maintenance loop device. |
| Comment by Jian Yu [ 26/Aug/15 ] |
|
Here is the test report of POSIX compliance testing on RHEL 7.1 client and server: FAILURE SUMMARY: POSIX failures: 6 Test Name Baseline Lustre Report access.43 Succeeded Unresolved chmod.18 Succeeded Unresolved chown.18 Succeeded Unresolved creat.28 Succeeded Unresolved creat.30 Succeeded Unresolved link.23 Succeeded Unresolved FAILURE DESCRIPTIONS: #################################################### Test Name: access.43 Unresolved Test Description: If the implementation supports a read-only file system, EROFS in errno and a return value of -1 on a call to access(path, amode) when write access is requested for a file on a read-only file system. Posix Ref: Component ACCESS Assertion 5.6.3.4-48(C) Test Information: deletion reason: mnt_ro(/dev/loop0, access-d.43) failed #################################################### Test Name: chmod.18 Unresolved Test Description: If the implementation supports a read-only file system, EROFS in errno and a return value of -1 on a call to chmod(path, mode) when the named file resides on a read-only file system. No change to the file mode shall occur. Posix Ref: Component CHMOD Assertion 5.6.4.4-39(C) Test Information: deletion reason: mnt_ro(/dev/loop0, chmod-d.18) failed #################################################### Test Name: chown.18 Unresolved Test Description: If the implementation supports a read-only file system, EROFS in errno and a return value of -1 on a call to chown(path, owner, group) when the named file resides on a read-only file system. No change shall be made to the owner and group of the file. Posix Ref: Component CHOWN Assertion 5.6.5.4-40(C) Test Information: deletion reason: mnt_ro(/dev/loop0, chown-d.18) failed #################################################### Test Name: creat.28 Unresolved Test Description: EROFS in errno and a return value of -1 on a call to creat(path, mode) when: a. the file exists and the named file resides on a read-only file system; b. the named file is to reside on a read-only file system and the file does not exist. The time related elements st_ctime and st_mtime field of the parent directory shall not be updated and the file shall not be truncated. Posix Ref: Component CREAT Assertion 5.3.2.4-55(C) Posix Ref: Component CREAT Assertion 5.3.2.4-56(C) Test Information: deletion reason: mnt_ro(/dev/loop0, ./creat-d.28) failed #################################################### Test Name: creat.30 Unresolved Test Description: ENOSPC in errno and a return value of -1 on a call to creat(path) when the directory or file system which would contain the new file cannot be extended. Posix Ref: Component CREAT Assertion 5.3.2.4-53(B) Test Information: File system not set up correctly for ENOSPC tests #################################################### Test Name: link.23 Unresolved Test Description: EROFS in errno and a return value of -1 on a call to link() when the requested link requires writing in a directory on a read-only file system. Posix Ref: Component LINK Assertion 5.3.4.4-63(C) Test Information: deletion reason: mnt_ro(/dev/loop0, link-d.23) failed #################################################### posix test_1: @@@@@@ FAIL: Run POSIX testsuite on /mnt/lustre failed Except creat.30, other failures are related to /dev/loop0. In addition, the following errors occurred before running POSIX test against Lustre filesystem: rm: cannot remove '/usr/src/posix/ext4/TESTROOT/tset/POSIX.os/files/mkfifo/d.mkfifo/mkfifo-d.17': Device or resource busy rm: cannot remove '/usr/src/posix/ext4/TESTROOT/tset/POSIX.os/files/rmdir/d.rmdir/rmdir-d.9': Device or resource busy rm: cannot remove '/usr/src/posix/ext4/TESTROOT/tset/POSIX.os/files/mkdir/d.mkdir/mkdir-d.19': Device or resource busy rm: cannot remove '/usr/src/posix/ext4/TESTROOT/tset/POSIX.os/files/link/d.link/link-d.25': Device or resource busy rm: cannot remove '/usr/src/posix/ext4/TESTROOT/tset/POSIX.os/files/unlink/d.unlink/unlink-d.9': Device or resource busy rm: cannot remove '/usr/src/posix/ext4/TESTROOT/tset/POSIX.os/files/open/d.open/open-d.46': Device or resource busy rm: cannot remove '/usr/src/posix/ext4/TESTROOT/tset/POSIX.os/files/rename/d.rename/rename-d.17': Device or resource busy rm: cannot remove '/usr/src/posix/ext4/TESTROOT/tset/POSIX.os/ioprim/write/d.write/write-d.16': Device or resource busy rm: cannot remove '/usr/src/posix/ext4/TESTROOT/tset/POSIX.os/ioprim/write/d.write/write-d.25': Device or resource busy lsof and fuser output nothing: # lsof +D /usr/src/posix/ext4/TESTROOT/ # lsof +D /usr/src/posix/ext4/TESTROOT/tset/POSIX.os/ioprim/write/d.write/write-d.16/ # fuser -u /usr/src/posix/ext4/TESTROOT/tset/POSIX.os/ioprim/write/d.write/write-d.16/ # ls -la /usr/src/posix/ext4/TESTROOT/tset/POSIX.os/ioprim/write/d.write/write-d.16/ total 2 drwxr-xr-x 2 root root 1024 Aug 25 17:34 . drwxrwsrwx 4 vsx0 vsxg0 1024 Aug 25 17:34 .. Still investigating. |
| Comment by Saurabh Tandan (Inactive) [ 11/Dec/15 ] |
|
master, build# 3264, 2.7.64 tag |
| Comment by Saurabh Tandan (Inactive) [ 16/Dec/15 ] |
|
Server: 2.5.5, b2_5_fe/62 |
| Comment by Saurabh Tandan (Inactive) [ 19/Dec/15 ] |
|
Another instance for EL7.1 Server/EL7.1 Client - DNE |
| Comment by Saurabh Tandan (Inactive) [ 20/Jan/16 ] |
|
Another instance found for interop : 2.5.5 Server/EL7 Client |
| Comment by Saurabh Tandan (Inactive) [ 03/Feb/16 ] |
|
Encountered another instance for tag 2.7.66 for FULL - EL7.1 Server/EL7.1 Client , master , build# 3314. Another instance for FULL - EL7.1 Server/EL7.1 Client - DNE, master, build# 3314 |
| Comment by Saurabh Tandan (Inactive) [ 10/Feb/16 ] |
|
Another instance found for interop tag 2.7.66 - 2.7.1 Server/EL7 Client, build# 3316 Another instance found for interop tag 2.7.66 - 2.5.5 Server/EL7 Client, build# 3316 Another instance found for Full tag 2.7.66 - EL7.1 Server/EL7.1 Client, build# 3314 Another instance found for Full tag 2.7.66 -EL7.1 Server/EL7.1 Client - DNE, build# 3314 |
| Comment by Saurabh Tandan (Inactive) [ 24/Feb/16 ] |
|
Another instance found for interop - EL7 Server/2.7.1 Client, tag 2.7.90. |
| Comment by Sarah Liu [ 10/Feb/17 ] |
|
This ticket is only for the loop device issue, header missing problem is tracking under |
| Comment by Gerrit Updater [ 16/Mar/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/15130/ |
| Comment by Peter Jones [ 16/Mar/17 ] |
|
Landed for 2.10 |
| Comment by Gerrit Updater [ 09/May/17 ] |
|
James Nunez (james.a.nunez@intel.com) uploaded a new patch: https://review.whamcloud.com/27012 |
| Comment by Gerrit Updater [ 24/May/17 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/27012/ |
| Comment by James Casper [ 26/Sep/17 ] |
|
2.10.1 b26 <--> 2.9.0 b22: |