Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
None
-
Lustre 2.9.0
-
None
-
RHEL 6.7 2.8.x <-> RHEL 7
master build# 3456
-
3
-
9223372036854775807
Description
This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com>
Please provide additional information about the failure here.
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/b7dc839c-957d-11e6-bc10-5254006e85c2.
Sanity test failed after downgrading MDS with the message:
unable to write to /mnt/lustre/d0_runas_test as UID 500
suite_log:
-----============= acceptance-small: sanity ============----- Sat Oct 15 23:27:53 PDT 2016
Running: bash /usr/lib64/lustre/tests/sanity.sh
onyx-23vm7: Checking config lustre mounted on /mnt/lustre
onyx-23vm8: Checking config lustre mounted on /mnt/lustre
Checking servers environments
Checking clients onyx-23vm7,onyx-23vm8 environments
Using TIMEOUT=20
disable quota as required
osd-ldiskfs.track_declares_assert=1
osd-ldiskfs.track_declares_assert=1
running as uid/gid/euid/egid 500/500/500/500, groups:
[touch] [/mnt/lustre/d0_runas_test/f11686]
touch: cannot touch `/mnt/lustre/d0_runas_test/f11686': Permission denied
sanity : @@@@@@ FAIL: unable to write to /mnt/lustre/d0_runas_test as UID 500.
Please set RUNAS_ID to some UID which exists on MDS and client or
add user 500:500 on these nodes.
Steps Followed:
1. Set up lustre file system with Old version 2.8.x
2. Upgraded OSS and ran sanity.sh
3. Upgraded MDS and ran sanity.sh
4. Upgraded Clients and ran sanity.sh
5. Downgraded Clients and ran sanity.sh
6. Downgraded MDS (used that extra step of abort_recovery for remount and '-f' for unmount again). File system got mounted with no issues. But when tried to run Sanity.sh from client the above error message showed up.
As the test failed I did tried to unmount the file system once and remount it on all nodes as well.
MDS log when mounted:
[root@onyx-25 ~]# mount -t lustre -o acl,user_xattr /dev/sdb1 /mnt/mds0 mount.lustre: increased /sys/block/sdb/queue/max_sectors_kb from 1024 to 16384 LNet: HW CPU cores: 32, npartitions: 4 alg: No test for adler32 (adler32-zlib) alg: No test for crc32 (crc32-table) alg: No test for crc32 (crc32-pclmul) Lustre: Lustre: Build Version: jenkins-arch=x86_64,build_type=server,distro=el6.7,ib_stack=inkernel-39-gf08239d-PRISTINE-2.6.32-573.26.1.el6_lustre.g948c890.x86_64 LNet: Added LNI 10.2.4.47@tcp [8/256/0/180] LNet: Accept secure, port 988 LDISKFS-fs (sdb1): mounted filesystem with ordered data mode. quota=on. Opts: Lustre: MGS: Connection restored to MGC10.2.4.47@tcp_0 (at 0@lo)
MDS log when sanity.sh was run:
Lustre: DEBUG MARKER: -----============= acceptance-small: sanity ============----- Sat Oct 15 23:27:53 PDT 2016 Lustre: DEBUG MARKER: Using TIMEOUT=20 LustreError: 20325:0:(mdt_identity.c:135:mdt_identity_do_upcall()) lustre-MDT0000: error invoking upcall /sbin/l_getidentity lustre-MDT0000 0: rc -2; check /proc/fs/lustre/mdt/lustre-MDT0000/identity_upcall, time 117us LustreError: 20325:0:(mdt_identity.c:135:mdt_identity_do_upcall()) lustre-MDT0000: error invoking upcall /sbin/l_getidentity lustre-MDT0000 0: rc -2; check /proc/fs/lustre/mdt/lustre-MDT0000/identity_upcall, time 220us LustreError: 20325:0:(mdt_identity.c:135:mdt_identity_do_upcall()) Skipped 4 previous similar messages Lustre: DEBUG MARKER: sanity : @@@@@@ FAIL: unable to write to /mnt/lustre/d0_runas_test as UID 500.
OSS log when sanity.sh was run:
[root@onyx-26 ~]# [118430.582487] Lustre: DEBUG MARKER: -----============= acceptance-small: sanity ============----- Sat Oct 15 23:27:53 PDT 2016 [118433.837028] Lustre: DEBUG MARKER: Using TIMEOUT=20 [118436.286200] Lustre: DEBUG MARKER: sanity : @@@@@@ FAIL: unable to write to /mnt/lustre/d0_runas_test as UID 500.