[LU-8725] Rolling Downgrade 2.8.x<->master : FAIL: unable to write to /mnt/lustre/d0_runas_test as UID 500 Created: 18/Oct/16 Updated: 11/May/17 Resolved: 11/May/17 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.9.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | Saurabh Tandan (Inactive) |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
RHEL 6.7 2.8.x <-> RHEL 7 |
||
| Attachments: |
|
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com> Please provide additional information about the failure here. This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/b7dc839c-957d-11e6-bc10-5254006e85c2. unable to write to /mnt/lustre/d0_runas_test as UID 500 suite_log: -----============= acceptance-small: sanity ============----- Sat Oct 15 23:27:53 PDT 2016
Running: bash /usr/lib64/lustre/tests/sanity.sh
onyx-23vm7: Checking config lustre mounted on /mnt/lustre
onyx-23vm8: Checking config lustre mounted on /mnt/lustre
Checking servers environments
Checking clients onyx-23vm7,onyx-23vm8 environments
Using TIMEOUT=20
disable quota as required
osd-ldiskfs.track_declares_assert=1
osd-ldiskfs.track_declares_assert=1
running as uid/gid/euid/egid 500/500/500/500, groups:
[touch] [/mnt/lustre/d0_runas_test/f11686]
touch: cannot touch `/mnt/lustre/d0_runas_test/f11686': Permission denied
sanity : @@@@@@ FAIL: unable to write to /mnt/lustre/d0_runas_test as UID 500.
Please set RUNAS_ID to some UID which exists on MDS and client or
add user 500:500 on these nodes.
Steps Followed: As the test failed I did tried to unmount the file system once and remount it on all nodes as well. MDS log when mounted: [root@onyx-25 ~]# mount -t lustre -o acl,user_xattr /dev/sdb1 /mnt/mds0 mount.lustre: increased /sys/block/sdb/queue/max_sectors_kb from 1024 to 16384 LNet: HW CPU cores: 32, npartitions: 4 alg: No test for adler32 (adler32-zlib) alg: No test for crc32 (crc32-table) alg: No test for crc32 (crc32-pclmul) Lustre: Lustre: Build Version: jenkins-arch=x86_64,build_type=server,distro=el6.7,ib_stack=inkernel-39-gf08239d-PRISTINE-2.6.32-573.26.1.el6_lustre.g948c890.x86_64 LNet: Added LNI 10.2.4.47@tcp [8/256/0/180] LNet: Accept secure, port 988 LDISKFS-fs (sdb1): mounted filesystem with ordered data mode. quota=on. Opts: Lustre: MGS: Connection restored to MGC10.2.4.47@tcp_0 (at 0@lo) MDS log when sanity.sh was run: Lustre: DEBUG MARKER: -----============= acceptance-small: sanity ============----- Sat Oct 15 23:27:53 PDT 2016 Lustre: DEBUG MARKER: Using TIMEOUT=20 LustreError: 20325:0:(mdt_identity.c:135:mdt_identity_do_upcall()) lustre-MDT0000: error invoking upcall /sbin/l_getidentity lustre-MDT0000 0: rc -2; check /proc/fs/lustre/mdt/lustre-MDT0000/identity_upcall, time 117us LustreError: 20325:0:(mdt_identity.c:135:mdt_identity_do_upcall()) lustre-MDT0000: error invoking upcall /sbin/l_getidentity lustre-MDT0000 0: rc -2; check /proc/fs/lustre/mdt/lustre-MDT0000/identity_upcall, time 220us LustreError: 20325:0:(mdt_identity.c:135:mdt_identity_do_upcall()) Skipped 4 previous similar messages Lustre: DEBUG MARKER: sanity : @@@@@@ FAIL: unable to write to /mnt/lustre/d0_runas_test as UID 500. OSS log when sanity.sh was run: [root@onyx-26 ~]# [118430.582487] Lustre: DEBUG MARKER: -----============= acceptance-small: sanity ============----- Sat Oct 15 23:27:53 PDT 2016 [118433.837028] Lustre: DEBUG MARKER: Using TIMEOUT=20 [118436.286200] Lustre: DEBUG MARKER: sanity : @@@@@@ FAIL: unable to write to /mnt/lustre/d0_runas_test as UID 500. |
| Comments |
| Comment by Joseph Gmitter (Inactive) [ 19/Oct/16 ] |
|
Hi Saurabh, Can you try the changes suggested to the test scrip that we discussed on the QE call for l_getidentity. Thanks. |
| Comment by Andreas Dilger [ 19/Oct/16 ] |
|
It looks like this is a problem with identity_upcall being set explicitly by the test framework when the MDT is formatted, which doesn't happen for normal Lustre configurations, but is needed when developers are running directly out of the build tree: export L_GETIDENTITY=${L_GETIDENTITY:-"$LUSTRE/utils/l_getidentity"}
if [ ! -f "$L_GETIDENTITY" ]; then
if `which l_getidentity > /dev/null 2>&1`; then
export L_GETIDENTITY=$(which l_getidentity)
else
export L_GETIDENTITY=NONE
fi
fi
opts+=${L_GETIDENTITY:+" --param=mdt.identity_upcall=$L_GETIDENTITY"}
|
| Comment by Andreas Dilger [ 19/Oct/16 ] |
|
I believe in earlier distros (RHEL6-) the upcall is /usr/sbin/l_getidentity but on newer distros (RHEL7+) I believe it is /sbin/l_getidentity because of distro packaging changes? If this is explicitly stored in the configuration log, it will be incorrect after a downgrade. That said, I don't see where it gets set to /sbin/l_getidentity on RHEL7 installs except by the test framework, so there may still be a problem for normal usage? |
| Comment by Saurabh Tandan (Inactive) [ 11/May/17 ] |
|
Cannot reproduce, Hence closing the ticket. |