[LU-4123] lfsck: @@@@@@ FAIL: /data/test/output isn't a shared directory Created: 18/Oct/13  Updated: 31/Dec/13  Resolved: 23/Dec/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.1
Fix Version/s: Lustre 2.6.0, Lustre 2.5.1

Type: Bug Priority: Minor
Reporter: Stephen Champion Assignee: Jian Yu
Resolution: Fixed Votes: 0
Labels: patch
Environment:

Lustre 1.8 -> master
multi rail IB cluster


Severity: 3
Rank (Obsolete): 11126

 Description   

check_write_access() compares the local node name to the host name passed via
xxx_HOST env variables. This breaks when there is a mismatch.

My cluster uses the nodename as the host name of the management interface
(ethernet). The Infiniband interface(s) use 'hostname-ibX'. To use an IB
interface for acceptance tests, the ib host name must be used in xxx_HOST
env variables.

I have a patch to use the remote nodename in check_write_access().



 Comments   
Comment by Stephen Champion [ 18/Oct/13 ]

http://review.whamcloud.com/#/c/8009/

Comment by Stephen Champion [ 18/Oct/13 ]

A little more detail - my startup script runs from an NFS exported directory. It specifies hosts by name of an IB interface:

  1. grep _HOST= run-acc-accfs.sh
    mgs_HOST=n013-ib1
    mds_HOST=n013-ib1
    mds1_HOST=n013-ib1
    ost1_HOST=n008-ib1
    ost_HOST=n008-ib1
    ost2_HOST=n009-ib1

check_logdir() creates the files:
touch $dir/check_file.$(hostname -s)

but must run on the node to actually be useful:
do_rpc_nodes "$list" check_logdir $dir
check_write_access $dir "$list" || return 1

This works just fine:

  1. ls
    check_file.n008 rpmlist-client.n006 rpmlist-server.n009 shared
    check_file.n009 rpmlist-client.n007 rpmlist-server.n013
    check_file.n013 rpmlist-server.n008 run-acc-accfs.sh

But because 'n013' != 'n013-ib1', the check for shared access fails.

The patch insures that it will check for the same file name that is created, regardless of host/interface name arrangement.

Comment by Jian Yu [ 20/Nov/13 ]

Patch landed on master branch.

Comment by Jian Yu [ 20/Nov/13 ]

Patch was back-ported to Lustre b2_4 branch: http://review.whamcloud.com/8343

Generated at Sat Feb 10 01:39:52 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.