Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12839

‘lctl pcc add …’ fails with “: Not a directory (20)”

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.13.0
    • 3
    • 9223372036854775807

    Description

      Running the first manual test in the PCC test plan, “Pressure Test via compilebench”, I get an error when adding a PCC backend on a client (trevis-62vm8):

      # lctl pcc add /lustre/scratch/lpcc /dev/vda3 -p "projid={100} rwid=2"
      lctl pcc pcc: error: setting llite.scratch-ffff9383a119e000.pcc="add /dev/vda3 projid={100} rwid=2"
      : Not a directory (20)
      

      On the client (vm8) dmesg, I see many of these messages:

      [68485.718305] LustreError: 11-0: scratch-OST0003-osc-ffff9383f9c47800: operation ost_connect to node 10.9.6.176@tcp failed: rc = -30
      

      Running ‘lfs df’ on the client (vm8), hangs before printing the information for OST0003. Yet running the same command on other clients (vm7) completes successfully.

      Since it seems like there’s a problem on an OSS, we look at the dmesg on the OSS (vm4) and see the following messages

      [68313.219321] LDISKFS-fs (dm-1): error count since last fsck: 54
      [68313.220003] LDISKFS-fs (dm-1): initial error at time 1570550293: ldiskfs_mb_check_ondisk_bitmap:3719
      [68313.220929] LDISKFS-fs (dm-1): last error at time 1570559969: ldiskfs_lookup:1816: inode 399
      

      and then many of the following messages

      [68368.364485] LustreError: 24267:0:(tgt_lastrcvd.c:1027:tgt_client_new()) scratch-OST0003: Failed to write client lcd at idx 3, rc -30
      [68368.365722] LustreError: 24267:0:(tgt_lastrcvd.c:1027:tgt_client_new()) Skipped 477 previous similar messages
      [68630.939863] LustreError: 25860:0:(tgt_grant.c:248:tgt_grant_sanity_check()) ofd_destroy_export: tot_granted 277312 != fo_tot_granted 8715072
      [68630.941251] LustreError: 25860:0:(tgt_grant.c:248:tgt_grant_sanity_check()) Skipped 232 previous similar messages 
      

      What did we do to get here? We set project quotas on the MDTs and OSTs and enabled HSM coordinators on all (4) MDTs. The on the client (vm8), we mount Lustre and ran the following commands:

      # mount | grep lustre
      10.9.6.172@tcp:/scratch on /lustre/scratch type lustre (rw,flock,user_xattr,lazystatfs)
      10.9.6.172@tcp:/scratch on /lustre/scratch2 type lustre (rw,flock,user_xattr,lazystatfs)
      # lsblk
      NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
      vda    253:0    0  120G  0 disk 
      ├─vda1 253:1    0   20G  0 part /
      ├─vda2 253:2    0    2G  0 part [SWAP]
      └─vda3 253:3    0   98G  0 part 
      # lhsmtool_posix --daemon --hsm-root /dev/vda3 --archive=2 /lustre/scratch < /dev/null >
      -bash: syntax error near unexpected token `newline'
      # lhsmtool_posix --daemon --hsm-root /dev/vda3 --archive=2 /lustre/scratch
      lhsmtool_posix: 1570566464.446881 lhsmtool_posix[1256]: action=0 src=(null) dst=(null) mount_point=/lustre/scratch
      # lhsmtool_posix: 1570566464.459653 lhsmtool_posix[1257]: waiting for message from kernel
      
      # ps -aux | grep hsm
      root      1257  0.0  0.0  18500  1464 ?        Ss   20:27   0:00 lhsmtool_posix --daemon --hsm-root /dev/vda3 --archive=2 /lustre/scratch
      # 
      # /tmp/copytool_log 2>&1
      -bash: /tmp/copytool_log: No such file or directory
      # mkdir /lustre/scratch/lpcc
      # lfs project -sp 100 /lustre/scratch/lpcc
      # lctl pcc add /lustre/scratch /dev/vda3 -p "projid={100} rwid=2"
      lctl pcc pcc: error: setting llite.scratch-ffff9383a119e000.pcc="add /dev/vda3 projid={100} rwid=2"
      : Not a directory (20)
      
      

      Attachments

        Activity

          People

            wc-triage WC Triage
            jamesanunez James Nunez (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: