Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8967

directory entries for non existing files

    XMLWordPrintable

Details

    • 3
    • 9223372036854775807

    Description

      We have several directories with entries for non existing files. For example:

      [root@quartz2311:~]# ls -l /p/lscratchh/casses1/quartz-zinc_3/19519/dbench/quartz2322/clients/client0                                                                                 
      ls: cannot access /p/lscratchh/casses1/quartz-zinc_3/19519/dbench/quartz2322/clients/client0/filler.003: No such file or directory
      total 3154
      -rw------- 1 casses1 casses1 1048576 Dec 21 16:43 filler.000
      -rw------- 1 casses1 casses1 1048576 Dec 21 16:43 filler.001
      -rw------- 1 casses1 casses1 1048576 Dec 21 16:43 filler.002
      -????????? ? ?       ?             ?            ? filler.003
      drwx------ 2 casses1 casses1   25600 Dec 21 16:43 ~dmtmp
      

      The directory itself is a remote directory on one MDT:

      [root@quartz2311:~]# lfs getdirstripe -d /p/lscratchh/casses1/quartz-zinc_3/19519/dbench/quartz2322/clients/client0
      lmv_stripe_count: 0 lmv_stripe_offset: 3
      

      We are able to get striping information for this file:

      [root@quartz2311:~]# lfs getstripe /p/lscratchh/casses1/quartz-zinc_3/19519/dbench/quartz2322/clients/client0/filler.003
      /p/lscratchh/casses1/quartz-zinc_3/19519/dbench/quartz2322/clients/client0/filler.003
      lmm_stripe_count:   1
      lmm_stripe_size:    1048576
      lmm_pattern:        1
      lmm_layout_gen:     0
      lmm_stripe_offset:  27
              obdidx           objid           objid           group
                  27        20538776      0x1396598      0xcc0000402
      

      It looks like the OSS serving that OST was rebooted and the OST went through recovery around the time the missing file was created. In particular, we note that the object number falls in the range of orphan objects that were deleted:

      [root@zinci:~]# grep 0xcc0000402 /var/log/conman/console.zinc*
      /var/log/conman/console.zinc43:2016-12-21 16:30:56 [189484.767900] Lustre: lsh-OST001b: deleting orphan objects from 0xcc0000402:20538706 to 0xcc0000402:20541649
      /var/log/conman/console.zinc43:2016-12-21 16:33:30 [189639.110247] Lustre: lsh-OST001b: deleting orphan objects from 0xcc0000402:20538766 to 0xcc0000402:20541649
      /var/log/conman/console.zinc43:2016-12-21 16:35:41 [189769.704490] Lustre: lsh-OST001b: deleting orphan objects from 0xcc0000402:20538766 to 0xcc0000402:20541649
      /var/log/conman/console.zinc43:2016-12-21 16:40:19 [190047.449320] Lustre: lsh-OST001b: deleting orphan objects from 0xcc0000402:20538766 to 0xcc0000402:20541649
      /var/log/conman/console.zinc43:2016-12-21 16:44:45 [190313.751155] Lustre: lsh-OST001b: deleting orphan objects from 0xcc0000402:20538820 to 0xcc0000402:20541649
      /var/log/conman/console.zinc44:2016-12-21 16:49:27 [  159.838420] Lustre: lsh-OST001b: deleting orphan objects from 0xcc0000402:20538820 to 0xcc0000402:20541649
      

      I will attach server console logs separately.

      Attachments

        1. LU-8967.console.zinc4.mds
          8 kB
          Ned Bass
        2. LU-8967.console.zinc43
          26 kB
          Ned Bass
        3. LU-8967.console.zinc44
          105 kB
          Ned Bass

        Issue Links

          Activity

            People

              tappro Mikhail Pershin
              nedbass Ned Bass (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: