Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4953

/proc/fs/lustre/lmv/*/target_obds missing on second mounts

Details

    • 3
    • 13705

    Description

      If I double mount a client (for example with MOUNT_2=y llmount.sh) then the second mount point will have a missing /proc/fs/lustre/lmv/*/target_obds directory.

      u:~# umount /mnt/lustre
      u:~# umount /mnt/lustre2
      #### 
      u:~# mount -o user_xattr,flock u@tcp:/lustre /mnt/lustre -t lustre
      u:~# ls /proc/fs/lustre/lmv/
      lustre-clilmv-ffff8801f6d64e60
      u:~# ls /proc/fs/lustre/lmv/lustre-clilmv-ffff8801f6d64e60/
      activeobd  desc_uuid  md_stats  numobd  placement  target_obd  target_obds  uuid
      u:~# ls -l /proc/fs/lustre/lmv/lustre-clilmv-ffff8801f6d64e60/target_obds/
      total 0
      lrwxrwxrwx 1 root root 48 Apr 24 12:49 lustre-MDT0000-mdc-ffff8801f6d64e60 -> ../../../mdc/lustre-MDT0000-mdc-ffff8801f6d64e60
      u:~# 
      u:~# mount -o user_xattr,flock u@tcp:/lustre /mnt/lustre2 -t lustre
      u:~# ls /proc/fs/lustre/lmv/
      lustre-clilmv-ffff8801f487f778  lustre-clilmv-ffff8801f6d64e60
      u:~# ls /proc/fs/lustre/lmv/lustre-clilmv-ffff8801f487f778
      activeobd  desc_uuid  md_stats  numobd  placement  target_obd  uuid
      u:~# 
      

      I see this when I unmount the second client mount:

      [  263.149071] LustreError: 3790:0:(lmv_obd.c:761:lmv_disconnect()) /proc/fs/lustre/lmv/lustre-clilmv-ffff8802169bc688/target_obds missing
      

      Attachments

        Activity

          [LU-4953] /proc/fs/lustre/lmv/*/target_obds missing on second mounts

          Patch landed to Master. Flagged for backport to b2_5. Closing ticket for fix in Master. Please reopen if additional work (other than back porting) is needed.

          jlevi Jodi Levi (Inactive) added a comment - Patch landed to Master. Flagged for backport to b2_5. Closing ticket for fix in Master. Please reopen if additional work (other than back porting) is needed.

          Patch landed for master. Need to make patch for b2_5 and possible b2_4 as well.

          simmonsja James A Simmons added a comment - Patch landed for master. Need to make patch for b2_5 and possible b2_4 as well.

          Patch at http://review.whamcloud.com/#/c/10192.

          I take it back about upstream. It does handle this correctly. We will need patches for b2_5 and possible b2_4 tho.

          simmonsja James A Simmons added a comment - Patch at http://review.whamcloud.com/#/c/10192 . I take it back about upstream. It does handle this correctly. We will need patches for b2_5 and possible b2_4 tho.
          simmonsja James A Simmons added a comment - - edited

          Okay I figured it out. The problem is the use of procsym itself. I created procsym so you have top level symlinks to point to another top level proc entry. Examples are lod -> lov and osp -> osc. A procsym exist once per object type whereas in this case the target_obds directories exist for each object instance.So we can't use the procsym field in obd_type. The reason it showed up in lmv and not lov is because I was not checking in lov if proc_sym was already set. This means lov is leaking memory on each module unload. This bug also exist in the upstream Lustre client.

          Please note this problem also exist in the 2.4 and 2.5 code base as well. In that case we have a single prod_dir_entry in each module even tho multiple instances need to be created. Thus we have memory leaks with module unloads as well.

          simmonsja James A Simmons added a comment - - edited Okay I figured it out. The problem is the use of procsym itself. I created procsym so you have top level symlinks to point to another top level proc entry. Examples are lod -> lov and osp -> osc. A procsym exist once per object type whereas in this case the target_obds directories exist for each object instance.So we can't use the procsym field in obd_type. The reason it showed up in lmv and not lov is because I was not checking in lov if proc_sym was already set. This means lov is leaking memory on each module unload. This bug also exist in the upstream Lustre client. Please note this problem also exist in the 2.4 and 2.5 code base as well. In that case we have a single prod_dir_entry in each module even tho multiple instances need to be created. Thus we have memory leaks with module unloads as well.

          Yep I can duplicate this problem. Let me look into it.

          simmonsja James A Simmons added a comment - Yep I can duplicate this problem. Let me look into it.
          jhammond John Hammond added a comment -

          James, can you comment here?

          jhammond John Hammond added a comment - James, can you comment here?

          People

            bogl Bob Glossman (Inactive)
            jhammond John Hammond
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: