Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13794

changing comp-flags using lfs setstripe could get stuck when ost is down

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      we are testing FLR setup and found out lfs setstripe could get stuck.

      After creating the FLR layout and lfs mirror resync:

      [root@bss022 test_file_replica]# lfs getstripe -v test-file1 
      test-file1
      composite_header:
        lcm_magic:         0x0BD60BD0
        lcm_size:          272
        lcm_flags:         ro
        lcm_layout_gen:    11
        lcm_mirror_count:  2
        lcm_entry_count:   2
      components:
        - lcme_id:             65537
          lcme_mirror_id:      1
          lcme_flags:          init
          lcme_extent.e_start: 0
          lcme_extent.e_end:   EOF
          lcme_offset:         128
          lcme_size:           72
          sub_layout:
            lmm_magic:         0x0BD30BD0
            lmm_seq:           0xa80000d30
            lmm_object_id:     0x6
            lmm_fid:           [0xa80000d30:0x6:0x0]
            lmm_stripe_count:  1
            lmm_stripe_size:   1048576
            lmm_pattern:       raid0
            lmm_layout_gen:    0
            lmm_stripe_offset: 9
            lmm_pool:          primary
            lmm_objects:
            - 0: { l_ost_idx: 9, l_fid: [0x440000419:0x17e2:0x0] }  - lcme_id:             131074
          lcme_mirror_id:      2
          lcme_flags:          init
          lcme_extent.e_start: 0
          lcme_extent.e_end:   EOF
          lcme_offset:         200
          lcme_size:           72
          sub_layout:
            lmm_magic:         0x0BD30BD0
            lmm_seq:           0xa80000d30
            lmm_object_id:     0x6
            lmm_fid:           [0xa80000d30:0x6:0x0]
            lmm_stripe_count:  1
            lmm_stripe_size:   1048576
            lmm_pattern:       raid0
            lmm_layout_gen:    0
            lmm_stripe_offset: 13
            lmm_pool:          secondary
            lmm_objects:
            - 0: { l_ost_idx: 13, l_fid: [0x2800000408:0x1802:0x0] }
      

      after putting ost9 offline we are still able to read the file.

      but in order to be able to write the file we need to set the preferred flag on the other component:

      lfs setstripe --comp-set -I 131074 --comp-flags=prefer test-file1
      

      however it will get stuck because lfs is trying to flush the client cache to ost9 which is offline.

      [<ffffffffc1433f94>] osc_io_data_version_end+0x34/0x190 [osc]
      [<ffffffffc0fc4ee0>] cl_io_end+0x60/0x150 [obdclass]
      [<ffffffffc0e0a0bb>] lov_io_end_wrapper+0xdb/0xe0 [lov]
      [<ffffffffc0e0ad38>] lov_io_data_version_end+0x78/0x1d0 [lov]
      [<ffffffffc0fc4ee0>] cl_io_end+0x60/0x150 [obdclass]
      [<ffffffffc0fc779a>] cl_io_loop+0xda/0x1c0 [obdclass]
      [<ffffffffc1513bcb>] ll_ioc_data_version+0x20b/0x340 [lustre]
      [<ffffffffc15283e0>] ll_file_ioctl+0x19d0/0x49f0 [lustre]
      [<ffffffffb665d9e0>] do_vfs_ioctl+0x3a0/0x5a0
      [<ffffffffb665dc81>] SyS_ioctl+0xa1/0xc0
      [<ffffffffb6b8cede>] system_call_fastpath+0x25/0x2a
      [<ffffffffffffffff>] 0xffffffffffffffff
      

      Attachments

        Activity

          People

            bobijam Zhenyu Xu
            dongyang Dongyang Li
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: