Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.15.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      Currently, in lov_init_composite() will quit with error when reading an unknown LOV pattern:

      			CERROR("%s: unknown composite layout entry type %i\n",
      			       lov2obd(dev->ld_lov)->obd_name,
      			       lsm->lsm_entries[i]->lsme_pattern);
      

      Since FLR will be used for upcoming new features, like PCC-RO, an old client should still be able to read the old format mirrors and ignore the new types of mirrors.

      On the MDS, it should skip mirrors with unknown/foreign component types and mark them stale when selecting which component to use for writes.

      Attachments

        Issue Links

          Activity

            [LU-13602] Skip unknown FLR component types
            simmonsja James A Simmons added a comment - - edited

            For some reason the change in the patch LCM_FL_PCC_RDONLY exposes a bug that I have seen with the port to the native Linux client. With the patch I get:

            Lustre: DEBUG MARKER: == sanity test 63b: async write errors should be returned to fsync ============================

            ======= 10:58:33 (1627916313)

            Lustre: *** cfs_fail_loc=406, val=0***

            LustreError: 20036:0:(osc_request.c:2586:osc_build_rpc()) prep_req failed: -12

            LustreError: 20036:0:(osc_cache.c:2259:osc_check_rpcs()) Write request failed with -12

             LustreError: 20042:0:(lu_object.c:215:lu_object_put()) ASSERTION( list_empty(&top->loh_lru) ) failed:

            LustreError: 20042:0:(lu_object.c:215:lu_object_put()) LBUG

            kernel: Pid: 20042, comm: lctl 5.7.0-rc7+ #1 SMP PREEMPT Sat Jul 31 13:04:46 EDT 2021

            kernel: Call Trace:

            kernel: libcfs_call_trace+0x62/0x80 [libcfs]

            kernel: lbug_with_loc+0x41/0xa0 [libcfs]

            kernel: lu_object_put+0x31b/0x340 [obdclass]

            kernel: vvp_pgcache_current+0x7e/0x150 [lustre]

            kernel: seq_read+0x200/0x3c0

            kernel: full_proxy_read+0x4d/0x70

            kernel: vfs_read+0xb3/0x17

            kernel: ksys_read+0x5f/0xd0

            kernel: do_syscall_64+0x6d/0x4f0

            kernel: entry_SYSCALL_64_after_hwframe+0x49/0xb3

            kernel: Kernel panic - not syncing: LBUG

            simmonsja James A Simmons added a comment - - edited For some reason the change in the patch LCM_FL_PCC_RDONLY exposes a bug that I have seen with the port to the native Linux client. With the patch I get: Lustre: DEBUG MARKER: == sanity test 63b: async write errors should be returned to fsync ============================ ======= 10:58:33 (1627916313) Lustre: *** cfs_fail_loc=406, val=0*** LustreError: 20036:0:(osc_request.c:2586:osc_build_rpc()) prep_req failed: -12 LustreError: 20036:0:(osc_cache.c:2259:osc_check_rpcs()) Write request failed with -12  LustreError: 20042:0:(lu_object.c:215:lu_object_put()) ASSERTION( list_empty(&top->loh_lru) ) failed: LustreError: 20042:0:(lu_object.c:215:lu_object_put()) LBUG kernel: Pid: 20042, comm: lctl 5.7.0-rc7+ #1 SMP PREEMPT Sat Jul 31 13:04:46 EDT 2021 kernel: Call Trace: kernel: libcfs_call_trace+0x62/0x80 [libcfs] kernel: lbug_with_loc+0x41/0xa0 [libcfs] kernel: lu_object_put+0x31b/0x340 [obdclass] kernel: vvp_pgcache_current+0x7e/0x150 [lustre] kernel: seq_read+0x200/0x3c0 kernel: full_proxy_read+0x4d/0x70 kernel: vfs_read+0xb3/0x17 kernel: ksys_read+0x5f/0xd0 kernel: do_syscall_64+0x6d/0x4f0 kernel: entry_SYSCALL_64_after_hwframe+0x49/0xb3 kernel: Kernel panic - not syncing: LBUG
            pjones Peter Jones added a comment -

            Landed for 2.15

            pjones Peter Jones added a comment - Landed for 2.15

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40813/
            Subject: LU-13602 pcc: add LCM_FL_PCC_RDONLY layout flag
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: adc1bbbf20e0a8a53274aa4590ed0935f954d1bc

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40813/ Subject: LU-13602 pcc: add LCM_FL_PCC_RDONLY layout flag Project: fs/lustre-release Branch: master Current Patch Set: Commit: adc1bbbf20e0a8a53274aa4590ed0935f954d1bc

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39513/
            Subject: LU-13602 flr: skip unknown FLR component types
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 61a002cd8631ecd35a0326dba68f0eb6e48dc44f

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39513/ Subject: LU-13602 flr: skip unknown FLR component types Project: fs/lustre-release Branch: master Current Patch Set: Commit: 61a002cd8631ecd35a0326dba68f0eb6e48dc44f
            qian_wc Qian Yingjin added a comment -

            Updated the patch https://review.whamcloud.com/#/c/39513/.

            It will skip the unknown FOREIGN layout component that the client does not support or understand.

            qian_wc Qian Yingjin added a comment - Updated the patch https://review.whamcloud.com/#/c/39513/ . It will skip the unknown FOREIGN layout component that the client does not support or understand.

            Yingjin, neither of these patches are addressing the main issue of this ticket, which is to allow the MDS and clients to skip layout types that they do not understand. In particular, in the new PCC code, it is using a foreign layout as part of a file, and if that file is being modified by a client then the PCC component needs to be marked STALE. The same should be done for any other layout type that the MDS does not understand.

            On the client, it should not try to read from a component type that it doesn't understand.

            adilger Andreas Dilger added a comment - Yingjin, neither of these patches are addressing the main issue of this ticket, which is to allow the MDS and clients to skip layout types that they do not understand. In particular, in the new PCC code, it is using a foreign layout as part of a file, and if that file is being modified by a client then the PCC component needs to be marked STALE. The same should be done for any other layout type that the MDS does not understand. On the client, it should not try to read from a component type that it doesn't understand.

            Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/40813
            Subject: LU-13602 pcc: add LCM_FL_PCC_RDONLY layout flag
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 2dc8af64b58ab5cc6d3ae4964ceda4271d2baef4

            gerrit Gerrit Updater added a comment - Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/40813 Subject: LU-13602 pcc: add LCM_FL_PCC_RDONLY layout flag Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 2dc8af64b58ab5cc6d3ae4964ceda4271d2baef4
            qian_wc Qian Yingjin added a comment -

            Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/40791
            Subject: LU-10499 pcc: introducing OBD_CONNECT2_PCCRO flag
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: d2638ccd5c1a327840c68d875eb30a0a3f4f895d

            qian_wc Qian Yingjin added a comment - Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/40791 Subject: LU-10499 pcc: introducing OBD_CONNECT2_PCCRO flag Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: d2638ccd5c1a327840c68d875eb30a0a3f4f895d
            qian_wc Qian Yingjin added a comment - - edited

            The old on-disk data for lcm_flags are:

            /**
             * on-disk data for lcm_flags. Valid if lcm_magic is LOV_MAGIC_COMP_V1.
             */
            enum lov_comp_md_flags {
            	/* the least 2 bits are used by FLR to record file state */
            	LCM_FL_NONE          = 0,
            	LCM_FL_RDONLY           = 1,
            	LCM_FL_WRITE_PENDING    = 2,
            	LCM_FL_SYNC_PENDING     = 3,
            	LCM_FL_FLR_MASK         = 0x3,
            };
            
            

            We have also changed the value of LCM_FL_SYNC_PENDING...

            Maybe the new lov_comp_md_flags should be:

            /**
             * on-disk data for lcm_flags. Valid if lcm_magic is LOV_MAGIC_COMP_V1.
             */
            enum lov_comp_md_flags {
            	/* the least 2 bits are used by FLR to record file state */
            	LCM_FL_NONE          = 0x00,
            	LCM_FL_RDONLY           = 0x01,
            	LCM_FL_WRITE_PENDING    = 0x02,
            	LCM_FL_SYNC_PENDING     = 0x03,
                    LCM_FL_PCC_RDONLY = 0x10,
            	LCM_FL_FLR_MASK         = 0x13,
            };
            
            

            we don't change the old FLR component flags, add a new LCM_FL_PCC_RDONLY from 0x10.

            qian_wc Qian Yingjin added a comment - - edited The old on-disk data for lcm_flags are: /** * on-disk data for lcm_flags. Valid if lcm_magic is LOV_MAGIC_COMP_V1. */ enum lov_comp_md_flags { /* the least 2 bits are used by FLR to record file state */ LCM_FL_NONE = 0, LCM_FL_RDONLY = 1, LCM_FL_WRITE_PENDING = 2, LCM_FL_SYNC_PENDING = 3, LCM_FL_FLR_MASK = 0x3, }; We have also changed the value of LCM_FL_SYNC_PENDING... Maybe the new lov_comp_md_flags should be: /** * on-disk data for lcm_flags. Valid if lcm_magic is LOV_MAGIC_COMP_V1. */ enum lov_comp_md_flags { /* the least 2 bits are used by FLR to record file state */ LCM_FL_NONE = 0x00, LCM_FL_RDONLY = 0x01, LCM_FL_WRITE_PENDING = 0x02, LCM_FL_SYNC_PENDING = 0x03, LCM_FL_PCC_RDONLY = 0x10, LCM_FL_FLR_MASK = 0x13, }; we don't change the old FLR component flags, add a new LCM_FL_PCC_RDONLY from 0x10.

            Combining with FLR, we need to extend the data structure `enum lov_comp_md_flags`.

            /**
             * on-disk data for lcm_flags. Valid if lcm_magic is LOV_MAGIC_COMP_V1.
             */
            enum lov_comp_md_flags {
            				/* the least 2 bits are used by FLR to record file state */
            				LCM_FL_NONE          		= 0x0,
            				LCM_FL_RDONLY           = 0x1,
            				LCM_FL_WRITE_PENDING    = 0x2,
            				LCM_FL_SYNC_PENDING     = 0x4,
            				LCM_FL_PCC_RDONLY		= ox8, /* The file is RO-PCC cached */
            				LCM_FL_FLR_MASK         = 0xF,
            };
            

            Please note the flags LCM_FL_PCC_RDONLY is different from LCM_FL_RDONLY here.
            A PCC-RO cached file could be in the state:

            • LCM_FL_PCC_RDONLY | LCM_FL_RDONLY: it means that all components are synced and in up-to-date state, and then one client attaches the file into the PCC backend FS.
            • LCM_FL_PCC_RDONLY | LCM_FL_WRITE_PENDING: It means the file was once modified by a client, the data content of all components are not synced. The MDT has already picked a primary replication and marked other components as STALE. At this time, a client can still PCC-RO attach the file. For this client, the primary component and the PCC copy are both in up-to-date state.
            • ...

            From my own opinion, I do not think PCC-RO attaching a file in LCM_FL_WRITE_PENDING state needs to sync all mirrors and transmit the file into LCM_FL_RDONLY state which may be time-consuming. In our implementation, we only need to add a new PCC-RO foreign layout component and add LCM_FL_PCC_RDONLY into the FLR flags.

            As we add a new LCM_FL_PCC_RDONLY state, thus the old client may not understand this new FLR component flag, and may result in inconsistent access.

            Regards,
            Qian

            qian_wc Qian Yingjin added a comment - Combining with FLR, we need to extend the data structure `enum lov_comp_md_flags`. /** * on-disk data for lcm_flags. Valid if lcm_magic is LOV_MAGIC_COMP_V1. */ enum lov_comp_md_flags { /* the least 2 bits are used by FLR to record file state */ LCM_FL_NONE = 0x0, LCM_FL_RDONLY = 0x1, LCM_FL_WRITE_PENDING = 0x2, LCM_FL_SYNC_PENDING = 0x4, LCM_FL_PCC_RDONLY = ox8, /* The file is RO-PCC cached */ LCM_FL_FLR_MASK = 0xF, }; Please note the flags LCM_FL_PCC_RDONLY is different from LCM_FL_RDONLY here. A PCC-RO cached file could be in the state: LCM_FL_PCC_RDONLY | LCM_FL_RDONLY: it means that all components are synced and in up-to-date state, and then one client attaches the file into the PCC backend FS. LCM_FL_PCC_RDONLY | LCM_FL_WRITE_PENDING: It means the file was once modified by a client, the data content of all components are not synced. The MDT has already picked a primary replication and marked other components as STALE. At this time, a client can still PCC-RO attach the file. For this client, the primary component and the PCC copy are both in up-to-date state. ... From my own opinion, I do not think PCC-RO attaching a file in LCM_FL_WRITE_PENDING state needs to sync all mirrors and transmit the file into LCM_FL_RDONLY state which may be time-consuming. In our implementation, we only need to add a new PCC-RO foreign layout component and add LCM_FL_PCC_RDONLY into the FLR flags. As we add a new LCM_FL_PCC_RDONLY state, thus the old client may not understand this new FLR component flag, and may result in inconsistent access. Regards, Qian

            People

              qian_wc Qian Yingjin
              lixi_wc Li Xi
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: