[LU-10040] nodemap and quota issues (ineffective GID mapping) - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Fixed
Priority: Major
Fix Version/s: Lustre 2.11.0, Lustre 2.10.2
Affects Version/s: Lustre 2.10.0, Lustre 2.10.1
Labels:
- patch
Environment:

Hide
client: lustre-client-2.10.0-1.el7.x86_64, lustre-2.10.1_RC1_srcc01-1.el7.centos.x86_64 (2.10.1-RC1 + patch from ~~LU-9929~~)

Show
client: lustre-client-2.10.0-1.el7.x86_64, lustre-2.10.1_RC1_srcc01-1.el7.centos.x86_64 (2.10.1-RC1 + patch from LU-9929 )

Severity:
2
Rank (Obsolete):
9223372036854775807

Description

We're using the nodemap feature with map_mode=gid_only in production and we are seeing more and more issues with GID mapping, which seems to default to squash_gid instead of being properly mapped. The nodemap hasn't changed for these groups, we just add new groups from time to time.

Example, configuration for mapping 'sherlock' on MGS:

[root@oak-md1-s1 sherlock]# pwd
/proc/fs/lustre/nodemap/sherlock

[root@oak-md1-s1 sherlock]# cat ranges 
[
 { id: 6, start_nid: 0.0.0.0@o2ib4, end_nid: 255.255.255.255@o2ib4 },
 { id: 5, start_nid: 0.0.0.0@o2ib3, end_nid: 255.255.255.255@o2ib3 }
]

[root@oak-md1-s1 sherlock]# cat idmap 
[
 { idtype: gid, client_id: 3525, fs_id: 3741 } { idtype: gid, client_id: 6401, fs_id: 3752 } { idtype: gid, client_id: 99001, fs_id: 3159 } { idtype: gid, client_id: 10525, fs_id: 3351 } { idtype: gid, client_id: 11886, fs_id: 3593 } { idtype: gid, client_id: 12193, fs_id: 3636 } { idtype: gid, client_id: 13103, fs_id: 3208 } { idtype: gid, client_id: 17079, fs_id: 3700 } { idtype: gid, client_id: 19437, fs_id: 3618 } { idtype: gid, client_id: 22959, fs_id: 3745 } { idtype: gid, client_id: 24369, fs_id: 3526 } { idtype: gid, client_id: 26426, fs_id: 3352 } { idtype: gid, client_id: 29361, fs_id: 3746 } { idtype: gid, client_id: 29433, fs_id: 3479 } { idtype: gid, client_id: 30289, fs_id: 3262 } { idtype: gid, client_id: 32264, fs_id: 3199 } { idtype: gid, client_id: 32774, fs_id: 3623 } { idtype: gid, client_id: 38517, fs_id: 3702 } { idtype: gid, client_id: 40387, fs_id: 3708 } { idtype: gid, client_id: 47235, fs_id: 3674 } { idtype: gid, client_id: 48931, fs_id: 3325 } { idtype: gid, client_id: 50590, fs_id: 3360 } { idtype: gid, client_id: 52892, fs_id: 3377 } { idtype: gid, client_id: 56316, fs_id: 3353 } { idtype: gid, client_id: 56628, fs_id: 3411 } { idtype: gid, client_id: 59943, fs_id: 3372 } { idtype: gid, client_id: 63938, fs_id: 3756 } { idtype: gid, client_id: 100533, fs_id: 3281 } { idtype: gid, client_id: 244300, fs_id: 3617 } { idtype: gid, client_id: 254778, fs_id: 3362 } { idtype: gid, client_id: 267829, fs_id: 3748 } { idtype: gid, client_id: 270331, fs_id: 3690 } { idtype: gid, client_id: 305454, fs_id: 3371 } { idtype: gid, client_id: 308753, fs_id: 3367 }

[root@oak-md1-s1 sherlock]# cat squash_gid 
99
[root@oak-md1-s1 sherlock]# cat map_mode 
gid_only

[root@oak-md1-s1 sherlock]# cat admin_nodemap 
0
[root@oak-md1-s1 sherlock]# cat deny_unknown 
1
[root@oak-md1-s1 sherlock]# cat trusted_nodemap 
0

Issue with group: GID 3593 (mapped to GID 11886 on sherlock)

lfs quota, not mapped (using canonical GID 3593):

[root@oak-rbh01 ~]# lfs quota -g oak_euan /oak
Disk quotas for group oak_euan (gid 3593):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
           /oak 33255114444  50000000000 50000000000       -  526016  7500000 7500000       -

Broken lfs quota mapped on sherlock (o2ib4):

[root@sh-113-01 ~]# lfs quota -g euan /oak
Disk quotas for grp euan (gid 11886):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
           /oak 2875412844*      1       1       -      26*      1       1       -
[root@sh-113-01 ~]# lctl list_nids
10.9.113.1@o2ib4

It matches the quota usage for squash_gid:

[root@oak-rbh01 ~]# lfs quota -g 99 /oak
Disk quotas for group 99 (gid 99):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
           /oak 2875412844*      1       1       -      26*      1       1       -

Please note that GID mapping works OK for most of the groups though:

3199 -> 32264(sherlock)

canonical:
[root@oak-rbh01 ~]# lfs quota -g oak_ruthm /oak
Disk quotas for group oak_ruthm (gid 3199):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
           /oak 10460005688  20000000000 20000000000       - 1683058  3000000 3000000       -

mapped (sherlock):
[root@sh-113-01 ~]# lfs quota -g ruthm /oak
Disk quotas for grp ruthm (gid 32264):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
           /oak 10460005688  20000000000 20000000000       - 1683058  3000000 3000000       -

Failing over the MDT resolved a few groups, but not all. Failing the MDT back showed an issue on the exact same original groups having issues (currently 4-5).

While I haven't seen it by myself yet, the issue seems to affect users as a few of them reported erroneous EDQUOT errors. This is why it is quite urgent to figure out what's wrong. Please note that the issue was already there before using the patch from ~~LU-9929~~.

I'm willing to attach some debug logs, but what debug flags should I enable to troubleshoot such a quota+nodemap issue on client and server?

Thanks!
Stephane

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

sh-113-01.dk.log
547 kB
28/Sep/17 6:33 PM
sh-101-59.client.dk.full.log
2.25 MB
02/Oct/17 5:41 AM
reproducer.log
3 kB
07/Oct/17 6:49 AM
oak-md1-s2.mdt.dk.full.log
53.84 MB
02/Oct/17 5:41 AM
oak-md1-s2.glb-grp.txt
11 kB
29/Sep/17 4:44 PM
oak-md1-s2.dk.log
1.25 MB
28/Sep/17 6:33 PM
oak-md1-s1.glb-grp.txt
11 kB
29/Sep/17 4:44 PM
break_nodemap_rbtree.sh
0.7 kB
07/Oct/17 6:49 AM

Issue Links

is related to

LU-10135 nodemap_del_idmap() calls nodemap_idx_idmap_del() while holding rwlock

Closed

Activity

[LU-10040] nodemap and quota issues (ineffective GID mapping)

Gerrit Updater added a comment - 27/Nov/17 2:34 PM

John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/30206/
Subject: ~~LU-10040~~ nodemap: add nodemap idmap correctly
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: e881c665bb60543fd2bbbd2d195ccce99a65f16b

Gerrit Updater added a comment - 27/Nov/17 2:34 PM John L. Hammond (john.hammond@intel.com) merged in patch https://review.whamcloud.com/30206/ Subject: LU-10040 nodemap: add nodemap idmap correctly Project: fs/lustre-release Branch: b2_10 Current Patch Set: Commit: e881c665bb60543fd2bbbd2d195ccce99a65f16b

Gerrit Updater added a comment - 22/Nov/17 2:48 PM

James Nunez (james.a.nunez@intel.com) uploaded a new patch: https://review.whamcloud.com/30206
Subject: ~~LU-10040~~ nodemap: add nodemap idmap correctly
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: a4de3c0f0ae3dbb684ba63874fa70e171c219cdf

Gerrit Updater added a comment - 22/Nov/17 2:48 PM James Nunez (james.a.nunez@intel.com) uploaded a new patch: https://review.whamcloud.com/30206 Subject: LU-10040 nodemap: add nodemap idmap correctly Project: fs/lustre-release Branch: b2_10 Current Patch Set: 1 Commit: a4de3c0f0ae3dbb684ba63874fa70e171c219cdf

Peter Jones added a comment - 22/Nov/17 2:44 PM

Landed for 2.11

Peter Jones added a comment - 22/Nov/17 2:44 PM Landed for 2.11

Gerrit Updater added a comment - 22/Nov/17 3:54 AM

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/29364/
Subject: ~~LU-10040~~ nodemap: add nodemap idmap correctly
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 253ccbd55ffe7fcdc405c9fcc4f72a47578920fe

Gerrit Updater added a comment - 22/Nov/17 3:54 AM Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/29364/ Subject: LU-10040 nodemap: add nodemap idmap correctly Project: fs/lustre-release Branch: master Current Patch Set: Commit: 253ccbd55ffe7fcdc405c9fcc4f72a47578920fe

Emoly Liu added a comment - 18/Oct/17 3:21 AM

Stephane,

That's great. After the MGS restarts/remounts, the other targets will detect the lock of config log changed and then fetch the config log from MGS to update their local copy.

Thanks,

Emoly

Emoly Liu added a comment - 18/Oct/17 3:21 AM Stephane, That's great. After the MGS restarts/remounts, the other targets will detect the lock of config log changed and then fetch the config log from MGS to update their local copy. Thanks, Emoly

Stephane Thiell added a comment - 17/Oct/17 3:17 PM - edited

Hi Emoly,

Good news. I renamed ./CONFIGS/nodemap into ./CONFIGS/nodemap.corrupted instead of removing it, but it worked! I was then able to mount the MGS and recreate all nodemaps by hand from there. And now, I can add new idmaps again and they are properly propagated to the targets. The corrupted 'sherlock' nodemap can't be seen anymore from the MGS.

After some time, like a few minutes maybe (not immediately), the corrupted 'sherlock' nodemap was also automatically removed from all targets (MDT, OST). This is great.

Thanks again! By the way, I am now running 2.10.1 with the patch on the MGS/MDS.

Stephane

Stephane Thiell added a comment - 17/Oct/17 3:17 PM - edited Hi Emoly, Good news. I renamed ./CONFIGS/nodemap into ./CONFIGS/nodemap.corrupted instead of removing it, but it worked! I was then able to mount the MGS and recreate all nodemaps by hand from there. And now, I can add new idmaps again and they are properly propagated to the targets. The corrupted 'sherlock' nodemap can't be seen anymore from the MGS. After some time, like a few minutes maybe (not immediately), the corrupted 'sherlock' nodemap was also automatically removed from all targets (MDT, OST). This is great. Thanks again! By the way, I am now running 2.10.1 with the patch on the MGS/MDS. Stephane

Emoly Liu added a comment - 17/Oct/17 5:02 AM - edited

Here are some steps to remove nodemap config log from MGS. This will remove all nodemap information from MGS, so before do that, you'd better save all of nodemap information by "cp -r /proc/fs/lustre/nodemap $nodemap_dir" or "lctl get_param nodemap.*.* > $nodemap_file".

umount your MGS
mount your MGS with ldiskfs type, by the command: mount -t ldiskfs $your_MGS_device $mountpoint
cd $mountpoint, you will see file ./CONFIGS/nodemap. I also suggest to save a backup(e.g. /tmp/nodemap) before remove it.
umount your MGS and remount it with lustre type

Please let me know if this works for you.

Emoly Liu added a comment - 17/Oct/17 5:02 AM - edited Here are some steps to remove nodemap config log from MGS. This will remove all nodemap information from MGS, so before do that, you'd better save all of nodemap information by " cp -r /proc/fs/lustre/nodemap $nodemap_dir " or "lctl get_param nodemap.*.* > $nodemap_file". umount your MGS mount your MGS with ldiskfs type, by the command: mount -t ldiskfs $your_MGS_device $mountpoint cd $mountpoint , you will see file ./CONFIGS/nodemap . I also suggest to save a backup(e.g. /tmp/nodemap) before remove it. umount your MGS and remount it with lustre type Please let me know if this works for you.

Emoly Liu added a comment - 17/Oct/17 2:53 AM

Stephane, I saw the same "-2" logs from my server on Oct. 9 when I tried to reproduce this issue. Let me see how to purge the nodemap log.

Emoly Liu added a comment - 17/Oct/17 2:53 AM Stephane, I saw the same "-2" logs from my server on Oct. 9 when I tried to reproduce this issue. Let me see how to purge the nodemap log.

Stephane Thiell added a comment - 16/Oct/17 10:48 PM

Also, I cannot remount the MGS anymore (2.10.1 + patch gerrit 29364):

[ 1174.919438] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
[ 1174.932548] Lustre: 17247:0:(osd_handler.c:7008:osd_mount()) MGS-osd: device /dev/mapper/md1-rbod1-mgt was upgraded from Lustre-1.x without enabling the dirdata feature. If you do not want to downgrade to Lustre-1.x again, you can enable it via 'tune2fs -O dirdata device'
[ 1175.062057] Lustre: 17247:0:(nodemap_storage.c:914:nodemap_load_entries()) MGS-osd: failed to load nodemap configuration: rc = -2
[ 1175.075067] LustreError: 17247:0:(mgs_fs.c:187:mgs_fs_setup()) MGS: error loading nodemap config file, file must be removed via ldiskfs: rc = -2
[ 1175.089557] LustreError: 17247:0:(mgs_handler.c:1297:mgs_init0()) MGS: MGS filesystem method init failed: rc = -2
[ 1175.145812] LustreError: 17247:0:(obd_config.c:608:class_setup()) setup MGS failed (-2)
[ 1175.154748] LustreError: 17247:0:(obd_mount.c:203:lustre_start_simple()) MGS setup error -2
[ 1175.164081] LustreError: 17247:0:(obd_mount_server.c:135:server_deregister_mount()) MGS not registered
[ 1175.174463] LustreError: 15e-a: Failed to start MGS 'MGS' (-2). Is the 'mgs' module loaded?
[ 1175.282230] Lustre: server umount MGS complete

Stephane Thiell added a comment - 16/Oct/17 10:48 PM Also, I cannot remount the MGS anymore (2.10.1 + patch gerrit 29364): [ 1174.919438] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 1174.932548] Lustre: 17247:0:(osd_handler.c:7008:osd_mount()) MGS-osd: device /dev/mapper/md1-rbod1-mgt was upgraded from Lustre-1.x without enabling the dirdata feature. If you do not want to downgrade to Lustre-1.x again, you can enable it via 'tune2fs -O dirdata device' [ 1175.062057] Lustre: 17247:0:(nodemap_storage.c:914:nodemap_load_entries()) MGS-osd: failed to load nodemap configuration: rc = -2 [ 1175.075067] LustreError: 17247:0:(mgs_fs.c:187:mgs_fs_setup()) MGS: error loading nodemap config file, file must be removed via ldiskfs: rc = -2 [ 1175.089557] LustreError: 17247:0:(mgs_handler.c:1297:mgs_init0()) MGS: MGS filesystem method init failed: rc = -2 [ 1175.145812] LustreError: 17247:0:(obd_config.c:608:class_setup()) setup MGS failed (-2) [ 1175.154748] LustreError: 17247:0:(obd_mount.c:203:lustre_start_simple()) MGS setup error -2 [ 1175.164081] LustreError: 17247:0:(obd_mount_server.c:135:server_deregister_mount()) MGS not registered [ 1175.174463] LustreError: 15e-a: Failed to start MGS 'MGS' (-2). Is the 'mgs' module loaded? [ 1175.282230] Lustre: server umount MGS complete

Stephane Thiell added a comment - 16/Oct/17 10:32 PM

Hi,
I applied the patch and tried on the MDS, but unfortunately it is not able to process nodemap log. I will need to find a way to purge the nodemap log.

oak-MDT0000:

[  127.492117] Lustre: Lustre: Build Version: 2.10.1_srcc02
[  127.527461] LNet: Using FMR for registration
[  127.553475] LNet: Added LNI 10.0.2.52@o2ib5 [8/256/0/180]
[  190.367048] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,acl,no_mbcache,nodelalloc
[  191.433340] LustreError: 137-5: oak-MDT0000_UUID: not available for connect from 10.210.45.60@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server.
[  191.452861] LustreError: Skipped 3 previous similar messages
[  191.471790] Lustre: 13119:0:(mgc_request.c:1797:mgc_process_recover_nodemap_log()) MGC10.0.2.51@o2ib5: error processing nodemap log nodemap: rc = -2
[  191.523256] Lustre: oak-MDT0000: Not available for connect from 10.210.47.38@o2ib3 (not set up)
[  191.532970] Lustre: Skipped 3 previous similar messages
[  192.015060] Lustre: oak-MDT0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900
[  192.501895] Lustre: oak-MDD0000: changelog on
[  192.549977] Lustre: oak-MDT0000: Will be in recovery for at least 2:30, or until 1212 clients reconnect
[  192.560492] Lustre: oak-MDT0000: Connection restored to cd0e08e0-aa22-f4da-21ed-94f218f886a1 (at 10.210.45.100@o2ib3)
[  192.595309] Lustre: oak-MDT0000: root_squash is set to 99:99
[  192.603004] Lustre: oak-MDT0000: nosquash_nids set to 10.0.2.[1-3]@o2ib5 10.0.2.[51-58]@o2ib5 10.0.2.[101-120]@o2ib5 10.0.2.[221-223]@o2ib5 10.0.2.[226-229]@o2ib5 10.0.2.[232-235]@o2ib5 10.0.2.[240-241]@o2ib5 10.210.47.253@o2ib3 10.9.0.[1-2]@o2ib4
...

Thanks,
Stephane

Stephane Thiell added a comment - 16/Oct/17 10:32 PM Hi, I applied the patch and tried on the MDS, but unfortunately it is not able to process nodemap log. I will need to find a way to purge the nodemap log. oak-MDT0000: [ 127.492117] Lustre: Lustre: Build Version: 2.10.1_srcc02 [ 127.527461] LNet: Using FMR for registration [ 127.553475] LNet: Added LNI 10.0.2.52@o2ib5 [8/256/0/180] [ 190.367048] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,acl,no_mbcache,nodelalloc [ 191.433340] LustreError: 137-5: oak-MDT0000_UUID: not available for connect from 10.210.45.60@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 191.452861] LustreError: Skipped 3 previous similar messages [ 191.471790] Lustre: 13119:0:(mgc_request.c:1797:mgc_process_recover_nodemap_log()) MGC10.0.2.51@o2ib5: error processing nodemap log nodemap: rc = -2 [ 191.523256] Lustre: oak-MDT0000: Not available for connect from 10.210.47.38@o2ib3 (not set up) [ 191.532970] Lustre: Skipped 3 previous similar messages [ 192.015060] Lustre: oak-MDT0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [ 192.501895] Lustre: oak-MDD0000: changelog on [ 192.549977] Lustre: oak-MDT0000: Will be in recovery for at least 2:30, or until 1212 clients reconnect [ 192.560492] Lustre: oak-MDT0000: Connection restored to cd0e08e0-aa22-f4da-21ed-94f218f886a1 (at 10.210.45.100@o2ib3) [ 192.595309] Lustre: oak-MDT0000: root_squash is set to 99:99 [ 192.603004] Lustre: oak-MDT0000: nosquash_nids set to 10.0.2.[1-3]@o2ib5 10.0.2.[51-58]@o2ib5 10.0.2.[101-120]@o2ib5 10.0.2.[221-223]@o2ib5 10.0.2.[226-229]@o2ib5 10.0.2.[232-235]@o2ib5 10.0.2.[240-241]@o2ib5 10.210.47.253@o2ib3 10.9.0.[1-2]@o2ib4 ... Thanks, Stephane

People

Assignee:: Emoly Liu

Reporter:: Stephane Thiell

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 28/Sep/17 4:07 AM

Updated:: 15/Mar/19 8:33 AM

Resolved:: 22/Nov/17 2:44 PM