[LU-7794] tgt_clients_data_init()) soaked-MDT0001: duplicate export for client generation 1 - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Duplicate
Priority: Critical
Fix Version/s: None
Affects Version/s: Lustre 2.8.0
Labels:
- soak

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

We saw this during DNE failover soak-test.

Lustre: 4374:0:(client.c:2063:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1455822992/real 1455822992]  req@ffff8807bba3b980 x1526529651962748/t0(0) o250->MGC192.168.1.108@o2ib10@0@lo:26/25 lens 520/544 e 0 to 1 dl 1455823038 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Lustre: 4374:0:(client.c:2063:ptlrpc_expire_one_request()) Skipped 37 previous similar messages
LustreError: 8115:0:(tgt_lastrcvd.c:1464:tgt_clients_data_init()) soaked-MDT0001: duplicate export for client generation 3
LustreError: 8115:0:(obd_config.c:578:class_setup()) setup soaked-MDT0001 failed (-114)
LustreError: 8115:0:(obd_config.c:1666:class_config_llog_handler()) MGC192.168.1.108@o2ib10: cfg command failed: rc = -114
Lustre:    cmd=cf003 0:soaked-MDT0001  1:soaked-MDT0001_UUID  2:1  3:soaked-MDT0001-mdtlov  4:f

And it cause the failover MDT can not be mounted on the new MDS. Similar as ~~LU-7430~~, but no panic this time.

Attachments

Issue Links

duplicates

LU-15935 MDT mount fails with "duplicate generation for client export" during failover

Resolved

is related to

LU-7430 General protection fault: 0000 upon mounting MDT

Resolved

Activity

[LU-7794] tgt_clients_data_init()) soaked-MDT0001: duplicate export for client generation 1

Di Wang (Inactive) added a comment - 22/Feb/16 3:57 AM

I upload the debug log to ftp.hpdd.intel.com:/uploads/lu-7794/

The whole process does like this, 4 MDS, each MDS has 2 MDTs. MDS0 (MDT0/1), MDS1(MDT2/3), MDS2(MDT4/5), MDS3(MDT6/7). MDS0 and MDS1 are configured as active/active failover, and MDS2 and MDS3 are configured as active/active failover.

And the test is randomly chosen one of MDS to reboot. then its MDTs will be failover to its pair MDS. In this case, it is MDS0 is restarted, then MDT0/MDT1 should be mounted on MDS1, but failed because of this. No, I do not think this can be easily produced.

Maybe Frank or Cliff will know more.

Di Wang (Inactive) added a comment - 22/Feb/16 3:57 AM I upload the debug log to ftp.hpdd.intel.com:/uploads/lu-7794/ The whole process does like this, 4 MDS, each MDS has 2 MDTs. MDS0 (MDT0/1), MDS1(MDT2/3), MDS2(MDT4/5), MDS3(MDT6/7). MDS0 and MDS1 are configured as active/active failover, and MDS2 and MDS3 are configured as active/active failover. And the test is randomly chosen one of MDS to reboot. then its MDTs will be failover to its pair MDS. In this case, it is MDS0 is restarted, then MDT0/MDT1 should be mounted on MDS1, but failed because of this. No, I do not think this can be easily produced. Maybe Frank or Cliff will know more.

Gregoire Pichon added a comment - 20/Feb/16 5:22 PM

Yes, could upload debug logs as attachment to this JIRA ticket ?

Are the D_INFO messages written to the logs ?
It would be helpful to look for the D_INFO messages logged in tgt_client_new()

CDEBUG(D_INFO, "%s: new client at index %d (%llu) with UUID '%s' "
        "generation %d\n",
        tgt->lut_obd->obd_name, ted->ted_lr_idx, ted->ted_lr_off,
        ted->ted_lcd->lcd_uuid, ted->ted_lcd->lcd_generation);

And see in which context they were called.

And again, could you detail what tests were running on the filesystem, please ?

Gregoire Pichon added a comment - 20/Feb/16 5:22 PM Yes, could upload debug logs as attachment to this JIRA ticket ? Are the D_INFO messages written to the logs ? It would be helpful to look for the D_INFO messages logged in tgt_client_new() CDEBUG(D_INFO, "%s: new client at index %d (%llu) with UUID '%s' " "generation %d\n" , tgt->lut_obd->obd_name, ted->ted_lr_idx, ted->ted_lr_off, ted->ted_lcd->lcd_uuid, ted->ted_lcd->lcd_generation); And see in which context they were called. And again, could you detail what tests were running on the filesystem, please ?

Di Wang (Inactive) added a comment - 20/Feb/16 2:52 AM

Hmm, I am not sure we can reproduce the problem easily. I am not sure there are node logs. If you need debug log, maybe I can upload it somewhere.

Di Wang (Inactive) added a comment - 20/Feb/16 2:52 AM Hmm, I am not sure we can reproduce the problem easily. I am not sure there are node logs. If you need debug log, maybe I can upload it somewhere.

Gregoire Pichon added a comment - 19/Feb/16 10:27 PM

Honestly, I have no idea what could lead to that duplicate export situation.

Could you detail what tests were running on the filesystem ?

Do you think you could rebuild the operations (client connections/disconnections, MDT mount/unmount, other MDTs connections/disconnections, ...) that occurred on the filesystem from the node logs ?

Gregoire Pichon added a comment - 19/Feb/16 10:27 PM Honestly, I have no idea what could lead to that duplicate export situation. Could you detail what tests were running on the filesystem ? Do you think you could rebuild the operations (client connections/disconnections, MDT mount/unmount, other MDTs connections/disconnections, ...) that occurred on the filesystem from the node logs ?

Peter Jones added a comment - 19/Feb/16 7:09 PM

Gregoire

Could you please advise?

Thanks

Peter

Peter Jones added a comment - 19/Feb/16 7:09 PM Gregoire Could you please advise? Thanks Peter

Di Wang (Inactive) added a comment - 18/Feb/16 8:01 PM

debuglog snippet

00000020:01000000:8.0:1455822976.430423:0:8057:0:(obd_config.c:1562:class_config_llog_handler()) For 2.x interoperability, rename obd type from lov to lod (soaked-MDT0001)
00000020:00000080:8.0:1455822976.430426:0:8057:0:(obd_config.c:1148:class_process_config()) processing cmd: cf001
00000020:00000080:8.0:1455822976.430429:0:8057:0:(obd_config.c:362:class_attach()) attach type lod name: soaked-MDT0001-mdtlov uuid: soaked-MDT0001-mdtlov_UUID
00000020:00000080:8.0:1455822976.430526:0:8057:0:(genops.c:371:class_newdev()) Adding new device soaked-MDT0001-mdtlov (ffff8807b479f078)
00000020:00000080:8.0:1455822976.430530:0:8057:0:(obd_config.c:432:class_attach()) OBD: dev 59 attached type lod with refcount 1
00000020:00000080:8.0:1455822976.430534:0:8057:0:(obd_config.c:1148:class_process_config()) processing cmd: cf003
00080000:01000000:8.0:1455822976.430613:0:8057:0:(osd_handler.c:6597:osd_obd_connect()) connect #1
00000020:00000080:8.0:1455822976.430617:0:8057:0:(genops.c:1167:class_connect()) connect: client soaked-MDT0001-osd_UUID, cookie 0xe84502d68e3d0d9a
00000020:00000080:8.0:1455822976.430707:0:8057:0:(obd_config.c:552:class_setup()) finished setup of obd soaked-MDT0001-mdtlov (uuid soaked-MDT0001-mdtlov_UUID)
00000020:01000000:8.0:1455822976.430711:0:8057:0:(obd_config.c:1487:class_config_llog_handler()) Marker, inst_flg=0x2 mark_flg=0x2
00000020:00000080:8.0:1455822976.430714:0:8057:0:(obd_config.c:1148:class_process_config()) processing cmd: cf010
00000020:00000080:8.0:1455822976.430715:0:8057:0:(obd_config.c:1218:class_process_config()) marker 8 (0x2) soaked-MDT0001-m lov setup
00000020:01000000:8.0:1455822976.430717:0:8057:0:(obd_config.c:1487:class_config_llog_handler()) Marker, inst_flg=0x0 mark_flg=0x1
00000020:00000080:8.0:1455822976.430720:0:8057:0:(obd_config.c:1148:class_process_config()) processing cmd: cf010
00000020:00000080:8.0:1455822976.430721:0:8057:0:(obd_config.c:1218:class_process_config()) marker 9 (0x1) soaked-MDT0001 add mdt
00000020:00000080:8.0:1455822976.430724:0:8057:0:(obd_config.c:1148:class_process_config()) processing cmd: cf001
00000020:00000080:8.0:1455822976.430725:0:8057:0:(obd_config.c:362:class_attach()) attach type mdt name: soaked-MDT0001 uuid: soaked-MDT0001_UUID
00000020:00000080:8.0:1455822976.430816:0:8057:0:(genops.c:371:class_newdev()) Adding new device soaked-MDT0001 (ffff8807a97fd038)
00000020:00000080:8.0:1455822976.430819:0:8057:0:(obd_config.c:432:class_attach()) OBD: dev 60 attached type mdt with refcount 1
00000020:00000080:8.0:1455822976.430822:0:8057:0:(obd_config.c:1148:class_process_config()) processing cmd: cf007
00000020:00000080:8.0:1455822976.430823:0:8057:0:(obd_config.c:1176:class_process_config()) mountopt: profile soaked-MDT0001 osc soaked-MDT0001-mdtlov mdc (null)
00000020:01000000:8.0:1455822976.430826:0:8057:0:(obd_config.c:873:class_add_profile()) Add profile soaked-MDT0001
00000020:00000080:8.0:1455822976.430831:0:8057:0:(obd_config.c:1148:class_process_config()) processing cmd: cf003
00000020:01000004:8.0:1455822976.430883:0:8057:0:(obd_mount_server.c:175:server_get_mount()) get mount ffff8807c8232400 from soaked-MDT0001, refs=2
00000020:00000080:8.0:1455822976.430908:0:8057:0:(obd_config.c:362:class_attach()) attach type mdd name: soaked-MDD0001 uuid: soaked-MDD0001_UUID
00000020:00000080:8.0:1455822976.430998:0:8057:0:(genops.c:371:class_newdev()) Adding new device soaked-MDD0001 (ffff8807ea567078)
00000020:00000080:8.0:1455822976.431001:0:8057:0:(obd_config.c:432:class_attach()) OBD: dev 61 attached type mdd with refcount 1
00000004:01000000:8.0:1455822976.431060:0:8057:0:(lod_dev.c:1687:lod_obd_connect()) connect #0
00000020:00000080:8.0:1455822976.431064:0:8057:0:(genops.c:1167:class_connect()) connect: client soaked-MDT0001-mdtlov_UUID, cookie 0xe84502d68e3d0daf
00000020:00000080:8.0:1455822976.431096:0:8057:0:(obd_config.c:552:class_setup()) finished setup of obd soaked-MDD0001 (uuid soaked-MDD0001_UUID)
00000004:01000000:8.0:1455822976.431103:0:8057:0:(mdd_device.c:1178:mdd_obd_connect()) connect #000000020:00000080:8.0:1455822976.431106:0:8057:0:(genops.c:1167:class_connect()) connect: client soaked-MDD0001_UUID, cookie 0xe84502d68e3d0db6
00080000:01000000:8.0:1455822976.431113:0:8057:0:(osd_handler.c:6597:osd_obd_connect()) connect #2
00000020:00000080:8.0:1455822976.431115:0:8057:0:(genops.c:1167:class_connect()) connect: client soaked-MDT0001-osd_UUID, cookie 0xe84502d68e3d0dbd
00000001:00080000:8.0:1455822976.461440:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: soaked-MDT0000-mdtlov_UUID idx: 0 lr: 17182107883 srv lr: 17180780327 lx: 1526530761809700 gen 0
00000001:00080000:8.0:1455822976.461464:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: soaked-MDT0003-mdtlov_UUID idx: 1 lr: 17181767205 srv lr: 17180780327 lx: 1526529647176604 gen 0
00000001:00080000:8.0:1455822976.461475:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: soaked-MDT0002-mdtlov_UUID idx: 2 lr: 17182149935 srv lr: 17180780327 lx: 1526529649871472 gen 0
00000001:00080000:8.0:1455822976.461483:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: soaked-MDT0005-mdtlov_UUID idx: 3 lr: 17182105205 srv lr: 17180780327 lx: 1526525227761264 gen 0
00000001:00080000:8.0:1455822976.461493:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: soaked-MDT0004-mdtlov_UUID idx: 4 lr: 17182152956 srv lr: 17180780327 lx: 1526525229666900 gen 0
00000001:00080000:8.0:1455822976.461503:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: soaked-MDT0007-mdtlov_UUID idx: 5 lr: 17182138513 srv lr: 17180780327 lx: 1526536609097660 gen 0
00000001:00080000:8.0:1455822976.461512:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: soaked-MDT0006-mdtlov_UUID idx: 6 lr: 17181782621 srv lr: 17180780327 lx: 1526536607769984 gen 0
00000001:00080000:8.0:1455822976.461520:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: c2781877-e222-f9a8-07f4-a1250eba2af6 idx: 7 lr: 0 srv lr: 17180780327 lx: 0 gen 1
00000001:00080000:8.0:1455822976.461530:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: 150208e6-71a8-375e-d139-d478eec5b761 idx: 8 lr: 0 srv lr: 17180780327 lx: 0 gen 1
00000001:00020000:8.0:1455822976.461539:0:8057:0:(tgt_lastrcvd.c:1464:tgt_clients_data_init()) soaked-MDT0001: duplicate export for client generation 1

Di Wang (Inactive) added a comment - 18/Feb/16 8:01 PM debuglog snippet 00000020:01000000:8.0:1455822976.430423:0:8057:0:(obd_config.c:1562:class_config_llog_handler()) For 2.x interoperability, rename obd type from lov to lod (soaked-MDT0001) 00000020:00000080:8.0:1455822976.430426:0:8057:0:(obd_config.c:1148:class_process_config()) processing cmd: cf001 00000020:00000080:8.0:1455822976.430429:0:8057:0:(obd_config.c:362:class_attach()) attach type lod name: soaked-MDT0001-mdtlov uuid: soaked-MDT0001-mdtlov_UUID 00000020:00000080:8.0:1455822976.430526:0:8057:0:(genops.c:371:class_newdev()) Adding new device soaked-MDT0001-mdtlov (ffff8807b479f078) 00000020:00000080:8.0:1455822976.430530:0:8057:0:(obd_config.c:432:class_attach()) OBD: dev 59 attached type lod with refcount 1 00000020:00000080:8.0:1455822976.430534:0:8057:0:(obd_config.c:1148:class_process_config()) processing cmd: cf003 00080000:01000000:8.0:1455822976.430613:0:8057:0:(osd_handler.c:6597:osd_obd_connect()) connect #1 00000020:00000080:8.0:1455822976.430617:0:8057:0:(genops.c:1167:class_connect()) connect: client soaked-MDT0001-osd_UUID, cookie 0xe84502d68e3d0d9a 00000020:00000080:8.0:1455822976.430707:0:8057:0:(obd_config.c:552:class_setup()) finished setup of obd soaked-MDT0001-mdtlov (uuid soaked-MDT0001-mdtlov_UUID) 00000020:01000000:8.0:1455822976.430711:0:8057:0:(obd_config.c:1487:class_config_llog_handler()) Marker, inst_flg=0x2 mark_flg=0x2 00000020:00000080:8.0:1455822976.430714:0:8057:0:(obd_config.c:1148:class_process_config()) processing cmd: cf010 00000020:00000080:8.0:1455822976.430715:0:8057:0:(obd_config.c:1218:class_process_config()) marker 8 (0x2) soaked-MDT0001-m lov setup 00000020:01000000:8.0:1455822976.430717:0:8057:0:(obd_config.c:1487:class_config_llog_handler()) Marker, inst_flg=0x0 mark_flg=0x1 00000020:00000080:8.0:1455822976.430720:0:8057:0:(obd_config.c:1148:class_process_config()) processing cmd: cf010 00000020:00000080:8.0:1455822976.430721:0:8057:0:(obd_config.c:1218:class_process_config()) marker 9 (0x1) soaked-MDT0001 add mdt 00000020:00000080:8.0:1455822976.430724:0:8057:0:(obd_config.c:1148:class_process_config()) processing cmd: cf001 00000020:00000080:8.0:1455822976.430725:0:8057:0:(obd_config.c:362:class_attach()) attach type mdt name: soaked-MDT0001 uuid: soaked-MDT0001_UUID 00000020:00000080:8.0:1455822976.430816:0:8057:0:(genops.c:371:class_newdev()) Adding new device soaked-MDT0001 (ffff8807a97fd038) 00000020:00000080:8.0:1455822976.430819:0:8057:0:(obd_config.c:432:class_attach()) OBD: dev 60 attached type mdt with refcount 1 00000020:00000080:8.0:1455822976.430822:0:8057:0:(obd_config.c:1148:class_process_config()) processing cmd: cf007 00000020:00000080:8.0:1455822976.430823:0:8057:0:(obd_config.c:1176:class_process_config()) mountopt: profile soaked-MDT0001 osc soaked-MDT0001-mdtlov mdc (null) 00000020:01000000:8.0:1455822976.430826:0:8057:0:(obd_config.c:873:class_add_profile()) Add profile soaked-MDT0001 00000020:00000080:8.0:1455822976.430831:0:8057:0:(obd_config.c:1148:class_process_config()) processing cmd: cf003 00000020:01000004:8.0:1455822976.430883:0:8057:0:(obd_mount_server.c:175:server_get_mount()) get mount ffff8807c8232400 from soaked-MDT0001, refs=2 00000020:00000080:8.0:1455822976.430908:0:8057:0:(obd_config.c:362:class_attach()) attach type mdd name: soaked-MDD0001 uuid: soaked-MDD0001_UUID 00000020:00000080:8.0:1455822976.430998:0:8057:0:(genops.c:371:class_newdev()) Adding new device soaked-MDD0001 (ffff8807ea567078) 00000020:00000080:8.0:1455822976.431001:0:8057:0:(obd_config.c:432:class_attach()) OBD: dev 61 attached type mdd with refcount 1 00000004:01000000:8.0:1455822976.431060:0:8057:0:(lod_dev.c:1687:lod_obd_connect()) connect #0 00000020:00000080:8.0:1455822976.431064:0:8057:0:(genops.c:1167:class_connect()) connect: client soaked-MDT0001-mdtlov_UUID, cookie 0xe84502d68e3d0daf 00000020:00000080:8.0:1455822976.431096:0:8057:0:(obd_config.c:552:class_setup()) finished setup of obd soaked-MDD0001 (uuid soaked-MDD0001_UUID) 00000004:01000000:8.0:1455822976.431103:0:8057:0:(mdd_device.c:1178:mdd_obd_connect()) connect #000000020:00000080:8.0:1455822976.431106:0:8057:0:(genops.c:1167:class_connect()) connect: client soaked-MDD0001_UUID, cookie 0xe84502d68e3d0db6 00080000:01000000:8.0:1455822976.431113:0:8057:0:(osd_handler.c:6597:osd_obd_connect()) connect #2 00000020:00000080:8.0:1455822976.431115:0:8057:0:(genops.c:1167:class_connect()) connect: client soaked-MDT0001-osd_UUID, cookie 0xe84502d68e3d0dbd 00000001:00080000:8.0:1455822976.461440:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: soaked-MDT0000-mdtlov_UUID idx: 0 lr: 17182107883 srv lr: 17180780327 lx: 1526530761809700 gen 0 00000001:00080000:8.0:1455822976.461464:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: soaked-MDT0003-mdtlov_UUID idx: 1 lr: 17181767205 srv lr: 17180780327 lx: 1526529647176604 gen 0 00000001:00080000:8.0:1455822976.461475:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: soaked-MDT0002-mdtlov_UUID idx: 2 lr: 17182149935 srv lr: 17180780327 lx: 1526529649871472 gen 0 00000001:00080000:8.0:1455822976.461483:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: soaked-MDT0005-mdtlov_UUID idx: 3 lr: 17182105205 srv lr: 17180780327 lx: 1526525227761264 gen 0 00000001:00080000:8.0:1455822976.461493:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: soaked-MDT0004-mdtlov_UUID idx: 4 lr: 17182152956 srv lr: 17180780327 lx: 1526525229666900 gen 0 00000001:00080000:8.0:1455822976.461503:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: soaked-MDT0007-mdtlov_UUID idx: 5 lr: 17182138513 srv lr: 17180780327 lx: 1526536609097660 gen 0 00000001:00080000:8.0:1455822976.461512:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: soaked-MDT0006-mdtlov_UUID idx: 6 lr: 17181782621 srv lr: 17180780327 lx: 1526536607769984 gen 0 00000001:00080000:8.0:1455822976.461520:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: c2781877-e222-f9a8-07f4-a1250eba2af6 idx: 7 lr: 0 srv lr: 17180780327 lx: 0 gen 1 00000001:00080000:8.0:1455822976.461530:0:8057:0:(tgt_lastrcvd.c:1427:tgt_clients_data_init()) RCVRNG CLIENT uuid: 150208e6-71a8-375e-d139-d478eec5b761 idx: 8 lr: 0 srv lr: 17180780327 lx: 0 gen 1 00000001:00020000:8.0:1455822976.461539:0:8057:0:(tgt_lastrcvd.c:1464:tgt_clients_data_init()) soaked-MDT0001: duplicate export for client generation 1

People

Assignee:: Gregoire Pichon

Reporter:: Di Wang (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 18/Feb/16 7:54 PM

Updated:: 03/Nov/22 3:10 AM

Resolved:: 03/Nov/22 3:10 AM