Does it mean there has been several occurences of the issue ?
Could it be possible to have the logs of the first occurrence ?
Yes, there are several occurrences according to the log. You can tell me one place, and I can upload the debug log there. Unfortunately the debug log is too big (more 100M) to be uploaded here.
Di Wang (Inactive)
added a comment - - edited
Does it mean there has been several occurences of the issue ?
Could it be possible to have the logs of the first occurrence ?
Yes, there are several occurrences according to the log. You can tell me one place, and I can upload the debug log there. Unfortunately the debug log is too big (more 100M) to be uploaded here.
There are 8 clients connected to the filesystem. I will check on client differences, but afaik they are uniform.
Cliff White (Inactive)
added a comment - There are 8 clients connected to the filesystem. I will check on client differences, but afaik they are uniform.
Why the error reported in the description of the ticket (soaked-MDT0001: duplicate export for client generation 3) is not the same than the error reported in the first comment from Di Wang (soaked-MDT0001: duplicate export for client generation 1) ?
Does it mean there has been several occurences of the issue ?
Could it be possible to have the logs of the first occurence ?
By the way, how many Lustre clients are connected to the filesystem ?
Are the two clients with uuid c2781877-e222-f9a8-07f4-a1250eba2af6 and 150208e6-71a8-375e-d139-d478eec5b761 different from others ?
Gregoire Pichon
added a comment - Why the error reported in the description of the ticket ( soaked-MDT0001: duplicate export for client generation 3 ) is not the same than the error reported in the first comment from Di Wang ( soaked-MDT0001: duplicate export for client generation 1 ) ?
Does it mean there has been several occurences of the issue ?
Could it be possible to have the logs of the first occurence ?
By the way, how many Lustre clients are connected to the filesystem ?
Are the two clients with uuid c2781877-e222-f9a8-07f4-a1250eba2af6 and 150208e6-71a8-375e-d139-d478eec5b761 different from others ?
Unfortunately, I cannot get the debug log from ftp.hpdd.intel.com:/uploads/lu-7794/, since the anonymous login only provides write access.
Would it be another way to provide these logs ?
Gregoire Pichon
added a comment - Di Wang,
Unfortunately, I cannot get the debug log from ftp.hpdd.intel.com:/uploads/lu-7794/, since the anonymous login only provides write access.
Would it be another way to provide these logs ?
it's interesting that in this case and in LDEV-180 the duplicated entry had generation=1. meaning that for a reason the counter got reset.
Alex Zhuravlev
added a comment - it's interesting that in this case and in LDEV-180 the duplicated entry had generation=1. meaning that for a reason the counter got reset.
(Random) selected node is powercycled at (randomly) chosen time
Wait till node is up again
Mount resources on failover partner:
Mount MDT resources in sequence
Wait for mount command to complete
If error occurrs retry to mount Lustre MDT(s) again.
Framework is configured for test session NOT to wait for RECOVERY process to complete
umount MDTs on secondary node
mount MDTs on primary node again
NOTE:
I checked the soak framework. I'm very sure that the implementation won't' execute multiple mount commands at the same time or start a new mount command while the one started before
haven't finished, yet.
The framework didn't check whether there's already a mount command running executed from outside the framework.
Concerning 'sequence of events'
2016-02-18 11:07:14,437:fsmgmt.fsmgmt:INFO triggering fault mds_failover
2016-02-18 11:07:14,438:fsmgmt.fsmgmt:INFO reseting MDS node lola-8 (--> Node was powercycled!)
2016-02-18 11:07:14,439:fsmgmt.fsmgmt:INFO executing cmd pm -h powerman -c lola-8> /dev/null
2016-02-18 11:07:28,291:fsmgmt.fsmgmt:INFO trying to connect to lola-8 ...
2016-02-18 11:07:38,307:fsmgmt.fsmgmt:INFO trying to connect to lola-8 ...
2016-02-18 11:07:46,410:fsmgmt.fsmgmt:INFO trying to connect to lola-8 ...
...
...
2016-02-18 11:13:36,132:fsmgmt.fsmgmt:INFO trying to connect to lola-8 ...
2016-02-18 11:13:37,060:fsmgmt.fsmgmt:INFO lola-8 is up!!!
2016-02-18 11:13:48,072:fsmgmt.fsmgmt:INFO Failing over soaked-MDT0001 ...
2016-02-18 11:13:48,073:fsmgmt.fsmgmt:INFO Mounting soaked-MDT0001 on lola-9 ...
2016-02-18 11:16:16,760:fsmgmt.fsmgmt:ERROR ... mount of soaked-MDT0001 on lola-9 failed with 114, retrying ...
2016-02-18 11:16:16,760:fsmgmt.fsmgmt:INFO mount.lustre: increased /sys/block/dm-10/queue/max_sectors_kb from 1024 to 16383
mount.lustre: increased /sys/block/dm-8/queue/max_sectors_kb from 1024 to 16383
mount.lustre: increased /sys/block/sdg/queue/max_sectors_kb from 1024 to 16383
mount.lustre: increased /sys/block/sdb/queue/max_sectors_kb from 1024 to 16383
mount.lustre: mount /dev/mapper/360080e50002ffd820000024f52013094p1 at /mnt/soaked-mdt1 failed: Operation already in progress
The target service is already running. (/dev/mapper/360080e50002ffd820000024f52013094p1)
2016-02-18 11:18:12,738:fsmgmt.fsmgmt:ERROR ... mount of soaked-MDT0001 on lola-9 failed with 114, retrying ...
2016-02-18 11:18:12,739:fsmgmt.fsmgmt:INFO mount.lustre: mount /dev/mapper/360080e50002ffd820000024f52013094p1 at /mnt/soaked-mdt1 failed: Operation already in progress
The target service is already running. (/dev/mapper/360080e50002ffd820000024f52013094p1)
2016-02-18 11:26:58,512:fsmgmt.fsmgmt:INFO ... soaked-MDT0001 mounted successfully on lola-9
2016-02-18 11:26:58,513:fsmgmt.fsmgmt:INFO ... soaked-MDT0001 failed over
2016-02-18 11:26:58,513:fsmgmt.fsmgmt:INFO Failing over soaked-MDT0000 ...
2016-02-18 11:26:58,513:fsmgmt.fsmgmt:INFO Mounting soaked-MDT0000 on lola-9 ...
2016-02-18 11:27:51,049:fsmgmt.fsmgmt:INFO ... soaked-MDT0000 mounted successfully on lola-9
2016-02-18 11:27:51,049:fsmgmt.fsmgmt:INFO ... soaked-MDT0000 failed over
2016-02-18 11:28:11,430:fsmgmt.fsmgmt:DEBUG Recovery Result Record: {'lola-9': {'soaked-MDT0001': 'RECOVERING', 'soaked-MDT0000': 'RECOVERING', 'soaked-MDT0003': 'COMPLETE', 'soaked-MDT0002': 'COMPLETE'}}
2016-02-18 11:28:11,431:fsmgmt.fsmgmt:INFO soaked-MDT0001 in status 'RECOVERING'.
2016-02-18 11:28:11,431:fsmgmt.fsmgmt:INFO soaked-MDT0000 in status 'RECOVERING'.
2016-02-18 11:28:11,431:fsmgmt.fsmgmt:INFO Don't wait for recovery to complete. Failback MDT's immediately
2016-02-18 11:28:11,431:fsmgmt.fsmgmt:INFO Failing back soaked-MDT0001 ...
2016-02-18 11:28:11,431:fsmgmt.fsmgmt:INFO Unmounting soaked-MDT0001 on lola-9 ...
2016-02-18 11:28:12,078:fsmgmt.fsmgmt:INFO ... soaked-MDT0001 unmounted successfully on lola-9
2016-02-18 11:28:12,079:fsmgmt.fsmgmt:INFO Mounting soaked-MDT0001 on lola-8 ...
2016-02-18 11:29:03,122:fsmgmt.fsmgmt:INFO ... soaked-MDT0001 mounted successfully on lola-8
2016-02-18 11:29:03,122:fsmgmt.fsmgmt:INFO ... soaked-MDT0001 failed back
2016-02-18 11:29:03,123:fsmgmt.fsmgmt:INFO Failing back soaked-MDT0000 ...
2016-02-18 11:29:03,123:fsmgmt.fsmgmt:INFO Unmounting soaked-MDT0000 on lola-9 ...
2016-02-18 11:29:09,942:fsmgmt.fsmgmt:INFO ... soaked-MDT0000 unmounted successfully on lola-9
2016-02-18 11:29:09,942:fsmgmt.fsmgmt:INFO Mounting soaked-MDT0000 on lola-8 ...
2016-02-18 11:29:24,023:fsmgmt.fsmgmt:INFO ... soaked-MDT0000 mounted successfully on lola-8
2016-02-18 11:29:24,023:fsmgmt.fsmgmt:INFO ... soaked-MDT0000 failed back
2016-02-18 11:29:24,024:fsmgmt.fsmgmt:INFO mds_failover just completed
2016-02-18 11:29:24,024:fsmgmt.fsmgmt:INFO next fault in 1898s
The error message at 2016-02-18 11:16:16,760 ... failed with 114 ... is strange as (stated above) the soak framework
don't execute multiple mounts of the same device file.
I'm not sure whether some manual mount (outside the soak framework) of MDT1 took place. I'm sure I didn't execute anything on the node at this time.
applications executed
mdtest (single shared file, file per process)
IOR (single shared file, file per process)
simul
blogbench
kcompile
pct (producer, consumer inhouse application)
All applications are initiated with random size, file count, block size. If needed I could provide the (slurm) list of active jobs at the time the error occured.
Frank Heckes (Inactive)
added a comment - - edited Gregoire: I hope the info's below will answer you're question. Please let me know if something is missing or unclear.
small environment info addition
lola-8 --> MDS0, lola-9 --> MDS1, lola-10 --> MDS2, lola-11 --> MDS3
MDTs formatted with ldiskfs , OSTs using zfs
Failover procedure
Triggered by automated framework
(Random) selected node is powercycled at (randomly) chosen time
Wait till node is up again
Mount resources on failover partner:
Mount MDT resources in sequence
Wait for mount command to complete
If error occurrs retry to mount Lustre MDT(s) again.
Framework is configured for test session NOT to wait for RECOVERY process to complete
umount MDTs on secondary node
mount MDTs on primary node again
NOTE:
I checked the soak framework. I'm very sure that the implementation won't' execute multiple
mount commands at the same time or start a new mount command while the one started before
haven't finished, yet.
The framework didn't check whether there's already a mount command running executed from outside the framework.
Concerning 'sequence of events'
2016-02-18 11:07:14,437:fsmgmt.fsmgmt:INFO triggering fault mds_failover
2016-02-18 11:07:14,438:fsmgmt.fsmgmt:INFO reseting MDS node lola-8 (--> Node was powercycled!)
2016-02-18 11:07:14,439:fsmgmt.fsmgmt:INFO executing cmd pm -h powerman -c lola-8> /dev/null
2016-02-18 11:07:28,291:fsmgmt.fsmgmt:INFO trying to connect to lola-8 ...
2016-02-18 11:07:38,307:fsmgmt.fsmgmt:INFO trying to connect to lola-8 ...
2016-02-18 11:07:46,410:fsmgmt.fsmgmt:INFO trying to connect to lola-8 ...
...
...
2016-02-18 11:13:36,132:fsmgmt.fsmgmt:INFO trying to connect to lola-8 ...
2016-02-18 11:13:37,060:fsmgmt.fsmgmt:INFO lola-8 is up!!!
2016-02-18 11:13:48,072:fsmgmt.fsmgmt:INFO Failing over soaked-MDT0001 ...
2016-02-18 11:13:48,073:fsmgmt.fsmgmt:INFO Mounting soaked-MDT0001 on lola-9 ...
2016-02-18 11:16:16,760:fsmgmt.fsmgmt:ERROR ... mount of soaked-MDT0001 on lola-9 failed with 114, retrying ...
2016-02-18 11:16:16,760:fsmgmt.fsmgmt:INFO mount.lustre: increased /sys/block/dm-10/queue/max_sectors_kb from 1024 to 16383
mount.lustre: increased /sys/block/dm-8/queue/max_sectors_kb from 1024 to 16383
mount.lustre: increased /sys/block/sdg/queue/max_sectors_kb from 1024 to 16383
mount.lustre: increased /sys/block/sdb/queue/max_sectors_kb from 1024 to 16383
mount.lustre: mount /dev/mapper/360080e50002ffd820000024f52013094p1 at /mnt/soaked-mdt1 failed: Operation already in progress
The target service is already running. (/dev/mapper/360080e50002ffd820000024f52013094p1)
2016-02-18 11:18:12,738:fsmgmt.fsmgmt:ERROR ... mount of soaked-MDT0001 on lola-9 failed with 114, retrying ...
2016-02-18 11:18:12,739:fsmgmt.fsmgmt:INFO mount.lustre: mount /dev/mapper/360080e50002ffd820000024f52013094p1 at /mnt/soaked-mdt1 failed: Operation already in progress
The target service is already running. (/dev/mapper/360080e50002ffd820000024f52013094p1)
2016-02-18 11:26:58,512:fsmgmt.fsmgmt:INFO ... soaked-MDT0001 mounted successfully on lola-9
2016-02-18 11:26:58,513:fsmgmt.fsmgmt:INFO ... soaked-MDT0001 failed over
2016-02-18 11:26:58,513:fsmgmt.fsmgmt:INFO Failing over soaked-MDT0000 ...
2016-02-18 11:26:58,513:fsmgmt.fsmgmt:INFO Mounting soaked-MDT0000 on lola-9 ...
2016-02-18 11:27:51,049:fsmgmt.fsmgmt:INFO ... soaked-MDT0000 mounted successfully on lola-9
2016-02-18 11:27:51,049:fsmgmt.fsmgmt:INFO ... soaked-MDT0000 failed over
2016-02-18 11:28:11,430:fsmgmt.fsmgmt:DEBUG Recovery Result Record: {'lola-9': {'soaked-MDT0001': 'RECOVERING', 'soaked-MDT0000': 'RECOVERING', 'soaked-MDT0003': 'COMPLETE', 'soaked-MDT0002': 'COMPLETE'}}
2016-02-18 11:28:11,431:fsmgmt.fsmgmt:INFO soaked-MDT0001 in status 'RECOVERING'.
2016-02-18 11:28:11,431:fsmgmt.fsmgmt:INFO soaked-MDT0000 in status 'RECOVERING'.
2016-02-18 11:28:11,431:fsmgmt.fsmgmt:INFO Don't wait for recovery to complete. Failback MDT's immediately
2016-02-18 11:28:11,431:fsmgmt.fsmgmt:INFO Failing back soaked-MDT0001 ...
2016-02-18 11:28:11,431:fsmgmt.fsmgmt:INFO Unmounting soaked-MDT0001 on lola-9 ...
2016-02-18 11:28:12,078:fsmgmt.fsmgmt:INFO ... soaked-MDT0001 unmounted successfully on lola-9
2016-02-18 11:28:12,079:fsmgmt.fsmgmt:INFO Mounting soaked-MDT0001 on lola-8 ...
2016-02-18 11:29:03,122:fsmgmt.fsmgmt:INFO ... soaked-MDT0001 mounted successfully on lola-8
2016-02-18 11:29:03,122:fsmgmt.fsmgmt:INFO ... soaked-MDT0001 failed back
2016-02-18 11:29:03,123:fsmgmt.fsmgmt:INFO Failing back soaked-MDT0000 ...
2016-02-18 11:29:03,123:fsmgmt.fsmgmt:INFO Unmounting soaked-MDT0000 on lola-9 ...
2016-02-18 11:29:09,942:fsmgmt.fsmgmt:INFO ... soaked-MDT0000 unmounted successfully on lola-9
2016-02-18 11:29:09,942:fsmgmt.fsmgmt:INFO Mounting soaked-MDT0000 on lola-8 ...
2016-02-18 11:29:24,023:fsmgmt.fsmgmt:INFO ... soaked-MDT0000 mounted successfully on lola-8
2016-02-18 11:29:24,023:fsmgmt.fsmgmt:INFO ... soaked-MDT0000 failed back
2016-02-18 11:29:24,024:fsmgmt.fsmgmt:INFO mds_failover just completed
2016-02-18 11:29:24,024:fsmgmt.fsmgmt:INFO next fault in 1898s
The error message at 2016-02-18 11:16:16,760 ... failed with 114 ... is strange as (stated above) the soak framework
don't execute multiple mounts of the same device file.
I'm not sure whether some manual mount (outside the soak framework) of MDT1 took place. I'm sure I didn't execute anything on the node at this time.
applications executed
mdtest (single shared file, file per process)
IOR (single shared file, file per process)
simul
blogbench
kcompile
pct (producer, consumer inhouse application)
All applications are initiated with random size, file count, block size. If needed I could provide the (slurm) list of active jobs at the time the error occured.
I upload the debug log to ftp.hpdd.intel.com:/uploads/lu-7794/
The whole process does like this, 4 MDS, each MDS has 2 MDTs. MDS0 (MDT0/1), MDS1(MDT2/3), MDS2(MDT4/5), MDS3(MDT6/7). MDS0 and MDS1 are configured as active/active failover, and MDS2 and MDS3 are configured as active/active failover.
And the test is randomly chosen one of MDS to reboot. then its MDTs will be failover to its pair MDS. In this case, it is MDS0 is restarted, then MDT0/MDT1 should be mounted on MDS1, but failed because of this. No, I do not think this can be easily produced.
Maybe Frank or Cliff will know more.
Di Wang (Inactive)
added a comment - I upload the debug log to ftp.hpdd.intel.com:/uploads/lu-7794/
The whole process does like this, 4 MDS, each MDS has 2 MDTs. MDS0 (MDT0/1), MDS1(MDT2/3), MDS2(MDT4/5), MDS3(MDT6/7). MDS0 and MDS1 are configured as active/active failover, and MDS2 and MDS3 are configured as active/active failover.
And the test is randomly chosen one of MDS to reboot. then its MDTs will be failover to its pair MDS. In this case, it is MDS0 is restarted, then MDT0/MDT1 should be mounted on MDS1, but failed because of this. No, I do not think this can be easily produced.
Maybe Frank or Cliff will know more.
Yes, could upload debug logs as attachment to this JIRA ticket ?
Are the D_INFO messages written to the logs ?
It would be helpful to look for the D_INFO messages logged in tgt_client_new()
CDEBUG(D_INFO, "%s: new client at index %d (%llu) with UUID '%s' ""generation %d\n",
tgt->lut_obd->obd_name, ted->ted_lr_idx, ted->ted_lr_off,
ted->ted_lcd->lcd_uuid, ted->ted_lcd->lcd_generation);
And see in which context they were called.
And again, could you detail what tests were running on the filesystem, please ?
Gregoire Pichon
added a comment - Yes, could upload debug logs as attachment to this JIRA ticket ?
Are the D_INFO messages written to the logs ?
It would be helpful to look for the D_INFO messages logged in tgt_client_new()
CDEBUG(D_INFO, "%s: new client at index %d (%llu) with UUID '%s' "
"generation %d\n" ,
tgt->lut_obd->obd_name, ted->ted_lr_idx, ted->ted_lr_off,
ted->ted_lcd->lcd_uuid, ted->ted_lcd->lcd_generation);
And see in which context they were called.
And again, could you detail what tests were running on the filesystem, please ?
Hmm, I am not sure we can reproduce the problem easily. I am not sure there are node logs. If you need debug log, maybe I can upload it somewhere.
Di Wang (Inactive)
added a comment - Hmm, I am not sure we can reproduce the problem easily. I am not sure there are node logs. If you need debug log, maybe I can upload it somewhere.
Honestly, I have no idea what could lead to that duplicate export situation.
Could you detail what tests were running on the filesystem ?
Do you think you could rebuild the operations (client connections/disconnections, MDT mount/unmount, other MDTs connections/disconnections, ...) that occurred on the filesystem from the node logs ?
Gregoire Pichon
added a comment - Honestly, I have no idea what could lead to that duplicate export situation.
Could you detail what tests were running on the filesystem ?
Do you think you could rebuild the operations (client connections/disconnections, MDT mount/unmount, other MDTs connections/disconnections, ...) that occurred on the filesystem from the node logs ?
Yes, there are several occurrences according to the log. You can tell me one place, and I can upload the debug log there. Unfortunately the debug log is too big (more 100M) to be uploaded here.