[LU-12304] replay-single test_62: 'unlinkmany /mnt/lustre/d62.replay-single/f62.replay-single failed' Created: 15/May/19 Updated: 14/Jul/21 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.0, Lustre 2.13.0, Lustre 2.12.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Patrick Farrell (Inactive) | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
Recent failure in 2.12.2 testing: https://testing.whamcloud.com/test_sets/81a8b6dc-fdf0-11e8-b837-52540065bddc Earlier hit, erroneously attached to https://testing.whamcloud.com/test_sets/81a8b6dc-fdf0-11e8-b837-52540065bddc |
| Comments |
| Comment by Patrick Farrell (Inactive) [ 15/May/19 ] |
|
jamesanunez highlighted this slightly scary log snippet: "We're seeing something similar with replay-single test 62 for ldiskfs/DNE for 2.12.2 RC1 at https://testing.whamcloud.com/test_sets/78994818-753c-11e9-a6f9-52540065bddc . We see the following in the client 2 dmesg" [64633.042303] Lustre: DEBUG MARKER: == replay-single test 0d: expired recovery with no clients =========================================== 09:24:46 (1557653086) [64633.892025] Lustre: DEBUG MARKER: mcreate /mnt/lustre/fsa-$(hostname); rm /mnt/lustre/fsa-$(hostname) [64634.219536] Lustre: DEBUG MARKER: if [ -d /mnt/lustre2 ]; then mcreate /mnt/lustre2/fsa-$(hostname); rm /mnt/lustre2/fsa-$(hostname); fi [64647.402309] LustreError: 166-1: MGC10.2.4.96@tcp: Connection to MGS (at 10.2.4.96@tcp) was lost; in progress operations using this service will fail [64655.316668] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-30vm4.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4 [64655.543605] Lustre: DEBUG MARKER: onyx-30vm4.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4 [64657.419063] Lustre: Evicted from MGS (at 10.2.4.96@tcp) after server handle changed from 0x4e283717d3799726 to 0x4e283717d3799d54 [64657.421383] LustreError: 17190:0:(import.c:1267:ptlrpc_connect_interpret()) lustre-MDT0000_UUID went back in time (transno 4295093706 was previously committed, server now claims 4295093699)! See https://bugzilla.lustre.org/show_bug.cgi?id=9646 [64837.930298] LustreError: 11-0: lustre-MDT0000-mdc-ffff91839c315000: operation mds_reint to node 10.2.4.96@tcp failed: rc = -107 [64842.704776] LustreError: 167-0: lustre-MDT0000-mdc-ffff91839c315000: This client was evicted by lustre-MDT0000; in progress operations using this service will fail. [64843.836846] Lustre: DEBUG MARKER: lctl set_param -n fail_loc=0 fail_val=0 2>/dev/null |