[LU-2270] failure in replay-dual.sh test_0b: class_newdev() Device lustre-MDT0000-mdc already exists at 3, won't add: rc = -17 Created: 03/Nov/12 Updated: 27/Nov/12 Resolved: 27/Nov/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Maloo | Assignee: | Alex Zhuravlev |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Environment: |
lustre master build #1011 SLES11 SP2 client |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 5429 | ||||||||
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/47f07908-253d-11e2-9e7c-52540035b04c. The sub-test test_0b failed with the following error:
client 2 dmesg shows [ 9000.375340] Lustre: Mounted lustre-client [ 9000.393643] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre2 [ 9000.399956] Lustre: DEBUG MARKER: mount -t lustre -o user_xattr,acl,flock client-26vm7@tcp:/lustre /mnt/lustre2 [ 9000.414481] LustreError: 18958:0:(genops.c:316:class_newdev()) Device lustre-MDT0000-mdc-ffff880079c03400 already exists at 3, won't add [ 9000.414489] LustreError: 18958:0:(obd_config.c:374:class_attach()) Cannot create device lustre-MDT0000-mdc-ffff880079c03400 of type mdc : -17 [ 9000.414500] LustreError: 18958:0:(obd_config.c:1546:class_config_llog_handler()) MGC10.10.4.154@tcp: cfg command failed: rc = -17 [ 9000.414511] Lustre: cmd=cf001 0:lustre-MDT0000-mdc 1:mdc 2:lustre-clilmv_UUID [ 9000.414581] LustreError: 15c-8: MGC10.10.4.154@tcp: The configuration from log 'lustre-client' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. [ 9000.414588] LustreError: 18955:0:(llite_lib.c:1012:ll_fill_super()) Unable to process log: -17 [ 9000.415048] LustreError: 18955:0:(obd_mount.c:2987:lustre_fill_super()) Unable to mount (-17) [ 9000.822327] Lustre: DEBUG MARKER: /usr/sbin/lctl mark replay-dual test_0b: @@@@@@ FAIL: mount2 fais [ 9000.877904] Lustre: DEBUG MARKER: replay-dual test_0b: @@@@@@ FAIL: mount2 fais [ 9001.019498] Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /logdir/test_logs/2012-11-01/lustre-master-el6-x86_64-sles11sp2-x86_64__1011__-69983126486880-163552/replay-dual.test_0b.debug_log.$(hostname -s).1351848424.log; [ 9001.019502] dmesg > /logdir/test_logs/2012-11-01/lustre-master-el6-x86_64-sl |
| Comments |
| Comment by Sarah Liu [ 05/Nov/12 ] |
|
another failure: [ 3940.919180] Lustre: Unmounted lustre-client [ 3958.686052] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre [ 3958.693235] Lustre: DEBUG MARKER: mount -t lustre -o user_xattr,acl,flock fat-intel-3vm7@tcp:/lustre /mnt/lustre [ 3958.702476] LustreError: 152-6: Ignoring deprecated mount option 'acl'. [ 3958.703435] Lustre: MGC10.10.4.92@tcp: Reactivating import [ 3958.704833] LustreError: 28998:0:(mgc_request.c:248:do_config_log_add()) failed processing sptlrpc log: -2 [ 3958.708656] LustreError: 29002:0:(genops.c:316:class_newdev()) Device lustre-MDT0000-mdc-ffff8800699d0000 already exists at 13, won't add [ 3958.708661] LustreError: 29002:0:(obd_config.c:374:class_attach()) Cannot create device lustre-MDT0000-mdc-ffff8800699d0000 of type mdc : -17 [ 3958.708664] LustreError: 29002:0:(obd_config.c:1546:class_config_llog_handler()) MGC10.10.4.92@tcp: cfg command failed: rc = -17 [ 3958.708669] Lustre: cmd=cf001 0:lustre-MDT0000-mdc 1:mdc 2:lustre-clilmv_UUID [ 3958.708692] LustreError: 15c-8: MGC10.10.4.92@tcp: The configuration from log 'lustre-client' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. [ 3958.708696] LustreError: 28998:0:(llite_lib.c:1012:ll_fill_super()) Unable to process log: -17 [ 3958.708803] Lustre: Unmounted lustre-client |
| Comment by Jodi Levi (Inactive) [ 06/Nov/12 ] |
|
Alex, |
| Comment by Sarah Liu [ 07/Nov/12 ] |
|
Failure found in replay-vbr. both server and client are running RHEL6 https://maloo.whamcloud.com/test_sets/421702b0-2838-11e2-9c12-52540035b04c Client 1 dmesg shows: Lustre: DEBUG MARKER: == replay-vbr test 7c: create, {lost}, rename == 15:57:46 (1352159866)
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre2
Lustre: DEBUG MARKER: mount -t lustre -o user_xattr,acl,flock fat-intel-3vm3@tcp:/lustre /mnt/lustre2
LustreError: 22925:0:(genops.c:316:class_newdev()) Device lustre-MDT0000-mdc-ffff88007d0fe400 already exists at 3, won't add
LustreError: 22925:0:(obd_config.c:374:class_attach()) Cannot create device lustre-MDT0000-mdc-ffff88007d0fe400 of type mdc : -17
LustreError: 22925:0:(obd_config.c:1546:class_config_llog_handler()) MGC10.10.4.88@tcp: cfg command failed: rc = -17
Lustre: cmd=cf001 0:lustre-MDT0000-mdc 1:mdc 2:lustre-clilmv_UUID
LustreError: 15c-8: MGC10.10.4.88@tcp: The configuration from log 'lustre-client' failed (-17). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
LustreError: 22922:0:(llite_lib.c:1012:ll_fill_super()) Unable to process log: -17
LustreError: 22922:0:(obd_mount.c:2987:lustre_fill_super()) Unable to mount (-17)
Lustre: DEBUG MARKER: mcreate /mnt/lustre/fsa-$(hostname); rm /mnt/lustre/fsa-$(hostname)
Lustre: DEBUG MARKER: if [ -d /mnt/lustre2 ]; then mcreate /mnt/lustre2/fsa-$(hostname); rm /mnt/lustre2/fsa-$(hostname); fi
Lustre: DEBUG MARKER: rm /mnt/lustre2/d0.replay-vbr/d7/f.replay-vbr.7c-0; createmany -o /mnt/lustre2/d0.replay-vbr/d7/f.replay-vbr.7c- 1
Lustre: DEBUG MARKER: /usr/sbin/lctl mark replay-vbr test_7c: @@@@@@ FAIL: test_7c.1: Cannot do \'lost\' operations
Lustre: DEBUG MARKER: replay-vbr test_7c: @@@@@@ FAIL: test_7c.1: Cannot do 'lost' operations
Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /logdir/test_logs/2012-11-05/lustre-master-el6-x86_64__1018__-69983081643020-022304/replay-vbr.test_7c.debug_log.$(hostname -s).1352159868.log;
dmesg > /logdir/test_logs/2012-11-05/lustre-master-el6-x86_64__1018__-699830816430
|
| Comment by Andreas Dilger [ 20/Nov/12 ] |
|
Alex, can you please take a quick look at this to figure out if it should remain a 2.4.0 blocker? It is currently on the priority list to fix before we open master for landing again. |
| Comment by Alex Zhuravlev [ 20/Nov/12 ] |
|
not absolutely sure, but probably it's mountpoint being pinned by a leaked request: http://jira.whamcloud.com/browse/LU-2275 I wouldn't make this a blocker. |
| Comment by Andreas Dilger [ 27/Nov/12 ] |
|
The most recent report of this bug causing a failure in Maloo is 2012-11-06: https://maloo.whamcloud.com/test_sets/421702b0-2838-11e2-9c12-52540035b04c and only three reports of the failure in Maloo at all. so it probably does not need to be a blocker given it hasn't hit in some time already (unless it is not being triaged to this bug anymore). |
| Comment by Andreas Dilger [ 27/Nov/12 ] |
|
Linking to |
| Comment by Andreas Dilger [ 27/Nov/12 ] |
|
Marking as a duplicate of |