[LU-3347] (local_storage.c:872:local_oid_storage_init()) ASSERTION( (*los)->los_last_oid >= first_oid ) failed: 0 < 1 Created: 15/May/13 Updated: 21/Oct/13 Resolved: 16/Sep/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.5.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Keith Mannthey (Inactive) | Assignee: | Mikhail Pershin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
lbug encountered during normal review testing |
||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 8276 | ||||||||||||
| Description |
|
This is from conf-sanity test test_32a. There is lots of other badness going on in conf-sanity and I am not sure how much THIS error occurs. It may be related to The test run: https://maloo.whamcloud.com/test_sets/32b2ffc4-bd3c-11e2-9324-52540035b04c Highlight lbug: 21:39:35:Lustre: DEBUG MARKER: mount -t lustre -o loop,mgsnode=10.10.4.198@tcp /tmp/t32/ost /tmp/t32/mnt/ost 21:39:35:LDISKFS-fs (loop1): mounted filesystem with ordered data mode. quota=off. Opts: 21:39:35:LustreError: 23362:0:(local_storage.c:872:local_oid_storage_init()) ASSERTION( (*los)->los_last_oid >= first_oid ) failed: 0 < 1 21:39:35:LustreError: 23362:0:(local_storage.c:872:local_oid_storage_init()) LBUG 21:39:35:Pid: 23362, comm: mount.lustre 21:39:35: 21:39:35:Call Trace: 21:39:35: [<ffffffffa0478895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 21:39:35: [<ffffffffa0478e97>] lbug_with_loc+0x47/0xb0 [libcfs] 21:39:35: [<ffffffffa05ca646>] local_oid_storage_init+0x426/0xe50 [obdclass] 21:39:35: [<ffffffffa05a3660>] llog_osd_setup+0xc0/0x360 [obdclass] 21:39:35: [<ffffffffa05a0162>] llog_setup+0x352/0x920 [obdclass] 21:39:35: [<ffffffffa0d3508b>] mgc_set_info_async+0x12eb/0x1970 [mgc] 21:39:35: [<ffffffffa04892c1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] 21:39:35: [<ffffffffa0607f70>] server_mgc_set_fs+0x120/0x520 [obdclass] 21:39:35: [<ffffffffa060e9a5>] server_start_targets+0x85/0x19c0 [obdclass] 21:39:35: [<ffffffffa0483d88>] ? libcfs_log_return+0x28/0x40 [libcfs] 21:39:35: [<ffffffffa05dfc40>] ? lustre_start_mgc+0x4e0/0x1ee0 [obdclass] 21:39:35: [<ffffffffa04892c1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] 21:39:35: [<ffffffffa0610e8c>] server_fill_super+0xbac/0x1660 [obdclass] 21:39:35: [<ffffffffa05e1818>] lustre_fill_super+0x1d8/0x530 [obdclass] 21:39:35: [<ffffffffa05e1640>] ? lustre_fill_super+0x0/0x530 [obdclass] 21:39:35: [<ffffffff811842bf>] get_sb_nodev+0x5f/0xa0 21:39:35: [<ffffffffa05d91b5>] lustre_get_sb+0x25/0x30 [obdclass] 21:39:35: [<ffffffff811838fb>] vfs_kern_mount+0x7b/0x1b0 21:39:35: [<ffffffff81183aa2>] do_kern_mount+0x52/0x130 21:39:35: [<ffffffff811a3cf2>] do_mount+0x2d2/0x8d0 21:39:35: [<ffffffff811a4380>] sys_mount+0x90/0xe0 21:39:35: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b |
| Comments |
| Comment by Jodi Levi (Inactive) [ 15/May/13 ] |
|
Mike, |
| Comment by Andreas Dilger [ 15/May/13 ] |
|
Keith, if this is repeatable, could you please submit a quick patch to change the LASSERT() to LASSERTF() and print out the actual values in this condition? That would make debugging this much easier. This LASSERT() was just recently added in |
| Comment by Andreas Dilger [ 15/May/13 ] |
|
Keith, also, if you file a bug related to a failure in Maloo, please "Associate" the bug with the failed test, and search all of the other recent failures of the same test (e.g. in the past 2 weeks) and Associate the same bug with those as well. This is easily done in Maloo with Results->Search->Name=conf-sanity,Status=TIMEOUT,ResultsWithin=2weeks and then looking to see which ones failed in test_32a and verifying those have the same ASSERT failure in the MDS console log. |
| Comment by Keith Mannthey (Inactive) [ 15/May/13 ] |
|
I am not sure if will reproduce or not. It is a pretty large patch that triggered it but there are alot of timeout errors with this conf-sanity and this test_32a. http://review.whamcloud.com/5512 is the patch set: The patch seems like it could have caused it but with so many timeouts and this test I opened the LU to track the issue. Maloo tells me in the last 4 weeks (master review ldisks) there have been 3 in the last 24 hours and non before that... 2 were review-dne and this one. So far it has been a one shot issue with a large patch set on Ldiskfs/Master. I can submit the Assert change if you want it in Master. |
| Comment by Keith Mannthey (Inactive) [ 16/May/13 ] |
|
I "associated" the single issue. With review-dne I didn't see a way but the error messages where the same for the 2 I see. |
| Comment by Mikhail Pershin [ 16/May/13 ] |
|
Keith, please rebase that patch again, I've fixed this issue in http://review.whamcloud.com/5049 which is top of patch set |
| Comment by Keith Mannthey (Inactive) [ 20/May/13 ] |
|
I did not hit the issue again when I did a retest. |