Imperative recovery bugs go here (LU-1252)

[LU-1701] CLONE - mgc_apply_recover_logs() ASSERTION(entry->mne_length <= CFS_PAGE_SIZE) Created: 02/Aug/12  Updated: 06/Aug/12  Resolved: 06/Aug/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.3.0
Fix Version/s: None

Type: Technical task Priority: Blocker
Reporter: Andreas Dilger Assignee: Keith Mannthey (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Kernel 2.6.32-220.4.2.bgq.2012_0308_0004.2llnl.bgq62.ppc64


Issue Links:
Duplicate
duplicates LU-1644 lustre b2_2<->master failure on lustr... Resolved
Story Points: 2
Rank (Obsolete): 2207

 Description   

BGQ I/O node hitting this ASSERTION when mounting the lustre/orion filesystem. Filesystem was reformatted earlier this week. Client node is running ppc64 lustre/orion.

2012-03-29 09:26:20.427354 LustreError: 11-0: MGC172.20.5.1@o2ib500: Communicating with 172.20.5.1@o2ib500, operation llog_origin_handle_create failed with -2
2012-03-29 09:26:20.427798 LustreError: 3396:0:(mgc_request.c:251:do_config_log_add()) failed processing sptlrpc log: -2
2012-03-29 09:26:21.040833 LustreError: 3396:0:(mgc_request.c:1164:mgc_apply_recover_logs()) ASSERTION(entry->mne_length <= CFS_PAGE_SIZE) failed
2012-03-29 09:26:21.041260 LustreError: 3396:0:(mgc_request.c:1164:mgc_apply_recover_logs()) LBUG
2012-03-29 09:26:21.041721 Call Trace:
2012-03-29 09:26:21.042293 [c0000003c697af70] [c000000000008190] .show_stack+0x7c/0x184
2012-03-29 09:26:21.042743  (unreliable)
2012-03-29 09:26:21.043657 [c0000003c697b020] [8000000000800a5c] .libcfs_debug_dumpstack+0x5c/0x70 [libcfs]
2012-03-29 09:26:21.044615 [c0000003c697b0a0] [8000000000800ec0] .lbug_with_loc+0x50/0xc0 [libcfs]
2012-03-29 09:26:21.045557 [c0000003c697b130] [800000000080dd54] .libcfs_assertion_failed+0x34/0x40 [libcfs]
2012-03-29 09:26:21.046441 [c0000003c697b1b0] [8000000005a82eb8] .mgc_apply_recover_logs+0x1028/0x1390 [mgc]
2012-03-29 09:26:21.047393 [c0000003c697b370] [8000000005a84e7c] .mgc_process_log+0xf4c/0x1450 [mgc]
2012-03-29 09:26:21.048330 [c0000003c697b520] [8000000005a86908] .mgc_process_config+0x7c8/0xe00 [mgc]
2012-03-29 09:26:21.049228 [c0000003c697b600] [8000000001ff525c] .lustre_log_process+0xa1c/0x1050 [obdclass]
2012-03-29 09:26:21.050184 [c0000003c697b730] [800000000581d4b0] .ll_fill_super+0x990/0x5370 [lustre]
2012-03-29 09:26:21.051121 [c0000003c697b8a0] [8000000001ffbee4] .lustre_mount+0x694/0x880 [obdclass]
2012-03-29 09:26:21.052002 [c0000003c697b960] [8000000001fb1f2c] .lustre_fill_super+0x1c/0x30 [obdclass]
2012-03-29 09:26:21.052964 [c0000003c697b9d0] [c0000000000c03c4] .get_sb_nodev+0x84/0xe8
2012-03-29 09:26:21.053903 [c0000003c697ba80] [8000000001fb1ef8] .lustre_get_sb+0x28/0x40 [obdclass]
2012-03-29 09:26:21.054782 [c0000003c697bb10] [c0000000000beb68] .vfs_kern_mount+0x80/0x114
2012-03-29 09:26:21.055814 [c0000003c697bbc0] [c0000000000bec64] .do_kern_mount+0x58/0x130
2012-03-29 09:26:21.056732 [c0000003c697bc80] [c0000000000dd9d4] .do_mount+0x8c8/0x984
2012-03-29 09:26:21.057564 [c0000003c697bd70] [c0000000000ddb48] .SyS_mount+0xb8/0x124
2012-03-29 09:26:21.058509 [c0000003c697be30] [c000000000000580] syscall_exit+0x0/0x2c
2012-03-29 09:26:21.059430 Kernel panic - not syncing: LBUG
2012-03-29 09:26:21.059889 Call Trace:
2012-03-29 09:26:21.060305 [c0000003c697af60] [c000000000008190] .show_stack+0x7c/0x184
2012-03-29 09:26:21.060801  (unreliable)
2012-03-29 09:26:21.061764 [c0000003c697b010] [c000000000412a7c] .panic+0x80/0x1a8
2012-03-29 09:26:21.062683 [c0000003c697b0a0] [8000000000800f20] .lbug_with_loc+0xb0/0xc0 [libcfs]
2012-03-29 09:26:21.063676 [c0000003c697b130] [800000000080dd54] .libcfs_assertion_failed+0x34/0x40 [libcfs]
2012-03-29 09:26:21.064616 [c0000003c697b1b0] [8000000005a82eb8] .mgc_apply_recover_logs+0x1028/0x1390 [mgc]
2012-03-29 09:26:21.065534 [c0000003c697b370] [8000000005a84e7c] .mgc_process_log+0xf4c/0x1450 [mgc]
2012-03-29 09:26:21.066428 [c0000003c697b520] [8000000005a86908] .mgc_process_config+0x7c8/0xe00 [mgc]
2012-03-29 09:26:21.067373 [c0000003c697b600] [8000000001ff525c] .lustre_log_process+0xa1c/0x1050 [obdclass]
2012-03-29 09:26:21.068294 [c0000003c697b730] [800000000581d4b0] .ll_fill_super+0x990/0x5370 [lustre]
2012-03-29 09:26:21.069232 [c0000003c697b8a0] [8000000001ffbee4] .lustre_mount+0x694/0x880 [obdclass]
2012-03-29 09:26:21.070241 [c0000003c697b960] [8000000001fb1f2c] .lustre_fill_super+0x1c/0x30 [obdclass]
2012-03-29 09:26:21.071132 [c0000003c697b9d0] [c0000000000c03c4] .get_sb_nodev+0x84/0xe8
2012-03-29 09:26:21.071984 [c0000003c697ba80] [8000000001fb1ef8] .lustre_get_sb+0x28/0x40 [obdclass]
2012-03-29 09:26:21.085900 [c0000003c697bb10] [c0000000000beb68] .vfs_kern_mount+0x80/0x114
2012-03-29 09:26:21.086893 [c0000003c697bbc0] [c0000000000bec64] .do_kern_mount+0x58/0x130
2012-03-29 09:26:21.087828 [c0000003c697bc80] [c0000000000dd9d4] .do_mount+0x8c8/0x984
2012-03-29 09:26:21.088936 [c0000003c697bd70] [c0000000000ddb48] .SyS_mount+0xb8/0x124
2012-03-29 09:26:21.089646 [c0000003c697be30] [c000000000000580] syscall_exit+0x0/0x2c


 Comments   
Comment by Andreas Dilger [ 02/Aug/12 ]

Cloned from ORI-609. Making a blocker due to possible interop issues between current 2.3 master code and 2.2.

Comment by Andreas Dilger [ 02/Aug/12 ]

Original patch was http://review.whamcloud.com/2410 landed to master.

Comment by Andreas Dilger [ 06/Aug/12 ]

Closed as a duplicate of LU-1644, where the discussion is taking place for fixing this problem in a compatible manner.

Generated at Sat Feb 10 01:18:56 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.