[LU-2958] LBUG triggered in seq_client_alloc_fid() - ASSERTION( seq != ((void *)0) ) failed Created: 13/Mar/13 Updated: 20/Mar/13 Resolved: 13/Mar/13 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Prakash Surya (Inactive) | Assignee: | WC Triage |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | sequoia, topsequoia | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 7213 | ||||||||
| Description |
|
We're continually hitting this LBUG when running 2.3.62-2chaos. 2013-03-13 10:28:27.644618 {DefaultControlEventListener} [mmcs]{8}.5.1: LustreError: 12943:0:(fid_request.c:329:seq_client_alloc_fid()) ASSERTION( seq != ((void *)0) ) failed:
2013-03-13 10:28:27.645378 {DefaultControlEventListener} [mmcs]{8}.5.1: LustreError: 12943:0:(fid_request.c:329:seq_client_alloc_fid()) LBUG
2013-03-13 10:28:27.645734 {DefaultControlEventListener} [mmcs]{8}.5.1: Call Trace:
2013-03-13 10:28:27.646114 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268b3b0] [c000000000008160] .show_stack+0x7c/0x184 (unreliable)
2013-03-13 10:28:27.646480 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268b460] [8000000000420cb8] .libcfs_debug_dumpstack+0xd8/0x150 [libcfs]
2013-03-13 10:28:27.646853 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268b510] [8000000000421480] .lbug_with_loc+0x50/0xc0 [libcfs]
2013-03-13 10:28:27.647213 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268b5a0] [8000000001958b80] .seq_client_alloc_fid+0x4f0/0x740 [fid]
2013-03-13 10:28:27.647576 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268b6c0] [8000000001a4e310] .mdc_fid_alloc+0x140/0x1e0 [mdc]
2013-03-13 10:28:27.647944 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268b770] [8000000001a65818] .mdc_intent_lock+0x508/0x838 [mdc]
2013-03-13 10:28:27.648345 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268b8f0] [8000000001ce72e0] .ll_lookup_it+0x450/0xfb0 [lustre]
2013-03-13 10:28:27.648774 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268ba70] [8000000001ce7f10] .ll_lookup_nd+0xd0/0x580 [lustre]
2013-03-13 10:28:27.649207 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268bb30] [c0000000000df10c] .__lookup_hash+0x180/0x1c8
2013-03-13 10:28:27.649699 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268bbd0] [c0000000000e3690] .do_filp_open+0x260/0xadc
2013-03-13 10:28:27.650185 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268bd80] [c0000000000d1ca8] .do_sys_open+0x8c/0x18c
2013-03-13 10:28:27.650632 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268be30] [c000000000000580] syscall_exit+0x0/0x2c
2013-03-13 10:28:27.651100 {DefaultControlEventListener} [mmcs]{8}.5.1: Kernel panic - not syncing: LBUG
2013-03-13 10:28:27.651554 {DefaultControlEventListener} [mmcs]{8}.5.1: Call Trace:
2013-03-13 10:28:27.652039 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268b3d0] [c000000000008160] .show_stack+0x7c/0x184 (unreliable)
2013-03-13 10:28:27.652453 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268b480] [c0000000004557cc] .panic+0xb8/0x1e0
2013-03-13 10:28:27.652880 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268b510] [80000000004214e0] .lbug_with_loc+0xb0/0xc0 [libcfs]
2013-03-13 10:28:27.653312 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268b5a0] [8000000001958b80] .seq_client_alloc_fid+0x4f0/0x740 [fid]
2013-03-13 10:28:27.653734 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268b6c0] [8000000001a4e310] .mdc_fid_alloc+0x140/0x1e0 [mdc]
2013-03-13 10:28:27.654166 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268b770] [8000000001a65818] .mdc_intent_lock+0x508/0x838 [mdc]
2013-03-13 10:28:27.654595 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268b8f0] [8000000001ce72e0] .ll_lookup_it+0x450/0xfb0 [lustre]
2013-03-13 10:28:27.655016 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268ba70] [8000000001ce7f10] .ll_lookup_nd+0xd0/0x580 [lustre]
2013-03-13 10:28:27.655440 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268bb30] [c0000000000df10c] .__lookup_hash+0x180/0x1c8
2013-03-13 10:28:27.655861 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268bbd0] [c0000000000e3690] .do_filp_open+0x260/0xadc
2013-03-13 10:28:27.656281 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268bd80] [c0000000000d1ca8] .do_sys_open+0x8c/0x18c
2013-03-13 10:28:27.656711 {DefaultControlEventListener} [mmcs]{8}.5.1: [c0000003c268be30] [c000000000000580] syscall_exit+0x0/0x2c
2013-03-13 10:28:27.657108 {DefaultControlEventListener} [mmcs]{8}.5.1: LustreError: dumping log to /tmp/lustre-log.1363195707.12943
|
| Comments |
| Comment by Andreas Dilger [ 13/Mar/13 ] |
|
Prakash, any info on what kind of workload/operation is triggering this problem? |
| Comment by Prakash Surya (Inactive) [ 13/Mar/13 ] |
|
This is on one of our production machines, so I'm unsure what kind of workload the users are putting on the system. We just recently upgraded the clients to run the new 2.3.62 tag (from 2.3.58), which I'm sure is why we're seeing it now. Also, I just noticed that the clients are mounting a file system running 2.1.2-3chaos, so this isn't exactly a supported configuration (2.3.62 clients using 2.1.2 servers). |
| Comment by Di Wang [ 13/Mar/13 ] |
|
duplicate with 2911 |
| Comment by Andreas Dilger [ 16/Mar/13 ] |
|
Prakash, it is my expectation that 2.4 clients will work with 2.1 servers. I thought the |
| Comment by Ned Bass [ 20/Mar/13 ] |
|
I think this is in fact the same problem as |