[LU-7820] jobs crash with llite_lib.c:2309:ll_prep_inode()) new_inode -fatal: rc -5 Created: 26/Feb/16 Updated: 24/Jan/17 Resolved: 24/Jan/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Frank Heckes (Inactive) | Assignee: | WC Triage |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | soak | ||
| Environment: |
lola |
||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
Error happens during soak testing of build '20160224' (b2_8 RC2) (see: Applicaton {mdtest (1file per process) jobs crash with the following errors: JOBID ERROR MESSAGE -- 445604 : 201602 25 15:08:35 : Process 1(lola-31.lola.whamcloud.com): FAILED in main, Unable to change to test directory: Input/output error -- 445605 : 201602 25 15:07:42 : Process 3(lola-32.lola.whamcloud.com): FAILED in main, Unable to change to test directory: Input/output error -- 445415 : 201602 25 11:27:11 : Process 3(lola-34.lola.whamcloud.com): FAILED in main, Unable to change to test directory: Input/output error -- 445416 : 201602 25 11:28:45 : Process 3(lola-32.lola.whamcloud.com): FAILED in main, Unable to change to test directory: Input/output error -- 445270 : 201602 25 08:05:01 : Process 4(lola-31.lola.whamcloud.com): FAILED in main, Unable to change to test directory: Input/output error -- 445271 : 201602 25 08:04:34 : Process 1(lola-29.lola.whamcloud.com): FAILED in main, Unable to change to test directory: Input/output error On MDS and client nodes the following Lustre errors can be correlated: ---- Incident 25 15:08:35 ---- lola-11.log:Feb 25 15:08:35 lola-11 kernel: Lustre: soaked-MDT0006: Connection restored to 300cd577-7ec5-3892-b093-9d631f897cda (at 192.168.1.131@o2ib100) lola-11.log:Feb 25 15:08:35 lola-11 kernel: Lustre: Skipped 254 previous similar messages lola-31.log:Feb 25 15:08:35 lola-31 kernel: LustreError: 167-0: soaked-MDT0006-mdc-ffff88086597e800: This client was evicted by soaked-MDT0006; in progress operations using this service will fail. lola-31.log:Feb 25 15:08:35 lola-31 kernel: LustreError: 120434:0:(llite_lib.c:2309:ll_prep_inode()) new_inode -fatal: rc -5 lola-31.log:Feb 25 15:08:35 lola-31 kernel: Lustre: soaked-MDT0006-mdc-ffff88086597e800: Connection restored to 192.168.1.111@o2ib10 (at 192.168.1.111@o2ib10) ---- Incident 25 15:07:42 ---- lola-32.log:Feb 25 15:07:42 lola-32 kernel: LustreError: 167-0: soaked-MDT0006-mdc-ffff88082f4c4000: This client was evicted by soaked-MDT0006; in progress operations using this service will fail. lola-32.log:Feb 25 15:07:42 lola-32 kernel: LustreError: 133347:0:(llite_lib.c:2309:ll_prep_inode()) new_inode -fatal: rc -5 lola-32.log:Feb 25 15:07:42 lola-32 kernel: LustreError: 133347:0:(llite_lib.c:2309:ll_prep_inode()) Skipped 2 previous similar messages lola-32.log:Feb 25 15:07:42 lola-32 kernel: Lustre: soaked-MDT0006-mdc-ffff88082f4c4000: Connection restored to 192.168.1.111@o2ib10 (at 192.168.1.111@o2ib10) ---- Incident 25 11:27:11 ---- lola-31.log:Feb 25 11:27:11 lola-31 kernel: LustreError: 105033:0:(llite_lib.c:2309:ll_prep_inode()) new_inode -fatal: rc -4 lola-34.log:Feb 25 11:27:11 lola-34 kernel: LustreError: 167-0: soaked-MDT0002-mdc-ffff88102fa38000: This client was evicted by soaked-MDT0002; in progress operations using this service will fail. lola-34.log:Feb 25 11:27:11 lola-34 kernel: LustreError: 105947:0:(llite_lib.c:2309:ll_prep_inode()) new_inode -fatal: rc -5 lola-34.log:Feb 25 11:27:11 lola-34 kernel: Lustre: soaked-MDT0002-mdc-ffff88102fa38000: Connection restored to 192.168.1.109@o2ib10 (at 192.168.1.109@o2ib10) ---- Incident 25 11:28:45 ---- lola-32.log:Feb 25 11:28:45 lola-32 kernel: LustreError: 167-0: soaked-MDT0002-mdc-ffff88082f4c4000: This client was evicted by soaked-MDT0002; in progress operations using this service will fail. lola-32.log:Feb 25 11:28:45 lola-32 kernel: LustreError: 117554:0:(llite_lib.c:2309:ll_prep_inode()) new_inode -fatal: rc -5 lola-32.log:Feb 25 11:28:45 lola-32 kernel: Lustre: soaked-MDT0002-mdc-ffff88082f4c4000: Connection restored to 192.168.1.109@o2ib10 (at 192.168.1.109@o2ib10) lola-32.log:Feb 25 11:28:45 lola-32 kernel: LustreError: 117554:0:(llite_lib.c:2309:ll_prep_inode()) Skipped 2 previous similar messages ---- Incident 25 08:05:01 ---- lola-31.log:Feb 25 08:05:01 lola-31 kernel: LustreError: 167-0: soaked-MDT0002-mdc-ffff88086597e800: This client was evicted by soaked-MDT0002; in progress operations using this service will fail. lola-31.log:Feb 25 08:05:01 lola-31 kernel: LustreError: 89849:0:(file.c:180:ll_close_inode_openhandle()) soaked-clilmv-ffff88086597e800: inode [0x28000bf82:0x69f4:0x0] mdc close failed: rc = -5 lola-31.log:Feb 25 08:05:01 lola-31 kernel: LustreError: 91182:0:(llite_lib.c:2309:ll_prep_inode()) new_inode -fatal: rc -5 lola-31.log:Feb 25 08:05:01 lola-31 kernel: Lustre: soaked-MDT0002-mdc-ffff88086597e800: Connection restored to 192.168.1.109@o2ib10 (at 192.168.1.109@o2ib10) ---- Incident 25 08:04:34 ---- lola-29.log:Feb 25 08:04:34 lola-29 kernel: LustreError: 167-0: soaked-MDT0002-mdc-ffff880871eec800: This client was evicted by soaked-MDT0002; in progress operations using this service will fail. lola-29.log:Feb 25 08:04:34 lola-29 kernel: LustreError: 1037:0:(file.c:180:ll_close_inode_openhandle()) soaked-clilmv-ffff880871eec800: inode [0x28000bf82:0x66f3:0x0] mdc close failed: rc = -5 lola-29.log:Feb 25 08:04:34 lola-29 kernel: LustreError: 1043:0:(vvp_io.c:1519:vvp_io_init()) soaked: refresh file layout [0x28000a816:0x1c0e2:0x0] error -5. lola-29.log:Feb 25 08:04:34 lola-29 kernel: Lustre: soaked-MDT0002-mdc-ffff880871eec800: Connection restored to 192.168.1.109@o2ib10 (at 192.168.1.109@o2ib10) lola-29.log:Feb 25 08:04:34 lola-29 kernel: LustreError: 1037:0:(file.c:180:ll_close_inode_openhandle()) Skipped 3 previous similar messages The errors happened after mds_failover : 2016-02-25 14:52:36,099 - 2016-02-25 14:59:44,541 lola-11 mds_failover : 2016-02-25 11:06:59,431 - 2016-02-25 11:16:18,956 lola-9 mds_failover : 2016-02-25 07:45:03,939 - 2016-02-25 07:54:18,970 lola-9 Does the eviction is an expected part of the workflow? |
| Comments |
| Comment by Cliff White (Inactive) [ 24/Jan/17 ] |
|
Old issue from 2.8 |