[LU-10844] sanity test_27z hang during rollling downgrade from 2.11 to 2.10.3 Created: 23/Mar/18  Updated: 11/May/18

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.11.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Sarah Liu Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

During rolling downgrade client and MDS from 2.11(which was formatted as 2.10.3) to 2.10.3, sanity test_27z hang. OSS remained as 2.11 (also formatted as 2.10.3). Found following message on MDS. umount the system and restart the following test suites, did not see this problem.

[ 4736.420979] Lustre: DEBUG MARKER: == sanity test 27z: check SEQ/OID on the MDT and OST filesystems ===================================== 02:15:21 (1521684921)
[ 4742.183870] Lustre: DEBUG MARKER: check file /mnt/lustre/d27z.sanity/f27z.sanity-1
[ 4742.667467] Lustre: DEBUG MARKER: FID seq 0x200011571, oid 0x4662 ver 0x0
[ 4743.140965] Lustre: DEBUG MARKER: LOV seq 0x200011571, oid 0x4662, count: 1
[ 4743.623018] Lustre: DEBUG MARKER: want: stripe:0 ost:0 oid:809526/0xc5a36 seq:0
[ 4746.434545] LustreError: 11-0: lustre-OST0000-osc-MDT0000: operation ost_statfs to node 10.2.2.48@tcp failed: rc = -107
[ 4746.449648] Lustre: lustre-OST0000-osc-MDT0000: Connection to lustre-OST0000 (at 10.2.2.48@tcp) was lost; in progress operations using this service will wait for recovery to complete
[17500.566604] mce: [Hardware Error]: Machine check events logged
[17500.566686] \{1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
[17500.566688] \{1}[Hardware Error]: It has been corrected by h/w and requires no further action
[17500.566689] \{1}[Hardware Error]: event severity: corrected
[17500.566691] \{1}[Hardware Error]: Error 0, type: corrected
[17500.566693] \{1}[Hardware Error]: fru_text: DIMM ??
[17500.566694] \{1}[Hardware Error]: section_type: memory error
[17500.566695] [Firmware Warn]: error section length is too small
[17500.643302] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17500.643306] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17500.643308] EDAC sbridge MC0: TSC 0 
[17500.643311] EDAC sbridge MC0: ADDR fa81e5000 
[17500.643313] EDAC sbridge MC0: MISC 90002000200048c 
[17500.643316] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697686 SOCKET 1 APIC 40
[17501.195345] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#0_DIMM#0 (channel:4 slot:0 page:0xfa81e5 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:1 rank:0)
[17502.471597] mce: [Hardware Error]: Machine check events logged
[17502.471675] \{2}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
[17502.471676] \{2}[Hardware Error]: It has been corrected by h/w and requires no further action
[17502.471677] \{2}[Hardware Error]: event severity: corrected
[17502.471678] \{2}[Hardware Error]: Error 0, type: corrected
[17502.471680] \{2}[Hardware Error]: fru_text: DIMM ??
[17502.471681] \{2}[Hardware Error]: section_type: memory error
[17502.471681] [Firmware Warn]: error section length is too small
[17502.547250] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17502.547255] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17502.547257] EDAC sbridge MC0: TSC 0 
[17502.547259] EDAC sbridge MC0: ADDR fa8595000 
[17502.547263] EDAC sbridge MC0: MISC 90002000200048c 
[17502.547266] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697688 SOCKET 1 APIC 40
[17503.227328] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfa8595 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
[17504.390627] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17504.390633] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17504.390636] EDAC sbridge MC0: TSC 0 
[17504.390638] EDAC sbridge MC0: ADDR fa8935000 
[17504.390641] EDAC sbridge MC0: MISC 90002000200048c 
[17504.390643] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697690 SOCKET 1 APIC 40
[17504.480491] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17504.480497] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17504.480499] EDAC sbridge MC0: TSC 0 
[17504.480502] EDAC sbridge MC0: ADDR fa8965000 
[17504.480504] EDAC sbridge MC0: MISC 90002000200048c 
[17504.480506] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697690 SOCKET 1 APIC 40
[17505.257328] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#0_DIMM#0 (channel:4 slot:0 page:0xfa8935 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:1 rank:0)
[17505.283795] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#0_DIMM#0 (channel:4 slot:0 page:0xfa8965 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:1 rank:0)
[17506.801585] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17506.801591] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17506.801593] EDAC sbridge MC0: TSC 0 
[17506.801596] EDAC sbridge MC0: ADDR fa8d45000 
[17506.801598] EDAC sbridge MC0: MISC 90002000200048c 
[17506.801600] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697692 SOCKET 1 APIC 40
[17506.801640] ghes_print_estatus: 2 callbacks suppressed
[17506.810011] \{3}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
[17506.821795] \{3}[Hardware Error]: It has been corrected by h/w and requires no further action
[17506.833545] \{3}[Hardware Error]: event severity: corrected
[17506.841941] \{3}[Hardware Error]: Error 0, type: corrected
[17506.850283] \{3}[Hardware Error]: fru_text: DIMM ??
[17506.857756] \{3}[Hardware Error]: section_type: memory error
[17506.866137] [Firmware Warn]: error section length is too small
[17507.313326] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfa8d45 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
[17508.712415] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17508.712421] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17508.712423] EDAC sbridge MC0: TSC 0 
[17508.712425] EDAC sbridge MC0: ADDR fa9095000 
[17508.712427] EDAC sbridge MC0: MISC 90002000200048c 
[17508.712430] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697694 SOCKET 1 APIC 40
[17508.712490] \{4}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
[17508.723899] \{4}[Hardware Error]: It has been corrected by h/w and requires no further action
[17508.735546] \{4}[Hardware Error]: event severity: corrected
[17508.743674] \{4}[Hardware Error]: Error 0, type: corrected
[17508.751767] \{4}[Hardware Error]: fru_text: DIMM ??
[17508.759000] \{4}[Hardware Error]: section_type: memory error
[17508.767180] [Firmware Warn]: error section length is too small
[17508.790296] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17508.790303] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17508.790305] EDAC sbridge MC0: TSC 0 
[17508.790308] EDAC sbridge MC0: ADDR fa90b5000 
[17508.790310] EDAC sbridge MC0: MISC 90002000200048c 
[17508.790312] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697694 SOCKET 1 APIC 40
[17508.910332] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17508.910338] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17508.910340] EDAC sbridge MC0: TSC 0 
[17508.910342] EDAC sbridge MC0: ADDR fa90f5000 
[17508.910344] EDAC sbridge MC0: MISC 90002000200048c 
[17508.910346] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697694 SOCKET 1 APIC 40
[17509.343315] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xfa9095 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:1 rank:0)
[17509.369179] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#0_DIMM#0 (channel:4 slot:0 page:0xfa90b5 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:1 rank:0)
[17509.394808] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#1_DIMM#0 (channel:5 slot:0 page:0xfa90f5 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:2 rank:0)
[17510.934471] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17510.934477] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17510.934479] EDAC sbridge MC0: TSC 0 
[17510.934482] EDAC sbridge MC0: ADDR fa9485000 
[17510.934484] EDAC sbridge MC0: MISC 90002000200048c 
[17510.934486] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697696 SOCKET 1 APIC 40
[17511.095088] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17511.095093] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17511.095096] EDAC sbridge MC0: TSC 0 
[17511.095098] EDAC sbridge MC0: ADDR fa94b5000 
[17511.095099] EDAC sbridge MC0: MISC 90002000200048c 
[17511.095102] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697696 SOCKET 1 APIC 40
[17511.423316] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xfa9485 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:1 rank:0)
[17511.449450] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#1_DIMM#0 (channel:5 slot:0 page:0xfa94b5 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:2 rank:0)
[17513.070547] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17513.070552] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17513.070555] EDAC sbridge MC0: TSC 0 
[17513.070557] EDAC sbridge MC0: ADDR fa9815000 
[17513.070559] EDAC sbridge MC0: MISC 90002000200048c 
[17513.070561] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697698 SOCKET 1 APIC 40
[17513.123604] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17513.123611] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17513.123613] EDAC sbridge MC0: TSC 0 
[17513.123615] EDAC sbridge MC0: ADDR fa9835000 
[17513.123617] EDAC sbridge MC0: MISC 90002000200048c 
[17513.123620] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697698 SOCKET 1 APIC 40
[17513.177477] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17513.177484] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17513.177486] EDAC sbridge MC0: TSC 0 
[17513.177489] EDAC sbridge MC0: ADDR fa9855000 
[17513.177491] EDAC sbridge MC0: MISC 90002000200048c 
[17513.177493] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697698 SOCKET 1 APIC 40
[17513.479315] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xfa9815 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:1 rank:0)
[17513.505445] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#0_DIMM#0 (channel:4 slot:0 page:0xfa9835 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:1 rank:0)
[17513.531370] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfa9855 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
[17515.333443] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17515.333449] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17515.333452] EDAC sbridge MC0: TSC 0 
[17515.333454] EDAC sbridge MC0: ADDR fa9c05000 
[17515.333456] EDAC sbridge MC0: MISC 90002000200048c 
[17515.333458] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697701 SOCKET 1 APIC 40
[17515.333518] ghes_print_estatus: 7 callbacks suppressed
[17515.341874] \{5}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
[17515.353666] \{5}[Hardware Error]: It has been corrected by h/w and requires no further action
[17515.360474] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17515.360477] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17515.360479] EDAC sbridge MC0: TSC 0 
[17515.360479] EDAC sbridge MC0: ADDR fa9c15000 
[17515.360481] EDAC sbridge MC0: MISC 90002000200048c 
[17515.360482] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697701 SOCKET 1 APIC 40
[17515.366408] \{5}[Hardware Error]: event severity: corrected
[17515.374719] \{5}[Hardware Error]: Error 0, type: corrected
[17515.383031] \{5}[Hardware Error]: fru_text: DIMM ??
[17515.390516] \{5}[Hardware Error]: section_type: memory error
[17515.398977] [Firmware Warn]: error section length is too small
[17515.474390] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17515.474396] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17515.474398] EDAC sbridge MC0: TSC 0 
[17515.474400] EDAC sbridge MC0: ADDR fa9c45000 
[17515.474402] EDAC sbridge MC0: MISC 90002000200048c 
[17515.474405] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697701 SOCKET 1 APIC 40
[17515.474450] \{6}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
[17515.485731] \{6}[Hardware Error]: It has been corrected by h/w and requires no further action
[17515.497710] \{6}[Hardware Error]: event severity: corrected
[17515.506104] \{6}[Hardware Error]: Error 0, type: corrected
[17515.514441] \{6}[Hardware Error]: fru_text: DIMM ??
[17515.519350] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17515.519353] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17515.519356] EDAC sbridge MC0: TSC 0 
[17515.519357] EDAC sbridge MC0: ADDR fa9c55000 
[17515.519359] EDAC sbridge MC0: MISC 90002000200048c 
[17515.519359] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697701 SOCKET 1 APIC 40
[17515.522783] \{6}[Hardware Error]: section_type: memory error
[17515.531179] [Firmware Warn]: error section length is too small
[17515.559307] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xfa9c05 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:1 rank:0)
[17515.562537] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17515.562540] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17515.562542] EDAC sbridge MC0: TSC 0 
[17515.562543] EDAC sbridge MC0: ADDR fa9c65000 
[17515.562545] EDAC sbridge MC0: MISC 90002000200048c 
[17515.562545] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697701 SOCKET 1 APIC 40
[17515.586088] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfa9c15 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
[17515.605394] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17515.605397] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17515.605399] EDAC sbridge MC0: TSC 0 
[17515.605400] EDAC sbridge MC0: ADDR fa9c75000 
[17515.605402] EDAC sbridge MC0: MISC 90002000200048c 
[17515.605403] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697701 SOCKET 1 APIC 40
[17515.612619] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfa9c45 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
[17515.638195] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xfa9c55 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:1 rank:0)
[17516.665303] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#1_DIMM#0 (channel:5 slot:0 page:0xfa9c65 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:2 rank:0)
[17516.692108] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#0_DIMM#0 (channel:4 slot:0 page:0xfa9c75 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:1 rank:0)
[17519.811483] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17519.811488] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17519.811491] EDAC sbridge MC0: TSC 0 
[17519.811493] EDAC sbridge MC0: ADDR faa385000 
[17519.811495] EDAC sbridge MC0: MISC 90002000200048c 
[17519.811497] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697705 SOCKET 1 APIC 40
[17520.727300] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfaa385 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
[17522.347722] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17522.347728] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17522.347731] EDAC sbridge MC0: TSC 0 
[17522.347733] EDAC sbridge MC0: ADDR faa7c5000 
[17522.347735] EDAC sbridge MC0: MISC 90002000200048c 
[17522.347737] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697708 SOCKET 1 APIC 40
[17522.347783] ghes_print_estatus: 3 callbacks suppressed
[17522.356398] \{7}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
[17522.368481] \{7}[Hardware Error]: It has been corrected by h/w and requires no further action
[17522.380508] \{7}[Hardware Error]: event severity: corrected
[17522.389182] \{7}[Hardware Error]: Error 0, type: corrected
[17522.397801] \{7}[Hardware Error]: fru_text: DIMM ??
[17522.405530] \{7}[Hardware Error]: section_type: memory error
[17522.414171] [Firmware Warn]: error section length is too small
[17522.759299] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfaa7c5 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
[17524.226532] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17524.226537] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17524.226540] EDAC sbridge MC0: TSC 0 
[17524.226542] EDAC sbridge MC0: ADDR faab05000 
[17524.226544] EDAC sbridge MC0: MISC 90002000200048c 
[17524.226546] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697710 SOCKET 1 APIC 40
[17524.226612] \{8}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
[17524.238454] \{8}[Hardware Error]: It has been corrected by h/w and requires no further action
[17524.250391] \{8}[Hardware Error]: event severity: corrected
[17524.258781] \{8}[Hardware Error]: Error 0, type: corrected
[17524.267115] \{8}[Hardware Error]: fru_text: DIMM ??
[17524.274563] \{8}[Hardware Error]: section_type: memory error
[17524.282938] [Firmware Warn]: error section length is too small
[17524.402614] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17524.402620] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17524.402622] EDAC sbridge MC0: TSC 0 
[17524.402625] EDAC sbridge MC0: ADDR faab65000 
[17524.402627] EDAC sbridge MC0: MISC 90002000200048c 
[17524.402629] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697710 SOCKET 1 APIC 40
[17524.789294] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfaab05 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
[17524.815442] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#0_DIMM#0 (channel:4 slot:0 page:0xfaab65 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:1 rank:0)
[17526.670601] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17526.670607] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17526.670609] EDAC sbridge MC0: TSC 0 
[17526.670611] EDAC sbridge MC0: ADDR faaf45000 
[17526.670613] EDAC sbridge MC0: MISC 90002000200048c 
[17526.670615] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697712 SOCKET 1 APIC 40
[17526.695370] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17526.695376] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17526.695378] EDAC sbridge MC0: TSC 0 
[17526.695380] EDAC sbridge MC0: ADDR faaf55000 
[17526.695382] EDAC sbridge MC0: MISC 90002000200048c 
[17526.695384] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697712 SOCKET 1 APIC 40
[17526.745348] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17526.745354] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17526.745356] EDAC sbridge MC0: TSC 0 
[17526.745358] EDAC sbridge MC0: ADDR faaf75000 
[17526.745360] EDAC sbridge MC0: MISC 90002000200048c 
[17526.745363] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697712 SOCKET 1 APIC 40
[17526.845288] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfaaf45 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
[17526.871716] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xfaaf55 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:1 rank:0)
[17526.897945] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#0_DIMM#0 (channel:4 slot:0 page:0xfaaf75 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:1 rank:0)
[17528.579406] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17528.579412] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17528.579415] EDAC sbridge MC0: TSC 0 
[17528.579417] EDAC sbridge MC0: ADDR fab285000 
[17528.579419] EDAC sbridge MC0: MISC 90002000200048c 
[17528.579422] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697714 SOCKET 1 APIC 40
[17528.645721] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17528.645727] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17528.645729] EDAC sbridge MC0: TSC 0 
[17528.645732] EDAC sbridge MC0: ADDR fab2a5000 
[17528.645734] EDAC sbridge MC0: MISC 90002000200048c 
[17528.645736] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697714 SOCKET 1 APIC 40
[17528.662703] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17528.662709] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17528.662711] EDAC sbridge MC0: TSC 0 
[17528.662714] EDAC sbridge MC0: ADDR fab2b5000 
[17528.662716] EDAC sbridge MC0: MISC 90002000200048c 
[17528.662719] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697714 SOCKET 1 APIC 40
[17528.720528] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17528.720534] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17528.720536] EDAC sbridge MC0: TSC 0 
[17528.720538] EDAC sbridge MC0: ADDR fab2d5000 
[17528.720540] EDAC sbridge MC0: MISC 90002000200048c 
[17528.720543] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697714 SOCKET 1 APIC 40
[17528.762541] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
[17528.762547] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
[17528.762549] EDAC sbridge MC0: TSC 0 
[17528.762551] EDAC sbridge MC0: ADDR fab2f5000 
[17528.762553] EDAC sbridge MC0: MISC 90002000200048c 
[17528.762556] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697714 SOCKET 1 APIC 40


Generated at Sat Feb 10 02:38:41 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.