Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10844

sanity test_27z hang during rollling downgrade from 2.11 to 2.10.3

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.11.0
    • None
    • 3
    • 9223372036854775807

    Description

      During rolling downgrade client and MDS from 2.11(which was formatted as 2.10.3) to 2.10.3, sanity test_27z hang. OSS remained as 2.11 (also formatted as 2.10.3). Found following message on MDS. umount the system and restart the following test suites, did not see this problem.

      [ 4736.420979] Lustre: DEBUG MARKER: == sanity test 27z: check SEQ/OID on the MDT and OST filesystems ===================================== 02:15:21 (1521684921)
      [ 4742.183870] Lustre: DEBUG MARKER: check file /mnt/lustre/d27z.sanity/f27z.sanity-1
      [ 4742.667467] Lustre: DEBUG MARKER: FID seq 0x200011571, oid 0x4662 ver 0x0
      [ 4743.140965] Lustre: DEBUG MARKER: LOV seq 0x200011571, oid 0x4662, count: 1
      [ 4743.623018] Lustre: DEBUG MARKER: want: stripe:0 ost:0 oid:809526/0xc5a36 seq:0
      [ 4746.434545] LustreError: 11-0: lustre-OST0000-osc-MDT0000: operation ost_statfs to node 10.2.2.48@tcp failed: rc = -107
      [ 4746.449648] Lustre: lustre-OST0000-osc-MDT0000: Connection to lustre-OST0000 (at 10.2.2.48@tcp) was lost; in progress operations using this service will wait for recovery to complete
      [17500.566604] mce: [Hardware Error]: Machine check events logged
      [17500.566686] \{1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
      [17500.566688] \{1}[Hardware Error]: It has been corrected by h/w and requires no further action
      [17500.566689] \{1}[Hardware Error]: event severity: corrected
      [17500.566691] \{1}[Hardware Error]: Error 0, type: corrected
      [17500.566693] \{1}[Hardware Error]: fru_text: DIMM ??
      [17500.566694] \{1}[Hardware Error]: section_type: memory error
      [17500.566695] [Firmware Warn]: error section length is too small
      [17500.643302] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17500.643306] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17500.643308] EDAC sbridge MC0: TSC 0 
      [17500.643311] EDAC sbridge MC0: ADDR fa81e5000 
      [17500.643313] EDAC sbridge MC0: MISC 90002000200048c 
      [17500.643316] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697686 SOCKET 1 APIC 40
      [17501.195345] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#0_DIMM#0 (channel:4 slot:0 page:0xfa81e5 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:1 rank:0)
      [17502.471597] mce: [Hardware Error]: Machine check events logged
      [17502.471675] \{2}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
      [17502.471676] \{2}[Hardware Error]: It has been corrected by h/w and requires no further action
      [17502.471677] \{2}[Hardware Error]: event severity: corrected
      [17502.471678] \{2}[Hardware Error]: Error 0, type: corrected
      [17502.471680] \{2}[Hardware Error]: fru_text: DIMM ??
      [17502.471681] \{2}[Hardware Error]: section_type: memory error
      [17502.471681] [Firmware Warn]: error section length is too small
      [17502.547250] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17502.547255] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17502.547257] EDAC sbridge MC0: TSC 0 
      [17502.547259] EDAC sbridge MC0: ADDR fa8595000 
      [17502.547263] EDAC sbridge MC0: MISC 90002000200048c 
      [17502.547266] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697688 SOCKET 1 APIC 40
      [17503.227328] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfa8595 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
      [17504.390627] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17504.390633] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17504.390636] EDAC sbridge MC0: TSC 0 
      [17504.390638] EDAC sbridge MC0: ADDR fa8935000 
      [17504.390641] EDAC sbridge MC0: MISC 90002000200048c 
      [17504.390643] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697690 SOCKET 1 APIC 40
      [17504.480491] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17504.480497] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17504.480499] EDAC sbridge MC0: TSC 0 
      [17504.480502] EDAC sbridge MC0: ADDR fa8965000 
      [17504.480504] EDAC sbridge MC0: MISC 90002000200048c 
      [17504.480506] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697690 SOCKET 1 APIC 40
      [17505.257328] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#0_DIMM#0 (channel:4 slot:0 page:0xfa8935 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:1 rank:0)
      [17505.283795] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#0_DIMM#0 (channel:4 slot:0 page:0xfa8965 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:1 rank:0)
      [17506.801585] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17506.801591] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17506.801593] EDAC sbridge MC0: TSC 0 
      [17506.801596] EDAC sbridge MC0: ADDR fa8d45000 
      [17506.801598] EDAC sbridge MC0: MISC 90002000200048c 
      [17506.801600] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697692 SOCKET 1 APIC 40
      [17506.801640] ghes_print_estatus: 2 callbacks suppressed
      [17506.810011] \{3}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
      [17506.821795] \{3}[Hardware Error]: It has been corrected by h/w and requires no further action
      [17506.833545] \{3}[Hardware Error]: event severity: corrected
      [17506.841941] \{3}[Hardware Error]: Error 0, type: corrected
      [17506.850283] \{3}[Hardware Error]: fru_text: DIMM ??
      [17506.857756] \{3}[Hardware Error]: section_type: memory error
      [17506.866137] [Firmware Warn]: error section length is too small
      [17507.313326] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfa8d45 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
      [17508.712415] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17508.712421] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17508.712423] EDAC sbridge MC0: TSC 0 
      [17508.712425] EDAC sbridge MC0: ADDR fa9095000 
      [17508.712427] EDAC sbridge MC0: MISC 90002000200048c 
      [17508.712430] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697694 SOCKET 1 APIC 40
      [17508.712490] \{4}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
      [17508.723899] \{4}[Hardware Error]: It has been corrected by h/w and requires no further action
      [17508.735546] \{4}[Hardware Error]: event severity: corrected
      [17508.743674] \{4}[Hardware Error]: Error 0, type: corrected
      [17508.751767] \{4}[Hardware Error]: fru_text: DIMM ??
      [17508.759000] \{4}[Hardware Error]: section_type: memory error
      [17508.767180] [Firmware Warn]: error section length is too small
      [17508.790296] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17508.790303] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17508.790305] EDAC sbridge MC0: TSC 0 
      [17508.790308] EDAC sbridge MC0: ADDR fa90b5000 
      [17508.790310] EDAC sbridge MC0: MISC 90002000200048c 
      [17508.790312] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697694 SOCKET 1 APIC 40
      [17508.910332] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17508.910338] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17508.910340] EDAC sbridge MC0: TSC 0 
      [17508.910342] EDAC sbridge MC0: ADDR fa90f5000 
      [17508.910344] EDAC sbridge MC0: MISC 90002000200048c 
      [17508.910346] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697694 SOCKET 1 APIC 40
      [17509.343315] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xfa9095 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:1 rank:0)
      [17509.369179] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#0_DIMM#0 (channel:4 slot:0 page:0xfa90b5 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:1 rank:0)
      [17509.394808] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#1_DIMM#0 (channel:5 slot:0 page:0xfa90f5 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:2 rank:0)
      [17510.934471] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17510.934477] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17510.934479] EDAC sbridge MC0: TSC 0 
      [17510.934482] EDAC sbridge MC0: ADDR fa9485000 
      [17510.934484] EDAC sbridge MC0: MISC 90002000200048c 
      [17510.934486] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697696 SOCKET 1 APIC 40
      [17511.095088] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17511.095093] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17511.095096] EDAC sbridge MC0: TSC 0 
      [17511.095098] EDAC sbridge MC0: ADDR fa94b5000 
      [17511.095099] EDAC sbridge MC0: MISC 90002000200048c 
      [17511.095102] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697696 SOCKET 1 APIC 40
      [17511.423316] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xfa9485 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:1 rank:0)
      [17511.449450] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#1_DIMM#0 (channel:5 slot:0 page:0xfa94b5 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:2 rank:0)
      [17513.070547] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17513.070552] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17513.070555] EDAC sbridge MC0: TSC 0 
      [17513.070557] EDAC sbridge MC0: ADDR fa9815000 
      [17513.070559] EDAC sbridge MC0: MISC 90002000200048c 
      [17513.070561] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697698 SOCKET 1 APIC 40
      [17513.123604] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17513.123611] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17513.123613] EDAC sbridge MC0: TSC 0 
      [17513.123615] EDAC sbridge MC0: ADDR fa9835000 
      [17513.123617] EDAC sbridge MC0: MISC 90002000200048c 
      [17513.123620] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697698 SOCKET 1 APIC 40
      [17513.177477] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17513.177484] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17513.177486] EDAC sbridge MC0: TSC 0 
      [17513.177489] EDAC sbridge MC0: ADDR fa9855000 
      [17513.177491] EDAC sbridge MC0: MISC 90002000200048c 
      [17513.177493] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697698 SOCKET 1 APIC 40
      [17513.479315] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xfa9815 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:1 rank:0)
      [17513.505445] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#0_DIMM#0 (channel:4 slot:0 page:0xfa9835 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:1 rank:0)
      [17513.531370] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfa9855 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
      [17515.333443] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17515.333449] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17515.333452] EDAC sbridge MC0: TSC 0 
      [17515.333454] EDAC sbridge MC0: ADDR fa9c05000 
      [17515.333456] EDAC sbridge MC0: MISC 90002000200048c 
      [17515.333458] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697701 SOCKET 1 APIC 40
      [17515.333518] ghes_print_estatus: 7 callbacks suppressed
      [17515.341874] \{5}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
      [17515.353666] \{5}[Hardware Error]: It has been corrected by h/w and requires no further action
      [17515.360474] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17515.360477] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17515.360479] EDAC sbridge MC0: TSC 0 
      [17515.360479] EDAC sbridge MC0: ADDR fa9c15000 
      [17515.360481] EDAC sbridge MC0: MISC 90002000200048c 
      [17515.360482] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697701 SOCKET 1 APIC 40
      [17515.366408] \{5}[Hardware Error]: event severity: corrected
      [17515.374719] \{5}[Hardware Error]: Error 0, type: corrected
      [17515.383031] \{5}[Hardware Error]: fru_text: DIMM ??
      [17515.390516] \{5}[Hardware Error]: section_type: memory error
      [17515.398977] [Firmware Warn]: error section length is too small
      [17515.474390] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17515.474396] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17515.474398] EDAC sbridge MC0: TSC 0 
      [17515.474400] EDAC sbridge MC0: ADDR fa9c45000 
      [17515.474402] EDAC sbridge MC0: MISC 90002000200048c 
      [17515.474405] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697701 SOCKET 1 APIC 40
      [17515.474450] \{6}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
      [17515.485731] \{6}[Hardware Error]: It has been corrected by h/w and requires no further action
      [17515.497710] \{6}[Hardware Error]: event severity: corrected
      [17515.506104] \{6}[Hardware Error]: Error 0, type: corrected
      [17515.514441] \{6}[Hardware Error]: fru_text: DIMM ??
      [17515.519350] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17515.519353] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17515.519356] EDAC sbridge MC0: TSC 0 
      [17515.519357] EDAC sbridge MC0: ADDR fa9c55000 
      [17515.519359] EDAC sbridge MC0: MISC 90002000200048c 
      [17515.519359] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697701 SOCKET 1 APIC 40
      [17515.522783] \{6}[Hardware Error]: section_type: memory error
      [17515.531179] [Firmware Warn]: error section length is too small
      [17515.559307] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xfa9c05 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:1 rank:0)
      [17515.562537] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17515.562540] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17515.562542] EDAC sbridge MC0: TSC 0 
      [17515.562543] EDAC sbridge MC0: ADDR fa9c65000 
      [17515.562545] EDAC sbridge MC0: MISC 90002000200048c 
      [17515.562545] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697701 SOCKET 1 APIC 40
      [17515.586088] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfa9c15 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
      [17515.605394] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17515.605397] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17515.605399] EDAC sbridge MC0: TSC 0 
      [17515.605400] EDAC sbridge MC0: ADDR fa9c75000 
      [17515.605402] EDAC sbridge MC0: MISC 90002000200048c 
      [17515.605403] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697701 SOCKET 1 APIC 40
      [17515.612619] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfa9c45 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
      [17515.638195] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xfa9c55 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:1 rank:0)
      [17516.665303] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#1_DIMM#0 (channel:5 slot:0 page:0xfa9c65 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:2 rank:0)
      [17516.692108] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#0_DIMM#0 (channel:4 slot:0 page:0xfa9c75 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:1 rank:0)
      [17519.811483] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17519.811488] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17519.811491] EDAC sbridge MC0: TSC 0 
      [17519.811493] EDAC sbridge MC0: ADDR faa385000 
      [17519.811495] EDAC sbridge MC0: MISC 90002000200048c 
      [17519.811497] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697705 SOCKET 1 APIC 40
      [17520.727300] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfaa385 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
      [17522.347722] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17522.347728] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17522.347731] EDAC sbridge MC0: TSC 0 
      [17522.347733] EDAC sbridge MC0: ADDR faa7c5000 
      [17522.347735] EDAC sbridge MC0: MISC 90002000200048c 
      [17522.347737] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697708 SOCKET 1 APIC 40
      [17522.347783] ghes_print_estatus: 3 callbacks suppressed
      [17522.356398] \{7}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
      [17522.368481] \{7}[Hardware Error]: It has been corrected by h/w and requires no further action
      [17522.380508] \{7}[Hardware Error]: event severity: corrected
      [17522.389182] \{7}[Hardware Error]: Error 0, type: corrected
      [17522.397801] \{7}[Hardware Error]: fru_text: DIMM ??
      [17522.405530] \{7}[Hardware Error]: section_type: memory error
      [17522.414171] [Firmware Warn]: error section length is too small
      [17522.759299] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfaa7c5 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
      [17524.226532] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17524.226537] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17524.226540] EDAC sbridge MC0: TSC 0 
      [17524.226542] EDAC sbridge MC0: ADDR faab05000 
      [17524.226544] EDAC sbridge MC0: MISC 90002000200048c 
      [17524.226546] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697710 SOCKET 1 APIC 40
      [17524.226612] \{8}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
      [17524.238454] \{8}[Hardware Error]: It has been corrected by h/w and requires no further action
      [17524.250391] \{8}[Hardware Error]: event severity: corrected
      [17524.258781] \{8}[Hardware Error]: Error 0, type: corrected
      [17524.267115] \{8}[Hardware Error]: fru_text: DIMM ??
      [17524.274563] \{8}[Hardware Error]: section_type: memory error
      [17524.282938] [Firmware Warn]: error section length is too small
      [17524.402614] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17524.402620] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17524.402622] EDAC sbridge MC0: TSC 0 
      [17524.402625] EDAC sbridge MC0: ADDR faab65000 
      [17524.402627] EDAC sbridge MC0: MISC 90002000200048c 
      [17524.402629] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697710 SOCKET 1 APIC 40
      [17524.789294] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfaab05 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
      [17524.815442] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#0_DIMM#0 (channel:4 slot:0 page:0xfaab65 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:1 rank:0)
      [17526.670601] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17526.670607] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17526.670609] EDAC sbridge MC0: TSC 0 
      [17526.670611] EDAC sbridge MC0: ADDR faaf45000 
      [17526.670613] EDAC sbridge MC0: MISC 90002000200048c 
      [17526.670615] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697712 SOCKET 1 APIC 40
      [17526.695370] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17526.695376] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17526.695378] EDAC sbridge MC0: TSC 0 
      [17526.695380] EDAC sbridge MC0: ADDR faaf55000 
      [17526.695382] EDAC sbridge MC0: MISC 90002000200048c 
      [17526.695384] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697712 SOCKET 1 APIC 40
      [17526.745348] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17526.745354] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17526.745356] EDAC sbridge MC0: TSC 0 
      [17526.745358] EDAC sbridge MC0: ADDR faaf75000 
      [17526.745360] EDAC sbridge MC0: MISC 90002000200048c 
      [17526.745363] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697712 SOCKET 1 APIC 40
      [17526.845288] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xfaaf45 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:2 rank:0)
      [17526.871716] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0xfaaf55 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:0 channel_mask:1 rank:0)
      [17526.897945] EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#1_Chan#0_DIMM#0 (channel:4 slot:0 page:0xfaaf75 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 ha:1 channel_mask:1 rank:0)
      [17528.579406] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17528.579412] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17528.579415] EDAC sbridge MC0: TSC 0 
      [17528.579417] EDAC sbridge MC0: ADDR fab285000 
      [17528.579419] EDAC sbridge MC0: MISC 90002000200048c 
      [17528.579422] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697714 SOCKET 1 APIC 40
      [17528.645721] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17528.645727] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17528.645729] EDAC sbridge MC0: TSC 0 
      [17528.645732] EDAC sbridge MC0: ADDR fab2a5000 
      [17528.645734] EDAC sbridge MC0: MISC 90002000200048c 
      [17528.645736] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697714 SOCKET 1 APIC 40
      [17528.662703] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17528.662709] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17528.662711] EDAC sbridge MC0: TSC 0 
      [17528.662714] EDAC sbridge MC0: ADDR fab2b5000 
      [17528.662716] EDAC sbridge MC0: MISC 90002000200048c 
      [17528.662719] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697714 SOCKET 1 APIC 40
      [17528.720528] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17528.720534] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17528.720536] EDAC sbridge MC0: TSC 0 
      [17528.720538] EDAC sbridge MC0: ADDR fab2d5000 
      [17528.720540] EDAC sbridge MC0: MISC 90002000200048c 
      [17528.720543] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697714 SOCKET 1 APIC 40
      [17528.762541] EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
      [17528.762547] EDAC sbridge MC0: CPU 18: Machine Check Event: 0 Bank 13: 8c000042000800c0
      [17528.762549] EDAC sbridge MC0: TSC 0 
      [17528.762551] EDAC sbridge MC0: ADDR fab2f5000 
      [17528.762553] EDAC sbridge MC0: MISC 90002000200048c 
      [17528.762556] EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1521697714 SOCKET 1 APIC 40
      
      

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              sarah Sarah Liu
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: