Forgot password?
 Register now

Welcome to use this form to feedback your problems with Ruijie Community

The category of your feedback

Your Feedback

Your Email address (optional):

Official
Poor Contact of the Obliquely Inserted Memory Module — Troubleshooting Guide Reply

admin

Level 4

Poor Contact of the Obliquely Inserted Memory Module — Troubleshooting Guide
8025 1 2017-5-3 15:30:19
Original
Poor Contact of the Obliquely Inserted Memory Module — Troubleshooting Guide.

Contact Memory
0 2017-5-3 15:30:28 View all replies
Applicable scope:
All CM cards in high-end switches that use obliquely inserted memory module sockets, as shown in the following figure.
The models include but are not limited to the following:
M18010-CM
M18010-CM II
M18014-CM
M18014-CM II
M18007-CM II
M18007-CM II LITE
M8600E-CM
M7800E-CM

Fault symptom:
The two common fault logs are as follows:

The device is restarted repeatedly and the following exception information is displayed in the case of boot:
Boot 1.2.2-eaf8aaa (Build time: Apr 21 2014 - 10:12:42)
DRAM: 4 GiB
Boot 1.2.2-eaf8aaa (Build time: Apr 21 2014 - 10:12:42)
DRAM: 4 GiB

The device automatically restarts and the following exception information is displayed (the ECC error is reported repeatedly):
NAND:  512 MiB
Flash: 8 MiB
SETMAC: Setmac operation was performed at 2014-06-16 21:16:11 (version: 11.0)
Press Ctrl+C to enter Boot Menu
Bootloader: Done loading app on coremask: 0xf
[    0.000000] ERROR PBANK_LSB: 4, ROW_LSB: 2, Row bits: 16, Col bits: 10, Row mask: 0xffff, Col mask: 0x3ff
[    0.000000] ERROR LMC0 ECC: sec_err:8 ded_err:0
[    0.000000] LMC0 ECC:        Failing dimm:   0
[    0.000000] LMC0 ECC:        Failing rank:   0
[    0.000000] LMC0 ECC:        Failing bank:   7
[    0.000000] LMC0 ECC:        Failing row:    0xff0b
[    0.000000] LMC0 ECC:        Failing column: 0x2dbe
[    0.000000] LMC0 ECC:        syndrome: 0xce
[    0.000000] Failing  Address: 0x000000010f0b6cf8, Data: 0xc00627d8c006cfec
[    0.000000] ERROR PBANK_LSB: 4, ROW_LSB: 2, Row bits: 16, Col bits: 10, Row mask: 0xffff, Col mask: 0x3ff
[    0.000000] ERROR LMC0 ECC: sec_err:1 ded_err:0
[    0.000000] LMC0 ECC:        Failing dimm:   0
[    0.000000] LMC0 ECC:        Failing rank:   0
[    0.000000] LMC0 ECC:        Failing bank:   5
[    0.000000] LMC0 ECC:        Failing row:    0x14
[    0.000000] LMC0 ECC:        Failing column: 0x1110
[    0.000000] LMC0 ECC:        syndrome: 0xce
[    0.000000] Failing  Address: 0x0000000000144480, Data: 0x080510000083102d
[    9.235671] ERROR PBANK_LSB: 4, ROW_LSB: 2, Row bits: 16, Col bits: 10, Row mask: 0xffff, Col mask: 0x3ff
[    9.350371] ERROR LMC0 ECC: sec_err:8 ded_err:0
[    9.350374] LMC0 ECC:        Failing dimm:   0
[    9.350377] LMC0 ECC:        Failing rank:   0
[    9.350379] LMC0 ECC:        Failing bank:   6
[    9.350382] LMC0 ECC:        Failing row:    0xdd
[    9.350385] LMC0 ECC:        Failing column: 0x379a
[    9.350388] LMC0 ECC:        syndrome: 0xce
[    9.350390] Failing  Address: 0x0000000000dde458, Data: 0xcccccccccccccccc

3. Troubleshooting suggestion:
When a faulty card encounters the preceding fault symptoms, the fault may be caused by poor contact between the memory module and the memory module socket. In this case, perform the following operations to attempt to eliminate the poor contact:
Step 1: Remove the faulty card from the chassis and put it on a flat platform.
Step 2: After wearing ESD gloves or an ESD wrist strap, hold the edge in the middle of the memory module where no component resides (as shown in Figure 2), shake the memory module top down along the direction vertical to the memory module plane (as shown in Figure 3), with the amplitude smaller than 5 mm, to prevent damage to the memory module and socket.
                                    Figure 2

                            Figure 3

Step 3: Hold both ends of the memory module and socket with index fingers and thumbs, and press the memory module into the socket with force along the direction parallel to the memory module, as shown in Figure 4.
                                   Figure 4

Step 4: Insert the faulty card into the chassis and power on the device.
If the fault is rectified and the device runs properly after the preceding operations are performed, the poor contact is eliminated and the sudden poor contact will not occur on the memory module in the subsequent device running.

If the fault persists after the preceding operations are performed, you are recommended to perform the following operations:
Step 1: When the faulty card encounters the repeated restart symptom, press Ctrl+T till the card resets and enters the memory self-check state. Then, release the buttons. After the memory self-check is complete, record the collected log for future troubleshooting.
Step 2: Record the customer name, device running duration, device serial number, and other common information.
Step 3: Start the DOA or RMA process for the faulty card.

Related Posts
Product Model

Share this topic to

Cancel

This site contains user submitted content, comments and opinions and is for informational purposes only. Ruijie may provide or recommend responses as a possible solution based on the information provided; every potential issue may involve several factors not detailed in the conversations captured in an electronic forum and Ruijie can therefore provide no guarantee as to the efficacy of any proposed solutions on the community forums. Ruijie disclaims any and all liability for the acts, omissions and conduct of any third parties in connection with or related to your use of the site. All postings and use of the content on this site are subject to the Ruijie Community Terms of Use.

More ways to get help: Visit Support Videos, call us via Service Hotline, Facebook or Live Chat.

©2000-2023 Ruijie Networks Co,Ltd