Ruijie Community
Title: Troubleshooting high memory utilization [Print this page]
Author: admin Time: 2018-4-9 19:21
Title: Troubleshooting high memory utilization
1) A switch works properly. However, the show memory command output shows that the memory utilization rangesfrom 80% to 90% or even higher and increases continuously. (As the availablememory is decreasing, the memory utilization is increasing.)
2) If the memory utilization increases till the memory is exhausted, thefollowing symptom appears:
When you type characters on the console port, the switch does not respondand a log indicating insufficient memory is displayed.
Ruijie>en
not enough memory! cli execute fail!
Or *Sep 6 08:54:14:%SCHED-0-NOSTACK: Could not allocate 40960 bytes for stack from memory.
Note: Some Ruijie switcheshave a small memory. After they are started, memory utilization ranges from 50%to 75%. When switches are running, the memory utilization may even exceed 80%.As long as the switches work properly and the memory does not increase to 90%or above sharply, no fault occurs.
Check whether the memory utilization increases continuously. If thenetwork management software shows that the memory utilization increasessharply, a memory leak may have occurred.
Author: admin Time: 2018-4-9 19:22
Possible Causes1) Due to software faults, the memory space occupied by a function cannotbe released, causing rapid or slow memory leak. If a switch has been workingproperly for a long time but rapid memory leak occurs on it after a newfunction is applied, it is mostly because the new function is abnormal.
2) In the case of function changes such as the increasing of new unicastroute entries and multicast entries, the memory utilization usually increasestably. You need to analyze the fault based on real network.
Troubleshooting procedureStep1: Check whether the memory utilization increasescontinuously.In the case offunction changes such as new unicast route entries and multicast entries, thememory utilization usually increase stably. (For example, 1,000 routes occupyabout 2 MB in the memory. Due to network expansion and reconstruction, the switchlearns 1,000 more routes, and therefore the memory space is reduced by about 2MB.) This is normal.
Therefore, if you doubt whethermemory leak occurs, check for a continuousincrease in memory utilization.
1. Run the show memory command every 2 seconds forthree times.
2. Check whether Used Rate (memory utilization)increases continuously and Current FreeMemory (available memory space in KB) decreases continuously.
For example:
Ruijie#show memory
System Memory Statistic:
Free pages: 2898
watermarks : min 433, lower 866, low 1299, high 1732
System Total Memory : 128MB, Current Free Memory : 14580KB
UsedRate : 89%
· If Current Free Memory decreases sharply (by about 2 KB each time), run the show memory commandevery 5 to 10 minutes.
· If Current Free Memory changes slightly, run the show memory command every couple of hours or every day. If it stillchanges slightly, run the show memorycommand every week or month.
3.
1) No fault occurs if Current Free Memory changes slightly ina long time.
2) If Current Free Memory decreases steadily (whether rapidly or slowly),run the following commands again and contact Ruijie technical support.
1) Run the show memory command every 5 seconds for three times.
2) Run the show memory protocols command every 5 seconds for three times.
3) Run the show version, show versionslots, show running, show interface status, show arp counter, show mac-address-table count, showip route, show vlan, show log commands once to collectregular information.
Notes
If the memory utilization exceeds 90% and continues to increase, you cannegotiate with the customer on the time to restart the switch to minimize the effecton the customer's services. After the switch is restarted, collect informationaccording to Step 3 and contact Ruijie technical support.
Step2: Check whether the memory is exhausted1.Log in to the switch and obtain the memory information (using either ofthe following methods).
· Connect to the switch throughthe console port and type characters to see whether input is echoed on theSecureCRT or Hyper Terminal. If yes, log in to the switch and run the show memory command twice.
· Remotely log in to the switchthrough Telnet or SSH and run the showmemory command twice.
2. Check the memory utilization of the switch.
· If the switch displays eitherof the following logs, a severe memory leak has occurred and the system cannotallocate memory properly. In this case, the switch fails to work and servicesare interrupted.
log1:not enough memory! cli execute fail!
log2:*Sep 6 08:54:14: %SCHED-0-NOSTACK: Could notallocate 40960 bytes for stack from memory.
In this case, go to Step 3 to collect information and recover services inemergency.
· If the show memory command displays an output as follows, it indicatesthat there is enough available memory for normal operation of the system.
For example:
Ruijie#show memory
System Memory Statistic:
Free pages: 2898
watermarks : min 433, lower 866, low 1299, high 1732
System Total Memory : 128MB, Current Free Memory : 14580KB
Used Rate : 89%
If the memory utilization is high (over 70% for example) but the switchworks properly, or the memory utilization is low (below 70% for example) butyou are concerned about faults, go to
step3 to collect information.
Step3: Collectinformation and recover services in emergency1) Risks:Collection poses high risks. Due to high priority and frequentinterruption, collection may affect customer service, or even interrupt thecustomer's network (in this case, you need to restart the switch to recoverservices).
2) If theswitches work as a VSU group, collect information on each standby device (seethe "Note" above), (If there are more than three VSU members, collectinformation on any three members.)
3) Enter@@@@C on the standby device to enable console printing.
Notes
n Collectinformation as follows if the customer wants to restart the switch for servicerecovery immediately but the switch cannot be managed through console, telnetor SSH.
n If thecustomer agrees to collect information after being informed of the risks, youcan restart the switch.
n Informationcollection must be complete within the downtime of the customer's services. Ifnot, you should restart the switch as well.
n Beforerestarting the switch, confirm with the customer on the restart time so thatthe customer can be well prepared.
1. Connect to the faulty switch through the consoleport.
Run the following commands in sequence. If the customer wants to recoverservices, restart the switch whether the collection is complete or not. (4Escindicates pressing the Esc button for four times.)
4Esc + d :debug_show_all_locks
4Esc + c : open and closeconsole
4Esc + f : dumptech-support info
4Esc + h : help
4Esc + i : dump cli debuginfo
4Esc + j : dump irq info
4Esc + k + pid + # : kill pid proc
4Esc + l : dump startprocess
4Esc + m : dump mem info
4Esc + n : start hrtimer
4Esc + o : stop hrtimer
4Esc + p : close loggingmessage(same as: no logging on)
4Esc + q : dump contextswitches and runtime
4Esc + r : dump other cpudump_stack
4Esc + s : show 5@ info
4Esc + t : show taskstates
4Esc + x : start dot task
4Esc + y : stop dot task
2. If the switches work as a VSU group,collect information on each standby device: (If there are more than three VSUmembers, collect information on any three members.)
Enter @@@@C on the standby device to enable console printing.
Ruijie#debug su
Ruijie(support)#tech-support package
Notes
n If console printing fails while no fault occurs to console cableconnection, contact Ruijie technical support.
3. After information collection is complete, power off and restart theswitch based on the customer's demand. When the device is restarted, theconsole cable should be connected to the device. Observe the logs on the deviceand ensure that the device is restarted properly.
4. After the device is properly restarted, run the following commandsto collect information and report the information to Ruijie technical supportfor analysis. You can rapidly identify the fault by comparing the informationcollected during proper running of the switch with that collected in emergencyduring failure of the switch.
1) Run the showmemory command every 5 seconds for three times.
2) Run the showmemory protocols command every 5 seconds for three times.
3) Run the show version,show version slots, show running, show interface status, showarp counter, show mac-address-tablecount, show ip route, show vlan, show log commands once to collect regular information.
Perform the precedingsteps to collect information, and contact Ruijie technical support for assistance.
Welcome to Ruijie Community (https://community.ruijienetworks.com/) |
Powered by Discuz! X3.2 |