Friday, October 20, 2023

Why NE40E CPU usage raise too high?

Fault description

The customer reports that the CPU usage of Huawei NE40E-X3 raise too high, which reaches 60%

 

Processing procedures

1. Check the CPU usage to locate which tasks are using a lot of CPU resource.

 

<NE40E-X3>dis cpu-usage
Cpu utilization statistics at 2019-02-22 15:10:26 916 ms
System cpu use rate is : 74%
Cpu utilization for five seconds: 72% ;  one minute: 65% ;  five minutes: 77%.
Max CPU Usage : 99%
Max CPU Usage Stat. Time : 2018-09-10 18:58:15 410 ms
---------------------------
ServiceName UseRate 
---------------------------
SYSTEM           40%
BRAS             26%
CMF               3%
FEC               3%
IP STACK          2%
AAA               0%
ARP               0%

 

The result shows that the BRAS service and system service occupy massive CPU resource.

It’s suspected that the BRAS related configuration or device fault cause the error.

 

2.  Check boards health in the device

 

<NE40E-X3>dis health
----------------------------------------------------------------
Slot                       CPU Usage  Memory Usage(Used/Total)
----------------------------------------------------------------
4      MPU(Master)            53%          38%  1562MB/4022MB
1      LPU                    39%          22%   859MB/3736MB
3      VSU                     3%          14%   509MB/3545MB
5      MPU(Slave)              18%         31%   1274MB/4022MB
----------------------------------------------------------------

 

3.  Check the services implemented on the boards, found that misconfiguration cause a lot of users cannot online, which cause high CPU usage, after modifying the configuration.

 

Root Cuase

Misconfiguration cause a lot of users cannot online, which cause high CPU usage.

cpu high

Previous configuration:

acl 3001
 rule 5 permit ip source 10.5.x.0 0.0.0.255

 

Modified configuration:

acl 3001
 rule 5 permit ip source 10.5.x.0 0.0.255.255

Solution

The misconfiguration cause massive users cannot get online which generates massive re-authentication messages, as a result, the device CPU is exhausted.

No comments:

Post a Comment