Jump to content
Welcome to our new Citrix community!

ADC / Netscaler MGMT CPU 100%


Philipp Zenz

Recommended Posts

Hi guys,

 

I´v read much about this problem.

Also that some guys fixed this problem with replacing their licence files (wtf?) and such things.

 

currently I´ve got no idea how to fix my problem, maybe someone got an nice idea for me :)

 

System:

VPX 3000

4 vCPU

4GB RAM

HA Cluster - other VM doesnt got this problem.

 

Problem:

- Management CPU runs at 100%

 

sh ns version
        NetScaler NS13.0: Build 36.27.nc, Date: May 13 2019, 11:42:13   (64-bit)

 

Reboot already done, doesnt fix this..

 

top output:

last pid: 71776;  load averages:  4.00,  3.34,  3.12                                                             up 0+03:39:47  11:40:07
76 processes:  4 running, 71 sleeping, 1 zombie
CPU: 61.7% user,  0.0% nice, 13.3% system,  0.1% interrupt, 25.0% idle
Mem: 219M Active, 82M Inact, 2862M Wired, 16K Cache, 152M Buf, 326M Free
Swap: 4198M Total, 4198M Free

  PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
 1177 root          1  44   r0   839M   822M CPU1    1 219:36 100.00% NSPPE-00
 1178 root          1  44   r0   839M   822M CPU2    2 219:36 100.00% NSPPE-01
 2119 root          1  76    0  8320K  1924K piperd  0  32:36 15.19% sh
59643 nobody        1  47    0   118M 40604K select  0   0:07  6.98% httpd
32588 nobody        1  45    0   116M 37388K select  0   0:10  3.37% httpd
 1180 root          1  44    0 31608K 18276K kqread  0   7:03  0.10% nsnetsvc
 1256 root          1  44    0 30652K 17136K kqread  0   0:02  0.10% nsconfigd
 1318 root          1  44    0 18100K  7608K kqread  0   0:27  0.00% nsrised
 1462 root          1  44    0 18368K  3788K nanslp  0   0:18  0.00% nsprofmon
 1464 root          1  44    0 18368K  3816K nanslp  0   0:17  0.00% nsprofmon
 1207 root          1  44    0 53808K 29252K kqread  0   0:10  0.00% nsaggregatord
  996 root          1  44    0 31664K  4256K select  0   0:05  0.00% vmtoolsd
32590 nobody        1  44    0   114M 36708K accept  0   0:05  0.00% httpd
 1355 root          2  76    0   281M   223M ucond   0   0:03  0.00% nscopo
32591 nobody        1  44    0   112M 34740K accept  0   0:02  0.00% httpd
   25 root          1  44    0 10592K 10644K kqread  0   0:02  0.00% pitboss
 1258 root          1  44    0 44996K 20104K nanslp  0   0:02  0.00% nsgslbautosync
45305 nobody        1  44    0   114M 36672K accept  0   0:01  0.00% httpd
 1313 root          1  44    0 27012K  7020K kqread  0   0:01  0.00% snmpd
32587 nobody        1  44    0   110M 32236K accept  0   0:01  0.00% httpd
 1352 root          1  44    0  8264K  2160K accept  0   0:01  0.00% datadaemon
 1336 root          1  44    0 59492K 35224K nanslp  0   0:01  0.00% nscollect
 1345 root          2  44    0 46452K  6908K kqread  0   0:01  0.00% metricscollector
 1208 root          1  44    0 76196K  3784K kqread  0   0:01  0.00% nsclusterd
32589 nobody        1  44    0   112M 36004K accept  0   0:01  0.00% httpd
 1286 root          1  44    0 43588K 14004K select  0   0:01  0.00% php
  973 root          1  44    0   108M 27308K select  0   0:00  0.00% httpd
  966 root          1  44    0  6896K  1472K select  0   0:00  0.00% syslogd
  976 root          1  44    0 10196K  2748K nanslp  0   0:00  0.00% monit
 1471 root          1  44    0 15072K  7996K select  0   0:00  0.00% ntpd
 1296 root         11  76    0 53908K 14404K ucond   0   0:00  0.00% nsaaad
 1301 root          1  44    0  7436K  3216K select  0   0:00  0.00% iked

 

> stat system cpu

CPU statistics
ID         Usage
2              1
1              1
 Done

> stat ns | grep -i cpu
Packet CPU usage (%)                0.80
Management CPU usage (%)          100.00

 

I think there´s something with wrong with:

2119 root          1  76    0  8320K  1924K piperd  0  32:36 15.19% sh

I dont know how to troubleshoot that.

 

nsconmsg output:

reltime:mili second between two records Fri Mar 20 10:02:07 2020
  Index   rtime totalcount-val      delta rate/sec symbol-name&device-no&time
    135    7006              5          5        0 mgmt_cpu_use  Fri Mar 20 10:02:07 2020
    136    7004              7          2        0 mgmt_cpu_use  Fri Mar 20 10:02:14 2020
    137    7004            629        622       88 mgmt_cpu_use  Fri Mar 20 10:02:21 2020
    138    7003           1000        371       52 mgmt_cpu_use  Fri Mar 20 10:02:28 2020
    139    7002            637       -363      -51 mgmt_cpu_use  Fri Mar 20 11:01:23 2020
    140    7003              2       -635      -90 mgmt_cpu_use  Fri Mar 20 11:01:30 2020
    141    7003              1         -1        0 mgmt_cpu_use  Fri Mar 20 11:01:37 2020
    142    7006              0         -1        0 mgmt_cpu_use  Fri Mar 20 11:01:44 2020
    143    7005              2          2        0 mgmt_cpu_use  Fri Mar 20 11:01:51 2020
    144    7004              1         -1        0 mgmt_cpu_use  Fri Mar 20 11:01:58 2020
    145    7006              4          3        0 mgmt_cpu_use  Fri Mar 20 11:02:05 2020
    146    7005              6          2        0 mgmt_cpu_use  Fri Mar 20 11:02:12 2020
    147    7005              4         -2        0 mgmt_cpu_use  Fri Mar 20 11:02:19 2020
    148    7003            633        629       89 mgmt_cpu_use  Fri Mar 20 11:02:26 2020
    149    7003           1000        367       52 mgmt_cpu_use  Fri Mar 20 11:02:33 2020

It seams that the logfiles stop after the CPU is busy... (totalcount-val = 1000 no new timestamps after this...)

 

 

Maybe someone could help me

 

ty and stay safe

Philipp

 

 

Link to comment
Share on other sites

  • 10 months later...
  • 3 weeks later...

Just as a final comment on this in case anyone else comes across it, after digging around I discovered it was being caused by the nslog folder being, or trying to be, zipped up.

 

image.thumb.png.1202cd3d318bc285a6a4cf3917dcea37.png

 

After trying various things I did the following which seemed to work:

  • Deleted all nslog folders both zipped and unzipped (/var/nslog/)
  • Set the value in the nslog.nextfile file to 1 
  • Deleted the nslog.nextzip file
  • Rebooted NS

This affectively clears the archived logs and reset the counter. When the cron job to do the zip gets called it will recreate the nslog.nextzip file.

 

All has been good since.

 

Link to comment
Share on other sites

  • 2 months later...

I encountered the same issue today. Problem was the nslog.nextzip was set to 0 and that was incorrect.
It is safe to kill the process. Management CPU usage will return to normal immediately.

Find the PID: ps aux | grep dozip

kill -9 pid

Then set nslog.nextzip to the correct value or get a fresh start by wiping the newnslog files and setting the nslog.nextfile and nslog.nextzip to 1.


 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...