Author Topic: System becomes VERY SLOW, "wa" percent very high, SWAP too high..  (Read 3290 times)

xtrgeo

  • Zen Monk
  • **
  • Posts: 61
  • Karma: +0/-0
    • View Profile
As the title says, my zentyal server (2.2.5) becomes extremely slow many times in the day. When running top i get extremely high percent in the cpu wa.(wait). All the others (idle, user,etc) are ok).

I have traffic balancing enable, squid, monitoring.

My swap usage is at about 30%.

What could be happening?

What other informations shouls i provide in order to get any help?

Thnks

PS. I am attaching both zentyal.log and error.log
« Last Edit: February 17, 2012, 02:28:05 pm by xtrgeo »

christian

  • Guest
Re: System becomes VERY SLOW, "wa" percent very high
« Reply #1 on: February 09, 2012, 07:46:41 am »
I don't know the root cause but "wa" showing amount of "IO wait", this is consistent with your point about "swap".
If your system needs to swap, this generates a lot of disk IO in order to move data between memory and disk. If disk is slow (and disk IS indeed slow compared to memory, at least HDD) then your system spends time "waiting".
Thus you should look for what is generating such amount of swap.
Systems are not supposed to swap. What is your hardware configuration and how many applications are running on this system?

You will never get healthy system  if swap is not under tight control  8)

xtrgeo

  • Zen Monk
  • **
  • Posts: 61
  • Karma: +0/-0
    • View Profile
Re: System becomes VERY SLOW, "wa" percent very high
« Reply #2 on: February 09, 2012, 09:35:27 pm »
I don't know the root cause but "wa" showing amount of "IO wait", this is consistent with your point about "swap".
If your system needs to swap, this generates a lot of disk IO in order to move data between memory and disk. If disk is slow (and disk IS indeed slow compared to memory, at least HDD) then your system spends time "waiting".
Thus you should look for what is generating such amount of swap.
Systems are not supposed to swap. What is your hardware configuration and how many applications are running on this system?

You will never get healthy system  if swap is not under tight control  8)

Hi Christian and thnks for the answer.

My hardware configuration is:
Intel core 2Duo 2.2 Ghz
1GB Ram
160Gb HD

Mostly my swap statistics are:

2805752  Total
199268 used
The rest are free..
153808 cached

Sometimes I see postgres process causes huge amount of cpu load, about 85%.

By saying applications what u mean? The modules?

Module status page has:

Network
Firewall
Antivirus
Events*
Logs*
Monitoring*
Traffic shaping
Users and groups
Bandwidth monitor
Http Proxy


*These 3 are related to high load from postgres?????

How can we determine what causes my system to "wait"??

Thnks!

xtrgeo

  • Zen Monk
  • **
  • Posts: 61
  • Karma: +0/-0
    • View Profile
Re: System becomes VERY SLOW, "wa" percent very high
« Reply #3 on: February 17, 2012, 02:26:55 pm »
Using a tool that counts the swap usage i get.

Code: [Select]
Overall swap used: 412568
PID=25234 - Swap used: 4 - (gvfs-gdu-volume )
PID=25239 - Swap used: 8 - (gvfs-gphoto2-vo )
PID=24987 - Swap used: 12 - (gnome-pty-helpe )
PID=25241 - Swap used: 12 - (gvfs-afc-volume )
PID=25237 - Swap used: 20 - (udisks-daemon )
PID=25236 - Swap used: 32 - (udisks-daemon )
PID=12157 - Swap used: 44 - (unlinkd )
PID=24796 - Swap used: 52 - (gvfsd )
PID=24983 - Swap used: 60 - (lxterminal )
PID=24792 - Swap used: 72 - (gnome-keyring-d )
PID=11019 - Swap used: 76 - (gam_server )
PID=1326 - Swap used: 96 - (getty )
PID=1370 - Swap used: 100 - (getty )
PID=1329 - Swap used: 104 - (getty )
PID=1371 - Swap used: 104 - (getty )
PID=1373 - Swap used: 104 - (getty )
PID=4214 - Swap used: 104 - (getty )
PID=5766 - Swap used: 108 - (zbwmonitor )
PID=1375 - Swap used: 112 - (acpid )
PID=24988 - Swap used: 112 - (bash )
PID=24829 - Swap used: 116 - (notification-da )
PID=11047 - Swap used: 136 - (hald-addon-stor )
PID=11046 - Swap used: 144 - (hald-addon-stor )
PID=326 - Swap used: 152 - (upstart-udev-br )
PID=1381 - Swap used: 156 - (atd )
PID=10950 - Swap used: 160 - (lxsession )
PID=1380 - Swap used: 168 - (cron )
PID=11048 - Swap used: 200 - (hald-addon-cpuf )
PID=11036 - Swap used: 208 - (hald-addon-inpu )
PID=11049 - Swap used: 208 - (hald-addon-acpi )
PID=11424 - Swap used: 224 - (gconfd-2 )
PID=11026 - Swap used: 248 - (hald-runner )
PID=10983 - Swap used: 268 - (ssh-agent )
PID=11053 - Swap used: 276 - (gvfsd )
PID=10986 - Swap used: 296 - (dbus-launch )
PID=10999 - Swap used: 296 - (pcmanfm )
PID=11421 - Swap used: 304 - (dbus-launch )
PID=11025 - Swap used: 324 - (menu-cached )
PID=1484 - Swap used: 324 - (ntpd )
PID=743 - Swap used: 332 - (rsyslogd )
PID=12148 - Swap used: 348 - (ldap_auth )
PID=12149 - Swap used: 352 - (ldap_auth )
PID=12152 - Swap used: 352 - (ldap_auth )
PID=12153 - Swap used: 352 - (ldap_auth )
PID=12151 - Swap used: 356 - (ldap_auth )
PID=11017 - Swap used: 368 - (polkitd )
PID=10987 - Swap used: 388 - (dbus-daemon )
PID=11422 - Swap used: 392 - (dbus-daemon )
PID=776 - Swap used: 404 - (dbus-daemon )
PID=10998 - Swap used: 412 - (lxpanel )
PID=11011 - Swap used: 440 - (gnome-keyring-d )
PID=3325 - Swap used: 516 - (collectd )
PID=1 - Swap used: 524 - (init )
PID=1144 - Swap used: 540 - (sshd )
PID=2472 - Swap used: 608 - (dansguardian )
PID=2477 - Swap used: 608 - (dansguardian )
PID=2487 - Swap used: 608 - (dansguardian )
PID=25779 - Swap used: 608 - (dansguardian )
PID=25781 - Swap used: 608 - (dansguardian )
PID=26734 - Swap used: 608 - (dansguardian )
PID=26737 - Swap used: 608 - (dansguardian )
PID=26738 - Swap used: 608 - (dansguardian )
PID=26745 - Swap used: 608 - (dansguardian )
PID=26746 - Swap used: 608 - (dansguardian )
PID=26748 - Swap used: 608 - (dansguardian )
PID=26749 - Swap used: 608 - (dansguardian )
PID=2816 - Swap used: 608 - (dansguardian )
PID=22454 - Swap used: 612 - (dansguardian )
PID=24329 - Swap used: 612 - (dansguardian )
PID=24330 - Swap used: 612 - (dansguardian )
PID=2473 - Swap used: 612 - (dansguardian )
PID=2474 - Swap used: 612 - (dansguardian )
PID=2475 - Swap used: 612 - (dansguardian )
PID=2476 - Swap used: 612 - (dansguardian )
PID=2478 - Swap used: 612 - (dansguardian )
PID=2479 - Swap used: 612 - (dansguardian )
PID=2480 - Swap used: 612 - (dansguardian )
PID=2481 - Swap used: 612 - (dansguardian )
PID=2482 - Swap used: 612 - (dansguardian )
PID=2484 - Swap used: 612 - (dansguardian )
PID=2486 - Swap used: 612 - (dansguardian )
PID=25475 - Swap used: 612 - (dansguardian )
PID=25780 - Swap used: 612 - (dansguardian )
PID=26733 - Swap used: 612 - (dansguardian )
PID=26735 - Swap used: 612 - (dansguardian )
PID=26739 - Swap used: 612 - (dansguardian )
PID=26740 - Swap used: 612 - (dansguardian )
PID=26741 - Swap used: 612 - (dansguardian )
PID=26742 - Swap used: 612 - (dansguardian )
PID=26743 - Swap used: 612 - (dansguardian )
PID=26747 - Swap used: 612 - (dansguardian )
PID=26750 - Swap used: 612 - (dansguardian )
PID=2820 - Swap used: 612 - (dansguardian )
PID=2821 - Swap used: 612 - (dansguardian )
PID=2825 - Swap used: 612 - (dansguardian )
PID=2826 - Swap used: 612 - (dansguardian )
PID=2827 - Swap used: 612 - (dansguardian )
PID=2829 - Swap used: 612 - (dansguardian )
PID=2830 - Swap used: 612 - (dansguardian )
PID=2831 - Swap used: 612 - (dansguardian )
PID=2832 - Swap used: 612 - (dansguardian )
PID=2833 - Swap used: 612 - (dansguardian )
PID=2834 - Swap used: 612 - (dansguardian )
PID=2836 - Swap used: 612 - (dansguardian )
PID=10011 - Swap used: 616 - (dansguardian )
PID=12099 - Swap used: 616 - (dansguardian )
PID=15683 - Swap used: 616 - (dansguardian )
PID=16368 - Swap used: 616 - (dansguardian )
PID=22047 - Swap used: 616 - (dansguardian )
PID=2471 - Swap used: 616 - (dansguardian )
PID=24731 - Swap used: 616 - (dansguardian )
PID=2483 - Swap used: 616 - (dansguardian )
PID=26736 - Swap used: 616 - (dansguardian )
PID=26744 - Swap used: 616 - (dansguardian )
PID=2818 - Swap used: 616 - (dansguardian )
PID=2824 - Swap used: 616 - (dansguardian )
PID=2828 - Swap used: 616 - (dansguardian )
PID=2835 - Swap used: 616 - (dansguardian )
PID=16364 - Swap used: 620 - (dansguardian )
PID=22046 - Swap used: 620 - (dansguardian )
PID=2470 - Swap used: 620 - (dansguardian )
PID=2822 - Swap used: 620 - (dansguardian )
PID=2823 - Swap used: 620 - (dansguardian )
PID=6446 - Swap used: 620 - (dansguardian )
PID=6615 - Swap used: 620 - (dansguardian )
PID=2817 - Swap used: 624 - (dansguardian )
PID=2819 - Swap used: 624 - (dansguardian )
PID=6447 - Swap used: 624 - (dansguardian )
PID=9560 - Swap used: 624 - (dansguardian )
PID=13713 - Swap used: 628 - (dansguardian )
PID=16366 - Swap used: 628 - (dansguardian )
PID=17799 - Swap used: 628 - (dansguardian )
PID=20062 - Swap used: 628 - (dansguardian )
PID=2485 - Swap used: 628 - (dansguardian )
PID=7271 - Swap used: 628 - (dansguardian )
PID=7328 - Swap used: 628 - (dansguardian )
PID=7879 - Swap used: 628 - (dansguardian )
PID=9154 - Swap used: 628 - (dansguardian )
PID=20061 - Swap used: 632 - (dansguardian )
PID=11426 - Swap used: 636 - (vino-server )
PID=11761 - Swap used: 636 - (dansguardian )
PID=12100 - Swap used: 636 - (dansguardian )
PID=20696 - Swap used: 636 - (dansguardian )
PID=13714 - Swap used: 640 - (dansguardian )
PID=17797 - Swap used: 640 - (dansguardian )
PID=17800 - Swap used: 640 - (dansguardian )
PID=20813 - Swap used: 640 - (dansguardian )
PID=4310 - Swap used: 640 - (dansguardian )
PID=15682 - Swap used: 644 - (dansguardian )
PID=16365 - Swap used: 644 - (dansguardian )
PID=17798 - Swap used: 644 - (dansguardian )
PID=20814 - Swap used: 644 - (dansguardian )
PID=7272 - Swap used: 644 - (dansguardian )
PID=8146 - Swap used: 648 - (dansguardian )
PID=16367 - Swap used: 652 - (dansguardian )
PID=8145 - Swap used: 652 - (dansguardian )
PID=17811 - Swap used: 656 - (dansguardian )
PID=4309 - Swap used: 656 - (dansguardian )
PID=16363 - Swap used: 668 - (dansguardian )
PID=10010 - Swap used: 676 - (dansguardian )
PID=9558 - Swap used: 688 - (dansguardian )
PID=22103 - Swap used: 692 - (sshd )
PID=12183 - Swap used: 696 - (dansguardian )
PID=464 - Swap used: 696 - (udevd )
PID=2837 - Swap used: 700 - (dansguardian )
PID=2838 - Swap used: 700 - (dansguardian )
PID=2839 - Swap used: 700 - (dansguardian )
PID=2840 - Swap used: 700 - (dansguardian )
PID=2841 - Swap used: 700 - (dansguardian )
PID=2842 - Swap used: 700 - (dansguardian )
PID=2843 - Swap used: 700 - (dansguardian )
PID=465 - Swap used: 700 - (udevd )
PID=10132 - Swap used: 712 - (dansguardian )
PID=6598 - Swap used: 712 - (sshd )
PID=352 - Swap used: 716 - (udevd )
PID=10997 - Swap used: 724 - (xscreensaver )
PID=6674 - Swap used: 724 - (sshd )
PID=22177 - Swap used: 732 - (sshd )
PID=12182 - Swap used: 736 - (dansguardian )
PID=12184 - Swap used: 752 - (dansguardian )
PID=7270 - Swap used: 752 - (dansguardian )
PID=11024 - Swap used: 772 - (hald )
PID=1931 - Swap used: 776 - (redis-server )
PID=1529 - Swap used: 828 - (bandwidthd )
PID=9559 - Swap used: 828 - (dansguardian )
PID=3083 - Swap used: 836 - (postgres )
PID=10133 - Swap used: 848 - (dansguardian )
PID=10994 - Swap used: 852 - (openbox )
PID=8826 - Swap used: 924 - (dansguardian )
PID=11000 - Swap used: 960 - (polkit-gnome-au )
PID=3812 - Swap used: 964 - (dansguardian )
PID=1490 - Swap used: 996 - (postgres )
PID=10878 - Swap used: 1016 - (console-kit-dae )
PID=1492 - Swap used: 1016 - (postgres )
PID=1528 - Swap used: 1036 - (bandwidthd )
PID=1495 - Swap used: 1048 - (postgres )
PID=1494 - Swap used: 1056 - (postgres )
PID=6675 - Swap used: 1068 - (bash )
PID=1493 - Swap used: 1080 - (postgres )
PID=4210 - Swap used: 1160 - (lxdm-binary )
PID=1946 - Swap used: 1176 - (redis-server )
PID=1527 - Swap used: 1396 - (bandwidthd )
PID=28189 - Swap used: 1948 - (apache2 )
PID=22178 - Swap used: 2084 - (bash )
PID=3625 - Swap used: 2120 - (slapd )
PID=28191 - Swap used: 2256 - (apache2 )
PID=28192 - Swap used: 2256 - (apache2 )
PID=28193 - Swap used: 2256 - (apache2 )
PID=28194 - Swap used: 2256 - (apache2 )
PID=28195 - Swap used: 2256 - (apache2 )
PID=1530 - Swap used: 2304 - (bandwidthd )
PID=27885 - Swap used: 2676 - (postgres )
PID=4217 - Swap used: 2772 - (Xorg )
PID=6743 - Swap used: 30772 - (apache2 )
PID=12143 - Swap used: 39932 - (squid )
PID=5730 - Swap used: 49796 - (clamd )
PID=6745 - Swap used: 75860 - (apache2 )
PID=27821 - Swap used: 77972 - (loggerd )

The numbers are in KiloBytes.
loggerd,apache2,clamd,squid seems to be the most harmful...


The command  dpkg -l | grep zentyal gives the above output

 
i
Code: [Select]
i  l7-filter-userspace                  0.11-4+zentyal                        Userspace layer 7 packet classifier
ii  libhtml-mason-perl                   1:1.44-1+zentyal1                     HTML::Mason Perl module
ii  liblog-any-perl                      0.11-1+zentyal1                       Log anywhere
ii  libredis-perl                        2:2.0.1-0ubuntu1+zentyal1             persistent key-value database with network interface (P
ii  zentyal                              2.2                                   Zentyal - Core metapackage
ii  zentyal-antivirus                    2.2                                   Zentyal - Antivirus
ii  zentyal-bwmonitor                    2.2.4                                 Zentyal - Bandwidth Monitor
ii  zentyal-ca                           2.2.2                                 Zentyal - Certification Authority
ii  zentyal-common                       2.2.3                                 Zentyal - Common Library
ii  zentyal-core                         2.2.5                                 Zentyal - Core
ii  zentyal-firewall                     2.2                                   Zentyal - Firewall
ii  zentyal-l7-protocols                 2.2                                   Zentyal - Layer-7 Filter
ii  zentyal-monitor                      2.2.2                                 Zentyal - Monitor
ii  zentyal-network                      2.2.5                                 Zentyal - Network Configuration
ii  zentyal-objects                      2.2                                   Zentyal - Network Objects
ii  zentyal-openvpn                      2.2.1                                 Zentyal - VPN Service
ii  zentyal-remoteservices               2.2.3                                 Zentyal - Cloud Client
ii  zentyal-services                     2.2                                   Zentyal - Network Services
ii  zentyal-software                     2.2.2                                 Zentyal - Software Management
ii  zentyal-squid                        2.2.2                                 Zentyal - HTTP Proxy (Cache and Filter)
ii  zentyal-trafficshaping               2.2                                   Zentyal - Traffic Shaping
ii  zentyal-users                        2.2.5                                 Zentyal - Users and Groups

and top command

Code: [Select]
top - 15:22:21 up 2 days, 23:04,  3 users,  load average: 3.17, 3.02, 2.73
Tasks: 271 total,   1 running, 266 sleeping,   0 stopped,   4 zombie
Cpu(s):  1.4%us,  1.9%sy,  0.0%ni, 93.1%id,  3.2%wa,  0.0%hi,  0.3%si,  0.0%st
Mem:    957160k total,   901132k used,    56028k free,    14148k buffers
Swap:  2805752k total,   346404k used,  2459348k free,   218628k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 5730 root      20   0  548m 116m 1552 S    1 12.5 199:36.54 clamd
 1527 root      21   1 49456 4464 2492 S    1  0.5  21:07.39 bandwidthd
 1528 root      21   1 49032 4320 2480 S    1  0.5  22:07.09 bandwidthd
 1529 root      21   1 48608 4320 2480 S    1  0.5  21:21.76 bandwidthd
12143 proxy     20   0  153m  93m 1404 S    1 10.0   4:42.19 squid
 1530 root      21   1 48608 2612 2456 S    0  0.3  23:07.17 bandwidthd
 5563 dansguar  20   0 51972 1680 1208 S    0  0.2   0:00.02 dansguardian
 5766 root      20   0 18416 2684 2472 S    0  0.3  25:17.53 zbwmonitor
 5782 xtrgeo    20   0 19356 1520 1028 R    0  0.2   0:00.14 top
10998 xtrgeo    20   0  158m 7600 5480 S    0  0.8   0:23.79 lxpanel
11426 root      20   0  205m  11m 9584 S    0  1.2   0:16.49 vino-server
13713 dansguar  20   0 52492 2056 1188 S    0  0.2   0:00.90 dansguardian
20813 dansguar  20   0 53048 2592 1188 S    0  0.3   0:01.00 dansguardian
22046 dansguar  20   0 52460 2024 1188 S    0  0.2   0:00.61 dansguardian
    1 root      20   0 62128 1636  912 S    0  0.2   0:28.13 init
    2 root      20   0     0    0    0 S    0  0.0   0:00.01 kthreadd
    3 root      RT   0     0    0    0 S    0  0.0   0:01.38 migration/0
    4 root      20   0     0    0    0 S    0  0.0   0:29.85 ksoftirqd/0
    5 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/0
    6 root      RT   0     0    0    0 S    0  0.0   0:01.48 migration/1
    7 root      20   0     0    0    0 S    0  0.0   1:06.61 ksoftirqd/1
    8 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/1
    9 root      20   0     0    0    0 S    0  0.0   0:06.96 events/0
   10 root      20   0     0    0    0 S    0  0.0   0:02.38 events/1
   11 root      20   0     0    0    0 S    0  0.0   0:00.00 cpuset
   12 root      20   0     0    0    0 S    0  0.0   0:00.69 khelper
   13 root      20   0     0    0    0 S    0  0.0   0:00.00 netns
   14 root      20   0     0    0    0 S    0  0.0   0:00.00 async/mgr
   15 root      20   0     0    0    0 S    0  0.0   0:00.00 pm

What this server does is that it works as a gateway for users and is traffic balancing 4 gateways. Nothing else.

What do u thing?