Friday, December 29, 2023

[SOLVED] 100% CPU Utilization on my VPS - Django app gunicorn

December 29, 2023 django, docker-compose, gunicorn, nginx, ubuntu

Issue

I have a VPS running Nginx, Django, Postgres and a Golang microservice in a Docker compose environment and recently I've noticed it's consistently hitting 100% CPU utilization and not working anymore. I suspect this may be due to a DDoS attack or weird gunicorn behavior.

VPS OS: Ubuntu 22.04.2 LTS

Observations:

The high CPU usage started around yesterday (24hrs ago).

Steps Taken:

Setup and config ufw, and a digitalocean firewall.

NGINX Logs

...
89.44.9.51 - - [29/Sep/2023:05:09:58 +0000] "OPTIONS /webclient/api/MyPhone/session HTTP/1.1" 444 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) 3CXDesktopApp/18.13.959 Chrome/112.0.5615.165 Electron/24.3.0 Safari/537.36" "-"
89.44.9.51 - - [29/Sep/2023:05:10:01 +0000] "GET /provisioning/5lbr5h6kse0q/TcxProvFiles/3cxProv_YU8SR32OGF200.xml HTTP/1.1" 444 0 "-" "electron-fetch/1.0 electron (+https://github.com/arantes555/electron-fetch)" "-"
89.44.9.51 - - [29/Sep/2023:05:10:07 +0000] "GET /provisioning/5lbr5h6kse0q/TcxProvFiles/3cxProv_YU8SR32OGF200.xml HTTP/1.1" 444 0 "-" "electron-fetch/1.0 electron (+https://github.com/arantes555/electron-fetch)" "-"
89.44.9.51 - - [29/Sep/2023:05:10:12 +0000] "GET /provisioning/5lbr5h6kse0q/TcxProvFiles/3cxProv_YU8SR32OGF200.xml HTTP/1.1" 444 0 "-" "electron-fetch/1.0 electron (+https://github.com/arantes555/electron-fetch)" "-"
89.44.9.51 - - [29/Sep/2023:05:10:17 +0000] "GET /provisioning/5lbr5h6kse0q/TcxProvFiles/3cxProv_YU8SR32OGF200.xml HTTP/1.1" 444 0 "-" "electron-fetch/1.0 electron (+https://github.com/arantes555/electron-fetch)" "-"
89.44.9.51 - - [29/Sep/2023:05:10:22 +0000] "GET /provisioning/5lbr5h6kse0q/TcxProvFiles/3cxProv_YU8SR32OGF200.xml HTTP/1.1" 444 0 "-" "electron-fetch/1.0 electron (+https://github.com/arantes555/electron-fetch)" "-"
89.44.9.51 - - [29/Sep/2023:05:10:27 +0000] "GET /provisioning/5lbr5h6kse0q/TcxProvFiles/3cxProv_YU8SR32OGF200.xml HTTP/1.1" 444 0 "-" "electron-fetch/1.0 electron (+https://github.com/arantes555/electron-fetch)" "-"
89.44.9.51 - - [29/Sep/2023:05:10:32 +0000] "GET /provisioning/5lbr5h6kse0q/TcxProvFiles/3cxProv_YU8SR32OGF200.xml HTTP/1.1" 444 0 "-" "electron-fetch/1.0 electron (+https://github.com/arantes555/electron-fetch)" "-"
89.44.9.51 - - [29/Sep/2023:05:10:38 +0000] "GET /provisioning/5lbr5h6kse0q/TcxProvFiles/3cxProv_YU8SR32OGF200.xml HTTP/1.1" 444 0 "-" "electron-fetch/1.0 electron (+https://github.com/arantes555/electron-fetch)" "-"
...

TOP

  15284 root      20   0  298336 133468  18568 S  90.7   3.3   4:06.24 gunicorn                                                                                                                           
  15477 lxd       20   0  216928  18104  15104 R   8.6   0.5   0:03.21 postgres

When accessing the Domain, it straight hits the CPU limits. I was implementing some new features which worked like a charm (locally) but I can't think of that would do something like that. I actually also rolled back everything and had the same issue.

I am kinda stuck where to look at.

Would appreciate any hint.

PS: If you need more information, let me know in the comments pls. I didn't want to prolong this.

Thank you in advance!

EDIT with additional information

docker logs 7e3e93cda248 -f
Collect static files

254 static files copied to '/staticfiles', 756 post-processed.
Apply database migrations
System check identified some issues:

WARNINGS:
?: (urls.W005) URL namespace 'v1' isn't unique. You may not be able to reverse all URLs in this namespace
Operations to perform:
  Apply all migrations: admin, app, auth, contenttypes, sessions
Running migrations:
  No migrations to apply.
System check identified some issues:

WARNINGS:
?: (urls.W005) URL namespace 'v1' isn't unique. You may not be able to reverse all URLs in this namespace
No changes detected in app 'app'
System check identified some issues:

WARNINGS:
?: (urls.W005) URL namespace 'v1' isn't unique. You may not be able to reverse all URLs in this namespace
Operations to perform:
  Apply all migrations: admin, app, auth, contenttypes, sessions
Running migrations:
  No migrations to apply.
[2023-09-29 04:52:38 +0000] [10] [INFO] Starting gunicorn 21.2.0
[2023-09-29 04:52:38 +0000] [10] [INFO] Listening at: http://0.0.0.0:8000 (10)
[2023-09-29 04:52:38 +0000] [10] [INFO] Using worker: gthread
[2023-09-29 04:52:38 +0000] [11] [INFO] Booting worker with pid: 11
Not Found: /favicon.ico

gunicorn --pythonpath . app.wsgi:application --bind 0.0.0.0:8000 --timeout 120 --threads=3

2 vCPUs

Solution

So obviously it was an application level problem, I could have guessed it when I saw that Postgres PID also consumes a lot of CPU.

There was an unperformant ORM operation that somehow ends in a timeout.

Answered By - Softwareentwicklung Freelancer

Answer Checked By - Cary Denson (WPSolving Admin)

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Friday, December 29, 2023

[SOLVED] 100% CPU Utilization on my VPS - Django app gunicorn

Issue

Solution

Popular Posts

Labels