Thursday, October 6, 2022

[SOLVED] Too Many Cron Jobs giving me "bash: fork: Resource temporarily unavailable"

Issue

I noticed today after some crons were not performing as they should that I was getting the following error line in the log files:

/bin/sh: fork: Resource temporarily unavailable

I did some research and found out it may have to do with the number of processes that a user is allowed to run.

I then run:

top -u

and indeed there are tons of sh and curl processes which I think shouldn't be there.

Most of the processes are simple calls to local php files running some DB maintenance tasks.

All the processes are running in my local machine, so I have full access to everything. If I knew how, I would change the limit, but I can't find any information related to this issue specifically on a Mac OS X Lion.

Also, I am not sure why the processes don't disappear after they are executed.

Is there any way to kill the process after it is executed?

Any hint in the right direction will be much appreciated! Thanks


Solution

The only sane thing you could do is using lockfiles to guarantee that there is only one instance running for every particular cronjob. The simplest way to do this is by using lockfiles from within the cron-scripts. ("cooperative locking"):

  • On startup, the (cron) job tests if the lockfile exists
  • if the lockfile happens to exist, the job performs a kill -0 <pid> on the other process (#1)
  • if the errorcode from the kill is zero, the process actually exists and is from the same userid. The new job should exit. (#2)
  • if the errorcode from the kill is not zero, either the process does not exist anymore (good) or belongs to another (unrelated) process for a different uid
  • if the process does not exist, the new job can continue by creating the lockfile, and writing its pid into it (#3)
  • Now the actual payload can be executed
  • finally the lockfile can be be removed.

#1: kill -0 is a no-op; it only checks the validity of the pid

#2: there is a small chance that the pid belongs to an unrelated process for our pid. We can refine the search by inspecting the output of ps, and checking if the pid actually belongs to an older instance of our cron job.

#3: this is not race-free, but for a cronjob that runs once a minute it is probably good enough.



Answered By - wildplasser
Answer Checked By - Willingham (WPSolving Volunteer)