Issue
I have a cron job set up on the Kubernetes cluster to process millions of records from the database. Sometimes pod corresponding to the cron job gets Evicted or OOM Killed. Now the issue I am facing is whenever this cron job starts again it processes all those records from the beginning.
Just wanted to understand how should I approach storing the progress of this cron job somewhere. Let's say I store it in a database then how frequent should I make a db call to store the state?
Solution
I recommend fixing OOM issue instead of finding work arounds. I've listed my thoughts on both.
Fixing OOM: Assuming Cronjob is processing millions of records and it is hitting OOM issue, This mostly due to a memory leak. I would check if certain data structures/resources are being released after being done with it. Another way to approach is to increase the memory.
Work around: If you are using database, it doesn't make much sense to introduce another technology just to save the progress. You can create a table for cronjob progress, and update the table after processing a batch of records. You can update the table with pagination number or offset.
Answered By - Jaswanthi Kolla Answer Checked By - Candace Johnson (WPSolving Volunteer)