Issue
I am having difficulties when running R script with h2o
library via cron
in linux.
The script runs perfectly fine in interactive mode, but when scheduled in cron
the script fails.
Part of the code causing the error:
automl_h2o_models <- h2o.automl(
x = predictors,
y = target,
training_frame = train_conv_h2o,
leaderboard_frame = valid_conv_h2o,
max_runtime_secs = 3600,
seed = 1234
)
When max_runtime_secs
is set to 1800
there is no issue, but anything beyond this value will result in error below.
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page, :
Unexpected CURL error: getaddrinfo() thread failed to start
I am on Ubuntu 20.04, R Version 3.6.3, h2o version 3.32.1.3
Solution
The issue is related to number of descriptors setting in linux. The cron
environment is different than the system environment when running the script in interactive mode.
As a solution I have used extra parameter in my cron
:
0 18 21 6 * ulimit -nS 1048576 && Rscript <script_name>
Then the error disappeared and the script ran correctly.
Answered By - Tomas Answer Checked By - Mary Flores (WPSolving Volunteer)