Issue
I have a peculiar issue while trying to set up Selenium WebDriver on an AWS EC2 instance running Red Hat. I'm using Google Chrome version 120.0.6099.109 on this instance.
I have been trying to install the appropriate ChromeDriver for this Chrome version, but I've encountered an intermittent problem. My Selenium Python script sometimes runs successfully, and other times, it reports that "chrome has crashed." I have tried multiple versions of ChromeDriver without consistent success.
Here's the version of Google Chrome I have:
$ google-chrome --version
Google Chrome 120.0.6099.109
I've attempted to download the corresponding ChromeDriver version (e.g., 120.0.6099.71), but the issue persists. I've also tried various versions of ChromeDriver with no consistent results.
Below is a snippet of the Python script using Selenium:
chrome_options = webdriver.ChromeOptions()
prefs = {
"download.prompt_for_download": False,
"plugins.always_open_pdf_externally": True,
"download.open_pdf_in_system_reader": False,
"profile.default_content_settings.popups": 0,
"download.default_directory": file_path_descargas_guias
}
chrome_options.add_experimental_option('prefs', prefs)
chrome_options.add_argument("--headless")
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument("--disable-dev-shm-usage")
driver = webdriver.Chrome(options=chrome_options)
The weird thing is that sometimes IT DOES work, but most of the time it does not the cell on Jupyter Labs keeps "running" until I get this error:
---------------------------------------------------------------------------
SessionNotCreatedException Traceback (most recent call last)
Cell In[11], line 16
13 chrome_options.add_argument('--no-sandbox')
14 chrome_options.add_argument("--disable-dev-shm-usage")
---> 16 driver = webdriver.Chrome(options=chrome_options)
File ~/anaconda3/lib/python3.11/site-packages/selenium/webdriver/chrome/webdriver.py:45, in WebDriver.__init__(self, options, service, keep_alive)
42 service = service if service else Service()
43 options = options if options else Options()
---> 45 super().__init__(
46 browser_name=DesiredCapabilities.CHROME["browserName"],
47 vendor_prefix="goog",
48 options=options,
49 service=service,
50 keep_alive=keep_alive,
51 )
File ~/anaconda3/lib/python3.11/site-packages/selenium/webdriver/chromium/webdriver.py:61, in ChromiumDriver.__init__(self, browser_name, vendor_prefix, options, service, keep_alive)
52 executor = ChromiumRemoteConnection(
53 remote_server_addr=self.service.service_url,
54 browser_name=browser_name,
(...)
57 ignore_proxy=options._ignore_local_proxy,
58 )
60 try:
---> 61 super().__init__(command_executor=executor, options=options)
62 except Exception:
63 self.quit()
File ~/anaconda3/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py:209, in WebDriver.__init__(self, command_executor, keep_alive, file_detector, options)
207 self._authenticator_id = None
208 self.start_client()
--> 209 self.start_session(capabilities)
File ~/anaconda3/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py:293, in WebDriver.start_session(self, capabilities)
286 """Creates a new session with the desired capabilities.
287
288 :Args:
289 - capabilities - a capabilities dict to start the session with.
290 """
292 caps = _create_caps(capabilities)
--> 293 response = self.execute(Command.NEW_SESSION, caps)["value"]
294 self.session_id = response.get("sessionId")
295 self.caps = response.get("capabilities")
File ~/anaconda3/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py:348, in WebDriver.execute(self, driver_command, params)
346 response = self.command_executor.execute(driver_command, params)
347 if response:
--> 348 self.error_handler.check_response(response)
349 response["value"] = self._unwrap_value(response.get("value", None))
350 return response
File ~/anaconda3/lib/python3.11/site-packages/selenium/webdriver/remote/errorhandler.py:229, in ErrorHandler.check_response(self, response)
227 alert_text = value["alert"].get("text")
228 raise exception_class(message, screen, stacktrace, alert_text) # type: ignore[call-arg] # mypy is not smart enough here
--> 229 raise exception_class(message, screen, stacktrace)
SessionNotCreatedException: Message: session not created: DevToolsActivePort file doesn't exist
Stacktrace:
#0 0x5635242acd33 <unknown>
#1 0x563523f69f87 <unknown>
#2 0x563523fa5e21 <unknown>
#3 0x563523fa1d9f <unknown>
#4 0x563523f9e4de <unknown>
#5 0x563523feea90 <unknown>
#6 0x563523fe30e3 <unknown>
#7 0x563523fab044 <unknown>
#8 0x563523fac44e <unknown>
#9 0x563524271861 <unknown>
#10 0x563524275785 <unknown>
#11 0x56352425f285 <unknown>
#12 0x56352427641f <unknown>
#13 0x56352424320f <unknown>
#14 0x56352429a028 <unknown>
#15 0x56352429a1f7 <unknown>
#16 0x5635242abed4 <unknown>
#17 0x7f84fd69f802 start_thread
or:
I did download an dunzip the webdriver and put in PATH but I'm lost right now.
I am seeking any insights into the following:
The recommended version of ChromeDriver for Google Chrome 120.0.6099.109 on Linux/Red Hat. Any specific configurations or adjustments needed for running Selenium WebDriver in headless mode on an AWS EC2 instance without display hardware. Or what would be the proper way to set the path of the WebDriver?
Solution
Instead of using selenium
, you can try playwright
. Playwright is a newer python module, that does not require a webdriver.
You can install it using: pip install playwright
Then install the included browsers: playwright install
and it will start downloading Chromium, Webkit, and Firefox, which come built in with playwright
. Of course, you can also use Chrome and Microsoft Edge (running through a channel).
For example:
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(channel="chrome") #using chrome
page = browser.new_page()
page.goto('https://example.com')
print(page.title)
page.click('button#something') #Clicks on a button (CSS for example)
page.close()
browser.close()
and that should work. Note that in the line: browser = p.chromium.launch(channel="chrome")
, you're not actually launching chromium, you are launching chrome. I assume the reason chromium is there before launching the correct browser is because chromium, chrome, and msedge are similar browsers. You can also do the same with firefox and webkit, just replace chromium with the desired browser, but only put the channel
argument if you are running chrome or msedge.
Note that playwright
launches a browser in headless mode by default (the window opens without the user seeing so), similar to certain functions of selenium
. To change headless mode to False, add an argument called headless when launching the browser.
The reason I suggest playwright
instead of selenium
is that selenium
is pretty old, and not the best choice for browser automation. Playwright
is a newer and better version of selenium
, and it will probably help you with your crashing problem.
Playwright: https://playwright.dev/python/docs/intro
Answered By - 5rod Answer Checked By - Katrina (WPSolving Volunteer)