Step-by-Step Guide to Use R and Selenium on Windows

Wait 5 sec.

[This article was first published on pacha.dev/blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. Because of delays with my scholarship payment, if this post is useful to you I kindly ask a minimal donation on Buy Me a Coffee. It shall be used to continue my Open Source efforts. The full explanation is here: A Personal Message from an Open Source Contributor.You can send me questions for the blog using this form and subscribe to receive an email when there is a new post.MotivationI got this question: I followed your Selenium post and it does not work on Windows. How can I fix that?The post in question is here, and after testing on a Windows machine I realised that the issue was related to fact that newer Google Chrome versions (>119) do not provide ChromeDriver, a software that Selenium uses to control the browser, and do not work with the most recent version you can download from Google.Here is how to use Mozilla Firefox instead.Required softwareMozilla Firefox and GeckoDriver: web browser and remote control programRSelenium: R-Selenium integrationrvest: HTML processingdplyr: to load the pipe operator (can be used later for data cleaning)purrr: iteration (i.e., repeated operations)I installed Mozilla Firefox from the official website and followed the installer.For GeckoDriver, I downloaded it from here for Windows 64-bit and saved “geckodriver.exe” to a new folder “C:”. Then, I had to add the folder to the PATH like this:Press Win + SType “Environment variables”Open “Edit the system environment variables”.Click “Environment variables”.In “System variables”, find and select “Path”, then click “Edit”.Click “New” and add “C:” without quotesClick OK to save.Then restart RStudio and close PowerShell if it is open. Not installing GeckoDrive would only result in this error message in R: “Unable to create new service geckodriverservice.”I installed RSelenium from the R console:if (!require(RSelenium)) install.packages("RSelenium")# orremotes::install_github("ropensci/RSelenium")For the rest of the packages:if (!require(rvest)) install.packages("rvest")if (!require(dplyr)) install.packages("dplyr")if (!require(purrr)) install.packages("purrr")Running Selenium ServerI tried to start Selenium as it is mentioned in the official guide, and in the post linked above, and it did not work.I also had to download Selenium Server, so I used this link and from a new PowerShell I ran:cd Downloadsjava -jar selenium-server-standalone-3.9.1.jarFrom RStudio (same for an R terminal), I could control the browser from R:library(RSelenium)library(rvest)library(dplyr)library(purrr)rmDr