Issue
I would like to create a new column for words used for grep. I have a data frame and a list of keywords to identify whether my data frame includes such list of keywords or not. If keywords are included in the data frame, I would like to know which words in a newly created column.
So, this is what my data is
id // skills
1 // this is a skill for xoobst
2 // artificial intelligence
3 // logistic regression
I used the below code to grep words.
keyword <- "xoobst|logistic|intelligence"
result <- df[grep(keyword, df$skills, ignore.case = T),]
This is what I desired for as an outcome
id // skills // words
1 // this is a skill for xoobst // xoobst
2 // artificial intelligence // intelligence
3 // logistic regression // logistic
I tried the below code, but it got me a full sentence rather than a word used to identify whether it includes the word or not.
keys <- sprintf(".*(%s).*", keyword)
df$words <- sub(keys, "\\1", df$skills)
Which alternative way would be necessary for me? Thank you in advance!
Solution
You can use stringr
:
df <- data.frame(
id = c(1, 2, 3),
skills = c("this is a skill for xoobst", "artificial intelligence", "logistic regression")
)
df |>
dplyr::mutate(words = stringr::str_extract(df$skills, "xoobst|logistic|intelligence"))
#> id skills words
#> 1 1 this is a skill for xoobst xoobst
#> 2 2 artificial intelligence intelligence
#> 3 3 logistic regression logistic
Answered By - Matt Answer Checked By - Dawn Plyler (WPSolving Volunteer)