Saturday, October 29, 2022

[SOLVED] detect duplicated words within string

Issue

In the string below (which is a column in a df) I want to extract strings in which TRUE is present at least two times. I guess I could do some strsplit and then detect duplicates, but is there a method to do it directly?

head(df$Filter)
[1] "FALSE_TRUE_FALSE_FALSE" "FALSE_TRUE_FALSE_FALSE" "FALSE_TRUE_TRUE_FALSE"  "FALSE_TRUE_FALSE_FALSE" "FALSE_TRUE_FALSE_FALSE"
[6] "FALSE_TRUE_FALSE_FALSE"

out in this example:

FALSE_TRUE_TRUE_FALSE

Solution

We can use str_count

library(dplyr)
library(stringr)
df %>%
    filter(str_count(Filter, "TRUE") > 1)


Answered By - akrun
Answer Checked By - Timothy Miller (WPSolving Admin)