Friday, January 25, 2019

regex - gsub() everything except specified characters?




How do I gsub() everything except a specified character in R?



In my problem I have the following string...



"the quick brown fox jumps over a lazy dog"


I have to generate a new string with by removing all characters except 'r' and 'o' and get the following output...



"roooro"


Assuming all characters are lower case how do I go about. I tried...



gsub(!"r","",gsub(!"o","",str1))


But the '!' doesn't work.


Answer



We need to use ^ inside the [ to match all characters except the ro. Here, the [^ro]+ implies matching one or more characters that are not a 'r' or 'o' and replace it with blank ("").



gsub("[^ro]+", "", str1)
#[1] "roooro"





If we have a vector of values, we can create the pattern with paste



v1 <- c("r", "o")
gsub(paste0("[^", paste(v1, collapse=""), "]+"), "", str1)
#[1] "roooro"

No comments:

Post a Comment

plot explanation - Why did Peaches&#39; mom hang on the tree? - Movies &amp; TV

In the middle of the movie Ice Age: Continental Drift Peaches' mom asked Peaches to go to sleep. Then, she hung on the tree. This parti...