3

I would like to remove the character 'V' (always the last one in the strings) from the following vector containing a large number of strings. They look similar to the following example:

str <- c("VDM 000 V2.1.1",
         "ABVC 001 V10.15.0",
         "ASDV 123 V1.20.0")

I know that it is always the last 'V', I would like to remove. I also know that this character is either the sixth, seventh or eighth last character within these strings.

I was not really able to come up with a nice solution. I know that I have to use sub or gsub but I can only remove all V's rather than only the last one.

Has anyone got an idea?

Thank you!

2
  • Try sub("V([^V]*)$", "\\1", str), adapted from here. Commented Dec 5, 2016 at 21:24
  • @Richard: Would replace last "V" but not the last "V" restricted to the 6:8th location. Commented Dec 5, 2016 at 22:02

3 Answers 3

3

This regex pattern is written to match a "V" that is then followed by 5 to 7 other non-"V" characters. The "[...]" construct is a "character-class" and within such constructs a leading "^" causes negation. The "{...} consturct allows two digits specifying minimum and maximum lengths, and the "$" matches the length-0 end-of-string which I think was desired when you wrote "sixth, seventh or eighth last character":

sub("(V)(.{5,7})$", "\\2", str)
[1] "VDM 000 2.1.1"    "ABVC 001 10.15.0" "ASDV 123 1.20.0" 

Since you only wanted a single substitution I used sub instead of gsub.

Sign up to request clarification or add additional context in comments.

Comments

2

You can use:

gsub("V(\\d+.\\d+.\\d+)$","\\1",str)
##[1] "VDM 000 2.1.1"    "ABVC 001 10.15.0" "ASDV 123 1.20.0" 

The regex V(\\d+.\\d+.\\d+)$ matches the "version" consisting of the character "V" followed by three sets of digits (i.e., \\d+) separated by two "." at the end of the string (i.e., $). The parenthesis around the \\d+.\\d+.\\d+ provides a group within the match that can be referenced by \\1. Therefore, gsub will replace the whole match with the group, thereby removing that "V".

Comments

1

Since you know it's the last V you want to remove from the string, try this regex V(?=[^V]*$):

gsub("V(?=[^V]*$)", "", str, perl = TRUE)
# [1] "VDM 000 2.1.1"    "ABVC 001 10.15.0" "ASDV 123 1.20.0" 

The regex matches V before pattern [^V]*$ which consists of non V characters from the end of the String, which guarantees that the matched V is the last V in the string.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.