0

I'm working with regex's using the RegEx.replace method and I'm running into some strings my patterns are not working for.

For strings such as:

"Av.Av. Italy"
"Av.  Av. Italy"
"Av. . Av. Italy"

I'm trying to replace the Av's and remove all the extra periods and whitespaces so I tyr to use this regex

 rgx = new Regex(@"(Av\.).(Av\.)");
 address = rgx.Replace(address, replacement);

[edit] I want all of the above strings to end up just saying

"Av. Italy"

But it doesn't change anything.

I also wanted to use a regex to get random periods that appear on some strings (eg: "word . other word") with

rgx= new Regex(@"\b\.\b");

But that doesn't do anything either...

Am I using the escape sequences wrong?

2
  • 2
    Can you give an example of how Av. . Av. Italy should end up like? Commented Dec 30, 2013 at 19:55
  • Oh, I forgot that, added it now; sorry Commented Dec 30, 2013 at 20:01

2 Answers 2

1

You can perhaps use this regex:

rgx = new Regex(@".*(?:(Av\.)\s*)+");
address = rgx.Replace(address, replacement);

regex101 demo

ideone demo

The regex takes any characters, where there's an Av. somewhere ahead, eats all the duplicate Av. and spaces and replaces those with a single Av. (plus a space that got eaten by the regex).


For the second one, maybe that?

rgx= new Regex(@" \.(?= )");

\b matches between two word characters, namely between \w and \W no matter in what order they come, and \w is [a-zA-Z0-9_] while \W is the opposite. Since both space and . are in \W, you wouldn't have a match. Then, I used a space instead of \s because \s matches newlines, which I don't think is what you're looking for :)

The lookahead is to prevent the removal of two white spaces. Otherwise Word . Word would become WordWord.

Sign up to request clarification or add additional context in comments.

7 Comments

The first regexp works perfectly and is just what I wanted, thank you! The second one, however didn't work. Any ideas?
@ConnorU Are you sure? Word . word has to become Word word right? I made a demo. Otherwise, let me know what you're really expecting!
Yes, you are right. word . word does become word word. i'm getting an error with random .'s at the end of strings [so word . \newline] but I should be able to handle that. Thanks
@ConnorU Oh, I guess that's something that will require I see the code to debug that part ^^;. Hopefully you'll be able to get it fixed (^.^)b Good luck!
I figured out how to fix that. One question though, the regex you gave me to take care of .'s between words also kills any numbers between words -- The 5 soldiers becomes The Soldiers. How do I avoid matching numbers?
|
1

For the first this rgx = new Regex(@"(Av\.*[\s]*)*"); will work for you. For the second you must provide an example.

8 Comments

You can take the third string I posted as an example, with that random . in between the "Av."'s Other examples: "Edinburgh . Plaza", "Av. . France"
I tried your suggestion, this is the output I got: "AvAvItaly\n Av.\n AvAvAvAv."...
I missed a dot rgx = new Regex(@"(Av\.*[\s.]*)*");. The third you mean this: "Av. . Av. Italy"?
Lol, that regex gives me Av.v1Av8Av. AvdAveAv. AvJAvuAvlAviAvoAv. Av. Av.v1Av8Av. AvdAveAv. AvJAvuAvlAviAvoAv. Av. Av.Av8Av. AvdAveAv. AvjAvuAvlAviAvoAv. :p
@dimimpou: The outermost quantifier should be +, not *. Your pattern is matching empty strings.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.