2

I need a regex which can basically check for space, line break etc after string.

So conditions are,

  1. Allow special characters ., _, -, + inside the string i.e.@hello.world, @hello_world, @helloworld, etc.
  2. Discard anything including special characters where there is no alpha-numeric string after them i.e. @helloworld.<space>, @helloworld-<space>, @helloworld.?, etc. must be parsed as @helloworld

My existing RegEx is /@([A-Za-z0-9+_.-]+)/ which works perfectly Condition #1, but still there seems to be a problem Condition #2

I am using above RegEx in preg_replace()

Solution:

$str = preg_replace('#@[\w+.\-]+\b#', '[[$0]]', $str);

This works perfectly.

Tested with

http://gskinner.com/RegExr/

0

3 Answers 3

1

You can use word boundaries to easily find the position between an alphanumeric letter and a non-alphanumeric letter:

$str = preg_replace('#@[\w+.\-]+\b#', '[[$0]]', $str);

Working example: http://ideone.com/0ShCm

Sign up to request clarification or add additional context in comments.

Comments

1

Here's an idea:

  1. Use strrev to reverse the string
  2. Use strcspn to find the longest prefix of the reversed string that does not contain any alphanumeric characters
  3. Cut the prefix off with substr
  4. Reverse the string again; this is your final result

See it in action.

I 'm not taking into account any requirement that restricts the legal characters in the string to some subset, but you can use your regular expression for that (or even strspn, which might be faster).

3 Comments

Your code works perfectly for me but I needed an exact solution which @kobi has provided.
@whoru: That code won't work as you describe if I get the description correctly though. Try it with the input @helloworld_. Word of advice: don't use a solution you don't completely understand.
I needed @helloworld_ to be accepted because @helloworld_ in retrived from something like [email protected]
0

The reason is because it's reading the string as a whole. If you want it to parse out everything after the alphanumeric section you might have to do like and end(explode()); and run that through to make sure that it isn't valid and if it isn't valid then remove it from the equation, but then you'd have to check the end for every possible explode point i.e. .,-,~,etc.

Then again another trap that you might run into is that in the case of a item or anything w/ alphanumeric value it might just parse everything from after the last alphanumeric character on.

Sorry that this isn't much help, but I figured thinking aloud does help.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.