0

Okay I have made some progress on a problem I am solving, but need some help with a small glitch.

I need to remove all characters from the filenames in the specific path images/prices/ BEFORE the first digit, except for where there is from_, in which case remove all characters from the filename BEFORE from_.

Examples:

BEFORE                                AFTER
images/prices/abcde40.gif           > images/prices/40.gif
images/prices/UgfVe5559.gif         > images/prices/5559.gif
images/prices/wedsxcdfrom_88457.gif > images/prices/from_88457.gif

What I've done:

$pattern = '%images/(.+?)/([^0-9]+?)(from_|)([0-9]+?)\.gif%';
$replace = 'images/\\1/\\3\\4.gif';
$string = "AAA images/prices/abcde40.gif BBB images/prices/wedsxcdfrom_88457.gif CCC images/prices/UgfVe5559.gif DDD";
$newstring = str_ireplace('from_','733694521548',$string);
while(preg_match($pattern,$newstring)){
    $newstring=preg_replace($pattern,$replace,$newstring);
}
$newstring=str_ireplace('733694521548','from_',$newstring);
echo "Original:\n$string\n\nNew:\n$newstring";

My expected output is:

AAA images/prices/40.gif BBB images/prices/from_88457.gif CCC images/prices/5559.gif DDD"

But instead I am getting:

AAA images/prices/40.gif BBB images/from_88457.gif CCC images/5559.gif DDD

The prices/ part of the path is missing from the last two paths.

Note that the AAA, BBB etc. portions are just placeholders. In reality the paths are scattered all across a raw HTML file parsed into a string, so we cannot rely on any pattern in between occurrences of the text to be replaced.

Also, I know the method I am using of substituting from_ is hacky, but this is purely for a local file operation and not for a production server, so I am okay with it. However if there is a better way, I am all ears!

Thanks for any assistance.

4 Answers 4

1

You can use lookaround assertions:

preg_replace('~(?<=/)(?:([a-z]+)(?=\d+\.gif)|(\w+)(?=from_))~i', '', $value);

Explanation:

(?<=/)          # If preceded by a '/':
(?:             # Begin group
 ([a-z]+)       #   Match alphabets from a-z, one or more times
 (?=\d+\.gif)   #   If followed followed by digit(s) and '.gif'
 |              #   OR
 (\w+)          #   Match word characters, one or more times
 (?=from_)      #   If followed by 'from_'
)               # End group

Visualization:

Image from debuggex

Code:

$pattern = '~(?<=/)(?:([a-z]+)(?=\d+\.gif)|(\w+)(?=from_))~i';
echo preg_replace($pattern, '', $string);

Demo

Sign up to request clarification or add additional context in comments.

7 Comments

Thanks, but I cannot work with nice and clean arrays here - the problem is such that I need the replacement to work with all matching instances inside a long string.
@RC: It doesn't matter. It should work with a plain string as well. See demo.
Yes it does! Sorry I had seen so many solutions with an array input that failed with a long string, so I didn't test it. Thanks so much! And also for the effort in mapping out the syntax - much appreciated!
Sorry, just realised something: how do I amend your pattern so that it ONLY replaces filenames within the images/prices/ path, and ignores all other similar filenames in other paths?
@RC: Simply change the lookbehind to (?<=images/prices/).
|
0

You can use this regex for replacement:

^(images/prices/)\D*?(from_)?(\d+\..+)$

And use this expression for replacement:

$1$2$3

RegEx Demo

Code:

$re = '~^(images/prices/)\D*?(from_)?(\d+\..+)$~m'; 
$str = "images/prices/abcde40.gif\nimages/prices/UgfVe5559.gif\nimages/prices/wedsxcdfrom_88457.gif";     
$result = preg_replace($re, '$1$2$3', $str);

4 Comments

Thanks, but if I replace the line breaks with spaces, the code fails. Only the first item is replaced - the remaining two are not. I need every matching instance in a long string to be replaced. Any ideas?
Did you check my linked demo where all the images are getting replaced. You might be missing m flag.
I'm really bad with Regex! I tried running your code with the line breaks replaced with spaces and this is the result: sandbox.onlinephpfunctions.com/code/… - the last two are not replaced.
In that case your regex will be: $re = '~(images/prices/)\D*?(from_)?(\d+\..+)~';
0

You can try with Lookaround as well. Just replace with blank string.

(?<=^images\/prices\/).*?(?=(from_)?\d+\.gif$)

regex101 demo

Sample code: (directly from above site)

$re = "/(?<=^images\\/prices\\/).*?(?=(from_)?\\d+\\.gif$)/m";
$str = "images/prices/abcde40.gif\nimages/prices/UgfVe5559.gif\nimages/prices/wedsxcdfrom_88457.gif";
$subst = '';

$result = preg_replace($re, $subst, $str);

If string is not multi-line then use \b as word boundary instead of ^ and $ to match start and end of the line/string.

(?<=\bimages\/prices\/).*?(?=(from_)?\d+\.gif\b)

2 Comments

Thanks, but if I replace the line breaks with spaces, the code fails and only displays the last item. I need every matching instance in a long string to be replaced. Any ideas?
Thanks. I'm really bad with Regex syntax, so may be missing something, but I set $re = "(?<=\bimages\/prices\/).*?(?=(from_)?\d+\.gif\b)"; and ran the code but got a Unknown modifier '.' error.
0
$arr = array(
    'images/prices/abcde40.gif',
    'images/prices/UgfVe5559.gif',
    'images/prices/wedsxcdfrom_88457.gif'
);

foreach($arr as $str){
    echo preg_replace('#images/prices/.*?((from_|\d).*)#i','images/prices/$1',$str);
}

DEMO

EDIT:

$str = 'AAA images/prices/abcde40.gif BBB images/prices/wedsxcdfrom_88457.gif CCC images/prices/UgfVe5559.gif DDD';

echo preg_replace('#images/prices/.*?((from_|\d).*?\s|$)#i','images/prices/$1',$str), PHP_EOL;

2 Comments

Thanks, but I cannot work with nice and clean arrays here - the problem is such that I need the replacement to work with all matching instances inside a long string.
@RC You can still use, kind of, the same regex. I updated the answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.