0

I tried to fix bad data in postgres DB where photo tags are appended twice.

The trip is wonderful.<photo=2-1-1601981-7-1.jpg><photo=2-1-1601981-5-2.jpg>We enjoyed it very much.<photo=2-1-1601981-5-2.jpg><photo=2-1-1601981-7-1.jpg>

As you can see in the string, photo tags were added already, but they were appended to the text again. I want to remove the second occurrence: . The first occurrence has certain order and I want to keep them.

I wrote a function that could construct a regex pattern:

CREATE OR REPLACE FUNCTION dd_trip_photo_tags(tagId int) RETURNS text 
LANGUAGE sql IMMUTABLE  
AS $$
SELECT string_agg(concat('<photo=',media_name,'>.*?(<photo=',media_name,'>)'),'|')  FROM   t_ddtrip_media WHERE tag_id=tagId $$;

This captures the second occurrence of a certain photo tag. Then, I use regex_replace to replace the second occurrence: update t_ddtrip_content set content = regexp_replace(content,dd_trip_photo_tags(332761),'') from t_ddtrip_content where tag_id=332761;

Yet, it would remove all matched tags. I looked up online for days but still couldn't figure out a way to fix this. Appreciate any help.

1 Answer 1

1

This Should Work.

Regex 1:

<photo=.+?>

See: https://regex101.com/r/thHmlq/1

Regex 2:

<.+?>

See: https://regex101.com/r/thHmlq/2

Input:

The trip is wonderful.<photo=2-1-1601981-7-1.jpg><photo=2-1-1601981-5-2.jpg>We enjoyed it very much.<photo=2-1-1601981-5-2.jpg><photo=2-1-1601981-7-1.jpg>

Output:

<photo=2-1-1601981-7-1.jpg>
<photo=2-1-1601981-5-2.jpg>
<photo=2-1-1601981-5-2.jpg>
<photo=2-1-1601981-7-1.jpg>
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.