REGEX replace in T-SQL

Question

I have a article table full with false descriptions. for example Ger-teschutz because somebody replaced all äs with -.

Now I want to get Geräteschutz instead of Ger-teschutz, but I also have other strings in there which have to stay that way, for example TX-40 or WA-I30.

But I only want to replace that one - in that string and not all of them.

I want to replace them by regex where the char before the - is upper case and after lower case.

Can anybody help me?

Since you tagged it, just wanted to confirm that the database version is SQL Server 2000. Is that right? — Shiva
– Shiva, Commented Feb 14, 2014 at 18:55
Your regex description doesn't seem to match Ger-teschutz, is that a problem? — Tim Lehner
– Tim Lehner, Commented Feb 14, 2014 at 18:59
Are you absolutely sure nothing else that shouldn't get transformed doesn't match that format? — tenub
– tenub, Commented Feb 14, 2014 at 19:00

Community · Accepted Answer · 2017-05-23 11:49:48Z

I want to replace them by regex where the char before the - is upper case and after lower case.

I'm not sure if this regex you describe will capture all of your data in the way you seem to intend in your example, but here is one possibility in SQL:

update MyTable
set MyColumn = left(MyColumn, patindex('%[A-Z]-[a-z]%', MyColumn collate Latin1_General_BIN))
                + 'ä'
                + right(MyColumn, len(MyColumn) - 1 - patindex('%[A-Z]-[a-z]%', MyColumn collate Latin1_General_BIN))
where MyColumn collate Latin1_General_BIN like '%[A-Z]-[a-z]%'

GeR-teschutz -> GeRäteschutz

Note that both like and patindex can understand character sets, much like regex. I am also specifically using a ~~case-sensitive~~ binary collation with each of them, as I don't know your database.

You'll also have to run this multiple times if there are multiple matches in one value ("GeR-tescH-tz").

This does not check for boundary cases that may exist in your data (word endings, etc.).

UPDATE: I've updated the query to use the more common character range for the sets, and used a binary collation. If a non-binary collation is necessary, one would have to put each letter in the set. source: How does SQL Server Wildcard Character Range, eg [A-D], work with Case-sensitive Collation?

Community · Accepted Answer · 2017-05-23 11:49:48Z

0

So what you say you want, contradicts the values in the question a bit. You say you want the letter before the - to be UPPER and the letter after to be lower. That regex looks like this:

([A-Z]-[a-z])

Regular expression visualization

Debuggex Demo

However, you'll notice in the demo that matches the second of these two values:

Ger-teschutz
GeR-teschutz

Either way, if what you say you want is what you want then this handles it.

Now, using that regex in SQL 2000 is a bit of a trick. At this point you're going to be wishing you were in MySQL. But here is a post that does a great job of explaining how to implement the usages of regular expressions: TSQL Replace all non a-z/A-Z characters with an empty string.

NOTE: in that post the answerer leveraged a stored procedure. You could too leverage a function if necessary since they can be inlined into a query.

edited May 23, 2017 at 11:49

CommunityBot

11 silver badge

answered Feb 14, 2014 at 19:07

Mike Perrenoud

68.1k32 gold badges167 silver badges238 bronze badges

2 Comments

Tim Lehner Over a year ago

I think the OP wants to do this in SQL Server 2000.

Mike Perrenoud Over a year ago

@TimLehner: good point. Though I could just repeat it here, I think it's just worth linking to this post (stackoverflow.com/questions/2374594/…); it does a good job explaining how you'd accomplish that in the query itself.

Collectives™ on Stack Overflow

REGEX replace in T-SQL

2 Answers 2

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related