1

I want to read the company table and take out all possible suffixes from the name. Here's what I have so far:

declare @badStrings table (item varchar(50))

INSERT INTO @badStrings(item)
SELECT 'company' UNION ALL
SELECT 'co.' UNION ALL
SELECT 'incorporated' UNION ALL
SELECT 'inc.' UNION ALL
SELECT 'llc' UNION ALL
SELECT 'llp' UNION ALL
SELECT 'ltd'

select id, (companyname = Replace(name, item, '') FROM @badStrings)
from companies
where name != ''
3
  • What database engine is this? Commented Mar 29, 2012 at 5:05
  • dear @user990016 what error you are getting while executing these queries ..?? Commented Mar 29, 2012 at 5:18
  • This is MS SQL and I get a syntax error. Commented Mar 29, 2012 at 17:37

2 Answers 2

4

Ed Northridge's answer will work, and I have upvoted it, but just in case multiple replacements are required I am adding another option using his sample data. If, for example one of the companies was called "The PC Company LTD" This would duplicate rows in the output with one being "The PC LTD" and the other "The PC Company". To resolve this there are 2 option depending on your desired outcome. The first is to only replace the "Bad Strings" when they occur at the end of the name.

SELECT  c.ID, RTRIM(x.Name) [Name]
FROM    @companies c
        OUTER APPLY 
        (   SELECT  REPLACE(c.name, item, '') AS [Name]
            FROM    @badStrings
                    -- WHERE CLAUSE ADDED HERE
            WHERE   CHARINDEX(item, c.Name) = 1 + LEN(c.Name) - LEN(Item)
        ) x
WHERE   c.name != '' 
AND     x.[Name] != c.Name

This would yield "The PC Company" with no duplicates.

The other option is replace All occurances of the Bad Strings recursively:

;WITH CTE AS
(   SELECT  c.ID, c.Name [OriginalName], RTRIM(x.Name) [Name], 1 [Level]
    FROM    @companies c
            OUTER APPLY 
            (   SELECT  REPLACE(c.name, item, '') AS [Name]
                FROM    @badStrings
                WHERE   CHARINDEX(item, c.Name) = 1 + LEN(c.Name) - LEN(Item)
            ) x
    WHERE   c.name != '' 
    AND     RTRIM(x.Name) != c.Name
    UNION ALL
    SELECT  c.ID, OriginalName, RTRIM(x.Name) [Name], Level + 1 [Level]
    FROM    CTE c
            OUTER APPLY 
            (   SELECT  REPLACE(c.name, item, '') AS [Name]
                FROM    @badStrings
                WHERE   CHARINDEX(item, c.Name) = 1 + LEN(c.Name) - LEN(Item)
            ) x
    WHERE   c.name != '' 
    AND     x.[Name] != c.Name  
)

SELECT  DISTINCT ID, Name, OriginalName
FROM    (   SELECT  *, MAX(Level) OVER(PARTITION BY ID) [MaxLevel]
            FROM    CTE
        ) c
WHERE   Level = maxLevel

This would yield "The PC" from "The PC Company".

Sign up to request clarification or add additional context in comments.

7 Comments

+1 exactly, a recursive CTE is what I immediately thought about after reading the question.
I'm getting "Incorrect syntax near 'Name'." but I don't see it.
For which query, the top or bottom one?
Bottom one, the line before UNION ALL - remove [Name]
@EdNorthridge Good Spot, I must have missed that when I added all the RTRIMs, I've edited the answer now so it should work.
|
3

The error I got running the snippet was:

Msg 102, Level 15, State 1, Line 12
Incorrect syntax near '='.

The below code isn't an ideal solution - it will only return a list of companies where their name has been changed by the REPLACE function.

declare @companies table (id int, name nvarchar(50))
INSERT INTO @companies(id, name)
SELECT 1,'One Company' UNION ALL
SELECT 2, 'Two co.' UNION ALL
SELECT 3, 'Three incorporated' UNION ALL
SELECT 4, 'Four inc.' UNION ALL
SELECT 5, 'Five llc' UNION ALL
SELECT 6, 'Six llp' UNION ALL
SELECT 7, 'Seven ltd'

select * from @companies

declare @badStrings table (item varchar(50))

INSERT INTO @badStrings(item)
SELECT 'company' UNION ALL
SELECT 'co.' UNION ALL
SELECT 'incorporated' UNION ALL
SELECT 'inc.' UNION ALL
SELECT 'llc' UNION ALL
SELECT 'llp' UNION ALL
SELECT 'ltd'

select * from @badStrings

Here is the edited query:

select id, x.Name
from @companies c
OUTER APPLY (
    SELECT  Replace(c.name, item, '') AS [Name]
    FROM    @badStrings
) x    
where c.name != '' 
AND x.[Name] != c.Name

This returns:

id          Name
----------- --------
1           One 
2           Two 
3           Three 
4           Four 
5           Five 
6           Six 
7           Seven 

(7 row(s) affected)

Hopefully it's useful

Edit: An alternative to apply the match to those company names which end with the @badStrings value

select id, x.Name
from @companies c
OUTER APPLY (
    SELECT  Replace(c.name, item, '') AS [Name]
    FROM    @badStrings 
    WHERE   c.Name LIKE '%'+item
) x
where c.name != '' 

4 Comments

One problem with this code is it returns NULL if the company name is ABCD L.L.P. but if it's My Company LLP it returns My Company.
@user990016 - Correct, this code only returns entries where a replacement has been made - it's not looking for L.L.P only LLP
I'm using MS SQL 2005 Studio and have tested several versions on this solution. All of a sudden Studio crashes with an unhandled exception in SQLWB.EXE. I can't even create the temp table @badstrings and select *.
@user990016 Post the issue as a separate question, it's slightly out of scope for a short answer in a comment

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.