0

I have an unusual query, which got me stuck right now

Table Fields are:

id    bigint  20
name  varchar 255
desc  text

There are many records with same name and desc, but desc have some extra spaces in between words

like

1   't1'   'hello world'
2   't2'   'hello                world'

I need to find those rows that have similar data

How can I find these, thanks.

2 Answers 2

2

This is pretty close. Assuming:

+-------+---------+------+-----+---------+-------+

| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| d     | text    | YES  |     | NULL    |       |
| id    | int(11) | YES  |     | NULL    |       |
+-------+---------+------+-----+---------+-------+

Then this query:

select x.id,x2.id,x.d,x2.d from x left join x as x2 on replace(x.d," ","") = replace(x2.d," ","") and x.id != x2.id having !(x2.id is null);

Gets you the duplicate rows. It fails if you have "Helloworld" (i.e. with no space) and you don't want that to match.

Sign up to request clarification or add additional context in comments.

Comments

0

Unless you need to preserve the original data, something like this is best done at insertion time, when you create/update the record, rather than later, at comparison time.

That being said, you could do something like

SELECT id, name, desc, REPLACE(desc, '  ', ' ') as replaced
                                      xx    x  <--note the number of spaces
FROM table
GROUP replaced
HAVING replaced > 1

Probably won't perfectly, and you'll have to adjust the replacement portion a few times, but this should get you started.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.