MYSQL remove duplicates where multiple columns match

Question

I have a table > 500,000 rows where there are duplicate entries that need to be removed but only where a few columns match.

The main table has the following columns

id,
countryID,
postalCode,
adminName1,
adminName2,
placeName,
adminName3,
latitude,
longitude

I need to remove duplicates (leaving the first record) where placeName, latitude & longitude match

I had a search and found this which looks right but doesnt work for me. I have duplicated the original table structure into a new table (tblTemp)

INSERT INTO tblTemp(id,countryID,postalCode,adminName1,adminName2,placeName,adminName3,latitude,longitude)
SELECT DISTINCT placeName,latitude,longitude
FROM tblCountry_admin;

But i get the error

Column count doesn't match value count at row 1

Ofcourse it will. You are trying to insert 3 values into 9 columns. — Gurwinder Singh
– Gurwinder Singh, Commented Feb 11, 2017 at 17:16
Please specify which row to consider in case there are multiple rows with same placeName,latitude,longitude ... Maybe max id? — Gurwinder Singh
– Gurwinder Singh, Commented Feb 11, 2017 at 17:20

Gurwinder Singh · Accepted Answer · 2017-02-11 17:27:13Z

1

if you want delete the duplicate row by laceName, latitude & longitude leaving the the old one you could

You could check for select the duplicated rows this wat

  select * from tblCountry_admin
  where id not in (
     select min(id)
     from tblCountry_admin
     group by placename, latitude, longitude
     )

then you could delete this way

delete from tblCountry_admin
where id not in (
   select min(id)
   from tblCountry_admin
   group by placename, latitude, longitude
   )

the error you get in you insert select id due by the fact the number of column in insert don't match the number of column is select

edited Feb 11, 2017 at 17:27

Gurwinder Singh

39.7k6 gold badges62 silver badges87 bronze badges

answered Feb 11, 2017 at 17:25

ScaisEdge

133k10 gold badges98 silver badges111 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Gordon Linoff · Accepted Answer · 2017-02-11 17:21:36Z

0

Use WHERE:

INSERT INTO tblTemp (id, countryID, postalCode, adminName1,adminName2,     
                     placeName, adminName3, latitude, longitude)
    SELECT id, countryID, postalCode, adminName1, adminName2,
           placeName, adminName3, latitude, longitude
    FROM tblCountry_admin a
    WHERE a.id = (SELECT MIN(a2.id)
                  FROM tblCountry_admin a2
                  WHERE a2.placeName = a.placeName AND
                        a2.latitude = a.latitude AND
                        a2.longitude = a.longitude
                 );

answered Feb 11, 2017 at 17:21

Gordon Linoff

1.3m62 gold badges706 silver badges857 bronze badges

Comments

Gurwinder Singh · Accepted Answer · 2017-02-11 17:24:33Z

0

Assuming the tblTemp has same set of columns as tblCountry_admin and you want to get the rows with max id in case of duplicates, you can use this:

INSERT INTO tblTemp
select a.*
from tblCountry_admin a left join tblCountry_admin b on a.placeName = b.placeName
    and a.latitude = b.latitude
    and a.longitude = b.longitude
    and a.id < b.id
where b.id is null;

If you want to create the table using the select use:

create table tblTemp as
select a.*
from tblCountry_admin a left join tblCountry_admin b on a.placeName = b.placeName
    and a.latitude = b.latitude
    and a.longitude = b.longitude
    and a.id < b.id
where b.id is null;

answered Feb 11, 2017 at 17:24

Gurwinder Singh

39.7k6 gold badges62 silver badges87 bronze badges

2 Comments

lifeson Over a year ago

This option is taking a vlong time to run and if i interrupt it no entries have been added to tblTemp

Gurwinder Singh Over a year ago

Working on half a million records is going to take time. It'll however be faster than group by operation provided proper indexes were in place in your original table.

Collectives™ on Stack Overflow

MYSQL remove duplicates where multiple columns match

3 Answers 3

Comments

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related