Currently browsing 'SQL Programming'
SQL finding duplicate records.
When adding unique indexs to tables to prevent data duplication you discover there are already duplicates in the table you wish to fix/improve.
Here’s a quick way to list records with duplicates.
select count( id ) as cnt, id from table
group by id
having cnt > 1
You still have to go through and figure out what data is valid and such but it definitely saves time to run that query first. In MySQL you can run the following query to view all of the records in a table that are duplicate (make’s it easier to know which ones to delete then).
select * from table where id in (
select id from (
select count( id ) as cnt, id from table group by id
having cnt > 1 ) as table_alias )
order by id
This will list of the rows that have duplicates in order so you can see which records to delete in your favorite mysql client.
If you want something simpler and are not necessarily interested in what record might have better data.
Follow these instructions here on creating a new table to get rid of duplicate rows
Finding and Removing duplicate data from SQL Tables
Here’s an decent article with various methods of finding duplicate data in databases with SQL.
http://www.delphifaq.com/faq/delphi/database/f20.shtml
Removing duplicate data can be a troublesome task as well. One of the fastest ways to do so involves creating a new table with a unique index to the column you wish to purge of duplicate data.
Then you copy data over form the original table to the new using a insert ignore into table ( columns ) select columns from table type of query. Don’t forget the “ignore” otherwise the batch insert will fail at the first case of a duplicate piece of data.