By Michael Marr
Expert Author
Article Date: 2010-07-08
Repetitive information goes against good database design, and thus is reason enough to find and remove duplicate values. However, when utilizing an UNIQUE key or other types of indexes, duplications actually break or prevent statements from executing properly.
Step 1: Determine if duplicates exist.
mysql>SELECT value, count(*) as num
FROM t1
GROUP BY value
HAVING count(*) > 1;
Replace the column name `value` with the column that contains duplicate data. The result should look something like this (if duplicates exist):
mysql> SELECT * FROM dups;
+----------------+-----+
| value | num |
+----------------+-----+
| Pirate | 6 |
| Ninja | 5 |
| Jedi | 2 |
+----------------+-----+
3 rows in set (0.00 sec)
Step 2: Grab unique rows. Now that we know we have duplicate values, we can simply execute the following to find and store single entry for each of our unique items in our table:
mysql> CREATE TEMPORARY TABLE t2 SELECT * FROM t1 GROUP BY value;
Query OK, 6 rows affected (0.00 sec)
Records: 6 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM t2;
+----+------------------+
| id | value |
+----+------------------+
| 9 | Pirate |
| 15 | Ninja |
| 1 | Jedi |
| 10 | Plumber |
| 7 | Princess |
| 29 | Spy |
+----+------------------+
6 rows in set (0.00 sec)
Step 3: Re-create table from unique rows. After clearing out the existing table, you can re-insert the unique rows with the following command:
mysql>TRUNCATE t1;
Query OK, 0 rows affected (0.00 sec)
mysql>INSERT INTO t1 SELECT * FROM t2;
Query OK, 5 rows affected (0.00 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql>SELECT * FROM t1;
+----+------------------+
| id | value |
+----+------------------+
| 9 | Pirate |
| 15 | Ninja |
| 1 | Jedi |
| 7 | Princess |
| 29 | Spy |
+----+------------------+
6 rows in set (0.00 sec)