How to Find and Remove Duplicates in Excel
“I’ve never been a natural, all I do is try, try, try.”
Those Taylor Swift lyrics in the song “Mirrorball” perfectly explain my relationship with numbers, math, and anything to do with data analysis.
As a marketer, however, data analysis is one of the most important aspects of my job. But like most marketers who prefer strategy and creativity, I don’t take numbers and Excel reports for granted.
For this reason, it is important to know how to work in Excel and find shortcuts to make the process easier.
Today we’re going to dive into one of these processes – how to find and remove duplicates in Excel.
How to remove duplicates in Excel
- Find and mark duplicates in Excel using conditional formatting.
- Count duplicates in Excel.
- Remove duplicates using the duplicate removal feature.
1. Find and mark duplicates in Excel using conditional formatting.
The first step in removing duplicates is to find them. One easy way to do this is through conditional formatting.
You can do this by following these steps:
- Make sure you are on the Home tab.
- Select the entire table by clicking the button in the top left.
- Click Conditional Formatting → Cell Highlighting Rules → Duplicate Values.
- In the area entitled “Format with”, change how duplicates are highlighted. You can choose to highlight, bold, change the text color, and so on.
And voila. Your duplicates are now marked. It should look something like this:
2. Count duplicates in Excel.
After you find your duplicates, you may want to count them and see how many there are, especially if you have a large record.
You can use this formula to do this = COUNTIF (A: A, A2). The formula means that Excel counts the number of times a certain value is used in a certain place.
The column A: A. represents the data table you are looking at. This will likely be a different value in your Excel spreadsheet. Then, A2 refers to the value whose frequency you want to count.
To do this, proceed as follows:
- Create a new sheet in your Excel document.
I’ve found that the easiest way to count duplicates in Excel is to create a new sheet in your Excel workbook.
Then copy and paste the column you want to count duplicates in. In the following example I copied and pasted the blog titles from the editorial calendar to see if there were duplicate titles.
Then create another column for “Occurrences”. This is where we use the formula. Your new sheet should look something like this:
- Paste the formula.
Now you can paste the formula into the first cell under Occurrences. You enter or copy and paste the formula. Then highlight A: A (we’ll replace this with your data set) and click on the sheet in your Excel document that contains the data. Now you can click on the top left to select the entire sheet, or you can just highlight the column or rows with your data.
For the second value, you want to go back to your second hand, highlight A2 and select the value to the left of it. In most cases this will remain A2, A3, A4, etc.
See how that looks in action here:
3. Remove duplicates using the Duplicate Removal feature.
Now is the time to remove the duplicates from your dataset.
Before doing this I would recommend duplicating / copying your dataset to another sheet or workbook entirely. You want to always keep your original data intact, even though you can use Excel to remove and filter the data you want. You don’t want to lose any data by wrongly clicking.
Now that you’ve made a copy of your data, it’s time to remove the duplicates.
To remove duplicates, do the following:
- Select the worksheet with duplicate values that you want to remove. Click Data → Table Tools → Remove Duplicates.
- Select the columns where you want to remove the duplicates.
In this case, I just want to remove duplicate blog titles. So I choose column D. I checked “My list contains headings” because there are two rows of headings before the data starts on this sheet.
Remember, Excel will remove the entire row that has the double value.
- Check data.
Excel now shows that many duplicate values have been found and removed and how many unique values are left.
You can now check your data. If you compare my first record to this record, you can see that all rows with the same blog title have been deleted.
This is what the sheet looked like before:
And this is how it looks now:
When removing duplicates from Excel, be aware of which column you want to remove duplicates from, and remember that Excel will remove duplicates within a selected table range. You can highlight the entire workbook or just highlight the rows that contain data.
Excel automatically keeps the first occurrence of the value.
Working on marketing reports or a marketing excel spreadsheet can leave you frustrated and hit your head against the wall (is that just me?). Because of this, using Excel templates and following these simple instructions on formulas can help you improve your game.