Fuzzy Search and Replace in Excel
You can find similar entries from a list or table in Excel by doing a fuzzy search in Excel. This gives you a way to consider the following to effectively be the same.
John Q. Smith
So if you just want to look for “John Smith” and simply find those entries that are pretty close to that. You want to do a fuzzy text search (not just a wildcard search at the beginning or end of a string). This post describes how to use the Fuzzy Find and Replace feature of the Excel PowerUps add-in for Excel to find those approximate matches. Fuzzy text search in Excel is here.
Sample Use Cases
There are many ways you can use this capability to help you get more out of Excel or improve your data cleansing and analysis capabilities. Here, I’ll describe two common scenarios. The first will be simply finding cells with values similar to your search string. The second will be in data transformation and normalization.
Just Search for Cells Containing Similar Values
If you have a large list of data and you need to find all occurrences of a particular value you can use the basic Fuzzy Search part of the Fuzzy Search and Replace. You might be looking for a name, or perhaps an address that may have been entered in multiple ways.
You may have to deal with different abbreviations, characters being transposed, names being misspelled, etc. You need to be able to find these and know how consistent the data you have is so you can decide how best to handle it. In this scenario let’s just say we want to find all records that may match ‘1234 Columbia Blvd’.
The first thing to do is select the range of data that contains the data you want to search. You can see that I’ve done that in the image above.
Next, select Fuzzy Find and Replace from the PowerUp Tools menu. You’ll find that on the PowerUps tab across the top of Excel.
(if you don’t have the PowerUps tab you need to install the Excel PowerUps add-in)
In the Fuzzy Find and Replace tool, type in the term or phrase that you’re looking for and click the Find button. You can control how exacting the match is by moving the Fuzziness Scale slider back and forth.
In the next scenario, we’ll “fix” the data that has been provided so that you can create a useful report from it.
Data Transformation or Normalization
If you have data from a number of sources, or let’s say your data comes in hand-entered from multiple people you might wind up with variations in the way the same value is represented. For the purpose of creating reports it would be best if you can consolidate all of the similar values into a single consistent value that everybody recognizes, avoids duplication of information, and also avoids distribution of the information across similar values that are really the same thing.
More formally, this “cleaning” is taken care of in the “T” of your ETL processes. Your goal is to get the data into a single canonical form from which all downstream reporting and analysis can be managed.
Using the Fuzzy Find and Replace feature this is made super easy.
Sample Data Cleanup Task
Let’s say you want to clean up the data in the example above so that all address entries similar to 1234 Columbia Blvd are entered as a single consistent string value. In the Fuzzy Find and Replace dialog box where you’ve done your search you can adjust the Fuzziness Scale slider to see the matches that are being returned until you are satisfied. As you slide back and forth, the matches listed should update in nearly real time so this step is really easy.
Once you’re satisfied with the list of approximate matches returned, you can use one of the buttons to the right of the column of matches.
If you click the Replace Cell Contents in Place button each of the occurrences of the matches will be replaced in the current data set you were searching.
If you click the Replace Contents and Export button each of the occurrences of the matches will also be replaced, but the updated range of data will be inserted in a new worksheet automatically added to your workbook.
Finally, if you click the Export Matches to New Sheet button, only the matches will be exported to a new worksheet added to your workbook.
See it in Action
You can watch the short video below to see the Fuzzy Find and Replace in action.
Fuzzy Matching in a Formula
If you need a fuzzy matching capability within your worksheet formulas, you should look at the pwrSIMILARITY function that is also part of the Excel PowerUps. I have a post here that outlines its usage.
Fuzzy Matching with VLOOKUP
If you just love VLOOKUP, but wish you could do the same thing with a fuzzy match instead of exact you should look at the pwrVLOOKUP function that is part of the Excel PowerUps. I have a post here<link> that outlines its usage.
INDEX/MATCH, but Fuzzy
If you prefer to use the INDEX/MATCH duo for your lookup needs, but just wish you could use a fuzzy match instead you should look at the pwrMATCH function that is part of the Excel PowerUps. You can use it in the same manner as the MATCH function.
There you go.