This is part 46 of the series of article on SSIS
In this article we are going to see on how to use the Fuzzy Lookup transformation in SSIS. Fuzzy lookup transformation uses an equi join to do a check for the matching records across the tables. Fuzzy lookup can be used in place where we have a large number of corrupted data and we need to consider doing a cleanup and processing the data to be available across the systems.[more]
Take an example when we need to write a package which fetches the details from the customer table and process the data to some systems, in that case if there is some mismatch in the name then also we need to process the data at that situation we can have this fuzzy lookup which takes the matchup as per the threshold and process the missing records so that the accuracy comes into picture. Let’s jump start on how to use this task in real time and see the steps to do the configurations.
You can look into my series of article on SSIS at the url – http://f5debug.net/tutorial/ssis.php
Follow steps 1 to 3 on my first article to open the BIDS project and select the right project to work on integration services project. Once the project is created, we will see on how to use the Fuzzy Lookup control. Once you open the project just drag and drop the Fuzzy Lookup control and a source provider as shown in the below image.
There are some Red Cross icons on the tasks which indicate that the controls are not configured yet. Now let’s start to configure the controls in the coming sections. First configure the Source provider as shown in the below task.
Now the Source provider is configured, which mean we have the data to process in our package, here we need to see the corrupted data that is like any data repeated and anything against the policy for the business. Now let’s configure the Fuzzy Lookup as shown in the below screen
Configure for each tabs as shown below
Here we have an option to create a new index or use an existing index, normally Fuzzy lookup creates an index to do the check for the sorting and do the transformation for checking the duplication of values accordingly. If we have an existing index on the table then we have option to use the same instead of creating a new one to maintain the performance of the table.
The above image shows on which column we should map and which column holds the responsibility of doing the column check.
The above screen shows the advanced setting on to use the fuzzy lookup transformation like providing the threshold and giving the exact match for the fuzzy transformation
After finishing the configuration your screen looks like below image
Executing the package (Press F5) will execute the package and your screen looks like below. This indicates that the package is executed perfectly.
So in this article we have seen on how to use the Fuzzy Lookup transformation task and the key configurations used in order to use this task handy.