Let us discuss about “How to define Type 1 SCD in SSIS using Slowly Changing Dimension Transformation” in this post. As most of us know that there are many types of SCDS available, here in this post we will cover only SCD Type 1. Before jumping into the demonstration, first let us know what this SCD Type 1 says –
In Type 1 SCD we will overwrite the existing data with the new data. Lets take a small example here. Lets say I have data like given below in my existing table.
Let’s say the employee is not moved to USA from India. Now, the data that is present in the database table will be directly replaced with the new data and we will have the following data in table.
The major disadvantage of using this method is HISTORY will be lost. If we implement this methodology then we can’t track the history of a particular employee. So, this is not at all suggestible in the case where HISTORY needs to be maintained.
This is the easiest way to implement of all th SCD types available. So, Type 1 slowly changing dimension should be used when it is NOT NECESSARY for the data warehouse to keep track of historical changes. I hope you got some useful info regarding SCD type 1 and now let’s jump into SCD Transformation.
I have created two tables one Test_Stage(Which I am using as Staging table) and another one is Test(Which I am using as Main table) with the following syntax.
Create table Test_Stage
Create table Test
Now I inserted some records into my staging table i.e Test_Stage and PFB the scripts used –
Insert into test_stage Values(1,’Roopesh’,’India’);
Insert into test_stage Values(2,’Lokesh’,’Pakistan’);
Insert into test_stage Values(3,’Vinay’,’USA’);
Insert into test_stage Values(4,’Rakesh’,’China’);
Insert into test_stage Values(5,’Venki’,’Japan’);
Now Source data is ready and PFB the steps you have to follow to use Slowly Changing Dimension Transformation.
- Open SSIS Package and drag a dataFlow Task from toolbox to control Flow Pane as shown below.
- Either double click or Right click on Data Flow Task and select EDIT as shown below.
- One the Data Flow pane is opened then drag and drop OLEDB SOURCE from “Data Flow Sources” as shown below.
- Now select EDIT by Right clicking on OLE DB Source and provide the Source details. remember that you have to provide Staging table connection details here.
- Select columns tab and check whether columns are coming correctly or not.
- Now drag and drop Slowly Changing Dimension Transformation in to the Data Flow Pane and provide the connection between OLE DB Source and SCD Transformation.
- Now right click on the SCD Transformation and click on EDIT menu.
- Select the dimension table(in our case it is “Test”) and Keys under “Select Dimension Table and Keys” page as shown below.
- In the above screenshot, the Business Key is the column based on which we will filter the Newly added records from the existing one. In simple this is the column which we use to look up.
- After selecting Next, Under “Slowly Changing Dimension Columns” select the columns that MAY change and select Change Type as “Changing Attribute” as we are dealing with Type 1 SCD here. Once the selection is done the click on NEXT.
- Check the option “Change all the matching records including the outdated records…….” if you wish to have that functionality and click Next.
- Click Next and finally click on Finish by leaving the default options as is.
- Once you click on Finish, SSIS server creates all the functionality that is required to implement SCD Type 1 based on the information provided by us. It will add Data flow tasks for Inserting new records and updating existing records as shown below.
- Now if I run the package then it will check for newly added records and inserts those records and updates the already existing records if there is any change in data. As there is NO RECORDS in TEST table in this run it will load all the records under New Insert as shown below.
- PFB the Screenshot of the data tables output before running the package and After running it. You can see data got loaded in to our table “TEST”.
- Now I updated the existing records in staging table using the following queries and once I run the package again we can see data getting updated into our main table.
update test_stage set Country = ‘USA’ where Emp_Id = 1
- Once I run the package you can see the package showing “1 rows” under “Changing Attributes Updates Output” section as shown below.
- Now if you run the queries you can see the data updated in TEST table.
That’s it guys .. Now we implemented SCD Type 1 using Slowly Changing Dimension Transformation. This is as simple as this. Hopw you understood the concept. Happy Coding !!
Roopesh Babu V