Aggregate Transformation in SSIS 2008R2 with Example



The Aggregate transformation is used to perform aggregate operations/functions on groups in a dataset. The aggregate functions available are- Count, Count Distinct, Sum, Average, Minimum and Maximum. The Aggregate transformation has one input and one or more outputs. It does not support an error output. The Aggregate transformation supports the following operations.

Group By: Divides datasets into groups. Columns of any data type can be used for grouping.
Sum: Sums the values in a column. Only columns with numeric data types can be summed.
Average: Returns the average of the column values in a column. Only columns with numeric data types can be averaged.
Count: Returns the number of items in a group.
Count distinct: Returns the number of unique non null values in a group.
Minimum: Returns the minimum value in a group. This operation can be used only with numeric, date, and time data types.
Maximum: Returns the maximum value in a group. This operation can be used only with numeric, date, and time data types.

The Aggregate transformation handles null values in the same way as the SQL Server relational database engine.

  • In a GROUP BY clause, nulls are treated like other column values. If the grouping column contains more than one null value, the null values are put into a single group.
  • In the COUNT (column name) and COUNT (DISTINCT column name) functions, nulls are ignored and the result excludes rows that contain null values in the named column.
  • In the COUNT (*) function, all rows are counted, including rows with null values.

Now let me demonstrate how you can create an SSIS package with Aggregate   transformation

Go to START -> Microsoft SQL Server 2008 -> SQL Server Business Intelligence Development Studio to launch BIDS.

Then go to File menu -> New -> Project -> Select “Business Intelligence Projects” in the left tree pane -> Select “Integration Services Projects” and name the project as you wish and click OK.

Here in this example we want to get the sum of the sales amount  for each Color and English product name  based on the Dimproduct and FactInternetsales tables  data from AdventureworkDW Database. We want to perform database equivalent of SUM(SALESAMOUNT) GROUP BY Color and EnglishproductName  operation.

Here, we have Dimproduct and FactInternetsales  tables are OLEDB Source.

Now Drag and Drop Aggregate Transformation As Show below.

Double-click the Aggregate transform to open the editor. Next in the lower pane we select   the Input Column, set Output Alias to columns, select the Operation i.e. Group By clause or any aggregate functions as below:

Note: The operation will appear in drop down depending upon the data type of field. If your input column is string type, you will not see Sum, Count and other aggregate   functions but only Group by. Always make sure to convert data type of column to proper data type before providing as input to Aggregate Transformation.

Instead of using Table as destination, I used Sort Transformation for testing purpose and used Data viewer to show output data produced by Aggregate Transformation. After testing our package successfully we can delete Sort Transformation and Data Viewer and bring our destination and connect to aggregate transformation for input columns for our destination.

Now Execute the package to see the Results. This is the aggregated output of the data we provided as input.

This is it guys .. Give a try !!

Thanks Aparna for the document.

Roopesh Babu V