3

I'm loading a pipe delimited flat file into a staging table. During the load process an SSIS script component performs some operations on a row. It may set a flag in one field based on values in another field, add a prefix to certain columns, or apply formatting. For example, if a date is missing, the field is assigned to a default date. (if Row.EndDate_isNull then Row.EndDate = defaultDate)

These scripts become cumbersome when the same transformation needs to be applied to a series of rows. For example, a medical record file can describe each of 9 diagnoses with three fields: Diagnosis01_Date, Diagnosis01_Code, Diagnosis01_System....Diagnosis09_Date, Diagnosis09_Code, Diagnosis09_System.

I want to use a loop to perform the operations over each of the 9 groups of 3 fields, instead of writing the same operation 9 times.

If I were dealing with a collection in VB, for example, I'd write something like this in in the Input0_ProcessInputRow sub:

For i = 1 to 9
   row.("Diagnosis0"+ i + "_Date").Value = diagnosisDate
   row.("Diagnosis0"+ i + "_System").value = "ICD10"
next i 

In the SSIS object model, however, the columns are exposed as properties of Input0Buffer and I can't find a way to use a variable to refer to them. So, how can I create a loop that operates on columns by name in an SSIS script component?

Edit: I found the following sources, especially the first two, to be helpful while I was doing my research on this problem. It seems like there should be a solution using system.reflection, but I just don't know .NET well enough to figure it out.

http://agilebi.com/jwelch/2007/10/21/address-columns-generically-in-a-script-task/

http://agilebi.com/jwelch/2007/06/02/xml-destination-script-component/

http://microsoft-ssis.blogspot.com/2010/12/do-something-for-all-columns-in-your.html

http://toddmcdermid.blogspot.com/2011/05/iterating-over-columns-in-ssis-script.html

http://bidn.com/blogs/MikeDavis/ssis/1800/ssis-for-each-column-in-a-data-flow

https://social.msdn.microsoft.com/Forums/en-US/edbac1df-f05f-40db-820a-e009fae201a4/using-script-destination-object-to-create-and-write-to-new-text-file?forum=sqlintegrationservices&forum=sqlintegrationservices

https://social.msdn.microsoft.com/Forums/en-US/757d11c8-8ad4-4021-a959-1d13c8dfdaa7/how-to-run-a-loop-for-all-columns-in-script-component-input-column-collection-for-each-row?forum=sqlintegrationservices

How can I get the column Value in Script Component in SSIS?

1 Answer 1

3
+50

Simple Workaround

You can store columns name in a List(of string) using loops , and use Row.GetType().GetProperties() to manipulate columns dynamically.

Example:

Note: You have to import System.Reflection , System.Linq and System.Collections.Generic libraries

Dim lstDateColumns as new List(of string)
Dim lstSystemColumns as new List(of string)

For i = 1 to 9
    lstDateColumns.Add("Diagnosis0" & i.toString() & "_Date")
    lstSystemColumns.Add("Diagnosis0" & i.toString() & "_System")
Next


For each  dataColumn as PropertyInfo in Row.GetType().GetProperties()


    If lstDateColumns.Contains(dataColumn.Name) Then

                 dataColumn.SetValue(Row, diagnosisDate, Nothing)

     ElseIf lstSystemColumns.Contains(dataColumn.Name) Then

                dataColumn.SetValue(Row, "ICD10", Nothing)

     End IF
Next

And you can filter over columns names from the lists

    Dim lstDateColumns As New List(Of String)
    Dim lstSystemColumns As New List(Of String)

    For i As Integer = 1 To 9
        lstDateColumns.Add("Diagnosis0" & i.ToString() & "_Date")
        lstSystemColumns.Add("Diagnosis0" & i.ToString() & "_System")
    Next

    For Each dataColumn As PropertyInfo In Row.GetType().GetProperties().Where(Function(x) lstDateColumns.Contains(x.Name))

        dataColumn.SetValue(Row, diagnosisDate, Nothing)

    Next


    For Each dataColumn As PropertyInfo In Row.GetType().GetProperties().Where(Function(x) lstSystemColumns.Contains(x.Name))

        dataColumn.SetValue(Row, "ICD10", Nothing)

    Next

References

Sign up to request clarification or add additional context in comments.

11 Comments

I see that this gets me part way there--It will allow me to loop through my fields because they're in the list. Is there a way to avoid iterating over all of the columns in the source?
Is my worry about the overhead of looping through all columns unjustified?
You're right. I think you can filter on column name using linq. Few moments and i'll update
@DataWriter i updated my answer, remember to imports System.Linq library to use Where function
Can I declare the lists as private module-level variables and add the column names in the event Public Overrides Sub PreExecute() so that I only have to assign the values once once for the package execution, instead of doing it for each row when Input0_ProcessInputRow fires?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.