1

I have a BIDS project set-up to upload data from several flat files to a SQL Server 2008 database. The data is provided by another organization.

A lot of the data has trailing or leading spaces. This is enough of a problem that it would require me changing the size of the columns in my table to accommodate. I could use a derived column to resolve this, but there are enough columns that it would be impractical to set this all up manually.

I'm trying to use a script component (transformation) to remove leading and trailing spaces from every field before being uploaded. However, this is my first stab at using a script component and I'm having no luck.

Trying a simple foreach loop:

foreach(DataColumn i in Row)
  {
      /* do something */
  }

Gives me an error, "foreach statement cannot operate on variables of type 'Input0Buffer' because 'Input0Buffer' does not contain a public definition of 'GetEnumerator'. What do I need to do to resolve this?

1

2 Answers 2

8

Row in a script component is NOT a System.Data.DataRow but rather it is a Input0Buffer. The Input0Buffer is derived directly from your ssis package and has column names as properties.

so you can use GetType().GetProperties() to get all of the System.Reflection.PropertyInfo that are on the object and go through them to do what you want. Though you will have to do some research on how to use system reflection to actually call the property dynamically to modify the contents because I don't know that answer of the top of my head.

using System.Linq;

var properties = Row.GetType().GetProperties().Where(p => !p.Name.EndsWith("_IsNull")).Select(p => p.Name).ToArray();
foreach (var p in properties)
{
    //Do Something
}
Sign up to request clarification or add additional context in comments.

3 Comments

Has anybody found out how the columns can then be mapped to a DataTable? Something like DataTable dt = new DataTable(); DataRow dr = dt.NewRow(); ... foreach (var p in properties) { dr[p] = p; } dt.Rows.Add(dr);? I get for this code: Column 'myCol' does not belong to table .
@questionto42 this should be a new question and not continuation here. but a search of properly creating DataTable (dt) will get you the answer. but basically you need to add the property names as Columns e.g. dt.Columns.Add(p). But because this transformation is row level (Not table or DataSet) I am not sure what you are really wanting to do with a DataTable. Seems like you might want to open a new question and detailing your constraints, desired outcomes, and what you have tried.
0

There is also an answer at Apply row transformation for multiple input columns in Script Transformation with typeof(Input0Buffer).GetProperties() instead of GetType().GetProperties() as you find it here in the other answer, and even if that should be the same, the answer there goes slightly further and embeds a new Dictionary of the column names and their indexes in the Input0ByIndexBuffer, replacing Input0Buffer:

you actually can access the underlying object in a script and iterate over a buffer, without using the named columns. You need to inherit the buffer and use this inherited class in public override void ProcessInput(int InputID, string InputName, PipelineBuffer Buffer, OutputNameMap OutputMap). I usually use a class inherited from Input0Buffer ( the autogenrated one) so I can access both the named and the iterateable columns.You can get the indexes to the columns by using Reflection e.g.:

//inherit via BufferWrapper 
   public class Input0ByIndexBuffer : Input0Buffer

And then, it seems to have worked to loop over all columns of the Row object, by changing from:

public override void Input0_ProcessInputRow(Input0Buffer Row)

to the self-written:

public void Input0_ProcessInputRow(Input0ByIndexBuffer Row)

This self-written Input0ByIndexBuffer allows you to get all the columns in one go or other attributes that were protected before this change. It seems like a large code, but it is just a copy of the Input0Buffer class with the needed small changes so that the needed attributes are no longer protected, and don't be afraid of large code since you can hide it in #region Input0ByIndexBuffer:

enter image description here

Small test, after copying everything to my code, you have the Dictionary ColumnIndexes with all column names and their indices to loop over which was not there before. This Dictionary is also what seems to be the main trick to get the loop done, as the answerer writes:

The actual indexes to the column names I'll get from the Dictionary ColumnIndexes:

enter image description here

Works:

enter image description here

The next Debug line will then show the first column of the Row object.

The Dictionary is made in the following lines right at the top of the long code:

    public Dictionary<string, int> ColumnIndexes = new Dictionary<string, int>();

    public Input0ByIndexBuffer(PipelineBuffer Buffer, int[] BufferColumnIndexes, OutputNameMap OutputMap) : base(Buffer, BufferColumnIndexes, OutputMap)
    {
        IList<string> propertyList = new List<string>();
        foreach (PropertyInfo property in typeof(Input0Buffer).GetProperties())
            if (!property.Name.EndsWith("_IsNull"))
                propertyList.Add(property.Name.ToUpperInvariant());
        for (int i = 0; i < propertyList.Count; i++)
            ColumnIndexes[propertyList[i]] = i;
    }

The answerer has embedded this in the new Input0ByIndexBuffer class. To get this class to work, you need the Reflection namespace: you need using System.Reflection;, as the answerer writes. On the whole, my namespaces to get my own code to run (with the last three needed after the code change) are:

#region Namespaces
using System;
using System.Data;
using System.Data.SqlClient;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.Reflection;
using Microsoft.SqlServer.Dts.Pipeline;
using System.Collections.Generic;
#endregion

The propertyList gets all of the columns from typeof(Input0Buffer).GetProperties(). And the ColumnIndexes[propertyList[i]] then maps the indexes to the column names in that propertyList. In the end, you have everything you need in the Dictionary ColumnIndexes: the column name and its index.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.