3

I have a Data Flow with OLE DB Source, Script Component (Transformation), and Flat File Destination:

enter image description here

The OLE DB Source task has 100+ columns. The script component is going to cleanup data in each column and then output it to the Flat File Destination.

Adding output columns by hand in Script Component is unthinkable to me.

enter image description here

What options do I have to mirror the output columns with the input columns in the Script Component? While the output column name will be the same, I plan to change the datatype from DT_STR to DT_WSTR.

Thank you.

3
  • Possibly related? This answer seems to suggest that everything passes through be default. Can't verify right now though. stackoverflow.com/a/54809782 Commented Jul 24, 2023 at 14:47
  • @GeneralGrievance No, the other question is it possible to pass columns through an SSIS script transformation? is about passing almost all columns and changing only a few of them. It seems as if the asker did not know that the columns are passed by default if you make them as "ReadOnly" (the default). Here, it is about making a lot of new output columns, so many that it should rather be scripted somehow. Commented Mar 25, 2024 at 17:42
  • you can skip the transformation and change the code page on Flat File Destination to 65001 which mean UTF-8 Commented Mar 28, 2024 at 19:00

3 Answers 3

2

You are short of luck here. Possible scenarios:

  • Either you use Script Component and have to key in all columns and its properties manually. In your case, you have to set proper datatype.
  • Or you can create your own Custom Component which can be programmed to create output columns based on input columns. It is not easy and I cannot recommend a simple guideline, but it could be done. This might have sense if you have to repeat similar operations in many places so it is not a one-time task.
  • You can create a BIML script that creates a package based on metadata. However, the metadata (list of columns and its datatypes) has to be prepared before running BIML script or do some tricks to get it during script execution. Again, some proficiency with BIML is essential.

So, for one-time job and little experience with BIML I would go for a pure manual approach.

Sign up to request clarification or add additional context in comments.

1 Comment

Well, there is always the option of directly editing the XML (dtsx file) if you're careful about it. Then you can at least copy-paste a lot of the tedious stuff.
1

You need to open a dtsx file as an xml and add outputs as a text. Copy and replace some stuff. You can inspect what you need to replace by adding some columns to an output.

After saving an xml you need to open a code in the UI to update some metadata for a script component.

Comments

0

Formula-Puzzle together your xml

The other answers are right in saying that you can take the xml of the package.

I once did this for 100+ columns when I had to make a mapping for the lookup or derived column component:

  • I took the xml of the dtsx,
  • made the needed default column types as defaults in MS Excel and then
  • mapped the SQL column list of data types of the CREATE statement with its SSIS data types,
  • built all of the needed default xml patterns from mere tests in a dummy project,
  • of course with the column names as placeholders so that they would be replaced or concatenated into those default xml blocks.

On the whole, I puzzled together the needed xml by handycraft. This worked and saved me not just a lot of time, but also mistakes and foremost, nerves, and this sort of mapping platform helped me again and again.

I can assure you that a calculation sheet is already enough to do the full mapping if you know how to concatenate strings and lookups with more than one search word.

Other ways

You might do the same thing in another programming language, not just in calculation sheet formulas, one keyword might be BIML, but see all of the hints at How to Map Input and Output Columns dynamically in SSIS?.

PS: unchanged ReadOnly columns are passed by default

This is just a small remark for those who are new to the script component as such.

In the question above, the columns need an output column since they are cleaned or changed in some way. Do not think that you have to make an output column for each input column. If you do not change anything, you can just pass all of the input columns to the output arrow without any new output columns in the main menu. The columns stay as they are in the downstream, always available with this name, there is no need to choose them as the output columns of the script component to keep them alive in the downstream of the flow. See also is it possible to pass columns through an SSIS script transformation?.

If you still try to add an output column for each input column and give it the same name, you will see the error:

Script Component [153]]: Column name "my_column1" in output "Output 0" cannot be used because it conflicts with a column of the same name on synchronous input "Input 0".

PSS: List of SSIS data types

For a full mapping of everything, I found this list in the SetObject() function of the BufferWrapper class that is called by the Input0Buffer class in C#:

    public void SetObject(int columnIndex, object value, bool failSilently = true)
    {
        BufferColumn columnInfo = GetColumnInfo(columnIndex);
        try
        {
            switch (columnInfo.DataType)
            {
                case DataType.DT_BOOL:
                case DataType.DT_BYREF_BOOL:
                    SetBoolean(columnIndex, (bool)value);
                    break;
                case DataType.DT_I2:
                case DataType.DT_BYREF_I2:
                    SetInt16(columnIndex, (short)value);
                    break;
                case DataType.DT_I4:
                case DataType.DT_BYREF_I4:
                    SetInt32(columnIndex, (int)value);
                    break;
                case DataType.DT_R4:
                case DataType.DT_BYREF_R4:
                    SetSingle(columnIndex, (float)value);
                    break;
                case DataType.DT_R8:
                case DataType.DT_BYREF_R8:
                    SetDouble(columnIndex, (double)value);
                    break;
                case DataType.DT_CY:
                case DataType.DT_BYREF_CY:
                case DataType.DT_DECIMAL:
                case DataType.DT_NUMERIC:
                case DataType.DT_BYREF_DECIMAL:
                case DataType.DT_BYREF_NUMERIC:
                    SetDecimal(columnIndex, (decimal)value);
                    break;
                case DataType.DT_I1:
                case DataType.DT_BYREF_I1:
                    SetSByte(columnIndex, (sbyte)value);
                    break;
                case DataType.DT_UI1:
                case DataType.DT_BYREF_UI1:
                    SetByte(columnIndex, (byte)value);
                    break;
                case DataType.DT_UI2:
                case DataType.DT_BYREF_UI2:
                    SetUInt16(columnIndex, (ushort)value);
                    break;
                case DataType.DT_UI4:
                case DataType.DT_BYREF_UI4:
                    SetUInt32(columnIndex, (uint)value);
                    break;
                case DataType.DT_I8:
                case DataType.DT_BYREF_I8:
                    SetInt64(columnIndex, (long)value);
                    break;
                case DataType.DT_UI8:
                case DataType.DT_BYREF_UI8:
                    SetUInt64(columnIndex, (ulong)value);
                    break;
                case DataType.DT_DBDATE:
                case DataType.DT_BYREF_DBDATE:
                    SetDate(columnIndex, (DateTime)value);
                    break;
                case DataType.DT_DATE:
                case DataType.DT_BYREF_DATE:
                case DataType.DT_FILETIME:
                case DataType.DT_BYREF_FILETIME:
                case DataType.DT_DBTIME:
                case DataType.DT_BYREF_DBTIME:
                case DataType.DT_DBTIMESTAMP:
                case DataType.DT_BYREF_DBTIMESTAMP:
                case DataType.DT_DBTIME2:
                case DataType.DT_BYREF_DBTIME2:
                case DataType.DT_DBTIMESTAMPOFFSET:
                case DataType.DT_BYREF_DBTIMESTAMPOFFSET:
                case DataType.DT_DBTIMESTAMP2:
                case DataType.DT_BYREF_DBTIMESTAMP2:
                    SetDateTime(columnIndex, (DateTime)value);
                    break;
                default:
                    throw new Exception(columnInfo.DataType.ToString() + " not yet supported ");
            }
        }
        catch (Exception e)
        {
            if (failSilently == false)
                throw e;
            else
                try { SetNull(columnIndex); } catch { }
        }
    }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.