1

I'm working with the craziest file format I've seen. It is fixed width, and contains multiple record types (in the sense that each row may have different columns and widths). There's a file header, trailer, and then a static number of rows that when put together make up one record. The problem I'm having is that there is nothing in the rows that tell you they belong to the same record other than their sort order and a row number attribute.

Example:

001 David          Wellingsworth    Mr.
002 312-555-5555      3060 W Maple St.           Chicago
001 Jimothy        Bogendath        Dr.
002 563-555-5432      123 Main St.               Davenport

My question is therefore: is it possible, without using a Script Component, to process a file like this? I understand the basic concept of how to handle disparate record types in a fixed width file (making use of conditional splits and substrings), but I can't get past how to join up all this data after the splits if the rows don't have identifiers.

If it helps, my question is basically this previous question but in reverse.

1 Answer 1

1

Possible but with some work. I've worked with data like these and this was our approach on how we solved them.

  1. You will need to build a table that will give them their own unique RecordID
  2. Create another table for your Files to log in your filename and unique fileID
  3. Link your fileID to the RecordID so you know which file each record came from
  4. Build all your sub tables linking to each unique RecordID

Building your tables this way will give you:

  1. Unique recordID for each row (though there maybe duplicate in the file, in your tables they are unique).
  2. Knowing which file each record comes from.
Sign up to request clarification or add additional context in comments.

7 Comments

Thanks for the thoughts! To clarify though, there is only one file, it is just multiple records spread across multiple rows in the same file. In my example above, there are two separate records.
Right. Whether its one or multiple files same approach is taken. Since there is no SSN, or unique ID for each one; each record should have their own RecordID. Then you can link all the same records based on their qualifying data. Whether it's name, address, etc.
Each record has an ID currently, but it is only provided on the first row of data. The subsequent rows do not have any attributes that would allow us to know they're linked to that first row beyond the fact that they come after the first row. I guess I'm confused as to how I get the data into a table in the first place in order to be able to assign an ID to each related row in the output so that I can group them together into one record later on.
The ID passed on by the file should be another column. Since this is not a reliable data, you can't use it to link. That's not the RecordID I am talking about. The RecordID that you create in the table will be auto populated by the system based on each import through SSIS. Therefore, the RecordID is the row number of each insert.
The first goal is to INSERT them first as they are in the file with each unique RecordID. Tie each RecordID from the FileID so you know where each record came from. Your next goal is to create a script that ties them all together into one line. I am not sure if this is where you are having issues at.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.