I have a really large data set that I need to create new columns for based on text strings from other columns, and put them in a binary format. I have a person ID column and a set of 99 "Diagnosis Codes" that contain a text string that corresponds to a particular health condition.
Sample of Original Data
| Person ID | Diagnosis Code 1 | Diagnosis Code 2 | Diagnosis Code 3 |
|---|---|---|---|
| 10 | N18.3 | V34.2 | E73 |
| 11 | F35.9 | X29 | D4.0 |
| 12 | G27.2 | J05.1 | J60 |
I need to get the data into this format....
| PersonID | N18.3 | V34.2 | E73 | F35.9 | G27.2 | (plus all other codes) |
|---|---|---|---|---|---|---|
| 10 | 1 | 1 | 1 | 0 | 0 | etc |
| 11 | 0 | 0 | 0 | 1 | 0 | etc |
| 12 | 0 | 0 | 0 | 0 | 1 | etc |
I have tried transposing, tabluation, so many other different ways and nothing seems to work. I'd appreciate any help!