3

I Have a column with many flags that were parsed from a XML parser. Data looks like this:

USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N;

I have to create a table with all these column names to capture the flags. Like:

USERKEYED VALMATCH DEVICEVERIFIED EXCEPTION USERREGISTRD ASSOCIATE EXTERNAL GROSSGIVEN UMAPPED 
Y  N N N N Y N Y N 
Y  N N N N Y Y Y N 
Y  N N Y N Y N Y N 

How can I capture values dynamically in SAS? Either in a DATA step or a PROC step?

Thanks in advance.

6
  • Read in the data with the first variable as Variable and the second Variable as Value. Then transpose the data. Commented Nov 13, 2015 at 17:57
  • Is the string in a character variable in a dataset or in a text file that you want to read? If it is in a text file then perhaps you can read using named input style. But you would need to know the variable names to write the input statement. Commented Nov 13, 2015 at 18:05
  • Thank you for the responses. For now I tried with the string functions and it is working fine. Eventhough it is working, the code is not looking satisfactory for me as in future if more flags are added to the column, it requires a code change. Commented Nov 13, 2015 at 18:49
  • Perhaps you should post that code, so we can work off that in giving you suggestions. Commented Nov 13, 2015 at 18:54
  • my apologies. I just posted. In comments, it was not coming good. so answered my own question. Still I request you to provide me a better solution, please. Commented Nov 13, 2015 at 18:56

4 Answers 4

2

Let's start with your example output data.

data expect ;
  id+1;
  length USERKEYED VALMATCH DEVICEVERIFIED EXCEPTION 
         USERREGISTRD ASSOCIATE EXTERNAL GROSSGIVEN UMAPPED $1 ;
  input USERKEYED -- UMAPPED;
cards4;
Y  N N N N Y N Y N 
Y  N N N N Y Y Y N 
Y  N N Y N Y N Y N 
;;;;

Now we can recreate your example input data:

data have ;
  do until (last.id);
    set expect ;
    by id ;
    array flag _character_;
    length string $200 ;
    do _n_=1 to dim(flag);
      string=catx(';',string,catx('=',vname(flag(_n_)),flag(_n_)));
    end;
  end;
  keep id string;
run;

Which will look like this:

USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N 
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=Y;GROSSGIVEN=Y;UMAPPED=N 
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=Y;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N 

So to process this we need to parse out the pairs from the variable STRING into multiple observations with the individual pairs' values split into NAME and VALUE variables.

data middle ;
  set have ;
  do _n_=1 by 1 while(_n_=1 or scan(string,_n_,';')^=' ');
    length name $32 ;
    name = scan(scan(string,_n_,';'),1,'=');
    value = scan(scan(string,_n_,';'),2,'=');
    output;
  end;
  keep id name value ;
run;

Then we can use PROC TRANSPOSE to convert those observations into variables.

proc transpose data=middle out=want (drop=_name_) ;
  by id;
  id name ;
  var value ;
run;
Sign up to request clarification or add additional context in comments.

Comments

1

The data that you have is a series of name/value pairs, using a ; as a delimiter. We can extract each name/value pair one at a time, and then parse those into values:

data tmp;
  length my_string next_pair name value $200;
  my_string = "USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N;";
  cnt = 1;
  next_pair = scan(my_string,cnt,";");
  do while (next_pair ne "");
    name = scan(next_pair,1,"=");
    value = scan(next_pair,2,"=");
    output;
    cnt = cnt + 1;
    next_pair = scan(my_string,cnt,";");
  end;
  keep name value;
run;

Gives us:

name                value
=================== =====
USERKEYED           Y
VALMATCH            N
DEVICEVERIFIED      N
EXCEPTION           N
USERREGISTRD        N
ASSOCIATE           Y
EXTERNAL            N
GROSSGIVEN          Y
UMAPPED             N

We can then transpose the data so that the name is used for the column names:

proc transpose data=tmp out=want(drop=_name_);
  id name;
  var value;
run;

Which gives you the desired table.

4 Comments

This works for single record scenario. If I am not wrong, Transpose will not work if there are multiple records that needs to be passed to my_string from the data step. ???
You need a variable or variables to uniquely identify the original row that you can pass to PROC TRANSPOSE in a BY statement.
@NagaVemprala If you update your data I will update my code =)
@NagaVemprala Nevermind - looks like Tom's done it for you.
0
DATA <MY_DATASET>;
SET INPUT_DATASET;
USERKEYED = substr(input_column, find(input_column, 'USERKEYED=')+10,1);
VALMATCH = substr(input_column, find(input_column, 'VALMATCH=')+9,1);
DEVICEVERIFIED = substr(input_column, find(input_column, 'DEVICEVERIFIED=')+15,1);
EXCEPTION = substr(input_column, find(input_column, 'EXCEPTION=')+10,1);
USERREGISTRD = substr(input_column, find(input_column, 'USERREGISTRD=')+13,1);
ASSOCIATE = substr(input_column, find(input_column, 'ASSOCIATE=')+10,1); EXTERNAL = substr(input_column, find(input_column, 'EXTERNAL=')+9,1);
GROSSGIVEN = substr(input_column, find(input_column, 'GROSSGIVEN=')+11,1);
UMAPPED = substr(input_column, find(input_column, UMAPPED=')+8,1);
run; 

Comments

0

My answer is essentially in the first block of code, the rest is just explanation, one alternative and a nice tip.

Based on the answer you gave, the input data is already in a SAS data set, so that can be read to create a file of SAS code which can then be run using %include and so proc transpose is not required:

filename tempcode '<path><file-name.txt>'; /* set this up yourself */

/* write out SAS code to the fileref tempcode */
data _null_;
  file tempcode;
  set have;
  if _n_=1 then
    put 'Y="Y"; N="N"; drop Y N;';
  put input_column;
  put 'output;';
run;

/* %include the code to create the desired output */
data want;
  %include tempcode;
run;

As the input data already almost looks like SAS assignment statements, we have taken advantage of that and so the SAS code that has been run from fileref tempcode using %include should look like:

Y="Y"; N="N"; drop Y N;
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N;
output;
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=Y;GROSSGIVEN=Y;UMAPPED=N;
output;
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=Y;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N;
output;

As an alternative, fileref tempcode could contain all of the code for data step "data want;":

/* write out entire SAS data step code to the fileref tempcode */
data _null_;
  file tempcode;
  set have end=lastrec;
  if _n_=1 then
    put 'data want;'
       /'Y="Y"; N="N"; drop Y N;';

  put input_column;
  put 'output;';

  if lastrec then
    put 'run;';
run;

%include tempcode; /* no need for surrounding SAS code */

As a tip, to see the code being processed by %include in the log you can use the following variation:

%include tempcode / source2;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.