1

Have a variable called var1 that has two kinds of values (both as character strings). One is "ND" the other is a number out of 0-100, as a string. I want to convert "ND" to 0 and the character string to a numeric value, for example 1(character) to 1(numeric).

Here's my code attempt:

data cleaned_up(drop = exam_1);
    set dataset.df(rename=(exam1=exam_1));
select (exam1);
    when ('ND') do;
        exam1 = 0;
    end;
    when ; 
        exam1 = input(exam_1,2.);
    end;
    otherwise;
end;

Clearly not working. What am I doing wrong?

1
  • Just to add a comment, I have no idea what to put in that second when statement that can make it like when (0-100) since 0-100 aren't numeric... Commented Nov 1, 2018 at 6:49

3 Answers 3

3

A couple of problems with your code. Putting the rename statement as a dataset option against the input dataset will perform the rename before the data is read in. Therefore exam1 won't exist as it is now called exam_1. This will still be defined as a character column, so the input function won't work.

You need to keep the existing column, create a new numeric column to do the conversion, then drop the old column and rename the new one. This can be done as a dataset option against the output dataset.

The tranwrd function will replace all occurrences of 'ND' to '0', then using input with the best12 informat will read in all the data as numbers. You don't have to specify the length when reading numbers (i.e. 2. for 2 digits, 3. for 3 digits etc).

data cleaned_up (drop=exam1 rename=(exam_1=exam1));
set df;
exam_1 = input(tranwrd(exam1,'ND','0'),best12.);
run;
Sign up to request clarification or add additional context in comments.

Comments

2

You are using select(exam1) while it should be select(exam_1). You can use select for this purpose, but I think simple if condition can solve this much easier:

data test;
    length source $32;
    do source='99', '34.5', '105', 'ND';
        output;
    end;
run;

data result(drop = convertedValue);
    set test;

    if (source eq 'ND') then do;
        result = 0;
    end;
    else do;
        convertedValue = input(source,??best.);
        if not missing(convertedValue) then do;
            if (0 <= round(convertedValue, 1E-12) <= 100) then do;
                result = convertedValue;
            end;
        end;
    end;
run;

input(source,??best.) tries to convert source to number and if it fails (e.g. values contains some word), it does not print an error and simply continues execution.

round(convertedValue,1E-12) is used to avoid precision error during the comparison. If you want to do it absolutely safely you have to use something like

if (0 < round(convertedValue,1E-12) < 100
    or abs(round(convertedValue,1E-12)) < 1E-10 
    or abs(round(convertedValue-100,1E-12)) < 1E-10
) 

3 Comments

Are there any potential issues with using ?? that the OP should be aware of as well?
Hi, I do not know about any potential issues when using '??' modifier. It will not print a message for invalid values and will not set an automatic ERROR variable. I guess if the OP wants to see the invalid values, another else condition can be added to handle the case when convertedValue is missing and one more in case the number is outside the 0-100 range.
I choose the other answer because the code was succinct from a readability standpoint. But this is also a nice answer. Thank you.
2

Try to use ifc function then convert to numeric variable.

data have;
input x $3.;
_x=input(ifc(x='ND','0',x),best12.);
cards;
3
10
ND
;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.