2

I currently have 2 variables, state and year, into which I wish to convert into 1 variable, stateyear.

I want the stateyear variable to have values in the following form: state_year (e.g. Texas_1962).

How can I reference the values in the state and year variables to create the new stateyear variable?

3 Answers 3

9

That could be

 gen state_year = state + "_" + string(year) 

where I am assuming that year is numeric. Or it could be

 egen state_year = concat(state year), p(_) 

which takes care of any type conversions needed.

Or it could be

 egen state_year = group(state year), label 

which doesn't give you a connecting underscore. That raises a key point: why you do think you need that underscore? It will just look ugly on graphs or tables. If spaces are (thought to be) a problem, what about "North Carolina_2013", and so forth?

For a miniature review of this question, see http://www.stata-journal.com/sjpdf.html?articlenum=dm0034

Sign up to request clarification or add additional context in comments.

Comments

6

Here is an example:

// create some example data
clear
input ///
str13 state      int year
"Noord-Holland"  1962
"Zuid-Holland"   1963
"Utrecht"        1964
"Zeeland"        1965
"Noord-Brabant"  1966
"Limburg"        1967
"Gelderland"     1968
"Flevoland"      1969
"Overijsel"      1970
"Drente"         1971
"Friesland"      1972
"Groningen"      1973
end

// create the variable
gen str18 state_year = state + "_" + string(year)

// admire the result
list    

If the + operator appears between two string, then it means that Stata has to concatenate the two strings.

So, the part state + "_" means add the string "_" after the content of the string variable state. To make sure that + also means concatenate for the part "_" + string(year), I used the string() function, which turns the numeric values of the variable year into strings.

The str18 part means that you want the variable state_year to be a string with 18 characters. This works for the Dutch states in this examples, but you will need to count the number of characters in the state with the longest name and add 5 to that, to determine the maximum length of the string in your case. Say that number is 21, then you need to replace str18 with str21

Comments

1

One addition to Nick's solution. If the state variable is stored as numeric with value labels (e.g. 1 "Alabama" 2 "Alaska" etc.) I believe you will also need to specify the decode option, thus to convert the labels into strings:

 egen state_year = concat(state year), p(_) decode

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.