1

Why does Stata complain with a cryptic error when I use string variables in the table command?

Consider the following toy example:

sysuse auto, clear
decode foreign, g(foreign_str)
table foreign, contents(n foreign_str mean mpg)

This raises an r(111) variable __000002 not found error in Stata 13.1.

Tracing the error tells me that it is trying to run format __000002 %9.0gc and crashing when it does not find the variable. If I switch the order of the variables in the clist, that is i run table foreign, contents(n mpg_rank mean mpg), I get the same error but with __000003 instead of __000002.

So it appears that Stata crashes when it finds the string variable. If I replace the string variable with a numeric variable, the error doesn't occur.

I know it is not meaningful to compute summary statistics on string variables, but counting the number of observations of a string variable (in each group specified by the rowvar) makes perfect sense.

4
  • 4
    You can use table foreign, contents(freq mean mpg) to get around this. Commented Nov 6, 2014 at 0:18
  • @DimitriyV.Masterov Right, that does exactly what I want; I'm mainly curious as to why Stata throws this (in my opinion) cryptic error. It seems like the function should check for string variables first, and throw a type mismatch or string variable not supported error, instead of a variable not found error (since the variable obviously exists). Commented Nov 6, 2014 at 14:12
  • 1
    Unfortunately, I don't really know the answer. Commented Nov 6, 2014 at 19:50
  • @DimitriyV.Masterov see my answer about why this happens in Stata 13. Commented Jun 20, 2019 at 12:00

1 Answer 1

1

Stata complains because variable __000002 (or __000003 if you change the order) is not created by the collapse command (which is used internally by table) due to the following error:

collapse (count) foreign_str
type mismatch
r(109);

What really happens is not visible to the user because capture is used in combination with collapse and the output from trace confirms that:

- capture collapse `clist' `wgt', by(`varlist' `by') fast `cw'
= capture collapse  (count) __000002=foreign_str (mean) __000003=mpg , by(foreign ) fast

There are only provisions for error codes 111 and 135, so the table command continues to run until it hits a wall when it cannot find the aforementioned variables.

Stata 14 and later versions check the variable(s) provided by the user in the contents() option and only accept numeric types, issuing a more informative error message if this is not the case.

It is also worth pointing out that collapse treats strings differently in more recent Stata versions.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.