0

I can't figure out how this is possibly happening in Stata. The data are integers with no missing values.

gen aid=bid*10000+cid
gen temp=0
replace temp=1 if aid!=bid*10000+cid
sum temp

The mean of temp equals 0.27, and I can see many places where the variable aid is not coded according to the formula. How is this possible? I have tried running it in Stata 12 and 13 and got the same results. In every case with an error aid is off by 1 or 2 in the ones digit.

Here is a reproducible example:

set obs 1

gen wid=2107 
gen fid=2104 

gen mid=fid*10000+wid 
di mid
4
  • Looks like a precision issue but you don't give away enough information. Please post a reproducible example. Commented Nov 20, 2014 at 3:20
  • See help data types and help float. Commented Nov 20, 2014 at 3:26
  • Here is a reproducible example: gen wid=2107 gen fid=2104 gen mid=fid*10000+wid di mid 21042108 Commented Nov 20, 2014 at 3:27
  • This doesn't seem like a precision issue to me. The numbers are only 8 digits, and they differ in the ones digit. Commented Nov 20, 2014 at 3:28

1 Answer 1

2

An example:

clear 
set more off

set obs 1

gen wid=2107 
gen fid=2104 

gen mid = fid*10000 + wid 
gen double mid2 = fid*10000 + wid 

display mid
display mid2

The default data type is a float and "floats have about 7 digits of accuracy".

If you increase the precision of your data type, you see the expected. Read the references I gave in previous comments: help data types and references within.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the fix! I wonder how many times I have gotten the wrong answer without even knowing. I never realized that adding integers was so difficult.
The good news is that often such details make little difference to practical results. After all, you could use doubles rather than floats for measured variables too, but you'd often not notice any scientific consequences. But strange errors can occur if large integers are used as identifiers, which appears to be what you are doing here. Identifiers are often better held as strings.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.