0

I currently have this do file in stata which is a simple test for significance in a matched pairs regression. I understand some basic python but I did not know if something like this is possible in python given my limited knowledge. I am using this for my uncle who is using python for his company. If anyone can guide me to some resources or explain how I would do this please let me know.

*import delimited "data"

drop if missing(v1,v2,v3)

regress v3 v2

test v2

generate pvalue = r(p)

if pvalue > .01 {
display "notsig"
display pvalue
}

if pvalue <= .01 {
display "sig"
display pvalue
}

drop pvalue
1
  • The variable pvalue is not needed as you can condition on r(p) The test is given in the regress output any way. Commented Jan 26, 2018 at 7:54

1 Answer 1

1

I would look into pandas (http://pandas.pydata.org/pandas-docs/stable/) and statsmodels (http://www.statsmodels.org/dev/index.html). Pandas is good for reading data into dataframes in python, and then you can run statistical models with statsmodels. I am not well-versed in statsmodels, so you may have to look into the documentation yourself.

Here is an example, to try and go along with what you showed in your question:

import pandas as pd
import statsmodels.formula.api as sm

df = pd.read_csv("data.csv", sep=",")
df.dropna(axis=0, how='any')

results = sm.ols(formula="v3~v2", data=df).fit()
t_test = results.t_test('v2=0')

if (t_test.pvalue*2) > 0.01:
  print("notsig")
  print(t_test.pvalue*2)

if (t_test.pvalue*2) <= 0.01:
  print("sig")
  print(t_test.pvalue*2)

I took the pvalue*2 in this example, because I believe that it only gives the one-tail p-value, but you should check the documentation to make sure.

Sign up to request clarification or add additional context in comments.

4 Comments

tvalues and pvalues for testing that parameter is zero are directly available in the results instance, t_test is more general and provides the same results.
The pvalue is for two-sided hypothesis, the alternative is no equal, so the *2 needs to be removed. (Currently the test in the model results are always two-sided, only the standalone t_tests for means allow for one-sided alternatives.)
statsmodels is misspelled (without an "l")
@CPBL Thank you, I edited it to show the correct spelling.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.