1

here is my csv file:

uipid,shid,pass,camera,pointheight,pointxpos,PointZPos,deffound,HighestHeight,XPosition,ZPosition,RLevel,Rejected,MixedP
50096853911,6345214,1,SXuXeXCamera,218,12600,82570,no,-1,-1,-1,880,no,498
49876879038,6391743,1,SZuZeZCamera,313,210400,187807,no,-1,-1,-1,880,no,388

Here is my code:

df=pd.read_csv('.\sources\data.csv', delimiter=',', names=['uipid','shid','pass','camera','pointheight','pointxpos','PointZPos','deffound','HighestHeight', 'XPosition','ZPosition','RLevel','Rejected','MixedP'], skip_blank_lines=True, skipinitialspace=True, engine='python')

and when I select a column print(df.loc[(df['uipid']==50096853911))I get an empty df.

Empty DataFrame Columns[uipid,shid,pass,camera,pointheight,pointxpos,PointZPos,deffound,HighestHeight,XPosition,ZPosition,RLevel,Rejected,MixedP] Index: []

And when i set the dtype in pd.read_csv:

df=pd.read_csv('.\sources\data.csv', delimiter=',' ,dtype={'uipid':int, 'shid': int, 'pass':int, 'camera':str, 'pointheight':int, 'pointxpos':int , 'PointZPos':int, 'deffound':str, 'HighestHeight':int, 'XPosition':int,'ZPosition':int, 'RLevel':int, 'Rejected':str, 'MixedP':int}, names=['uipid','shid','pass','camera','pointheight','pointxpos','PointZPos','deffound','HighestHeight', 'XPosition','ZPosition','RLevel','Rejected','MixedP'], skip_blank_lines=True, index_col=False, encoding="utf-8", skipinitialspace=True)

I get this error:

TypeError: Cannot cast array from dtype('O') to dtype('int32') according to the rule 'safe'

ValueError: invalid literal for int() with base 10: 'uipid'

7
  • 1
    Did you check if that value exists at all for that column? as your original code seems correct to me. Also you shouldn't need to specify those args, the following should work fine: df=pd.read_csv('.\sources\data.csv', skip_blank_lines=True, skipinitialspace=True) Commented Apr 7, 2017 at 10:28
  • yes the value exists.. Commented Apr 7, 2017 at 10:32
  • 2
    Try header = 0 in your read_csv with names= Commented Apr 7, 2017 at 10:39
  • 1
    @pshep123 is correct, by passing names here you're treating the existing column row as a data row, so it becomes the first row, this then converts the dtypes to be object or in fact str for all the rows, you can prove this by trying print(df.loc[df['uipid']=='50096853911']) and also it seems unnecessary to pass the column names if they match the existing column rows Commented Apr 7, 2017 at 10:44
  • 1
    Show us the output of df.info() Commented Apr 7, 2017 at 12:12

2 Answers 2

1

Try putting header = 0 in your second read_csv example and let us know if it works.

Sign up to request clarification or add additional context in comments.

3 Comments

OK, if i have header=0 then I dont get the error : Cannot cast array from dtype('O') to dtype('int32') according to the rule 'safe'
Can you be a little more descriptive with your new problem?
When I used: df=pd.read_csv('.\sources\data.csv', delimiter=',', header=0) and print(df.loc[(df[uipid]==50096853911)]) I got the result. Thanks for your solution and sorry if I was confusing!
0

Try this:

df_trail=pd.read_csv('/content/New Text Document.txt',
  delimiter=',',
  names=['uipid', 'shid', 'pass', 'camera', 'pointheight', 'pointxpos', 'PointZPos', 'deffound', 'HighestHeight', 'XPosition', 'ZPosition', 'RLevel', 'Rejected', 'MixedP'],
  skip_blank_lines=True, skipinitialspace=True, engine='python',header=0)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.