pandas.read_csv returns empty dataframe

Question

here is my csv file:

uipid,shid,pass,camera,pointheight,pointxpos,PointZPos,deffound,HighestHeight,XPosition,ZPosition,RLevel,Rejected,MixedP
50096853911,6345214,1,SXuXeXCamera,218,12600,82570,no,-1,-1,-1,880,no,498
49876879038,6391743,1,SZuZeZCamera,313,210400,187807,no,-1,-1,-1,880,no,388

Here is my code:

df=pd.read_csv('.\sources\data.csv', delimiter=',', names=['uipid','shid','pass','camera','pointheight','pointxpos','PointZPos','deffound','HighestHeight', 'XPosition','ZPosition','RLevel','Rejected','MixedP'], skip_blank_lines=True, skipinitialspace=True, engine='python')

and when I select a column print(df.loc[(df['uipid']==50096853911))I get an empty df.

Empty DataFrame Columns[uipid,shid,pass,camera,pointheight,pointxpos,PointZPos,deffound,HighestHeight,XPosition,ZPosition,RLevel,Rejected,MixedP] Index: []

And when i set the dtype in pd.read_csv:

df=pd.read_csv('.\sources\data.csv', delimiter=',' ,dtype={'uipid':int, 'shid': int, 'pass':int, 'camera':str, 'pointheight':int, 'pointxpos':int , 'PointZPos':int, 'deffound':str, 'HighestHeight':int, 'XPosition':int,'ZPosition':int, 'RLevel':int, 'Rejected':str, 'MixedP':int}, names=['uipid','shid','pass','camera','pointheight','pointxpos','PointZPos','deffound','HighestHeight', 'XPosition','ZPosition','RLevel','Rejected','MixedP'], skip_blank_lines=True, index_col=False, encoding="utf-8", skipinitialspace=True)

I get this error:

TypeError: Cannot cast array from dtype('O') to dtype('int32') according to the rule 'safe'

ValueError: invalid literal for int() with base 10: 'uipid'

Did you check if that value exists at all for that column? as your original code seems correct to me. Also you shouldn't need to specify those args, the following should work fine: df=pd.read_csv('.\sources\data.csv', skip_blank_lines=True, skipinitialspace=True) — EdChum
– EdChum, Commented Apr 7, 2017 at 10:28
@pshep123 is correct, by passing names here you're treating the existing column row as a data row, so it becomes the first row, this then converts the dtypes to be object or in fact str for all the rows, you can prove this by trying print(df.loc[df['uipid']=='50096853911']) and also it seems unnecessary to pass the column names if they match the existing column rows — EdChum
– EdChum, Commented Apr 7, 2017 at 10:44

elPastor · Accepted Answer · 2017-04-07 12:53:09Z

1

Try putting header = 0 in your second read_csv example and let us know if it works.

edited Apr 7, 2017 at 12:53

answered Apr 7, 2017 at 12:42

elPastor

9,14411 gold badges59 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

SillyPerson Over a year ago

OK, if i have header=0 then I dont get the error : Cannot cast array from dtype('O') to dtype('int32') according to the rule 'safe'

elPastor Over a year ago

Can you be a little more descriptive with your new problem?

SillyPerson Over a year ago

When I used: df=pd.read_csv('.\sources\data.csv', delimiter=',', header=0) and print(df.loc[(df[uipid]==50096853911)]) I got the result. Thanks for your solution and sorry if I was confusing!

JohanC · Accepted Answer · 2019-12-12 19:46:12Z

0

Try this:

df_trail=pd.read_csv('/content/New Text Document.txt',
  delimiter=',',
  names=['uipid', 'shid', 'pass', 'camera', 'pointheight', 'pointxpos', 'PointZPos', 'deffound', 'HighestHeight', 'XPosition', 'ZPosition', 'RLevel', 'Rejected', 'MixedP'],
  skip_blank_lines=True, skipinitialspace=True, engine='python',header=0)

edited Dec 12, 2019 at 19:46

JohanC

81.4k8 gold badges54 silver badges90 bronze badges

answered Dec 12, 2019 at 17:41

Boredguy

1

Collectives™ on Stack Overflow

pandas.read_csv returns empty dataframe

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related