2

i am unable load csv using the below pandas commnads.

f1 = pd.read_csv(r'C:\Users\sana.mohan.reddy\Desktop\Python_Practice\Test1.CSV', skiprows=[0,1,2], skip_footer=[0], sep = ',')

I have to skip first 3 rows and last row.

Below is the sample data.

Contacts - Total Opens by Campaign

Email Open Date/Time,"Total Opens"
3/25/2016 6:00:35 AM,"1"
3/25/2016 6:00:35 AM,"1"
3/25/2016 6:00:46 AM,"1"
3/25/2016 6:00:46 AM,"1"
3/25/2016 6:00:51 AM,"1"
3/25/2016 6:00:52 AM,"1"
Total,"796"

could you please correct me where i am going wrong

2 Answers 2

1

I think you can use read_csv with other parameters (sep = ',' is omited, because , is default value of sep):

import pandas as pd
import io

temp=u'''Email Open Date/Time,"Total Opens"
3/25/2016 6:00:35 AM,"1"
3/25/2016 6:00:35 AM,"1"
3/25/2016 6:00:46 AM,"1"
3/25/2016 6:00:46 AM,"1"
3/25/2016 6:00:51 AM,"1"
3/25/2016 6:00:52 AM,"1"
Total,"796"'''
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp),
                 skipfooter=1, #skip last row
                 engine='python', #remove warning
                 skiprows=[0,1,2], #remove first 3 rows
                 header=None) #no header, set default 0,1,... 
print (df)

                      0  1
0  3/25/2016 6:00:46 AM  1
1  3/25/2016 6:00:46 AM  1
2  3/25/2016 6:00:51 AM  1
3  3/25/2016 6:00:52 AM  1

EDIT by real data:

There was main problem with encoding - I have to set utf-16.

import pandas as pd

df = pd.read_csv('Test 1.csv',
                 skipfooter=1, #skip last row
                 engine='python', #remove warning
                 skiprows=[0,1], #remove first 2 rows
                 encoding='utf-16', #set encoding
                 parse_dates=[0]) #convert first column to datetime 
print (df)

    Email Open Date/Time  Total Opens
0    2016-03-25 06:00:35            1
1    2016-03-25 06:00:35            1
2    2016-03-25 06:00:46            1
3    2016-03-25 06:00:46            1
4    2016-03-25 06:00:51            1
5    2016-03-25 06:00:52            1
6    2016-03-25 06:00:57            1
7    2016-03-25 06:00:58            1
8    2016-03-25 06:01:03            1
9    2016-03-25 06:01:20            1
10   2016-03-25 06:01:20            1
11   2016-03-25 06:01:25            1
Sign up to request clarification or add additional context in comments.

11 Comments

Error: line contains NULL byte i am getting this error any help on this in my csv file the first row is string the second row is blank the third row is string and from fourth row i have the actual data and the last row i am skipping
Is possible share your file?
how to share the file i dont see any attachment link or option
You have to upload file to some share server, e.g. wetransfer.com and then share link.
your email id please i can transfer the file
|
1

You need to correct your read_csv to:

f1 = pd.read_csv('yourFile.csv', skiprows=3, skip_footer=1, sep = ',')

since skip_footer requires an integer value (the number of lines to skip at the bottom of the file) see http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

2 Comments

f1 = pd.read_csv(r'C:\Users\sana.mohan.reddy\Desktop\Python_Practice\Test 1.csv', skiprows=3, skipfooter=1, sep = ',', engine='python') it is throwing an error like 'Line contains NULL byte' how to deal with this
Error: line contains NULL byte i am getting this error any help on this in my csv file the first row is string the second row is blank the third row is string and from fourth row i have the actual data and the last row i am skipping

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.