How to read only certain rows and cells from csv with Python pandas?

Question

I have csv-file with this structure:

Last Name   First Name  Start Date  End Date            
Example     Eva         1.1.2021    15.6.2021
                                        
Here is some random information.                                        
                                        
------- Header-------                       
Index   Date    Time        Reading
0   10.4.2021   16:26:01    0,1             
1   10.4.2021   16:25:44    0,1             
2   10.4.2021   16:00:00    0,1             
3   10.4.2021   16:00:00    0,1             
4   10.4.2021   14:00:00    0,1             
5   10.4.2021   14:00:00    0,1             
6   10.4.2021   13:00:00    0,3             

------- Header------- 
Index   Date    Time        Reading
0   10.4.2021   16:26:01    0,1             
1   10.4.2021   16:25:44    0,1             
2   10.4.2021   16:00:00    0,1             
3   10.4.2021   16:00:00    0,1             
4   10.4.2021   14:00:00    0,1             
5   10.4.2021   14:00:00    0,1             
6   10.4.2021   13:00:00    0,3

I want to read the file using pandas and make a dictionary about the data, like this for example: {'last_name': 'Example', 'first_name': 'Eva'} and so on. How can I read certain values into variables for example? At the moment, I read the csv -file like this: data = pd.read_csv(file, sep='delimiter').

So, you don't care about everything after the first space?

mozway
– mozway

2021-08-31 12:55:19 +00:00
Commented Aug 31, 2021 at 12:55 — mozway
– mozway, Commented Aug 31, 2021 at 12:55

mozway · Accepted Answer · 2021-08-31 13:07:25Z

1

header

If you only want to read the beginning of the file as a dictionary, you can do:

pd.read_csv('filename.csv', sep='\s\s+', nrows=1).loc[0].to_dict()

output:

{'Last Name': 'Example',
 'First Name': 'Eva',
 'Start Date': '1.1.2021',
 'End Date': '15.6.2021'}

rest of the file

To read the rest of the file:

df = (pd.read_csv('filename.csv',
                  sep='\s+',
                  skiprows=6,
                  index_col=0,
                 )
        .drop(['Index', '-------']) # get rid of extra headers
     )

output:

            Date      Time Reading
Index                             
0      10.4.2021  16:26:01     0,1
1      10.4.2021  16:25:44     0,1
2      10.4.2021  16:00:00     0,1
3      10.4.2021  16:00:00     0,1
4      10.4.2021  14:00:00     0,1
5      10.4.2021  14:00:00     0,1
6      10.4.2021  13:00:00     0,3
0      10.4.2021  16:26:01     0,1
1      10.4.2021  16:25:44     0,1
2      10.4.2021  16:00:00     0,1
3      10.4.2021  16:00:00     0,1
4      10.4.2021  14:00:00     0,1
5      10.4.2021  14:00:00     0,1
6      10.4.2021  13:00:00     0,3

If you need to determine programmatically the number of lines to skip:

with open('filename.csv') as f:
    skip = 1
    for l in f:
        if l.startswith('-------'):
            break
        skip+=1

skip: 6

edited Aug 31, 2021 at 13:07

answered Aug 31, 2021 at 13:00

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

lr_optim Over a year ago

Thank you @mozway, this is definitely a right direction. My goal is to get a clean dict out of the file with only the information I need. Let's say I want only name, dates and readings structured like this: {'last_name': 'Example', 'first_name': 'Eva', 'measurements': [{'date:': 'some_date', 'reading': 'some_reading'}, ...}. How can I iterate through the columns and only get the ones I need?

mozway Over a year ago

difficult to answer without having the exact format ;) As this is a different question, I suggest you give it a try first and start a new question if need be.

Collectives™ on Stack Overflow

How to read only certain rows and cells from csv with Python pandas?

1 Answer 1

header

rest of the file

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

header

rest of the file

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related