1

I have a .csv file filled with observations from sensors in the field. The sensors write data as millimeters and I need it as meters to import into another application. My idea was to use Python and possibly pandas to: 1. Read in the .csv as dataframe 2. Find the headers of the data I need to modify (divide each actual value by 1000) 3. Divide each value in the chosen column by 1000 to convert it to meters 4. Write the resulting updated file to disk

Objective: I need to modify all the values except those with a header that contains "rad" in it.

This is what the data looks like: Any column with with the text "rad" should be left alone, zero's don't matter

Here is what I have done so far:

  1. Read data into a dataframe:

    import pandas as pd
    import numpy as np
    delta_df = pd.read_csv('SAAF_121581_67_500.dat',index_col=False)
    
  2. Filter out all the data that I don't want to touch:

    delta_df.filter(like='rad', axis=1)
    

Here is where I got stuck as I couldn't filter the dataframe to

not like = 'rad'

How can I do this?

2 Answers 2

1

Its easier if you post the dataframe rather than the image as the image is not reproducible. You can use dataframe.filter to keep all the columns containing 'rad'

delta_df = delta_df.select(lambda x: re.search('rad', x), axis=1)

Incase you are trying to remove all the columns containing 'rad', use

delta_df = delta_df.select(lambda x: not re.search('rad', x), axis=1)

Alternate solution without regex:

df.filter(like='rad',axis=1)

EDIT: Given the dataframes containing rad and not containing rad like this

df_norad = df.select(lambda x: not re.search('rad', x), axis=1)

df_rad = df.select(lambda x: re.search('rad', x), axis=1)

You can convert the values of df_norad df to meters and then merge it with df_rad

merged = pd.concat([df_norad, df_rad], axis = 1)

You can convert the dataframe merged to csv using to_csv

merged.to_csv('yourfilename.csv')
Sign up to request clarification or add additional context in comments.

4 Comments

delta_df = delta_df.select(lambda x: not re.search('rad', x), axis=1) worked like a charm! Thanks for that! Update: I managed to use delta_df.iloc[1:,1:]/1000 and this divided all the remaining data into the correct format of meters. However, as a final step I need to update the changed columns back to the original dataframe, and output it as CSV - is this possible?
@André Lubbe, I am glad it worked. See the edit for answer to the rest of the questions.
thanks for the updated solution. You have saved me so much time! Just a last question though, is there anyway to retain the original structure of the file? i.e. when finally exporting everything (the changed non-rad columns along with the divided values) is there a way to export so that the columns are in there original imported position?
You can re arrange the columns like this merged.columns = list(df.columns)
0

Off the top of my head I believe you can do something like this:

delta_df.filter(regex='^rad', axis=1)

Where we use the regex parameter instead of the like parameter (**note regex and like are mutually exclusive).

The actual regex selects everything that does not match what follow the '^' operator.

Again, I don't have an environment set up to test this but I hope this motivates the idea well enough.

3 Comments

Thank you for the suggestion, I used this in my notebook, but the '^rad' returns no results at all, instead of returning only the columns that don't have "rad" as part of the name. Is there a way that I could filter the headers according to a partial match to the actual header? Something like contains ?
This might be because the carrot operator also denotes the beginning of a sequence i.e. ^rad would match radical, radish, etc but not eradicate and so on. To get around this in regular python you put the negative lookahead (in this case 'rad') in a capture group. i.e. `'^(rad)'. However, I'm not sure how pandas implements their regex engine although I would assume it uses python's built-in library.
I think this works: df.filter(regex='^((?!rad).)*$' ) via stackoverflow.com/questions/406230/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.