2

I am trying to use apply function to assign new values to two existing columns in a dataframe slice using a .loc query.

To reproduce - first create a dataframe:

import re
import panads as pd
data = [[1000, "MSL", "Test string"], [2000, 'AGL', 'other string'], [0, 'AGL', "xxxx SFC-10000ft MSLXXX"]]
df = pd.DataFrame(data=data, columns=['Alt', "AltType",'FreeText'])

Then create the apply function

def testapply(row):
    try:
        match = re.findall("SFC-([0-9]+)FT (MSL|AGL|FL)", row.FreeText)[0]
        return (int(match[0]), match[1])
    except:
        return (0, row.AltType)

When I run

df.loc[df['Alt']==0, ['Alt', 'AltType']] = df.loc[df['Alt']==0].apply(testapply, axis=1)

I would like to get as a result is:

     Alt AltType                 FreeText
0   1000     MSL              Test string
1   2000     AGL             other string
2  10000     MSL  xxxx SFC-10000ft MSLXXX

but what I end up getting is:

            Alt       AltType                 FreeText
0          1000           MSL              Test string
1          2000           AGL             other string
2  (10000, MSL)  (10000, MSL)  xxxx SFC-10000FT MSLXXX

Does anyone know how to make this work in one fell swoop?

1
  • I explained in my answer you need to return a list instead of tuple. the problem will be solved by adding a tolist() at the end of return (int(match[0]), match[1]) in your function Commented Oct 9, 2020 at 14:03

4 Answers 4

2

Just add tolist()

df.loc[df['Alt']==0, ['Alt', 'AltType']] = df.loc[df['Alt']==0].apply(testapply, axis=1).tolist()
Sign up to request clarification or add additional context in comments.

1 Comment

Chaining tolist to the apply function is what worked for me, not converting the return value from within the apply function. as @mhDG7 specified.
1

Use loc accessor to select relevant Alt column. Compute value by extracting digit from matching FreeText using regex

df.loc[df['Alt']==0,'Alt']=df.loc[df['Alt']==0,'FreeText'].str.extract('(\d+)')[0]



     Alt AltType                 FreeText
0   1000     MSL              Test string
1   2000     AGL             other string
2  10000     AGL  xxxx SFC-10000ft MSLXXX

1 Comment

This would technically work except in the more general case of more complicated workflows. I simplified my actual code for the purposes of this thread.
1

Let's try Series.str.extract and use boolean indexing with loc to replace the values in columns Alt and AltType where the column Alt contains 0:

m = df['Alt'].eq(0)
df.loc[m, ['Alt', 'AltType']] = df.loc[m, 'FreeText'].str.extract(r'(?i)SFC-(\d+)FT\s(MSL|AGL|FL)').values

     Alt AltType                 FreeText
0   1000     MSL              Test string
1   2000     AGL             other string
2  10000     MSL  xxxx SFC-10000ft MSLXXX

1 Comment

This would technically work except in the more general case of more complicated workflows. I simplified my actual code for the purposes of this thread.
0

Only change your testapply function to this:

def testapply(row):
    try:
        match = re.findall("SFC-([0-9]+)ft (MSL|AGL|FL)", row.FreeText)[0]
        print(match)
        return (int(match[0]), match[1]).tolist()
    except:
        return (0, row.AltType)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.