8

I wanted to try out the functionality of applymap method of Pandas DataFrame object. Here is the Use case:

Let say my DataFrame df1 is as follows:

Age   ID       Name
0   27  101    John
1   22  102    Bob
2   19  103    Alok
3   27  104    Tom
4   32  105    Matt
5   19  106    Steve
6    5  107    Tom
7   55  108    Dick
8   67  109    Harry

Now I want to create a flag variable with the logic that if length of element is less than 2, then flag=1 else flag=0.

In order to run this element-wise, I wanted to use applymap method. So for that I created a user defined function as follows:

def f(x): 
   if len(str(x))>2: 
       df1['Flag']=1
   else: 
      df1['Flag']=0

Then I ran df1.applymap(f) which gave:

    Age    ID  Name
0  None  None  None
1  None  None  None
2  None  None  None
3  None  None  None
4  None  None  None
5  None  None  None
6  None  None  None
7  None  None  None
8  None  None  None

instead of creating a flag variable with the flag value. How can I achieve the desired functionality using applymap?

Can't we use the DataFrame variable name or pandas statement inside the user defined function? I.e., is df1['Flag'] valid inside the definition of f()?

1 Answer 1

10

the function f(x) is not special to pandas -- it is just a regular python function. So the only data in scope within f is the variable x Other members of df1 are not available.

From applymap docs:

func : function

Python function, returns a single value from a single value

So you could try this:

def f(x):
    if len(str(x)) <= 3: return 1
    else: return 0

Outputting 1/0 for each element in the frame when applied:

df1.applymap(f)

>>>
   Age  ID  Name
0    1   1     0
1    1   1     1
2    1   1     0
3    1   1     1
4    1   1     0
5    1   1     0
6    1   1     1
7    1   1     0
8    1   1     0

To use the result to add another variable in each row, you need one value per row , e.g.,

df1['Flag'] = df1.applymap(f).all(axis=1).astype(bool)

>>> df1

   Age   ID   Name   Flag
0   27  101   John  False
1   22  102    Bob   True
2   19  103   Alok  False
3   27  104    Tom   True
4   32  105   Matt  False
5   19  106  Steve  False
6    5  107    Tom   True
7   55  108   Dick  False
8   67  109  Harry  False

Also check out https://stackoverflow.com/a/19798528/1643946 which covers apply, map as well as applymap.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks. Just a follow-up. In the df1['Flag'] = df1.applymap(f).sum(axis=1).astype(bool) statement, when we sum it up by columns, then shouldnt first row would have value of 2 (1+1+0)?. Then bool of that should be True rite?.Then why is it False?
yes, sorry - I copied the result from the all function but wrote in the sum function (bool(sum) gives True for all of the rows which isn't a good example). Fixed now.
@Baktaawar if this solved your problem, then in addition to "thanks", would be great if you could accept the answer!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.