0

I need to write a parameterized for loop.

# This works but...
df["ID"]=np_get_defined(df["methodA"+"ID"], df["methodB"+"ID"],df["methodC"+"ID"])

# I need a for loop as follows
df["ID"]=np_get_defined(df[sm+"ID"] for sm in strmethods)

and I get the following error:

ValueError: Length of values does not match length of index

Remaining definitions:

import numpy as np

df is a Pandas.DataFrame

strmethods=['methodA','methodB','methodC']

def get_defined(*args):
    strs = [str(arg) for arg in args if not pd.isnull(arg) and 'N/A' not in str(arg) and arg!='0']
    return ''.join(strs) if strs else None
np_get_defined = np.vectorize(get_defined)

2 Answers 2

1

df["ID"]=np_get_defined(df[sm+"ID"] for sm in strmethods) means you're passing a generator as single argument to the called method.

If you want to expand the generated sequence to a list of arguments use the * operator:

df["ID"] = np_get_defined(*(df[sm + "ID"] for sm in strmethods))
# or:
df["ID"] = np_get_defined(*[df[sm + "ID"] for sm in strmethods])

The first uses a generator and unpacks its elements, the second uses a list comprehension instead, the result will be the same in either case.

Sign up to request clarification or add additional context in comments.

Comments

0

I think the reason why it doesn't work is that your DataFrame consists of columns with different lengths.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.