0

I'm trying to create a new variable based on a simple variable ModelType and a df variable model.

Currently I'm doing it in this way

if ModelType == 'FRSG':
    df=df.withColumn(MODEL_NAME+'_veh', F.when(df["model"].isin(MDL_CD), df["ford_cd"]))
elif ModelType == 'TYSG':
    df=df.withColumn(MODEL_NAME+'_veh', F.when(df["model"].isin(MDL_CD), df["toyota_cd"]))
else:
    df=df.withColumn(MODEL_NAME+'_veh', F.when(df["model"].isin(MDL_CD), df["cm_cd"]))

I have tried this as well

df=df.withColumn(MODEL_NAME+'_veh', F.when((ModelType == 'FRSG') &(df["model"].isin(MDL_CD)), df["ford_cd"]))

but since the variable ModelType is not a column so it gives an error

TypeError: condition should be a Column

Is there any other efficient method also to perform the same?

2 Answers 2

2

You can also use a dict that holds the possible mappings for ModelType and use it like this:

model_mapping = {"FRSG": "ford_cd", "TYSG": "toyota_cd"}

df = df.withColumn(
    MODEL_NAME + '_veh', 
    F.when(df["model"].isin(MDL_CD), df[model_mapping.get(ModelType, "cm_cd")])
)
Sign up to request clarification or add additional context in comments.

Comments

1

I would probably use a variable for the column to be chosen in the then part:

if ModelType == 'FRSG':
    x = "ford_cd"
elif ModelType == 'TYSG':
    x = "toyota_cd"
else:
    x = "cm_cd"

df=df.withColumn(MODEL_NAME+'_veh', F.when(df["model"].isin(MDL_CD), df[x]))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.