2

I have the following dataframe

   text         ext_nms ext_dt  ext_cd  ext_addr
Stonetrust      CRR     xxx     xxx     xxx
Wilkes          CRR     xxx     xxx     xxx
wcv 0086967 05  xxx     xxx     COD     xxx
SILVER  INC     xxx     xxx     xxx     ADD
PO BOX 988      xxx     xxx     xxx     ADD
LA 70520        xxx     xxx     xxx     ADD
02/12/2019      xxx     DAT     xxx     ADD

My expected output:

text            ext_nms ext_dt  ext_cd  ext_addr    out
Stonetrust      CRR     xxx     xxx     xxx         CRR
Wilkes          CRR     xxx     xxx     xxx         CRR
wcv 0086967 05  xxx     xxx     COD     xxx         COD
SILVER  INC     xxx     xxx     xxx     ADD         ADD
PO BOX 988      xxx     xxx     xxx     ADD         ADD
LA 70520        xxx     xxx     xxx     ADD         ADD
02/12/2019      xxx     DAT     xxx     xxx         DAT

Essentially, I want to create a new column that has a non-xxx string and that value needs to be moved to that new column.

What I tried:

df['qs_key'] = df.filter(like='^ext').replace('^xxx', '').apply(lambda x: ''.join(x), axis=1)

I am not getting the output that I desire.

Any help will be appreciated. thanks

0

2 Answers 2

1

IIUC use df.ne("xxx"), stack and map:

   df["out"] = df["text"].map(df[df.ne("xxx")].set_index("text").stack().droplevel(1))
   
   print (df)
   
                text ext_nms ext_dt ext_cd ext_addr  out
   0      Stonetrust     CRR    xxx    xxx      xxx  CRR
   1          Wilkes     CRR    xxx    xxx      xxx  CRR
   2  wcv 0086967 05     xxx    xxx    COD      xxx  COD
   3      SILVER INC     xxx    xxx    xxx      ADD  ADD
   4      PO BOX 988     xxx    xxx    xxx      ADD  ADD
   5        LA 70520     xxx    xxx    xxx      ADD  ADD
   6      02/12/2019     xxx    DAT    xxx      xxx  DAT
Sign up to request clarification or add additional context in comments.

Comments

1

In your code use regex for filter by regex code, then add regex=True for substring replacement and last lambda should be omitted:

df['qs_key'] = df.filter(regex='^ext').replace('^xxx', '', regex=True).apply(''.join, axis=1)
print (df)
             text ext_nms ext_dt ext_cd ext_addr qs_key
0      Stonetrust     CRR    xxx    xxx      xxx    CRR
1          Wilkes     CRR    xxx    xxx      xxx    CRR
2  wcv 0086967 05     xxx    xxx    COD      xxx    COD
3     SILVER  INC     xxx    xxx    xxx      ADD    ADD
4      PO BOX 988     xxx    xxx    xxx      ADD    ADD
5        LA 70520     xxx    xxx    xxx      ADD    ADD
6      02/12/2019     xxx    DAT    xxx      xxx    DAT

If possible multiple values and need first replace ^xxx to missing values, then back filling missing values and last seelct first column by position:

df['qs_key'] = df.filter(regex='^ext').replace('^xxx', np.nan, regex=True).bfill(axis=1).iloc[:, 0]

print (df)
             text ext_nms ext_dt ext_cd ext_addr qs_key
0      Stonetrust     CRR    xxx    xxx      xxx    CRR
1          Wilkes     CRR    xxx    xxx      xxx    CRR
2  wcv 0086967 05     xxx    xxx    COD      xxx    COD
3     SILVER  INC     xxx    xxx    xxx      ADD    ADD
4      PO BOX 988     xxx    xxx    xxx      ADD    ADD
5        LA 70520     xxx    xxx    xxx      ADD    ADD
6      02/12/2019     xxx    DAT    xxx      xxx    DAT

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.