0

I have a dataframe like below, and I need to create a new column Block with either the value 1 or 2 in it based on a partial string match in the column Program Number where it says _block_1 or _block_2. I've been trying if statements and .str.contains but can't get it to work. How would you do this?

148 0209-3SP_block_1    ['g76p010060q00250r.0005'   'JEBD0507160 REV A' CHNCIII
149 0209-3SP_block_2    ['g76x.3761z-.500p03067q03067f.05'  'JEBD0507160 REV A' CHNC III
150 0209-5SP_block_1    ['g76p020060q00250r.0005'   'JEBD0507160 REV A' CHNC III
151 0209-5SP_block_2    ['g76x.3767z-.48p03067q03067f.05'   'JEBD0507160 REV A' CHNC III
152 0210-3SP_block_1    ['g76p010060q00250r.0005'   'JEBD0507160 REV A' CHNC III
3
  • Just tried another method: block1 = df['Machine'].str.contains('_block_1') df['Block'] = block1.replace((True,False), ('1','2')) but this added all 2 to column Commented Jan 21, 2019 at 1:22
  • See https://pandas.pydata.org/pandas-docs/stable/text.html, specifically regex part. Commented Jan 21, 2019 at 1:23
  • Thanks for the link, that's actually what I've been using on this last attempt, the section Testing for Strings that Match or Contain a Pattern Commented Jan 21, 2019 at 1:28

1 Answer 1

1

You could use the method where from numpy:

import numpy as np

df['Block'] = np.where(
    df['Machine'].str.contains('_block_1'),1,
    np.where(df['Machine'].str.contains('_block_2'),2,0)
)

Otherwise, assuming all the strings have the same length:

df['Block'] = df['Machine'].str[15:].astype(int)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.