0

I have the following dataframe called df

    country ticker   
01  ST      ENRO.ST
02  ST      ERICb.ST
03  ST      BTSb.ST
04  US      MSFT
05  HK      0070.HK
06  ST      SAABb.ST
07  ST      SaA.ST

I want to do the following,

if the country == 'ST', select the string in the ticker row.

check if there are any lowercase characters.

If there is a lowercase character, add a hyphen before it and make the letter uppercase, like this.

    country ticker   
01  ST      ENRO.ST
02  ST      ERIC-B.ST
03  ST      BTS-B.ST
04  US      MSFT
05  HK      0070.HK
06  ST      SAAB-B.ST
07  ST      S-AA.ST

I would do the following if it was just one string,

teststr = [char for char in "ERICb.ST"]:
for i,v in enumerate(teststr):
    if teststr[i].islower():
        mod = i

teststr[mod] = teststr[mod].upper()

teststr.insert(mod,'-')
teststr = ''.join(teststr)

but i dont know how to apply it to every row if it meets that condition.

2
  • Is it possible that there are multiple lowercase letters which have to be replaced? Commented May 5, 2020 at 23:30
  • no, there can only be one. Commented May 5, 2020 at 23:30

2 Answers 2

2

First we split the strings up based on the lowercase letters, then we join them back with - as delimiter on the first two parts and uppercase the strings, then we add the last part. Finally we use Series.where to only modify the rows where country == ST:

s1 = df['ticker'].str.split('([a-z])')
s2 = s1.str[:2].str.join('-').str.upper() + s1.str[2:].str.join('')
df['ticker'] = s2.where(df['country'].eq('ST'), df['ticker'])

  country     ticker
0      ST    ENRO.ST
1      ST  ERIC-B.ST
2      ST   BTS-B.ST
3      US       MSFT
4      HK    0070.HK
5      ST  SAAB-B.ST
6      ST    S-AA.ST
Sign up to request clarification or add additional context in comments.

1 Comment

hey sorry, what about only selecting the rows with 'ST' in the country, because there could be rows with lower case characters if there when its not ST and i dont want to touch those, i mentioned it in the first part of the question
0

you may use replace function with str.replace

repl = lambda x: '-'+x.group(0).upper()

df.loc[df.country.eq('ST'), 'ticker'] = (df.loc[df.country.eq('ST'), 'ticker']
                                           .str.replace('([a-z])', repl))

Out[58]:
  country     ticker
1      ST    ENRO.ST
2      ST  ERIC-B.ST
3      ST   BTS-B.ST
4      US       MSFT
5      HK    0070.HK
6      ST  SAAB-B.ST
7      ST    S-AA.ST

Note: as you said there is only a single lowercase char in each string so I use the pattern [a-z]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.