1

I have this csv with a column containing a mix of string and integer types. (ie, 6 Years and 12 Months). I am trying to find a way to convert the 'years' and 'months' into a new array containing just the months.

YrsAndMonths=np.array(['6 Years and 12 Months','7 Years and 8 Months','2 Years'])

I am trying to get an output of something like Months=['84','92','24'] Not really sure how to proceed from here.

1
  • Not related, but the best practice in Python is to use snake_case_vars instead of CamelCaseVars. Using years_and_months would make the code more readable. Commented Jun 29, 2022 at 11:38

3 Answers 3

1

There is a specific approach that should work with the pattern of your sentences:

sentences = ['6 Years and 12 Months','7 Years and 8 Months','2 Years']
res = []

for x in [sentence.lower() for sentence in sentences]:
    local_res = 0
    if "year" in x:
        year = x.split("year")
        cnt = year[0]
        local_res += int(cnt) * 12
    if "month" in x:
        month = x.split("month")[0].split("and")[1].strip()
        cnt = month
        local_res += int(cnt)
    res.append(local_res)
    
print(res)
Sign up to request clarification or add additional context in comments.

Comments

1

This falls a bit outside of what can do natively.

You could however use that is built on top of numpy and will thus also enable vectorial operations:

import re
import pandas as pd

out = (pd
  # extract years and months values independently
 .Series(YrsAndMonths).str.extractall('(\d+)\s*year|(\d+)\s*month', flags=re.I)
 .astype(float)           # convert string to float
 .groupby(level=0).sum()  # sum per original row
 .mul([12, 1])            # multiply years by 12
 .sum(axis=1).astype(int) # sum and convert to numpy array
 .to_numpy()
)

output: array([84, 92, 24])

Comments

0

Below code does this using list comprehension:

YrsAndMonths=np.array(['6 Years and 12 Months','7 Years and 8 Months','2 Years'])

[ str((int(i[0]) * 12) + int(i.split('and')[-1].split('Months')[0]) ) if i.find('and') > -1 else str(int(i[0])*12) for i in YrsAndMonths  ]

Output:

enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.