I have a string that contains salary information in the following way:
salaryMixed = "£25,000 - £30,000"
Sometimes it will look like this:
salaryMixed = "EUR25,000 - EUR30,000"
And others times like this:
salaryMixed = "£37.50 - £50.00"
What I want to do is to remove all characters but the numeric values and then split the two values so as to place them into their own respective variables that reflect low banding and high banding. So far I have:
if salaryMixed.find('£')!=-1: # found £ char
salaryMixed = salaryMixed.replace("£", "")
if salaryMixed.find('-')!=-1: # found hyphen
salaryMixed = salaryMixed.replace("-", "")
if salaryMixed.find(',')!=-1: # found comma
salaryMixed = salaryMixed.replace(",", "")
if salaryMixed.find('EUR')!=-1: # found EUR
salaryMixed = salaryMixed.replace("EUR", "")
salaryMixed = re.sub('\s{2,}', ' ', salaryMixed) # to remove multiple space
if len(salaryList) == 1:
salaryLow = map(int, 0) in salaryList
salaryHigh = 00000
else:
salaryLow = int(salaryList.index(1))
salaryHigh = int(salaryList.index(2))
But I am stumped with how to split the two values up, and also how to handle the decimal point when salaryMixed isn't an annual salary but rather per hour as in the case of salaryMixed = "£37.50 - £50.00" because isn't that a float?
I am wanting to store this information in a MySQL DB later on in the code but I have described the table as:
CREATE TABLE jobs(
job_id INT NOT NULL AUTO_INCREMENT,
job_title VARCHAR(300) NOT NULL,
job_salary_low INT(25),
job_salary_high INT(25),
PRIMARY KEY ( job_id )
);
What is the best approach here? Thanks.
[\d,.]+? That will tell you where both numbers are in the string. Then you can preprocess (remove commas etc) and transform it into a number.INTfor the salary columns, but your values are floats. Better switch toDECIMALat the MySQL end.