I have a column of about 100 values; mixed integers and decimals (eg 27, 27.2, 28) but they're stored as datatype string (eg '27', '27.2', '28'). The data was compiled from multiple sources and some of those compiling the data did not have the precision necessary for the decimal values and so entered data with '>' or "<' characters. So add a '>27' to the example column:
col_1
27
>27
27.2
28
The values are out of sort and I would like to sort them from lowest to highest and convert them back to datatype string. My solution is to convert everything to a float, sort on the numerical values and then convert everything back but the values without precision are getting in the way.
My thinking was to add characters to the end of those values, say '.001', and removing the ">" and "<" characters before converting, sorting, and then converting everything back. So, when doing the operation to add '0.001' to the string value I do this:
df['col_1'].loc[df['col_1].str.contains('>')] = df['col_1'].loc[df['col_1].str.contains('>')] + '.001'
Is there a better, more acceptable, or maybe more efficient way to do this?