0

I have the following list:

[[1.01782362e-05 1.93798303e-04 7.96163586e-05 5.08812627e-06
  1.39600188e-05 3.94912873e-04 2.33748418e-04 1.22856018e-05]]

When I return its type, I get:

<class 'str'>

Is the reason for that the scientific notation used for instance (i.e. e-04)?

In this case, how can I convert the above list to an integer or float?

Thanks.

EDIT

The above list snippet comes from this CSV file under the "Feature" column.

5
  • 5
    What you posted is neither a string nor syntaxtically valid. Please provide a minimal reproducible example. My guess is that the above is surrounded by quote marks that you did not include, but in that case there is no mystery as to why it is a string. Commented Nov 30, 2021 at 15:40
  • I would change your title to say "Converting scientific notation to a float". Commented Nov 30, 2021 at 15:40
  • 1
    Does this answer your question? Convert Scientific Notation to Float Commented Nov 30, 2021 at 15:43
  • I added the CSV file where the data is coming from to my question. Commented Nov 30, 2021 at 16:35
  • Does this answer your question? Parsing the string-representation of a numpy array Commented Nov 30, 2021 at 18:10

3 Answers 3

1

It looks a lot like you have NumPy's string representation of an array. As I linked above, there doesn't seem to be a nice way of parsing this back, but in your case it might not matter, Pandas and Numpy can sort of get there reasonably easily:

import pandas as pd
import numpy as np

# read in the data
df = pd.read_csv("features_thresholds.csv")

# use numpy to parse that column
df.Feature = df.Feature.apply(lambda x: np.fromstring(x[2:-2], sep=' '))

note that the x[2:-2] is trimming off the leading [[ and trailing ]], otherwise it's mostly standard Pandas usage that most data science tutorials will go through.

Sign up to request clarification or add additional context in comments.

1 Comment

if you can I'd suggest asking whoever gave you that file to give it to you in some other format that's easier to parse, e.g. a JSON dump of the array would likely be more portable than what you've got
0

What you posted must be part of a string literal:

s = '[[1.01782362e-05 1.93798303e-04 7.96163586e-05 5.08812627e-06 1.39600188e-05 3.94912873e-04 2.33748418e-04 1.22856018e-05]]'

In which case

list(map(float, s.lstrip('[').rstrip(']').split()))

evaluates to

[1.01782362e-05, 0.000193798303, 7.96163586e-05, 5.08812627e-06, 1.39600188e-05, 0.000394912873, 0.000233748418, 1.22856018e-05]

Comments

0

We can use python ast (Abstract Syntax Tree) to process it efficiently

import ast
x = '[[1.01782362e-05 1.93798303e-04 7.96163586e-05 5.08812627e-06 1.39600188e-05 3.94912873e-04 2.33748418e-04 1.22856018e-05]]'
x = ast.literal_eval(x.replace(" ",","))
print(x)

1 Comment

Using literal_eval is a good idea.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.