68

I am trying to convert threshold array(pickle file of isolation forest from scikit learn) of type from Float64 to Float32

for i in range(len(tree.tree_.threshold)):
    tree.tree_.threshold[i] = tree.tree_.threshold[i].astype(np.float32)

​ Then Printing it

for value in tree.tree_.threshold[:5]:
    print(type(value))
    print(value)

the output i am getting is :

<class 'numpy.float64'>
526226.0
<class 'numpy.float64'>
91.9514312744
<class 'numpy.float64'>
3.60330319405
<class 'numpy.float64'>
-2.0
<class 'numpy.float64'>
-2.0

I am not getting a proper conversion to Float32. I want to convert values and their type to Float32, Did anybody have any workaround this ?

3
  • no, there is no missing values, and max value is 526225.98822 Commented Aug 30, 2017 at 9:06
  • can you give us print tree.tree_.threshold.flags Commented Aug 30, 2017 at 9:53
  • C_CONTIGUOUS : False F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False Commented Aug 31, 2017 at 4:30

3 Answers 3

60

The problem is that you do not do any type conversion of the numpy array. You calculate a float32 variable and put it as an entry into a float64 numpy array. numpy then converts it properly back to float64

Try someting like this:

a = np.zeros(4,dtype="float64") 
print a.dtype
print type(a[0])
a = np.float32(a)
print a.dtype
print type(a[0])

The output (tested with python 2.7)

float64
<type 'numpy.float64'>
float32
<type 'numpy.float32'>

a is in your case the array tree.tree_.threshold

Sign up to request clarification or add additional context in comments.

8 Comments

can you explain about np.zeroes parameters, specially "4"
it was the fasted way of created a numpy array that is filled with something... here zeros. The 4 is just the number of zeros. I choose 4 because I like the number...
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-125-d20ee23285ad> in <module>() ----> 1 tree.tree_.threshold = np.zeros(4,dtype="float64") 2 print (tree.tree_.threshold.dtype) 3 tree.tree_.threshold= float32(tree.tree_.threshold) 4 print (a.dtype) AttributeError: attribute 'threshold' of 'sklearn.tree._tree.Tree' objects is not writable
getting same error: AttributeError Traceback (most recent call last) <ipython-input-139-0ba8c9d97878> in <module>() ----> 1 tree.tree_.threshold = np.zeros(4,dtype="float64") 2 print (tree.tree_.threshold.dtype) 3 print (tree.tree_.threshold[0].dtype) 4 tree.tree_.threshold= float32(a) 5 print (tree.tree_.threshold.dtype) AttributeError: attribute 'threshold' of 'sklearn.tree._tree.Tree' objects is not writable
well, if its not a writable object, you cannnot overwrite it. Therefore the solution wont work. It only works if you use a new numpy array which you probably don't want.
|
2

You can try this:

tree.tree_.threshold[i]=tree.tree_.threshold[i].astype('float32',casting='same_kind’)

Comments

1

Actually i tried hard but not able to do as the 'sklearn.tree._tree.Tree' objects is not writable.

It is causing a precision issue while generating a PMML file, so i raised a bug over there and they gave an updated solution for it by not converting it in to the Float64 internally.

For more info, you can follow this link: Precision Issue

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.