2

Is there a way to include strings in an array of floats without the format of the array changing such that all floats are changed to strings but the string element is still kept as a string?

eg.

import numpy as np

a = np.array([ 'hi' , 1. , 2. , 3. ])

Ideally I would like the format to remain the same as how it looks when input as 'a' above.

This gives:

array(['hi', '1.0', '2.0', '3.0'], dtype='|S3')

And then how would one save such an array as a text file?

Many thanks,

J

6
  • mention python in the tag (since that's what you are using I assume) Commented Jul 5, 2017 at 8:58
  • oops there we go, sorry about that- yes I'm using Python Commented Jul 5, 2017 at 8:59
  • Aren't lists in python heterogeneous anyways? I don't understand the problem you are facing... Commented Jul 5, 2017 at 9:00
  • The problem is when I create this array and try and save it as a text file it won't do it because of a mismatch between array dtype and so essentially I'm asking how to overcome this problem and save an array containing both strings and floats in a way that I can read the text file back in and use later on extracting strings and floats. Commented Jul 5, 2017 at 9:06
  • While specifying dtype=object might solve some of your problems, this is not how NumPy was designed to work, and using object arrays will cause weird incompatibilities and destroy most of the advantages NumPy arrays have over plain lists. Commented Jul 5, 2017 at 9:08

2 Answers 2

3

I'm guessing your problem is this: you want to dump out the array np.array([ 'hi' , 1. , 2. , 3. ]) using np.savetxt() but are getting this error:

TypeError: Mismatch between array dtype ('|S3') and format specifier ('%.18e')

If this is the case, you just need to set the fmt kwarg in np.savetxt. Instead of the default %.18e, which is for formatting floating point data, you can use %s, which formats things as a string, even if the original value in the array was numerical.

So this will work:

import numpy as np
a = np.array([ 'hi' , 1. , 2. , 3. ])
np.savetxt("test.out",a,fmt="%s")

Note that you can just do this with the original list - numpy will convert it to an array for you. So for example you can do:

np.savetxt("test.out",[ 'hi' , 1. , 2. , 3. ],fmt="%s")

and it should work fine too.

For the first part of the question, this is not really what numpy arrays are intended for. If you are trying to put different data types into the same array, then you probably want a different data structure. A vanilla python list would do it, but depending on your situation, a dict is probably what you're looking for.

Edit: Based on the comment threads & the specific question, it looks like this is an attempt to make a header on a data file. This can be done directly through

np.savetxt("a.txt",a,header="title goes here")

This can be read directly with np.loadtxt() because by default the header is prepended with #, and by default np.loadtxt() ignores lines that start with #.

Sign up to request clarification or add additional context in comments.

6 Comments

That is the exact error I'm having, thank you. That worked for saving the file but if you want to then load that array you get an error of not being able to convert string to float for the 'hi' element. Any ideas on that one? Thank you by the way!
Never mind I did a quick search and found out how to do this, thank you Astrokiwi!
Could you give the method you used so I can add it to my answer? It will be useful for googlers of the future.
yeah that's fine it went as follows: data = np.loadtxt('a.txt') # new line # data_floats = data[0,1:].astype(np.float)
Hmm - is the reason you're mixing strings & floats because you want to have a header/title line at the top of your output file? In that case, you can do np.savetxt("a.txt",a,header="title goes here"), and it will start the file with #title goes here. Then you can just use loadtxt directly, because it will ignore any line that starts with #.
|
1

Use pickle:

import pickle

a = ['abc',3,4,5,6,7.0]
pickle.dump( a, open( "save.p", "wb" ))
b = pickle.load( open( "save.p", "rb" ) )

print(b)

Output:

['abc', 3, 4, 5, 6, 7.0]

3 Comments

Thanks! How would this work if you try and vstack two arrays of the same style as 'a' and then save/open?
If you only need to vstack you can still do it with lists c= [a]+[b]. However if you need more numpy, you should consider using pandas if you really have to mix datatypes.
Okay thanks I'll bear that in mind and try saving as one data type, extracting the columns I need and the converting those to different data-types separately instead

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.