2

What I have is a list of strings. What I would like to do with it is to convert it to a 2D numpy array, where result[i, j] would be the ascii code of j-th character from i-th string (preferably as float).

I know I can use list(map(float, map(ord, single_line_from_list))) to get a list of my floats, convert it to 1D array, and then loop all of that to get my final array. But I wonder if there's a more elegant way to do this.

2
  • Is there a particular reason you're using a list of str instead of an ndarray with one of numpy's string types? Commented Aug 30, 2017 at 1:30
  • Also, I'm not sure what you think you gain by having dtype=float when all the values fit in dtype=uint8, which is much less storage and the values usually convert as needed. Commented Aug 30, 2017 at 1:32

2 Answers 2

2

You can use nested list comprehension.

import numpy as np 

array = np.array([[float(ord(character)) for character in word] for word in words])
Sign up to request clarification or add additional context in comments.

2 Comments

Pre-building an ndarray and then filling it will avoid the temporaries.
This doesn't actually return a 2D array, just an array of lists. Any idea on how to fix that (just pad columns that are shorter than the max lenght with zeros).
0

One option could be create a sparse matrix using scipy.sparse.coo_matrix and then convert it to dense:

from scipy.sparse import coo_matrix

lst = ['hello', 'world!!']
​
idx, idy, val = zip(*((i, j, ord(c)) for i, s in enumerate(lst) for j, c in enumerate(s)))   ​
coo_matrix((val, (idx, idy)), shape=(max(idx)+1, max(idy)+1)).todense()

#matrix([[104, 101, 108, 108, 111,   0,   0],
#        [119, 111, 114, 108, 100,  33,  33]])

Or use izip_longest(python2)/zip_longest(python3) from itertools:

from itertools import izip_longest

list(zip(*izip_longest(*map(lambda s: map(ord, s), lst))))
# [(104, 101, 108, 108, 111, None, None), (119, 111, 114, 108, 100, 33, 33)]

This gives a 2d list. You can use fillvalue parameter to fill the Nones:

list(zip(*izip_longest(*map(lambda s: map(ord, s), lst), fillvalue=0)))
# [(104, 101, 108, 108, 111, 0, 0), (119, 111, 114, 108, 100, 33, 33)]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.