3

What data structure is used to build a concatenation of lists of strings with different size?

E.g.,

a_list = ['h','i']
b_list = ['t','h','e','r','e']
c_list = ['fr', 'ie','nd']

desired structure:

my_structure = [ ['h','i'],
                 ['t','h','e','r','e'],
                 ['fr', 'ie','nd']
               ]

and then fill it in with 'null' strings to get the same size in every list:

 my_structure = [    ['h','i','null','null','null'],
                     ['t','h','e','r','e'],
                     ['fr', 'ie','nd','null', 'null']
                   ]
2
  • Do you want the string 'null', or do you want the null string? Commented Jul 3, 2018 at 15:51
  • does not matter, a random string, it is just to get every list with the same length! Commented Jul 3, 2018 at 16:24

2 Answers 2

4

You could use itertools.zip_longest:

import itertools

np.array(list(itertools.zip_longest(a_list, b_list, c_list, fillvalue='null'))).T

array([['h', 'i', 'null', 'null', 'null'],
      ['t', 'h', 'e', 'r', 'e'],
      ['fr', 'ie', 'nd', 'null', 'null']],
  dtype='<U4')

Edit: As per your comment that you want to add new lists to your array, it is probably more straightforward to create a list of the lists you want to use, and you can append to that list somewhat dynamically:

a_list = ['h','i']
b_list = ['t','h','e','r','e']
c_list = ['fr', 'ie','nd']

my_list = [a_list, b_list, c_list]

my_arr = np.array(list(itertools.zip_longest(*my_list, fillvalue='null'))).T

>>> my_arr
array([['h', 'i', 'null', 'null', 'null'],
       ['t', 'h', 'e', 'r', 'e'],
       ['fr', 'ie', 'nd', 'null', 'null']],
      dtype='<U4')

Then you can add a new list to my_list:

d_list = ['x']

my_list.append(d_list)

my_arr = np.array(list(itertools.zip_longest(*my_list, fillvalue='null'))).T

>>> my_arr
array([['h', 'i', 'null', 'null', 'null'],
       ['t', 'h', 'e', 'r', 'e'],
       ['fr', 'ie', 'nd', 'null', 'null'],
       ['x', 'null', 'null', 'null', 'null']],
      dtype='<U4')
Sign up to request clarification or add additional context in comments.

3 Comments

I think that would work. I got a question: if a need to add a new list to that array (let's say list_d) to that already built structure. How could I do it?
I think the most straightforward way would be to re-make your array: add list_d to your zip_longest. An alternative would be to pad list_d with null strings so that it is the same length as the longest element, and then use np.vstack or something like that, but it seems complicated, and wouldn't actually save you much performance.
the thing is that I have to concatenate the lists on the go in a loop since the lists are stored in different files. So I cannot really (a_list, b_list, c_list,....zzz_list) do that, I will have to append it one by one.. @sacul
2

Here's one way using a list comprehension. It involves calculating the maximum length of your lists as an initial step:

L = (a_list, b_list, c_list)
maxlen = max(map(len, L))

res = [i+['null']*(maxlen-len(i)) for i in L]

print(res)

[['h', 'i', 'null', 'null', 'null'],
 ['t', 'h', 'e', 'r', 'e'],
 ['fr', 'ie', 'nd', 'null', 'null']]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.