How to combine unicode of string and list to a python list

Question

I have these unicodes:

uni_list = u'["","aa","bb","cc"]

I also have the following unicodes:

uni_str = u'dd'

I need to combine them together to a list, and get rid of null ones, the desired result will be like:

["aa","bb","cc","dd"]

But I don't know when it will be a uni_list, or a uni_str as i am reading a json file which split these results, is there a unified solution to convert them and combine them to a python list or set?

I tried to use ast.literal_eval, it seems only handle uni_list, but give me error of "malformed string" when it is a uni_str.

Many Thanks!

... There's no such thing as a "unicode of list". You can have a unicode string representing a list. You can have a list of unicode strings. It's really unclear what your situation is here. Can you post something that demonstrates your problem? — khelwood
– khelwood, Commented Feb 23, 2017 at 10:15
Please show what you have tried so far as well as the input you are using. u'["","aa","bb","cc"] is neither a valid unicode string nor a valid list. — Christian König
– Christian König, Commented Feb 23, 2017 at 10:15
If you are getting that string from json you have other problems, like who wrote that to json in the first place. — Mark Tolonen
– Mark Tolonen, Commented Feb 23, 2017 at 21:17

Moinuddin Quadri · Accepted Answer · 2017-02-23 10:19:34Z

4

You may use ast.literal_eval to convert your string to list as:

>>> import ast
>>> my_unicode = u'["","aa","bb","cc"]'

# convert string to list
>>> my_list = ast.literal_eval(my_unicode)
>>> my_list
['', 'aa', 'bb', 'cc']

# Filter empty string from list
>>> new_list = [i for i in my_list if i]
>>> new_list
['aa', 'bb', 'cc']

# append `"dd"` string to the list
>>> new_list.append("dd")  # OR, `str(u"dd")` if `"dd"` is unicode string
>>> new_list
['aa', 'bb', 'cc', 'dd']

edited Feb 23, 2017 at 10:19

answered Feb 23, 2017 at 10:16

Moinuddin Quadri

48.4k13 gold badges101 silver badges137 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Sapling Over a year ago

I have tried ast.literal_eval, the problem is i don't know when i am having a u"dd" or u'["","aa","bb","cc"]', so when i do ast.literal_eval for all of them, when it's a u"dd", it will give me an error of "malformed string", do you know how to solve this? Thanks!

Moinuddin Quadri Over a year ago

But are you sure that the string will be of only these two type? i.e. u"dd" or u'["","aa","bb","cc"]' ?

Kruupös Over a year ago

Using filter for remove empty element in list is faster than list comprehension. new_list = filter(None, my_list)

Sapling Over a year ago

For my data, yes, after json.loads(), i have either u"aa", or u'[]', or u'["aa","bb"]', but i don't know when it is one of 3

Moinuddin Quadri Over a year ago

@MaxChrétien That's not the case. At least not in Python2.7. In Python3.x it will be faster because it returns the iterator object. If you type cast it to list, it'll again be slower

|

RomanPerekhrest · Accepted Answer · 2017-02-23 11:32:03Z

1

Alternative solution using re.match and re.findall functions:

result = []

def getValues(s):
    global result
    // check if input string contains list representation
    is_list = re.match(r'^\[[^][]+\]$', s, re.UNICODE)

    if is_list:
        result = result + re.findall(r'\"([^",]+)\"', s, re.UNICODE)
    else:
        result.append(s)


getValues(u'["","aa","bb","cc"]')
getValues(u'dd')

print(result)

The output:

['aa', 'bb', 'cc', 'dd']

edited Feb 23, 2017 at 11:32

answered Feb 23, 2017 at 11:18

RomanPerekhrest

93.1k4 gold badges75 silver badges112 bronze badges

Collectives™ on Stack Overflow

How to combine unicode of string and list to a python list

2 Answers 2

8 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

8 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related