1

I have these unicodes:

uni_list = u'["","aa","bb","cc"]

I also have the following unicodes:

uni_str = u'dd'

I need to combine them together to a list, and get rid of null ones, the desired result will be like:

["aa","bb","cc","dd"]

But I don't know when it will be a uni_list, or a uni_str as i am reading a json file which split these results, is there a unified solution to convert them and combine them to a python list or set?

I tried to use ast.literal_eval, it seems only handle uni_list, but give me error of "malformed string" when it is a uni_str.

enter image description here

Many Thanks!

3
  • 1
    ... There's no such thing as a "unicode of list". You can have a unicode string representing a list. You can have a list of unicode strings. It's really unclear what your situation is here. Can you post something that demonstrates your problem? Commented Feb 23, 2017 at 10:15
  • 1
    Please show what you have tried so far as well as the input you are using. u'["","aa","bb","cc"] is neither a valid unicode string nor a valid list. Commented Feb 23, 2017 at 10:15
  • If you are getting that string from json you have other problems, like who wrote that to json in the first place. Commented Feb 23, 2017 at 21:17

2 Answers 2

4

You may use ast.literal_eval to convert your string to list as:

>>> import ast
>>> my_unicode = u'["","aa","bb","cc"]'

# convert string to list
>>> my_list = ast.literal_eval(my_unicode)
>>> my_list
['', 'aa', 'bb', 'cc']

# Filter empty string from list
>>> new_list = [i for i in my_list if i]
>>> new_list
['aa', 'bb', 'cc']

# append `"dd"` string to the list
>>> new_list.append("dd")  # OR, `str(u"dd")` if `"dd"` is unicode string
>>> new_list
['aa', 'bb', 'cc', 'dd']
Sign up to request clarification or add additional context in comments.

8 Comments

I have tried ast.literal_eval, the problem is i don't know when i am having a u"dd" or u'["","aa","bb","cc"]', so when i do ast.literal_eval for all of them, when it's a u"dd", it will give me an error of "malformed string", do you know how to solve this? Thanks!
But are you sure that the string will be of only these two type? i.e. u"dd" or u'["","aa","bb","cc"]' ?
Using filter for remove empty element in list is faster than list comprehension. new_list = filter(None, my_list)
For my data, yes, after json.loads(), i have either u"aa", or u'[]', or u'["aa","bb"]', but i don't know when it is one of 3
@MaxChrétien That's not the case. At least not in Python2.7. In Python3.x it will be faster because it returns the iterator object. If you type cast it to list, it'll again be slower
|
1

Alternative solution using re.match and re.findall functions:

result = []

def getValues(s):
    global result
    // check if input string contains list representation
    is_list = re.match(r'^\[[^][]+\]$', s, re.UNICODE)

    if is_list:
        result = result + re.findall(r'\"([^",]+)\"', s, re.UNICODE)
    else:
        result.append(s)


getValues(u'["","aa","bb","cc"]')
getValues(u'dd')

print(result)

The output:

['aa', 'bb', 'cc', 'dd']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.