2

doing some data wrangling from an example in O'Reilly's "Python for Data Analysis."

We start with data of the following format:

In [108]: data.CATEGORY[:5]
Out[108]: 
0          1. Urgences | Emergency, 3. Public Health, 
4                            1. Urgences | Emergency, 
5                       5e. Communication lines down, 
6    4. Menaces | Security Threats, 4e. Assainissem...
7                      4. Menaces | Security Threats, 
Name: CATEGORY, dtype: object

The book then lays out a procedure for removing the periods and '|' from each entry with the goal of creating a dictionary, using the following definitions;

def get_all_categories(cat_series):
    cat_sets = (set(to_cat_list(x)) for x in cat_series)
    return sorted(set.union(*cat_sets))

def get_english(cat):
    code, names = cat.split('.')
    if '|' in names:
        names = names.split(' | ')[1]
    return code, names.strip()

The first step goes fine, creating the list of unique categories;

In [109]: all_cats = get_all_categories(data.CATEGORY)

In [110]: all_cats[:5]
Out[110]: 
['1. Urgences | Emergency',
 '1a. Highly vulnerable',
 '1b. Urgence medicale | Medical Emergency',
 '1c. Personnes prises au piege | People trapped',
 '1d. Incendie | Fire']

However, using the second definition results in the following;

In [116]: english_mapping = dict(get_english(x) for x in all_cats)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-116-e69c3419c341> in <module>()
----> 1 english_mapping = dict(get_english(x) for x in all_cats)

TypeError: cannot convert dictionary update sequence element #1 to a sequence

A little help for a Python noob please :)

3
  • 1
    I do not get this error when run on the input data you have provided. Commented Dec 20, 2014 at 0:09
  • 1
    Just tried it again on a different computer and I continue to get the same TypeError. I forgot to type in the following definition which is necessary for the get_all_categories definition: def to_cat_list(catstr): stripped = (x.strip() for x in catstr.split(',')) return [x for x in stripped if x] Not sure how you got the code to work without that Scott... Commented Dec 20, 2014 at 16:54
  • 1
    @user3334415 I also did not get an error. Link here: codepad.org/4uYUiIoS Caveat: I took a shortcut and simply used the excerpt you posted for all_cats (the first five elements) in a direct assignment, but I don't know enough about the full data you are loading to know whether that would cause your outcome to be different. Commented Dec 21, 2014 at 2:51

1 Answer 1

3

Here is the solution:

  1. "cannot convert dictionary update sequence element #1 to a sequence"---because there is null value in the expression "get_english(x) for x in all_cats", so it can't be converted into a dictionary;

  2. Why?

def get_english(cat):
    code,names=cat.split('.')
    if '|' in names:
        names=names.split('|')[1]
        return code,names.strip()

----Here is the problem,the last line shouldn't indent,if not,you will get some value null.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.