"Series objects are mutable and cannot be hashed" error

Question

I am trying to get the following script to work. The input file consists of 3 columns: gene association type, gene name, and disease name.

cols = ['Gene type', 'Gene name', 'Disorder name']
no_headers = pd.read_csv('orphanet_infoneeded.csv', sep=',',header=None,names=cols)

gene_type = no_headers.iloc[1:,[0]]
gene_name = no_headers.iloc[1:,[1]]
disease_name = no_headers.iloc[1:,[2]]

query = 'Disease-causing germline mutation(s) in' ###add query as required

orph_dict = {}

for x in gene_name:
    if gene_name[x] in orph_dict:
        if gene_type[x] == query:
            orph_dict[gene_name[x]]=+ 1
        else:
            pass
    else:
        orph_dict[gene_name[x]] = 0

I keep getting an error that says:

Series objects are mutable and cannot be hashed

Any help would be dearly appreciated!

show us the full traceback so we can see the line on which the error is being thrown. my guess is it's orph_dict[gene_name[x]] = 0. the traceback would also show us the class of error being thrown. — abcd
– abcd, Commented Apr 17, 2015 at 13:50

Community · Accepted Answer · 2017-05-23 12:18:04Z

39

Shortly: gene_name[x] is a mutable object so it cannot be hashed. To use an object as a key in a dictionary, python needs to use its hash value, and that's why you get an error.

Further explanation:

Mutable objects are objects which value can be changed. For example, list is a mutable object, since you can append to it. int is an immutable object, because you can't change it. When you do:

a = 5;
a = 3;

You don't change the value of a, you create a new object and make a point to its value.

Mutable objects cannot be hashed. See this answer.

To solve your problem, you should use immutable objects as keys in your dictionary. For example: tuple, string, int.

edited May 23, 2017 at 12:18

CommunityBot

11 silver badge

answered Apr 17, 2015 at 14:42

Ella Sharakanski

2,7934 gold badges32 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

jkitchen · Accepted Answer · 2015-04-17 14:36:25Z

13

gene_name = no_headers.iloc[1:,[1]]

This creates a DataFrame because you passed a list of columns (single, but still a list). When you later do this:

gene_name[x]

you now have a Series object with a single value. You can't hash the Series.

The solution is to create Series from the start.

gene_type = no_headers.iloc[1:,0]
gene_name = no_headers.iloc[1:,1]
disease_name = no_headers.iloc[1:,2]

Also, where you have orph_dict[gene_name[x]] =+ 1, I'm guessing that's a typo and you really mean orph_dict[gene_name[x]] += 1 to increment the counter.

answered Apr 17, 2015 at 14:36

jkitchen

1,07012 silver badges19 bronze badges

3 Comments

Ali Over a year ago

How could I apply this technique of creating the Series from the start when I am splitting into a training and testing dataset?

X_train, X_test, y_train, y_test = train_test_split(training_feature_set, training_feature_label, test_size = 0.1, random_state=42)

@stackoverflow.com/users/639792/jkitchen

jkitchen Over a year ago

@Alvis, if your function returns DataFrames, you can still select individual items from those. Read the docs for indexing. .loc or .iloc are probably what you want.

Ali Over a year ago

Thank you @jkitchen I'll check out the documentation :-)

Collectives™ on Stack Overflow

"Series objects are mutable and cannot be hashed" error

2 Answers 2

Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related