0

I'm trying to provide average movie’s ratings by the following four time intervals during which the movies were released (a) 1970 to 1979 (b) 1980 to 1989, ect.. and I wonder what did I wrong here, since I'm new to DS.

EDIT

  1. Since the dataset have no year column, I extract the released year embedded in the title column and assign a new column to the dataset:
year = df['title'].str.findall('\((\d{4})\)').str.get(0)
year_df = df.assign(year = year.values)

1.5. Because there are some str in the column, I convert the entire "year" column to int. Then I implemented groupby function to group the year in 10 years interval.

year_df['year'] = year_df['year'].astype(int)
year_df = year_df.groupby(year_df.year // 10 * 10)
  1. After that, I want to assign the year group into an interval of 10 years:
year_desc = { 1910: "1910 – 1019", 1920: "1920 – 1929", 1930: "1930 – 1939", 1940: "1940 – 1949", 1950: "1950 – 1959",1960: "1960 – 1969",1970: "1970 – 1979",1980: "1980 – 1989",1990: "1990 – 1999",2000: "2000 – 2009"}
year_df['year'] = [year_desc[x] for x in year_df['year']]

When I run my code after trying to assign year group, I get an error stated that:

TypeError: 'Series' objects are mutable, thus they cannot be hashed

UPDATES:

I tried to follow @ozacha suggestion and I still experiencing error, but this time is

'SeriesGroupBy' object has no attribute 'map'

1 Answer 1

2

Ad 1) Your year_df already has a year column, so there is no need to recreate it using df.assign(). .assign() is an alternative way of (re)defining columns in a dataframe.

Ad 2) Not sure what your test_group is, so it is difficult to get what's the source of the error. However, I believe this is what you want – using pd.Series.map:

year_df = ...
year_df['year'] = year_df['year'].astype(int)
year_desc = {...}
year_df['year_group'] = year_df['year'].map(year_desc)

Alternatively, you can also generate year groups dynamically:

year_df['year_group'] = year_df['year'].map(lambda year: f"{year} – {year + 9}")
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you for the suggestion! So, the test_group is a just a copied dataframe form the original. I tried both of your suggestions and this time around I got an error saying: "'SeriesGroupBy' object has no attribute 'map'" Any thought on what would it might be?
groupby is used for grouping rows into logical sets that allow you to apply aggregations or transformations. Replace your part 1.5 with year_df['year'] = year_df['year'].astype(int).round(-1) instead
Everything works now! Thank you so much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.