I have the data file which looks like this -
[Table 1]
Terms Author Frequency
Hepatitis Christopher 2
Acid Subrata 1
Acid Kal 3
Kinase Pramod 31
Kinase Steve 5
Kinase Sharon 10
Acid Rob 5
Acid Christopher 2
Hepatitis Sharon 3
which I want to convert in a frequency matrix like this -
Terms Christopher Subrata Kal Pramod Steve Sharon Rob
Hepatitis 2 0 0 0 0 3 0
Acid 2 0 3 0 0 0 5
Kinase 0 0 0 31 5 10 0
Now I have figured out how to do that and I am using this code for that -
a = pd.read_csv("C:\\Users\\robert\\Desktop\\Python Project\\Publications Data\\New Merged Title Terms Corrected\\Python generated file\\Terms_Frequency_File.csv")
b = a.groupby(['Terms']).apply(lambda x:x.set_index(['Terms','Author']).unstack()['Frequency'])
and this worked absolutely fine till yesterday but today I generated the [Table 1] data again as I had to add one additional author to the data and trying to make a frequency matrix again like in [Table 2] but it's giving me this silly error -
KeyError: 'Terms'
I am pretty sure this has to do something with the index column in the dataframe or some white space issues in the index column(in this case 'Terms' column). I tried to read several answers on this like this - KeyError: 'column_name' and this - Key error when selecting columns in pandas dataframe after read_csv and tried those methods but these aren't helping.
Any help on this will be much appreciated! Thanks much!
print(a.columns)give you?pd.pivot_table(df, index='Terms', columns='Author', values='Frequency', fill_value=0)In your code,Termsdoesn't exist in the context you have selected when you try to set_indexcrosstabhere:pd.crosstab(df.Terms, df.Author, values=df.Frequency, aggfunc='sum').fillna(0)Index([''FINGER-LOOP'', 'Kukolj G', '1'], dtype='object')