df = pd.DataFrame({'a': ['Anakin Ana', 'Anakin Ana, Chris Cannon', 'Chris Cannon', 'Bella Bold'],
'b': ['Bella Bold, Chris Cannon', 'Donald Deakon', 'Bella Bold', 'Bella Bold'],
'c': ['Chris Cannon', 'Chris Cannon, Donald Deakon', 'Chris Cannon', 'Anakin Ana, Bella Bold']},
index=[0, 1, 2])
Hi everyone,
I'm trying to count how many names are in common in each column. Above is an example of what my data looks like. At first, it said 'float' object has no attribute 'split' error. I did some searching and it seems the error is coming from my missing data which is reading as float. But even when I change the column in string variable it keeps getting the error. Below is my code.
import pandas as pd
import csv
filepath = "C:/Users/data/Untitled Folder/creditdata2.csv"
df = pd.read_csv(filepath,encoding='utf-8')
df['word_overlap'] = [set(x[8].astype(str).split(",")) & set(x[10].astype(str).split(",")) for x in df.values]
df['overlap_count'] = df['word_overlap'].str.len()
df.to_csv('creditdata3.csv',mode='a',index=False)
And here is the error
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-21-b85ac8637aae> in <module>
4 df = pd.read_csv(filepath,encoding='utf-8')
5
----> 6 df['word_overlap'] = [set(x[8].astype(str).split(",")) & set(x[10].astype(str).split(",")) for x in df.values]
7 df['overlap_count'] = df['word_overlap'].str.len()
8
<ipython-input-21-b85ac8637aae> in <listcomp>(.0)
4 df = pd.read_csv(filepath,encoding='utf-8')
5
----> 6 df['word_overlap'] = [set(x[8].astype(str).split(",")) & set(x[10].astype(str).split(",")) for x in df.values]
7 df['overlap_count'] = df['word_overlap'].str.len()
8
AttributeError: 'float' object has no attribute 'astype'