I am new to python and I am not familiar iterating with the groupby function in pandas I modified the code below and it works fine for creating a pandas dataframe
i=['J,Smith,200 G Ct,',
'E,Johnson,200 G Ct,',
'A,Johnson,200 G Ct,',
'M,Simpson,63 F Wy,',
'L,Diablo,60 N Blvd,',
'H,Simpson,63 F Wy,',
'B,Simpson,63 F Wy,']
dbn=[]
dba=[]
for z,g in groupby(
sorted([l.split(',')for l in i],
key=lambda x:x[1:]),
lambda x:x[2:]
):
l=list(g);r=len(l);Address=','.join(z);o=l[0]
if r>2:
dbn.append('The '+o[1]+" Family,")
dba.append(Address)
elif r>1:
dbn.append(o[0]+" and "+l[1][0]+", "+o[1]+",")
dba.append(Address)
else:
dbn.append(o[0]+" "+o[1])
# print','.join(o),
dba.append(Address)
Hdf=pd.DataFrame({'Address':dba,'Name':dbn})
print Hdf
Address Name
0 60 N Blvd, L Diablo
1 200 G Ct, E and A, Johnson,
2 63 F Wy, The Simpson Family,
3 200 G Ct, J Smith
How would I modify the for loop to yield the same results if I am using a pandas dataframe instead of raw csv data?
df=pd.DataFrame({'Name':['J','E','A','M','L','H','B'],
'Lastname':['Smith','Johnson','Johnson','Simpson','Diablo','Simpson','Simpson'],
'Address':['200 G Ct','200 G Ct','200 G Ct','63 F Wy','60 N Blvd','63 F Wy','63 F Wy']})