I am writing script to read a csv file and write the data in a graph using the pygraphml.
Issue is that the file first column has some data like this and I am not able to read them.
Master Muppet ™ joèl b Kýrie, eléison
This is my python script
import csv
import sys
from pygraphml import Graph
from pygraphml import GraphMLParser
#reload(sys)
#sys.setdefaultencoding("utf8")
data = [] # networkd data to write
g = Graph() # graph for networks
#Open File and retrive the target rows
with open(r"C:\Users\csvlabuser\Downloads\test.csv","r") as fp:
reader = csv.reader(fp)
unread_count = 2
completed_list = []
try:
for rows in reader:
if "tweeter_id" == rows[2]: # skip and check the header
print("tweeter_id column found")
continue
#if rows[2] not in completed_list:
n = g.add_node(rows[2].encode("utf8"))
completed_list.append(rows[2])
n['username'] = rows[0].encode("utf8")
n['userid'] = rows[1]
if rows[3] != "NULL": # edges exist only when there is retweets id
g.add_edge_by_label(rows[2], rows[3])
print unread_count
unread_count +=1
except:
pass
fp.close()
print unread_count
g.show()
# Write the graph into graphml file format
parser = GraphMLParser()
parser.write(g, "myGraph.graphml")
Kindly let me know where is the issue.
Thanks in advance.
exceptwith apassfor a body... It would help to know if this is Python 2 or Python 3 though; Python 3's native support for Unicode is much better and more seamless, in Python 2, you're going to have a harder time. In addition, we (and Python) need to know the encoding of the file being read; if the file isutf-8, and you read aslatin-1, orutf-16, or vice-versa, you won't interpret the file correctly.