How to setup Pandas DataFrame and create networkx plot in python

Question

I have the following dataframe:

data = [['tom', 'matt','alex',10,1,'a'], ['adam', 'matt','james',15,1,'a'],['tom', 'adam','alex',20,1,'a'],['alex', 'matt','james',12,1,'a']]
# Create the pandas DataFrame 
df = pd.DataFrame(data, columns = ['Person1','Person2','Person3', 'Attempts','Score','Category']) 
print(df)

  Person1 Person2 Person3  Attempts  Score Category
0     tom    matt    alex        10      1        a
1    adam    matt   james        15      1        a
2     tom    adam    alex        20      1        a
3    alex    matt   james        12      1        a

I am hoping to create a network graph where:

a) there is a node for each unique person across Person1, Person2, Person3

b) the nodesize is the sum of Attempts for each person

c) there is a edge between each person where they share an Attempts and the thickness is the sum of `Attempts they share.

I have read through the documentation but still struggling to find out how to setup my dataframe and then to plot. Any ideas on how to do this? Thanks very much!

Does this answer your question? Construct NetworkX graph from Pandas DataFrame — Trenton McKinney
– Trenton McKinney, Commented Jun 8, 2020 at 5:29

yatu · Accepted Answer · 2020-06-08 10:36:59Z

2

You can start by obtaining the length two combinations, and build a dictionary with the existing pairs of people (adding the attempts of edges with different order together):

from itertools import combinations, chain
from collections import defaultdict

seen = set()
d = defaultdict(list)
for *people, att in df.values[:,:4].tolist():
    for edge in combinations(people, r=2):
        edge_rev = tuple(reversed(edge))
        if edge in seen:
            d[edge] += att
        elif edge_rev in seen:
            d[edge_rev] += att
        else:
            seen.add(edge)
            d[edge] = att

w_edges = ((*edge, w) for edge, w in d.items())
#('tom', 'matt', 10) ('tom', 'alex', 30) ('matt', 'alex', 22) ('adam', 'matt', 15)...

And build a graph from the list of weighted edges with add_weighted_edges_from:

G = nx.Graph()
G.add_weighted_edges_from(w_edges)

You can then obtain the weights of the graph and set them as edge width (downscaled by some factor) with:

plt.figure(figsize=(8,6))
weights = nx.get_edge_attributes(G,'weight').values()

pos = nx.circular_layout(G)
nx.draw(G, pos, 
        edge_color='lightgreen', 
        node_color='lightblue',
        width=[i/3 for i in weights],
        with_labels=True,
        node_size=1000,
        alpha=0.7)

edited Jun 8, 2020 at 10:36

answered Jun 8, 2020 at 10:08

yatu

88.6k12 gold badges93 silver badges148 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

SOK Over a year ago

this is fantastic @yatu. Thanks so much I've got it working and if any quesitions ill let you know!

SOK Over a year ago

Do you know how I could ge the node_size to get the sum of each persons attemps? WOuld i need to create another list of the unique names and sum of their attempts? Thanks!

Collectives™ on Stack Overflow

How to setup Pandas DataFrame and create networkx plot in python

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related