Create undirected graph in NetworkX in python from pandas dataframe

Question

I am new to NetworkX package in python. I want to solve the following problem.

lets say this is my data set:

import pandas as pd 
d = {'label': [1, 2, 3, 4, 5], 'size': [10, 8, 6, 4, 2], 'dist': [0, 2, -2, 4, -4]}
df = pd.DataFrame(data=d)
df

label and size in the df are quite self-explanatory. The dist column measures the distance from the biggest label (label 1) to the rest of the labels. Hence dist is 0 in the case of label 1.

I want to produce something similar to the picture below:

Where the biggest label in size is in a central position (1abel 1). Edges are the distance from label 1 to all other labels and the size of nodes are proportional to the size of each label. Is it possible?

Thank you very much in advance. Please let me know if the question is unclear.

Do negative distance values are important or is it only a matter of drawing the graph ? My first guess would be to create another dataframe based on a list of edges, then use networkx to convert the dataframe to a graph. — Beinje
– Beinje, Commented Nov 4, 2021 at 11:09

Reti43 · Accepted Answer · 2021-11-04 11:44:00Z

1

import matplotlib.pyplot as plt
import networkx as nx

G = nx.Graph()
for _, row in df.iterrows():
    G.add_node(row['label'], pos=(row['dist'], 0), size=row['size'])
biggest_node = 1
for node in G.nodes:
    if node != biggest_node:
        G.add_edge(biggest_node, node)

nx.draw(G,
        pos={node: attrs['pos'] for node, attrs in G.nodes.items()},
        node_size=[node['size'] * 100 for node in G.nodes.values()],
        with_labels=True
        )
plt.show()

Which plots

Notes:

You will notice the edges in 1-3 and 1-2 are thicker, because they overlap with the edge sections from 1-5 and 1-4 respectively. You can address that by having one only one edge from the center to the furthest node out in each direction and since every node will be on the same line, it'll look the same.

coords = [(attrs['pos'][0], node) for node, attrs in G.nodes.items()]
nx.draw(G,
        # same arguments as before and also add
        edgelist=[(biggest_node, min(coords)[1]), (biggest_node, max(coords)[1])]
        )

The 100 factor in the list for the node_size argument is just a scaling factor. You can change that to whatever you want.

edited Nov 4, 2021 at 11:44

answered Nov 4, 2021 at 11:18

Reti43

9,8373 gold badges30 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Avto Abashishvili Over a year ago

Thanks for the answer. It is exactly what I aim for. One quick question, when I run the code it states: module 'matplotlib.cbook' has no attribute 'iterable'. Is it something related to the matplotlib package? I guess I need to update or downgrade the current version.

Reti43 Over a year ago

@AvtoAbashishvili According to this, you probably need to update networkx to the most recent version.

Reti43 Over a year ago

@AvtoAbashishvili By the way, I hardcoded which node is the biggest one here. But you can dynamically compute that by seeing which node has the biggest size, or its distance is 0.

Avto Abashishvili Over a year ago

by updating NetworkX the problem has been solved. Thanks.

Collectives™ on Stack Overflow

Create undirected graph in NetworkX in python from pandas dataframe

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related