0

I am new to NetworkX package in python. I want to solve the following problem.

lets say this is my data set:

import pandas as pd 
d = {'label': [1, 2, 3, 4, 5], 'size': [10, 8, 6, 4, 2], 'dist': [0, 2, -2, 4, -4]}
df = pd.DataFrame(data=d)
df 

label and size in the df are quite self-explanatory. The dist column measures the distance from the biggest label (label 1) to the rest of the labels. Hence dist is 0 in the case of label 1.

I want to produce something similar to the picture below: enter image description here

Where the biggest label in size is in a central position (1abel 1). Edges are the distance from label 1 to all other labels and the size of nodes are proportional to the size of each label. Is it possible?

Thank you very much in advance. Please let me know if the question is unclear.

1
  • Do negative distance values are important or is it only a matter of drawing the graph ? My first guess would be to create another dataframe based on a list of edges, then use networkx to convert the dataframe to a graph. Commented Nov 4, 2021 at 11:09

1 Answer 1

1
import matplotlib.pyplot as plt
import networkx as nx

G = nx.Graph()
for _, row in df.iterrows():
    G.add_node(row['label'], pos=(row['dist'], 0), size=row['size'])
biggest_node = 1
for node in G.nodes:
    if node != biggest_node:
        G.add_edge(biggest_node, node)

nx.draw(G,
        pos={node: attrs['pos'] for node, attrs in G.nodes.items()},
        node_size=[node['size'] * 100 for node in G.nodes.values()],
        with_labels=True
        )
plt.show()

Which plots

enter image description here

Notes:

You will notice the edges in 1-3 and 1-2 are thicker, because they overlap with the edge sections from 1-5 and 1-4 respectively. You can address that by having one only one edge from the center to the furthest node out in each direction and since every node will be on the same line, it'll look the same.

coords = [(attrs['pos'][0], node) for node, attrs in G.nodes.items()]
nx.draw(G,
        # same arguments as before and also add
        edgelist=[(biggest_node, min(coords)[1]), (biggest_node, max(coords)[1])]
        )

The 100 factor in the list for the node_size argument is just a scaling factor. You can change that to whatever you want.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for the answer. It is exactly what I aim for. One quick question, when I run the code it states: module 'matplotlib.cbook' has no attribute 'iterable'. Is it something related to the matplotlib package? I guess I need to update or downgrade the current version.
@AvtoAbashishvili According to this, you probably need to update networkx to the most recent version.
@AvtoAbashishvili By the way, I hardcoded which node is the biggest one here. But you can dynamically compute that by seeing which node has the biggest size, or its distance is 0.
by updating NetworkX the problem has been solved. Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.