Adjacency matrix not square error from square dataframe with networkx

Question

I have code that aims to generate a graph from an adjacency matrix from a table correlating workers with their manager. The source is a table with two columns (Worker, manager). It still works perfectly from a small mock data set, but fails unexpectedly with the real data:

import pandas as pd
import networkx as nx

# Read input
df = pd.read_csv("org.csv")

# Create the input adjacency matrix
am = pd.DataFrame(0, columns=df["Worker"], index=df["Worker"])
# This way, it is impossible that the dataframe is not square,
# or that index and columns don't match

# Fill the matrix
for ix, row in df.iterrows():
    am.at[row["manager"], row["Worker"]] = 1

# At this point, am.shape returns a square dataframe (2825,2825)
# Generate the graph
G = nx.from_pandas_adjacency(am, create_using=nx.DiGraph)

This returns: NetworkXError: Adjacency matrix not square: nx,ny=(2825, 2829)

And indeed, the dimensions reported in the error are not the same as in those of the input dataframe am.

Does anyone have an idea of what happens in from_pandas_adjacency that could lead to this mismatch?

mozway · Accepted Answer · 2025-02-25 15:42:55Z

1

In:

am = pd.DataFrame(0, columns=df["Worker"], index=df["Worker"])
# This way, it is impossible that the dataframe is not square,

your DataFrame is indeed square, but when you later assign values in the loop, if you have a manager that is not in "Worker", this will create a new row:

am.at[row["manager"], row["Worker"]]

Better avoid the loop, use a crosstab, then reindex on the whole set of nodes:

am = pd.crosstab(df['manager'], df['Worker'])
nodes = am.index.union(am.columns)
am = am.reindex(index=nodes, columns=nodes, fill_value=0)

Even better, if you don't really need the adjacency matrix, directly create the graph with nx.from_pandas_edgelist:

G = nx.from_pandas_edgelist(df, source='manager', target='Worker',
                            create_using=nx.DiGraph)

Example:

# input
df = pd.DataFrame({'manager': ['A', 'B', 'A'], 'Worker': ['D', 'E', 'F']})

# adjacency matrix
   A  B  D  E  F
A  0  0  1  0  1
B  0  0  0  1  0
D  0  0  0  0  0
E  0  0  0  0  0
F  0  0  0  0  0

# adjacency matrix with your code
Worker    D    E    F
Worker               
D       0.0  0.0  0.0
E       0.0  0.0  0.0
F       0.0  0.0  0.0
A       1.0  NaN  1.0  # those rows are created 
B       NaN  1.0  NaN  # after initializing am

Graph:

edited Feb 25 at 15:42

answered Feb 25 at 15:37

mozway

267k13 gold badges55 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

mrgou Feb 25 at 16:05

Thanks for this, it works! Not sure why though, as the dimensions remain the same before and after the loop. However, with your method, it's (2823,2823) instead of (2825,2825). So something was wrong!

mozway Feb 25 at 16:17

@mrgou have you checked the list of values in each column? Do you have None/NaNs?

ThomasIsCoding Feb 26 at 0:59

nice solution with crosstab as a bridge to reach the adjacency matrix from the edge list, +1! I am a bit surprised that networkx doesn't have a function that can build a graph based on incidence matrix.

ThomasIsCoding · Accepted Answer · 2025-02-26 01:04:20Z

First of all, your "adjacency matrix" is not the real one, but the "incidence matrix" indeed.

I didn't find a straightforward utility in networkx that support generating the directed graph from the incidence matrix. However, with igraph package in the r environment, there is such functionality that can show how it should work. For example

library(igraph)

df <- data.frame(
    manager = c("A", "B", "A"),
    worker = c("D", "E", "F")
)
am <- table(df)
g <- graph_from_biadjacency_matrix(am, directed = TRUE, mode = "out")
plot(g)

where

> print(df)
  manager worker
1       A      D
2       B      E
3       A      F

> print(am)
       worker
manager D E F
      A 1 0 1
      B 0 1 0

such that g can be visualized as below

Again, the real "adjacency matrix" should look like this

> as_adjacency_matrix(g)
5 x 5 sparse Matrix of class "dgCMatrix"
  A B D E F
A . . 1 . 1
B . . . 1 .
D . . . . .
E . . . . .
F . . . . .

Collectives™ on Stack Overflow

Adjacency matrix not square error from square dataframe with networkx

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related