I have code that aims to generate a graph from an adjacency matrix from a table correlating workers with their manager. The source is a table with two columns (Worker, manager). It still works perfectly from a small mock data set, but fails unexpectedly with the real data:
import pandas as pd
import networkx as nx
# Read input
df = pd.read_csv("org.csv")
# Create the input adjacency matrix
am = pd.DataFrame(0, columns=df["Worker"], index=df["Worker"])
# This way, it is impossible that the dataframe is not square,
# or that index and columns don't match
# Fill the matrix
for ix, row in df.iterrows():
am.at[row["manager"], row["Worker"]] = 1
# At this point, am.shape returns a square dataframe (2825,2825)
# Generate the graph
G = nx.from_pandas_adjacency(am, create_using=nx.DiGraph)
This returns: NetworkXError: Adjacency matrix not square: nx,ny=(2825, 2829)
And indeed, the dimensions reported in the error are not the same as in those of the input dataframe am.
Does anyone have an idea of what happens in from_pandas_adjacency that could lead to this mismatch?

