1

I would like to create a matrix in python using a dataframe that has all the frequencies for each state transition combination. For example if we have 3 states (a, b, c) and two time periods (1 & 2) I could have the following dataframe with the following columns:

# Import pandas library
import pandas as pd

# initialize list of lists
data = [['a to a', 20],['a to b', 10], ['a to c', 5],
        ['b to a', 7],['b to b', 30],['b to c', 15],
        ['c to a', 9],['c to b', 6],['c to c', 25]]
  
# Create the pandas DataFrame
df = pd.DataFrame(data, columns=['Transition', 'Frequency'])

from this dataframe is there a way to create a 3x3 matrix (array) in python and populate it with the values from the "Frequency" column in my dataframe?

ex matrix:    a    b   c
            a[20, 10,  5]
            b[ 7, 30, 15]
            c[ 9,  6, 25]

bonus points for adding the axis values of "a, b, c" to the matrix output ;)

2 Answers 2

1

You can split the 'x to y' strings into 'x' and 'y' columns, pivot, then convert to_numpy:

a = (df
 .join(df['Transition'].str.split(r'\s*to\s*', expand=True))
 .pivot(0, 1, 'Frequency')
 .to_numpy()
 )

Output:

array([[20, 10,  5],
       [ 7, 30, 15],
       [ 9,  6, 25]])

Intermediate pivoted DataFrame (before conversion to numpy):

1   a   b   c
0            
a  20  10   5
b   7  30  15
c   9   6  25
Sign up to request clarification or add additional context in comments.

Comments

0

Try the following code

import pandas as pd
import numpy as np

# initialize list of lists
data = [['a to a', 20],['a to b', 10], ['a to c', 5],
        ['b to a', 7],['b to b', 30],['b to c', 15],
        ['c to a', 9],['c to b', 6],['c to c', 25]]
  
# Create the pandas DataFrame
df = pd.DataFrame(data, columns=['Transition', 'Frequency'])
# empty data frame
df1 = pd.DataFrame()
l = []
for i,j in df[['Transition','Frequency']].values:
    l.append(j)
    df1.loc[i[0],i[-1]] = j
arr = np.array(l).reshape(3,3)  # for nump array
print(df1)  # shows with col and row values

Output of pandas dataframe:

      a     b     c
a  20.0  10.0   5.0
b   7.0  30.0  15.0
c   9.0   6.0  25.0

1 Comment

Thank you this works like a charm! Would you mind explaining the process happening in this line to a noob? => df1.loc[i[0],i[-1]] = j

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.