0

I'm using python 3.x. I have a numpy array of shape (29982,29982) & a list of shape 29982. The sample array looks like

array([[1,5,7,2,9...],
       [2,6,4,1,5...],
       [7,9,1,12,4...],
       ...
       ...
       [6,8,13,2,4...]])

The sample list looks like

['John','David','Josua',......,'Martin']

I would like to get a pandas dataframe combining this array & list such that array value should be greater than 5. The dataframe should look like

        'John'  'David'   'Josua'
'John'    0       0         7
'David'   0       6         0
'Josua'   7       9         0
....
'Martin'  6       8         13

Can you please suggest me how should I do it?

2 Answers 2

3

Just create the dataframe from the array with pd.DataFrame, passing your list as index and columns. Then use df.where to keep only values that are greater than 5:

arr = [...]
lst = ['John','David','Josua',...,'Martin']

df = pd.DataFrame(arr, index=lst, columns=lst)
df = df.where(df > 5, 0)
Sign up to request clarification or add additional context in comments.

Comments

1

You can try numpy.ma.masked_where to process on numpy array

arr = np.array([[1,5,7,2,],
                [2,6,4,1,],
                [7,9,1,12],
                [6,8,13,2]])

lst = ['John','David','Josua', 'Martin']

df = pd.DataFrame(np.ma.masked_where(arr<=5, arr).filled(0), index=lst, columns=lst)
print(df)

        John  David  Josua  Martin
John       0      0      7       0
David      0      6      0       0
Josua      7      9      0      12
Martin     6      8     13       0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.