0

I have a numpy array arr of the form:

array([[ 0.00021284, -0.04443965,  0.03926146, ...,  0.04830161,
    -0.11913304,  0.03370821],
   [ 0.01778569, -0.05192029, -0.00792321, ..., -0.01799901,
    -0.09819183,  0.06020728],
   [-0.00748426, -0.02401578,  0.01762747, ...,  0.09334017,
    -0.11837556,  0.00603597],
   [-0.03505319, -0.01932572, -0.03248611, ...,  0.00356432,
    -0.082398  ,  0.03887841],
   [-0.05111802, -0.0309066 ,  0.03542011, ..., -0.01343899,
    -0.10434885, -0.0315006 ]], dtype=float32)

Assume the shape is (5, 512)

I also have a pandas dataframe df of the form:

    Message
0   How are you?
1   What is your name?
2   What do you do?
3   What is your address?
4   Let's hang out?

I would like to attach each row in arr as an element in df by creating a new column:

    Message                Vector
0   How are you?           [ 0.00021284, -0.04443965,  0.03926146, ...,  0.04830161, -0.11913304, 0.03370821] 
1   What is your name?     [ 0.01778569, -0.05192029, -0.00792321, ..., -0.01799901, -0.09819183,  0.06020728]
2   What do you do?        [-0.00748426, -0.02401578,  0.01762747, ...,  0.09334017, -0.11837556,  0.00603597]
3   What is your address?  [-0.03505319, -0.01932572, -0.03248611, ...,  0.00356432, -0.082398,  0.03887841]
4   Let's hang out?        [-0.05111802, -0.0309066 ,  0.03542011, ..., -0.01343899, -0.10434885, -0.0315006 ]

What is an efficient way to achieve this?

4
  • Do you want the vectors in their own columns or in on big list like this? Commented Jan 30, 2020 at 19:21
  • When you say in their own columns, how would that look like? Not able to picture it, sorry. Commented Jan 30, 2020 at 19:21
  • Well your vectors column could be a column for each item in the list. Or you could have it like you do there. I'll do it both ways Commented Jan 30, 2020 at 19:22
  • Got it, so in this case like 512 columns? No, I want them to be all under one column. I know we can do pd.DataFrame(arr) and then stack it horizontally with the dataframe for the 512 column structure, but that is not what I am trying for Commented Jan 30, 2020 at 19:24

1 Answer 1

1

Creating an array for the problem, and convert this to a list.

a = np.array([[ 0.00021284, -0.04443965,  0.03926146,  0.04830161,
    -0.11913304,  0.03370821],
   [ 0.01778569, -0.05192029, -0.00792321, -0.01799901,
    -0.09819183,  0.06020728],
   [-0.00748426, -0.02401578,  0.01762747,  0.09334017,
    -0.11837556,  0.00603597],
   [-0.03505319, -0.01932572, -0.03248611,  0.00356432,
    -0.082398  ,  0.03887841],
   [-0.05111802, -0.0309066 ,  0.03542011, -0.01343899,
    -0.10434885, -0.0315006 ]]).tolist()

Results in:

print(a)

[[0.00021284, -0.04443965, 0.03926146, 0.04830161, -0.11913304, 0.03370821], [0.01778569, -0.05192029, -0.00792321, -0.01799901, -0.09819183, 0.06020728], [-0.00748426, -0.02401578, 0.01762747, 0.09334017, -0.11837556, 0.00603597], [-0.03505319, -0.01932572, -0.03248611, 0.00356432, -0.082398, 0.03887841], [-0.05111802, -0.0309066, 0.03542011, -0.01343899, -0.10434885, -0.0315006]]

Then add the list to the dataframe.

df = pd.DataFrame({"Message": [
"How are you?",
"What is your name?",
"What do you do?",
"What is your address?",
"Let's hang out?"]})
df['Array'] = a
print(df)

For:

                Message                                              Array
0           How are you?  [0.00021284, -0.04443965, 0.03926146, 0.048301...
1     What is your name?  [0.01778569, -0.05192029, -0.00792321, -0.0179...
2        What do you do?  [-0.00748426, -0.02401578, 0.01762747, 0.09334...
3  What is your address?  [-0.03505319, -0.01932572, -0.03248611, 0.0035...
4        Let's hang out?  [-0.05111802, -0.0309066, 0.03542011, -0.01343...

To create everything at the beginning, you can use dictionary:

df = pd.DataFrame({"Message": [
"How are you?",
"What is your name?",
"What do you do?",
"What is your address?",
"Let's hang out?"], "Array": a})
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks @run-out. The generated list has a long trail of digits. I imagine this occurs because of the original float32 datatype? Regardless, is there any way to fix it to avoid changing the representation?
@run-out have you tested the solution? doesn't work for me.
The long trail of digits is display due to lengthy cell. The values are there.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.