0

I have below a column3 in df which is made of array of values. I want to access each elements from column3 but I am unable to.

I tried indexing df.column3[1] but it returns all the values. I see some other SO threads which mentions applymap or unpack method but not sure if it applies here and based on example how to use them For simplicity I am showing only 1 row. Also data is timeseries so one of the col is timestamp again for simplicity just col1 with value a

       column1.     column 2                column 3
Row1      a          2          [{port_no=1, status=0, ts=1624467015}, {port_no=2, 
                                status=0, ts=1624467015}]

Expected output - ability to query each port_no and its status - I dont mind creating independent df columns for each port_no and its associated value.

Expected output dataframe

       col1. col2  column 3                             col 4
Row1      a   2 [{port_no=1,status=0,ts=1624467015}] [{port_no=1,status=0,ts=1624467015}]
0

2 Answers 2

2

Using apply to create an unknown number of columns:

data = [
    ['a', 2, [{'port_no': 1, 'status': 0, 'ts': 1234}, {'port_no': 2, 'status': 0, 'ts': 2345}]],
    ['b', 3, [{'port_no': 1, 'status': 0, 'ts': 3456}, {'port_no': 2, 'status': 0, 'ts': 4567}, {'port_no': 3, 'status': 0, 'ts': 5678}]]
]
columns = ['column1', 'column2', 'column3']
df = pd.DataFrame(data=data, columns=columns)

df
  column1  column2                                            column3
0       a        2  [{'port_no': 1, 'status': 0, 'ts': 1234}, {'po...
1       b        3  [{'port_no': 1, 'status': 0, 'ts': 3456}, {'po...


def split_to_columns(row):
    column_3 = row['column3']
    for x in range(len(column_3)):
        row[f'column{x + 3}'] = column_3[x]
    return row


df = df.apply(lambda x: split_to_columns(x), axis=1)
df
  column1  column2                                  column3                                  column4                                  column5
0       a        2  {'port_no': 1, 'status': 0, 'ts': 1234}  {'port_no': 2, 'status': 0, 'ts': 2345}                                      NaN
1       b        3  {'port_no': 1, 'status': 0, 'ts': 3456}  {'port_no': 2, 'status': 0, 'ts': 4567}  {'port_no': 3, 'status': 0, 'ts': 5678}
Sign up to request clarification or add additional context in comments.

1 Comment

sorry for delayed response - and thanks a lot for the help - it works !!
0

Just try explode

out = pd.DataFrame(df['column 3'].explode().tolist())
Out[26]: 
   port_no  status          ts
0        1       0  1624467015
1        2       0  1624467015

1 Comment

thanks for the response - I see your point with explode but it messes up larger dataframe - any way we can select elements from the array/list and create a new column out of it ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.