0

I have a pandas DataFrame with a column which has the following values in a column:

Identifier
[1;12;7;3;0]
[4;5;2;6;0]

I want to convert the values in square brackets in this column to 5 new columns. Essentially, I want to split those values into 5 new columns, while keeping the index for new columns same as the original column.

Identifier,a,b,c,d,e
[1;12;7;3;0],1,12,7,3,0
[4;5;2;6;0],4,5,2,6,0

pattern = re.compile(r'(\d+)')
for g in raw_data["Identifier"]:
    new_id = raw_data.Identifier.str.findall(pattern) # this converts the Identifier into a list of the 5 values
raw_data.append({'a':new_id[0],'b':new_id[1],'c':new_id[2],'d':new_id[3],'d':new_id[4]}, ignore_index=True)

The above code adds the extracted values from the column "identifier" to the end of the DataFrame and not to the corresponding rows. How can I add the values extracted to the same row/index as the original column ('Identifier')?

1 Answer 1

1

One way would be to use str methods to get the numbers, make a new dataframe from that, and then join (or concatentate) the results. For example,

id_data = df.Identifier.str.strip("[]").str.split(";").tolist()
df_id = pd.DataFrame(id_data, columns=list("abcde"), index=df.index, dtype=int)
df2 = df.join(df_id)

produces something like

      Identifier  a   b  c  d  e
10  [1;12;7;3;0]  1  12  7  3  0
20   [4;5;2;6;0]  4   5  2  6  0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.