1

I am trying to extract data from a pandas Dataframe column that has a specific pattern. I am trying to loop in such that each occurrence is created as a new row. Given below is how the data is:

id: id_101
description: id_name1
id: id_102
description: id_name2
id: id_103
description: id_name3

All of the above content is stored in a single row. I am trying to convert as below where each occurrence is made into a new row:

 , id, description
0, id_101, id_name1 
1, id_102, id_name2
2, id_103, id_name3 

1 Answer 1

1

If data has always pairs first Series.str.split and then DataFrame.pivot with helper column created by GroupBy.cumcount:

df = df['col'].str.split(': ', expand=True)
df['g'] = df.groupby(0)[1].cumcount()
df = df.pivot('g', 0, 1).rename_axis(index=None, columns=None)
print (df)
  description      id
0    id_name1  id_101
1    id_name2  id_102
2    id_name3  id_103

Or get values after :, convert to numpy array and reshape to new DataFrame:

a = df['col'].str.split(': ').str[1].to_numpy()
df = pd.DataFrame(a.reshape(-1, 2), columns=['id','description'])
print (df)
       id description
0  id_101    id_name1
1  id_102    id_name2
2  id_103    id_name3
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.