1

I have extracted multiple data from file and now I want to create a dataframe of my data of interest. I have tried following way:

anticodon = re.findall(r'(at.\w\w-\w\w)', line)
    for line in anticodon:
        anticod = line.replace('at ', '')

import pandas as pd

df1 = pd.DataFrame({'id': [m_id], 'cod': [anticod]})
print df1
* similar way I have extraced m_id

But in output I only get last row of both columns not the entire column. How can I get complete data?

0

1 Answer 1

3

You are overwriting the value of anticod each time you iterate through anticodon, and thus you are left with it being the final value. You need to store each value, for example, you could create a list at the start anticods = [] and in your for loop append to it:

anticods = []

anticodon = re.findall(r'(at.\w\w-\w\w)', line)
    for line in anticodon:
        anticod = line.replace('at ', '')
        anticods.append(anticod)

m_ids = []
#similar logic for m_id

To then convert it into a dataframe, pass your lists as the column values:

import pandas as pd

d = {'id': m_ids, 'cod': anticods}
df1 = pd.DataFrame(data=d)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.