1

I have a CSV file that I'm trying to extract and break down into parts. It 10 columns. My current script (a line of which is shown below) asks user for input of two columns (say columns A and C) and gets data from the third (column F) and writes columns A and F to a new CSV file.

df1 = data.columnF[(data['columnA'] == data_name) & (data['columnC'] == study_name)]

Current output looks something like this:

name1,study1
name1,study2
name1,study2
name5,study9
name6,study6
name6,study0

Instead, I want the output to be multiple text files (by skipping the step of writing everything to CSV file and then breaking it into chunks).

File 'name1.txt' should have
study1
study2 (only once, without repetition) 

Similarly,

name5.txt > study9
name6.txt > study6
            study0

How can I do that?

1 Answer 1

3

Use groupby and loop in each group:

df_grouped = data.columnF[(data['columnA'] == data_name) & (data['columnC'] == study_name)].drop_duplicates().groupby('columnA')
for index, group in df_grouped:
    group.to_csv(index + '.text')
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.