2

How can I write an output of a for loop to pandas data-frame?

Input data is a list of data-frames (df_elements).

[                          seq  score    status
1652  TGGCTTCGATTTTGTTATCGATG  -0.22  negative
1277  GTACTGTGGAATCTCGGCAGGCT   4.87  negative
302   CCAAAGTCTCACTTGTTGAGAAC  -4.66  negative
1756  TGGCGGTGGTGGCGGCGCAGAGC   1.55  negative
5043  TGACGAAACATCTTATAAAGGAA   1.96  negative
3859  CAGAGCTCTTCAAACTTAAGAAC  -0.39  negative
1937  GTATGCTTGTGCTTCTCCAAAAA  -0.91  negative
2805  GGCCGGCCTGTGGTCGACGGGGA  -3.26  negative
3353                CCGATGGGC  -1.97  negative
5352  ACTTACTATTTACTGATCAGCAC   3.53  negative
5901  TTGAGGCTCTCCTTATCCAGATT   6.37  negative
5790  AAGGAAACGTGTAATGATAGGCG  -2.69  negative,                           seq  score    status
2197  CTTCCATTGAGCTGCTCCAGCAC  -0.97  negative
1336  CCAAATGCAACAATTCAAAGCCC  -0.44  negative
4825                CAATTTTGT  -6.44  negative
4991  ATACTGTTTGCTCACAAAAGGAG   2.15  negative
1652  TGGCTTCGATTTTGTTATCGATG  -0.22  negative
1964  ACCACTTTGTGGACGAATACGAC  -4.51  negative
4443  TTCCTCGTCTAGCCTTTCAGTGC   3.05  negative
4208  TGGCTGTGAACCCCTATCAGCTG   2.70  negative
212   CTGTCGTTTCAATGTTTAAGATA   6.43  negative
775                 GCTTTAAGT   0.06  negative
3899                GAGCAAAGC  -6.61  negative

I am trying to write the output of the below for loop to a data-frame. I tried by creating an empty list (data) and append row-wise output using data.append. I am getting an error like cannot concatenate object of type "";

The code is given below which print the output in the console:


cut_off = [0,1,2]

for co in cut_off:
    for df in df_elements:
        print co, "\t", str((df['score'] > co).sum())

The code should compare the cut_off value to the column score and print the total for each data-frame element, where the score is > than cut_off.

The output should look like this:

cutoff number
0   5  #for first dataframe element
0   5  #for second dataframe element

2
  • This is working to me. Are the elements of the list pandas Dataframes? Are you sure? Commented Dec 4, 2019 at 11:10
  • Also, if you are using python 3. don't forget to add parenthesis to print Commented Dec 4, 2019 at 11:13

1 Answer 1

5
# create empty lists for cutoff and number
cutoff_list = []
number_list = []

# loop through cutoff values and dataframes, to populate your lists
for co in cut_off:
    for df in df_elements:
        cutoff_list.append(co)
        number_list.append((df['score'] > co).sum())

# create dataframe from your lists
df = pd.DataFrame(list(zip(cutoff_list , number_list)), 
           columns =['cutoff', 'number']) 

# get your desired output
print(df)
Sign up to request clarification or add additional context in comments.

3 Comments

If this solution is the desired, please consider accepting it. In other case let me know how to improve it, cheers.
It worked perfectly!! Could you please explain this part to me? pd.DataFrame(list(zip(cutoff_list , number_list)), columns =['cutoff', 'number']). What does list and zip function do?
@ranusharma Zip is a generator that binds together items. I believe that to understand this you can run a toy example like this: a = [1,2] b = [3,4] c = [5,6] for item in zip(a, b, c): print(item) list(zip(a, b, c))

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.