Extracting Data With Python

Question

I am new to python and I am trying to extract data from a large unsorted text file. I would like to know if it is possible to extract all the data on a line where a single word "stop_codon" occurs through the text document. this is what i have so far...

import re
regex = re.compile("stop_codon([^U]+)")

contigdata = open("contigs.txt").read()

for match in regex.finditer(contigdata):
    rules = match.group(0).splitlines()
    for rule in rules:
        if rule and not rule.startswith("#"):
            print rule

This is the output that the script is producing and i would prefer if it was all on the one line.

contig00002 A
stop_codon  2076    2078    .   +   0   transcript_id "g2.t1"; gene_id "g2";

Any help would be gratefully appreciated!

thefourtheye · Accepted Answer · 2013-09-02 11:03:27Z

1

If you just want to print all the output in a single line

change

print rule

to

print rule,

We dont really need regular expressions for this

with open("contigs.txt") as f:
    for line in f:
        if "stop_codon" in line and line[0] != "#":
            print line,

answered Sep 2, 2013 at 11:03

thefourtheye

241k53 gold badges466 silver badges505 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Extracting Data With Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related