1

I am new to python and I am trying to extract data from a large unsorted text file. I would like to know if it is possible to extract all the data on a line where a single word "stop_codon" occurs through the text document. this is what i have so far...

import re
regex = re.compile("stop_codon([^U]+)")

contigdata = open("contigs.txt").read()

for match in regex.finditer(contigdata):
    rules = match.group(0).splitlines()
    for rule in rules:
        if rule and not rule.startswith("#"):
            print rule

This is the output that the script is producing and i would prefer if it was all on the one line.

contig00002 A
stop_codon  2076    2078    .   +   0   transcript_id "g2.t1"; gene_id "g2";

Any help would be gratefully appreciated!

1 Answer 1

1

If you just want to print all the output in a single line

change

print rule

to

print rule,

We dont really need regular expressions for this

with open("contigs.txt") as f:
    for line in f:
        if "stop_codon" in line and line[0] != "#":
            print line,
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.