1

I have this program to generate random N sequences.

import random
N = 5
def randseq(abc, length):
    return "".join([random.choice(abc) for i in range(random.randint(1, length))])
for i in range(N):
    print(f'Sequence {i+1}:')
    print(randseq("ATCG", 120))

I got the sequences

Sequence 1:

TGGTACACGTGCTTAATGTTAACCTGTCTGGCGCAGGGTAACTATTTCATCCCT

Sequence 2:

CGTATATAATGCTTCCTCTTCAGGCGACCTTGCGATAGTGTCCGGCCATGTGAGTCCCTGTGGAGTGCCTTTAGATGACCTATACGTCTTTAGACTATGTTTATGGGG

Sequence 3:

CACAGCCTTCCTCCAATG . . .

Sequence N:

How can I print the longest and shortest N sequences and their lengths?

....

3
  • Create variables like e. g. "seq_min" and "seq_max" by assigning the first sequence to them and while iterating through the remaining sequences compare length of current sequence with length of seq_min/seq_max and if the sequence is shorter/longer set seq_min/seq_max to this found sequence. Commented Oct 9, 2021 at 3:09
  • @MichaelButscher Could you tell me how to assign the first sequence to the variables? Sorry, I am new in this field :/ Commented Oct 9, 2021 at 3:56
  • seq_max = seq_min = randseq("ATCG", 120). If not done already you should work through the Python tutorial. Commented Oct 9, 2021 at 4:32

1 Answer 1

2

Please check on my code. The descriptions are inside there.

import random


def randseq(abc, length):
    return "".join([random.choice(abc) for i in range(random.randint(1, length))])


# You should move the input value to the main part of code
# If not, it will treat as global variable
N = 5

# Init the longest seq with shortest one (empty string) 
# to make sure that all random seq must longer than this init
longest_seq = ""

# Init the shortest seq with longest one 
# (assume that randseq("ATCG", 1000) is long enough) 
# to make sure that all random seq must shorter than this init
shortest_seq = randseq("ATCG", 1000)

for i in range(N):
    print(f'Sequence {i+1}:')
    seq = randseq("ATCG", 120)
    
    # Find the longest one then update it to the longest_seq variable
    if len(seq) > len(longest_seq):
        longest_seq = seq
    
    # Find the shortest one then update it to the shortest_seq variable
    if len(seq) < len(shortest_seq):
        shortest_seq = seq
    
    print(seq)
   
print("") 
print('The longest seq is ', longest_seq)
print('The lenght of longest seq is ', len(longest_seq))
print('The shortest is ', shortest_seq)
print('The lenght of shortest seq is ', len(shortest_seq))

Example result (it's random, so it will not same as you when you run it)

Sequence 1:
CGGTGATCGCGATTACTGCCCGGCCTTGTCCACTCACAGCGATAACAGTGCTTATAGATCTCTCAAGTCTACCGTCTCACCCGTTGATTACCAA
Sequence 2:
AAGGTCAAGATTCGAATTCGTATCGCCGTATGGATAGGCGAAACGAGGGGTGGCTAAGGGGTAGACAGCAGAGCCGCTTTTGTACACCGTAAAACGGACGGTTCAGAACCGGAGGTACG
Sequence 3:
ACGGCCTCATGGATAATGCCCGGGGGAACAGGGAAGGAAAGATTTTGTCAAACTGATTCAGTTAC
Sequence 4:
GATACA
Sequence 5:
ATCGAAAGGAATATCTGTACGGGACGTTTGGTCTCGAGCCTAGCGTAAGCCGCCCGCAATTCGCTCTGATGAGCTACCG

The longest seq is  AAGGTCAAGATTCGAATTCGTATCGCCGTATGGATAGGCGAAACGAGGGGTGGCTAAGGGGTAGACAGCAGAGCCGCTTTTGTACACCGTAAAACGGACGGTTCAGAACCGGAGGTACG
The lenght of longest seq is  119
The shortest is  GATACA
The lenght of shortest seq is  6

Precaution:

In some (rarely) case, the initialization of shortest_seq might be too small (smallest among all random seq). If this case occur, the program will be failed. You can increase the length of randseq input to reduce the possibility to encounter with this problem.

For example.

You can change it from:

shortest_seq = randseq("ATCG", 1000)

to:

shortest_seq = randseq("ATCG", 10000)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.